Revision as of 14:44, 21 April 2023 (view source) Proton2 (talk \| contribs) (Created the Common Lisp entry) ← Older edit		Revision as of 16:10, 21 April 2023 (view source) Thundergnat (talk \| contribs) m (Add draft markup, related task, Raku example) Newer edit →
Line 1: {{draft task}} An N-gram is a sequence of N contiguous elements of a given text. Although N-grams refer sometimes to words or syllables, in this task we will consider only sequences of characters. The task consists in, given a text and an integer size of the desired N-grams, find all the different contiguous sequences of N characters, together with the number of times they appear in the text. Line 16 ⟶ 18: Note that space and other non-alphanumeric characters are taken into account. ;See also ;* [[Sorensen–Dice_coefficient\|Related task: Sorensen–Dice coefficient]] Line 42 ⟶ 48: ("ND" . 1) ("D " . 1) ("LE" . 1) ("ET" . 1) ("T " . 1)) </syntaxhighlight> =={{header\|Raku}}== <syntaxhighlight lang="raku" line>sub n-gram ($this, $N=2) { Bag.new( flat $this.uc.map: { .comb.rotor($N => -($N-1))».join } ) } dd 'Live and let live'.&n-gram; # bi-gram dd 'Live and let live'.&n-gram(3); # tri-gram</syntaxhighlight> {{out}} <pre>("IV"=>2,"T "=>1,"VE"=>2,"E "=>1,"LE"=>1,"AN"=>1,"LI"=>2,"ND"=>1,"ET"=>1," L"=>2," A"=>1,"D "=>1).Bag ("ET "=>1,"AND"=>1,"LIV"=>2," LI"=>1,"ND "=>1," LE"=>1,"IVE"=>2,"E A"=>1,"VE "=>1,"T L"=>1,"D L"=>1,"LET"=>1," AN"=>1).Bag</pre>

N-grams: Difference between revisions

N-grams (view source)