Jump to content

Word frequency: Difference between revisions

→‎{{header|REXX}}: added the REXX language.
m (→‎{{header|Perl 6}}: more concisely)
(→‎{{header|REXX}}: added the REXX language.)
Line 147:
("that" . 7922)
("it" . 6659))</pre>
 
=={{header|REXX}}==
This REXX version doesn't need to sort the list of words.
<lang rexx>/*REXX program reads and displays a count of words a file. Word case is ignored.*/
parse arg fID top . /*obtain optional arguments from the CL*/
if fID=='' | fID=="," then fID= 'les_mes.TXT' /*None specified? Then use the default.*/
if top=='' | top=="," then top= 10 /* " " " " " " */
c=0; @.=0 /*initialize word list; word count. */
!.= /* " the original word instance*/
do #=1 while lines(fID)\==0 /*loop whilst there are lines in file. */
y=space( linein(fID) ) /*remove superfluous blanks in the line*/
$= /*$: is a list of words in this line. */
do j=1 for length(y); _=substr(y,j,1) /*obtain a character of the word found.*/
if datatype(_, 'M') then $=$ || _ /*Is it a letter? Append to $. */
else $=$ || ' ' /*Is it not a letter? Append blank. */
end /*j*/
$=strip($) /*strip any leading and trailing blanks*/
do while $\=''; parse var $ z $ /*now, process each word in the $ list.*/
oz=z; upper z /*obtain an uppercase version of word. */
if @.z==0 then do; c=c+1; !.c=z; end /*bump the word#; assign word to array*/
@@.z=oz /*save the original case of the word. */
@.z=@.z + 1 /*bump the count of occurrences of word*/
end /*while*/
end /*#*/
say right('word',40) " " center(' rank ',6) " count " /*display a title for output*/
say right('════',40) " " center('══════',6) "═══════" /* " a title separator.*/
 
do tops=1 by 0 until tops>top; mc=0 /*process enough words to satisfy TOP.*/
tl= /*initialize (possibly) a list of words*/
do n=1 for c; z=!.n; count=@.z /*process the list of words in the file*/
if count<1 then iterate /*Is count too small? Then ignore it.*/
z=!.n /*get the name of the capitalized word.*/
if count==mc then tl=tl z /*handle cases of tied number of words.*/
if count>mc then do; mc=count /*this word count is the current max. */
tl=z /* " word " " " " */
end
end /*n*/
w=0 /*will be the maximum length of count. */
wr=max( length(' rank '), length(top) ) /*find the maximum length of the rank #*/
do d=1 for words(tl); _=word(tl, d)
if d==1 then w=max(8, length(@._)) /*use the length of the first word used*/
say right(@@._, 40 ) right(tops, wr) right(@._, w)
@._=0 /*nullify this word count for next time*/
end /*d*/
tops=tops + words(tl) /*correctly handle the tied rankings. */
end /*tops*/ /*stick a fork in it, we're all done. */</lang>
{{out|output|text=&nbsp; when using the default inputs:}}
 
This output agrees with '''UNIX Shell'''.
<pre>
word rank count
════ ══════ ═══════
the 1 41089
of 2 19949
and 3 14942
a 4 14608
to 5 13951
in 6 11214
he 7 9648
was 8 8621
that 9 7924
it 10 6661
</pre>
 
=={{header|UNIX Shell}}==
Cookies help us deliver our services. By using our services, you agree to our use of cookies.