Open a text file and count the occurrences of each letter.
Some of these programs count all characters (including punctuation), but some only count letters A to Z.
<syntaxhighlight lang="11l">F countletters(s)
DefaultDict[Char, Int] results
L(char) s
V c = char.lowercase()
I c C ‘a’..‘z’
R results
L(letter, count) countletters(File(:argv[1]).read())
=={{header|8080 Assembly}}==
This program prints the frequency of each printable ASCII character
contained in the file.
<syntaxhighlight lang="8080asm">bdos: equ 5 ; CP/M syscalls
putch: equ 2 ; Print a character
puts: equ 9 ; Print a string
fopen: equ 15 ; Open a file
fread: equ 20 ; Read a file
fcb: equ 5ch ; FCB for file given on command line
dma: equ 80h ; Default DMA
org 100h ; CP/M loads the program starting at page 1
;; Zero out pages two and three (to keep a 16-bit counter
;; for each possible byte in the file).
;; We can do this because this program is small enough to
;; fit in page 1 in its entirety.
xra a ; Zero A.
mov b,a ; Zero B too (make it loop 256 times)
lxi d,200h ; Start of page two
zero: stax d ; Zero out a byte (store A, which is zero)
inx d ; Next byte
stax d ; Zero out another byte
inx d ; Next byte
dcr b ; Decrement the loop counter.
jnz zero ; Continue until B comes back to zero.
;; Open the file given on the command line.
lxi d,fcb ; CP/M always tries to parse the command line,
mvi c,fopen ; and gives us a file "object" in page zero.
call bdos ; We can just call fopen on it.
inr a ; It sets A=FF on error, so if incrementing A
jz error ; rolls back over to 0, that's an error.
;; Process the file record by record.
;; In CP/M, each file consists of a number of 128-byte
;; records. An exact size is not kept.
;; If a text file is not an exact multiple of 128 bytes
;; long, the last record will contain a ^Z (26 decimal),
;; and anything after that byte should be ignored.
read: lxi d,fcb ; From the file control block (the "object"),
mvi c,fread ; read one record. By default it ends up in
call bdos ; the last half of page zero.
ana a ; Zero carry flag.
rar ; Low bit says if end reached
jc output ; If so, go print the table
ana a ; If any other bits are set, that's a
jnz error ; read error.
;; Count the characters in the current record.
lxi d,dma-1 ; Set DE to point just before the record
byte: inr e ; Go to the next byte.
jz read ; If end of record, go get next record.
ldax d ; Grab the current byte
cpi 26 ; If it is EOF, we're done.
jz output ; Go print the table
mov l,a ; Otherwise, increment the counter for this
mvi h,2 ; character: the low byte is kept in page 2.
inr m ; 'm' means the value in memory at HL.
jnz byte ; If no rollover, we're done; count next byte
inr h ; But we're keeping a 16-bit counter, so
inr m ; if there is rollover, increment high byte.
jmp byte ; The high byte is in page 3 -unorthodox, but
; it's easy to access here.
;; We've done the whole file. For each printable
;; ASCII character (32..126), print the character and
;; the count.
output: mvi a,32 ; Start at 32.
;; Print a character and its counter
char: mov l,a ; Load 16-bit counter into DE. Low byte
mvi h,2 ; is in page 2 at a;
mov e,m
inr h ; And the high byte is in page 3.
mov d,m
mov a,d ; Test if the counter is zero
ora e
mov a,l ; Put the character back in A
jz next ; If zero, don't print anything.
push psw ; If not, push the character,
push d ; and the counter.
mvi c,putch ; Print the current character
mov e,a
call bdos
lxi d,separator ; Then print ': '
call outs
;; Then convert the counter to ASCII
pop d ; Retrieve the counter
lxi h,numend ; Get pointer to end of digit string
push h ; And put it on the stack
dgtloop: xchg ; Put counter in HL (16-bit accumulator)
lxi b,-10 ; Dividend is 10
mov d,b ; Start quotient at -1 (we'll loop once
mov e,b ; too many, this corrects for it)
divloop: inx d ; Increment the quotient,
dad b ; subtract 10 from the dividend,
jc divloop ; and keep doing it until it goes negative
lxi b,10+'0' ; Add 10 back to get the remainder,
dad b ; plus '0' to make it ASCII.
mov a,l
pop h ; Retrieve digit pointer
dcx h ; Decrement it (to point at current digit)
mov m,a ; Store the digit
push h ; And store the new pointer
mov a,d ; Check if the quotient is now zero
ora e
jnz dgtloop ; If not, do the next digit.
pop d ; Set DE to point at the first digit
call outs ; And output it as a string.
pop psw ; Restore the character
next: inr a ; Increment it
cpi 127 ; Did we just do the last character?
jnz char ; If not, go do the next character.
ret ; If so, we're done.
;; Print the error message
error: lxi d,errmsg
;; Print string
outs: mvi c,puts
jmp bdos
;; Strings
errmsg: db '?$' ; "Error message" (if file error)
separator: db ': $' ; Goes in between character and number
number: db '00000' ; Space to keep ASCII representation of
numend: db 13,10,'$' ; a 16-bit number, plus newline.
<syntaxhighlight lang="8th">
needs map/iter
8 var, numtasks
var tasks
0 args "Give filename as param" thrownull
f:slurp >s s:len numtasks @ n:/ n:ceil 1 a:close s:/ a:len numtasks ! constant work
m:new constant result
: print-character-count \ m s -- m
swap over m:@ rot s:size 1 n:= over -1 s:@ nip 31 n:< and if
-1 s:@ nip "<%d>" s:strfmt
"'%s'" s:strfmt
"%s: %d\n" s:strfmt . ;
: print-results
tasks @ a:len ( a:pop t:result nip ( result -rot m:[]! drop ) m:each drop ) swap times drop
result ( nip array? if ' n:+ 0 a:reduce then ) m:map
m:keys ' s:cmp a:sort ' print-character-count m:iter drop ;
: task \ slice --
"" s:/ ' noop a:group
( nip a:len nip ) m:map ;
: start-tasks
( work a:pop nip 1 ' task t:task-n a:push ) numtasks @ times
tasks ! ;
: wait-tasks
tasks @ t:wait ;
: app:main
bye ;
=={{header|AArch64 Assembly}}==
{{works with|as|Raspberry Pi 3B version Buster 64 bits <br> or android 64 bits with application Termux }}
<syntaxhighlight lang AArch64 Assembly>
/* ARM assembly AARCH64 Raspberry PI 3B */
/* program cptletters64.s */
/* Constantes */
/* for this file see task include a file in language AArch64 assembly*/
.include "../includeConstantesARM64.inc"
.equ BUFFERSIZE, 300000
/* Initialized data */
szMessOpen: .asciz "File open error.\n"
szMessStat: .asciz "File information error.\n"
szMessRead: .asciz "File read error.\n"
szMessClose: .asciz "File close error.\n"
szMessDecryptText: .asciz "Decrypted text :\n"
szMessCryptText: .asciz "Encrypted text :\n"
szMessErrorChar: .asciz "Character text not Ok!\n"
szFileName: .asciz "unixdict.txt"
//szFileName: .asciz "test1.txt"
szMessResult: .asciz "Result: = "
szCarriageReturn: .asciz "\n"
szMessStart: .asciz "Program 64 bits start.\n"
/* UnInitialized data */
sZoneConv: .skip 24
tabCptLetters: .skip 8 * 52 // (A-Z a-z) counter array
sBuffer: .skip BUFFERSIZE // file buffer
/* code section */
.global main
main: // entry of program
ldr x0,qAdrszMessStart
bl affichageMess
mov x0,AT_FDCWD
ldr x1,qAdrszFileName // file name
mov x2,#O_RDWR // flags
mov x3,#0 // mode
mov x8,#OPEN // file open
svc 0
cmp x0,#0 // error ?
ble 99f
mov x9,x0 // FD save
mov x0,x9
ldr x1,qAdrsBuffer
ldr x2,#iBufferSize
mov x8,#READ // call system read file
svc 0
cmp x0,#0 // error read ?
blt 97f
mov x6,x0 // file size
mov x0,x9
mov x8,#CLOSE // call system close file
svc 0
cmp x0,#0 // error close ?
blt 96f
ldr x0,qAdrsBuffer
mov x1,x6
bl cptLetters
b 100f
ldr x0,qAdrszMessClose
bl affichageMess
mov x0,#-1 // error
b 100f
ldr x0,qAdrszMessRead
bl affichageMess
mov x0,#-1 // error
b 100f
ldr x0,qAdrszMessOpen
bl affichageMess
mov x0,#-1 // error
100: // standard end of the program
mov x0, #0 // return code
mov x8, #EXIT // request to exit program
svc 0 // perform the system call
qAdrsZoneConv: .quad sZoneConv
qAdrszMessResult: .quad szMessResult
qAdrszCarriageReturn: .quad szCarriageReturn
qAdrszMessStart: .quad szMessStart
qAdrszFileName: .quad szFileName
qAdrszMessOpen: .quad szMessOpen
qAdrszMessRead: .quad szMessRead
qAdrszMessStat: .quad szMessStat
qAdrszMessClose: .quad szMessClose
qAdrsBuffer: .quad sBuffer
iBufferSize: .quad BUFFERSIZE
/* letters frequency */
/* r0 contains a file buffer */
/* r1 contains string length */
stp x1,lr,[sp,-16]!
stp x2,x3,[sp,-16]!
stp x4,x5,[sp,-16]!
stp x6,x7,[sp,-16]!
ldr x4,qAdrtabCptLetters // counter array
mov x3,#0 // index string
ldrb w2,[x0,x3] // load byte of string
cmp x2,#'A' // select alpha characters lower or upper
blt 5f
cmp x2,#'Z'
bgt 2f
sub x5,x2,#65 // convert ascii upper in index array (0-25)
b 3f
cmp x2,#'z'
bgt 5f
cmp x2,#'a'
blt 5f
sub x5,x2,#97 - 26 // convert ascii lower in index array (26,52]
ldr x7,[x4,x5,lsl #3] // load counter of load character
add x7,x7,#1 // increment counter
str x7,[x4,x5,lsl #3] // and store
add x3,x3,#1 // increment text index
cmp x3,x1 // end ?
blt 1b // and loop
ldr x7,qAdrszMessResult
mov x2,65 // for upper ascci character
mov x3,#0
6: // result display
ldr x1,[x4,x3,lsl #3] // load counter
cmp x1,#0 // if zero not display
beq 7f
cmp x3,#25 // upper ?
add x2,x3,65 // for upper ascci character
add x8,x3,#97 - 26 // lower
csel x6,x2,x8,le // compute ascii character
strb w6,[x7,#9] // store in message
mov x0,x1 // convert count in decimal
ldr x1,qAdrsZoneConv
bl conversion10
ldr x0,qAdrszMessResult // and display
bl affichageMess
ldr x0,qAdrsZoneConv
bl affichageMess
ldr x0,qAdrszCarriageReturn
bl affichageMess
add x3,x3,#1
cmp x3,#52
blt 6b
ldp x6,x7,[sp],16
ldp x4,x5,[sp],16
ldp x2,x3,[sp],16
ldp x1,lr,[sp],16 // TODO: retaur à completer
qAdrtabCptLetters: .quad tabCptLetters
/* for this file see task include a file in language AArch64 assembly*/
.include "../includeARM64.inc"
Program 64 bits start.
Result: a = 16421
Result: b = 4115
Result: c = 8216
Result: d = 5799
Result: e = 20144
Result: f = 2662
Result: g = 4129
Result: h = 5208
Result: i = 13980
Result: j = 430
Result: k = 1925
Result: l = 10061
Result: m = 5828
Result: n = 12097
Result: o = 12738
Result: p = 5516
Result: q = 378
Result: r = 13436
Result: s = 10210
Result: t = 12836
Result: u = 6489
Result: v = 1902
Result: w = 1968
Result: x = 617
Result: y = 3633
Result: z = 433
<langsyntaxhighlight Lisplang="lisp">(defun increment-alist (tbl key)
(cond ((endp tbl) (list (cons key 1)))
((eql (car (first tbl)) key)
Line 20 ⟶ 430:
(defun letter-freq (str)
(freq-table (coerce str 'list)))</langsyntaxhighlight>
{{libheader|Action! Tool Kit}}
<syntaxhighlight lang="action!">INCLUDE "D2:PRINTF.ACT" ;from the Action! Tool Kit
CARD ARRAY histogram(256)
PROC Clear()
FOR i=0 TO 255
DO histogram(i)=0 OD
PROC ProcessLine(CHAR ARRAY line)
FOR i=1 TO line(0)
PROC ProcessFile(CHAR ARRAY fname)
CHAR ARRAY line(255)
BYTE dev=[1]
WHILE Eof(dev)=0
PROC PrintResult()
FOR i=0 TO 255
IF histogram(i) THEN
PrintF(" %C:%-5S",i,s)
PROC Main()
LMARGIN=0 ;remove left margin on the screen
Put(125) PutE() ;clear the screen
PrintF("Reading ""%S""...%E%E",fname)
LMARGIN=old ;restore left margin on the screen
[https://gitlab.com/amarok8bit/action-rosetta-code/-/raw/master/images/Letter_frequency.png Screenshot from Atari 8-bit computer]
Reading "H6:LETTE_KJ.ACT"...
:150 !:1 ":12 $:1 %:5
(:27 ):27 +:1 ,:8 -:1
.:5 0:7 1:5 2:7 4:1
5:10 6:2 ::3 ;:4 =:13
A:25 B:2 C:19 D:12 E:16
F:10 G:4 H:8 I:13 J:1
K:2 L:9 M:5 N:15 O:20
P:16 R:44 S:4 T:20 U:6
W:1 Y:8 [:1 ]:1 _:1
a:16 c:9 d:10 e:49 f:9
g:8 h:9 i:36 l:20 m:14
n:29 o:23 p:2 r:25 s:24
t:24 u:5 v:7
<langsyntaxhighlight Adalang="ada">with Ada.Text_IO;
procedure Letter_Frequency is
Line 43 ⟶ 536:
end if;
end loop;
end Letter_Frequency;</langsyntaxhighlight>
Sample Output (counting the characters of its own source code):
' ': 122
Line 58 ⟶ 551:
<langsyntaxhighlight lang="aikido">import ctype
var letters = new int [26]
Line 76 ⟶ 569:
foreach i letters.size() {
println (cast<char>('a' + i) + " " + letters[i])
Letters proper:
<syntaxhighlight lang="aime">file f;
index x;
integer c;
while ((c = f.pick) ^ -1) {
x[c] += 1;
c = 'A';
while (c <= 'Z') {
o_form("%c: /w5/\n", c, x[c] += x[c + 'a' - 'A'] += 0);
c += 1;
All chars:
<syntaxhighlight lang="aime">file f;
index x;
integer c, n;
while ((c = f.pick) ^ -1) {
x[c] += 1;
for (c, n in x) {
o_form("%c: /w5/\n", c, n);
=={{header|ALGOL 68}}==
<syntaxhighlight lang="algol68">
[0:max abs char]INT histogram;
FOR i FROM 0 TO max abs char DO histogram[i] := 0 OD;
FILE input file;
STRING input file name = "Letter_frequency.a68";
IF open (input file, input file name, stand in channel) /= 0 THEN
put (stand error, ("Cannot open ", input file name, newline));
on file end (input file, (REF FILE f) BOOL: (close (f); GOTO finished))
get (input file, (s, newline));
CHAR c = s[i];
IF "A" <= c AND c <= "Z" OR "a" <= c AND c <= "z" THEN
histogram[ABS c] PLUSAB 1
close (input file);
FOR i FROM ABS "A" TO ABS "Z" DO printf (($a3xg(0)l$, REPR i, histogram[i])) OD;
FOR i FROM ABS "a" TO ABS "z" DO printf (($a3xg(0)l$, REPR i, histogram[i])) OD
{{out}} Counting letters in its own source code:
A 11
B 9
C 2
D 13
E 11
F 14
G 4
H 3
I 10
J 0
[[ Omitted for K – Z and a – p ]]
q 1
r 15
s 19
t 24
u 10
v 0
w 3
x 4
y 1
z 2
<syntaxhighlight lang="apl"> freq←{(⍪∪⍵),+/(∪⍵)∘.⍷⍵}
freq 0 1 2 3 2 3 4 3 4 4 4
0 1
1 1
2 2
3 3
4 4
freq 'balloon'
b 1
a 1
l 2
o 2
n 1</syntaxhighlight>
The above solution doesn't do the "open a text file" part of the task. File I/O is implementation-dependent, but here's how to do it in Dyalog:
{{works with|Dyalog APL}}
<syntaxhighlight lang="apl"> text ← ⊃⎕nget 'filename'</syntaxhighlight>
... after which the above <tt>freq</tt> function can be applied to <tt>text</tt>.
This is probably best handled with vanilla AppleScript and ASObjC each each doing what it does best. The test text used here is the one specified for the [https://www.rosettacode.org/wiki/Word_frequency Word frequency] task.
<syntaxhighlight lang="applescript">use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions
on letterFrequencyinFile(theFile)
-- Read the file as an NSString, letting the system guess the encoding.
set fileText to current application's class "NSString"'s stringWithContentsOfFile:(POSIX path of theFile) ¬
usedEncoding:(missing value) |error|:(missing value)
-- Get the NSString's non-letter delimited runs, lower-cased, as an AppleScript list of texts.
-- The switch to vanilla objects is for speed and the ability to extract 'characters'.
set nonLetterSet to current application's class "NSCharacterSet"'s letterCharacterSet()'s invertedSet()
script o
property letterRuns : (fileText's lowercaseString()'s componentsSeparatedByCharactersInSet:(nonLetterSet)) as list
end script
-- Extract the characters from the runs and add them to an NSCountedSet to have the occurrences of each value counted.
-- No more than 50,000 characters are extracted in one go to avoid slowing or freezing the script.
set countedSet to current application's class "NSCountedSet"'s new()
repeat with i from 1 to (count o's letterRuns)
set thisRun to item i of o's letterRuns
set runLength to (count thisRun)
repeat with i from 1 to runLength by 50000
set j to i + 49999
if (j > runLength) then set j to runLength
tell countedSet to addObjectsFromArray:(characters i thru j of thisRun)
end repeat
end repeat
-- Work through the counted set's contents and build a list of records showing how many of what it received.
set output to {}
repeat with thisLetter in countedSet's allObjects()
set thisCount to (countedSet's countForObject:(thisLetter))
set end of output to {letter:thisLetter, |count|:thisCount}
end repeat
-- Derive an array of dictionaries from the list and sort it on the letters.
set output to current application's class "NSMutableArray"'s arrayWithArray:(output)
set byLetter to current application's class "NSSortDescriptor"'s sortDescriptorWithKey:("letter") ¬
ascending:(true) selector:("localizedStandardCompare:")
tell output to sortUsingDescriptors:({byLetter})
-- Convert back to a list of records and return the result.
return output as list
end letterFrequencyinFile
-- Test with the text file for the "Word frequency" task.
set theFile to ((path to desktop as text) & "135-0.txt") as alias
return letterFrequencyinFile(theFile)</syntaxhighlight>
Output (using the text file for the "Word frequency" task):
====Minimal new code====
If we just want to get something up and running (and tabulating output) with a minimum of new code –
enough to read the frequencies of shortish texts – we can quickly click together a composition of generic functions.
<syntaxhighlight lang="applescript">use AppleScript version "2.4"
use framework "Foundation"
use scripting additions
------------- CHARACTER COUNTS FROM FILE PATH -------------
-- charCounts :: FilePath -> Either String [(Char, Int)]
on charCounts(fp)
script go
on |λ|(s)
|Right|(sortBy(flip(comparing(my snd)), ¬
map(fanArrow(my head, my |length|), ¬
groupBy(my eq, sort(characters of s)))))
end |λ|
end script
bindLR(readFileLR(fp), go)
end charCounts
-------------------------- TEST ---------------------------
on run
set intColumns to 4
either(identity, frequencyTabulation(intColumns), ¬
end run
------------------------- DISPLAY -------------------------
-- frequencyTabulation :: Int -> [(Char, Int)] -> String
on frequencyTabulation(intCols)
on |λ|(xs)
set w to length of (snd(item 1 of xs) as string)
script go
on |λ|(x)
justifyRight(5, " ", showChar(fst(x))) & ¬
" -> " & justifyRight(w, " ", snd(x) as string)
end |λ|
end script
showColumns(intCols, map(go, xs))
end |λ|
end script
end frequencyTabulation
-------------------- GENERIC FUNCTIONS --------------------
-- Left :: a -> Either a b
on |Left|(x)
{type:"Either", |Left|:x, |Right|:missing value}
end |Left|
-- Right :: b -> Either a b
on |Right|(x)
{type:"Either", |Left|:missing value, |Right|:x}
end |Right|
-- Tuple (,) :: a -> b -> (a, b)
on Tuple(a, b)
-- Constructor for a pair of values, possibly of two different types.
{type:"Tuple", |1|:a, |2|:b, length:2}
end Tuple
-- Absolute value.
-- abs :: Num -> Num
on abs(x)
if 0 > x then
end if
end abs
-- bindLR (>>=) :: Either a -> (a -> Either b) -> Either b
on bindLR(m, mf)
if missing value is not |Left| of m then
mReturn(mf)'s |λ|(|Right| of m)
end if
end bindLR
-- chunksOf :: Int -> [a] -> [[a]]
on chunksOf(n, xs)
set lng to length of xs
script go
on |λ|(a, i)
set x to (i + n) - 1
if x ≥ lng then
a & {items i thru -1 of xs}
a & {items i thru x of xs}
end if
end |λ|
end script
foldl(go, {}, enumFromThenTo(1, 1 + n, lng))
end chunksOf
-- comparing :: (a -> b) -> (a -> a -> Ordering)
on comparing(f)
on |λ|(a, b)
tell mReturn(f)
set fa to |λ|(a)
set fb to |λ|(b)
if fa < fb then
else if fa > fb then
end if
end tell
end |λ|
end script
end comparing
-- concatMap :: (a -> [b]) -> [a] -> [b]
on concatMap(f, xs)
set lng to length of xs
set acc to {}
tell mReturn(f)
repeat with i from 1 to lng
set acc to acc & (|λ|(item i of xs, i, xs))
end repeat
end tell
return acc
end concatMap
-- either :: (a -> c) -> (b -> c) -> Either a b -> c
on either(lf, rf, e)
if missing value is |Left| of e then
tell mReturn(rf) to |λ|(|Right| of e)
tell mReturn(lf) to |λ|(|Left| of e)
end if
end either
-- enumFromThenTo :: Int -> Int -> Int -> [Int]
on enumFromThenTo(x1, x2, y)
set xs to {}
set gap to x2 - x1
set d to max(1, abs(gap)) * (signum(gap))
repeat with i from x1 to y by d
set end of xs to i
end repeat
return xs
end enumFromThenTo
-- eq (==) :: Eq a => a -> a -> Bool
on eq(a, b)
a = b
end eq
-- Compose a function from a simple value to a tuple of
-- the separate outputs of two different functions
-- fanArrow (&&&) :: (a -> b) -> (a -> c) -> (a -> (b, c))
on fanArrow(f, g)
on |λ|(x)
Tuple(mReturn(f)'s |λ|(x), mReturn(g)'s |λ|(x))
end |λ|
end script
end fanArrow
-- flip :: (a -> b -> c) -> b -> a -> c
on flip(f)
property g : mReturn(f)
on |λ|(x, y)
g's |λ|(y, x)
end |λ|
end script
end flip
-- foldl :: (a -> b -> a) -> a -> [b] -> a
on foldl(f, startValue, xs)
tell mReturn(f)
set v to startValue
set lng to length of xs
repeat with i from 1 to lng
set v to |λ|(v, item i of xs, i, xs)
end repeat
return v
end tell
end foldl
-- fst :: (a, b) -> a
on fst(tpl)
if class of tpl is record then
|1| of tpl
item 1 of tpl
end if
end fst
-- Typical usage: groupBy(on(eq, f), xs)
-- groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
on groupBy(f, xs)
set mf to mReturn(f)
script enGroup
on |λ|(a, x)
if length of (active of a) > 0 then
set h to item 1 of active of a
set h to missing value
end if
if h is not missing value and mf's |λ|(h, x) then
{active:(active of a) & {x}, sofar:sofar of a}
{active:{x}, sofar:(sofar of a) & {active of a}}
end if
end |λ|
end script
if length of xs > 0 then
set dct to foldl(enGroup, {active:{item 1 of xs}, sofar:{}}, rest of xs)
if length of (active of dct) > 0 then
sofar of dct & {active of dct}
sofar of dct
end if
end if
end groupBy
-- head :: [a] -> a
on head(xs)
if xs = {} then
missing value
item 1 of xs
end if
end head
-- identity :: a -> a
on identity(x)
-- The argument unchanged.
end identity
-- justifyRight :: Int -> Char -> String -> String
on justifyRight(n, cFiller, strText)
if n > length of strText then
text -n thru -1 of ((replicate(n, cFiller) as text) & strText)
end if
end justifyRight
-- length :: [a] -> Int
on |length|(xs)
set c to class of xs
if list is c or string is c then
length of xs
(2 ^ 29 - 1) -- (maxInt - simple proxy for non-finite)
end if
end |length|
-- map :: (a -> b) -> [a] -> [b]
on map(f, xs)
-- The list obtained by applying f
-- to each element of xs.
tell mReturn(f)
set lng to length of xs
set lst to {}
repeat with i from 1 to lng
set end of lst to |λ|(item i of xs, i, xs)
end repeat
return lst
end tell
end map
-- max :: Ord a => a -> a -> a
on max(x, y)
if x > y then
end if
end max
-- maximum :: Ord a => [a] -> a
on maximum(xs)
on |λ|(a, b)
if a is missing value or b > a then
end if
end |λ|
end script
foldl(result, missing value, xs)
end maximum
-- partition :: (a -> Bool) -> [a] -> ([a], [a])
on partition(f, xs)
tell mReturn(f)
set ys to {}
set zs to {}
repeat with x in xs
set v to contents of x
if |λ|(v) then
set end of ys to v
set end of zs to v
end if
end repeat
end tell
Tuple(ys, zs)
end partition
-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
-- 2nd class handler function lifted into 1st class script wrapper.
if script is class of f then
property |λ| : f
end script
end if
end mReturn
-- readFileLR :: FilePath -> Either String IO String
on readFileLR(strPath)
set ca to current application
set e to reference
set {s, e} to (ca's NSString's ¬
stringWithContentsOfFile:((ca's NSString's ¬
stringWithString:strPath)'s ¬
stringByStandardizingPath) ¬
encoding:(ca's NSUTF8StringEncoding) |error|:(e))
if s is missing value then
|Left|((localizedDescription of e) as string)
|Right|(s as string)
end if
end readFileLR
-- Egyptian multiplication - progressively doubling a list, appending
-- stages of doubling to an accumulator where needed for binary
-- assembly of a target length
-- replicate :: Int -> a -> [a]
on replicate(n, a)
set out to {}
if 1 > n then return out
set dbl to {a}
repeat while (1 < n)
if 0 < (n mod 2) then set out to out & dbl
set n to (n div 2)
set dbl to (dbl & dbl)
end repeat
return out & dbl
end replicate
-- showChar :: Char -> String
on showChar(c)
if space is c then
else if tab is c then
else if linefeed is c then
end if
end showChar
-- showColumns :: Int -> [String] -> String
on showColumns(n, xs)
set w to maximum(map(my |length|, xs))
set m to (length of xs) div n
unlines(map(my unwords, ¬
transpose(chunksOf(m, xs))))
end showColumns
-- signum :: Num -> Num
on signum(x)
if x < 0 then
else if x = 0 then
end if
end signum
-- snd :: (a, b) -> b
on snd(tpl)
if class of tpl is record then
|2| of tpl
item 2 of tpl
end if
end snd
-- sort :: Ord a => [a] -> [a]
on sort(xs)
((current application's NSArray's arrayWithArray:xs)'s ¬
sortedArrayUsingSelector:"compare:") as list
end sort
-- Enough for small scale sorts.
-- Use instead sortOn (Ord b => (a -> b) -> [a] -> [a])
-- which is equivalent to the more flexible sortBy(comparing(f), xs)
-- and uses a much faster ObjC NSArray sort method
-- sortBy :: (a -> a -> Ordering) -> [a] -> [a]
on sortBy(f, xs)
if length of xs > 1 then
set h to item 1 of xs
set f to mReturn(f)
on |λ|(x)
f's |λ|(x, h) ≤ 0
end |λ|
end script
set lessMore to partition(result, rest of xs)
sortBy(f, |1| of lessMore) & {h} & ¬
sortBy(f, |2| of lessMore)
end if
end sortBy
-- transpose :: [[String]] -> [[String]]
on transpose(rows)
script cols
on |λ|(_, iCol)
script cell
on |λ|(row)
if iCol > length of row then
item iCol of row
end if
end |λ|
end script
concatMap(cell, rows)
end |λ|
end script
map(cols, item 1 of rows)
end transpose
-- unlines :: [String] -> String
on unlines(xs)
Output (using script's own file):
-- of a list of strings with the newline character.
set {dlm, my text item delimiters} to ¬
{my text item delimiters, linefeed}
set str to xs as text
set my text item delimiters to dlm
end unlines
-- unwords :: [String] -> String
on unwords(xs)
set {dlm, my text item delimiters} to ¬
{my text item delimiters, space}
set s to xs as text
set my text item delimiters to dlm
return s
end unwords</syntaxhighlight>
<pre>SPACE -> 1330 p -> 138 " -> 28 k -> 5
e -> 598 | -> 132 L -> 20 U -> 5
TAB -> 584 g -> 129 F -> 20 < -> 4
t -> 562 x -> 121 E -> 20 W -> 3
LF -> 509 > -> 107 I -> 18 G -> 3
n -> 462 : -> 102 O -> 17 / -> 3
s -> 423 , -> 98 B -> 17 V -> 2
- -> 384 y -> 70 A -> 17 H -> 2
i -> 372 b -> 70 & -> 17 D -> 2
o -> 365 R -> 42 ' -> 15 4 -> 2
r -> 316 λ -> 39 ¬ -> 13 + -> 2
a -> 311 w -> 39 N -> 13 ≥ -> 1
l -> 241 v -> 35 2 -> 13 ≤ -> 1
f -> 240 ] -> 35 = -> 12 ~ -> 1
d -> 198 [ -> 35 q -> 11 _ -> 1
) -> 181 C -> 35 0 -> 10 ^ -> 1
( -> 181 T -> 33 . -> 9 Y -> 1
m -> 154 S -> 32 P -> 8 9 -> 1
h -> 152 1 -> 31 M -> 8 8 -> 1
c -> 149 } -> 30 j -> 6 5 -> 1
u -> 141 { -> 30 z -> 5 * -> 1</pre>
====Minimal run-time====
For longer texts, calculating only the frequencies of (case-insensitive and accent-insensitive) (a-z) alphabetics (and re-using here the Project Gutenburg ''Misérables'' text from the ''Word Frequency'' task),
we can do something a little faster with a list of simple regular expressions, again composing a solution from existing generic functions.
<syntaxhighlight lang="applescript">use AppleScript version "2.4"
use framework "Foundation"
use scripting additions
-- romanLetterFrequencies :: FilePath -> Maybe [(Char, Int)]
on romanLetterFrequencies(fp)
if doesFileExist(fp) then
set patterns to enumFromToChar("a", "z")
set counts to ap(map(my matchCount, patterns), ¬
{readFile(fp)'s ¬
decomposedStringWithCanonicalMapping's ¬
sortBy(flip(comparing(my snd)))'s ¬
|λ|(zip(patterns, counts))
missing value
end if
end romanLetterFrequencies
--------------------------- TEST -------------------------
on run
set fpText to scriptFolder() & "miserables.txt"
set azFrequencies to romanLetterFrequencies(fpText)
if missing value is not azFrequencies then
script arrow
on |λ|(kv)
set {k, v} to kv
unwords({k, "->", v})
end |λ|
end script
unlines(map(arrow, azFrequencies))
display dialog "Text file not found in this script's folder:" & ¬
linefeed & tab & fpText
end if
end run
------------------------- GENERIC ------------------------
-- Tuple (,) :: a -> b -> (a, b)
on Tuple(a, b)
-- Constructor for a pair of values, possibly of two different types.
{a, b}
end Tuple
-- ap (<*>) :: [(a -> b)] -> [a] -> [b]
on ap(fs, xs)
-- e.g. [(*2),(/2), sqrt] <*> [1,2,3]
-- --> ap([dbl, hlf, root], [1, 2, 3])
-- --> [2,4,6,0.5,1,1.5,1,1.4142135623730951,1.7320508075688772]
-- Each member of a list of functions applied to
-- each of a list of arguments, deriving a list of new values
set lst to {}
repeat with f in fs
tell mReturn(contents of f)
repeat with x in xs
set end of lst to |λ|(contents of x)
end repeat
end tell
end repeat
return lst
end ap
-- comparing :: (a -> b) -> (a -> a -> Ordering)
on comparing(f)
on |λ|(a, b)
tell mReturn(f)
set fa to |λ|(a)
set fb to |λ|(b)
if fa < fb then
else if fa > fb then
end if
end tell
end |λ|
end script
end comparing
-- doesFileExist :: FilePath -> IO Bool
on doesFileExist(strPath)
set ca to current application
set oPath to (ca's NSString's stringWithString:strPath)'s ¬
set {bln, int} to (ca's NSFileManager's defaultManager's ¬
fileExistsAtPath:oPath isDirectory:(reference))
bln and (int ≠ 1)
end doesFileExist
-- enumFromToChar :: Char -> Char -> [Char]
on enumFromToChar(m, n)
set {intM, intN} to {id of m, id of n}
if intM ≤ intN then
set xs to {}
repeat with i from intM to intN
set end of xs to character id i
end repeat
return xs
end if
end enumFromToChar
-- flip :: (a -> b -> c) -> b -> a -> c
on flip(f)
property g : mReturn(f)
on |λ|(x, y)
g's |λ|(y, x)
end |λ|
end script
end flip
-- map :: (a -> b) -> [a] -> [b]
on map(f, xs)
-- The list obtained by applying f
-- to each element of xs.
tell mReturn(f)
set lng to length of xs
set lst to {}
repeat with i from 1 to lng
set end of lst to |λ|(item i of xs, i, xs)
end repeat
return lst
end tell
end map
-- matchCount :: String -> NSString -> Int
on matchCount(regexString)
-- A count of the matches for a regular expression
-- in a given NSString
on |λ|(s)
set ca to current application
((ca's NSRegularExpression's ¬
regularExpressionWithPattern:regexString ¬
options:(ca's NSRegularExpressionAnchorsMatchLines) ¬
|error|:(missing value))'s ¬
numberOfMatchesInString:s ¬
options:0 ¬
range:{location:0, |length|:s's |length|()}) as integer
end |λ|
end script
end matchCount
-- min :: Ord a => a -> a -> a
on min(x, y)
if y < x then
end if
end min
-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
-- 2nd class handler function lifted into 1st class script wrapper.
if script is class of f then
property |λ| : f
end script
end if
end mReturn
-- partition :: (a -> Bool) -> [a] -> ([a], [a])
on partition(f, xs)
tell mReturn(f)
set ys to {}
set zs to {}
repeat with x in xs
set v to contents of x
if |λ|(v) then
set end of ys to v
set end of zs to v
end if
end repeat
end tell
{ys, zs}
end partition
-- readFile :: FilePath -> IO NSString
on readFile(strPath)
set ca to current application
set e to reference
set {s, e} to (ca's NSString's ¬
stringWithContentsOfFile:((ca's NSString's ¬
stringWithString:strPath)'s ¬
stringByStandardizingPath) ¬
encoding:(ca's NSUTF8StringEncoding) |error|:(e))
if missing value is e then
(localizedDescription of e) as string
end if
end readFile
-- scriptFolder :: () -> IO FilePath
on scriptFolder()
-- The path of the folder containing this script
tell application "Finder" to ¬
POSIX path of ((container of (path to me)) as alias)
end scriptFolder
-- snd :: (a, b) -> b
on snd(tpl)
item 2 of tpl
end snd
-- sortBy :: (a -> a -> Ordering) -> [a] -> [a]
on sortBy(f)
-- Enough for small scale sorts.
-- The NSArray sort method in the Foundation library
-- gives better permormance for longer lists.
script go
on |λ|(xs)
if length of xs > 1 then
set h to item 1 of xs
set f to mReturn(f)
on |λ|(x)
f's |λ|(x, h) ≤ 0
end |λ|
end script
set lessMore to partition(result, rest of xs)
|λ|(item 1 of lessMore) & {h} & ¬
|λ|(item 2 of lessMore)
end if
end |λ|
end script
end sortBy
-- unlines :: [String] -> String
on unlines(xs)
-- A single string formed by the intercalation
-- of a list of strings with the newline character.
set {dlm, my text item delimiters} to ¬
{my text item delimiters, linefeed}
set s to xs as text
set my text item delimiters to dlm
end unlines
-- unwords :: [String] -> String
on unwords(xs)
set {dlm, my text item delimiters} to ¬
{my text item delimiters, space}
set s to xs as text
set my text item delimiters to dlm
return s
end unwords
-- zip :: [a] -> [b] -> [(a, b)]
on zip(xs, ys)
zipWith(Tuple, xs, ys)
end zip
-- zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
on zipWith(f, xs, ys)
set lng to min(length of xs, length of ys)
set lst to {}
if 1 > lng then
return {}
tell mReturn(f)
repeat with i from 1 to lng
set end of lst to |λ|(item i of xs, item i of ys)
end repeat
return lst
end tell
end if
end zipWith</syntaxhighlight>
<pre>e -> 332590
t -> 235526
a -> 207252
o -> 184422
h -> 176839
i -> 175345
n -> 169956
s -> 162047
r -> 148671
d -> 108747
l -> 99543
u -> 68336
c -> 67404
m -> 62219
w -> 56513
f -> 56206
g -> 48598
p -> 43387
y -> 39183
b -> 37506
v -> 26268
k -> 14433
j -> 5840
x -> 4027
q -> 2533
z -> 1906</pre>
=={{header|Applesoft BASIC}}==
<syntaxhighlight lang="gwbasic"> 100 LET F$ = "TEXT FILE"
110 LET D$ = CHR$ (4)
120 DIM C(255)
140 FOR Q = 0 TO 1 STEP 0
160 ONERR GOTO 240
170 GET C$
180 POKE 216,0
190 LET C = ASC (C$)
200 LET C(C) = C(C) + 1
220 NEXT
230 STOP
240 POKE 216,0
250 LET E = PEEK (222)
270 IF E < > 5 THEN RESUME
280 FOR I = 0 TO 255
290 IF C(I) THEN GOSUB 320
300 NEXT I
310 END
320 IF I < 32 THEN PRINT "^" CHR$ (64 + I);
330 IF I > = 32 AND I < 128 THEN PRINT CHR$ (I);
340 IF I > 127 THEN PRINT "CHR$("I")";
350 PRINT "="C(I)" ";
360 RETURN</syntaxhighlight>
=={{header|ARM Assembly}}==
{{works with|as|Raspberry Pi <br> or android 32 bits with application Termux}}
<syntaxhighlight lang ARM Assembly>
/* ARM assembly Raspberry PI */
/* program cptletters.s */
/* Constantes */
/* for this file see task include a file in language ARM assembly*/
.include "../constantes.inc"
.equ READ, 3
.equ WRITE, 4
.equ OPEN, 5
.equ CLOSE, 6
.equ O_RDWR, 0x0002 @ open for reading and writing
.equ BUFFERSIZE, 300000
/* Initialized data */
szMessOpen: .asciz "File open error.\n"
szMessStat: .asciz "File information error.\n"
szMessRead: .asciz "File read error.\n"
szMessClose: .asciz "File close error.\n"
szMessDecryptText: .asciz "Decrypted text :\n"
szMessCryptText: .asciz "Encrypted text :\n"
szMessErrorChar: .asciz "Character text not Ok!\n"
szFileName: .asciz "unixdict.txt"
//szFileName: .asciz "test1.txt"
szMessResult: .asciz "Result: = "
szCarriageReturn: .asciz "\n"
szMessStart: .asciz "Program 32 bits start.\n"
/* UnInitialized data */
sZoneConv: .skip 24
tabCptLetters: .skip 4 * 52 @ (A-Z a-z) counter array
sBuffer: .skip BUFFERSIZE @ file buffer
/* code section */
.global main
main: @ entry of program
ldr r0,iAdrszMessStart
bl affichageMess
ldr r0,iAdrszFileName @ file name
mov r1,#O_RDWR @ flags
mov r2,#0 @ mode
mov r7,#OPEN @ file open
svc 0
cmp r0,#0 @ error ?
ble 99f
mov r8,r0 @ FD save
mov r0,r8
ldr r1,iAdrsBuffer
ldr r2,#iBufferSize
mov r7,#READ @ call system read file
svc 0
cmp r0,#0 @ error read ?
blt 97f
mov r6,r0 @ file size
mov r0,r8
mov r7,#CLOSE @ call system close file
svc 0
cmp r0,#0 @ error close ?
blt 96f
ldr r0,iAdrsBuffer
mov r1,r6
bl cptLetters
b 100f
ldr r0,iAdrszMessClose
bl affichageMess
mov r0,#-1 @ error
b 100f
ldr r0,iAdrszMessRead
bl affichageMess
mov r0,#-1 @ error
b 100f
ldr r0,iAdrszMessOpen
bl affichageMess
mov r0,#-1 @ error
100: @ standard end of the program
mov r0, #0 @ return code
mov r7, #EXIT @ request to exit program
svc 0 @ perform the system call
iAdrsZoneConv: .int sZoneConv
iAdrszMessResult: .int szMessResult
iAdrszCarriageReturn: .int szCarriageReturn
iAdrszMessStart: .int szMessStart
iAdrszFileName: .int szFileName
iAdrszMessOpen: .int szMessOpen
iAdrszMessRead: .int szMessRead
iAdrszMessStat: .int szMessStat
iAdrszMessClose: .int szMessClose
iAdrsBuffer: .int sBuffer
iBufferSize: .int BUFFERSIZE
/* letters frequency */
/* r0 contains a file buffer */
/* r1 contains string length */
push {r1-r7,lr} @ save registers
ldr r4,iAdrtabCptLetters @ counter array
mov r3,#0 @ index string
ldrb r2,[r0,r3] @ load byte of string
cmp r2,#'A' @ select alpha characters lower or upper
blt 5f
cmp r2,#'Z'
bgt 2f
sub r5,r2,#65 @ convert ascii upper in index array (0-25)
b 3f
cmp r2,#'z'
bgt 5f
cmp r2,#'a'
blt 5f
sub r5,r2,#97 - 26 @ convert ascii lower in index array (26,52]
ldr r7,[r4,r5,lsl #2] @ load counter of load character
add r7,r7,#1 @ increment counter
str r7,[r4,r5,lsl #2] @ and store
add r3,r3,#1 @ increment text index
cmp r3,r1 @ end ?
blt 1b @ and loop
ldr r7,iAdrszMessResult
mov r3,#0
6: @ result display
ldr r1,[r4,r3,lsl #2] @ load counter
cmp r1,#0 @ if zero not display
beq 7f
cmp r3,#25 @ upper ?
addle r6,r3,#65 @ yes compute ascci character
addgt r6,r3,#97 - 26 @ lower
strb r6,[r7,#9] @ store in message
mov r0,r1 @ convert count in decimal
ldr r1,iAdrsZoneConv
bl conversion10
ldr r0,iAdrszMessResult @ and display
bl affichageMess
ldr r0,iAdrsZoneConv
bl affichageMess
ldr r0,iAdrszCarriageReturn
bl affichageMess
add r3,r3,#1
cmp r3,#52
blt 6b
pop {r1-r7,pc}
iAdrtabCptLetters: .int tabCptLetters
/* for this file see task include a file in language ARM assembly*/
.include "../affichage.inc"
Output with the file unixdict.txt:
Program 32 bits start.
Result: a = 16421
Result: b = 4115
Result: c = 8216
Result: d = 5799
Result: e = 20144
Result: f = 2662
Result: g = 4129
Result: h = 5208
Result: i = 13980
Result: j = 430
Result: k = 1925
Result: l = 10061
Result: m = 5828
Result: n = 12097
Result: o = 12738
Result: p = 5516
Result: q = 378
Result: r = 13436
Result: s = 10210
Result: t = 12836
Result: u = 6489
Result: v = 1902
Result: w = 1968
Result: x = 617
Result: y = 3633
Result: z = 433
<syntaxhighlight lang="rebol">source: {
The Red Death had long devastated the country.
No pestilence had ever been so fatal, or so hideous.
Blood was its Avator and its seal—the redness and the horror of blood.
There were sharp pains, and sudden dizziness,
and then profuse bleeding at the pores, with dissolution.
The scarlet stains upon the body and especially upon the face of the victim,
were the pest ban which shut him out from the aid and from the sympathy of his fellow-men.
And the whole seizure, progress and termination of the disease,
were the incidents of half an hour.
valid: split "abcdefghijklmnopqrstuvwxyz"
frequencies: #[]
loop split lower source 'ch [
if in? ch valid [
if not? key? frequencies ch ->
set frequencies ch 0
set frequencies ch (get frequencies ch)+1
Output:
<pre>[ :dictionary
t : 39 :integer
h : 33 :integer
e : 60 :integer
r : 24 :integer
d : 27 :integer
a : 33 :integer
l : 15 :integer
o : 34 :integer
n : 30 :integer
g : 3 :integer
v : 4 :integer
s : 34 :integer
c : 8 :integer
u : 11 :integer
y : 5 :integer
p : 11 :integer
i : 25 :integer
b : 6 :integer
f : 12 :integer
w : 8 :integer
z : 3 :integer
m : 7 :integer
<br>This is the past version of this edit but made into a function, and now it only shows the letters that are in it, but not the ones with 0 letters
<lang AutoHotkey>OpenFile = %A_ScriptFullPath% ; use own source code
<syntaxhighlight lang="autohotkey">LetterFreq(Var) {
FileRead, FileText, %OpenFile%
Loop, 26
StringReplace StrReplace(Var, junk, FileText, % Chr(96+A_Index), , UseErrorLevelCount)
if Count
out .= Chr(96+A_Index) ": " ErrorLevel "`n"
out .= Chr(96+A_Index) ": " Count "`n"
return out
MsgBox % out</lang>
var := "The dog jumped over the lazy fox"
Output (using script's own file):
var2 := "foo bar"
Msgbox, % LetterFreq(var)
Msgbox, % LetterFreq(var2)</syntaxhighlight>
a: 61
d: 2
e: 4
f: 1
g: 1
h: 2
j: 1
l: 1
m: 1
o: 3
p: 1
r: 1
t: 2
u: 1
v: 1
x: 1
y: 1
z: 1
a: 1
b: 1
cf: 61
do: 42
r: 1</pre>
e: 24
This function prints the Letter frequency of a given textfile.
You can choose to use case sensitive search and if special chars should be searched too.
<syntaxhighlight lang="text">
Func _Letter_frequency($Path, $fcase = True, $fspecial_chars = True)
Local $hFile, $sRead, $iupto, $iStart, $iCount
If Not $fcase Then $fcase = False
If Not $fspecial_chars Then
$iStart = 64
If Not $fcase Then
$iupto = 26
$iupto = 58
$iStart = 31
$iupto = 224
$hFile = FileOpen($Path, 0)
$sRead = FileRead($hFile)
For $i = 1 To $iupto
If Not $fspecial_chars Then
If $iStart + $i > 90 And $iStart + $i < 97 Then ContinueLoop
$sRead = StringReplace($sRead, Chr($iStart + $i), "", 0, $fcase)
$iCount = @extended
If $iCount > 0 Then ConsoleWrite(Chr($iStart + $i) & " : " & $iCount & @CRLF)
Example use (based on a configuration file from another task):
A : 32
B : 2
C : 15
E : 31
F : 10
[several lines omitted]
xu : 514
yv : 01
w : 1
z: 0</pre>
x : 14</pre>
<syntaxhighlight lang="awk">
<lang AWK>
# usage: awk -f letters.awk HolyBible.txt
BEGIN { FS="" }
{ for(i=1;i<=NF;i++) m[$i]++}END{for(i in m)printf("%9d %-14s\n",m[i],i)}
END { for(i in m) printf("%9d %-14s\n", m[i],i) }
<syntaxhighlight lang="freebasic">txt$ = LOAD$("bible.txt")
FOR x = 97 TO 122
PRINT CHR$(x-32), " ", CHR$(x), " : ", COUNT(txt$, x-32), " - ", COUNT(txt$, x)
A a : 17915 - 257815
B b : 4714 - 44161
C c : 1698 - 53373
D d : 8782 - 149313
E e : 2710 - 409525
F f : 2386 - 81157
G g : 6206 - 49096
H h : 3208 - 279471
I i : 13302 - 180660
J j : 6374 - 2515
K k : 547 - 21745
L l : 9222 - 120716
M m : 3056 - 76884
N n : 1891 - 223166
O o : 8896 - 234290
P p : 1877 - 41377
Q q : 6 - 958
R r : 7568 - 162761
S s : 4906 - 185124
T t : 7763 - 309983
U u : 333 - 83140
V v : 107 - 30258
W w : 2408 - 63079
X x : 2 - 1476
Y y : 569 - 58007
Z z : 904 - 2068
=={{header|BBC BASIC}}==
<syntaxhighlight lang="bbcbasic"> DIM cnt%(255)
file% = OPENIN("C:\unixdict.txt")
IF file%=0 ERROR 100, "Could not open file"
A$ = GET$#file%
L% = LEN(A$)
FOR I% = 1 TO L%
cnt%(ASCMID$(A$,I%)) += 1
CLOSE #file%
FOR c% = &41 TO &5A
PRINT CHR$(c%)CHR$(c%+32) ": " cnt%(c%)+cnt%(c%+32)
Aa: 16421
Bb: 4115
Cc: 8216
Dd: 5799
Ee: 20144
Ff: 2662
Gg: 4129
Hh: 5208
Ii: 13980
Jj: 430
Kk: 1925
Ll: 10061
Mm: 5828
Nn: 12097
Oo: 12738
Pp: 5516
Qq: 378
Rr: 13436
Ss: 10210
Tt: 12836
Uu: 6489
Vv: 1902
Ww: 1968
Xx: 617
Yy: 3633
Zz: 433
<syntaxhighlight lang="bcpl">get "libhdr"
let start() be
$( let count = vec 255
let file = findinput("unixdict.txt")
for i = 0 to 255 do i!count := 0
$( let ch = rdch()
if ch = endstreamch then break
ch!count := ch!count + 1
$) repeat
for i = 'A' to 'Z' do
$( let n = i!count + (i|32)!count
unless n = 0 do
writef("%C%C: %I5*N", i, i|32, n)
<pre>Aa: 16421
Bb: 4115
Cc: 8216
Dd: 5799
Ee: 20144
Ff: 2662
Gg: 4129
Hh: 5208
Ii: 13980
Jj: 430
Kk: 1925
Ll: 10061
Mm: 5828
Nn: 12097
Oo: 12738
Pp: 5516
Qq: 378
Rr: 13436
Ss: 10210
Tt: 12836
Uu: 6489
Vv: 1902
Ww: 1968
Xx: 617
Yy: 3633
Zz: 433</pre>
<code>"sample.txt"</code> can be substituted with any filename.
<syntaxhighlight lang="bqn">Freq←⍷≍/⁼∘⊐
Freq "balloon"
# For a file:
Freq •FLines "sample.txt"</syntaxhighlight>
<syntaxhighlight lang="text">┌─
╵ 'b' 'a' 'l' 'o' 'n'
1 1 2 2 1
usage: awk -f letters.awk HolyBible.txt
<langsyntaxhighlight lang="bracmat">(lc=
counts c
. fil$(!arg,r) {open file for reading}
Line 124 ⟶ 2,167:
lc$"valid.bra" {example: count letters in Bracmat's validation suite.}
<langsyntaxhighlight lang="bracmat">107*A
+ 33*B
+ 37*C
Line 175 ⟶ 2,218:
+ 685*y
+ 211*z
+ 1035*i</langsyntaxhighlight>
<langsyntaxhighlight lang="c">/* declare array */
int frequency[26];
int ch;
Line 196 ⟶ 2,239:
else if ('A' <= ch && ch <= 'Z') /* upper case */
=={{header|C sharp}}==
<langsyntaxhighlight lang="csharp">using System;
using System.Collections.Generic;
using System.IO;
Line 235 ⟶ 2,278:
Sample output:
<pre> : 1
!: 1
Line 247 ⟶ 2,290:
r: 1
w: 1</pre>
Declarative approach:
<syntaxhighlight lang="csharp">
var freq = from c in str
where char.IsLetter(c)
orderby c
group c by c into g
select g.Key + ":" + g.Count();
foreach(var g in freq)
<langsyntaxhighlight lang="cpp">#include <fstream>
#include <iostream>
Line 274 ⟶ 2,340:
Example output{{out}} when file contains "Hello, world!" (without quotes):
! = 1
Line 287 ⟶ 2,353:
w = 1
<syntaxhighlight lang="clojure">(println (sort-by second >
(frequencies (map #(java.lang.Character/toUpperCase %)
(filter #(java.lang.Character/isLetter %) (slurp "text.txt"))))))</syntaxhighlight>
=={{header|Common Lisp}}==
<langsyntaxhighlight lang="lisp">(defun letter-freq (file)
(with-open-file (stream file)
(let ((str (make-string (file-length stream)))
Line 300 ⟶ 2,371:
(if (zerop (rem i 8)) #\newline #\tab))))))
(letter-freq "test.lisp")</langsyntaxhighlight>
<syntaxhighlight lang="cobol">
PROGRAM-ID. Letter-Frequency.
AUTHOR. Bill Gunshannon.
DATE-WRITTEN. 12 December 2021.
** Program Abstract:
** A rather simplistic program to do the kind of thing
** that COBOL does really well.
SELECT Text-File ASSIGN TO "File.txt"
FD Text-File
01 Record-Name PIC X(80).
01 Eof PIC X VALUE 'F'.
01 Letter-cnt.
05 A-cnt PIC 9(5) VALUE 0.
05 B-cnt PIC 9(5) VALUE 0.
05 C-cnt PIC 9(5) VALUE 0.
05 D-cnt PIC 9(5) VALUE 0.
05 E-cnt PIC 9(5) VALUE 0.
05 F-cnt PIC 9(5) VALUE 0.
05 G-cnt PIC 9(5) VALUE 0.
05 H-cnt PIC 9(5) VALUE 0.
05 I-cnt PIC 9(5) VALUE 0.
05 J-cnt PIC 9(5) VALUE 0.
05 K-cnt PIC 9(5) VALUE 0.
05 L-cnt PIC 9(5) VALUE 0.
05 M-cnt PIC 9(5) VALUE 0.
05 N-cnt PIC 9(5) VALUE 0.
05 O-cnt PIC 9(5) VALUE 0.
05 P-cnt PIC 9(5) VALUE 0.
05 Q-cnt PIC 9(5) VALUE 0.
05 R-cnt PIC 9(5) VALUE 0.
05 S-cnt PIC 9(5) VALUE 0.
05 T-cnt PIC 9(5) VALUE 0.
05 U-cnt PIC 9(5) VALUE 0.
05 V-cnt PIC 9(5) VALUE 0.
05 W-cnt PIC 9(5) VALUE 0.
05 X-cnt PIC 9(5) VALUE 0.
05 Y-cnt PIC 9(5) VALUE 0.
05 Z-cnt PIC 9(5) VALUE 0.
01 Letter-disp.
05 A-cnt PIC ZZZZ9.
05 B-cnt PIC ZZZZ9.
05 C-cnt PIC ZZZZ9.
05 D-cnt PIC ZZZZ9.
05 E-cnt PIC ZZZZ9.
05 F-cnt PIC ZZZZ9.
05 G-cnt PIC ZZZZ9.
05 H-cnt PIC ZZZZ9.
05 I-cnt PIC ZZZZ9.
05 J-cnt PIC ZZZZ9.
05 K-cnt PIC ZZZZ9.
05 L-cnt PIC ZZZZ9.
05 M-cnt PIC ZZZZ9.
05 N-cnt PIC ZZZZ9.
05 O-cnt PIC ZZZZ9.
05 P-cnt PIC ZZZZ9.
05 Q-cnt PIC ZZZZ9.
05 R-cnt PIC ZZZZ9.
05 S-cnt PIC ZZZZ9.
05 T-cnt PIC ZZZZ9.
05 U-cnt PIC ZZZZ9.
05 V-cnt PIC ZZZZ9.
05 W-cnt PIC ZZZZ9.
05 X-cnt PIC ZZZZ9.
05 Y-cnt PIC ZZZZ9.
05 Z-cnt PIC ZZZZ9.
READ Text-File
AT END MOVE 'T' to Eof
TALLYING A-cnt OF Letter-cnt FOR ALL 'A'
TALLYING B-cnt OF Letter-cnt FOR ALL 'B'
TALLYING C-cnt OF Letter-cnt FOR ALL 'C'
TALLYING D-cnt OF Letter-cnt FOR ALL 'D'
TALLYING E-cnt OF Letter-cnt FOR ALL 'E'
TALLYING F-cnt OF Letter-cnt FOR ALL 'F'
TALLYING G-cnt OF Letter-cnt FOR ALL 'G'
TALLYING H-cnt OF Letter-cnt FOR ALL 'H'
TALLYING I-cnt OF Letter-cnt FOR ALL 'I'
TALLYING J-cnt OF Letter-cnt FOR ALL 'J'
TALLYING K-cnt OF Letter-cnt FOR ALL 'K'
TALLYING L-cnt OF Letter-cnt FOR ALL 'L'
TALLYING M-cnt OF Letter-cnt FOR ALL 'M'
TALLYING N-cnt OF Letter-cnt FOR ALL 'N'
TALLYING O-cnt OF Letter-cnt FOR ALL 'O'
TALLYING P-cnt OF Letter-cnt FOR ALL 'P'
TALLYING Q-cnt OF Letter-cnt FOR ALL 'Q'
TALLYING R-cnt OF Letter-cnt FOR ALL 'R'
TALLYING S-cnt OF Letter-cnt FOR ALL 'S'
TALLYING T-cnt OF Letter-cnt FOR ALL 'T'
TALLYING U-cnt OF Letter-cnt FOR ALL 'U'
TALLYING V-cnt OF Letter-cnt FOR ALL 'V'
TALLYING W-cnt OF Letter-cnt FOR ALL 'W'
TALLYING X-cnt OF Letter-cnt FOR ALL 'X'
TALLYING Y-cnt OF Letter-cnt FOR ALL 'Y'
TALLYING Z-cnt OF Letter-cnt FOR ALL 'Z'
CLOSE Text-File.
MOVE CORRESPONDING Letter-cnt To Letter-disp.
DISPLAY 'Letter Frequency Distribution'.
DISPLAY '-----------------------------'.
DISPLAY 'A : ' A-cnt OF Letter-disp ' '
'N : ' N-cnt OF Letter-disp.
DISPLAY 'B : ' B-cnt OF Letter-disp ' '
'O : ' O-cnt OF Letter-disp.
DISPLAY 'C : ' C-cnt OF Letter-disp ' '
'P : ' P-cnt OF Letter-disp.
DISPLAY 'D : ' D-cnt OF Letter-disp ' '
'Q : ' Q-cnt OF Letter-disp.
DISPLAY 'E : ' E-cnt OF Letter-disp ' '
'R : ' R-cnt OF Letter-disp.
DISPLAY 'F : ' F-cnt OF Letter-disp ' '
'S : ' S-cnt OF Letter-disp.
DISPLAY 'G : ' G-cnt OF Letter-disp ' '
'T : ' T-cnt OF Letter-disp.
DISPLAY 'H : ' H-cnt OF Letter-disp ' '
'U : ' U-cnt OF Letter-disp.
DISPLAY 'I : ' I-cnt OF Letter-disp ' '
'V : ' V-cnt OF Letter-disp.
DISPLAY 'J : ' J-cnt OF Letter-disp ' '
'W : ' W-cnt OF Letter-disp.
DISPLAY 'K : ' K-cnt OF Letter-disp ' '
'X : ' X-cnt OF Letter-disp.
DISPLAY 'L : ' L-cnt OF Letter-disp ' '
'Y : ' Y-cnt OF Letter-disp.
DISPLAY 'M : ' M-cnt OF Letter-disp ' '
'Z : ' Z-cnt OF Letter-disp.
Letter Frequency Distribution
A : 416 N : 434
B : 120 O : 545
C : 316 P : 215
D : 267 Q : 12
E : 679 R : 436
F : 122 S : 432
G : 171 T : 493
H : 131 U : 180
I : 429 V : 57
J : 12 W : 97
K : 17 X : 35
L : 303 Y : 50
M : 162 Z : 60
=={{header|Component Pascal}}==
BlackBox Component Builder
<syntaxhighlight lang="oberon2">
MODULE LetterFrecuency;
IMPORT Files,StdLog,Strings;
loc: Files.Locator;
fd: Files.File;
rd: Files.Reader;
x: BYTE;
frecuency: ARRAY 26 OF LONGINT;
c: CHAR;
loc := Files.dir.This("BBTest/Mod");
fd := Files.dir.Old(loc,"LetterFrecuency.odc",FALSE);
rd := fd.NewReader(NIL);
(* init the frecuency array *)
FOR i := 0 TO LEN(frecuency) - 1 DO frecuency[i] := 0 END;
(* collect frecuencies *)
WHILE ~rd.eof DO
rd.ReadByte(x);c := CAP(CHR(x));
(* convert vowels with diacritics *)
193: c := 'A';
|201: c := 'E';
|205: c := 'I';
|211: c := 'O';
|218: c := 'U';
IF (c >= 'A') & (c <= 'Z') THEN
INC(frecuency[ORD(c) - ORD('A')]);
(* show data *)
FOR i := 0 TO LEN(frecuency) - 1 DO
StdLog.Char(CHR(i + ORD('A')));StdLog.String(":> ");StdLog.Int(frecuency[i]);
END LetterFrecuency.
Execute: ^Q LetterFrequency.Do
A:> 28
B:> 7
C:> 100
D:> 94
E:> 168
F:> 30
G:> 10
H:> 11
I:> 49
J:> 0
K:> 1
L:> 67
M:> 25
N:> 57
O:> 81
P:> 3
Q:> 0
R:> 91
S:> 90
T:> 94
U:> 32
V:> 14
W:> 15
X:> 15
Y:> 17
Z:> 3
<syntaxhighlight lang="cowgol">include "cowgol.coh";
include "argv.coh";
include "file.coh";
# Get filename from command line
var file := ArgvNext();
if file == (0 as [uint8]) then
print("error: no file name\n");
end if;
# Open the file
var fcb: FCB;
if FCBOpenIn(&fcb, file) != 0 then
print("error: cannot open file\n");
end if;
# Counters for each letter
var letterCount: uint32[26];
MemZero(&letterCount as [uint8], @bytesof letterCount);
# Count every letter
var len := FCBExt(&fcb);
while len != 0 loop
len := len - 1;
var ch := (FCBGetChar(&fcb) | 32) - 'a';
if ch >= @sizeof letterCount then
end if;
letterCount[ch] := letterCount[ch] + 1;
end loop;
# Close the file
var foo := FCBClose(&fcb);
# Print value for each letter
ch := 0;
while ch < @sizeof letterCount loop
print_char(ch + 'A');
print(": ");
ch := ch + 1;
end loop;</syntaxhighlight>
The result of running the program on its own source file:
<pre>A: 22
B: 11
C: 46
D: 9
E: 80
F: 32
G: 6
H: 26
I: 42
J: 0
K: 0
L: 39
M: 7
N: 53
O: 45
P: 14
Q: 0
R: 47
S: 8
T: 59
U: 18
V: 11
W: 5
X: 4
Y: 2
Z: 3</pre>
<syntaxhighlight lang="d">void main() {
<lang d>import std.stdio, std.ascii, std.algorithm, std.range;
import std.stdio, std.ascii, std.algorithm, std.range;
uint[26] frequency;
void main() {
int[26] frequency;
foreach (ubyte[]const buffer; File("unixdict.txt").File.byChunk(2 ^^ 15))
foreach (immutable c; buffer.filter!isAlpha())
frequency[c.toLower - 'a']++;
writefln("%(%(%s, %),\n%)", std.rangefrequency[].chunks(frequency[], 10));
<pre>16421, 4115, 8216, 5799, 20144, 2662, 4129, 5208, 13980, 430,
1925, 10061, 5828, 12097, 12738, 5516, 378, 13436, 10210, 12836,
6489, 1902, 1968, 617, 3633, 433</pre>
See [https://www.rosettacode.org/wiki/Letter_frequency#Pascal Pascal].
<syntaxhighlight lang="draco">proc nonrec main() void:
file() infile;
[256] char linebuf;
[256] word count;
*char line;
char c;
byte i;
word n;
channel input text filech;
channel input text linech;
for i from 0 upto 255 do count[i] := 0 od;
line := &linebuf[0];
open(filech, infile, "unixdict.txt");
while readln(filech; line) do
open(linech, line);
while read(linech; c) do
i := pretend(c, byte);
count[i] := count[i] + 1
for c from 'A' upto 'Z' do
i := pretend(c, byte);
n := count[i] + count[i | 32];
writeln(c, pretend(i | 32, char), ": ", n:5)
<pre>Aa: 16421
Bb: 4115
Cc: 8216
Dd: 5799
Ee: 20144
Ff: 2662
Gg: 4129
Hh: 5208
Ii: 13980
Jj: 430
Kk: 1925
Ll: 10061
Mm: 5828
Nn: 12097
Oo: 12738
Pp: 5516
Qq: 378
Rr: 13436
Ss: 10210
Tt: 12836
Uu: 6489
Vv: 1902
Ww: 1968
Xx: 617
Yy: 3633
Zz: 433</pre>
<syntaxhighlight lang="easylang">
len d[] 26
s$ = input
until s$ = ""
for c$ in strchars s$
c = strcode c$
if c >= 97 and c <= 122
c -= 32
if c >= 65 and c <= 91
d[c - 64] += 1
for i to 26
write strchar (96 + i) & ": "
print d[i]
Open a text file and count the occurrences of each letter.
Some of these programs count all characters (including
punctuation), but some only count letters A to Z.
Other tasks related to string operations:
We use a property list - plist for short - which is a hash table, to store the pairs ( letter . count) .
<syntaxhighlight lang="lisp">
;; bump count when letter added
(define (hash-counter hash key )
;; (set! key (string-downcase key)) - if ignore case wanted
(putprop hash (1+ (or (getprop hash key) 0 )) key))
;; apply to exploded string
;; and sort result
(define (hash-compare a b) ( < (first a) (first b)))
(define (count-letters hash string)
(map (curry hash-counter hash) (string->list string))
(list-sort hash-compare (symbol-plist hash)))
<syntaxhighlight lang="lisp">
(define (file-stats file string)
(set-plist! 'file-stats null) ; reset counters
(writeln (count-letters 'file-stats string))
(writeln "Total letters:" (string-length string))
(writeln "Total lines:" (getprop 'file-stats "#\\newline")))
; frequency for 'help.html' file
(file->string file-stats) ; browser 'open' dialog
➛ help.html -> string
➛ (( 28918) (! 138) (# 1035) (#\newline 4539) (#\tab 409) ($ 7) (% 24) (& 136) (' 1643) ((3577) () 3583) (* 233)
(+ 303) (, 599) (- 3164) (. 1454) (/ 5388) (0 1567) (1 1769) (2 1258) (3 857) (4 1872) (5 453) (6 581) (7 344)
(8 337) (9 411) (: 1235) (; 647) (< 9951) (= 1834) (> 10255) (? 392) (@ 11) (A 166) (B 92) (C 144) (D 72) (E 224)
(F 52) (G 35) (H 42) (I 193) (J 31) (K 36) (L 196) (M 82) (N 94) (O 132) (P 192) (Q 27) (R 56) (S 220) (T 226) (U 37)
(V 51) (W 28) (X 6) (Y 38) (Z 2) ([ 237) (\ 12) (] 215) (^ 28) (_ 107) (` 7) (a 8420) (b 4437) (c 3879) (d 4201)
(e 11905) (f 2989) (g 2068) (h 3856) (i 11313) (j 334) (k 653) (l 5748) (m 3048) (n 7020) (o 7207) (p 3585) (q 249)
(r 8312) (s 8284) (t 8704) (u 3833) (v 1135) (w 861) (x 1172) (y 1451) (z 268) ({ 123) (| 62) (} 123) (~ 7) (§ 1) (© 1)
(« 1) (» 1) (É 2) (à 18) (â 3) (ç 3) (è 6) (é 53) (î 1) (ö 9) (û 1) (œ 1) (ε 2) (λ 12) (μ 1) (ο 2) (ς 1)
(τ 1) (а 1) (д 1) (е 1) (з 1) (л 1) (м 1) (н 1) (я 3) (ἄ 1) (— 2) (“ 2) (” 2) (… 184) (→ 465) (∅ 57) (∈ 4) (∏ 1)
(∑ 2) (∘ 6) (√ 4)(∞ 12) (∫ 2) (⌚ 2) (⌛ 1) (⏳ 4) (☕ 1) (♠ 7) (♡ 2) (♢ 2) (♣ 6) (♤ 2) (♥ 8) (♦ 8)
(♧ 2) (⚁ 1) (⚃ 2) (⚪ 1) (⛔ 1) (✋ 1) (❄ 1) (❅ 1) (❆ 1) (❇ 1) (❈ 1) (❉ 1) (❊ 1) (❋ 1) (❌ 3) (❍ 1)
(❎ 1) (❗ 1) (➛ 900) (➰ 1) (⭕ 2) ... )
➛ Total letters: 212631
➛ Total lines: 4539
<syntaxhighlight lang="eiffel">class
feature {NONE} -- Initialization
-- Read from the file and print frequencies.
create file.make_open_read("input.txt")
across get_frequencies(file.last_string) as f loop
print(f.key.out + ": " + f.item.out + "%N")
feature -- Access
-- Hash table of counts for alphabetic characters in `s'.
create Result.make(0)
across s.area as st loop
char := st.item
if char.is_alpha then
if Result.has(char) then
Result.force(Result.at(char) + 1, char)
Result.put (1, char)
Output when file contains "Hello, Eiffel world!":
<pre>H: 1
e: 2
l: 4
o: 2
E: 1
i: 1
f: 2
w: 1
r: 1
d: 1</pre>
<syntaxhighlight lang="elixir">file = hd(System.argv)
|> String.upcase
|> String.graphemes
|> Enum.filter(fn c -> c =~ ~r/[A-Z]/ end)
|> Enum.reduce(Map.new, fn c,acc -> Map.update(acc, c, 1, &(&1+1)) end)
|> Enum.sort_by(fn {_k,v} -> -v end)
|> Enum.each(fn {k,v} -> IO.puts "#{k} #{v}" end)</syntaxhighlight>
Output:
E 20144
A 16421
I 13980
R 13436
T 12836
O 12738
N 12097
S 10210
L 10061
C 8216
U 6489
M 5828
D 5799
P 5516
H 5208
G 4129
B 4115
Y 3633
F 2662
W 1968
K 1925
V 1902
X 617
Z 433
J 430
Q 378
=={{header|Emacs Lisp}}==
<syntaxhighlight lang="lisp">
(defun tally-letter-frequency-in-file ()
"Open a file and count the number of times each letter appears."
(let ((alphabet "abcdefghijklmnopqrstuvwxyz") ; variable to hold letters we will be counting
(current-letter) ; variable to hold current letter we will be counting
(count) ; variable to count how many times current letter appears
(case-fold-search t)) ; ignores case
(find-file "~/Documents/Elisp/MobyDick.txt") ; open file in a buffer (or switch to buffer if file is already open)
(while (>= (length alphabet) 1) ; as long as there is at least 1 letter left in alphabet
(beginning-of-buffer) ; go to the beginning of the buffer
(setq current-letter (substring alphabet 0 1)) ; set current-letter to first letter of alphabet
(setq count (how-many current-letter)) ; count how many of this letter in file
(end-of-buffer) ; go to the end of the buffer
(insert (format "\n%s%s - %7d" current-letter (upcase current-letter) count)) ; write how many times that letter appears
(setq alphabet (substring alphabet 1 nil))) ; remove first letter from alphabet
(insert "\n")))
aA - 79220
bB - 17203
cC - 23318
dD - 38834
eE - 119345
fF - 21252
gG - 21287
hH - 63769
iI - 66671
jJ - 1176
kK - 8228
lL - 43349
mM - 23626
nN - 66778
oO - 70808
pP - 17873
qQ - 1581
rR - 53589
sS - 65136
tT - 89874
uU - 27205
vV - 8724
wW - 22556
xX - 1064
yY - 17242
zZ - 635
<syntaxhighlight lang="erlang">%% Implemented by Arjun Sunel
-export([main/0, letter_freq/1]).
main() ->
case file:read_file("file.txt") of
{ok, FileData} ->
_FileNotExist ->
io:format("File do not exist~n")
letter_freq(Data) ->
lists:foreach(fun(Char) ->
LetterCount = lists:foldl(fun(Element, Count) ->
case Element =:= Char of
true ->
false ->
end, 0, Data),
case LetterCount >0 of
true ->
io:format("~p : ~p~n", [[Char], LetterCount]);
false ->
end, lists:seq(0, 222)).
Output:
" " : 4
"," : 1
"." : 22
":" : 3
"M" : 1
"a" : 2
"e" : 2
"i" : 1
"j" : 1
"l" : 1
"m" : 1
"n" : 3
"r" : 1
"s" : 2
"u" : 2
"y" : 1
"}" : 2
Alternatively letter_freq/1 above can be replaced with
<syntaxhighlight lang="erlang">
letter_freq( Data ) ->
Dict = lists:foldl( fun (Char, Dict) -> dict:update_counter( Char, 1, Dict ) end, dict:new(), Data ),
[io:fwrite( "~p : ~p~n", [[X], dict:fetch(X, Dict)]) || X <- dict:fetch_keys(Dict)].
Using ERRE help file for testing.
<syntaxhighlight lang="erre">PROGRAM LETTER
DIM CNT[255]
FOR C%=$41 TO $5A DO
PRINT(CHR$(C%);CHR$(C%+32);": ";CNT[C%]+CNT[C%+32])
{{works with|OpenEuphoria}}
<syntaxhighlight lang="euphoria">
-- LetterFrequency.ex
-- Count frequency of each letter in own source code.
include std/console.e
include std/io.e
include std/text.e
sequence letters = repeat(0,26)
sequence content = read_file("LetterFrequency.ex")
content = lower(content)
for i = 1 to length(content) do
if content[i] > 96 and content[i] < 123 then
letters[content[i]-96] += 1
end if
end for
for i = 1 to 26 do
printf(1,"%s: %d\n",{i+96,letters[i]})
end for
Output:
a: 4
b: 0
c: 21
x: 3
y: 3
z: 0
<syntaxhighlight lang="fsharp">let alphabet =
['A'..'Z'] |> Set.ofList
let letterFreq (text : string) =
|> Array.filter (fun x -> alphabet.Contains(x))
|> Seq.countBy (fun x -> x)
|> Seq.sort
let v = "Now is the time for all good men to come to the aid of the party"
let res = letterFreq v
for (letter, freq) in res do
printfn "%A, %A" letter freq</syntaxhighlight>
<syntaxhighlight lang="factor">USING: hashtables locals io assocs kernel io.encodings.utf8 io.files formatting ;
IN: count-letters
: count-from-stream ( -- counts )
52 <hashtable>
[ read1 dup ] [ over inc-at ] while
drop ;
: print-counts ( counts -- )
[ "%c: %d\n" printf ] assoc-each ;
: count-letters ( filename -- )
utf8 [ count-from-stream ] with-file-reader
print-counts ;
The result of the first evaluation of ASC() is retained in the symbol ASC for later use. This is a standard feature of FBSL functions. The ascii array is dynamic. Command(1) is the name of the script file.
<syntaxhighlight lang="qbasic">#APPTYPE CONSOLE
'Open a text file and count the occurrences of each letter.
FUNCTION countBytes(fileName AS STRING)
DIM ascii[]
c = FILEGETC(handle)
ascii[ASC] = ascii[ASC(c)] + 1
RETURN ascii
DIM counters = countBytes(COMMAND(1))
FOR DIM i = LBOUND(counters) TO UBOUND(counters)
PRINT i, TAB, IIF(i <= 32, i, CHR(i)), TAB, counters[i]
<langsyntaxhighlight lang="forth">create counts 26 cells allot
: freq ( filename -- )
counts 26 cells erase
slurp-file bounds do
i c@ 32 or '[char] a -
dup 0 26 within if
cells counts +
Line 332 ⟶ 3,242:
26 0 do
cr [char] ' emit '[char] a i + emit ." ': "
counts i cells + @ .
loop ;
s" example.txt" freq</langsyntaxhighlight>
Using the configuration file (which has changed since the example was documented) of the J example, compilation and output of this program on a gnu/linux system is
<syntaxhighlight lang="fortran">
-*- mode: compilation; default-directory: "/tmp/" -*-
Compilation started at Sat May 18 18:09:46
a=./F && make $a && $a < configuration.file
f95 -Wall -ffree-form F.F -o F
92 21 17 24 82 19 19 22 67 0 2 27 27 57 55 31 1 61 43 60 20 6 2 0 10 0
Compilation finished at Sat May 18 18:09:46
And here's the FORTRAN90 program source. The program reads stdin and writes the result to stdout. Future enhancement: use block size records.
<syntaxhighlight lang="fortran">
! count letters from stdin
program LetterFrequency
implicit none
character (len=1) :: s
integer, dimension(26) :: a
integer :: ios, i, t
data a/26*0/,i/0/
open(unit=7, file='/dev/stdin', access='direct', form='formatted', recl=1, status='old', iostat=ios)
if (ios .ne. 0) then
write(0,*)'Opening stdin failed'
do i=1, huge(i)
read(unit=7, rec = i, fmt = '(a)', iostat = ios ) s
if (ios .ne. 0) then
!write(0,*)'ios on failure is ',ios
t = ior(iachar(s(1:1)), 32) - iachar('a')
if ((0 .le. t) .and. (t .le. iachar('z'))) then
t = t+1
a(t) = a(t) + 1
end do
write(6, *) a
end program LetterFrequency
<syntaxhighlight lang="freebasic">' FB 1.05.0 Win64
Dim a(65 to 90) As Integer ' array to hold frequency of each letter, all elements zero initially
Dim fileName As String = "input.txt"
Dim s As String
Dim i As Integer
Open fileName For Input As #1
While Not Eof(1)
Line Input #1, s
s = UCase(s)
For i = 0 To Len(s) - 1
a(s[i]) += 1
Close #1
Print "The frequency of each letter in the file "; fileName; " is as follows:"
For i = 65 To 90
If a(i) > 0 Then
Print Chr(i); " : "; a(i)
End If
Print "Press any key to quit"
results for input.txt which contains the single line:
The quick brown fox jumps over the lazy dog.
The frequency of each letter in the file input.txt is as follows:
A : 1
B : 1
C : 1
D : 1
E : 3
F : 1
G : 1
H : 2
I : 1
J : 1
K : 1
L : 1
M : 1
N : 1
O : 4
P : 1
Q : 1
R : 2
S : 1
T : 2
U : 2
V : 1
W : 1
X : 1
Y : 1
Z : 1
Sample text:
system that ships with most desktops, laptops, and servers works just fine? To answer that question, I would pose another question.
Does that operating system you’re currently using really work “just fine”? Or are you constantly battling viruses, malware, slow
downs, crashes, costly repairs, and licensing fees?
If you struggle with the above, and want to free yourself from the constant fear of losing data or having to take your computer in
for the “yearly clean up,” Linux might be the perfect platform for you. Linux has evolved into one of the most reliable computer
ecosystems on the planet. Combine that reliability with zero cost of entry and you have the perfect solution for a desktop platform.
This example shows some of the subtle and non-obvious power of Frink in processing text files in a language-aware and Unicode-aware fashion:
* Frink has a Unicode-aware function, <CODE>graphemeList[''str'']</CODE>, which intelligently enumerates through what a human would consider to be a single visible character, including "characters" composed of multiple Unicode codepoints.
* The file fetched from Project Gutenberg is supposed to be encoded in UTF-8 character encoding, but their servers incorrectly send either that it is Windows-1252 encoded or send no character encoding at all, so this program fixes that.
* Frink has a Unicode-aware lowercase function, <CODE>lc[''str'']</CODE> that correctly handles accented characters and may even make a string longer.
* This uses full Unicode tables to determine what is a "letter."
* This works with high Unicode characters, that is above \uFFFF.
* Frink can normalize Unicode characters with its <CODE>normalizeUnicode</CODE> function so the same grapheme encoded two different ways in Unicode can be treated consistently. For example, a Unicode string can use various methods to encode what is essentially the same character/glyph. For example, the character <CODE>ô</CODE> can be represented as either <CODE>"\u00F4"</CODE> or <CODE>"\u006F\u0302"</CODE>. The former is a "precomposed" character, <CODE>"LATIN SMALL LETTER O WITH CIRCUMFLEX"</CODE>, and the latter is two Unicode codepoints, an <CODE>o</CODE> (<CODE>LATIN SMALL LETTER O</CODE>) followed by <CODE>"COMBINING CIRCUMFLEX ACCENT"</CODE>. (This is usually referred to as a "decomposed" representation.) Unicode normalization rules can convert these "equivalent" encodings into a canonical representation. This makes two different strings which look equivalent to a human (but are very different in their codepoints) be treated as the same to a computer, and these programs will count them the same. Even if the Project Gutenberg document uses precomposed and decomposed representations for the same characters, this program will fix it and count them the same! See the [[http://unicode.org/reports/tr15/ Unicode Normal Forms]] specification for more about these normalization rules. Frink implements all of them (NFC, NFD, NFKC, NFKD). NFC is the default in <CODE>normalizeUnicode[''str'', ''encoding=NFC'']</CODE>. They're interesting!
How many other languages in this page do all or any of this correctly?
Output:
e 330603
t 235571
a 207101
o 184385
h 176823
i 175320
n 169922
s 162043
r 148632
d 108724
l 99567
u 68295
c 67332
m 62212
w 56507
f 56187
g 48543
p 43366
y 39183
b 37461
v 26258
k 14427
j 5838
x 4026
q 2533
z 1905
é 1473
è 299
æ 116
ê 74
à 64
â 56
ç 50
ü 39
î 39
œ 38
ô 34
ù 18
ï 18
û 9
ë 5
ñ 2
A clone of the Objective-C solution.
This code assumes a text file named "MyTextFile.txt" is in the same folder as the code file.
<syntaxhighlight lang="futurebasic">include "NSLog.incl"
include resources "MyTextFile.txt"
void local fn DoIt
CFURLRef url
CFStringRef string
NSUInteger length, index
unichar chr
CountedSetRef set
CFNumberRef number
url = fn BundleURLForResource( fn BundleMain, @"MyTextFile", @"txt", NULL )
string = fn StringWithContentsOfURL( url, NSUTF8StringEncoding, NULL )
if ( string )
set = fn CountedSetWithCapacity(0)
length = len(string)
for index = 0 to length - 1
chr = fn StringCharacterAtIndex( string, index )
CountedSetAddObject( set, @(chr) )
for number in set
NSLog(@"%C = %ld",intVal(number),fn CountedSetCountForObject( set, number ))
end if
end fn
fn DoIt
<syntaxhighlight lang="gambas">Public Sub Form_Open()
Dim sData As String = File.Load("data.txt")
Dim iCount, iSpaces, iLetters, iOther As Integer
Dim bPunctuation As Boolean
For iCount = 1 To Len(sData)
If InStr("ABCDEFGHIJKLMNOPQRSTUVWXYZ", UCase(Mid(sData, iCount, 1))) Then
Inc iLetters
bPunctuation = True
End If
If Mid(sData, icount, 1) = " " Then
Inc iSpaces
bPunctuation = True
End If
If bPunctuation = False Then Inc iOther
bPunctuation = False
Message("Text contains " & Len(sData) & " characters\n" & iLetters & " Letters\n" & iSpaces & " Spaces\n" & iOther & " Punctuation, newlines etc.")
Output:
677 Letters
135 Spaces
42 Punctuation, newlines etc.
<langsyntaxhighlight lang="go">package main
import (
Line 395 ⟶ 3,554:
func (lfs lfList) Swap(i, j int) {
lfs[i], lfs[j] = lfs[j], lfs[i]
file: unixdict.txt
Line 443 ⟶ 3,602:
<langsyntaxhighlight lang="groovy">def frequency = { it.inject([:]) { map, value -> map[value] = (map[value] ?: 0) + 1; map } }
frequency(new File('frequency.groovy').text).each { key, value ->
println "'$key': $value"
<pre>'d': 1
'e': 19
Line 463 ⟶ 3,622:
'"': 2
'$': 2</pre>
<syntaxhighlight lang="visualfoxpro">PROCEDURE Main()
LOCAL s := hb_MemoRead( Left( __FILE__ , At( ".", __FILE__ )) +"prg")
LOCAL c, n, i
LOCAL a := {}
IF Asc( c ) > 31
AAdd( a, c )
a := ASort( a )
i := 1
WHILE i <= Len( a )
c := a[i] ; n := 1
IF i < Len(a) .AND. c == a[i]
WHILE c == a[i]
n++ ; i++
?? "'" + c + "'" + "=" + hb_NtoS( n ) + " "
Output (counting the printable characters of its own source code):
' '=190 '"'=12 ' ' '=2 '('=10 ')'=10 '+'=12 ','=5 '.'=3 '1'=3 '3'=1 ':'=6 ';'=2 '<'=2 '='=12
'>'=1 '?'=2 'A'=10 'C'=5 'D'=6 'E'=13 'F'=7 'H'=3 'I'=9 'L'=13 'M'=2 'N'=9 'O'=5 'P'=1
'R'=6 'S'=2 'T'=2 'U'=2 'W'=2 'X'=1 '['=3 ']'=3 '_'=10 'a'=12 'b'=2 'c'=9 'd'=3 'e'=5
'f'=1 'g'=1 'h'=2 'i'=11 'm'=1 'n'=7 'o'=3 'p'=1 'r'=2 's'=3 't'=5 'w'=1 '{'=1 '}'=1
Short version:
<langsyntaxhighlight Haskelllang="haskell">import Data.List (group, sort)
main :: IO ()
main = readFile "freq.hs" >>= mapM_ (\x -> print (head x, length x)) . group . sort</lang>
main = interact (show . fmap ((,) . head <*> length) . group . sort</syntaxhighlight>
or, as an alternative to sorting and grouping the whole string, we could use some kind of container as the accumulator for a single fold, for example:
Properly architected version:
<lang Haskell>import qualified Data.Map as M
<syntaxhighlight lang="haskell">import Data.List (sortBy)
main = do
import qualified Data.Map.Strict as M
text <- readFile "freq.hs"
import Data.Ord (comparing)
let result = foldl (flip (M.adjust (+1))) initial text
mapM_ print $ M.toList result
charCounts :: String -> M.Map Char Int
initial = M.fromList $ zipWith (\k v -> (toEnum k,v)) [0..255] (repeat 0)</lang>
charCounts = foldr (M.alter f) M.empty
f (Just x) = Just (succ x)
f _ = Just 1
main :: IO ()
main =
readFile "miserables.txt"
>>= mapM_ print
. sortBy
(flip $ comparing snd)
. M.toList
. charCounts</syntaxhighlight>
Output:
=={{header|Icon}} and {{header|Unicon}}==
The example below counts (case insensitive) letters and was run on a version of this source file.
<langsyntaxhighlight Iconlang="icon">link printf
procedure main(A)
Line 502 ⟶ 3,828:
every c := key(T) do
printf("%s - %d\n",c,T[c])
{{libheader|Icon Programming Library}}
[http://www.cs.arizona.edu/icon/library/src/procs/printf.icn printf.icn provides formatting]
Output:<pre>c - 17
<pre>c - 17
k - 5
s - 10
Line 528 ⟶ 3,855:
n - 28
v - 4</pre>
Works with the Node REPL.
<syntaxhighlight lang="insitux">
(-> (read "README.md")
(filter letter?)
(map upper-case)
(sort-by 1)
(map (join " "))
(join " "))
Output:
<syntaxhighlight lang="is-basic">100 PROGRAM "Letters.bas"
110 NUMERIC LETT(65 TO 90)
120 FOR I=65 TO 90
130 LET LETT(I)=0
140 NEXT
150 LET EOF=0
160 OPEN #1:"list.txt"
180 DO
190 GET #1:A$
200 LET A$=UCASE$(A$)
210 IF A$>="A" AND A$<="Z" THEN LET LETT(ORD(A$))=LETT(ORD(A$))+1
240 FOR I=65 TO 90
250 PRINT CHR$(I);":";LETT(I),
260 NEXT
280 LET EOF=-1
290 CLOSE #1
310 END HANDLER</syntaxhighlight>
Input is a directory-path with filename. Result is 26 integers representing counts of each letter, in alphabetic order (a's count is first).
<syntaxhighlight lang="j">ltrfreq=: 3 : 0
<lang j>require 'files' NB. define fread
letters=. u: 65 + i.26 NB. upper case letters
ltrfreq=: 3 : 0
<: #/.~ letters (, -. -.~) toupper fread y
letters=. u: (u:inv'A') + i.26 NB. upper case letters
<: #/.~ (toupper fread y) (,~ -. -.) letters
Example use (based on a configuration file from another task):
<langsyntaxhighlight lang="j"> ltrfreq 'config.file'
88 17 17 24 79 18 19 19 66 0 2 26 26 57 54 31 1 53 43 59 19 6 2 0 8 0</langsyntaxhighlight>
This implementation will capture the frequency of all characters
<syntaxhighlight lang="java5">
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.HashMap;
import java.util.Map;
<syntaxhighlight lang="java5">
public static void main(String[] args) throws IOException {
Map<Integer, Integer> frequencies = frequencies("src/LetterFrequency.java");
static String print(Map<Integer, Integer> frequencies) {
StringBuilder string = new StringBuilder();
int key;
for (Map.Entry<Integer, Integer> entry : frequencies.entrySet()) {
key = entry.getKey();
/* display the hexadecimal value for non-printable characters */
if ((key >= 0 && key < 32) || key == 127) {
} else {
string.append("%s%n".formatted((char) key));
return string.toString();
static Map<Integer, Integer> frequencies(String path) throws IOException {
try (InputStreamReader reader = new InputStreamReader(new FileInputStream(path))) {
/* key = character, and value = occurrences */
Map<Integer, Integer> map = new HashMap<>();
int value;
while ((value = reader.read()) != -1) {
if (map.containsKey(value)) {
map.put(value, map.get(value) + 1);
} else {
map.put(value, 1);
return map;
Output:
1 !
8 "
5 %
2 &
33 (
33 )
4 *
1 +
9 ,
3 -
29 .
5 /
2 0
4 1
3 2
1 3
1 7
1 8
1 :
19 ;
7 <
12 =
7 >
2 B
4 E
4 F
2 H
18 I
2 K
2 L
8 M
3 O
3 R
13 S
1 V
1 [
1 ]
73 a
3 b
28 c
19 d
121 e
13 f
25 g
11 h
53 i
6 j
8 k
25 l
22 m
67 n
24 o
44 p
8 q
81 r
30 s
87 t
34 u
15 v
7 w
5 x
20 y
11 {
2 |
11 }
<br />
{{works with|Java|5+}}
<langsyntaxhighlight lang="java5">import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
Line 570 ⟶ 4,053:
{{works with|Java|7+}}
In Java 7, we can use try with resources. The <code>countLetters</code> method would look like this:
<langsyntaxhighlight lang="java5">public static int[] countLetters(String filename) throws IOException{
int[] freqs = new int[26];
try(BufferedReader in = new BufferedReader(new FileReader(filename))){
Line 587 ⟶ 4,070:
return freqs;
{{works with|Java|8+}}
In Java 8, we can use streams. This code also handles unicode codepoints as well. The <code>countLetters</code> method would look like this:
<syntaxhighlight lang="java5">public static Map<Integer, Long> countLetters(String filename) throws IOException {
return Files.lines(Paths.get(filename))
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
JavaScript is no longer used only in environments which are carefully isolated from file systems, but JavaScript standards still do not specify standard file-system functions.
Leaving aside the particular and variable details of how files will be opened and read in environments like Node.js and OS X JavaScript for Automation etc.,
we can still use core JavasScript (ES5 in the example below), to count the characters in a text once it has been read from a file system.
<syntaxhighlight lang="javascript">(function(txt) {
var cs = txt.split(''),
i = cs.length,
dct = {},
c = '',
while (i--) {
c = cs[i];
dct[c] = (dct[c] || 0) + 1;
keys = Object.keys(dct);
return keys.map(function (c) { return [c, dct[c]]; });
})("Not all that Mrs. Bennet, however, with the assistance of her five\
daughters, could ask on the subject, was sufficient to draw from her\
husband any satisfactory description of Mr. Bingley. They attacked him\
in various ways--with barefaced questions, ingenious suppositions, and\
distant surmises; but he eluded the skill of them all, and they were at\
last obliged to accept the second-hand intelligence of their neighbour,\
Lady Lucas. Her report was highly favourable. Sir William had been\
delighted with him. He was quite young, wonderfully handsome, extremely\
agreeable, and, to crown the whole, he meant to be at the next assembly\
with a large party. Nothing could be more delightful! To be fond of\
dancing was a certain step towards falling in love; and very lively\
hopes of Mr. Bingley's heart were entertained."); </syntaxhighlight>
<syntaxhighlight lang="javascript">[[" ", 121], ["!", 1], ["'", 1], [",", 13], ["-", 3], [".", 9], [";", 2],
["B", 3], ["H", 2], ["L", 2], ["M", 3], ["N", 2], ["S", 1], ["T", 2], ["W", 1],
["a", 53], ["b", 13], ["c", 17], ["d", 29], ["e", 82], ["f", 17], ["g", 16], ["h", 36],
["i", 44], ["j", 1], ["k", 3], ["l", 34], ["m", 11], ["n", 41], ["o", 40], ["p", 8],
["q", 2], ["r", 35], ["s", 39], ["t", 55], ["u", 20], ["v", 7], ["w", 17], ["x", 2], ["y", 16]]</syntaxhighlight>
Using the 'JavaScript for Automation' embedding of a JSContext on macOS, for access to the file system:
<syntaxhighlight lang="javascript">(() => {
'use strict';
// charCounts :: String -> [(Char, Int)]
const charCounts = s =>
(a, c) => (
a[c] = 1 + (a[c] || 0),
), {}
// ----------------------- TEST -----------------------
// main :: IO ()
const main = () =>
either(msg => msg)(
// -----------------GENERIC FUNCTIONS -----------------
// Left :: a -> Either a b
const Left = x => ({
type: 'Either',
Left: x
// Right :: b -> Either a b
const Right = x => ({
type: 'Either',
Right: x
// chars :: String -> [Char]
const chars = s =>
// comparing :: (a -> b) -> (a -> a -> Ordering)
const comparing = f =>
x => y => {
a = f(x),
b = f(y);
return a < b ? -1 : (a > b ? 1 : 0);
// compose (<<<) :: (b -> c) -> (a -> b) -> a -> c
const compose = (...fs) =>
(f, g) => x => f(g(x)),
x => x
// either :: (a -> c) -> (b -> c) -> Either a b -> c
const either = fl =>
fr => e => 'Either' === e.type ? (
undefined !== e.Left ? (
) : fr(e.Right)
) : undefined;
// flip :: (a -> b -> c) -> b -> a -> c
const flip = f =>
1 < f.length ? (
(a, b) => f(b, a)
) : (x => y => f(y)(x));
// map :: (a -> b) -> [a] -> [b]
const map = f =>
// The list obtained by applying f
// to each element of xs.
// (The image of xs under f).
xs => (
Array.isArray(xs) ? (
) : xs.split('')
// readFileLR :: FilePath -> Either String IO String
const readFileLR = fp => {
e = $(),
ns = $.NSString
return ns.isNil() ? (
) : Right(ObjC.unwrap(ns));
// snd :: (a, b) -> b
const snd = tpl => tpl[1];
// sortBy :: (a -> a -> Ordering) -> [a] -> [a]
const sortBy = f =>
xs => xs.slice()
.sort((a, b) => f(a)(b));
// unlines :: [String] -> String
const unlines = xs =>
// A single string formed by the intercalation
// of a list of strings with the newline character.
// MAIN ---
return main();
<pre>[" ",516452]
Or, using an object as a hash-table, and the reduce method:
(note that this version omits the opening of a text file which is specified in the task description):
<syntaxhighlight lang="javascript">(() => {
'use strict';
const letterfreq = text => [...text]
(a, c) => (a[c] = (a[c] || 0) + 1, a),
return JSON.stringify(
`remember, remember, the fifth of november
gunpowder treason and plot
I see no reason why gunpowder treason
should ever be forgot`
null, 2
Using the spread operator, you get the unicode characters rather than the UTF-16 code units.
"r": 12,
"e": 19,
"m": 5,
"b": 4,
",": 2,
" ": 56,
"t": 6,
"h": 4,
"f": 4,
"i": 1,
"o": 12,
"n": 8,
"v": 2,
"\n": 3,
"g": 3,
"u": 3,
"p": 3,
"w": 3,
"d": 4,
"a": 4,
"s": 5,
"l": 2,
"I": 1,
"y": 1
The following program will report the frequency of all characters in the input file, including newlines, returns, etc, provided the file will fit in memory.<syntaxhighlight lang="jq">
# Input: an array of strings.
# Output: an object with the strings as keys,
# the values of which are the corresponding frequencies.
def counter:
reduce .[] as $item ( {}; .[$item] += 1 ) ;
# For neatness we sort the keys:
explode | map( [.] | implode ) | counter | . as $counter
| keys | sort[] | [., $counter[.] ]
Example:<syntaxhighlight lang="sh">jq -s -R -c -f Letter_frequency.jq somefile.txt</syntaxhighlight>
[" ",124]
<syntaxhighlight lang="julia">using DataStructures
function letterfreq(file::AbstractString; fltr::Function=(_) -> true)
sort(Dict(counter(filter(fltr, read(file, String)))))
display(letterfreq("src/Letter_frequency.jl"; fltr=isletter))
DataStructures.OrderedDict{Char,Int64} with 29 entries:
'A' => 1
'C' => 1
'D' => 2
'F' => 1
'L' => 3
'S' => 2
'a' => 9
'b' => 1
'c' => 13
'd' => 5
'e' => 30
'f' => 13
'g' => 4
'h' => 10
'i' => 14
'j' => 1
'k' => 3
'l' => 11
'n' => 15
⋮ => ⋮
<langsyntaxhighlight Klang="k">+(?a;#:'=a:,/0:`)</langsyntaxhighlight>
Example: The file "hello.txt" contains the string "Hello, world!"
<langsyntaxhighlight Klang="k">
<lang K>
Line 611 ⟶ 4,519:
Sort on decreasing occurrences:
<syntaxhighlight lang="k">
<lang K>
<lang K>
Line 632 ⟶ 4,539:
<syntaxhighlight lang="scala">// version 1.1.2
import java.io.File
fun main(args: Array<String>) {
val text = File("input.txt").readText().toLowerCase()
val letterMap = text.filter { it in 'a'..'z' }.groupBy { it }.toSortedMap()
for (letter in letterMap) println("${letter.key} = ${letter.value.size}")
val sum = letterMap.values.sumBy { it.size }
println("\nTotal letters = $sum")
'input.txt' just contains two pangrams:
The quick brown fox jumps over the lazy dog.
Sphinx of black quartz, judge my vow.
a = 3
b = 2
c = 2
d = 2
e = 4
f = 2
g = 2
h = 3
i = 2
j = 2
k = 2
l = 2
m = 2
n = 2
o = 6
p = 2
q = 2
r = 3
s = 2
t = 3
u = 4
v = 2
w = 2
x = 2
y = 2
z = 2
Total letters = 64
<syntaxhighlight lang="ksh">
# Count the occurrences of each character
# main #
typeset -iA freqCnt
while read; do
for ((i=0; i<${#REPLY}; i++)); do
(( freqCnt[${REPLY:i:1}]++ ))
done < $0 ## Count chars of this code file
for ch in "${!freqCnt[@]}"; do
[[ ${ch} == ?(\S) ]] && print -- "${ch} ${freqCnt[${ch}]}"
Counts the characters of the source code file
! 2
" 4
# 19
$ 8
& 2
( 5
) 5
+ 4
- 3
/ 2
0 2
1 1
: 2
; 5
< 2
= 3
? 1
@ 1
A 1
C 6
E 2
L 2
P 2
R 2
S 1
Y 2
[ 5
] 5
a 6
b 1
c 12
d 8
e 18
f 9
h 11
i 12
k 1
l 2
m 1
n 14
o 14
p 2
q 4
r 13
s 5
t 12
u 3
w 1
y 1
{ 7
In this entry we choose to show how lambdatalk can use any existing javascript code (say the #Javascript entry in this page), and build an interface to use it as a standard lambdatalk function. So, applied to any string the W.frequency primitive returns a pair structure containing the array of chars and the corresponding array of frequencies.
<syntaxhighlight lang="scheme">
// W.frequency is added to the lambdatalk dictionary via the {script ...} special form
LAMBDATALK.DICT['W.frequency'] = function() {
// 1) simply copied from the rosetta.org #Javascript entry
var frequency = function(txt) {
var cs = txt.split(''),
i = cs.length,
dct = {},
c = '';
while (i--) {
c = cs[i];
dct[c] = (dct[c] || 0) + 1;
var keys = Object.keys(dct);
return keys.map(function (c) { return [c, dct[c]]; });
// 2) then interfaced with lambdatalk
var args = arguments[0].trim().replace( /\s+/g, "␣" );
var output = frequency( args );
for (var a=[], b=[], i=0; i< output.length; i++) {
a.push( output[i][0] );
b.push( output[i][1] );
var pair = "{cons {A.new " + a.join(" ") +
"} {A.new " + b.join(" ") + "}}"
return LAMBDATALK.eval_forms( pair );
{def S3
Not all that Mrs. Bennet, however, with the assistance of her five daughters, could ask on the subject, was sufficient to draw from her husband any satisfactory description of Mr. Bingley. They attacked him in various ways--with barefaced questions, ingenious suppositions, and distant surmises; but he eluded the skill of them all, and they were at last obliged to accept the second-hand intelligence of their neighbour, Lady Lucas. Her report was highly favourable. Sir William had been delighted with him. He was quite young, wonderfully handsome, extremely agreeable, and, to crown the whole, he meant to be at the next assembly with a large party. Nothing could be more delightful! To be fond of dancing was a certain step towards falling in love; and very lively hopes of Mr. Bingley's heart were entertained.
-> S3
{def S3.freq {W.frequency {S3}}}
-> S3.freq
characters: {car {S3.freq}}
-> [!,',,,-,.,;,B,H,L,M,N,S,T,W,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,␣]
frequencies: {cdr {S3.freq}}
-> [1,1,13,3,9,2,3,2,2,3,2,1,2,1,53,13,17,29,82,17,16,36,44,1,3,34,11,41,40,8,2,35,39,55,20,7,17,2,16,132]
<syntaxhighlight lang="langur">
val countLetters = fn(s) {
for[={:}] s2 in split(replace(s, RE/\P{L}/)) {
_for[s2; 0] += 1
val counts = countLetters(readfile("./fuzz.txt"))
writeln join("\n", map(fn(k) { "{{k}}: {{counts[k]}}" }, keys(counts)))
The input contains "fuzzy furry kittens ασδ ξκλ ασδ ξα" (random Greek letters at the end) and the output is as follows.
<pre>f: 2
u: 2
z: 2
y: 2
r: 2
k: 1
i: 1
t: 2
e: 1
n: 1
s: 1
α: 3
σ: 2
δ: 2
ξ: 2
κ: 1
λ: 1
<syntaxhighlight lang="lasso">local(
str = 'Hello world!',
freq = map
// as a loop. arguably quicker than query expression
loop(#str->size) => {
#freq->keys !>> #str->get(loop_count) ?
#freq->insert(#str->get(loop_count) = #str->values->find(#str->get(loop_count))->size)
// or
str = 'Hello world!',
freq = map
// as query expression, less code
with i in #str->values where #freq->keys !>> #i do => {
#freq->insert(#i = #str->values->find(#i)->size)
// output #freq
with elem in #freq->keys do => {^
'"'+#elem+'": '+#freq->find(#elem)+'\r'
=={{header|Liberty BASIC}}==
Un-rem a line to convert to all-upper-case. Letter freq'y is printed as percentages.
Letter freq'y is printed as percentages.
<lang lb>
<syntaxhighlight lang="lb">
open "text.txt" for input as #i
txt$ =input$( #i, lof( #i))
Line 658 ⟶ 4,810:
This solution counts letters only, which could be changed by altering the pattern argument to 'gmatch' on line 31. It also treats upper and lower case letters as distinct, which could be changed by changing everything to upper or lower case with string.upper() or string.lower() before tallying.
<syntaxhighlight lang="lua">-- Return entire contents of named file
function readFile (filename)
local file = assert(io.open(filename, "r"))
local contents = file:read("*all")
return contents
-- Return a closure to keep track of letter counts
<lang lua>-- Open the file named on the command line
function tally ()
local file = assert(io.open(arg[1]))
local t = {}
-- Keep a table counting the instances of each letter
local instances = {}
-- Add x to tally if supplied, return tally list otherwise
local function tally(char)
local function count (x)
-- normalize case
if x then
char = string.upper(char)
if t[x] then
-- add to the count of the found character
occurrences t[charx] = occurrencest[charx] + 1
t[x] = 1
return t
return count
-- For each line in the file
-- Main procedure
for line in file:lines() do
local letterCount = tally()
for letter in readFile(arg[1] or arg[0]):gmatch("%a") do
'%a', -- For each letter (%a) on the line,
tally) --increase the count for that letter
for k, v in pairs(letterCount()) do
-- Print letter counts
print(k, v)
for letter, count in pairs(instances) do
print(letter, count)
Output from running this script on itself:
<pre>i 24
f 16
R 2
v 2
c 19
k 4
M 1
s 14
d 17
l 40
e 61
t 54
m 4
r 34
u 18
C 3
o 32
A 1
g 3
x 7
F 2
y 4
w 1
n 42
h 4
a 25
p 7</pre>
=={{header|M2000 Interpreter}}==
<syntaxhighlight lang="m2000 interpreter">
document file1$={Open a text file and count the occurrences of each letter.
Some of these programs count all characters (including punctuation), but some only count letters A to Z
const Ansi=3, nl$=chr$(13)+chr$(10), Console=-2
save.doc file1$, "checkdoc.txt", Ansi
open "checkdoc.txt" for input as F
buffer onechar as byte
dim m(65 to 90)
while not eof(#F)
get #F, onechar
if a$ ~ "[A-Za-z]" then
end if
end while
close #F
document Export$
for i=65 to 90
if m(i)>0 then Export$=format$("{0} - {1:2:4}%",chr$(i),m(i)/m*100)+nl$
print #Console, Export$
clipboard Export$
<pre style="height:30ex;overflow:scroll">
A - 6,87%
B - 0,76%
C - 8,40%
D - 1,53%
E - 12,2%
F - 2,29%
G - 1,53%
H - 3,05%
I - 3,05%
L - 5,34%
M - 2,29%
N - 8,40%
O - 9,92%
P - 2,29%
R - 6,11%
S - 5,34%
T - 12,2%
U - 6,11%
X - 0,76%
Y - 0,76%
Z - 0,76%
</pre >
<syntaxhighlight lang="maple">StringTools:-CharacterFrequencies(readbytes("File.txt",infinity,TEXT))</syntaxhighlight>
=={{header|Mathematica}} / {{header|Wolfram Language}}==
<syntaxhighlight lang="mathematica">Tally[Characters[Import["file.txt","Text"]]]</syntaxhighlight>
=={{header|MATLAB}} / {{header|Octave}}==
<syntaxhighlight lang="matlab">function u = letter_frequency(t)
if ischar(t)
t = abs(t);
A = sparse(t+1,1,1,256,1);
<syntaxhighlight lang="nanoquery">// define a list to hold characters and amounts
characters = list()
amounts = list()
// define the alphabet as a string to check only letters and numbers
alpha = "abcdefghijklmnopqrstuvwxyz0123456789"
// get the filename as an argument
fname = args[len(args) - 1]
// read the entire file into a string
contents = new(Nanoquery.IO.File, fname).readAll()
// loop through all the characters in the array
for i in range(0, len(contents) - 1)
// get the character to check
toCheck = str(contents[i]).toLowerCase()
// check if the current character is in the array
if ((alpha .contains. toCheck) && (characters .contains. toCheck))
// if it's there, increment its amount
index = characters[toCheck]
amounts[index] = amounts[index] + 1
if (alpha .contains. toCheck)
// if it's not, add it
append characters toCheck
append amounts 0
end if
end for
// output the amounts
println format("%-20s %s", "Character", "Amount")
println "=" * 30
for i in range(0, len(characters) - 1)
println format("%-20s %d", characters[i], amounts[i])
end for</syntaxhighlight>
<pre>$ java -jar ../nanoquery-2.3_1462.jar -b letterfreq.nq sherlock-holmes.txt
Character Amount
p 7239
r 25708
o 34866
j 544
e 54972
c 11118
t 40545
g 8311
u 13604
n 29701
b 6645
s 27941
h 29588
a 36146
d 19064
v 4567
f 9362
l 17633
k 3684
m 12150
y 9776
i 31240
w 11554
2 45
9 23
0 104
1 127
6 30
8 46
z 152
x 578
q 437
5 27
4 29
7 25
3 25</pre>
<syntaxhighlight lang="netrexx">/* NetRexx ************************************************************
* 22.05.2013 Walter Pachl translated from REXX
options replace format comments java crossref symbols nobinary
parse arg dsn .
if dsn = '' then
dsn = 'test.txt'
totChars=0 /*count of the total num of chars*/
totLetters=0 /*count of the total num letters.*/
indent=' '.left(20) /*used for indentation of output.*/
lines = scanFile(dsn)
loop l_ = 1 to lines[0]
line = lines[l_]
Say '>'line'<' line.length /* that's in test.txt */
Parse lrx leftx rightx
Say ' 'leftx
Say ' 'rightx
loop k=1 for line.length() /*loop over characters */
totChars=totChars+1 /*Increment total number of chars*/
c=line.substr(k,1) /*get character number k */
cnt[c]=cnt[c]+1 /*increment the character's count*/
end l_
w=totChars.length /*used for right-aligning counts.*/
say 'file -----' dsn "----- has" lines[0] 'records.'
say 'file -----' dsn "----- has" totChars 'characters.'
Loop L=0 to 255 /* display nonzero letter counts */
c=l.d2c /* the character in question */
if cnt[c]>0 & c.datatype('M')>0 Then Do /* was found in the file */
/* and is a latin letter */
say indent "(Latin) letter " c 'count:' cnt[c].right(w) /* tell */
totLetters=totLetters+cnt[c] /* increment number of letters */
say 'file -----' dsn "----- has" totLetters '(Latin) letters.'
say ' other charactes follow'
loop m=0 to 255 /* now for non-letters */
c=m.d2c /* the character in question */
y=c.c2x /* the hex representation */
if cnt[c]>0 & c.datatype('M')=0 Then Do /* was found in the file */
/* and is not a latin letter */
other=other+cnt[c] /* increment count */
_=cnt[c].right(w) /* prepare output of count */
select /*make the character viewable. */
when c<<' ' | m==255 then say indent "'"y"'x character count:" _
when c==' ' then say indent "blank character count:" _
otherwise say indent " " c 'character count:' _
say 'file -----' dsn "----- has" other 'other characters.'
say 'file -----' dsn "----- has" totLetters 'letters.'
-- Read a file and return contents as a Rexx indexed string
method scanFile(dsn) public static returns Rexx
fileLines = ''
inFile = File(dsn)
inFileScanner = Scanner(inFile)
loop l_ = 1 while inFileScanner.hasNext()
fileLines[0] = l_
fileLines[l_] = inFileScanner.nextLine()
end l_
catch ex = FileNotFoundException
return fileLines</syntaxhighlight>
<syntaxhighlight lang="nim">import tables, os
var t = initCountTable[char]()
<lang Mathematica>Tally[Characters[Import["file.txt","Text"]]]</lang>
for l in paramStr(1).lines:
for c in l:
echo t</syntaxhighlight>
<langsyntaxhighlight lang="objeck">
use IO;
Line 720 ⟶ 5,152:
<langsyntaxhighlight lang="objc">#import <Foundation/Foundation.h>
int main (int argc, const char *argv[]) {
@autoreleasepool {
NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
NSData *data = [NSData dataWithContentsOfFile:[NSString stringWithCString:@(argv[1] encoding:NSUTF8StringEncoding])];
NSString *string = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
NSCountedSet *countedSet = [[NSCountedSet alloc] init];
NSUInteger len = [string length];
for (NSUInteger i = 0; i < len; i++) {
unichar c = [string characterAtIndex:i];
if ([[NSCharacterSet letterCharacterSet] characterIsMember:c])
[countedSet addObject:[NSNumber numberWithInteger:@(c])];
for (NSNumber *chr in countedSet) {
[string release];
for NSLog(NSNumber@"%C *=> %lu", (unichar)[chr inintegerValue], [countedSet) {countForObject:chr]);
NSLog(@"%C => %lu", (unichar)[chr integerValue], [countedSet countForObject:chr]);
[countedSet release];
[pool release];
return 0;
Line 752 ⟶ 5,182:
We open a text file and compute letter frequency. Other characters than [a-z] and [A-Z] are ignored, and upper case letters are first converted to lower case before to compute letter frequency.
<langsyntaxhighlight lang="ocaml">let () =
let ic = open_in Sys.argv.(1) in
let base = int_of_char 'a' in
Line 766 ⟶ 5,196:
for i=0 to 25 do
Printf.printf "%c -> %d\n" (char_of_int(i + base)) arr.(i)
If we want to compute all characters in an UTF8 file, we must use an external library, for example Batteries. The following function takes as input a string that contains the path to the file, and prints all the characters together with their frequencies, ordered by increasing frequencies, on the standard output.
<syntaxhighlight lang="ocaml">
open Batteries
let frequency file =
let freq = Hashtbl.create 52 in
File.with_file_in file
(Enum.iter (fun c -> Hashtbl.modify_def 1 c succ freq) % Text.chars_of);
List.iter (fun (k,v) -> Text.write_text stdout k;
Printf.printf " %d\n" v)
@@ List.sort (fun (_,v) (_,v') -> compare v v')
@@ Hashtbl.fold (fun k v l -> (Text.of_uchar k,v) :: l) freq []
<syntaxhighlight lang="scheme">
(define source (bytes->string (file->bytestream "letter_frequency.scm"))) ; utf-8
(define dict (lfold (lambda (ff char)
(put ff char (+ 1 (get ff char 0))))
(str-iter source)))
; that's all.
; just print the dictionary in human readable format:
(for-each (lambda (kv)
(let* ((key value kv))
(case key
(display "NEWLINE"))
(display "TAB"))
(display "SPACE"))
((< key #\space) => (lambda (_)
(display "char ") (display key)))
(display (string key))))
(display " --> ")
(display value))
(ff->alist dict))
<pre>NEWLINE --> 27
SPACE --> 374
" --> 12
# --> 5
' --> 1
( --> 37
) --> 37
* --> 1
+ --> 1
- --> 8
. --> 3
/ --> 4
0 --> 1
1 --> 1
8 --> 1
: --> 2
; --> 4
< --> 1
= --> 1
> --> 5
A --> 2
B --> 1
C --> 1
E --> 3
I --> 1
L --> 2
N --> 2
O --> 1
P --> 1
S --> 1
T --> 1
W --> 1
\ --> 4
_ --> 3
a --> 35
b --> 7
c --> 17
d --> 19
e --> 41
f --> 17
g --> 4
h --> 9
i --> 25
j --> 1
k --> 8
l --> 25
m --> 7
n --> 13
o --> 9
p --> 14
q --> 2
r --> 23
s --> 24
t --> 31
u --> 10
v --> 4
w --> 2
y --> 18
{ --> 1
} --> 1
<syntaxhighlight lang="oxygenbasic">
indexbase 0
sys a,e,i,c[255]
string s=getfile "t.txt"
e=len s
for i=1 to e
pr="Char Frequencies" cr cr
for i=32 to 255
pr+=chr(i) chr(9) c(i) cr
print pr
'putfile "CharCount.txt",pr
<syntaxhighlight lang="parigp">v=vector(26);
{{works with|Extended Pascal}}
<lang pascal>program LetterFrequency;
<syntaxhighlight lang="pascal">program letterFrequency(input, output);
chart: array[char] of 0..maxInt value [otherwise 0];
textFile: text;
character c: char;
counter: array[0..255] of integer;
i: integer;
{ parameter-less EOF checks for EOF(input) }
for i := low(counter) to high(counter) do
while not EOF do
counter[i] := 0;
assign(textFile, 'a_text_file.txt');
chart[c] := chart[c] + 1
while not eof(textFile) do
while not eoln(textFile) do
{ now, chart[someLetter] gives you the letter’s frequency }
read(textFile, character);
for i := low(counter) to high(counter) do
if counter[i] > 0 then
writeln(char(i), ': ', counter[i]);
>: ./LetterFrequency
3: 2
a: 4
d: 3
e: 3
f: 3
g: 2
q: 1
r: 4
s: 3
t: 2
w: 2</pre>
Counts letters in files given on command line or piped to stdin. Case insensitive.
<langsyntaxhighlight lang="perl">while (<>) { $cnt{lc chop}++ while length }
print "$_: ", $cnt{$_}//0, "\n" for 'a' .. 'z';</langsyntaxhighlight>
=={{header|Perl 6Phix}}==
<del>Counts own source or supplied filename</del> Now counts words in unixdict.txt of 17 letters or more (to make it js compatible)
<lang perl6>(my %count){$_}++ for lines.comb;
<!--<syntaxhighlight lang="phix">(phixonline)-->
.say for %count.sort;</lang>
<span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span>
The <tt>lines</tt> function automatically opens the file supplied on the command line. This program does not count newlines.
<span style="color: #004080;">sequence</span> <span style="color: #000000;">lc</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">repeat</span><span style="color: #0000FF;">(</span><span style="color: #000000;">0</span><span style="color: #0000FF;">,</span><span style="color: #000000;">#7E</span><span style="color: #0000FF;">)</span>
<span style="color: #000080;font-style:italic;">--string text = get_text(command_line()[$])</span>
<span style="color: #004080;">string</span> <span style="color: #000000;">text</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">join</span><span style="color: #0000FF;">(</span><span style="color: #7060A8;">unix_dict</span><span style="color: #0000FF;">(</span><span style="color: #000000;">17</span><span style="color: #0000FF;">))</span>
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">text</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">ch</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">text</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">ch</span><span style="color: #0000FF;">=-</span><span style="color: #000000;">1</span> <span style="color: #008080;">then</span> <span style="color: #008080;">exit</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">ch</span><span style="color: #0000FF;">>=</span><span style="color: #008000;">' '</span> <span style="color: #008080;">and</span> <span style="color: #000000;">ch</span><span style="color: #0000FF;"><</span><span style="color: #000000;">#7F</span> <span style="color: #008080;">then</span>
<span style="color: #000000;">lc</span><span style="color: #0000FF;">[</span><span style="color: #000000;">ch</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">count</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">0</span>
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #008000;">' '</span> <span style="color: #008080;">to</span> <span style="color: #000000;">#7E</span> <span style="color: #008080;">do</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">lc</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]!=</span><span style="color: #000000;">0</span> <span style="color: #008080;">then</span>
<span style="color: #000000;">count</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">mod</span><span style="color: #0000FF;">(</span><span style="color: #000000;">count</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #000000;">10</span><span style="color: #0000FF;">)</span>
<span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"'%c': %-2d%s"</span><span style="color: #0000FF;">,{</span><span style="color: #000000;">i</span><span style="color: #0000FF;">,</span><span style="color: #000000;">lc</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">],</span><span style="color: #008080;">iff</span><span style="color: #0000FF;">(</span><span style="color: #000000;">count</span><span style="color: #0000FF;">=</span><span style="color: #000000;">0</span><span style="color: #0000FF;">?</span><span style="color: #008000;">"\n"</span><span style="color: #0000FF;">:</span><span style="color: #008000;">" "</span><span style="color: #0000FF;">)})</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
' ': 14 'a': 15 'b': 2 'c': 20 'd': 7 'e': 33 'g': 7 'h': 16 'i': 22 'l': 16
'm': 5 'n': 14 'o': 27 'p': 15 'r': 26 's': 15 't': 24 'u': 6 'v': 1 'y': 4
The following should work by spec, but nobody implements the Bag type yet:
<syntaxhighlight lang="phixmonti">0 255 repeat var ascCodes
<lang perl6>.say for slurp.comb.Bag.pairs.sort;</lang>
"unixdict.txt" "r" fopen var file
file 0 < not
file fgets
dup 0 < not
len 1 swap 2 tolist
var i
i get ascCodes over get 1 + rot set var ascCodes
ascCodes len
var i
i get
i tochar print " = " print i get print nl
file fclose
<langsyntaxhighlight lang="php"><?php
Sorting on the frequency (decreasing).
<syntaxhighlight lang="picat">go =>
% removing '\n' first
Chars = delete_all(read_file_chars("unixdict.txt"),'\n'),
M = letter_freq(Chars),
% Get the letter frequency
letter_freq(S) = Map =>
Map = new_map(),
foreach(C in S)
% Different sorting function on maps
sort_map(Map,values) = [K=V:_=(K=V) in sort([V=(K=V): K=V in Map])].
sort_map(Map,keys) = sort([KV : KV in Map]).
sort_map(Map) = sort_map(Map,keys).</syntaxhighlight>
<pre>[e = 20144,a = 16421,i = 13980,r = 13436,t = 12836,o = 12738,n = 12097,s = 10210,
l = 10061,c = 8216,u = 6489,m = 5828,d = 5799,p = 5516,h = 5208,g = 4129,b = 4115,
y = 3633,f = 2662,w = 1968,k = 1925,v = 1902,x = 617,z = 433,j = 430,q = 378,
' = 105,. = 6,& = 6,1 = 2,9 = 1,8 = 1,7 = 1,6 = 1,5 = 1,4 = 1,3 = 1,2 = 1,0 = 1]</pre>
<langsyntaxhighlight PicoLisplang="picolisp">(let Freq NIL
(in "file.txt"
(while (char) (accu 'Freq @ 1)) )
(sort Freq) )</langsyntaxhighlight>
For a "file.txt":
<pre>-> (("^J" . 2) ("a" . 1) ("b" . 1) ("c" . 2) ("d" . 2) ("e" . 1) ("f" . 1))</pre>
<syntaxhighlight lang="pike">
string all = Stdio.read_file("README.md");
mapping res = ([]);
foreach(all/1, string char)
write("%O\n", res);
([ /* 26 elements */
"\n": 2,
" ": 12,
".": 2,
"/": 3,
":": 1,
"P": 1,
"T": 1,
"a": 5,
"c": 1,
"d": 2,
"e": 10,
"f": 3,
"g": 1,
"h": 2,
"i": 5,
"k": 1,
"l": 4,
"m": 3,
"n": 1,
"o": 7,
"p": 4,
"r": 4,
"s": 10,
"t": 5,
"u": 2,
"x": 2
<syntaxhighlight lang="pli">
<lang PL/I>
frequencies: procedure options (main);
declare tallies(26) fixed binary static initial ((26) 0);
Line 863 ⟶ 5,529:
put skip list (substr(alphabet, i, 1), tallies(i));
end frequencies;</langsyntaxhighlight>
Line 869 ⟶ 5,535:
Letter Frequency
Line 898 ⟶ 5,564:
Y 1
Z 1
<syntaxhighlight lang="powershell">
function frequency ($string) {
$arr = $string.ToUpper().ToCharArray() |where{$_ -match '[A-KL-Z]'}
$n = $arr.count
$arr | group | foreach{
[pscustomobject]@{letter = "$($_.name)"; frequency = "$([math]::round($($_.Count/$n),5))"; count = "$($_.count)"}
} | sort letter
$file = "$($MyInvocation.MyCommand.Name )" #Put the name of your file here
frequency $(get-content $file -Raw)
letter frequency count
------ --------- -----
A 0.06809 16
B 0.00426 1
C 0.06809 16
D 0.00851 2
E 0.11064 26
F 0.0383 9
G 0.01702 4
H 0.02979 7
I 0.03404 8
J 0.00426 1
K 0.00426 1
L 0.02553 6
M 0.04255 10
N 0.09362 22
O 0.08085 19
P 0.02128 5
Q 0.01277 3
R 0.10638 25
S 0.02128 5
T 0.10213 24
U 0.05957 14
V 0.00426 1
W 0.00851 2
Y 0.02979 7
Z 0.00426 1
Line 903 ⟶ 5,612:
Works with SWI-Prolog. <br>
Only alphabetic codes are computed in uppercase state. <br>
Uses '''packlist/2''' defined there : http://rosettacode.org/wiki/[[Run-length_encodinglength encoding#Prolog]] <br>
<langsyntaxhighlight Prologlang="prolog">frequency(File) :-
read_file_to_codes(File, Code, []),
Line 957 ⟶ 5,666:
run(Var,[Other|RRest], [1,Var],[Other|RRest]):-
Output{{out}} for this file
<pre>Number of A : 63
Number of B : 7
Line 973 ⟶ 5,682:
true .
Alphabetic codes are converted to uppercase before being used and no other codes are used as part of the calculations. <br>
<langsyntaxhighlight PureBasiclang="purebasic">Procedure countLetters(Array letterCounts(1), textLine.s)
;counts only letters A -> Z, uses index 0 of letterCounts() to keep a total of all counts
Protected i, lineLength = Len(textLine), letter
Line 1,015 ⟶ 5,725:
Print(#CRLF$ + #CRLF$ + "Press ENTER to exit"): Input()
Sample output:
<pre>File: D:\_T\Text\dictionary.txt
Line 1,050 ⟶ 5,760:
===Using collections.Counter===
====Using collections.Counter====
{{works with|Python|2.7+ and 3.1+}}
<langsyntaxhighlight lang="python">import collections, sys
def filecharcount(openfile):
c =return sorted(collections.Counter(c for l in openfile for c in l).items())
for line in openfile:
return sorted(c.items())
f = open(sys.argv[1])
====As a fold====
Character counting can be conveniently expressed in terms of fold/reduce. See the example below, which also generates column-wrapped output:
{{Works with|Python|3}}
<syntaxhighlight lang="python">'''Character counting as a fold'''
from functools import reduce
from itertools import repeat
from os.path import expanduser
# charCounts :: String -> Dict Char Int
def charCounts(s):
'''A dictionary of
(character, frequency) mappings
def tally(dct, c):
dct[c] = 1 + dct[c] if c in dct else 1
return dct
return reduce(tally, list(s), {})
# TEST ----------------------------------------------------
# main :: IO ()
def main():
'''Listing in descending order of frequency.'''
'Descending order of frequency:\n'
# GENERIC -------------------------------------------------
# chunksOf :: Int -> [a] -> [[a]]
def chunksOf(n):
'''A series of lists of length n,
subdividing the contents of xs.
Where the length of xs is not evenly divible,
the final list will be shorter than n.'''
return lambda xs: reduce(
lambda a, i: a + [xs[i:n + i]],
range(0, len(xs), n), []
) if 0 < n else []
# compose (<<<) :: (b -> c) -> (a -> b) -> a -> c
def compose(g):
'''Right to left function composition.'''
return lambda f: lambda x: g(f(x))
# fst :: (a, b) -> a
def fst(tpl):
'''First member of a pair.'''
return tpl[0]
# readFile :: FilePath -> IO String
def readFile(fp):
'''The contents of any file at the path
derived by expanding any ~ in fp.'''
with open(expanduser(fp), 'r', encoding='utf-8') as f:
return f.read()
# paddedMatrix :: a -> [[a]] -> [[a]]
def paddedMatrix(v):
''''A list of rows padded to equal length
(where needed) with instances of the value v.'''
def go(rows):
return paddedRows(
len(max(rows, key=len))
return lambda rows: go(rows) if rows else []
# paddedRows :: Int -> a -> [[a]] -[[a]]
def paddedRows(n):
'''A list of rows padded (but never truncated)
to length n with copies of value v.'''
def go(v, xs):
def pad(x):
d = n - len(x)
return (x + list(repeat(v, d))) if 0 < d else x
return list(map(pad, xs))
return lambda v: lambda xs: go(v, xs) if xs else []
# showColumns :: Int -> [String] -> String
def showColumns(n):
'''A column-wrapped string
derived from a list of rows.'''
def go(xs):
def fit(col):
w = len(max(col, key=len))
def pad(x):
return x.ljust(4 + w, ' ')
return ''.join(map(pad, col)).rstrip()
q, r = divmod(len(xs), n)
return '\n'.join(map(
chunksOf(q + int(bool(r)))(xs)
return lambda xs: go(xs)
# snd :: (a, b) -> b
def snd(tpl):
'''Second member of a pair.'''
return tpl[1]
# stet :: a -> a
def stet(x):
'''The identity function.
The usual 'id' is reserved in Python.'''
return x
# swap :: (a, b) -> (b, a)
def swap(tpl):
'''The swapped components of a pair.'''
return (tpl[1], tpl[0])
# tabulated :: String -> (a -> String) ->
# (b -> String) ->
# Int ->
# (a -> b) -> [a] -> String
def tabulated(s):
'''Heading -> x display function -> fx display function ->
number of columns -> f -> value list -> tabular string.'''
def go(xShow, fxShow, intCols, f, xs):
def mxw(fshow, g):
return max(map(compose(len)(fshow), map(g, xs)))
w = mxw(xShow, lambda x: x)
fw = mxw(fxShow, f)
return s + '\n' + showColumns(intCols)([
xShow(x).rjust(w, ' ') + ' -> ' + (
fxShow(f(x)).rjust(fw, ' ')
for x in xs
return lambda xShow: lambda fxShow: lambda nCols: (
lambda f: lambda xs: go(
xShow, fxShow, nCols, f, xs
# MAIN ---
if __name__ == '__main__':
<pre>Descending order of frequency:
' ' -> 568 ')' -> 62 'v' -> 25 'w' -> 7 '5' -> 3
'\t' -> 382 '(' -> 62 '1' -> 24 'k' -> 7 '4' -> 3
'e' -> 274 'd' -> 60 'G' -> 22 '9' -> 6 '+' -> 3
'n' -> 233 'g' -> 59 ']' -> 17 'S' -> 5 '¬' -> 2
'\n' -> 228 'u' -> 58 '[' -> 17 'R' -> 5 '=' -> 2
't' -> 204 '|' -> 54 'λ' -> 16 'M' -> 5 '.' -> 2
's' -> 198 'x' -> 53 '2' -> 15 'F' -> 5 'L' -> 1
'-' -> 178 'm' -> 52 'N' -> 11 '<' -> 5 'C' -> 1
'i' -> 145 'c' -> 52 '}' -> 10 '6' -> 5 'A' -> 1
'o' -> 126 'h' -> 47 '{' -> 10 'z' -> 4 '3' -> 1
'f' -> 100 ':' -> 47 'T' -> 10 "'" -> 4 '&' -> 1
'r' -> 96 ',' -> 38 'I' -> 10 '^' -> 3 '$' -> 1
'a' -> 86 'b' -> 32 '0' -> 10 'E' -> 3
'l' -> 70 'y' -> 31 '"' -> 10 '8' -> 3
'p' -> 68 '>' -> 28 'J' -> 9 '7' -> 3</pre>
===Not using collections.Counter===
====Without using collections.Counter====
<lang python>import string
<syntaxhighlight lang="python">import string
if hasattr(string, ''ascii_lowercase''):
if hasattr(string, 'ascii_lowercase'):
letters = string.ascii_lowercase # Python 2.2 and later
Line 1,073 ⟶ 5,970:
def countletters(file_handle):
"""Traverse a file and compute the number of occurences of each letter
"""return results as a simple 26 element list of integers."""
results = [0] * len(letters)
for line in file_handle:
Line 1,079 ⟶ 5,976:
char = char.lower()
if char in letters:
results[offset - ord(char) - offset] += 1
# Ordinal minus ordinal of 'a' minus ordinal of any lowercase ASCII letter -> 0..25
return results
Line 1,087 ⟶ 5,984:
lettercounts = countletters(sourcedata)
for i in xrange(len(lettercounts)):
print "%s=%d" % (chr(i + ord('a')), lettercounts[i]),</langsyntaxhighlight>
This example defines the function and provides a sample usage. The ''if ... __main__...'' line allows it to be cleanly imported into any other Python code while also allowing it to function as a standalone script. (A very common Python idiom).
Line 1,093 ⟶ 5,990:
Using a numerically indexed array (list) for this is artificial and clutters the code somewhat.
====Using defaultdict====
{{works with|Python|2.5+ and 3.x}}
<langsyntaxhighlight lang="python">...
from collections import defaultdict
def countletters(file_handle):
Line 1,106 ⟶ 6,003:
c = char.lower()
results[c] += 1
return results</langsyntaxhighlight>
Which eliminates the ungainly fiddling with ordinal values and offsets in function countletters of a previous example above. More importantly it allows the results to be more simply printed using:
<langsyntaxhighlight lang="python">lettercounts = countletters(sourcedata)
for letter,count in lettercounts.iteritems():
print "%s=%s" % (letter, count),</langsyntaxhighlight>
Again eliminating all fussing with the details of converting letters into list indices.
<syntaxhighlight lang="quackery"> [ [] 26 times [ 0 join ] ] is makecountnest ( --> [ )
[ char A char Z 1+ within ] is ischar ( c --> b )
[ char A -
2dup peek 1+ unrot poke ] is tallychar ( [ c --> [ )
[ makecountnest swap
[ upper dup ischar iff
else drop ] ] is countchars ( $ --> [ )
[ say "Letter count:" cr
[ say " "
i^ char A + emit
say ":" echo
cr ] ] is echocount ( [ --> )
[ sharefile 0 = if
[ $ " not found."
join fail ]
echocount ] is fileletters ( $ --> )</syntaxhighlight>
'''Testing in Quackery shell:'''
<pre>O> $ "quackery.py" fileletters ( i.e. the Quackery source file )
Letter count:
Stack empty.
/O> $ "nosuchfile.txt" fileletters
Problem: nosuchfile.txt not found.
Quackery Stack:
Return stack: {[...] 0} {quackery 1} {[...] 11} {shell 5} {quackery 1} {[...] 1} {fileletters 4}</pre>
=={{header|Quick Basic/QBASIC/PDS 7.1/VB-DOS}}==
This version counts valid letters from A to Z (including Ñ in Spanish alphabet) or characters in a file. Takes in account accented vowels. It runs in QB, QBASIC, PDS 7.1 and VB_DOS as is.
<syntaxhighlight lang="vb">
' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
' Program CountCar '
' '
' This program counts how many distinct characters '
' have a text file specified by the user. '
' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
' OPTION EXPLICIT ' Remove comment in VB-DOS
' Register
TYPE regChar
Character AS STRING * 3
' Var
CONST LngReg = 256
CONST Letters = 1
'------Main program cycle
' Initialize variables
strDate = DATE$
strTime = TIME$
PRINT "This program counts letters or characters in a text file."
INPUT "File to open: ", strFile
IF LOF(iFile) > 0 THEN
PRINT "Count: 1) Letters 2) Characters (1 or 2)";
strKey = INKEY$
LOOP UNTIL strKey = "1" OR strKey = "2"
PRINT ". Option selected: "; strKey
iCL = VAL(strKey)
sTime = TIMER
iP = POS(0)
lHowMany = LOF(iFile)
strTxt = SPACE$(LngReg)
IF iCL = Letters THEN
iMaxIdx = 26
iMaxIdx = 255
IF iMaxIdx <> iPMI THEN
iPMI = iMaxIdx
REDIM rChar(0 TO iMaxIdx) AS regChar
FOR i = 0 TO iMaxIdx
IF iCL = Letters THEN
strTxt = CHR$(i + 65)
IF i = 26 THEN strTxt = CHR$(165)
CASE 0: strTxt = "nul"
CASE 7: strTxt = "bel"
CASE 9: strTxt = "tab"
CASE 10: strTxt = "lf"
CASE 11: strTxt = "vt"
CASE 12: strTxt = "ff"
CASE 13: strTxt = "cr"
CASE 28: strTxt = "fs"
CASE 29: strTxt = "gs"
CASE 30: strTxt = "rs"
CASE 31: strTxt = "us"
CASE 32: strTxt = "sp"
CASE ELSE: strTxt = CHR$(i)
rChar(i).Character = strTxt
FOR i = 0 TO iMaxIdx
rChar(i).Count = 0
PRINT "Looking for ";
IF iCL = Letters THEN PRINT "letters."; ELSE PRINT "characters.";
PRINT " File is"; STR$(lHowMany); " in size. Working"; : COLOR 23: PRINT "..."; : COLOR (7)
DO WHILE LOC(iFile) < LOF(iFile)
IF LOC(iFile) + LngReg > LOF(iFile) THEN
strTxt = SPACE$(LOF(iFile) - LOC(iFile))
GET #iFile, , strTxt
FOR i = 1 TO LEN(strTxt)
IF iCL = Letters THEN
iChar = ASC(UCASE$(MID$(strTxt, i, 1)))
CASE 164: iChar = 165
CASE 160: iChar = 65
CASE 130, 144: iChar = 69
CASE 161: iChar = 73
CASE 162: iChar = 79
CASE 163, 129: iChar = 85
iChar = iChar - 65
' Validates if iChar is a letter
IF iChar >= 0 AND iChar <= 25 THEN
rChar(iChar).Count = rChar(iChar).Count + 1
ELSEIF iChar = 100 THEN ' CHR$(165)
rChar(iMaxIdx).Count = rChar(iMaxIdx).Count + 1
iChar = ASC(MID$(strTxt, i, 1))
rChar(iChar).Count = rChar(iChar).Count + 1
CLOSE #iFile
' Show the characters found
lMUC = 0
iMUI = 0
lLUC = 2147483647
iLUI = 0
iPrint = FALSE
lTotChars = 0
iCountChars = 0
iPause = FALSE
IF iCL = Letters THEN PRINT "Letters found: "; ELSE PRINT "Characters found: ";
FOR i = 0 TO iMaxIdx
' Most Used Character
IF lMUC < rChar(i).Count THEN
lMUC = rChar(i).Count
iMUI = i
' Print character
IF rChar(i).Count > 0 THEN
strTxt = ""
IF iPrint THEN strTxt = ", " ELSE iPrint = TRUE
strTxt = strTxt + LTRIM$(RTRIM$(rChar(i).Character))
strTxt = strTxt + "=" + LTRIM$(STR$(rChar(i).Count))
iP = POS(0)
IF iP + LEN(strTxt) + 1 >= 80 AND iPrint THEN
iPause = TRUE
PRINT "Press a key to continue..."
strKey = INKEY$
LOOP UNTIL strKey <> ""
strTxt = MID$(strTxt, 3)
PRINT strTxt;
lTotChars = lTotChars + rChar(i).Count
iCountChars = iCountChars + 1
' Least Used Character
IF lLUC > rChar(i).Count THEN
lLUC = rChar(i).Count
iLUI = i
' Shows the summary
PRINT "File analyzed....................: "; strFile
PRINT "Looked for.......................: "; : IF iCL = Letters THEN PRINT "Letters" ELSE PRINT "Characters"
PRINT "Total characters in file.........:"; lHowMany
PRINT "Total characters counted.........:"; lTotChars
IF iCL = Letters THEN PRINT "Characters discarded on count....:"; lHowMany - lTotChars
PRINT "Distinct characters found in file:"; iCountChars; "of"; iMaxIdx + 1
PRINT "Most used character was..........: ";
iPrint = FALSE
FOR i = 0 TO iMaxIdx
IF rChar(i).Count = lMUC THEN
IF iPrint THEN PRINT ", "; ELSE iPrint = TRUE
PRINT RTRIM$(LTRIM$(rChar(i).Character));
PRINT " ("; LTRIM$(STR$(rChar(iMUI).Count)); " times)"
PRINT "Least used character was.........: ";
iPrint = FALSE
FOR i = 0 TO iMaxIdx
IF rChar(i).Count = lLUC THEN
IF iPrint THEN PRINT ", "; ELSE iPrint = TRUE
PRINT RTRIM$(LTRIM$(rChar(i).Character));
PRINT " ("; LTRIM$(STR$(rChar(iLUI).Count)); " times)"
PRINT "Time spent in the process........:"; TIMER - sTime; "seconds"
' File does not exist
CLOSE #iFile
KILL strFile
PRINT "File does not exist."
' Again?
PRINT "Again? (Y/n)"
strTxt = UCASE$(INKEY$)
LOOP UNTIL strTxt = "N" OR strTxt = "Y" OR strTxt = CHR$(13) OR strTxt = CHR$(27)
LOOP UNTIL strTxt = "N" OR strTxt = CHR$(27)
PRINT "End of execution."
PRINT "Start time: "; strDate; " "; strTime; ", end time: "; DATE$; " "; TIME$; "."
' ---End of main program cycle
This program counts letters or characters in a text file.
File to open: readme.txt
Count: 1) Letters 2) Characters (1 or 2). Option selected: 1
Looking for letters. File is 23769 in size. Working...
Letters found: A=1427, B=306, C=583, D=530, E=2098, F=279, G=183, H=501,
I=1177, J=15, K=34, L=741, M=379, N=1219, O=1183, P=312, Q=32, R=1105, S=1079,
T=1309, U=660, V=346, W=147, X=190, Y=242, Z=70, Ñ=5.
File analyzed....................: readme.txt
Looked for.......................: Letters
Total characters in file.........: 23769
Total characters counted.........: 16152
Characters discarded on count....: 7617
Distinct characters found in file: 27 of 27
Most used character was..........: E (2098 times)
Least used character was.........: Ñ (5 times)
Time spent in the process........: .3789063 seconds
Again? (Y/n)
===Using summary===
<lang R>letter.frequency <- function(filename)
<syntaxhighlight lang="rsplus">letter.frequency <- function(filename)
file <- paste(readLines(filename), collapse = '')
chars <- strsplit(file, NULL)[[1]]
Usage on itself:
<langsyntaxhighlight Rlang="rsplus">> source('letter.frequency.r')
> letter.frequency('letter.frequency.r')
- , . ' ( ) [ ] { } < = 1 a c d e f h i l L m n N o p q r s t u U y
22 3 2 1 2 6 6 2 2 1 1 3 1 1 9 6 1 14 7 2 7 8 3 4 6 1 3 3 1 8 8 7 3 1 2 </langsyntaxhighlight>
===Using table===
R's table function is more idiomatic. For variety, we will use read.delim rather than readLines and show how to only count letters. It is worth noting that readLines is prone to counting empty lines. This may be undesirable.
<syntaxhighlight lang="rsplus">letterFreq <- function(filename, lettersOnly)
txt <- read.delim(filename, header = FALSE, stringsAsFactors = FALSE, allowEscapes = FALSE, quote = "")
count <- table(strsplit(paste0(txt[,], collapse = ""), ""))
if(lettersOnly) count[names(count) %in% c(LETTERS, letters)] else count
For fun, we'll use this page for input. However, HTML rarely parses well and the variety of text here is so large that I suspect inaccurate output.
<pre>> print(letterFreq('https://rosettacode.org/wiki/Letter_frequency', TRUE))
a A b B c C d D e E f F g G h H i I
38186 666 8008 350 16585 1263 4151 277 15020 713 3172 529 3079 149 4549 161 9397 690
j J k K l L m M n N o O p P q Q r R
311 113 3294 76 15906 928 3333 322 26795 355 8926 456 22702 497 1877 39 15055 591
s S t T u U v V w W x X y Y z Z
46527 695 15549 597 5268 269 1003 128 4134 148 1239 144 3037 55 127 77 </pre>
<syntaxhighlight lang="racket">
#lang racket
(require math)
(define (letter-frequencies ip)
(port->list read-char ip)))
(letter-frequencies (open-input-string "abaabdc"))
'(#\a #\b #\d #\c)
'(3 2 1 1)
Using input from a text file:
<syntaxhighlight lang="racket">
(letter-frequencies (open-input-file "somefile.txt"))
(formerly Perl 6)
In Raku, whenever you want to count things in a collection, the rule of thumb is to use the Bag structure.
<div style="background:#ffffee;padding:1em;">
''In response to some of the breathless exposition in the [[Letter_frequency#Frink|Frink]] entry.''
* Has a Unicode-aware function, which intelligently enumerates through what a human would consider to be a single visible character. - Raku doesn't have '''a''' special Unicode aware string processing function, rather '''all''' of Rakus text/string wrangling functions are Unicode aware by default.
* Unicode-aware lowercase function. - See above: lowercase, uppercase, fold-case, title-case, all Unicode aware by default.
* Uses full Unicode tables to determine what is a "letter." - Um... Yeah, Unicode aware by default.
* Works with high Unicode characters, that is above \uFFFF. - OK, now we're just repeating ourselves.
* Normalize Unicode characters with its normalizeUnicode function. - Raku; normalized by default.
* Implements all of (NFC, NFD, NFKC, NFKD). NFC is the default. - Yep.
"How many other languages in this page do all or any of this correctly?" Quite a few I suspect. Some even moreso than Frink.
<syntaxhighlight lang="raku" line>.&ws.say for slurp.comb.Bag.sort: -*.value;
sub ws ($pair) {
$pair.key ~~ /\n/
?? ('NEW LINE' => $pair.value)
!! $pair.key ~~ /\s/
?? ($pair.key.uniname => $pair.value)
!! $pair
{{Out|Output when fed the same Les Misérables text file as used in the [[Word_frequency#Raku|Word frequency]] task}}
<pre>SPACE => 522095
e => 325692
t => 222916
a => 199790
o => 180974
h => 170210
n => 167006
i => 165201
s => 157585
r => 145118
d => 106987
l => 97131
NEW LINE => 67662
u => 67340
c => 62717
m => 56021
f => 53494
w => 53301
, => 48784
g => 46060
p => 39932
y => 37985
b => 34276
. => 30589
v => 24045
" => 14340
k => 14169
T => 12547
- => 11037
I => 10067
A => 7355
H => 6600
M => 6206
; => 5885
E => 4968
C => 4583
S => 4392
' => 3938
x => 3692
! => 3539
R => 3531
P => 3424
O => 3401
j => 3390
B => 3185
W => 3180
N => 3053
? => 2976
F => 2754
G => 2508
: => 2468
J => 2448
L => 2444
q => 2398
V => 2200
_ => 2070
z => 1847
D => 1756
é => 1326
Y => 1238
U => 895
1 => 716
8 => 412
X => 333
K => 321
è => 292
3 => 259
2 => 248
5 => 220
0 => 218
* => 181
4 => 181
) => 173
( => 173
6 => 167
É => 146
7 => 143
Q => 135
] => 122
[ => 122
9 => 117
æ => 106
= => 75
ê => 74
Z => 59
à => 59
â => 56
> => 50
< => 50
/ => 50
ç => 48
î => 39
ü => 37
| => 36
ô => 34
# => 26
ù => 18
ï => 18
Æ => 10
û => 9
+ => 5
È => 5
ë => 5
À => 4
@ => 2
ñ => 2
Ç => 2
$ => 2
% => 1
& => 1
{ => 1
} => 1
½ => 1
<syntaxhighlight lang="raven">define count_letters use $words
{ } as $wordHash [ ] as $keys [ ] as $vals
$words each chr
dup $wordHash swap get 0 prefer 1 + # stack: chr cnt
swap $wordHash swap set
$wordHash keys copy sort each
dup $keys push
$wordHash swap get $vals push
$keys $vals combine print "\n" print
"test.dat" as $file
$file read as $all_data
$all_data count_letters</syntaxhighlight>
<syntaxhighlight lang="refal">$ENTRY Go {
, <Arg 1>: {
= <Prout 'No filename given'>;
e.File, <ReadFile 1 e.File>: e.Text,
<Tally e.Text>: e.Counts
= <ShowLetterCounts (e.Counts) <Letters>>;
Letters {
ShowLetterCounts {
(e.T) = ;
(e.T) s.L e.Ls,
<Upper s.L>: s.UL, <Item (e.T) s.UL>: s.ULN,
<Lower s.L>: s.LL, <Item (e.T) s.LL>: s.LLN,
<+ s.ULN s.LLN>: s.Total
= <Prout s.UL s.LL ': ' <Symb s.Total>>
<ShowLetterCounts (e.T) e.Ls>;
ReadFile {
s.Chan e.Filename =
<Open 'r' s.Chan e.Filename>
<ReadFile (s.Chan)>;
(s.Chan), <Get s.Chan>: {
0 = <Close s.Chan>;
e.Line = e.Line '\n' <ReadFile (s.Chan)>;
Tally {
(e.T) = e.T;
(e.T) s.X e.Xs = <Tally (<Inc (e.T) s.X>) e.Xs>;
e.Xs = <Tally () e.Xs>;
Inc {
(e.1 (s.I s.N) e.2) s.I = e.1 (s.I <+ 1 s.N>) e.2;
(e.X) s.I = e.X (s.I 1);
Item {
(e.1 (s.I s.N) e.2) s.I = s.N;
(e.X) s.I = 0;
The result of running the program on its own source file:
<pre>Aa: 22
Bb: 2
Cc: 16
Dd: 5
Ee: 75
Ff: 10
Gg: 5
Hh: 11
Ii: 26
Jj: 1
Kk: 1
Ll: 49
Mm: 8
Nn: 33
Oo: 18
Pp: 6
Qq: 1
Rr: 17
Ss: 55
Tt: 44
Uu: 14
Vv: 2
Ww: 5
Xx: 12
Yy: 7
Zz: 1</pre>
===version 1===
<lang rexx>/* counts the occurances of all characters in a file, */
It should be noted that the file being read is read one line at time, so the line-end characters (presumably the
/* {all Latin alphabet letters are uppercased first}. */
<br>line-feed, carriage return, new-line, or whatever control characters are being used) are not reported.
These characters could be read and reported if the &nbsp; '''charin''' &nbsp; BIF would be used instead of the &nbsp; '''linein''' &nbsp; BIF.
parse arg fileID .
if fileID=='' then fileID='JUNK.TXT'
Also note that this REXX program is ASCII or EBCDIC independent, but what constitutes a letter is restricted to
do j=1 while lines(fileID)\==0 /*read file until cows come home.*/
<br>the Latin (Roman) alphabet (that is, which characters are considered to be letters of a particular language.
_=linein(fileID) /*get a line from the file. */
upper _ /* ◄──────────────────────uppercase the Latin characters.*/
The version of REXX that was used was the '''English''' version of Regina REXX. &nbsp; It should be noted that almost all
<br>REXX interpreters assume the English language for such things as determining what characters are considered
<br>letters unless another language is specified &nbsp; (Regina REXX uses an environmental variable for this purpose).
All characters are still counted, whether a letter or not, including non-displayable characters.
do k=1 for length(_) /*examine/count each character. */
<syntaxhighlight lang="rexx">/*REXX program counts the occurrences of all characters in a file, and note that all */
x=c2x(substr(_,k,1)) /*convert the character to hex. */
/* Latin alphabet letters are uppercased for also counting {Latin} letters (both cases).*/
@.x=@.x+1 /*bump the character's count. */
end /*k*/
abc = 'abcdefghijklmnopqrstuvwxyz' /*define an (Latin or English) alphabet*/
abcU= 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' /*define an uppercase version of [↑]. */
parse arg fileID . /*this last char isn't a middle dot: · */
if fileID=='' then fileID= 'JUNK.TXT' /*¿none specified? Then use the default*/
totChars= 0; totLetters= 0 /*count of all chars and of all letters*/
pad= left('',18); pad9= left('', 18%2) /*used for the indentations of output. */
@.= 0 /*wouldn't it be neat to use Θ instead?*/
do j=1 while lines(fileID)\==0 /*read the file 'til the cows come home*/
rec= linein(fileID) /*get a line/record from the input file*/
/* [↓] process all characters in REC.*/
do k=1 for length(rec) /*examine/count each of the characters.*/
totChars= totChars + 1 /*bump count of number of characters. */
c= substr(rec, k, 1); @.c= @.c + 1 /*Peel off a character; bump its count.*/
if \datatype(c, 'M') then iterate /*Not a Latin letter? Get next char.⌠*/
totLetters= totLetters + 1 /*bump the count for [Latin] letters. ⌡*/
upper c /* ◄─────◄ uppercase a Latin character.*/
@..c= @..c + 1 /*bump the (Latin) letter's count. */
end /*k*/ /*no Greek glyphs: αßΓπΣσµτΦΘΩδφε ··· */
end /*j*/ /*maybe we're ½ done by now, or mäÿbé ¬*/
LL= '(Latin) letter' /*literal used for a "SAY" (below). */
w= length(totChars) /*used for right─aligning the counts. */
say 'file ─────' fileId "───── has" j-1 'records and has' totLetters LL"s."; say
do L=0 for 256; c= d2c(L) /*display all none─zero letter counts. */
if @..c==0 then iterate /*Has a zero count? Then skip character*/
say pad9 LL' ' c " (also" translate(c,abc,abcU)') count:' right(@..c, w)
end /*L*/ /*we may be in a rut, but not a cañyon.*/
say /*¡The old name for Eygpt was Æygpt! _*/
say 'file ─────' fileId "───── has" totChars 'characters.' /* √ */
say /*The name for « » chars is guillemets.*/
do #=0 for 256; y= d2c(#) /*display all none─zero char counts. */
if @.y==0 then iterate /*¿Å zero count? Then ignore character*/
c= d2c(#); ch= c /*C is the character glyph of a char. */
if c<<' ' | #==255 then ch= /*don't show some control characters. */
if c==' ' then ch= 'blank' /*show a blank's {true} name. */
say pad right(ch, 5) " ('"d2x(#,2)"'x character count:" right(@.c, w)
end /*#*/ /*255 isn't quite ∞, but sometimes ∙∙∙ */
say /*not a good place for dithering: ░▒▓█ */
say pad pad9 '☼ end─of─list ☼' /*show we are at the end of the list. */
/*§§§§ Talk about a mishmash of 2¢ comments. ▬▬^▬▬ stick a fork in it, we're all done. ☻*/</syntaxhighlight>
'''output''' &nbsp; when using the (above) REXX program for the input file:
Note that this REXX program works with ASCII or EBCDIC, but the order of the output will
end /*j*/
<br>be different because of the order in which EBCDIC and ASCII stores characters.
file ───── JUNK.TXT ───── has 42 records and has 1652 (Latin) letters.
(Latin) letter A (also a) count: 146
say 'file ─────' fileId "───── has" j-1 'records.'
(Latin) letter B (also b) count: 26
(Latin) letter C (also c) count: 104
(Latin) letter D (also d) count: 58
(Latin) letter E (also e) count: 187
(Latin) letter F (also f) count: 53
(Latin) letter G (also g) count: 25
(Latin) letter H (also h) count: 80
(Latin) letter I (also i) count: 89
(Latin) letter J (also j) count: 6
(Latin) letter K (also k) count: 13
(Latin) letter L (also l) count: 97
(Latin) letter M (also m) count: 28
(Latin) letter N (also n) count: 102
(Latin) letter O (also o) count: 106
(Latin) letter P (also p) count: 38
(Latin) letter Q (also q) count: 3
(Latin) letter R (also r) count: 111
(Latin) letter S (also s) count: 96
(Latin) letter T (also t) count: 175
(Latin) letter U (also u) count: 48
(Latin) letter V (also v) count: 3
(Latin) letter W (also w) count: 18
(Latin) letter X (also x) count: 9
(Latin) letter Y (also y) count: 25
(Latin) letter Z (also z) count: 6
file ───── JUNK.TXT ───── has 3778 characters.
do m=0 to 255 /*display none-zero char counts. */
y=d2x(m); if @.y==0 then iterate /*count=0? Then ignore this char.*/
c=d2c(m) /*C is the hex version of of char*/
select ('02'x character count: /*make the character viewable. */1
when c<<' ' | m==255 then say " ('"y"0F'x character count:" @.y 2
when c==' ' then say "blank ('11'x character count:" @.y 2
otherwise say " ('15'x "character c 'count:' @.y 4
('16'x character count: 4
end /*select*/
('18'x character count: 1
('19'x character count: 1
blank ('20'x character count: 1477
! ('21'x character count: 1
" ('22'x character count: 14
# ('23'x character count: 6
% ('25'x character count: 1
' ('27'x character count: 47
( ('28'x character count: 23
) ('29'x character count: 22
* ('2A'x character count: 86
+ ('2B'x character count: 4
, ('2C'x character count: 16
- ('2D'x character count: 1
. ('2E'x character count: 40
/ ('2F'x character count: 88
0 ('30'x character count: 8
1 ('31'x character count: 10
2 ('32'x character count: 11
5 ('35'x character count: 7
6 ('36'x character count: 2
8 ('38'x character count: 2
9 ('39'x character count: 3
: ('3A'x character count: 5
; ('3B'x character count: 8
< ('3C'x character count: 2
= ('3D'x character count: 38
? ('3F'x character count: 5
@ ('40'x character count: 9
A ('41'x character count: 2
B ('42'x character count: 1
C ('43'x character count: 8
D ('44'x character count: 6
E ('45'x character count: 5
F ('46'x character count: 1
G ('47'x character count: 3
H ('48'x character count: 2
I ('49'x character count: 8
J ('4A'x character count: 2
K ('4B'x character count: 2
L ('4C'x character count: 22
M ('4D'x character count: 2
N ('4E'x character count: 3
O ('4F'x character count: 1
P ('50'x character count: 2
Q ('51'x character count: 1
R ('52'x character count: 3
S ('53'x character count: 2
T ('54'x character count: 9
U ('55'x character count: 4
V ('56'x character count: 1
W ('57'x character count: 1
X ('58'x character count: 4
Y ('59'x character count: 2
Z ('5A'x character count: 1
[ ('5B'x character count: 3
\ ('5C'x character count: 2
] ('5D'x character count: 3
^ ('5E'x character count: 1
_ ('5F'x character count: 1
a ('61'x character count: 144
b ('62'x character count: 25
c ('63'x character count: 96
d ('64'x character count: 52
e ('65'x character count: 182
f ('66'x character count: 52
g ('67'x character count: 22
h ('68'x character count: 78
i ('69'x character count: 81
j ('6A'x character count: 4
k ('6B'x character count: 11
l ('6C'x character count: 75
m ('6D'x character count: 26
n ('6E'x character count: 99
o ('6F'x character count: 105
p ('70'x character count: 36
q ('71'x character count: 2
r ('72'x character count: 108
s ('73'x character count: 94
t ('74'x character count: 166
u ('75'x character count: 44
v ('76'x character count: 2
w ('77'x character count: 17
x ('78'x character count: 5
y ('79'x character count: 23
z ('7A'x character count: 5
{ ('7B'x character count: 2
| ('7C'x character count: 1
} ('7D'x character count: 2
~ ('7E'x character count: 10
é ('82'x character count: 1
ä ('84'x character count: 1
Å ('8F'x character count: 1
Æ ('92'x character count: 1
ÿ ('98'x character count: 1
¢ ('9B'x character count: 1
ñ ('A4'x character count: 1
¿ ('A8'x character count: 2
¬ ('AA'x character count: 1
½ ('AB'x character count: 1
¡ ('AD'x character count: 1
« ('AE'x character count: 1
» ('AF'x character count: 1
░ ('B0'x character count: 1
▒ ('B1'x character count: 1
▓ ('B2'x character count: 1
─ ('C4'x character count: 30
═ ('CD'x character count: 76
█ ('DB'x character count: 1
α ('E0'x character count: 1
ß ('E1'x character count: 1
Γ ('E2'x character count: 1
π ('E3'x character count: 1
Σ ('E4'x character count: 1
σ ('E5'x character count: 1
µ ('E6'x character count: 1
τ ('E7'x character count: 1
Φ ('E8'x character count: 1
Θ ('E9'x character count: 2
Ω ('EA'x character count: 1
δ ('EB'x character count: 1
∞ ('EC'x character count: 1
φ ('ED'x character count: 1
ε ('EE'x character count: 1
⌠ ('F4'x character count: 1
⌡ ('F5'x character count: 1
∙ ('F9'x character count: 3
· ('FA'x character count: 4
√ ('FB'x character count: 1
☼ end─of─list ☼
end /*m*/
===Version 2 (for TSO)===
<syntaxhighlight lang="rexx">/*REXX program counts the occurences of all characters in a file
say 'file ─────' fileId "───── has" totChars 'characters.'</lang>
* Adapted version 1 for TSO (EXECIO instead of linein)
'''output''' when using as input this REXX program:
* No translation to uppercase takes place
<pre style="height:80ex;overflow:scroll">
* There is no need for tails being hex
file ───── countfrq.rex ───── has 37 records.
* 25.07.2012 Walter Pachl
Parse arg dsn . /*Data set to be processed */
'11'x count: 1
if dsn='' Then /*none specified? */
blank count: 427
dsn='PRIV.V100(TEST)' /* Use default. */
" count: 12
c.=0 /* Character counts */
' count: 21
( count: 7
) count: 7
'FREE * count: 32FI(IN)'
totChars=0 /*count of the total num of chars*/
+ count: 2
totLetters=0 /*count of the total num letters.*/
, count: 3
indent=left('',20) /*used for indentation of output.*/
- count: 2
. count: 22
/ count: 33
0 count: 6
1 count: 5
2 count: 5
5 count: 4
: count: 3
; count: 1
< count: 2
= count: 23
? count: 1
@ count: 8
A count: 52
B count: 5
C count: 48
D count: 20
E count: 79
F count: 20
G count: 4
H count: 39
I count: 40
J count: 4
K count: 6
L count: 31
M count: 11
N count: 37
O count: 33
P count: 10
R count: 42
S count: 37
T count: 56
U count: 16
V count: 3
W count: 6
X count: 10
Y count: 15
Z count: 1
\ count: 1
_ count: 4
{ count: 1
| count: 1
} count: 1
─ count: 42
do j=1 to l.0 /*process all lines */
file ───── countfrq.rex ───── has 1302 characters.
rec=l.j /*take line number j */
Say '>'rec'<' length(rec) /*that's in PRIV.V100(TEST) */
Say ' E8C44D8FF015674BCDEF'
Say ' 61100711200000000002'
do k=1 for length(rec) /*loop over characters */
totChars=totChars+1 /*Increment total number of chars*/
c=substr(rec,k,1) /*get character number k */
c.c=c.c+1 /*increment the character's count*/
End /*maybe we're ½ done by now, or ¬*/
w=length(totChars) /*used for right-aligning counts.*/
say 'file -----' dsn "----- has" j-1 'records.'
say 'file -----' dsn "----- has" totChars 'characters.'
do L=0 to 255 /* display nonzero letter counts */
c=d2c(l) /* the character in question */
if c.c>0 &, /* was found in the file */
datatype(c,'M')>0 Then Do /* and is a Latin letter */
say indent "(Latin) letter " c 'count:' right(c.c,w) /* tell */
totLetters=totLetters+c.c /* increment number of letters */
say 'file -----' dsn "----- has" totLetters '(Latin) letters.'
say ' other characters follow'
do m=0 to 255 /* now for non-letters */
c=d2c(m) /* the character in question */
y=c2x(c) /* the hex representation */
if c.c>0 &, /* was found in the file */
datatype(c,'M')=0 Then Do /* and is not a Latin letter */
other=other+c.c /* increment count */
_=right(c.c,w) /* prepare output of count */
select /*make the character viewable. */
when c<<' ' | m==255 then say indent "'"y"'x character count:" _
when c==' ' then say indent "blank character count:" _
otherwise say indent " " c 'character count:' _
say 'file -----' dsn "----- has" other 'other characters.'</syntaxhighlight>
>WaA Pa12 :&-: :äüÖ2< 20
file ----- PRIV.V100(TEST) ----- has 1 records.
file ----- PRIV.V100(TEST) ----- has 20 characters.
(Latin) letter a count: 2
(Latin) letter A count: 1
(Latin) letter P count: 1
(Latin) letter W count: 1
file ----- PRIV.V100(TEST) ----- has 5 (Latin) letters.
other characters follow
'00'x character count: 1
'10'x character count: 1
blank character count: 3
& character count: 1
- character count: 1
: character count: 1
: character count: 1
ä character count: 1
ü character count: 1
Ö character count: 1
1 character count: 1
2 character count: 2
file ----- PRIV.V100(TEST) ----- has 15 other characters.</pre>
<syntaxhighlight lang="ring">
textData = read("C:\Ring\ReadMe.txt")
ln =len(textData)
charCount = list(255)
totCount = 0
for i =1 to ln
char = ascii(substr(textData,i,1))
charCount[char] = charCount[char] + 1
if char > 31 totCount = totCount + 1 ok
for i = 32 to 255
if charCount[i] > 0 see char(i) + " = " + charCount[i] + " " + (charCount[i]/totCount)*100 + " %" + nl ok
« → text
« { 26 } 0 CON
1 text SIZE '''FOR''' j
text j DUP SUB NUM
'''IF''' DUP 97 ≥ OVER 122 ≤ AND '''THEN''' 32 - '''END'''
'''IF''' DUP 65 ≥ OVER 90 ≤ AND '''THEN''' 64 - DUP2 GET 1 + PUT '''ELSE''' DROP '''END'''
{ }
1 26 '''FOR''' j
'''IF''' OVER j GET '''THEN''' LASTARG j 64 + CHR →TAG + '''END'''
» » '<span style="color:blue">AZFREQ</span>' STO
'<span style="color:blue">AZFREQ</span>' DUP RCL →STR SWAP EVAL <span style="color:grey">@ have the program count its own letters</span>
1: { :A: 5 :B: 1 :C: 2 :D: 9 :E: 17 :F: 5 :G: 4 :H: 4 :I: 4 :J: 5 :L: 1 :M: 1 :N: 12 :O: 6 :P: 5 :R: 7 :S: 3 :T: 16 :U: 7 :V: 3 :X: 5 :Z: 1 }
<langsyntaxhighlight lang="ruby">def letter_frequency(file)
letters = 'a' .. 'z'
File.read(file) .
Line 1,240 ⟶ 7,021:
letter_frequency(ARGV[0]).sort_by {|key, val| -val}.each {|pair| p pair}</langsyntaxhighlight>
example output, using the program file as input:
<pre>$ ruby letterFrequency.rb letterFrequency.rb
Line 1,266 ⟶ 7,047:
["z", 1]
["w", 1]</pre>
===Ruby 2.0===
<syntaxhighlight lang="ruby">def letter_frequency(file)
freq = Hash.new(0)
file.each_char.lazy.grep(/[[:alpha:]]/).map(&:upcase).each_with_object(freq) do |char, freq_map|
freq_map[char] += 1
letter_frequency(ARGF).sort.each do |letter, frequency|
puts "#{letter}: #{frequency}"
note that this version *should* use less memory, even on a gigantic file. This is done by using lazy enumerables, which ruby 2.0 introduces.
example output, using the (somewhat large) dictionary file as the input. Also note that this versions works on unicode text.
<pre>$ ruby letter_frequency.rb /usr/share/dict/words
A: 64439
B: 15526
C: 31872
D: 28531
E: 88833
F: 10675
G: 22712
H: 19320
I: 66986
J: 1948
K: 8409
L: 41107
M: 22508
N: 57144
O: 48944
P: 22274
Q: 1524
R: 57347
S: 90113
T: 53006
U: 26118
V: 7989
W: 7530
X: 2124
Y: 12652
Z: 3281
Å: 1
á: 10
â: 6
ä: 7
å: 3
ç: 5
è: 28
é: 144
ê: 6
í: 2
ñ: 8
ó: 8
ô: 2
ö: 16
û: 3
ü: 12
===Ruby 2.7===
Ruby 2.7 introduced "tally", which delivers a tally on anything enumerable.
<syntaxhighlight lang="ruby">p File.open("/usr/share/dict/words","r").each_char.tally</syntaxhighlight>
=={{header|Run BASIC}}==
<langsyntaxhighlight Runbasiclang="runbasic">open "c:\rbp101\public\textFile.txt" for input as #f
textData$ = input$(#f, lof( #f))
ln =len(textData$)
Line 1,283 ⟶ 7,128:
for i = 32 to 255
if charCount(i) > 0 then print "Ascii:";using("###",i);" char:";chr$(i);" Count:";using("#######",charCount(i));" ";using("##.#",(charCount(i) / totCount) * 100);"%"
next i</langsyntaxhighlight>
Output uses this program to count itself:
Line 1,333 ⟶ 7,178:
Ascii:120 char:x Count: 7 1.5%
Works with all UTF-8 characters
<syntaxhighlight lang="rust">use std::collections::btree_map::BTreeMap;
use std::{env, process};
use std::io::{self, Read, Write};
use std::fmt::Display;
use std::fs::File;
fn main() {
let filename = env::args().nth(1)
.ok_or("Please supply a file name")
.unwrap_or_else(|e| exit_err(e, 1));
let mut buf = String::new();
let mut count = BTreeMap::new();
.unwrap_or_else(|e| exit_err(e, 2))
.read_to_string(&mut buf)
.unwrap_or_else(|e| exit_err(e, 3));
for c in buf.chars() {
*count.entry(c).or_insert(0) += 1;
println!("Number of occurences per character");
for (ch, count) in &count {
println!("{:?}: {}", ch, count);
fn exit_err<T>(msg: T, code: i32) -> ! where T: Display {
writeln!(&mut io::stderr(), "{}", msg).expect("Could not write to stderr");
Output when run on source file:
Number of occurences per character
'\n': 35
' ': 167
'!': 4
'\"': 10
'#': 1
'&': 4
'(': 25
')': 25
'*': 1
'+': 1
',': 12
'-': 1
'.': 10
'0': 1
'1': 3
'2': 2
'3': 2
':': 37
';': 13
'<': 1
'=': 4
'>': 2
'?': 1
'B': 2
'C': 1
'D': 2
'F': 2
'M': 2
'N': 1
'P': 1
'R': 1
'S': 1
'T': 5
'W': 1
'[': 1
']': 1
'_': 15
'a': 20
'b': 5
'c': 22
'd': 12
'e': 75
'f': 14
'g': 5
'h': 6
'i': 29
'k': 1
'l': 23
'm': 13
'n': 36
'o': 28
'p': 17
'r': 45
's': 33
't': 42
'u': 24
'v': 2
'w': 8
'x': 6
'y': 4
'{': 9
'|': 6
'}': 9
Because S-BASIC lacks an EOF function, some extra care is required to avoid reading beyond the end of file. (CP/M text files are normally terminated with a Ctrl-Z byte, but not all text editors enforce this convention if the file would otherwise end on a sector boundary.)
<syntaxhighlight lang="s-basic">
$constant EOF = 1AH rem normal end-of-file marker
rem Convert character to upper case
function upcase(ch = char) = char
if ch >= 'a' and ch <= 'z' then
ch = ch - 32
end = ch
rem Convert string to all upper case characters
function allcaps(source = string) = string
var p = integer
for p = 1 to len(source) do
mid(source,p,1) = upcase(mid(source,p,1))
next p
end = source
Preserve console and printer channels (#0 and #1)
Channel #2 declared as sequential ASCII
files d, d, sa(1)
var ch = char
var i = integer
based errcode = integer
base errcode at 103H rem S-BASIC stores run-time error code here
var filename = string
var total = real
dim real freq(26)
input "Name of text file to process: "; filename
filename = allcaps(filename)
open #2; filename
on error goto 7_trap rem In case input file lacks terminating ^Z
rem Initialize letter counts to zero
for i = 1 to 26
freq(i) = 0
next i
rem Process the file
total = 0
input3 #2; ch
while ch <> EOF do
ch = upcase(ch);
if ch >= "A" and ch <= "Z" then
freq(ch - 64) = freq(ch - 64) + 1
total = total + 1
input3 #2; ch
goto 8_done rem Jump around error trap
7_trap if errcode <> 15 then
print "Runtime error = ";errcode
goto 9_exit
rem otherwise fall through on attempted read past EOF (err = 15)
close #2
rem Report results
print "Letter Count Percent"
for I = 1 to 26
print chr(i+64);" ";
print using " ##,###"; freq(i);
print using " ##.#"; freq(i) / total * 100
next i
With Lincoln's Second Inaugural Address used as input
Letter Count Percent
A 101 8.8
B 14 1.2
C 31 2.7
D 59 5.1
E 165 14.3
F 27 2.3
G 28 2.4
H 80 7.0
I 68 5.9
J 0 0.0
K 3 0.3
L 42 3.7
M 13 1.1
N 78 6.8
O 93 8.1
P 15 1.3
Q 1 0.1
R 79 6.9
S 44 3.8
T 126 11.0
U 21 1.8
V 23 2.0
W 28 2.4
X 0 0.0
Y 11 1.0
Z 0 0.0
<syntaxhighlight lang="scala">import io.Source.fromFile
def letterFrequencies(filename: String) =
fromFile(filename).mkString groupBy (c => c) mapValues (_.length)</syntaxhighlight>
Using guile scheme 2.0.11.
Note that this prints the scheme representations of characters in no particular order.
<syntaxhighlight lang="scheme">(use-modules (ice-9 format))
(define (char-freq port table)
(eof-object? (peek-char port))
(char-freq port (add-char (read-char port) table))))
(define (add-char char table)
((null? table) (list (list char 1)))
((eq? (caar table) char) (cons (list char (+ (cadar table) 1)) (cdr table)))
(#t (cons (car table) (add-char char (cdr table))))))
(define (format-table table)
(for-each (lambda (t) (format #t "~10s~10d~%" (car t) (cadr t))) table))
(define (print-freq filename)
(format-table (char-freq (open-input-file filename) '())))
(print-freq "letter-frequency.scm")</syntaxhighlight>
Output when reading own source:
#\( 45
#\u 5
#\s 9
#\e 47
#\- 19
#\m 9
#\o 16
#\d 19
#\l 25
#\space 83
#\i 15
#\c 28
#\9 1
#\f 20
#\r 39
#\a 47
#\t 36
#\) 45
#\newline 21
#\n 15
#\h 14
#\q 7
#\p 9
#\b 16
#\j 1
#\? 3
#\k 1
#\1 4
#\+ 1
#\# 2
#\" 4
#\~ 3
#\0 2
#\% 1
#\' 1
#\y 1
#\. 1
An implementation for CHICKEN scheme:
<syntaxhighlight lang="scheme">
(with-input-from-string "foobar"
(lambda ()
(port-fold (lambda (x s)
(alist-update x
(add1 (alist-ref x s eq? 0))
which shows: ((#\f . 1) (#\o . 2) (#\b . 1) (#\a . 1) (#\r . 1))
<langsyntaxhighlight lang="seed7">$ include "seed7_05.s7i";
const type: charHash is hash [char] integer;
Line 1,355 ⟶ 7,508:
writeln(ch <& " " <& numberOfChars[ch]);
end for;
end func;</langsyntaxhighlight>
Output when the program uses itself as input:
Line 1,375 ⟶ 7,528:
w 3
y 2</pre>
<syntaxhighlight lang="sensetalk">
put file "~/Documents/addresses.csv" into source
repeat with each character of source
if it is a controlChar then next repeat -- skip control characters
if it is a lowercase then put "." after it -- make keys distinct
add 1 to counts.(it)
end repeat
repeat with each (theChar, count) in counts
put char 1 of theChar & " —> " & count
end repeat
—> 2862
" —> 11180
# —> 109
& —> 58
, —> 5646
- —> 2009
. —> 1629
/ —> 1000
0 —> 1496
1 —> 1665
2 —> 1487
3 —> 1481
4 —> 1405
5 —> 1416
6 —> 1260
7 —> 1499
8 —> 1323
9 —> 1349
: —> 500
@ —> 500
_ —> 127
A —> 558
a —> 4082
B —> 290
b —> 455
C —> 572
c —> 2387
D —> 273
d —> 1230
E —> 265
e —> 4493
F —> 177
f —> 392
G —> 146
g —> 911
H —> 239
h —> 1699
I —> 235
i —> 2935
J —> 212
j —> 147
K —> 131
k —> 646
L —> 302
l —> 2602
M —> 428
m —> 1912
N —> 319
n —> 3237
O —> 136
o —> 4018
P —> 294
p —> 1141
Q —> 6
q —> 228
R —> 288
r —> 3124
S —> 600
s —> 2229
T —> 196
t —> 3328
U —> 21
u —> 899
V —> 65
v —> 508
W —> 222
w —> 1937
X —> 34
x —> 153
Y —> 99
y —> 858
Z —> 14
z —> 145
<syntaxhighlight lang="ruby">func letter_frequency(File file) {
file.read.chars.grep{.match(/[[:alpha:]]/)} \
.group_by {|letter| letter.downcase} \
.map_val {|_, val| val.len} \
.sort_by {|_, val| -val}
var top = letter_frequency(File(__FILE__))
top.each{|pair| say "#{pair[0]}: #{pair[1]}"}</syntaxhighlight>
e: 22
l: 17
a: 16
t: 14
r: 14
p: 12
f: 8
i: 8
n: 7
c: 6
u: 6
o: 6
v: 6
y: 5
s: 5
h: 3
w: 2
q: 2
b: 2
m: 2
g: 2
d: 1
Example: open a text file and compute letter frequency.
<langsyntaxhighlight lang="simpol">constant iBUFSIZE 500
function main(string filename)
Line 1,420 ⟶ 7,700:
end while
end if
end function s</langsyntaxhighlight>
As this was being created I realized that in [SIMPOL] I wouldn't have done it this way (in fact, I wrote it differently the first time and had to go back and change it to use an array afterward). In [SIMPOL] we would have used the set object. It acts similarly to a single-dimensional array, but can also use various set operations, such as difference, unite, intersect, etc. One of th einteresting things is that each unique value is stored only once, and the number of duplicates is stored with it. The sample then looks a little cleaner:
<langsyntaxhighlight lang="simpol">constant iBUFSIZE 500
function main(string filename)
Line 1,460 ⟶ 7,740:
end while
end if
end function s</langsyntaxhighlight>
The final stage simply reads the totals for each character. One caveat, if a character is unrepresented, then it will not show up at all in this second implementation.
Make it a bag of characters and get the counts:
{{works with|Smalltalk/X}}
<syntaxhighlight lang="smalltalk">bagOfChars := 'someFile' asFilename contentsAsString asBag.
bag sortedCounts
select:[:assoc | assoc value isLetter ]
thenDo:[:assoc | assoc printCR].</syntaxhighlight>
If the file is huge, you may not want to read it in as a big string first, but feed the chars linewise into the bag:
<syntaxhighlight lang="smalltalk">bagOfChars := Bag new.
'someFile' asFilename readingLinesDo:[:eachLine | bagOfChars addAll:eachLine].
bag sortedCounts ...</syntaxhighlight>
To show all counts (as opposed to selecting the letter counts only), replace the "select:thenDo:" by a simple "do:", as in:
<syntaxhighlight lang="smalltalk">bag sortedCounts do:[:assoc | assoc printCR].</syntaxhighlight>
or even shorter:
<syntaxhighlight lang="smalltalk">bag sortedCounts do:#printCR.</syntaxhighlight>
If you prefer seeing the character first, followed by the count, replace the do-loop's action with:
<syntaxhighlight lang="smalltalk">... do:[:assoc | '%s -> %s\n' printf:{assoc value . assoc key} on:Stdout ].</syntaxhighlight>
<pre>e -> 27
n -> 20
u -> 16
d -> 16
<syntaxhighlight lang="swift">import Foundation
let dictPath: String
switch CommandLine.arguments.count {
case 2:
dictPath = CommandLine.arguments[1]
case _:
dictPath = "/usr/share/dict/words"
let wordsData = FileManager.default.contents(atPath: dictPath)!
let allWords = String(data: wordsData, encoding: .utf8)!
let words = allWords.components(separatedBy: "\n")
let counts = words.flatMap({ $0.map({ ($0, 1) }) }).reduce(into: [:], { $0[$1.0, default: 0] += $1.1 })
for (char, count) in counts {
print("\(char): \(count)")
<langsyntaxhighlight lang="tcl">proc letterHistogram {fileName} {
# Initialize table (in case of short texts without every letter)
for {set i 97} {$i<=122} {incr i} {
Line 1,483 ⟶ 7,817:
letterHistogram the/sample.txt</langsyntaxhighlight>
<langsyntaxhighlight lang="tuscript">
words = REQUEST ("http://www.puzzlers.org/pub/wordlists/unixdict.txt")
Line 1,508 ⟶ 7,843:
<pre style='height:30ex;overflow:scroll'>
Line 1,554 ⟶ 7,889:
===PatternTXR MatchingExtraction PlusLanguage Embeddedplus TXR Lisp===
<syntaxhighlight lang lisp="txr">@(do (defvar h (make-hash nil nil t:equal-based)))
@(collect :vars ())
@(coll :vars ())@\
@{letter /[A-Za-z]/}@(filter :upcase letter)@\
Line 1,564 ⟶ 7,899:
@(do (dohash (key value h)
(format t "~a: ~a\n" key value)))</langsyntaxhighlight>
<pre>$ ./txr letterfreq.txr /usr/share/dict/words
Line 1,575 ⟶ 7,912:
Z: 3238</pre>
===Just EmbeddedTXR Lisp===
<syntaxhighlight lang="txrlisp">(let* ((s (open-file "/usr/share/dict/words" "r"))
<lang lisp>@(do (defun lazy-char-stream (s)
(letchrs (ch)[keep-if* (genchr-isalpha (set chgun (get-char s)) ch))])
(h [group-reduce (hash) chr-toupper (op succ @1) chrs 0]))
(dohash (key value h)
(let ((h (make-hash nil nil t))
(put-line `@key: @value`)))</syntaxhighlight>
(s (open-file "/usr/share/dict/words" "r")))
(each ((ch (lazy-char-stream s)))
(if (chr-isalpha ch)
(inc [h (chr-toupper ch) 0])))
(dohash (key value h)
(format t "~a: ~a\n" key value))))</lang>
Counts every character except new line character.
<langsyntaxhighlight lang="vala">
using Gee;
Line 1,612 ⟶ 7,944:
Sample output (run on its own source code) with several lines omitted:
Line 1,626 ⟶ 7,958:
l occured 22 times
<syntaxhighlight lang="vba">
<lang VBA>
Public Sub LetterFrequency(fname)
'count number of letters in text file "fname" (ASCII-coded)
Line 1,668 ⟶ 8,001:
End Sub
Line 1,701 ⟶ 8,034:
z 4159
<syntaxhighlight lang="vb">
Set objfso = CreateObject("Scripting.FileSystemObject")
Set objdict = CreateObject("Scripting.Dictionary")
Set objfile = objfso.OpenTextFile(filepath,1)
txt = objfile.ReadAll
For i = 1 To Len(txt)
char = Mid(txt,i,1)
If objdict.Exists(char) Then
objdict.Item(char) = objdict.Item(char) + 1
objdict.Add char,1
End If
For Each key In objdict.Keys
WScript.StdOut.WriteLine key & " = " & objdict.Item(key)
Set objfso = Nothing
Set objdict = Nothing
=={{header|Vedit macro language}}==
<langsyntaxhighlight lang="vedit">File_Open("c:\txt\a_text_file.txt")
Line 1,711 ⟶ 8,072:
#2 = Search(@103, BEGIN+ALL+NOERR)
Message(@103) Num_Type(#2)
Example output:
Line 1,741 ⟶ 8,102:
Y 16
Z 2
=={{header|V (Vlang)}}==
<syntaxhighlight lang="v (vlang)">import os
struct LetterFreq {
rune int
freq int
fn main(){
file := os.read_file('unixdict.txt')?
mut freq := map[rune]int{}
for c in file {
mut lf := []LetterFreq{}
for k,v in freq {
lf << LetterFreq{u8(k),v}
lf.sort_with_compare(fn(a &LetterFreq, b &LetterFreq)int{
if a.freq > b.freq {
return -1
if a.freq < b.freq {
return 1
return 0
for f in lf {
println('${u8(f.rune).ascii_str()} ${f.rune} $f.freq')
<pre> D 25103
A 25103
e 65 20144
a 61 16421
i 69 13980
r 72 13436
t 74 12836
o 6F 12738
n 6E 12097
s 73 10210
l 6C 10061
c 63 8216
u 75 6489
m 6D 5828
d 64 5799
p 70 5516
h 68 5208
g 67 4129
b 62 4115
y 79 3633
f 66 2662
w 77 1968
k 6B 1925
v 76 1902
x 78 617
z 7A 433
j 6A 430
q 71 378
' 27 105
& 26 6
. 2E 6
1 31 2
8 38 1
7 37 1
6 36 1
5 35 1
4 34 1
3 33 1
2 32 1
0 30 1
9 39 1
<syntaxhighlight lang="whitespace">
<syntaxhighlight lang="asm">push 127
; Initialize a slot in the heap for each ASCII character.
push 0
push 1
jn 1
jump 0
; Read until EOF, incrementing the relevant heap slot.
push 0
jn 2 ; Done reading, proceed to print.
push 1
jump 1
; Stack is [-1 -1], but [0] would be nice.
; Print characters with tallies greater than 0.
push 1
push 128
jz 4 ; All done.
jz 3 ; Don't print if no occurrences.
ochr ; Display the character,
push 32
ochr ; a space,
onum ; its frequency,
push 10
ochr ; and a newline.
jump 3
<pre>$ cat freq.ws | wspace freq.ws
As we have a copy to hand, we count the number of letters in the MIT 10000 word list which apparently contains nothing other than lower case letters.
<syntaxhighlight lang="wren">import "io" for File
import "./fmt" for Fmt
var text = File.read("mit10000.txt")
var freqs = List.filled(26, 0)
for (c in text.codePoints) {
if (c >= 97 && c <= 122) {
freqs[c-97] = freqs[c-97] + 1
var totalFreq = freqs.reduce { |sum, f| sum + f }
System.print("Frequencies of letters in mit10000.txt:")
System.print("\n freq \%")
for (i in 0..25) {
Fmt.print("$c $5d $6.2f", i+97, freqs[i], freqs[i]/totalFreq * 100)
System.print(" ----- ------")
Fmt.print(" $5d 100.00", totalFreq)
Fmt.print("\nTotal characters in text file = $d minus 10000 \\n's = $d", text.count, totalFreq)</syntaxhighlight>
Frequencies of letters in mit10000.txt:
freq %
a 5378 8.16
b 1141 1.73
c 3025 4.59
d 2507 3.81
e 7601 11.54
f 927 1.41
g 1717 2.61
h 1429 2.17
i 5461 8.29
j 183 0.28
k 592 0.90
l 3231 4.90
m 1912 2.90
n 4822 7.32
o 4252 6.45
p 2027 3.08
q 123 0.19
r 4860 7.38
s 5085 7.72
t 4760 7.23
u 1939 2.94
v 849 1.29
w 632 0.96
x 264 0.40
y 1027 1.56
z 136 0.21
----- ------
65880 100.00
Total characters in text file = 75880 minus 10000 \n's = 65880
Line 1,749 ⟶ 8,367:
character ($1A). Usage: count <filename.ext
<langsyntaxhighlight XPL0lang="xpl0">include c:\cxpl\codes; \intrinsic 'code' declarations
int A(256), C, I;
[for C:= 0 to 256-1 do A(C):= 0;
Line 1,767 ⟶ 8,385:
if (I&7) = 7 then [CrLf(0); C:= C-8*16+1];
Example output of count.xpl counting itself:
<syntaxhighlight lang="yabasic">dim ascCodes(255)
f = open("unixdict.txt", "r")
if f then
while(not eof(#f))
line input #f a$
for i = 1 to len(a$)
c = asc(mid$(a$, i, 1))
ascCodes(c) = ascCodes(c) + 1
for i = 1 to 255
c = ascCodes(i)
if c print chr$(i), " = ", c
close #f
end if</syntaxhighlight>
<syntaxhighlight lang="zkl">fcn ccnt(textInBitBucket){
letters:=["a".."z"].pump(List().write,0); // array of 26 zeros
textInBitBucket.howza(0).pump(Void,'wrap(c){ // pump text as ints
if(97<=c<=122) c-=97;
else if(65<=c<=90) c-=65;
else return(Void.Skip);
sum:=letters.sum(); println(sum," letters");
ccnt(Data(0,Int,"This is a test"));
11 letters
181171 letters
<syntaxhighlight lang="zoea">
program: letter_frequency
input: 'cbcacb' # can be literal value, stdin or file url at runtime
derive: [[a,1],[b,2],[c,3]]
output: 'a : 1\nb : 2\nc : 3\n'
=={{header|Zoea Visual}}==
[http://zoea.co.uk/examples/zv-rc/Letter_frequency.png Letter Frequency]