I before E except after C: Difference between revisions

From Rosetta Code
Content added Content deleted
(Add ed example)
(98 intermediate revisions by 42 users not shown)
Line 1: Line 1:

The phrase [[wp:I before E except after C|"I before E, except after C"]] is a
The phrase     [[wp:I before E except after C| "I before E, except after C"]]     is a
widely known mnemonic which is supposed to help when spelling English words.
widely known mnemonic which is supposed to help when spelling English words.

;Task Description:
Using the word list from [http://www.puzzlers.org/pub/wordlists/unixdict.txt http://www.puzzlers.org/pub/wordlists/unixdict.txt], check if the two sub-clauses
of the phrase are plausible individually:
# ''"I before E when not preceded by C"''
# ''"E before I when preceded by C"''

If both sub-phrases are plausible then the original phrase can be said to be plausible.<br>
Using the word list from &nbsp; [http://wiki.puzzlers.org/pub/wordlists/unixdict.txt http://wiki.puzzlers.org/pub/wordlists/unixdict.txt],
<br>check if the two sub-clauses of the phrase are plausible individually:
:::# &nbsp; ''"I before E when not preceded by C"''
:::# &nbsp; ''"E before I when preceded by C"''

If both sub-phrases are plausible then the original phrase can be said to be plausible.

Something is plausible if the number of words having the feature is more than two times the number of words having the opposite feature (where feature is 'ie' or 'ei' preceded or not by 'c' as appropriate).
Something is plausible if the number of words having the feature is more than two times the number of words having the opposite feature (where feature is 'ie' or 'ei' preceded or not by 'c' as appropriate).

;Stretch goal:
;Stretch goal:
As a stretch goal use the entries from the table of [http://ucrel.lancs.ac.uk/bncfreq/lists/1_2_all_freq.txt Word Frequencies in Written and Spoken English: based on the British National Corpus], (selecting those rows with three space or tab separated words only), to see if the phrase is plausible when word frequencies are taken into account.
As a stretch goal use the entries from the table of [http://ucrel.lancs.ac.uk/bncfreq/lists/1_2_all_freq.txt Word Frequencies in Written and Spoken English: based on the British National Corpus], (selecting those rows with three space or tab separated words only), to see if the phrase is plausible when word frequencies are taken into account.

''Show your output here as well as your program.''
''Show your output here as well as your program.''


Line 21: Line 31:
* [http://www.youtube.com/watch?v=duqlZXiIZqA I Before E Except After C] - [[wp:QI|QI]] Series 8 Ep 14, (humorous)
* [http://www.youtube.com/watch?v=duqlZXiIZqA I Before E Except After C] - [[wp:QI|QI]] Series 8 Ep 14, (humorous)
* [http://ucrel.lancs.ac.uk/bncfreq/ Companion website] for the book: "Word Frequencies in Written and Spoken English: based on the British National Corpus".
* [http://ucrel.lancs.ac.uk/bncfreq/ Companion website] for the book: "Word Frequencies in Written and Spoken English: based on the British National Corpus".


<syntaxhighlight lang="11l">V PLAUSIBILITY_RATIO = 2

F plausibility_check(comment, x, y)
print("\n Checking plausibility of: #.".format(comment))
print(‘ PLAUSIBLE. As we have counts of #. vs #., a ratio of #2.1 times’.format(x, y, Float(x) / y))
I x > y
print(‘ IMPLAUSIBLE. As although we have counts of #. vs #., a ratio of #2.1 times does not make it plausible’.format(x, y, Float(x) / y))
print(‘ IMPLAUSIBLE, probably contra-indicated. As we have counts of #. vs #., a ratio of #2.1 times’.format(x, y, Float(x) / y))

F simple_stats()
V words = File(‘unixdict.txt’).read().split("\n")
V cie = Set(words.filter(word -> ‘cie’ C word)).len
V cei = Set(words.filter(word -> ‘cei’ C word)).len
V not_c_ie = Set(words.filter(word -> re:‘(^ie|[^c]ie)’.search(word))).len
V not_c_ei = Set(words.filter(word -> re:‘(^ei|[^c]ei)’.search(word))).len
R (cei, cie, not_c_ie, not_c_ei)

F print_result(cei, cie, not_c_ie, not_c_ei)
I (plausibility_check(‘I before E when not preceded by C’, not_c_ie, not_c_ei) & plausibility_check(‘E before I when preceded by C’, cei, cie))
print(‘(To be plausible, one count must exceed another by #. times)’.format(:PLAUSIBILITY_RATIO))

print(‘Checking plausibility of "I before E except after C":’)
V (cei, cie, not_c_ie, not_c_ei) = simple_stats()
print_result(cei, cie, not_c_ie, not_c_ei)</syntaxhighlight>

Checking plausibility of "I before E except after C":

Checking plausibility of: I before E when not preceded by C
PLAUSIBLE. As we have counts of 465 vs 213, a ratio of 2.2 times

Checking plausibility of: E before I when preceded by C
IMPLAUSIBLE, probably contra-indicated. As we have counts of 13 vs 24, a ratio of 0.5 times

(To be plausible, one count must exceed another by 2 times)

=={{header|8080 Assembly}}==

This program is written to run under CP/M. It takes the filename on the command line.
The file can be as large as you like, it does not need to fit in memory at once.
(Indeed, <code>unixdict.txt</code> is 206k.)

<syntaxhighlight lang="8080asm"> ;;; I before E, except after C
fcb1: equ 5Ch ; FCB 1 (populated by file on command line)
dma: equ 80h ; Standard DMA location
bdos: equ 5 ; CP/M entry point
puts: equ 9 ; CP/M call to write a string to the console
fopen: equ 0Fh ; CP/M call to open a file
fread: equ 14h ; CP/M call to read from a file
CR: equ 13
LF: equ 10
EOF: equ 26
org 100h
;;; Open the file given on the command line
lxi d,fcb1
mvi c,fopen
call bdos
inr a ; FF = error
jz die
;;; We can only read one 128-byte block at a time, and the file
;;; will not fit in memory (max 64 k). So there are two things
;;; going on here: we copy from the block into a word buffer
;;; until we see the end of a line, at which point we process
;;; the word. In the meantime, if while copying we reach the end
;;; of the block, we read the next block.
lxi b,curwrd ; Word pointer
block: push b ; Keep word pointer while reading
lxi d,fcb1 ; Read a block from the file
mvi c,fread
call bdos
pop b ; Restore word pointer
dcr a ; 1 = EOF
jz done
inr a ; otherwise, <>0 = error
jnz die
lxi h,dma ; Start reading at DMA
char: mov a,m ; Get character
cpi EOF ; If it's an EOF character, we're done
jz done
stax b ; Store character in current word
inx b
cpi LF ; If it's LF, then we've got a full word
cz word ; Process the word
inr l ; Go to next character
jz block ; If we're done with this block, get next one
jmp char
;;; When done, report the statistics
done: lxi d,scie ; CIE
call sout
lhld cie
call puthl
lxi d,sxie ; xIE
call sout
lhld xie
call puthl
lxi d,scei ; CEI
call sout
lhld cei
call puthl
lxi d,sxei ; xEI
call sout
lhld xei
call puthl
;;; Then say what is and isn't plausible
lxi d,s_ienc ; I before E when not preceded by C
call sout ; plausible if 2*xIE>CIE
lhld cie
lhld xie
call pplaus
lxi d,s_eic ; E before I when preceded by C
call sout ; plausible if 2*CEI>xEI
lhld xei
lhld cei
;;; If HL = amount of words with feature, and
;;; DE = amount of words with opposit feature, then print
;;; '(not) plausible', as appropriate.
pplaus: dad h ; 2 * feature
mov a,d ; Compare high byte
cmp h
jc plaus ; If 2*H>D then plausible
mov a,e ; Otherwise, compare low byte
cmp l
jc plaus ; If 2*L>E then plausible
lxi d,snop ; Otherwise, not plausible
jmp sout
plaus: lxi d,splau
jmp sout
;;; Process a word
word: push h ; Save file read address
xra a ; Zero out end of word
stax b
dcx b
lxi h,curwrd ; Scan word
start: mov a,m ; Get current character
inx h ; Move pointer ahead
ana a ; If zero,
jz w_end ; we're done
cpi 'c' ; Did we find a 'c'?
jz findc
cpi 'e' ; Otherwise, did we find 'e'?
jz finde
cpi 'i' ; Otherwise, did we find 'i'?
jz findi
jmp start ; Otherwise, keep going
;;; We found an 'e'
finde: mov a,m ; Get following character
cpi 'i' ; Is it 'i'?
jnz start ; If not, keep going
inx h ; Otherwise, move past it,
xchg ; keep pointer in DE,
lhld xie ; We found ie without c
inx h
shld xie
jmp start
;;; We found an 'i'
findi: mov a,m ; Get following character
cpi 'e' ; Is it 'e'?
jnz start ; If not, keep going
inx h ; Otherwise, move past it,
xchg ; keep pointer in DE,
lhld xei ; We found ei without c
inx h
shld xei
jmp start
;;; We found a 'c'
findc: mov a,m ; Get following character
cpi 'e' ; Is it 'e'?
jz findce ; Then we have 'ce'
cpi 'i' ; Is it 'i'?
jz findci ; Then we have 'ci'
jmp start ; Otherwise, just keep going
findce: mov d,h ; set DE = start of 'e?'
mov e,l
inx d ; Get next character
ldax d
cpi 'i' ; Is it 'i'?
jnz start ; If not, do nothing
lhld cei ; But if so, we found 'cei'
inx h ; Increment the counter
shld cei
xchg ; Keep scanning _after_ the 'cei'
inx h
jmp start
findci: mov d,h ; set DE = start of 'i?'
mov e,l
inx d ; Get next character
ldax d
cpi 'e' ; Is it 'e'?
jnz start ; If not, do nothing
lhld cie ; But if so, we found 'cie'
inx h ; Increment the counter
shld cie
xchg ; Keep scanning _after_ the 'cie'
inx h
jmp start
w_end: lxi b,curwrd ; Set word pointer to beginning
pop h ; Restore file read address
;;; Print error message and stop the program
die: lxi d,errmsg
mvi c,puts
call bdos
rst 0
;;; Print string
sout: mvi c,puts
jmp bdos
;;; Print HL to the console as a decimal number
puthl: push h
lxi h,num
lxi b,-10
dgt: lxi d,-1
clcdgt: inx d
dad b
jc clcdgt
mov a,l
adi 10+'0'
dcx h
mov m,a
mov a,h
ora l
jnz dgt
pop d
mvi c,puts
jmp bdos
errmsg: db 'Error$' ; Good enough
s_ienc: db 'I before E when not preceded by C:$'
s_eic: db 'E before I when preceded by C:$'
snop: db ' not'
splau: db ' plausible',CR,LF,'$'
scie: db 'CIE: $' ; Report strings
sxie: db 'xIE: $'
scei: db 'CEI: $'
sxei: db 'xEI: $'
db '00000'
num: db CR,LF,'$' ; Space for number
;;; Counters
xie: dw 0 ; I before E when not preceded by C
cie: dw 0 ; I before E when preceded by C
cei: dw 0 ; E before I when preceded by C
xei: dw 0 ; E before I when not preceded by C
curwrd: equ $ ; Current word stored here</syntaxhighlight>


<pre>A>iec unixdict.txt
CIE: 24
xIE: 217
CEI: 13
xEI: 464
I before E when not preceded by C: plausible
E before I when preceded by C: not plausible</pre>

=={{header|ALGOL 68}}==
{{works with|ALGOL 68G|Any - tested with release 2.8.3.win32}} Uses non-standard procedure to lower available in Algol 68G.
<syntaxhighlight lang="algol68"># tests the plausibility of "i before e except after c" using unixdict.txt #

# implements the plausibility test specified by the task #
# returns TRUE if with > 2 * without #
PROC plausible = ( INT with, without )BOOL: with > 2 * without;

# shows the plausibility of with and without #
PROC show plausibility = ( STRING legend, INT with, without )VOID:
print( ( legend, IF plausible( with, without ) THEN " is plausible" ELSE " is not plausible" FI, newline ) );

IF FILE input file;
STRING file name = "unixdict.txt";
open( input file, file name, stand in channel ) /= 0
# failed to open the file #
print( ( "Unable to open """ + file name + """", newline ) )
# file opened OK #
BOOL at eof := FALSE;
# set the EOF handler for the file #
on logical file end( input file, ( REF FILE f )BOOL:
# note that we reached EOF on the #
# latest read #
at eof := TRUE;
# return TRUE so processing can continue #
INT cei := 0;
INT xei := 0;
INT cie := 0;
INT xie := 0;
get( input file, ( word, newline ) );
NOT at eof
# examine the word for cie, xie (x /= c), cei and xei (x /= c) #
FOR pos FROM LWB word TO UPB word DO word[ pos ] := to lower( word[ pos ] ) OD;
IF word = "ie" THEN
xie +:= 1
ELIF word = "ei" THEN
xei +:= 1
INT length = ( UPB word - LWB word ) + 1;
IF length > 1 THEN
IF word[ LWB word ] = "i" AND word[ LWB word + 1 ] = "e" THEN
# word starts ie #
xie +:= 1
ELIF word[ LWB word ] = "e" AND word[ LWB word + 1 ] = "i" THEN
# word starts ei #
xei +:= 1
FOR pos FROM LWB word + 1 TO UPB word - 1 DO
IF word[ pos ] = "i" AND word[ pos + 1 ] = "e" THEN
# have i before e, check the preceeding character #
IF word[ pos - 1 ] = "c" THEN cie ELSE xie FI +:= 1
ELIF word[ pos ] = "e" AND word[ pos + 1 ] = "i" THEN
# have e before i, check the preceeding character #
IF word[ pos - 1 ] = "c" THEN cei ELSE xei FI +:= 1
# close the file #
close( input file );

# test the hypothesis #
print( ( "cie occurances: ", whole( cie, 0 ), newline ) );
print( ( "xie occurances: ", whole( xie, 0 ), newline ) );
print( ( "cei occurances: ", whole( cei, 0 ), newline ) );
print( ( "xei occurances: ", whole( xei, 0 ), newline ) );
show plausibility( "i before e except after c", xie, cie );
show plausibility( "e before i except after c", xei, cei );
show plausibility( "i before e when after c", cie, xie );
show plausibility( "e before i when after c", cei, xei );
show plausibility( "i before e in general", xie + cie, xei + cei );
show plausibility( "e before i in general", xei + cei, xie + cie )
cie occurances: 24
xie occurances: 466
cei occurances: 13
xei occurances: 217
i before e except after c is plausible
e before i except after c is plausible
i before e when after c is not plausible
e before i when after c is not plausible
i before e in general is plausible
e before i in general is not plausible

Ignoring the fact that all exceptions to the rule in unixdict.txt occur where the rule doesn't apply anyway, such as in diphthongs, adjacent syllables, foreign or borrowed words, ''etc.'':


<syntaxhighlight lang="applescript">on ibeeac()
script o
property wordList : words of (read file ((path to desktop as text) & "www.rosettacode.org:unixdict.txt") as «class utf8»)
-- Subhandler called if thisWord contains either "ie" or "ei". Checks if there's an instance not preceded by "c".
on testWithoutC(thisWord, letterPair)
set AppleScript's text item delimiters to letterPair
repeat with i from 1 to (count thisWord's text items) - 1
if (text item i of thisWord does not end with "c") then return true
end repeat
return false
end testWithoutC
end script
-- Counters: {i before e not after c, i before e after c, e before i not after c, e before i after c}.
set {xie, cie, xei, cei} to {0, 0, 0, 0}
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "ie"
repeat with thisWord in o's wordList
set thisWord to thisWord's contents
if (thisWord contains "ie") then
if (thisWord contains "cie") then set cie to cie + 1
if (o's testWithoutC(thisWord, "ie")) then set xie to xie + 1
end if
if (thisWord contains "ei") then
if (thisWord contains "cei") then set cei to cei + 1
if (o's testWithoutC(thisWord, "ei")) then set xei to xei + 1
end if
end repeat
set AppleScript's text item delimiters to astid
set |1 is plausible| to (xie / cie > 2)
set |2 is plausible| to (cei / xei > 2)
return {|"I before E not after C" is plausible|:|1 is plausible|} & ¬
{|"E before I after C" is plausible|:|2 is plausible|} & ¬
{|Both are plausible|:(|1 is plausible| and |2 is plausible|)}
end ibeeac


<syntaxhighlight lang="applescript">{|"I before E not after C" is plausible|:true, |"E before I after C" is plausible|:false, |Both are plausible|:false}</syntaxhighlight>


<syntaxhighlight lang="applescript">use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions

on ibeeac()
set wordList to words of ¬
(read (((path to desktop as text) & "www.rosettacode.org:unixdict.txt") as «class furl») as «class utf8»)
set wordArray to current application's class "NSArray"'s arrayWithArray:(wordList)
set counters to {}
repeat with letterPair in {"ie", "ei"}
set filter to (current application's class "NSPredicate"'s ¬
predicateWithFormat_("(self CONTAINS[c] %@)", letterPair))
set relevants to (wordArray's filteredArrayUsingPredicate:(filter))
set filter to (current application's class "NSPredicate"'s ¬
predicateWithFormat_("NOT (self CONTAINS[c] %@)", "c" & letterPair))
set end of counters to (relevants's filteredArrayUsingPredicate:(filter))'s |count|()
set filter to (current application's class "NSPredicate"'s ¬
predicateWithFormat_("(self CONTAINS[c] %@)", "c" & letterPair))
set end of counters to (relevants's filteredArrayUsingPredicate:(filter))'s |count|()
end repeat
set {xie, cie, xei, cei} to counters
set |1 is plausible| to (xie / cie > 2)
set |2 is plausible| to (cei / xei > 2)
return {|"I before E not after C" is plausible|:|1 is plausible|} & ¬
{|"E before I after C" is plausible|:|2 is plausible|} & ¬
{|Both are plausible|:(|1 is plausible| and |2 is plausible|)}
end ibeeac


<syntaxhighlight lang="applescript">{|"I before E not after C" is plausible|:true, |"E before I after C" is plausible|:false, |Both are plausible|:false}</syntaxhighlight>

<syntaxhighlight lang="applescript">use AppleScript version "2.4"
use framework "Foundation"
use scripting additions

---------------------- TEST OF CLAIMS --------------------
on run
set fpWordList to scriptFolder() & "unixdict.txt"
if doesFileExist(fpWordList) then
set patterns to {"[^c]ie", "[^c]ei", "cei", "cie"}
set counts to ap(map(matchCount, patterns), ¬
script test
on |λ|(kvs)
set {common, rare} to kvs
set {ck, cv} to common
set {rk, rv} to rare
set ratio to roundTo(2, cv / rv)
if ratio > 2 then
set verdict to "plausible"
set verdict to "unsupported"
end if
unwords({ck, ">", rk, "->", cv, "/", rv, ¬
"=", ratio, "::", verdict})
end |λ|
end script
unlines(map(test, chunksOf(2, zip(patterns, counts))))
display dialog "Word list not found in this script's folder:" & ¬
linefeed & tab & fpWordList
end if
end run

------------------------- GENERIC ------------------------

-- Tuple (,) :: a -> b -> (a, b)
on Tuple(a, b)
-- Constructor for a pair of values, possibly of two different types.
{a, b}
end Tuple

-- ap (<*>) :: [(a -> b)] -> [a] -> [b]
on ap(fs, xs)
-- e.g. [(*2),(/2), sqrt] <*> [1,2,3]
-- --> ap([dbl, hlf, root], [1, 2, 3])
-- --> [2,4,6,0.5,1,1.5,1,1.4142135623730951,1.7320508075688772]
-- Each member of a list of functions applied to
-- each of a list of arguments, deriving a list of new values
set lst to {}
repeat with f in fs
tell mReturn(contents of f)
repeat with x in xs
set end of lst to |λ|(contents of x)
end repeat
end tell
end repeat
return lst
end ap

-- chunksOf :: Int -> [a] -> [[a]]
on chunksOf(k, xs)
on go(ys)
set ab to splitAt(k, ys)
set a to item 1 of ab
if {} ≠ a then
{a} & go(item 2 of ab)
end if
end go
end script
result's go(xs)
end chunksOf

-- doesFileExist :: FilePath -> IO Bool
on doesFileExist(strPath)
set ca to current application
set oPath to (ca's NSString's stringWithString:strPath)'s ¬
set {bln, int} to (ca's NSFileManager's defaultManager's ¬
fileExistsAtPath:oPath isDirectory:(reference))
bln and (int ≠ 1)
end doesFileExist

-- map :: (a -> b) -> [a] -> [b]
on map(f, xs)
-- The list obtained by applying f
-- to each element of xs.
tell mReturn(f)
set lng to length of xs
set lst to {}
repeat with i from 1 to lng
set end of lst to |λ|(item i of xs, i, xs)
end repeat
return lst
end tell
end map

-- matchCount :: String -> NSString -> Int
on matchCount(regexString)
-- A count of the matches for a regular expression
-- in a given NSString
on |λ|(s)
set ca to current application
((ca's NSRegularExpression's ¬
regularExpressionWithPattern:regexString ¬
options:(ca's NSRegularExpressionAnchorsMatchLines) ¬
|error|:(missing value))'s ¬
numberOfMatchesInString:s ¬
options:0 ¬
range:{location:0, |length|:s's |length|()}) as integer
end |λ|
end script
end matchCount

-- min :: Ord a => a -> a -> a
on min(x, y)
if y < x then
end if
end min

-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
-- 2nd class handler function lifted into 1st class script wrapper.
if script is class of f then
property |λ| : f
end script
end if
end mReturn

-- readFile :: FilePath -> IO NSString
on readFile(strPath)
set ca to current application
set e to reference
set {s, e} to (ca's NSString's ¬
stringWithContentsOfFile:((ca's NSString's ¬
stringWithString:strPath)'s ¬
stringByStandardizingPath) ¬
encoding:(ca's NSUTF8StringEncoding) |error|:(e))
if missing value is e then
(localizedDescription of e) as string
end if
end readFile

-- roundTo :: Int -> Float -> Float
on roundTo(n, x)
set d to 10 ^ n
(round (x * d)) / d
end roundTo

-- scriptFolder :: () -> IO FilePath
on scriptFolder()
-- The path of the folder containing this script
tell application "Finder" to ¬
POSIX path of ((container of (path to me)) as alias)
end scriptFolder

-- splitAt :: Int -> [a] -> ([a], [a])
on splitAt(n, xs)
if n > 0 and n < length of xs then
if class of xs is text then
{items 1 thru n of xs as text, ¬
items (n + 1) thru -1 of xs as text}
{items 1 thru n of xs, items (n + 1) thru -1 of xs}
end if
if n < 1 then
{{}, xs}
{xs, {}}
end if
end if
end splitAt

-- unlines :: [String] -> String
on unlines(xs)
-- A single string formed by the intercalation
-- of a list of strings with the newline character.
set {dlm, my text item delimiters} to ¬
{my text item delimiters, linefeed}
set s to xs as text
set my text item delimiters to dlm
end unlines

-- unwords :: [String] -> String
on unwords(xs)
set {dlm, my text item delimiters} to ¬
{my text item delimiters, space}
set s to xs as text
set my text item delimiters to dlm
return s
end unwords

-- zip :: [a] -> [b] -> [(a, b)]
on zip(xs, ys)
zipWith(Tuple, xs, ys)
end zip

-- zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
on zipWith(f, xs, ys)
set lng to min(length of xs, length of ys)
set lst to {}
if 1 > lng then
return {}
tell mReturn(f)
repeat with i from 1 to lng
set end of lst to |λ|(item i of xs, item i of ys)
end repeat
return lst
end tell
end if
end zipWith</syntaxhighlight>
<pre>[^c]ie > [^c]ei -> 466 / 217 = 2.15 :: plausible
cei > cie -> 13 / 24 = 0.54 :: unsupported</pre>

<syntaxhighlight lang="arturo">rule1: {"I before E when not preceded by C"}
rule2: {"E before I when preceded by C"}
phrase: {"I before E except after C"}

plausibility: #[
false: "not plausible",
true: "plausible"

checkPlausible: function [rule, count1, count2][
result: count1 > 2 * count2
print ["The rule" rule "is" plausibility\[result] ":"]
print ["\tthere were" count1 "examples and" count2 "counter-examples."]
return result

words: read.lines relative "unixdict.txt"

[nie,cie,nei,cei]: 0

loop words 'word [
if contains? word "ie" ->
inc (contains? word "cie")? -> 'cie -> 'nie
if contains? word "ei" ->
inc (contains? word "cei")? -> 'cei -> 'nei

p1: checkPlausible rule1 nie nei
p2: checkPlausible rule2 cei cie

print ["\nSo the phrase" phrase "is" (to :string plausibility\[and? p1 p2]) ++ "."]</syntaxhighlight>


<pre>The rule "I before E when not preceded by C" is plausible :
there were 465 examples and 213 counter-examples.
The rule "E before I when preceded by C" is not plausible :
there were 13 examples and 24 counter-examples.

So the phrase "I before E except after C" is not plausible.</pre>

<lang AutoHotkey>WordList := URL_ToVar("http://www.puzzlers.org/pub/wordlists/unixdict.txt")
<syntaxhighlight lang="autohotkey">WordList := URL_ToVar("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
WordList := RegExReplace(WordList, "i)cie", "", cieN)
WordList := RegExReplace(WordList, "i)cie", "", cieN)
WordList := RegExReplace(WordList, "i)cei", "", ceiN)
WordList := RegExReplace(WordList, "i)cei", "", ceiN)
Line 44: Line 805:
return, WebRequest.ResponseText
return, WebRequest.ResponseText
<pre>"I before E when not preceded by C" is plausible.
<pre>"I before E when not preceded by C" is plausible.
Line 55: Line 816:

<lang awk>#!/usr/bin/awk -f
<syntaxhighlight lang="awk">#!/usr/bin/awk -f

/.ei/ {nei+=cnt($3)}
/.ei/ {nei+=cnt($3)}
Line 80: Line 841:
print "E before I when preceded by C: is"v2" plausible";
print "E before I when preceded by C: is"v2" plausible";
print "Overall rule is"v" plausible";
print "Overall rule is"v" plausible";

Line 102: Line 863:
=={{header|Batch File}}==
=={{header|Batch File}}==
Download first the text file, then put it on the same directory with this sample code:
Download first the text file, then put it on the same directory with this sample code:
<lang dos>::I before E except after C task from Rosetta Code Wiki
<syntaxhighlight lang="dos">::I before E except after C task from Rosetta Code Wiki
::Batch File Implementation
::Batch File Implementation

Line 138: Line 899:

exit /b 0</lang>
exit /b 0</syntaxhighlight>
<pre>Plausibility of "I before E when not preceded by C": TRUE (465 VS 213)
<pre>Plausibility of "I before E when not preceded by C": TRUE (465 VS 213)
Line 144: Line 905:
Overall plausibility of "I before E EXCEPT after C": FALSE
Overall plausibility of "I before E EXCEPT after C": FALSE
Press any key to continue . . .</pre>
Press any key to continue . . .</pre>

==== '''Fast solution using standard external commands FINDSTR and FIND:''' ====
Each word is counted once if word has at least one occurrence of test string (word with 2 or more occurrences only counts once).
The same word may count toward different categories.
<syntaxhighlight lang="dos">@echo off
setlocal enableDelayedExpansion
for /f %%A in ('findstr /i "^ie [^c]ie" unixdict.txt ^| find /c /v ""') do set Atrue=%%A
for /f %%A in ('findstr /i "^ei [^c]ei" unixdict.txt ^| find /c /v ""') do set Afalse=%%A
for /f %%A in ('findstr /i "[c]ei" unixdict.txt ^| find /c /v ""') do set Btrue=%%A
for /f %%A in ('findstr /i "[c]ie" unixdict.txt ^| find /c /v ""') do set Bfalse=%%A
set /a "Aresult=Atrue/Afalse/2, Bresult=Btrue/Bfalse/2, Result=^!^!Aresult*Bresult"
set "Answer1=Plausible" & set "Answer0=Implausible"
echo I before E when not preceded by C: True=%Atrue% False=%Afalse% : !Answer%Aresult%!
echo E before I when preceded by C: True=%Btrue% False=%Bfalse% : !Answer%Bresult%!
echo I before E, except after C : !Answer%Result%!</syntaxhighlight>
<pre>I before E when not preceded by C: True=465 False=213 : Plausible
E before I when preceded by C: True=13 False=24 : Implausible
I before E, except after C : Implausible</pre>

==== '''Stretch solution using standard external command FINDSTR:''' ====
Each word frequency is included once if word has at least one occurrence of test string (word with 2 or more occurrences only counts once).
The same word frequency may count toward different categories.
<syntaxhighlight lang="dos">@echo off
setlocal enableDelayedExpansion
set /a Atrue=Afalse=Btrue=Bfalse=0
for /f "tokens=3*" %%A in ('findstr /i "[^c]ie" 1_2_all_freq.txt') do if "%%B" equ "" set /a Atrue+=%%A
for /f "tokens=3*" %%A in ('findstr /i "[^c]ei" 1_2_all_freq.txt') do if "%%B" equ "" set /a Afalse+=%%A
for /f "tokens=3*" %%A in ('findstr /i "[c]ei" 1_2_all_freq.txt') do if "%%B" equ "" set /a Btrue+=%%A
for /f "tokens=3*" %%A in ('findstr /i "[c]ie" 1_2_all_freq.txt') do if "%%B" equ "" set /a Bfalse+=%%A
set /a "Aresult=Atrue/Afalse/2, Bresult=Btrue/Bfalse/2, Result=^!^!Aresult*Bresult"
set "Answer1=Plausible" & set "Answer0=Implausible"
echo I before E when not preceded by C: True=%Atrue% False=%Afalse% : !Answer%Aresult%!
echo E before I when preceded by C: True=%Btrue% False=%Bfalse% : !Answer%Bresult%!
echo I before E, except after C : !Answer%Result%!</syntaxhighlight>
<pre>I before E when not preceded by C: True=8192 False=4826 : Implausible
E before I when preceded by C: True=327 False=994 : Implausible
I before E, except after C : Implausible</pre>

<syntaxhighlight lang="basic">10 DEFINT A-Z
80 PRINT "xIE:";XI
100 PRINT "xEI:";XE
120 PRINT "I before E when not preceded by C: ";
130 IF 2*XI <= CI THEN PRINT "not ";
140 PRINT "plausible."
150 PRINT "E before I when preceded by C: ";
160 IF 2*CE <= XE THEN PRINT "not ";
170 PRINT "plausible."</syntaxhighlight>
<pre>CIE: 24
xIE: 465
CEI: 13
xEI: 213

I before E when not preceded by C: plausible.
E before I when preceded by C: not plausible.</pre>

<syntaxhighlight lang="freebasic">CI = 0 : XI = 0 : CE = 0 : XE = 0
open 1, "unixdict.txt"

pal$ = readline (1)
if instr(pal$, "ie") then
if instr(pal$, "cie") then CI += 1 else XI += 1
if instr(pal$, "ei") then
if instr(pal$, "cei") then CE += 1 else XE += 1
until eof(1)
close 1

print "CIE: "; CI
print "xIE: "; XI
print "CEI: "; CE
print "xEI: "; XE
print "I before E when not preceded by C: ";
if 2 * XI <= CI then print "not ";
print "plausible."
print "E before I when preceded by C: ";
if 2 * CE <= XE then print "not ";
print "plausible."

=={{header|BBC BASIC}}==
{{works with|BBC BASIC for Windows}}
<syntaxhighlight lang="bbcbasic"> F%=OPENIN"unixdict.txt"
IF F% == 0 ERROR 100, "unixdict not found!"

CI=0 : XI=0 : CE=0 : XE=0
P%=INSTR(Line$, "ie")
IF MID$(Line$, P% - 1, 1) == "c" CI+=1 ELSE XI+=1
P%=INSTR(Line$, "ie", P% + 1)
P%=INSTR(Line$, "ei")
IF MID$(Line$, P% - 1, 1) == "c" CE+=1 ELSE XE+=1
P%=INSTR(Line$, "ei", P% + 1)

PRINT "Instances of 'ie', proceeded by a 'c' = ";CI
PRINT "Instances of 'ie', NOT proceeded by a 'c' = ";XI
P1%=XI * 2 > CI
PRINT "Therefore 'I before E when not preceded by C' is" FNTest(P1%)

PRINT "Instances of 'ei', proceeded by a 'c' = ";CE
PRINT "Instances of 'ei', NOT proceeded by a 'c' = ";XE
P2%=CE * 2 > XE
PRINT "Therefore 'E before I when preceded by C' is" FNTest(P2%)

PRINT "oth sub-phrases are plausible, therefore the phrase " +\
\ "'I before E, except after C' can be said to be" FNTest(P1% AND P2%) "!"

DEF FNTest(plausible%)=MID$(" not plausible", 1 - 4 * plausible%)</syntaxhighlight>
<pre>Instances of 'ie', proceeded by a 'c' = 24
Instances of 'ie', NOT proceeded by a 'c' = 466
Therefore 'I before E when not preceded by C' is plausible

Instances of 'ei', proceeded by a 'c' = 13
Instances of 'ei', NOT proceeded by a 'c' = 217
Therefore 'E before I when preceded by C' is not plausible

Not both sub-phrases are plausible, therefore the phrase 'I before E, except after C' can be said to be not plausible!</pre>

<syntaxhighlight lang="bcpl">get "libhdr"

// Read word from selected input
let readword(v) = valof
$( let ch = ?
v%0 := 0
$( ch := rdch()
if ch = endstreamch then resultis false
if ch = '*N' then resultis true
v%0 := v%0 + 1
v%(v%0) := ch
$) repeat

// Does s1 contain s2?
let contains(s1, s2) = valof
$( for i = 1 to s1%0 - s2%0 + 1
if valof
$( for j = 1 to s2%0
unless s1%(i+j-1) = s2%j resultis false
resultis true
$) resultis true
resultis false

// Test unixdict.txt
let start() be
$( let word = vec 2+64/BYTESPERWORD
let file = findinput("unixdict.txt")
let ncie, ncei, nxie, nxei = 0, 0, 0, 0
while readword(word)
test contains(word, "ie")
test contains(word, "cie")
do ncie := ncie + 1
or nxie := nxie + 1
or if contains(word, "ei")
test contains(word, "cei")
do ncei := ncei + 1
or nxei := nxei + 1
// Show results
writef("CIE: %N*N", ncie)
writef("xIE: %N*N", nxie)
writef("CEI: %N*N", ncei)
writef("xEI: %N*N", nxei)
writef("I before E when not preceded by C: %Splausible.*N",
2*nxie > ncie -> "", "not ")
writef("E before I when preceded by C: %Splausible.*N",
2*ncei > nxei -> "", "not ")
<pre>CIE: 24
xIE: 465
CEI: 13
xEI: 209
I before E when not preceded by C: plausible.
E before I when preceded by C: not plausible.</pre>

Line 150: Line 1,122:
This may in turn motivate me to provide a second J solution as a single pass FSM.
This may in turn motivate me to provide a second J solution as a single pass FSM.
Please find the program output hidden at the top of the source as part of the build and example run.
Please find the program output hidden at the top of the source as part of the build and example run.
<syntaxhighlight lang="c">
<lang c>
Line 183: Line 1,155:
return 0;
return 0;

=={{header|C sharp|C#}}==
<syntaxhighlight lang="csharp">using System;
using System.Collections.Generic;
using System.IO;

namespace IBeforeE {
class Program {
static bool IsOppPlausibleWord(string word) {
if (!word.Contains("c") && word.Contains("ei")) {
return true;
if (word.Contains("cie")) {
return true;
return false;

static bool IsPlausibleWord(string word) {
if (!word.Contains("c") && word.Contains("ie")) {
return true;
if (word.Contains("cei")) {
return true;
return false;

static bool IsPlausibleRule(string filename) {
IEnumerable<string> wordSource = File.ReadLines(filename);
int trueCount = 0;
int falseCount = 0;

foreach (string word in wordSource) {
if (IsPlausibleWord(word)) {
else if (IsOppPlausibleWord(word)) {

Console.WriteLine("Plausible count: {0}", trueCount);
Console.WriteLine("Implausible count: {0}", falseCount);
return trueCount > 2 * falseCount;

static void Main(string[] args) {
if (IsPlausibleRule("unixdict.txt")) {
Console.WriteLine("Rule is plausible.");
else {
Console.WriteLine("Rule is not plausible.");
<pre>Plausible count: 384
Implausible count: 204
Rule is not plausible.</pre>

Line 193: Line 1,226:
:* (Test used 4.4, so only a limited number of C++11 features were used.)
:* (Test used 4.4, so only a limited number of C++11 features were used.)

<lang cpp>#include <iostream>
<syntaxhighlight lang="cpp">#include <iostream>
#include <fstream>
#include <fstream>
#include <string>
#include <string>
Line 294: Line 1,327:
return 0;
return 0;

Line 304: Line 1,337:
Overall plausibility: no
Overall plausibility: no


The output here was generated with the files as of 21st June 2016.

<syntaxhighlight lang="clojure">
(ns i-before-e.core
(:require [clojure.string :as s])

(def patterns {:cie #"cie" :ie #"(?<!c)ie" :cei #"cei" :ei #"(?<!c)ei"})

(defn update-counts
"Given a map of counts of matching patterns and a word, increment any count if the word matches it's pattern."
[counts [word freq]]
(apply hash-map (mapcat (fn [[k v]] [k (if (re-seq (patterns k) word) (+ freq v) v)]) counts)))

(defn count-ie-ei-combinations
"Update counts of all ie and ei combinations"
(reduce update-counts {:ie 0 :cie 0 :ei 0 :cei 0} words))

(defn apply-freq-1
"Apply a frequency of one to words"
(map #(vector % 1) words))

(defn- format-plausible
(if plausible? "plausible" "implausible"))

(defn- apply-rule [desc examples contra]
(let [plausible? (<= (* 2 contra) examples)]
(println (format "The sub rule %s is %s. There are %d examples and %d counter-examples.\n" desc (format-plausible plausible?) examples contra))

(defn i-before-e-except-after-c-plausible?
"Check if i before e after c plausible?"
[description words]
(println description)
(let [counts (count-ie-ei-combinations words)
subrule1 (apply-rule "I before E when not preceeded by C" (:ie counts) (:ei counts))
subrule2 (apply-rule "E before I when preceeded by C" (:cei counts) (:cie counts))
rule (and subrule1 subrule2)]
(println (format "Overall the rule 'I before E except after C' is %s" (format-plausible rule)))

(defn format-freq-line [line] (letfn [(format-line [xs] [(first xs) (read-string (last xs))])]
(-> line
(s/split #"\s")

(defn -main []
(with-open [rdr (clojure.java.io/reader "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")]
(i-before-e-except-after-c-plausible? "Check unixdist list" (apply-freq-1 (line-seq rdr))))
(with-open [rdr (clojure.java.io/reader "http://ucrel.lancs.ac.uk/bncfreq/lists/1_2_all_freq.txt")]
(i-before-e-except-after-c-plausible? "Word frequencies (stretch goal)" (map format-freq-line (drop 1 (line-seq rdr))))))

lein run
Check unixdist list
The sub rule I before E when not preceeded by C is plausible. There are 465 examples and 213 counter-examples.

The sub rule E before I when preceeded by C is implausible. There are 13 examples and 24 counter-examples.

Overall the rule 'I before E except after C' is implausible
Word frequencies (stretch goal)
The sub rule I before E when not preceeded by C is implausible. There are 8192 examples and 4826 counter-examples.

The sub rule E before I when preceeded by C is implausible. There are 327 examples and 994 counter-examples.

Overall the rule 'I before E except after C' is implausible

<syntaxhighlight lang="clu">report = cluster is new, classify, results
rep = record[cie, xie, cei, xei, words: int]
new = proc () returns (cvt)
return(rep${cie: 0, xie: 0, cei: 0, xei: 0, words: 0})
end new
classify = proc (r: cvt, word: string)
r.words := r.words + 1
if string$indexs("ie", word) ~= 0 then
if string$indexs("cie", word) ~= 0
then r.cie := r.cie + 1
else r.xie := r.xie + 1
elseif string$indexs("ei", word) ~= 0 then
if string$indexs("cei", word) ~= 0
then r.cei := r.cei + 1
else r.xei := r.xei + 1
end classify
stat = proc (s: stream, name: string, val: int)
stream$puts(s, name)
stream$puts(s, ": ")
stream$putl(s, int$unparse(val))
end stat
plausible = proc (s: stream, feature: string, match, nomatch: int)
returns (bool)
stream$puts(s, feature)
stream$puts(s, ": ")
plaus: bool := 2 * match > nomatch;
if ~plaus then stream$puts(s, "not ") end
stream$putl(s, "plausible.");
end plausible
results = proc (r: cvt) returns (string)
ss: stream := stream$create_output()
stat(ss, "Amount of words", r.words)
stat(ss, "CIE", r.cie)
stat(ss, "xIE", r.xie)
stat(ss, "CEI", r.cei)
stat(ss, "xEI", r.xei)
stream$putl(ss, "")
xie_p: bool := plausible(ss, "I before E when not preceded by C", r.xie, r.cie)
cei_p: bool := plausible(ss, "E before I when preceded by C", r.cei, r.xei)
stream$puts(ss, "I before E, except after C: ")
if ~(xie_p & cei_p) then stream$puts(ss, "not ") end
stream$putl(ss, "plausible.")
end results
end report

lines = iter (s: stream) yields (string)
while true do
except when end_of_file: break end
end lines

start_up = proc ()
po: stream := stream$primary_output()
file: file_name := file_name$parse("unixdict.txt")
fstream: stream := stream$open(file, "read")
r: report := report$new()
for line: string in lines(fstream) do
report$classify(r, line)
stream$puts(po, report$results(r))
end start_up </syntaxhighlight>
<pre>Amount of words: 25104
CIE: 24
xIE: 465
CEI: 13
xEI: 209

I before E when not preceded by C: plausible.
E before I when preceded by C: not plausible.
I before E, except after C: not plausible.</pre>

Line 311: Line 1,506:
Now we can do the task:
Now we can do the task:

<lang coco>ie-npc = ei-npc = ie-pc = ei-pc = 0
<syntaxhighlight lang="coco">ie-npc = ei-npc = ie-pc = ei-pc = 0
for word of dict.toLowerCase!.match /\S+/g
for word of dict.toLowerCase!.match /\S+/g
++ie-npc if /(^|[^c])ie/.test word
++ie-npc if /(^|[^c])ie/.test word
Line 323: Line 1,518:
console.log '(1) is%s plausible.', if p1 then '' else ' not'
console.log '(1) is%s plausible.', if p1 then '' else ' not'
console.log '(2) is%s plausible.', if p2 then '' else ' not'
console.log '(2) is%s plausible.', if p2 then '' else ' not'
console.log 'The whole phrase is%s plausible.', if p1 and p2 then '' else ' not'</lang>
console.log 'The whole phrase is%s plausible.', if p1 and p2 then '' else ' not'</syntaxhighlight>

=={{header|Common Lisp}}==
=={{header|Common Lisp}}==

<lang lisp>
<syntaxhighlight lang="lisp">
(defun test-rule (rule-name examples counter-examples)
(defun test-rule (rule-name examples counter-examples)
(let ((plausible (if (> examples (* 2 counter-examples)) 'plausible 'not-plausible)))
(let ((plausible (if (> examples (* 2 counter-examples)) 'plausible 'not-plausible)))
Line 362: Line 1,557:
(plausibility "Dictionary" #p"unixdict.txt" #'parse-dict)
(plausibility "Dictionary" #p"unixdict.txt" #'parse-dict)
(plausibility "Word frequencies (stretch goal)" #p"1_2_all_freq.txt" #'parse-freq)
(plausibility "Word frequencies (stretch goal)" #p"1_2_all_freq.txt" #'parse-freq)

Line 377: Line 1,572:
Overall the rule is NOT-PLAUSIBLE
Overall the rule is NOT-PLAUSIBLE

The extra work has not been attempted
<syntaxhighlight lang="d">import std.file;
import std.stdio;

int main(string[] args) {
if (args.length < 2) {
stderr.writeln(args[0], " filename");
return 1;

int cei, cie, ie, ei;
auto file = File(args[1]);
foreach(line; file.byLine) {
auto res = eval(cast(string) line);
cei += res.cei;
cie += res.cie;
ei += res.ei;
ie += res.ie;

writeln("CEI: ", cei, "; CIE: ", cie);
writeln("EI: ", ei, "; IE: ", ie);

writeln("'I before E when not preceded by C' is ", verdict(ie, ei));
writeln("'E before I when preceded by C' is ", verdict(cei, cie));

return 0;

string verdict(int a, int b) {
import std.format;
if (a > 2*b) {
return format("plausible with evidence %f", cast(double)a/b);
return format("not plausible with evidence %f", cast(double)a/b);

struct Evidence {
int cei;
int cie;
int ei;
int ie;

Evidence eval(string word) {
enum State {

State state;
Evidence cnt;
for(int i=0; i<word.length; ++i) {
char c = word[i];
switch(state) {
case State.START:
if (c == 'c') {
state = State.C;
if (c == 'e') {
state = State.E;
if (c == 'i') {
state = State.I;
case State.C:
if (c == 'e') {
state = State.CE;
} else if (c == 'i') {
state = State.CI;
} else if (c != 'c') {
state = State.START;
case State.E:
if (c == 'c') {
state = State.C;
} else if (c == 'i') {
state = State.I;
} else if (c != 'e') {
state = State.START;
case State.I:
if (c == 'c') {
state = State.C;
} else if (c == 'e') {
state = State.E;
} else if (c != 'i') {
state = State.START;
case State.CE:
if (c == 'i') {
state = State.I;
if (c == 'c') {
state = State.C;
state = State.START;
case State.CI:
if (c == 'e') {
state = State.E;
if (c == 'c') {
state = State.C;
state = State.START;
return cnt;

<pre>CEI: 13; CIE: 24
EI: 217; IE: 466
'I before E when not preceded by C' is plausible with evidence 2.147465
'E before I when preceded by C' is not plausible with evidence 0.541667</pre>

{{libheader| System.SysUtils}}
{{libheader| System.IOUtils}}
{{Trans|C sharp}}
<syntaxhighlight lang="delphi">
program I_before_E_except_after_C;

System.SysUtils, System.IOUtils;

function IsOppPlausibleWord(w: string): Boolean;
if ((not w.Contains('c')) and (w.Contains('ei'))) then

if (w.Contains('cie')) then


function IsPlausibleWord(w: string): Boolean;
if ((not w.Contains('c')) and (w.Contains('ie'))) then

if (w.Contains('cie')) then


function IsPlausibleRule(filename: TFileName): Boolean;
words: TArray<string>;
trueCount, falseCount: Cardinal;
w: string;
words := TFile.ReadAllLines(filename, TEncoding.UTF8);
trueCount := 0;
falseCount := 0;

for w in words do
if (IsPlausibleWord(w)) then
else if (IsOppPlausibleWord(w)) then


Writeln('Plausible count: ', trueCount);
Writeln('Implausible count: ', falseCount);

Result := trueCount > 2 * falseCount;;


if (IsPlausibleRule('unixdict.txt')) then
Writeln('Rule is plausible.')
Writeln('Rule is not plausible.');


<syntaxhighlight lang="draco">\util.g

/* variables to hold totals for each possibility */
word cie, xie, cei, xei;

/* classify a word and add it to the proper total */
proc nonrec classify(*char w) void:
if CharsIndex(w, "ie") /= -1 then
if CharsIndex(w, "cie") /= -1
then cie := cie + 1
else xie := xie + 1
elif CharsIndex(w, "ei") /= -1 then
if CharsIndex(w, "cei") /= -1
then cei := cei + 1
else xei := xei + 1

/* see if a clause is plausible */
proc nonrec plausible(*char clause; word match, nomatch) bool:
bool p;
p := 2*match > nomatch;
writeln(clause, ": ", if p then "" else "not " fi, "plausible.");

proc nonrec main() void:
file() dict_file;
channel input text dict_ch;
[256] char line;
bool p;
cie := 0;
xie := 0;
cei := 0;
xei := 0;
/* read every word */
open(dict_ch, dict_file, "unixdict.txt");
while readln(dict_ch; &line[0]) do
/* print statistics */
writeln("CIE: ", cie:5);
writeln("xIE: ", xie:5);
writeln("CEI: ", cei:5);
writeln("xEI: ", xei:5);
/* see if the propositions are plausible */
p := plausible("I before E when not preceded by C", xie, cie);
p := plausible("E before I when preceded by C", cei, xei) and p;
writeln("I before E except after C: ",
if p then "" else "not " fi,
<pre>CIE: 24
xIE: 465
CEI: 13
xEI: 209
I before E when not preceded by C: plausible.
E before I when preceded by C: not plausible.
I before E except after C: not plausible.</pre>


There are two files, one per hypothesis.

<syntaxhighlight lang="sed">
# i-before-e.ed
# Remove all the non-rule-related words
# Replace the occurences with one-letter markers
# Remove 1 occurence of e (alternative) per two i (null)
# Check whether there are more i's in the output (null hypothesis true) or not

<syntaxhighlight lang="sed">
# e-before-i-with-c.ed
# Remove all the non-rule-related words
# Replace the occurences with one-letter markers
# Remove 1 occurence of i (alternative) per two e (null)
# Check whether there are more e's in the output (null hypothesis true) or not


<pre>$ cat i-before-e.ed | ed -lEGs unixdict.txt

Has more i's so the "i before e" hypothesis is plausible.

<pre>$ cat e-before-i-with-c.ed | ed -lEGs unixdict.txt

Has more i's, so the "e before i when preceded by c" is not plausible.
Thus, the whole rule is not plausible.

<lang elixir>defmodule RC do
<syntaxhighlight lang="elixir">defmodule RC do
def task(path) do
def task(path) do
plausibility_ratio = 2
plausibility_ratio = 2
Line 408: Line 1,961:

path = hd(System.argv)
path = hd(System.argv)
IO.inspect RC.task(path)</lang>
IO.inspect RC.task(path)</syntaxhighlight>

Line 421: Line 1,974:

<lang erlang>
<syntaxhighlight lang="erlang">
Line 446: Line 1,999:
nomatch -> count(T,Pattern, Acc)
nomatch -> count(T,Pattern, Acc)
Line 453: Line 2,006:
Proposition 2. is not plausible: cei 13, cie 24
Proposition 2. is not plausible: cei 13, cie 24
The rule 'is not' plausible
The rule 'is not' plausible

<syntaxhighlight lang="factor">USING: combinators formatting generalizations io.encodings.utf8
io.files kernel literals math prettyprint regexp sequences ;
IN: rosetta-code.i-before-e

: correct ( #correct #incorrect rule-str -- )
pprint " is correct for %d and incorrect for %d.\n" printf ;

: plausibility ( #correct #incorrect -- str )
2 * > "plausible" "implausible" ? ;
: output ( #correct #incorrect rule-str -- )
[ correct ] curry
[ plausibility "This is %s.\n\n" printf ] 2bi ;
"unixdict.txt" utf8 file-lines ${
R/ cei/ R/ cie/ R/ [^c]ie/ R/ [^c]ei/
[ count-matches ]
[ map-sum ]
[ 4 apply-curry ] bi@
} cleave

"I before E when not preceded by C"
"E before I when preceded by C" [ output ] bi@</syntaxhighlight>
"I before E when not preceded by C" is correct for 465 and incorrect for 195.
This is plausible.

"E before I when preceded by C" is correct for 13 and incorrect for 24.
This is implausible.

Please find the linux build instructions along with example run in the comments at the beginning of the f90 source. Thank you.
Please find the linux build instructions along with example run in the comments at the beginning of the f90 source. Thank you.
<syntaxhighlight lang="fortran">
<lang FORTRAN>
!-*- mode: compilation; default-directory: "/tmp/" -*-
!-*- mode: compilation; default-directory: "/tmp/" -*-
!Compilation started at Sat May 18 22:19:19
!Compilation started at Sat May 18 22:19:19
Line 529: Line 2,115:
end function plausibility
end function plausibility
end program cia
end program cia

<lang FreeBASIC>Function getfile(file As String) As String
<syntaxhighlight lang="freebasic">Function getfile(file As String) As String
Dim As Integer F = Freefile
Dim As Integer F = Freefile
Dim As String text,intext
Dim As String text,intext
Line 581: Line 2,167:
print "So, the idea is not plausible."
print "So, the idea is not plausible."

<pre>The number of words in unixdict.txt 25104
<pre>The number of words in unixdict.txt 25104
Line 595: Line 2,181:
ei is not plausible when preceeded by c, the ratio is 0.5416666666666666
ei is not plausible when preceeded by c, the ratio is 0.5416666666666666
So, the idea is not plausible.</pre>
So, the idea is not plausible.</pre>

<syntaxhighlight lang="futurebasic">include "NSLog.incl"

#plist NSAppTransportSecurity @{NSAllowsArbitraryLoads:YES}

void local fn CheckWord( wrd as CFStringRef, txt as CFStringRef, c as ^long, x as ^long )
CFRange range = fn StringRangeOfString( wrd, txt )
while ( range.location != NSNotFound )
if ( range.location > 0 )
select ( fn StringCharacterAtIndex( wrd, range.location-1 ) )
case _"c"
*c += 1
case else
*x += 1
end select
*x += 1
end if
range.length = len(wrd) - range.location
range = fn StringRangeOfStringWithOptionsInRange( wrd, txt, 0, range )
end fn

void local fn Doit
CFURLRef url = fn URLWithString( @"http://wiki.puzzlers.org/pub/wordlists/unixdict.txt" )
CFStringRef string = fn StringWithContentsOfURL( url, NSUTF8StringEncoding, NULL )
CFArrayRef words = fn StringComponentsSeparatedByCharactersInSet( string, fn CharacterSetNewlineSet )
long cei = 0, cie = 0, xei = 0, xie = 0
CFStringRef wrd, result
for wrd in words
fn CheckWord( wrd, @"ei", @cei, @xei )
fn CheckWord( wrd, @"ie", @cie, @xie )
NSLog(@"cei: %ld",cei)
NSLog(@"cie: %ld",cie)
NSLog(@"xei: %ld",xei)
NSLog(@"xie: %ld",xie)
if 2 * xie <= cie then result = @"not plausible" else result = @"plausible"
NSLog( @"\nI before E when not preceded by C: %@.\n¬
There are %ld examples and %ld counter-examples for a ratio of %f.\n", ¬
result, xie, xei, ( ( (float)xie - (float)cie ) / ( (float)xei - (float)cei ) ) )
if 2 * cei <= xei then result = @"not plausible" else result = @"plausible"
NSLog( @"E before I when preceded by C: %@.\n¬
There are %ld examples and %ld counter-examples for a ratio of %f.\n", ¬
result, cei, cie, ( (float)cei / (float)cie ) )
end fn

fn DoIt


<pre>cei: 13
cie: 24
xei: 217
xie: 466

I before E when not preceded by C: plausible.
There are 466 examples and 217 counter-examples for a ratio of 2.166667.

E before I when preceded by C: not plausible.
There are 13 examples and 24 counter-examples for a ratio of 0.541667.</pre>

<lang go>package main
<syntaxhighlight lang="go">package main

import (
import (
Line 664: Line 2,318:
return false
return false

Line 678: Line 2,332:
This solution does not attempt the stretch goal.
This solution does not attempt the stretch goal.

<lang Haskell>import Network.HTTP
<syntaxhighlight lang="haskell">import Network.HTTP
import Text.Regex.TDFA
import Text.Regex.TDFA
import Text.Printf
import Text.Printf
Line 686: Line 2,340:
response <- simpleHTTP.getRequest$ url
response <- simpleHTTP.getRequest$ url
getResponseBody response
getResponseBody response
where url = "http://www.puzzlers.org/pub/wordlists/unixdict.txt"
where url = "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt"

main = do
main = do
Line 702: Line 2,356:
rule2Plausible = numTrueRule2 > (2*numFalseRule2)
rule2Plausible = numTrueRule2 > (2*numFalseRule2)
printf "Rule 2 is correct for %d\n incorrect for %d\n" numTrueRule2 numFalseRule2
printf "Rule 2 is correct for %d\n incorrect for %d\n" numTrueRule2 numFalseRule2
printf "*** Rule 2 is %splausible.\n" (if rule2Plausible then "" else "im")</lang>
printf "*** Rule 2 is %splausible.\n" (if rule2Plausible then "" else "im")</syntaxhighlight>

Line 723: Line 2,377:
same input line should all be tested.
same input line should all be tested.

<lang Unicon>import Utils # To get the FindFirst class
<syntaxhighlight lang="unicon">import Utils # To get the FindFirst class

procedure main(a)
procedure main(a)
Line 741: Line 2,395:

if \showCounts then every write(phrase := !phrases,": ",totals[phrase])
if \showCounts then every write(phrase := !phrases,": ",totals[phrase])

{{out}} of running with <tt>--showcounts</tt> flag:
{{out}} of running with <tt>--showcounts</tt> flag:
Line 758: Line 2,412:
=== stretch goal ===
=== stretch goal ===

<lang Unicon>import Utils # To get the FindFirst class
<syntaxhighlight lang="unicon">import Utils # To get the FindFirst class

procedure main(a)
procedure main(a)
Line 782: Line 2,436:

if \showCounts then every write(phrase := !phrases,": ",totals[phrase])
if \showCounts then every write(phrase := !phrases,": ",totals[phrase])

Line 801: Line 2,455:
After downloading unixdict to /tmp:
After downloading unixdict to /tmp:

<lang J> dict=:tolower fread '/tmp/unixdict.txt'</lang>
<syntaxhighlight lang="j"> dict=:tolower fread '/tmp/unixdict.txt'</syntaxhighlight>

Investigating the rules:
Investigating the rules:

<lang J> +/'cie' E. dict
<syntaxhighlight lang="j"> +/'cie' E. dict
+/'cei' E. dict
+/'cei' E. dict
Line 812: Line 2,466:
+/'ei' E. dict
+/'ei' E. dict

So, based on unixdict.txt, the "I before E" rule seems plausible (490 > 230 by more than a factor of 2), but the exception does not make much sense (we see almost twice as many i before e after a c as we see e before i after a c).
So, based on unixdict.txt, the "I before E" rule seems plausible (490 > 230 by more than a factor of 2), but the exception does not make much sense (we see almost twice as many i before e after a c as we see e before i after a c).
Line 822: Line 2,476:
After downloading 1_2_all_freq to /tmp, we can read it into J, and break out the first column (as words) and the third column as numbers:
After downloading 1_2_all_freq to /tmp, we can read it into J, and break out the first column (as words) and the third column as numbers:

<lang J>allfreq=: |:}.<;._1;._2]1!:1<'/tmp/1_2_all_freq.txt'
<syntaxhighlight lang="j">allfreq=: |:}.<;._1;._2]1!:1<'/tmp/1_2_all_freq.txt'

words=: >0 { allfreq
words=: >0 { allfreq
freqs=: 0 {.@".&>2 { allfreq</lang>
freqs=: 0 {.@".&>2 { allfreq</syntaxhighlight>

With these definitions, we can define a prevalence verb which will tell us how often a particular substring is appears in use:
With these definitions, we can define a prevalence verb which will tell us how often a particular substring is appears in use:

<lang J>prevalence=:verb define
<syntaxhighlight lang="j">prevalence=:verb define
(y +./@E."1 words) +/ .* freqs
(y +./@E."1 words) +/ .* freqs

Investigating our original proposed rules:
Investigating our original proposed rules:

<lang J> 'ie' %&prevalence 'ei'
<syntaxhighlight lang="j"> 'ie' %&prevalence 'ei'

A generic "i before e" rule is not looking quite as good now - words that have i before e are used less than twice as much as words which use e before i.
A generic "i before e" rule is not looking quite as good now - words that have i before e are used less than twice as much as words which use e before i.

<lang J> 'cei' %&prevalence 'cie'
<syntaxhighlight lang="j"> 'cei' %&prevalence 'cie'

An "except after c" variant is looking awful now - words that use the cie sequence are three times as likely as words that use the cei sequence. So, of course, if we modified our original rule with this exception it would weaken the original rule:
An "except after c" variant is looking awful now - words that use the cie sequence are three times as likely as words that use the cei sequence. So, of course, if we modified our original rule with this exception it would weaken the original rule:

<lang J> ('ie' -&prevalence 'cie') % ('ei' -&prevalence 'cei')
<syntaxhighlight lang="j"> ('ie' -&prevalence 'cie') % ('ei' -&prevalence 'cei')

Note that we might also want to consider non-adjacent matches (the regular expression 'i.*e' instead of 'ie' or perhaps 'c.*ie' or 'c.*i.*e' instead of 'cie') - this would be straightforward to check, but this would bulk up the page. (And, to be meaningful, we'd want a more constrained wildcard than <code>.*</code> -- at the very least we would not want to span words.)

Note that we might also want to consider non-adjacent matches (the regular expression 'i.*e' instead of 'ie' or perhaps 'c.*ie' or 'c.*i.*e' instead of 'cie') - this would be straightforward to check, but this would bulk up the page.
<syntaxhighlight lang="java">
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URI;
import java.net.URISyntaxException;
import java.net.URL;
<syntaxhighlight lang="java">
public static void main(String[] args) throws URISyntaxException, IOException {
System.out.printf("%-10s %,d%n", "total", total);
System.out.printf("%-10s %,d%n", "'cei'", cei);
System.out.printf("%-10s %,d%n", "'cie'", cie);
System.out.printf("%,d > (%,d * 2) = %b%n", cei, cie, cei > (cie * 2));
System.out.printf("%,d > (%,d * 2) = %b", cie, cei, cie > (cei * 2));

static int total = 0;
static int cei = 0;
static int cie = 0;

static void count() throws URISyntaxException, IOException {
URL url = new URI("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt").toURL();
try (BufferedReader reader = new BufferedReader(new InputStreamReader(url.openStream()))) {
String line;
while ((line = reader.readLine()) != null) {
if (line.matches(".*?(?:[^c]ie|cei).*")) {
} else if (line.matches(".*?(?:[^c]ei|cie).*")) {
total 25,104
'cei' 477
'cie' 215
477 > (215 * 2) = true
215 > (477 * 2) = false
<br />
An alternate demonstration<br>
Download and save wordlist to unixdict.txt.
Download and save wordlist to unixdict.txt.

<lang java>
<syntaxhighlight lang="java">
import java.io.BufferedReader;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.FileReader;
Line 911: Line 2,612:

Line 917: Line 2,618:
Implausible count: 204
Implausible count: 204
Rule is not plausible.</pre>
Rule is not plausible.</pre>

Line 923: Line 2,623:

WARNING: The problem statement is misleading as the rule only applies to syllables that rhyme with "see".
WARNING: The problem statement is misleading as the rule only applies to syllables that rhyme with "see".
<lang jq>def plausibility_ratio: 2;
<syntaxhighlight lang="jq">def plausibility_ratio: 2;

# scan/2 produces a stream of matches but the first match of a segment (e.g. cie)
# scan/2 produces a stream of matches but the first match of a segment (e.g. cie)
Line 957: Line 2,657:
as ratio = \($x)/\($y) ~ \($ratio * 100 |round)%" ;
as ratio = \($x)/\($y) ~ \($ratio * 100 |round)%" ;

"Using the problematic criterion specified in the task requirements:", assess</lang>
"Using the problematic criterion specified in the task requirements:", assess</syntaxhighlight>
Using http://www.puzzlers.org/pub/wordlists/unixdict.txt as of June 2015:
Using http://www.puzzlers.org/pub/wordlists/unixdict.txt as of June 2015:
<lang sh>$ jq -s -R -r -f I_before_E_except_after_C.jq unixdict.txt
<syntaxhighlight lang="sh">$ jq -s -R -r -f I_before_E_except_after_C.jq unixdict.txt
Using the problematic criterion specified in the task requirements:
Using the problematic criterion specified in the task requirements:
-- the rule "E before I when preceded by C" is implausible
-- the rule "E before I when preceded by C" is implausible
as ratio = 13/24 ~ 54%
as ratio = 13/24 ~ 54%
-- the rule "I before E when not preceded by C" is plausible
-- the rule "I before E when not preceded by C" is plausible
as ratio = 464/217 ~ 214%</lang>
as ratio = 464/217 ~ 214%</syntaxhighlight>

<syntaxhighlight lang="julia"># v0.0.6

open("unixdict.txt") do txtfile
rule1, notrule1, rule2, notrule2 = 0, 0, 0, 0
for word in eachline(txtfile)
# "I before E when not preceded by C"
if ismatch(r"ie"i, word)
if ismatch(r"cie"i, word)
notrule1 += 1
rule1 += 1
# "E before I when preceded by C"
if ismatch(r"ei"i, word)
if ismatch(r"cei"i, word)
rule2 += 1
notrule2 += 1

print("Plausibility of \"I before E when not preceded by C\": ")
println(rule1 > 2 * notrule1 ? "PLAUSIBLE" : "UNPLAUSIBLE")
print("Plausibility of \"E before I when preceded by C\":")
println(rule2 > 2 * notrule2 ? "PLAUSIBLE" : "UNPLAUSIBLE")

<pre>Plausibility of "I before E when not preceded by C": PLAUSIBLE
Plausibility of "E before I when preceded by C":UNPLAUSIBLE</pre>

<syntaxhighlight lang="scala">// version 1.0.6

import java.net.URL
import java.io.InputStreamReader
import java.io.BufferedReader

fun isPlausible(n1: Int, n2: Int) = n1 > 2 * n2

fun printResults(source: String, counts: IntArray) {
println("Results for $source")
println(" i before e except after c")
println(" for ${counts[0]}")
println(" against ${counts[1]}")
val plausible1 = isPlausible(counts[0], counts[1])
println(" sub-rule is${if (plausible1) "" else " not"} plausible\n")
println(" e before i when preceded by c")
println(" for ${counts[2]}")
println(" against ${counts[3]}")
val plausible2 = isPlausible(counts[2], counts[3])
println(" sub-rule is${if (plausible2) "" else " not"} plausible\n")
val plausible = plausible1 && plausible2
println(" rule is${if (plausible) "" else " not"} plausible")

fun main(args: Array<String>) {
val url = URL("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
val isr = InputStreamReader(url.openStream())
val reader = BufferedReader(isr)
val regexes = arrayOf(
Regex("(^|[^c])ie"), // i before e when not preceded by c (includes words starting with ie)
Regex("(^|[^c])ei"), // e before i when not preceded by c (includes words starting with ei)
Regex("cei"), // e before i when preceded by c
Regex("cie") // i before e when preceded by c
val counts = IntArray(4) // corresponding counts of occurrences
var word = reader.readLine()
while (word != null) {
for (i in 0..3) counts[i] += regexes[i].findAll(word).toList().size
word = reader.readLine()
printResults("unixdict.txt", counts)

val url2 = URL("http://ucrel.lancs.ac.uk/bncfreq/lists/1_2_all_freq.txt")
val isr2 = InputStreamReader(url2.openStream())
val reader2 = BufferedReader(isr2)
val counts2 = IntArray(4)
reader2.readLine() // read header line
var line = reader2.readLine() // read first line and store it
var words: List<String>
val splitter = Regex("""(\t+|\s+)""")
while (line != null) {
words = line.split(splitter)
if (words.size == 4) // first element is empty
for (i in 0..3) counts2[i] += regexes[i].findAll(words[1]).toList().size * words[3].toInt()
line = reader2.readLine()
printResults("British National Corpus", counts2)

Results for unixdict.txt
i before e except after c
for 466
against 217
sub-rule is plausible

e before i when preceded by c
for 13
against 24
sub-rule is not plausible

rule is not plausible

Results for British National Corpus
i before e except after c
for 8192
against 4826
sub-rule is not plausible

e before i when preceded by c
for 327
against 994
sub-rule is not plausible

rule is not plausible

<syntaxhighlight lang="langur">
val words = split("\n", readfile("./data/unixdict.txt")) -> rest

val print = impure fn(support, against) {
val ratio = support / against
writeln "{{support}} / {{against}} = {{ratio : r2}}:", (ratio < 2) * " NOT", " PLAUSIBLE"
return if(ratio >= 2: 1; 0)

val ks = fw/ei cei ie cie/
var cnt = {:}

for w in words {
for k in ks {
cnt[k; 0] += if(k in w: 1; 0)

var support = cnt'ie - cnt'cie
var against = cnt'ei - cnt'cei

var result = print(support, against)
result += print(cnt'cei, cnt'cie)

writeln "Overall:", (result < 2) * " NOT", " PLAUSIBLE\n"

<pre>465 / 213 = 2.18: PLAUSIBLE
13 / 24 = 0.54: NOT PLAUSIBLE

<lang lasso>
<syntaxhighlight lang="lasso">
local(cie,cei,ie,ei) = (:0,0,0,0)
local(cie,cei,ie,ei) = (:0,0,0,0)

Line 974: Line 2,836:
local(match_ei) = regExp(`[^c]ei`)
local(match_ei) = regExp(`[^c]ei`)

with word in include_url(`http://www.puzzlers.org/pub/wordlists/unixdict.txt`)->asString->split("\n")
with word in include_url(`http://wiki.puzzlers.org/pub/wordlists/unixdict.txt`)->asString->split("\n")
where #word >> `ie` or #word >> `ei`
where #word >> `ie` or #word >> `ei`
do {
do {
Line 1,002: Line 2,864:
stdoutnl(`Overall the rule is ` + (#ie_plausible and #cei_plausible ? `` | `NOT-`) + `PLAUSIBLE`)
stdoutnl(`Overall the rule is ` + (#ie_plausible and #cei_plausible ? `` | `NOT-`) + `PLAUSIBLE`)
Line 1,009: Line 2,871:
Overall the rule is NOT-PLAUSIBLE
Overall the rule is NOT-PLAUSIBLE

<syntaxhighlight lang="lua">-- Needed to get dictionary file from web server
local http = require("socket.http")

-- Return count of words that contain pattern
function count (pattern, wordList)
local total = 0
for word in wordList:gmatch("%S+") do
if word:match(pattern) then total = total + 1 end
return total

-- Check plausibility of case given its opposite
function plaus (case, opposite, words)
if count(case, words) > 2 * count(opposite, words) then
return true
return false

-- Main procedure
local page = http.request("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
io.write("I before E when not preceded by C: ")
local sub1 = plaus("[^c]ie", "cie", page)
io.write("E before I when preceded by C: ")
local sub2 = plaus("cei", "[^c]ei", page)
io.write("Overall the phrase is ")
if not (sub1 and sub2) then io.write("not ") end
<pre>I before E when not preceded by C: PLAUSIBLE
E before I when preceded by C: IMPLAUSIBLE
Overall the phrase is not plausible.</pre>

<syntaxhighlight lang="maple">words:= HTTP:-Get("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt"):
lst := StringTools:-Split(words[2],"\n"):
xie, cie, cei, xei := 0, 0, 0, 0:
for item in lst do
if searchtext("ie", item) <> 0 then
if searchtext("cie", item) <> 0 then
cie := cie + 1:
xie := xie + 1:
if searchtext("ei", item) <> 0 then
if searchtext("cei", item) <> 0 then
cei := cei + 1:
xei := xei + 1:
p1, p2 := evalb(xie > 2*xei),evalb(cei > 2*cie);
printf("The first phrase is %s with supporting features %d, anti features %d\n", piecewise(p1, "plausible", "not plausible"), xie, xei);
printf("The seond phrase is %s with supporting features %d, anti features %d\n", piecewise(p2, "plausible", "not plausible"), cei, cie);
printf("The overall phrase is %s\n", piecewise(p1 and p2, "plausible", "not plausible")):</syntaxhighlight>
<pre>The first phrase is plausible with supporting features 465 and anti features 213
The second phrase is not plausible with supporting features 13 and anti features 24
The overall phrase is not plausible</pre>

=={{header|Mathematica}} / {{header|Wolfram Language}}==
=={{header|Mathematica}} / {{header|Wolfram Language}}==
<lang mathematica>wordlist =
<syntaxhighlight lang="mathematica">wordlist =
Print["The number of words in unixdict.txt = " <>
Print["The number of words in unixdict.txt = " <>
Line 1,037: Line 2,966:
Print["Overall the rule is " <>
Print["Overall the rule is " <>
If[test1 && test2, "PLAUSIBLE", "NOT PLAUSIBLE" ]]</lang>
If[test1 && test2, "PLAUSIBLE", "NOT PLAUSIBLE" ]]</syntaxhighlight>

<lang mathematica>The number of words in unixdict.txt = 25104
<syntaxhighlight lang="mathematica">The number of words in unixdict.txt = 25104
The rule "I before E when not preceded by C" is PLAUSIBLE
The rule "I before E when not preceded by C" is PLAUSIBLE
There were 465 examples and 213 counter examples, for a ratio of 2.1831
There were 465 examples and 213 counter examples, for a ratio of 2.1831
Line 1,046: Line 2,975:
There were 13 examples and 24 counter examples, for a ratio of 0.541667
There were 13 examples and 24 counter examples, for a ratio of 0.541667
Overall the rule is NOT PLAUSIBLE
Overall the rule is NOT PLAUSIBLE

=={{header|MATLAB}} / {{header|Octave}}==
{{incomplete|MATLAB|Is the original phrase plausible?}}

<syntaxhighlight lang="matlab">
<lang MATLAB>function i_before_e_except_after_c(f)
function iBeforeE()

function check(URL)
fid = fopen(f,'r');
fprintf('For %s:\n', URL)
nei = 0;
[~, name, ext] = fileparts(URL);
fn = [name ext];
if exist(fn,'file')
lines = readlines(fn, 'EmptyLineRule', 'skip');
fprintf('Reading data from %s\n', URL)
lines = readlines(URL, 'EmptyLineRule', 'skip');
% Save the file for later
includesFrequencyData = length(split(lines(1))) > 1;
ie = 0;
cie = 0;
ei = 0;
cei = 0;
cei = 0;
for i = 1:size(lines,1)
nie = 0;
if includesFrequencyData
cie = 0;
fields = split(strtrim(lines(i)));
while ~feof(fid)
if length(fields) ~= 3 || i == 1
c = strsplit(strtrim(fgetl(fid)),char([9,32]));
if length(c) > 2,
n = str2num(c{3});
word = fields(1);
frequency = str2double(fields(3));
n = 1;
word = lines(i);
if strfind(c{1},'ei')>1, nei=nei+n; end;
frequency = 1;
if strfind(c{1},'cei'), cei=cei+n; end;
if strfind(c{1},'ie')>1, nie=nie+n; end;
if strfind(c{1},'cie'), cie=cie+n; end;
ie = ie + length(strfind(word,'ie')) * frequency;
ei = ei + length(strfind(word,'ei')) * frequency;
cie = cie + length(strfind(word,'cie')) * frequency;
cei = cei + length(strfind(word,'cei')) * frequency;
rule1 = "I before E when not preceded by C";
p1 = reportPlausibility(rule1, ie-cie, ei-cei );
rule2 = "E before I when preceded by C";
p2 = reportPlausibility(rule2, cei, cie );
combinedRule = "I before E, except after C";
fprintf('Hence the combined rule \"%s\" is ', combinedRule);
if ~(p1 && p2)
fprintf('NOT ');

function plausible = reportPlausibility(claim, positive, negative)
printf('cie: %i\nnie: %i\ncei: %i\nnei: %i\n',cie,nie-cie,cei,nei-cei);
v = '';
plausible = true;
fprintf('\"%s\" is ', claim);
if (nie < 3 * cie)
if positive <= 2*negative
v=' not';
plausible = false;
fprintf('NOT ')
fprintf('PLAUSIBLE,\n since the ratio of positive to negative examples is %d/%d = %0.2f.\n', positive, negative, positive/negative )
printf('I before E when not preceded by C: is%s plausible\n',v);
v = '';
if (nei > 3 * cei)
v=' not';
printf('E before I when preceded by C: is%s plausible\n',v);

<pre>octave:23> i_before_e_except_after_c 1_2_all_freq.txt
>> iBeforeE
cie: 994
For http://wiki.puzzlers.org/pub/wordlists/unixdict.txt:
nie: 8133
"I before E when not preceded by C" is PLAUSIBLE,
cei: 327
since the ratio of positive to negative examples is 466/217 = 2.15.
nei: 4274
I before E when not preceded by C: is plausible
"E before I when preceded by C" is NOT PLAUSIBLE,
since the ratio of positive to negative examples is 13/24 = 0.54.
E before I when preceded by C: is not plausible
Hence the combined rule "I before E, except after C" is NOT PLAUSIBLE.
octave:24> i_before_e_except_after_c unixdict.txt

cie: 24
For http://ucrel.lancs.ac.uk/bncfreq/lists/1_2_all_freq.txt:
nie: 464
"I before E when not preceded by C" is NOT PLAUSIBLE,
cei: 13
since the ratio of positive to negative examples is 8207/4826 = 1.70.
nei: 191
I before E when not preceded by C: is plausible
"E before I when preceded by C" is NOT PLAUSIBLE,
since the ratio of positive to negative examples is 327/994 = 0.33.
E before I when preceded by C: is not plausible</pre>
Hence the combined rule "I before E, except after C" is NOT PLAUSIBLE.

<syntaxhighlight lang="modula2">MODULE IEC;
FROM InOut IMPORT WriteString, WriteCard, WriteLn;
FROM Strings IMPORT Pos;

VAR words, cie, cei, xie, xei: CARDINAL;
xie_plausible, cei_plausible: BOOLEAN;
end := Pos("", word);
IF Pos("ie", word) # end THEN
IF Pos("cie", word) # end
THEN INC(cie);
ELSE INC(xie);
ELSIF Pos("ei", word) # end THEN
IF Pos("cei", word) # end
THEN INC(cei);
ELSE INC(xei);
END Classify;

PROCEDURE ProcessFile(filename: ARRAY OF CHAR);
VAR file: SeqIO.FILE;
dict: Texts.TEXT;
word: ARRAY [0..63] OF CHAR;
fs: SeqIO.FileState;
ts: Texts.TextState;
fs := SeqIO.Open(file, filename);
ts := Texts.Connect(dict, file);
WHILE NOT Texts.EOT(dict) DO
Texts.ReadLn(dict, word);
ts := Texts.Disconnect(dict);
fs := SeqIO.Close(file);
END ProcessFile;

WriteString(": ");
WriteCard(num, 0);
END WriteStat;

PROCEDURE Plausible(feature: ARRAY OF CHAR; match, nomatch: CARDINAL): BOOLEAN;
VAR plausible: BOOLEAN;
WriteString(": ");
plausible := 2 * match > nomatch;
IF NOT plausible THEN
WriteString("not ");
RETURN plausible;
END Plausible;

words := 0;
cie := 0;
cei := 0;
xie := 0;
xei := 0;
WriteStat("Amount of words", words);
WriteStat("CIE", cie);
WriteStat("xIE", xie);
WriteStat("CEI", cei);
WriteStat("xEI", xei);
xie_plausible :=
Plausible("I before E when not preceded by C", xie, cie);
cei_plausible :=
Plausible("E before I when preceded by C", cei, xei);
WriteString("I before E, except after C: ");
IF NOT (xie_plausible AND cei_plausible) THEN
WriteString("not ");
END IEC.</syntaxhighlight>
<pre>Amount of words: 50209
CIE: 24
xIE: 465
CEI: 13
xEI: 209

I before E when not preceded by C: plausible.
E before I when preceded by C: not plausible.
I before E, except after C: not plausible.</pre>

<syntaxhighlight lang="nim">import httpclient, strutils, strformat

Rule1 = "\"I before E when not preceded by C\""
Rule2 = "\"E before I when preceded by C\""
Phrase = "\"I before E except after C\""
PlausibilityText: array[bool, string] = ["not plausible", "plausible"]

proc plausibility(rule: string; count1, count2: int): bool =
## Compute, display and return plausibility.
result = count1 > 2 * count2
stdout.write &"The rule {rule} is {PlausibilityText[result]}: "
echo &"there were {count1} examples and {count2} counter-examples."

let client = newHttpClient()

var nie, cie, nei, cei = 0
for word in client.getContent("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt").split():
if word.contains("ie"):
if word.contains("cie"):
inc cie
inc nie
if word.contains("ei"):
if word.contains("cei"):
inc cei
inc nei

let p1 = plausibility(Rule1, nie, nei)
let p2 = plausibility(Rule2, cei, cie)
echo &"So the phrase {Phrase} is {PlausibilityText[p1 and p2]}."</syntaxhighlight>

<pre>The rule "I before E when not preceded by C" is plausible: there were 465 examples and 213 counter-examples.
The rule "E before I when preceded by C" is not plausible: there were 13 examples and 24 counter-examples.
So the phrase "I before E except after C" is not plausible.</pre>

<lang objeck>
<syntaxhighlight lang="objeck">
use HTTP;
use HTTP;
use Collection;
use Collection;
Line 1,108: Line 3,217:
class HttpTest {
class HttpTest {
function : Main(args : String[]) ~ Nil {
function : Main(args : String[]) ~ Nil {

Line 1,164: Line 3,273:

Line 1,176: Line 3,285:
(To be plausible, one word count must exceed another by 2 times)
(To be plausible, one word count must exceed another by 2 times)

{{incomplete|MATLAB|Is the original phrase plausible?}}

<syntaxhighlight lang="matlab">function i_before_e_except_after_c(f)

fid = fopen(f,'r');
nei = 0;
cei = 0;
nie = 0;
cie = 0;
while ~feof(fid)
c = strsplit(strtrim(fgetl(fid)),char([9,32]));
if length(c) > 2,
n = str2num(c{3});
n = 1;
if strfind(c{1},'ei')>1, nei=nei+n; end;
if strfind(c{1},'cei'), cei=cei+n; end;
if strfind(c{1},'ie')>1, nie=nie+n; end;
if strfind(c{1},'cie'), cie=cie+n; end;

printf('cie: %i\nnie: %i\ncei: %i\nnei: %i\n',cie,nie-cie,cei,nei-cei);
v = '';
if (nie < 3 * cie)
v=' not';
printf('I before E when not preceded by C: is%s plausible\n',v);
v = '';
if (nei > 3 * cei)
v=' not';
printf('E before I when preceded by C: is%s plausible\n',v);

<pre>octave:23> i_before_e_except_after_c 1_2_all_freq.txt
cie: 994
nie: 8133
cei: 327
nei: 4274
I before E when not preceded by C: is plausible
E before I when preceded by C: is not plausible
octave:24> i_before_e_except_after_c unixdict.txt
cie: 24
nie: 464
cei: 13
nei: 191
I before E when not preceded by C: is plausible
E before I when preceded by C: is not plausible</pre>

<lang perl>#!/usr/bin/perl
<syntaxhighlight lang="perl">#!/usr/bin/perl
use warnings;
use warnings;
use strict;
use strict;
Line 1,211: Line 3,372:
$result += result($support, $against);
$result += result($support, $against);

print 'Overall: ', 'NOT ' x ($result < 2), "PLAUSIBLE.\n";</lang>
print 'Overall: ', 'NOT ' x ($result < 2), "PLAUSIBLE.\n";</syntaxhighlight>

Line 1,220: Line 3,381:
===Perl: Stretch Goal===
===Perl: Stretch Goal===
Just replace the while loop with the following one:
Just replace the while loop with the following one:
<lang perl>while (<>) {
<syntaxhighlight lang="perl">while (<>) {
my @columns = split;
my @columns = split;
next if 3 < @columns;
next if 3 < @columns;
Line 1,227: Line 3,388:
$count{$k} += $freq if -1 != index $word, $k;
$count{$k} += $freq if -1 != index $word, $k;
<pre>I before E when not preceded by C: 8148 / 4826 = 1.69. NOT PLAUSIBLE
<pre>I before E when not preceded by C: 8148 / 4826 = 1.69. NOT PLAUSIBLE
Line 1,233: Line 3,394:
Overall: NOT PLAUSIBLE.</pre>
Overall: NOT PLAUSIBLE.</pre>

=={{header|Perl 6}}==
Kept dirt simple, difficult to imagine any other approach being faster than this.
This solution uses grammars and actions to parse the given file, the <tt>Bag</tt> for tallying up occurrences of each possible thing we're looking for ("ie", "ei", "cie", and "cei"), and junctions to determine the plausibility of a phrase from the subphrases. Note that a version of rakudo newer than the January 2014 compiler or Star releases is needed, as this code relies on a recent bugfix to the <tt>make</tt> function.
<!--<syntaxhighlight lang="phix">(phixonline)-->
<lang perl6>grammar CollectWords {
<span style="color: #000080;font-style:italic;">-- demo\rosetta\IbeforeE.exw</span>
token TOP {
<span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span>
[^^ <word> $$ \n?]+
<span style="color: #008080;">procedure</span> <span style="color: #000000;">show_plausibility</span><span style="color: #0000FF;">(</span><span style="color: #004080;">string</span> <span style="color: #000000;">msg</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">w</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">wo</span><span style="color: #0000FF;">)</span>
<span style="color: #004080;">string</span> <span style="color: #000000;">no</span> <span style="color: #0000FF;">=</span> <span style="color: #008080;">iff</span><span style="color: #0000FF;">(</span><span style="color: #000000;">w</span><span style="color: #0000FF;"><</span><span style="color: #000000;">2</span><span style="color: #0000FF;">*</span><span style="color: #000000;">wo</span><span style="color: #0000FF;">?</span><span style="color: #008000;">" not"</span><span style="color: #0000FF;">:</span><span style="color: #008000;">""</span><span style="color: #0000FF;">)</span>
<span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"%s (pro: %3d, anti: %3d) is%s plausible\n"</span><span style="color: #0000FF;">,{</span><span style="color: #000000;">msg</span><span style="color: #0000FF;">,</span><span style="color: #000000;">w</span><span style="color: #0000FF;">,</span><span style="color: #000000;">wo</span><span style="color: #0000FF;">,</span><span style="color: #000000;">no</span><span style="color: #0000FF;">})</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">procedure</span>
<span style="color: #004080;">string</span> <span style="color: #000000;">text</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">join</span><span style="color: #0000FF;">(</span><span style="color: #7060A8;">unix_dict</span><span style="color: #0000FF;">())</span>
<span style="color: #000080;font-style:italic;">-- Note: my unixdict.txt begins with "10th" and ends with "zygote", so
-- boundary checks such as "i&gt;=2 and i+1&lt;=length(text)" can be skipped.</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">cei</span><span style="color: #0000FF;">=</span><span style="color: #000000;">0</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">xei</span><span style="color: #0000FF;">=</span><span style="color: #000000;">0</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">cie</span><span style="color: #0000FF;">=</span><span style="color: #000000;">0</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">xie</span><span style="color: #0000FF;">=</span><span style="color: #000000;">0</span>
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">text</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">text</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]=</span><span style="color: #008000;">'i'</span> <span style="color: #008080;">then</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">text</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]=</span><span style="color: #008000;">'e'</span> <span style="color: #008080;">then</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">text</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">-</span><span style="color: #000000;">2</span><span style="color: #0000FF;">]=</span><span style="color: #008000;">'c'</span> <span style="color: #008080;">then</span>
<span style="color: #000000;">cei</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span>
<span style="color: #008080;">else</span>
<span style="color: #000000;">xei</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #000080;font-style:italic;">-- (nb not elsif here; "eie" occurs twice)</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">text</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]=</span><span style="color: #008000;">'e'</span> <span style="color: #008080;">then</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">text</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]=</span><span style="color: #008000;">'c'</span> <span style="color: #008080;">then</span>
<span style="color: #000000;">cie</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span>
<span style="color: #008080;">else</span>
<span style="color: #000000;">xie</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
<span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"occurances: cie:%d, xie:%d, cei:%d, xei:%d\n"</span><span style="color: #0000FF;">,</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">cie</span><span style="color: #0000FF;">,</span><span style="color: #000000;">xie</span><span style="color: #0000FF;">,</span><span style="color: #000000;">cei</span><span style="color: #0000FF;">,</span><span style="color: #000000;">xei</span><span style="color: #0000FF;">})</span>
<span style="color: #000000;">show_plausibility</span><span style="color: #0000FF;">(</span> <span style="color: #008000;">"i before e except after c"</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">xie</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">cie</span> <span style="color: #0000FF;">);</span>
<span style="color: #000000;">show_plausibility</span><span style="color: #0000FF;">(</span> <span style="color: #008000;">"e before i except after c"</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">xei</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">cei</span> <span style="color: #0000FF;">);</span>
<span style="color: #000000;">show_plausibility</span><span style="color: #0000FF;">(</span> <span style="color: #008000;">"i before e when after c"</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">cie</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">cei</span> <span style="color: #0000FF;">);</span>
<span style="color: #000000;">show_plausibility</span><span style="color: #0000FF;">(</span> <span style="color: #008000;">"e before i when after c"</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">cei</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">cie</span> <span style="color: #0000FF;">);</span>
<span style="color: #000000;">show_plausibility</span><span style="color: #0000FF;">(</span> <span style="color: #008000;">"i before e in general"</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">xie</span> <span style="color: #0000FF;">+</span> <span style="color: #000000;">cie</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">xei</span> <span style="color: #0000FF;">+</span> <span style="color: #000000;">cei</span> <span style="color: #0000FF;">);</span>
<span style="color: #000000;">show_plausibility</span><span style="color: #0000FF;">(</span> <span style="color: #008000;">"e before i in general"</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">xei</span> <span style="color: #0000FF;">+</span> <span style="color: #000000;">cei</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">xie</span> <span style="color: #0000FF;">+</span> <span style="color: #000000;">cie</span> <span style="color: #0000FF;">)</span>
Although the output matches, I decided to use different metrics from ALGOL 68 for the middle two conclusions.<br>
I am not confident these are meaningful/correct logical inferences anyway, but the raw numbers are right.<br>
(Being told ib4eeac is more often wrong than right has quite clearly made me start to doubt myself.)
occurances: cie:24, xie:466, cei:13, xei:217
i before e except after c (pro: 466, anti: 24) is plausible
e before i except after c (pro: 217, anti: 13) is plausible
i before e when after c (pro: 24, anti: 13) is not plausible
e before i when after c (pro: 13, anti: 24) is not plausible
i before e in general (pro: 490, anti: 230) is plausible
e before i in general (pro: 230, anti: 490) is not plausible

token word {
<syntaxhighlight lang="picat">main =>
[ <with_c> | <no_c> | \N ]+
Words = read_file_lines("unixdict.txt"),
IEWords = [Word : Word in Words, find(Word,"ie",_,_)],
EIWords = [Word : Word in Words, find(Word,"ei",_,_)],

token with_c {
% cie vs not cie
[CIE_len, CIE_not_len] = partition_len(IEWords,"cie"),
c <ie_part>

token no_c {
% cei vs not cei
[CEI_len, CEI_not_len] = partition_len(EIWords,"cei"),

token ie_part {
printf("I before E when not preceeded by C (%d vs %d): %w\n",
ie | ei | eie # a couple words in the list have "eie"
printf("E before I when preceeded by C (%d cs %d): %w\n",

plausible(Len1,Len2) = cond(Len1 / Len2 > 2,"plausible","not plausible").
class CollectWords::Actions {
method TOP($/) {
make $<word>».ast.Bag;

partition_len(Words,Sub) = [True.len, False.len] =>
method word($/) {
True = [],
if $<with_c> + $<no_c> {
False = [],
make ($<with_c>».ast, $<no_c>».ast);
foreach(Word in Words)
} else {
if find(Word,Sub,_,_) then
make ();
True := [Word|True]
False := [Word|False]

method with_c($/) {
make "c" X~ $<ie_part>.ast;

method no_c($/) {
make "!c" X~ $<ie_part>.ast;

method ie_part($/) {
if ~$/ eq 'eie' {
make ('ei', 'ie');
} else {
make ~$/;

sub plausible($good, $bad, $msg) {
if $good > 2*$bad {
say "$msg: PLAUSIBLE ($good ✔ vs. $bad ✘)";
return True;
} else {
say "$msg: NOT PLAUSIBLE ($good ✔ vs. $bad ✘)";
return False;

my $results = CollectWords.parsefile("unixdict.txt", :actions(CollectWords::Actions)).ast;

my $phrasetest = [&] plausible($results<!cie>, $results<!cei>, "I before E when not preceded by C"),
plausible($results<cei>, $results<cie>, "E before I when preceded by C");

say "I before E except after C: ", $phrasetest ?? "PLAUSIBLE" !! "NOT PLAUSIBLE";</lang>

<pre>[cie = 24,cie_not = 465]
[cei = 13,cei_not = 213]

<pre>I before E when not preceded by C: PLAUSIBLE (466 ✔ vs. 217 ✘)
I before E when not preceeded by C (465 vs 213): plausible
E before I when preceded by C: NOT PLAUSIBLE (13 ✔ vs. 24 ✘)
E before I when preceeded by C (13 cs 24): not plausible</pre>
I before E except after C: NOT PLAUSIBLE</pre>

===Perl 6: Stretch Goal===
<syntaxhighlight lang="picolisp">(de ibEeaC (File . Prg)
Note that within the original text file, a tab character was erroneously replaced with a space. Thus, the following changes to the text file are needed before this solution will run:
<pre>--- orig_1_2_all_freq.txt 2014-02-01 14:36:53.124121018 -0800
(Cie (let N 0 (in File (while (from "cie") (run Prg))))
+++ 1_2_all_freq.txt 2014-02-01 14:37:10.525552980 -0800
Nie (let N 0 (in File (while (from "ie") (run Prg))))
@@ -2488,7 +2488,7 @@
Cei (let N 0 (in File (while (from "cei") (run Prg))))
other than Prep 43
Nei (let N 0 (in File (while (from "ei") (run Prg)))) )
visited Verb 43
(prinl "cie: " Cie)
cross NoC 43
(prinl "nie: " (dec 'Nie Cie))
- lie Verb 43
(prinl "cei: " Cei)
+ lie Verb 43
(prinl "nei: " (dec 'Nei Cei))
grown Verb 43
(let (NotI (> (* 3 Cie) Nie) NotE (> Nei (* 3 Cei)))
crowd NoC 43
recognised Verb 43</pre>
"I before E except after C: is"
(and NotI " not")
" plausible" )
"E before I when after C: is"
(and NotE " not")
" plausible" )
"Overall rule is"
(and (or NotI NotE) " not")
" plausible" ) ) ) )

(ibEeaC "unixdict.txt"
This solution requires just a few modifications to the grammar and actions from the non-stretch goal.
(inc 'N) )
<lang perl6>grammar CollectWords {
token TOP {
^^ \t Word \t PoS \t Freq $$ \n
[^^ <word> $$ \n?]+

token word {
[ <with_c> | <no_c> | \T ]+ \t+
\T+ \t+ # PoS doesn't matter to us, so ignore it
$<freq>=[<.digit>+] \h*

(ibEeaC "1_2_all_freq.txt"
token with_c {
(inc 'N (format (stem (line) "\t"))) )</syntaxhighlight>
c <ie_part>
<pre>cie: 24
nie: 466
cei: 13
nei: 217
I before E except after C: is plausible
E before I when after C: is not plausible
Overall rule is not plausible

cie: 994
token no_c {
nie: 8148
cei: 327
nei: 4826
I before E except after C: is plausible
E before I when after C: is not plausible
Overall rule is not plausible</pre>

token ie_part {
<syntaxhighlight lang="pli">iBeforeE: procedure options(main);
ie | ei
declare dict file;
open file(dict) title('unixdict.txt');
on endfile(dict) go to report;
declare (cie, xie, cei, xei) fixed;
declare word char(32) varying;
cie = 0;
xie = 0;
cei = 0;
xei = 0;
do while('1'b);
get file(dict) list(word);
if index(word, 'ie') ^= 0 then
if index(word, 'cie') ^= 0 then
cie = cie + 1;
xie = xie + 1;
if index(word, 'ei') ^= 0 then
if index(word, 'cei') ^= 0 then
cei = cei + 1;
xei = xei + 1;
close file(dict);
put skip list('CIE:', cie);
put skip list('xIE:', xie);
put skip list('CEI:', cei);
put skip list('xEI:', xei);
declare (ieNotC, eiC) bit;
ieNotC = xie * 2 > cie;
eiC = cei * 2 > xei;

put skip list('I before E when not preceded by C:');
class CollectWords::Actions {
if ^ieNotC then put list('not');
method TOP($/) {
put list('plausible.');
make $<word>».ast».flat.Bag;

put skip list('E before I when preceded by C:');
method word($/) {
if $<with_c> + $<no_c> {
if ^eiC then put list('not');
put list('plausible.');
make ($<with_c>».ast xx $<freq>, $<no_c>».ast xx $<freq>);
} else {
make ();

method with_c($/) {
make "c" ~ $<ie_part>;

method no_c($/) {
make "!c" ~ $<ie_part>;

sub plausible($good, $bad, $msg) {
if $good > 2*$bad {
say "$msg: PLAUSIBLE ($good ✔ vs. $bad ✘)";
return True;
} else {
say "$msg: NOT PLAUSIBLE ($good ✔ vs. $bad ✘)";
return False;

# can't use .parsefile like before due to the non-Unicode £ in this file.
my $file = slurp("1_2_all_freq.txt", :enc<iso-8859-1>);
my $results = CollectWords.parse($file, :actions(CollectWords::Actions)).ast;

my $phrasetest = [&] plausible($results<!cie>, $results<!cei>, "I before E when not preceded by C"),
plausible($results<cei>, $results<cie>, "E before I when preceded by C");

say "I before E except after C: ", $phrasetest ?? "PLAUSIBLE" !! "NOT PLAUSIBLE";</lang>

put skip list('I before E, except after C:');
if ^(ieNotC & eiC) then put list('not');
put list('plausible.');
end iBeforeE;</syntaxhighlight>
<pre>CIE: 24
<pre>I before E when not preceded by C: NOT PLAUSIBLE (8222 ✔ vs. 4826 ✘)
xIE: 465
E before I when preceded by C: NOT PLAUSIBLE (327 ✔ vs. 994 ✘)
CEI: 13
I before E except after C: NOT PLAUSIBLE</pre>
xEI: 213

I before E when not preceded by C: plausible.
E before I when preceded by C: not plausible.
I before E, except after C: not plausible.</pre>

<lang Powershell>$Web = New-Object -TypeName Net.Webclient
<syntaxhighlight lang="powershell">$Web = New-Object -TypeName Net.Webclient
$Words = $web.DownloadString('http://www.puzzlers.org/pub/wordlists/unixdict.txt')
$Words = $web.DownloadString('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt')
$IE = $EI = $CIE = $CEI = @()
$IE = $EI = $CIE = $CEI = @()
Line 1,427: Line 3,626:
if ($Clause1 -and $Clause2)
if ($Clause1 -and $Clause2)
{$MainClause = $True}
{$MainClause = $True}
"The plausibility of the phrase 'I before E except after C' is $MainClause"</lang>
"The plausibility of the phrase 'I before E except after C' is $MainClause"</syntaxhighlight>
Line 1,434: Line 3,633:
The plausibility of the phrase 'I before E except after C' is False
The plausibility of the phrase 'I before E except after C' is False

==={{header|Alternative Implementation}}===
<syntaxhighlight lang="powershell">$Web = New-Object -TypeName Net.Webclient
$Words = $web.DownloadString('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt')
$IE = $EI = $CIE = $CEI = @()
$Clause1 = $Clause2 = $MainClause = $false
foreach ($Word in $Words.split())
switch ($Word)
{$_ -like '*cei*'} {$CEI += $Word; break}
{$_ -like '*cie*'} {$CIE += $Word; break}
{$_ -like '*ie*'} {$IE += $Word}
{$_ -like '*ei*'} {$EI += $Word}
if ($IE.count -gt $EI.count * 2)
{$Clause1 = $true}
"The plausibility of 'I before E when not preceded by C' is $Clause1"
if ($CEI.count -gt $CIE.count * 2)
{$Clause2 = $true}
"The plausibility of 'E before I when preceded by C' is $Clause2"
if ($Clause1 -and $Clause2)
{$MainClause = $True}
"The plausibility of the phrase 'I before E except after C' is $MainClause"</syntaxhighlight>
The plausibility of 'I before E when not preceded by C' is True
The plausibility of 'E before I when preceded by C' is False
The plausibility of the phrase 'I before E except after C' is False
==={{header|Alternative Implementation 2}}===
A single pass through the wordlist using the regex engine.
<syntaxhighlight lang="powershell">$webResult = Invoke-WebRequest -Uri http://wiki.puzzlers.org/pub/wordlists/unixdict.txt -UseBasicParsing

$cie, $cei, $_ie, $_ei = 0, 0, 0, 0

[regex]::Matches($webResult.Content, '.(ie|ei)').foreach{
if ($_.Value -eq 'cie') { $cie+=2 }
elseif ($_.Value -eq 'cei') { $cei++ }
elseif ($_.Value[1] -eq 'i' ) { $_ie++ }
else { $_ei+=2 }

"I before E when not preceded by C is plausible: $($_ie -gt $_ei)"
"E before I when preceded by C is plausible: $($cei -gt $cie)"
"I before E, except after C is plausible: $(($_ie -gt $_ei) -and ($cei -gt $cie))"</syntaxhighlight>
I before E when not preceded by C is plausible: True
E before I when preceded by C is plausible: False
I before E, except after C is plausible: False

<lang purebasic>If ReadFile(1,GetPathPart(ProgramFilename())+"wordlist(en).txt")
<syntaxhighlight lang="purebasic">If ReadFile(1,GetPathPart(ProgramFilename())+"wordlist(en).txt")
While Not Eof(1)
While Not Eof(1)
Line 1,457: Line 3,716:
Print("Overall the rule is : ")
Print("Overall the rule is : ")
If cei>cie And ie>ei : PrintN("PLAUSIBLE") : Else : PrintN("NOT PLAUSIBLE") : EndIf
If cei>cie And ie>ei : PrintN("PLAUSIBLE") : Else : PrintN("NOT PLAUSIBLE") : EndIf
Line 1,471: Line 3,730:
Overall the rule is : NOT PLAUSIBLE
Overall the rule is : NOT PLAUSIBLE

<lang python>import urllib.request
<syntaxhighlight lang="python">import urllib.request
import re
import re

Line 1,491: Line 3,751:

def simple_stats(url='http://www.puzzlers.org/pub/wordlists/unixdict.txt'):
def simple_stats(url='http://wiki.puzzlers.org/pub/wordlists/unixdict.txt'):
words = urllib.request.urlopen(url).read().decode().lower().split()
words = urllib.request.urlopen(url).read().decode().lower().split()
cie = len({word for word in words if 'cie' in word})
cie = len({word for word in words if 'cie' in word})
Line 1,508: Line 3,768:

print('Checking plausibility of "I before E except after C":')
print('Checking plausibility of "I before E except after C":')

Line 1,524: Line 3,784:
===Python: Stretch Goal===
===Python: Stretch Goal===
Add the following to the bottom of the previous program:
Add the following to the bottom of the previous program:
<lang python>def stretch_stats(url='http://ucrel.lancs.ac.uk/bncfreq/lists/1_2_all_freq.txt'):
<syntaxhighlight lang="python">def stretch_stats(url='http://ucrel.lancs.ac.uk/bncfreq/lists/1_2_all_freq.txt'):
freq = [line.strip().lower().split()
freq = [line.strip().lower().split()
for line in urllib.request.urlopen(url)
for line in urllib.request.urlopen(url)
Line 1,539: Line 3,799:
print('\n\nChecking plausibility of "I before E except after C"')
print('\n\nChecking plausibility of "I before E except after C"')
print('And taking account of word frequencies in British English:')
print('And taking account of word frequencies in British English:')

{{out|Produces this extra output}}
{{out|Produces this extra output}}
Line 1,553: Line 3,813:
(To be plausible, one count must exceed another by 2 times)</pre>
(To be plausible, one count must exceed another by 2 times)</pre>

<syntaxhighlight lang="qbasic">DEFINT A-Z
IF INSTR(W, "ie") THEN IF INSTR(W, "cie") THEN CI = CI + 1 ELSE XI = XI + 1
IF INSTR(W, "ei") THEN IF INSTR(W, "cei") THEN CE = CE + 1 ELSE XE = XE + 1

PRINT "I before E when not preceded by C: ";
IF 2 * XI <= CI THEN PRINT "not ";
PRINT "plausible."
PRINT "E before I when preceded by C: ";
IF 2 * CE <= XE THEN PRINT "not ";
PRINT "plausible."</syntaxhighlight>

<lang rsplus>words = tolower(readLines("http://www.puzzlers.org/pub/wordlists/unixdict.txt"))
<syntaxhighlight lang="rsplus">words = tolower(readLines("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt"))
ie.npc = sum(grepl("(?<!c)ie", words, perl = T))
ie.npc = sum(grepl("(?<!c)ie", words, perl = T))
ei.npc = sum(grepl("(?<!c)ei", words, perl = T))
ei.npc = sum(grepl("(?<!c)ei", words, perl = T))
Line 1,566: Line 3,852:
message("(1) is ", (if (p1) "" else "not "), "plausible.")
message("(1) is ", (if (p1) "" else "not "), "plausible.")
message("(2) is ", (if (p2) "" else "not "), "plausible.")
message("(2) is ", (if (p2) "" else "not "), "plausible.")
message("The whole phrase is ", (if (p1 && p2) "" else "not "), "plausible.")</lang>
message("The whole phrase is ", (if (p1 && p2) "" else "not "), "plausible.")</syntaxhighlight>

Line 1,574: Line 3,860:

<lang racket>#lang racket
<syntaxhighlight lang="racket">#lang racket

(define (get-tallies filename line-parser . patterns)
(define (get-tallies filename line-parser . patterns)
Line 1,605: Line 3,891:

(plausibility "Dictionary" "unixdict.txt" (λ (line) (list line 1))) (newline)
(plausibility "Dictionary" "unixdict.txt" (λ (line) (list line 1))) (newline)
(plausibility "Word frequencies (stretch goal)" "1_2_all_freq.txt" parse-frequency-data)</lang>
(plausibility "Word frequencies (stretch goal)" "1_2_all_freq.txt" parse-frequency-data)</syntaxhighlight>

Line 1,621: Line 3,907:
Overall, the rule "I before E, except after C" is IMPLAUSIBLE.
Overall, the rule "I before E, except after C" is IMPLAUSIBLE.

(formerly Perl 6)
This solution uses grammars and actions to parse the given file, the <tt>Bag</tt> for tallying up occurrences of each possible thing we're looking for ("ie", "ei", "cie", and "cei"), and junctions to determine the plausibility of a phrase from the subphrases. Note that a version of rakudo newer than the January 2014 compiler or Star releases is needed, as this code relies on a recent bugfix to the <tt>make</tt> function.
<syntaxhighlight lang="raku" line>grammar CollectWords {
token TOP {
[^^ <word> $$ \n?]+

token word {
[ <with_c> | <no_c> | \N ]+

token with_c {
c <ie_part>

token no_c {

token ie_part {
ie | ei | eie # a couple words in the list have "eie"

class CollectWords::Actions {
method TOP($/) {
make $<word>».ast.flat.Bag;

method word($/) {
if $<with_c> + $<no_c> {
make flat $<with_c>».ast, $<no_c>».ast;
} else {
make ();

method with_c($/) {
make "c" X~ $<ie_part>.ast;

method no_c($/) {
make "!c" X~ $<ie_part>.ast;

method ie_part($/) {
if ~$/ eq 'eie' {
make ('ei', 'ie');
} else {
make ~$/;

sub plausible($good, $bad, $msg) {
if $good > 2*$bad {
say "$msg: PLAUSIBLE ($good vs. $bad ✘)";
return True;
} else {
say "$msg: NOT PLAUSIBLE ($good vs. $bad ✘)";
return False;

my $results = CollectWords.parsefile("unixdict.txt", :actions(CollectWords::Actions)).ast;

my $phrasetest = [&] plausible($results<!cie>, $results<!cei>, "I before E when not preceded by C"),
plausible($results<cei>, $results<cie>, "E before I when preceded by C");

say "I before E except after C: ", $phrasetest ?? "PLAUSIBLE" !! "NOT PLAUSIBLE";</syntaxhighlight>


<pre>I before E when not preceded by C: PLAUSIBLE (466 vs. 217 ✘)
E before I when preceded by C: NOT PLAUSIBLE (13 vs. 24 ✘)
I before E except after C: NOT PLAUSIBLE</pre>

===Raku: Stretch Goal===
Note that within the original text file, a tab character was erroneously replaced with a space. Thus, the following changes to the text file are needed before this solution will run:
<pre>--- orig_1_2_all_freq.txt 2014-02-01 14:36:53.124121018 -0800
+++ 1_2_all_freq.txt 2014-02-01 14:37:10.525552980 -0800
@@ -2488,7 +2488,7 @@
other than Prep 43
visited Verb 43
cross NoC 43
- lie Verb 43
+ lie Verb 43
grown Verb 43
crowd NoC 43
recognised Verb 43</pre>

This solution requires just a few modifications to the grammar and actions from the non-stretch goal.
<syntaxhighlight lang="raku" line>grammar CollectWords {
token TOP {
^^ \t Word \t PoS \t Freq $$ \n
[^^ <word> $$ \n?]+

token word {
[ <with_c> | <no_c> | \T ]+ \t+
\T+ \t+ # PoS doesn't matter to us, so ignore it
$<freq>=[<.digit>+] \h*

token with_c {
c <ie_part>

token no_c {

token ie_part {
ie | ei

class CollectWords::Actions {
method TOP($/) {
make $<word>».ast.flat.Bag;

method word($/) {
if $<with_c> + $<no_c> {
make flat $<with_c>».ast xx +$<freq>, $<no_c>».ast xx +$<freq>;
} else {
make ();

method with_c($/) {
make "c" ~ $<ie_part>;

method no_c($/) {
make "!c" ~ $<ie_part>;

sub plausible($good, $bad, $msg) {
if $good > 2*$bad {
say "$msg: PLAUSIBLE ($good vs. $bad ✘)";
return True;
} else {
say "$msg: NOT PLAUSIBLE ($good vs. $bad ✘)";
return False;

# can't use .parsefile like before due to the non-Unicode £ in this file.
my $file = slurp("1_2_all_freq.txt", :enc<iso-8859-1>);
my $results = CollectWords.parse($file, :actions(CollectWords::Actions)).ast;

my $phrasetest = [&] plausible($results<!cie>, $results<!cei>, "I before E when not preceded by C"),
plausible($results<cei>, $results<cie>, "E before I when preceded by C");

say "I before E except after C: ", $phrasetest ?? "PLAUSIBLE" !! "NOT PLAUSIBLE";</syntaxhighlight>

<pre>I before E when not preceded by C: NOT PLAUSIBLE (8222 vs. 4826 ✘)
E before I when preceded by C: NOT PLAUSIBLE (327 vs. 994 ✘)
I before E except after C: NOT PLAUSIBLE</pre>

The script processes both the task and the stretch goal.
In the stretch goal, "rows with three space or tab separated words only" (7574 out of 7726) are processed, excluding all expressions like "out of".
<syntaxhighlight lang="red">Red ["i before e except after c"]

testlist: function [wordlist /wfreq] [
cie: cei: ie: ei: 0
if not wfreq [forall wordlist [insert wordlist: next wordlist 1]]
foreach [word freq] wordlist [
parse word [ some [
"cie" (cie: cie + freq) |
"cei" (cei: cei + freq) |
"ie" (ie: ie + freq) |
"ei" (ei: ei + freq) |
print rejoin [
"i is before e " ie " times, and also " cie " times following c.^/"
"i is after e " ei " times, and also " cei " times following c.^/"
"Hence ^"i before e^" is " either a: 2 * ei < ie [""] ["not "] "plausible,^/"
"while ^"except after c^" is " either b: 2 * cie < cei [""] ["not "] "plausible.^/"
"Overall the rule is " either a and b [""] ["not "] "plausible."]

print "Results for unixdict.txt:"
testlist read/lines http://wiki.puzzlers.org/pub/wordlists/unixdict.txt

print "^/Results for British National Corpus:"
bnc: next read/lines %1_2_all_freq.txt
spaces: charset "^- "
bnclist: collect [ foreach w bnc [
if 3 = length? seq: split trim w spaces [
keep seq/1 keep to-integer seq/3
testlist/wfreq bnclist</syntaxhighlight>

<pre>Results for unixdict.txt:
i is before e 464 times, and also 24 times following c.
i is after e 217 times, and also 13 times following c.
Hence "i before e" is plausible,
while "except after c" is not plausible.
Overall the rule is not plausible.

Results for British National Corpus:
i is before e 8207 times, and also 994 times following c.
i is after e 4826 times, and also 327 times following c.
Hence "i before e" is not plausible,
while "except after c" is not plausible.
Overall the rule is not plausible.</pre>

The following assumptions were made about the (default) dictionary:
The following assumptions were made about the (default) dictionary:
:* there could be leading and/or trailing blanks or tabs
::* &nbsp; there could be leading and/or trailing blanks or tabs
:* the dictionary words are in mixed case.
::* &nbsp; the dictionary words are in mixed case.
:* there could be blank lines
::* &nbsp; there could be blank lines
:* there may be more than one occurrence of a target string within a word [einsteinium]
::* &nbsp; there may be more than one occurrence of a target string within a word &nbsp; [einsteinium]

===unweighted version===
===unweighted version===
<lang rexx>/*REXX pgm shows plausibility of I before E when not preceded by C, and*/
<syntaxhighlight lang="rexx">/*REXX program shows plausibility of "I before E" when not preceded by C, and */
/*────────────────────────────── E before I when preceded by C. */
/*───────────────────────────────────── "E before I" when preceded by C. */
#.=0 /*zero out various word counters.*/
parse arg iFID . /*obtain optional argument from the CL.*/
parse arg iFID .; if iFID=='' then iFID='UNIXDICT.TXT' /*use default?*/
if iFID=='' | iFID=="," then iFID='UNIXDICT.TXT' /*Not specified? Then use the default.*/
#.=0 /*zero out the various word counters. */
do r=0 while lines(iFID)\==0 /*keep reading the dictionary 'til done*/
u=space( lineIn(iFID), 0); upper u /*elide superfluous blanks and tabs. */
if u=='' then iterate /*Is it a blank line? Then ignore it.*/
#.words=#.words + 1 /*keep running count of number of words*/
if pos('EI', u)\==0 & pos('IE', u)\==0 then #.both=#.both + 1 /*the word has both*/
call find 'ie' /*look for ie */
call find 'ei' /* " " ei */
end /*r*/ /*at exit of DO loop, R = # of lines.*/

L=length(#.words) /*use this to align the output numbers.*/
do r=0 while lines(ifid)\==0; _=linein(iFID) /*get a single line.*/
say 'lines in the ' iFID " dictionary: " r
u=translate(space(_,0)) /*elide superfluous blanks & tabs*/
if u=='' then iterate /*if a blank line, then ignore it*/
say 'words in the ' iFID " dictionary: " #.words
#.words=#.words+1 /*keep a running count of #words.*/
if pos('EI',u)\==0 & pos('IE',u)\==0 then #.both=#.both+1 /*has both.*/
call find 'ie'
call find 'ei'
end /*r*/

L=length(#.words) /*use this to align the output #s*/
say 'lines in the ' ifid ' dictionary: ' r
say 'words in the ' ifid ' dictionary: ' #.words
say 'words with "IE" and "EI" (in same word): ' right(#.both,L)
say 'words with "IE" and "EI" (in same word): ' right(#.both, L)
say 'words with "IE" and preceded by "C": ' right(#.ie.c ,L)
say 'words with "IE" and preceded by "C": ' right(#.ie.c ,L)
say 'words with "IE" and not preceded by "C": ' right(#.ie.z ,L)
say 'words with "IE" and not preceded by "C": ' right(#.ie.z ,L)
say 'words with "EI" and preceded by "C": ' right(#.ei.c ,L)
say 'words with "EI" and preceded by "C": ' right(#.ei.c ,L)
say 'words with "EI" and not preceded by "C": ' right(#.ei.z ,L)
say 'words with "EI" and not preceded by "C": ' right(#.ei.z ,L)
say; mantra='The spelling mantra '
say; mantra= 'The spelling mantra '
p1=#.ie.z/max(1,#.ei.z); phrase='"I before E when not preceded by C"'
p1=#.ie.z / max(1, #.ei.z); phrase= '"I before E when not preceded by C"'
say mantra phrase ' is ' word("im", 1+(p1>2))'plausible.'
say mantra phrase ' is ' word("im", 1 + (p1>2) )'plausible.'
p2=#.ie.c/max(1,#.ei.c); phrase='"E before I when preceded by C"'
p2=#.ie.c / max(1, #.ei.c); phrase= '"E before I when preceded by C"'
say mantra phrase ' is ' word("im", 1+(p2>2))'plausible.'
say mantra phrase ' is ' word("im", 1 + (p2>2) )'plausible.'
po=p1>2 & p2>2; say 'Overall, it is' word("im",1+po)'plausible.'
po=(p1>2 & p2>2); say 'Overall, it is' word("im", 1 + po)'plausible.'
exit /*stick a fork in it, we're done.*/
exit /*stick a fork in it, we're all done. */
/*──────────────────────────────────FIND subroutine─────────────────────*/
find: arg x; s=1; do forever; _=pos(x,u,s); if _==0 then leave
find: arg x; s=1; do forever; _=pos(x, u, s); if _==0 then return
if substr(u,_-1+(_==1)*999,1)=='C' then #.x.c=#.x.c+1
if substr(u, _ - 1 + (_==1)*999, 1)=='C' then #.x.c=#.x.c + 1
else #.x.z=#.x.z+1
else #.x.z=#.x.z + 1
s=_+1 /*handle case of multiple finds. */
s=_ + 1 /*handle the cases of multiple finds. */
end /*forever*/
end /*forever*/</syntaxhighlight>
{{out|output|text=&nbsp; when using the default dictionary:}}
{{out}} when using the default dictionary
lines in the UNIXDICT.TXT dictionary: 25104
lines in the UNIXDICT.TXT dictionary: 25104
Line 1,674: Line 4,176:
words with "IE" and "EI" (in same word): 4
words with "IE" and "EI" (in same word): 4
words with "IE" and preceded by "C": 24
words with "IE" and preceded by "C": 24
words with "IE" and not preceded by "C": 465
words with "IE" and not preceded by "C": 466
words with "EI" and preceded by "C": 13
words with "EI" and preceded by "C": 13
words with "EI" and not preceded by "C": 213
words with "EI" and not preceded by "C": 217

The spelling mantra "I before E when not preceded by C" is plausible.
The spelling mantra "I before E when not preceded by C" is plausible.
Line 1,685: Line 4,187:
===weighted version===
===weighted version===
Using the default word frequency count file, several discrepancies (or not) became apparent:
Using the default word frequency count file, several discrepancies (or not) became apparent:
:* some "words" were in fact, phrases
::* &nbsp; some "words" were in fact, &nbsp; phrases
:* some words were in the form of &nbsp; &nbsp; x / y &nbsp; &nbsp; indicating x OR y
::* &nbsp; some words were in the form of &nbsp; &nbsp; x / y &nbsp; &nbsp; indicating x OR y
:* some words were in the form of &nbsp; &nbsp; x/y &nbsp; &nbsp; (with no blanks) &nbsp; indicating x OR y, &nbsp; or a word)
::* &nbsp; some words were in the form of &nbsp; &nbsp; x/y &nbsp; &nbsp; &nbsp; (with no blanks) &nbsp; indicating x OR y, &nbsp; or a word
:* some words had a '''~''' prefix
::* &nbsp; some words had a &nbsp; '''~''' &nbsp; prefix
:* some words had a '''*''' suffix
::* &nbsp; some words had a &nbsp; '''*''' &nbsp; suffix
:* some words had a '''~''' suffix
::* &nbsp; some words had a &nbsp; '''~''' &nbsp; suffix
:* some words had a '''~''' and '''*''' suffix
::* &nbsp; some words had a &nbsp; '''~''' &nbsp; and &nbsp; '''*''' &nbsp; suffix
:* one word had a '''~''' prefix and a '''~''' suffix
::* &nbsp; one word had a &nbsp; '''~''' &nbsp; prefix and a &nbsp; '''~''' &nbsp; suffix
:* some lines had an imbedded '''[xxx]''' comment
::* &nbsp; some lines had an imbedded &nbsp; '''[xxx]''' &nbsp; comment
:* some words had a &nbsp; ''' ' ''' &nbsp; (quote) &nbsp; prefix to indicate a:
::* &nbsp; some words had a &nbsp; ''' ' ''' &nbsp; (quote) &nbsp; prefix to indicate a:
::* possessive
::::* &nbsp; possessive
::* plural
::::* &nbsp; plural
::* contraction
::::* &nbsp; contraction
::* word &nbsp; (as is)
::::* &nbsp; word &nbsp; (as is)
All of the cases where an asterisk ['''*'''] or tilde ['''~'''] were used were '''not''' programmatically handled within the REXX program; &nbsp; it is assumed that prefixes and suffixes were being used to indicate multiple words that either begin or end with (any) string &nbsp; (or in some case, both).
All of the cases when an asterisk &nbsp; ['''*'''] &nbsp; or tilde &nbsp; ['''~'''] &nbsp; was used <u>weren't</u> programmatically handled within the REXX program; &nbsp; it is assumed that prefixes and suffixes were being used to indicate multiple words that either begin or end with (any) string &nbsp; (or in some case, both).
<br>A cursory look at the file seems to indicate that the use of the tilde and/or asterisk doesn't affect the rules for the mantra phrases.
<lang rexx>/*REXX pgm shows plausibility of I before E when not preceded by C, and*/
/*────────────────────────────── E before I when preceded by C using a*/
/*────────────────────────────── weighted frequency for each word. */
#.=0 /*zero out various word counters.*/
parse arg iFID wFID .
if iFID=='' | iFID==',' then iFID='UNIXDICT.TXT' /*use the default? */
if wFID=='' | wFID==',' then wFID='WORDFREQ.TXT' /*use the default? */
tabs=xrange('0'x, "f"x)
f.=1 /*default word freq. multiplier. */

A cursory look at the file seems to indicate that the use of the tilde and/or asterisk doesn't affect the rules for the mantra phrases.
do recs=0 while lines(wFID)\==0; _=linein(wFID) /*get a record. */
<syntaxhighlight lang="rexx">/*REXX program shows plausibility of "I before E" when not preceded by C, and */
u=translate(_,,tabs); upper u /*trans various tabs & low hexex.*/
/*───────────────────────────────────── "E before I" when preceded by C, using a */
u=translate(u,'*', "~") /*translate tildes to an asterisk*/
/*───────────────────────────────────── weighted frequency for each word. */
if u=='' then iterate /*if a blank line, then ignore it*/
freq=word(u,words(u)) /*get the last token on the line.*/
parse arg iFID wFID . /*obtain optional arguments from the CL*/
if \datatype(freq,'W') then iterate /*Not numeric? Then ignore it. */
if iFID=='' | iFID=="," then iFID='UNIXDICT.TXT' /*Not specified? Then use the default.*/
parse var u w.1 '/' w.2 . /*handle case of: ααα/ßßß ... */
if wFID=='' | wFID=="," then wFID='WORDFREQ.TXT' /* " " " " " " */
cntl=xrange(, ' ') /*get all manner of tabs, control chars*/
#.=0 /*zero out the various word counters. */
f.=1 /*default word frequency multiplier. */
do recs=0 while lines(wFID)\==0 /*read a record from the file 'til done*/
u=translate( linein(wFID), , cntl); upper u /*translate various tabs and cntl chars*/
u=translate(u, '*', "~") /*translate tildes (~) to an asterisk.*/
if u=='' then iterate /*Is this a blank line? Then ignore it.*/
freq=word(u, words(u) ) /*obtain the last token on the line. */
if \datatype(freq, 'W') then iterate /*FREQ not an integer? Then ignore it.*/
parse var u w.1 '/' w.2 . /*handle case of: ααα/ßßß ··· */

do j=1 for 2; w.j=word(w.j,1) /*strip leading/trailing blanks */
do j=1 for 2; w.j=word(w.j, 1) /*strip leading and/or trailing blanks.*/
_=w.j; if _=='' then iterate /*if not present, then ignore it.*/
_=w.j; if _=='' then iterate /*if not present, then ignore it. */
if j==2 then if w.2==w.1 then iterate /*2nd word=1st word? skip.*/
if j==2 then if w.2==w.1 then iterate /*second word ≡ first word? Then skip.*/
#.freqs = #.freqs + 1 /*bump word count in FREQ list.*/
#.freqs=#.freqs + 1 /*bump word counter in the FREQ list.*/
f._ = f._ + freq /*add to a word's frequency count*/
f._=f._ + freq /*add to a word's frequency count. */
end /*ws*/
end /*ws*/
end /*recs*/ /*at exit of DO loop, RECS = # of recs.*/

if recs\==0 then say 'lines in the ' wFID " list: " recs
end /*recs*/
if #.freqs\==0 then say 'words in the ' wFID " list: " #.freqs

if #.freqs ==0 then weighted=
if recs\==0 then say 'lines in the ' wFID ' list: ' recs
if #.freqs\==0 then say 'words in the ' wFID ' list: ' #.freqs
else weighted= ' (weighted)'
if #.freqs==0 then weighted=
else weighted=' (weighted)'
do r=0 while lines(iFID)\==0 /*keep reading the dictionary 'til done*/
u=space( linein(iFID), 0); upper u /*elide superfluous blanks and tabs. */
if u=='' then iterate /*Is it a blank line? Then ignore it.*/
#.words=#.words + 1 /*keep running count of number of words*/
if pos('EI', u)\==0 & pos('IE', u)\==0 then #.both=#.both + one /*the word has both*/
call find 'ie' /*look for ie */
call find 'ei' /* " " ei */
end /*r*/ /*at exit of DO loop, R = # of lines.*/

L=length(#.words) /*use this to align the output numbers.*/
do r=0 while lines(iFID)\==0; _=linein(iFID) /*get a single line.*/
u=space(_,0); upper u /*elide superfluous blanks & tabs*/
say 'lines in the ' iFID ' dictionary: ' r
if u=='' then iterate /*if a blank line, then ignore it*/
say 'words in the ' iFID ' dictionary: ' #.words
#.words=#.words+1 /*keep a running count of #words.*/
if pos('EI',u)\==0 & pos('IE',u)\==0 then #.both=#.both+one /*has both*/
call find 'ie'
call find 'ei'
end /*r*/

L=length(#.words) /*use this to align the output #s*/
say 'lines in the ' iFID ' dictionary: ' r
say 'words in the ' iFID ' dictionary: ' #.words
say 'words with "IE" and "EI" (in same word): ' right(#.both,L) weighted
say 'words with "IE" and "EI" (in same word): ' right(#.both, L) weighted
say 'words with "IE" and preceded by "C": ' right(#.ie.c ,L) weighted
say 'words with "IE" and preceded by "C": ' right(#.ie.c ,L) weighted
say 'words with "IE" and not preceded by "C": ' right(#.ie.z ,L) weighted
say 'words with "IE" and not preceded by "C": ' right(#.ie.z ,L) weighted
say 'words with "EI" and preceded by "C": ' right(#.ei.c ,L) weighted
say 'words with "EI" and preceded by "C": ' right(#.ei.c ,L) weighted
say 'words with "EI" and not preceded by "C": ' right(#.ei.z ,L) weighted
say 'words with "EI" and not preceded by "C": ' right(#.ei.z ,L) weighted
say; mantra='The spelling mantra '
say; mantra= 'The spelling mantra '
p1=#.ie.z/max(1,#.ei.z); phrase='"I before E when not preceded by C"'
p1=#.ie.z / max(1, #.ei.z); phrase= '"I before E when not preceded by C"'
say mantra phrase ' is ' word("im", 1+(p1>2))'plausible.'
say mantra phrase ' is ' word("im", 1 + (p1>2) )'plausible.'
p2=#.ie.c/max(1,#.ei.c); phrase='"E before I when preceded by C"'
p2=#.ie.c / max(1, #.ei.c); phrase= '"E before I when preceded by C"'
say mantra phrase ' is ' word("im", 1+(p2>2))'plausible.'
say mantra phrase ' is ' word("im", 1 + (p2>2) )'plausible.'
po=p1>2 & p2>2; say 'Overall, it is' word("im",1+po)'plausible.'
po=(p1>2 & p2>2); say 'Overall, it is' word("im",1 + po)'plausible.'
exit /*stick a fork in it, we're done.*/
exit /*stick a fork in it, we're all done. */
/*──────────────────────────────────FIND subroutine─────────────────────*/
find: arg x; s=1; do forever; _=pos(x,u,s); if _==0 then leave
find: arg x; s=1; do forever; _=pos(x, u, s); if _==0 then return
if substr(u,_-1+(_==1)*999,1)=='C' then #.x.c=#.x.c+one
if substr(u, _ - 1 + (_==1)*999, 1)=='C' then #.x.c=#.x.c + one
else #.x.z=#.x.z+one
else #.x.z=#.x.z + one
s=_+1 /*handle case of multiple finds. */
s=_ + 1 /*handle the cases of multiple finds. */</syntaxhighlight>
{{out|output|text=&nbsp; when using the default dictionary and default word frequency list:}}
end /*forever*/
{{out}} when using the default dictionary and default word frequency list
lines in the WORDFREQ.TXT list: 7727
lines in the WORDFREQ.TXT list: 7727
Line 1,784: Line 4,282:
The spelling mantra "E before I when preceded by C" is plausible.
The spelling mantra "E before I when preceded by C" is plausible.
Overall, it is implausible.
Overall, it is implausible.

<syntaxhighlight lang="ring">
# Project : I before E except after C

fn1 = "unixdict.txt"

fp = fopen(fn1,"r")
str = fread(fp, getFileSize(fp))
strcount = str2list(str)
see "The number of words in unixdict : " + len(strcount) + nl
cei = count(str, "cei")
cie = count(str, "cie")
ei = count(str, "ei")
ie = count(str, "ie")
see "Instances of cei : " + cei + nl
see "Instances of cie : " + cie + nl
see "Rule: 'e' before 'i' when preceded by 'c' is = "
if cei>cie see "plausible" + nl else see"not plausible" + nl ok
see "Instances of *ei, where * is not c : " + (ei-cei) + nl
see "Instances of *ie, where * is not c: " + (ie-cie) + nl
see "Rule: 'i' before 'e' when not preceded by 'c' is = "
if ie>ei see "plausible" + nl else see "not plausible" + nl ok
see "Overall the rule is : "
if cei>cie and ie>ei see "PLAUSIBLE" + nl else see "NOT PLAUSIBLE" + nl ok

func getFileSize fp
c_filestart = 0
c_fileend = 2
nfilesize = ftell(fp)
return nfilesize

func count(cString,dString)
sum = 0
while substr(cString,dString) > 0
sum = sum + 1
cString = substr(cString,substr(cString,dString)+len(string(sum)))
return sum
The number of words in unixdict : 25104
Instances of cei : 13
Instances of cie : 24
Rule: 'e' before 'i' when preceded by 'c' is = not plausible
Instances of *ei, where * is not c : 217
Instances of *ie, where * is not c: 466
Rule: 'i' before 'e' when not preceded by 'c' is = plausible
Overall the rule is : NOT PLAUSIBLE

<lang ruby>require 'open-uri'
<syntaxhighlight lang="ruby">require 'open-uri'

plausibility_ratio = 2
plausibility_ratio = 2
counter = Hash.new(0)
counter = Hash.new(0)
path = 'http://www.puzzlers.org/pub/wordlists/unixdict.txt'
path = 'http://wiki.puzzlers.org/pub/wordlists/unixdict.txt'
rules = [['I before E when not preceded by C:', 'ie', 'ei'],
rules = [['I before E when not preceded by C:', 'ie', 'ei'],
['E before I when preceded by C:', 'cei', 'cie']]
['E before I when preceded by C:', 'cei', 'cie']]
Line 1,806: Line 4,358:

puts "Overall: #{overall_plausible ? 'Plausible' : 'Implausible'}."
puts "Overall: #{overall_plausible ? 'Plausible' : 'Implausible'}."
Line 1,815: Line 4,367:
Overall: Implausible.
Overall: Implausible.

<syntaxhighlight lang="rust">use std::default::Default;
use std::ops::AddAssign;

use itertools::Itertools;
use reqwest::get;

#[derive(Default, Debug)]
struct Feature<T> {
pub cie: T,
pub xie: T,
pub cei: T,
pub xei: T,

impl AddAssign<Feature<bool>> for Feature<u64> {
fn add_assign(&mut self, rhs: Feature<bool>) {
self.cei += rhs.cei as u64;
self.xei += rhs.xei as u64;
self.cie += rhs.cie as u64;
self.xie += rhs.xie as u64;

fn check_feature(word: &str) -> Feature<bool> {
let mut feature: Feature<bool> = Default::default();

for window in word.chars().tuple_windows::<(char, char, char)>() {
match window {
('c', 'e', 'i') => { feature.cei = true }
('c', 'i', 'e') => { feature.cie = true }
(not_c, 'e', 'i') if not_c != 'c' => (feature.xei = true),
(not_c, 'i', 'e') if not_c != 'c' => (feature.xie = true),
_ => {}


fn maybe_is_feature_plausible(feature_count: u64, opposing_count: u64) -> Option<bool> {
if feature_count > 2 * opposing_count { Some(true) } else if opposing_count > 2 * feature_count { Some(false) } else { None }

fn print_feature_plausibility(feature_plausibility: Option<bool>, feature_name: &str) {
let plausible_msg =
match feature_plausibility {
None => " is implausible",
Some(true) => "is plausible",
Some(false) => "is definitely implausible",

println!("{} {}", feature_name, plausible_msg)

fn main() {
let mut res = get(" http://wiki.puzzlers.org/pub/wordlists/unixdict.txt").unwrap();
let texts = res.text().unwrap();

let mut feature_count: Feature<u64> = Default::default();
for word in texts.lines() {
let feature = check_feature(word);
feature_count += feature;

println!("Counting {:#?}", feature_count);

let xie_plausibility =
maybe_is_feature_plausible(feature_count.xie, feature_count.cie);
let cei_plausibility =
maybe_is_feature_plausible(feature_count.cei, feature_count.xei);

print_feature_plausibility(xie_plausibility, "I before E when not preceded by C");
print_feature_plausibility(cei_plausibility, "E before I when preceded by C");
println!("The rule in general is {}",
if xie_plausibility.unwrap_or(false) && cei_plausibility.unwrap_or(false)
{ "Plausible" } else { "Implausible" }
Counting Feature {
cie: 24,
xie: 464,
cei: 13,
xei: 194,
I before E when not preceded by C is plausible
E before I when preceded by C is definitely implausible
The rule in general is Implausible

<lang Scala>object I_before_E_except_after_C extends App {
<syntaxhighlight lang="scala">object I_before_E_except_after_C extends App {
val testIE1 = "(^|[^c])ie".r // i before e when not preceded by c
val testIE1 = "(^|[^c])ie".r // i before e when not preceded by c
val testIE2 = "cie".r // i before e when preceded by c
val testIE2 = "cie".r // i before e when preceded by c
Line 1,825: Line 4,471:
var countsCEI = (0,0)
var countsCEI = (0,0)

scala.io.Source.fromURL("http://www.puzzlers.org/pub/wordlists/unixdict.txt").getLines.map(_.toLowerCase).foreach{word =>
scala.io.Source.fromURL("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt").getLines.map(_.toLowerCase).foreach{word =>
if (testIE1.findFirstIn(word).isDefined) countsIE = (countsIE._1 + 1, countsIE._2)
if (testIE1.findFirstIn(word).isDefined) countsIE = (countsIE._1 + 1, countsIE._2)
if (testIE2.findFirstIn(word).isDefined) countsIE = (countsIE._1, countsIE._2 + 1)
if (testIE2.findFirstIn(word).isDefined) countsIE = (countsIE._1, countsIE._2 + 1)
Line 1,838: Line 4,484:
println("E before I when preceded by C: "+plausibility(countsCEI))
println("E before I when preceded by C: "+plausibility(countsCEI))
println("Overall: "+plausibility(plausible(countsIE) && plausible(countsCEI)))
println("Overall: "+plausibility(plausible(countsIE) && plausible(countsCEI)))
<pre>I before E when not preceded by C: plausible
<pre>I before E when not preceded by C: plausible
Line 1,845: Line 4,491:

<lang seed7>$ include "seed7_05.s7i";
<syntaxhighlight lang="seed7">$ include "seed7_05.s7i";
include "gethttp.s7i";
include "gethttp.s7i";
include "float.s7i";
include "float.s7i";
Line 1,890: Line 4,536:
var integer: not_c_ei is 0;
var integer: not_c_ei is 0;
words := split(lower(getHttp("www.puzzlers.org/pub/wordlists/unixdict.txt")), "\n");
words := split(lower(getHttp("wiki.puzzlers.org/pub/wordlists/unixdict.txt")), "\n");
cie := count("cie", words);
cie := count("cie", words);
cei := count("cei", words);
cei := count("cei", words);
Line 1,903: Line 4,549:
writeln("(To be plausible, one word count must exceed another by " <& PLAUSIBILITY_RATIO <& " times)");
writeln("(To be plausible, one word count must exceed another by " <& PLAUSIBILITY_RATIO <& " times)");
end if;
end if;
end func;</lang>
end func;</syntaxhighlight>

Line 1,915: Line 4,561:
(To be plausible, one word count must exceed another by 2 times)
(To be plausible, one word count must exceed another by 2 times)

<syntaxhighlight lang="setl">program i_before_e_except_after_c;
init cie := 0, xie := 0, cei := 0, xei := 0;

dict := open("unixdict.txt", "r");
loop doing word := getline(dict); while word /= om do
end loop;

p :=
plausible("I before E when not preceded by C", xie, cie) and
plausible("E before I when preceded by C", cei, xei);
print("I before E, except after C:" + (if p then "" else " not" end)
+ " plausible.");

proc classify(word);
if "ie" in word then
if "cie" in word then cie +:= 1;
else xie +:= 1;
end if;
elseif "ei" in word then
if "cei" in word then cei +:= 1;
else xei +:= 1;
end if;
end if;
end proc;

proc plausible(clause, feature, opposite);
p := 2 * feature > opposite;
print(clause + ":" + (if p then "" else " not" end) + " plausible.");
return p;
end proc;
end program;</syntaxhighlight>
<pre>I before E when not preceded by C: plausible.
E before I when preceded by C: not plausible.

I before E, except after C: not plausible.</pre>

Using [https://github.com/johnno1962/SwiftRegex/blob/master/SwiftRegex.swift SwiftRegex] for easy regex in strings.
Using [https://github.com/johnno1962/SwiftRegex/blob/master/SwiftRegex.swift SwiftRegex] for easy regex in strings.
<lang Swift>import Foundation
<syntaxhighlight lang="swift">import Foundation

let request = NSURLRequest(URL: NSURL(string: "http://www.puzzlers.org/pub/wordlists/unixdict.txt")!)
let request = NSURLRequest(URL: NSURL(string: "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")!)

NSURLConnection.sendAsynchronousRequest(request, queue: NSOperationQueue()) {res, data, err in
NSURLConnection.sendAsynchronousRequest(request, queue: NSOperationQueue()) {res, data, err in
Line 1,971: Line 4,659:

Line 1,977: Line 4,665:
E before I when preceded by C is not plausable
E before I when preceded by C is not plausable
I before E except after C is not plausible</pre>
I before E except after C is not plausible</pre>

=={{header|True BASIC}}==
<syntaxhighlight lang="qbasic">DEF EOF(f)

OPEN #1: NAME "UNIXDICT.TXT", org text, ACCESS INPUT, create old
IF POS(w$,"ie")<>0 THEN
IF POS(w$,"cie")<>0 THEN LET ci = ci+1 ELSE LET xi = xi+1
IF POS(w$,"ei")<>0 THEN
IF POS(w$,"cei")<>0 THEN LET ce = ce+1 ELSE LET xe = xe+1

PRINT "CIE:"; ci
PRINT "xIE:"; xi
PRINT "CEI:"; ce
PRINT "xEI:"; xe
PRINT "I before E when not preceded by C: ";
IF 2*xi <= ci THEN PRINT "not ";
PRINT "plausible."
PRINT "E before I when preceded by C: ";
IF 2*ce <= xe THEN PRINT "not ";
PRINT "plausible."

{{trans|Python}}<!-- very approximately, mainly for the messages -->
{{trans|Python}}<!-- very approximately, mainly for the messages -->
<lang tcl>package require http
<syntaxhighlight lang="tcl">package require http

Line 2,004: Line 4,724:

set t [http::geturl http://www.puzzlers.org/pub/wordlists/unixdict.txt]
set t [http::geturl http://wiki.puzzlers.org/pub/wordlists/unixdict.txt]
set words [split [http::data $t] "\n"]
set words [split [http::data $t] "\n"]
http::cleanup $t
http::cleanup $t
Line 2,021: Line 4,741:
puts "\n(To be plausible, one word count must exceed another by\
puts "\n(To be plausible, one word count must exceed another by\
$PLAUSIBILITY_RATIO times)"</lang>
$PLAUSIBILITY_RATIO times)"</syntaxhighlight>
<!-- note that checking the pronunciation of the words indicates a key guard on the real rule that isn't normally stated -->
<!-- note that checking the pronunciation of the words indicates a key guard on the real rule that isn't normally stated -->
Line 2,037: Line 4,757:
(To be plausible, one word count must exceed another by 2.0 times)
(To be plausible, one word count must exceed another by 2.0 times)

<syntaxhighlight lang="tuscript">

LOOP word=words
IF (word.nc." ie "," ei ") CYCLE

IF (word.ct." ie "&& word.ct." ei ") THEN
IF (word.ct." Cie ") THEN
ELSEIF (word.ct." Cei ") THEN

IF (word.ct." ie ") THEN
IF (word.ct." Cie ") THEN
ELSEIF (word.ct." ei ") THEN
IF (word.ct." Cei ") THEN


PRINT "ieee ", ieei
PRINT "cie ", cie
PRINT "xie ", xie
PRINT "cei ", cei
PRINT "xei ", xei


IF (xie>doublexei) THEN
check1="not plausible"

IF (cei>xei) THEN
check2="not plausible"
IF (check1==check2) THEN
checkall="not plausible"

TRAcE *check1,check2,checkall
ieee 4
cie 24
xie 465
cei 13
xei 213
TRACE * 62 -*SKRIPTE 203
check1 = plausible
check2 = not plausible
checkall = not plausible

<syntaxhighlight lang="text">If Set(a, Open ("unixdict.txt", "r")) < 0 Then Print "Cannot open \qunixdict.txt\q" : End

x = Set (y, Set (p, Set (q, 0)))

Do While Read (a)
w = Tok(0)
If FUNC(_Search(w, "cei")) > -1 Then x = x + 1
If FUNC(_Search(w, "cie")) > -1 Then y = y + 1
If FUNC(_Search(w, "ie")) > -1 Then p = p + 1
If FUNC(_Search(w, "ei")) > -1 Then q = q + 1

Print "The plausibility of 'I before E when not preceded by C' is ";
Print Show (Iif (p>(q+q), "True", "False"))

Print "The plausibility of 'E before I when preceded by C' is ";
Print Show (Iif (x>(y+y), "True", "False"))

Print "The plausibility of the phrase 'I before E except after C' is ";
Print Show (Iif ((x>(y+y))*(p>(q+q)), "True", "False"))

Close a

Param (2)
Local (1)
For c@ = 0 to Len (a@) - Len (b@)
If Comp(Clip(Chop(a@,c@),Len(a@)-c@-Len(b@)),b@)=0 Then Unloop : Return (c@)
Return (-1)</syntaxhighlight>
<pre>The plausibility of 'I before E when not preceded by C' is True
The plausibility of 'E before I when preceded by C' is False
The plausibility of the phrase 'I before E except after C' is False

0 OK, 0:800 </pre>

=={{header|UNIX Shell}}==
=={{header|UNIX Shell}}==
<lang bash>#!/bin/sh
<syntaxhighlight lang="bash">#!/bin/sh

matched() {
matched() {
Line 2,066: Line 4,904:
echo "Overall, the rule is not plausible"
echo "Overall, the rule is not plausible"
Line 2,076: Line 4,914:
The sample text was downloaded and saved in the same folder as the script.
The sample text was downloaded and saved in the same folder as the script.
<syntaxhighlight lang="vb">
<lang vb>
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set srcFile = objFSO.OpenTextFile(objFSO.GetParentFolderName(WScript.ScriptFullName) &_
Set srcFile = objFSO.OpenTextFile(objFSO.GetParentFolderName(WScript.ScriptFullName) &_
Line 2,125: Line 4,963:
Set objFSO = Nothing
Set objFSO = Nothing

Line 2,133: Line 4,971:
Overall it is NOT plausible.
Overall it is NOT plausible.

=={{header|Visual Basic .NET}}==
'''Compiler:''' Roslyn Visual Basic (language version >= 15.3)
{{works with|.NET Core|2.1}}

Implemented using both a single-pass loop and regex. Implementation used is toggled with compiler constant.

Regex implementation does not technically conform to specification because it counts the number of occurrences of "ie" and "ei" instead of the number of words.

<syntaxhighlight lang="vbnet">Option Compare Binary
Option Explicit On
Option Infer On
Option Strict On

Imports System.Text.RegularExpressions

#Const USE_REGEX = False

Module Program
' Supports both local and remote files
Const WORDLIST_URI = "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt"

' The support factor of a word for EI or IE is the number of occurrences that support the rule minus the number that oppose it.
' I.e., for IE:
' - increased when not preceded by C
' - decreased when preceded by C
' and for EI:
' - increased when preceded by C
' - decreased when not preceded by C
Private Function GetSupportFactor(word As String) As (IE As Integer, EI As Integer)
Dim IE, EI As Integer

' Enumerate the letter pairs in the word.
For i = 0 To word.Length - 2
Dim pair = word.Substring(i, 2)

' Instances at the beginning of a word count towards the factor and are treated as not preceded by C.
Dim prevIsC As Boolean = i > 0 AndAlso String.Equals(word(i - 1), "c"c, StringComparison.OrdinalIgnoreCase)

If pair.Equals("ie", StringComparison.OrdinalIgnoreCase) Then
IE += If(Not prevIsC, 1, -1)
ElseIf pair.Equals("ei", StringComparison.OrdinalIgnoreCase) Then
EI += If(prevIsC, 1, -1)
End If

If Math.Abs(IE) > 1 Or Math.Abs(EI) > 1 Then Debug.WriteLine($"{word}: {GetSupportFactor}")
Return (IE, EI)
End Function

' Returns the number of words that support or oppose the rule.
Private Function GetPlausabilities(words As IEnumerable(Of String)) As (ieSuppCount As Integer, ieOppCount As Integer, eiSuppCount As Integer, eiOppCount As Integer)
Dim ieSuppCount, ieOppCount, eiSuppCount, eiOppCount As Integer

For Each word In words
Dim status = GetSupportFactor(word)
If status.IE > 0 Then
ieSuppCount += 1
ElseIf status.IE < 0 Then
ieOppCount += 1
End If
If status.EI > 0 Then
eiSuppCount += 1
ElseIf status.EI < 0 Then
eiOppCount += 1
End If

Return (ieSuppCount, ieOppCount, eiSuppCount, eiOppCount)
End Function

' Takes entire file instead of individual words.
' Returns the number of instances of IE or EI that support or oppose the rule.
Private Function GetPlausabilitiesRegex(words As String) As (ieSuppCount As Integer, ieOppCount As Integer, eiSuppCount As Integer, eiOppCount As Integer)
' Gets number of occurrences of the pattern, case-insensitive.
Dim count = Function(pattern As String) Regex.Matches(words, pattern, RegexOptions.IgnoreCase).Count

Dim ie = count("[^c]ie")
Dim ei = count("[^c]ei")
Dim cie = count("cie")
Dim cei = count("cei")

Return (ie, cie, cei, ei)
End Function

Sub Main()
Dim file As String
Dim wc As New Net.WebClient()
Console.WriteLine("Fetching file...")
file = wc.DownloadString(WORDLIST_URI)
Catch ex As Net.WebException
Exit Sub
End Try

Dim res = GetPlausabilitiesRegex(file)
Dim words = file.Split({vbCr, vbLf}, StringSplitOptions.RemoveEmptyEntries)
Dim res = GetPlausabilities(words)
#End If

Dim PrintResult =
Function(suppCount As Integer, oppCount As Integer, printEI As Boolean) As Boolean
Dim ratio = suppCount / oppCount,
plausible = ratio > 2
#If Not USE_REGEX Then
Console.WriteLine($" Words with no instances of {If(printEI, "EI", "IE")} or equal numbers of supporting/opposing occurrences: {words.Length - suppCount - oppCount}")
#End If
Console.WriteLine($" Number supporting: {suppCount}")
Console.WriteLine($" Number opposing: {oppCount}")
Console.WriteLine($" {suppCount}/{oppCount}={ratio:N3}")
Console.WriteLine($" Rule therefore IS {If(plausible, "", "NOT ")}plausible.")
Return plausible
End Function

Console.WriteLine($"Total occurrences of IE: {res.ieOppCount + res.ieSuppCount}")
Console.WriteLine($"Total occurrences of EI: {res.eiOppCount + res.eiSuppCount}")
Console.WriteLine($"Total words: {words.Length}")
#End If

Console.WriteLine("""IE is not preceded by C""")
Dim iePlausible = PrintResult(res.ieSuppCount, res.ieOppCount, False)

Console.WriteLine("""EI is preceded by C""")
Dim eiPlausible = PrintResult(res.eiSuppCount, res.eiOppCount, True)

Console.WriteLine($"Rule thus overall IS {If(iePlausible AndAlso eiPlausible, "", "NOT ")}plausible.")
End Sub
End Module

{{out|case=Loop implementation}}
<pre>Fetching file...

Total words: 25104

"IE is not preceded by C"
Words with no instances of IE or equal numbers of supporting/opposing occurrences: 24615
Number supporting: 465
Number opposing: 24
Rule therefore IS plausible.

"EI is preceded by C"
Words with no instances of EI or equal numbers of supporting/opposing occurrences: 24878
Number supporting: 13
Number opposing: 213
Rule therefore IS NOT plausible.

Rule thus overall IS NOT plausible.</pre>

{{out|case=Regex implementation}}
<pre>Fetching file...

Total occurrences of IE: 490
Total occurrences of EI: 230

"IE is not preceded by C"
Number supporting: 466
Number opposing: 24
Rule therefore IS plausible.

"EI is preceded by C"
Number supporting: 13
Number opposing: 217
Rule therefore IS NOT plausible.

Rule thus overall IS NOT plausible.</pre>

=={{header|V (Vlang)}}==
<syntaxhighlight lang="v (vlang)">import os
import strconv

fn main() {
mut cei, mut cie, mut ie, mut ei := f32(0), f32(0), f32(0), f32(0)
unixdict := os.read_file('./unixdict.txt') or {println('Error: file not found') exit(1)}
words := unixdict.split_into_lines()
println("The number of words in unixdict: ${words.len}")
for word in words {
cei += word.count('cei')
cie += word.count('cie')
ei += word.count('ei')
ie += word.count('ie')
print("Rule: 'e' before 'i' when preceded by 'c' at the ratio of ")
print("${strconv.f64_to_str_lnd1((cei / cie), 2)} is ")
if cei > cie {println("plausible.")} else {println("implausible.")}
println("$cei cases for and $cie cases against.")

print("Rule: 'i' before 'e' except after 'c' at the ratio of ")
print("${strconv.f64_to_str_lnd1(((ie - cie) / (ei - cei)), 2)} is ")
if ie > ei {println("plausible.")} else {println("implausible.")}
println("${(ie - cie)} cases for and ${(ei - cei)} cases against.")

print("Overall the rules are ")
if cei > cie && ie > ei {println("plausible.")} else {println("implausible.")}

The number of words in unixdict: 25104
Rule: 'e' before 'i' when preceded by 'c' at the ratio of 0.54 is implausible.
13 cases for and 24 cases against.
Rule: 'i' before 'e' except after 'c' at the ratio of 2.15 is plausible.
466 cases for and 217 cases against.
Overall the rules are implausible.

It's a moot point whether one should include words beginning with "ei" or "ie" in this analysis as I've certainly never applied the rule to them and there are clearly a lot more of the former than the latter (22 to 1 for unixdict.txt). Despite this reservation I've included them anyway.

Also there are seven words which fall into two categories and which have therefore been double-counted.
<syntaxhighlight lang="wren">import "io" for File
import "./pattern" for Pattern
import "./fmt" for Fmt

var yesNo = Fn.new { |b| (b) ? "yes" : "no" }

var plausRatio = 2

var count1 = 0 // [^c]ie
var count2 = 0 // [^c]ei
var count3 = 0 // cie
var count4 = 0 // cei
var count5 = 0 // ^ie
var count6 = 0 // ^ei

var p1 = Pattern.new("^cie")
var p2 = Pattern.new("^cei")

var words = File.read("unixdict.txt").split("\n").map { |w| w.trim() }.where { |w| w != "" }
System.print("The following words fall into more than one category")
System.print("and so are counted more than once:")
for (word in words) {
var tc1 = count1 + count2 + count3 + count4 + count5 + count6
if (p1.isMatch(word)) count1 = count1 + 1
if (p2.isMatch(word)) count2 = count2 + 1
if (word.contains("cie")) count3 = count3 + 1
if (word.contains("cei")) count4 = count4 + 1
if (word.startsWith("ie")) count5 = count5 + 1
if (word.startsWith("ei")) count6 = count6 + 1
var tc2 = count1 + count2 + count3 + count4 + count5 + count6
if ((tc2 -tc1) > 1) System.print(" " + word)

System.print("\nChecking plausability of \"i before e except after c\":")
var nFor = count1 + count5
var nAgst = count2 + count6
var ratio = nFor / nAgst
var plaus = (ratio > plausRatio)
Fmt.print(" Cases for : $d", nFor)
Fmt.print(" Cases against : $d", nAgst)
Fmt.print(" Ratio for/agst : $4.2f", ratio)
Fmt.print(" Plausible : $s", yesNo.call(plaus))

System.print("\nChecking plausability of \"e before i when preceded by c\":")
var ratio2 = count4 / count3
var plaus2 = (ratio2 > plausRatio)
Fmt.print(" Cases for : $d", count4)
Fmt.print(" Cases against : $d", count3)
Fmt.print(" Ratio for/agst : $4.2f", ratio2)
Fmt.print(" Plausible : $s", yesNo.call(plaus2))

Fmt.print("\nPlausible overall: $s", yesNo.call(plaus && plaus2))</syntaxhighlight>

The following words fall into more than one category
and so are counted more than once:

Checking plausability of "i before e except after c":
Cases for : 465
Cases against : 216
Ratio for/agst : 2.15
Plausible : yes

Checking plausability of "e before i when preceded by c":
Cases for : 13
Cases against : 24
Ratio for/agst : 0.54
Plausible : no

Plausible overall: no

And the code and results for the 'stretch goal' which has just the one double-counted word:

<syntaxhighlight lang="wren">import "io" for File
import "./pattern" for Pattern
import "./fmt" for Fmt

var yesNo = Fn.new { |b| (b) ? "yes" : "no" }

var plausRatio = 2

var count1 = 0 // [^c]ie
var count2 = 0 // [^c]ei
var count3 = 0 // cie
var count4 = 0 // cei
var count5 = 0 // ^ie
var count6 = 0 // ^ei

var p0 = Pattern.new("+1/s")
var p1 = Pattern.new("^cie")
var p2 = Pattern.new("^cei")

var entries = File.read("corpus.txt").split("\n").map { |w| w.trim() }.where { |w| w != "" }
System.print("The following words fall into more than one category")
System.print("and so are counted more than their frequency:")
for (entry in entries.skip(1)) {
var items = p0.splitAll(entry)
if (items.count == 3) {
var word = items[0] // leave any trailing * in place
var freq = Num.fromString(items[2])
var tc1 = count1 + count2 + count3 + count4 + count5 + count6
if (p1.isMatch(word)) count1 = count1 + freq
if (p2.isMatch(word)) count2 = count2 + freq
if (word.contains("cie")) count3 = count3 + freq
if (word.contains("cei")) count4 = count4 + freq
if (word.startsWith("ie")) count5 = count5 + freq
if (word.startsWith("ei")) count6 = count6 + freq
var tc2 = count1 + count2 + count3 + count4 + count5 + count6
if ((tc2 -tc1) > freq) System.print(" " + word)

System.print("\nChecking plausability of \"i before e except after c\":")
var nFor = count1 + count5
var nAgst = count2 + count6
var ratio = nFor / nAgst
var plaus = (ratio > plausRatio)
Fmt.print(" Cases for : $d", nFor)
Fmt.print(" Cases against : $d", nAgst)
Fmt.print(" Ratio for/agst : $4.2f", ratio)
Fmt.print(" Plausible : $s", yesNo.call(plaus))

System.print("\nChecking plausability of \"e before i when preceded by c\":")
var ratio2 = count4 / count3
var plaus2 = (ratio2 > plausRatio)
Fmt.print(" Cases for : $d", count4)
Fmt.print(" Cases against : $d", count3)
Fmt.print(" Ratio for/agst : $4.2f", ratio2)
Fmt.print(" Plausible : $s", yesNo.call(plaus2))

Fmt.print("\nPlausible overall: $s", yesNo.call(plaus && plaus2))</syntaxhighlight>

The following words fall into more than one category
and so are counted more than their frequency:

Checking plausability of "i before e except after c":
Cases for : 8192
Cases against : 4826
Ratio for/agst : 1.70
Plausible : no

Checking plausability of "e before i when preceded by c":
Cases for : 327
Cases against : 994
Ratio for/agst : 0.33
Plausible : no

Plausible overall: no

<syntaxhighlight lang="freebasic">open "unixdict.txt" for reading as #1

line input #1 pal$
if instr(pal$, "ie") then
if instr(pal$, "cie") then CI = CI + 1 else XI = XI + 1 : fi
if instr(pal$, "ei") then
if instr(pal$, "cei") then CE = CE + 1 else XE = XE + 1 : fi
until eof(1)
close #1

print "CIE: ", CI
print "xIE: ", XI
print "CEI: ", CE
print "xEI: ", XE
print "\nI before E when not preceded by C: ";
if 2 * XI <= CI then print "not "; : fi
print "plausible."
print "E before I when preceded by C: ";
if 2 * CE <= XE then print "not "; : fi
print "plausible."

<lang zkl>fcn wcnt(wordList,altrs,aAdjust,bltrs,bAdjust,text){
<syntaxhighlight lang="zkl">fcn wcnt(wordList,altrs,aAdjust,bltrs,bAdjust,text){
a:=wordList.reduce('wrap(cnt,word){ cnt+word.holds(altrs) },0) - aAdjust;
a:=wordList.reduce('wrap(cnt,word){ cnt+word.holds(altrs) },0) - aAdjust;
b:=wordList.reduce('wrap(cnt,word){ cnt+word.holds(bltrs) },0) - bAdjust;
b:=wordList.reduce('wrap(cnt,word){ cnt+word.holds(bltrs) },0) - bAdjust;
Line 2,143: Line 5,405:
<lang zkl>a,b,r1:=wcnt(wordList,"cei",0,"cie",0,"E before I when preceded by C");
<syntaxhighlight lang="zkl">a,b,r1:=wcnt(wordList,"cei",0,"cie",0,"E before I when preceded by C");
_,_,r2:=wcnt(wordList,"ie",b,"ei",a, "I before E when not preceded by C");
_,_,r2:=wcnt(wordList,"ie",b,"ei",a, "I before E when not preceded by C");
"Overall the rule is %splausible".fmt((r1<2 or r2<2) and "im" or "").println();</lang>
"Overall the rule is %splausible".fmt((r1<2 or r2<2) and "im" or "").println();</syntaxhighlight>
Line 2,156: Line 5,418:
<lang zkl>fcn wc2(wordList,altrs,aAdjust,bltrs,bAdjust,text){
<syntaxhighlight lang="zkl">fcn wc2(wordList,altrs,aAdjust,bltrs,bAdjust,text){
// don't care if line is "Word PoS Freq" or "as yet Adv 14"
// don't care if line is "Word PoS Freq" or "as yet Adv 14"
Line 2,170: Line 5,432:

Latest revision as of 23:15, 30 June 2024

I before E except after C
You are encouraged to solve this task according to the task description, using any language you may know.

The phrase     "I before E, except after C"     is a widely known mnemonic which is supposed to help when spelling English words.


Using the word list from   http://wiki.puzzlers.org/pub/wordlists/unixdict.txt,
check if the two sub-clauses of the phrase are plausible individually:

  1.   "I before E when not preceded by C"
  2.   "E before I when preceded by C"

If both sub-phrases are plausible then the original phrase can be said to be plausible.

Something is plausible if the number of words having the feature is more than two times the number of words having the opposite feature (where feature is 'ie' or 'ei' preceded or not by 'c' as appropriate).

Stretch goal

As a stretch goal use the entries from the table of Word Frequencies in Written and Spoken English: based on the British National Corpus, (selecting those rows with three space or tab separated words only), to see if the phrase is plausible when word frequencies are taken into account.

Show your output here as well as your program.

Other tasks related to string operations:
Song lyrics/poems/Mad Libs/phrases



Translation of: Python

F plausibility_check(comment, x, y)
   print("\n  Checking plausibility of: #.".format(comment))
      print(‘    PLAUSIBLE. As we have counts of #. vs #., a ratio of #2.1 times’.format(x, y, Float(x) / y))
      I x > y
         print(‘    IMPLAUSIBLE. As although we have counts of #. vs #., a ratio of #2.1 times does not make it plausible’.format(x, y, Float(x) / y))
         print(‘    IMPLAUSIBLE, probably contra-indicated. As we have counts of #. vs #., a ratio of #2.1 times’.format(x, y, Float(x) / y))

F simple_stats()
   V words = File(‘unixdict.txt’).read().split("\n")
   V cie = Set(words.filter(word -> ‘cie’ C word)).len
   V cei = Set(words.filter(word -> ‘cei’ C word)).len
   V not_c_ie = Set(words.filter(word -> re:‘(^ie|[^c]ie)’.search(word))).len
   V not_c_ei = Set(words.filter(word -> re:‘(^ei|[^c]ei)’.search(word))).len
   R (cei, cie, not_c_ie, not_c_ei)

F print_result(cei, cie, not_c_ie, not_c_ei)
   I (plausibility_check(‘I before E when not preceded by C’, not_c_ie, not_c_ei) & plausibility_check(‘E before I when preceded by C’, cei, cie))
      print("\nOVERALL IT IS PLAUSIBLE!")
   print(‘(To be plausible, one count must exceed another by #. times)’.format(:PLAUSIBILITY_RATIO))

print(‘Checking plausibility of "I before E except after C":’)
V (cei, cie, not_c_ie, not_c_ei) = simple_stats()
print_result(cei, cie, not_c_ie, not_c_ei)
Checking plausibility of "I before E except after C":

  Checking plausibility of: I before E when not preceded by C
    PLAUSIBLE. As we have counts of 465 vs 213, a ratio of  2.2 times

  Checking plausibility of: E before I when preceded by C
    IMPLAUSIBLE, probably contra-indicated. As we have counts of 13 vs 24, a ratio of  0.5 times

(To be plausible, one count must exceed another by 2 times)

8080 Assembly

This program is written to run under CP/M. It takes the filename on the command line. The file can be as large as you like, it does not need to fit in memory at once. (Indeed, unixdict.txt is 206k.)

	;;; I before E, except after C
fcb1:	equ	5Ch	; FCB 1 (populated by file on command line)
dma:	equ	80h	; Standard DMA location
bdos:	equ	5	; CP/M entry point
puts:	equ	9	; CP/M call to write a string to the console
fopen:	equ	0Fh	; CP/M call to open a file
fread:	equ	14h	; CP/M call to read from a file
CR:	equ	13
LF:	equ	10
EOF:	equ	26
	org	100h
	;;;	Open the file given on the command line
	lxi	d,fcb1
	mvi	c,fopen
	call	bdos
	inr	a		; FF = error
	jz	die
	;;;	We can only read one 128-byte block at a time, and the file
	;;;	will not fit in memory (max 64 k). So there are two things
	;;;	going on here: we copy from the block into a word buffer
	;;;	until we see the end of a line, at which point we process
	;;;	the word. In the meantime, if while copying we reach the end
	;;;	of the block, we read the next block.
	lxi	b,curwrd	; Word pointer
block:	push	b		; Keep word pointer while reading
	lxi	d,fcb1		; Read a block from the file
	mvi	c,fread
	call	bdos
	pop	b		; Restore word pointer
	dcr	a		; 1 = EOF
	jz	done
	inr	a		; otherwise, <>0 = error
	jnz	die
	lxi	h,dma		; Start reading at DMA
char:	mov	a,m		; Get character
	cpi	EOF		; If it's an EOF character, we're done
	jz	done
	stax	b		; Store character in current word
	inx	b
	cpi	LF		; If it's LF, then we've got a full word
	cz	word		; Process the word
	inr 	l		; Go to next character
	jz	block		; If we're done with this block, get next one
	jmp	char		
	;;;	When done, report the statistics
done:	lxi	d,scie		; CIE
	call	sout
	lhld	cie
	call	puthl
	lxi	d,sxie		; xIE
	call	sout
	lhld	xie
	call	puthl
	lxi	d,scei		; CEI
	call	sout
	lhld	cei
	call	puthl
	lxi	d,sxei		; xEI 
	call	sout
	lhld	xei
	call	puthl
	;;;	Then say what is and isn't plausible
	lxi 	d,s_ienc	; I before E when not preceded by C
	call	sout		; plausible if 2*xIE>CIE
	lhld	cie
	lhld	xie
	call	pplaus
	lxi	d,s_eic		; E before I when preceded by C
	call	sout		; plausible if 2*CEI>xEI
	lhld	xei
	lhld	cei
	;;;	If HL = amount of words with feature, and
	;;;	DE = amount of words with opposit feature, then print
	;;;	'(not) plausible', as appropriate.
pplaus:	dad	h		; 2 * feature
	mov	a,d		; Compare high byte
	cmp	h
	jc	plaus		; If 2*H>D then plausible
	mov	a,e		; Otherwise, compare low byte
	cmp	l
	jc	plaus		; If 2*L>E then plausible
	lxi	d,snop		; Otherwise, not plausible
	jmp	sout
plaus:	lxi	d,splau
	jmp	sout
	;;;	Process a word
word:	push	h		; Save file read address 
	xra	a		; Zero out end of word
	stax	b
	dcx	b
	lxi	h,curwrd	; Scan word
start:	mov	a,m		; Get current character
	inx	h		; Move pointer ahead
	ana	a		; If zero,
	jz	w_end		; we're done
	cpi	'c'		; Did we find a 'c'?
	jz	findc
	cpi	'e'		; Otherwise, did we find 'e'?
	jz	finde
	cpi	'i'		; Otherwise, did we find 'i'?
	jz	findi
	jmp	start		; Otherwise, keep going
	;;;	We found an 'e'
finde:	mov	a,m		; Get following character
	cpi	'i'		; Is it 'i'?
	jnz	start		; If not, keep going
	inx	h		; Otherwise, move past it,
	xchg			; keep pointer in DE,
	lhld	xie		; We found ie without c
	inx	h
	shld	xie
	jmp	start
	;;;	We found an 'i'
findi:	mov	a,m		; Get following character
	cpi	'e'		; Is it 'e'?
	jnz	start		; If not, keep going
	inx	h		; Otherwise, move past it,
	xchg			; keep pointer in DE,
	lhld	xei		; We found ei without c
	inx	h
	shld	xei
	jmp	start
	;;;	We found a 'c'
findc:	mov	a,m		; Get following character
	cpi	'e'		; Is it 'e'?
	jz	findce		; Then we have 'ce'
	cpi	'i'		; Is it 'i'?
	jz	findci		; Then we have 'ci'
	jmp	start		; Otherwise, just keep going
findce:	mov	d,h		; set DE = start of 'e?'
	mov	e,l
	inx	d		; Get next character
	ldax	d
	cpi	'i'		; Is it 'i'?
	jnz	start		; If not, do nothing
	lhld	cei		; But if so, we found 'cei'
	inx	h		; Increment the counter
	shld	cei
	xchg			; Keep scanning _after_ the 'cei'
	inx	h
	jmp	start
findci:	mov	d,h		; set DE = start of 'i?'
	mov	e,l
	inx	d		; Get next character
	ldax	d
	cpi	'e'		; Is it 'e'?
	jnz	start		; If not, do nothing
	lhld	cie		; But if so, we found 'cie'
	inx	h		; Increment the counter
	shld	cie
	xchg			; Keep scanning _after_ the 'cie'
	inx	h
	jmp	start	
w_end:	lxi	b,curwrd	; Set word pointer to beginning
	pop	h		; Restore file read address
	;;;	Print error message and stop the program
die:	lxi	d,errmsg
	mvi	c,puts
	call	bdos
	rst	0
	;;;	Print string
sout:	mvi	c,puts
	jmp	bdos
	;;;	Print HL to the console as a decimal number
puthl:	push	h
	lxi	h,num
	lxi	b,-10
dgt:	lxi	d,-1
clcdgt:	inx	d
	dad	b
	jc	clcdgt
	mov	a,l
	adi	10+'0'
	dcx	h
	mov	m,a
	mov	a,h
	ora	l
	jnz	dgt
	pop	d
	mvi	c,puts
	jmp	bdos	
errmsg:	db	'Error$'	; Good enough
s_ienc:	db	'I before E when not preceded by C:$'
s_eic:	db	'E before I when preceded by C:$'
snop:	db	' not'
splau:	db	' plausible',CR,LF,'$'
scie:	db	'CIE: $'	; Report strings
sxie:	db	'xIE: $'	
scei:	db	'CEI: $'
sxei:	db	'xEI: $'
	db	'00000'
num:	db	CR,LF,'$'	; Space for number
	;;;	Counters
xie:	dw	0		; I before E when not preceded by C
cie:	dw	0		; I before E when preceded by C 
cei:	dw	0		; E before I when preceded by C
xei:	dw	0		; E before I when not preceded by C
curwrd:	equ	$		; Current word stored here
A>iec unixdict.txt
CIE: 24
xIE: 217
CEI: 13
xEI: 464
I before E when not preceded by C: plausible
E before I when preceded by C: not plausible


Works with: ALGOL 68G version Any - tested with release 2.8.3.win32

Uses non-standard procedure to lower available in Algol 68G.

# tests the plausibility of "i before e except after c" using unixdict.txt #

# implements the plausibility test specified by the task                   #
# returns TRUE if with > 2 * without                                       #
PROC plausible = ( INT with, without )BOOL: with > 2 * without;

# shows the plausibility of with and without                               #
PROC show plausibility = ( STRING legend, INT with, without )VOID:
     print( ( legend, IF plausible( with, without ) THEN " is plausible" ELSE " is not plausible" FI, newline ) );

IF  FILE input file;
    STRING file name = "unixdict.txt";
    open( input file, file name, stand in channel ) /= 0
    # failed to open the file #
    print( (  "Unable to open """ + file name + """", newline ) )
    # file opened OK #
    BOOL at eof := FALSE;
    # set the EOF handler for the file #
    on logical file end( input file, ( REF FILE f )BOOL:
                                         # note that we reached EOF on the #
                                         # latest read #
                                         at eof := TRUE;
                                         # return TRUE so processing can continue #
    INT    cei := 0;
    INT    xei := 0;
    INT    cie := 0;
    INT    xie := 0;
    WHILE STRING word;
          get( input file, ( word, newline ) );
          NOT at eof
        # examine the word for cie, xie (x /= c), cei and xei (x /= c)      #
        FOR pos FROM LWB word TO UPB word DO word[ pos ] := to lower( word[ pos ] ) OD;
        IF   word = "ie" THEN
            xie +:= 1
        ELIF word = "ei" THEN
            xei +:= 1
            INT length = ( UPB word - LWB word ) + 1;
            IF length > 1 THEN
                IF   word[ LWB word ] = "i" AND word[ LWB word + 1 ] = "e" THEN
                    # word starts ie                                        #
                    xie +:= 1
                ELIF word[ LWB word ] = "e" AND word[ LWB word + 1 ] = "i" THEN
                    # word starts ei                                        #
                    xei +:= 1
                FOR pos FROM LWB word + 1 TO UPB word - 1 DO
                    IF   word[ pos ] = "i" AND word[ pos + 1 ] = "e" THEN
                        # have i before e, check the preceeding character   #
                        IF word[ pos - 1 ] = "c" THEN cie ELSE xie FI +:= 1
                    ELIF word[ pos ] = "e" AND word[ pos + 1 ] = "i" THEN
                        # have e before i, check the preceeding character   #
                        IF word[ pos - 1 ] = "c" THEN cei ELSE xei FI +:= 1
    # close the file #
    close( input file );

    # test the hypothesis                                                    #
    print( ( "cie occurances: ", whole( cie, 0 ), newline ) );
    print( ( "xie occurances: ", whole( xie, 0 ), newline ) );
    print( ( "cei occurances: ", whole( cei, 0 ), newline ) );
    print( ( "xei occurances: ", whole( xei, 0 ), newline ) );
    show plausibility( "i before e except after c", xie, cie );
    show plausibility( "e before i except after c", xei, cei );
    show plausibility( "i before e   when after c", cie, xie );
    show plausibility( "e before i   when after c", cei, xei );
    show plausibility( "i before e     in general", xie + cie, xei + cei );
    show plausibility( "e before i     in general", xei + cei, xie + cie )
cie occurances: 24
xie occurances: 466
cei occurances: 13
xei occurances: 217
i before e except after c is plausible
e before i except after c is plausible
i before e   when after c is not plausible
e before i   when after c is not plausible
i before e     in general is plausible
e before i     in general is not plausible


Ignoring the fact that all exceptions to the rule in unixdict.txt occur where the rule doesn't apply anyway, such as in diphthongs, adjacent syllables, foreign or borrowed words, etc.:


on ibeeac()
    script o
        property wordList : words of (read file ((path to desktop as text) & "www.rosettacode.org:unixdict.txt") as «class utf8»)
        -- Subhandler called if thisWord contains either "ie" or "ei". Checks if there's an instance not preceded by "c".
        on testWithoutC(thisWord, letterPair)
            set AppleScript's text item delimiters to letterPair
            repeat with i from 1 to (count thisWord's text items) - 1
                if (text item i of thisWord does not end with "c") then return true
            end repeat
            return false
        end testWithoutC
    end script
    -- Counters: {i before e not after c, i before e after c, e before i not after c, e before i after c}.
    set {xie, cie, xei, cei} to {0, 0, 0, 0}
    set astid to AppleScript's text item delimiters
    set AppleScript's text item delimiters to "ie"
    repeat with thisWord in o's wordList
        set thisWord to thisWord's contents
        if (thisWord contains "ie") then
            if (thisWord contains "cie") then set cie to cie + 1
            if (o's testWithoutC(thisWord, "ie")) then set xie to xie + 1
        end if
        if (thisWord contains "ei") then
            if (thisWord contains "cei") then set cei to cei + 1
            if (o's testWithoutC(thisWord, "ei")) then set xei to xei + 1
        end if
    end repeat
    set AppleScript's text item delimiters to astid
    set |1 is plausible| to (xie / cie > 2)
    set |2 is plausible| to (cei / xei > 2)
    return {|"I before E not after C" is plausible|:|1 is plausible|} & ¬
        {|"E before I after C" is plausible|:|2 is plausible|} & ¬
        {|Both are plausible|:(|1 is plausible| and |2 is plausible|)}
end ibeeac

{|"I before E not after C" is plausible|:true, |"E before I after C" is plausible|:false, |Both are plausible|:false}


use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions

on ibeeac()
    set wordList to words of ¬
        (read (((path to desktop as text) & "www.rosettacode.org:unixdict.txt") as «class furl») as «class utf8»)
    set wordArray to current application's class "NSArray"'s arrayWithArray:(wordList)
    set counters to {}
    repeat with letterPair in {"ie", "ei"}
        set filter to (current application's class "NSPredicate"'s ¬
            predicateWithFormat_("(self CONTAINS[c] %@)", letterPair))
        set relevants to (wordArray's filteredArrayUsingPredicate:(filter))
        set filter to (current application's class "NSPredicate"'s ¬
            predicateWithFormat_("NOT (self CONTAINS[c] %@)", "c" & letterPair))
        set end of counters to (relevants's filteredArrayUsingPredicate:(filter))'s |count|()
        set filter to (current application's class "NSPredicate"'s ¬
            predicateWithFormat_("(self CONTAINS[c] %@)", "c" & letterPair))
        set end of counters to (relevants's filteredArrayUsingPredicate:(filter))'s |count|()
    end repeat
    set {xie, cie, xei, cei} to counters
    set |1 is plausible| to (xie / cie > 2)
    set |2 is plausible| to (cei / xei > 2)
    return {|"I before E not after C" is plausible|:|1 is plausible|} & ¬
        {|"E before I after C" is plausible|:|2 is plausible|} & ¬
        {|Both are plausible|:(|1 is plausible| and |2 is plausible|)}
end ibeeac

{|"I before E not after C" is plausible|:true, |"E before I after C" is plausible|:false, |Both are plausible|:false}


use AppleScript version "2.4"
use framework "Foundation"
use scripting additions

---------------------- TEST OF CLAIMS --------------------
on run
    set fpWordList to scriptFolder() & "unixdict.txt"
    if doesFileExist(fpWordList) then
        set patterns to {"[^c]ie", "[^c]ei", "cei", "cie"}
        set counts to ap(map(matchCount, patterns), ¬
        script test
            on |λ|(kvs)
                set {common, rare} to kvs
                set {ck, cv} to common
                set {rk, rv} to rare
                set ratio to roundTo(2, cv / rv)
                if ratio > 2 then
                    set verdict to "plausible"
                    set verdict to "unsupported"
                end if
                unwords({ck, ">", rk, "->", cv, "/", rv, ¬
                    "=", ratio, "::", verdict})
            end |λ|
        end script
        unlines(map(test, chunksOf(2, zip(patterns, counts))))
        display dialog "Word list not found in this script's folder:" & ¬
            linefeed & tab & fpWordList
    end if
end run

------------------------- GENERIC ------------------------

-- Tuple (,) :: a -> b -> (a, b)
on Tuple(a, b)
    -- Constructor for a pair of values, possibly of two different types.
    {a, b}
end Tuple

-- ap (<*>) :: [(a -> b)] -> [a] -> [b]
on ap(fs, xs)
    -- e.g. [(*2),(/2), sqrt] <*> [1,2,3]
    -- -->  ap([dbl, hlf, root], [1, 2, 3])
    -- -->  [2,4,6,0.5,1,1.5,1,1.4142135623730951,1.7320508075688772]
    -- Each member of a list of functions applied to
    -- each of a list of arguments, deriving a list of new values
    set lst to {}
    repeat with f in fs
        tell mReturn(contents of f)
            repeat with x in xs
                set end of lst to |λ|(contents of x)
            end repeat
        end tell
    end repeat
    return lst
end ap

-- chunksOf :: Int -> [a] -> [[a]]
on chunksOf(k, xs)
        on go(ys)
            set ab to splitAt(k, ys)
            set a to item 1 of ab
            if {}  a then
                {a} & go(item 2 of ab)
            end if
        end go
    end script
    result's go(xs)
end chunksOf

-- doesFileExist :: FilePath -> IO Bool
on doesFileExist(strPath)
    set ca to current application
    set oPath to (ca's NSString's stringWithString:strPath)'s ¬
    set {bln, int} to (ca's NSFileManager's defaultManager's ¬
        fileExistsAtPath:oPath isDirectory:(reference))
    bln and (int  1)
end doesFileExist

-- map :: (a -> b) -> [a] -> [b]
on map(f, xs)
    -- The list obtained by applying f
    -- to each element of xs.
    tell mReturn(f)
        set lng to length of xs
        set lst to {}
        repeat with i from 1 to lng
            set end of lst to |λ|(item i of xs, i, xs)
        end repeat
        return lst
    end tell
end map

-- matchCount :: String -> NSString -> Int
on matchCount(regexString)
    -- A count of the matches for a regular expression
    -- in a given NSString
        on |λ|(s)
            set ca to current application
            ((ca's NSRegularExpression's ¬
                regularExpressionWithPattern:regexString ¬
                    options:(ca's NSRegularExpressionAnchorsMatchLines) ¬
                    |error|:(missing value))'s ¬
                numberOfMatchesInString:s ¬
                    options:0 ¬
                    range:{location:0, |length|:s's |length|()}) as integer
        end |λ|
    end script
end matchCount

-- min :: Ord a => a -> a -> a
on min(x, y)
    if y < x then
    end if
end min

-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
    -- 2nd class handler function lifted into 1st class script wrapper. 
    if script is class of f then
            property |λ| : f
        end script
    end if
end mReturn

-- readFile :: FilePath -> IO NSString
on readFile(strPath)
    set ca to current application
    set e to reference
    set {s, e} to (ca's NSString's ¬
        stringWithContentsOfFile:((ca's NSString's ¬
            stringWithString:strPath)'s ¬
            stringByStandardizingPath) ¬
            encoding:(ca's NSUTF8StringEncoding) |error|:(e))
    if missing value is e then
        (localizedDescription of e) as string
    end if
end readFile

-- roundTo :: Int -> Float -> Float
on roundTo(n, x)
    set d to 10 ^ n
    (round (x * d)) / d
end roundTo

-- scriptFolder :: () -> IO FilePath
on scriptFolder()
    -- The path of the folder containing this script
    tell application "Finder" to ¬
        POSIX path of ((container of (path to me)) as alias)
end scriptFolder

-- splitAt :: Int -> [a] -> ([a], [a])
on splitAt(n, xs)
    if n > 0 and n < length of xs then
        if class of xs is text then
            {items 1 thru n of xs as text, ¬
                items (n + 1) thru -1 of xs as text}
            {items 1 thru n of xs, items (n + 1) thru -1 of xs}
        end if
        if n < 1 then
            {{}, xs}
            {xs, {}}
        end if
    end if
end splitAt

-- unlines :: [String] -> String
on unlines(xs)
    -- A single string formed by the intercalation
    -- of a list of strings with the newline character.
    set {dlm, my text item delimiters} to ¬
        {my text item delimiters, linefeed}
    set s to xs as text
    set my text item delimiters to dlm
end unlines

-- unwords :: [String] -> String
on unwords(xs)
    set {dlm, my text item delimiters} to ¬
        {my text item delimiters, space}
    set s to xs as text
    set my text item delimiters to dlm
    return s
end unwords

-- zip :: [a] -> [b] -> [(a, b)]
on zip(xs, ys)
    zipWith(Tuple, xs, ys)
end zip

-- zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
on zipWith(f, xs, ys)
    set lng to min(length of xs, length of ys)
    set lst to {}
    if 1 > lng then
        return {}
        tell mReturn(f)
            repeat with i from 1 to lng
                set end of lst to |λ|(item i of xs, item i of ys)
            end repeat
            return lst
        end tell
    end if
end zipWith
[^c]ie > [^c]ei -> 466 / 217 = 2.15 :: plausible
cei > cie -> 13 / 24 = 0.54 :: unsupported


rule1: {"I before E when not preceded by C"}
rule2: {"E before I when preceded by C"}
phrase: {"I before E except after C"}

plausibility: #[
    false: "not plausible", 
    true: "plausible"

checkPlausible: function [rule, count1, count2][
    result: count1 > 2 * count2
    print ["The rule" rule "is" plausibility\[result] ":"]
    print ["\tthere were" count1 "examples and" count2 "counter-examples."]
    return result

words: read.lines relative "unixdict.txt"

[nie,cie,nei,cei]: 0

loop words 'word [
    if contains? word "ie" ->
        inc (contains? word "cie")? -> 'cie -> 'nie
    if contains? word "ei" ->
        inc (contains? word "cei")? -> 'cei -> 'nei

p1: checkPlausible rule1 nie nei
p2: checkPlausible rule2 cei cie

print ["\nSo the phrase" phrase "is" (to :string plausibility\[and? p1 p2]) ++ "."]
The rule "I before E when not preceded by C" is plausible : 
	there were 465 examples and 213 counter-examples. 
The rule "E before I when preceded by C" is not plausible : 
	there were 13 examples and 24 counter-examples. 

So the phrase "I before E except after C" is not plausible.


WordList := URL_ToVar("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
WordList := RegExReplace(WordList, "i)cie", "", cieN)
WordList := RegExReplace(WordList, "i)cei", "", ceiN)
RegExReplace(WordList, "i)ie", "", ieN)
RegExReplace(WordList, "i)ei", "", eiN)
cei := ceiN / cieN > 2 ? "plausible" : "implausible"
ei  := ieN  / eiN  > 2 ? "plausible" : "implausible"
ova := cei = "plausible." && ei = "plausible" ? "plausible" : "implausible"
MsgBox, % """I before E when not preceded by C"" is " ei ".`n"
        . ieN " cases for and " eiN " cases against is a ratio of " ieN / eiN ".`n`n"
        . """E before I when preceded by C"" is " cei ".`n"
        . ceiN " cases for and " cieN " cases against is a ratio of " ceiN / cieN ".`n`n"
        . "Overall the rule is " ova "."
URL_ToVar(URL) {
    WebRequest := ComObjCreate("WinHttp.WinHttpRequest.5.1")
    WebRequest.Open("GET", URL)
    return, WebRequest.ResponseText
"I before E when not preceded by C" is plausible.
466 cases for and 217 cases against is a ratio of 2.147465.

"E before I when preceded by C" is implausible.
13 cases for and 24 cases against is a ratio of 0.541667.

Overall the rule is implausible.


#!/usr/bin/awk -f 

/.ei/ {nei+=cnt($3)}
/cei/ {cei+=cnt($3)}

/.ie/ {nie+=cnt($3)}
/cie/ {cie+=cnt($3)}

function cnt(c) {
	if (c<1) return 1; 
	return c;

	printf("cie: %i\nnie: %i\ncei: %i\nnei: %i\n",cie,nie-cie,cei,nei-cei);
	v = v2 = "";
	if (nie < 3 * cie) {
		v =" not";
	print "I before E when not preceded by C: is"v" plausible";
	if (nei > 3 * cei)  {
		v = v2 =" not";
	print "E before I when preceded by C: is"v2" plausible";
        print "Overall rule is"v" plausible";


$ awk -f ./i_before_e_except_after_c.awk unixdict.txt 
cie: 24
nie: 464
cei: 13
nei: 194
I before E when not preceded by C: is plausible
E before I when preceded by C: is not plausible

$ awk -f i_before_e_except_after_c.awk 1_2_all_freq.txt 
cie: 994
nie: 8148
cei: 327
nei: 4826
I before E when not preceded by C: is plausible
E before I when preceded by C: is not plausible
Overall rule is not plausible

Batch File

Download first the text file, then put it on the same directory with this sample code:

::I before E except after C task from Rosetta Code Wiki
::Batch File Implementation

@echo off
setlocal enabledelayedexpansion
set ie=0
set ei=0
set cie=0
set cei=0

set propos1=FALSE
set propos2=FALSE
set propos3=FALSE

	::Do the matching
for /f %%X in (unixdict.txt) do (
	set word=%%X
	if not "!word:ie=!"=="!word!" if "!word:cie=!"=="!word!" (set /a ie+=1)
	if not "!word:ei=!"=="!word!" if "!word:cei=!"=="!word!" (set /a ei+=1)
	if not "!word:cei=!"=="!word!" (set /a cei+=1)
	if not "!word:cie=!"=="!word!" (set /a cie+=1)

set /a "counter1=!ei!*2,counter2=!cie!*2"

if !ie! gtr !counter1! set propos1=TRUE
echo.Plausibility of "I before E when not preceded by C": !propos1! (!ie! VS !ei!)

if !cei! gtr !counter2! set propos2=TRUE
echo.Plausibility of "E before I when preceded by C": !propos2! (!cei! VS !cie!)

if !propos1!==TRUE if !propos2!==TRUE (set propos3=TRUE)
echo.Overall plausibility of "I before E EXCEPT after C": !propos3!

exit /b 0
Plausibility of "I before E when not preceded by C": TRUE (465 VS 213)
Plausibility of "E before I when preceded by C": FALSE (13 VS 24)
Overall plausibility of "I before E EXCEPT after C": FALSE
Press any key to continue . . .

Fast solution using standard external commands FINDSTR and FIND:

Each word is counted once if word has at least one occurrence of test string (word with 2 or more occurrences only counts once). The same word may count toward different categories.

@echo off
setlocal enableDelayedExpansion
for /f %%A in ('findstr /i "^ie [^c]ie" unixdict.txt ^| find /c /v ""') do set Atrue=%%A
for /f %%A in ('findstr /i "^ei [^c]ei" unixdict.txt ^| find /c /v ""') do set Afalse=%%A
for /f %%A in ('findstr /i "[c]ei" unixdict.txt ^| find /c /v ""') do set Btrue=%%A
for /f %%A in ('findstr /i "[c]ie" unixdict.txt ^| find /c /v ""') do set Bfalse=%%A
set /a "Aresult=Atrue/Afalse/2, Bresult=Btrue/Bfalse/2, Result=^!^!Aresult*Bresult"
set "Answer1=Plausible" & set "Answer0=Implausible"
echo I before E when not preceded by C: True=%Atrue% False=%Afalse% : !Answer%Aresult%!
echo E before I when preceded by C: True=%Btrue% False=%Bfalse% : !Answer%Bresult%!
echo I before E, except after C : !Answer%Result%!
I before E when not preceded by C: True=465 False=213 : Plausible
E before I when preceded by C: True=13 False=24 : Implausible
I before E, except after C : Implausible

Stretch solution using standard external command FINDSTR:

Each word frequency is included once if word has at least one occurrence of test string (word with 2 or more occurrences only counts once). The same word frequency may count toward different categories.

@echo off
setlocal enableDelayedExpansion
set /a Atrue=Afalse=Btrue=Bfalse=0
for /f "tokens=3*" %%A in ('findstr /i "[^c]ie" 1_2_all_freq.txt') do if "%%B" equ "" set /a Atrue+=%%A
for /f "tokens=3*" %%A in ('findstr /i "[^c]ei" 1_2_all_freq.txt') do if "%%B" equ "" set /a Afalse+=%%A
for /f "tokens=3*" %%A in ('findstr /i "[c]ei" 1_2_all_freq.txt') do if "%%B" equ "" set /a Btrue+=%%A
for /f "tokens=3*" %%A in ('findstr /i "[c]ie" 1_2_all_freq.txt') do if "%%B" equ "" set /a Bfalse+=%%A
set /a "Aresult=Atrue/Afalse/2, Bresult=Btrue/Bfalse/2, Result=^!^!Aresult*Bresult"
set "Answer1=Plausible" & set "Answer0=Implausible"
echo I before E when not preceded by C: True=%Atrue% False=%Afalse% : !Answer%Aresult%!
echo E before I when preceded by C: True=%Btrue% False=%Bfalse% : !Answer%Bresult%!
echo I before E, except after C : !Answer%Result%!
I before E when not preceded by C: True=8192 False=4826 : Implausible
E before I when preceded by C: True=327 False=994 : Implausible
I before E, except after C : Implausible


80 PRINT "xIE:";XI
100 PRINT "xEI:";XE
120 PRINT "I before E when not preceded by C: ";
130 IF 2*XI <= CI THEN PRINT "not ";
140 PRINT "plausible."
150 PRINT "E before I when preceded by C: ";
160 IF 2*CE <= XE THEN PRINT "not ";
170 PRINT "plausible."
CIE: 24
xIE: 465
CEI: 13
xEI: 213

I before E when not preceded by C: plausible.
E before I when preceded by C: not plausible.


Translation of: BASIC
CI = 0 : XI = 0 : CE = 0 : XE = 0
open 1, "unixdict.txt"

	pal$ = readline (1)
	if instr(pal$, "ie") then
		if instr(pal$, "cie") then CI += 1 else XI += 1
	if instr(pal$, "ei") then
		if instr(pal$, "cei") then CE += 1 else XE += 1
until eof(1)
close 1

print "CIE: "; CI
print "xIE: "; XI
print "CEI: "; CE
print "xEI: "; XE
print "I before E when not preceded by C: ";
if 2 * XI <= CI then print "not ";
print "plausible."
print "E before I when preceded by C: ";
if 2 * CE <= XE then print "not ";
print "plausible."


      IF F% == 0 ERROR 100, "unixdict not found!"

      CI=0 : XI=0 : CE=0 : XE=0
        P%=INSTR(Line$, "ie")
        WHILE P%
          IF MID$(Line$, P% - 1, 1) == "c" CI+=1 ELSE XI+=1
          P%=INSTR(Line$, "ie", P% + 1)
        P%=INSTR(Line$, "ei")
        WHILE P%
          IF MID$(Line$, P% - 1, 1) == "c" CE+=1 ELSE XE+=1
          P%=INSTR(Line$, "ei", P% + 1)

      PRINT "Instances of 'ie', proceeded by a 'c'     = ";CI
      PRINT "Instances of 'ie', NOT proceeded by a 'c' = ";XI
      P1%=XI * 2 > CI
      PRINT "Therefore 'I before E when not preceded by C' is" FNTest(P1%)

      PRINT "Instances of 'ei', proceeded by a 'c'     = ";CE
      PRINT "Instances of 'ei', NOT proceeded by a 'c' = ";XE
      P2%=CE * 2 > XE
      PRINT "Therefore 'E before I when preceded by C' is" FNTest(P2%)

      IF P1% AND P2% PRINT "B"; ELSE PRINT "Not b";
      PRINT "oth sub-phrases are plausible, therefore the phrase " +\
      \     "'I before E, except after C' can be said to be" FNTest(P1% AND P2%) "!"

      DEF FNTest(plausible%)=MID$(" not plausible", 1 - 4 * plausible%)
Instances of 'ie', proceeded by a 'c'     = 24
Instances of 'ie', NOT proceeded by a 'c' = 466
Therefore 'I before E when not preceded by C' is plausible

Instances of 'ei', proceeded by a 'c'     = 13
Instances of 'ei', NOT proceeded by a 'c' = 217
Therefore 'E before I when preceded by C' is not plausible

Not both sub-phrases are plausible, therefore the phrase 'I before E, except after C' can be said to be not plausible!


get "libhdr"

// Read word from selected input
let readword(v) = valof 
$(  let ch = ?
    v%0 := 0
    $(  ch := rdch()
        if ch = endstreamch then resultis false
        if ch = '*N' then resultis true
        v%0 := v%0 + 1
        v%(v%0) := ch
    $) repeat

// Does s1 contain s2?
let contains(s1, s2) = valof
$(  for i = 1 to s1%0 - s2%0 + 1
        if valof
        $(  for j = 1 to s2%0
                unless s1%(i+j-1) = s2%j resultis false
            resultis true
        $) resultis true
    resultis false

// Test unixdict.txt
let start() be
$(  let word = vec 2+64/BYTESPERWORD
    let file = findinput("unixdict.txt")
    let ncie, ncei, nxie, nxei = 0, 0, 0, 0
    while readword(word)
        test contains(word, "ie")
            test contains(word, "cie")
                do ncie := ncie + 1
                or nxie := nxie + 1
        or if contains(word, "ei")
            test contains(word, "cei")
                do ncei := ncei + 1
                or nxei := nxei + 1
    // Show results
    writef("CIE: %N*N", ncie)
    writef("xIE: %N*N", nxie)
    writef("CEI: %N*N", ncei)
    writef("xEI: %N*N", nxei)
    writef("I before E when not preceded by C: %Splausible.*N",
        2*nxie > ncie -> "", "not ")
    writef("E before I when preceded by C: %Splausible.*N",
        2*ncei > nxei -> "", "not ")
CIE: 24
xIE: 465
CEI: 13
xEI: 209
I before E when not preceded by C: plausible.
E before I when preceded by C: not plausible.


Inspired by the J solution, but implemented as a single pass through the data, we have flex build the finite state machine in C. This may in turn motivate me to provide a second J solution as a single pass FSM. Please find the program output hidden at the top of the source as part of the build and example run.

    compilation and example on a GNU linux system:
    $ flex --case-insensitive --noyywrap --outfile=cia.c source.l
    $ make LOADLIBES=-lfl cia 
    $ ./cia < unixdict.txt 
    I before E when not preceded by C: plausible
    E before I when preceded by C: implausible
    Overall, the rule is: implausible 
  int cie, cei, ie, ei;
cie ++cie, ++ie; /* longer patterns are matched preferentially, consuming input */
cei ++cei, ++ei;
ie ++ie;
ei ++ei;
.|\n ;
int main() {
  cie = cei = ie = ei = 0;
  printf("%s: %s\n","I before E when not preceded by C", (2*ei < ie ? "plausible" : "implausible"));
  printf("%s: %s\n","E before I when preceded by C", (2*cie < cei ? "plausible" : "implausible"));
  printf("%s: %s\n","Overall, the rule is", (2*(cie+ei) < (cei+ie) ? "plausible" : "implausible"));
  return 0;


Translation of: Java
using System;
using System.Collections.Generic;
using System.IO;

namespace IBeforeE {
    class Program {
        static bool IsOppPlausibleWord(string word) {
            if (!word.Contains("c") && word.Contains("ei")) {
                return true;
            if (word.Contains("cie")) {
                return true;
            return false;

        static bool IsPlausibleWord(string word) {
            if (!word.Contains("c") && word.Contains("ie")) {
                return true;
            if (word.Contains("cei")) {
                return true;
            return false;

        static bool IsPlausibleRule(string filename) {
            IEnumerable<string> wordSource = File.ReadLines(filename);
            int trueCount = 0;
            int falseCount = 0;

            foreach (string word in wordSource) {
                if (IsPlausibleWord(word)) {
                else if (IsOppPlausibleWord(word)) {

            Console.WriteLine("Plausible count: {0}", trueCount);
            Console.WriteLine("Implausible count: {0}", falseCount);
            return trueCount > 2 * falseCount;

        static void Main(string[] args) {
            if (IsPlausibleRule("unixdict.txt")) {
                Console.WriteLine("Rule is plausible.");
            else {
                Console.WriteLine("Rule is not plausible.");
Plausible count: 384
Implausible count: 204
Rule is not plausible.


  • If the file changes, the outcome will possibly be different.
  • sha1 of file 2013-12-30: 058f8872306ef36f679d44f1b556334a13a85b57 unixdict.txt
  • Build with: g++ -Wall -std=c++0x thisfile.cpp -lboost_regex
  • (Test used 4.4, so only a limited number of C++11 features were used.)
#include <iostream>
#include <fstream>
#include <string>
#include <tuple>
#include <vector>
#include <stdexcept>
#include <boost/regex.hpp>

struct Claim {
        Claim(const std::string& name) : name_(name), pro_(0), against_(0), propats_(), againstpats_() {
        void add_pro(const std::string& pat) { 
               propats_.push_back(std::make_tuple(boost::regex(pat), pat[0] == '^')); 
        void add_against(const std::string& pat) { 
               againstpats_.push_back(std::make_tuple(boost::regex(pat), pat[0] == '^')); 
        bool plausible() const { return pro_ > against_*2; }
        void check(const char * buf, uint32_t len) {
                for (auto i = propats_.begin(), ii = propats_.end(); i != ii; ++i) {
                        uint32_t pos = 0;
                        boost::cmatch m;
                        if (std::get<1>(*i) && pos > 0) continue;
                        while (pos < len && boost::regex_search(buf+pos, buf+len, m, std::get<0>(*i))) {
                                if (pos > 0) std::cerr << name_ << " [pro] multiple matches in: " << buf << "\n";
                                pos += m.position() + m.length();
                for (auto i = againstpats_.begin(), ii = againstpats_.end(); i != ii; ++i) {
                        uint32_t pos = 0;
                        boost::cmatch m;
                        if (std::get<1>(*i) && pos > 0) continue;
                        while (pos < len && boost::regex_search(buf+pos, buf+len, m, std::get<0>(*i))) {
                                if (pos > 0) std::cerr << name_ << " [against] multiple matches in: " << buf << "\n";
                                pos += m.position() + m.length();
        friend std::ostream& operator<<(std::ostream& os, const Claim& c);
        std::string name_;
        uint32_t pro_;
        uint32_t against_;
        // tuple<regex,begin only>
        std::vector<std::tuple<boost::regex,bool>> propats_;
        std::vector<std::tuple<boost::regex,bool>> againstpats_;

std::ostream& operator<<(std::ostream& os, const Claim& c) {
        os << c.name_ << ": matches: " << c.pro_ << " vs. counter matches: " << c.against_ << ". ";
        os << "Plausibility: " << (c.plausible() ? "yes" : "no") << ".";
        return os;

int main(int argc, char ** argv) {
        try {
                if (argc < 2) throw std::runtime_error("No input file.");
                std::ifstream is(argv[1]);
                if (! is) throw std::runtime_error("Input file not valid.");

                Claim ieclaim("[^c]ie");

                Claim ceiclaim("cei");

                        const uint32_t MAXLEN = 32;
                        char buf[MAXLEN];
                        uint32_t longest = 0;
                        while (is) {
                                is.getline(buf, sizeof(buf));
                                if (is.gcount() <= 0) break;
                                else if (is.gcount() > longest) longest = is.gcount();
                                ieclaim.check(buf, is.gcount());
                                ceiclaim.check(buf, is.gcount());
                        if (longest >= MAXLEN) throw std::runtime_error("Buffer too small.");

                std::cout << ieclaim << "\n";
                std::cout << ceiclaim << "\n";
                std::cout << "Overall plausibility: " << (ieclaim.plausible() && ceiclaim.plausible() ? "yes" : "no") << "\n";

        } catch (const std::exception& ex) {
                std::cerr << "*** Error: " << ex.what() << "\n";
                return -1;
        return 0;
[^c]ie [pro] multiple matches in: siegfried
[^c]ie [against] multiple matches in: weinstein
[^c]ie: matches: 466 vs. counter matches: 217. Plausibility: yes.
cei: matches: 13 vs. counter matches: 24. Plausibility: no.
Overall plausibility: no


The output here was generated with the files as of 21st June 2016.

(ns i-before-e.core
  (:require [clojure.string :as s])

(def patterns {:cie #"cie" :ie #"(?<!c)ie" :cei #"cei" :ei #"(?<!c)ei"})

(defn update-counts
  "Given a map of counts of matching patterns and a word, increment any count if the word matches it's pattern."
  [counts [word freq]]
  (apply hash-map (mapcat (fn [[k v]] [k (if (re-seq (patterns k) word) (+ freq v) v)]) counts)))

(defn count-ie-ei-combinations
  "Update counts of all ie and ei combinations"
  (reduce update-counts {:ie 0 :cie 0 :ei 0 :cei 0} words))

(defn apply-freq-1
  "Apply a frequency of one to words"
  (map #(vector % 1) words))

(defn- format-plausible
  (if plausible? "plausible" "implausible"))

(defn- apply-rule [desc examples contra]
  (let [plausible? (<= (* 2 contra) examples)]
    (println (format "The sub rule %s is %s. There are %d examples and %d counter-examples.\n" desc (format-plausible plausible?) examples contra))

(defn i-before-e-except-after-c-plausible?
  "Check if i before e after c plausible?"
  [description words]
    (println description)
    (let [counts (count-ie-ei-combinations words)
          subrule1 (apply-rule "I before E when not preceeded by C" (:ie counts) (:ei counts))
          subrule2 (apply-rule "E before I when preceeded by C" (:cei counts) (:cie counts))
          rule (and subrule1 subrule2)]
      (println (format "Overall the rule 'I before E except after C' is %s" (format-plausible rule)))

(defn format-freq-line [line] (letfn [(format-line [xs] [(first xs) (read-string (last xs))])]
                                       (-> line
                                           (s/split #"\s")

(defn -main []
  (with-open [rdr (clojure.java.io/reader "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")]
   (i-before-e-except-after-c-plausible? "Check unixdist list" (apply-freq-1 (line-seq rdr))))
  (with-open [rdr (clojure.java.io/reader "http://ucrel.lancs.ac.uk/bncfreq/lists/1_2_all_freq.txt")]
   (i-before-e-except-after-c-plausible? "Word frequencies (stretch goal)" (map format-freq-line (drop 1 (line-seq rdr))))))
lein run
Check unixdist list
The sub rule I before E when not preceeded by C is plausible. There are 465 examples and 213 counter-examples.

The sub rule E before I when preceeded by C is implausible. There are 13 examples and 24 counter-examples.

Overall the rule 'I before E except after C' is implausible
Word frequencies (stretch goal)
The sub rule I before E when not preceeded by C is implausible. There are 8192 examples and 4826 counter-examples.

The sub rule E before I when preceeded by C is implausible. There are 327 examples and 994 counter-examples.

Overall the rule 'I before E except after C' is implausible


report = cluster is new, classify, results
    rep = record[cie, xie, cei, xei, words: int]
    new = proc () returns (cvt)
        return(rep${cie: 0, xie: 0, cei: 0, xei: 0, words: 0})
    end new
    classify = proc (r: cvt, word: string)
        r.words := r.words + 1
        if string$indexs("ie", word) ~= 0 then
            if string$indexs("cie", word) ~= 0
                then r.cie := r.cie + 1
                else r.xie := r.xie + 1
        elseif string$indexs("ei", word) ~= 0 then
            if string$indexs("cei", word) ~= 0
                then r.cei := r.cei + 1
                else r.xei := r.xei + 1
    end classify
    stat = proc (s: stream, name: string, val: int)
        stream$puts(s, name)
        stream$puts(s, ": ")
        stream$putl(s, int$unparse(val))
    end stat
    plausible = proc (s: stream, feature: string, match, nomatch: int) 
                returns (bool)
        stream$puts(s, feature)
        stream$puts(s, ": ")
        plaus: bool := 2 * match > nomatch;
        if ~plaus then stream$puts(s, "not ") end
        stream$putl(s, "plausible.");
    end plausible
    results = proc (r: cvt) returns (string)
        ss: stream := stream$create_output()
        stat(ss, "Amount of words", r.words)
        stat(ss, "CIE", r.cie)
        stat(ss, "xIE", r.xie)
        stat(ss, "CEI", r.cei)
        stat(ss, "xEI", r.xei)
        stream$putl(ss, "")
        xie_p: bool := plausible(ss, "I before E when not preceded by C", r.xie, r.cie)
        cei_p: bool := plausible(ss, "E before I when preceded by C", r.cei, r.xei)
        stream$puts(ss, "I before E, except after C: ")
        if ~(xie_p & cei_p) then stream$puts(ss, "not ") end
        stream$putl(ss, "plausible.")
    end results
end report

lines = iter (s: stream) yields (string)
    while true do
        except when end_of_file: break end
end lines

start_up = proc ()
    po: stream := stream$primary_output()
    file: file_name := file_name$parse("unixdict.txt")
    fstream: stream := stream$open(file, "read")
    r: report := report$new()
    for line: string in lines(fstream) do
        report$classify(r, line)
    stream$puts(po, report$results(r))
end start_up
Amount of words: 25104
CIE: 24
xIE: 465
CEI: 13
xEI: 209

I before E when not preceded by C: plausible.
E before I when preceded by C: not plausible.
I before E, except after C: not plausible.


First we need to set the variable dict to the text of the dictionary as a string. How to do this depends on your JavaScript platform. Using Node.js, for example, you could download a copy of the dictionary to /tmp/unixdict.txt and then say dict = fs.readFileSync '/tmp/unixdict.txt', {encoding: 'UTF-8'}.

Now we can do the task:

ie-npc = ei-npc = ie-pc = ei-pc = 0
for word of dict.toLowerCase!.match /\S+/g
    ++ie-npc if /(^|[^c])ie/.test word
    ++ei-npc if /(^|[^c])ei/.test word
    ++ie-pc if word.indexOf('cie') > -1
    ++ei-pc if word.indexOf('cei') > -1

p1 = ie-npc > 2 * ei-npc
p2 = ei-pc > 2 * ie-pc
console.log '(1) is%s plausible.', if p1 then '' else ' not'
console.log '(2) is%s plausible.', if p2 then '' else ' not'
console.log 'The whole phrase is%s plausible.', if p1 and p2 then '' else ' not'

Common Lisp

(defun test-rule (rule-name examples counter-examples)
  (let ((plausible (if (> examples (* 2 counter-examples)) 'plausible 'not-plausible)))
    (list rule-name plausible examples counter-examples)))

(defun plausibility (result-string file parser)
  (let ((cei 0) (cie 0) (ie 0) (ei 0))
    (macrolet ((search-count (&rest terms)
                 (when terms
                      (when (search ,(string-downcase (symbol-name (car terms))) word)
                        (incf ,(car terms) freq))
                      (search-count ,@(cdr terms))))))
      (with-open-file (stream file :external-format :latin-1)
        (loop :for raw-line = (read-line stream nil 'eof)
              :until (eq raw-line 'eof)
              :for line = (string-trim '(#\Tab #\Space) raw-line)
              :for (word freq) = (funcall parser line)
              :do (search-count cei cie ie ei))
        (print-result result-string cei cie ie ei)))))

(defun print-result (result-string cei cie ie ei)
  (let ((results (list (test-rule "I before E when not preceded by C" (- ie cie) (- ei cei))
                       (test-rule "E before I when preceded by C" cei cie))))
    (format t "~a:~%~{~{~2TThe rule \"~a\" is ~S. There were ~a examples and ~a counter-examples.~}~^~%~}~%~%~2TOverall the rule is ~S~%~%"
            result-string results (or (find 'not-plausible (mapcar #'cadr results)) 'plausible))))

(defun parse-dict (line) (list line 1))

(defun parse-freq (line)
  (list (subseq line 0 (position #\Tab line))
        (parse-integer (subseq line (position #\Tab line :from-end t)) :junk-allowed t)))

(plausibility "Dictionary" #p"unixdict.txt" #'parse-dict)
(plausibility "Word frequencies (stretch goal)" #p"1_2_all_freq.txt" #'parse-freq)
  The rule "I before E when not preceded by C" is PLAUSIBLE. There were 465 examples and 213 counter-examples.
  The rule "E before I when preceded by C" is NOT-PLAUSIBLE. There were 13 examples and 24 counter-examples.

  Overall the rule is NOT-PLAUSIBLE

Word frequencies (stretch goal):
  The rule "I before E when not preceded by C" is NOT-PLAUSIBLE. There were 8163 examples and 4826 counter-examples.
  The rule "E before I when preceded by C" is NOT-PLAUSIBLE. There were 327 examples and 994 counter-examples.

  Overall the rule is NOT-PLAUSIBLE


The extra work has not been attempted

import std.file;
import std.stdio;

int main(string[] args) {
    if (args.length < 2) {
        stderr.writeln(args[0], " filename");
        return 1;

    int cei, cie, ie, ei;
    auto file = File(args[1]);
    foreach(line; file.byLine) {
        auto res = eval(cast(string) line);
        cei += res.cei;
        cie += res.cie;
        ei += res.ei;
        ie += res.ie;

    writeln("CEI: ", cei, "; CIE: ", cie);
    writeln("EI: ", ei, "; IE: ", ie);

    writeln("'I before E when not preceded by C' is ", verdict(ie, ei));
    writeln("'E before I when preceded by C' is ", verdict(cei, cie));

    return 0;

string verdict(int a, int b) {
    import std.format;
    if (a > 2*b) {
        return format("plausible with evidence %f", cast(double)a/b);
    return format("not plausible with evidence %f", cast(double)a/b);

struct Evidence {
    int cei;
    int cie;
    int ei;
    int ie;

Evidence eval(string word) {
    enum State {

    State state;
    Evidence cnt;
    for(int i=0; i<word.length; ++i) {
        char c = word[i];
        switch(state) {
            case State.START:
                if (c == 'c') {
                    state = State.C;
                if (c == 'e') {
                    state = State.E;
                if (c == 'i') {
                    state = State.I;
            case State.C:
                if (c == 'e') {
                    state = State.CE;
                } else if (c == 'i') {
                    state = State.CI;
                } else if (c != 'c') {
                    state = State.START;
            case State.E:
                if (c == 'c') {
                    state = State.C;
                } else if (c == 'i') {
                    state = State.I;
                } else if (c != 'e') {
                    state = State.START;
            case State.I:
                if (c == 'c') {
                    state = State.C;
                } else if (c == 'e') {
                    state = State.E;
                } else if (c != 'i') {
                    state = State.START;
            case State.CE:
                if (c == 'i') {
                    state = State.I;
                if (c == 'c') {
                    state = State.C;
                state = State.START;
            case State.CI:
                if (c == 'e') {
                    state = State.E;
                if (c == 'c') {
                    state = State.C;
                state = State.START;
    return cnt;
CEI: 13; CIE: 24
EI: 217; IE: 466
'I before E when not preceded by C' is plausible with evidence 2.147465
'E before I when preceded by C' is not plausible with evidence 0.541667


Translation of: C sharp
program I_before_E_except_after_C;

  System.SysUtils, System.IOUtils;

function IsOppPlausibleWord(w: string): Boolean;
  if ((not w.Contains('c')) and (w.Contains('ei'))) then

  if (w.Contains('cie')) then


function IsPlausibleWord(w: string): Boolean;
  if ((not w.Contains('c')) and (w.Contains('ie'))) then

  if (w.Contains('cie')) then


function IsPlausibleRule(filename: TFileName): Boolean;
  words: TArray<string>;
  trueCount, falseCount: Cardinal;
  w: string;
  words := TFile.ReadAllLines(filename, TEncoding.UTF8);
  trueCount := 0;
  falseCount := 0;

  for w in words do
    if (IsPlausibleWord(w)) then
    else if (IsOppPlausibleWord(w)) then


  Writeln('Plausible count: ', trueCount);
  Writeln('Implausible  count: ', falseCount);

  Result := trueCount > 2 * falseCount;;


  if (IsPlausibleRule('unixdict.txt')) then
    Writeln('Rule is plausible.')
    Writeln('Rule is not plausible.');




/* variables to hold totals for each possibility */
word cie, xie, cei, xei;

/* classify a word and add it to the proper total */
proc nonrec classify(*char w) void:
    if CharsIndex(w, "ie") /= -1 then
        if CharsIndex(w, "cie") /= -1
            then cie := cie + 1
            else xie := xie + 1
    elif CharsIndex(w, "ei") /= -1 then
        if CharsIndex(w, "cei") /= -1
            then cei := cei + 1
            else xei := xei + 1

/* see if a clause is plausible */
proc nonrec plausible(*char clause; word match, nomatch) bool:
    bool p;
    p := 2*match > nomatch;
    writeln(clause, ": ", if p then "" else "not " fi, "plausible.");

proc nonrec main() void:
    file() dict_file;
    channel input text dict_ch;
    [256] char line;
    bool p;
    cie := 0;
    xie := 0;
    cei := 0;
    xei := 0;
    /* read every word */
    open(dict_ch, dict_file, "unixdict.txt");
    while readln(dict_ch; &line[0]) do
    /* print statistics */
    writeln("CIE: ", cie:5);
    writeln("xIE: ", xie:5);
    writeln("CEI: ", cei:5);
    writeln("xEI: ", xei:5);
    /* see if the propositions are plausible */
    p := plausible("I before E when not preceded by C", xie, cie);
    p := plausible("E before I when preceded by C", cei, xei) and p;
    writeln("I before E except after C: ",
            if p then "" else "not " fi,
CIE:    24
xIE:   465
CEI:    13
xEI:   209
I before E when not preceded by C: plausible.
E before I when preceded by C: not plausible.
I before E except after C: not plausible.


There are two files, one per hypothesis.

# i-before-e.ed
# Remove all the non-rule-related words
# Replace the occurences with one-letter markers
# Remove 1 occurence of e (alternative) per two i (null)
# Check whether there are more i's in the output (null hypothesis true) or not
# e-before-i-with-c.ed
# Remove all the non-rule-related words
# Replace the occurences with one-letter markers
# Remove 1 occurence of i (alternative) per two e (null)
# Check whether there are more e's in the output (null hypothesis true) or not
$ cat i-before-e.ed | ed -lEGs unixdict.txt 

Has more i's so the "i before e" hypothesis is plausible.

$ cat e-before-i-with-c.ed | ed -lEGs unixdict.txt 

Has more i's, so the "e before i when preceded by c" is not plausible. Thus, the whole rule is not plausible.


Translation of: Ruby
defmodule RC do
  def task(path) do
    plausibility_ratio = 2
    rules = [ {"I before E when not preceded by C:", "ie", "ei"},
              {"E before I when preceded by C:", "cei", "cie"} ]
    regex = ~r/ie|ei|cie|cei/
    counter = File.read!(path) |> countup(regex)
    Enum.all?(rules, fn {str, x, y} ->
      nx = counter[x]
      ny = counter[y]
      ratio = nx / ny
      plausibility = if ratio > plausibility_ratio, do: "Plausible", else: "Implausible"
      IO.puts str
      IO.puts "  #{x}: #{nx}; #{y}: #{ny}; Ratio: #{Float.round(ratio,3)}: #{plausibility}"
      ratio > plausibility_ratio
  def countup(binary, regex) do
    |> Enum.reduce(Map.new, fn word,acc ->
         if match = Regex.run(regex, word),
             do: Dict.update(acc, hd(match), 1, &(&1+1)), else: acc

path = hd(System.argv)
IO.inspect RC.task(path)
C:\Elixir>elixir test.exs \work\unixdict.txt
I before E when not preceded by C:
  ie: 462; ei: 212; Ratio: 2.179: Plausible
E before I when preceded by C:
  cei: 13; cie: 24; Ratio: 0.542: Implausible


plaus() ->                                                                    
    {ok,Words} = file:read_file("unixdict.txt"),                              
    Swords = string:tokens(erlang:binary_to_list(Words), "\n"),                                                        
    EiF = count(Swords,"[^c]ei",0),                                               
    IeF = count(Swords,"[^c]ie",0),                                               
    CeiF = count(Swords,"cei",0),                                             
    CieF = count(Swords,"cie",0),                                             
    if CeiF >= 2 * CieF -> P1= 'is'; true -> P1 = 'is not' end,               
    if IeF >= 2 * EiF -> P2 = 'is'; true -> P2 = 'is not' end,                
    if P1 == 'is' andalso p2 == 'is' -> P3 ='is'; true -> P3 = 'is not' end,  
    io:format("Proposition 1. ~w plausible: ie ~w, ei ~w~n", [P2,IeF,EiF]),    
    io:format("Proposition 2. ~w plausible: cei ~w, cie ~w~n", [P1,CeiF,CieF]),
    io:format("The rule ~w plausible~n", [P3]).                               
count(List,Pattern,Acc) when length(List) == 0 -> Acc;                        
count(List,Pattern,Acc) ->                                                    
    [H|T] = List,                                                             
    case re:run(H,Pattern,[global,{capture,none}]) of                         
        match -> count(T,Pattern, Acc + 1);                                   
        nomatch -> count(T,Pattern, Acc)                                      
69> cei:plaus().
Proposition 1. is plausible: ie 464, ei 194
Proposition 2. is not plausible: cei 13, cie 24
The rule 'is not' plausible


USING: combinators formatting generalizations io.encodings.utf8
io.files kernel literals math prettyprint regexp sequences ;
IN: rosetta-code.i-before-e

: correct ( #correct #incorrect rule-str -- )
    pprint " is correct for %d and incorrect for %d.\n" printf ;

: plausibility ( #correct #incorrect -- str )
    2 * > "plausible" "implausible" ? ;
: output ( #correct #incorrect rule-str -- )
    [ correct ] curry
    [ plausibility "This is %s.\n\n" printf ] 2bi ;
"unixdict.txt" utf8 file-lines ${
    R/ cei/ R/ cie/ R/ [^c]ie/ R/ [^c]ei/
    [ count-matches ]
    [ map-sum       ]
    [ 4 apply-curry ] bi@
} cleave

"I before E when not preceded by C"
"E before I when preceded by C" [ output ] bi@
"I before E when not preceded by C" is correct for 465 and incorrect for 195.
This is plausible.

"E before I when preceded by C" is correct for 13 and incorrect for 24.
This is implausible.


Please find the linux build instructions along with example run in the comments at the beginning of the f90 source. Thank you.

!-*- mode: compilation; default-directory: "/tmp/" -*-
!Compilation started at Sat May 18 22:19:19
!a=./F && make $a && $a < unixdict.txt
!f95 -Wall -ffree-form F.F -o F
!   ie   ei  cie  cei
!  490  230   24   13
!         [^c]ie plausible                       
!            cei implausible                     
! ([^c]ie)|(cei) implausible                     
!Compilation finished at Sat May 18 22:19:19

! test the plausibility of i before e except...
program cia
  implicit none
  character (len=256) :: s
  integer :: ie, ei, cie, cei
  integer :: ios
  data ie, ei, cie, cei/4*0/
  do while (.true.)
    read(5,*,iostat = ios)s
    if (0 .ne. ios) then
    call lower_case(s)
    cie = cie + occurrences(s, 'cie')
    cei = cei + occurrences(s, 'cei')
    ie = ie + occurrences(s, 'ie')
    ei = ei + occurrences(s, 'ei')
  write(6,'(1x,4(a4,1x))') 'ie','ei','cie','cei'
  write(6,'(1x,4(i4,1x))') ie,ei,cie,cei ! 488 230 24 13
  write(6,'(1x,2(a,1x))') '        [^c]ie',plausibility(ie,ei)
  write(6,'(1x,2(a,1x))') '           cei',plausibility(cei,cie)
  write(6,'(1x,2(a,1x))') '([^c]ie)|(cei)',plausibility(ie+cei,ei+cie)


  subroutine lower_case(s)
    character(len=*), intent(inout) :: s
    integer :: i
    do i=1, len_trim(s)
      s(i:i) = achar(ior(iachar(s(i:i)),32))
  end subroutine lower_case

  integer function occurrences(a,b)
    character(len=*), intent(in) :: a, b
    integer :: i, j, n
    n = 0
    i = 0
    j = index(a, b)
    do while (0 .lt. j)
      n = n+1
      i = i+len(b)+j-1
      j = index(a(i:), b)
    end do
    occurrences = n
  end function occurrences

  character*(32) function plausibility(da, nyet)
    integer, intent(in) :: da, nyet
    if (nyet*2 .lt. da) then
      plausibility = 'plausible'
      plausibility = 'implausible'
  end function plausibility
end program cia


Function getfile(file As String) As String
    Dim As Integer F = Freefile
    Dim As String text,intext
    Open file For Input As #F
    Line Input #F,text
    While Not Eof(F) 
        Line Input #F,intext
    close #F
    Return text
End Function

Function TALLY(instring As String,PartString As String) As Integer
        Dim count As Integer
        var lens2=Len(PartString)
        Dim As String s=instring 
        Dim As Integer position=Instr(s,PartString)
        If position=0 Then Return 0
        While position>0
    End Function
Dim As String myfile="unixdict.txt"

Dim As String wordlist= getfile(myfile)

print "The number of words in unixdict.txt  ",TALLY(wordlist,chr(10))+1
dim as integer cei=TALLY(wordlist,"cei")
print "Instances of cei",cei
dim as integer cie=TALLY(wordlist,"cie")
print "Instances of cie",cie
dim as integer ei=TALLY(wordlist,"ei")
print "Instances of *ei, where * is not c",ei-cei
dim as integer ie=TALLY(wordlist,"ie")
print "Instances of *ie, where * is not c",ie-cie
print "Conclusion:"
print "ie is plausible when not preceeded by c, the ratio is ";(ie-cie)/(ei-cei)
print "ei is not plausible when preceeded by c, the ratio is ";cei/cie
print "So, the idea is not plausible."

The number of words in unixdict.txt        25104

Instances of cei             13
Instances of cie             24

Instances of *ei, where * is not c         217
Instances of *ie, where * is not c         466

ie is plausible when not preceeded by c, the ratio is  2.147465437788018
ei is not plausible when preceeded by c, the ratio is  0.5416666666666666
So, the idea is not plausible.


include "NSLog.incl"

#plist NSAppTransportSecurity @{NSAllowsArbitraryLoads:YES}

void local fn CheckWord( wrd as CFStringRef, txt as CFStringRef, c as ^long, x as ^long )
  CFRange range = fn StringRangeOfString( wrd, txt )
  while ( range.location != NSNotFound )
    if ( range.location > 0 )
      select ( fn StringCharacterAtIndex( wrd, range.location-1 ) )
        case _"c"
          *c += 1
        case else
          *x += 1
      end select
      *x += 1
    end if
    range.length = len(wrd) - range.location
    range = fn StringRangeOfStringWithOptionsInRange( wrd, txt, 0, range )
end fn

void local fn Doit
  CFURLRef    url    = fn URLWithString( @"http://wiki.puzzlers.org/pub/wordlists/unixdict.txt" )
  CFStringRef string = fn StringWithContentsOfURL( url, NSUTF8StringEncoding, NULL )
  CFArrayRef  words  = fn StringComponentsSeparatedByCharactersInSet( string, fn CharacterSetNewlineSet )
  long        cei    = 0, cie = 0, xei = 0, xie = 0
  CFStringRef wrd, result
  for wrd in words
    fn CheckWord( wrd, @"ei", @cei, @xei )
    fn CheckWord( wrd, @"ie", @cie, @xie )
  NSLog(@"cei: %ld",cei)
  NSLog(@"cie: %ld",cie)
  NSLog(@"xei: %ld",xei)
  NSLog(@"xie: %ld",xie)
  if 2 * xie <= cie then result = @"not plausible" else result = @"plausible"
  NSLog( @"\nI before E when not preceded by C: %@.\n¬
  There are %ld examples and %ld counter-examples for a ratio of %f.\n", ¬
  result, xie, xei, ( ( (float)xie - (float)cie ) / ( (float)xei - (float)cei ) ) )
  if 2 * cei <= xei then result = @"not plausible" else result = @"plausible"
  NSLog( @"E before I when preceded by C: %@.\n¬
  There are %ld examples and %ld counter-examples for a ratio of %f.\n", ¬
  result, cei, cie, ( (float)cei / (float)cie ) )
end fn

fn DoIt

cei: 13
cie: 24
xei: 217
xie: 466

I before E when not preceded by C: plausible.
There are 466 examples and 217 counter-examples for a ratio of 2.166667.

E before I when preceded by C: not plausible.
There are 13 examples and 24 counter-examples for a ratio of 0.541667.


package main

import (

func main() {
	f, err := os.Open("unixdict.txt")
	if err != nil {
	defer f.Close()

	s := bufio.NewScanner(f)
	rie := regexp.MustCompile("^ie|[^c]ie")
	rei := regexp.MustCompile("^ei|[^c]ei")
	var cie, ie int
	var cei, ei int
	for s.Scan() {
		line := s.Text()
		if strings.Contains(line, "cie") {
		if strings.Contains(line, "cei") {
		if rie.MatchString(line) {
		if rei.MatchString(line) {
	err = s.Err()
	if err != nil {

	if check(ie, ei, "I before E when not preceded by C") &&
		check(cei, cie, "E before I when preceded by C") {
		fmt.Println("Both plausable.")
		fmt.Println(`"I before E, except after C" is plausable.`)
	} else {
		fmt.Println("One or both implausable.")
		fmt.Println(`"I before E, except after C" is implausable.`)

// check checks if a statement is plausible. Something is plausible if a is more
// than two times b.
func check(a, b int, s string) bool {
	switch {
	case a > b*2:
		fmt.Printf("%q is plausible (%d vs %d).\n", s, a, b)
		return true
	case a >= b:
		fmt.Printf("%q is implausible (%d vs %d).\n", s, a, b)
		fmt.Printf("%q is implausible and contra-indicated (%d vs %d).\n",
			s, a, b)
	return false
"I before E when not preceded by C" is plausible (465 vs 213).
"E before I when preceded by C" is implausible and contra-indicated (13 vs 24).
One or both implausable.
"I before E, except after C" is implausable.


Using Regular Expressions, you can quickly count all occurrences of words that follow this rule and words that don't. In this solution, TDFA -- a fast, POSIX ERE engine -- was used. However, substituting any other regex engine for TDFA should only require changing the import statement. See this page for a list of the most common regex engines available in Haskell.

This solution does not attempt the stretch goal.

import Network.HTTP
import Text.Regex.TDFA
import Text.Printf

getWordList :: IO String
getWordList  =  do
    response  <-  simpleHTTP.getRequest$ url
    getResponseBody response
        where url = "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt"

main = do
    words <- getWordList
    putStrLn "Checking Rule 1: \"I before E when not preceded by C\"..."
    let numTrueRule1   =  matchCount (makeRegex "[^c]ie" :: Regex) words
        numFalseRule1  =  matchCount (makeRegex "[^c]ei" :: Regex) words
        rule1Plausible  =  numTrueRule1 > (2*numFalseRule1)
    printf "Rule 1 is correct for %d\n        incorrect for %d\n" numTrueRule1 numFalseRule1
    printf "*** Rule 1 is %splausible.\n" (if rule1Plausible then "" else "im")
    putStrLn "Checking Rule 2: \"E before I when preceded by C\"..."
    let numTrueRule2   =  matchCount (makeRegex "cei" :: Regex) words
        numFalseRule2  =  matchCount (makeRegex "cie" :: Regex) words
        rule2Plausible  =  numTrueRule2 > (2*numFalseRule2)
    printf "Rule 2 is correct for %d\n        incorrect for %d\n" numTrueRule2 numFalseRule2
    printf "*** Rule 2 is %splausible.\n" (if rule2Plausible then "" else "im")
Checking Rule 1: "I before E when not preceded by C"...
Rule 1 is correct for 465
        incorrect for 195
*** Rule 1 is plausible.
Checking Rule 2: "E before I when preceded by C"...
Rule 2 is correct for 13
        incorrect for 24
*** Rule 2 is implausible.

Icon and Unicon

This solution only works in Unicon, but wouldn't be hard to adapt to Icon. Assumes that words that start with "ei" violate "i before e except after c" and that occurrences of "ei" and "ie" that occur multiple times in the same input line should all be tested.

import Utils		# To get the FindFirst class

procedure main(a)
    showCounts := "--showcounts" == !a
    totals := table(0)
    phrases := ["cei","cie","ei","ie"]  # Longer phrases first
    ff := FindFirst(phrases)

    every map(!&input) ?
        while totals[2(tab(ff.locate()), ff.moveMatch(), move(-1))] +:= 1

    eiP := totals["cei"] > 2* totals["cie"]
    ieP := (totals["ie"]+totals["cei"]) > 2* totals["ei"]
    write("phrase is ",((\ieP & \eiP),"plausible")|"not plausible")
    write("ie is ",(\ieP,"plausible")|"not plausible")
    write("ei is ",(\eiP,"plausible")|"not plausible")

    if \showCounts then every write(phrase := !phrases,": ",totals[phrase])

of running with --showcounts flag

-> ei --showcounts <unixdict.txt
phrase is not plausible
ie is plausible
ei is not plausible
cei: 13
cie: 24
ei: 217
ie: 466

stretch goal

import Utils		# To get the FindFirst class

procedure main(a)
    WS := " \t"
    showCounts := "--showcounts" == !a
    phrases := ["cei","cie","ei","ie"]
    ff := FindFirst(phrases)
    totals := table(0)

    every map(!&input) ? {
        w := (tab(many(WS)),tab(upto(WS)))             # word
        (tab(many(WS)),tab(upto(WS)))                  # Skip part of speech
        n := integer((tab(many(WS)),tab(upto(WS)|0))) | next   # frequency?
        \w ? while totals[2(tab(ff.locate()), ff.moveMatch(), move(-1))] +:= n

    eiP := totals["cei"] > 2* totals["cie"]
    ieP := (totals["ie"]+totals["cei"]) > 2* totals["ei"]
    write("phrase is ",((\ieP & \eiP),"plausible")|"not plausible")
    write("ie is ",(\ieP,"plausible")|"not plausible")
    write("ei is ",(\eiP,"plausible")|"not plausible")

    if \showCounts then every write(phrase := !phrases,": ",totals[phrase])
->ei2 --showcounts <1_2*txt
phrase is not plausible
ie is not plausible
ei is not plausible
cei: 327
cie: 994
ei: 4826
ie: 8207


After downloading unixdict to /tmp:

   dict=:tolower fread '/tmp/unixdict.txt'

Investigating the rules:

   +/'cie' E. dict
   +/'cei' E. dict
   +/'ie' E. dict
   +/'ei' E. dict

So, based on unixdict.txt, the "I before E" rule seems plausible (490 > 230 by more than a factor of 2), but the exception does not make much sense (we see almost twice as many i before e after a c as we see e before i after a c).

Note that if we looked at frequency of use for words, instead of considering all words to have equal weights, we might come up with a different answer.

stretch goal

After downloading 1_2_all_freq to /tmp, we can read it into J, and break out the first column (as words) and the third column as numbers:

allfreq=: |:}.<;._1;._2]1!:1<'/tmp/1_2_all_freq.txt'

words=:         >0 { allfreq
freqs=: 0 {.@".&>2 { allfreq

With these definitions, we can define a prevalence verb which will tell us how often a particular substring is appears in use:

prevalence=:verb define
  (y +./@E."1 words) +/ .* freqs

Investigating our original proposed rules:

   'ie' %&prevalence 'ei'

A generic "i before e" rule is not looking quite as good now - words that have i before e are used less than twice as much as words which use e before i.

   'cei' %&prevalence 'cie'

An "except after c" variant is looking awful now - words that use the cie sequence are three times as likely as words that use the cei sequence. So, of course, if we modified our original rule with this exception it would weaken the original rule:

   ('ie' -&prevalence 'cie') % ('ei' -&prevalence 'cei')

Note that we might also want to consider non-adjacent matches (the regular expression 'i.*e' instead of 'ie' or perhaps 'c.*ie' or 'c.*i.*e' instead of 'cie') - this would be straightforward to check, but this would bulk up the page. (And, to be meaningful, we'd want a more constrained wildcard than .* -- at the very least we would not want to span words.)


import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URI;
import java.net.URISyntaxException;
import java.net.URL;
public static void main(String[] args) throws URISyntaxException, IOException {
    System.out.printf("%-10s %,d%n", "total", total);
    System.out.printf("%-10s %,d%n", "'cei'", cei);
    System.out.printf("%-10s %,d%n", "'cie'", cie);
    System.out.printf("%,d > (%,d * 2) = %b%n", cei, cie, cei > (cie * 2));
    System.out.printf("%,d > (%,d * 2) = %b", cie, cei, cie > (cei * 2));

static int total = 0;
static int cei = 0;
static int cie = 0;

static void count() throws URISyntaxException, IOException {
    URL url = new URI("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt").toURL();
    try (BufferedReader reader = new BufferedReader(new InputStreamReader(url.openStream()))) {
        String line;
        while ((line = reader.readLine()) != null) {
            if (line.matches(".*?(?:[^c]ie|cei).*")) {
            } else if (line.matches(".*?(?:[^c]ei|cie).*")) {
total      25,104
'cei'      477
'cie'      215
477 > (215 * 2) = true
215 > (477 * 2) = false

An alternate demonstration
Download and save wordlist to unixdict.txt.

import java.io.BufferedReader;
import java.io.FileReader;

public class IbeforeE 
	public static void main(String[] args)
		IbeforeE now=new IbeforeE();
		String wordlist="unixdict.txt";
			System.out.println("Rule is plausible.");
			System.out.println("Rule is not plausible.");
	boolean isPlausibleRule(String filename)
		int truecount=0,falsecount=0;
			BufferedReader br=new BufferedReader(new FileReader(filename));
			String word;
				else if(isOppPlausibleWord(word))
		catch(Exception e)
			System.out.println("Something went horribly wrong: "+e.getMessage());
		System.out.println("Plausible count: "+truecount);
		System.out.println("Implausible count: "+falsecount);
			return true;
		return false;
	boolean isPlausibleWord(String word)
			return true;
		else if(word.contains("cei"))
			return true;
		return false;
	boolean isOppPlausibleWord(String word)
			return true;
		else if(word.contains("cie"))
			return true;
		return false;
Plausible count: 384
Implausible count: 204
Rule is not plausible.


Works with: jq version with regex support

WARNING: The problem statement is misleading as the rule only applies to syllables that rhyme with "see".

def plausibility_ratio: 2;

# scan/2 produces a stream of matches but the first match of a segment (e.g. cie)
# blocks further matches with that segment, and therefore if scan produces "ie",
# it was NOT preceded by "c".
def dictionary:
  reduce .[] as $word
    ( {};
      reduce ($word | scan("ie|ei|cie|cei")) as $found ( .; .[$found] += 1 ));

def rules:
  { "I before E when not preceded by C": ["ie",  "ei"],
    "E before I when preceded by C":     ["cei", "cie"]

# Round to nearest integer or else "round-up"
def round:
  if . < 0 then (-1 * ((- .) | round) | if . == -0 then 0 else . end)
  else floor as $x | if (. - $x) < 0.5 then $x else $x+1 end
def assess:
  (split("\n") | dictionary) as $dictionary
  | rules as $rules
  | ($rules | keys[]) as $key
  | $rules[$key] as $fragments
  | $dictionary[$fragments[0]] as $x
  | $dictionary[$fragments[1]] as $y
  | ($x / $y) as $ratio
  | (if $ratio > plausibility_ratio then "plausible"
     else "implausible" end) as $plausibility
  | " -- the rule \"\($key)\" is \($plausibility)
    as ratio = \($x)/\($y) ~ \($ratio * 100 |round)%"  ;

"Using the problematic criterion specified in the task requirements:", assess

Using http://www.puzzlers.org/pub/wordlists/unixdict.txt as of June 2015:

$ jq -s -R -r -f I_before_E_except_after_C.jq unixdict.txt
Using the problematic criterion specified in the task requirements:
 -- the rule "E before I when preceded by C" is implausible
    as ratio = 13/24 ~ 54%
 -- the rule "I before E when not preceded by C" is plausible
    as ratio = 464/217 ~ 214%


# v0.0.6

open("unixdict.txt") do txtfile
    rule1, notrule1, rule2, notrule2 = 0, 0, 0, 0
    for word in eachline(txtfile)
        # "I before E when not preceded by C"
        if ismatch(r"ie"i, word)
            if ismatch(r"cie"i, word)
                notrule1 += 1
                rule1 += 1
        # "E before I when preceded by C"
        if ismatch(r"ei"i, word)
            if ismatch(r"cei"i, word)
                rule2 += 1
                notrule2 += 1

    print("Plausibility of \"I before E when not preceded by C\": ")
    println(rule1 > 2 * notrule1 ? "PLAUSIBLE" : "UNPLAUSIBLE")
    print("Plausibility of \"E before I when preceded by C\":")
    println(rule2 > 2 * notrule2 ? "PLAUSIBLE" : "UNPLAUSIBLE")
Plausibility of "I before E when not preceded by C": PLAUSIBLE
Plausibility of "E before I when preceded by C":UNPLAUSIBLE


// version 1.0.6

import java.net.URL
import java.io.InputStreamReader
import java.io.BufferedReader

fun isPlausible(n1: Int, n2: Int) = n1 > 2 * n2

fun printResults(source: String, counts: IntArray) {
    println("Results for $source")
    println("  i before e except after c")
    println("    for     ${counts[0]}")
    println("    against ${counts[1]}")
    val plausible1 = isPlausible(counts[0], counts[1])
    println("  sub-rule is${if (plausible1) "" else " not"} plausible\n")
    println("  e before i when preceded by c")
    println("    for     ${counts[2]}")
    println("    against ${counts[3]}")
    val plausible2 = isPlausible(counts[2], counts[3])
    println("  sub-rule is${if (plausible2) "" else " not"} plausible\n")
    val plausible = plausible1 && plausible2
    println("  rule is${if (plausible) "" else " not"} plausible")

fun main(args: Array<String>) {
    val url = URL("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
    val isr = InputStreamReader(url.openStream())
    val reader = BufferedReader(isr)
    val regexes = arrayOf(
        Regex("(^|[^c])ie"),     // i before e when not preceded by c (includes words starting with ie)
        Regex("(^|[^c])ei"),     // e before i when not preceded by c (includes words starting with ei)
        Regex("cei"),            // e before i when preceded by c
        Regex("cie")             // i before e when preceded by c       
    val counts = IntArray(4) // corresponding counts of occurrences
    var word = reader.readLine()
    while (word != null) {
        for (i in 0..3) counts[i] += regexes[i].findAll(word).toList().size
        word = reader.readLine()
    printResults("unixdict.txt", counts)

    val url2 = URL("http://ucrel.lancs.ac.uk/bncfreq/lists/1_2_all_freq.txt")
    val isr2 = InputStreamReader(url2.openStream())
    val reader2 = BufferedReader(isr2)
    val counts2 = IntArray(4) 
    reader2.readLine() // read header line
    var line = reader2.readLine() // read first line and store it
    var words: List<String>
    val splitter = Regex("""(\t+|\s+)""")
    while (line != null) {
        words = line.split(splitter)
        if (words.size == 4)  // first element is empty
            for (i in 0..3) counts2[i] += regexes[i].findAll(words[1]).toList().size * words[3].toInt()
        line = reader2.readLine()
    printResults("British National Corpus", counts2)
Results for unixdict.txt
  i before e except after c
    for     466
    against 217
  sub-rule is plausible

  e before i when preceded by c
    for     13
    against 24
  sub-rule is not plausible

  rule is not plausible

Results for British National Corpus
  i before e except after c
    for     8192
    against 4826
  sub-rule is not plausible

  e before i when preceded by c
    for     327
    against 994
  sub-rule is not plausible

  rule is not plausible


Translation of: Perl
val words = split("\n", readfile("./data/unixdict.txt")) -> rest

val print = impure fn(support, against) {
    val ratio = support / against
    writeln "{{support}} / {{against}} = {{ratio : r2}}:", (ratio < 2) * " NOT", " PLAUSIBLE"
    return if(ratio >= 2: 1; 0)

val ks = fw/ei cei ie cie/
var cnt = {:}

for w in words {
    for k in ks {
        cnt[k; 0] += if(k in w: 1; 0)

var support = cnt'ie - cnt'cie
var against = cnt'ei - cnt'cei

var result = print(support, against)
result += print(cnt'cei, cnt'cie)

writeln "Overall:", (result < 2) * " NOT", " PLAUSIBLE\n"
465 / 213 = 2.18: PLAUSIBLE
13 / 24 = 0.54: NOT PLAUSIBLE


local(cie,cei,ie,ei) = (:0,0,0,0)

local(match_ie) = regExp(`[^c]ie`)
local(match_ei) = regExp(`[^c]ei`)

with word in include_url(`http://wiki.puzzlers.org/pub/wordlists/unixdict.txt`)->asString->split("\n")
where #word >> `ie` or #word >> `ei`
do {
    #word >> `cie`
        ? #cie++
    #word >> `cei`
        ? #cei++

    #match_ie->reset(-input=#word, -ignoreCase)&find
        ? #ie++
    #match_ei->reset(-input=#word, -ignoreCase)&find
        ? #ei++

local(ie_plausible)  = (#ie  >= (2 * #ei))
local(cei_plausible) = (#cei >= (2 * #cie))

    `The rule "I before E when not preceded by C" is ` +
    (#ie_plausible ? '' | 'NOT-') + `PLAUSIBLE. There were ` +
    #ie + ` examples and ` + #ei + ` counter-examples.`
    `The rule "E before I when preceded by C" is ` +
    (#cei_plausible ? `` | `NOT-`) + `PLAUSIBLE. There were ` +
    #cei + ` examples and ` + #cie + ` counter-examples.`
stdoutnl(`Overall the rule is ` + (#ie_plausible and #cei_plausible ? `` | `NOT-`) + `PLAUSIBLE`)
The rule "I before E when not preceded by C" is PLAUSIBLE. There were 464 examples and 194 counter-examples.
The rule "E before I when preceded by C" is NOT-PLAUSIBLE. There were 13 examples and 24 counter-examples.
Overall the rule is NOT-PLAUSIBLE


-- Needed to get dictionary file from web server
local http = require("socket.http")

-- Return count of words that contain pattern
function count (pattern, wordList)
    local total = 0
    for word in wordList:gmatch("%S+") do
        if word:match(pattern) then total = total + 1 end
    return total

-- Check plausibility of case given its opposite
function plaus (case, opposite, words)
    if count(case, words) > 2 * count(opposite, words) then
        return true
        return false

-- Main procedure
local page = http.request("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
io.write("I before E when not preceded by C: ")
local sub1 = plaus("[^c]ie", "cie", page)
io.write("E before I when preceded by C: ")
local sub2 = plaus("cei", "[^c]ei", page)
io.write("Overall the phrase is ")
if not (sub1 and sub2) then io.write("not ") end
I before E when not preceded by C: PLAUSIBLE
E before I when preceded by C: IMPLAUSIBLE
Overall the phrase is not plausible.


words:= HTTP:-Get("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt"):
lst := StringTools:-Split(words[2],"\n"):
xie, cie, cei, xei := 0, 0, 0, 0:
for item in lst do 
	if searchtext("ie", item) <> 0 then
		if searchtext("cie", item) <> 0 then
			cie := cie + 1:
			xie := xie + 1:
	if searchtext("ei", item) <> 0 then
		if searchtext("cei", item) <> 0 then
			cei := cei + 1:
			xei := xei + 1:
p1, p2 := evalb(xie > 2*xei),evalb(cei > 2*cie);
printf("The first phrase is %s with supporting features %d, anti features %d\n", piecewise(p1, "plausible", "not plausible"), xie, xei);
printf("The seond phrase is %s with supporting features %d, anti features %d\n", piecewise(p2, "plausible", "not plausible"), cei, cie);
printf("The overall phrase is %s\n", piecewise(p1 and p2, "plausible", "not plausible")):
The first phrase is plausible with supporting features 465 and anti features 213
The second phrase is not plausible with supporting features 13 and anti features 24
The overall phrase is not plausible

Mathematica / Wolfram Language

wordlist = 
Print["The number of words in unixdict.txt = " <> 
StringMatchQ[#, ___ ~~ "c" ~~ "i" ~~ "e" ~~ ___] & /@ wordlist ;
cie = Count[%, True];
StringMatchQ[#, ___ ~~ "c" ~~ "e" ~~ "i" ~~ ___] & /@ wordlist ;
cei = Count[%, True];
StringMatchQ[#, ___ ~~ "i" ~~ "e" ~~ ___] & /@ wordlist ;
ie = Count[%, True] - cie;
StringMatchQ[#, ___ ~~ "e" ~~ "i" ~~ ___] & /@ wordlist ;
ei = Count[%, True] - cei;
test1 = ie > 2 ei;
Print["The rule \"I before E when not preceded by C\" is " <> 
Print["There were " <> ToString[ie] <> " examples and " <> 
  ToString[ei]  <> " counter examples, for a ratio of " <> 
test2 = cei > 2 cie;
Print["The rule \"E before I when preceded by C\" is " <> 
Print["There were " <> ToString[cei] <> " examples and " <> 
  ToString[cie]  <> " counter examples, for a ratio of " <> 
Print["Overall the rule is " <> 
  If[test1 && test2, "PLAUSIBLE", "NOT PLAUSIBLE" ]]
The number of words in unixdict.txt = 25104
The rule "I before E when not preceded by C" is PLAUSIBLE
There were 465 examples and 213 counter examples, for a ratio of 2.1831
The rule "E before I when preceded by C" is NOT PLAUSIBLE
There were 13 examples and 24 counter examples, for a ratio of 0.541667
Overall the rule is NOT PLAUSIBLE


function iBeforeE()

function check(URL)
fprintf('For %s:\n', URL)
[~, name, ext] = fileparts(URL);
fn = [name ext];
if exist(fn,'file')
    lines = readlines(fn, 'EmptyLineRule', 'skip');
    fprintf('Reading data from %s\n', URL)
    lines = readlines(URL, 'EmptyLineRule', 'skip');
    % Save the file for later
includesFrequencyData = length(split(lines(1))) > 1;
ie = 0;
cie = 0;
ei = 0;
cei = 0;
for i = 1:size(lines,1)
    if includesFrequencyData
        fields = split(strtrim(lines(i)));
        if length(fields) ~= 3 || i == 1
        word = fields(1);
        frequency = str2double(fields(3));
        word = lines(i);
        frequency = 1;
    ie = ie + length(strfind(word,'ie')) * frequency;
    ei = ei + length(strfind(word,'ei')) * frequency;
    cie = cie + length(strfind(word,'cie')) * frequency;
    cei = cei + length(strfind(word,'cei')) * frequency;
rule1 =  "I before E when not preceded by C";
p1 = reportPlausibility(rule1, ie-cie, ei-cei );
rule2 =  "E before I when preceded by C";
p2 = reportPlausibility(rule2, cei, cie );
combinedRule = "I before E, except after C";
fprintf('Hence the combined rule \"%s\" is ', combinedRule);
if ~(p1 && p2)
    fprintf('NOT ');

function plausible = reportPlausibility(claim, positive, negative)
plausible = true;
fprintf('\"%s\" is ', claim);
if positive <= 2*negative
    plausible = false;
    fprintf('NOT ')
fprintf('PLAUSIBLE,\n  since the ratio of positive to negative examples is %d/%d = %0.2f.\n', positive, negative, positive/negative )
>> iBeforeE
For http://wiki.puzzlers.org/pub/wordlists/unixdict.txt:
"I before E when not preceded by C" is PLAUSIBLE,
  since the ratio of positive to negative examples is 466/217 = 2.15.
"E before I when preceded by C" is NOT PLAUSIBLE,
  since the ratio of positive to negative examples is 13/24 = 0.54.
Hence the combined rule "I before E, except after C" is NOT PLAUSIBLE.

For http://ucrel.lancs.ac.uk/bncfreq/lists/1_2_all_freq.txt:
"I before E when not preceded by C" is NOT PLAUSIBLE,
  since the ratio of positive to negative examples is 8207/4826 = 1.70.
"E before I when preceded by C" is NOT PLAUSIBLE,
  since the ratio of positive to negative examples is 327/994 = 0.33.
Hence the combined rule "I before E, except after C" is NOT PLAUSIBLE.


FROM InOut IMPORT WriteString, WriteCard, WriteLn;
FROM Strings IMPORT Pos;

VAR words, cie, cei, xie, xei: CARDINAL;
    xie_plausible, cei_plausible: BOOLEAN;
    VAR end: CARDINAL;
    end := Pos("", word);
    IF Pos("ie", word) # end THEN
        IF Pos("cie", word) # end 
        THEN INC(cie);
        ELSE INC(xie);
    ELSIF Pos("ei", word) # end THEN
        IF Pos("cei", word) # end
        THEN INC(cei);
        ELSE INC(xei);
END Classify;

PROCEDURE ProcessFile(filename: ARRAY OF CHAR);
    VAR file: SeqIO.FILE;
        dict: Texts.TEXT;
        word: ARRAY [0..63] OF CHAR;
        fs: SeqIO.FileState;
        ts: Texts.TextState;
    fs := SeqIO.Open(file, filename);
    ts := Texts.Connect(dict, file);
    WHILE NOT Texts.EOT(dict) DO
        Texts.ReadLn(dict, word);
    ts := Texts.Disconnect(dict);
    fs := SeqIO.Close(file);
END ProcessFile;

    WriteString(": ");
    WriteCard(num, 0);
END WriteStat;

PROCEDURE Plausible(feature: ARRAY OF CHAR; match, nomatch: CARDINAL): BOOLEAN;
    VAR plausible: BOOLEAN;
    WriteString(": ");
    plausible := 2 * match > nomatch;
    IF NOT plausible THEN
        WriteString("not ");
    RETURN plausible;
END Plausible;

    words := 0;
    cie := 0;
    cei := 0;
    xie := 0;
    xei := 0;
    WriteStat("Amount of words", words);
    WriteStat("CIE", cie);
    WriteStat("xIE", xie);
    WriteStat("CEI", cei);
    WriteStat("xEI", xei);
    xie_plausible := 
        Plausible("I before E when not preceded by C", xie, cie);
    cei_plausible :=
        Plausible("E before I when preceded by C", cei, xei);
    WriteString("I before E, except after C: ");
    IF NOT (xie_plausible AND cei_plausible) THEN
        WriteString("not ");
Amount of words: 50209
CIE: 24
xIE: 465
CEI: 13
xEI: 209

I before E when not preceded by C: plausible.
E before I when preceded by C: not plausible.
I before E, except after C: not plausible.


import httpclient, strutils, strformat

  Rule1 = "\"I before E when not preceded by C\""
  Rule2 = "\"E before I when preceded by C\""
  Phrase = "\"I before E except after C\""
  PlausibilityText: array[bool, string] = ["not plausible", "plausible"]

proc plausibility(rule: string; count1, count2: int): bool =
  ## Compute, display and return plausibility.
  result = count1 > 2 * count2
  stdout.write &"The rule {rule} is {PlausibilityText[result]}: "
  echo &"there were {count1} examples and {count2} counter-examples."

let client = newHttpClient()

var nie, cie, nei, cei = 0
for word in client.getContent("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt").split():
  if word.contains("ie"):
    if word.contains("cie"):
      inc cie
      inc nie
  if word.contains("ei"):
    if word.contains("cei"):
      inc cei
      inc nei

let p1 = plausibility(Rule1, nie, nei)
let p2 = plausibility(Rule2, cei, cie)
echo &"So the phrase {Phrase} is {PlausibilityText[p1 and p2]}."
The rule "I before E when not preceded by C" is plausible: there were 465 examples and 213 counter-examples.
The rule "E before I when preceded by C" is not plausible: there were 13 examples and 24 counter-examples.
So the phrase "I before E except after C" is not plausible.


Translation of: Seed7
use HTTP;
use Collection;

class HttpTest {
  function : Main(args : String[]) ~ Nil {

  function : PlausibilityCheck(comment : String, x : Int, y : Int) ~ Bool {
    ratio := x->As(Float) / y->As(Float);
    "  Checking plausibility of: {$comment}"->PrintLine();
    if(x > 2 * y) {
      "    PLAUSIBLE. As we have counts of {$x} vs {$y} words, a ratio of {$ratio} times"->PrintLine();
    else if(x > y) {
      "    IMPLAUSIBLE. As although we have counts of {$x} vs {$y} words, a ratio of {$ratio} times does not make it plausible"->PrintLine();
    else {
      "    IMPLAUSIBLE, probably contra-indicated. As we have counts of {$x} vs {$y} words, a ratio of {$ratio} times"->PrintLine();

    return x > 2 * y;

  function : IsPlausibleRule(url : String) ~ Nil {
    truecount := 0;
    falsecount := 0;

    client := HttpClient->New();
    data := client->Get(url)->Get(0)->As(String);
    data := data->ToLower();
    words := data->Split("\n");

    cie := Count("cie", words);
    cei := Count("cei", words);
    not_c_ie := Count("ie", words) - cie;
    not_c_ei := Count("ei", words) - cei;

    "Checking plausibility of \"I before E except after C\":"->PrintLine();
    if(PlausibilityCheck("I before E when not preceded by C", not_c_ie, not_c_ei) &
        PlausibilityCheck("E before I when preceded by C", cei, cie)) {
      "OVERALL IT IS PLAUSIBLE!"->PrintLine();
    else {
      "OVERALL IT IS IMPLAUSIBLE!"->PrintLine();
      "(To be plausible, one word count must exceed another by 2 times)"->PrintLine();

  function : Count(check: String, words : String[]) ~ Int {
    count := 0;

    each(i : words) {
      if(words[i]->Find(check) > -1) {
        count += 1;

    return count;
Checking plausibility of "I before E except after C":
  Checking plausibility of: I before E when not preceded by C
    PLAUSIBLE. As we have counts of 465 vs 213 words, a ratio of 2.183 times
  Checking plausibility of: E before I when preceded by C
            IMPLAUSIBLE, probably contra-indicated. As we have counts of 13 vs 24 words, a ratio of 0.542 times
(To be plausible, one word count must exceed another by 2 times)


This example is incomplete. Is the original phrase plausible? Please ensure that it meets all task requirements and remove this message.
function i_before_e_except_after_c(f)

fid = fopen(f,'r');
nei = 0;
cei = 0;
nie = 0;
cie = 0;
while ~feof(fid)
	c = strsplit(strtrim(fgetl(fid)),char([9,32]));
	if length(c) > 2, 
		n = str2num(c{3});
		n = 1;
	if strfind(c{1},'ei')>1, nei=nei+n; end;
	if strfind(c{1},'cei'),  cei=cei+n; end;
	if strfind(c{1},'ie')>1, nie=nie+n; end;
	if strfind(c{1},'cie'),  cie=cie+n; end;

printf('cie: %i\nnie: %i\ncei: %i\nnei: %i\n',cie,nie-cie,cei,nei-cei);
v = '';
if (nie < 3 * cie)
	v=' not';
printf('I before E when not preceded by C: is%s plausible\n',v);
v = '';
if (nei > 3 * cei) 
	v=' not';
printf('E before I when preceded by C: is%s plausible\n',v);
octave:23> i_before_e_except_after_c 1_2_all_freq.txt 
cie: 994
nie: 8133
cei: 327
nei: 4274
I before E when not preceded by C: is plausible
E before I when preceded by C: is not plausible
octave:24> i_before_e_except_after_c unixdict.txt
cie: 24
nie: 464
cei: 13
nei: 191
I before E when not preceded by C: is plausible
E before I when preceded by C: is not plausible


use warnings;
use strict;

sub result {
    my ($support, $against) = @_;
    my $ratio  = sprintf '%.2f', $support / $against;
    my $result = $ratio >= 2;
    print "$support / $against = $ratio. ", 'NOT ' x !$result, "PLAUSIBLE\n";
    return $result;

my @keys  = qw(ei cei ie cie);
my %count;

while (<>) {
    for my $k (@keys) {
        $count{$k}++ if -1 != index $_, $k;

my ($support, $against, $result);

print 'I before E when not preceded by C: ';
$support = $count{ie} - $count{cie};
$against = $count{ei} - $count{cei};
$result += result($support, $against);

print 'E before I when preceded by C: ';
$support = $count{cei};
$against = $count{cie};
$result += result($support, $against);

print 'Overall: ', 'NOT ' x ($result < 2), "PLAUSIBLE.\n";
I before E when not preceded by C: 465 / 213 = 2.18. PLAUSIBLE
E before I when preceded by C: 13 / 24 = 0.54. NOT PLAUSIBLE

Perl: Stretch Goal

Just replace the while loop with the following one:

while (<>) {
    my @columns = split;
    next if 3 < @columns;
    my ($word, $freq) = @columns[0, 2];
    for my $k (@keys) {
        $count{$k} += $freq if -1 != index $word, $k;
I before E when not preceded by C: 8148 / 4826 = 1.69. NOT PLAUSIBLE
E before I when preceded by C: 327 / 994 = 0.33. NOT PLAUSIBLE


Kept dirt simple, difficult to imagine any other approach being faster than this.

-- demo\rosetta\IbeforeE.exw
with javascript_semantics
procedure show_plausibility(string msg, integer w, wo)
    string no = iff(w<2*wo?" not":"")
    printf(1, "%s (pro: %3d, anti: %3d) is%s plausible\n",{msg,w,wo,no})
end procedure

string text = join(unix_dict())
-- Note: my unixdict.txt begins with "10th" and ends with "zygote", so 
-- boundary checks such as "i>=2 and i+1<=length(text)" can be skipped.
integer cei=0, xei=0, cie=0, xie=0
for i=1 to length(text) do
    if text[i]='i' then
        if text[i-1]='e' then
            if text[i-2]='c' then
                cei += 1
                xei += 1
            end if
        end if
        -- (nb not elsif here; "eie" occurs twice)
        if text[i+1]='e' then
            if text[i-1]='c' then
                cie += 1
                xie += 1
            end if
        end if
    end if
end for
printf(1,"occurances: cie:%d, xie:%d, cei:%d, xei:%d\n", {cie,xie,cei,xei})
show_plausibility( "i before e except after c", xie, cie );
show_plausibility( "e before i except after c", xei, cei );
show_plausibility( "i before e   when after c", cie, cei );
show_plausibility( "e before i   when after c", cei, cie );
show_plausibility( "i before e     in general", xie + cie, xei + cei );
show_plausibility( "e before i     in general", xei + cei, xie + cie )

Although the output matches, I decided to use different metrics from ALGOL 68 for the middle two conclusions.
I am not confident these are meaningful/correct logical inferences anyway, but the raw numbers are right.
(Being told ib4eeac is more often wrong than right has quite clearly made me start to doubt myself.)

occurances: cie:24, xie:466, cei:13, xei:217
i before e except after c (pro: 466, anti:  24) is plausible
e before i except after c (pro: 217, anti:  13) is plausible
i before e   when after c (pro:  24, anti:  13) is not plausible
e before i   when after c (pro:  13, anti:  24) is not plausible
i before e     in general (pro: 490, anti: 230) is plausible
e before i     in general (pro: 230, anti: 490) is not plausible


main =>
  Words = read_file_lines("unixdict.txt"),
  IEWords = [Word : Word in Words, find(Word,"ie",_,_)],
  EIWords = [Word : Word in Words, find(Word,"ei",_,_)],  

  % cie vs not cie
  [CIE_len, CIE_not_len] = partition_len(IEWords,"cie"),

  % cei vs not cei
  [CEI_len, CEI_not_len] = partition_len(EIWords,"cei"),

  printf("I before E when not preceeded by C (%d vs %d): %w\n",
  printf("E before I when preceeded by C (%d cs %d): %w\n",

plausible(Len1,Len2) = cond(Len1 / Len2 > 2,"plausible","not plausible").

partition_len(Words,Sub) = [True.len, False.len] =>
  True = [],
  False = [],
  foreach(Word in Words)
    if find(Word,Sub,_,_) then
      True := [Word|True]
      False := [Word|False]    
[cie = 24,cie_not = 465]
[cei = 13,cei_not = 213]

I before E when not preceeded by C (465 vs 213): plausible
E before I when preceeded by C (13 cs 24): not plausible


(de ibEeaC (File . Prg)
      (Cie (let N 0 (in File (while (from "cie") (run Prg))))
         Nie (let N 0 (in File (while (from "ie") (run Prg))))
         Cei (let N 0 (in File (while (from "cei") (run Prg))))
         Nei (let N 0 (in File (while (from "ei") (run Prg)))) )
      (prinl "cie: " Cie)
      (prinl "nie: " (dec 'Nie Cie))
      (prinl "cei: " Cei)
      (prinl "nei: " (dec 'Nei Cei))
      (let (NotI (> (* 3 Cie) Nie)  NotE (> Nei (* 3 Cei)))
            "I before E except after C: is"
            (and NotI " not")
            " plausible" )
            "E before I when after C: is"
            (and NotE " not")
            " plausible" )
            "Overall rule is"
            (and (or NotI NotE) " not")
            " plausible" ) ) ) )

(ibEeaC "unixdict.txt"
   (inc 'N) )


(ibEeaC "1_2_all_freq.txt"
   (inc 'N (format (stem (line) "\t"))) )


cie: 24
nie: 466
cei: 13
nei: 217
I before E except after C: is plausible
E before I when after C: is not plausible
Overall rule is not plausible

cie: 994
nie: 8148
cei: 327
nei: 4826
I before E except after C: is plausible
E before I when after C: is not plausible
Overall rule is not plausible


iBeforeE: procedure options(main);
    declare dict file;
    open file(dict) title('unixdict.txt');
    on endfile(dict) go to report;
    declare (cie, xie, cei, xei) fixed;
    declare word char(32) varying;
    cie = 0;
    xie = 0;
    cei = 0;
    xei = 0;
    do while('1'b);
        get file(dict) list(word);
        if index(word, 'ie') ^= 0 then
            if index(word, 'cie') ^= 0 then
                cie = cie + 1;
                xie = xie + 1;
        if index(word, 'ei') ^= 0 then
            if index(word, 'cei') ^= 0 then
                cei = cei + 1;
                xei = xei + 1;
    close file(dict);
    put skip list('CIE:', cie);
    put skip list('xIE:', xie);
    put skip list('CEI:', cei);
    put skip list('xEI:', xei);
    declare (ieNotC, eiC) bit;
    ieNotC = xie * 2 > cie;
    eiC = cei * 2 > xei;

    put skip list('I before E when not preceded by C:');
    if ^ieNotC then put list('not');
    put list('plausible.');

    put skip list('E before I when preceded by C:');
    if ^eiC then put list('not');
    put list('plausible.');

    put skip list('I before E, except after C:');
    if ^(ieNotC & eiC) then put list('not');
    put list('plausible.');
end iBeforeE;
CIE:        24
xIE:       465
CEI:        13
xEI:       213
I before E when not preceded by C: plausible.
E before I when preceded by C: not plausible.
I before E, except after C: not plausible.


$Web = New-Object -TypeName Net.Webclient
$Words = $web.DownloadString('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt')
$IE = $EI = $CIE = $CEI = @()
$Clause1 = $Clause2 = $MainClause = $false
foreach ($Word in $Words.split())
    switch ($Word)
        {($_ -like '*ie*') -and ($_ -notlike '*cie*')} {$IE += $Word}
        {($_ -like '*ei*') -and ($_ -notlike '*cei*')} {$EI += $Word}
        {$_ -like '*cei*'} {$CEI += $Word}
        {$_ -like '*cie*'} {$CIE += $Word}
if ($IE.count -gt $EI.count * 2)
{$Clause1 = $true}
"The plausibility of 'I before E when not preceded by C' is $Clause1"
if ($CEI.count -gt $CIE.count * 2)
{$Clause2 = $true}
"The plausibility of 'E before I when preceded by C' is $Clause2"
if ($Clause1 -and $Clause2)
{$MainClause = $True}
"The plausibility of the phrase 'I before E except after C' is $MainClause"
The plausibility of 'I before E when not preceded by C' is True
The plausibility of 'E before I when preceded by C' is False
The plausibility of the phrase 'I before E except after C' is False

Alternative Implementation

$Web = New-Object -TypeName Net.Webclient
$Words = $web.DownloadString('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt')
$IE = $EI = $CIE = $CEI = @()
$Clause1 = $Clause2 = $MainClause = $false
foreach ($Word in $Words.split())
    switch ($Word)
        {$_ -like '*cei*'} {$CEI += $Word; break}
        {$_ -like '*cie*'} {$CIE += $Word; break}
        {$_ -like '*ie*'}  {$IE += $Word}
        {$_ -like '*ei*'}  {$EI += $Word}
if ($IE.count -gt $EI.count * 2)
{$Clause1 = $true}
"The plausibility of 'I before E when not preceded by C' is $Clause1"
if ($CEI.count -gt $CIE.count * 2)
{$Clause2 = $true}
"The plausibility of 'E before I when preceded by C' is $Clause2"
if ($Clause1 -and $Clause2)
{$MainClause = $True}
"The plausibility of the phrase 'I before E except after C' is $MainClause"
The plausibility of 'I before E when not preceded by C' is True
The plausibility of 'E before I when preceded by C' is False
The plausibility of the phrase 'I before E except after C' is False

Alternative Implementation 2

A single pass through the wordlist using the regex engine.

$webResult = Invoke-WebRequest -Uri http://wiki.puzzlers.org/pub/wordlists/unixdict.txt -UseBasicParsing

$cie, $cei, $_ie, $_ei = 0, 0, 0, 0

[regex]::Matches($webResult.Content, '.(ie|ei)').foreach{
  if     ($_.Value    -eq 'cie') { $cie+=2 }
  elseif ($_.Value    -eq 'cei') { $cei++  }
  elseif ($_.Value[1] -eq  'i' ) { $_ie++  }
  else                           { $_ei+=2 }

"I before E when not preceded by C is plausible: $($_ie -gt $_ei)"
"E before I when preceded by C is plausible: $($cei -gt $cie)"
"I before E, except after C is plausible: $(($_ie -gt $_ei) -and ($cei -gt $cie))"
I before E when not preceded by C is plausible: True
E before I when preceded by C is plausible: False
I before E, except after C is plausible: False


If ReadFile(1,GetPathPart(ProgramFilename())+"wordlist(en).txt")
  While Not Eof(1)

PrintN("Number of words in [wordlist(en).txt]: "+CountString(wl$,";"))
cei.i=CountString(wl$,"cei") : PrintN("Instances of [cei]                   : "+Str(cei))
cie.i=CountString(wl$,"cie") : PrintN("Instances of [cie]                   : "+Str(cie))
Print("Rule: 'e' before 'i' when preceded by 'c' is = ")
If cei>cie : PrintN("plausible") : Else : PrintN("not plausible") : EndIf
wl$=RemoveString(wl$,"cei")  : wl$=RemoveString(wl$,"cie")
ei.i=CountString(wl$,"ei")   : PrintN("Instances of [*ei] '*'<>'c'          : "+Str(ei))
ie.i=CountString(wl$,"ie")   : PrintN("Instances of [*ie] '*'<>'c'          : "+Str(ie))
Print("Rule: 'i' before 'e' when not preceded by 'c' is = ")
If ie>ei : PrintN("plausible") : Else : PrintN("not plausible") : EndIf
Print("Overall the rule is : ")
If cei>cie And ie>ei : PrintN("PLAUSIBLE") : Else : PrintN("NOT PLAUSIBLE") :  EndIf
Number of words in [wordlist(en).txt]: 25104
Instances of [cei]                   : 13
Instances of [cie]                   : 24
Rule: 'e' before 'i' when preceded by 'c' is = not plausible

Instances of [*ei] '*'<>'c'          : 217
Instances of [*ie] '*'<>'c'          : 466
Rule: 'i' before 'e' when not preceded by 'c' is = plausible

Overall the rule is : NOT PLAUSIBLE


import urllib.request
import re


def plausibility_check(comment, x, y):
    print('\n  Checking plausibility of: %s' % comment)
    if x > PLAUSIBILITY_RATIO * y:
        print('    PLAUSIBLE. As we have counts of %i vs %i, a ratio of %4.1f times'
              % (x, y, x / y))
        if x > y:
            print('    IMPLAUSIBLE. As although we have counts of %i vs %i, a ratio of %4.1f times does not make it plausible'
                  % (x, y, x / y))
            print('    IMPLAUSIBLE, probably contra-indicated. As we have counts of %i vs %i, a ratio of %4.1f times'
                  % (x, y, x / y))
    return x > PLAUSIBILITY_RATIO * y

def simple_stats(url='http://wiki.puzzlers.org/pub/wordlists/unixdict.txt'):
    words = urllib.request.urlopen(url).read().decode().lower().split()
    cie = len({word for word in words if 'cie' in word})
    cei = len({word for word in words if 'cei' in word})
    not_c_ie = len({word for word in words if re.search(r'(^ie|[^c]ie)', word)})
    not_c_ei = len({word for word in words if re.search(r'(^ei|[^c]ei)', word)})
    return cei, cie, not_c_ie, not_c_ei

def print_result(cei, cie, not_c_ie, not_c_ei):
    if ( plausibility_check('I before E when not preceded by C', not_c_ie, not_c_ei)
         & plausibility_check('E before I when preceded by C', cei, cie) ):
        print('\nOVERALL IT IS PLAUSIBLE!')
        print('\nOVERALL IT IS IMPLAUSIBLE!')
    print('(To be plausible, one count must exceed another by %i times)' % PLAUSIBILITY_RATIO)

print('Checking plausibility of "I before E except after C":')
Checking plausibility of "I before E except after C":

  Checking plausibility of: I before E when not preceded by C
    PLAUSIBLE. As we have counts of 465 vs 213, a ratio of  2.2 times

  Checking plausibility of: E before I when preceded by C
    IMPLAUSIBLE, probably contra-indicated. As we have counts of 13 vs 24, a ratio of  0.5 times

(To be plausible, one count must exceed another by 2 times)

Python: Stretch Goal

Add the following to the bottom of the previous program:

def stretch_stats(url='http://ucrel.lancs.ac.uk/bncfreq/lists/1_2_all_freq.txt'):
    freq = [line.strip().lower().split()
            for line in urllib.request.urlopen(url)
            if len(line.strip().split()) == 3]
    wordfreq = [(word.decode(), int(frq))
                for word, pos, frq in freq[1:]
                if (b'ie' in word) or (b'ei' in word)]
    cie = sum(frq for word, frq in wordfreq if 'cie' in word)
    cei = sum(frq for word, frq in wordfreq if 'cei' in word)
    not_c_ie = sum(frq for word, frq in wordfreq if re.search(r'(^ie|[^c]ie)', word))
    not_c_ei = sum(frq for word, frq in wordfreq if re.search(r'(^ei|[^c]ei)', word))
    return cei, cie, not_c_ie, not_c_ei

print('\n\nChecking plausibility of "I before E except after C"')
print('And taking account of word frequencies in British English:')
Produces this extra output:
Checking plausibility of "I before E except after C"
And taking account of word frequencies in British English:

  Checking plausibility of: I before E when not preceded by C
    IMPLAUSIBLE. As although we have counts of 8192 vs 4826, a ratio of  1.7 times does not make it plausible

  Checking plausibility of: E before I when preceded by C
    IMPLAUSIBLE, probably contra-indicated. As we have counts of 327 vs 994, a ratio of  0.3 times

(To be plausible, one count must exceed another by 2 times)


Translation of: BASIC
    LINE INPUT #1, W
    IF INSTR(W, "ie") THEN IF INSTR(W, "cie") THEN CI = CI + 1 ELSE XI = XI + 1
    IF INSTR(W, "ei") THEN IF INSTR(W, "cei") THEN CE = CE + 1 ELSE XE = XE + 1

PRINT "I before E when not preceded by C: ";
IF 2 * XI <= CI THEN PRINT "not ";
PRINT "plausible."
PRINT "E before I when preceded by C: ";
IF 2 * CE <= XE THEN PRINT "not ";
PRINT "plausible."


words = tolower(readLines("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt"))
ie.npc = sum(grepl("(?<!c)ie", words, perl = T))
ei.npc = sum(grepl("(?<!c)ei", words, perl = T))
ie.pc = sum(grepl("cie", words, fixed = T))
ei.pc = sum(grepl("cei", words, fixed = T))

p1 = ie.npc > 2 * ei.npc
p2 = ei.pc > 2 * ie.pc

message("(1) is ", (if (p1) "" else "not "), "plausible.")
message("(2) is ", (if (p2) "" else "not "), "plausible.")
message("The whole phrase is ", (if (p1 && p2) "" else "not "), "plausible.")
(1) is plausible.
(2) is not plausible.
The whole phrase is not plausible.


#lang racket

(define (get-tallies filename line-parser . patterns)
  (for/fold ([totals (make-list (length patterns) 0)])
    ([line (file->lines filename)])
    (match-let ([(list word n) (line-parser line)])
      (for/list ([p patterns] [t totals])
        (if (regexp-match? p word) 
            (+ n t) t)))))

(define (plausible test) (string-append (if test "" "IM") "PLAUSIBLE"))

(define (subrule description examples counters)
  (let ([result (> examples (* 2 counters))])
    (printf "  The sub-rule \"~a\" is ~a.  There were ~a examples and ~a counter-examples.\n" 
            description (plausible result) examples counters)

(define (plausibility description filename parser)
  (printf "~a:\n" description)
  (match-let ([(list cei cie ie ei) (get-tallies filename parser "cei" "cie" "ie" "ei")])
    (let ([rule1 (subrule "I before E when not preceded by C" (- ie cie) (- ei cei))]
          [rule2 (subrule "E before I when preceded by C" cei cie)])
      (printf "\n  Overall, the rule \"I before E, except after C\" is ~a.\n"
              (plausible (and rule1 rule2))))))

(define (parse-frequency-data line)
  (let ([words (string-split line)])
    (list (string-join (drop-right words 2)) (string->number (last words)))))

(plausibility "Dictionary" "unixdict.txt" (λ (line) (list line 1))) (newline)
(plausibility "Word frequencies (stretch goal)" "1_2_all_freq.txt" parse-frequency-data)
  The sub-rule "I before E when not preceded by C" is PLAUSIBLE.  There were 465 examples and 213 counter-examples.
  The sub-rule "E before I when preceded by C" is IMPLAUSIBLE.  There were 13 examples and 24 counter-examples.

  Overall, the rule "I before E, except after C" is IMPLAUSIBLE.

Word frequencies (stretch goal):
  The sub-rule "I before E when not preceded by C" is IMPLAUSIBLE.  There were 8163 examples and 4826 counter-examples.
  The sub-rule "E before I when preceded by C" is IMPLAUSIBLE.  There were 327 examples and 994 counter-examples.

  Overall, the rule "I before E, except after C" is IMPLAUSIBLE.


(formerly Perl 6) This solution uses grammars and actions to parse the given file, the Bag for tallying up occurrences of each possible thing we're looking for ("ie", "ei", "cie", and "cei"), and junctions to determine the plausibility of a phrase from the subphrases. Note that a version of rakudo newer than the January 2014 compiler or Star releases is needed, as this code relies on a recent bugfix to the make function.

grammar CollectWords {
    token TOP {
        [^^ <word> $$ \n?]+

    token word {
        [ <with_c> | <no_c> | \N ]+

    token with_c {
        c <ie_part>

    token no_c {

    token ie_part {
        ie | ei | eie # a couple words in the list have "eie"

class CollectWords::Actions {
    method TOP($/) {
        make $<word>».ast.flat.Bag;

    method word($/) {
        if $<with_c> + $<no_c> {
            make flat $<with_c>».ast, $<no_c>».ast;
        } else {
            make ();

    method with_c($/) {
        make "c" X~ $<ie_part>.ast;

    method no_c($/) {
        make "!c" X~ $<ie_part>.ast;

    method ie_part($/) {
        if ~$/ eq 'eie' {
            make ('ei', 'ie');
        } else {
            make ~$/;

sub plausible($good, $bad, $msg) {
    if $good > 2*$bad {
        say "$msg: PLAUSIBLE ($good  vs. $bad ✘)";
        return True;
    } else {
        say "$msg: NOT PLAUSIBLE ($good  vs. $bad ✘)";
        return False;

my $results = CollectWords.parsefile("unixdict.txt", :actions(CollectWords::Actions)).ast;

my $phrasetest = [&] plausible($results<!cie>, $results<!cei>, "I before E when not preceded by C"),
                     plausible($results<cei>, $results<cie>, "E before I when preceded by C");

say "I before E except after C: ", $phrasetest ?? "PLAUSIBLE" !! "NOT PLAUSIBLE";
I before E when not preceded by C: PLAUSIBLE (466  vs. 217 ✘)
E before I when preceded by C: NOT PLAUSIBLE (13  vs. 24 ✘)
I before E except after C: NOT PLAUSIBLE

Raku: Stretch Goal

Note that within the original text file, a tab character was erroneously replaced with a space. Thus, the following changes to the text file are needed before this solution will run:

--- orig_1_2_all_freq.txt	2014-02-01 14:36:53.124121018 -0800
+++ 1_2_all_freq.txt	2014-02-01 14:37:10.525552980 -0800
@@ -2488,7 +2488,7 @@
 	other than	Prep	43
 	visited	Verb	43
 	cross	NoC	43
-	lie Verb	43
+	lie	Verb	43
 	grown	Verb	43
 	crowd	NoC	43
 	recognised	Verb	43

This solution requires just a few modifications to the grammar and actions from the non-stretch goal.

grammar CollectWords {
    token TOP {
        ^^ \t Word \t PoS \t Freq $$ \n
        [^^ <word> $$ \n?]+

    token word {
        [ <with_c> | <no_c> | \T ]+ \t+
        \T+ \t+ # PoS doesn't matter to us, so ignore it
        $<freq>=[<.digit>+] \h*

    token with_c {
        c <ie_part>

    token no_c {

    token ie_part {
        ie | ei

class CollectWords::Actions {
    method TOP($/) {
        make $<word>».ast.flat.Bag;

    method word($/) {
        if $<with_c> + $<no_c> {
            make flat $<with_c>».ast xx +$<freq>, $<no_c>».ast xx +$<freq>;
        } else {
            make ();

    method with_c($/) {
        make "c" ~ $<ie_part>;

    method no_c($/) {
        make "!c" ~ $<ie_part>;

sub plausible($good, $bad, $msg) {
    if $good > 2*$bad {
        say "$msg: PLAUSIBLE ($good  vs. $bad ✘)";
        return True;
    } else {
        say "$msg: NOT PLAUSIBLE ($good  vs. $bad ✘)";
        return False;

# can't use .parsefile like before due to the non-Unicode £ in this file.
my $file = slurp("1_2_all_freq.txt", :enc<iso-8859-1>);
my $results = CollectWords.parse($file, :actions(CollectWords::Actions)).ast;

my $phrasetest = [&] plausible($results<!cie>, $results<!cei>, "I before E when not preceded by C"),
                     plausible($results<cei>, $results<cie>, "E before I when preceded by C");

say "I before E except after C: ", $phrasetest ?? "PLAUSIBLE" !! "NOT PLAUSIBLE";
I before E when not preceded by C: NOT PLAUSIBLE (8222  vs. 4826 ✘)
E before I when preceded by C: NOT PLAUSIBLE (327  vs. 994 ✘)
I before E except after C: NOT PLAUSIBLE


The script processes both the task and the stretch goal. In the stretch goal, "rows with three space or tab separated words only" (7574 out of 7726) are processed, excluding all expressions like "out of".

Red ["i before e except after c"]

testlist: function [wordlist /wfreq] [
	cie: cei: ie: ei: 0
	if not wfreq [forall wordlist [insert wordlist: next wordlist 1]]
	foreach [word freq] wordlist [
		parse word [ some [
			"cie" (cie: cie + freq)	|
			"cei" (cei: cei + freq)	|
			"ie"  (ie: ie + freq)	|
			"ei"  (ei: ei + freq)	|
	print rejoin [
	"i is before e " ie " times, and also " cie " times following c.^/"
	"i is after e " ei " times, and also " cei " times following c.^/"
	"Hence ^"i before e^" is " either a: 2 * ei < ie [""] ["not "] "plausible,^/"
	"while ^"except after c^" is " either b: 2 * cie < cei [""] ["not "] "plausible.^/"
	"Overall the rule is " either a and b [""] ["not "] "plausible."]

print "Results for unixdict.txt:"
testlist read/lines http://wiki.puzzlers.org/pub/wordlists/unixdict.txt

print "^/Results for British National Corpus:"
bnc: next read/lines %1_2_all_freq.txt
spaces: charset "^- "
bnclist: collect [ foreach w bnc [	
	if 3 = length? seq: split trim w spaces [
		keep seq/1 keep to-integer seq/3
testlist/wfreq bnclist
Results for unixdict.txt:
i is before e 464 times, and also 24 times following c.
i is after e 217 times, and also 13 times following c.
Hence "i before e" is plausible,
while "except after c" is not plausible.
Overall the rule is not plausible.

Results for British National Corpus:
i is before e 8207 times, and also 994 times following c.
i is after e 4826 times, and also 327 times following c.
Hence "i before e" is not plausible,
while "except after c" is not plausible.
Overall the rule is not plausible.


The following assumptions were made about the (default) dictionary:

  •   there could be leading and/or trailing blanks or tabs
  •   the dictionary words are in mixed case.
  •   there could be blank lines
  •   there may be more than one occurrence of a target string within a word   [einsteinium]

unweighted version

/*REXX program shows  plausibility  of  "I before E"  when not preceded by C,  and      */
/*───────────────────────────────────── "E before I"  when     preceded by C.           */
parse arg iFID .                                 /*obtain optional argument from the CL.*/
if iFID=='' | iFID=="," then iFID='UNIXDICT.TXT' /*Not specified?  Then use the default.*/
#.=0                                             /*zero out the various word counters.  */
     do r=0  while  lines(iFID)\==0              /*keep reading the dictionary 'til done*/
     u=space( lineIn(iFID), 0);      upper u     /*elide superfluous blanks and tabs.   */
     if u==''  then iterate                      /*Is it a blank line?   Then ignore it.*/
     #.words=#.words + 1                         /*keep running count of number of words*/
     if pos('EI', u)\==0 & pos('IE', u)\==0  then #.both=#.both + 1  /*the word has both*/
     call find  'ie'                                                 /*look for   ie    */
     call find  'ei'                                                 /*  "   "    ei    */
     end   /*r*/                                 /*at exit of DO loop,   R = # of lines.*/

L=length(#.words)                                /*use this to align the output numbers.*/
say 'lines in the  '         iFID         " dictionary: "            r
say 'words in the  '         iFID         " dictionary: "            #.words
say 'words with "IE" and "EI" (in same word): '    right(#.both, L)
say 'words with "IE" and     preceded by "C": '    right(#.ie.c ,L)
say 'words with "IE" and not preceded by "C": '    right(#.ie.z ,L)
say 'words with "EI" and     preceded by "C": '    right(#.ei.c ,L)
say 'words with "EI" and not preceded by "C": '    right(#.ei.z ,L)
say;                         mantra= 'The spelling mantra  '
p1=#.ie.z / max(1, #.ei.z);  phrase= '"I before E when not preceded by C"'
say mantra phrase   ' is '   word("im", 1 + (p1>2) )'plausible.'
p2=#.ie.c / max(1, #.ei.c);  phrase= '"E before I when     preceded by C"'
say mantra phrase   ' is '   word("im", 1 + (p2>2) )'plausible.'
po=(p1>2 & p2>2);            say 'Overall, it is'    word("im", 1 + po)'plausible.'
exit                                             /*stick a fork in it,  we're all done. */
find: arg x;  s=1;  do forever;           _=pos(x, u, s);          if _==0  then return
                    if substr(u, _ - 1 + (_==1)*999, 1)=='C'  then #.x.c=#.x.c + 1
                                                              else #.x.z=#.x.z + 1
                    s=_ + 1                      /*handle the cases of multiple finds.  */
                    end   /*forever*/
output   when using the default dictionary:
lines in the   UNIXDICT.TXT  dictionary:  25104
words in the   UNIXDICT.TXT  dictionary:  25104

words with "IE" and "EI" (in same word):      4
words with "IE" and     preceded by "C":     24
words with "IE" and not preceded by "C":    466
words with "EI" and     preceded by "C":     13
words with "EI" and not preceded by "C":    217

The spelling mantra   "I before E when not preceded by C"  is  plausible.
The spelling mantra   "E before I when     preceded by C"  is  implausible.
Overall, it is implausible.

weighted version

Using the default word frequency count file, several discrepancies (or not) became apparent:

  •   some "words" were in fact,   phrases
  •   some words were in the form of     x / y     indicating x OR y
  •   some words were in the form of     x/y       (with no blanks)   indicating x OR y,   or a word
  •   some words had a   ~   prefix
  •   some words had a   *   suffix
  •   some words had a   ~   suffix
  •   some words had a   ~   and   *   suffix
  •   one word had a   ~   prefix and a   ~   suffix
  •   some lines had an imbedded   [xxx]   comment
  •   some words had a   '   (quote)   prefix to indicate a:
  •   possessive
  •   plural
  •   contraction
  •   word   (as is)

All of the cases when an asterisk   [*]   or tilde   [~]   was used weren't programmatically handled within the REXX program;   it is assumed that prefixes and suffixes were being used to indicate multiple words that either begin or end with (any) string   (or in some case, both).

A cursory look at the file seems to indicate that the use of the tilde and/or asterisk doesn't affect the rules for the mantra phrases.

/*REXX program shows  plausibility  of  "I before E"  when not preceded by C,  and      */
/*───────────────────────────────────── "E before I"  when     preceded by C,  using a  */
/*───────────────────────────────────── weighted frequency for each word.               */
parse arg iFID wFID .                            /*obtain optional arguments from the CL*/
if iFID=='' | iFID=="," then iFID='UNIXDICT.TXT' /*Not specified?  Then use the default.*/
if wFID=='' | wFID=="," then wFID='WORDFREQ.TXT' /* "      "         "   "   "     "    */
cntl=xrange(, ' ')                               /*get all manner of tabs, control chars*/
#.=0                                             /*zero out the various word counters.  */
f.=1                                             /*default word frequency multiplier.   */
    do recs=0  while lines(wFID)\==0             /*read a record from the file 'til done*/
    u=translate( linein(wFID), , cntl);  upper u /*translate various tabs and cntl chars*/
    u=translate(u, '*', "~")                     /*translate tildes (~)  to an asterisk.*/
    if u==''                 then iterate        /*Is this a blank line? Then ignore it.*/
    freq=word(u, words(u) )                      /*obtain the last token on the line.   */
    if \datatype(freq, 'W')  then iterate        /*FREQ not an integer?  Then ignore it.*/
    parse var  u   w.1  '/'  w.2  .              /*handle case of:   ααα/ßßß  ···       */

         do j=1  for 2;  w.j=word(w.j, 1)        /*strip leading and/or trailing blanks.*/
         _=w.j;   if _==''          then iterate /*if not present, then ignore it.      */
         if j==2  then if w.2==w.1  then iterate /*second word ≡ first word?  Then skip.*/
         #.freqs=#.freqs + 1                     /*bump word counter in the  FREQ  list.*/
         f._=f._ + freq                          /*add to a word's frequency count.     */
         end   /*ws*/
    end        /*recs*/                          /*at exit of DO loop, RECS = # of recs.*/

if    recs\==0  then say 'lines in the  '        wFID        "       list: "      recs
if #.freqs\==0  then say 'words in the  '        wFID        "       list: "      #.freqs
if #.freqs ==0  then weighted=
                else weighted= ' (weighted)'
    do r=0  while  lines(iFID)\==0               /*keep reading the dictionary 'til done*/
    u=space( linein(iFID), 0);      upper u      /*elide superfluous blanks and tabs.   */
    if u==''  then iterate                       /*Is it a blank line?   Then ignore it.*/
    #.words=#.words + 1                          /*keep running count of number of words*/
    if pos('EI', u)\==0 & pos('IE', u)\==0  then #.both=#.both + one /*the word has both*/
    call find  'ie'                                                  /*look for   ie    */
    call find  'ei'                                                  /*  "   "    ei    */
    end   /*r*/                                  /*at exit of DO loop,   R = # of lines.*/

L=length(#.words)                                /*use this to align the output numbers.*/
say 'lines in the  '         iFID         ' dictionary: '             r
say 'words in the  '         iFID         ' dictionary: '             #.words
say 'words with "IE" and "EI" (in same word): '    right(#.both, L)   weighted
say 'words with "IE" and     preceded by "C": '    right(#.ie.c ,L)   weighted
say 'words with "IE" and not preceded by "C": '    right(#.ie.z ,L)   weighted
say 'words with "EI" and     preceded by "C": '    right(#.ei.c ,L)   weighted
say 'words with "EI" and not preceded by "C": '    right(#.ei.z ,L)   weighted
say;                         mantra= 'The spelling mantra  '
p1=#.ie.z / max(1, #.ei.z);  phrase= '"I before E when not preceded by C"'
say mantra phrase   ' is '   word("im", 1 + (p1>2) )'plausible.'
p2=#.ie.c / max(1, #.ei.c);  phrase= '"E before I when     preceded by C"'
say mantra phrase   ' is '   word("im", 1 + (p2>2) )'plausible.'
po=(p1>2 & p2>2);            say 'Overall, it is'    word("im",1 + po)'plausible.'
exit                                             /*stick a fork in it,  we're all done. */
find: arg x;  s=1;  do forever;           _=pos(x, u, s);          if _==0  then return
                    if substr(u, _ - 1 + (_==1)*999, 1)=='C'  then #.x.c=#.x.c + one
                                                              else #.x.z=#.x.z + one
                    s=_ + 1                      /*handle the cases of multiple finds.  */
output   when using the default dictionary and default word frequency list:
lines in the   WORDFREQ.TXT        list:  7727
words in the   WORDFREQ.TXT        list:  7728

lines in the   UNIXDICT.TXT  dictionary:  25104
words in the   UNIXDICT.TXT  dictionary:  25104

words with "IE" and "EI" (in same word):      4  (weighted)
words with "IE" and     preceded by "C":    719  (weighted)
words with "IE" and not preceded by "C":   3818  (weighted)
words with "EI" and     preceded by "C":    100  (weighted)
words with "EI" and not preceded by "C":   4875  (weighted)

The spelling mantra   "I before E when not preceded by C"  is  implausible.
The spelling mantra   "E before I when     preceded by C"  is  plausible.
Overall, it is implausible.


# Project : I before E except after C

fn1 = "unixdict.txt"

fp = fopen(fn1,"r")
str = fread(fp, getFileSize(fp))
strcount = str2list(str)
see "The number of words in unixdict : " + len(strcount) + nl
cei = count(str, "cei")
cie = count(str, "cie")
ei = count(str, "ei")
ie = count(str, "ie")
see "Instances of cei : " + cei + nl
see "Instances of cie : " + cie + nl
see "Rule: 'e' before 'i' when preceded by 'c' is = "
if cei>cie see "plausible" + nl else see"not plausible" + nl ok
see "Instances of *ei, where * is not c : " + (ei-cei) + nl
see "Instances of *ie, where * is not c: " + (ie-cie) + nl
see "Rule: 'i' before 'e' when not preceded by 'c' is = " 
if ie>ei see "plausible" + nl else see "not plausible" + nl ok
see "Overall the rule is : "
if cei>cie and ie>ei see "PLAUSIBLE" + nl else see "NOT PLAUSIBLE" + nl ok

func getFileSize fp
       c_filestart = 0
       c_fileend = 2
       nfilesize = ftell(fp)
       return nfilesize

func count(cString,dString)
       sum = 0
       while substr(cString,dString) > 0
               sum = sum + 1
               cString = substr(cString,substr(cString,dString)+len(string(sum)))
       return sum


The number of words in unixdict : 25104
Instances of cei : 13
Instances of cie : 24
Rule: 'e' before 'i' when preceded by 'c' is = not plausible
Instances of *ei, where * is not c : 217
Instances of *ie, where * is not c: 466
Rule: 'i' before 'e' when not preceded by 'c' is = plausible
Overall the rule is : NOT PLAUSIBLE


require 'open-uri'

plausibility_ratio = 2
counter = Hash.new(0)
path = 'http://wiki.puzzlers.org/pub/wordlists/unixdict.txt'
rules = [['I before E when not preceded by C:', 'ie', 'ei'],
         ['E before I when preceded by C:', 'cei', 'cie']]

open(path){|f| f.each{|line| line.scan(/ie|ei|cie|cei/){|match| counter[match] += 1 }}}

overall_plausible = rules.all? do |(str, x, y)|
  num_x, num_y, ratio = counter[x], counter[y], counter[x] / counter[y].to_f
  plausibility = ratio > plausibility_ratio
  puts str
  puts "#{x}: #{num_x}; #{y}: #{num_y}; Ratio: #{ratio.round(2)}: #{ plausibility ? 'Plausible' : 'Implausible'}"

puts "Overall: #{overall_plausible ? 'Plausible' : 'Implausible'}."
I before E when not preceded by C:
ie: 464; ei: 217; Ratio: 2.14: Plausible
E before I when preceded by C:
cei: 13; cie: 24; Ratio: 0.54: Implausible
Overall: Implausible.


use std::default::Default;
use std::ops::AddAssign;

use itertools::Itertools;
use reqwest::get;

#[derive(Default, Debug)]
struct Feature<T> {
    pub cie: T,
    pub xie: T,
    pub cei: T,
    pub xei: T,

impl AddAssign<Feature<bool>> for Feature<u64> {
    fn add_assign(&mut self, rhs: Feature<bool>) {
        self.cei += rhs.cei as u64;
        self.xei += rhs.xei as u64;
        self.cie += rhs.cie as u64;
        self.xie += rhs.xie as u64;

fn check_feature(word: &str) -> Feature<bool> {
    let mut feature: Feature<bool> = Default::default();

    for window in word.chars().tuple_windows::<(char, char, char)>() {
        match window {
            ('c', 'e', 'i') => { feature.cei = true }
            ('c', 'i', 'e') => { feature.cie = true }
            (not_c, 'e', 'i') if not_c != 'c' => (feature.xei = true),
            (not_c, 'i', 'e') if not_c != 'c' => (feature.xie = true),
            _ => {}


fn maybe_is_feature_plausible(feature_count: u64, opposing_count: u64) -> Option<bool> {
    if feature_count > 2 * opposing_count { Some(true) } else if opposing_count > 2 * feature_count { Some(false) } else { None }

fn print_feature_plausibility(feature_plausibility: Option<bool>, feature_name: &str) {
    let plausible_msg =
        match feature_plausibility {
            None => " is implausible",
            Some(true) => "is plausible",
            Some(false) => "is definitely implausible",

    println!("{} {}", feature_name, plausible_msg)

fn main() {
    let mut res = get(" http://wiki.puzzlers.org/pub/wordlists/unixdict.txt").unwrap();
    let texts = res.text().unwrap();

    let mut feature_count: Feature<u64> = Default::default();
    for word in texts.lines() {
        let feature = check_feature(word);
        feature_count += feature;

    println!("Counting {:#?}", feature_count);

    let xie_plausibility =
        maybe_is_feature_plausible(feature_count.xie, feature_count.cie);
    let cei_plausibility =
        maybe_is_feature_plausible(feature_count.cei, feature_count.xei);

    print_feature_plausibility(xie_plausibility, "I before E when not preceded by C");
    print_feature_plausibility(cei_plausibility, "E before I when preceded by C");
    println!("The rule in general is {}",
             if xie_plausibility.unwrap_or(false) && cei_plausibility.unwrap_or(false)
             { "Plausible" } else { "Implausible" }
Counting Feature {
    cie: 24,
    xie: 464,
    cei: 13,
    xei: 194,
I before E when not preceded by C is plausible
E before I when preceded by C is definitely implausible
The rule in general is Implausible


object I_before_E_except_after_C extends App {
  val testIE1 = "(^|[^c])ie".r // i before e when not preceded by c
  val testIE2 = "cie".r // i before e when preceded by c
  var countsIE = (0,0)

  val testCEI1 = "cei".r // e before i when preceded by c
  val testCEI2 = "(^|[^c])ei".r // e before i when not preceded by c
  var countsCEI = (0,0)

  scala.io.Source.fromURL("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt").getLines.map(_.toLowerCase).foreach{word =>
    if (testIE1.findFirstIn(word).isDefined) countsIE = (countsIE._1 + 1, countsIE._2)
    if (testIE2.findFirstIn(word).isDefined) countsIE = (countsIE._1, countsIE._2 + 1)
    if (testCEI1.findFirstIn(word).isDefined) countsCEI = (countsCEI._1 + 1, countsCEI._2)
    if (testCEI2.findFirstIn(word).isDefined) countsCEI = (countsCEI._1, countsCEI._2 + 1)

  def plausible(counts: (Int,Int)) = counts._1 > (2 * counts._2)
  def plausibility(plausible: Boolean) = if (plausible) "plausible" else "implausible"
  def plausibility(counts: (Int, Int)): String = plausibility(plausible(counts))
  println("I before E when not preceded by C: "+plausibility(countsIE))
  println("E before I when preceded by C: "+plausibility(countsCEI))
  println("Overall: "+plausibility(plausible(countsIE) && plausible(countsCEI)))
I before E when not preceded by C: plausible
E before I when preceded by C: implausible
Overall: implausible


$ include "seed7_05.s7i";
  include "gethttp.s7i";
  include "float.s7i";

const integer: PLAUSIBILITY_RATIO is 2;

const func boolean: plausibilityCheck (in string: comment, in integer: x, in integer: y) is func
    var boolean: plausible is FALSE;
    writeln("  Checking plausibility of: " <& comment);
    if x > PLAUSIBILITY_RATIO * y then
      writeln("    PLAUSIBLE. As we have counts of " <& x <& " vs " <& y <&
              " words, a ratio of " <& flt(x) / flt(y) digits 1 lpad 4 <& " times");
    elsif x > y then
      writeln("    IMPLAUSIBLE. As although we have counts of " <& x <& " vs " <& y <&
              " words, a ratio of " <& flt(x) / flt(y) digits 1 lpad 4 <& " times does not make it plausible");
      writeln("    IMPLAUSIBLE, probably contra-indicated. As we have counts of " <& x <& " vs " <& y <&
              " words, a ratio of " <& flt(x) / flt(y) digits 1 lpad 4 <& " times");
    end if;
    plausible := x > PLAUSIBILITY_RATIO * y;
  end func;

const func integer: count (in string: stri, in array string: words) is func
    var integer: count is 0;
    var integer: index is 0;
    for key index range words do
      if pos(words[index], stri) <> 0 then
      end if;
    end for;
  end func;

const proc: main is func
    var array string: words is 0 times "";
    var integer: cie is 0;
    var integer: cei is 0;
    var integer: not_c_ie is 0;
    var integer: not_c_ei is 0;
    words := split(lower(getHttp("wiki.puzzlers.org/pub/wordlists/unixdict.txt")), "\n");
    cie := count("cie", words);
    cei := count("cei", words);
    not_c_ie := count("ie", words) - cie;
    not_c_ei := count("ei", words) - cei;
    writeln("Checking plausibility of \"I before E except after C\":");
    if plausibilityCheck("I before E when not preceded by C", not_c_ie, not_c_ei) and
        plausibilityCheck("E before I when preceded by C", cei, cie) then
      writeln("OVERALL IT IS PLAUSIBLE!");
      writeln("OVERALL IT IS IMPLAUSIBLE!");
      writeln("(To be plausible, one word count must exceed another by " <& PLAUSIBILITY_RATIO <& " times)");
    end if;
  end func;
Checking plausibility of "I before E except after C":
  Checking plausibility of: I before E when not preceded by C
    PLAUSIBLE. As we have counts of 465 vs 213 words, a ratio of  2.2 times
  Checking plausibility of: E before I when preceded by C
    IMPLAUSIBLE, probably contra-indicated. As we have counts of 13 vs 24 words, a ratio of  0.5 times
(To be plausible, one word count must exceed another by 2 times)


program i_before_e_except_after_c;
    init cie := 0, xie := 0, cei := 0, xei := 0;

    dict := open("unixdict.txt", "r");
    loop doing word := getline(dict); while word /= om do
    end loop;

    p :=
        plausible("I before E when not preceded by C", xie, cie) and
        plausible("E before I when preceded by C", cei, xei);
    print("I before E, except after C:" + (if p then "" else " not" end)
        + " plausible.");

    proc classify(word);
        if "ie" in word then
            if "cie" in word then cie +:= 1;
            else xie +:= 1;
            end if;
        elseif "ei" in word then
            if "cei" in word then cei +:= 1;
            else xei +:= 1;
            end if;
        end if;
    end proc;

    proc plausible(clause, feature, opposite);
        p := 2 * feature > opposite;
        print(clause + ":" + (if p then "" else " not" end) + " plausible.");
        return p;
    end proc;
end program;
I before E when not preceded by C: plausible.
E before I when preceded by C: not plausible.

I before E, except after C: not plausible.


Using SwiftRegex for easy regex in strings.

import Foundation

let request = NSURLRequest(URL: NSURL(string: "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")!)

NSURLConnection.sendAsynchronousRequest(request, queue: NSOperationQueue()) {res, data, err in
    if (data != nil) {
        if let fileAsString = NSString(data: data, encoding: NSUTF8StringEncoding) {
            var firstCase = false
            var secondCase = false
            var cie = 0
            var cei = 0
            var not_c_ie = 0
            var not_c_ei = 0
            let words = fileAsString.componentsSeparatedByString("\n")
            for word in words {
                var wordRegex = RegexMutable(word as String)
                if (wordRegex["cie"]) {
                if (wordRegex["cei"]) {
                if (wordRegex["(^ie|[^c]ie)"].matches().count != 0) {
                if (wordRegex["(^ei|[^c]ei)"].matches().count != 0) {
            if (not_c_ie > not_c_ei * 2) {
                println("I before E when not preceded by C is plausable")
                firstCase = true
            } else {
                println("I before E when not preceded by C is not plausable")
            if (cei > cie * 2) {
                secondCase = true
                println("E before I when preceded by C is plausable")
            } else {
                println("E before I when preceded by C is not plausable")
            if (firstCase && secondCase) {
                println("I before E except after C is plausible")
            } else {
                println("I before E except after C is not plausible")

I before E when not preceded by C is plausable
E before I when preceded by C is not plausable
I before E except after C is not plausible


Translation of: BASIC

OPEN #1: NAME "UNIXDICT.TXT", org text, ACCESS INPUT, create old
   LINE INPUT #1: w$
   IF POS(w$,"ie")<>0 THEN
      IF POS(w$,"cie")<>0 THEN LET ci = ci+1 ELSE LET xi = xi+1
   IF POS(w$,"ei")<>0 THEN
      IF POS(w$,"cei")<>0 THEN LET ce = ce+1 ELSE LET xe = xe+1

PRINT "CIE:"; ci
PRINT "xIE:"; xi
PRINT "CEI:"; ce
PRINT "xEI:"; xe
PRINT "I before E when not preceded by C: ";
IF 2*xi <= ci THEN PRINT "not ";
PRINT "plausible."
PRINT "E before I when preceded by C: ";
IF 2*ce <= xe THEN PRINT "not ";
PRINT "plausible."


Translation of: Python
package require http

proc plausible {description x y} {
    puts "  Checking plausibility of: $description"
    if {$x > $PLAUSIBILITY_RATIO * $y} {
	set conclusion "PLAUSIBLE"
	set fmt "As we have counts of %i vs %i words, a ratio of %.1f times"
	set result true
    } elseif {$x > $y} {
	set conclusion "IMPLAUSIBLE"
	set fmt "As although we have counts of %i vs %i words,"
	append fmt " a ratio of %.1f times does not make it plausible"
	set result false
    } else {
	set conclusion "IMPLAUSIBLE, probably contra-indicated"
	set fmt "As we have counts of %i vs %i words, a ratio of %.1f times"
	set result false
    puts [format "    %s.\n    $fmt" $conclusion $x $y [expr {double($x)/$y}]]
    return $result

set t [http::geturl http://wiki.puzzlers.org/pub/wordlists/unixdict.txt]
set words [split [http::data $t] "\n"]
http::cleanup $t
foreach {name pattern} {ie (?:^|[^c])ie ei (?:^|[^c])ei cie cie cei cei} {
    set count($name) [llength [lsearch -nocase -all -regexp $words $pattern]]

puts "Checking plausibility of \"I before E except after C\":"
if {
    [plausible "I before E when not preceded by C" $count(ie) $count(ei)] &&
    [plausible "E before I when preceded by C" $count(cei) $count(cie)]
} then {
} else {
puts "\n(To be plausible, one word count must exceed another by\
Checking plausibility of "I before E except after C":
  Checking plausibility of: I before E when not preceded by C
    As we have counts of 465 vs 213 words, a ratio of 2.2 times
  Checking plausibility of: E before I when preceded by C
    IMPLAUSIBLE, probably contra-indicated.
    As we have counts of 13 vs 24 words, a ratio of 0.5 times


(To be plausible, one word count must exceed another by 2.0 times)



LOOP word=words
IF (word.nc." ie "," ei ") CYCLE

IF (word.ct." ie "&& word.ct." ei ") THEN
  IF (word.ct." Cie ") THEN
  ELSEIF (word.ct." Cei ") THEN

IF (word.ct." ie ") THEN
  IF (word.ct." Cie ") THEN
ELSEIF (word.ct." ei ") THEN
  IF (word.ct." Cei ") THEN


PRINT "ieee ", ieei
PRINT "cie  ", cie
PRINT "xie  ", xie
PRINT "cei  ", cei
PRINT "xei  ", xei


IF (xie>doublexei) THEN
 check1="not plausible"

IF (cei>xei) THEN
 check2="not plausible"
IF (check1==check2) THEN
 checkall="not plausible"

TRAcE *check1,check2,checkall


ieee 4
cie  24
xie  465
cei  13
xei  213
TRACE *    62    -*SKRIPTE  203
check1       = plausible
check2       = not plausible
checkall     = not plausible


Translation of: PowerShell
If Set(a, Open ("unixdict.txt", "r")) < 0 Then Print "Cannot open \qunixdict.txt\q" : End

x = Set (y, Set (p, Set (q, 0)))

Do While Read (a)
  w = Tok(0)
  If FUNC(_Search(w, "cei")) > -1 Then x = x + 1
  If FUNC(_Search(w, "cie")) > -1 Then y = y + 1
  If FUNC(_Search(w, "ie"))  > -1 Then p = p + 1
  If FUNC(_Search(w, "ei"))  > -1 Then q = q + 1

Print "The plausibility of 'I before E when not preceded by C' is ";
Print Show (Iif (p>(q+q), "True", "False"))

Print "The plausibility of 'E before I when preceded by C' is ";
Print Show (Iif (x>(y+y), "True", "False"))

Print "The plausibility of the phrase 'I before E except after C' is ";
Print Show (Iif ((x>(y+y))*(p>(q+q)), "True", "False"))

Close a

  Param (2)
  Local (1)
  For c@ = 0 to Len (a@) - Len (b@)
    If Comp(Clip(Chop(a@,c@),Len(a@)-c@-Len(b@)),b@)=0 Then Unloop : Return (c@)
Return (-1)
The plausibility of 'I before E when not preceded by C' is True
The plausibility of 'E before I when preceded by C' is False
The plausibility of the phrase 'I before E except after C' is False

0 OK, 0:800 

UNIX Shell


matched() {
  grep -Poe "$1" unixdict.txt | wc -l

check() {
  local num_for="$(matched "$3")"
  local num_against="$(matched "$2")"
  if [ "$num_for" -le "$(expr 2 \* "$num_against")" ]; then
    echo "Clause $1 not plausible ($num_for examples; $num_against counterexamples)"
    return 1
    echo "Clause $1 is plausible ($num_for examples; $num_against counterexamples)"
    return 0

check 1 '(?<!c)ei' '(?<!c)ie'
check 2 'cie' 'cei'
if [ $PLAUSIBLE_1 -eq 0 -a $PLAUSIBLE_2 -eq 0 ]; then
  echo "Overall, the rule is plausible"
  echo "Overall, the rule is not plausible"
Clause 1 is plausible (466 examples; 217 counterexamples)
Clause 2 not plausible (13 examples; 24 counterexamples)
Overall, the rule is not plausible


The sample text was downloaded and saved in the same folder as the script.

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set srcFile = objFSO.OpenTextFile(objFSO.GetParentFolderName(WScript.ScriptFullName) &_
cei = 0 : cie = 0 : ei = 0 : ie = 0

Do Until srcFile.AtEndOfStream
	word = srcFile.ReadLine
	If InStr(word,"cei") Then
		cei = cei + 1
	ElseIf InStr(word,"cie") Then
		cie = cie + 1
	ElseIf InStr(word,"ei") Then
		ei = ei + 1
	ElseIf InStr(word,"ie") Then
		ie = ie + 1
	End If

FirstClause = False
SecondClause = False
Overall = False

'testing the first clause
If  ie > ei*2 Then
	WScript.StdOut.WriteLine "I before E when not preceded by C is plausible."
	FirstClause = True
	WScript.StdOut.WriteLine "I before E when not preceded by C is NOT plausible."
End If

'testing the second clause
If cei > cie*2 Then
	WScript.StdOut.WriteLine "E before I when not preceded by C is plausible."
	SecondClause = True
	WScript.StdOut.WriteLine "E before I when not preceded by C is NOT plausible."
End If

'overall clause
If FirstClause And SecondClause Then
	WScript.StdOut.WriteLine "Overall it is plausible."
	WScript.StdOut.WriteLine "Overall it is NOT plausible."
End If

Set objFSO = Nothing
I before E when not preceded by C is plausible.
E before I when not preceded by C is NOT plausible.
Overall it is NOT plausible.

Visual Basic .NET

Compiler: Roslyn Visual Basic (language version >= 15.3)

Works with: .NET Core version 2.1

Implemented using both a single-pass loop and regex. Implementation used is toggled with compiler constant.

Regex implementation does not technically conform to specification because it counts the number of occurrences of "ie" and "ei" instead of the number of words.

Option Compare Binary
Option Explicit On
Option Infer On
Option Strict On

Imports System.Text.RegularExpressions

#Const USE_REGEX = False

Module Program
    ' Supports both local and remote files
    Const WORDLIST_URI = "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt"

    ' The support factor of a word for EI or IE is the number of occurrences that support the rule minus the number that oppose it.
    ' I.e., for IE:
    '   - increased when not preceded by C
    '   - decreased when preceded by C
    ' and for EI:
    '   - increased when preceded by C
    '   - decreased when not preceded by C
    Private Function GetSupportFactor(word As String) As (IE As Integer, EI As Integer)
        Dim IE, EI As Integer

        ' Enumerate the letter pairs in the word.
        For i = 0 To word.Length - 2
            Dim pair = word.Substring(i, 2)

            ' Instances at the beginning of a word count towards the factor and are treated as not preceded by C.
            Dim prevIsC As Boolean = i > 0 AndAlso String.Equals(word(i - 1), "c"c, StringComparison.OrdinalIgnoreCase)

            If pair.Equals("ie", StringComparison.OrdinalIgnoreCase) Then
                IE += If(Not prevIsC, 1, -1)
            ElseIf pair.Equals("ei", StringComparison.OrdinalIgnoreCase) Then
                EI += If(prevIsC, 1, -1)
            End If

        If Math.Abs(IE) > 1 Or Math.Abs(EI) > 1 Then Debug.WriteLine($"{word}: {GetSupportFactor}")
        Return (IE, EI)
    End Function

    ' Returns the number of words that support or oppose the rule.
    Private Function GetPlausabilities(words As IEnumerable(Of String)) As (ieSuppCount As Integer, ieOppCount As Integer, eiSuppCount As Integer, eiOppCount As Integer)
        Dim ieSuppCount, ieOppCount, eiSuppCount, eiOppCount As Integer

        For Each word In words
            Dim status = GetSupportFactor(word)
            If status.IE > 0 Then
                ieSuppCount += 1
            ElseIf status.IE < 0 Then
                ieOppCount += 1
            End If
            If status.EI > 0 Then
                eiSuppCount += 1
            ElseIf status.EI < 0 Then
                eiOppCount += 1
            End If

        Return (ieSuppCount, ieOppCount, eiSuppCount, eiOppCount)
    End Function

    ' Takes entire file instead of individual words.
    ' Returns the number of instances of IE or EI that support or oppose the rule.
    Private Function GetPlausabilitiesRegex(words As String) As (ieSuppCount As Integer, ieOppCount As Integer, eiSuppCount As Integer, eiOppCount As Integer)
        ' Gets number of occurrences of the pattern, case-insensitive.
        Dim count = Function(pattern As String) Regex.Matches(words, pattern, RegexOptions.IgnoreCase).Count

        Dim ie = count("[^c]ie")
        Dim ei = count("[^c]ei")
        Dim cie = count("cie")
        Dim cei = count("cei")

        Return (ie, cie, cei, ei)
    End Function

    Sub Main()
        Dim file As String
        Dim wc As New Net.WebClient()
            Console.WriteLine("Fetching file...")
            file = wc.DownloadString(WORDLIST_URI)
        Catch ex As Net.WebException
            Exit Sub
        End Try

        Dim res = GetPlausabilitiesRegex(file)
        Dim words = file.Split({vbCr, vbLf}, StringSplitOptions.RemoveEmptyEntries)
        Dim res = GetPlausabilities(words)
#End If

        Dim PrintResult =
        Function(suppCount As Integer, oppCount As Integer, printEI As Boolean) As Boolean
            Dim ratio = suppCount / oppCount,
                plausible = ratio > 2
#If Not USE_REGEX Then
            Console.WriteLine($"    Words with no instances of {If(printEI, "EI", "IE")} or equal numbers of supporting/opposing occurrences: {words.Length - suppCount - oppCount}")
#End If
            Console.WriteLine($"    Number supporting: {suppCount}")
            Console.WriteLine($"    Number opposing: {oppCount}")
            Console.WriteLine($"    {suppCount}/{oppCount}={ratio:N3}")
            Console.WriteLine($"    Rule therefore IS {If(plausible, "", "NOT ")}plausible.")
            Return plausible
        End Function

        Console.WriteLine($"Total occurrences of IE: {res.ieOppCount + res.ieSuppCount}")
        Console.WriteLine($"Total occurrences of EI: {res.eiOppCount + res.eiSuppCount}")
        Console.WriteLine($"Total words: {words.Length}")
#End If

        Console.WriteLine("""IE is not preceded by C""")
        Dim iePlausible = PrintResult(res.ieSuppCount, res.ieOppCount, False)

        Console.WriteLine("""EI is preceded by C""")
        Dim eiPlausible = PrintResult(res.eiSuppCount, res.eiOppCount, True)

        Console.WriteLine($"Rule thus overall IS {If(iePlausible AndAlso eiPlausible, "", "NOT ")}plausible.")
    End Sub
End Module
Output  —  Loop implementation:
Fetching file...

Total words: 25104

"IE is not preceded by C"
    Words with no instances of IE or equal numbers of supporting/opposing occurrences: 24615
    Number supporting: 465
    Number opposing: 24
    Rule therefore IS plausible.

"EI is preceded by C"
    Words with no instances of EI or equal numbers of supporting/opposing occurrences: 24878
    Number supporting: 13
    Number opposing: 213
    Rule therefore IS NOT plausible.

Rule thus overall IS NOT plausible.
Output  —  Regex implementation:
Fetching file...

Total occurrences of IE: 490
Total occurrences of EI: 230

"IE is not preceded by C"
    Number supporting: 466
    Number opposing: 24
    Rule therefore IS plausible.

"EI is preceded by C"
    Number supporting: 13
    Number opposing: 217
    Rule therefore IS NOT plausible.

Rule thus overall IS NOT plausible.

V (Vlang)

import os
import strconv

fn main() {
	mut cei, mut cie, mut ie, mut ei := f32(0), f32(0), f32(0), f32(0)
    unixdict := os.read_file('./unixdict.txt') or {println('Error: file not found') exit(1)}
	words := unixdict.split_into_lines() 
	println("The number of words in unixdict: ${words.len}")
	for word in words {
		cei += word.count('cei')
		cie += word.count('cie')
		ei += word.count('ei')
		ie += word.count('ie')
	print("Rule: 'e' before 'i' when preceded by 'c' at the ratio of ")
	print("${strconv.f64_to_str_lnd1((cei / cie), 2)} is ")
	if cei > cie {println("plausible.")} else {println("implausible.")}
	println("$cei cases for and $cie cases against.")

	print("Rule: 'i' before 'e' except after 'c' at the ratio of ")
	print("${strconv.f64_to_str_lnd1(((ie - cie) / (ei - cei)), 2)} is ") 
	if ie > ei {println("plausible.")} else {println("implausible.")}
	println("${(ie - cie)} cases for and ${(ei - cei)} cases against.")

	print("Overall the rules are ")
	if cei > cie && ie > ei {println("plausible.")} else {println("implausible.")}
The number of words in unixdict: 25104
Rule: 'e' before 'i' when preceded by 'c' at the ratio of 0.54 is implausible.
13 cases for and 24 cases against.
Rule: 'i' before 'e' except after 'c' at the ratio of 2.15 is plausible.
466 cases for and 217 cases against.
Overall the rules are implausible.


Library: Wren-pattern
Library: Wren-fmt

It's a moot point whether one should include words beginning with "ei" or "ie" in this analysis as I've certainly never applied the rule to them and there are clearly a lot more of the former than the latter (22 to 1 for unixdict.txt). Despite this reservation I've included them anyway.

Also there are seven words which fall into two categories and which have therefore been double-counted.

import "io" for File
import "./pattern" for Pattern
import "./fmt" for Fmt

var yesNo = Fn.new { |b| (b) ? "yes" : "no" }

var plausRatio = 2

var count1 = 0  // [^c]ie
var count2 = 0  // [^c]ei
var count3 = 0  // cie
var count4 = 0  // cei
var count5 = 0  // ^ie
var count6 = 0  // ^ei

var p1 = Pattern.new("^cie")
var p2 = Pattern.new("^cei")

var words = File.read("unixdict.txt").split("\n").map { |w| w.trim() }.where { |w| w != "" }
System.print("The following words fall into more than one category")
System.print("and so are counted more than once:")
for (word in words) {
    var tc1 = count1 + count2 + count3 + count4 + count5 + count6
    if (p1.isMatch(word)) count1 = count1 + 1
    if (p2.isMatch(word)) count2 = count2 + 1
    if (word.contains("cie")) count3 = count3 + 1
    if (word.contains("cei")) count4 = count4 + 1
    if (word.startsWith("ie")) count5 = count5 + 1
    if (word.startsWith("ei")) count6 = count6 + 1
    var tc2 = count1 + count2 + count3 + count4 + count5 + count6
    if ((tc2 -tc1) > 1) System.print("  " + word)

System.print("\nChecking plausability of \"i before e except after c\":")
var nFor  = count1 + count5
var nAgst = count2 + count6
var ratio = nFor / nAgst
var plaus = (ratio > plausRatio)
Fmt.print("  Cases for      : $d", nFor)
Fmt.print("  Cases against  : $d", nAgst)
Fmt.print("  Ratio for/agst : $4.2f", ratio)
Fmt.print("  Plausible      : $s", yesNo.call(plaus))

System.print("\nChecking plausability of \"e before i when preceded by c\":")
var ratio2 = count4 / count3
var plaus2 = (ratio2 > plausRatio)
Fmt.print("  Cases for      : $d", count4)
Fmt.print("  Cases against  : $d", count3)
Fmt.print("  Ratio for/agst : $4.2f", ratio2)
Fmt.print("  Plausible      : $s", yesNo.call(plaus2))

Fmt.print("\nPlausible overall: $s", yesNo.call(plaus && plaus2))
The following words fall into more than one category
and so are counted more than once:

Checking plausability of "i before e except after c":
  Cases for      : 465
  Cases against  : 216
  Ratio for/agst : 2.15
  Plausible      : yes

Checking plausability of "e before i when preceded by c":
  Cases for      : 13
  Cases against  : 24
  Ratio for/agst : 0.54
  Plausible      : no

Plausible overall: no

And the code and results for the 'stretch goal' which has just the one double-counted word:

import "io" for File
import "./pattern" for Pattern
import "./fmt" for Fmt

var yesNo = Fn.new { |b| (b) ? "yes" : "no" }

var plausRatio = 2

var count1 = 0  // [^c]ie
var count2 = 0  // [^c]ei
var count3 = 0  // cie
var count4 = 0  // cei
var count5 = 0  // ^ie
var count6 = 0  // ^ei

var p0 = Pattern.new("+1/s")
var p1 = Pattern.new("^cie")
var p2 = Pattern.new("^cei")

var entries = File.read("corpus.txt").split("\n").map { |w| w.trim() }.where { |w| w != "" }
System.print("The following words fall into more than one category")
System.print("and so are counted more than their frequency:")
for (entry in entries.skip(1)) {
    var items = p0.splitAll(entry)
    if (items.count == 3) {
        var word = items[0]  // leave any trailing * in place
        var freq = Num.fromString(items[2])
        var tc1 = count1 + count2 + count3 + count4 + count5 + count6
        if (p1.isMatch(word)) count1 = count1 + freq
        if (p2.isMatch(word)) count2 = count2 + freq
        if (word.contains("cie")) count3 = count3 + freq
        if (word.contains("cei")) count4 = count4 + freq
        if (word.startsWith("ie")) count5 = count5 + freq
        if (word.startsWith("ei")) count6 = count6 + freq
        var tc2 = count1 + count2 + count3 + count4 + count5 + count6
        if ((tc2 -tc1) > freq) System.print("  " + word)

System.print("\nChecking plausability of \"i before e except after c\":")
var nFor  = count1 + count5
var nAgst = count2 + count6
var ratio = nFor / nAgst
var plaus = (ratio > plausRatio)
Fmt.print("  Cases for      : $d", nFor)
Fmt.print("  Cases against  : $d", nAgst)
Fmt.print("  Ratio for/agst : $4.2f", ratio)
Fmt.print("  Plausible      : $s", yesNo.call(plaus))

System.print("\nChecking plausability of \"e before i when preceded by c\":")
var ratio2 = count4 / count3
var plaus2 = (ratio2 > plausRatio)
Fmt.print("  Cases for      : $d", count4)
Fmt.print("  Cases against  : $d", count3)
Fmt.print("  Ratio for/agst : $4.2f", ratio2)
Fmt.print("  Plausible      : $s", yesNo.call(plaus2))

Fmt.print("\nPlausible overall: $s", yesNo.call(plaus && plaus2))
The following words fall into more than one category
and so are counted more than their frequency:

Checking plausability of "i before e except after c":
  Cases for      : 8192
  Cases against  : 4826
  Ratio for/agst : 1.70
  Plausible      : no

Checking plausability of "e before i when preceded by c":
  Cases for      : 327
  Cases against  : 994
  Ratio for/agst : 0.33
  Plausible      : no

Plausible overall: no


Translation of: BASIC
open "unixdict.txt" for reading as #1

    line input #1 pal$
    if instr(pal$, "ie") then 
		if instr(pal$, "cie") then CI = CI + 1 else XI = XI + 1 : fi
    if instr(pal$, "ei") then 
		if instr(pal$, "cei") then CE = CE + 1 else XE = XE + 1 : fi
until eof(1)
close #1

print "CIE: ", CI
print "xIE: ", XI
print "CEI: ", CE
print "xEI: ", XE
print "\nI before E when not preceded by C: ";
if 2 * XI <= CI then print "not "; : fi
print "plausible."
print "E before I when preceded by C: ";
if 2 * CE <= XE then print "not "; : fi
print "plausible."


fcn wcnt(wordList,altrs,aAdjust,bltrs,bAdjust,text){
   a:=wordList.reduce('wrap(cnt,word){ cnt+word.holds(altrs) },0) - aAdjust;
   b:=wordList.reduce('wrap(cnt,word){ cnt+word.holds(bltrs) },0) - bAdjust;
   "%s is %splausible".fmt(text,ratio<2 and "im" or "").println();
   "  %d cases for and %d cases against is a ratio of %.3f.".fmt(a,b,ratio).println();
a,b,r1:=wcnt(wordList,"cei",0,"cie",0,"E before I when preceded by C");
_,_,r2:=wcnt(wordList,"ie",b,"ei",a,  "I before E when not preceded by C");
"Overall the rule is %splausible".fmt((r1<2 or r2<2) and "im" or "").println();
E before I when preceded by C is implausible
  13 cases for and 24 cases against is a ratio of 0.542.
I before E when not preceded by C is plausible
  465 cases for and 213 cases against is a ratio of 2.183.
Overall the rule is implausible


fcn wc2(wordList,altrs,aAdjust,bltrs,bAdjust,text){
      // don't care if line is "Word PoS Freq" or "as yet Adv 14"
      if(word.holds(altrs)) cnts[0]=cnts[0]+n;
      if(word.holds(bltrs)) cnts[1]=cnts[1]+n;
   a-=aAdjust; b-=bAdjust;
   "%s is %splausible".fmt(text,ratio<2 and "im" or "").println();
   "  %d cases for and %d cases against is a ratio of %.3f.".fmt(a,b,ratio).println();
E before I when preceded by C is implausible
  327 cases for and 994 cases against is a ratio of 0.329.
I before E when not preceded by C is implausible
  8148 cases for and 4826 cases against is a ratio of 1.688.
Overall the rule is implausible