Change e letters to i in words: Difference between revisions

add SETL
No edit summary
(add SETL)
(4 intermediate revisions by 4 users not shown)
Line 212:
===Core language only===
Because of the huge size of the original word list and the number of changed words to check, it's nearly 100 times as fast to ensure the list is sorted and to use a binary search handler as it is to use the language's built-in '''is in''' command! (1.17 seconds instead of 110 on my current machine.) A further, lesser but interesting optimisation is to work through the sorted list in reverse, storing possible "i" word candidates encountered before getting to any "e" words from which they can be derived. Changed "e" words then only need to be checked against this smaller collection.
<syntaxhighlight lang="applescript">use AppleScript version "2.3.1" -- Mac OS X 10.9 (Mavericks) or later.
use sorter : script "CustomInsertion Iterative Ternary Merge Sortsort" -- <https://macscripterrosettacode.netorg/viewtopic.php?pid=194430wiki/Sorting_algorithms/Insertion_sort#p194430AppleScript>
use scripting additions
Line 226 ⟶ 224:
repeat until (l = r)
set m to (l + r) div 2
if (item m of o's lst's item m < v) then
set l to m + 1
Line 233 ⟶ 231:
end repeat
if (item l of o's lst's item l is= v) then return l
return 0
end binarySearch
on replace(a, b, txt)
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to a
set txt to txt's text items
set AppleScript's text item delimiters to b
set txt to txt as text
set AppleScript's text item delimiters to astid
return txt
end replace
on task(minWordLength)
Line 246 ⟶ 254:
set wordCount to (count o's wordList)
ignoring case
tell sorter to sort(o's wordList, 1, wordCount, {}) -- Not actually needed with unixdict.txt.
tell sorter to sort(o's wordList, 1, wordCount) -- Not actually needed with unixdict.txt.
set iWordCount to 0
set iWordCount to 0
set astid to AppleScript's text item delimiters
repeat with i from wordCount to 1 by -1
set thisWord to item iset ofthisWord to o's wordList's item i
if ((count thisWord) < minWordLength) then
else if ((thisWord contains "e") and (iWordCount > 0)) then
set AppleScript's text item delimitersset changedWord to replace("e", "i", thisWord)
set tis to thisWord if (binarySearch(changedWord, o's textiWords, 1, iWordCount) > 0) itemsthen
set AppleScriptbeginning of o's text item delimitersoutput to "i"{thisWord, changedWord}
set changedWord to tis asend textif
else if (binarySearch(changedWord, o's iWords, 1, iWordCount)thisWord >contains 0"i") then
set beginning of o's outputiWords to {thisWord, changedWord}
set iWordCount to iWordCount + 1
end if
elseend if (thisWord contains "i") thenrepeat
end ignoring
set beginning of o's iWords to thisWord
set iWordCount to iWordCount + 1
end if
end repeat
set AppleScript's text item delimiters to astid
return o's output
Line 530 ⟶ 535:
victor <- vector
willis <- welles</pre>
Line 1,108 ⟶ 1,114:
<syntaxhighlight lang="futurebasic">include "NSLog.incl"
include "NSLog.incl"
#plist NSAppTransportSecurity @{NSAllowsArbitraryLoads:YES}
void local fn DoIt
Line 1,970 ⟶ 1,979:
The only way to use <code>unixdict.txt</code> as input is to convert it into a list of 25104 strings. Fortunately, emulators can handle such a big data structure in RAM.
{{works with|Halcyon Calc|4.2.7}}
≪ → words
≪ { }
<span style="color:red">1</span> words EVAL SIZE '''FOR''' j
words j GET
'''IF''' DUP SIZE <span style="color:red">5</span> ≤ OVER <span style="color:red">"e"</span> POS NOT OR '''THEN''' DROP '''ELSE'''
<span style="color:red">""</span>
<span style="color:red">1 3</span> PICK SIZE '''FOR''' j
NUM R→B <span style="color:red">#20h</span> OR B→R CHR <span style="color:grey">@turn into lowercase</span>
DUP <span style="color:red">"e"</span> == <span style="color:red">"i"</span> ROT IFTE +
'''IF''' words EVAL OVER POS '''THEN''' <span style="color:red">" → "</span> SWAP + + + '''ELSE''' DROP2 '''END'''
'''END NEXT'''
≫ ≫ ‘<span style="color:blue">E→I</span>’ STO
1: { "analyses → analysis" "atlantes → atlantis" "bellow → billow" "breton → briton" "clench → clinch" "convect → convict" "crises → crisis" "diagnoses → diagnosis" "enfant → infant" "enquiry → inquiry" "frances → francis" "galatea → galatia" "harden → hardin" "heckman → hickman" "inequity → iniquity" "inflect → inflict" "jacobean → jacobian" "marten → martin" "module → moduli" "pegging → pigging" "psychoses → psychosis" "rabbet → rabbit" "sterling → stirling" "synopses → synopsis" "vector → victor" "welles → willis" }
<syntaxhighlight lang="ruby">words = File.readlines("unixdict.txt").map(&:chomp)
Line 2,008 ⟶ 2,039:
welles -> willis
<syntaxhighlight lang="rust">use std::collections::BTreeSet;
Line 2,070 ⟶ 2,102:
26. welles -> willis
<syntaxhighlight lang="setl">program change_e_letters_to_i_in_words;
dictfile := open("unixdict.txt", "r");
dict := {getline(dictfile) : until eof(dictfile)};
loop for word in dict | #word > 5 do
if "e" notin word then continue; end if;
iword := replaceall(word, "e", "i");
if iword notin dict then continue; end if;
print([word, iword]);
end loop;
proc replaceall(word, x, y);
loop while x in word do
word(x) := y;
end loop;
return word;
end proc;
end program;</syntaxhighlight>
<pre>[analyses analysis]
[atlantes atlantis]
[bellow billow]
[breton briton]
[clench clinch]
[convect convict]
[crises crisis]
[diagnoses diagnosis]
[enfant infant]
[enquiry inquiry]
[frances francis]
[galatea galatia]
[harden hardin]
[heckman hickman]
[inequity iniquity]
[inflect inflict]
[jacobean jacobian]
[marten martin]
[module moduli]
[pegging pigging]
[psychoses psychosis]
[rabbet rabbit]
[sterling stirling]
[synopses synopsis]
[vector victor]
[welles willis]</pre>
<syntaxhighlight lang="ruby">var file = File("unixdict.txt")
Line 2,126 ⟶ 2,206:
26: welles <-> willis
<syntaxhighlight lang="swift">import Foundation
Line 2,248 ⟶ 2,329:
<syntaxhighlight lang="ecmascriptwren">import "io" for File
import "./sort" for Find
import "./fmt" for Fmt
var wordList = "unixdict.txt" // local copy
Line 2,295 ⟶ 2,376:
26: welles -> willis
<syntaxhighlight lang="xpl0">string 0; \use zero-terminated strings
