Textonyms
You are encouraged to solve this task according to the task description, using any language you may know.
When entering text on a phone's digital pad it is possible that a particular combination of digits corresponds to more than one word. Such are called textonyms.
Assuming the digit keys are mapped to letters as follows:
2 -> ABC 3 -> DEF 4 -> GHI 5 -> JKL 6 -> MNO 7 -> PQRS 8 -> TUV 9 -> WXYZ
- Task
Write a program that finds textonyms in a list of words such as Textonyms/wordlist or unixdict.txt.
The task should produce a report:
There are #{0} words in #{1} which can be represented by the digit key mapping. They require #{2} digit combinations to represent them. #{3} digit combinations represent Textonyms.
Where:
#{0} is the number of words in the list which can be represented by the digit key mapping. #{1} is the URL of the wordlist being used. #{2} is the number of digit combinations required to represent the words in #{0}. #{3} is the number of #{2} which represent more than one word.
At your discretion show a couple of examples of your solution displaying Textonyms.
E.G.:
2748424767 -> "Briticisms", "criticisms"
- Extra credit
Use a word list and keypad mapping other than English.
- Metrics
- Counting
- Word frequency
- Letter frequency
- Jewels and stones
- I before E except after C
- Bioinformatics/base count
- Count occurrences of a substring
- Count how many vowels and consonants occur in a string
- Remove/replace
- XXXX redacted
- Conjugate a Latin verb
- Remove vowels from a string
- String interpolation (included)
- Strip block comments
- Strip comments from a string
- Strip a set of characters from a string
- Strip whitespace from a string -- top and tail
- Strip control codes and extended characters from a string
- Anagrams/Derangements/shuffling
- Word wheel
- ABC problem
- Sattolo cycle
- Knuth shuffle
- Ordered words
- Superpermutation minimisation
- Textonyms (using a phone text pad)
- Anagrams
- Anagrams/Deranged anagrams
- Permutations/Derangements
- Find/Search/Determine
- ABC words
- Odd words
- Word ladder
- Semordnilap
- Word search
- Wordiff (game)
- String matching
- Tea cup rim text
- Alternade words
- Changeable words
- State name puzzle
- String comparison
- Unique characters
- Unique characters in each string
- Extract file extension
- Levenshtein distance
- Palindrome detection
- Common list elements
- Longest common suffix
- Longest common prefix
- Compare a list of strings
- Longest common substring
- Find common directory path
- Words from neighbour ones
- Change e letters to i in words
- Non-continuous subsequences
- Longest common subsequence
- Longest palindromic substrings
- Longest increasing subsequence
- Words containing "the" substring
- Sum of the digits of n is substring of n
- Determine if a string is numeric
- Determine if a string is collapsible
- Determine if a string is squeezable
- Determine if a string has all unique characters
- Determine if a string has all the same characters
- Longest substrings without repeating characters
- Find words which contains all the vowels
- Find words which contain the most consonants
- Find words which contains more than 3 vowels
- Find words whose first and last three letters are equal
- Find words with alternating vowels and consonants
- Formatting
- Substring
- Rep-string
- Word wrap
- String case
- Align columns
- Literals/String
- Repeat a string
- Brace expansion
- Brace expansion using ranges
- Reverse a string
- Phrase reversals
- Comma quibbling
- Special characters
- String concatenation
- Substring/Top and tail
- Commatizing numbers
- Reverse words in a string
- Suffixation of decimal numbers
- Long literals, with continuations
- Numerical and alphabetical suffixes
- Abbreviations, easy
- Abbreviations, simple
- Abbreviations, automatic
- Song lyrics/poems/Mad Libs/phrases
- Mad Libs
- Magic 8-ball
- 99 bottles of beer
- The Name Game (a song)
- The Old lady swallowed a fly
- The Twelve Days of Christmas
- Tokenize
- Text between
- Tokenize a string
- Word break problem
- Tokenize a string with escaping
- Split a character string based on change of character
- Sequences
11l
[Char = String] CH2NUM
L(chars) ‘abc def ghi jkl mno pqrs tuv wxyz’.split(‘ ’)
V num = L.index + 2
L(ch) chars
CH2NUM[ch] = String(num)
F mapnum2words(words)
DefaultDict[String, [String]] number2words
V reject = 0
L(word) words
X.try
number2words[word.map(ch -> :CH2NUM[ch]).join(‘’)].append(word)
X.catch KeyError
reject++
R (number2words, reject)
V words = File(‘unixdict.txt’).read().rtrim("\n").split("\n")
print(‘Read #. words from 'unixdict.txt'’.format(words.len))
V wordset = Set(words)
V (num2words, reject) = mapnum2words(words)
F interactiveconversions()
L(inp) (‘rosetta’, ‘code’, ‘2468’, ‘3579’)
print("\nType a number or a word to get the translation and textonyms: "inp)
I all(inp.map(ch -> ch C ‘23456789’))
I inp C :num2words
print(‘ Number #. has the following textonyms in the dictionary: #.’.format(inp, (:num2words[inp]).join(‘, ’)))
E
print(‘ Number #. has no textonyms in the dictionary.’.format(inp))
E I all(inp.map(ch -> ch C :CH2NUM))
V num = inp.map(ch -> :CH2NUM[ch]).join(‘’)
print(‘ Word #. is#. in the dictionary and is number #. with textonyms: #.’.format(inp, (I inp C :wordset {‘’} E ‘n't’), num, (:num2words[num]).join(‘, ’)))
E
print(‘ I don't understand '#.'’.format(inp))
V morethan1word = sum(num2words.keys().filter(w -> :num2words[w].len > 1).map(w -> 1))
V maxwordpernum = max(num2words.values().map(values -> values.len))
print(‘
There are #. words in #. which can be represented by the Textonyms mapping.
They require #. digit combinations to represent them.
#. digit combinations represent Textonyms.’.format(words.len - reject, ‘'unixdict.txt'’, num2words.len, morethan1word))
print("\nThe numbers mapping to the most words map to #. words each:".format(maxwordpernum))
V maxwpn = sorted(num2words.filter((key, val) -> val.len == :maxwordpernum))
L(num, wrds) maxwpn
print(‘ #. maps to: #.’.format(num, wrds.join(‘, ’)))
interactiveconversions()
- Output:
Read 25104 words from 'unixdict.txt' There are 24978 words in 'unixdict.txt' which can be represented by the Textonyms mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. The numbers mapping to the most words map to 9 words each: 269 maps to: amy, any, bmw, bow, box, boy, cow, cox, coy 729 maps to: paw, pax, pay, paz, raw, ray, saw, sax, say Type a number or a word to get the translation and textonyms: rosetta Word rosetta is in the dictionary and is number 7673882 with textonyms: rosetta Type a number or a word to get the translation and textonyms: code Word code is in the dictionary and is number 2633 with textonyms: bode, code, coed Type a number or a word to get the translation and textonyms: 2468 Number 2468 has the following textonyms in the dictionary: ainu, chou Type a number or a word to get the translation and textonyms: 3579 Number 3579 has no textonyms in the dictionary.
ALGOL 68
Uses the Algol 68G specific "to upper" procedure.
# find textonyms in a list of words #
# use the associative array in the Associate array/iteration task #
PR read "aArray.a68" PR
# returns the number of occurances of ch in text #
PROC count = ( STRING text, CHAR ch )INT:
BEGIN
INT result := 0;
FOR c FROM LWB text TO UPB text DO IF text[ c ] = ch THEN result +:= 1 FI OD;
result
END # count # ;
CHAR invalid char = "*";
# returns text with the characters replaced by their text digits #
PROC to text = ( STRING text )STRING:
BEGIN
STRING result := text;
FOR pos FROM LWB result TO UPB result DO
CHAR c = to upper( result[ pos ] );
IF c = "A" OR c = "B" OR c = "C" THEN result[ pos ] := "2"
ELIF c = "D" OR c = "E" OR c = "F" THEN result[ pos ] := "3"
ELIF c = "G" OR c = "H" OR c = "I" THEN result[ pos ] := "4"
ELIF c = "J" OR c = "K" OR c = "L" THEN result[ pos ] := "5"
ELIF c = "M" OR c = "N" OR c = "O" THEN result[ pos ] := "6"
ELIF c = "P" OR c = "Q" OR c = "R" OR c = "S" THEN result[ pos ] := "7"
ELIF c = "T" OR c = "U" OR c = "V" THEN result[ pos ] := "8"
ELIF c = "W" OR c = "X" OR c = "Y" OR c = "Z" THEN result[ pos ] := "9"
ELSE # not a character that can be encoded # result[ pos ] := invalid char
FI
OD;
result
END # to text # ;
# read the list of words and store in an associative array #
CHAR separator = "/"; # character that will separate the textonyms #
IF FILE input file;
STRING file name = "unixdict.txt";
open( input file, file name, stand in channel ) /= 0
THEN
# failed to open the file #
print( ( "Unable to open """ + file name + """", newline ) )
ELSE
# file opened OK #
BOOL at eof := FALSE;
# set the EOF handler for the file #
on logical file end( input file, ( REF FILE f )BOOL:
BEGIN
# note that we reached EOF on the #
# latest read #
at eof := TRUE;
# return TRUE so processing can continue #
TRUE
END
);
REF AARRAY words := INIT LOC AARRAY;
INT word count := 0;
INT combinations := 0;
INT multiple count := 0;
INT max length := 0;
WHILE STRING word;
get( input file, ( word, newline ) );
NOT at eof
DO
STRING text word = to text( word );
IF count( text word, invalid char ) = 0 THEN
# the word can be fully encoded #
word count +:= 1;
INT length := ( UPB word - LWB word ) + 1;
IF length > max length THEN
# this word is longer than the maximum length found so far #
max length := length
FI;
IF ( words // text word ) = "" THEN
# first occurance of this encoding #
combinations +:= 1;
words // text word := word
ELSE
# this encoding has already been used #
IF count( words // text word, separator ) = 0
THEN
# this is the second time this encoding is used #
multiple count +:= 1
FI;
words // text word +:= separator + word
FI
FI
OD;
# close the file #
close( input file );
# find the maximum number of textonyms #
INT max textonyms := 0;
REF AAELEMENT e := FIRST words;
WHILE e ISNT nil element DO
INT textonyms := count( value OF e, separator );
IF textonyms > max textonyms
THEN
max textonyms := textonyms
FI;
e := NEXT words
OD;
print( ( "There are ", whole( word count, 0 ), " words in ", file name, " which can be represented by the digit key mapping.", newline ) );
print( ( "They require ", whole( combinations, 0 ), " digit combinations to represent them.", newline ) );
print( ( whole( multiple count, 0 ), " combinations represent Textonyms.", newline ) );
# show the textonyms with the maximum number #
print( ( "The maximum number of textonyms for a particular digit key mapping is ", whole( max textonyms + 1, 0 ), " as follows:", newline ) );
e := FIRST words;
WHILE e ISNT nil element DO
IF INT textonyms := count( value OF e, separator );
textonyms = max textonyms
THEN
print( ( " ", key OF e, " encodes ", value OF e, newline ) )
FI;
e := NEXT words
OD;
# show the textonyms with the maximum length #
print( ( "The longest words are ", whole( max length, 0 ), " chracters long", newline ) );
print( ( "Encodings with this length are:", newline ) );
e := FIRST words;
WHILE e ISNT nil element DO
IF max length = ( UPB key OF e - LWB key OF e ) + 1
THEN
print( ( " ", key OF e, " encodes ", value OF e, newline ) )
FI;
e := NEXT words
OD;
FI
- Output:
There are 24978 words in unixdict.txt which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 combinations represent Textonyms. The maximum number of textonyms for a particular digit key mapping is 9 as follows: 269 encodes amy/any/bmw/bow/box/boy/cow/cox/coy 729 encodes paw/pax/pay/paz/raw/ray/saw/sax/say The longest words are 22 chracters long Encodings with this length are: 3532876362374256472749 encodes electroencephalography
AppleScript
Vanilla
use AppleScript version "2.3.1" -- OS X 10.9 (Mavericks) or later.
-- https://rosettacode.org/wiki/Sorting_algorithms/Quicksort#Straightforward
use sorter : script "Quicksort"
use scripting additions
on textonyms(posixPath, query)
set digits to "23456789"
set keys to {"", "abc", "def", "ghi", "jkl", "mno", "pqrs", "tuv", "wxyz"}
set {mv, LF} to {missing value, linefeed}
-- Check input.
try
set reporting to (query's class is not text)
if (not reporting) then
repeat with chr in query
if (chr is not in digits) then error "Invalid digit input"
end repeat
set digitCount to (count query)
end if
script o
property |words| : (do shell script ("cat " & posixPath))'s paragraphs
property combos : mv
end script
on error errMsg
display alert "Textonyms handler: parameter error" message ¬
errMsg as critical buttons {"Stop"} default button 1
error number -128
end try
ignoring case
-- Lose obvious no-hope words.
set alphabet to join(keys's rest, "")
repeat with i from 1 to (count o's |words|)
set wrd to o's |words|'s item i
if ((reporting) or (wrd's length = digitCount)) then
repeat with chr in wrd
if (chr is not in alphabet) then
set o's |words|'s item i to mv
exit repeat
end if
end repeat
else
set o's |words|'s item i to mv
end if
end repeat
set o's |words| to o's |words|'s every text
set wordCount to (count o's |words|)
-- Derive digit combinations from the rest.
set txt to join(o's |words|, LF)
repeat with d in digits
set d to d's contents
repeat with letter in keys's item d
set txt to replaceText(txt, letter's contents, d)
end repeat
end repeat
set o's combos to txt's paragraphs
end ignoring
-- Return the appropriate result
considering case -- Case insensitivity not needed with digits.
if (reporting) then
tell sorter to sort(o's combos, 1, wordCount)
set {previousCombo, comboCount, textonymCount, counting} to ¬
{"", wordCount, 0, true}
repeat with i from 1 to wordCount
set thisCombo to o's combos's item i
if (thisCombo = previousCombo) then
set comboCount to comboCount - 1
if (counting) then
set textonymCount to textonymCount + 1
set counting to false
end if
else
set previousCombo to thisCombo
set counting to true
end if
end repeat
set output to (wordCount as text) & " words in '" & ¬
(do shell script ("basename " & posixPath)) & ¬
"' can be represented by the digit key mapping." & ¬
(LF & comboCount & " digit combinations are required to represent them.") & ¬
(LF & textonymCount & " of the digit combinations represent Textonyms.")
else
set output to {}
repeat with i from 1 to wordCount
if (o's combos's item i = query) then set output's end to o's |words|'s item i
end repeat
if ((count output) = 1) then set output to {}
end if
end considering
return output
end textonyms
on join(lst, delim)
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to delim
set txt to lst as text
set AppleScript's text item delimiters to astid
return txt
end join
on replaceText(mainText, searchText, replacementText)
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to searchText
set textItems to mainText's text items
set AppleScript's text item delimiters to replacementText
set mainText to textItems as text
set AppleScript's text item delimiters to astid
return mainText
end replaceText
on task()
set posixPath to "~/Desktop/www.rosettacode.org/unixdict.txt"
set report to textonyms(posixPath, missing value)
set output to {report, "", "Examples:"}
repeat with digitCombo in {"729", "723353", "25287876746242"}
set foundWords to textonyms(posixPath, digitCombo's contents)
set output's end to digitCombo & " --> {" & join(foundWords, ", ") & "}"
end repeat
return join(output, linefeed)
end task
task()
- Output:
"24978 words in 'unixdict.txt' can be represented by the digit key mapping.
22903 digit combinations are required to represent them.
1473 of the digit combinations represent Textonyms.
Examples:
729 --> {paw, pax, pay, paz, raw, ray, saw, sax, say}
723353 --> {paddle, raffle, saddle}
25287876746242 --> {claustrophobia, claustrophobic}"
AppleScriptObjC
use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions
on textonyms(posixPath, query)
set digits to "23456789"
set keys to {"", "[abc]", "[def]", "[ghi]", "[jkl]", "[mno]", "[pqrs]", "[tuv]", "[wxyz]"}
set {mv, LF} to {missing value, linefeed}
-- Check input.
try
set reporting to (query's class is not text)
if (not reporting) then
repeat with chr in query
if (chr is not in digits) then error "Invalid digit input"
end repeat
set digitCount to (count query)
end if
set || to current application
set pathStr to (||'s NSString's stringWithString:(posixPath))'s ¬
stringByExpandingTildeInPath()
set {txt, err} to ||'s NSMutableString's stringWithContentsOfFile:(pathStr) ¬
usedEncoding:(mv) |error|:(reference)
if (err ≠ mv) then error (err's localizedDescription() as text)
on error errMsg
display alert "Textonyms handler: parameter error" message ¬
errMsg as critical buttons {"Stop"} default button 1
error number -128
end try
-- Lose obvious no-hope words.
set regex to ||'s NSRegularExpressionSearch
txt's replaceOccurrencesOfString:("\\R") withString:(LF) ¬
options:(regex) range:({0, txt's |length|()})
set |words| to txt's componentsSeparatedByString:(LF)
if ((reporting) or (digitCount > 9)) then
set predFormat to "(self MATCHES '(?i)[a-z]++')"
else
set predFormat to "(self MATCHES '(?i)[a-z]{" & digitCount & "}+')"
end if
set predicate to ||'s NSPredicate's predicateWithFormat:(predFormat)
set |words| to |words|'s filteredArrayUsingPredicate:(predicate)
set wordCount to |words|'s |count|()
-- Derive digit combinations from the rest.
set txt to (|words|'s componentsJoinedByString:(LF))'s mutableCopy()
set range to {0, txt's |length|()}
repeat with d in digits
(txt's replaceOccurrencesOfString:("(?i)" & keys's item d) withString:(d) ¬
options:(regex) range:(range))
end repeat
set combos to txt's componentsSeparatedByString:(LF)
-- Return the appropriate result.
if (reporting) then
set comboSet to ||'s NSSet's setWithArray:(combos)
set comboCount to comboSet's |count|()
set textonymSet to ||'s NSCountedSet's alloc()'s initWithArray:(combos)
textonymSet's minusSet:(comboSet)
set textonymCount to textonymSet's |count|()
set output to (wordCount as text) & " words in '" & ¬
(pathStr's lastPathComponent()) & ¬
"' can be represented by the digit key mapping." & ¬
(LF & comboCount & " digit combinations are required to represent them.") & ¬
(LF & textonymCount & " of the digit combinations represent Textonyms.")
else
set output to {}
set range to {0, wordCount}
set i to combos's indexOfObject:(query) inRange:(range)
repeat until (i > wordCount)
set output's end to (|words|'s objectAtIndex:(i)) as text
set range to {i + 1, wordCount - (i + 1)}
set i to combos's indexOfObject:(query) inRange:(range)
end repeat
if ((count output) = 1) then set output to {}
end if
return output
end textonyms
on join(lst, delim)
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to delim
set txt to lst as text
set AppleScript's text item delimiters to astid
return txt
end join
on task()
set posixPath to "~/Desktop/www.rosettacode.org/unixdict.txt"
set report to textonyms(posixPath, missing value)
set output to {report, "", "Examples:"}
repeat with digitCombo in {"729", "723353", "25287876746242"}
set foundWords to textonyms(posixPath, digitCombo's contents)
set output's end to digitCombo & " --> {" & join(foundWords, ", ") & "}"
end repeat
return join(output, linefeed)
end task
task()
- Output:
Same as for the "vanilla" solution.
Arturo
words: read.lines relative "unixdict.txt" | select => [match? & {/^[a-z]+$/}]
nums: "22233344455566677778889999"
phone: $ => [
join map &'c -> nums\[sub to :integer c 97]
]
textonyms: #[]
tcount: 0
loop words 'w [
p: phone w
if? key? textonyms p [
textonyms\[p]: textonyms\[p] ++ w
if 2 = size textonyms\[p] -> 'tcount + 1
]
else -> textonyms\[p]: @[w]
]
print ~{
There are |size words| words in unixdict.txt which can be represented by the digit key mapping.
They require |size keys textonyms| digit combinations to represent them.
|tcount| digit combinations represent Textonyms.
7325 -> |textonyms\["7325"]|
}
- Output:
There are 24978 words in unixdict.txt which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. 7325 -> [peak peal peck real reck seal]
AWK
#!/usr/bin/env -S gawk -E
BEGIN { # user's configuration area
KEYMAP="2 abc 3 def 4 ghi 5 jkl 6 mno 7 pqrs 8 tuv 9 wxyz"
FNAME="/usr/share/dict/american-english" # 0.5 MB; 102775 words;
#KEYMAP="2 αβγά 3 δεζέ 4 ηθιήίϊΐ 5 κλμ 6 νξοό 7 πρσς 8 τυφύϋΰ 9 χψωώ"
#FNAME="/usr/share/dict/greek" # 19.5MB; 828808 words;
# where generated data will be written,
# or comment out a line if you don’t need it.
EXPORT_TXN="/tmp/textonyms"
EXPORT_ALL="/tmp/phonewords"
EXPORT_BAD="/tmp/invalidwords" #also the line ‘BUFF_ERRW = BUFF_...’
}
BEGIN { # main
delete ARGV; ARGC=1 # do not accept command line arguments
delete XEK # reserve id for use only as a hash table
delete TXN # reserve id ...
AZ="" # generated Alphabet
EE=0 # invalid word Counter
KK=0 # valid word Counter
TT=0 # textonym groups in the table TXN
BUFF_ERRW="" # invalid word buffer
TOTAL=1 # enum
COUNT=2 # enum
STDERR="/dev/stderr"
OLD_RS=RS
OLD_FS=FS
processFile()
generateReport()
userQuery()
}
function processFile( ii,jj,nn,errW,ss,aKey,aGroup,qqq){
$0=KEYMAP
AZ=" "
for (ii=1; ii<=NF; ii=ii+2) {
aKey=$ii; aGroup=$(ii+1)
nn=split(aGroup, qqq, //)
for (jj=1; jj<=nn; jj++) {ss=qqq[jj]; XEK[ss]=aKey; AZ = AZ ss " " }
}
AZ = AZ " "
######################
RS="^$" #
FS="[\n\t ]+" #
######################
if ((getline <FNAME) <= 0) {
printf "unexpected EOF or error: ‘%s’ %s\n",FNAME,ERRNO >STDERR
exit 1
} else printf "total words in the file ‘%s’: %s\n", FNAME,NF
for (ii=1; ii<=NF; ii++) {
errW=0
ss=tolower($ii)
nn=split(ss, qqq, //)
nmb=""
for (jj=1; jj<=nn; jj++) {
lchr=qqq[jj]
if (index(AZ," "lchr" ")>0) { nmb = nmb XEK[lchr] }
else {
EE++
errW=1
BUFF_ERRW = BUFF_ERRW $ii "\n"
break
}
}
if (errW) { continue }
T9=TXN[nmb][TOTAL]
if (index(T9" "," "ss" ")==0) {
TXN[nmb][TOTAL] = T9 " " ss
TXN[nmb][COUNT]++
}
KK++
}
}
function generateReport( elm){
for (elm in TXN) { if (TXN[elm][COUNT]>1) { TT++ } }
printf "valid words: %9s\n", KK
printf "invalid words: %9s\n", EE
printf "table indices for valid words: %9s\n", length(TXN)
printf "textonym groups in the table: %9s\n", TT
exportData()
close(EXPORT_BAD); close(EXPORT_TXN); close(EXPORT_ALL)
}
function exportData( elm){
if (EXPORT_BAD != "") print BUFF_ERRW >EXPORT_BAD
if (EXPORT_TXN != "" && EXPORT_ALL != "") {
printf "%s\n",
"number-of-textonyms\tword's-length\tkeys\tlist-of-textonyms" >EXPORT_ALL
printf "%s\n",
"number-of-textonyms\tword's-length\tkeys\tlist-of-textonyms" >EXPORT_TXN
for (elm in TXN) {
printf "%s\t%s\t%s\t%s\n",
TXN[elm][COUNT], length(elm), elm, TXN[elm][TOTAL] >EXPORT_ALL
if (TXN[elm][COUNT]>1) {
printf "%s\t%s\t%s\t%s\n",
TXN[elm][COUNT], length(elm), elm, TXN[elm][TOTAL] >EXPORT_TXN
}
}
return ## return ## return ## return ##
} else if (EXPORT_ALL != "") {
printf "%s\n",
"number-of-textonyms\tword's-length\tkeys\tlist-of-textonyms" >EXPORT_ALL
for (elm in TXN) {
printf "%s\t%s\t%s\t%s\n",
TXN[elm][COUNT], length(elm), elm, TXN[elm][TOTAL] >EXPORT_ALL
}
}
else if (EXPORT_TXN != "") {
printf "%s\n",
"number-of-textonyms\tword's-length\tkeys\tlist-of-textonyms" >EXPORT_TXN
for (elm in TXN) {
if (TXN[elm][COUNT]>1) {
printf "%s\t%s\t%s\t%s\n",
TXN[elm][COUNT], length(elm), elm, TXN[elm][TOTAL] >EXPORT_TXN
}
}
}
}
function userQuery( userasks,ss,ss1,nn,key,words){
printf "txn>> "
RS=OLD_RS
FS=OLD_FS
while ((getline ) > 0) {
userasks=$1
if (NF==0){ printf "txn>> ", ""; continue }
else if (userasks ~ /^-e|--ex|--exit$/) { exit }
else if (userasks ~ /^[0-9]+$/) {
nn=TXN[userasks][COUNT]+0
words=TXN[userasks][TOTAL]
if (nn == 0) { printf "%s -> %s\n", userasks,"no matching words" }
else { printf "%s -> (%s) %s\n", userasks,nn,words }
}
else {
ss=tolower(userasks)
if ((key=keySeq_orElse_zero(ss))>0) {
ss1=(index((TXN[key][TOTAL]" ") , " "ss" ")>0) ?
", and the word is in" : ", but the word is not in"
printf "%s -> %s; the key is%s in the table%s\n", ss,key,
((key in TXN) ?"":" not"),ss1
}
else {
printf "%s -> not a valid word for the alphabet:\n%s\n", userasks,AZ
}
}
printf "txn>> "
}
printf "\n"
}
function keySeq_orElse_zero(aWord, qqq,lchr,nn,jj,buf){
nn=split(aWord, qqq, //)
for (jj=1; jj<=nn; jj++) {
lchr=qqq[jj]
if (index(AZ," "lchr" ")>0) { buf = buf XEK[lchr] } else { return 0 }
}
return buf
}
- Output:
# Run, assuming the code is in the txn.awk $ LANG=en_US.UTF-8 ./txn.awk total words in the file ‘/usr/share/dict/american-english’: 102775 valid words: 73318 invalid words: 29457 table indices for valid words: 65817 textonym groups in the table: 4670 txn>> cafe cafe -> 2233; the key is in the table, but the word is not in txn>> 2233 2233 -> (3) abed aced bade txn>> café café -> not a valid word for the alphabet: a b c d e f g h i j k l m n o p q r s t u v w x y z txn>> --exit $ $ $ egrep 'café' "/tmp/invalidwords" café café's cafés $ $ sort -n -b -k 1 "/tmp/textonyms" | tail -n 7 8 6 782537 quaker pucker quakes rubles stakes staler stales sucker 9 3 269 amy bmw cox coy any bow box boy cow 9 4 2273 case acre bard bare barf base cape card care 9 4 7277 parr sars paps pars pass raps rasp saps sass 9 4 7867 pump puns rump rums runs stop sump sums suns 9 5 46637 homer goner goods goofs homes hones hoods hoofs inner 12 5 22737 acres bards barer bares barfs baser bases caper capes cards cares cases $ $ $ sort -n -b -k 2 "/tmp/phonewords" | tail -n 5 1 20 86242722837478422559 uncharacteristically 1 21 353287636237425647267 electroencephalograms 1 21 353287636237425647274 electroencephalograph 1 22 2686837738658846627437 counterrevolutionaries 1 22 3532876362374256472747 electroencephalographs $
C
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <glib.h>
char text_char(char c) {
switch (c) {
case 'a': case 'b': case 'c':
return '2';
case 'd': case 'e': case 'f':
return '3';
case 'g': case 'h': case 'i':
return '4';
case 'j': case 'k': case 'l':
return '5';
case 'm': case 'n': case 'o':
return '6';
case 'p': case 'q': case 'r': case 's':
return '7';
case 't': case 'u': case 'v':
return '8';
case 'w': case 'x': case 'y': case 'z':
return '9';
default:
return 0;
}
}
bool text_string(const GString* word, GString* text) {
g_string_set_size(text, word->len);
for (size_t i = 0; i < word->len; ++i) {
char c = text_char(g_ascii_tolower(word->str[i]));
if (c == 0)
return false;
text->str[i] = c;
}
return true;
}
typedef struct textonym_tag {
const char* text;
size_t length;
GPtrArray* words;
} textonym_t;
int compare_by_text_length(const void* p1, const void* p2) {
const textonym_t* t1 = p1;
const textonym_t* t2 = p2;
if (t1->length > t2->length)
return -1;
if (t1->length < t2->length)
return 1;
return strcmp(t1->text, t2->text);
}
int compare_by_word_count(const void* p1, const void* p2) {
const textonym_t* t1 = p1;
const textonym_t* t2 = p2;
if (t1->words->len > t2->words->len)
return -1;
if (t1->words->len < t2->words->len)
return 1;
return strcmp(t1->text, t2->text);
}
void print_words(GPtrArray* words) {
for (guint i = 0, n = words->len; i < n; ++i) {
if (i > 0)
printf(", ");
printf("%s", g_ptr_array_index(words, i));
}
printf("\n");
}
void print_top_words(GArray* textonyms, guint top) {
for (guint i = 0; i < top; ++i) {
const textonym_t* t = &g_array_index(textonyms, textonym_t, i);
printf("%s = ", t->text);
print_words(t->words);
}
}
void free_strings(gpointer ptr) {
g_ptr_array_free(ptr, TRUE);
}
bool find_textonyms(const char* filename, GError** error_ptr) {
GError* error = NULL;
GIOChannel* channel = g_io_channel_new_file(filename, "r", &error);
if (channel == NULL) {
g_propagate_error(error_ptr, error);
return false;
}
GHashTable* ht = g_hash_table_new_full(g_str_hash, g_str_equal,
g_free, free_strings);
GString* word = g_string_sized_new(64);
GString* text = g_string_sized_new(64);
guint count = 0;
gsize term_pos;
while (g_io_channel_read_line_string(channel, word, &term_pos,
&error) == G_IO_STATUS_NORMAL) {
g_string_truncate(word, term_pos);
if (!text_string(word, text))
continue;
GPtrArray* words = g_hash_table_lookup(ht, text->str);
if (words == NULL) {
words = g_ptr_array_new_full(1, g_free);
g_hash_table_insert(ht, g_strdup(text->str), words);
}
g_ptr_array_add(words, g_strdup(word->str));
++count;
}
g_io_channel_unref(channel);
g_string_free(word, TRUE);
g_string_free(text, TRUE);
if (error != NULL) {
g_propagate_error(error_ptr, error);
g_hash_table_destroy(ht);
return false;
}
GArray* words = g_array_new(FALSE, FALSE, sizeof(textonym_t));
GHashTableIter iter;
gpointer key, value;
g_hash_table_iter_init(&iter, ht);
while (g_hash_table_iter_next(&iter, &key, &value)) {
GPtrArray* v = value;
if (v->len > 1) {
textonym_t textonym;
textonym.text = key;
textonym.length = strlen(key);
textonym.words = v;
g_array_append_val(words, textonym);
}
}
printf("There are %u words in '%s' which can be represented by the digit key mapping.\n",
count, filename);
guint size = g_hash_table_size(ht);
printf("They require %u digit combinations to represent them.\n", size);
guint textonyms = words->len;
printf("%u digit combinations represent Textonyms.\n", textonyms);
guint top = 5;
if (textonyms < top)
top = textonyms;
printf("\nTop %u by number of words:\n", top);
g_array_sort(words, compare_by_word_count);
print_top_words(words, top);
printf("\nTop %u by length:\n", top);
g_array_sort(words, compare_by_text_length);
print_top_words(words, top);
g_array_free(words, TRUE);
g_hash_table_destroy(ht);
return true;
}
int main(int argc, char** argv) {
if (argc != 2) {
fprintf(stderr, "usage: %s word-list\n", argv[0]);
return EXIT_FAILURE;
}
GError* error = NULL;
if (!find_textonyms(argv[1], &error)) {
if (error != NULL) {
fprintf(stderr, "%s: %s\n", argv[1], error->message);
g_error_free(error);
}
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
- Output:
There are 24978 words in 'unixdict.txt' which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. Top 5 by number of words: 269 = amy, any, bmw, bow, box, boy, cow, cox, coy 729 = paw, pax, pay, paz, raw, ray, saw, sax, say 2273 = acre, bard, bare, base, cape, card, care, case 726 = pam, pan, ram, ran, sam, san, sao, scm 426 = gam, gao, ham, han, ian, ibm, ibn Top 5 by length: 25287876746242 = claustrophobia, claustrophobic 7244967473642 = schizophrenia, schizophrenic 666628676342 = onomatopoeia, onomatopoeic 49376746242 = hydrophobia, hydrophobic 2668368466 = contention, convention
C++
#include <fstream>
#include <iostream>
#include <unordered_map>
#include <vector>
struct Textonym_Checker {
private:
int total;
int elements;
int textonyms;
int max_found;
std::vector<std::string> max_strings;
std::unordered_map<std::string, std::vector<std::string>> values;
int get_mapping(std::string &result, const std::string &input)
{
static std::unordered_map<char, char> mapping = {
{'A', '2'}, {'B', '2'}, {'C', '2'},
{'D', '3'}, {'E', '3'}, {'F', '3'},
{'G', '4'}, {'H', '4'}, {'I', '4'},
{'J', '5'}, {'K', '5'}, {'L', '5'},
{'M', '6'}, {'N', '6'}, {'O', '6'},
{'P', '7'}, {'Q', '7'}, {'R', '7'}, {'S', '7'},
{'T', '8'}, {'U', '8'}, {'V', '8'},
{'W', '9'}, {'X', '9'}, {'Y', '9'}, {'Z', '9'}
};
result = input;
for (char &c : result) {
if (!isalnum(c)) return 0;
if (isalpha(c)) c = mapping[toupper(c)];
}
return 1;
}
public:
Textonym_Checker() : total(0), elements(0), textonyms(0), max_found(0) { }
~Textonym_Checker() { }
void add(const std::string &str) {
std::string mapping;
total++;
if (!get_mapping(mapping, str)) return;
const int num_strings = values[mapping].size();
if (num_strings == 1) textonyms++;
elements++;
if (num_strings > max_found) {
max_strings.clear();
max_strings.push_back(mapping);
max_found = num_strings;
}
else if (num_strings == max_found)
max_strings.push_back(mapping);
values[mapping].push_back(str);
}
void results(const std::string &filename) {
std::cout << "Read " << total << " words from " << filename << "\n\n";
std::cout << "There are " << elements << " words in " << filename;
std::cout << " which can be represented by the digit key mapping.\n";
std::cout << "They require " << values.size() <<
" digit combinations to represent them.\n";
std::cout << textonyms << " digit combinations represent Textonyms.\n\n";
std::cout << "The numbers mapping to the most words map to ";
std::cout << max_found + 1 << " words each:\n";
for (auto it1 : max_strings) {
std::cout << '\t' << it1 << " maps to: ";
for (auto it2 : values[it1])
std::cout << it2 << " ";
std::cout << '\n';
}
std::cout << '\n';
}
void match(const std::string &str) {
auto match = values.find(str);
if (match == values.end()) {
std::cout << "Key '" << str << "' not found\n";
}
else {
std::cout << "Key '" << str << "' matches: ";
for (auto it : values[str])
std::cout << it << " ";
std::cout << '\n';
}
}
};
int main()
{
auto filename = "unixdict.txt";
std::ifstream input(filename);
Textonym_Checker tc;
if (input.is_open()) {
std::string line;
while (getline(input, line))
tc.add(line);
}
input.close();
tc.results(filename);
tc.match("001");
tc.match("228");
tc.match("27484247");
tc.match("7244967473642");
}
- Output:
Read 25104 words from unixdict.txt There are 24988 words in unixdict.txt which can be represented by the digit key mapping. They require 22905 digit combinations to represent them. 1477 digit combinations represent Textonyms. The numbers mapping to the most words map to 9 words each: 269 maps to: amy any bmw bow box boy cow cox coy 729 maps to: paw pax pay paz raw ray saw sax say Key '001' not found Key '228' matches: aau act bat cat Key '27484247' not found Key '7244967473642' matches: schizophrenia schizophrenic
Clojure
The Tcl example counts all the words which share a digit sequence with another word. Like the other examples, this considers a textonym to be a digit sequence which maps to more than one word.
(def table
{\a 2 \b 2 \c 2 \A 2 \B 2 \C 2
\d 3 \e 3 \f 3 \D 3 \E 3 \F 3
\g 4 \h 4 \i 4 \G 4 \H 4 \I 4
\j 5 \k 5 \l 5 \J 5 \K 5 \L 5
\m 6 \n 6 \o 6 \M 6 \N 6 \O 6
\p 7 \q 7 \r 7 \s 7 \P 7 \Q 7 \R 7 \S 7
\t 8 \u 8 \v 8 \T 8 \U 8 \V 8
\w 9 \x 9 \y 9 \z 9 \W 9 \X 9 \Y 9 \Z 9})
(def words-url "http://www.puzzlers.org/pub/wordlists/unixdict.txt")
(def words (-> words-url slurp clojure.string/split-lines))
(def digits (partial map table))
(let [textable (filter #(every? table %) words) ;; words with letters only
mapping (group-by digits textable) ;; map of digits to words
textonyms (filter #(< 1 (count (val %))) mapping)] ;; textonyms only
(print
(str "There are " (count textable) " words in " \' words-url \'
" which can be represented by the digit key mapping. They require "
(count mapping) " digit combinations to represent them. "
(count textonyms) " digit combinations represent Textonyms.")))
- Output:
There are 24978 words in 'http://www.puzzlers.org/pub/wordlists/unixdict.txt' which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms.
D
void main() {
import std.stdio, std.string, std.range, std.algorithm, std.ascii;
immutable src = "unixdict.txt";
const words = src.File.byLineCopy.map!strip.filter!(w => w.all!isAlpha).array;
immutable table = makeTrans("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ",
"2223334445556667777888999922233344455566677778889999");
string[][string] dials;
foreach (const word; words)
dials[word.translate(table)] ~= word;
auto textonyms = dials.byPair.filter!(p => p[1].length > 1).array;
writefln("There are %d words in %s which can be represented by the digit key mapping.", words.length, src);
writefln("They require %d digit combinations to represent them.", dials.length);
writefln("%d digit combinations represent Textonyms.", textonyms.length);
"\nTop 5 in ambiguity:".writeln;
foreach (p; textonyms.schwartzSort!(p => -p[1].length).take(5))
writefln(" %s => %-(%s %)", p[]);
"\nTop 5 in length:".writeln;
foreach (p; textonyms.schwartzSort!(p => -p[0].length).take(5))
writefln(" %s => %-(%s %)", p[]);
}
- Output:
There are 24978 words in unixdict.txt which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. Top 5 in ambiguity: 729 => paw pax pay paz raw ray saw sax say 269 => amy any bmw bow box boy cow cox coy 2273 => acre bard bare base cape card care case 726 => pam pan ram ran sam san sao scm 426 => gam gao ham han ian ibm ibn Top 5 in length: 25287876746242 => claustrophobia claustrophobic 7244967473642 => schizophrenia schizophrenic 666628676342 => onomatopoeia onomatopoeic 49376746242 => hydrophobia hydrophobic 6388537663 => mettlesome nettlesome
Delphi
program Textonyms;
{$APPTYPE CONSOLE}
uses
System.SysUtils,
System.Classes,
System.Generics.Collections,
System.Character;
const
TEXTONYM_MAP = '22233344455566677778889999';
type
TextonymsChecker = class
private
Total, Elements, Textonyms, MaxFound: Integer;
MaxStrings: TList<string>;
Values: TDictionary<string, TList<string>>;
FFileName: TFileName;
function Map(c: Char): Char;
function GetMapping(var return: string; const Input: string): Boolean;
public
constructor Create(FileName: TFileName);
destructor Destroy; override;
procedure Add(const Str: string);
procedure Load(FileName: TFileName);
procedure Test;
function Match(const str: string): Boolean;
property FileName: TFileName read FFileName;
end;
{ TextonymsChecker }
procedure TextonymsChecker.Add(const Str: string);
var
mapping: string;
num_strings: Integer;
procedure AddValues(mapping: string; NewItem: string);
begin
if not Values.ContainsKey(mapping) then
Values.Add(mapping, TList<string>.Create);
Values[mapping].Add(NewItem);
end;
begin
inc(total);
if not GetMapping(mapping, Str) then
Exit;
if Values.ContainsKey(mapping) then
num_strings := Values[mapping].Count
else
num_strings := 0;
inc(Textonyms, ord(num_strings = 1));
inc(Elements);
if (num_strings > maxfound) then
begin
MaxStrings.Clear;
MaxStrings.Add(mapping);
MaxFound := num_strings;
end
else if num_strings = MaxFound then
begin
MaxStrings.Add(mapping);
end;
AddValues(mapping, Str);
end;
constructor TextonymsChecker.Create(FileName: TFileName);
begin
MaxStrings := TList<string>.Create;
Values := TDictionary<string, TList<string>>.Create;
Total := 0;
Textonyms := 0;
MaxFound := 0;
Elements := 0;
Load(FileName);
end;
destructor TextonymsChecker.Destroy;
var
key: string;
begin
for key in Values.Keys do
Values[key].Free;
Values.Free;
MaxStrings.Free;
inherited;
end;
function TextonymsChecker.GetMapping(var return: string; const Input: string): Boolean;
var
i: Integer;
begin
return := Input;
for i := 1 to return.Length do
begin
if not return[i].IsLetterOrDigit then
exit(False);
if return[i].IsLetter then
return[i] := Map(return[i]);
end;
Result := True;
end;
procedure TextonymsChecker.Load(FileName: TFileName);
var
i: Integer;
begin
if not FileExists(FileName) then
begin
writeln('File "', FileName, '" not found');
exit;
end;
with TStringList.Create do
begin
LoadFromFile(FileName);
for i := 0 to count - 1 do
begin
self.Add(Strings[i]);
end;
Free;
end;
end;
function TextonymsChecker.Map(c: Char): Char;
begin
Result := TEXTONYM_MAP.Chars[Ord(UpCase(c)) - Ord('A')];
end;
function TextonymsChecker.Match(const str: string): Boolean;
var
w: string;
begin
Result := Values.ContainsKey(str);
if not Result then
begin
writeln('Key "', str, '" not found');
end
else
begin
write('Key "', str, '" matches: ');
for w in Values[str] do
begin
write(w, ' ');
end;
writeln;
end;
end;
procedure TextonymsChecker.Test;
var
i, j: Integer;
begin
writeln('Read ', Total, ' words from ', FileName, #10);
writeln(' which can be represented by the digit key mapping.');
writeln('They require ', Values.Count, ' digit combinations to represent them.');
writeln(textonyms, ' digit combinations represent Textonyms.', #10);
write('The numbers mapping to the most words map to');
writeln(MaxFound + 1, ' words each:');
for i := 0 to MaxStrings.Count - 1 do
begin
write(^I, MaxStrings[i], ' maps to: ');
for j := 0 to Values[MaxStrings[i]].Count - 1 do
begin
write(Values[MaxStrings[i]][j], ' ');
end;
Writeln;
end;
end;
var
Tc: TextonymsChecker;
begin
Tc := TextonymsChecker.Create('unixdict.txt');
Tc.Test;
tc.match('001');
tc.match('228');
tc.match('27484247');
tc.match('7244967473642');
Tc.Free;
readln;
end.
- Output:
Read 25104 words from which can be represented by the digit key mapping. They require 22905 digit combinations to represent them. 1477 digit combinations represent Textonyms. The numbers mapping to the most words map to9 words each: 269 maps to: amy any bmw bow box boy cow cox coy 729 maps to: paw pax pay paz raw ray saw sax say Key "001" not found Key "228" matches: aau act bat cat Key "27484247" not found Key "7244967473642" matches: schizophrenia schizophrenic
Factor
USING: assocs assocs.extras interpolate io io.encodings.utf8
io.files kernel literals math math.parser prettyprint sequences
unicode ;
<< CONSTANT: src "unixdict.txt" >>
CONSTANT: words
$[ src utf8 file-lines [ [ letter? ] all? ] filter ]
CONSTANT: digits "22233344455566677778889999"
: >phone ( str -- n )
[ CHAR: a - digits nth ] map string>number ;
: textonyms ( seq -- assoc )
[ [ >phone ] keep ] map>alist expand-keys-push-at ;
: #textonyms ( assoc -- n )
[ nip length 1 > ] assoc-filter assoc-size ;
words length src words textonyms [ assoc-size ] keep #textonyms
[I There are ${} words in ${} which can be represented by the digit key mapping.
They require ${} digit combinations to represent them.
${} digit combinations represent Textonyms.I] nl nl
"7325 -> " write words textonyms 7325 of .
- Output:
There are 24978 words in unixdict.txt which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. 7325 -> V{ "peak" "peal" "peck" "real" "reck" "seal" }
FreeBASIC
Type KeyValuePair
As String key
As String value
End Type
' Simulate a dictionary with an array
Dim Shared keyMap(0 To 7) As KeyValuePair
keyMap(0).key = "ABC" : keyMap(0).value = "2"
keyMap(1).key = "DEF" : keyMap(1).value = "3"
keyMap(2).key = "GHI" : keyMap(2).value = "4"
keyMap(3).key = "JKL" : keyMap(3).value = "5"
keyMap(4).key = "MNO" : keyMap(4).value = "6"
keyMap(5).key = "PQRS" : keyMap(5).value = "7"
keyMap(6).key = "TUV" : keyMap(6).value = "8"
keyMap(7).key = "WXYZ" : keyMap(7).value = "9"
Function GetKeyMapValue(char As String) As String
For i As Integer = Lbound(keyMap) To Ubound(keyMap)
If Instr(keyMap(i).key, Ucase(char)) > 0 Then Return keyMap(i).value
Next
Return ""
End Function
Function ArrayExists(arr() As String, value As String) As Boolean
For i As Integer = Lbound(arr) To Ubound(arr)
If arr(i) = value Then Return True
Next
Return False
End Function
Dim As Integer TotalWords = 0
Dim As Integer UniqueCombinations = 0
Dim As String uniqueWords(), moreThanOneWord()
Dim As String inputFile = "unixdict.txt"
Dim As Integer ff = Freefile()
Open inputFile For Input As #ff
If Err <> 0 Then Print "Error: Unable to open file '" & inputFile & "'": End 1
Dim As String linea, num, char, digit
Dim As Integer c, i
Do Until Eof(ff)
Line Input #ff, linea
If Len(linea) > 0 Then
num = ""
c = 0
For i = 1 To Len(linea)
char = Mid(linea, i, 1)
digit = GetKeyMapValue(char)
If digit <> "" Then
num &= digit
c += 1
End If
Next i
If c = Len(linea) Then
TotalWords += 1
If Ubound(uniqueWords) = -1 Orelse Not ArrayExists(uniqueWords(), num) Then
Redim Preserve uniqueWords(0 To Ubound(uniqueWords) + 1)
uniqueWords(Ubound(uniqueWords)) = num
UniqueCombinations += 1
Else
If Ubound(moreThanOneWord) = -1 Orelse Not ArrayExists(moreThanOneWord(), num) Then
Redim Preserve moreThanOneWord(0 To Ubound(moreThanOneWord) + 1)
moreThanOneWord(Ubound(moreThanOneWord)) = num
End If
End If
End If
End If
Loop
Close #ff
Print "There are " & TotalWords & " words in ""unixdict.txt"" which can be represented by the digit key mapping."
Print "They require " & UniqueCombinations & " digit combinations to represent them."
Print Ubound(moreThanOneWord) + 1 & " digit combinations represent Textonyms."
Sleep
- Output:
Same as VBScript entry.
Go
Uses a local file and shows its name rather than re-fetching a URL each run and printing that URL.
Like the Phython example, the examples shown are the numbers that map to the most words.
package main
import (
"bufio"
"flag"
"fmt"
"io"
"log"
"os"
"strings"
"unicode"
)
func main() {
log.SetFlags(0)
log.SetPrefix("textonyms: ")
wordlist := flag.String("wordlist", "wordlist", "file containing the list of words to check")
flag.Parse()
if flag.NArg() != 0 {
flag.Usage()
os.Exit(2)
}
t := NewTextonym(phoneMap)
_, err := ReadFromFile(t, *wordlist)
if err != nil {
log.Fatal(err)
}
t.Report(os.Stdout, *wordlist)
}
// phoneMap is the digit to letter mapping of a typical phone.
var phoneMap = map[byte][]rune{
'2': []rune("ABC"),
'3': []rune("DEF"),
'4': []rune("GHI"),
'5': []rune("JKL"),
'6': []rune("MNO"),
'7': []rune("PQRS"),
'8': []rune("TUV"),
'9': []rune("WXYZ"),
}
// ReadFromFile is a generic convience function that allows the use of a
// filename with an io.ReaderFrom and handles errors related to open and
// closing the file.
func ReadFromFile(r io.ReaderFrom, filename string) (int64, error) {
f, err := os.Open(filename)
if err != nil {
return 0, err
}
n, err := r.ReadFrom(f)
if cerr := f.Close(); err == nil && cerr != nil {
err = cerr
}
return n, err
}
type Textonym struct {
numberMap map[string][]string // map numeric string into words
letterMap map[rune]byte // map letter to digit
count int // total number of words in numberMap
textonyms int // number of numeric strings with >1 words
}
func NewTextonym(dm map[byte][]rune) *Textonym {
lm := make(map[rune]byte, 26)
for d, ll := range dm {
for _, l := range ll {
lm[l] = d
}
}
return &Textonym{letterMap: lm}
}
func (t *Textonym) ReadFrom(r io.Reader) (n int64, err error) {
t.numberMap = make(map[string][]string)
buf := make([]byte, 0, 32)
sc := bufio.NewScanner(r)
sc.Split(bufio.ScanWords)
scan:
for sc.Scan() {
buf = buf[:0]
word := sc.Text()
// XXX we only bother approximating the number of bytes
// consumed. This isn't used in the calling code and was
// only included to match the io.ReaderFrom interface.
n += int64(len(word)) + 1
for _, r := range word {
d, ok := t.letterMap[unicode.ToUpper(r)]
if !ok {
//log.Printf("ignoring %q\n", word)
continue scan
}
buf = append(buf, d)
}
//log.Printf("scanned %q\n", word)
num := string(buf)
t.numberMap[num] = append(t.numberMap[num], word)
t.count++
if len(t.numberMap[num]) == 2 {
t.textonyms++
}
//log.Printf("%q → %v\t→ %v\n", word, num, t.numberMap[num])
}
return n, sc.Err()
}
func (t *Textonym) Most() (most int, subset map[string][]string) {
for k, v := range t.numberMap {
switch {
case len(v) > most:
subset = make(map[string][]string)
most = len(v)
fallthrough
case len(v) == most:
subset[k] = v
}
}
return most, subset
}
func (t *Textonym) Report(w io.Writer, name string) {
// Could be fancy and use text/template package but fmt is sufficient
fmt.Fprintf(w, `
There are %v words in %q which can be represented by the digit key mapping.
They require %v digit combinations to represent them.
%v digit combinations represent Textonyms.
`,
t.count, name, len(t.numberMap), t.textonyms)
n, sub := t.Most()
fmt.Fprintln(w, "\nThe numbers mapping to the most words map to",
n, "words each:")
for k, v := range sub {
fmt.Fprintln(w, "\t", k, "maps to:", strings.Join(v, ", "))
}
}
- Output:
There are 13085 words in "wordlist" which can be represented by the digit key mapping. They require 11932 digit combinations to represent them. 661 digit combinations represent Textonyms. The numbers mapping to the most words map to 15 words each: 27 maps to: AP, AQ, AR, AS, Ar, As, BP, BR, BS, Br, CP, CQ, CR, Cr, Cs
- Output with "-wordlist unixdict.txt":
There are 24978 words in "unixdict.txt" which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. The numbers mapping to the most words map to 9 words each: 269 maps to: amy, any, bmw, bow, box, boy, cow, cox, coy 729 maps to: paw, pax, pay, paz, raw, ray, saw, sax, say
Haskell
import Data.Char (toUpper)
import Data.Function (on)
import Data.List (groupBy, sortBy)
import Data.Maybe (fromMaybe, isJust, isNothing)
toKey :: Char -> Maybe Char
toKey ch
| ch < 'A' = Nothing
| ch < 'D' = Just '2'
| ch < 'G' = Just '3'
| ch < 'J' = Just '4'
| ch < 'M' = Just '5'
| ch < 'P' = Just '6'
| ch < 'T' = Just '7'
| ch < 'W' = Just '8'
| ch <= 'Z' = Just '9'
| otherwise = Nothing
toKeyString :: String -> Maybe String
toKeyString st
| any isNothing mch = Nothing
| otherwise = Just $ map (fromMaybe '!') mch
where
mch = map (toKey . toUpper) st
showTextonym :: [(String, String)] -> String
showTextonym ts =
fst (head ts)
++ " => "
++ concat
[ w ++ " "
| (_, w) <- ts
]
main :: IO ()
main = do
let src = "unixdict.txt"
contents <- readFile src
let wordList = lines contents
keyedList =
[ (key, word)
| (Just key, word) <-
filter (isJust . fst) $
zip (map toKeyString wordList) wordList
]
groupedList =
groupBy ((==) `on` fst) $
sortBy (compare `on` fst) keyedList
textonymList = filter ((> 1) . length) groupedList
mapM_ putStrLn $
[ "There are "
++ show (length keyedList)
++ " words in "
++ src
++ " which can be represented by the digit key mapping.",
"They require "
++ show (length groupedList)
++ " digit combinations to represent them.",
show (length textonymList) ++ " digit combinations represent Textonyms.",
"",
"Top 5 in ambiguity:"
]
++ fmap
showTextonym
( take 5 $
sortBy (flip compare `on` length) textonymList
)
++ ["", "Top 5 in length:"]
++ fmap
showTextonym
(take 5 $ sortBy (flip compare `on` (length . fst . head)) textonymList)
- Output:
There are 24978 words in unixdict.txt which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. Top 5 in ambiguity: 269 => amy any bmw bow box boy cow cox coy 729 => paw pax pay paz raw ray saw sax say 2273 => acre bard bare base cape card care case 726 => pam pan ram ran sam san sao scm 426 => gam gao ham han ian ibm ibn Top 5 in length: 25287876746242 => claustrophobia claustrophobic 7244967473642 => schizophrenia schizophrenic 666628676342 => onomatopoeia onomatopoeic 49376746242 => hydrophobia hydrophobic 2668368466 => contention convention
Or, in terms of Data.Map and traverse:
import Data.Function (on)
import Data.List (groupBy, maximum, sortBy, sortOn)
import qualified Data.Map as M
import Data.Maybe (mapMaybe)
import Data.Ord (comparing)
------------------------ TEXTONYMS -----------------------
digitEncoded ::
M.Map Char Char ->
[String] ->
[(String, String)]
digitEncoded dict =
mapMaybe $
((>>=) . traverse (`M.lookup` dict))
<*> curry Just
charDict :: M.Map Char Char
charDict =
M.fromList $
concat $
zipWith
(fmap . flip (,))
(head . show <$> [2 ..])
(words "abc def ghi jkl mno pqrs tuv wxyz")
definedSamples ::
Int ->
[[(String, String)]] ->
[[(String, String)] -> Int] ->
[[[(String, String)]]]
definedSamples n xs fs =
[take n . flip sortBy xs] <*> (flip . comparing <$> fs)
--------------------------- TEST -------------------------
main :: IO ()
main = do
let fp = "unixdict.txt"
s <- readFile fp
let encodings = digitEncoded charDict $ lines s
codeGroups =
groupBy
(on (==) snd)
. sortOn snd
$ encodings
textonyms = filter ((1 <) . length) codeGroups
mapM_
putStrLn
[ "There are "
<> show (length encodings)
<> " words in "
<> fp
<> " which can be represented\n"
<> "by the digit key mapping.",
"\nThey require "
<> show (length codeGroups)
<> " digit combinations to represent them.",
show (length textonyms)
<> " digit combinations represent textonyms.",
""
]
let codeLength = length . snd . head
[ambiguous, longer] =
definedSamples
5
textonyms
[length, codeLength]
[wa, wl] =
maximum . fmap codeLength
<$> [ambiguous, longer]
mapM_ putStrLn $
"Five most ambiguous:" :
fmap (showTextonym wa) ambiguous
<> ( "" :
"Five longest:" :
fmap
(showTextonym wl)
longer
)
------------------------- DISPLAY ------------------------
showTextonym :: Int -> [(String, String)] -> String
showTextonym w ts =
concat
[ rjust w ' ' (snd (head ts)),
" -> ",
unwords $ fmap fst ts
]
where
rjust n c = (drop . length) <*> (replicate n c <>)
- Output:
There are 24978 words in unixdict.txt which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent textonyms. Five most ambiguous: 269 -> amy any bmw bow box boy cow cox coy 729 -> paw pax pay paz raw ray saw sax say 2273 -> acre bard bare base cape card care case 726 -> pam pan ram ran sam san sao scm 426 -> gam gao ham han ian ibm ibn Five longest: 25287876746242 -> claustrophobia claustrophobic 7244967473642 -> schizophrenia schizophrenic 666628676342 -> onomatopoeia onomatopoeic 49376746242 -> hydrophobia hydrophobic 2668368466 -> contention convention
Io
main := method(
setupLetterToDigitMapping
file := File clone openForReading("./unixdict.txt")
words := file readLines
file close
wordCount := 0
textonymCount := 0
dict := Map clone
words foreach(word,
(key := word asPhoneDigits) ifNonNil(
wordCount = wordCount+1
value := dict atIfAbsentPut(key,list())
value append(word)
if(value size == 2,textonymCount = textonymCount+1)
)
)
write("There are ",wordCount," words in ",file name)
writeln(" which can be represented by the digit key mapping.")
writeln("They require ",dict size," digit combinations to represent them.")
writeln(textonymCount," digit combinations represent Textonyms.")
samplers := list(maxAmbiquitySampler, noMatchingCharsSampler)
dict foreach(key,value,
if(value size == 1, continue)
samplers foreach(sampler,sampler examine(key,value))
)
samplers foreach(sampler,sampler report)
)
setupLetterToDigitMapping := method(
fromChars := Sequence clone
toChars := Sequence clone
list(
list("ABC", "2"), list("DEF", "3"), list("GHI", "4"),
list("JKL", "5"), list("MNO", "6"), list("PQRS","7"),
list("TUV", "8"), list("WXYZ","9")
) foreach( map,
fromChars appendSeq(map at(0), map at(0) asLowercase)
toChars alignLeftInPlace(fromChars size, map at(1))
)
Sequence asPhoneDigits := block(
str := call target asMutable translate(fromChars,toChars)
if( str contains(0), nil, str )
) setIsActivatable(true)
)
maxAmbiquitySampler := Object clone do(
max := list()
samples := list()
examine := method(key,textonyms,
i := key size - 1
if(i > max size - 1,
max setSize(i+1)
samples setSize(i+1)
)
nw := textonyms size
nwmax := max at(i)
if( nwmax isNil or nw > nwmax,
max atPut(i,nw)
samples atPut(i,list(key,textonyms))
)
)
report := method(
writeln("\nExamples of maximum ambiquity for each word length:")
samples foreach(sample,
sample ifNonNil(
writeln(" ",sample at(0)," -> ",sample at(1) join(" "))
)
)
)
)
noMatchingCharsSampler := Object clone do(
samples := list()
examine := method(key,textonyms,
for(i,0,textonyms size - 2 ,
for(j,i+1,textonyms size - 1,
if( _noMatchingChars(textonyms at(i), textonyms at(j)),
samples append(list(textonyms at(i),textonyms at(j)))
)
)
)
)
_noMatchingChars := method(t1,t2,
t1 foreach(i,ich,
if(ich == t2 at(i), return false)
)
true
)
report := method(
write("\nThere are ",samples size," textonym pairs which ")
writeln("differ at each character position.")
if(samples size > 10, writeln("The ten largest are:"))
samples sortInPlace(at(0) size negate)
if(samples size > 10,samples slice(0,10),samples) foreach(sample,
writeln(" ",sample join(" ")," -> ",sample at(0) asPhoneDigits)
)
)
)
main
- Output:
There are 24978 words in unixdict.txt which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. Examples of maximum ambiquity for each word length: 7 -> p q r s 46 -> gm go ho in io 269 -> amy any bmw bow box boy cow cox coy 2273 -> acre bard bare base cape card care case 42779 -> garry gassy happy harpy harry 723353 -> paddle raffle saddle 2667678 -> comport compost consort 38465649 -> ethology etiology 468376377 -> governess inverness 6388537663 -> mettlesome nettlesome 49376746242 -> hydrophobia hydrophobic 666628676342 -> onomatopoeia onomatopoeic 7244967473642 -> schizophrenia schizophrenic 25287876746242 -> claustrophobia claustrophobic There are 275 textonym pairs which differ at each character position. The ten largest are: pistol shrunk -> 747865 hotbed invade -> 468233 aback cabal -> 22225 about bantu -> 22688 adams bebop -> 23267 rival shuck -> 74825 astor crump -> 27867 knack local -> 56225 rice shad -> 7423 ammo coon -> 2666
J
require'regex strings web/gethttp'
strip=:dyad define
(('(?s)',x);'') rxrplc y
)
fetch=:monad define
txt=. '.*<pre>' strip '</pre>.*' strip gethttp y
cutopen tolower txt-.' '
)
keys=:noun define
2 abc
3 def
4 ghi
5 jkl
6 mno
7 pqrs
8 tuv
9 wxyz
)
reporttext=:noun define
There are #{0} words in #{1} which can be represented by the digit key mapping.
They require #{2} digit combinations to represent them.
#{3} digit combinations represent Textonyms.
)
report=:dyad define
x rplc (":&.>y),.~('#{',":,'}'"_)&.>i.#y
)
textonymrpt=:dyad define
'digits letters'=. |:>;,&.>,&.>/&.>/"1 <;._1;._2 x
valid=. (#~ */@e.&letters&>) fetch y NB. ignore illegals
reps=. {&digits@(letters&i.)&.> valid NB. reps is digit seq
reporttext report (#valid);y;(#~.reps);+/(1<#)/.~reps
)
Required example:
keys textonymrpt 'http://rosettacode.org/wiki/Textonyms/wordlist'
There are 13085 words in http://rosettacode.org/wiki/Textonyms/wordlist which can be represented by the digit key mapping.
They require 11932 digit combinations to represent them.
661 digit combinations represent Textonyms.
In this example, the intermediate results in textonymrpt would look like this (just looking at the first 5 elements of the really big values:
digits
22233344455566677778889999
letters
abcdefghijklmnopqrstuvwxyz
5{.valid
┌─┬──┬───┬───┬──┐
│a│aa│aaa│aam│ab│
└─┴──┴───┴───┴──┘
5{.reps
┌─┬──┬───┬───┬──┐
│2│22│222│226│22│
└─┴──┴───┴───┴──┘
Here's another example:
keys textonymrpt 'http://www.puzzlers.org/pub/wordlists/unixdict.txt'
There are 24978 words in http://www.puzzlers.org/pub/wordlists/unixdict.txt which can be represnted by the digit key mapping.
They require 22903 digit combinations to represent them.
1473 digit combinations represent Textonyms.
Java
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Scanner;
import java.util.Vector;
public class RTextonyms {
private static final Map<Character, Character> mapping;
private int total, elements, textonyms, max_found;
private String filename, mappingResult;
private Vector<String> max_strings;
private Map<String, Vector<String>> values;
static {
mapping = new HashMap<Character, Character>();
mapping.put('A', '2'); mapping.put('B', '2'); mapping.put('C', '2');
mapping.put('D', '3'); mapping.put('E', '3'); mapping.put('F', '3');
mapping.put('G', '4'); mapping.put('H', '4'); mapping.put('I', '4');
mapping.put('J', '5'); mapping.put('K', '5'); mapping.put('L', '5');
mapping.put('M', '6'); mapping.put('N', '6'); mapping.put('O', '6');
mapping.put('P', '7'); mapping.put('Q', '7'); mapping.put('R', '7'); mapping.put('S', '7');
mapping.put('T', '8'); mapping.put('U', '8'); mapping.put('V', '8');
mapping.put('W', '9'); mapping.put('X', '9'); mapping.put('Y', '9'); mapping.put('Z', '9');
}
public RTextonyms(String filename) {
this.filename = filename;
this.total = this.elements = this.textonyms = this.max_found = 0;
this.values = new HashMap<String, Vector<String>>();
this.max_strings = new Vector<String>();
return;
}
public void add(String line) {
String mapping = "";
total++;
if (!get_mapping(line)) {
return;
}
mapping = mappingResult;
if (values.get(mapping) == null) {
values.put(mapping, new Vector<String>());
}
int num_strings;
num_strings = values.get(mapping).size();
textonyms += num_strings == 1 ? 1 : 0;
elements++;
if (num_strings > max_found) {
max_strings.clear();
max_strings.add(mapping);
max_found = num_strings;
}
else if (num_strings == max_found) {
max_strings.add(mapping);
}
values.get(mapping).add(line);
return;
}
public void results() {
System.out.printf("Read %,d words from %s%n%n", total, filename);
System.out.printf("There are %,d words in %s which can be represented by the digit key mapping.%n", elements,
filename);
System.out.printf("They require %,d digit combinations to represent them.%n", values.size());
System.out.printf("%,d digit combinations represent Textonyms.%n", textonyms);
System.out.printf("The numbers mapping to the most words map to %,d words each:%n", max_found + 1);
for (String key : max_strings) {
System.out.printf("%16s maps to: %s%n", key, values.get(key).toString());
}
System.out.println();
return;
}
public void match(String key) {
Vector<String> match;
match = values.get(key);
if (match == null) {
System.out.printf("Key %s not found%n", key);
}
else {
System.out.printf("Key %s matches: %s%n", key, match.toString());
}
return;
}
private boolean get_mapping(String line) {
mappingResult = line;
StringBuilder mappingBuilder = new StringBuilder();
for (char cc : line.toCharArray()) {
if (Character.isAlphabetic(cc)) {
mappingBuilder.append(mapping.get(Character.toUpperCase(cc)));
}
else if (Character.isDigit(cc)) {
mappingBuilder.append(cc);
}
else {
return false;
}
}
mappingResult = mappingBuilder.toString();
return true;
}
public static void main(String[] args) {
String filename;
if (args.length > 0) {
filename = args[0];
}
else {
filename = "./unixdict.txt";
}
RTextonyms tc;
tc = new RTextonyms(filename);
Path fp = Paths.get(filename);
try (Scanner fs = new Scanner(fp, StandardCharsets.UTF_8.name())) {
while (fs.hasNextLine()) {
tc.add(fs.nextLine());
}
}
catch (IOException ex) {
ex.printStackTrace();
}
List<String> numbers = Arrays.asList(
"001", "228", "27484247", "7244967473642",
"."
);
tc.results();
for (String number : numbers) {
if (number.equals(".")) {
System.out.println();
}
else {
tc.match(number);
}
}
return;
}
}
- Output with "java RTextonyms ./unixdict.txt":
Read 25,104 words from ./unixdict.txt There are 24,988 words in ./unixdict.txt which can be represented by the digit key mapping. They require 22,905 digit combinations to represent them. 1,477 digit combinations represent Textonyms. The numbers mapping to the most words map to 9 words each: 269 maps to: [amy, any, bmw, bow, box, boy, cow, cox, coy] 729 maps to: [paw, pax, pay, paz, raw, ray, saw, sax, say] Key 001 not found Key 228 matches: [aau, act, bat, cat] Key 27484247 not found Key 7244967473642 matches: [schizophrenia, schizophrenic]
jq
The following requires a version of jq with "gsub".
def textonym_value:
gsub("a|b|c|A|B|C"; "2")
| gsub("d|e|f|D|E|F"; "3")
| gsub("g|h|i|G|H|I"; "4")
| gsub("j|k|l|J|K|L"; "5")
| gsub("m|n|o|M|N|O"; "6")
| gsub("p|q|r|s|P|Q|R|S"; "7")
| gsub("t|u|v|T|U|V"; "8")
| gsub("w|x|y|z|W|X|Y|Z"; "9");
def explore:
# given an array (or hash), find the maximum length of the items (or values):
def max_length: [.[] | length] | max;
# The length of the longest textonym in the dictionary of numericString => array:
def longest:
[to_entries[] | select(.value|length > 1) | .key | length] | max;
# pretty-print a key-value pair:
def pp: "\(.key) maps to: \(.value|tostring)";
split("\n")
| map(select(test("^[a-zA-Z]+$"))) # select the strictly alphabetic strings
| length as $nwords
| reduce .[] as $line
( {};
($line | textonym_value) as $key
| .[$key] += [$line] )
| max_length as $max_length
| longest as $longest
| "There are \($nwords) words in the Textonyms/wordlist word list that can be represented by the digit-key mapping.",
"They require \(length) digit combinations to represent them.",
"\( [.[] | select(length>1) ] | length ) digit combinations represent Textonyms.",
"The numbers mapping to the most words map to \($max_length) words:",
(to_entries[] | select((.value|length) == $max_length) | pp ),
"The longest Textonyms in the word list have length \($longest):",
(to_entries[] | select((.key|length) == $longest and (.value|length > 1)) | pp)
;
explore
- Output:
$ jq -R -r -c -s -f textonyms.jq textonyms.txt
There are 13085 words in the Textonyms/wordlist word list that can be represented by the digit-key mapping.
They require 11932 digit combinations to represent them.
661 digit combinations represent Textonyms.
The numbers mapping to the most words map to 15 words:
27 maps to: ["AP","AQ","AR","AS","Ar","As","BP","BR","BS","Br","CP","CQ","CR","Cr","Cs"]
The longest Textonyms in the word list have length 11:
26456746242 maps to: ["Anglophobia","Anglophobic"]
24636272673 maps to: ["CinemaScope","Cinemascope"]
Julia
This solution uses an aspell dictionary on the local machine as its word source. The character to number mapping is done via regular expressions and Julia's replace function. Because this list contains accented characters, the matching expressions were expanded to include such characters. Words are case sensitive, but the mapping is not, so for example both "Homer" and "homer" are included in the tabulation, each coded as "46637". Function
using Printf
const tcode = (Regex=>Char)[r"A|B|C|Ä|Å|Á|Â|Ç" => '2',
r"D|E|F|È|Ê|É" => '3',
r"G|H|I|Í" => '4',
r"J|K|L" => '5',
r"M|N|O|Ó|Ö|Ô|Ñ" => '6',
r"P|Q|R|S" => '7',
r"T|U|V|Û|Ü" => '8',
r"W|X|Y|Z" => '9']
function tpad(str::IOStream)
tnym = (String=>Array{String,1})[]
for w in eachline(str)
w = chomp(w)
t = uppercase(w)
for (k,v) in tcode
t = replace(t, k, v)
end
t = replace(t, r"\D", '1')
tnym[t] = [get(tnym, t, String[]), w]
end
return tnym
end
Main
dname = "/usr/share/dict/american-english"
DF = open(dname, "r")
tnym = tpad(DF)
close(DF)
println("The character to digit mapping is done according to")
println("these regular expressions (following uppercase conversion):")
for k in sort(collect(keys(tcode)), by=x->tcode[x])
println(" ", tcode[k], " -> ", k)
end
println("Unmatched non-digit characters are mapped to 1")
println()
print("There are ", sum(map(x->length(x), values(tnym))))
println(" words in ", dname)
println(" which can be represented by the digit key mapping.")
print("They require ", length(keys(tnym)))
println(" digit combinations to represent them.")
print(sum(map(x->length(x)>1, values(tnym))))
println(" digit combinations represent Textonyms.")
println()
println("The degeneracies of telephone key encodings are:")
println(" Words Encoded Number of codes")
dgen = zeros(maximum(map(x->length(x), values(tnym))))
for v in values(tnym)
dgen[length(v)] += 1
end
for (i, d) in enumerate(dgen)
println(@sprintf "%10d %15d" i d)
end
println()
dgen = length(dgen) - 2
println("Codes mapping to ", dgen, " or more words:")
for (k, v) in tnym
dgen <= length(v) || continue
println(@sprintf "%7s (%2d) %s" k length(v) join(v, ", "))
end
- Output:
The character to digit mapping is done according to these regular expressions (following uppercase conversion): 2 -> r"A|B|C|Ä|Å|Á|Â|Ç" 3 -> r"D|E|F|È|Ê|É" 4 -> r"G|H|I|Í" 5 -> r"J|K|L" 6 -> r"M|N|O|Ó|Ö|Ô|Ñ" 7 -> r"P|Q|R|S" 8 -> r"T|U|V|Û|Ü" 9 -> r"W|X|Y|Z" Unmatched non-digit characters are mapped to 1 There are 99171 words in /usr/share/dict/american-english which can be represented by the digit key mapping. They require 89353 digit combinations to represent them. 6860 digit combinations represent Textonyms. The degeneracies of telephone key encodings are: Words Encoded Number of codes 1 82493 2 5088 3 1104 4 383 5 159 6 72 7 24 8 16 9 8 10 4 11 1 12 1 Codes mapping to 10 or more words: 269 (11) Amy, BMW, Cox, Coy, any, bow, box, boy, cow, cox, coy 22737 (12) acres, bards, barer, bares, barfs, baser, bases, caper, capes, cards, cares, cases 2273 (10) Case, acre, bard, bare, barf, base, cape, card, care, case 46637 (10) Homer, goner, goods, goofs, homer, homes, hones, hoods, hoofs, inner 7217 (10) PA's, PC's, Pa's, Pb's, Ra's, Rb's, SC's, Sb's, Sc's, pa's 4317 (10) GE's, Gd's, Ge's, HF's, He's, Hf's, ID's, he's, id's, if's
Kotlin
// version 1.1.4-3
import java.io.File
val wordList = "unixdict.txt"
val url = "http://www.puzzlers.org/pub/wordlists/unixdict.txt"
const val DIGITS = "22233344455566677778889999"
val map = mutableMapOf<String, MutableList<String>>()
fun processList() {
var countValid = 0
val f = File(wordList)
val sb = StringBuilder()
f.forEachLine { word->
var valid = true
sb.setLength(0)
for (c in word.toLowerCase()) {
if (c !in 'a'..'z') {
valid = false
break
}
sb.append(DIGITS[c - 'a'])
}
if (valid) {
countValid++
val key = sb.toString()
if (map.containsKey(key)) {
map[key]!!.add(word)
}
else {
map.put(key, mutableListOf(word))
}
}
}
var textonyms = map.filter { it.value.size > 1 }.toList()
val report = "There are $countValid words in '$url' " +
"which can be represented by the digit key mapping.\n" +
"They require ${map.size} digit combinations to represent them.\n" +
"${textonyms.size} digit combinations represent Textonyms.\n"
println(report)
val longest = textonyms.sortedByDescending { it.first.length }
val ambiguous = longest.sortedByDescending { it.second.size }
println("Top 8 in ambiguity:\n")
println("Count Textonym Words")
println("====== ======== =====")
var fmt = "%4d %-8s %s"
for (a in ambiguous.take(8)) println(fmt.format(a.second.size, a.first, a.second))
fmt = fmt.replace("8", "14")
println("\nTop 6 in length:\n")
println("Length Textonym Words")
println("====== ============== =====")
for (l in longest.take(6)) println(fmt.format(l.first.length, l.first, l.second))
}
fun main(args: Array<String>) {
processList()
}
- Output:
There are 24978 words in 'http://www.puzzlers.org/pub/wordlists/unixdict.txt' which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. Top 8 in ambiguity: Count Textonym Words ====== ======== ===== 9 269 [amy, any, bmw, bow, box, boy, cow, cox, coy] 9 729 [paw, pax, pay, paz, raw, ray, saw, sax, say] 8 2273 [acre, bard, bare, base, cape, card, care, case] 8 726 [pam, pan, ram, ran, sam, san, sao, scm] 7 4663 [gone, good, goof, home, hone, hood, hoof] 7 7283 [pate, pave, rate, rave, saud, save, scud] 7 426 [gam, gao, ham, han, ian, ibm, ibn] 7 782 [pta, pub, puc, pvc, qua, rub, sub] Top 6 in length: Length Textonym Words ====== ============== ===== 14 25287876746242 [claustrophobia, claustrophobic] 13 7244967473642 [schizophrenia, schizophrenic] 12 666628676342 [onomatopoeia, onomatopoeic] 11 49376746242 [hydrophobia, hydrophobic] 10 2668368466 [contention, convention] 10 6388537663 [mettlesome, nettlesome]
Lua
-- Global variables
http = require("socket.http")
keys = {"VOICEMAIL", "abc", "def", "ghi", "jkl", "mno", "pqrs", "tuv", "wxyz"}
dictFile = "http://www.puzzlers.org/pub/wordlists/unixdict.txt"
-- Return the sequence of keys required to type a given word
function keySequence (str)
local sequence, noMatch, letter = ""
for pos = 1, #str do
letter = str:sub(pos, pos)
for i, chars in pairs(keys) do
noMatch = true
if chars:match(letter) then
sequence = sequence .. tostring(i)
noMatch = false
break
end
end
if noMatch then return nil end
end
return tonumber(sequence)
end
-- Generate table of words grouped by key sequence
function textonyms (dict)
local combTable, keySeq = {}
for word in dict:gmatch("%S+") do
keySeq = keySequence(word)
if keySeq then
if combTable[keySeq] then
table.insert(combTable[keySeq], word)
else
combTable[keySeq] = {word}
end
end
end
return combTable
end
-- Analyse sequence table and print details
function showReport (keySeqs)
local wordCount, seqCount, tCount = 0, 0, 0
for seq, wordList in pairs(keySeqs) do
wordCount = wordCount + #wordList
seqCount = seqCount + 1
if #wordList > 1 then tCount = tCount + 1 end
end
print("There are " .. wordCount .. " words in " .. dictFile)
print("which can be represented by the digit key mapping.")
print("They require " .. seqCount .. " digit combinations to represent them.")
print(tCount .. " digit combinations represent Textonyms.")
end
-- Main procedure
showReport(textonyms(http.request(dictFile)))
- Output:
There are 24983 words in http://www.puzzlers.org/pub/wordlists/unixdict.txt which can be represented by the digit key mapping. They require 22908 digit combinations to represent them. 1473 digit combinations represent Textonyms.
Mathematica /Wolfram Language
ClearAll[Numerify,rls]
rls={"A"->2,"B"->2,"C"->2,"D"->3,"E"->3,"F"->3,"G"->4,"H"->4,"I"->4,"J"->5,"K"->5,"L"->5,"M"->6,"N"->6,"O"->6,"P"->7,"Q"->7,"R"->7,"S"->7,"T"->8,"U"->8,"V"->8,"W"->9,"X"->9,"Y"->9,"Z"->9};
Numerify[s_String]:=Characters[ToUpperCase[s]]/.rls
dict=Once[Import["http://www.rosettacode.org/wiki/Textonyms/wordlist","XML"]];
dict=Cases[dict,XMLElement["pre",{},{x_}]:>x,\[Infinity]];
dict=TakeLargestBy[dict,ByteCount,1][[1]];
dict=DeleteDuplicates[StringTrim/*ToUpperCase/@StringSplit[dict]];
dict=Select[dict,StringMatchQ[(Alternatives@@Keys[rls])..]];
Print["Number of words from Textonyms/wordlist are: ",Length[dict]]
grouped=GroupBy[dict[[;;;;10]],Numerify];
Print["Number of unique numbers: ",Length[grouped]]
grouped=Select[grouped,Length/*GreaterThan[1]];
Print["Most with the same number:"]
KeyValueMap[List,TakeLargestBy[grouped,Length,1]]//Grid
Print["5 longest words with textonyms:"]
List@@@Normal[ReverseSortBy[grouped,First/*Length][[;;5]]]//Grid
- Output:
Number of words from Textonyms/wordlist are: 71125 Number of unique numbers: 7030 Most with the same number: {2,6,6,6} {AMON,COMO,CONN,ANON} 5 longest words with textonyms: {2,4,6,6,4,7,4,6,4} {CHONGQING,AGONISING} {3,5,3,2,8,4,6,6} {EJECTION,ELECTION} {2,8,7,8,4,3,7,8} {BUSTIEST,CURVIEST} {2,8,7,3,8,8,3,7} {BURETTES,CURETTES} {3,7,8,2,8,3,7} {EQUATES,ESTATES}
MiniScript
This solution assumes the Mini Micro environment (providing the listUtil and mapUtil modules, as well as the englishWords.txt file).
import "listUtil"
import "mapUtil"
groups = "abc def ghi jkl mno pqrs tuv wxyz".split
charToNum = {}
for i in groups.indexes
for ch in groups[i]
charToNum[ch] = i + 2
end for
end for
words = file.readLines("/sys/data/englishWords.txt")
wordToNum = function(word)
parts = word.split("")
parts.apply function(ch)
return charToNum[ch]
end function
return parts.join("")
end function
numToWords = {}
moreThan1Word = 0
for word in words
num = wordToNum(word.lower)
if numToWords.hasIndex(num) then
numToWords[num].push word
else
numToWords[num] = [word]
end if
if numToWords[num].len == 2 then moreThan1Word = moreThan1Word + 1
end for
print "There are " + words.len + " words in englishWords.txt which can be represented by the digit key mapping."
print "They require " + numToWords.len + " digit combinations to represent them."
print moreThan1Word + " digit combinations represent Textonyms."
while true
print
inp = input("Enter a word or digit combination: ")
if not inp then break
if val(inp) > 0 then
print inp + " -> " + numToWords.get(inp)
else
num = wordToNum(inp.lower)
print "Digit key combination for """ + inp + """ is: " + num
print num + " -> " + numToWords.get(num)
end if
end while
- Output:
There are 64664 words in englishWords.txt which can be represented by the digit key mapping. They require 59148 digit combinations to represent them. 4028 digit combinations represent Textonyms. Enter a word or digit combination: 2877464 2877464 -> ["burping", "bussing", "cupping", "cursing", "cussing"] Enter a word or digit combination: phoning Digit key combination for "phoning" is: 7466464 7466464 -> ["phoning", "pinning", "rimming", "shooing", "sinning"]
Nim
import algorithm, sequtils, strformat, strutils, tables
const
WordList = "unixdict.txt"
Url = "http://www.puzzlers.org/pub/wordlists/unixdict.txt"
Digits = "22233344455566677778889999"
proc processList(wordFile: string) =
var mapping: Table[string, seq[string]]
var countValid = 0
for word in wordFile.lines:
var valid = true
var key: string
for c in word.toLowerAscii:
if c notin 'a'..'z':
valid = false
break
key.add Digits[ord(c) - ord('a')]
if valid:
inc countValid
mapping.mgetOrPut(key, @[]).add word
let textonyms = toSeq(mapping.pairs).filterIt(it[1].len > 1)
echo &"There are {countValid} words in '{Url}' ",
&"which can be represented by the digit key mapping."
echo &"They require {mapping.len} digit combinations to represent them."
echo &"{textonyms.len} digit combinations represent Textonyms.\n"
let longest = textonyms.sortedByIt(-it[0].len)
let ambiguous = longest.sortedByIt(-it[1].len)
echo "Top 8 in ambiguity:\n"
echo "Count Textonym Words"
echo "====== ======== ====="
for a in ambiguous[0..7]:
echo &"""{a[1].len:4} {a[0]:>8} {a[1].join(", ")}"""
echo "\nTop 6 in length:\n"
echo "Length Textonym Words"
echo "====== ============== ====="
for l in longest[0..5]:
echo &"""{l[0].len:4} {l[0]:>14} {l[1].join(", ")}"""
processList(WordList)
- Output:
There are 24978 words in 'http://www.puzzlers.org/pub/wordlists/unixdict.txt' which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. Top 8 in ambiguity: Count Textonym Words ====== ======== ===== 9 729 paw, pax, pay, paz, raw, ray, saw, sax, say 9 269 amy, any, bmw, bow, box, boy, cow, cox, coy 8 2273 acre, bard, bare, base, cape, card, care, case 8 726 pam, pan, ram, ran, sam, san, sao, scm 7 4663 gone, good, goof, home, hone, hood, hoof 7 7283 pate, pave, rate, rave, saud, save, scud 7 782 pta, pub, puc, pvc, qua, rub, sub 7 426 gam, gao, ham, han, ian, ibm, ibn Top 6 in length: Length Textonym Words ====== ============== ===== 14 25287876746242 claustrophobia, claustrophobic 13 7244967473642 schizophrenia, schizophrenic 12 666628676342 onomatopoeia, onomatopoeic 11 49376746242 hydrophobia, hydrophobic 10 2668368466 contention, convention 10 6388537663 mettlesome, nettlesome
OCaml
module IntMap = Map.Make(Int)
let seq_lines ch =
let rec repeat () =
match input_line ch with
| s -> Seq.Cons (s, repeat)
| exception End_of_file -> Nil
in repeat
(* simply use bijective numeration in base 8 for keys *)
let key_of_char = function
| 'a' .. 'c' -> Some 1
| 'd' .. 'f' -> Some 2
| 'g' .. 'i' -> Some 3
| 'j' .. 'l' -> Some 4
| 'm' .. 'o' -> Some 5
| 'p' .. 's' -> Some 6
| 't' .. 'v' -> Some 7
| 'w' .. 'z' -> Some 8
| _ -> None
let keys_of_word =
let next k c =
Option.bind (key_of_char c) (fun d -> Option.map (fun k -> k lsl 3 + d) k)
in String.fold_left next (Some 0)
let update m k =
IntMap.update k (function Some n -> Some (succ n) | None -> Some 1) m
let map_from ch =
seq_lines ch |> Seq.filter_map keys_of_word |> Seq.fold_left update IntMap.empty
let count _ n (words, keys, txtns) =
words + n, succ keys, if n > 1 then succ txtns else txtns
let show src (words, keys, txtns) = Printf.printf "\
There are %u words in %s which can be represented by the digit key mapping.\n\
They require %u digit combinations to represent them.\n\
%u digit combinations represent Textonyms.\n" words src keys txtns
let () =
show "stdin" (IntMap.fold count (map_from stdin) (0, 0, 0))
- Output:
There are 24978 words in stdin which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms.
...when being feeded with unixdict.txt
Perl
my $src = 'unixdict.txt';
# filter word-file for valid input, transform to low-case
open $fh, "<", $src;
@words = grep { /^[a-zA-Z]+$/ } <$fh>;
map { tr/A-Z/a-z/ } @words;
# translate words to dials
map { tr/abcdefghijklmnopqrstuvwxyz/22233344455566677778889999/ } @dials = @words;
# get unique values (modify @dials) and non-unique ones (are textonyms)
@dials = grep {!$h{$_}++} @dials;
@textonyms = grep { $h{$_} > 1 } @dials;
print "There are @{[scalar @words]} words in '$src' which can be represented by the digit key mapping.
They require @{[scalar @dials]} digit combinations to represent them.
@{[scalar @textonyms]} digit combinations represent Textonyms.";
- Output:
There are 24978 words in 'unixdict.txt' which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms.
Phix
with javascript_semantics sequence digit = repeat(-1,255) digit['a'..'c'] = '2' digit['d'..'f'] = '3' digit['g'..'i'] = '4' digit['j'..'l'] = '5' digit['m'..'o'] = '6' digit['p'..'s'] = '7' digit['t'..'v'] = '8' digit['w'..'z'] = '9' function digits(string word) string keycode = repeat(' ',length(word)) for i=1 to length(word) do integer ch = word[i] assert(ch>='a' and ch<='z') keycode[i] = digit[ch] end for return {keycode,word} end function function az(string word) return min(word)>='a' and max(word)<='z' end function sequence words = apply(filter(unix_dict(),az),digits), max_idx, long_idx string word, keycode, last = "" integer keycode_count = 0, textonyms = 0, this_count = 0, max_count = 0, longest = 0 printf(1,"There are %d words in unixdict.txt which can be represented by the digit key mapping.\n",{length(words)}) -- Sort by keycode: while words are ordered we get -- eg {"a","ab","b","ba"} -> {"2","22","2","22"} words = sort(deep_copy(words)) for i=1 to length(words) do {keycode,word} = words[i] if keycode=last then textonyms += this_count=1 this_count += 1 if this_count>=max_count then if this_count>max_count then max_idx = {i} else max_idx &= i end if max_count = this_count end if else keycode_count += 1 last = keycode this_count = 1 end if if length(word)>=longest then if length(word)>longest then long_idx = {i} else long_idx &= i end if longest = length(word) end if end for printf(1,"They require %d digit combinations to represent them.\n",{keycode_count}) printf(1,"%d digit combinations represent Textonyms.\n",{textonyms}) printf(1,"The maximum number of textonyms for a particular digit key mapping is %d:\n",{max_count}) for i=1 to length(max_idx) do integer k = max_idx[i], l = k-max_count+1 string dups = join(vslice(words[l..k],2),"/") printf(1," %s encodes %s\n",{words[k][1],dups}) end for printf(1,"The longest words are %d characters long\n",longest) printf(1,"Encodings with this length are:\n") for i=1 to length(long_idx) do printf(1," %s encodes %s\n",words[long_idx[i]]) end for
- Output:
(my unixdict.txt seems to have grown by 4 entries sometime in the past couple of years...)
There are 24981 words in unixdict.txt which can be represented by the digit key mapping. They require 22906 digit combinations to represent them. 1473 digit combinations represent Textonyms. The maximum number of textonyms for a particular digit key mapping is 9: 269 encodes amy/any/bmw/bow/box/boy/cow/cox/coy 729 encodes paw/pax/pay/paz/raw/ray/saw/sax/say The longest words are 22 characters long Encodings with this length are: 3532876362374256472749 encodes electroencephalography
PowerShell
$url = "http://www.puzzlers.org/pub/wordlists/unixdict.txt"
$file = "$env:TEMP\unixdict.txt"
(New-Object System.Net.WebClient).DownloadFile($url, $file)
$unixdict = Get-Content -Path $file
[string]$alpha = "abcdefghijklmnopqrstuvwxyz"
[string]$digit = "22233344455566677778889999"
$table = [ordered]@{}
for ($i = 0; $i -lt $alpha.Length; $i++)
{
$table.Add($alpha[$i], $digit[$i])
}
$words = foreach ($word in $unixdict)
{
if ($word -match "^[a-z]*$")
{
[PSCustomObject]@{
Word = $word
Number = ($word.ToCharArray() | ForEach-Object {$table.$_}) -join ""
}
}
}
$digitCombinations = $words | Group-Object -Property Number
$textonyms = $digitCombinations | Where-Object -Property Count -GT 1 | Sort-Object -Property Count -Descending
Write-Host ("There are {0} words in {1} which can be represented by the digit key mapping." -f $words.Count, $url)
Write-Host ("They require {0} digit combinations to represent them." -f $digitCombinations.Count)
Write-Host ("{0} digit combinations represent Textonyms.`n" -f $textonyms.Count)
Write-Host "Top 5 in ambiguity:"
$textonyms | Select-Object -First 5 -Property Count,
@{Name="Textonym"; Expression={$_.Name}},
@{Name="Words" ; Expression={$_.Group.Word -join ", "}} | Format-Table -AutoSize
Write-Host "Top 5 in length:"
$textonyms | Sort-Object {$_.Name.Length} -Descending |
Select-Object -First 5 -Property @{Name="Length" ; Expression={$_.Name.Length}},
@{Name="Textonym"; Expression={$_.Name}},
@{Name="Words" ; Expression={$_.Group.Word -join ", "}} | Format-Table -AutoSize
Remove-Item -Path $file -Force -ErrorAction SilentlyContinue
- Output:
There are 24978 words in http://www.puzzlers.org/pub/wordlists/unixdict.txt which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. Top 5 in ambiguity: Count Textonym Words ----- -------- ----- 9 729 paw, pax, pay, paz, raw, ray, saw, sax, say 9 269 amy, any, bmw, bow, box, boy, cow, cox, coy 8 726 pam, pan, ram, ran, sam, san, sao, scm 8 2273 acre, bard, bare, base, cape, card, care, case 7 426 gam, gao, ham, han, ian, ibm, ibn Top 5 in length: Length Textonym Words ------ -------- ----- 14 25287876746242 claustrophobia, claustrophobic 13 7244967473642 schizophrenia, schizophrenic 12 666628676342 onomatopoeia, onomatopoeic 11 49376746242 hydrophobia, hydrophobic 10 6388537663 mettlesome, nettlesome
Python
from collections import defaultdict
import urllib.request
CH2NUM = {ch: str(num) for num, chars in enumerate('abc def ghi jkl mno pqrs tuv wxyz'.split(), 2) for ch in chars}
URL = 'http://www.puzzlers.org/pub/wordlists/unixdict.txt'
def getwords(url):
return urllib.request.urlopen(url).read().decode("utf-8").lower().split()
def mapnum2words(words):
number2words = defaultdict(list)
reject = 0
for word in words:
try:
number2words[''.join(CH2NUM[ch] for ch in word)].append(word)
except KeyError:
# Reject words with non a-z e.g. '10th'
reject += 1
return dict(number2words), reject
def interactiveconversions():
global inp, ch, num
while True:
inp = input("\nType a number or a word to get the translation and textonyms: ").strip().lower()
if inp:
if all(ch in '23456789' for ch in inp):
if inp in num2words:
print(" Number {0} has the following textonyms in the dictionary: {1}".format(inp, ', '.join(
num2words[inp])))
else:
print(" Number {0} has no textonyms in the dictionary.".format(inp))
elif all(ch in CH2NUM for ch in inp):
num = ''.join(CH2NUM[ch] for ch in inp)
print(" Word {0} is{1} in the dictionary and is number {2} with textonyms: {3}".format(
inp, ('' if inp in wordset else "n't"), num, ', '.join(num2words[num])))
else:
print(" I don't understand %r" % inp)
else:
print("Thank you")
break
if __name__ == '__main__':
words = getwords(URL)
print("Read %i words from %r" % (len(words), URL))
wordset = set(words)
num2words, reject = mapnum2words(words)
morethan1word = sum(1 for w in num2words if len(num2words[w]) > 1)
maxwordpernum = max(len(values) for values in num2words.values())
print("""
There are {0} words in {1} which can be represented by the Textonyms mapping.
They require {2} digit combinations to represent them.
{3} digit combinations represent Textonyms.\
""".format(len(words) - reject, URL, len(num2words), morethan1word))
print("\nThe numbers mapping to the most words map to %i words each:" % maxwordpernum)
maxwpn = sorted((key, val) for key, val in num2words.items() if len(val) == maxwordpernum)
for num, wrds in maxwpn:
print(" %s maps to: %s" % (num, ', '.join(wrds)))
interactiveconversions()
- Output:
Read 25104 words from 'http://www.puzzlers.org/pub/wordlists/unixdict.txt' There are 24978 words in http://www.puzzlers.org/pub/wordlists/unixdict.txt which can be represented by the Textonyms mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. The numbers mapping to the most words map to 9 words each: 269 maps to: amy, any, bmw, bow, box, boy, cow, cox, coy 729 maps to: paw, pax, pay, paz, raw, ray, saw, sax, say Type a number or a word to get the translation and textonyms: rosetta Word rosetta is in the dictionary and is number 7673882 with textonyms: rosetta Type a number or a word to get the translation and textonyms: code Word code is in the dictionary and is number 2633 with textonyms: bode, code, coed Type a number or a word to get the translation and textonyms: 2468 Number 2468 has the following textonyms in the dictionary: ainu, chou Type a number or a word to get the translation and textonyms: 3579 Number 3579 has no textonyms in the dictionary. Type a number or a word to get the translation and textonyms: Thank you
Racket
This version allows digits to be used (since you can usually enter them through an SMS-style keypad).
unixdict.txt
has words like 2nd
which would not be valid using letters only, but is textable.
#lang racket
(module+ test (require tests/eli-tester))
(module+ test
(test
(map char->sms-digit (string->list "ABCDEFGHIJKLMNOPQRSTUVWXYZ."))
=> (list 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 7 8 8 8 9 9 9 9 #f)))
(define char->sms-digit
(match-lambda
[(? char-lower-case? (app char-upcase C)) (char->sms-digit C)]
;; Digits, too, can be entered on a text pad!
[(? char-numeric? (app char->integer c)) (- c (char->integer #\0))]
[(or #\A #\B #\C) 2]
[(or #\D #\E #\F) 3]
[(or #\G #\H #\I) 4]
[(or #\J #\K #\L) 5]
[(or #\M #\N #\O) 6]
[(or #\P #\Q #\R #\S) 7]
[(or #\T #\U #\V) 8]
[(or #\W #\X #\Y #\Z) 9]
[_ #f]))
(module+ test
(test
(word->textonym "criticisms") => 2748424767
(word->textonym "Briticisms") => 2748424767
(= (word->textonym "Briticisms") (word->textonym "criticisms"))))
(define (word->textonym w)
(for/fold ((n 0)) ((s (sequence-map char->sms-digit (in-string w))) #:final (not s))
(and s (+ (* n 10) s))))
(module+ test
(test
((cons-uniquely 'a) null) => '(a)
((cons-uniquely 'a) '(b)) => '(a b)
((cons-uniquely 'a) '(a b c)) => '(a b c)))
(define ((cons-uniquely a) d)
(if (member a d) d (cons a d)))
(module+ test
(test
(with-input-from-string "criticisms" port->textonym#) =>
(values 1 (hash 2748424767 '("criticisms")))
(with-input-from-string "criticisms\nBriticisms" port->textonym#) =>
(values 2 (hash 2748424767 '("Briticisms" "criticisms")))
(with-input-from-string "oh-no!-dashes" port->textonym#) =>
(values 0 (hash))))
(define (port->textonym#)
(for/fold
((n 0) (t# (hash)))
((w (in-port read-line)))
(define s (word->textonym w))
(if s
(values (+ n 1) (hash-update t# s (cons-uniquely w) null))
(values n t#))))
(define (report-on-file f-name)
(define-values (n-words textonym#) (with-input-from-file f-name port->textonym#))
(define n-textonyms (for/sum ((v (in-hash-values textonym#)) #:when (> (length v) 1)) 1))
(printf "--- report on ~s ends ---~%" f-name)
(printf
#<<EOS
There are ~a words in ~s which can be represented by the digit key mapping.
They require ~a digit combinations to represent them.
~a digit combinations represent Textonyms.
EOS
n-words f-name (hash-count textonym#) n-textonyms)
;; Show all the 6+ textonyms
(newline)
(for (((k v) (in-hash textonym#)) #:when (>= (length v) 6)) (printf "~a -> ~s~%" k v))
(printf "--- report on ~s ends ---~%" f-name))
(module+ main
(report-on-file "data/unixdict.txt"))
- Output:
--- report on "data/unixdict.txt" ends --- There are 24988 words in "data/unixdict.txt" which can be represented by the digit key mapping. They require 22905 digit combinations to represent them. 1477 digit combinations represent Textonyms. 226 -> ("can" "cam" "ban" "bam" "acm" "abo") 269 -> ("coy" "cox" "cow" "boy" "box" "bow" "bmw" "any" "amy") 426 -> ("ibn" "ibm" "ian" "han" "ham" "gao" "gam") 529 -> ("lay" "lax" "law" "kay" "jay" "jaw") 627 -> ("oar" "ncr" "nbs" "nap" "mar" "map") 729 -> ("say" "sax" "saw" "ray" "raw" "paz" "pay" "pax" "paw") 726 -> ("scm" "sao" "san" "sam" "ran" "ram" "pan" "pam") 782 -> ("sub" "rub" "qua" "pvc" "puc" "pub" "pta") 786 -> ("sun" "sum" "run" "rum" "quo" "pun") 843 -> ("vie" "vhf" "uhf" "tie" "tid" "the") 2273 -> ("case" "care" "card" "cape" "base" "bare" "bard" "acre") 2253 -> ("calf" "cake" "bale" "bald" "bake" "able") 2666 -> ("coon" "conn" "boon" "boom" "bonn" "ammo") 4663 -> ("hoof" "hood" "hone" "home" "goof" "good" "gone") 7283 -> ("scud" "save" "saud" "rave" "rate" "pave" "pate") 7243 -> ("said" "sage" "raid" "rage" "paid" "page") 7325 -> ("seal" "reck" "real" "peck" "peal" "peak") 7673 -> ("sore" "rose" "rope" "pose" "pore" "pope") --- report on "data/unixdict.txt" ends --- 1 test passed 3 tests passed 3 tests passed 3 tests passed
Raku
(formerly Perl 6)
my $src = 'unixdict.txt';
my @words = slurp($src).lines.grep(/ ^ <alpha>+ $ /);
my @dials = @words.classify: {
.trans('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
=> '2223334445556667777888999922233344455566677778889999');
}
my @textonyms = @dials.grep(*.value > 1);
say qq:to 'END';
There are {+@words} words in $src which can be represented by the digit key mapping.
They require {+@dials} digit combinations to represent them.
{+@textonyms} digit combinations represent Textonyms.
END
say "Top 5 in ambiguity:";
say " ",$_ for @textonyms.sort(-*.value)[^5];
say "\nTop 5 in length:";
say " ",$_ for @textonyms.sort(-*.key.chars)[^5];
- Output:
There are 24978 words in unixdict.txt which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. Top 5 in ambiguity: 269 => amy any bmw bow box boy cow cox coy 729 => paw pax pay paz raw ray saw sax say 2273 => acre bard bare base cape card care case 726 => pam pan ram ran sam san sao scm 426 => gam gao ham han ian ibm ibn Top 5 in length: 25287876746242 => claustrophobia claustrophobic 7244967473642 => schizophrenia schizophrenic 666628676342 => onomatopoeia onomatopoeic 49376746242 => hydrophobia hydrophobic 2668368466 => contention convention
REXX
Extra code was added detect and display a count illegal words (words not representable by the key digits), and
also duplicate words in the dictionary.
/*REXX program counts and displays the number of textonyms that are in a dictionary file*/
parse arg iFID . /*obtain optional fileID from the C.L. */
if iFID=='' | iFID=="," then iFID='UNIXDICT.TXT' /*Not specified? Then use the default.*/
@.= 0 /*the placeholder of digit combinations*/
!.=; $.= /*sparse array of textonyms; words. */
alphabet= 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' /*the supported alphabet to be used. */
digitKey= 22233344455566677778889999 /*translated alphabet to digit keys. */
digKey= 0; #word= 0 /*number digit combinations; word count*/
ills= 0 ; dups= 0; longest= 0; mostus= 0 /*illegals; duplicated words; longest..*/
first=. ; last= .; long= 0; most= 0 /*first, last, longest, most counts. */
call linein iFID, 1, 0 /*point to the first char in dictionary*/
#= 0 /*number of textonyms in file (so far).*/
do while lines(iFID)\==0; x= linein(iFID) /*keep reading the file until exhausted*/
y= x; upper x /*save a copy of X; uppercase X. */
if \datatype(x, 'U') then do; ills= ills + 1; iterate; end /*Not legal? Skip.*/
if $.x==. then do; dups= dups + 1; iterate; end /*Duplicate? Skip.*/
$.x= . /*indicate that it's a righteous word. */
#word= #word + 1 /*bump the word count (for the file). */
z= translate(x, digitKey, alphabet) /*build a translated digit key word. */
@.z= @.z + 1 /*flag that the digit key word exists. */
!.z= !.z y; _= words(!.z) /*build list of equivalent digit key(s)*/
if _>most then do; mostus= z; most= _; end /*remember the "mostus" digit keys. */
if @.z==2 then do; #= # + 1 /*bump the count of the textonyms. */
if first==. then first=z /*the first textonym found. */
last= z /* " last " " */
_= length(!.z) /*the length (# chars) of the digit key*/
if _>longest then long= z /*is this the longest textonym ? */
longest= max(_, longest) /*now, use this length as a target/goal*/
end /* [↑] discretionary (extra credit). */
if @.z==1 then digKey= digKey + 1 /*bump the count of digit key words. */
end /*while*/
@dict= 'in the dictionary file' /*literal used for some displayed text.*/
L= length(commas(max(#word,ills,dups,digKey,#))) /*find length of max # being displayed.*/
say 'The dictionary file being used is: ' iFID
say
call tell #word, 'words' @dict,
"which can be represented by digit key mapping"
if ills>0 then call tell ills, 'word's(ills) "that contain illegal characters" @dict
if dups>0 then call tell dups, 'duplicate word's(dups) "detected" @dict
call tell digKey, 'combination's(digKey) "required to represent them"
call tell #, 'digit combination's(#) "that can represent Textonyms"
say
if first \== . then say ' first digit key=' !.first
if last \== . then say ' last digit key=' !.last
if long \== 0 then say ' longest digit key=' !.long
if most \== 0 then say ' numerous digit key=' !.mostus " ("most 'words)'
exit # /*stick a fork in it, we're all done. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
commas: parse arg _; do jc=length(_)-3 to 1 by -3; _=insert(',', _, jc); end; return _
tell: arg ##; say 'There are ' right(commas(##), L)' ' arg(2).; return /*commatize #*/
s: if arg(1)==1 then return ''; return "s" /*a simple pluralizer.*/
- output when using the default input file:
The dictionary file being used is: UNIXDICT.TXT There are 24,978 words in the dictionary file which can be represented by digit key mapping. There are 126 words that contain illegal characters in the dictionary file. There are 22,903 combinations required to represent them. There are 1,473 digit combinations that can represent Textonyms. first digit key= aaa aba abc cab last digit key= woe zoe longest digit key= claustrophobia claustrophobic numerous digit key= amy any bmw bow box boy cow cox coy (9 words)
- output when using the input file: textonyms.txt
The dictionary file being used is: TEXTONYMS.TXT There are 12,990 words in the dictionary file which can be represented by digit key mapping. There are 95 duplicate words detected in the dictionary file. There are 11,932 combinations required to represent them. There are 650 digit combinations that can represent Textonyms. first digit key= AA AB AC BA BB BC CA CB last digit key= Phillip Phillis longest digit key= Anglophobia Anglophobic numerous digit key= AP AQ AR AS BP BR BS CP CQ CR Cs (11 words)
Ruby
CHARS = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
NUMS = "22233344455566677778889999" * 2
dict = "unixdict.txt"
textonyms = File.open(dict){|f| f.map(&:chomp).group_by {|word| word.tr(CHARS, NUMS) } }
puts "There are #{File.readlines(dict).size} words in #{dict} which can be represented by the digit key mapping.
They require #{textonyms.size} digit combinations to represent them.
#{textonyms.count{|_,v| v.size > 1}} digit combinations represent Textonyms."
puts "\n25287876746242: #{textonyms["25287876746242"].join(", ")}"
- Output:
There are 25104 words in unixdict.txt which can be represented by the digit key mapping. They require 23003 digit combinations to represent them. 1485 digit combinations represent Textonyms. 25287876746242: claustrophobia, claustrophobic
Rust
use std::collections::HashMap;
use std::fs::File;
use std::io::{self, BufRead};
fn text_char(ch: char) -> Option<char> {
match ch {
'a' | 'b' | 'c' => Some('2'),
'd' | 'e' | 'f' => Some('3'),
'g' | 'h' | 'i' => Some('4'),
'j' | 'k' | 'l' => Some('5'),
'm' | 'n' | 'o' => Some('6'),
'p' | 'q' | 'r' | 's' => Some('7'),
't' | 'u' | 'v' => Some('8'),
'w' | 'x' | 'y' | 'z' => Some('9'),
_ => None,
}
}
fn text_string(s: &str) -> Option<String> {
let mut text = String::with_capacity(s.len());
for c in s.chars() {
if let Some(t) = text_char(c) {
text.push(t);
} else {
return None;
}
}
Some(text)
}
fn print_top_words(textonyms: &Vec<(&String, &Vec<String>)>, top: usize) {
for (text, words) in textonyms.iter().take(top) {
println!("{} = {}", text, words.join(", "));
}
}
fn find_textonyms(filename: &str) -> std::io::Result<()> {
let file = File::open(filename)?;
let mut table = HashMap::new();
let mut count = 0;
for line in io::BufReader::new(file).lines() {
let mut word = line?;
word.make_ascii_lowercase();
if let Some(text) = text_string(&word) {
let words = table.entry(text).or_insert(Vec::new());
words.push(word);
count += 1;
}
}
let mut textonyms: Vec<(&String, &Vec<String>)> =
table.iter().filter(|x| x.1.len() > 1).collect();
println!(
"There are {} words in '{}' which can be represented by the digit key mapping.",
count, filename
);
println!(
"They require {} digit combinations to represent them.",
table.len()
);
println!(
"{} digit combinations represent Textonyms.",
textonyms.len()
);
let top = std::cmp::min(5, textonyms.len());
textonyms.sort_by_key(|x| (std::cmp::Reverse(x.1.len()), x.0));
println!("\nTop {} by number of words:", top);
print_top_words(&textonyms, top);
textonyms.sort_by_key(|x| (std::cmp::Reverse(x.0.len()), x.0));
println!("\nTop {} by length:", top);
print_top_words(&textonyms, top);
Ok(())
}
fn main() {
let args: Vec<String> = std::env::args().collect();
if args.len() != 2 {
eprintln!("usage: {} word-list", args[0]);
std::process::exit(1);
}
match find_textonyms(&args[1]) {
Ok(()) => {}
Err(error) => eprintln!("{}: {}", args[1], error),
}
}
- Output:
There are 24978 words in 'unixdict.txt' which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. Top 5 by number of words: 269 = amy, any, bmw, bow, box, boy, cow, cox, coy 729 = paw, pax, pay, paz, raw, ray, saw, sax, say 2273 = acre, bard, bare, base, cape, card, care, case 726 = pam, pan, ram, ran, sam, san, sao, scm 426 = gam, gao, ham, han, ian, ibm, ibn Top 5 by length: 25287876746242 = claustrophobia, claustrophobic 7244967473642 = schizophrenia, schizophrenic 666628676342 = onomatopoeia, onomatopoeic 49376746242 = hydrophobia, hydrophobic 2668368466 = contention, convention
Sidef
var words = ARGF.grep(/^[[:alpha:]]+\z/);
var dials = words.group_by {
.tr('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ',
'2223334445556667777888999922233344455566677778889999');
}
var textonyms = dials.grep_v { .len > 1 };
say <<-END;
There are #{words.len} words which can be represented by the digit key mapping.
They require #{dials.len} digit combinations to represent them.
#{textonyms.len} digit combinations represent Textonyms.
END
say "Top 5 in ambiguity:";
say textonyms.sort_by { |_,v| -v.len }.first(5).join("\n");
say "\nTop 5 in length:";
say textonyms.sort_by { |k,_| -k.len }.first(5).join("\n");
- Output:
$ sidef textonyms.sf < unixdict.txt There are 24978 words which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. Top 5 in ambiguity: ["729", ["paw", "pax", "pay", "paz", "raw", "ray", "saw", "sax", "say"]] ["269", ["amy", "any", "bmw", "bow", "box", "boy", "cow", "cox", "coy"]] ["2273", ["acre", "bard", "bare", "base", "cape", "card", "care", "case"]] ["726", ["pam", "pan", "ram", "ran", "sam", "san", "sao", "scm"]] ["782", ["pta", "pub", "puc", "pvc", "qua", "rub", "sub"]] Top 5 in length: ["25287876746242", ["claustrophobia", "claustrophobic"]] ["7244967473642", ["schizophrenia", "schizophrenic"]] ["666628676342", ["onomatopoeia", "onomatopoeic"]] ["49376746242", ["hydrophobia", "hydrophobic"]] ["2668368466", ["contention", "convention"]]
Swift
import Foundation
func textCharacter(_ ch: Character) -> Character? {
switch (ch) {
case "a", "b", "c":
return "2"
case "d", "e", "f":
return "3"
case "g", "h", "i":
return "4"
case "j", "k", "l":
return "5"
case "m", "n", "o":
return "6"
case "p", "q", "r", "s":
return "7"
case "t", "u", "v":
return "8"
case "w", "x", "y", "z":
return "9"
default:
return nil
}
}
func textString(_ string: String) -> String? {
var result = String()
result.reserveCapacity(string.count)
for ch in string {
if let tch = textCharacter(ch) {
result.append(tch)
} else {
return nil
}
}
return result
}
func compareByWordCount(pair1: (key: String, value: [String]),
pair2: (key: String, value: [String])) -> Bool {
if pair1.value.count == pair2.value.count {
return pair1.key < pair2.key
}
return pair1.value.count > pair2.value.count
}
func compareByTextLength(pair1: (key: String, value: [String]),
pair2: (key: String, value: [String])) -> Bool {
if pair1.key.count == pair2.key.count {
return pair1.key < pair2.key
}
return pair1.key.count > pair2.key.count
}
func findTextonyms(_ path: String) throws {
var dict = Dictionary<String, [String]>()
let contents = try String(contentsOfFile: path, encoding: String.Encoding.ascii)
var count = 0
for line in contents.components(separatedBy: "\n") {
if line.isEmpty {
continue
}
let word = line.lowercased()
if let text = textString(word) {
dict[text, default: []].append(word)
count += 1
}
}
var textonyms = Array(dict.filter{$0.1.count > 1})
print("There are \(count) words in '\(path)' which can be represented by the digit key mapping.")
print("They require \(dict.count) digit combinations to represent them.")
print("\(textonyms.count) digit combinations represent Textonyms.")
let top = min(5, textonyms.count)
print("\nTop \(top) by number of words:")
textonyms.sort(by: compareByWordCount)
for (text, words) in textonyms.prefix(top) {
print("\(text) = \(words.joined(separator: ", "))")
}
print("\nTop \(top) by length:")
textonyms.sort(by: compareByTextLength)
for (text, words) in textonyms.prefix(top) {
print("\(text) = \(words.joined(separator: ", "))")
}
}
do {
try findTextonyms("unixdict.txt")
} catch {
print(error.localizedDescription)
}
- Output:
There are 24978 words in 'unixdict.txt' which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. Top 5 by number of words: 269 = amy, any, bmw, bow, box, boy, cow, cox, coy 729 = paw, pax, pay, paz, raw, ray, saw, sax, say 2273 = acre, bard, bare, base, cape, card, care, case 726 = pam, pan, ram, ran, sam, san, sao, scm 426 = gam, gao, ham, han, ian, ibm, ibn Top 5 by length: 25287876746242 = claustrophobia, claustrophobic 7244967473642 = schizophrenia, schizophrenic 666628676342 = onomatopoeia, onomatopoeic 49376746242 = hydrophobia, hydrophobic 2668368466 = contention, convention
Tcl
set keymap {
2 -> ABC
3 -> DEF
4 -> GHI
5 -> JKL
6 -> MNO
7 -> PQRS
8 -> TUV
9 -> WXYZ
}
set url http://www.puzzlers.org/pub/wordlists/unixdict.txt
set report {
There are %1$s words in %2$s which can be represented by the digit key mapping.
They require %3$s digit combinations to represent them.
%4$s digit combinations represent Textonyms.
A %5$s-letter textonym which has %6$s combinations is %7$s:
%8$s
}
package require http
proc geturl {url} {
try {
set tok [http::geturl $url]
return [http::data $tok]
} finally {
http::cleanup $tok
}
}
proc main {keymap url} {
foreach {digit -> letters} $keymap {
foreach l [split $letters ""] {
dict set strmap $l $digit
}
}
set doc [geturl $url]
foreach word [split $doc \n] {
if {![string is alpha -strict $word]} continue
dict lappend words [string map $strmap [string toupper $word]] $word
}
set ncombos [dict size $words]
set nwords 0
set ntextos 0
set nmax 0
set dmax ""
dict for {d ws} $words {
puts [list $d $ws]
set n [llength $ws]
incr nwords $n
if {$n > 1} {
incr ntextos $n
}
if {$n >= $nmax && [string length $d] > [string length $dmax]} {
set nmax $n
set dmax $d
}
}
set maxwords [dict get $words $dmax]
set lenmax [llength $maxwords]
format $::report $nwords $url $ncombos $ntextos $lenmax $nmax $dmax $maxwords
}
puts [main $keymap $url]
- Output:
There are 24978 words in http://www.puzzlers.org/pub/wordlists/unixdict.txt which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 3548 digit combinations represent Textonyms. A 6-letter textonym which has 6 combinations is 2253: able bake bald bale cake calf
VBScript
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objInFile = objFSO.OpenTextFile(objFSO.GetParentFolderName(WScript.ScriptFullName) &_
"\unixdict.txt",1)
Set objKeyMap = CreateObject("Scripting.Dictionary")
With objKeyMap
.Add "ABC", "2" : .Add "DEF", "3" : .Add "GHI", "4" : .Add "JKL", "5"
.Add "MNO", "6" : .Add "PQRS", "7" : .Add "TUV", "8" : .Add "WXYZ", "9"
End With
'Instantiate or Intialize Counters
TotalWords = 0
UniqueCombinations = 0
Set objUniqueWords = CreateObject("Scripting.Dictionary")
Set objMoreThanOneWord = CreateObject("Scripting.Dictionary")
Do Until objInFile.AtEndOfStream
Word = objInFile.ReadLine
c = 0
Num = ""
If Word <> "" Then
For i = 1 To Len(Word)
For Each Key In objKeyMap.Keys
If InStr(1,Key,Mid(Word,i,1),1) > 0 Then
Num = Num & objKeyMap.Item(Key)
c = c + 1
End If
Next
Next
If c = Len(Word) Then
TotalWords = TotalWords + 1
If objUniqueWords.Exists(Num) = False Then
objUniqueWords.Add Num, ""
UniqueCombinations = UniqueCombinations + 1
Else
If objMoreThanOneWord.Exists(Num) = False Then
objMoreThanOneWord.Add Num, ""
End If
End If
End If
End If
Loop
WScript.Echo "There are " & TotalWords & " words in ""unixdict.txt"" which can be represented by the digit key mapping." & vbCrLf &_
"They require " & UniqueCombinations & " digit combinations to represent them." & vbCrLf &_
objMoreThanOneWord.Count & " digit combinations represent Textonyms."
objInFile.Close
- Output:
There are 24978 words in "unixdict.txt" which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms.
Wren
import "io" for File
import "./str" for Char, Str
import "./sort" for Sort
import "./fmt" for Fmt
var wordList = "unixdict.txt"
var DIGITS = "22233344455566677778889999"
var map = {}
var countValid = 0
var words = File.read(wordList).trimEnd().split("\n")
for (word in words) {
var valid = true
var sb = ""
for (c in Str.lower(word)) {
if (!Char.isLower(c)) {
valid = false
break
}
sb = sb + DIGITS[Char.code(c) - 97]
}
if (valid) {
countValid = countValid + 1
if (map.containsKey(sb)) {
map[sb].add(word)
} else {
map[sb] = [word]
}
}
}
var textonyms = map.toList.where { |me| me.value.count > 1 }.toList
var report = "There are %(countValid) words in '%(wordList)' " +
"which can be represented by the digit key mapping.\n" +
"They require %(map.count) digit combinations to represent them.\n" +
"%(textonyms.count) digit combinations represent Textonyms.\n"
System.print(report)
var longest = Sort.merge(textonyms) { |i, j| (j.key.count - i.key.count).sign }
var ambiguous = Sort.merge(longest) { |i, j| (j.value.count - i.value.count).sign }
System.print("Top 8 in ambiguity: \n")
System.print("Count Textonym Words")
System.print("====== ======== =====")
var f = "$4d $-8s $s"
for (a in ambiguous.take(8)) Fmt.print(f, a.value.count, a.key, a.value)
f = f.replace("8", "14")
System.print("\nTop 6 in length:\n")
System.print("Length Textonym Words")
System.print("====== ============== =====")
for (l in longest.take(6)) Fmt.print(f, l.key.count, l.key, l.value)
- Output:
There are 24978 words in 'unixdict.txt' which can be represented by the digit key mapping. They require 22903 digit combinations to represent them. 1473 digit combinations represent Textonyms. Top 8 in ambiguity: Count Textonym Words ====== ======== ===== 9 269 amy any bmw bow box boy cow cox coy 9 729 paw pax pay paz raw ray saw sax say 8 2273 acre bard bare base cape card care case 8 726 pam pan ram ran sam san sao scm 7 4663 gone good goof home hone hood hoof 7 7283 pate pave rate rave saud save scud 7 782 pta pub puc pvc qua rub sub 7 426 gam gao ham han ian ibm ibn Top 6 in length: Length Textonym Words ====== ============== ===== 14 25287876746242 claustrophobia claustrophobic 13 7244967473642 schizophrenia schizophrenic 12 666628676342 onomatopoeia onomatopoeic 11 49376746242 hydrophobia hydrophobic 10 2668368466 contention convention 10 6388537663 mettlesome nettlesome
zkl
Like the Python example, this solution uses the Unix Dictionary, rather than the textonyms word list as I don't want to parse the HTML.
URL:="http://www.puzzlers.org/pub/wordlists/unixdict.txt";
var ZC=Import("zklCurl");
var keypad=Dictionary(
"a",2,"b",2,"c",2, "d",3,"e",3,"f",3, "g",4,"h",4,"i",4,
"j",5,"k",5,"l",5, "m",6,"n",6,"o",6, "p",7,"q",7,"r",7,"s",7,
"t",8,"u",8,"v",8, "w",9,"x",9,"y",9,"z",9);
//fcn numerate(word){ word.toLower().apply(keypad.find.fp1("")) }
fcn numerate(word){ word.toLower().apply(keypad.get) } //-->textonym or error
println("criticisms --> ",numerate("criticisms"));
words:=ZC().get(URL); //--> T(Data,bytes of header, bytes of trailer)
words=words[0].del(0,words[1]); // remove HTTP header
println("Read %d words from %s".fmt(words.len(1),URL));
wcnt:=Dictionary();
foreach word in (words.walker(11)){ // iterate over stripped lines
w2n:=try{ numerate(word) }catch(NotFoundError){ continue };
wcnt.appendV(w2n,word); // -->[textonym:list of words]
}
moreThan1Word:=wcnt.reduce(fcn(s,[(k,v)]){ s+=(v.len()>1) },0);
maxWordPerNum:=(0).max(wcnt.values.apply("len"));
("There are %d words which can be represented by the Textonyms mapping.\n"
"There are %d overlaps.").fmt(wcnt.len(),moreThan1Word).println();
println("Max collisions: %d words:".fmt(maxWordPerNum));
foreach k,v in (wcnt.filter('wrap([(k,v)]){ v.len()==maxWordPerNum })){
println(" %s is the textonym of: %s".fmt(k,v.concat(", ")));
}
- Output:
criticisms --> 2748424767 Read 25104 words from http://www.puzzlers.org/pub/wordlists/unixdict.txt There are 22903 words which can be represented by the Textonyms mapping. There are 1473 overlaps. Max collisions: 9 words: 729 is the textonym of: paw, pax, pay, paz, raw, ray, saw, sax, say 269 is the textonym of: amy, any, bmw, bow, box, boy, cow, cox, coy
- Programming Tasks
- Solutions by Programming Task
- 11l
- ALGOL 68
- AppleScript
- Arturo
- AWK
- C
- GLib
- C++
- Clojure
- D
- Delphi
- System.SysUtils
- System.Classes
- System.Generics.Collections
- System.Character
- Factor
- FreeBASIC
- Go
- Haskell
- Io
- J
- Java
- Jq
- Julia
- Kotlin
- Lua
- Mathematica
- Wolfram Language
- MiniScript
- Nim
- OCaml
- Perl
- Phix
- PowerShell
- Python
- Racket
- Raku
- REXX
- Ruby
- Rust
- Sidef
- Swift
- Tcl
- VBScript
- Wren
- Wren-str
- Wren-sort
- Wren-fmt
- Zkl