Anagrams
You are encouraged to solve this task according to the task description, using any language you may know.
When two or more words are composed of the same characters, but in a different order, they are called anagrams.
- Task
Using the word list at http://wiki.puzzlers.org/pub/wordlists/unixdict.txt,
find the sets of words that share the same characters that contain the most words in them.
- Related tasks
- Metrics
- Counting
- Word frequency
- Letter frequency
- Jewels and stones
- I before E except after C
- Bioinformatics/base count
- Count occurrences of a substring
- Count how many vowels and consonants occur in a string
- Remove/replace
- XXXX redacted
- Conjugate a Latin verb
- Remove vowels from a string
- String interpolation (included)
- Strip block comments
- Strip comments from a string
- Strip a set of characters from a string
- Strip whitespace from a string -- top and tail
- Strip control codes and extended characters from a string
- Anagrams/Derangements/shuffling
- Word wheel
- ABC problem
- Sattolo cycle
- Knuth shuffle
- Ordered words
- Superpermutation minimisation
- Textonyms (using a phone text pad)
- Anagrams
- Anagrams/Deranged anagrams
- Permutations/Derangements
- Find/Search/Determine
- ABC words
- Odd words
- Word ladder
- Semordnilap
- Word search
- Wordiff (game)
- String matching
- Tea cup rim text
- Alternade words
- Changeable words
- State name puzzle
- String comparison
- Unique characters
- Unique characters in each string
- Extract file extension
- Levenshtein distance
- Palindrome detection
- Common list elements
- Longest common suffix
- Longest common prefix
- Compare a list of strings
- Longest common substring
- Find common directory path
- Words from neighbour ones
- Change e letters to i in words
- Non-continuous subsequences
- Longest common subsequence
- Longest palindromic substrings
- Longest increasing subsequence
- Words containing "the" substring
- Sum of the digits of n is substring of n
- Determine if a string is numeric
- Determine if a string is collapsible
- Determine if a string is squeezable
- Determine if a string has all unique characters
- Determine if a string has all the same characters
- Longest substrings without repeating characters
- Find words which contains all the vowels
- Find words which contain the most consonants
- Find words which contains more than 3 vowels
- Find words whose first and last three letters are equal
- Find words with alternating vowels and consonants
- Formatting
- Substring
- Rep-string
- Word wrap
- String case
- Align columns
- Literals/String
- Repeat a string
- Brace expansion
- Brace expansion using ranges
- Reverse a string
- Phrase reversals
- Comma quibbling
- Special characters
- String concatenation
- Substring/Top and tail
- Commatizing numbers
- Reverse words in a string
- Suffixation of decimal numbers
- Long literals, with continuations
- Numerical and alphabetical suffixes
- Abbreviations, easy
- Abbreviations, simple
- Abbreviations, automatic
- Song lyrics/poems/Mad Libs/phrases
- Mad Libs
- Magic 8-ball
- 99 bottles of beer
- The Name Game (a song)
- The Old lady swallowed a fly
- The Twelve Days of Christmas
- Tokenize
- Text between
- Tokenize a string
- Word break problem
- Tokenize a string with escaping
- Split a character string based on change of character
- Sequences
11l
DefaultDict[String, Array[String]] anagram
L(word) File(‘unixdict.txt’).read().split("\n")
anagram[sorted(word)].append(word)
V count = max(anagram.values().map(ana -> ana.len))
L(ana) anagram.values()
I ana.len == count
print(ana)
- Output:
[abel, able, bale, bela, elba] [caret, carte, cater, crate, trace] [angel, angle, galen, glean, lange] [alger, glare, lager, large, regal] [elan, lane, lean, lena, neal] [evil, levi, live, veil, vile]
8th
\
\ anagrams.8th
\ Rosetta Code - Anagrams problem
\ Using the word list at:
\ http://wiki.puzzlers.org/pub/wordlists/unixdict.txt,
\ find the sets of words that share the same characters
\ that contain the most words in them.
\
ns: anagrams
m:new var, anamap
a:new var, anaptr
0 var, analen
\ sort a string
: s:sort \ s -- s \
null s:/ \ a
' s:cmpi a:sort \ a
"" a:join \ s
;
: process-words \ word -- \ word
s:lc \ word
dup \ word word
>r \ word | word
\ 1. we create a sorted version of the curret word (sword)
s:sort \ sword | word
\ We check if sword can be found in map anamap
anamap @ \ sword anamap | word
over \ sword anamap sword | word
m:exists? \ sword anamap boolean | word
if \ sword anamap | word
\ If sword already exists in anamap:
\ - get mapvalue, which is an array
\ - add the original word to that array
\ - store the array in the map with key sword
over \ sword anamap sword | word
m:@ \ sword anamap array | word
r> \ sword anamap array word
a:push \ sword anamap array
rot \ anamap array sword
swap \ anamap sword array
m:! \ anamap
else \ sword anamap | word
\ If sword does not yet exist in anamap:
\ - create empty array
\ - put the original word into that array
\ - store the array in the map with key sword
swap \ anamap sword | word
a:new \ anamap sword array | word
r> \ anamap sword array word
a:push \ anamap sword array
m:! \ anamap
then
drop \
;
\ Read and check all words in array analist
: read-and-check-words \ -- \
"analist.txt" \ fname
f:open-ro \ f
' process-words f:eachline \ f
f:close \
;
: len< \ key array arraylen -- \ key array arraylen
2drop \ key
drop \
;
: len= \ key array arraylen -- \ key array arraylen
2drop \ key
anaptr @ \ key anaptr
swap \ anaptr key
a:push \ anaptr
drop \
;
: len> \ key array arraylen -- \ key array arraylen
analen \ key array arraylen analen
! \ key array
drop \ key
anaptr @ \ key anaptr
a:clear \ key anaptr
swap \ anaptr key
a:push \ anaptr
drop \
;
: fetch-longest-list \ key array -- \ key array
a:len \ key array arraylen
analen @ \ key array arraylen analen
2dup \ key array arraylen analen arraylen analen
n:cmp \ key array arraylen analen value
1 n:+ \ key array arraylen analen value
nip \ key array arraylen value
[ ' len< , ' len= , ' len> ] \ key array arraylen value swarr
swap \ key array arraylen swarr value
caseof \
;
: list-words-1 \ ix value -- \ ix value
nip \ value
"\t" . . \
;
: list-words \ ix value -- \ ix value
nip \ value
anamap @ \ value anamap
swap \ anamap value
m:@ \ anamap array
nip \ array
' list-words-1 a:each \ array
cr \ array
drop \
;
: app:main
\ Create a map, where the values are arrays, containing all words
\ which are the same when sorted (sword); sword is used as key
read-and-check-words
\ Create an array that holds the keys for anamap, for which the value,
\ which is the array of anagrams, has the biggest length found.
anamap @ ' fetch-longest-list m:each
\ Dump the resulting words to the console
anaptr @ ' list-words a:each drop
bye
;
AArch64 Assembly
/* ARM assembly AARCH64 Raspberry PI 3B */
/* program anagram64.s */
/*******************************************/
/* Constantes file */
/*******************************************/
/* for this file see task include a file in language AArch64 assembly*/
.include "../includeConstantesARM64.inc"
.equ MAXI, 40000
.equ BUFFERSIZE, 300000
/*********************************/
/* Initialized data */
/*********************************/
.data
szFileName: .asciz "./listword.txt"
szMessErreur: .asciz "FILE ERROR."
szCarriageReturn: .asciz "\n"
szMessSpace: .asciz " "
ptBuffex1: .quad sBuffex1
/*********************************/
/* UnInitialized data */
/*********************************/
.bss
ptTabBuffer: .skip 8 * MAXI
ptTabAna: .skip 8 * MAXI
tbiCptAna: .skip 8 * MAXI
iNBword: .skip 8
sBuffer: .skip BUFFERSIZE
sBuffex1: .skip BUFFERSIZE
/*********************************/
/* code section */
/*********************************/
.text
.global main
main: // entry of program
mov x4,#0 // loop indice
mov x0,AT_FDCWD // current directory
ldr x1,qAdrszFileName // file name
mov x2,#O_RDWR // flags
mov x3,#0 // mode
mov x8,#OPEN //
svc 0
cmp x0,#0 // error open
ble 99f
mov x19,x0 // FD save Fd
ldr x1,qAdrsBuffer // buffer address
ldr x2,qSizeBuf // buffersize
mov x8, #READ
svc 0
cmp x0,#0 // error read ?
blt 99f
mov x5,x0 // save size read bytes
ldr x4,qAdrsBuffer // buffer address
ldr x0,qAdrsBuffer // start word address
mov x2,#0
mov x1,#0 // word length
1:
cmp x2,x5
bge 2f
ldrb w3,[x4,x2]
cmp w3,#0xD // end word ?
cinc x1,x1,ne // increment word length
cinc x2,x2,ne // increment indice
bne 1b // and loop
strb wzr,[x4,x2] // store final zero
bl anaWord // sort word letters
add x2,x2,#2 // jump OD and 0A
add x0,x4,x2 // new address begin word
mov x1,#0 // init length
b 1b // and loop
2:
strb wzr,[x4,x2] // zero final
bl anaWord // last word
mov x0,x19 // file Fd
mov x8, #CLOSE
svc 0
cmp x0,#0 // error close ?
blt 99f
ldr x0,qAdrptTabAna // address sorted string area
mov x1,#0 // first indice
ldr x2,qAdriNBword
ldr x2,[x2] // last indice
ldr x3,qAdrptTabBuffer // address sorted string area
bl triRapide // quick sort
ldr x4,qAdrptTabAna // address sorted string area
ldr x7,qAdrptTabBuffer // address sorted string area
ldr x10,qAdrtbiCptAna // address counter occurences
mov x9,x2 // size word array
mov x8,#0 // indice first occurence
ldr x3,[x4,x8,lsl #3] // load first value
mov x2,#1 // loop indice
mov x6,#0 // counter
mov x12,#0 // counter value max
3:
ldr x5,[x4,x2,lsl #3] // load next value
mov x0,x3
mov x1,x5
bl comparStrings
cmp x0,#0 // sorted strings equal ?
bne 4f
add x6,x6,#1 // yes increment counter
b 5f
4: // no
str x6,[x10,x8,lsl #3] // store counter in first occurence
cmp x6,x12 // counter > value max
csel x12,x6,x12,gt // yes counter -> value max
//movgt x12,x6 // yes counter -> value max
mov x6,#0 // raz counter
mov x8,x2 // init index first occurence
mov x3,x5 // init value first occurence
5:
add x2,x2,#1 // increment indice
cmp x2,x9 // end word array ?
blt 3b // no -> loop
mov x2,#0 // raz indice
6: // display loop
ldr x6,[x10,x2,lsl #3] // load counter
cmp x6,x12 // equal to max value ?
bne 8f
ldr x0,[x7,x2,lsl #3] // load address first word
bl affichageMess
add x3,x2,#1 // increment new indixe
mov x4,#0 // counter
7:
ldr x0,qAdrszMessSpace
bl affichageMess
ldr x0,[x7,x3,lsl #3] // load address other word
bl affichageMess
add x3,x3,#1 // increment indice
add x4,x4,#1 // increment counter
cmp x4,x6 // max value ?
blt 7b // no loop
ldr x0,qAdrszCarriageReturn
bl affichageMess
8:
add x2,x2,#1 // increment indice
cmp x2,x9 // maxi ?
blt 6b // no -> loop
b 100f
99: // display error
ldr x0,qAdrszMessErreur
bl affichageMess
100: // standard end of the program
mov x0, #0 // return code
mov x8, #EXIT // request to exit program
svc #0 // perform the system call
qAdrszCarriageReturn: .quad szCarriageReturn
qAdrszFileName: .quad szFileName
qAdrszMessErreur: .quad szMessErreur
qAdrsBuffer: .quad sBuffer
qSizeBuf: .quad BUFFERSIZE
qAdrszMessSpace: .quad szMessSpace
qAdrtbiCptAna: .quad tbiCptAna
/******************************************************************/
/* analizing word */
/******************************************************************/
/* x0 word address */
/* x1 word length */
anaWord:
stp x1,lr,[sp,-16]! // save registers
stp x2,x3,[sp,-16]! // save registers
stp x4,x5,[sp,-16]! // save registers
stp x6,x7,[sp,-16]! // save registers
mov x5,x0
mov x6,x1
ldr x1,qAdrptTabBuffer
ldr x2,qAdriNBword
ldr x3,[x2]
str x0,[x1,x3,lsl #3]
ldr x1,qAdrptTabAna
ldr x4,qAdrptBuffex1
ldr x0,[x4]
add x6,x6,x0
add x6,x6,#1
str x6,[x4]
str x0,[x1,x3,lsl #3]
add x3,x3,#1
str x3,[x2]
mov x1,x0
mov x0,x5
bl triLetters // sort word letters
mov x2,#0
100:
ldp x6,x7,[sp],16 // restaur 2 registers
ldp x4,x5,[sp],16 // restaur 2 registers
ldp x2,x3,[sp],16 // restaur 2 registers
ldp x1,lr,[sp],16 // restaur 2 registers
ret // return to address lr x30
qAdrptTabBuffer: .quad ptTabBuffer
qAdrptTabAna: .quad ptTabAna
qAdriNBword: .quad iNBword
qAdrptBuffex1: .quad ptBuffex1
/******************************************************************/
/* sort word letters */
/******************************************************************/
/* x0 address begin word */
/* x1 address recept array */
triLetters:
stp x1,lr,[sp,-16]! // save registers
stp x2,x3,[sp,-16]! // save registers
stp x4,x5,[sp,-16]! // save registers
stp x6,x7,[sp,-16]! // save registers
mov x2,#0
1:
ldrb w3,[x0,x2] // load letter
cmp w3,#0 // end word ?
beq 6f
cmp x2,#0 // first letter ?
bne 2f
strb w3,[x1,x2] // yes store in first position
add x2,x2,#1 // increment indice
b 1b // and loop
2:
mov x4,#0
3: // begin loop to search insertion position
ldrb w5,[x1,x4] // load letter
cmp w3,w5 // compare
blt 4f // to low -> insertion
add x4,x4,#1 // increment indice
cmp x4,x2 // compare to letters number in place
blt 3b // search loop
strb w3,[x1,x2] // else store in last position
add x2,x2,#1
b 1b // and loop
4: // move first letters in one position
sub x6,x2,#1 // start indice
5:
ldrb w5,[x1,x6] // load letter
add x7,x6,#1 // store indice - 1
strb w5,[x1,x7] // store letter
sub x6,x6,#1 // decrement indice
cmp x6,x4 // end ?
bge 5b // no loop
strb w3,[x1,x4] // else store letter in free position
add x2,x2,#1
b 1b // and loop
6:
strb wzr,[x1,x2] // final zéro
100:
ldp x6,x7,[sp],16 // restaur 2 registers
ldp x4,x5,[sp],16 // restaur 2 registers
ldp x2,x3,[sp],16 // restaur 2 registers
ldp x1,lr,[sp],16 // restaur 2 registers
ret // return to address lr x30
/***************************************************/
/* Appel récursif Tri Rapide quicksort */
/***************************************************/
/* x0 contains the address of table */
/* x1 contains index of first item */
/* x2 contains the number of elements > 0 */
/* x3 contains the address of table 2 */
triRapide:
stp x1,lr,[sp,-16]! // save registers
stp x2,x3,[sp,-16]! // save registers
stp x4,x5,[sp,-16]! // save registers
stp x6,x7,[sp,-16]! // save registers
mov x6,x3
sub x2,x2,#1 // last item index
cmp x1,x2 // first > last ?
bge 100f // yes -> end
mov x4,x0 // save x0
mov x5,x2 // save x2
mov x3,x6
bl partition1 // cutting.quado 2 parts
mov x2,x0 // index partition
mov x0,x4 // table address
bl triRapide // sort lower part
mov x0,x4 // table address
add x1,x2,#1 // index begin = index partition + 1
add x2,x5,#1 // number of elements
bl triRapide // sort higter part
100: // end function
ldp x6,x7,[sp],16 // restaur 2 registers
ldp x4,x5,[sp],16 // restaur 2 registers
ldp x2,x3,[sp],16 // restaur 2 registers
ldp x1,lr,[sp],16 // restaur 2 registers
ret // return to address lr x30
/******************************************************************/
/* Partition table elements */
/******************************************************************/
/* x0 contains the address of table */
/* x1 contains index of first item */
/* x2 contains index of last item */
/* x3 contains the address of table 2 */
partition1:
stp x1,lr,[sp,-16]! // save registers
stp x2,x3,[sp,-16]! // save registers
stp x4,x5,[sp,-16]! // save registers
stp x6,x7,[sp,-16]! // save registers
stp x8,x9,[sp,-16]! // save registers
stp x10,x12,[sp,-16]! // save registers
mov x8,x0 // save address table 2
mov x9,x1
ldr x10,[x8,x2,lsl #3] // load string address last index
mov x4,x9 // init with first index
mov x5,x9 // init with first index
1: // begin loop
ldr x6,[x8,x5,lsl #3] // load string address
mov x0,x6
mov x1,x10
bl comparStrings
cmp x0,#0
bge 2f
ldr x7,[x8,x4,lsl #3] // if < swap value table
str x6,[x8,x4,lsl #3]
str x7,[x8,x5,lsl #3]
ldr x7,[x3,x4,lsl #3] // swap array 2
ldr x12,[x3,x5,lsl #3]
str x7,[x3,x5,lsl #3]
str x12,[x3,x4,lsl #3]
add x4,x4,#1 // and increment index 1
2:
add x5,x5,#1 // increment index 2
cmp x5,x2 // end ?
blt 1b // no -> loop
ldr x7,[x8,x4,lsl #3] // swap value
str x10,[x8,x4,lsl #3]
str x7,[x8,x2,lsl #3]
ldr x7,[x3,x4,lsl #3] // swap array 2
ldr x12,[x3,x2,lsl #3]
str x7,[x3,x2,lsl #3]
str x12,[x3,x4,lsl #3]
mov x0,x4 // return index partition
100:
ldp x10,x12,[sp],16 // restaur 2 registers
ldp x8,x9,[sp],16 // restaur 2 registers
ldp x6,x7,[sp],16 // restaur 2 registers
ldp x4,x5,[sp],16 // restaur 2 registers
ldp x2,x3,[sp],16 // restaur 2 registers
ldp x1,lr,[sp],16 // restaur 2 registers
ret // return to address lr x30
/************************************/
/* Strings case sensitive comparisons */
/************************************/
/* x0 et x1 contains the address of strings */
/* return 0 in x0 if equals */
/* return -1 if string x0 < string x1 */
/* return 1 if string x0 > string x1 */
comparStrings:
stp x1,lr,[sp,-16]! // save registers
stp x2,x3,[sp,-16]! // save registers
stp x4,x5,[sp,-16]! // save registers
mov x2,#0 // counter
1:
ldrb w3,[x0,x2] // byte string 1
ldrb w4,[x1,x2] // byte string 2
cmp w3,w4
blt 2f // small
bgt 3f // greather
cmp x3,#0 // 0 end string
beq 4f // end string
add x2,x2,#1 // else add 1 in counter
b 1b // and loop
2:
mov x0,#-1 // small
b 100f
3:
mov x0,#1 // greather
b 100f
4:
mov x0,#0 // equal
100:
ldp x4,x5,[sp],16 // restaur 2 registers
ldp x2,x3,[sp],16 // restaur 2 registers
ldp x1,lr,[sp],16 // restaur 2 registers
ret // return to address lr x30
/********************************************************/
/* File Include fonctions */
/********************************************************/
/* for this file see task include a file in language AArch64 assembly */
.include "../includeARM64.inc"
~/.../rosetta/asm1 $ anagram64 bale able bela abel elba cater carte crate caret trace galen glean angle lange angel regal glare alger lager large lena lane lean elan neal veil levi live vile evil
ABAP
report zz_anagrams no standard page heading.
define update_progress.
call function 'SAPGUI_PROGRESS_INDICATOR'
exporting
text = &1.
end-of-definition.
" Selection screen segment allowing the person to choose which file will act as input.
selection-screen begin of block file_choice.
parameters p_file type string lower case.
selection-screen end of block file_choice.
" When the user requests help with input, run the routine to allow them to navigate the presentation server.
at selection-screen on value-request for p_file.
perform getfile using p_file.
at selection-screen output.
%_p_file_%_app_%-text = 'Input File: '.
start-of-selection.
data: gt_data type table of string.
" Read the specified file from the presentation server into memory.
perform readfile using p_file changing gt_data.
" After the file has been read into memory, loop through it line-by-line and make anagrams.
perform anagrams using gt_data.
" Subroutine for generating a list of anagrams.
" The supplied input is a table, with each entry corresponding to a word.
form anagrams using it_data like gt_data.
types begin of ty_map.
types key type string.
types value type string.
types end of ty_map.
data: lv_char type c,
lv_len type i,
lv_string type string,
ls_entry type ty_map,
lt_anagrams type standard table of ty_map,
lt_c_tab type table of string.
field-symbols: <fs_raw> type string.
" Loop through each word in the table, and make an associative array.
loop at gt_data assigning <fs_raw>.
" First, we need to re-order the word alphabetically. This generated a key. All anagrams will use this same key.
" Add each character to a table, which we will then sort alphabetically.
lv_len = strlen( <fs_raw> ).
refresh lt_c_tab.
do lv_len times.
lv_len = sy-index - 1.
append <fs_raw>+lv_len(1) to lt_c_tab.
enddo.
sort lt_c_tab as text.
" Now append the characters to a string and add it as a key into the map.
clear lv_string.
loop at lt_c_tab into lv_char.
concatenate lv_char lv_string into lv_string respecting blanks.
endloop.
ls_entry-key = lv_string.
ls_entry-value = <fs_raw>.
append ls_entry to lt_anagrams.
endloop.
" After we're done processing, output a list of the anagrams.
clear lv_string.
loop at lt_anagrams into ls_entry.
" Is it part of the same key --> Output in the same line, else a new entry.
if lv_string = ls_entry-key.
write: ', ', ls_entry-value.
else.
if sy-tabix <> 1.
write: ']'.
endif.
write: / '[', ls_entry-value.
endif.
lv_string = ls_entry-key.
endloop.
" Close last entry.
write ']'.
endform.
" Read a specified file from the presentation server.
form readfile using i_file type string changing it_raw like gt_data.
data: l_datat type string,
l_msg(2048),
l_lines(10).
" Read the file into memory.
update_progress 'Reading file...'.
call method cl_gui_frontend_services=>gui_upload
exporting
filename = i_file
changing
data_tab = it_raw
exceptions
others = 1.
" Output error if the file could not be uploaded.
if sy-subrc <> 0.
write : / 'Error reading the supplied file!'.
return.
endif.
endform.
- Output:
[ angel , angle , galen , glean , lange ] [ elan , lane , lean , lena , neal ] [ alger , glare , lager , large , regal ] [ abel , able , bale , bela , elba ] [ evil , levi , live , veil , vile ] [ caret , carte , cater , crate , trace ]
Ada
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Containers.Indefinite_Ordered_Maps;
with Ada.Containers.Indefinite_Ordered_Sets;
procedure Words_Of_Equal_Characters is
package Set_Of_Words is new Ada.Containers.Indefinite_Ordered_Sets (String);
use Ada.Containers, Set_Of_Words;
package Anagrams is new Ada.Containers.Indefinite_Ordered_Maps (String, Set);
use Anagrams;
File : File_Type;
Result : Map;
Max : Count_Type := 1;
procedure Put (Position : Anagrams.Cursor) is
First : Boolean := True;
List : Set renames Element (Position);
procedure Put (Position : Set_Of_Words.Cursor) is
begin
if First then
First := False;
else
Put (',');
end if;
Put (Element (Position));
end Put;
begin
if List.Length = Max then
Iterate (List, Put'Access);
New_Line;
end if;
end Put;
begin
Open (File, In_File, "unixdict.txt");
loop
declare
Word : constant String := Get_Line (File);
Key : String (Word'Range) := (others => Character'Last);
List : Set;
Position : Anagrams.Cursor;
begin
for I in Word'Range loop
for J in Word'Range loop
if Key (J) > Word (I) then
Key (J + 1..I) := Key (J..I - 1);
Key (J) := Word (I);
exit;
end if;
end loop;
end loop;
Position := Find (Result, Key);
if Has_Element (Position) then
List := Element (Position);
Insert (List, Word);
Replace_Element (Result, Position, List);
else
Insert (List, Word);
Include (Result, Key, List);
end if;
Max := Count_Type'Max (Max, Length (List));
end;
end loop;
exception
when End_Error =>
Iterate (Result, Put'Access);
Close (File);
end Words_Of_Equal_Characters;
- Output:
abel,able,bale,bela,elba caret,carte,cater,crate,trace angel,angle,galen,glean,lange alger,glare,lager,large,regal elan,lane,lean,lena,neal evil,levi,live,veil,vile
ALGOL 68
Uses the "read" PRAGMA of Algol 68 G to include the associative array code from the Associative_array/Iteration task.
# find longest list(s) of words that are anagrams in a list of words #
# use the associative array in the Associate array/iteration task #
PR read "aArray.a68" PR
# returns the number of occurances of ch in text #
PROC count = ( STRING text, CHAR ch )INT:
BEGIN
INT result := 0;
FOR c FROM LWB text TO UPB text DO IF text[ c ] = ch THEN result +:= 1 FI OD;
result
END # count # ;
# returns text with the characters sorted into ascending order #
PROC char sort = ( STRING text )STRING:
BEGIN
STRING sorted := text;
FOR end pos FROM UPB sorted - 1 BY -1 TO LWB sorted
WHILE
BOOL swapped := FALSE;
FOR pos FROM LWB sorted TO end pos DO
IF sorted[ pos ] > sorted[ pos + 1 ]
THEN
CHAR t := sorted[ pos ];
sorted[ pos ] := sorted[ pos + 1 ];
sorted[ pos + 1 ] := t;
swapped := TRUE
FI
OD;
swapped
DO SKIP OD;
sorted
END # char sort # ;
# read the list of words and store in an associative array #
CHAR separator = "|"; # character that will separate the anagrams #
IF FILE input file;
STRING file name = "unixdict.txt";
open( input file, file name, stand in channel ) /= 0
THEN
# failed to open the file #
print( ( "Unable to open """ + file name + """", newline ) )
ELSE
# file opened OK #
BOOL at eof := FALSE;
# set the EOF handler for the file #
on logical file end( input file, ( REF FILE f )BOOL:
BEGIN
# note that we reached EOF on the #
# latest read #
at eof := TRUE;
# return TRUE so processing can continue #
TRUE
END
);
REF AARRAY words := INIT LOC AARRAY;
WHILE NOT at eof
DO
STRING word;
get( input file, ( word, newline ) );
words // char sort( word ) +:= separator + word
OD;
# close the file #
close( input file );
# find the maximum number of anagrams #
INT max anagrams := 0;
REF AAELEMENT e := FIRST words;
WHILE e ISNT nil element DO
IF INT anagrams := count( value OF e, separator );
anagrams > max anagrams
THEN
max anagrams := anagrams
FI;
e := NEXT words
OD;
print( ( "Maximum number of anagrams: ", whole( max anagrams, -4 ), newline ) );
# show the anagrams with the maximum number #
e := FIRST words;
WHILE e ISNT nil element DO
IF INT anagrams := count( value OF e, separator );
anagrams = max anagrams
THEN
print( ( ( value OF e )[ ( LWB value OF e ) + 1: ], newline ) )
FI;
e := NEXT words
OD
FI
- Output:
Maximum number of anagrams: 5 abel|able|bale|bela|elba elan|lane|lean|lena|neal evil|levi|live|veil|vile angel|angle|galen|glean|lange alger|glare|lager|large|regal caret|carte|cater|crate|trace
Amazing Hopper
#include <basico.h>
#define MAX_LINE 30
algoritmo
fd=0, filas=0
word={}, 2da columna={}
old_word="",new_word=""
dimensionar (1,2) matriz de cadenas 'result'
pos=0
token.separador'""'
abrir para leer("basica/unixdict.txt",fd)
iterar mientras ' no es fin de archivo (fd) '
usando 'MAX_LINE', leer línea desde(fd),
---copiar en 'old_word'---, separar para 'word '
word, ---retener--- ordenar esto,
encadenar en 'new_word'
matriz.buscar en tabla (1,new_word,result)
copiar en 'pos'
si ' es negativo? '
new_word,old_word, pegar fila en 'result'
sino
#( result[pos,2] = cat(result[pos,2],cat(",",old_word) ) )
fin si
reiterar
cerrar archivo(fd)
guardar 'filas de (result)' en 'filas'
#( 2da columna = result[2:filas, 2] )
fijar separador '","'
tomar '2da columna'
contar tokens en '2da columna' ---retener resultado,
obtener máximo valor,es mayor o igual?, replicar esto
compactar esto
fijar separador 'NL', luego imprime todo
terminar
- Output:
abel,able,bale,bela,elba alger,glare,lager,large,regal angel,angle,galen,glean,lange caret,carte,cater,crate,trace elan,lane,lean,lena,neal evil,levi,live,veil,vile
APL
This is a rough translation of the J version, intermediate values are kept and verb trains are not used for clarity of data flow.
anagrams←{
tie←⍵ ⎕NTIE 0
dict←⎕NREAD tie 80(⎕NSIZE tie)0
boxes←((⎕UCS 10)≠dict)⊆dict
ana←(({⍵[⍋⍵]}¨boxes)({⍵}⌸)boxes)
({~' '∊¨(⊃/¯1↑[2]⍵)}ana)⌿ana ⋄ ⎕NUNTIE
}
On a unix system we can assume wget exists and can use it from dyalog to download the file.
The ]display command formats the output with boxes.
Example:
⎕SH'wget http://wiki.puzzlers.org/pub/wordlists/unixdict.txt'
]display anagrams 'unixdict.txt'
Output:
┌→────────────────────────────────────────┐ ↓ ┌→───┐ ┌→───┐ ┌→───┐ ┌→───┐ ┌→───┐ │ │ │abel│ │able│ │bale│ │bela│ │elba│ │ │ └────┘ └────┘ └────┘ └────┘ └────┘ │ │ ┌→────┐ ┌→────┐ ┌→────┐ ┌→────┐ ┌→────┐ │ │ │alger│ │glare│ │lager│ │large│ │regal│ │ │ └─────┘ └─────┘ └─────┘ └─────┘ └─────┘ │ │ ┌→────┐ ┌→────┐ ┌→────┐ ┌→────┐ ┌→────┐ │ │ │angel│ │angle│ │galen│ │glean│ │lange│ │ │ └─────┘ └─────┘ └─────┘ └─────┘ └─────┘ │ │ ┌→────┐ ┌→────┐ ┌→────┐ ┌→────┐ ┌→────┐ │ │ │caret│ │carte│ │cater│ │crate│ │trace│ │ │ └─────┘ └─────┘ └─────┘ └─────┘ └─────┘ │ │ ┌→───┐ ┌→───┐ ┌→───┐ ┌→───┐ ┌→───┐ │ │ │elan│ │lane│ │lean│ │lena│ │neal│ │ │ └────┘ └────┘ └────┘ └────┘ └────┘ │ │ ┌→───┐ ┌→───┐ ┌→───┐ ┌→───┐ ┌→───┐ │ │ │evil│ │levi│ │live│ │veil│ │vile│ │ │ └────┘ └────┘ └────┘ └────┘ └────┘ │ └∊────────────────────────────────────────┘
AppleScript
use AppleScript version "2.3.1" -- OS X 10.9 (Mavericks) or later.
use sorter : script ¬
"Custom Iterative Ternary Merge Sort" -- <www.macscripter.net/t/timsort-and-nigsort/71383/3>
use scripting additions
on join(lst, delim)
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to delim
set txt to lst as text
set AppleScript's text item delimiters to astid
return txt
end join
on largestAnagramGroups(listOfWords)
script o
property wordList : listOfWords
property groupingTexts : wordList's items
property largestGroupSize : 0
property largestGroupRanges : {}
on judgeGroup(i, j)
set groupSize to j - i + 1
if (groupSize < largestGroupSize) then -- Most likely.
else if (groupSize = largestGroupSize) then -- Next most likely.
set end of largestGroupRanges to {i, j}
else -- Largest group so far.
set largestGroupRanges to {{i, j}}
set largestGroupSize to groupSize
end if
end judgeGroup
on isGreater(a, b)
return a's beginning > b's beginning
end isGreater
end script
set wordCount to (count o's wordList)
ignoring case
-- Replace the words in the groupingTexts list with sorted-character versions.
repeat with i from 1 to wordCount
set chrs to o's groupingTexts's item i's characters
tell sorter to sort(chrs, 1, -1, {})
set o's groupingTexts's item i to join(chrs, "")
end repeat
-- Sort the list to group its contents and echo the moves in the original word list.
tell sorter to sort(o's groupingTexts, 1, wordCount, {slave:{o's wordList}})
-- Find the list range(s) of the longest run(s) of equal grouping texts.
set i to 1
set currentText to beginning of o's groupingTexts
repeat with j from 2 to wordCount
set thisText to o's groupingTexts's item j
if (thisText is not currentText) then
tell o to judgeGroup(i, j - 1)
set currentText to thisText
set i to j
end if
end repeat
if (j > i) then tell o to judgeGroup(i, j)
-- Extract the group(s) of words occupying the same range(s) in the original word list.
set output to {}
repeat with thisRange in o's largestGroupRanges
set {i, j} to thisRange
-- Add this group to the output.
set thisGroup to o's wordList's items i thru j
tell sorter to sort(thisGroup, 1, -1, {}) -- Not necessary with unixdict.txt. But hey.
set end of output to thisGroup
end repeat
-- As a final flourish, sort the groups on their first items.
tell sorter to sort(output, 1, -1, {comparer:o})
end ignoring
return output
end largestAnagramGroups
local wordFile, wordList
set wordFile to ((path to desktop as text) & "www.rosettacode.org:unixdict.txt") as «class furl»
set wordList to paragraphs of (read wordFile as «class utf8»)
return largestAnagramGroups(wordList)
- Output:
{{"abel", "able", "bale", "bela", "elba"}, {"alger", "glare", "lager", "large", "regal"}, {"angel", "angle", "galen", "glean", "lange"}, {"caret", "carte", "cater", "crate", "trace"}, {"elan", "lane", "lean", "lena", "neal"}, {"evil", "levi", "live", "veil", "vile"}}
ARM Assembly
/* ARM assembly Raspberry PI */
/* program anagram.s */
/* REMARK 1 : this program use routines in a include file
see task Include a file language arm assembly
for the routine affichageMess conversion10
see at end of this program the instruction include */
/* for constantes see task include a file in arm assembly */
/************************************/
/* Constantes */
/************************************/
.include "../constantes.inc"
.equ MAXI, 40000
.equ BUFFERSIZE, 300000
.equ READ, 3 @ system call
.equ OPEN, 5 @ system call
.equ CLOSE, 6 @ system call
.equ O_RDWR, 0x0002 @ open for reading and writing
/*********************************/
/* Initialized data */
/*********************************/
.data
szFileName: .asciz "./listword.txt"
szMessErreur: .asciz "FILE ERROR."
szCarriageReturn: .asciz "\n"
szMessSpace: .asciz " "
ptBuffer1: .int sBuffer1
/*********************************/
/* UnInitialized data */
/*********************************/
.bss
ptTabBuffer: .skip 4 * MAXI
ptTabAna: .skip 4 * MAXI
tbiCptAna: .skip 4 * MAXI
iNBword: .skip 4
sBuffer: .skip BUFFERSIZE
sBuffer1: .skip BUFFERSIZE
/*********************************/
/* code section */
/*********************************/
.text
.global main
main: @ entry of program
mov r4,#0 @ loop indice
ldr r0,iAdrszFileName @ file name
mov r1,#O_RDWR @ flags
mov r2,#0 @ mode
mov r7,#OPEN @
svc 0
cmp r0,#0 @ error open
ble 99f
mov r8,r0 @ FD save Fd
ldr r1,iAdrsBuffer @ buffer address
ldr r2,iSizeBuf @ buffersize
mov r7, #READ
svc 0
cmp r0,#0 @ error read ?
blt 99f
mov r5,r0 @ save size read bytes
ldr r4,iAdrsBuffer @ buffer address
ldr r0,iAdrsBuffer @ start word address
mov r2,#0
mov r1,#0 @ word length
1:
cmp r2,r5
bge 2f
ldrb r3,[r4,r2]
cmp r3,#0xD @ end word ?
addne r1,r1,#1 @ increment word length
addne r2,r2,#1 @ increment indice
bne 1b @ and loop
mov r3,#0
strb r3,[r4,r2] @ store final zero
bl anaWord @ sort word letters
add r2,r2,#2 @ jump OD and 0A
add r0,r4,r2 @ new address begin word
mov r1,#0 @ init length
b 1b @ and loop
2:
mov r3,#0 @ last word
strb r3,[r4,r2]
bl anaWord
mov r0,r8 @ file Fd
mov r7, #CLOSE
svc 0
cmp r0,#0 @ error close ?
blt 99f
ldr r0,iAdrptTabAna @ address sorted string area
mov r1,#0 @ first indice
ldr r2,iAdriNBword
ldr r2,[r2] @ last indice
ldr r3,iAdrptTabBuffer @ address sorted string area
bl triRapide @ quick sort
ldr r4,iAdrptTabAna @ address sorted string area
ldr r7,iAdrptTabBuffer @ address sorted string area
ldr r10,iAdrtbiCptAna @ address counter occurences
mov r9,r2 @ size word array
mov r8,#0 @ indice first occurence
ldr r3,[r4,r8,lsl #2] @ load first value
mov r2,#1 @ loop indice
mov r6,#0 @ counter
mov r12,#0 @ counter value max
3:
ldr r5,[r4,r2,lsl #2] @ load next value
mov r0,r3
mov r1,r5
bl comparStrings
cmp r0,#0 @ sorted strings equal ?
bne 4f
add r6,r6,#1 @ yes increment counter
b 5f
4: @ no
str r6,[r10,r8,lsl #2] @ store counter in first occurence
cmp r6,r12 @ counter > value max
movgt r12,r6 @ yes counter -> value max
mov r6,#0 @ raz counter
mov r8,r2 @ init index first occurence
mov r3,r5 @ init value first occurence
5:
add r2,r2,#1 @ increment indice
cmp r2,r9 @ end word array ?
blt 3b @ no -> loop
mov r2,#0 @ raz indice
6: @ display loop
ldr r6,[r10,r2,lsl #2] @ load counter
cmp r6,r12 @ equal to max value ?
bne 8f
ldr r0,[r7,r2,lsl #2] @ load address first word
bl affichageMess
add r3,r2,#1 @ increment new indixe
mov r4,#0 @ counter
7:
ldr r0,iAdrszMessSpace
bl affichageMess
ldr r0,[r7,r3,lsl #2] @ load address other word
bl affichageMess
add r3,r3,#1 @ increment indice
add r4,r4,#1 @ increment counter
cmp r4,r6 @ max value ?
blt 7b @ no loop
ldr r0,iAdrszCarriageReturn
bl affichageMess
8:
add r2,r2,#1 @ increment indice
cmp r2,r9 @ maxi ?
blt 6b @ no -> loop
b 100f
99: @ display error
ldr r1,iAdrszMessErreur
bl displayError
100: @ standard end of the program
mov r0, #0 @ return code
mov r7, #EXIT @ request to exit program
svc #0 @ perform the system call
iAdrszCarriageReturn: .int szCarriageReturn
iAdrszFileName: .int szFileName
iAdrszMessErreur: .int szMessErreur
iAdrsBuffer: .int sBuffer
iSizeBuf: .int BUFFERSIZE
iAdrszMessSpace: .int szMessSpace
iAdrtbiCptAna: .int tbiCptAna
/******************************************************************/
/* analizing word */
/******************************************************************/
/* r0 word address */
/* r1 word length */
anaWord:
push {r1-r6,lr}
mov r5,r0
mov r6,r1
ldr r1,iAdrptTabBuffer
ldr r2,iAdriNBword
ldr r3,[r2]
str r0,[r1,r3,lsl #2]
ldr r1,iAdrptTabAna
ldr r4,iAdrptBuffer1
ldr r0,[r4]
add r6,r6,r0
add r6,r6,#1
str r6,[r4]
str r0,[r1,r3,lsl #2]
add r3,r3,#1
str r3,[r2]
mov r1,r0
mov r0,r5
bl triLetters @ sort word letters
mov r2,#0
100:
pop {r1-r6,pc}
iAdrptTabBuffer: .int ptTabBuffer
iAdrptTabAna: .int ptTabAna
iAdriNBword: .int iNBword
iAdrptBuffer1: .int ptBuffer1
/******************************************************************/
/* sort word letters */
/******************************************************************/
/* r0 address begin word */
/* r1 address recept array */
triLetters:
push {r1-r7,lr}
mov r2,#0
1:
ldrb r3,[r0,r2] @ load letter
cmp r3,#0 @ end word ?
beq 6f
cmp r2,#0 @ first letter ?
bne 2f
strb r3,[r1,r2] @ yes store in first position
add r2,r2,#1 @ increment indice
b 1b @ and loop
2:
mov r4,#0
3: @ begin loop to search insertion position
ldrb r5,[r1,r4] @ load letter
cmp r3,r5 @ compare
blt 4f @ to low -> insertion
add r4,r4,#1 @ increment indice
cmp r4,r2 @ compare to letters number in place
blt 3b @ search loop
strb r3,[r1,r2] @ else store in last position
add r2,r2,#1
b 1b @ and loop
4: @ move first letters in one position
sub r6,r2,#1 @ start indice
5:
ldrb r5,[r1,r6] @ load letter
add r7,r6,#1 @ store indice - 1
strb r5,[r1,r7] @ store letter
sub r6,r6,#1 @ decrement indice
cmp r6,r4 @ end ?
bge 5b @ no loop
strb r3,[r1,r4] @ else store letter in free position
add r2,r2,#1
b 1b @ and loop
6:
mov r3,#0 @ final zéro
strb r3,[r1,r2]
100:
pop {r1-r7,pc}
/***************************************************/
/* Appel récursif Tri Rapide quicksort */
/***************************************************/
/* r0 contains the address of table */
/* r1 contains index of first item */
/* r2 contains the number of elements > 0 */
/* r3 contains the address of table 2 */
triRapide:
push {r2-r6,lr} @ save registers
mov r6,r3
sub r2,#1 @ last item index
cmp r1,r2 @ first > last ?
bge 100f @ yes -> end
mov r4,r0 @ save r0
mov r5,r2 @ save r2
mov r3,r6
bl partition1 @ cutting into 2 parts
mov r2,r0 @ index partition
mov r0,r4 @ table address
bl triRapide @ sort lower part
mov r0,r4 @ table address
add r1,r2,#1 @ index begin = index partition + 1
add r2,r5,#1 @ number of elements
bl triRapide @ sort higter part
100: @ end function
pop {r2-r6,lr} @ restaur registers
bx lr @ return
/******************************************************************/
/* Partition table elements */
/******************************************************************/
/* r0 contains the address of table */
/* r1 contains index of first item */
/* r2 contains index of last item */
/* r3 contains the address of table 2 */
partition1:
push {r1-r12,lr} @ save registers
mov r8,r0 @ save address table 2
mov r9,r1
ldr r10,[r8,r2,lsl #2] @ load string address last index
mov r4,r9 @ init with first index
mov r5,r9 @ init with first index
1: @ begin loop
ldr r6,[r8,r5,lsl #2] @ load string address
mov r0,r6
mov r1,r10
bl comparStrings
cmp r0,#0
ldrlt r7,[r8,r4,lsl #2] @ if < swap value table
strlt r6,[r8,r4,lsl #2]
strlt r7,[r8,r5,lsl #2]
ldrlt r7,[r3,r4,lsl #2] @ swap array 2
ldrlt r12,[r3,r5,lsl #2]
strlt r7,[r3,r5,lsl #2]
strlt r12,[r3,r4,lsl #2]
addlt r4,#1 @ and increment index 1
add r5,#1 @ increment index 2
cmp r5,r2 @ end ?
blt 1b @ no -> loop
ldr r7,[r8,r4,lsl #2] @ swap value
str r10,[r8,r4,lsl #2]
str r7,[r8,r2,lsl #2]
ldr r7,[r3,r4,lsl #2] @ swap array 2
ldr r12,[r3,r2,lsl #2]
str r7,[r3,r2,lsl #2]
str r12,[r3,r4,lsl #2]
mov r0,r4 @ return index partition
100:
pop {r1-r12,lr}
bx lr
/************************************/
/* Strings case sensitive comparisons */
/************************************/
/* r0 et r1 contains the address of strings */
/* return 0 in r0 if equals */
/* return -1 if string r0 < string r1 */
/* return 1 if string r0 > string r1 */
comparStrings:
push {r1-r4} @ save des registres
mov r2,#0 @ counter
1:
ldrb r3,[r0,r2] @ byte string 1
ldrb r4,[r1,r2] @ byte string 2
cmp r3,r4
movlt r0,#-1 @ small
movgt r0,#1 @ greather
bne 100f @ not equals
cmp r3,#0 @ 0 end string
moveq r0,#0 @ equals
beq 100f @ end string
add r2,r2,#1 @ else add 1 in counter
b 1b @ and loop
100:
pop {r1-r4}
bx lr
/***************************************************/
/* ROUTINES INCLUDE */
/***************************************************/
.include "../affichage.inc"
bale able bela abel elba cater carte crate caret trace galen glean angle lange angel regal glare alger lager large lena lane lean elan neal veil levi live vile evil
Arturo
wordset: map read.lines relative "unixdict.txt" => strip
anagrams: #[]
loop wordset 'word [
anagram: sort to [:char] word
unless key? anagrams anagram ->
anagrams\[anagram]: new []
anagrams\[anagram]: anagrams\[anagram] ++ word
]
loop select values anagrams 'x [5 =< size x] 'words ->
print join.with:", " words
- Output:
abel, able, bale, bela, elba alger, glare, lager, large, regal angel, angle, galen, glean, lange caret, carte, cater, crate, trace elan, lane, lean, lena, neal evil, levi, live, veil, vile
AutoHotkey
Following code should work for AHK 1.0.* and 1.1* versions:
FileRead, Contents, unixdict.txt
Loop, Parse, Contents, % "`n", % "`r"
{ ; parsing each line of the file we just read
Loop, Parse, A_LoopField ; parsing each letter/character of the current word
Dummy .= "," A_LoopField
Sort, Dummy, % "D," ; sorting those letters before removing the delimiters (comma)
StringReplace, Dummy, Dummy, % ",", % "", All
List .= "`n" Dummy " " A_LoopField , Dummy := ""
} ; at this point, we have a list where each line looks like <LETTERS><SPACE><WORD>
Count := 0, Contents := "", List := SubStr(List,2)
Sort, List
Loop, Parse, List, % "`n", % "`r"
{ ; now the list is sorted, parse it counting the consecutive lines with the same set of <LETTERS>
Max := (Count > Max) ? Count : Max
StringSplit, LinePart, A_LoopField, % " " ; (LinePart1 are the letters, LinePart2 is the word)
If ( PreviousLinePart1 = LinePart1 )
Count++ , WordList .= "," LinePart2
Else
var_Result .= ( Count <> Max ) ? "" ; don't append if the number of common words is too low
: "`n" Count "`t" PreviousLinePart1 "`t" SubStr(WordList,2)
, WordList := "" , Count := 0
PreviousLinePart1 := LinePart1
}
List := "", var_Result := SubStr(var_Result,2)
Sort, var_Result, R N ; make the higher scores appear first
Loop, Parse, var_Result, % "`n", % "`r"
If ( 1 == InStr(A_LoopField,Max) )
var_Output .= "`n" A_LoopField
Else ; output only those sets of letters that scored the maximum amount of common words
Break
MsgBox, % ClipBoard := SubStr(var_Output,2) ; the result is also copied to the clipboard
- Output:
4 aeln lane,lean,lena,neal 4 aeglr glare,lager,large,regal 4 aegln angle,galen,glean,lange 4 acert carte,cater,crate,trace 4 abel able,bale,bela,elba 4 eilv levi,live,veil,vile
AWK
# JUMBLEA.AWK - words with the most duplicate spellings
# syntax: GAWK -f JUMBLEA.AWK UNIXDICT.TXT
{ for (i=1; i<=NF; i++) {
w = sortstr(toupper($i))
arr[w] = arr[w] $i " "
n = gsub(/ /,"&",arr[w])
if (max_n < n) { max_n = n }
}
}
END {
for (w in arr) {
if (gsub(/ /,"&",arr[w]) == max_n) {
printf("%s\t%s\n",w,arr[w])
}
}
exit(0)
}
function sortstr(str, i,j,leng) {
leng = length(str)
for (i=2; i<=leng; i++) {
for (j=i; j>1 && substr(str,j-1,1) > substr(str,j,1); j--) {
str = substr(str,1,j-2) substr(str,j,1) substr(str,j-1,1) substr(str,j+1)
}
}
return(str)
}
- Output:
ABEL abel able bale bela elba ACERT caret carte cater crate trace AEGLN angel angle galen glean lange AEGLR alger glare lager large regal AELN elan lane lean lena neal EILV evil levi live veil vile
Alternatively, non-POSIX version:
#!/bin/gawk -f
{ patsplit($0, chars, ".")
asort(chars)
sorted = ""
for (i = 1; i <= length(chars); i++)
sorted = sorted chars[i]
if (++count[sorted] > countMax) countMax++
accum[sorted] = accum[sorted] " " $0
}
END {
for (i in accum)
if (count[i] == countMax)
print substr(accum[i], 2)
}
BASIC
BaCon
OPTION COLLAPSE TRUE
DECLARE idx$ ASSOC STRING
FOR w$ IN LOAD$("unixdict.txt") STEP NL$
set$ = SORT$(EXPLODE$(w$, 1))
idx$(set$) = APPEND$(idx$(set$), 0, w$)
total = AMOUNT(idx$(set$))
IF MaxCount < total THEN MaxCount = total
NEXT
PRINT "Analyzing took ", TIMER, " msecs.", NL$
LOOKUP idx$ TO n$ SIZE x
FOR y = 0 TO x-1
IF MaxCount = AMOUNT(idx$(n$[y])) THEN PRINT n$[y], ": ", idx$(n$[y])
NEXT
- Output:
Analyzing took 35 msecs. a b e l: abel able bale bela elba a e g l r: alger glare lager large regal a e g l n: angel angle galen glean lange a c e r t: caret carte cater crate trace a e l n: elan lane lean lena neal e i l v: evil levi live veil vile
BBC BASIC
INSTALL @lib$+"SORTLIB"
sort% = FN_sortinit(0,0)
REM Count number of words in dictionary:
nwords% = 0
dict% = OPENIN("unixdict.txt")
WHILE NOT EOF#dict%
word$ = GET$#dict%
nwords% += 1
ENDWHILE
CLOSE #dict%
REM Create arrays big enough to contain the dictionary:
DIM dict$(nwords%), sort$(nwords%)
REM Load the dictionary and sort the characters in the words:
dict% = OPENIN("unixdict.txt")
FOR word% = 1 TO nwords%
word$ = GET$#dict%
dict$(word%) = word$
sort$(word%) = FNsortchars(word$)
NEXT word%
CLOSE #dict%
REM Sort arrays using the 'sorted character' words as a key:
C% = nwords%
CALL sort%, sort$(1), dict$(1)
REM Count the longest sets of anagrams:
max% = 0
set% = 1
FOR word% = 1 TO nwords%-1
IF sort$(word%) = sort$(word%+1) THEN
set% += 1
ELSE
IF set% > max% THEN max% = set%
set% = 1
ENDIF
NEXT word%
REM Output the results:
set% = 1
FOR word% = 1 TO nwords%-1
IF sort$(word%) = sort$(word%+1) THEN
set% += 1
ELSE
IF set% = max% THEN
FOR anagram% = word%-max%+1 TO word%
PRINT dict$(anagram%),;
NEXT
PRINT
ENDIF
set% = 1
ENDIF
NEXT word%
END
DEF FNsortchars(word$)
LOCAL C%, char&()
DIM char&(LEN(word$))
$$^char&(0) = word$
C% = LEN(word$)
CALL sort%, char&(0)
= $$^char&(0)
- Output:
abel able bale bela elba caret carte cater crate trace angel angle galen glean lange alger glare lager large regal elan lane lean lena neal evil levi live veil vile
BQN
words ← •FLines "unixdict.txt"
•Show¨{𝕩/˜(⊢=⌈´)≠¨𝕩} (⊐∧¨)⊸⊔ words
⟨ "abel" "able" "bale" "bela" "elba" ⟩
⟨ "alger" "glare" "lager" "large" "regal" ⟩
⟨ "angel" "angle" "galen" "glean" "lange" ⟩
⟨ "caret" "carte" "cater" "crate" "trace" ⟩
⟨ "elan" "lane" "lean" "lena" "neal" ⟩
⟨ "evil" "levi" "live" "veil" "vile" ⟩
Assumes that unixdict.txt
is in the same folder. The JS implementation must be run in Node.js to have access to the filesystem.
(⊐∧¨)⊸⊔
is an expression which sorts all words and groups based on them.
Bracmat
This solution makes extensive use of Bracmat's computer algebra mechanisms. A trick is needed to handle words that are merely repetitions of a single letter, such as iii
. That's why the variabe sum
isn't initialised with 0
, but with a non-number, in this case the empty string. Also te correct handling of characters 0-9 needs a trick so that they are not numerically added: they are prepended with a non-digit, an N
in this case. After completely traversing the word list, the program writes a file product.txt
that can be visually inspected.
The program is not fast. (Minutes rather than seconds.)
( get$("unixdict.txt",STR):?list
& 1:?product
& whl
' ( @(!list:(%?word:?w) \n ?list)
& :?sum
& whl
' ( @(!w:%?let ?w)
& (!let:~#|str$(N !let))+!sum:?sum
)
& !sum^!word*!product:?product
)
& lst$(product,"product.txt",NEW)
& 0:?max
& :?group
& ( !product
: ?
* ?^(%+%:?exp)
* ( ?
& !exp
: ?
+ ( [>!max:[?max&!exp:?group
| [~<!max&!group !exp:?group
)
& ~
)
| out$!group
)
);
- Output:
abel+able+bale+bela+elba caret+carte+cater+crate+trace angel+angle+galen+glean+lange alger+glare+lager+large+regal elan+lane+lean+lena+neal evil+levi+live+veil+vile
C
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <time.h>
char *sortedWord(const char *word, char *wbuf)
{
char *p1, *p2, *endwrd;
char t;
int swaps;
strcpy(wbuf, word);
endwrd = wbuf+strlen(wbuf);
do {
swaps = 0;
p1 = wbuf; p2 = endwrd-1;
while (p1<p2) {
if (*p2 > *p1) {
t = *p2; *p2 = *p1; *p1 = t;
swaps = 1;
}
p1++; p2--;
}
p1 = wbuf; p2 = p1+1;
while(p2 < endwrd) {
if (*p2 > *p1) {
t = *p2; *p2 = *p1; *p1 = t;
swaps = 1;
}
p1++; p2++;
}
} while (swaps);
return wbuf;
}
static
short cxmap[] = {
0x06, 0x1f, 0x4d, 0x0c, 0x5c, 0x28, 0x5d, 0x0e, 0x09, 0x33, 0x31, 0x56,
0x52, 0x19, 0x29, 0x53, 0x32, 0x48, 0x35, 0x55, 0x5e, 0x14, 0x27, 0x24,
0x02, 0x3e, 0x18, 0x4a, 0x3f, 0x4c, 0x45, 0x30, 0x08, 0x2c, 0x1a, 0x03,
0x0b, 0x0d, 0x4f, 0x07, 0x20, 0x1d, 0x51, 0x3b, 0x11, 0x58, 0x00, 0x49,
0x15, 0x2d, 0x41, 0x17, 0x5f, 0x39, 0x16, 0x42, 0x37, 0x22, 0x1c, 0x0f,
0x43, 0x5b, 0x46, 0x4b, 0x0a, 0x26, 0x2e, 0x40, 0x12, 0x21, 0x3c, 0x36,
0x38, 0x1e, 0x01, 0x1b, 0x05, 0x4e, 0x44, 0x3d, 0x04, 0x10, 0x5a, 0x2a,
0x23, 0x34, 0x25, 0x2f, 0x2b, 0x50, 0x3a, 0x54, 0x47, 0x59, 0x13, 0x57,
};
#define CXMAP_SIZE (sizeof(cxmap)/sizeof(short))
int Str_Hash( const char *key, int ix_max )
{
const char *cp;
short mash;
int hash = 33501551;
for (cp = key; *cp; cp++) {
mash = cxmap[*cp % CXMAP_SIZE];
hash = (hash >>4) ^ 0x5C5CF5C ^ ((hash<<1) + (mash<<5));
hash &= 0x3FFFFFFF;
}
return hash % ix_max;
}
typedef struct sDictWord *DictWord;
struct sDictWord {
const char *word;
DictWord next;
};
typedef struct sHashEntry *HashEntry;
struct sHashEntry {
const char *key;
HashEntry next;
DictWord words;
HashEntry link;
short wordCount;
};
#define HT_SIZE 8192
HashEntry hashTable[HT_SIZE];
HashEntry mostPerms = NULL;
int buildAnagrams( FILE *fin )
{
char buffer[40];
char bufr2[40];
char *hkey;
int hix;
HashEntry he, *hep;
DictWord we;
int maxPC = 2;
int numWords = 0;
while ( fgets(buffer, 40, fin)) {
for(hkey = buffer; *hkey && (*hkey!='\n'); hkey++);
*hkey = 0;
hkey = sortedWord(buffer, bufr2);
hix = Str_Hash(hkey, HT_SIZE);
he = hashTable[hix]; hep = &hashTable[hix];
while( he && strcmp(he->key , hkey) ) {
hep = &he->next;
he = he->next;
}
if ( ! he ) {
he = malloc(sizeof(struct sHashEntry));
he->next = NULL;
he->key = strdup(hkey);
he->wordCount = 0;
he->words = NULL;
he->link = NULL;
*hep = he;
}
we = malloc(sizeof(struct sDictWord));
we->word = strdup(buffer);
we->next = he->words;
he->words = we;
he->wordCount++;
if ( maxPC < he->wordCount) {
maxPC = he->wordCount;
mostPerms = he;
he->link = NULL;
}
else if (maxPC == he->wordCount) {
he->link = mostPerms;
mostPerms = he;
}
numWords++;
}
printf("%d words in dictionary max ana=%d\n", numWords, maxPC);
return maxPC;
}
int main( )
{
HashEntry he;
DictWord we;
FILE *f1;
f1 = fopen("unixdict.txt","r");
buildAnagrams(f1);
fclose(f1);
f1 = fopen("anaout.txt","w");
// f1 = stdout;
for (he = mostPerms; he; he = he->link) {
fprintf(f1,"%d:", he->wordCount);
for(we = he->words; we; we = we->next) {
fprintf(f1,"%s, ", we->word);
}
fprintf(f1, "\n");
}
fclose(f1);
return 0;
}
- Output:
(less than 1 second on old P500)
5:vile, veil, live, levi, evil, 5:trace, crate, cater, carte, caret, 5:regal, large, lager, glare, alger, 5:neal, lena, lean, lane, elan, 5:lange, glean, galen, angle, angel, 5:elba, bela, bale, able, abel,
A much shorter version with no fancy data structures:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>
#include <string.h>
typedef struct { const char *key, *word; int cnt; } kw_t;
int lst_cmp(const void *a, const void *b)
{
return strcmp(((const kw_t*)a)->key, ((const kw_t*)b)->key);
}
/* Bubble sort. Faster than stock qsort(), believe it or not */
void sort_letters(char *s)
{
int i, j;
char t;
for (i = 0; s[i] != '\0'; i++) {
for (j = i + 1; s[j] != '\0'; j++)
if (s[j] < s[i]) {
t = s[j]; s[j] = s[i]; s[i] = t;
}
}
}
int main()
{
struct stat s;
char *words, *keys;
size_t i, j, k, longest, offset;
int n_word = 0;
kw_t *list;
int fd = open("unixdict.txt", O_RDONLY);
if (fd == -1) return 1;
fstat(fd, &s);
words = malloc(s.st_size * 2);
keys = words + s.st_size;
read(fd, words, s.st_size);
memcpy(keys, words, s.st_size);
/* change newline to null for easy use; sort letters in keys */
for (i = j = 0; i < s.st_size; i++) {
if (words[i] == '\n') {
words[i] = keys[i] = '\0';
sort_letters(keys + j);
j = i + 1;
n_word ++;
}
}
list = calloc(n_word, sizeof(kw_t));
/* make key/word pointer pairs for sorting */
for (i = j = k = 0; i < s.st_size; i++) {
if (words[i] == '\0') {
list[j].key = keys + k;
list[j].word = words + k;
k = i + 1;
j++;
}
}
qsort(list, n_word, sizeof(kw_t), lst_cmp);
/* count each key's repetition */
for (i = j = k = offset = longest = 0; i < n_word; i++) {
if (!strcmp(list[i].key, list[j].key)) {
++k;
continue;
}
/* move current longest to begining of array */
if (k < longest) {
k = 0;
j = i;
continue;
}
if (k > longest) offset = 0;
while (j < i) list[offset++] = list[j++];
longest = k;
k = 0;
}
/* show the longest */
for (i = 0; i < offset; i++) {
printf("%s ", list[i].word);
if (i < n_word - 1 && strcmp(list[i].key, list[i+1].key))
printf("\n");
}
/* free(list); free(words); */
close(fd);
return 0;
}
- Output:
abel able bale bela elba caret carte cater crate trace angel angle galen glean lange alger glare lager large regal elan lane lean lena neal evil levi live veil vile
C#
using System;
using System.IO;
using System.Linq;
using System.Net;
using System.Text.RegularExpressions;
namespace Anagram
{
class Program
{
const string DICO_URL = "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt";
static void Main( string[] args )
{
WebRequest request = WebRequest.Create(DICO_URL);
string[] words;
using (StreamReader sr = new StreamReader(request.GetResponse().GetResponseStream(), true)) {
words = Regex.Split(sr.ReadToEnd(), @"\r?\n");
}
var groups = from string w in words
group w by string.Concat(w.OrderBy(x => x)) into c
group c by c.Count() into d
orderby d.Key descending
select d;
foreach (var c in groups.First()) {
Console.WriteLine(string.Join(" ", c));
}
}
}
}
- Output:
abel able bale bela elba alger glare lager large regal angel angle galen glean lange caret carte cater crate trace elan lane lean lena neal evil levi live veil vile
C++
#include <iostream>
#include <fstream>
#include <string>
#include <map>
#include <vector>
#include <algorithm>
#include <iterator>
int main() {
std::ifstream in("unixdict.txt");
typedef std::map<std::string, std::vector<std::string> > AnagramMap;
AnagramMap anagrams;
std::string word;
size_t count = 0;
while (std::getline(in, word)) {
std::string key = word;
std::sort(key.begin(), key.end());
// note: the [] op. automatically inserts a new value if key does not exist
AnagramMap::mapped_type & v = anagrams[key];
v.push_back(word);
count = std::max(count, v.size());
}
in.close();
for (AnagramMap::const_iterator it = anagrams.begin(), e = anagrams.end();
it != e; it++)
if (it->second.size() >= count) {
std::copy(it->second.begin(), it->second.end(),
std::ostream_iterator<std::string>(std::cout, ", "));
std::cout << std::endl;
}
return 0;
}
- Output:
abel, able, bale, bela, elba, caret, carte, cater, crate, trace, angel, angle, galen, glean, lange, alger, glare, lager, large, regal, elan, lane, lean, lena, neal, evil, levi, live, veil, vile,
Clojure
Assume wordfile is the path of the local file containing the words. This code makes a map (groups) whose keys are sorted letters and values are lists of the key's anagrams. It then determines the length of the longest list, and prints out all the lists of that length.
(require '[clojure.java.io :as io])
(def groups
(with-open [r (io/reader wordfile)]
(group-by sort (line-seq r))))
(let [wordlists (sort-by (comp - count) (vals groups))
maxlength (count (first wordlists))]
(doseq [wordlist (take-while #(= (count %) maxlength) wordlists)]
(println wordlist))
(->> (slurp "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
clojure.string/split-lines
(group-by sort)
vals
(sort-by count >) ;; sort in reverse
(partition-by count)
first)
;; (["caret" "carte" "cater" "crate" "trace"]
;; ["angel" "angle" "galen" "glean" "lange"]
;; ["elan" "lane" "lean" "lena" "neal"]
;; ["alger" "glare" "lager" "large" "regal"]
;; ["evil" "levi" "live" "veil" "vile"]
;; ["abel" "able" "bale" "bela" "elba"])
CLU
% Keep a list of anagrams
anagrams = cluster is new, add, largest_size, sets
anagram_set = struct[letters: string, words: array[string]]
rep = array[anagram_set]
new = proc () returns (cvt)
return(rep$[])
end new
% Sort the letters in a string
sort = proc (s: string) returns (string)
chars: array[int] := array[int]$fill(0,256,0) % Assuming ASCII here
for c: char in string$chars(s) do
i: int := char$c2i(c)
chars[i] := chars[i] + 1
end
sorted: array[char] := array[char]$predict(1,string$size(s))
for i: int in array[int]$indexes(chars) do
for j: int in int$from_to(1,chars[i]) do
array[char]$addh(sorted,char$i2c(i))
end
end
return(string$ac2s(sorted))
end sort
% Add a word
add = proc (a: cvt, s: string)
letters: string := sort(s)
as: anagram_set
begin
for t_as: anagram_set in rep$elements(a) do
if t_as.letters = letters then
as := t_as
exit found
end
end
as := anagram_set${letters: letters, words: array[string]$[]}
rep$addh(a, as)
end except when found: end
array[string]$addh(as.words, s)
end add
% Find the size of the largest set
largest_size = proc (a: cvt) returns (int)
size: int := 0
for as: anagram_set in rep$elements(a) do
cur: int := array[string]$size(as.words)
if cur > size then size := cur end
end
return(size)
end largest_size
% Yield all sets of a given size
sets = iter (a: cvt, s: int) yields (sequence[string])
for as: anagram_set in rep$elements(a) do
if array[string]$size(as.words) = s then
yield(sequence[string]$a2s(as.words))
end
end
end sets
end anagrams
start_up = proc ()
an: anagrams := anagrams$new()
dict: stream := stream$open(file_name$parse("unixdict.txt"), "read")
while true do
anagrams$add(an, stream$getl(dict))
except when end_of_file: break end
end
stream$close(dict)
po: stream := stream$primary_output()
max: int := anagrams$largest_size(an)
stream$putl(po, "Largest amount of anagrams per set: " || int$unparse(max))
stream$putl(po, "")
for words: sequence[string] in anagrams$sets(an, max) do
for word: string in sequence[string]$elements(words) do
stream$putleft(po, word, 7)
end
stream$putl(po, "")
end
end start_up
- Output:
Largest amount of anagrams per set: 5 abel able bale bela elba alger glare lager large regal angel angle galen glean lange caret carte cater crate trace elan lane lean lena neal evil levi live veil vile
COBOL
Tested with GnuCOBOL 2.0. ALLWORDS output display trimmed for width.
*> TECTONICS
*> wget http://wiki.puzzlers.org/pub/wordlists/unixdict.txt
*> or visit https://sourceforge.net/projects/souptonuts/files
*> or snag ftp://ftp.openwall.com/pub/wordlists/all.gz
*> for a 5 million all language word file (a few phrases)
*> cobc -xj anagrams.cob [-DMOSTWORDS -DMOREWORDS -DALLWORDS]
*> ***************************************************************
identification division.
program-id. anagrams.
environment division.
configuration section.
repository.
function all intrinsic.
input-output section.
file-control.
select words-in
assign to wordfile
organization is line sequential
status is words-status
.
REPLACE ==:LETTERS:== BY ==42==.
data division.
file section.
fd words-in record is varying from 1 to :LETTERS: characters
depending on word-length.
01 word-record.
05 word-data pic x occurs 0 to :LETTERS: times
depending on word-length.
working-storage section.
>>IF ALLWORDS DEFINED
01 wordfile constant as "/usr/local/share/dict/all.words".
01 max-words constant as 4802100.
>>ELSE-IF MOSTWORDS DEFINED
01 wordfile constant as "/usr/local/share/dict/linux.words".
01 max-words constant as 628000.
>>ELSE-IF MOREWORDS DEFINED
01 wordfile constant as "/usr/share/dict/words".
01 max-words constant as 100000.
>>ELSE
01 wordfile constant as "unixdict.txt".
01 max-words constant as 26000.
>>END-IF
*> The 5 million word file needs to restrict the word length
>>IF ALLWORDS DEFINED
01 max-letters constant as 26.
>>ELSE
01 max-letters constant as :LETTERS:.
>>END-IF
01 word-length pic 99 comp-5.
01 words-status pic xx.
88 ok-status values '00' thru '09'.
88 eof-status value '10'.
*> sortable word by letter table
01 letter-index usage index.
01 letter-table.
05 letters occurs 1 to max-letters times
depending on word-length
ascending key letter
indexed by letter-index.
10 letter pic x.
*> table of words
01 sorted-index usage index.
01 word-table.
05 word-list occurs 0 to max-words times
depending on word-tally
ascending key sorted-word
indexed by sorted-index.
10 match-count pic 999 comp-5.
10 this-word pic x(max-letters).
10 sorted-word pic x(max-letters).
01 sorted-display pic x(10).
01 interest-table.
05 interest-list pic 9(8) comp-5
occurs 0 to max-words times
depending on interest-tally.
01 outer pic 9(8) comp-5.
01 inner pic 9(8) comp-5.
01 starter pic 9(8) comp-5.
01 ender pic 9(8) comp-5.
01 word-tally pic 9(8) comp-5.
01 interest-tally pic 9(8) comp-5.
01 tally-display pic zz,zzz,zz9.
01 most-matches pic 99 comp-5.
01 matches pic 99 comp-5.
01 match-display pic z9.
*> timing display
01 time-stamp.
05 filler pic x(11).
05 timer-hours pic 99.
05 filler pic x.
05 timer-minutes pic 99.
05 filler pic x.
05 timer-seconds pic 99.
05 filler pic x.
05 timer-subsec pic v9(6).
01 timer-elapsed pic 9(6)v9(6).
01 timer-value pic 9(6)v9(6).
01 timer-display pic zzz,zz9.9(6).
*> ***************************************************************
procedure division.
main-routine.
>>IF ALLWORDS DEFINED
display "** Words limited to " max-letters " letters **"
>>END-IF
perform show-time
perform load-words
perform find-most
perform display-result
perform show-time
goback
.
*> ***************************************************************
load-words.
open input words-in
if not ok-status then
display "error opening " wordfile upon syserr
move 1 to return-code
goback
end-if
perform until exit
read words-in
if eof-status then exit perform end-if
if not ok-status then
display wordfile " read error: " words-status upon syserr
end-if
if word-length equal zero then exit perform cycle end-if
>>IF ALLWORDS DEFINED
move min(word-length, max-letters) to word-length
>>END-IF
add 1 to word-tally
move word-record to this-word(word-tally) letter-table
sort letters ascending key letter
move letter-table to sorted-word(word-tally)
end-perform
move word-tally to tally-display
display trim(tally-display) " words" with no advancing
close words-in
if not ok-status then
display "error closing " wordfile upon syserr
move 1 to return-code
end-if
*> sort word list by anagram check field
sort word-list ascending key sorted-word
.
*> first entry in a list will end up with highest match count
find-most.
perform varying outer from 1 by 1 until outer > word-tally
move 1 to matches
add 1 to outer giving starter
perform varying inner from starter by 1
until sorted-word(inner) not equal sorted-word(outer)
add 1 to matches
end-perform
if matches > most-matches then
move matches to most-matches
initialize interest-table all to value
move 0 to interest-tally
end-if
move matches to match-count(outer)
if matches = most-matches then
add 1 to interest-tally
move outer to interest-list(interest-tally)
end-if
end-perform
.
*> only display the words with the most anagrams
display-result.
move interest-tally to tally-display
move most-matches to match-display
display ", most anagrams: " trim(match-display)
", with " trim(tally-display) " set" with no advancing
if interest-tally not equal 1 then
display "s" with no advancing
end-if
display " of interest"
perform varying outer from 1 by 1 until outer > interest-tally
move sorted-word(interest-list(outer)) to sorted-display
display sorted-display
" [" trim(this-word(interest-list(outer)))
with no advancing
add 1 to interest-list(outer) giving starter
add most-matches to interest-list(outer) giving ender
perform varying inner from starter by 1
until inner = ender
display ", " trim(this-word(inner))
with no advancing
end-perform
display "]"
end-perform
.
*> elapsed time
show-time.
move formatted-current-date("YYYY-MM-DDThh:mm:ss.ssssss")
to time-stamp
compute timer-value = timer-hours * 3600 + timer-minutes * 60
+ timer-seconds + timer-subsec
if timer-elapsed = 0 then
display time-stamp
move timer-value to timer-elapsed
else
if timer-value < timer-elapsed then
add 86400 to timer-value
end-if
subtract timer-elapsed from timer-value
move timer-value to timer-display
display time-stamp ", " trim(timer-display) " seconds"
end-if
.
end program anagrams.
- Output:
prompt$ time cobc -xjd anagrams.cob 2016-05-04T07:13:23.225147 25,104 words, most anagrams: 5, with 6 sets of interest abel [abel, able, bale, bela, elba] acert [caret, carte, cater, crate, trace] aegln [angel, angle, galen, glean, lange] aeglr [alger, glare, lager, large, regal] aeln [elan, lane, lean, lena, neal] eilv [evil, levi, live, veil, vile] 2016-05-04T07:13:23.262851, 0.037704 seconds real 0m0.191s user 0m0.152s sys 0m0.024s prompt$ cobc -xjd anagrams.cob -DMOSTWORDS 2016-05-04T07:13:42.570360 627,999 words, most anagrams: 17, with 1 set of interest aerst [arest, arets, aster, astre, earst, rates, reast, resat, serta, stare, stear, tares, tarse, taser, tears, teras, treas] 2016-05-04T07:13:43.832743, 1.262383 seconds prompt$ cobc -xjd anagrams.cob -DALLWORDS ** Words limited to 26 letters ** 2016-05-04T07:13:50.944146 4,802,017 words, most anagrams: 68, with 1 set of interest aeinst [aisnet, aniets, anites, antesi, anties, antise, ... 2016-05-04T07:14:02.475959, 11.531813 seconds
CoffeeScript
http = require 'http'
show_large_anagram_sets = (word_lst) ->
anagrams = {}
max_size = 0
for word in word_lst
key = word.split('').sort().join('')
anagrams[key] ?= []
anagrams[key].push word
size = anagrams[key].length
max_size = size if size > max_size
for key, variations of anagrams
if variations.length == max_size
console.log variations.join ' '
get_word_list = (process) ->
options =
host: "wiki.puzzlers.org"
path: "/pub/wordlists/unixdict.txt"
req = http.request options, (res) ->
s = ''
res.on 'data', (chunk) ->
s += chunk
res.on 'end', ->
process s.split '\n'
req.end()
get_word_list show_large_anagram_sets
- Output:
> coffee anagrams.coffee
[ 'abel', 'able', 'bale', 'bela', 'elba' ]
[ 'alger', 'glare', 'lager', 'large', 'regal' ]
[ 'angel', 'angle', 'galen', 'glean', 'lange' ]
[ 'caret', 'carte', 'cater', 'crate', 'trace' ]
[ 'elan', 'lane', 'lean', 'lena', 'neal' ]
[ 'evil', 'levi', 'live', 'veil', 'vile' ]
Common Lisp
to retrieve the wordlist.
(defun anagrams (&optional (url "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt"))
(let ((words (drakma:http-request url :want-stream t))
(wordsets (make-hash-table :test 'equalp)))
;; populate the wordsets and close stream
(do ((word (read-line words nil nil) (read-line words nil nil)))
((null word) (close words))
(let ((letters (sort (copy-seq word) 'char<)))
(multiple-value-bind (pair presentp)
(gethash letters wordsets)
(if presentp
(setf (car pair) (1+ (car pair))
(cdr pair) (cons word (cdr pair)))
(setf (gethash letters wordsets)
(cons 1 (list word)))))))
;; find and return the biggest wordsets
(loop with maxcount = 0 with maxwordsets = '()
for pair being each hash-value of wordsets
if (> (car pair) maxcount)
do (setf maxcount (car pair)
maxwordsets (list (cdr pair)))
else if (eql (car pair) maxcount)
do (push (cdr pair) maxwordsets)
finally (return (values maxwordsets maxcount)))))
Evalutating
(multiple-value-bind (wordsets count) (anagrams)
(pprint wordsets)
(print count))
- Output:
(("vile" "veil" "live" "levi" "evil") ("regal" "large" "lager" "glare" "alger") ("lange" "glean" "galen" "angle" "angel") ("neal" "lena" "lean" "lane" "elan") ("trace" "crate" "cater" "carte" "caret") ("elba" "bela" "bale" "able" "abel")) 5
Another method, assuming file is local:
(defun read-words (file)
(with-open-file (stream file)
(loop with w = "" while w collect (setf w (read-line stream nil)))))
(defun anagram (file)
(let ((wordlist (read-words file))
(h (make-hash-table :test #'equal))
longest)
(loop for w in wordlist with ws do
(setf ws (sort (copy-seq w) #'char<))
(setf (gethash ws h) (cons w (gethash ws h))))
(loop for w being the hash-keys in h using (hash-value wl)
with max-len = 0 do
(let ((l (length wl)))
(if (> l max-len) (setf longest nil max-len l))
(if (= l max-len) (push wl longest))))
longest))
(format t "~{~{~a ~}~^~%~}" (anagram "unixdict.txt"))
- Output:
elba bela bale able abel regal large lager glare alger lange glean galen angle angel trace crate cater carte caret neal lena lean lane elan vile veil live levi evil
Component Pascal
BlackBox Component Builder
MODULE BbtAnagrams;
IMPORT StdLog,Files,Strings,Args;
CONST
MAXPOOLSZ = 1024;
TYPE
Node = POINTER TO LIMITED RECORD;
count: INTEGER;
word: Args.String;
desc: Node;
next: Node;
END;
Pool = POINTER TO LIMITED RECORD
capacity,max: INTEGER;
words: POINTER TO ARRAY OF Node;
END;
PROCEDURE NewNode(word: ARRAY OF CHAR): Node;
VAR
n: Node;
BEGIN
NEW(n);n.count := 0;n.word := word$;
n.desc := NIL;n.next := NIL;
RETURN n
END NewNode;
PROCEDURE Index(s: ARRAY OF CHAR;cap: INTEGER): INTEGER;
VAR
i,sum: INTEGER;
BEGIN
sum := 0;
FOR i := 0 TO LEN(s$) DO
INC(sum,ORD(s[i]))
END;
RETURN sum MOD cap
END Index;
PROCEDURE ISort(VAR s: ARRAY OF CHAR);
VAR
i, j: INTEGER;
t: CHAR;
BEGIN
FOR i := 0 TO LEN(s$) - 1 DO
j := i;
t := s[j];
WHILE (j > 0) & (s[j -1] > t) DO
s[j] := s[j - 1];
DEC(j)
END;
s[j] := t
END
END ISort;
PROCEDURE SameLetters(x,y: ARRAY OF CHAR): BOOLEAN;
BEGIN
ISort(x);ISort(y);
RETURN x = y
END SameLetters;
PROCEDURE NewPoolWith(cap: INTEGER): Pool;
VAR
i: INTEGER;
p: Pool;
BEGIN
NEW(p);
p.capacity := cap;
p.max := 0;
NEW(p.words,cap);
i := 0;
WHILE i < p.capacity DO
p.words[i] := NIL;
INC(i);
END;
RETURN p
END NewPoolWith;
PROCEDURE NewPool(): Pool;
BEGIN
RETURN NewPoolWith(MAXPOOLSZ);
END NewPool;
PROCEDURE (p: Pool) Add(w: ARRAY OF CHAR), NEW;
VAR
idx: INTEGER;
iter,n: Node;
BEGIN
idx := Index(w,p.capacity);
iter := p.words[idx];
n := NewNode(w);
WHILE(iter # NIL) DO
IF SameLetters(w,iter.word) THEN
INC(iter.count);
IF iter.count > p.max THEN p.max := iter.count END;
n.desc := iter.desc;
iter.desc := n;
RETURN
END;
iter := iter.next
END;
ASSERT(iter = NIL);
n.next := p.words[idx];p.words[idx] := n
END Add;
PROCEDURE ShowAnagrams(l: Node);
VAR
iter: Node;
BEGIN
iter := l;
WHILE iter # NIL DO
StdLog.String(iter.word);StdLog.String(" ");
iter := iter.desc
END;
StdLog.Ln
END ShowAnagrams;
PROCEDURE (p: Pool) ShowMax(),NEW;
VAR
i: INTEGER;
iter: Node;
BEGIN
FOR i := 0 TO LEN(p.words) - 1 DO
IF p.words[i] # NIL THEN
iter := p.words^[i];
WHILE iter # NIL DO
IF iter.count = p.max THEN
ShowAnagrams(iter);
END;
iter := iter.next
END
END
END
END ShowMax;
PROCEDURE GetLine(rd: Files.Reader; OUT str: ARRAY OF CHAR);
VAR
i: INTEGER;
b: BYTE;
BEGIN
rd.ReadByte(b);i := 0;
WHILE (~rd.eof) & (i < LEN(str)) DO
IF (b = ORD(0DX)) OR (b = ORD(0AX)) THEN str[i] := 0X; RETURN END;
str[i] := CHR(b);
rd.ReadByte(b);INC(i)
END;
str[LEN(str) - 1] := 0X
END GetLine;
PROCEDURE DoProcess*;
VAR
params : Args.Params;
loc: Files.Locator;
fd: Files.File;
rd: Files.Reader;
line: ARRAY 81 OF CHAR;
p: Pool;
BEGIN
Args.Get(params);
IF params.argc = 1 THEN
loc := Files.dir.This("Bbt");
fd := Files.dir.Old(loc,params.args[0]$,FALSE);
StdLog.String("Processing: " + params.args[0]);StdLog.Ln;StdLog.Ln;
rd := fd.NewReader(NIL);
p := NewPool();
REPEAT
GetLine(rd,line);
p.Add(line);
UNTIL rd.eof;
p.ShowMax()
ELSE
StdLog.String("Error: Missing file to process");StdLog.Ln
END;
END DoProcess;
END BbtAnagrams.
Execute:^Q BbtAnagrams.DoProcess unixdict.txt~
- Output:
Processing: unixdict.txt abel elba bela bale able elan neal lena lean lane evil vile veil live levi angel lange glean galen angle alger regal large lager glare caret trace crate cater carte
Crystal
require "http/client"
response = HTTP::Client.get("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
if response.body?
words : Array(String) = response.body.split
anagram = {} of String => Array(String)
words.each do |word|
key = word.split("").sort.join
if !anagram[key]?
anagram[key] = [word]
else
anagram[key] << word
end
end
count = anagram.values.map { |ana| ana.size }.max
anagram.each_value { |ana| puts ana if ana.size >= count }
end
- Output:
["abel", "able", "bale", "bela", "elba"] ["alger", "glare", "lager", "large", "regal"] ["angel", "angle", "galen", "glean", "lange"] ["caret", "carte", "cater", "crate", "trace"] ["elan", "lane", "lean", "lena", "neal"] ["evil", "levi", "live", "veil", "vile"]
D
Short Functional Version
import std.stdio, std.algorithm, std.string, std.exception, std.file;
void main() {
string[][ubyte[]] an;
foreach (w; "unixdict.txt".readText.splitLines)
an[w.dup.representation.sort().release.assumeUnique] ~= w;
immutable m = an.byValue.map!q{ a.length }.reduce!max;
writefln("%(%s\n%)", an.byValue.filter!(ws => ws.length == m));
}
- Output:
["caret", "carte", "cater", "crate", "trace"] ["evil", "levi", "live", "veil", "vile"] ["abel", "able", "bale", "bela", "elba"] ["elan", "lane", "lean", "lena", "neal"] ["alger", "glare", "lager", "large", "regal"] ["angel", "angle", "galen", "glean", "lange"]
Runtime: about 0.07 seconds.
Faster Version
Less safe, same output.
void main() {
import std.stdio, std.algorithm, std.file, std.string;
auto keys = "unixdict.txt".readText!(char[]);
immutable vals = keys.idup;
string[][string] anags;
foreach (w; keys.splitter) {
immutable k = w.representation.sort().release.assumeUTF;
anags[k] ~= vals[k.ptr - keys.ptr .. k.ptr - keys.ptr + k.length];
}
//immutable m = anags.byValue.maxs!q{ a.length };
immutable m = anags.byValue.map!q{ a.length }.reduce!max;
writefln("%(%-(%s %)\n%)", anags.byValue.filter!(ws => ws.length == m));
}
Runtime: about 0.06 seconds.
Delphi
program AnagramsTest;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils,
System.Classes,
System.Diagnostics;
function Sort(s: string): string;
var
c: Char;
i, j, aLength: Integer;
begin
aLength := s.Length;
if aLength = 0 then
exit('');
Result := s;
for i := 1 to aLength - 1 do
for j := i + 1 to aLength do
if result[i] > result[j] then
begin
c := result[i];
result[i] := result[j];
result[j] := c;
end;
end;
function IsAnagram(s1, s2: string): Boolean;
begin
if s1.Length <> s2.Length then
exit(False);
Result := Sort(s1) = Sort(s2);
end;
function Split(s: string; var Count: Integer; var words: string): Boolean;
var
sCount: string;
begin
sCount := s.Substring(0, 4);
words := s.Substring(5);
Result := TryStrToInt(sCount, Count);
end;
function CompareLength(List: TStringList; Index1, Index2: Integer): Integer;
begin
result := List[Index1].Length - List[Index2].Length;
if Result = 0 then
Result := CompareText(Sort(List[Index2]), Sort(List[Index1]));
end;
var
Dict: TStringList;
i, j, Count, MaxCount, WordLength, Index: Integer;
words: string;
StopWatch: TStopwatch;
begin
StopWatch := TStopwatch.Create;
StopWatch.Start;
Dict := TStringList.Create();
Dict.LoadFromFile('unixdict.txt');
Dict.CustomSort(CompareLength);
Index := 0;
words := Dict[Index];
Count := 1;
while Index + Count < Dict.Count do
begin
if IsAnagram(Dict[Index], Dict[Index + Count]) then
begin
words := words + ',' + Dict[Index + Count];
Dict[Index + Count] := '';
inc(Count);
end
else
begin
Dict[Index] := format('%.4d', [Count]) + ',' + words;
inc(Index, Count);
words := Dict[Index];
Count := 1;
end;
end;
// The last one not match any one
if not Dict[Dict.count - 1].IsEmpty then
Dict.Delete(Dict.count - 1);
Dict.Sort;
while Dict[0].IsEmpty do
Dict.Delete(0);
StopWatch.Stop;
Writeln(Format('Time pass: %d ms [i7-4500U Windows 7]', [StopWatch.ElapsedMilliseconds]));
Split(Dict[Dict.count - 1], MaxCount, words);
writeln(#10'The anagrams that contain the most words, has ', MaxCount, ' words:'#10);
writeln('Words found:'#10);
Writeln(' ', words);
for i := Dict.Count - 2 downto 0 do
begin
Split(Dict[i], Count, words);
if Count = MaxCount then
Writeln(' ', words)
else
Break;
end;
Dict.Free;
Readln;
end.
- Output:
Time pass: 700 ms [i7-4500U Windows 7] The anagrams that contain the most words, has 5 words: Words found: veil,live,vile,evil,levi trace,crate,cater,carte,caret regal,glare,large,lager,alger neal,lean,elan,lane,lena glean,angel,galen,angle,lange able,bale,abel,bela,elba
DuckDB
Most of the heavy lifting in the program presented here is done by DuckDB's histogram() function, which returns a DuckDB `MAP`, as illustrated by the following example:
D select spectrum('alpha'); ┌───────────────────────┐ │ spectrum('alpha') │ │ map(varchar, ubigint) │ ├───────────────────────┤ │ {a=2, h=1, l=1, p=1} │ └───────────────────────┘
Despite the name "map", the order of the keys in a DuckDB MAP is important for determining the equality of MAP objects. Fortunately, the histogram() function sorts the keys; otherwise, we'd have to normalize them.
# The MAP giving the frequency counts of characters in the given string
create or replace function spectrum(str) as (
select histogram(c)
from (select unnest(regexp_extract_all(str,'.')) as c)
);
# Find the anagram groups having the most members.
# Each group is sorted, and the groups are sorted by first word in the group.
with words as (from read_csv('unixdict.txt', header=false) _(word)),
histograms as (select word, spectrum(word) as h from words),
groups as (select h, count(h) as c from histograms group by h order by c desc),
mx as (select max(c) as mx from groups),
maximals as (select h from groups, mx where c = mx.mx),
results as (select (select array_agg(word).list_sort()
from histograms
where histograms.h = maximals.h ) as anagrams
from maximals)
select anagrams
from results
order by anagrams[1] ;
- Output:
┌─────────────────────────────────────┐ │ anagrams │ │ varchar[] │ ├─────────────────────────────────────┤ │ [abel, able, bale, bela, elba] │ │ [alger, glare, lager, large, regal] │ │ [angel, angle, galen, glean, lange] │ │ [caret, carte, cater, crate, trace] │ │ [elan, lane, lean, lena, neal] │ │ [evil, levi, live, veil, vile] │ └─────────────────────────────────────┘
E
println("Downloading...")
when (def wordText := <http://wiki.puzzlers.org/pub/wordlists/unixdict.txt> <- getText()) -> {
def words := wordText.split("\n")
def storage := [].asMap().diverge()
def anagramTable extends storage {
to get(key) { return storage.fetch(key, fn { storage[key] := [].diverge() }) }
}
println("Grouping...")
var largestGroupSeen := 0
for word in words {
def anagramGroup := anagramTable[word.sort()]
anagramGroup.push(word)
largestGroupSeen max= anagramGroup.size()
}
println("Selecting...")
for _ => anagramGroup ? (anagramGroup.size() == mostSeen) in anagramTable {
println(anagramGroup.snapshot())
}
}
EchoLisp
For a change, we will use the french dictionary - (lib 'dico.fr) - delivered within EchoLisp.
(require 'struct)
(require 'hash)
(require 'sql)
(require 'words)
(require 'dico.fr.no-accent)
(define mots-français (words-select #:any null 999999))
(string-delimiter "")
(define (string-sort str)
(list->string (list-sort string<? (string->list str))))
(define (ana-sort H words) ;; bump counter for each word
(for ((w words))
#:continue (< (string-length w) 4)
(let [(key (string-sort w))] (hash-set H key (1+ (hash-ref! H key 0))))))
;; input w word
;; output : list of matching words
(define (anagrams w words)
(set! w (string-sort w))
(make-set
(for/list (( ana words))
#:when (string=? w (string-sort ana))
ana)))
(define (task words)
(define H (make-hash))
(ana-sort H words) ;; build counters key= sorted-string, value = count
(hash-get-keys H ;; extract max count values
(for/fold (hmax 0) ((h H) )
#:when (>= (cdr h) hmax)
(cdr h))
))
- Output:
(length mots-français)
→ 209315
(task mots-français)
→ (aeilns acenr) ;; two winners
(anagrams "acenr" mots-français)
→ { ancre caner caren carne ceran cerna encra nacre nerac rance renac }
(anagrams "aeilns" mots-français)
→ { alisen enlias enlisa ensila islaen islean laines lianes salien saline selina }
Eiffel
class
ANAGRAMS
create
make
feature
make
-- Set of Anagrams, containing most words.
local
count: INTEGER
do
read_wordlist
across
words as wo
loop
if wo.item.count > count then
count := wo.item.count
end
end
across
words as wo
loop
if wo.item.count = count then
across
wo.item as list
loop
io.put_string (list.item + "%T")
end
io.new_line
end
end
end
original_list: STRING = "unixdict.txt"
feature {NONE}
read_wordlist
-- Preprocessed wordlist for finding Anagrams.
local
l_file: PLAIN_TEXT_FILE
sorted: STRING
empty_list: LINKED_LIST [STRING]
do
create l_file.make_open_read_write (original_list)
l_file.read_stream (l_file.count)
wordlist := l_file.last_string.split ('%N')
l_file.close
create words.make (wordlist.count)
across
wordlist as w
loop
create empty_list.make
sorted := sort_letters (w.item)
words.put (empty_list, sorted)
if attached words.at (sorted) as ana then
ana.extend (w.item)
end
end
end
wordlist: LIST [STRING]
sort_letters (word: STRING): STRING
--Sorted in alphabetical order.
local
letters: SORTED_TWO_WAY_LIST [STRING]
do
create letters.make
create Result.make_empty
across
1 |..| word.count as i
loop
letters.extend (word.at (i.item).out)
end
across
letters as s
loop
Result.append (s.item)
end
end
words: HASH_TABLE [LINKED_LIST [STRING], STRING]
end
- Output:
abel able bale bela elba alger glare lager large regal angel angle galen glean lange caret carte cater crate trace elan lane lean lena neal evil levi live veil vile
Ela
open monad io list string
groupon f x y = f x == f y
lines = split "\n" << replace "\n\n" "\n" << replace "\r" "\n"
main = do
fh <- readFile "c:\\test\\unixdict.txt" OpenMode
f <- readLines fh
closeFile fh
let words = lines f
let wix = groupBy (groupon fst) << sort $ zip (map sort words) words
let mxl = maximum $ map length wix
mapM_ (putLn << map snd) << filter ((==mxl) << length) $ wix
- Output:
["vile","veil","live","levi","evil"]["neal","lena","lean","lane","elan"] ["regal","large","lager","glare","alger"] ["lange","glean","galen","angle","angel"] ["trace","crate","cater","carte","caret"] ["elba","bela","bale","able","abel"]
Elena
ELENA 6.x:
import system'routines;
import system'calendar;
import system'io;
import system'collections;
import extensions;
import extensions'routines;
import extensions'text;
import algorithms;
extension op
{
string normalized()
= self.toArray().ascendant().summarize(new StringWriter());
}
public program()
{
var start := now;
auto dictionary := new Map<string,object>();
File.assign("unixdict.txt").forEachLine::(word)
{
var key := word.normalized();
var item := dictionary[key];
if (nil == item)
{
item := new ArrayList();
dictionary[key] := item
};
item.append(word)
};
dictionary.Values
.quickSort::(former,later => former.Item2.Length > later.Item2.Length )
.top(20)
.forEach::(pair){ console.printLine(pair.Item2) };
var end := now;
var diff := end - start;
console.printLine("Time elapsed in msec:",diff.Milliseconds);
console.readChar()
}
- Output:
abel,able,bale,bela,elba alger,glare,lager,large,regal evil,levi,live,veil,vile elan,lane,lean,lena,neal caret,carte,cater,crate,trace angel,angle,galen,glean,lange are,ear,era,rae dare,dear,erda,read diet,edit,tide,tied cereus,recuse,rescue,secure ames,mesa,same,seam emit,item,mite,time amen,mane,mean,name enol,leon,lone,noel esprit,priest,sprite,stripe beard,bread,debar,debra hare,hear,hera,rhea apt,pat,pta,tap aires,aries,arise,raise keats,skate,stake,steak
Elixir
defmodule Anagrams do
def find(file) do
File.read!(file)
|> String.split
|> Enum.group_by(fn word -> String.codepoints(word) |> Enum.sort end)
|> Enum.group_by(fn {_,v} -> length(v) end)
|> Enum.max
|> print
end
defp print({_,y}) do
Enum.each(y, fn {_,e} -> Enum.sort(e) |> Enum.join(" ") |> IO.puts end)
end
end
Anagrams.find("unixdict.txt")
- Output:
caret carte cater crate trace evil levi live veil vile alger glare lager large regal elan lane lean lena neal angel angle galen glean lange abel able bale bela elba
The same output, using File.Stream!
to generate tuples
containing the word and it's sorted value as strings
.
File.stream!("unixdict.txt")
|> Stream.map(&String.strip &1)
|> Enum.group_by(&String.codepoints(&1) |> Enum.sort)
|> Map.values
|> Enum.group_by(&length &1)
|> Enum.max
|> elem(1)
|> Enum.each(fn n -> Enum.sort(n) |> Enum.join(" ") |> IO.puts end)
- Output:
caret carte cater crate trace evil levi live veil vile alger glare lager large regal elan lane lean lena neal angel angle galen glean lange abel able bale bela elba
Emacs Lisp
(defun code-letters (str)
"Sort STR into alphabetized list of individual letters."
(sort (split-string str "" t) #'string<))
(defun code-letters-to-string (str)
"Sort STR alphabetically and combine into one string."
(apply #'concat (code-letters str)))
(defun remove-periods (str)
"Remove periods (full stops) from STR."
(string-replace "." "" str))
(defun list-pair (str)
"Create paired list from STR, STR (unchanged) and alphabetized order of STR."
;; Remove periods from alphabetized order to make regex matching easier
(let ((letter-list (remove-periods (code-letters-to-string str))))
(list letter-list str)))
(defun pair-up (words)
"Make list of lists of paired words, one alphabetized one original."
(let ((paired-list)
(temp-pair))
(dolist (word words)
(setq temp-pair (list-pair word))
(push temp-pair paired-list))
paired-list))
(defun create-list-of-numbers (my-list)
"Create list of numbers from MY-LIST."
(let ((list-of-numbers))
(dolist (one-pair my-list)
(push (car one-pair) list-of-numbers))
list-of-numbers))
(defun get-largest-number (my-list)
"Find largest number in MY-LIST."
(let ((list-of-numbers))
(setq list-of-numbers (create-list-of-numbers my-list))
(apply #'max list-of-numbers)))
(defun make-list-matching-words (coded-word-and-original number-and-code-pair)
"List original words whose code matches code in NUMBER-AND-CODE-PAIR."
(dolist (word-pair coded-word-and-original)
;; test if coded word in CODED-WORD-AND-ORIGINAL matches
;; coded word in NUMBER-AND-CODE-PAIR
(when (string= (nth 0 word-pair) (nth 1 number-and-code-pair))
;; insert the original word
(insert (format "%s " (nth 1 word-pair)))))
(insert "\n"))
(defun count-anagrams ()
"Count the number of anagrams in file wordlist.txt"
(let ((coded-word-and-original)
(just-coded-words)
(unique-coded-words)
(count-and-code)
(number-of-anagrams)
(largest-number))
;; Path below needs to be adapted to individual case
(find-file "~/Documents/Elisp/wordlist.txt")
(beginning-of-buffer)
;; create list of lists of coded words and originals
(setq coded-word-and-original (pair-up (split-string (buffer-string) "\n")))
(find-file "temp-all-coded")
(erase-buffer)
(dolist (number-and-code-pair coded-word-and-original)
;; make list of just the coded words
(push (nth 0 number-and-code-pair) just-coded-words))
(dolist (one-word just-coded-words)
;; write list of coded words to buffer for later processing
(insert (format "%s\n" one-word)))
;; create a list of coded words with no repetitions
(setq unique-coded-words (seq-uniq just-coded-words))
(dolist (one-code unique-coded-words)
(find-file "temp-all-coded")
(beginning-of-buffer)
;; count the number of times ONE-CODE appears in buffer
(setq number-of-anagrams (how-many (format "^%s$" one-code)))
(if (>= number-of-anagrams 1) ; eliminate "words" of zero length
(push (list number-of-anagrams one-code) count-and-code)))
(find-file "anagram-listing")
(erase-buffer)
(setq largest-number (get-largest-number count-and-code))
(dolist (number-and-code-pair count-and-code)
;; when the number in NUMBER-AND-CODE-PAIR = largest number of anagrams
(when (= (nth 0 number-and-code-pair) largest-number)
(make-list-matching-words coded-word-and-original number-and-code-pair)))))
An alternate version, shown below, uses a hash table and runs much faster
(defun code-letters (str)
"Sort STR into alphabetized list of individual letters."
(sort (split-string str "" t) #'string<))
(defun code-letters-to-string (str)
"Sort STR alphabetically and combine into one string."
(apply #'concat (code-letters str)))
(defun add-to-hash (key value table)
"If KEY exists, add VALUE to list of values.
If KEY does not exist, associate value with KEY."
(let ((current-values))
(if (gethash key table)
(progn
(setq current-values (gethash key table))
(setq current-values (push value current-values))
(puthash key current-values table))
(puthash key (list value) table))))
(defun create-list-of-numbers (hash-table)
"Create a list of numbers from HASH-TABLE."
(let ((current-number)
(list-of-numbers))
(setq list-of-numbers (list)) ; omit?
(maphash (lambda (key value)
(setq current-number (car (gethash key hash-table)))
(push current-number list-of-numbers))
hash-table)
list-of-numbers))
(defun find-largest-number-in-hash (hash-table)
"Find largest number in HASH-TABLE."
(let ((list-of-numbers))
(setq list-of-numbers (create-list-of-numbers hash-table))
(apply #'max list-of-numbers)))
(defun find-longest-lists-of-anagrams (&optional file)
"Find the set(s) of largest number of anagrams in file wordlist.txt"
(let ((largest-number)
(hash-key)
(dictionary-table (make-hash-table :test 'equal)))
;; Path and filename below needs to be adapted to individual case if
;; FILE is *not* passed to this function
(with-temp-buffer
(insert-file-contents (or file "~/Documents/Elisp/wordlist.txt"))
(beginning-of-buffer)
;; set up hash table with key and word(s)
;; key = letters of word, but with letters in
;; alphabetical order. Create list word(s) associated
;; with key.
(dolist (current-word (split-string (buffer-string) "\n"))
(setq hash-key (code-letters-to-string current-word))
(add-to-hash hash-key current-word dictionary-table))
;; Count number of anagram words
(maphash (lambda (key value)
"Add number of anagram words to VALUE."
(add-to-hash key (length (gethash key dictionary-table)) dictionary-table))
dictionary-table)
;; find the size of the largest list(s) of anagrams
(setq largest-number (find-largest-number-in-hash dictionary-table)))
;; set up empty buffer to show results
(with-current-buffer (pop-to-buffer "anagram-listing")
(erase-buffer)
;; show results
(maphash (lambda (key value)
"Display longest lists of anagrams."
(when (= largest-number (car (gethash key dictionary-table)))
(mapc
(lambda (element)
"Insert ELEMENT followed by one space in buffer."
(insert (format "%s " element)))
(cdr (gethash key dictionary-table)))
(insert "\n")))
dictionary-table))))
- Output:
vile veil live levi evil neal lena lean lane elan trace crate cater carte caret lange glean galen angle angel regal large lager glare alger elba bela bale able abel
Erlang
The function fetch/2 is used to solve Anagrams/Deranged_anagrams. Please keep backwards compatibility when editing. Or update the other module, too.
-module(anagrams).
-compile(export_all).
play() ->
{ok, P} = file:read_file('unixdict.txt'),
D = dict:new(),
E=fetch(string:tokens(binary_to_list(P), "\n"), D),
get_value(dict:fetch_keys(E), E).
fetch([H|T], D) ->
fetch(T, dict:append(lists:sort(H), H, D));
fetch([], D) ->
D.
get_value(L, D) -> get_value(L,D,1,[]).
get_value([H|T], D, N, L) ->
Var = dict:fetch(H,D),
Len = length(Var),
if
Len > N ->
get_value(T, D, Len, [Var]);
Len == N ->
get_value(T, D, Len, [Var | L]);
Len < N ->
get_value(T, D, N, L)
end;
get_value([], _, _, L) ->
L.
- Output:
1> anagrams:play(). [["caret","carte","cater","crate","trace"], ["elan","lane","lean","lena","neal"], ["alger","glare","lager","large","regal"], ["angel","angle","galen","glean","lange"], ["evil","levi","live","veil","vile"], ["abel","able","bale","bela","elba"]] 2>
Euphoria
include sort.e
function compare_keys(sequence a, sequence b)
return compare(a[1],b[1])
end function
constant fn = open("unixdict.txt","r")
sequence words, anagrams
object word
words = {}
while 1 do
word = gets(fn)
if atom(word) then
exit
end if
word = word[1..$-1] -- truncate new-line character
words = append(words, {sort(word), word})
end while
close(fn)
integer maxlen
maxlen = 0
words = custom_sort(routine_id("compare_keys"), words)
anagrams = {words[1]}
for i = 2 to length(words) do
if equal(anagrams[$][1],words[i][1]) then
anagrams[$] = append(anagrams[$], words[i][2])
elsif length(anagrams[$]) = 2 then
anagrams[$] = words[i]
else
if length(anagrams[$]) > maxlen then
maxlen = length(anagrams[$])
end if
anagrams = append(anagrams, words[i])
end if
end for
if length(anagrams[$]) = 2 then
anagrams = anagrams[1..$-1]
end if
for i = 1 to length(anagrams) do
if length(anagrams[i]) = maxlen then
for j = 2 to length(anagrams[i]) do
puts(1,anagrams[i][j])
puts(1,' ')
end for
puts(1,"\n")
end if
end for
- Output:
abel bela bale elba able crate cater carte caret trace angle galen glean lange angel regal lager large alger glare elan lean neal lane lena live veil vile levi evil
F#
Read the lines in the dictionary, group by the sorted letters in each word, find the length of the longest sets of anagrams, extract the longest sequences of words sharing the same letters (i.e. anagrams):
let xss = Seq.groupBy (Array.ofSeq >> Array.sort) (System.IO.File.ReadAllLines "unixdict.txt")
Seq.map snd xss |> Seq.filter (Seq.length >> ( = ) (Seq.map (snd >> Seq.length) xss |> Seq.max))
Note that it is necessary to convert the sorted letters in each word from sequences to arrays because the groupBy function uses the default comparison and sequences do not compare structurally (but arrays do in F#).
Takes 0.8s to return:
val it : string seq seq =
seq
[seq ["abel"; "able"; "bale"; "bela"; "elba"];
seq ["alger"; "glare"; "lager"; "large"; "regal"];
seq ["angel"; "angle"; "galen"; "glean"; "lange"];
seq ["caret"; "carte"; "cater"; "crate"; "trace"];
seq ["elan"; "lane"; "lean"; "lena"; "neal"];
seq ["evil"; "levi"; "live"; "veil"; "vile"]]
Fantom
class Main
{
// take given word and return a string rearranging characters in order
static Str toOrderedChars (Str word)
{
Str[] chars := [,]
word.each |Int c| { chars.add (c.toChar) }
return chars.sort.join("")
}
// add given word to anagrams map
static Void addWord (Str:Str[] anagrams, Str word)
{
Str orderedWord := toOrderedChars (word)
if (anagrams.containsKey (orderedWord))
anagrams[orderedWord].add (word)
else
anagrams[orderedWord] = [word]
}
public static Void main ()
{
Str:Str[] anagrams := [:] // map Str -> Str[]
// loop through input file, adding each word to map of anagrams
File (`unixdict.txt`).eachLine |Str word|
{
addWord (anagrams, word)
}
// loop through anagrams, keeping the keys with values of largest size
Str[] largestKeys := [,]
anagrams.keys.each |Str k|
{
if ((largestKeys.size < 1) || (anagrams[k].size == anagrams[largestKeys[0]].size))
largestKeys.add (k)
else if (anagrams[k].size > anagrams[largestKeys[0]].size)
largestKeys = [k]
}
largestKeys.each |Str k|
{
echo ("Key: $k -> " + anagrams[k].join(", "))
}
}
}
- Output:
Key: abel -> abel, able, bale, bela, elba Key: aeln -> elan, lane, lean, lena, neal Key: eilv -> evil, levi, live, veil, vile Key: aegln -> angel, angle, galen, glean, lange Key: aeglr -> alger, glare, lager, large, regal Key: acert -> caret, carte, cater, crate, trace
Fortran
This program:
!***************************************************************************************
module anagram_routines
!***************************************************************************************
implicit none
!the dictionary file:
integer,parameter :: file_unit = 1000
character(len=*),parameter :: filename = 'unixdict.txt'
!maximum number of characters in a word:
integer,parameter :: max_chars = 50
!maximum number of characters in the string displaying the anagram lists:
integer,parameter :: str_len = 256
type word
character(len=max_chars) :: str = repeat(' ',max_chars) !the word from the dictionary
integer :: n = 0 !length of this word
integer :: n_anagrams = 0 !number of anagrams found
logical :: checked = .false. !if this one has already been checked
character(len=str_len) :: anagrams = repeat(' ',str_len) !the anagram list for this word
end type word
!the dictionary structure:
type(word),dimension(:),allocatable,target :: dict
contains
!***************************************************************************************
!******************************************************************************
function count_lines_in_file(fid) result(n_lines)
!******************************************************************************
implicit none
integer :: n_lines
integer,intent(in) :: fid
character(len=1) :: tmp
integer :: i
integer :: ios
!the file is assumed to be open already.
rewind(fid) !rewind to beginning of the file
n_lines = 0
do !read each line until the end of the file.
read(fid,'(A1)',iostat=ios) tmp
if (ios < 0) exit !End of file
n_lines = n_lines + 1 !row counter
end do
rewind(fid) !rewind to beginning of the file
!******************************************************************************
end function count_lines_in_file
!******************************************************************************
!******************************************************************************
pure elemental function is_anagram(x,y)
!******************************************************************************
implicit none
character(len=*),intent(in) :: x
character(len=*),intent(in) :: y
logical :: is_anagram
character(len=len(x)) :: x_tmp !a copy of x
integer :: i,j
!a character not found in any word:
character(len=1),parameter :: null = achar(0)
!x and y are assumed to be the same size.
x_tmp = x
do i=1,len_trim(x)
j = index(x_tmp, y(i:i)) !look for this character in x_tmp
if (j/=0) then
x_tmp(j:j) = null !clear it so it won't be checked again
else
is_anagram = .false. !character not found: x,y are not anagrams
return
end if
end do
!if we got to this point, all the characters
! were the same, so x,y are anagrams:
is_anagram = .true.
!******************************************************************************
end function is_anagram
!******************************************************************************
!***************************************************************************************
end module anagram_routines
!***************************************************************************************
!***************************************************************************************
program main
!***************************************************************************************
use anagram_routines
implicit none
integer :: n,i,j,n_max
type(word),pointer :: x,y
logical :: first_word
real :: start, finish
call cpu_time(start) !..start timer
!open the dictionary and read in all the words:
open(unit=file_unit,file=filename) !open the file
n = count_lines_in_file(file_unit) !count lines in the file
allocate(dict(n)) !allocate dictionary structure
do i=1,n !
read(file_unit,'(A)') dict(i)%str !each line is a word in the dictionary
dict(i)%n = len_trim(dict(i)%str) !saving length here to avoid trim's below
end do
close(file_unit) !close the file
!search dictionary for anagrams:
do i=1,n
x => dict(i) !pointer to simplify code
first_word = .true. !initialize
do j=i,n
y => dict(j) !pointer to simplify code
!checks to avoid checking words unnecessarily:
if (x%checked .or. y%checked) cycle !both must not have been checked already
if (x%n/=y%n) cycle !must be the same size
if (x%str(1:x%n)==y%str(1:y%n)) cycle !can't be the same word
! check to see if x,y are anagrams:
if (is_anagram(x%str(1:x%n), y%str(1:y%n))) then
!they are anagrams.
y%checked = .true. !don't check this one again.
x%n_anagrams = x%n_anagrams + 1
if (first_word) then
!this is the first anagram found for this word.
first_word = .false.
x%n_anagrams = x%n_anagrams + 1
x%anagrams = trim(x%anagrams)//x%str(1:x%n) !add first word to list
end if
x%anagrams = trim(x%anagrams)//','//y%str(1:y%n) !add next word to list
end if
end do
x%checked = .true. !don't check this one again
end do
!anagram groups with the most words:
write(*,*) ''
n_max = maxval(dict%n_anagrams)
do i=1,n
if (dict(i)%n_anagrams==n_max) write(*,'(A)') trim(dict(i)%anagrams)
end do
!anagram group containing longest words:
write(*,*) ''
n_max = maxval(dict%n, mask=dict%n_anagrams>0)
do i=1,n
if (dict(i)%n_anagrams>0 .and. dict(i)%n==n_max) write(*,'(A)') trim(dict(i)%anagrams)
end do
write(*,*) ''
call cpu_time(finish) !...stop timer
write(*,'(A,F6.3,A)') '[Runtime = ',finish-start,' sec]'
write(*,*) ''
!***************************************************************************************
end program main
!***************************************************************************************
- Output:
abel,able,bale,bela,elba alger,glare,lager,large,regal angel,angle,galen,glean,lange caret,carte,cater,crate,trace elan,lane,lean,lena,neal evil,levi,live,veil,vile conservation,conversation [Runtime = 6.897 sec]
FBSL
A little bit of cheating: literatim re-implementation of C solution in FBSL's Dynamic C layer.
#APPTYPE CONSOLE
DIM gtc = GetTickCount()
Anagram()
PRINT "Done in ", (GetTickCount() - gtc) / 1000, " seconds"
PAUSE
DYNC Anagram()
#include <windows.h>
#include <stdio.h>
char* sortedWord(const char* word, char* wbuf)
{
char* p1, *p2, *endwrd;
char t;
int swaps;
strcpy(wbuf, word);
endwrd = wbuf + strlen(wbuf);
do {
swaps = 0;
p1 = wbuf; p2 = endwrd - 1;
while (p1 < p2) {
if (*p2 >* p1) {
t = *p2; *p2 = *p1; *p1 = t;
swaps = 1;
}
p1++; p2--;
}
p1 = wbuf; p2 = p1 + 1;
while (p2 < endwrd) {
if (*p2 >* p1) {
t = *p2; *p2 = *p1; *p1 = t;
swaps = 1;
}
p1++; p2++;
}
} while (swaps);
return wbuf;
}
static short cxmap[] = {
0x06, 0x1f, 0x4d, 0x0c, 0x5c, 0x28, 0x5d, 0x0e, 0x09, 0x33, 0x31, 0x56,
0x52, 0x19, 0x29, 0x53, 0x32, 0x48, 0x35, 0x55, 0x5e, 0x14, 0x27, 0x24,
0x02, 0x3e, 0x18, 0x4a, 0x3f, 0x4c, 0x45, 0x30, 0x08, 0x2c, 0x1a, 0x03,
0x0b, 0x0d, 0x4f, 0x07, 0x20, 0x1d, 0x51, 0x3b, 0x11, 0x58, 0x00, 0x49,
0x15, 0x2d, 0x41, 0x17, 0x5f, 0x39, 0x16, 0x42, 0x37, 0x22, 0x1c, 0x0f,
0x43, 0x5b, 0x46, 0x4b, 0x0a, 0x26, 0x2e, 0x40, 0x12, 0x21, 0x3c, 0x36,
0x38, 0x1e, 0x01, 0x1b, 0x05, 0x4e, 0x44, 0x3d, 0x04, 0x10, 0x5a, 0x2a,
0x23, 0x34, 0x25, 0x2f, 0x2b, 0x50, 0x3a, 0x54, 0x47, 0x59, 0x13, 0x57,
};
#define CXMAP_SIZE (sizeof(cxmap) / sizeof(short))
int Str_Hash(const char* key, int ix_max)
{
const char* cp;
short mash;
int hash = 33501551;
for (cp = key; *cp; cp++) {
mash = cxmap[*cp % CXMAP_SIZE];
hash = (hash >>4) ^ 0x5C5CF5C ^ ((hash << 1) + (mash << 5));
hash &= 0x3FFFFFFF;
}
return hash % ix_max;
}
typedef struct sDictWord* DictWord;
struct sDictWord {
const char* word;
DictWord next;
};
typedef struct sHashEntry* HashEntry;
struct sHashEntry {
const char* key;
HashEntry next;
DictWord words;
HashEntry link;
short wordCount;
};
#define HT_SIZE 8192
HashEntry hashTable[HT_SIZE];
HashEntry mostPerms = NULL;
int buildAnagrams(FILE* fin)
{
char buffer[40];
char bufr2[40];
char* hkey;
int hix;
HashEntry he, *hep;
DictWord we;
int maxPC = 2;
int numWords = 0;
while (fgets(buffer, 40, fin)) {
for (hkey = buffer; *hkey && (*hkey != '\n'); hkey++);
*hkey = 0;
hkey = sortedWord(buffer, bufr2);
hix = Str_Hash(hkey, HT_SIZE);
he = hashTable[hix]; hep = &hashTable[hix];
while (he && strcmp(he->key, hkey)) {
hep = &he->next;
he = he->next;
}
if (! he) {
he = (HashEntry)malloc(sizeof(struct sHashEntry));
he->next = NULL;
he->key = strdup(hkey);
he->wordCount = 0;
he->words = NULL;
he->link = NULL;
*hep = he;
}
we = (DictWord)malloc(sizeof(struct sDictWord));
we->word = strdup(buffer);
we->next = he->words;
he->words = we;
he->wordCount++;
if (maxPC < he->wordCount) {
maxPC = he->wordCount;
mostPerms = he;
he->link = NULL;
}
else if (maxPC == he->wordCount) {
he->link = mostPerms;
mostPerms = he;
}
numWords++;
}
printf("%d words in dictionary max ana=%d\n", numWords, maxPC);
return maxPC;
}
void main()
{
HashEntry he;
DictWord we;
FILE* f1;
f1 = fopen("unixdict.txt", "r");
buildAnagrams(f1);
fclose(f1);
f1 = fopen("anaout.txt", "w");
for (he = mostPerms; he; he = he->link) {
fprintf(f1, "%d: ", he->wordCount);
for (we = he->words; we; we = we->next) {
fprintf(f1, "%s, ", we->word);
}
fprintf(f1, "\n");
}
fclose(f1);
}
END DYNC
- Output:
(2.2GHz Intel Core2 Duo)
25104 words in dictionary max ana=5 Done in 0.031 seconds Press any key to continue...
"anaout.txt" listing:
5: vile, veil, live, levi, evil, 5: trace, crate, cater, carte, caret, 5: regal, large, lager, glare, alger, 5: neal, lena, lean, lane, elan, 5: lange, glean, galen, angle, angel, 5: elba, bela, bale, able, abel,
Factor
"resource:unixdict.txt" utf8 file-lines
[ [ natural-sort >string ] keep ] { } map>assoc sort-keys
[ [ first ] compare +eq+ = ] monotonic-split
dup 0 [ length max ] reduce '[ length _ = ] filter [ values ] map .
{
{ "abel" "able" "bale" "bela" "elba" }
{ "caret" "carte" "cater" "crate" "trace" }
{ "angel" "angle" "galen" "glean" "lange" }
{ "alger" "glare" "lager" "large" "regal" }
{ "elan" "lane" "lean" "lena" "neal" }
{ "evil" "levi" "live" "veil" "vile" }
}
FreeBASIC
' FB 1.05.0 Win64
Type IndexedWord
As String word
As Integer index
End Type
' selection sort, quick enough for sorting small number of letters
Sub sortWord(s As String)
Dim As Integer i, j, m, n = Len(s)
For i = 0 To n - 2
m = i
For j = i + 1 To n - 1
If s[j] < s[m] Then m = j
Next j
If m <> i Then Swap s[i], s[m]
Next i
End Sub
' selection sort, quick enough for sorting small array of IndexedWord instances by index
Sub sortIndexedWord(iw() As IndexedWord)
Dim As Integer i, j, m, n = UBound(iw)
For i = 1 To n - 1
m = i
For j = i + 1 To n
If iw(j).index < iw(m).index Then m = j
Next j
If m <> i Then Swap iw(i), iw(m)
Next i
End Sub
' quicksort for sorting whole dictionary of IndexedWord instances by sorted word
Sub quicksort(a() As IndexedWord, first As Integer, last As Integer)
Dim As Integer length = last - first + 1
If length < 2 Then Return
Dim pivot As String = a(first + length\ 2).word
Dim lft As Integer = first
Dim rgt As Integer = last
While lft <= rgt
While a(lft).word < pivot
lft +=1
Wend
While a(rgt).word > pivot
rgt -= 1
Wend
If lft <= rgt Then
Swap a(lft), a(rgt)
lft += 1
rgt -= 1
End If
Wend
quicksort(a(), first, rgt)
quicksort(a(), lft, last)
End Sub
Dim t As Double = timer
Dim As String w() '' array to hold actual words
Open "undict.txt" For Input As #1
Dim count As Integer = 0
While Not Eof(1)
count +=1
Redim Preserve w(1 To count)
Line Input #1, w(count)
Wend
Close #1
Dim As IndexedWord iw(1 To count) '' array to hold sorted words and their index into w()
Dim word As String
For i As Integer = 1 To count
word = w(i)
sortWord(word)
iw(i).word = word
iw(i).index = i
Next
quickSort iw(), 1, count '' sort the IndexedWord array by sorted word
Dim As Integer startIndex = 1, length = 1, maxLength = 1, ub = 1
Dim As Integer maxIndex(1 To ub)
maxIndex(ub) = 1
word = iw(1).word
For i As Integer = 2 To count
If word = iw(i).word Then
length += 1
Else
If length > maxLength Then
maxLength = length
Erase maxIndex
ub = 1
Redim maxIndex(1 To ub)
maxIndex(ub) = startIndex
ElseIf length = maxLength Then
ub += 1
Redim Preserve maxIndex(1 To ub)
maxIndex(ub) = startIndex
End If
startIndex = i
length = 1
word = iw(i).word
End If
Next
If length > maxLength Then
maxLength = length
Erase maxIndex
Redim maxIndex(1 To 1)
maxIndex(1) = startIndex
ElseIf length = maxLength Then
ub += 1
Redim Preserve maxIndex(1 To ub)
maxIndex(ub) = startIndex
End If
Print Str(count); " words in the dictionary"
Print "The anagram set(s) with the greatest number of words (namely"; maxLength; ") is:"
Print
Dim iws(1 To maxLength) As IndexedWord '' array to hold each anagram set
For i As Integer = 1 To UBound(maxIndex)
For j As Integer = maxIndex(i) To maxIndex(i) + maxLength - 1
iws(j - maxIndex(i) + 1) = iw(j)
Next j
sortIndexedWord iws() '' sort anagram set before displaying it
For j As Integer = 1 To maxLength
Print w(iws(j).index); " ";
Next j
Print
Next i
Print
Print "Took ";
Print Using "#.###"; timer - t;
Print " seconds on i3 @ 2.13 GHz"
Print
Print "Press any key to quit"
Sleep
- Output:
25104 words in the dictionary The anagram set(s) with the greatest number of words (namely 5) is: abel able bale bela elba caret carte cater crate trace angel angle galen glean lange alger glare lager large regal elan lane lean lena neal evil levi live veil vile Took 0.103 seconds on i3 @ 2.13 GHz
Frink
d = new dict
for w = lines["http://wiki.puzzlers.org/pub/wordlists/unixdict.txt"]
{
sorted = sort[charList[w]]
d.addToList[sorted, w]
}
most = sort[toArray[d], {|a,b| length[b@1] <=> length[a@1]}]
longest = length[most@0@1]
i = 0
while length[most@i@1] == longest
{
println[most@i@1]
i = i + 1
}
FutureBasic
Applications in the latest versions of Macintosh OS X 10.x are sandboxed and require setting special permissions to link to internet files. For illustration purposes here, this code uses the internal Unix dictionary file available in all versions of OS X.
include "NSLog.incl"
local fn Dictionary as CFArrayRef
CFURLRef url = fn URLFileURLWithPath( @"/usr/share/dict/words" )
CFStringRef string = fn StringWithContentsOfURL( url, NSUTF8StringEncoding, NULL )
end fn = fn StringComponentsSeparatedByString( string, @"\n" )
local fn IsAnagram( wrd1 as CFStringRef, wrd2 as CFStringRef ) as BOOL
NSUInteger i
BOOL result = NO
if ( len(wrd1) != len(wrd2) ) then exit fn
if ( fn StringCompare( wrd1, wrd2 ) == NSOrderedSame ) then exit fn
CFMutableArrayRef mutArr1 = fn MutableArrayWithCapacity(0) : CFMutableArrayRef mutArr2 = fn MutableArrayWithCapacity(0)
for i = 0 to len(wrd1) - 1
MutableArrayAddObject( mutArr1, fn StringWithFormat( @"%C", fn StringCharacterAtIndex( wrd1, i ) ) )
MutableArrayAddObject( mutArr2, fn StringWithFormat( @"%C", fn StringCharacterAtIndex( wrd2, i ) ) )
next
SortDescriptorRef sd = fn SortDescriptorWithKeyAndSelector( NULL, YES, @"caseInsensitiveCompare:" )
if ( fn ArrayIsEqual( fn ArraySortedArrayUsingDescriptors( mutArr1, @[sd] ), fn ArraySortedArrayUsingDescriptors( mutArr2, @[sd] ) ) ) then result = YES
end fn = result
void local fn FindAnagramsInDictionary( wd as CFStringRef, dict as CFArrayRef )
CFStringRef string, temp
CFMutableArrayRef words = fn MutableArrayWithCapacity(0)
for temp in dict
if ( fn IsAnagram( lcase( wd ), temp ) ) then MutableArrayAddObject( words, temp )
next
string = fn ArrayComponentsJoinedByString( words, @", " )
NSLogSetTextColor( fn ColorText ) : NSLog( @"Anagrams for %@:", lcase(wd) )
NSLogSetTextColor( fn ColorSystemBlue ) : NSLog(@"%@\n",string)
end fn
void local fn DoIt
CFArrayRef dictionary = fn Dictionary
dispatchglobal
CFStringRef string
CFArrayRef words = @[@"bade",@"abet",@"beast",@"tuba",@"mace",@"scare",@"marine",@"antler",@"spare",@"leading",@"alerted",@"allergy",@"research",@"hustle",@"oriental",@"creationism",@"resistance",@"mountaineer"]
for string in words
fn FindAnagramsInDictionary( string, dictionary )
next
dispatchend
end fn
fn DoIt
HandleEvents
Output:
Anagrams for BADE: abed bade bead Anagrams for ABET: abet bate beat beta Anagrams for BEAST: baste beast tabes Anagrams for TUBA: abut tabu tuba Anagrams for MACE: acme came mace Anagrams for SCARE: carse caser ceras scare scrae Anagrams for MARINE: marine remain Anagrams for ANTLER: altern antler learnt rental ternal Anagrams for SPARE: asper parse prase spaer spare spear Anagrams for LEADING: adeling dealing leading Anagrams for ALERTED: delater related treadle Anagrams for ALLERGY: allergy gallery largely regally Anagrams for RESEARCH: rechaser research searcher Anagrams for HUSTLE: hustle sleuth Anagrams for ORIENTAL: oriental relation Anagrams for CREATIONISM: anisometric creationism miscreation ramisection reactionism Anagrams for RESISTANCE: resistance senatrices Anagrams for MOUNTAINEER: enumeration mountaineer
This version fulfils the task description.
include "NSLog.incl"
#plist NSAppTransportSecurity @{NSAllowsArbitraryLoads:YES}
local fn Dictionary as CFArrayRef
CFURLRef url = fn URLWithString( @"http://wiki.puzzlers.org/pub/wordlists/unixdict.txt" )
CFStringRef string = fn StringWithContentsOfURL( url, NSUTF8StringEncoding, NULL )
end fn = fn StringComponentsSeparatedByCharactersInSet( string, fn CharacterSetNewlineSet )
local fn TestIndexes( array as CFArrayRef, obj as CFTypeRef, index as NSUInteger, stp as ^BOOL, userData as ptr ) as BOOL
end fn = fn StringIsEqual( obj, userData )
void local fn IndexSetEnumerator( set as IndexSetRef, index as NSUInteger, stp as ^BOOL, userData as ptr )
NSLog(@"\t%@\b",fn ArrayObjectAtIndex( userData, index ))
end fn
void local fn DoIt
CFArrayRef words
CFMutableArrayRef sortedWords, letters
CFStringRef string, sortedString
IndexSetRef indexes
long i, j, count, indexCount, maxCount = 0, length
CFMutableDictionaryRef anagrams
CFTimeInterval ti
ti = fn CACurrentMediaTime
NSLog(@"Searching...")
// create another word list with sorted letters
words = fn Dictionary
count = len(words)
sortedWords = fn MutableArrayWithCapacity(count)
for string in words
length = len(string)
letters = fn MutableArrayWithCapacity(length)
for i = 0 to length - 1
MutableArrayAddObject( letters, mid(string,i,1) )
next
MutableArraySortUsingSelector( letters, @"compare:" )
sortedString = fn ArrayComponentsJoinedByString( letters, @"" )
MutableArrayAddObject( sortedWords, sortedString )
next
// search for identical sorted words
anagrams = fn MutableDictionaryWithCapacity(0)
for i = 0 to count - 2
j = i + 1
indexes = fn ArrayIndexesOfObjectsAtIndexesPassingTest( sortedWords, fn IndexSetWithIndexesInRange( fn CFRangeMake(j,count-j) ), NSEnumerationConcurrent, @fn TestIndexes, (ptr)sortedWords[i] )
indexCount = len(indexes)
if ( indexCount > maxCount )
maxCount = indexCount
MutableDictionaryRemoveAllObjects( anagrams )
end if
if ( indexCount == maxCount )
MutableDictionarySetValueForKey( anagrams, indexes, words[i] )
end if
next
// show results
NSLogClear
for string in anagrams
NSLog(@"%@\b",string)
indexes = anagrams[string]
IndexSetEnumerateIndexes( indexes, @fn IndexSetEnumerator, (ptr)words )
NSLog(@"")
next
NSLog(@"\nCalculated in %0.6fs",fn CACurrentMediaTime - ti)
end fn
dispatchglobal
fn DoIt
dispatchend
HandleEvents
- Output:
alger glare lager large regal caret carte cater crate trace elan lane lean lena neal abel able bale bela elba evil levi live veil vile angel angle galen glean lange Calculated in 2.409008s
GAP
Anagrams := function(name)
local f, p, L, line, word, words, swords, res, cur, r;
words := [ ];
swords := [ ];
f := InputTextFile(name);
while true do
line := ReadLine(f);
if line = fail then
break;
else
word := Chomp(line);
Add(words, word);
Add(swords, SortedList(word));
fi;
od;
CloseStream(f);
p := SortingPerm(swords);
L := Permuted(words, p);
r := "";
cur := [ ];
res := [ ];
for word in L do
if SortedList(word) = r then
Add(cur, word);
else
if Length(cur) > 0 then
Add(res, cur);
fi;
r := SortedList(word);
cur := [ word ];
fi;
od;
if Length(cur) > 0 then
Add(res, cur);
fi;
return Filtered(res, v -> Length(v) > 1);
end;
ana := Anagrams("my/gap/unixdict.txt");;
# What is the longest anagram sequence ?
Maximum(List(ana, Length));
# 5
# Which are they ?
Filtered(ana, v -> Length(v) = 5);
# [ [ "abel", "able", "bale", "bela", "elba" ],
# [ "caret", "carte", "cater", "crate", "trace" ],
# [ "angel", "angle", "galen", "glean", "lange" ],
# [ "alger", "glare", "lager", "large", "regal" ],
# [ "elan", "lane", "lean", "lena", "neal" ],
# [ "evil", "levi", "live", "veil", "vile" ] ]
Go
package main
import (
"bytes"
"fmt"
"io/ioutil"
"net/http"
"sort"
)
func main() {
r, err := http.Get("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
if err != nil {
fmt.Println(err)
return
}
b, err := ioutil.ReadAll(r.Body)
r.Body.Close()
if err != nil {
fmt.Println(err)
return
}
var ma int
var bs byteSlice
m := make(map[string][][]byte)
for _, word := range bytes.Fields(b) {
bs = append(bs[:0], byteSlice(word)...)
sort.Sort(bs)
k := string(bs)
a := append(m[k], word)
if len(a) > ma {
ma = len(a)
}
m[k] = a
}
for _, a := range m {
if len(a) == ma {
fmt.Printf("%s\n", a)
}
}
}
type byteSlice []byte
func (b byteSlice) Len() int { return len(b) }
func (b byteSlice) Swap(i, j int) { b[i], b[j] = b[j], b[i] }
func (b byteSlice) Less(i, j int) bool { return b[i] < b[j] }
- Output:
[angel angle galen glean lange] [elan lane lean lena neal] [evil levi live veil vile] [abel able bale bela elba] [caret carte cater crate trace] [alger glare lager large regal]
Groovy
This program:
def words = new URL('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt').text.readLines()
def groups = words.groupBy{ it.toList().sort() }
def bigGroupSize = groups.collect{ it.value.size() }.max()
def isBigAnagram = { it.value.size() == bigGroupSize }
println groups.findAll(isBigAnagram).collect{ it.value }.collect{ it.join(' ') }.join('\n')
- Output:
abel able bale bela elba alger glare lager large regal angel angle galen glean lange caret carte cater crate trace elan lane lean lena neal evil levi live veil vile
Haskell
import Data.List
groupon f x y = f x == f y
main = do
f <- readFile "./../Puzzels/Rosetta/unixdict.txt"
let words = lines f
wix = groupBy (groupon fst) . sort $ zip (map sort words) words
mxl = maximum $ map length wix
mapM_ (print . map snd) . filter ((==mxl).length) $ wix
- Output:
*Main> main
["abel","able","bale","bela","elba"]
["caret","carte","cater","crate","trace"]
["angel","angle","galen","glean","lange"]
["alger","glare","lager","large","regal"]
["elan","lane","lean","lena","neal"]
["evil","levi","live","veil","vile"]
and we can noticeably speed up the second stage sorting and grouping by packing the String lists of Chars to the Text type:
import Data.List (groupBy, maximumBy, sort)
import Data.Ord (comparing)
import Data.Function (on)
import Data.Text (pack)
main :: IO ()
main = do
f <- readFile "./unixdict.txt"
let ws = groupBy (on (==) fst) (sort (((,) =<< pack . sort) <$> lines f))
mapM_
(print . fmap snd)
(filter ((length (maximumBy (comparing length) ws) ==) . length) ws)
- Output:
["abel","able","bale","bela","elba"] ["caret","carte","cater","crate","trace"] ["angel","angle","galen","glean","lange"] ["alger","glare","lager","large","regal"] ["elan","lane","lean","lena","neal"] ["evil","levi","live","veil","vile"]
Icon and Unicon
Sample run:
->an <unixdict.txt abel bale bela able elba lean neal elan lane lena angle galen lange angel glean alger glare lager large regal veil evil levi live vile caret cater crate carte trace ->
J
If the unixdict file has been retrieved and saved in the current directory (for example, using wget):
(#~ a: ~: {:"1) (]/.~ /:~&>) <;._2 ] 1!:1 <'unixdict.txt'
+-----+-----+-----+-----+-----+
|abel |able |bale |bela |elba |
+-----+-----+-----+-----+-----+
|alger|glare|lager|large|regal|
+-----+-----+-----+-----+-----+
|angel|angle|galen|glean|lange|
+-----+-----+-----+-----+-----+
|caret|carte|cater|crate|trace|
+-----+-----+-----+-----+-----+
|elan |lane |lean |lena |neal |
+-----+-----+-----+-----+-----+
|evil |levi |live |veil |vile |
+-----+-----+-----+-----+-----+
Explanation:
<;._2 ] 1!:1 <'unixdict.txt'
This reads in the dictionary and produces a list of boxes. Each box contains one line (one word) from the dictionary.
(]/.~ /:~&>)
This groups the words into rows where anagram equivalents appear in the same row. In other words, creates a copy of the original list where the characters contained in each box have been sorted. Then it organizes the contents of the original list in rows, with each new row keyed by the values in the new list.
(#~ a: ~: {:"1)
This selects rows whose last element is not an empty box.
(In the previous step we created an array of rows of boxes. The short rows were automatically padded with empty boxes so that all rows would be the same length.)
Java
The key to this algorithm is the sorting of the characters in each word from the dictionary. The line Arrays.sort(chars); sorts all of the letters in the word in ascending order using a built-in quicksort, so all of the words in the first group in the result end up under the key "aegln" in the anagrams map.
import java.net.*;
import java.io.*;
import java.util.*;
public class WordsOfEqChars {
public static void main(String[] args) throws IOException {
URL url = new URL("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt");
InputStreamReader isr = new InputStreamReader(url.openStream());
BufferedReader reader = new BufferedReader(isr);
Map<String, Collection<String>> anagrams = new HashMap<String, Collection<String>>();
String word;
int count = 0;
while ((word = reader.readLine()) != null) {
char[] chars = word.toCharArray();
Arrays.sort(chars);
String key = new String(chars);
if (!anagrams.containsKey(key))
anagrams.put(key, new ArrayList<String>());
anagrams.get(key).add(word);
count = Math.max(count, anagrams.get(key).size());
}
reader.close();
for (Collection<String> ana : anagrams.values())
if (ana.size() >= count)
System.out.println(ana);
}
}
import java.net.*;
import java.io.*;
import java.util.*;
import java.util.concurrent.*;
import java.util.function.*;
public interface Anagram {
public static <AUTOCLOSEABLE extends AutoCloseable, OUTPUT> Supplier<OUTPUT> tryWithResources(Callable<AUTOCLOSEABLE> callable, Function<AUTOCLOSEABLE, Supplier<OUTPUT>> function, Supplier<OUTPUT> defaultSupplier) {
return () -> {
try (AUTOCLOSEABLE autoCloseable = callable.call()) {
return function.apply(autoCloseable).get();
} catch (Throwable throwable) {
return defaultSupplier.get();
}
};
}
public static <INPUT, OUTPUT> Function<INPUT, OUTPUT> function(Supplier<OUTPUT> supplier) {
return i -> supplier.get();
}
public static void main(String... args) {
Map<String, Collection<String>> anagrams = new ConcurrentSkipListMap<>();
int count = tryWithResources(
() -> new BufferedReader(
new InputStreamReader(
new URL(
"http://wiki.puzzlers.org/pub/wordlists/unixdict.txt"
).openStream()
)
),
reader -> () -> reader.lines()
.parallel()
.mapToInt(word -> {
char[] chars = word.toCharArray();
Arrays.parallelSort(chars);
String key = Arrays.toString(chars);
Collection<String> collection = anagrams.computeIfAbsent(
key, function(ArrayList::new)
);
collection.add(word);
return collection.size();
})
.max()
.orElse(0),
() -> 0
).get();
anagrams.values().stream()
.filter(ana -> ana.size() >= count)
.forEach(System.out::println)
;
}
}
- Output:
[angel, angle, galen, glean, lange] [elan, lane, lean, lena, neal] [alger, glare, lager, large, regal] [abel, able, bale, bela, elba] [evil, levi, live, veil, vile] [caret, carte, cater, crate, trace]
JavaScript
ES5
var fs = require('fs');
var words = fs.readFileSync('unixdict.txt', 'UTF-8').split('\n');
var i, item, max = 0,
anagrams = {};
for (i = 0; i < words.length; i += 1) {
var key = words[i].split('').sort().join('');
if (!anagrams.hasOwnProperty(key)) {//check if property exists on current obj only
anagrams[key] = [];
}
var count = anagrams[key].push(words[i]); //push returns new array length
max = Math.max(count, max);
}
//note, this returns all arrays that match the maximum length
for (item in anagrams) {
if (anagrams.hasOwnProperty(item)) {//check if property exists on current obj only
if (anagrams[item].length === max) {
console.log(anagrams[item].join(' '));
}
}
}
- Output:
[ 'abel', 'able', 'bale', 'bela', 'elba' ] [ 'alger', 'glare', 'lager', 'large', 'regal' ] [ 'angel', 'angle', 'galen', 'glean', 'lange' ] [ 'caret', 'carte', 'cater', 'crate', 'trace' ] [ 'elan', 'lane', 'lean', 'lena', 'neal' ] [ 'evil', 'levi', 'live', 'veil', 'vile' ]
Alternative using reduce:
var fs = require('fs');
var dictionary = fs.readFileSync('unixdict.txt', 'UTF-8').split('\n');
//group anagrams
var sortedDict = dictionary.reduce(function (acc, word) {
var sortedLetters = word.split('').sort().join('');
if (acc[sortedLetters] === undefined) { acc[sortedLetters] = []; }
acc[sortedLetters].push(word);
return acc;
}, {});
//sort list by frequency
var keysSortedByFrequency = Object.keys(sortedDict).sort(function (keyA, keyB) {
if (sortedDict[keyA].length < sortedDict[keyB].length) { return 1; }
if (sortedDict[keyA].length > sortedDict[keyB].length) { return -1; }
return 0;
});
//print first 10 anagrams by frequency
keysSortedByFrequency.slice(0, 10).forEach(function (key) {
console.log(sortedDict[key].join(' '));
});
ES6
Using JavaScript for Automation (A JavaScriptCore interpreter on macOS with an Automation library).
(() => {
'use strict';
// largestAnagramGroups :: FilePath -> Either String [[String]]
const largestAnagramGroups = fp =>
either(msg => msg)(strLexicon => {
const
groups = sortBy(flip(comparing(length)))(
groupBy(on(eq)(fst))(
sortBy(comparing(fst))(
strLexicon
.split(/[\r\n]/)
.map(w => [w.split('').sort().join(''), w])
)
)
),
maxSize = groups[0].length;
return map(map(snd))(
takeWhile(x => maxSize === x.length)(
groups
)
)
})(readFileLR(fp));
// ------------------------TEST------------------------
const main = () =>
console.log(JSON.stringify(
largestAnagramGroups('unixdict.txt'),
null, 2
))
// -----------------GENERIC FUNCTIONS------------------
// Left :: a -> Either a b
const Left = x => ({
type: 'Either',
Left: x
});
// Right :: b -> Either a b
const Right = x => ({
type: 'Either',
Right: x
});
// Tuple (,) :: a -> b -> (a, b)
const Tuple = a =>
b => ({
type: 'Tuple',
'0': a,
'1': b,
length: 2
});
// comparing :: (a -> b) -> (a -> a -> Ordering)
const comparing = f =>
x => y => {
const
a = f(x),
b = f(y);
return a < b ? -1 : (a > b ? 1 : 0);
};
// either :: (a -> c) -> (b -> c) -> Either a b -> c
const either = fl =>
fr => e => 'Either' === e.type ? (
undefined !== e.Left ? (
fl(e.Left)
) : fr(e.Right)
) : undefined;
// eq (==) :: Eq a => a -> a -> Bool
const eq = a =>
// True when a and b are equivalent.
b => a === b
// flip :: (a -> b -> c) -> b -> a -> c
const flip = f =>
1 < f.length ? (
(a, b) => f(b, a)
) : (x => y => f(y)(x));
// fst :: (a, b) -> a
const fst = tpl =>
// First member of a pair.
tpl[0];
// groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
const groupBy = fEq => xs =>
// // Typical usage: groupBy(on(eq)(f), xs)
0 < xs.length ? (() => {
const
tpl = xs.slice(1).reduce(
(gw, x) => {
const
gps = gw[0],
wkg = gw[1];
return fEq(wkg[0])(x) ? (
Tuple(gps)(wkg.concat([x]))
) : Tuple(gps.concat([wkg]))([x]);
},
Tuple([])([xs[0]])
);
return tpl[0].concat([tpl[1]])
})() : [];
// length :: [a] -> Int
const length = xs => xs.length
// map :: (a -> b) -> [a] -> [b]
const map = f =>
// The list obtained by applying f
// to each element of xs.
// (The image of xs under f).
xs => xs.map(f);
// on :: (b -> b -> c) -> (a -> b) -> a -> a -> c
const on = f =>
// e.g. sortBy(on(compare,length), xs)
g => a => b => f(g(a))(g(b));
// readFileLR :: FilePath -> Either String IO String
const readFileLR = fp => {
const
e = $(),
ns = $.NSString
.stringWithContentsOfFileEncodingError(
$(fp).stringByStandardizingPath,
$.NSUTF8StringEncoding,
e
);
return ns.isNil() ? (
Left(ObjC.unwrap(e.localizedDescription))
) : Right(ObjC.unwrap(ns));
};
// snd :: (a, b) -> b
const snd = tpl => tpl[1];
// sortBy :: (a -> a -> Ordering) -> [a] -> [a]
const sortBy = f =>
xs => xs.slice()
.sort((a, b) => f(a)(b));
// takeWhile :: (a -> Bool) -> [a] -> [a]
// takeWhile :: (Char -> Bool) -> String -> String
const takeWhile = p =>
xs => {
const lng = xs.length;
return 0 < lng ? xs.slice(
0,
until(i => lng === i || !p(xs[i]))(
i => 1 + i
)(0)
) : [];
};
// until :: (a -> Bool) -> (a -> a) -> a -> a
const until = p => f => x => {
let v = x;
while (!p(v)) v = f(v);
return v;
};
// MAIN ---
return main();
})();
- Output:
[ [ "abel", "able", "bale", "bela", "elba" ], [ "caret", "carte", "cater", "crate", "trace" ], [ "angel", "angle", "galen", "glean", "lange" ], [ "alger", "glare", "lager", "large", "regal" ], [ "elan", "lane", "lean", "lena", "neal" ], [ "evil", "levi", "live", "veil", "vile" ] ]
jq
def anagrams:
(reduce .[] as $word (
{table: {}, max: 0}; # state
($word | explode | sort | implode) as $hash
| .table[$hash] += [ $word ]
| .max = ([ .max, ( .table[$hash] | length) ] | max ) ))
| .max as $max
| .table | .[] | select(length == $max) ;
# The task:
split("\n") | anagrams
- Output:
$ jq -M -s -c -R -f anagrams.jq unixdict.txt
["abel","able","bale","bela","elba"]
["alger","glare","lager","large","regal"]
["angel","angle","galen","glean","lange"]
["caret","carte","cater","crate","trace"]
["elan","lane","lean","lena","neal"]
["evil","levi","live","veil","vile"]
Jsish
From Javascript, nodejs entry.
/* Anagrams, in Jsish */
var datafile = 'unixdict.txt';
if (console.args[0] == '-more' && Interp.conf('maxArrayList') > 500000)
datafile = '/usr/share/dict/words';
var words = File.read(datafile).split('\n');
puts(words.length, 'words');
var i, item, max = 0, anagrams = {};
for (i = 0; i < words.length; i += 1) {
var key = words[i].split('').sort().join('');
if (!anagrams.hasOwnProperty(key)) {
anagrams[key] = [];
}
var count = anagrams[key].push(words[i]);
max = Math.max(count, max);
}
// display all arrays that match the maximum length
for (item in anagrams) {
if (anagrams.hasOwnProperty(item)) {
if (anagrams[item].length === max) {
puts(anagrams[item].join(' '));
}
}
}
/*
=!EXPECTSTART!=
25108 words
abel able bale bela elba
caret carte cater crate trace
angel angle galen glean lange
alger glare lager large regal
elan lane lean lena neal
evil levi live veil vile
=!EXPECTEND!=
*/
- Output:
prompt$ jsish -u anagrams.jsi [PASS] anagrams.jsi
To update the script to pass with a site local words file, just jsish -u -update true anagrams.jsi.
Julia
url = "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt"
wordlist = open(readlines, download(url))
wsort(word::AbstractString) = join(sort(collect(word)))
function anagram(wordlist::Vector{<:AbstractString})
dict = Dict{String, Set{String}}()
for word in wordlist
sorted = wsort(word)
push!(get!(dict, sorted, Set{String}()), word)
end
wcnt = maximum(length, values(dict))
return collect(Iterators.filter((y) -> length(y) == wcnt, values(dict)))
end
println.(anagram(wordlist))
- Output:
Set(String["live", "vile", "veil", "evil", "levi"]) Set(String["abel", "able", "bale", "bela", "elba"]) Set(String["crate", "cater", "carte", "trace", "caret"]) Set(String["galen", "angel", "lange", "angle", "glean"]) Set(String["lager", "regal", "glare", "large", "alger"]) Set(String["neal", "elan", "lena", "lane", "lean"])
K
{x@&a=|/a:#:'x}{x g@&1<#:'g:={x@<x}'x}0::`unixdict.txt
Kotlin
import java.io.BufferedReader
import java.io.InputStreamReader
import java.net.URL
import kotlin.math.max
fun main() {
val url = URL("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
val isr = InputStreamReader(url.openStream())
val reader = BufferedReader(isr)
val anagrams = mutableMapOf<String, MutableList<String>>()
var count = 0
var word = reader.readLine()
while (word != null) {
val chars = word.toCharArray()
chars.sort()
val key = chars.joinToString("")
if (!anagrams.containsKey(key)) anagrams[key] = mutableListOf()
anagrams[key]?.add(word)
count = max(count, anagrams[key]?.size ?: 0)
word = reader.readLine()
}
reader.close()
anagrams.values
.filter { it.size == count }
.forEach { println(it) }
}
- Output:
[abel, able, bale, bela, elba] [alger, glare, lager, large, regal] [angel, angle, galen, glean, lange] [caret, carte, cater, crate, trace] [elan, lane, lean, lena, neal] [evil, levi, live, veil, vile]
Lasso
local(
anagrams = map,
words = include_url('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt')->split('\n'),
key,
max = 0,
findings = array
)
with word in #words do {
#key = #word -> split('') -> sort& -> join('')
if(not(#anagrams >> #key)) => {
#anagrams -> insert(#key = array)
}
#anagrams -> find(#key) -> insert(#word)
}
with ana in #anagrams
let ana_size = #ana -> size
do {
if(#ana_size > #max) => {
#findings = array(#ana -> join(', '))
#max = #ana_size
else(#ana_size == #max)
#findings -> insert(#ana -> join(', '))
}
}
#findings -> join('<br />\n')
- Output:
abel, able, bale, bela, elba caret, carte, cater, crate, trace angel, angle, galen, glean, lange alger, glare, lager, large, regal elan, lane, lean, lena, neal evil, levi, live, veil, vile
Liberty BASIC
' count the word list
open "unixdict.txt" for input as #1
while not(eof(#1))
line input #1,null$
numWords=numWords+1
wend
close #1
'import to an array appending sorted letter set
open "unixdict.txt" for input as #1
dim wordList$(numWords,3)
dim chrSort$(45)
wordNum=1
while wordNum<numWords
line input #1,actualWord$
wordList$(wordNum,1)=actualWord$
wordList$(wordNum,2)=sorted$(actualWord$)
wordNum=wordNum+1
wend
'sort on letter set
sort wordList$(),1,numWords,2
'count and store number of anagrams found
wordNum=1
startPosition=wordNum
numAnagrams=0
currentChrSet$=wordList$(wordNum,2)
while wordNum < numWords
while currentChrSet$=wordList$(wordNum,2)
numAnagrams=numAnagrams+1
wordNum=wordNum+1
wend
for n= startPosition to startPosition+numAnagrams
wordList$(n,3)=right$("0000"+str$(numAnagrams),4)+wordList$(n,2)
next
startPosition=wordNum
numAnagrams=0
currentChrSet$=wordList$(wordNum,2)
wend
'sort on number of anagrams+letter set
sort wordList$(),numWords,1,3
'display the top anagram sets found
wordNum=1
while wordNum<150
currentChrSet$=wordList$(wordNum,2)
print "Anagram set";
while currentChrSet$=wordList$(wordNum,2)
print " : ";wordList$(wordNum,1);
wordNum=wordNum+1
wend
print
currentChrSet$=wordList$(wordNum,2)
wend
close #1
end
function sorted$(w$)
nchr=len(w$)
for chr = 1 to nchr
chrSort$(chr)=mid$(w$,chr,1)
next
sort chrSort$(),1,nchr
sorted$=""
for chr = 1 to nchr
sorted$=sorted$+chrSort$(chr)
next
end function
LiveCode
LiveCode could definitely use a sort characters command. As it is this code converts the letters into items and then sorts that. I wrote a merge sort for characters, but the conversion to items, built-in-sort, conversion back to string is about 10% faster, and certainly easier to write.
on mouseUp
put mostCommonAnagrams(url "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
end mouseUp
function mostCommonAnagrams X
put 0 into maxCount
repeat for each word W in X
get sortChars(W)
put W & comma after A[it]
add 1 to C[it]
if C[it] >= maxCount then
if C[it] > maxCount then
put C[it] into maxCount
put char 1 to -2 of A[it] into winnerList
else
put cr & char 1 to -2 of A[it] after winnerList
end if
end if
end repeat
return winnerList
end mostCommonAnagrams
function sortChars X
get charsToItems(X)
sort items of it
return itemsToChars(it)
end sortChars
function charsToItems X
repeat for each char C in X
put C & comma after R
end repeat
return char 1 to -2 of R
end charsToItems
function itemsToChars X
replace comma with empty in X
return X
end itemsToChars
- Output:
abel,able,bale,bela,elba angel,angle,galen,glean,lange elan,lane,lean,lena,neal alger,glare,lager,large,regal caret,carte,cater,crate,trace evil,levi,live,veil,vile
Lua
Lua's core library is very small and does not include built-in network functionality. If a networking library were imported, the local file in the following script could be replaced with the remote dictionary file.
function sort(word)
local bytes = {word:byte(1, -1)}
table.sort(bytes)
return string.char(table.unpack(bytes))
end
-- Read in and organize the words.
-- word_sets[<alphabetized_letter_list>] = {<words_with_those_letters>}
local word_sets = {}
local max_size = 0
for word in io.lines('unixdict.txt') do
local key = sort(word)
if word_sets[key] == nil then word_sets[key] = {} end
table.insert(word_sets[key], word)
max_size = math.max(max_size, #word_sets[key])
end
-- Print out the answer sets.
for _, word_set in pairs(word_sets) do
if #word_set == max_size then
for _, word in pairs(word_set) do io.write(word .. ' ') end
print('') -- Finish with a newline.
end
end
- Output:
abel able bale bela elba evil levi live veil vile alger glare lager large regal angel angle galen glean lange caret carte cater crate trace elan lane lean lena neal
M4
divert(-1)
changequote(`[',`]')
define([for],
[ifelse($#,0,[[$0]],
[ifelse(eval($2<=$3),1,
[pushdef([$1],$2)$4[]popdef([$1])$0([$1],incr($2),$3,[$4])])])])
define([_bar],include(t.txt))
define([eachlineA],
[ifelse(eval($2>0),1,
[$3(substr([$1],0,$2))[]eachline(substr([$1],incr($2)),[$3])])])
define([eachline],[eachlineA([$1],index($1,[
]),[$2])])
define([removefirst],
[substr([$1],0,$2)[]substr([$1],incr($2))])
define([checkfirst],
[ifelse(eval(index([$2],substr([$1],0,1))<0),1,
0,
[ispermutation(substr([$1],1),
removefirst([$2],index([$2],substr([$1],0,1))))])])
define([ispermutation],
[ifelse([$1],[$2],1,
eval(len([$1])!=len([$2])),1,0,
len([$1]),0,0,
[checkfirst([$1],[$2])])])
define([_set],[define($1<$2>,$3)])
define([_get],[defn([$1<$2>])])
define([_max],1)
define([_n],0)
define([matchj],
[_set([count],$2,incr(_get([count],$2)))[]ifelse(eval(_get([count],$2)>_max),
1,[define([_max],incr(_max))])[]_set([list],$2,[_get([list],$2) $1])])
define([checkwordj],
[ifelse(ispermutation([$1],_get([word],$2)),1,[matchj([$1],$2)],
[addwordj([$1],incr($2))])])
define([_append],
[_set([word],_n,[$1])[]_set([count],_n,1)[]_set([list],_n,
[$1 ])[]define([_n],incr(_n))])
define([addwordj],
[ifelse($2,_n,[_append([$1])],[checkwordj([$1],$2)])])
define([addword],
[addwordj([$1],0)])
divert
eachline(_bar,[addword])
_max
for([x],1,_n,[ifelse(_get([count],x),_max,[_get([list],x)
])])
Memory limitations keep this program from working on the full-sized dictionary.
- Output:
(using only the first 100 words as input)
2 abel able aboard abroad
Maple
The first line downloads the specified dictionary. (You could, instead, read it from a file, or use one of Maple's built-in word lists.) Next, turn it into a list of words. The assignment to T is where the real work is done (via Classify, in the ListTools package). This creates sets of words all of which have the same "hash", which is, in this case, the sorted word. The convert call discards the hashes, which have done their job, and leaves us with a list L of anagram sets. Finally, we just note the size of the largest sets of anagrams, and pick those off.
words := HTTP:-Get( "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt" )[2]: # ignore errors
use StringTools, ListTools in
T := Classify( Sort, map( Trim, Split( words ) ) )
end use:
L := convert( T, 'list' ):
m := max( map( nops, L ) ); # what is the largest set?
A := select( s -> evalb( nops( s ) = m ), L ); # get the maximal sets of anagrams
The result of running this code is
A := [{"abel", "able", "bale", "bela", "elba"}, {"angel", "angle", "galen",
"glean", "lange"}, {"alger", "glare", "lager", "large", "regal"}, {"evil",
"levi", "live", "veil", "vile"}, {"caret", "carte", "cater", "crate", "trace"}
, {"elan", "lane", "lean", "lena", "neal"}];
Mathematica /Wolfram Language
Download the dictionary, split the lines, split the word in characters and sort them. Now sort by those words, and find sequences of equal 'letter-hashes'. Return the longest sequences:
list=Import["http://wiki.puzzlers.org/pub/wordlists/unixdict.txt","Lines"];
text={#,StringJoin@@Sort[Characters[#]]}&/@list;
text=SortBy[text,#[[2]]&];
splits=Split[text,#1[[2]]==#2[[2]]&][[All,All,1]];
maxlen=Max[Length/@splits];
Select[splits,Length[#]==maxlen&]
gives back:
{{abel,able,bale,bela,elba},{caret,carte,cater,crate,trace},{angel,angle,galen,glean,lange},{alger,glare,lager,large,regal},{elan,lane,lean,lena,neal},{evil,levi,live,veil,vile}}
An alternative is faster, but requires version 7 (for Gather
):
splits = Gather[list, Sort[Characters[#]] == Sort[Characters[#2]] &];
maxlen = Max[Length /@ splits];
Select[splits, Length[#] == maxlen &]
Or using build-in functions for sorting and gathering elements in lists it can be implimented as:
anagramGroups = GatherBy[SortBy[GatherBy[list,Sort[Characters[#]] &],Length],Length];
anagramGroups[[-1]]
Also, Mathematica's own word list is available; replacing the list definition with list = WordData[];
and forcing maxlen
to 5 yields instead this result:
{{angered,derange,enraged,grandee,grenade}, {anisometric,creationism,miscreation,reactionism,romanticise}, {aper,pare,pear,rape,reap}, {ardeb,barde,bared,beard,bread,debar}, {aril,lair,lari,liar,lira,rail,rial}, {aster,rates,stare,tears,teras}, {caret,carte,cater,crate,react,trace}, {east,eats,sate,seat,seta}, {ester,reset,steer,teres,terse}, {inert,inter,niter,nitre,trine}, {latrine,ratline,reliant,retinal,trenail}, {least,slate,stale,steal,stela,tesla}, {luster,lustre,result,rustle,sutler,ulster}, {merit,miter,mitre,remit,timer}, {part,prat,rapt,tarp,trap}, {resin,rinse,risen,serin,siren}, {respect,scepter,sceptre,specter,spectre}}
Also if using Mathematica 10 it gets really concise:
list=Import["http://wiki.puzzlers.org/pub/wordlists/unixdict.txt","Lines"];
MaximalBy[GatherBy[list, Sort@*Characters], Length]
Maxima
read_file(name) := block([file, s, L], file: openr(name), L: [],
while stringp(s: readline(file)) do L: cons(s, L), close(file), L)$
u: read_file("C:/my/mxm/unixdict.txt")$
v: map(lambda([s], [ssort(s), s]), u)$
w: sort(v, lambda([x, y], orderlessp(x[1], y[1])))$
ana(L) := block([m, n, p, r, u, v, w],
L: endcons(["", ""], L),
n: length(L),
r: "",
m: 0,
v: [ ],
w: [ ],
for i from 1 thru n do (
u: L[i],
if r = u[1] then (
w: cons(u[2], w)
) else (
p: length(w),
if p >= m then (
if p > m then (m: p, v: []),
v: cons(w, v)
),
w: [u[2]],
r: u[1]
)
),
v)$
ana(w);
/* [["evil", "levi", "live", "veil", "vile"],
["elan", "lane", "lean", "lena", "neal"],
["alger", "glare", "lager", "large", "regal"],
["angel", "angle", "galen", "glean", "lange"],
["caret", "carte", "cater", "crate", "trace"],
["abel", "able", "bale", "bela", "elba"]] */
MiniScript
This implementation is for use with the Mini Micro version of MiniScript. The command-line version does not include a HTTP library. The script can be modified to use the file class to read a local copy of the word list.
wordList = http.get("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt").split(char(10))
makeKey = function(word)
return word.split("").sort.join("")
end function
wordSets = {}
for word in wordList
k = makeKey(word)
if not wordSets.hasIndex(k) then
wordSets[k] = [word]
else
wordSets[k].push(word)
end if
end for
counts = []
for wordSet in wordSets.values
counts.push([wordSet.len, wordSet])
end for
counts.sort(0, false)
maxCount = counts[0][0]
for count in counts
if count[0] == maxCount then print count[1]
end for
- Output:
["abel", "able", "bale", "bela", "elba"] ["alger", "glare", "lager", "large", "regal"] ["angel", "angle", "galen", "glean", "lange"] ["caret", "carte", "cater", "crate", "trace"] ["elan", "lane", "lean", "lena", "neal"] ["evil", "levi", "live", "veil", "vile"]
MUMPS
Anagrams New ii,file,longest,most,sorted,word
Set file="unixdict.txt"
Open file:"r" Use file
For Quit:$ZEOF DO
. New char,sort
. Read word Quit:word=""
. For ii=1:1:$Length(word) Do
. . Set char=$ASCII(word,ii)
. . If char>64,char<91 Set char=char+32
. . Set sort(char)=$Get(sort(char))+1
. . Quit
. Set (sorted,char)="" For Set char=$Order(sort(char)) Quit:char="" Do
. . For ii=1:1:sort(char) Set sorted=sorted_$Char(char)
. . Quit
. Set table(sorted,word)=1
. Quit
Close file
Set sorted="" For Set sorted=$Order(table(sorted)) Quit:sorted="" Do
. Set ii=0,word="" For Set word=$Order(table(sorted,word)) Quit:word="" Set ii=ii+1
. Quit:ii<2
. Set most(ii,sorted)=1
. Quit
Write !,"The anagrams with the most variations:"
Set ii=$Order(most(""),-1)
Set sorted="" For Set sorted=$Order(most(ii,sorted)) Quit:sorted="" Do
. Write ! Set word="" For Set word=$Order(table(sorted,word)) Quit:word="" Write " ",word
. Quit
Write !,"The longest anagrams:"
Set ii=$Order(longest(""),-1)
Set sorted="" For Set sorted=$Order(longest(ii,sorted)) Quit:sorted="" Do
. Write ! Set word="" For Set word=$Order(table(sorted,word)) Quit:word="" Write " ",word
. Quit
Quit
Do Anagrams
The anagrams with the most variations: abel able bale bela elba caret carte cater crate trace angel angle galen glean lange alger glare lager large regal elan lane lean lena neal evil levi live veil vile The longest anagrams: conservation conversation
NetRexx
Java–Like
/* NetRexx */
options replace format comments java crossref symbols nobinary
class RAnagramsV01 public
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
method runSample(arg) public signals MalformedURLException, IOException
parse arg localFile .
isr = Reader
if localFile = '' then do
durl = URL("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
dictFrom = durl.toString()
isr = InputStreamReader(durl.openStream())
end
else do
dictFrom = localFile
isr = FileReader(localFile)
end
say 'Searching' dictFrom 'for anagrams'
dictionaryReader = BufferedReader(isr)
anagrams = Map HashMap()
aWord = String
count = 0
loop label w_ forever
aWord = dictionaryReader.readLine()
if aWord = null then leave w_
chars = aWord.toCharArray()
Arrays.sort(chars)
key = String(chars)
if (\anagrams.containsKey(key)) then do
anagrams.put(key, ArrayList())
end
(ArrayList anagrams.get(key)).add(Object aWord)
count = Math.max(count, (ArrayList anagrams.get(key)).size())
end w_
dictionaryReader.close
ani = anagrams.values().iterator()
loop label a_ while ani.hasNext()
ana = ani.next()
if (ArrayList ana).size() >= count then do
say ana
end
end a_
return
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
method main(args = String[]) public static
arg = Rexx(args)
Do
ra = RAnagramsV01()
ra.runSample(arg)
Catch ex = Exception
ex.printStackTrace()
End
return
- Output:
Searching http://wiki.puzzlers.org/pub/wordlists/unixdict.txt for anagrams [abel, able, bale, bela, elba] [elan, lane, lean, lena, neal] [evil, levi, live, veil, vile] [angel, angle, galen, glean, lange] [alger, glare, lager, large, regal] [caret, carte, cater, crate, trace]
Rexx–Like
Implemented with more NetRexx idioms such as indexed strings, PARSE and the NetRexx "built–in functions".
/* NetRexx */
options replace format comments java crossref symbols nobinary
runSample(arg)
return
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
method findMostAnagrams(arg) public static signals MalformedURLException, IOException
parse arg localFile .
isr = Reader
if localFile = '' then do
durl = URL("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
dictFrom = durl.toString()
isr = InputStreamReader(durl.openStream())
end
else do
dictFrom = localFile
isr = FileReader(localFile)
end
say 'Searching' dictFrom 'for anagrams'
dictionaryReader = BufferedReader(isr)
anagrams = 0
maxWords = 0
loop label w_ forever
aWord = dictionaryReader.readLine()
if aWord = null then leave w_
chars = aWord.toCharArray()
Arrays.sort(chars)
key = Rexx(chars)
parse anagrams[key] count aWords
aWords = (aWords aWord).space()
maxWords = maxWords.max(aWords.words())
anagrams[key] = aWords.words() aWords
end w_
dictionaryReader.close
loop key over anagrams
parse anagrams[key] count aWords
if count >= maxWords then
say aWords
else
anagrams[key] = null -- remove unwanted elements from the indexed string
end key
return
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
method runSample(arg) public static
Do
findMostAnagrams(arg)
Catch ex = Exception
ex.printStackTrace()
End
Return
- Output:
Searching http://wiki.puzzlers.org/pub/wordlists/unixdict.txt for anagrams abel able bale bela elba elan lane lean lena neal evil levi live veil vile angel angle galen glean lange alger glare lager large regal caret carte cater crate trace
NewLisp
;;; Get the words as a list, splitting at newline
(setq data
(parse (get-url "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
"\n"))
;
;;; Replace each word with a list of its key (list of sorted chars) and itself
;;; For example "hello" –> (("e" "h" "l" "l" "o") "hello")
(setq data (map (fn(x) (list (sort (explode x)) x)) data))
;
;;; Sort on the keys (data is modified); (x 0) is the same as (first x)
(sort data (fn(x y) (> (x 0)(y 0))))
;
;;; Return a list of lists of words with the same key
;;; An empty list at the head is inconsequential
(define (group-by-key)
(let (temp '() res '() oldkey '())
(dolist (x data)
(if (= (x 0) oldkey)
(push (x 1) temp)
(begin
(push temp res)
(setq temp (list (x 1)) oldkey (x 0)))))
(push temp res)
res))
;
;;; Print out only groups of more than 4 words
(map println (filter (fn(x) (> (length x) 4)) (group-by-key)))
- Output:
("abel" "able" "bale" "bela" "elba") ("caret" "carte" "cater" "crate" "trace") ("angel" "angle" "galen" "glean" "lange") ("alger" "glare" "lager" "large" "regal") ("elan" "lane" "lean" "lena" "neal") ("evil" "levi" "live" "veil" "vile")
Nim
import tables, strutils, algorithm
proc main() =
var
count = 0
anagrams = initTable[string, seq[string]]()
for word in "unixdict.txt".lines():
var key = word
key.sort(cmp[char])
anagrams.mgetOrPut(key, newSeq[string]()).add(word)
count = max(count, anagrams[key].len)
for _, v in anagrams:
if v.len == count:
v.join(" ").echo
main()
- Output:
evil levi live veil vile caret carte cater crate trace elan lane lean lena neal alger glare lager large regal abel able bale bela elba angel angle galen glean lange
Oberon-2
Oxford Oberon-2
MODULE Anagrams;
IMPORT Files,Out,In,Strings;
CONST
MAXPOOLSZ = 1024;
TYPE
String = ARRAY 80 OF CHAR;
Node = POINTER TO NodeDesc;
NodeDesc = RECORD;
count: INTEGER;
word: String;
desc: Node;
next: Node;
END;
Pool = POINTER TO PoolDesc;
PoolDesc = RECORD
capacity,max: INTEGER;
words: POINTER TO ARRAY OF Node;
END;
PROCEDURE InitNode(n: Node);
BEGIN
n^.count := 0;
n^.word := "";
n^.desc := NIL;
n^.next := NIL;
END InitNode;
PROCEDURE Index(s: ARRAY OF CHAR;cap: INTEGER): INTEGER;
VAR
i,sum: INTEGER;
BEGIN
sum := 0;
FOR i := 0 TO Strings.Length(s) DO
INC(sum,ORD(s[i]))
END;
RETURN sum MOD cap
END Index;
PROCEDURE ISort(VAR s: ARRAY OF CHAR);
VAR
i, j: INTEGER;
t: CHAR;
BEGIN
FOR i := 0 TO Strings.Length(s) - 1 DO
j := i;
t := s[j];
WHILE (j > 0) & (s[j -1] > t) DO
s[j] := s[j - 1];
DEC(j)
END;
s[j] := t
END
END ISort;
PROCEDURE SameLetters(x,y: ARRAY OF CHAR): BOOLEAN;
BEGIN
ISort(x);ISort(y);
RETURN (Strings.Compare(x,y) = 0)
END SameLetters;
PROCEDURE InitPool(p:Pool);
BEGIN
InitPoolWith(p,MAXPOOLSZ);
END InitPool;
PROCEDURE InitPoolWith(p:Pool;cap: INTEGER);
VAR
i: INTEGER;
BEGIN
p^.capacity := cap;
p^.max := 0;
NEW(p^.words,cap);
i := 0;
WHILE i < p^.capacity DO
p^.words^[i] := NIL;
INC(i);
END;
END InitPoolWith;
PROCEDURE (p: Pool) Add(w: ARRAY OF CHAR);
VAR
idx: INTEGER;
iter,n: Node;
BEGIN
idx := Index(w,p^.capacity);
iter := p^.words^[idx];
NEW(n);InitNode(n);COPY(w,n^.word);
WHILE(iter # NIL) DO
IF SameLetters(w,iter^.word) THEN
INC(iter^.count);
IF iter^.count > p^.max THEN p^.max := iter^.count END;
n^.desc := iter^.desc;
iter^.desc := n;
RETURN
END;
iter := iter^.next
END;
ASSERT(iter = NIL);
n^.next := p^.words^[idx];p^.words^[idx] := n
END Add;
PROCEDURE ShowAnagrams(l: Node);
VAR
iter: Node;
BEGIN
iter := l;
WHILE iter # NIL DO
Out.String(iter^.word);Out.String(" ");
iter := iter^.desc
END;
Out.Ln
END ShowAnagrams;
PROCEDURE (p: Pool) ShowMax();
VAR
i: INTEGER;
iter: Node;
BEGIN
FOR i := 0 TO LEN(p^.words^) - 1 DO
IF p^.words^[i] # NIL THEN
iter := p^.words^[i];
WHILE iter # NIL DO
IF iter^.count = p^.max THEN
ShowAnagrams(iter);
END;
iter := iter^.next
END
END
END
END ShowMax;
PROCEDURE DoProcess(fnm: ARRAY OF CHAR);
VAR
stdinBck,istream: Files.File;
line: String;
p: Pool;
BEGIN
istream := Files.Open(fnm,"r");
stdinBck := Files.stdin;
Files.stdin := istream;
NEW(p);InitPool(p);
WHILE In.Done DO
In.Line(line);
p.Add(line);
END;
Files.stdin := stdinBck;
Files.Close(istream);
p^.ShowMax();
END DoProcess;
BEGIN
DoProcess("unixdict.txt");
END Anagrams.
- Output:
abel elba bela bale able elan neal lena lean lane evil vile veil live levi angel lange glean galen angle alger regal large lager glare caret trace crate cater carte
Objeck
use HTTP;
use Collection;
class Anagrams {
function : Main(args : String[]) ~ Nil {
lines := HttpClient->New()->Get("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt");
anagrams := StringMap->New();
count := 0;
if(lines->Size() = 1) {
line := lines->Get(0)->As(String);
words := line->Split("\n");
each(i : words) {
word := words[i]->Trim();
key := String->New(word->ToCharArray()->Sort());
list := anagrams->Find(key)->As(Vector);
if(list = Nil) {
list := Vector->New();
anagrams->Insert(key, list);
};
list->AddBack(word);
count := count->Max(list->Size());
};
lists := anagrams->GetValues();
each(i : lists) {
list := lists->Get(i)->As(Vector);
if(list->Size() >= count) {
'['->Print();
each(j : list) {
list->Get(j)->As(String)->Print();
if(j + 1 < list->Size()) {
','->Print();
};
};
']'->PrintLine();
};
};
};
}
}
- Output:
[abel,able,bale,bela,elba] [caret,carte,cater,crate,trace] [angel,angle,galen,glean,lange] [alger,glare,lager,large,regal] [elan,lane,lean,lena,neal] [evil,levi,live,veil,vile]
OCaml
let explode str =
let l = ref [] in
let n = String.length str in
for i = n - 1 downto 0 do
l := str.[i] :: !l
done;
(!l)
let implode li =
let n = List.length li in
let s = String.create n in
let i = ref 0 in
List.iter (fun c -> s.[!i] <- c; incr i) li;
(s)
let () =
let h = Hashtbl.create 3571 in
let ic = open_in "unixdict.txt" in
try while true do
let w = input_line ic in
let k = implode (List.sort compare (explode w)) in
let l =
try Hashtbl.find h k
with Not_found -> []
in
Hashtbl.replace h k (w::l);
done with End_of_file -> ();
let n = Hashtbl.fold (fun _ lw n -> max n (List.length lw)) h 0 in
Hashtbl.iter (fun _ lw ->
if List.length lw >= n then
( List.iter (Printf.printf " %s") lw;
print_newline () )
) h
Oforth
import: mapping
import: collect
import: quicksort
: anagrams
| m |
"unixdict.txt" File new groupBy( #sort )
dup sortBy( #[ second size] ) last second size ->m
filter( #[ second size m == ] )
apply ( #[ second .cr ] )
;
- Output:
>anagrams [abel, able, bale, bela, elba] [alger, glare, lager, large, regal] [angel, angle, galen, glean, lange] [caret, carte, cater, crate, trace] [elan, lane, lean, lena, neal] [evil, levi, live, veil, vile]
ooRexx
Two versions of this, using different collection classes.
Version 1: Directory of arrays
-- This assumes you've already downloaded the following file and placed it
-- in the current directory: http://wiki.puzzlers.org/pub/wordlists/unixdict.txt
-- There are several different ways of reading the file. I chose the
-- supplier method just because I haven't used it yet in any other examples.
source = .stream~new('unixdict.txt')~supplier
-- this holds our mappings of the anagrams
anagrams = .directory~new
count = 0 -- this is used to keep track of the maximums
loop while source~available
word = source~item
-- this produces a string consisting of the characters in sorted order
-- Note: the ~~ used to invoke sort makes that message return value be
-- the target array. The sort method does not normally have a return value.
key = word~makearray('')~~sort~tostring("l", "")
-- make sure we have an accumulator collection for this key
list = anagrams[key]
if list == .nil then do
list = .array~new
anagrams[key] = list
end
-- this word is now associate with this key
list~append(word)
-- and see if this is a new highest count
count = max(count, list~items)
source~next
end
loop letters over anagrams
list = anagrams[letters]
if list~items >= count then
say letters":" list~makestring("l", ", ")
end
Version 2: Using the relation class
This version appears to be the fastest.
-- This assumes you've already downloaded the following file and placed it
-- in the current directory: http://wiki.puzzlers.org/pub/wordlists/unixdict.txt
-- There are several different ways of reading the file. I chose the
-- supplier method just because I haven't used it yet in any other examples.
source = .stream~new('unixdict.txt')~supplier
-- this holds our mappings of the anagrams. This is good use for the
-- relation class
anagrams = .relation~new
count = 0 -- this is used to keep track of the maximums
loop while source~available
word = source~item
-- this produces a string consisting of the characters in sorted order
-- Note: the ~~ used to invoke sort makes that message return value be
-- the target array. The sort method does not normally have a return value.
key = word~makearray('')~~sort~tostring("l", "")
-- add this to our mapping. This creates multiple entries for each
-- word that uses the same key
anagrams[key] = word
source~next
end
-- now get the set of unique keys
keys = .set~new~~putall(anagrams~allIndexes)
count = 0 -- this is used to keep track of the maximums
most = .directory~new
loop key over keys
words = anagrams~allAt(key)
newCount = words~items
if newCount > count then do
-- throw away our old set
most~empty
count = newCount
most[key] = words
end
-- matches our highest count, add it to the list
else if newCount == count then
most[key] = words
end
loop letters over most
words = most[letters]
say letters":" words~makestring("l", ", ")
end
Timings taken on my laptop:
Version 1 1.2 seconds Version 2 0.4 seconds Rexx 51.1 seconds (!) as of 04.08.2013 (using ooRexx after adapting the code for incompatibilities: @->y, a=, Upper) REXX v1 1.7 seconds as of 05.08.2013 -"- (improved version of REXX code) REXX v1 1.2 seconds 09.08.2013 -"- REXX v2 1.2 seconds 09.08.2013 PL/I 4.3 seconds NetRexx v1 .2 seconds (using local file, 4 seconds with remote) NetRexx v2 .09 seconds (using local file) It probably should be noted that the REXX timings are actually for ooRexx executing a modified version of the REXX code. Statistics: sets number of words 22022 1 1089 2 155 3 31 4 6 5
Oz
declare
%% Helper function
fun {ReadLines Filename}
File = {New class $ from Open.file Open.text end init(name:Filename)}
in
for collect:C break:B do
case {File getS($)} of false then {File close} {B}
[] Line then {C Line}
end
end
end
%% Groups anagrams by using a mutable dictionary
%% with sorted words as keys
WordDict = {Dictionary.new}
for Word in {ReadLines "unixdict.txt"} do
Keyword = {String.toAtom {Sort Word Value.'<'}}
in
WordDict.Keyword := Word|{CondSelect WordDict Keyword nil}
end
Sets = {Dictionary.items WordDict}
%% Filter such that only the largest sets remain
MaxSetSize = {FoldL {Map Sets Length} Max 0}
LargestSets = {Filter Sets fun {$ S} {Length S} == MaxSetSize end}
in
%% Display result (make sure strings are shown as string, not as number lists)
{Inspector.object configureEntry(widgetShowStrings true)}
{Inspect LargestSets}
Pascal
Program Anagrams;
// assumes a local file
uses
classes, math;
var
i, j, k, maxCount: integer;
sortedString: string;
WordList: TStringList;
SortedWordList: TStringList;
AnagramList: array of TStringlist;
begin
WordList := TStringList.Create;
WordList.LoadFromFile('unixdict.txt');
for i := 0 to WordList.Count - 1 do
begin
setLength(sortedString,Length(WordList.Strings[i]));
sortedString[1] := WordList.Strings[i][1];
// sorted assign
j := 2;
while j <= Length(WordList.Strings[i]) do
begin
k := j - 1;
while (WordList.Strings[i][j] < sortedString[k]) and (k > 0) do
begin
sortedString[k+1] := sortedString[k];
k := k - 1;
end;
sortedString[k+1] := WordList.Strings[i][j];
j := j + 1;
end;
// create the stringlists of the sorted letters and
// the list of the original words
if not assigned(SortedWordList) then
begin
SortedWordList := TStringList.Create;
SortedWordList.append(sortedString);
setlength(AnagramList,1);
AnagramList[0] := TStringList.Create;
AnagramList[0].append(WordList.Strings[i]);
end
else
begin
j := 0;
while sortedString <> SortedWordList.Strings[j] do
begin
inc(j);
if j = (SortedWordList.Count) then
begin
SortedWordList.append(sortedString);
setlength(AnagramList,length(AnagramList) + 1);
AnagramList[j] := TStringList.Create;
break;
end;
end;
AnagramList[j].append(WordList.Strings[i]);
end;
end;
maxCount := 1;
for i := 0 to length(AnagramList) - 1 do
maxCount := max(maxCount, AnagramList[i].Count);
// create output
writeln('The largest sets of words have ', maxCount, ' members:');
for i := 0 to length(AnagramList) - 1 do
begin
if AnagramList[i].Count = maxCount then
begin
write('"', SortedWordList.strings[i], '": ');
for j := 0 to AnagramList[i].Count - 2 do
write(AnagramList[i].strings[j], ', ');
writeln(AnagramList[i].strings[AnagramList[i].Count - 1]);
end;
end;
// Cleanup
WordList.Destroy;
SortedWordList.Destroy;
for i := 0 to length(AnagramList) - 1 do
AnagramList[i].Destroy;
end.
- Output:
The largest sets of words have 5 members: "abel": abel, able, bale, bela, elba "aeglr": alger, glare, lager, large, regal "aegln": angel, angle, galen, glean, lange "acert": caret, carte, cater, crate, trace "aeln": elan, lane, lean, lena, neal "eilv": evil, levi, live, veil, vile
PascalABC.NET
begin
var s := System.Net.WebClient.Create.DownloadString('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt');
var words := s.Split;
var groups := words.GroupBy(word -> word.Order.JoinToString);
var maxCount := groups.Max(gr -> gr.Count);
groups.Where(gr -> gr.Count = maxCount).PrintLines;
end.
- Output:
[abel,able,bale,bela,elba] [alger,glare,lager,large,regal] [angel,angle,galen,glean,lange] [caret,carte,cater,crate,trace] [elan,lane,lean,lena,neal] [evil,levi,live,veil,vile]
Perl
use List::Util 'max';
my @words = split "\n", do { local( @ARGV, $/ ) = ( 'unixdict.txt' ); <> };
my %anagram;
for my $word (@words) {
push @{ $anagram{join '', sort split '', $word} }, $word;
}
my $count = max(map {scalar @$_} values %anagram);
for my $ana (values %anagram) {
print "@$ana\n" if @$ana == $count;
}
If we calculate $max
, then we don't need the CPAN module:
push @{$anagram{ join '' => sort split '' }}, $_ for @words;
$max > @$_ or $max = @$_ for values %anagram;
@$_ == $max and print "@$_\n" for values %anagram;
- Output:
alger glare lager large regal abel able bale bela elba evil levi live veil vile angel angle galen glean lange elan lane lean lena neal caret carte cater crate trace
Phix
copied from Euphoria and cleaned up slightly
integer fn = open("demo/unixdict.txt","r") sequence words = {}, anagrams = {}, last="", letters object word integer maxlen = 1 while 1 do word = trim(gets(fn)) if atom(word) then exit end if if length(word) then letters = sort(word) words = append(words, {letters, word}) end if end while close(fn) words = sort(words) for i=1 to length(words) do {letters,word} = words[i] if letters=last then anagrams[$] = append(anagrams[$],word) if length(anagrams[$])>maxlen then maxlen = length(anagrams[$]) end if else last = letters anagrams = append(anagrams,{word}) end if end for puts(1,"\nMost anagrams:\n") for i=1 to length(anagrams) do last = anagrams[i] if length(last)=maxlen then printf(1,"%s\n",{join(last,", ")}) end if end for
- Output:
Most anagrams: abel, able, bale, bela, elba caret, carte, cater, crate, trace angel, angle, galen, glean, lange alger, glare, lager, large, regal elan, lane, lean, lena, neal evil, levi, live, veil, vile
Phixmonti
include ..\Utilitys.pmt
"unixdict.txt" "r" fopen var f
( )
true while
f fgets
dup -1 == if
drop
f fclose
false
else
-1 del
dup sort swap 2 tolist 0 put
true
endif
endwhile
sort
"" var prev
( ) var prov
( ) var res
0 var maxlen
len for
get 1 get dup prev != if
res prov len maxlen > if len var maxlen endif
0 put var res ( ) var prov
endif
var prev
2 get nip
prov swap 0 put var prov
endfor
res
len for
get len maxlen == if ? else drop endif
endfor
Other solution
include ..\Utilitys.pmt
( )
newd var dict
0 var maxlen
"unixdict.txt" "r" fopen var f
true while
f fgets
dup -1 == if
drop
f fclose
false
else
-1 del
0 put
true
endif
endwhile
len for
get dup >ps sort dup >ps
dict swap getd dup
"Unfound" == if
drop ps> ps> 1 tolist
else
ps> swap ps> 0 put len maxlen max var maxlen
endif
2 tolist setd var dict
endfor
drop dict 2 get nip
len for
get len maxlen == if ? else drop endif
endfor
- Output:
["abel", "able", "bale", "bela", "elba"]["caret", "carte", "cater", "crate", "trace"] ["angel", "angle", "galen", "glean", "lange"] ["alger", "glare", "lager", "large", "regal"] ["elan", "lane", "lean", "lena", "neal"] ["evil", "levi", "live", "veil", "vile"]
=== Press any key to exit ===
PHP
<?php
$words = explode("\n", file_get_contents('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt'));
foreach ($words as $word) {
$chars = str_split($word);
sort($chars);
$anagram[implode($chars)][] = $word;
}
$best = max(array_map('count', $anagram));
foreach ($anagram as $ana)
if (count($ana) == $best)
print_r($ana);
?>
Picat
Using foreach loop:
go =>
Dict = new_map(),
foreach(Line in read_file_lines("unixdict.txt"))
Sorted = Line.sort(),
Dict.put(Sorted, Dict.get(Sorted,"") ++ [Line] )
end,
MaxLen = max([Value.length : _Key=Value in Dict]),
println(maxLen=MaxLen),
foreach(_Key=Value in Dict, Value.length == MaxLen)
println(Value)
end,
nl.
- Output:
maxLen = 5 [alger,glare,lager,large,regal] [evil,levi,live,veil,vile] [abel,able,bale,bela,elba] [caret,carte,cater,crate,trace] [angel,angle,galen,glean,lange] [elan,lane,lean,lena,neal]
Same idea, but shorter version by (mis)using list comprehensions.
go2 =>
M = new_map(),
_ = [_:W in read_file_lines("unixdict.txt"),S=sort(W),M.put(S,M.get(S,"")++[W])],
X = max([V.len : _K=V in M]),
println(maxLen=X),
[V : _=V in M, V.len=X].println.
- Output:
maxLen = 5 [[evil,levi,live,veil,vile],[abel,able,bale,bela,elba],[caret,carte,cater,crate,trace],[angel,angle,galen,glean,lange],[elan,lane,lean,lena,neal],[alger,glare,lager,large,regal]]
PicoLisp
A straight-forward implementation using 'group' takes 48 seconds on a 1.7 GHz Pentium:
(flip
(by length sort
(by '((L) (sort (copy L))) group
(in "unixdict.txt" (make (while (line) (link @)))) ) ) )
Using a binary tree with the 'idx' function, it takes only 0.42 seconds on the same machine, a factor of 100 faster:
(let Words NIL
(in "unixdict.txt"
(while (line)
(let (Word (pack @) Key (pack (sort @)))
(if (idx 'Words Key T)
(push (car @) Word)
(set Key (list Word)) ) ) ) )
(flip (by length sort (mapcar val (idx 'Words)))) )
- Output:
-> (("vile" "veil" "live" "levi" "evil") ("trace" "crate" "cater" "carte" "caret ") ("regal" "large" "lager" "glare" "alger") ("neal" "lena" "lean" "lane" "elan" ) ("lange" "glean" "galen" "angle" "angel") ("elba" "bela" "bale" "able" "abel") ("tulsa" "talus" "sault" "latus") ...
PL/I
/* Search a list of words, finding those having the same letters. */
word_test: proc options (main);
declare words (50000) character (20) varying,
frequency (50000) fixed binary;
declare word character (20) varying;
declare (i, k, wp, most) fixed binary (31);
on endfile (sysin) go to done;
words = ''; frequency = 0;
wp = 0;
do forever;
get edit (word) (L);
call search_word_list (word);
end;
done:
put skip list ('There are ' || wp || ' words');
most = 0;
/* Determine the word(s) having the greatest number of anagrams. */
do i = 1 to wp;
if most < frequency(i) then most = frequency(i);
end;
put skip edit ('The following word(s) have ', trim(most), ' anagrams:') (a);
put skip;
do i = 1 to wp;
if most = frequency(i) then put edit (words(i)) (x(1), a);
end;
search_word_list: procedure (word) options (reorder);
declare word character (*) varying;
declare i fixed binary (31);
do i = 1 to wp;
if length(words(i)) = length(word) then
if is_anagram(word, words(i)) then
do;
frequency(i) = frequency(i) + 1;
return;
end;
end;
/* The word does not exist in the list, so add it. */
if wp >= hbound(words,1) then return;
wp = wp + 1;
words(wp) = word;
frequency(wp) = 1;
return;
end search_word_list;
/* Returns true if the words are anagrams, otherwise returns false. */
is_anagram: procedure (word1, word2) returns (bit(1)) options (reorder);
declare (word1, word2) character (*) varying;
declare tword character (20) varying, c character (1);
declare (i, j) fixed binary;
tword = word2;
do i = 1 to length(word1);
c = substr(word1, i, 1);
j = index(tword, c);
if j = 0 then return ('0'b);
substr(tword, j, 1) = ' ';
end;
return ('1'b);
end is_anagram;
end word_test;
- Output:
There are 23565 words The following word(s) have 5 anagrams: abel alger angel caret elan evil
Pointless
output =
readFileLines("unixdict.txt")
|> reduce(logWord, {})
|> vals
|> getMax
|> printLines
logWord(dict, word) =
(dict with $[chars] = [word] ++ getDefault(dict, [], chars))
where chars = sort(word)
getMax(groups) =
groups |> filter(g => length(g) == maxLength)
where maxLength = groups |> map(length) |> maximum
- Output:
["elba", "bela", "bale", "able", "abel"] ["neal", "lena", "lean", "lane", "elan"] ["vile", "veil", "live", "levi", "evil"] ["lange", "glean", "galen", "angle", "angel"] ["regal", "large", "lager", "glare", "alger"] ["trace", "crate", "cater", "carte", "caret"]
PowerShell
$c = New-Object Net.WebClient
$words = -split ($c.DownloadString('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt'))
$top_anagrams = $words `
| ForEach-Object {
$_ | Add-Member -PassThru NoteProperty Characters `
(-join (([char[]] $_) | Sort-Object))
} `
| Group-Object Characters `
| Group-Object Count `
| Sort-Object Count `
| Select-Object -First 1
$top_anagrams.Group | ForEach-Object { $_.Group -join ', ' }
- Output:
abel, able, bale, bela, elba alger, glare, lager, large, regal angel, angle, galen, glean, lange caret, carte, cater, crate, trace elan, lane, lean, lena, neal evil, levi, live, veil, vile
Another way with more .Net methods is quite a different style, but drops the runtime from 2 minutes to 1.5 seconds:
$Timer = [System.Diagnostics.Stopwatch]::StartNew()
$uri = 'http://wiki.puzzlers.org/pub/wordlists/unixdict.txt'
$words = -split [Net.WebClient]::new().DownloadString($uri)
$anagrams = @{}
$maxAnagramCount = 0
foreach ($w in $words)
{
# Sort the characters in the word into alphabetical order
$chars=[char[]]$w
[array]::sort($chars)
$orderedChars = [string]::Join('', $chars)
# If no anagrams list for these chars, make one
if (-not $anagrams.ContainsKey($orderedChars))
{
$anagrams[$orderedChars] = [Collections.Generic.List[String]]::new()
}
# Add current word as an anagram of these chars,
# in a way which keeps the list available
($list = $anagrams[$orderedChars]).Add($w)
# Keep running score of max number of anagrams seen
if ($list.Count -gt $maxAnagramCount)
{
$maxAnagramCount = $list.Count
}
}
foreach ($entry in $anagrams.GetEnumerator())
{
if ($entry.Value.Count -eq $maxAnagramCount)
{
[string]::join('', $entry.Value)
}
}
Processing
import java.util.Map;
void setup() {
String[] words = loadStrings("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt");
topAnagrams(words);
}
void topAnagrams (String[] words){
HashMap<String, StringList> anagrams = new HashMap<String, StringList>();
int maxcount = 0;
for (String word : words) {
char[] chars = word.toCharArray();
chars = sort(chars);
String key = new String(chars);
if (!anagrams.containsKey(key)) {
anagrams.put(key, new StringList());
}
anagrams.get(key).append(word);
maxcount = max(maxcount, anagrams.get(key).size());
}
for (StringList ana : anagrams.values()) {
if (ana.size() >= maxcount) {
println(ana);
}
}
}
- Output:
StringList size=5 [ "evil", "levi", "live", "veil", "vile" ] StringList size=5 [ "abel", "able", "bale", "bela", "elba" ] StringList size=5 [ "elan", "lane", "lean", "lena", "neal" ] StringList size=5 [ "angel", "angle", "galen", "glean", "lange" ] StringList size=5 [ "alger", "glare", "lager", "large", "regal" ] StringList size=5 [ "caret", "carte", "cater", "crate", "trace" ]
Prolog
:- use_module(library( http/http_open )).
anagrams:-
% we read the URL of the words
http_open('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt', In, []),
read_file(In, [], Out),
close(In),
% we get a list of pairs key-value where key = a-word value = <list-of-its-codes>
% this list must be sorted
msort(Out, MOut),
% in order to gather values with the same keys
group_pairs_by_key(MOut, GPL),
% we sorted this list in decreasing order of the length of values
predsort(my_compare, GPL, GPLSort),
% we extract the first 6 items
GPLSort = [_H1-T1, _H2-T2, _H3-T3, _H4-T4, _H5-T5, _H6-T6 | _],
% Tnn are lists of codes (97 for 'a'), we create the strings
maplist(maplist(atom_codes), L, [T1, T2, T3, T4, T5, T6] ),
maplist(writeln, L).
read_file(In, L, L1) :-
read_line_to_codes(In, W),
( W == end_of_file ->
% the file is read
L1 = L
;
% we sort the list of codes of the line
msort(W, W1),
% to create the key in alphabetic order
atom_codes(A, W1),
% and we have the pair Key-Value in the result list
read_file(In, [A-W | L], L1)).
% predicate for sorting list of pairs Key-Values
% if the lentgh of values is the same
% we sort the keys in alhabetic order
my_compare(R, K1-V1, K2-V2) :-
length(V1, L1),
length(V2, L2),
( L1 < L2 -> R = >; L1 > L2 -> R = <; compare(R, K1, K2)).
The result is
[abel,able,bale,bela,elba] [caret,carte,cater,crate,trace] [angel,angle,galen,glean,lange] [alger,glare,lager,large,regal] [elan,lane,lean,lena,neal] [evil,levi,live,veil,vile] true
PureBasic
InitNetwork() ;
OpenConsole()
Procedure.s sortWord(word$)
len.i = Len(word$)
Dim CharArray.s (len)
For n = 1 To len ; Transfering each single character
CharArray(n) = Mid(word$, n, 1) ; of the word into an array.
Next
SortArray(CharArray(),#PB_Sort_NoCase ) ; Sorting the array.
word$ =""
For n = 1 To len ; Writing back each single
word$ + CharArray(n) ; character of the array.
Next
ProcedureReturn word$
EndProcedure
;for a faster and more advanced alternative replace the previous procedure with this code
; Procedure.s sortWord(word$) ;returns a string with the letters of the word sorted
; Protected wordLength = Len(word$)
; Protected Dim letters.c(wordLength)
;
; PokeS(@letters(), word$) ;overwrite the array with the strings contents
; SortArray(letters(), #PB_Sort_Ascending, 0, wordLength - 1)
; ProcedureReturn PeekS(@letters(), wordLength) ;return the arrays contents
; EndProcedure
tmpdir$ = GetTemporaryDirectory()
filename$ = tmpdir$ + "unixdict.txt"
Structure ana
isana.l
anas.s
EndStructure
NewMap anaMap.ana()
If ReceiveHTTPFile("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt", filename$)
If ReadFile(1, filename$)
Repeat
word$ = (ReadString(1)) ; Reading a word from a file.
key$ = (sortWord(word$)) ; Sorting the word and storing in key$.
If FindMapElement(anaMap(), key$) ; Looking up if a word already had the same key$.
; if yes
anaMap()\anas = anaMap()\anas+ ", " + word$ ; adding the word
anaMap()\isana + 1
Else
; if no
anaMap(key$)\anas = word$ ; applying a new record
anaMap()\isana = 1
EndIf
If anaMap()\isana > maxAnagrams ;make note of maximum anagram count
maxAnagrams = anaMap()\isana
EndIf
Until Eof(1)
CloseFile(1)
DeleteFile(filename$)
;----- output -----
ForEach anaMap()
If anaMap()\isana = maxAnagrams ; only emit elements that have the most hits
PrintN(anaMap()\anas)
EndIf
Next
PrintN("Press any key"): Repeat: Until Inkey() <> ""
EndIf
EndIf
- Output:
evil, levi, live, veil, vile angel, angle, galen, glean, lange alger, glare, lager, large, regal abel, able, bale, bela, elba elan, lane, lean, lena, neal caret, carte, cater, crate, trace
Python
Python 3.X Using defaultdict
Python 3.2 shell input (IDLE)
>>> import urllib.request
>>> from collections import defaultdict
>>> words = urllib.request.urlopen('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt').read().split()
>>> anagram = defaultdict(list) # map sorted chars to anagrams
>>> for word in words:
anagram[tuple(sorted(word))].append( word )
>>> count = max(len(ana) for ana in anagram.values())
>>> for ana in anagram.values():
if len(ana) >= count:
print ([x.decode() for x in ana])
Python 2.7 version
Python 2.7 shell input (IDLE)
>>> import urllib
>>> from collections import defaultdict
>>> words = urllib.urlopen('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt').read().split()
>>> len(words)
25104
>>> anagram = defaultdict(list) # map sorted chars to anagrams
>>> for word in words:
anagram[tuple(sorted(word))].append( word )
>>> count = max(len(ana) for ana in anagram.itervalues())
>>> for ana in anagram.itervalues():
if len(ana) >= count:
print ana
['angel', 'angle', 'galen', 'glean', 'lange']
['alger', 'glare', 'lager', 'large', 'regal']
['caret', 'carte', 'cater', 'crate', 'trace']
['evil', 'levi', 'live', 'veil', 'vile']
['elan', 'lane', 'lean', 'lena', 'neal']
['abel', 'able', 'bale', 'bela', 'elba']
>>> count
5
>>>
Python: Using groupby
sort and then group using groupby()
>>> import urllib, itertools
>>> words = urllib.urlopen('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt').read().split()
>>> len(words)
25104
>>> anagrams = [list(g) for k,g in itertools.groupby(sorted(words, key=sorted), key=sorted)]
>>> count = max(len(ana) for ana in anagrams)
>>> for ana in anagrams:
if len(ana) >= count:
print ana
['abel', 'able', 'bale', 'bela', 'elba']
['caret', 'carte', 'cater', 'crate', 'trace']
['angel', 'angle', 'galen', 'glean', 'lange']
['alger', 'glare', 'lager', 'large', 'regal']
['elan', 'lane', 'lean', 'lena', 'neal']
['evil', 'levi', 'live', 'veil', 'vile']
>>> count
5
>>>
Or, disaggregating, speeding up a bit by avoiding the slightly expensive use of sorted as a key, updating for Python 3, and using a local unixdict.txt:
'''Largest anagram groups found in list of words.'''
from os.path import expanduser
from itertools import groupby
from operator import eq
# main :: IO ()
def main():
'''Largest anagram groups in local unixdict.txt'''
print(unlines(
largestAnagramGroups(
lines(readFile('unixdict.txt'))
)
))
# largestAnagramGroups :: [String] -> [[String]]
def largestAnagramGroups(ws):
'''A list of the anagram groups of
of the largest size found in a
given list of words.
'''
# wordChars :: String -> (String, String)
def wordChars(w):
'''A word paired with its
AZ sorted characters
'''
return (''.join(sorted(w)), w)
groups = list(map(
compose(list)(snd),
groupby(
sorted(
map(wordChars, ws),
key=fst
),
key=fst
)
))
intMax = max(map(len, groups))
return list(map(
compose(unwords)(curry(map)(snd)),
filter(compose(curry(eq)(intMax))(len), groups)
))
# GENERIC -------------------------------------------------
# compose (<<<) :: (b -> c) -> (a -> b) -> a -> c
def compose(g):
'''Right to left function composition.'''
return lambda f: lambda x: g(f(x))
# curry :: ((a, b) -> c) -> a -> b -> c
def curry(f):
'''A curried function derived
from an uncurried function.'''
return lambda a: lambda b: f(a, b)
# fst :: (a, b) -> a
def fst(tpl):
'''First member of a pair.'''
return tpl[0]
# lines :: String -> [String]
def lines(s):
'''A list of strings,
(containing no newline characters)
derived from a single new-line delimited string.'''
return s.splitlines()
# from os.path import expanduser
# readFile :: FilePath -> IO String
def readFile(fp):
'''The contents of any file at the path
derived by expanding any ~ in fp.'''
with open(expanduser(fp), 'r', encoding='utf-8') as f:
return f.read()
# snd :: (a, b) -> b
def snd(tpl):
'''Second member of a pair.'''
return tpl[1]
# unlines :: [String] -> String
def unlines(xs):
'''A single string derived by the intercalation
of a list of strings with the newline character.'''
return '\n'.join(xs)
# unwords :: [String] -> String
def unwords(xs):
'''A space-separated string derived from
a list of words.'''
return ' '.join(xs)
# MAIN ---
if __name__ == '__main__':
main()
- Output:
caret carte cater crate creat creta react recta trace angor argon goran grano groan nagor orang organ rogan ester estre reest reset steer stere stree terse tsere
QB64
$CHECKING:OFF
' Warning: Keep the above line commented out until you know your newly edited code works.
' You can NOT stop a program in mid run (using top right x button) with checkng off.
'
_TITLE "Rosetta Code Anagrams: mod #7 Best times yet w/o memory techniques by bplus 2017-12-12"
' This program now below .4 secs for average time to do 100 loops compared to 92 secs for 1
' loop on my "dinosaur" when I first coded a successful run.
'
' Steve McNeil at QB64.net has +7000 loops per sec on his machine with help of using
' memory techniques. see page 3 @ http://www.qb64.net/forum/index.php?topic=14622.30
'
' Thanks Steve! I learned allot and am NOW very motivated to learn memory techniques.
'
' This program has timings for 1 loop broken into sections currently commented out and another
' set of timings for multiple loop testing currently set, now at 100 tests for a sort of average.
' But average is misleading, the first test is usually always the longest and really only one test
' is necessary to get the results from a data file that does not change.
'
' Breaking code into logical sections and timing those can help spot trouble areas or the difference
' in a small or great change.
'
' Here is review of speed tips commented as they occur in code:
'
DEFINT A-Z 'there are 25,105 words in the unixdict.txt file so main array index
' and pointers in sort can all be integers.
' The letters from a word read in from the dictionary file (really just a word list in alpha order)
' are to be counted and coded into an alpha order sequence of letters:
' eg. eilv is the same code for words: evil, levi, live, veil, vile
' The longest word in the file had 22 letters, they are all lower case but there are other symbols
' in file like ' and digits we want to filter out.
TYPE wordData
code AS STRING * 22
theWord AS STRING * 22
END TYPE
' I originally was coding a word into the whole list (array) of letter counts as a string.
' Then realized I could drop all the zeros if I converted the numbers back to letters.
' I then attached THE word to the end of the coded word using ! to separate the 2 sections.
' That was allot of manipulation with INSTR to find the ! separator and then MID$ to extract the
' code or THE word when I needed the value. All this extra manipulation ended by using TYPE with
' the code part and the word part sharing the same index. Learned from Steve's example!
' Pick the lowest number type needed to cover the problem
DIM SHARED w(25105) AS wordData ' the main array
DIM anagramSetsCount AS _BYTE ' the Rosetta Code Challenge was to find only the largest sets of Anagrams
DIM codeCount AS _BYTE ' counting number of words with same code
DIM wordIndex AS _BYTE
DIM wordLength AS _BYTE
DIM flag AS _BIT 'flag used as true or false
DIM letterCounts(1 TO 26) AS _BYTE 'stores letter counts for coding word
' b$ always stands for building a string.
' For long and strings, I am using the designated suffix
t1# = TIMER: loops = 100
FOR test = 1 TO loops
'reset these for multiple loop tests
indexTop = 0 'indexTop for main data array
anagramSetsCount = 0 'anagrams count if exceed 4 for any one code
anagramList$ = "" 'list of anagrams
'get the file data loaded in one pop, disk access is slow!
OPEN "unixdict.txt" FOR BINARY AS #1
' http://wiki.puzzlers.org/pub/wordlists/unixdict.txt
' note: when I downloaded this file line breaks were by chr$(10) only.
' Steve had coded for either chr$(13) + chr$(10) or just chr$(10)
fileLength& = LOF(1): buf$ = SPACE$(fileLength&)
GET #1, , buf$
CLOSE #1
' Getting the data into a big long string saved allot of time as compared to
' reading from the file line by line.
'Process the file data by extracting the word from the long file string and then
'coding each word of interest, loading up the w() array.
filePosition& = 1
WHILE filePosition& < fileLength&
nextPosition& = INSTR(filePosition&, buf$, CHR$(10))
wd$ = MID$(buf$, filePosition&, nextPosition& - filePosition&)
wordLength = LEN(wd$)
IF wordLength > 2 THEN
'From Steve's example, changing from REDIM to ERASE saved an amzing amount of time!
ERASE letterCounts: flag = 0: wordIndex = 1
WHILE wordIndex <= wordLength
'From Steve's example, I was not aware of this version of ASC with MID$ built-in
ansciChar = ASC(wd$, wordIndex) - 96
IF 0 < ansciChar AND ansciChar < 27 THEN letterCounts(ansciChar) = letterCounts(ansciChar) + 1 ELSE flag = 1: EXIT WHILE
wordIndex = wordIndex + 1
WEND
'don't code and store a word unless all letters, no digits or apostrophes
IF flag = 0 THEN
b$ = "": wordIndex = 1
WHILE wordIndex < 27
IF letterCounts(wordIndex) THEN b$ = b$ + STRING$(letterCounts(wordIndex), CHR$(96 + wordIndex))
wordIndex = wordIndex + 1
WEND
indexTop = indexTop + 1
w(indexTop).code = b$
w(indexTop).theWord = wd$
END IF
END IF
IF nextPosition& THEN filePosition& = nextPosition& + 1 ELSE filePosition& = fileLength&
WEND
't2# = TIMER
'PRINT t2# - t1#; " secs to load word array."
'Sort using a recursive Quick Sort routine on the code key of wordData Type defined.
QSort 0, indexTop
't3# = TIMER
'PRINT t3# - t2#; " secs to sort array."
'Now find all the anagrams, word permutations, from the same word "code" that we sorted by.
flag = 0: j = 0
WHILE j < indexTop
'Does the sorted code key match the next one on the list?
IF w(j).code <> w(j + 1).code THEN ' not matched so stop counting and add to report
IF codeCount > 4 THEN ' only want the largest sets of anagrams 5 or more
anagramList$ = anagramList$ + b$ + CHR$(10)
anagramSetsCount = anagramSetsCount + 1
END IF
codeCount = 0: b$ = "": flag = 0
ELSEIF flag THEN ' match and match flag set so just add to count and build set
b$ = b$ + ", " + RTRIM$(w(j + 1).theWord)
codeCount = codeCount + 1
ELSE ' no flag means first match, start counting and building a new set
b$ = RTRIM$(w(j).theWord) + ", " + RTRIM$(w(j + 1).theWord)
codeCount = 2: flag = 1
END IF
j = j + 1
WEND
't4# = TIMER
'PRINT t4# - t3#; " secs to count matches from array."
NEXT
PRINT "Ave time per loop"; (TIMER - t1#) / loops; " secs, there were"; anagramSetsCount; " anagrams sets of 5 or more words."
PRINT anagramList$
'This sub modified for wordData Type, to sort by the .code key, the w() array is SHARED
SUB QSort (Start, Finish)
i = Start: j = Finish: x$ = w(INT((i + j) / 2)).code
WHILE i <= j
WHILE w(i).code < x$: i = i + 1: WEND
WHILE w(j).code > x$: j = j - 1: WEND
IF i <= j THEN
SWAP w(i), w(j)
i = i + 1: j = j - 1
END IF
WEND
IF j > Start THEN QSort Start, j
IF i < Finish THEN QSort i, Finish
END SUB
2nd solution (by Steve McNeill):
$CHECKING:OFF
SCREEN _NEWIMAGE(640, 480, 32)
_DELAY .5
_SCREENMOVE _MIDDLE
DEFLNG A-Z
TYPE DataType
Word AS _UNSIGNED INTEGER
Value AS STRING * 26
END TYPE
REDIM Words(0 TO 30000) AS DataType
REDIM WordList(0 TO 30000) AS STRING * 25
DIM Anagrams(0 TO 30000, 0 TO 10) AS LONG
DIM EndLine AS STRING, Endlength AS LONG
IF INSTR(temp$, CHR$(13)) THEN EndLine = CHR$(13) + CHR$(10) ELSE EndLine = CHR$(10)
Endlength = LEN(EndLine)
DIM t AS _FLOAT 'high precisition timer
DIM t1 AS _FLOAT
DIM letters(97 TO 122) AS _UNSIGNED _BYTE
DIM m1 AS _MEM, m2 AS _MEM, m3 AS _MEM
DIM a AS _UNSIGNED _BYTE
DIM matched(30000) AS _BYTE
m1 = _MEM(letters()): m2 = _MEM(Words()): m3 = _MEM(WordList())
blank$ = STRING$(26, 0)
t1 = TIMER
oldenter = 1
DO UNTIL TIMER - t1 > 1
t = t1
looper = looper + 1
OPEN "unixdict.txt" FOR BINARY AS #1
temp$ = SPACE$(LOF(1))
GET #1, 1, temp$ 'just grab the whole datafile from the drive in one swoop
CLOSE #1
'PRINT USING "##.###### seconds to load data from disk."; TIMER - t
t = TIMER
index = -1 'we want our first word to be indexed at 0, for ease of array/mem swappage
DO 'and parse it manually into our array
skip:
enter = INSTR(oldenter, temp$, EndLine)
IF enter THEN
l = enter - oldenter - 1
wd$ = MID$(temp$, oldenter, l)
oldenter = enter + Endlength
ELSE
wd$ = MID$(temp$, oldenter)
l = LEN(wd$)
END IF
_MEMPUT m1, m1.OFFSET, blank$ 'ERASE letters
j = 1
DO UNTIL j > l
a = ASC(wd$, j)
IF a < 97 OR a > 122 GOTO skip
letters(a) = letters(a) + 1 'and count them
j = j + 1
LOOP
index = index + 1
WordList(index) = wd$
Words(index).Word = index
_MEMCOPY m1, m1.OFFSET, 26 TO m2, m2.OFFSET + m2.ELEMENTSIZE * (index) + 2
LOOP UNTIL enter = 0
CLOSE #1
'PRINT USING "##.###### seconds to parse data into array."; TIMER - t
t = TIMER
combsort Words(), index
i = 1
DO UNTIL i > index
IF matched(i) = 0 THEN
count = 0
DO
count = count + 1
c = i + count
IF c > index THEN EXIT DO
IF _STRICMP(Words(i).Value, Words(c).Value) <> 0 THEN EXIT DO
Anagrams(anagram_count, count) = c
matched(c) = -1
LOOP
IF count > 1 THEN
Anagrams(anagram_count, 0) = i
Anagrams(anagram_count, 10) = count
i = c - 1
anagram_count = anagram_count + 1
END IF
END IF
i = i + 1
LOOP
t2## = TIMER
'PRINT USING "##.###### seconds to make matches."; t2## - t
'PRINT USING "##.###### total time from start to finish."; t2## - t1
'PRINT
LOOP
$CHECKING:ON
PRINT "LOOPER:"; looper; "executions from start to finish, in one second."
PRINT "Note, this is including disk access for new data each time."
PRINT
PRINT USING "#.################ seconds on average to run"; 1## / looper
INPUT "Anagram Pool Limit Size (Or larger) =>"; limit
IF limit < 1 THEN END
FOR i = 0 TO anagram_count - 1
v = Anagrams(i, 10)
IF v >= limit THEN
FOR j = 0 TO v
SELECT CASE j
CASE 0
CASE v: PRINT
CASE ELSE: PRINT ", ";
END SELECT
PRINT LEFT$(WordList(Words(Anagrams(i, j)).Word), INSTR(WordList(Words(Anagrams(i, j)).Word), " "));
NEXT
END IF
NEXT
END
SUB combsort (array() AS DataType, index AS LONG)
DIM gap AS LONG
'This is the routine I tend to use personally and promote.
'It's short, simple, and easy to implement into code.
gap = index
DO
gap = INT(gap / 1.247330925103979)
IF gap < 1 THEN gap = 1
i = 0
swapped = 0
DO
IF array(i).Value > array(i + gap).Value THEN
SWAP array(i), array(i + gap)
swapped = -1
END IF
i = i + 1
LOOP UNTIL i + gap > index
LOOP UNTIL gap = 1 AND swapped = 0
END SUB
Output:
LOOPER: 7134 executions from start to finish, in one second.
Note, this is including disk access for new data each time.
0.000140138155313 seconds on average to run
Anagram Pool Limit Size (or larger) =>? 5
veil, levi, live, vile, evil
lane, neal, lean, lena, elan
alger, lager, large, glare, regal
glean, angel, galen, angle, lange
caret, trace, crate, carte, cater
bale, abel, able, elba, bela
Quackery
$ "rosetta/unixdict.txt" sharefile drop nest$
[] swap witheach
[ dup sort
nested swap nested join
nested join ]
sortwith [ 0 peek swap 0 peek $< ]
dup
[ dup [] ' [ [ ] ] rot
witheach
[ tuck 0 peek swap 0 peek = if
[ tuck nested join swap ] ]
drop
dup [] != while
nip again ]
drop
witheach
[ over witheach
[ 2dup 0 peek swap 0 peek = iff
[ 1 peek echo$ sp ]
else drop ]
drop cr ]
drop
- Output:
abel able bale bela elba caret carte cater crate trace angel angle galen glean lange alger glare lager large regal elan lane lean lena neal evil levi live veil vile
R
words <- readLines("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
word_group <- sapply(
strsplit(words, split=""), # this will split all words to single letters...
function(x) paste(sort(x), collapse="") # ...which we sort and paste again
)
counts <- tapply(words, word_group, length) # group words by class to get number of anagrams
anagrams <- tapply(words, word_group, paste, collapse=", ") # group to get string with all anagrams
# Results
table(counts)
counts
1 2 3 4 5
22263 1111 155 31 6
anagrams[counts == max(counts)]
abel acert
"abel, able, bale, bela, elba" "caret, carte, cater, crate, trace"
aegln aeglr
"angel, angle, galen, glean, lange" "alger, glare, lager, large, regal"
aeln eilv
"elan, lane, lean, lena, neal" "evil, levi, live, veil, vile"
Racket
#lang racket
(require net/url)
(define (get-lines url-string)
(define port (get-pure-port (string->url url-string)))
(for/list ([l (in-lines port)]) l))
(define (hash-words words)
(for/fold ([ws-hash (hash)]) ([w words])
(hash-update ws-hash
(list->string (sort (string->list w) < #:key (λ (c) (char->integer c))))
(λ (ws) (cons w ws))
(λ () '()))))
(define (get-maxes h)
(define max-ws (apply max (map length (hash-values h))))
(define max-keys (filter (λ (k) (= (length (hash-ref h k)) max-ws)) (hash-keys h)))
(map (λ (k) (hash-ref h k)) max-keys))
(get-maxes (hash-words (get-lines "http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")))
- Output:
'(("neal" "lena" "lean" "lane" "elan") ("trace" "crate" "cater" "carte" "caret") ("regal" "large" "lager" "glare" "alger") ("elba" "bela" "bale" "able" "abel") ("lange" "glean" "galen" "angle" "angel") ("vile" "veil" "live" "levi" "evil"))
Raku
(formerly Perl 6)
my @anagrams = 'unixdict.txt'.IO.words.classify(*.comb.sort.join).values;
my $max = @anagrams».elems.max;
.put for @anagrams.grep(*.elems == $max);
- Output:
caret carte cater crate trace angel angle galen glean lange alger glare lager large regal elan lane lean lena neal evil levi live veil vile abel able bale bela elba
Just for the fun of it, here's a one-liner that uses no temporaries. Since it would be rather long, we've oriented it vertically:
.put for # print each element of the array made this way:
'unixdict.txt'.IO.words # load words from file
.classify(*.comb.sort.join) # group by common anagram
.classify(*.value.elems) # group by number of anagrams in a group
.max(*.key).value # get the group with highest number of anagrams
.map(*.value) # get all groups of anagrams in the group just selected
RapidQ
dim x as integer, y as integer
dim SortX as integer
dim StrOutPut as string
dim Count as integer
dim MaxCount as integer
dim AnaList as QStringlist
dim wordlist as QStringlist
dim Templist as QStringlist
dim Charlist as Qstringlist
function sortChars(expr as string) as string
Charlist.clear
for SortX = 1 to len(expr)
Charlist.AddItems expr[SortX]
next
charlist.sort
result = Charlist.text - chr$(10) - chr$(13)
end function
'--- Start main code
wordlist.loadfromfile ("unixdict.txt")
'create anagram list
for x = 0 to wordlist.itemcount-1
AnaList.AddItems sortChars(wordlist.item(x))
next
'Filter largest anagram lists
analist.sort
MaxCount = 0
for x = 0 to AnaList.Itemcount-1
Count = 0
for y = x+1 to AnaList.Itemcount-1
if AnaList.item(y) = AnaList.item(x) then
inc(count)
else
if count > MaxCount then
Templist.clear
MaxCount = Count
Templist.AddItems AnaList.item(x)
elseif count = MaxCount then
Templist.AddItems AnaList.item(x)
end if
exit for
end if
next
next
'Now get the words
for x = 0 to Templist.Itemcount-1
for y = 0 to wordlist.Itemcount-1
if Templist.item(x) = sortChars(wordlist.item(y)) then
StrOutPut = StrOutPut + wordlist.item(y) + " "
end if
next
StrOutPut = StrOutPut + chr$(13) + chr$(10)
next
ShowMessage StrOutPut
End
- Output:
abel able bale bela elba caret carte cater crate trace angel angle galen glean lange alger glare lager large regal elan lane lean lena neal evil levi live veil vile
Rascal
import Prelude;
list[str] OrderedRep(str word){
return sort([word[i] | i <- [0..size(word)-1]]);
}
public list[set[str]] anagram(){
allwords = readFileLines(|http://wiki.puzzlers.org/pub/wordlists/unixdict.txt|);
AnagramMap = invert((word : OrderedRep(word) | word <- allwords));
longest = max([size(group) | group <- range(AnagramMap)]);
return [AnagramMap[rep]| rep <- AnagramMap, size(AnagramMap[rep]) == longest];
}
Returns:
value: [
{"glean","galen","lange","angle","angel"},
{"glare","lager","regal","large","alger"},
{"carte","trace","crate","caret","cater"},
{"lane","lena","lean","elan","neal"},
{"able","bale","abel","bela","elba"},
{"levi","live","vile","evil","veil"}
]
Red
Red []
m: make map! [] 25000
maxx: 0
foreach word read/lines http://wiki.puzzlers.org/pub/wordlists/unixdict.txt [
sword: sort copy word ;; sorted characters of word
either find m sword [
append m/:sword word
maxx: max maxx length? m/:sword
] [
put m sword append copy [] word
]
]
foreach v values-of m [ if maxx = length? v [print v] ]
- Output:
abel able bale bela elba alger glare lager large regal angel angle galen glean lange caret carte cater crate trace elan lane lean lena neal evil levi live veil vile >>
REXX
version 1.1, idiomatic
This version doesn't assume that the dictionary is in alphabetical order, nor does it assume the
words are in any specific case (lower/upper/mixed).
/*REXX program finds words with the largest set of anagrams (of the same size). */
iFID= 'unixdict.txt' /*the dictionary input File IDentifier.*/
$=; !.=; ww=0; uw=0; most=0 /*initialize a bunch of REXX variables.*/
/* [↓] read the entire file (by lines)*/
do while lines(iFID) \== 0 /*Got any data? Then read a record. */
parse value linein(iFID) with @ . /*obtain a word from an input line. */
len=length(@); if len<3 then iterate /*onesies and twosies words can't win. */
if \datatype(@, 'M') then iterate /*ignore any non─anagramable words. */
uw=uw + 1 /*count of the (useable) words in file.*/
_=sortA(@) /*sort the letters in the word. */
!._=!._ @; #=words(!._) /*append it to !._; bump the counter. */
if #==most then $=$ _ /*append the sorted word──► max anagram*/
else if #>most then do; $=_; most=#; if len>ww then ww=len; end
end /*while*/ /*$ ◄── list of high count anagrams. */
say '─────────────────────────' uw "usable words in the dictionary file: " iFID
say
do m=1 for words($); z=subword($, m, 1) /*the high count of the anagrams. */
say ' ' left(word(!.z, 1), ww) ' [anagrams: ' subword(!.z, 2)"]"
end /*m*/ /*W is the maximum width of any word.*/
say
say '───── Found' words($) "words (each of which have" words(!.z)-1 'anagrams).'
exit /*stick a fork in it, we're all done. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
sortA: arg char 2 xx,@. /*get the first letter of arg; @.=null*/
@.char=char /*no need to concatenate the first char*/
/*[↓] sort/put letters alphabetically.*/
do length(xx); parse var xx char 2 xx; @.char=@.char || char; end
/*reassemble word with sorted letters. */
return @.a || @.b || @.c || @.d || @.e || @.f||@.g||@.h||@.i||@.j||@.k||@.l||@.m||,
@.n || @.o || @.p || @.q || @.r || @.s||@.t||@.u||@.v||@.w||@.x||@.y||@.z
Programming note: the long (wide) assignment for return @.a||... could've been coded as an elegant do loop instead of hardcoding 26 letters,
but since the dictionary (word list) is rather large, a rather expaciated method was used for speed.
- output when using the default input (dictionary):
───────────────────────── 24819 usable words in the dictionary file: unixdict.txt abel [anagrams: able bale bela elba] angel [anagrams: angle galen glean lange] elan [anagrams: lane lean lena neal] alger [anagrams: glare lager large regal] caret [anagrams: carte cater crate trace] evil [anagrams: levi live veil vile] ───── Found 6 words (each of which have 4 anagrams).
version 1.2, optimized
This optimized version eliminates the sortA subroutine and puts that subroutine's code in-line.
/*REXX program finds words with the largest set of anagrams (of the same size). */
iFID= 'unixdict.txt' /*the dictionary input File IDentifier.*/
$=; !.=; ww=0; uw=0; most=0 /*initialize a bunch of REXX variables.*/
/* [↓] read the entire file (by lines)*/
do while lines(iFID) \== 0 /*Got any data? Then read a record. */
parse value linein(iFID) with @ . /*obtain a word from an input line. */
len=length(@); if len<3 then iterate /*onesies and twosies words can't win. */
if \datatype(@, 'M') then iterate /*ignore any non─anagramable words. */
uw=uw + 1 /*count of the (useable) words in file.*/
_=sortA(@) /*sort the letters in the word. */
!._=!._ @; #=words(!._) /*append it to !._; bump the counter. */
if #==most then $=$ _ /*append the sorted word──► max anagram*/
else if #>most then do; $=_; most=#; if len>ww then ww=len; end
end /*while*/ /*$ ◄── list of high count anagrams. */
say '─────────────────────────' uw "usable words in the dictionary file: " iFID
say
do m=1 for words($); z=subword($, m, 1) /*the high count of the anagrams. */
say ' ' left(word(!.z, 1), ww) ' [anagrams: ' subword(!.z, 2)"]"
end /*m*/ /*W is the maximum width of any word.*/
say
say '───── Found' words($) "words (each of which have" words(!.z)-1 'anagrams).'
exit /*stick a fork in it, we're all done. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
sortA: arg char 2 xx,@. /*get the first letter of arg; @.=null*/
@.char=char /*no need to concatenate the first char*/
/*[↓] sort/put letters alphabetically.*/
do length(xx); parse var xx char 2 xx; @.char=@.char || char; end
/*reassemble word with sorted letters. */
return @.a || @.b || @.c || @.d || @.e || @.f||@.g||@.h||@.i||@.j||@.k||@.l||@.m||,
@.n || @.o || @.p || @.q || @.r || @.s||@.t||@.u||@.v||@.w||@.x||@.y||@.z
- output is the same as REXX version 1.1
Programming note: the above REXX programs adopted the method that the REXX version 2 uses for extracting each character of a word.
The method is more obtuse, but when invoking the routine tens of thousands of times, this faster method lends itself to heavy use.
annotated version using PARSE
(This algorithm actually utilizes a bin sort, one bin for each Latin letter.)
u= 'Halloween' /*word to be sorted by (Latin) letter.*/
upper u /*fast method to uppercase a variable. */
/*another: u = translate(u) */
/*another: parse upper var u u */
/*another: u = upper(u) */
/*not always available [↑] */
say 'u=' u
_.=
do until u=='' /*keep truckin' until U is null. */
parse var u y +1 u /*get the next (first) character in U.*/
xx='?'y /*assign a prefixed character to XX. */
_.xx=_.xx || y /*append it to all the Y characters. */
end /*until*/ /*U now has the first character elided.*/
/*Note: the variable U is destroyed.*/
/* [↓] constructs a sorted letter word*/
z=_.?a||_.?b||_.?c||_.?d||_.?e||_.?f||_.?g||_.?h||_.?i||_.?j||_.?k||_.?l||_.?m||,
_.?n||_.?o||_.?p||_.?q||_.?r||_.?s||_.?t||_.?u||_.?v||_.?w||_.?x||_.?y||_.?z
/*Note: the ? is prefixed to the letter to avoid */
/*collisions with other REXX one-character variables.*/
say 'z=' z
- output:
u= HALLOWEEN z= AEEHLLNOW
annotated version using a DO loop
u= 'Halloween' /*word to be sorted by (Latin) letter.*/
upper u /*fast method to uppercase a variable. */
L=length(u) /*get the length of the word (in bytes)*/
say 'u=' u
say 'L=' L
_.=
do k=1 for L /*keep truckin' for L characters. */
parse var u =(k) y +1 /*get the Kth character in U string.*/
xx='?'y /*assign a prefixed character to XX. */
_.xx=_.xx || y /*append it to all the Y characters. */
end /*do k*/ /*U now has the first character elided.*/
/* [↓] construct a sorted letter word.*/
z=_.?a||_.?b||_.?c||_.?d||_.?e||_.?f||_.?g||_.?h||_.?i||_.?j||_.?k||_.?l||_.?m||,
_.?n||_.?o||_.?p||_.?q||_.?r||_.?s||_.?t||_.?u||_.?v||_.?w||_.?x||_.?y||_.?z
say 'z=' z
- output:
u= HALLOWEEN L= 9 z= AEEHLLNOW
version 2
/*REXX program finds words with the largest set of anagrams (same size)
* 07.08.2013 Walter Pachl
* sorta for word compression courtesy Gerard Schildberger,
* modified, however, to obey lowercase
* 10.08.2013 Walter Pachl take care of mixed case dictionary
* following Version 1's method
**********************************************************************/
Parse Value 'A B C D E F G H I J K L M N O P Q R S T U V W X Y Z',
With a b c d e f g h i j k l m n o p q r s t u v w x y z
Call time 'R'
ifid='unixdict.txt' /* input file identifier */
words=0 /* number of usable words */
maxl=0 /* maximum number of anagrams */
wl.='' /* wl.ws words that have ws */
Do ri=1 By 1 While lines(ifid)\==0 /* read each word in file */
word=space(linein(ifid),0) /* pick off a word from the input.*/
If length(word)<3 Then /* onesies and twosies can't win. */
Iterate
If\datatype(word,'M') Then /* not an anagramable word */
Iterate
words=words+1 /* count of (useable) words. */
ws=sorta(word) /* sort the letters in the word. */
wl.ws=wl.ws word /* add word to list of ws */
wln=words(wl.ws) /* number of anagrams with ws */
Select
When wln>maxl Then Do /* a new maximum */
maxl=wln /* use this */
wsl=ws /* list of resulting ws values */
End
When wln=maxl Then /* same as the one found */
wsl=wsl ws /* add ws to the list */
Otherwise /* shorter */
Nop /* not yet of interest */
End
End
Say ' '
Say copies('-',10) ri-1 'words in the dictionary file: ' ifid
Say copies(' ',10) words 'thereof are anagram candidates'
Say ' '
Say 'There are' words(wsl) 'set(s) of anagrams with' maxl,
'elements each:'
Say ' '
Do while wsl<>''
Parse Var wsl ws wsl
Say ' 'wl.ws
End
Say time('E')
Exit
sorta:
/**********************************************************************
* sort the characters in word_p (lowercase translated to uppercase)
* 'chARa' -> 'AACHR'
**********************************************************************/
Parse Upper Arg word_p
c.=''
Do While word_p>''
Parse Var word_p cc +1 word_p
c.cc=c.cc||cc
End
Return c.a||c.b||c.c||c.d||c.e||c.f||c.g||c.h||c.i||c.j||c.k||c.l||,
c.m||c.n||c.o||c.p||c.q||c.r||c.s||c.t||c.u||c.v||c.w||c.x||c.y||c.z
- Output:
---------- 25108 words in the dictionary file: unixdict.txt 24819 thereof are anagram candidates There are 6 set(s) of anagrams with 5 elements each: abel able bale bela elba angel angle galen glean lange elan lane lean lena neal alger glare lager large regal caret carte cater crate trace evil levi live veil vile 1.170000
Ring
# Project : Anagrams
load "stdlib.ring"
fn1 = "unixdict.txt"
fp = fopen(fn1,"r")
str = fread(fp, getFileSize(fp))
fclose(fp)
strlist = str2list(str)
anagram = newlist(len(strlist), 5)
anag = list(len(strlist))
result = list(len(strlist))
for x = 1 to len(result)
result[x] = 0
next
for x = 1 to len(anag)
anag[x] = 0
next
for x = 1 to len(anagram)
for y = 1 to 5
anagram[x][y] = 0
next
next
for n = 1 to len(strlist)
for m = 1 to len(strlist)
sum = 0
if len(strlist[n]) = 4 and len(strlist[m]) = 4 and n != m
for p = 1 to len(strlist[m])
temp1 = count(strlist[n], strlist[m][p])
temp2 = count(strlist[m], strlist[m][p])
if temp1 = temp2
sum = sum + 1
ok
next
if sum = 4
anag[n] = anag[n] + 1
if anag[n] < 6 and result[n] = 0 and result[m] = 0
anagram[n][anag[n]] = strlist[m]
result[m] = 1
ok
ok
ok
next
if anag[n] > 0
result[n] = 1
ok
next
for n = 1 to len(anagram)
flag = 0
for m = 1 to 5
if anagram[n][m] != 0
if m = 1
see strlist[n] + " "
flag = 1
ok
see anagram[n][m] + " "
ok
next
if flag = 1
see nl
ok
next
func getFileSize fp
c_filestart = 0
c_fileend = 2
fseek(fp,0,c_fileend)
nfilesize = ftell(fp)
fseek(fp,0,c_filestart)
return nfilesize
func count(astring,bstring)
cnt = 0
while substr(astring,bstring) > 0
cnt = cnt + 1
astring = substr(astring,substr(astring,bstring)+len(string(sum)))
end
return cnt
Output:
abbe babe abed bade bead abel able bale bela abet bate beat beta alai alia alex axle bail bali bake beak bane bean bard brad bare bear brae barn bran beam bema blot bolt blow bowl blur burl body boyd
Ruby
require 'open-uri'
anagram = Hash.new {|hash, key| hash[key] = []} # map sorted chars to anagrams
URI.open('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt') do |f|
words = f.read.split
for word in words
anagram[word.split('').sort] << word
end
end
count = anagram.values.map {|ana| ana.length}.max
anagram.each_value do |ana|
if ana.length >= count
p ana
end
end
- Output:
["evil", "levi", "live", "veil", "vile"] ["abel", "able", "bale", "bela", "elba"] ["elan", "lane", "lean", "lena", "neal"] ["alger", "glare", "lager", "large", "regal"] ["angel", "angle", "galen", "glean", "lange"] ["caret", "carte", "cater", "crate", "trace"]
Short version (with lexical ordered result).
require 'open-uri'
anagrams = open('http://wiki.puzzlers.org/pub/wordlists/unixdict.txt'){|f| f.read.split.group_by{|w| w.each_char.sort} }
anagrams.values.group_by(&:size).max.last.each{|group| puts group.join(", ") }
- Output:
abel, able, bale, bela, elba alger, glare, lager, large, regal angel, angle, galen, glean, lange caret, carte, cater, crate, trace elan, lane, lean, lena, neal evil, levi, live, veil, vile
Run BASIC
sqliteconnect #mem, ":memory:"
mem$ = "CREATE TABLE anti(gram,ordr);
CREATE INDEX ord ON anti(ordr)"
#mem execute(mem$)
' read the file
a$ = httpGet$("http://wiki.puzzlers.org/pub/wordlists/unixdict.txt")
' break the file words apart
i = 1
while i <> 0
j = instr(a$,chr$(10),i+1)
if j = 0 then exit while
a1$ = mid$(a$,i,j-i)
q = instr(a1$,"'")
if q > 0 then a1$ = left$(a1$,q) + mid$(a1$,q)
ln = len(a1$)
s$ = a1$
' Split the characters of the word and sort them
s = 1
while s = 1
s = 0
for k = 1 to ln -1
if mid$(s$,k,1) > mid$(s$,k+1,1) then
h$ = mid$(s$,k,1)
h1$ = mid$(s$,k+1,1)
s$ = left$(s$,k-1) + h1$ + h$ + mid$(s$,k+2)
s = 1
end if
next k
wend
mem$ = "INSERT INTO anti VALUES('";a1$;"','";ord$;"')"
#mem execute(mem$)
i = j +1
wend
' find all antigrams
mem$ = "SELECT count(*) as cnt,anti.ordr FROM anti GROUP BY ordr ORDER BY cnt desc"
#mem execute(mem$)
numDups = #mem ROWCOUNT() 'Get the number of rows
dim dups$(numDups)
for i = 1 to numDups
#row = #mem #nextrow()
cnt = #row cnt()
if i = 1 then maxCnt = cnt
if cnt < maxCnt then exit for
dups$(i) = #row ordr$()
next i
for i = 1 to i -1
mem$ = "SELECT anti.gram FROM anti
WHERE anti.ordr = '";dups$(i);"'
ORDER BY anti.gram"
#mem execute(mem$)
rows = #mem ROWCOUNT() 'Get the number of rows
for ii = 1 to rows
#row = #mem #nextrow()
gram$ = #row gram$()
print gram$;chr$(9);
next ii
print
next i
end
abel able bale bela elba caret carte cater crate trace angel angle galen glean lange alger glare lager large regal elan lane lean lena neal evil levi live veil vile