Anagram generator: Difference between revisions

From Rosetta Code
Content added Content deleted
m (→‎{{header|J}}: bugfix (would have not rejected a non-anagram which only differed by case))
(Added Wren)
Line 145: Line 145:
I contest waldo
I contest waldo
nose to wildcat</pre>
nose to wildcat</pre>

=={{header|Wren}}==
{{libheader|Wren-str}}
{{libheader|Wren-perm}}
{{libheader|Wren-sort}}
Although reasonably thorough (at least for producing two word anagrams), this is none too quick when there's more than 9 letters to juggle with. Hence, the need for a limit to be imposed on the number of anagrams produced.
<lang ecmascript>import "io" for File
import "./str" for Str, Char
import "./perm" for Perm
import "./sort" for Find

var wordList = "unixdict.txt" // local copy
var words = File.read(wordList).trimEnd().split("\n").toList

var anagramGenerator = Fn.new { |text, limit|
var letters = Str.lower(text).toList
// remove any non-letters
for (i in letters.count-1..0) {
if (!Char.isLetter(letters[i])) letters.removeAt(i)
}
if (letters.count < 4) return
var h = (letters.count/2).floor
var count = 0
var tried = {}
for (n in h..2) {
for (perm in Perm.list(letters)) {
var letters1 = perm[0...n]
for (perm2 in Perm.list(letters1)) {
var word1 = perm2.join()
if (tried[word1]) continue
tried[word1] = true
if (Find.first(words, word1) >= 0) {
var letters2 = perm[n..-1]
for (perm3 in Perm.list(letters2)) {
var word2 = perm3.join()
if (tried[word2]) continue
tried[word2] = true
if (Find.first(words, word2) >= 0) {
System.print(" " + word1 + " " + word2)
count = count + 1
if (count == limit) return
}
}
}
}
}
}
}

var tests = ["Rosetta", "PureFox", "Petelomax", "Wherrera", "Thundergnat"]
var limits = [10, 10, 10, 10, 1]
for (i in 0...tests.count) {
System.print("\n%(tests[i])(<=%(limits[i])):")
anagramGenerator.call(tests[i], limits[i])
}</lang>

{{out}}
<pre>
Rosetta(<=10):
rot east
rot seat
oar test
ret taos
toe star
toe tsar
ott sera
ott sear
ott ares
oat rest

PureFox(<=10):
fox peru
fox pure

Petelomax(<=10):
poem latex
poem exalt
apex motel
alex tempo
moat expel
pax omelet
lao exempt
to example

Wherrera(<=10):
wehr rear
wehr rare
wear herr

Thundergnat(<=1):
ghent tundra
</pre>

Revision as of 16:54, 7 July 2022

Anagram generator is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

There are already other tasks relating to finding existing anagrams. This one is about creating them.

Write a (set of) routine(s) that, when given a word list to work from, and word or phrase as a seed, generates anagrams of that word or phrase. Feel free to ignore letter case, white-space, punctuation and symbols. Probably best to avoid numerics too, but feel free to include them if that floats your boat.

It is not necessary to (only) generate anagrams that make sense. That is a hard problem, much more difficult than can realistically be done in a small program; though again, if you feel the need, you are invited to amaze your peers.

In general, try to form phrases made up of longer words. Feel free to manually reorder output words or add punctuation and/or case changes to get a better meaning.


Task

Write an anagram generator program.

Use a publicly and freely available word file as its word list.

unixdict.txt from http://wiki.puzzlers.org is a popular, though somewhat limited choice.
A much larger word list: words_alpha.txt file from https://github.com/dwyl/english-words. May be better as far as coverage but may return unreasonably large results.

Use your program to generate anagrams of some words / phrases / names of your choice. No need to show all the output. It is likely to be very large. Just pick out one or two of the best results and show the seed word/phrase and anagram.

For example, show the seed and one or two of the best anagrams:

Purefox -> Fur expo
Petelomax -> Metal expo

.oO(hmmm. Seem to be detecting something of a trend here...)


J

Implementation:

<lang J>anagen=: {{

 seed=. (tolower y)([-.-.)a.{~97+i.26
 letters=. ~.seed
 list=. <;._2 tolower fread x
 ok1=. */@e.&letters every list
 ref=. #/.~seed
 counts=. <: #/.~@(letters,])every ok1#list
 ok2=. counts */ .<:ref
 c=. ok2#counts
 maybe=. i.1,~#c
 while. #maybe do.
   done=. (+/"2 maybe{c)*/ .=ref
   if. 1 e. done do.
     r=. ;:inv ((done#maybe) { ok2#I.ok1){L:0 1 <;._2 fread x
     if. #r=. r #~ -. r -:"1&tolower y do. r return. end.
   end.
   maybe=. ; c {{
     <(#~ n */ .<:"1~ [: +/"2 {&m) y,"1 0 ({:y)}.i.#m
   }} ref"1(-.done)#maybe
 end.
 EMPTY

}}</lang>

Examples:

<lang J> 'unixdict.txt' anagen 'Rosettacode' cetera stood coat oersted coda rosette code rosetta coed rosetta create stood creosote tad derate scoot detector sao doctor tease doctorate se ostracod tee

  'unixdict.txt' anagen 'Thundergnat'

dragnet hunt gannett hurd ghent tundra gnat thunder hurd tangent tang thunder

  'unixdict.txt' anagen 'Clint Eastwood'

atwood stencil clio downstate coil downstate downcast eliot downstate loci edison walcott</lang>

Raku

Using the unixdict.txt word file by default.

<lang perl6>unit sub MAIN ($in is copy = , :$dict = 'unixdict.txt');

say 'Enter a word or phrase to be anagramed. (Loading dictionary)' unless $in.chars;

  1. Load the words into a word / Bag hash

my %words = $dict.IO.slurp.lc.words.race.map: { .comb(/\w/).join => .comb(/\w/).Bag };

  1. Declare some globals

my ($phrase, $count, $bag);

loop {

   ($phrase, $count, $bag) = get-phrase;
   find-anagram Hash.new: %words.grep: { .value ⊆ $bag };

}

sub get-phrase {

   my $prompt = $in.chars ?? $in !! prompt "\nword or phrase? (press Enter to quit) ";
   $in = ;
   exit unless $prompt;
   $prompt,
   +$prompt.comb(/\w/),
   $prompt.lc.comb(/\w/).Bag;

}

sub find-anagram (%subset, $phrase is copy = , $last = Inf) {

   my $remain = $bag ∖ $phrase.comb(/\w/).Bag;        # Find the remaining letters
   my %filtered = %subset.grep: { .value ⊆ $remain }; # Find words using the remaining letters
   my $sofar = +$phrase.comb(/\w/);                   # Get the count of the letters used so far
   for %filtered.sort: { -.key.chars, ~.key } {       # Sort by length then alphabetically then iterate
       my $maybe = +.key.comb(/\w/);                  # Get the letter count of the maybe addition
       next if $maybe > $last;                        # Next if it is longer than last - only consider descending length words
       next if $maybe == 1 and $last == 1;            # Only allow one one character word
       next if $count - $sofar - $maybe > $maybe;     # Try to balance word lengths
       if $sofar + $maybe == $count {                 # It's an anagram
           say $phrase ~ ' ' ~ .key and next;         # Display it and move on
       } else {                                       # Not yet a full anagram, recurse
           find-anagram %filtered, $phrase ~ ' ' ~ .key, $maybe;
       }
   }

}</lang>

Truncated to only show the best few as subjectively determined by me:

Punctuation, capitalization and (in some cases) word order manually massaged.

Enter a word or phrase to be anagramed. (Loading dictionary)

word or phrase? (press Enter to quit) Rosettacode
doctor tease

word or phrase? (press Enter to quit) thundergnat
dragnet hunt
Gent? Nah, turd.

word or phrase? (press Enter to quit) Clint Eastwood
downcast eliot
I contest waldo
nose to wildcat

Wren

Library: Wren-str
Library: Wren-perm
Library: Wren-sort

Although reasonably thorough (at least for producing two word anagrams), this is none too quick when there's more than 9 letters to juggle with. Hence, the need for a limit to be imposed on the number of anagrams produced. <lang ecmascript>import "io" for File import "./str" for Str, Char import "./perm" for Perm import "./sort" for Find

var wordList = "unixdict.txt" // local copy var words = File.read(wordList).trimEnd().split("\n").toList

var anagramGenerator = Fn.new { |text, limit|

   var letters = Str.lower(text).toList
   // remove any non-letters
   for (i in letters.count-1..0) {
       if (!Char.isLetter(letters[i])) letters.removeAt(i)
   }
   if (letters.count < 4) return
   var h = (letters.count/2).floor
   var count = 0
   var tried = {}
   for (n in h..2) {
       for (perm in Perm.list(letters)) {
           var letters1 = perm[0...n]
           for (perm2 in Perm.list(letters1)) {
               var word1 = perm2.join()
               if (tried[word1]) continue
               tried[word1] = true
               if (Find.first(words, word1) >= 0) {
                   var letters2 = perm[n..-1]
                   for (perm3 in Perm.list(letters2)) {
                       var word2 = perm3.join()
                       if (tried[word2]) continue
                       tried[word2] = true
                       if (Find.first(words, word2) >= 0) {
                           System.print("  " + word1 + " " + word2)
                           count = count + 1
                           if (count == limit) return
                       }
                   }
               }
           }
       }
   }

}

var tests = ["Rosetta", "PureFox", "Petelomax", "Wherrera", "Thundergnat"] var limits = [10, 10, 10, 10, 1] for (i in 0...tests.count) {

   System.print("\n%(tests[i])(<=%(limits[i])):")
   anagramGenerator.call(tests[i], limits[i])

}</lang>

Output:
Rosetta(<=10):
  rot east
  rot seat
  oar test
  ret taos
  toe star
  toe tsar
  ott sera
  ott sear
  ott ares
  oat rest

PureFox(<=10):
  fox peru
  fox pure

Petelomax(<=10):
  poem latex
  poem exalt
  apex motel
  alex tempo
  moat expel
  pax omelet
  lao exempt
  to example

Wherrera(<=10):
  wehr rear
  wehr rare
  wear herr

Thundergnat(<=1):
  ghent tundra