I'm working on modernizing Rosetta Code's infrastructure. Starting with communications. Please accept this time-limited open invite to RC's Slack.. --Michael Mol (talk) 20:59, 30 May 2020 (UTC)

Talk:Word wheel

From Rosetta Code


It seemed logical to me to have the Python entry check each word of the dictionary against the grid constraints. The Julia entry seems to generate all possible conforming strings from the grid characters then checks if they are in the dictionary.
Nice. --Paddy3118 (talk) 15:30, 4 July 2020 (UTC)

It seems a waste of memory and CPU time to generate all possible conforming strings instead of writing some simple filters that validates the words against the Rosetta Code task and grid (word wheel) constraints.   The REXX solution consumes almost all of the   (smallish, sub-second)   CPU time in just reading in the dictionary.   The filters that I used for the REXX programming solution eliminate over half of the   25,105   words in the UNIXDICT file,   about  1/4  of the time used for the filtering was used in the detecting of duplicate words   (there are none, however).   A very small fraction of that is used to validate that each letter (by count) is represented in the grid.   I wonder what the CPU consumption would be if the number of words (entries) in the dictionary were a magnitude larger.   My "personal" dictionary that I built has over   915,000   words in it.     -- Gerard Schildberger (talk) 21:34, 4 July 2020 (UTC)
Hi Gerard, no waste of memory in the Julia case as nested loops are used to generate word candidates one-at-a-time and then quickly checked if they are in the set of dictionary words. Probably hundreds of thousands of lookups which is OK for todays laptops. As for dictionary size, the task specifies a particular dictionary to use; going so far outside of that may interest, but is outside the task boundary.
Cheers, --Paddy3118 (talk) 03:32, 5 July 2020 (UTC)
I know it appears that this Rosetta Code task specifies a particular dictionary to use, but the very next sentence says that   If you prefer to use a different dictionary ...,   so that more-or-less nullifies that a particular dictionary has to be used;   although I know that using one (directed) dictionary   (as per a task requirement)   helps to compare one's output to other computer programming language examples.   I do wish that a fuller dictionary which has duplicate words in it and/or words what are capitalized differently   (us   and   US)   to better reflect a "real-world" dictionary   (and expose problems with programming code that makes/takes/assumes shortcuts).   But, alas, it is what it is, and if I would use (say, "my" dictionary of over   915,000   words),   it doesn't lend itself to comparing the output to anyone else's output, although I did mention the results in the REXX output section in prose form.     -- Gerard Schildberger (talk) 13:45, 5 July 2020 (UTC)
Just ran the larger dictionary its 12x the size of the standard dictionary and runs in 15x the time using the Python code. (There is a lot of "cruft" padding out that larger dictionary from the look of the first 100 words). --Paddy3118 (talk) 10:18, 5 July 2020 (UTC)

more words with a different grid[edit]

I was wondering what the largest number of words   (when using the suggested dictionary)   would be using a different grid?

The 2nd REXX example uses what I thought might be one of the better grids,   but I only tried a few.

The grid that was used is:   satRELion,     with   E   being the center letter in the grid,   yielding   212   words.     -- Gerard Schildberger (talk) 14:54, 5 July 2020 (UTC)

Replace 'o' with 'p' for 234 words. --Paddy3118 (talk) 18:18, 5 July 2020 (UTC)
Thanks!!     I'll update the REXX (2nd) output.     -- Gerard Schildberger (talk) 18:33, 5 July 2020 (UTC)
The 9 most common letters in the dictionary are 'eratnilso' in order. I trid replacing the least common, 'o with any other from a..z and found the 'p'. Later I decided to replace the least common three characters: 'lso' with any combination of three characters from a..z which still only found the same replacement of 'o' by 'p'. --Paddy3118 (talk) 19:02, 5 July 2020 (UTC)
With unixdict.txt you can get 248 words from "setralinp" (i.e. same letters but with central letter "a"). Word wheel puzzles usually have the additional constraint that there is at least one nine-letter word to be found, in which case the best result with unixdict.txt is 215 words, which you can get from "spearmint" with "a" as the central letter. Using the dictionary "words_alpha.txt" from https://github.com/dwyl/english-words, "setralinp"/"a" generates 1033 words of 3 or more letters, including two nine-letter ones. Perhaps I'll add an "extra credit" part to this task along these lines. -- Simonjsaunders (talk) 12:38, 24 July 2020 (UTC)