It seemed logical to me to have the Python entry check each word of the dictionary against the grid constraints. The Julia entry seems to generate all possible conforming strings from the grid characters then checks if they are in the dictionary.
Nice. --Paddy3118 (talk) 15:30, 4 July 2020 (UTC)
- It seems a waste of memory and CPU time to generate all possible conforming strings instead of writing some simple filters that validates the words against the Rosetta Code task and grid (word wheel) constraints. The REXX solution consumes almost all of the (smallish, sub-second) CPU time in just reading in the dictionary. The filters that I used for the REXX programming solution eliminate over half of the 25,105 words in the UNIXDICT file, about 1/4 of the time used for the filtering was used in the detecting of duplicate words (there are none, however). A very small fraction of that is used to validate that each letter (by count) is represented in the grid. I wonder what the CPU consumption would be if the number of words (entries) in the dictionary were a magnitude larger. My "personal" dictionary that I built has over 915,000 words in it. -- Gerard Schildberger (talk) 21:34, 4 July 2020 (UTC)
- Hi Gerard, no waste of memory in the Julia case as nested loops are used to generate word candidates one-at-a-time and then quickly checked if they are in the set of dictionary words. Probably hundreds of thousands of lookups which is OK for todays laptops. As for dictionary size, the task specifies a particular dictionary to use; going so far outside of that may interest, but is outside the task boundary.
- Cheers, --Paddy3118 (talk) 03:32, 5 July 2020 (UTC)
- I know it appears that this Rosetta Code task specifies a particular dictionary to use, but the very next sentence says that If you prefer to use a different dictionary ..., so that more-or-less nullifies that a particular dictionary has to be used; although I know that using one (directed) dictionary (as per a task requirement) helps to compare one's output to other computer programming language examples. I do wish that a fuller dictionary which has duplicate words in it and/or words what are capitalized differently (us and US) to better reflect a "real-world" dictionary (and expose problems with programming code that makes/takes/assumes shortcuts). But, alas, it is what it is, and if I would use (say, "my" dictionary of over 915,000 words), it doesn't lend itself to comparing the output to anyone else's output, although I did mention the results in the REXX output section in prose form. -- Gerard Schildberger (talk) 13:45, 5 July 2020 (UTC)
more words with a different grid
I was wondering what the largest number of words (when using the suggested dictionary) would be using a different grid?
The 2nd REXX example uses what I thought might be one of the better grids, but I only tried a few.
- The 9 most common letters in the dictionary are 'eratnilso' in order. I trid replacing the least common, 'o with any other from a..z and found the 'p'. Later I decided to replace the least common three characters: 'lso' with any combination of three characters from a..z which still only found the same replacement of 'o' by 'p'. --Paddy3118 (talk) 19:02, 5 July 2020 (UTC)
- With unixdict.txt you can get 248 words from "setralinp" (i.e. same letters but with central letter "a"). Word wheel puzzles usually have the additional constraint that there is at least one nine-letter word to be found, in which case the best result with unixdict.txt is 215 words, which you can get from "spearmint" with "a" as the central letter. Using the dictionary "words_alpha.txt" from https://github.com/dwyl/english-words, "setralinp"/"a" generates 1033 words of 3 or more letters, including two nine-letter ones. Perhaps I'll add an "extra credit" part to this task along these lines. -- Simonjsaunders (talk) 12:38, 24 July 2020 (UTC)