Word search

From Rosetta Code
Revision as of 01:54, 27 March 2016 by Rdm (talk | contribs) (J: put a space on the left side of the key, for hopefully slightly better legibility)
Word search is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

A word search puzzle typically consists of a grid of letters in which words are hidden.

There are many varieties of word search puzzles. For the task at hand we will use a rectangular grid in which the words may be placed horizontally, vertically, or diagonally. The words may also be spelled backwards.

The words are not allowed to zigzag, or wrap around. They may overlap but not be completely embedded in another word.

The task: create a 10 by 10 word search and fill it using words from the unixdict. Use only words that are longer than 2, and contain no non-alphabetic characters.

The cells not used by the hidden words should contain the message: Rosetta Code, read from left to right, top to bottom. These letters should be somewhat evenly distributed over the grid, not clumped together. The message should be in upper case, the hidden words in lower case. All cells should either contain letters from the hidden words or from the message.

Pack a minimum of 25 words into the grid.

Print the resulting grid and the solutions.

For instance:

     0  1  2  3  4  5  6  7  8  9

0    n  a  y  r  y  R  e  l  m  f 
1    y  O  r  e  t  s  g  n  a  g 
2    t  n  e  d  i  S  k  y  h  E 
3    n  o  t  n  c  p  c  w  t  T 
4    a  l  s  u  u  n  T  m  a  x 
5    r  o  k  p  a  r  i  s  h  h 
6    a  A  c  f  p  a  e  a  c  C 
7    u  b  u  t  t  t  O  l  u  n 
8    g  y  h  w  a  D  h  p  m  u 
9    m  i  r  p  E  h  o  g  a  n 

parish     (3,5)(8,5)   gangster   (9,1)(2,1)
paucity    (4,6)(4,0)   guaranty   (0,8)(0,1)
prim       (3,9)(0,9)   huckster   (2,8)(2,1)
plasm      (7,8)(7,4)   fancy      (3,6)(7,2)
hogan      (5,9)(9,9)   nolo       (1,2)(1,5)
under      (3,4)(3,0)   chatham    (8,6)(8,0)
ate        (4,8)(6,6)   nun        (9,7)(9,9)
butt       (1,7)(4,7)   hawk       (9,5)(6,2)
why        (3,8)(1,8)   ryan       (3,0)(0,0)
fay        (9,0)(7,2)   much       (8,8)(8,5)
tar        (5,7)(5,5)   elm        (6,0)(8,0)
max        (7,4)(9,4)   pup        (5,3)(3,5)
mph        (8,8)(6,8)


J

Implementation:

<lang J>require'web/gethttp'

unixdict=:verb define

 if. _1 -: fread 'unixdict.txt' do.
   (gethttp 'http://www.puzzlers.org/pub/wordlists/unixdict.txt') fwrite 'unixdict.txt'
 end.
 fread 'unixdict.txt'

)

words=:verb define

 (#~ 1 - 0&e.@e.&'abcdefghijklmnopqrstuvwxyz'@>) (#~ [: (2&< * 10&>:) #@>) <;._2 unixdict

)

dirs=: 10#.0 0-.~>,{,~<i:1 lims=: _10+,"2 +/&>/"1 (0~:i:4)#>,{,~<<"1]1 10 1 +i."0]10*i:_1 dnms=: ;:'nw north ne west east sw south se'

genpuz=:verb define

 words=. words
 fill=. 'ROSETTACODE'
 grid=. ,10 10$' '
 inds=. ,i.10 10
 patience=. -:#words
 key=. i.0 0
 inuse=. i.0 2
 while. (26>#key)+.0<cap=. (+/' '=grid)-#fill do.
   word=. >({~ ?@#) words
   dir=. ?@#dirs
   offs=. (inds#~(#word)<:inds{dir{lims)+/(i.#word)*/dir{dirs
   cool=. ' '=offs{grid
   sel=. */"1 cool+.(offs{grid)="1 word
   offs=. (sel*cap>:+/"1 cool)#offs
   if. (#offs) do.
     off=. ({~ ?@#) offs
     loc=. ({.off),dir
     if. -. loc e. inuse do.
       inuse=. inuse,loc
       grid=. word off} grid
       patience=. patience+1
       key=. /:~ key,' ',(10{.word),(3":1+10 10#:{.off),' ',dir{::dnms
     end.
   else. NB. grr...
     if. 0 > patience=. patience-1 do.
       inuse=.i.0 2
       key=.i.0 0
       grid=. ,10 10$' '
       patience=. -:#words
     end.
   end.
 end.
 puz=. (_23{.":i.10),' ',1j1#"1(":i.10 1),.' ',.10 10$fill (I.grid=' ')} grid
 puz,' ',1 1}._1 _1}.":((</.~ <.) i.@# * 3%#)key

)</lang>

Notes:

While the result is square, we flatten our intermediate results to simplify the code.

dirs are index offsets within the flattened grid for each of the eight cardinal directions.

lims is, for each cardinal direction, and for each grid position, how long of a word can fit.

dnms are names for each of the cardinal directions.

words are the viable words from unixdict, and fill is what we're going to leave in the puzzle for spaces not occupied by any of those words (and this could be made into a parameter).

grid is our working copy of the text of the word search puzzle.

inds are the indices into grid - we will use these as the starting positions when we place the words.

patience is a guard variable, to avoid problems with infinite loops if we arbitrarily place words in a non-viable fashion.

key lists the words we are placing, and where we placed them.

inuse marks location+directions which already have a word (to prevent short words such as sal from being placed as prefixes of longer words such as sale).

Once we have these, we go into a loop where:

word is picked arbitrarily from the viable words from unixdict.

dir is picked arbitrarily from one of our eight cardinal directions.

offs are places where we might place the word (initially limited only by geometry, but we then constrain this based on what's already been placed).

cool marks where our word can be placed in unoccupied spaces (and also will be used later to count how many new spaces will be occupied by the word we pick.

sel marks where our word can be placed such that it does not conflict with existing words.

If this leaves us with any way to place the word, we pick one of them as off and combine the starting location with dir in loc to see if a word has already been placed there and if we're good, we place the word and update our key. (It's extremely rare that loc matches an inuse location, so just ignoring that word works just fine).

Otherwise, we check if we're getting impatient (in which case we scrap the entire thing and start over).

Once we're done, we reshape our grid so it's square and attach the key. Here, puz is the grid formatted for display (with a space between each column, and a numeric key for each row and column).

Example run:

<lang J> genpuz

   0 1 2 3 4 5 6 7 8 9                                                
                                                                      

0 y R p y r f O a p S 1 l o l s i f c c e a 2 l n v z i e n r n l 3 o p z s t e E i n l 4 h l s a v e r d a o 5 e a t a g r e e d y 6 m e m a g T f T A C 7 a y e r s p f z a p 8 O e c n a w o l l a 9 e s o p o r p c D E

acetate     1  8 sw   │ gam         7  5 west │ pol         1  3 sw   
acrid       1  8 south│ holly       5  1 north│ propose    10  7 west 
agreed      6  4 east │ massif      7  1 ne   │ rsvp        1  5 sw   
allowance   9 10 west │ neva        3  7 sw   │ sao         8  5 south
alloy       2 10 south│ offer       9  7 north│ save        5  3 east 
arm         9  5 nw   │ only        4  1 ne   │ sop        10  2 east 
ayers       8  1 east │ pap        10  4 ne   │ tee         4  5 se   
cop        10  8 nw   │ paz         8 10 west │ wan         9  6 west 
fizzle      1  6 sw   │ penna       1  9 south│                       

</lang>

Java

Works with: Java version 7

<lang java>import java.io.*; import static java.lang.String.format; import java.util.*;

public class WordSearch {

   static class Grid {
       int numAttempts;
       char[][] cells = new char[nRows][nCols];
       List<String> solutions = new ArrayList<>();
   }
   final static int[][] dirs = {{1, 0}, {0, 1}, {1, 1}, {1, -1}, {-1, 0},
   {0, -1}, {-1, -1}, {-1, 1}};
   final static int nRows = 10;
   final static int nCols = 10;
   final static int gridSize = nRows * nCols;
   final static int minWords = 25;
   final static Random rand = new Random();
   public static void main(String[] args) {
       printResult(createWordSearch(readWords("unixdict.txt")));
   }
   static List<String> readWords(String filename) {
       List<String> words = new ArrayList<>();
       try (Scanner sc = new Scanner(new FileReader(filename))) {
           while (sc.hasNext()) {
               String s = sc.next().trim().toLowerCase();
               if (s.matches("^[a-z]{3,}$"))
                   words.add(s);
           }
       } catch (FileNotFoundException e) {
           System.out.println(e);
       }
       return words;
   }
   static Grid createWordSearch(List<String> words) {
       Grid grid = null;
       int numAttempts = 0;
       outer:
       while (++numAttempts < 100) {
           Collections.shuffle(words);
           grid = new Grid();
           int messageLen = placeMessage(grid, "Rosetta Code");
           int target = gridSize - messageLen;
           int cellsFilled = 0;
           for (String word : words) {
               cellsFilled += tryPlaceWord(grid, word);
               if (cellsFilled == target && grid.solutions.size() >= minWords) {
                   grid.numAttempts = numAttempts;
                   break outer;
               }
           }
       }
       return grid;
   }
   static int placeMessage(Grid grid, String msg) {
       msg = msg.toUpperCase().replaceAll("[^A-Z]", "");
       int messageLen = msg.length();
       if (messageLen > 0 && messageLen < gridSize) {
           int gapSize = gridSize / messageLen;
           for (int i = 0; i < messageLen; i++) {
               int pos = i * gapSize + rand.nextInt(gapSize);
               grid.cells[pos / nCols][pos % nCols] = msg.charAt(i);
           }
       }
       return messageLen;
   }
   static int tryPlaceWord(Grid grid, String word) {
       int randDir = rand.nextInt(dirs.length);
       int randPos = rand.nextInt(gridSize);
       for (int dir = 0; dir < dirs.length; dir++) {
           dir = (dir + randDir) % dirs.length;
           for (int pos = 0; pos < gridSize; pos++) {
               pos = (pos + randPos) % gridSize;
               int lettersPlaced = tryLocation(grid, word, dir, pos);
               if (lettersPlaced > 0)
                   return lettersPlaced;
           }
       }
       return 0;
   }
   static int tryLocation(Grid grid, String word, int dir, int pos) {
       int r = pos / nCols;
       int c = pos % nCols;
       int len = word.length();
       //  check bounds
       if ((dirs[dir][0] == 1 && (len + c) > nCols)
               || (dirs[dir][0] == -1 && (len - 1) > c)
               || (dirs[dir][1] == 1 && (len + r) > nRows)
               || (dirs[dir][1] == -1 && (len - 1) > r))
           return 0;
       int rr, cc, i, overlaps = 0;
       // check cells
       for (i = 0, rr = r, cc = c; i < len; i++) {
           if (grid.cells[rr][cc] != 0 && grid.cells[rr][cc] != word.charAt(i))
               return 0;
           cc += dirs[dir][0];
           rr += dirs[dir][1];
       }
       // place
       for (i = 0, rr = r, cc = c; i < len; i++) {
           if (grid.cells[rr][cc] == word.charAt(i))
               overlaps++;
           else
               grid.cells[rr][cc] = word.charAt(i);
           if (i < len - 1) {
               cc += dirs[dir][0];
               rr += dirs[dir][1];
           }
       }
       int lettersPlaced = len - overlaps;
       if (lettersPlaced > 0) {
           grid.solutions.add(format("%-10s (%d,%d)(%d,%d)", word, c, r, cc, rr));
       }
       return lettersPlaced;
   }
   static void printResult(Grid grid) {
       if (grid == null || grid.numAttempts == 0) {
           System.out.println("No grid to display");
           return;
       }
       int size = grid.solutions.size();
       System.out.println("Attempts: " + grid.numAttempts);
       System.out.println("Number of words: " + size);
       System.out.println("\n     0  1  2  3  4  5  6  7  8  9");
       System.out.print("");
       for (int r = 0; r < nRows; r++) {
           System.out.printf("%n%d   ", r);
           for (int c = 0; c < nCols; c++)
               System.out.printf(" %c ", grid.cells[r][c]);
       }
       System.out.println("\n");
       for (int i = 0; i < size - 1; i += 2) {
           System.out.printf("%s   %s%n", grid.solutions.get(i),
                   grid.solutions.get(i + 1));
       }
       if (size % 2 == 1)
           System.out.println(grid.solutions.get(size - 1));
   }

}</lang>

Attempts: 2
Number of words: 27

     0  1  2  3  4  5  6  7  8  9

0    R  p  d  i  o  r  o  t  r  a 
1    O  a  o  e  s  b  l  o  c  S 
2    m  s  t  l  f  e  t  l  a  y 
3    E  t  e  i  y  o  t  s  T  i 
4    e  y  l  b  t  g  r  s  p  l 
5    r  l  T  i  A  h  o  e  e  l 
6    o  e  l  h  t  j  c  n  s  C 
7    z  l  o  u  a  a  O  t  a  t 
8    u  k  r  g  c  n  D  z  i  l 
9    o  t  r  a  v  e  l  E  v  w 

rototill   (8,0)(1,7)   polygonal  (1,0)(9,8)
fill       (4,2)(1,5)   goer       (3,8)(0,5)
travel     (1,9)(6,9)   deforest   (2,0)(9,7)
toroid     (7,0)(2,0)   truth      (1,9)(5,5)
estes      (8,5)(4,1)   ipecac     (9,3)(4,8)
ouzo       (0,9)(0,6)   pasty      (1,0)(1,4)
dote       (2,0)(2,3)   lay        (7,2)(9,2)
witch      (9,9)(5,5)   han        (3,6)(5,8)
bloc       (5,1)(8,1)   ill        (9,3)(9,5)
slot       (7,3)(7,0)   art        (9,0)(7,0)
ore        (0,6)(0,4)   bye        (3,4)(5,2)
elk        (1,6)(1,8)   jan        (5,6)(5,8)
liz        (9,8)(7,8)   dam        (2,0)(0,2)
via        (8,9)(8,7)

zkl

Repeat words allowed <lang zkl>fcn buildVectors(R,C){ //-->up to 8 vectors of wild card strings

  var [const] dirs=T(T(1,0), T(0,1), T(1,1), T(1,-1), T(-1,0),T(0,-1), T(-1,-1), T(-1,1));
  vs,v:=List(),List();
  foreach dr,dc in (dirs){ v.clear(); r,c:=R,C;
     while( (0<=r<10) and (0<=c<10) ){ v.append(grid[r][c]); r+=dr; c+=dc; }
     vs.append(T(v.concat(), // eg "???e??????" would match "cohen" or "mineral"
     	dr,dc));
  }
  vs.filter(fcn(v){ v[0].len()>2 }).shuffle()

} fcn findFit(vs,words){ //-->(n, word) ie (nth vector,word), empty vs not seen

  do(1000){ foreach n,v in (vs.enumerate()){ do(10){  // lots of ties
     word:=words[(0).random(nwds)];
     if(word.matches(v[0][0,word.len()])) return(word,n); // "??" !match "a
  }}}
  False

} fcn pasteWord(r,c, dr,dc, word) // jam word into grid along vector

  { foreach char in (word){ grid[r][c]=char; r+=dr; c+=dc; } }

fcn printGrid{

  println("   0 1 2 3 4 5 6 7 8 9");
  foreach n,line in (grid.enumerate()){ println(n,"  ",line.concat(" ")) }

} fcn stuff(msg){ msg=msg.toUpper() : Utils.Helpers.cycle(_);

  foreach r,c in (10,10){ if(grid[r][c]=="?") grid[r][c]=msg.next() }

}</lang>

<lang zkl>validWord:=RegExp("[A-Za-z]+\n$").matches; File("unixdict.txt").read(*) // dictionary file to blob, copied from web

  // blob to list of valid words
  .filter('wrap(w){ (3<w.len()<=10) and validWord(w) })  // "word\n"
  .howza(11).pump(List,"toLower")  // convert blob to list of words, removing \n
  : words:=(_);

reg fitted; do{

  var nwds=words.len(),grid=(10).pump(List(),(10).pump(List(),"?".copy).copy);
  fitted=List(); do(100){
     r,c:=(0).random(10), (0).random(10);
     if(grid[r][c]=="?"){

vs,wn:=buildVectors(r,c), findFit(vs,words); if(wn){ w,n:=wn; pasteWord(r,c,vs[n][1,*].xplode(),w); fitted.append(T(r,c,w)); }

     }}

}while(fitted.len()<25);

stuff("RosettaCode"); printGrid(); println(fitted.len()," words fitted"); fitted.pump(Console.println, T(Void.Read,3,False),

  fcn{ vm.arglist.pump(String,
     fcn([(r,c,w)]){ "%-19s".fmt("[%d,%d]: %s  ".fmt(r,c,w)) }) }

);</lang>

Output:
   0 1 2 3 4 5 6 7 8 9
0  n a n n a o r g d c
1  z i R n n a s d n a
2  p e s e y O w s o n
3  o S n e l h e a p o
4  t w b b l a e n m g
5  i n a a a l n m u a
6  o b o n n o e u t e
7  n h r o i g y n a n
8  s t e r n e l g n a
9  f r o n t n E e t l
26 words fitted
[3,5]: halogen     [0,4]: anna        [2,0]: potion      [9,2]: orion       
[5,4]: allyn       [4,8]: mutant      [5,7]: mung        [4,1]: wan         
[1,6]: sweeney     [7,1]: hen         [3,8]: pond        [2,7]: san         
[6,1]: bonn        [1,0]: zen         [0,9]: canoga      [9,0]: front       
[4,2]: bangle      [0,7]: groan       [4,3]: been        [9,9]: lane        
[2,2]: sin         [1,7]: dna         [8,4]: noon        [8,0]: stern