15 puzzle solver: Difference between revisions

(→‎The Code: Ah, EQUIVALENCE.)
Line 261:
Such a guide function is rather like a "distance" function, so along these lines, when a minimum-step move sequence is found, the distances of each position from ZERO are calculated by various distance functions and the results shown in the output. Notably, function ZDIST calculates an encodement of the board position by referring to the layout of the ZERO position. It relies on the first square of a position having sixteen possible values, then the second square has fifteen and so on down; the number of possible layouts is 16! rather than 16<sup>16</sup>. Given the list of values of the squares in the ZERO sequence, values that have been taken are marked off (in array AVAIL) and the count of possibilities remaining as each square is identified reduces. The ZDIST number is like an encodement of an integer from its digits, but with a successively-reducing base.
 
As an experiment, when the Red Tide procedd was poured to completion for a board of three rows and four columns, every position in the stash was presented to ZDIST and the result written out. Again, theThe filesystem presented difficulties: reading the stash file as a sequential file failed! So, a slog through reading specified record numbers, one after the other, taking half an hour. By contrast, the B6700 filesystem of the 1970s employed a feature called "diagonal I/O" whereby the style of the previous I/O for a file was retained and compared to the style of the new I/O operation: for matching styles, conclusions could be drawn. The headfirst and taillast few ZDIST values are 0, 105, 1, 328449, 609, 632, 4, 3271809, 3312009, ... 287247191, 446727869, 287039663. That last is entry 23950800 which is 12!/2: every possible attainable position has been tested and stored in the stash (once each), since a parity condition means that half the layout combinations have one parity and half the other, the possible moves being unable to change the parity. Alas, the ZDIST figures do not show anything so simple as odd/even for this. Every value should be unique, but in the past it has suspiciously often been easy to find a proof of a desired situation, so a test: sort the values and check. UltraEdit ran for a long time, then reported that there was insufficient memory to maintain an "undo" list. Very well. And then it wiped the file! Fortunately, the stash was still intact. On restarting in Linux the "sort" utility was available, and after some impressive running of all six cpus at 100% in cyclic bursts, I was looking at a file in a mad order: 0, 1, 10, 100, 1000, 10000, ... Apparently some sort of "word" order, even though the text file used leading spaces so that the numbers were all aligned right. As usual, the "man"ual information was brief and vague, but option "g" for "general number" looked likely, and so init proved. In sorted order: 0, 1, 4, 8, 9, 11, 12, 13, ... 479001589, 479001590, 479001593, 479001594, 479001597, 479001598. And 12! = 479001600. All values were unique.
 
The ZDIST function would be a candidate for a hash function, as it looks to give a good spray. But it requires rather more computation. It is always possible to calculate a "perfect" hash function, one that not only gives a distinct integer value for every possible key but also produces no gaps, so if there are N keys, there are N hash values in the range 1:N (or 0:N - 1) and the mapping is one to one. There even exist "hyper perfect" hash functions, that possess useful ordering properties: an example is the conversion of dates to a [[Calendar#Fortran|day number]], which is one-to-one, without gaps, and ordered by date. However, the calculation of such perfection may be lengthy, so a fast hash is preferable, except for special cases.
 
In the event, when running, TaskInfo showed that almost all of the time was spent in the filesystem's I/O procedures, no "user time" was visible. The hash calculation indeed was fast, but the I/O was slow. Probably because the disc file was scattered in blocks across the disc device (actually a solid-state drive) and finding the appropriate block took a lot of messing about. In the past, a large disc file would be in one or very few pieces on the actual disc, with a straightforward direct calculation between a record number and a disc's corresponding cylinder, surface, track and sector. These days, the operating system uses "spare" memory to hold the content of recently-used disc blocks, but alas, each index file is 781MB, and the stash files are over 3,000MB. Some systems have many gigabytes of memory, but this one has 4GB.
 
===The Code===
1,220

edits