Anagrams/Deranged anagrams: Difference between revisions

→‎{{header|FutureBasic}}: Solution trying for fastest execution time.
imported>Acediast
(→‎{{header|COBOL}}: PERFORM w/o an imperative is non-standard (X/Open) + replaced COMP-3 w/ PACKED-DECIMAL.)
(→‎{{header|FutureBasic}}: Solution trying for fastest execution time.)
Line 2,729:
Took 0.089 seconds on i3 @ 2.13 GHz
</pre>
 
=={{header|FutureBasic}}==
While there is nothing time sensitive about this task, fast code is often efficient code. Several of the entries in this category show their computation times. This FutureBasic entry is designed to outrace them all.
 
The other entries examined have started by sorting the letters in each word. Here we take a different approach by creating an "avatar" for each word. All anagrams of a word have the same avatar—-without any sorting. Here's how it works:<br>
An 8-byte variable can hold a lot of information. We create a 64-bit avatar that starts at the high end with 8 bits for the length of the word, so that longer words will be sorted first. The remaining 56 bits contain 2-bit fields for each letter of the alphabet. A 2-bit field can record from 0 to 3 occurrences of the letter, but even if there were 4 or more occurrences (think "Mississippi"), bleeding into the next field, the only matching avatar would still be an exact anagram. Here's how the bits would be set for the word "Anagrams":
<syntaxhighlight lang="future basic">
Anagrams
length ZzYyXx WwVvUuTt SsRrQqPp OoNnMmLl KkJjIiHh GgFfEeDd CcBbAa
00001000 00000000 00000000 01010000 00010100 00000000 01000000 00001100
</syntaxhighlight>
 
Bit shifts and 8-byte comparisons are fast operations, which contribute to the speed. As each avatar is generated, it is saved, along with the offset to its word, and an index to it inserted in a sorted list, guaranteeing that longest words occur first, and all matching anagrams are adjacent.
 
When words have the same avatars, they are anagrams, but for this task we still need to check for letters occurring in the same location in both words. That is a quick check that only has to be done for otherwise qualified candidates.
 
On a 1.2 GHz Quad-Core Intel Core i7 MacBook Pro, this code runs in ~6 ms, which is several times faster than times claimed by other entries. In that time, it finds not just the longest, but all 486 deranged anagrams in unixdict.txt. (Yes, there is an option to view all of them.)
 
FWIW, this code can easily be amended to show all 1800+ anagram pairs.
<syntaxhighlight lang="future basic">
#plist NSAppTransportSecurity @{NSAllowsArbitraryLoads:YES}
defstr long
begin globals
xref xwords( 210000 ) as char
long gAvatars( 26000 )
uint32 gwordNum, gfilen, gcount = 0, gOffset( 26000 )
uint16 gndx( 26000 ), deranged( 600, 1 )
long sh : sh = system( _scrnHeight ) -100
long sw : sw = (system( _scrnWidth ) -360 ) / 2
CFTimeInterval t
_len = 56
end globals
 
local fn loadDictionary
CFURLRef url = fn URLWithString( @"http://wiki.puzzlers.org/pub/wordlists/unixdict.txt" )
CFStringRef dictStr = fn StringWithContentsOfURL( url, NSUTF8StringEncoding, NULL )
dictStr = fn StringByAppendingString( @" ", dictStr )
xwords = fn StringUTF8String( dictstr )
gfilen = len(dictstr)
end fn
 
local fn deranagrams
uint64 ch, p, wordStart = 0
long avatar = 0
uint32 med, bot, top
byte chk, L
for p = 1 to gfilen
ch = xwords(p) //build avatar
if ch > _" " then avatar += (long) 1 << ( ch and 31 ) * 2: continue
avatar += (long)(p - wordStart - 1) << _len //complete avatar by adding word length
gAvatars(gWordNum) = avatar //store the avatar in list
gOffset( gWordNum) = wordStart //store offset to the word
//Insert into ordered list of avatars
bot = 0 : top = gwordNum //quick search for place to insert
while (top - bot) > 1
med = ( top + bot ) >> 1
if avatar > gAvatars(gndx(med)) then bot = med else top = med
wend
blockmove( @gndx( top ), @gndx( top + 1 ), ( gwordNum - top ) * 2 )
gndx(top) = gWordNum
gwordNum++ : wordStart = p : avatar = 0 //ready for new word
next p
//Check for matching avatars
for p = gWordNum to 1 step -1
chk = 1 //to make sure each word is compared with all matching avatars
while gAvatars( gndx( p ) ) == gAvatars( gndx( p - chk ) )
// found anagram; now check for chars in same position
L = ( gAvatars( gndx( p ) ) >> _len ) //get word length
while L
if xwords(gOffset(gndx(p)) +L) == xwords(gOffset(gndx(p-chk)) +L) then break
L--
wend
if L == 0
//no matching chars: found Deranged Anagram!
deranged( gcount, 0 ) = gndx( p )
deranged( gcount, 1 ) = gndx( p - chk )
gcount++
end if
chk++
wend
next
end fn
 
local fn printPair( ndx as uint32, chrsToCntr as byte )
ptr p : str255 pair : pair = ""
short n = ( gAvatars( deranged( ndx, 0 ) ) >> _len )
if n < chrsToCntr then print string$( chrsToCntr - n, " " );
p = xwords + gOffset( deranged( ndx, 0 ) )
p.0`` = n : print p.0$; " ";
p = xwords + gOffset( deranged( ndx, 1 ) )
p.0`` = n : print p.0$
end fn
 
local fn doDialog(evt as long)
if evt == _btnclick
long r
button -1 : window 1,,(sw,50,335,sh-50)
for r = 1 to gcount-1
fn printPair( r, 21 )
next
end if
end fn
 
fn loadDictionary : t = fn CACurrentMediaTime
fn deranagrams : t = fn CACurrentMediaTime - t
 
window 1, @"Deranged Anagrams in FutureBasic",(sw,sh-130,335,130)
printf @"\n %u deranged anagrams found among \n %u words ¬
in %.2f ms.\n", gcount, gWordNum, t * 1000
print " Longest:";: fn printPair( 0, 11 )
button 1,,,fn StringWithFormat(@"Show remaining %u deranged anagrams.",gcount-1),(24,20,285,34)
on dialog fn doDialog
handleevents
</syntaxhighlight>
{{out}}
[[File:FB output for Deranged Anagrams.png]]
 
=={{header|GAP}}==
68

edits