Best shuffle: Difference between revisions

→‎{{header|Haskell}}: Changed to correct and efficient solution
(→‎{{header|Haskell}}: Changed to correct and efficient solution)
Line 1,569:
 
=={{header|Haskell}}==
{{incomplete|Haskell|No output given.}}
{{trans|Perl 6}}
<lang haskell>import Data.Function (on)
import Data.List
import Data.Maybe
import Data.Array
import Text.Printf
 
We demonstrate several approaches here. In order to test the program we define a testing suite:
main = mapM_ f examples
where examples = ["abracadabra", "seesaw", "elk", "grrrrrr", "up", "a"]
f s = printf "%s, %s, (%d)\n" s s' $ score s s'
where s' = bestShuffle s
 
<lang Haskell>shufflingQuality l1 l2 = length $ filter id $ zipWith (==) l1 l2
score :: Eq a => [a] -> [a] -> Int
score old new = length $ filter id $ zipWith (==) old new
 
printTest prog = mapM_ test texts
bestShuffle :: (Ord a, Eq a) => [a] -> [a]
where
bestShuffle s = elems $ array bs $ f positions letters
where positions test s = do
x <- prog s
concat $ sortBy (compare `on` length) $
putStrLn $ map (map fst)unwords $ groupBy[ ((==)show `on` snd) $s
sortBy (compare `on` snd) $ zip [0..] s , show x
, show $ shufflingQuality s x]
letters = map (orig !) positions
texts = [ "abba", "abracadabra", "seesaw", "elk" , "grrrrrr"
, "up", "a", "aaaaa.....bbbbb"
, "Rosetta Code is a programming chrestomathy site." ]</lang>
 
=== Deterministic List-based solution ===
f [] [] = []
f (p : ps) ls = (p, ls !! i) : f ps (removeAt i ls)
where i = fromMaybe 0 $ findIndex (/= o) ls
o = orig ! p
 
The core of the algorithm is swapping procedure similar to those implemented in AWK and Icon examples. It could be done by a pure program with use of immutable vectors (though it is possible to use mutable vectors living in <tt>ST</tt> or <tt>IO</tt>, but it won't make the program more clear).
orig = listArray bs s
bs = (0, length s - 1)
 
<lang Haskell>import Data.Vector ((//),(!), Vector)
removeAt :: Int -> [a] -> [a]
import qualified Data.Vector as V
removeAt 0 (x : xs) = xs
import Data.List (delete, find)
removeAt i (x : xs) = x : removeAt (i - 1) xs</lang>
 
swapShuffle :: Eq a => [a] -> [a] -> [a]
Here's a version of <code>bestShuffle</code> that's much simpler, but too wasteful of memory for inputs like "abracadabra":
swapShuffle lref lst = V.toList $ foldr adjust (V.fromList lst) [0..n-1]
where
vref = V.fromList lref
n = V.length vref
adjust i v = case find alternative [0.. n-1] of
Nothing -> v
Just j -> v // [(j, v!i), (i, v!j)]
where
alternative j = and [ v!i == vref!i
, i /= j
, v!i /= vref!j
, v!j /= vref!i ]
 
<lang haskell>bestShuffleshuffle :: Eq a => [a] -> [a]
shuffle lst = swapShuffle lst lst</lang>
bestShuffle s = minimumBy (compare `on` score s) $ permutations s</lang>
 
{{Out}}
<pre>λ> printTest (pure . shuffle)
"abba" "baab" 0
"abracadabra" "daabacarrab" 0
"seesaw" "esaews" 0
"elk" "lke" 0
"grrrrrr" "rrrrrrg" 5
"up" "pu" 0
"a" "a" 1
"aaaaa.....bbbbb" ".....bbbbbaaaaa" 0
"Rosetta Code is a programming chrestomathy site." "stetma Code is a programoing chrestomathy site.R" 0</pre>
 
The program works but shuffling is not good in case of a real text, which was just shifted. We can make it better using [[Perfect shuffle]] (faro shuffle) before the swapping procedure.
 
<lang Haskell>perfectShuffle :: [a] -> [a]
perfectShuffle [] = []
perfectShuffle lst | odd n = b : shuffle (zip bs a)
| even n = shuffle (zip (b:bs) a)
where
n = length lst
(a,b:bs) = splitAt (n `div` 2) lst
shuffle = foldMap (\(x,y) -> [x,y])
shuffleP :: Eq a => [a] -> [a]
shuffleP lst = swapShuffle lst $ perfectShuffle lst</lang>
 
{{Out}}
<pre>λ> qualityTest (pure . shuffleP)
"abba" "baab" 0
"abracadabra" "baadabrraac" 0
"seesaw" "assewe" 0
"elk" "lke" 0
"grrrrrr" "rrgrrrr" 5
"up" "pu" 0
"a" "a" 1
"aaaaa.....bbbbb" "bbb.baaaaba...." 0
"Rosetta Code is a programming chrestomathy site." " Rmoisnegt tcahmrCeosdteo miast hay psriotger.a" 0</pre>
 
That's much better.
 
=== Nondeterministic List-based solution ===
 
Adding randomness is easy: just perform random shuffle before swapping procedure.
 
<lang Haskell>randomShuffle :: [a] -> IO [a]
randomShuffle [] = return []
randomShuffle lst = do
i <- getRandomR (0,length lst-1)
let (a, x:b) = splitAt i lst
xs <- randomShuffle $ a ++ b
return (x:xs)
shuffleR :: Eq a => [a] -> IO [a]
shuffleR lst = swapShuffle lst <$> randomShuffle lst</lang>
 
{{Out}}
<pre>λ> qualityTest shuffleR
"abba" "baab" 0
"abracadabra" "raacadababr" 0
"seesaw" "wsaese" 0
"elk" "kel" 0
"grrrrrr" "rrrgrrr" 5
"up" "pu" 0
"a" "a" 1
"aaaaa.....bbbbb" "b.b.baababa.a.." 0
"Rosetta Code is a programming chrestomathy site." "esodmnithsrasrmeogReat taoCp gtrty i .mi as ohce" 0</pre>
 
Now everything is Ok except for the efficiency. Both randomization and swapping procedure are O[n^2], moreover the whole text must be kept in memory, so for large data sequences it will take a while to shuffle.
 
=== Nondeterministic Conduit-based solution ===
 
Using streaming technique it is possible to shuffle the sequence on the fly, using relatively small moving window (say of length k) for shuffling procedure. In that case the program will consume constant memory amount O[k] and require O[n*k] operations.
 
<lang Haskell>{-# LANGUAGE TupleSections, LambdaCase #-}
import Conduit
import Control.Monad.Random (getRandomR)
import Data.List (delete, find)
 
shuffleC :: Eq a => Int -> Conduit a IO a
shuffleC 0 = awaitForever yield
shuffleC k = takeC k .| sinkList >>= \v -> delay v .| randomReplace v
 
delay :: Monad m => [a] -> Conduit t m (a, [a])
delay [] = mapC $ \x -> (x,[x])
delay (b:bs) = await >>= \case
Nothing -> yieldMany (b:bs) .| mapC (,[])
Just x -> yield (b, [x]) >> delay (bs ++ [x])
 
randomReplace :: Eq a => [a] -> Conduit (a, [a]) IO a
randomReplace vars = awaitForever $ \(x,b) -> do
y <- case filter (/= x) vars of
[] -> pure x
vs -> lift $ (vs !!) <$> getRandomR (0, length vs - 1)
yield y
randomReplace $ b ++ delete y vars
 
shuffleW :: Eq a => Int -> [a] -> IO [a]
shuffleW k lst = yieldMany lst =$= shuffleC k $$ sinkList</lang>
 
Here we define a new conduit <code>shuffleC</code> which uses a moving window of length <tt>k</tt> and returns shuffled elements of upstream data.
 
{{Out}}
<pre>λ> qualityTest (shuffleW 8)
"abba" "baab" 0
"abracadabra" "daabrcabaar" 0
"seesaw" "eswesa" 0
"elk" "kel" 0
"grrrrrr" "rgrrrrr" 5
"up" "pu" 0
"a" "a" 1
"aaaaa.....bbbbb" "....baabaaa.bbb" 3
"Rosetta Code is a programming chrestomathy site." "sCaoeRei d os pttaogrr nrgshmeaotaichiy .ttmsme" 0</pre>
 
This program is good for real texts with high entropy. In case of homogeneous strings like <tt>"aaaaa.....bbbbb"</tt> it gives poor results for windows smaller then homogeneous regions.
 
The main goal of streaming solution is to be able to process data from any resources, so let's use it to shuffle texts being transferred from <tt>stdin</tt> to <tt>stdout</tt>.
 
<lang Haskell>import Data.ByteString.Builder (charUtf8)
import Data.ByteString.Char8 (ByteString, unpack, pack)
import Data.Conduit.ByteString.Builder (builderToByteString)
import System.IO (stdin, stdout)
 
shuffleBS :: Int -> ByteString -> IO ByteString
shuffleBS n s =
yieldMany (unpack s)
=$ shuffleC n
=$ mapC charUtf8
=$ builderToByteString
$$ foldC
main :: IO ()
main =
sourceHandle stdin
=$ mapMC (shuffleBS 10)
$$ sinkHandle stdout</lang>
 
{{Out}}
<pre>$ ghc --make -O3 ./shuffle
[1 of 1] Compiling Main ( shuffle.hs, shuffle.o )
Linking shuffle ...
 
$ cat input.txt
Rosetta Code is a programming chrestomathy site. The idea is to present solutions to the same task in as many different languages as possible, to demonstrate how languages are similar and different, and to aid a person with a grounding in one approach to a problem in learning another. Rosetta Code currently has 823 tasks, 193 draft tasks, and is aware of 642 languages, though we do not (and cannot) have solutions to every task in every language.
 
$ cat input.txt | ./shuffle
aeotdR s aoiCtrpmmgi crn theemaysg srioT the tseo.dih psae re isltn ountstoeo tosmaetia es nssimhn ad kaeeinrlataffauytse g oanbs ,e ol e sio ttngdasmw esphut ro ganeemas g alsi arlaeefn,ranifddoii a drnp det r toi ahowgnutan n rgneanppi raohi d oaop blrcst imeioaer ngohrla.eRotn Cst n dce aenletya th8r3 n2ssout1 3dasktaft,rrk9as,a ss iewarf6 d2l ogu asga te g un oa hn4d enaodho(ctt)n, eha laovnsotusw oeinyetsakvn eo ienlrav ygtnu aer. g</pre>
 
=={{header|Icon}} and {{header|Unicon}}==
Anonymous user