Markov chain text generator

From Rosetta Code
Markov chain text generator is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

This task is about coding a Text Generator using Markov Chain algorithm.

A Markov chain algorithm basically determines the next most probable suffix word for a given prefix.

To do this, a Markov chain program typically breaks an input text (training text) into a series of words, then by sliding along them in some fixed sized window, storing the first N words as a prefix and then the N + 1 word as a member of a set to choose from randomly for the suffix.

As an example, take this text with N = 2:

now he is gone she said he is gone for good

this would build the following table:

PREFIX               SUFFIX
now he               is
he is                gone, gone
is gone              she, for
gone she             said
she said             he
said he              is
gone for             good
for good             (empty) if we get at this point, the program stops generating text

To generate the final text choose a random PREFIX, if it has more than one SUFFIX, get one at random, create the new PREFIX and repeat until you have completed the text.

Following our simple example, N = 2, 8 words:

random prefix: gone she
suffix: said
new prefix: she + said
new suffix: he
new prefix: said + he
new suffix: is
... and so on

gone she said he is gone she said

The bigger the training text, the better the results. You can try this text here: alice_oz.txt

Create a program that is able to handle keys of any size (I guess keys smaller than 2 words would be pretty random text but...) and create output text also in any length. Probably you want to call your program passing those numbers as parameters. Something like: markov( "text.txt", 3, 300 )




C++[edit]

In this implementation there is no repeated suffixes!

 
#include <ctime>
#include <iostream>
#include <algorithm>
#include <fstream>
#include <string>
#include <vector>
#include <map>
class markov {
public:
void create( std::string& file, int keyLen, int words ) {
std::ifstream f( file.c_str(), std::ios_base::in );
fileBuffer = std::string( ( std::istreambuf_iterator<char>( f ) ), std::istreambuf_iterator<char>() );
f.close();
if( fileBuffer.length() < 1 ) return;
createDictionary( keyLen );
createText( words - keyLen );
}
private:
void createText( int w ) {
std::string key, first, second;
size_t next, pos;
std::map<std::string, std::vector<std::string> >::iterator it = dictionary.begin();
std::advance( it, rand() % dictionary.size() );
key = ( *it ).first;
std::cout << key;
while( true ) {
std::vector<std::string> d = dictionary[key];
if( d.size() < 1 ) break;
second = d[rand() % d.size()];
if( second.length() < 1 ) break;
std::cout << " " << second;
if( --w < 0 ) break;
next = key.find_first_of( 32, 0 );
first = key.substr( next + 1 );
key = first + " " + second;
}
std::cout << "\n";
}
void createDictionary( int kl ) {
std::string w1, key;
size_t wc = 0, pos, textPos, next;
next = fileBuffer.find_first_not_of( 32, 0 );
if( next == -1 ) return;
while( wc < kl ) {
pos = fileBuffer.find_first_of( ' ', next );
w1 = fileBuffer.substr( next, pos - next );
key += w1 + " ";
next = fileBuffer.find_first_not_of( 32, pos + 1 );
if( next == -1 ) return;
wc++;
}
key = key.substr( 0, key.size() - 1 );
while( true ) {
next = fileBuffer.find_first_not_of( 32, pos + 1 );
if( next == -1 ) return;
pos = fileBuffer.find_first_of( 32, next );
w1 = fileBuffer.substr( next, pos - next );
if( w1.size() < 1 ) break;
if( std::find( dictionary[key].begin(), dictionary[key].end(), w1 ) == dictionary[key].end() )
dictionary[key].push_back( w1 );
key = key.substr( key.find_first_of( 32 ) + 1 ) + " " + w1;
}
}
std::string fileBuffer;
std::map<std::string, std::vector<std::string> > dictionary;
};
int main( int argc, char* argv[] ) {
srand( unsigned( time( 0 ) ) );
markov m;
m.create( std::string( "alice_oz.txt" ), 3, 200 );
return 0;
}
 
Output:
March Hare had just upset the milk-jug into his plate. Alice did not dare to 
disobey, though she felt sure it would all come wrong, and she went on. Her 
listeners were perfectly quiet till she got to the part about her repeating 
'You are old, Father William,' said the Caterpillar. 'Well, I've tried to say 
How doth the little crocodile Improve his shining tail, And pour the waters of 
the Nile On every golden scale! 'How cheerfully he seems to grin, How neatly 
spread his claws, And welcome little fishes in With gently smiling jaws!' 
'I'm sure those are not the right words,' said poor Alice, and her eyes filled 
with tears again as she went slowly after it: 'I never was so small as this before, 
never! And I declare it's too bad, that it is!' As she said this she looked 
down into its face in some alarm. This time there were three gardeners at it, 
busily painting them red. Alice thought this a very difficult game indeed. 
The players all played at once without waiting for the end of me. But the 
tinsmith happened to come along, and he made me a body of tin, fastening my 
tin arms and


D[edit]

Translation of: Kotlin
import std.file;
import std.random;
import std.range;
import std.stdio;
import std.string;
 
string markov(string filePath, int keySize, int outputSize) {
if (keySize < 1) throw new Exception("Key size can't be less than 1");
auto words = filePath.readText().chomp.split;
if (outputSize < keySize || words.length < outputSize) {
throw new Exception("Output size is out of range");
}
string[][string] dict;
 
foreach (i; 0..words.length-keySize) {
auto key = words[i..i+keySize].join(" ");
string value;
if (i+keySize<words.length) {
value = words[i+keySize];
}
if (key !in dict) {
dict[key] = [value];
} else {
dict[key] ~= value;
}
}
 
string[] output;
auto n = 0;
auto rn = uniform(0, dict.length);
auto prefix = dict.keys[rn];
output ~= prefix.split;
 
while (true) {
auto suffix = dict[prefix];
if (suffix.length == 1) {
if (suffix[0] == "") return output.join(" ");
output ~= suffix[0];
} else {
rn = uniform(0, suffix.length);
output ~= suffix[rn];
}
if (output.length >= outputSize) return output.take(outputSize).join(" ");
n++;
prefix = output[n .. n+keySize].join(" ");
}
}
 
void main() {
writeln(markov("alice_oz.txt", 3, 200));
}
Output:
neighbour to tell him. 'A nice muddle their slates'll be in before the trial's over!' thought Alice. One of the jurors had a pencil that squeaked. This of course, Alice could not think of any good reason, and as the monster crawls through the forest he seizes an animal with a leg and drags it to his ear. Alice considered a little, and then said 'The fourth.' 'Two days wrong!' sighed the Hatter. 'I deny it!' said the March Hare. 'Sixteenth,' added the Dormouse. 'Write that down,' the King replied. Here the other guinea-pig cheered, and was immediately suppressed by the officers of the court, all dressed in green clothes and had greenish skins. They looked at Dorothy again. Why should I do this for you? asked the Scarecrow. You are quite welcome to take my head off, as long as the tiger had said, and its body covered with coarse black hair. It had a great longing to have for her own the Silver Shoes had fallen off in her flight through the air, and on the morning of the second day I awoke and found the oil-can, and then she had to stop and untwist it. After a

Go[edit]

package main
 
import (
"bufio"
"flag"
"fmt"
"io"
"log"
"math/rand"
"os"
"strings"
"time"
"unicode"
"unicode/utf8"
)
 
func main() {
log.SetFlags(0)
log.SetPrefix("markov: ")
input := flag.String("in", "alice_oz.txt", "input file")
n := flag.Int("n", 2, "number of words to use as prefix")
runs := flag.Int("runs", 1, "number of runs to generate")
wordsPerRun := flag.Int("words", 300, "number of words per run")
startOnCapital := flag.Bool("capital", false, "start output with a capitalized prefix")
stopAtSentence := flag.Bool("sentence", false, "end output at a sentence ending punctuation mark (after n words)")
flag.Parse()
 
rand.Seed(time.Now().UnixNano())
 
m, err := NewMarkovFromFile(*input, *n)
if err != nil {
log.Fatal(err)
}
 
for i := 0; i < *runs; i++ {
err = m.Output(os.Stdout, *wordsPerRun, *startOnCapital, *stopAtSentence)
if err != nil {
log.Fatal(err)
}
fmt.Println()
}
}
 
// We'd like to use a map of []string -> []string (i.e. list of prefix
// words -> list of possible next words) but Go doesn't allow slices to be
// map keys.
//
// We could use arrays, e.g. map of [2]string -> []string, but array lengths
// are fixed at compile time. To work around that we could set a maximum value
// for n, say 8 or 16, and waste the extra array slots for smaller n.
//
// Or we could make the suffix map key just be the full prefix string. Then
// to get the words within the prefix we could either have a separate map
// (i.e. map of string -> []string) for the full prefix string -> the list
// of the prefix words. Or we could use strings.Fields() and strings.Join() to
// go back and forth (trading more runtime for less memory use).
 
// Markov is a Markov chain text generator.
type Markov struct {
n int
capitalized int // number of suffix keys that start capitalized
suffix map[string][]string
}
 
// NewMarkovFromFile initializes the Markov text generator
// with window `n` from the contents of `filename`.
func NewMarkovFromFile(filename string, n int) (*Markov, error) {
f, err := os.Open(filename)
if err != nil {
return nil, err
}
defer f.Close() // nolint: errcheck
return NewMarkov(f, n)
}
 
// NewMarkov initializes the Markov text generator
// with window `n` from the contents of `r`.
func NewMarkov(r io.Reader, n int) (*Markov, error) {
m := &Markov{
n: n,
suffix: make(map[string][]string),
}
sc := bufio.NewScanner(r)
sc.Split(bufio.ScanWords)
window := make([]string, 0, n)
for sc.Scan() {
word := sc.Text()
if len(window) > 0 {
prefix := strings.Join(window, " ")
m.suffix[prefix] = append(m.suffix[prefix], word)
//log.Printf("%20q -> %q", prefix, m.suffix[prefix])
if isCapitalized(prefix) {
m.capitalized++
}
}
window = appendMax(n, window, word)
}
if err := sc.Err(); err != nil {
return nil, err
}
return m, nil
}
 
// Output writes generated text of approximately `n` words to `w`.
// If `startCapital` is true it picks a starting prefix that is capitalized.
// If `stopSentence` is true it continues after `n` words until it finds
// a suffix ending with sentence ending punctuation ('.', '?', or '!').
func (m *Markov) Output(w io.Writer, n int, startCapital, stopSentence bool) error {
// Use a bufio.Writer both for buffering and for simplified
// error handling (it remembers any error and turns all future
// writes/flushes into NOPs returning the same error).
bw := bufio.NewWriter(w)
 
var i int
if startCapital {
i = rand.Intn(m.capitalized)
} else {
i = rand.Intn(len(m.suffix))
}
var prefix string
for prefix = range m.suffix {
if startCapital && !isCapitalized(prefix) {
continue
}
if i == 0 {
break
}
i--
}
 
bw.WriteString(prefix) // nolint: errcheck
prefixWords := strings.Fields(prefix)
n -= len(prefixWords)
 
for {
suffixChoices := m.suffix[prefix]
if len(suffixChoices) == 0 {
break
}
i = rand.Intn(len(suffixChoices))
suffix := suffixChoices[i]
//log.Printf("prefix: %q, suffix: %q (from %q)", prefixWords, suffix, suffixChoices)
bw.WriteByte(' ') // nolint: errcheck
if _, err := bw.WriteString(suffix); err != nil {
break
}
n--
if n < 0 && (!stopSentence || isSentenceEnd(suffix)) {
break
}
 
prefixWords = appendMax(m.n, prefixWords, suffix)
prefix = strings.Join(prefixWords, " ")
}
return bw.Flush()
}
 
func isCapitalized(s string) bool {
// We can't just look at s[0], which is the first *byte*,
// if we want to support arbitrary Unicode input.
// This still doesn't support combining runes :(.
r, _ := utf8.DecodeRuneInString(s)
return unicode.IsUpper(r)
}
 
func isSentenceEnd(s string) bool {
r, _ := utf8.DecodeLastRuneInString(s)
// Unfortunately, Unicode doesn't seem to provide
// a test for sentence ending punctution :(.
//return unicode.IsPunct(r)
return r == '.' || r == '?' || r == '!'
}
 
func appendMax(max int, slice []string, value string) []string {
// Often FIFO queues in Go are implemented via:
// fifo = append(fifo, newValues...)
// and:
// fifo = fifo[numberOfValuesToRemove:]
//
// However, the append will periodically reallocate and copy. Since
// we're dealing with a small number (usually two) of strings and we
// only need to append a single new string it's better to (almost)
// never reallocate the slice and just copy n-1 strings (which only
// copies n-1 pointers, not the entire string contents) every time.
if len(slice)+1 > max {
n := copy(slice, slice[1:])
slice = slice[:n]
}
return append(slice, value)
}
Output  —  for input -n=3 -runs=3 -words=12 -capital -sentence:
Alice could speak again. The Mock Turtle went on at last, more calmly, though still sobbing a little now and then, and holding it to his ear.
Don't you suppose we could rescue them? asked the girl anxiously. Oh, no.
City must wear spectacles night and day. Now they are all set free, and are grateful to you for help.

J[edit]

This seems to be reasonably close to the specification:

require'web/gethttp'
 
setstats=:dyad define
'plen slen limit'=: x
txt=. gethttp y
letters=. (tolower~:toupper)txt
NB. apostrophes have letters on both sides
apostrophes=. (_1 |.!.0 letters)*(1 |.!.0 letters)*''''=txt
parsed=. <;._1 ' ',deb ' ' (I.-.letters+apostrophes)} tolower txt
words=: ~.parsed
corpus=: words i.parsed
prefixes=: ~.plen]\corpus
suffixes=: ~.slen]\corpus
ngrams=. (plen+slen)]\corpus
pairs=. (prefixes i. plen{."1 ngrams),. suffixes i. plen}."1 ngrams
stats=: (#/.~pairs) (<"1~.pairs)} (prefixes ,&# suffixes)$0
weights=: +/\"1 stats
totals=: (+/"1 stats),0
i.0 0
)
 
genphrase=:3 :0
pren=. #prefixes
sufn=. #suffixes
phrase=. (?pren) { prefixes
while. limit > #phrase do.
p=. prefixes i. (-plen) {. phrase
t=. p { totals
if. 0=t do. break.end. NB. no valid matching suffix
s=. (p { weights) I. ?t
phrase=. phrase, s { suffixes
end.
 ;:inv phrase { words
)
Output:
   2 1 50 setstats 'http://paulo-jorente.de/text/alice_oz.txt'
genphrase''
got in as alice alice
genphrase''
perhaps even alice
genphrase''
pretty milkmaid alice

And, using 8 word suffixes (but limiting results to a bit over 50 words):

Output:
   2 8 50 setstats 'http://paulo-jorente.de/text/alice_oz.txt'
genphrase''
added it alice was beginning to get very tired of this i vote the young lady tells us alice was beginning to get very tired of being such a tiny little thing it did not take her long to find the one paved with yellow bricks within a short time
genphrase''
the raft through the water they got along quite well alice was beginning to get very tired of this i vote the young lady tells us alice was beginning to get very tired of being all alone here as she said this last word two or three times over to
genphrase''
gown that alice was beginning to get very tired of sitting by her sister on the bank and alice was beginning to get very tired of being such a tiny little thing it did so indeed and much sooner than she had accidentally upset the week before oh i beg

(see talk page for discussion of odd line wrapping with some versions of Safari)

Java[edit]

Translation of: Kotlin
Works with: Java version 8
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Objects;
import java.util.Random;
 
public class MarkovChain {
private static Random r = new Random();
 
private static String markov(String filePath, int keySize, int outputSize) throws IOException {
if (keySize < 1) throw new IllegalArgumentException("Key size can't be less than 1");
Path path = Paths.get(filePath);
byte[] bytes = Files.readAllBytes(path);
String[] words = new String(bytes).trim().split(" ");
if (outputSize < keySize || outputSize >= words.length) {
throw new IllegalArgumentException("Output size is out of range");
}
Map<String, List<String>> dict = new HashMap<>();
 
for (int i = 0; i < (words.length - keySize); ++i) {
StringBuilder key = new StringBuilder(words[i]);
for (int j = i + 1; j < i + keySize; ++j) {
key.append(' ').append(words[j]);
}
String value = (i + keySize < words.length) ? words[i + keySize] : "";
if (!dict.containsKey(key.toString())) {
ArrayList<String> list = new ArrayList<>();
list.add(value);
dict.put(key.toString(), list);
} else {
dict.get(key.toString()).add(value);
}
}
 
int n = 0;
int rn = r.nextInt(dict.size());
String prefix = (String) dict.keySet().toArray()[rn];
List<String> output = new ArrayList<>(Arrays.asList(prefix.split(" ")));
 
while (true) {
List<String> suffix = dict.get(prefix);
if (suffix.size() == 1) {
if (Objects.equals(suffix.get(0), "")) return output.stream().reduce("", (a, b) -> a + " " + b);
output.add(suffix.get(0));
} else {
rn = r.nextInt(suffix.size());
output.add(suffix.get(rn));
}
if (output.size() >= outputSize) return output.stream().limit(outputSize).reduce("", (a, b) -> a + " " + b);
n++;
prefix = output.stream().skip(n).limit(keySize).reduce("", (a, b) -> a + " " + b).trim();
}
}
 
public static void main(String[] args) throws IOException {
System.out.println(markov("alice_oz.txt", 3, 200));
}
}
Output:
Emerald City is? Certainly, answered the Queen; but it is a lucky thing I am not, for my mouth is only painted, and if I ever get my heart? Or I my courage? asked the Lion in surprise, as he watched her pick up the Scarecrow and said: Come with me, for Oz has made me its ruler and the people gazed upon it with much curiosity. The Tin Woodman appeared to think deeply for a moment. Then he said, The Winkies were sorry to have them go, and they had grown so large in the last few minutes that she wasn't a bit afraid of interrupting him,) 'I'll give him sixpence. I don't believe there's an atom of meaning in it.' The jury all wrote down on their slates, and then added them up, and reduced the answer to it?' said the Mock Turtle. 'No, no! The adventures first,' said the Gryphon hastily. 'Go on with the next verse,' the Gryphon repeated impatiently: 'it begins I passed by his garden, and marked, with one eye,

Julia[edit]

Works with: Julia version 0.6
function markovtext(txt::AbstractString, klen::Integer, maxchlen::Integer)
words = matchall(r"\w+", txt)
dict = Dict()
for i in 1:length(words)-klen
k = join(words[i:i+klen-1], " ")
v = words[i+klen]
if haskey(dict, k)
dict[k] = push!(dict[k], v)
else
dict[k] = [v]
end
end
keytext = rand(collect(keys(dict)))
outtext = keytext
lasttext = outtext
while length(outtext) < maxchlen
lasttext = outtext
valtext = rand(dict[keytext])
outtext = outtext * " " * valtext
keytext = replace(keytext, r"^\w+\s+(.+)", s"\1") * " " * valtext
end
return lasttext
end
 
txt = readstring(download("http://paulo-jorente.de/text/alice_oz.txt"))
println(markovtext(txt, 3, 200))
Output:
Wizard could take upon himself and the Lion looking around him with joy Never have I seen a more
beautiful place It seems gloomy said the Scarecrow If it required brains to figure it out I never

Kotlin[edit]

// version 1.1.51
 
import java.io.File
import java.util.Random
 
val r = Random()
 
fun markov(filePath: String, keySize: Int, outputSize: Int): String {
if (keySize < 1) throw IllegalArgumentException("Key size can't be less than 1")
val words = File(filePath).readText().trimEnd().split(' ')
if (outputSize !in keySize..words.size) {
throw IllegalArgumentException("Output size is out of range")
}
val dict = mutableMapOf<String, MutableList<String>>()
 
for (i in 0..(words.size - keySize)) {
val key = words.subList(i, i + keySize).joinToString(" ")
val value = if (i + keySize < words.size) words[i + keySize] else ""
if (!dict.containsKey(key))
dict.put(key, mutableListOf(value))
else
dict[key]!!.add(value)
}
 
val output = mutableListOf<String>()
var n = 0
var rn = r.nextInt(dict.size)
var prefix = dict.keys.toList()[rn]
output.addAll(prefix.split(' '))
 
while (true) {
var suffix = dict[prefix]!!
if (suffix.size == 1) {
if (suffix[0] == "") return output.joinToString(" ")
output.add(suffix[0])
}
else {
rn = r.nextInt(suffix.size)
output.add(suffix[rn])
}
if (output.size >= outputSize) return output.take(outputSize).joinToString(" ")
n++
prefix = output.subList(n, n + keySize).joinToString(" ")
}
}
 
fun main(args: Array<String>) {
println(markov("alice_oz.txt", 3, 200))
}

Sample output:

but now, to her surprise, she found it was no longer green, but pure white. The ribbon around Toto's neck, and they started off in the right way. But at noon, when the sun was up, they started on their way, and soon saw a beautiful green glow in the sky just before them. On the other side of the garden, where Alice could see this, as she was near enough to look over their slates; 'but it doesn't matter which way you go,' said the King, looking round the court and got behind him, and very soon finished off the cake. 'Curiouser and curiouser!' cried Alice (she was so much disappointed; and the eyes winked again and looked upon her anxiously, as if the Great Oz himself, and driven him out of the room. The cook threw a frying-pan after her as she spoke. Alice did not quite know what to do next, when suddenly a White Rabbit with pink eyes ran close by her. There was nothing else to do, so Alice soon began talking again. 'Dinah'll miss me very much to-night, I should think!' (Dinah was the cat.) 'I hope they'll remember her saucer of milk at

Lua[edit]

Not sure whether this is correct, but I am sure it is quite inefficient. Also not written very nicely.

Computes keys of all lengths <= N. During text generation, if a key does not exist in the dictionary, the first (least recent) word is removed, until a key is found (if no key at all is found, the program terminates).

local function pick(t)
local i = math.ceil(math.random() * #t)
return t[i]
end
 
local n_prevs = tonumber(arg[1]) or 2
local n_words = tonumber(arg[2]) or 8
 
local dict, wordset = {}, {}
local prevs, pidx = {}, 1
 
local function add(word) -- add new word to dictionary
local prev = ''
local i, len = pidx, #prevs
 
for _ = 1, len do
i = i - 1
if i == 0 then i = len end
 
if prev ~= '' then prev = ' ' .. prev end
prev = prevs[i] .. prev
local t = dict[prev]
if not t then
t = {}
dict[prev] = t
end
t[#t+1] = word
end
end
 
for line in io.lines() do
for word in line:gmatch("%S+") do
wordset[word] = true
add(word)
prevs[pidx] = word
pidx = pidx + 1; if pidx > n_prevs then pidx = 1 end
end
end
add('')
 
local wordlist = {}
for word in pairs(wordset) do
wordlist[#wordlist+1] = word
end
wordset = nil
 
math.randomseed(os.time())
math.randomseed(os.time() * math.random())
local word = pick(wordlist)
local prevs, cnt = '', 0
 
--[[ print the dictionary
for prevs, nexts in pairs(dict) do
io.write(prevs, ': ')
for _,word in ipairs(nexts) do
io.write(word, ' ')
end
io.write('\n')
end
]]

 
for i = 1, n_words do
io.write(word, ' ')
 
if cnt < n_prevs then
cnt = cnt + 1
else
local i = prevs:find(' ')
if i then prevs = prevs:sub(i+1) end
end
if prevs ~= '' then prevs = prevs .. ' ' end
prevs = prevs .. word
 
local cprevs = ' ' .. prevs
local nxt_words
repeat
local i = cprevs:find(' ')
if not i then break end
cprevs = cprevs:sub(i+1)
if DBG then io.write('\x1b[2m', cprevs, '\x1b[m ') end
nxt_words = dict[cprevs]
until nxt_words
 
if not nxt_words then break end
word = pick(nxt_words)
end
io.write('\n')
 
Output:
> ./markov.lua <alice_oz.txt 3 200
hugged the soft, stuffed body of the Scarecrow in her arms instead of kissing his
painted face, and found she was crying herself at this sorrowful parting from her
loving comrades. Glinda the Good stepped down from her ruby throne to give the
prizes?' quite a chorus of voices asked. 'Why, she, of course,' said the Dodo,
pointing to Alice with one finger; and the whole party look so grave and
anxious.) Alice could think of nothing else to do, and perhaps after all it might
tell her something worth hearing. For some minutes it puffed away without
speaking, but at last it sat down a good way off, panting, with its tongue
hanging out of its mouth again, and said, 'So you think you're changed, do you?'
'I'm afraid I don't know one,' said Alice, rather alarmed at the proposal. 'Then
the Dormouse shall!' they both cried. 'Wake up, Dormouse!' And they pinched it
on both sides at once. The Dormouse slowly opened his eyes. 'I wasn't asleep,' he
said in a low voice, 'Why the fact is, you see, Miss, we're doing our best, afore
she comes, to-' At this moment Five, who had been greatly interested in

Perl[edit]

$file = shift || 'alice_oz.txt';
$n = shift || 3;
$max = shift || 200;
 
sub build_dict {
my($n, @words) = @_;
my %dict;
for $i (0 .. @words-$n) {
my @prefix;
push @prefix, $words[$i+$_] for 0..$n-1;
push @{$dict{ join ' ', @prefix }}, $words[$i+$n];
}
return %dict;
}
 
sub pick1 { return @_[ rand @_ ] }
 
open F, "<$file"; my $text = <F>; close F;
 
my @words = split ' ', $text;
%dict = build_dict($n, @words);
 
print join ' ', @rotor = @words[0..$n-1];
for (1..$max) {
print ' ' . ($new = pick1( @{$dict{join ' ', @rotor}} ));
shift @rotor;
push @rotor, $new;
}
Output:
Alice was thoroughly puzzled. 'Does the boots and shoes,' the Gryphon whispered in a fight with another hedgehog, which seemed to extend to the South Country? To see the Great Oz was ready to sink into the garden, where Alice could see or feel was the end of you, as she chose, she ran after her. 'I've something important to say!' This sounded promising, certainly: Alice turned and walked through the forest very thick on this side, and it seemed to Alice as she chose, she ran out and shone brightly. So they all spoke at once, I'll chop it down, and the Dormouse sulkily remarked, 'If you please, sir-' The Rabbit started violently, dropped the white kid gloves and the four travelers walked up to the Land of Oz in less than no time to think about stopping herself before she made a dreadfully ugly child: but it is a man,' said the Stork, as she spoke. 'I must be a person of authority among them, called out, 'Sit down, all of them expected to come out of breath, and till the Pigeon in a sorrowful tone; 'at least there's no use to me they flew away with me,' thought Alice,

Perl 6[edit]

Works with: rakudo version 2017-01
unit sub MAIN ( :$text=$*IN, :$n=2, :$words=100, );
 
sub add-to-dict ( $text, :$n=2, ) {
my @words = $text.words;
my @prefix = @words.rotor: $n => -$n+1;
 
(%).push: @prefix Z=> @words[$n .. *]
}
 
my %dict = add-to-dict $text, :$n;
my @start-words = %dict.keys.pick.words;
my @generated-text = lazy |@start-words, { %dict{ "@_[ *-$n .. * ]" }.pick } ...^ !*.defined;
 
put @generated-text.head: $words;
 
>perl6 markov.p6 <alice_oz.txt --n=3 --words=200
Scarecrow. He can't hurt the straw. Do let me carry that basket for you. I shall not mind it, for I can't get tired. I'll tell you what I think, said the little man. Give me two or three pairs of tiny white kid gloves: she took up the fan and gloves, and, as the Lory positively refused to tell its age, there was no use in saying anything more till the Pigeon had finished. 'As if it wasn't trouble enough hatching the eggs,' said the Pigeon; 'but I must be very careful. When Oz gives me a heart of course I needn't mind so much. They were obliged to camp out that night under a large tree in the wood,' continued the Pigeon, raising its voice to a whisper. He is more powerful than they themselves, they would surely have destroyed me. As it was, I lived in deadly fear of them for many years; so you can see for yourself. Indeed, a jolly little clown came walking toward them, and Dorothy could see that in spite of all her coaxing. Hardly knowing what she did, she picked up a little bit of stick, and held it out to

Phix[edit]

This was fun! (easy, but fun)

integer fn = open("alice_oz.txt","rb")
string text = get_text(fn)
close(fn)
sequence words = split(text)
 
function markov(integer n, m)
integer dict = new_dict(), ki
sequence key, data, res
string suffix
for i=1 to length(words)-n do
key = words[i..i+n-1]
suffix = words[i+n]
ki = getd_index(key,dict)
if ki=0 then
data = {}
else
data = getd_by_index(ki,dict)
end if
setd(key,append(data,suffix),dict)
end for
integer start = rand(length(words)-n)
key = words[start..start+n-1]
res = key
for i=1 to m do
ki = getd_index(key,dict)
if ki=0 then exit end if
data = getd_by_index(ki,dict)
suffix = data[rand(length(data))]
res = append(res,suffix)
key = append(key[2..$],suffix)
end for
return join(res)
end function
 
?markov(2,100)
Output:

from the alice_oz.txt file:

"serve me a heart, said the Gryphon. \'Then, you know,\' Alice gently remarked; \'they\'d have been ill.\' \'So they were,\' said the
Lion. One would almost suspect you had been running too long. They found the way to send me back to the imprisoned Lion; but every day 
she came upon a green velvet counterpane. There was a long sleep you\'ve had!\' \'Oh, I\'ve had such a capital one for catching mice-oh, 
I beg your pardon!\' cried Alice hastily, afraid that she was shrinking rapidly; so she felt lonely among all these strange people. Her 
tears seemed to Alice a good dinner."

Python[edit]

Markov text generator - Python implementation.
Usage: markov.py source.txt context length

#Import libraries.
import sys
import random
 
 
def readdata(file):
'''Read file and return contents.'''
with open(file) as f:
contents = f.read()
return contents
 
 
def makerule(data, context):
'''Make a rule dict for given data.'''
rule = {}
words = data.split(' ')
index = context
 
for word in words[index:]:
key = ' '.join(words[index-context:index])
if key in rule:
rule[key].append(word)
else:
rule[key] = [word]
index += 1
 
return rule
 
 
def makestring(rule, length):
'''Use a given rule to make a string.'''
oldwords = random.choice(list(rule.keys())).split(' ') #random starting words
string = ' '.join(oldwords) + ' '
 
for i in range(length):
try:
key = ' '.join(oldwords)
newword = random.choice(rule[key])
string += newword + ' '
 
for word in range(len(oldwords)):
oldwords[word] = oldwords[(word + 1) % len(oldwords)]
oldwords[-1] = newword
 
except KeyError:
return string
return string
 
 
if __name__ == '__main__':
data = readdata(sys.argv[1])
rule = makerule(data, int(sys.argv[2]))
string = makestring(rule, int(sys.argv[3]))
print(string)
Output:
marry the pretty milkmaid was much pleased to have her little dog free. The other trees of
the Gates they had at the rapid flight of the castle to yourself. I have my shoulders got
to? And oh, I wish you were me?' 'Well, perhaps not,' said the Scarecrow fell off the
cake.'Curiouser and curiouser!' cried Alice (she was obliged to say to this: so she set to
work, so you have killed the Wicked Witch in all their simple joys, remembering her own
the Silver Shoes, began to work; and he did open his mouth, for his release, for he was
obliged to go on with the Wizard. What shall we do? asked the Witch, sinking her voice
sounded hoarse and strange, and the reason is-' here the country of the land of the court,
arm-in-arm with the name 'Alice!' 'Here!' cried Alice, quite forgetting that she could
remember them, all these strange people. Her tears seemed to be full of tears, until there
was no time to think about stopping herself before she gave her answer. 'They're done with
a deep groan near by. What was that? she asked the girl. Again the eyes move and the Tin
Woodman can chop it down, and felt quite strange at first; but she laughed heartily at the
Scarecrow. The Witch did not have lived much under the window, engaged in a grieved tone;
you're a humbug? asked Dorothy. A balloon, said Oz, for I have no right to command them
once

REXX[edit]

/*REXX program produces a Markov chain text from a training text using a text generator.*/
parse arg ord fin iFID seed . /*obtain optional arguments from the CL*/
if ord=='' | ord=="," then ord= 3 /*Not specified? Then use the default.*/
if fin=='' | fin=="," then fin= 300 /* " " " " " " */
if iFID=='' |iFID=="," then iFID='alice_oz.txt' /* " " " " " " */
if datatype(seed, 'W') then call random ,,seed /* " " " " " " */
sw = linesize() - 1 /*get usable linesize (screen width). */
$= space( linein(iFID) ) /*elide any superfluous whitespace in $*/
say words($) ' words read from input file: ' iFID
call gTab /*generate the Markov chain text table.*/
call gTxt /*generate the Markov chain text. */
call show /*display formatted output and a title.*/
exit /*stick a fork in it, we're all done. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
gTab: @.=; do j=1 for words($)-ord /*keep processing until words exhausted*/
p= subword($, j, ord) /*get the appropriate number of words. */
@.p= @.p word($, j + ord) /*get a prefix & 1 (of sev.?) suffixes.*/
end /*j*/
#= j-1 /*define the number of prefixes. */
return
/*──────────────────────────────────────────────────────────────────────────────────────*/
gTxt: mc=; do until words(mc)>=fin /*build Markov chain text until enough.*/
y= subword($, random(1, #), ord) /*obtain appropriate number of words. */
s= @.y; w= words(s) /*get a suffix for a word set; # wprds.*/
if w>1 then s= word(s,random(1,w)) /*pick random word in the set of words.*/
mc= mc y s /*add a prefix and suffix to the output*/
end /*until*/
return
/*──────────────────────────────────────────────────────────────────────────────────────*/
show: say center('Markov chain text', sw, "═") /*display the title for the output. */
g= word(mc, 1) /*generate lines of Markov chain text. */
do k=2 to words(mc) /*build output lines word by word. */
_= word(mc, k); g_= g _ /*get a word; add it to a temp variable*/
if length(g_)>=sw then do; say g; g= _; end /*line too long ? */
else g= g_ /*line OK so far. */
end /*k*/
if g\=='' then say g /*There any residual? Then display it.*/
return /* [↑] limits G to terminal width.*/
output   when using the default inputs:
════════════════════════════════════════════════════════Markov chain text════════════════════════════════════════════════════════
great Head? That was fell out of the the Caterpillar. 'Well, I've her crows and her us, he said, built eyes so often with quiet
till she got Tin Woodman were glad my body, splitting me said Alice, very much long arms growing out her and bit her before she
had drunk tremble. Quick! cried the a low, trembling voice. corner, but the Rabbit never heard it before,' small point a foot in
a little rippling the little girl was silver whistle twice. Straightway in groups. But the green leaves that lay well! It means
much who ran to Alice said the Cowardly Lion, Palace for several days, head and arms and say How doth the the effect: the next
her; so I set not doing these little saw no one at who played with Toto So the Scarecrow followed have witches and wizards
people to keep away ropes, and the balloon added to one of answered Oz. But I if you don't mind, for them. They came her in any
way. that it might happen Hatter said, turning to at the water's edge. anything to say, she Are you going with it a kind heart?
still in existence; 'and of speaking to a of tin. After this was as big as From the Land of it with hot air. The country here is
and frightening my cow? a coward. Have you upon it.) 'I'm glad creeping back, and Toto But Dorothy they did grin, and she said
for one so cowardly. but I think I dear! Let this be Dorothy and Toto and it's an arm, yer sent the soldier with what to do with
a mouse: she had not think of any their paws. 'And how I will fight them there were two little Making believe! cried Dorothy.

Swift[edit]

Translation of: Python
Works with: Swift version 4.2
import Foundation
 
func makeRule(input: String, keyLength: Int) -> [String: [String]] {
let words = input.components(separatedBy: " ")
var rules = [String: [String]]()
var i = keyLength
 
for word in words[i...] {
let key = words[i-keyLength..<i].joined(separator: " ")
 
rules[key, default: []].append(word)
 
i += 1
}
 
return rules
}
 
func makeString(rule: [String: [String]], length: Int) -> String {
var oldWords = rule.keys.randomElement()!.components(separatedBy: " ")
var string = oldWords.joined(separator: " ") + " "
 
for _ in 0..<length {
let key = oldWords.joined(separator: " ")
guard let newWord = rule[key]?.randomElement() else { return string }
 
string += newWord + " "
 
for ii in 0..<oldWords.count {
oldWords[ii] = oldWords[(ii + 1) % oldWords.count]
}
 
oldWords[oldWords.index(before: oldWords.endIndex)] = newWord
}
 
return string
}
 
let inputLoc = CommandLine.arguments.dropFirst().first!
let input = FileManager.default.contents(atPath: inputLoc)!
let inputStr = String(data: input, encoding: .utf8)!
let rule = makeRule(input: inputStr, keyLength: 3)
let str = makeString(rule: rule, length: 300)
 
print(str)
Output:
$ ./main /path/to/alice_oz.txt
with a crash, whereupon the Scarecrow's clothes fell out of the clouds to rule over us. Still, for many days they grieved over the loss of my heart. While I was in love I was the happiest man on earth; but no one came near them nor spoke to them because of the great beam the house rested on, two feet were sticking out, shod in silver shoes with pointed toes. Oh, dear! Oh, dear! cried Dorothy, clasping her hands together in dismay. The house must have fallen on her. Whatever shall we do? There is nothing to be done, I wonder?' As she said this, she looked up, but it was the first to break the silence. 'What day of the month is it?' he said, turning to Alice as it spoke. 'As wet as ever,' said Alice in a piteous tone. And she thought of herself, 'I wish the creatures wouldn't be so stingy about it, you know-' She had quite forgotten the Duchess by this time, and was going to dive in among the leaves, which she found to be nothing but the stars over them; and they rested very well indeed. In the morning they traveled on until they came to the great Throne Room, where he saw, sitting in the emerald throne, a most lovely Lady. She was dressed in a green uniform and wearing a long green beard. Here are strangers, said the Guardian of the Gates lived. This officer unlocked their spectacles to put them back in his great box, and then he struck at the Tin Woodman passed safely under it. Come on! he shouted to the others. These she also led to rooms, and each one of them can explain it,' said the King, 'and don't look at me like that!' He got behind