Knuth-Morris-Pratt string search

< About Knuth-Morris-Pratt String Search Algorithm >

const text1 = "InhisbookseriesTheArtofComputerProgrammingpublishedbyAddisonWesleyDKnuthusesanimaginarycomputertheMIXanditsassociatedmachinecodeandassemblylanguagestoillustratetheconceptsandalgorithmsastheyarepresented" const text2 = "Nearby farms grew a half acre of alfalfa on the dairy's behalf, with bales of all that alfalfa exchanged for milk." const pat1, pat2, pat3 = "put", "and", "alfalfa"

println("Found <$pat1> at: (one-based indices) ", kmp_search(text1, pat1)) println("Found <$pat2> at: (one-based indices) ", kmp_search(text1, pat2)) println("Found <$pat3> at: (one-based indices) ", kmp_search(text2, pat3))

</lang>

Output:

Found <put> at: (one-based indices) [27, 91]
Found <and> at: (one-based indices) [102, 129, 172]
Found <alfalfa> at: (one-based indices) [34, 88]

Wren

This is based on the code here. The examples used are the same as in the Boyer-Moore_string_search#Wren task. <lang ecmascript>class KMP {

   static search(haystack, needle) {
       haystack = haystack.bytes.toList
       needle = needle.bytes.toList
       var hc = haystack.count
       var nc = needle.count
       var indices = []
       var i = 0 // index into haystack
       var j = 0 // index into needle
       var t = table_(needle)
       while (i < hc) {
           if (needle[j] == haystack[i]) {
               i = i + 1
               j = j + 1
           }
           if (j == nc) {
               indices.add(i - j)
               j = t[j-1]
           } else if (i < hc && needle[j] != haystack[i]) {
               if (j != 0) {
                   j = t[j-1]
               } else {
                   i = i + 1
               }
           }
       }
       return indices
   }

   static table_(needle) {
       var nc = needle.count
       var t = List.filled(nc, 0)
       var i = 1   // index into table
       var len = 0 // length of previous longest prefix
       while (i < nc) {
           if (needle[i] == needle[len]) {
              len = len + 1
              t[i] = len
              i = i + 1
           } else if (len != 0) {
               len = t[len-1]
           } else {
               t[i] = 0
               i = i + 1
           }
       }
       return t
   }

}

var texts = [

   "GCTAGCTCTACGAGTCTA",
   "GGCTATAATGCGTA",
   "there would have been a time for such a word",
   "needle need noodle needle",

"InhisbookseriesTheArtofComputerProgrammingpublishedbyAddisonWesleyDKnuthusesanimaginarycomputertheMIXanditsassociatedmachinecodeandassemblylanguagestoillustratetheconceptsandalgorithmsastheyarepresented",

   "Nearby farms grew a half acre of alfalfa on the dairy's behalf, with bales of all that alfalfa exchanged for milk."

] var pats = ["TCTA", "TAATAAA", "word", "needle", "put", "and", "alfalfa"] for (i in 0...texts.count) System.print("text%(i+1) = %(texts[i])") System.print() for (i in 0...pats.count) {

   var j = (i < 5) ? i : i-1
   System.print("Found '%(pats[i])' in 'text%(j+1)' at indices %(KMP.search(texts[j], pats[i]))")

}</lang>

Output:

text1 = GCTAGCTCTACGAGTCTA
text2 = GGCTATAATGCGTA
text3 = there would have been a time for such a word
text4 = needle need noodle needle
text5 = InhisbookseriesTheArtofComputerProgrammingpublishedbyAddisonWesleyDKnuthusesanimaginarycomputertheMIXanditsassociatedmachinecodeandassemblylanguagestoillustratetheconceptsandalgorithmsastheyarepresented
text6 = Nearby farms grew a half acre of alfalfa on the dairy's behalf, with bales of all that alfalfa exchanged for milk.

Found 'TCTA' in 'text1' at indices [6, 14]
Found 'TAATAAA' in 'text2' at indices []
Found 'word' in 'text3' at indices [40]
Found 'needle' in 'text4' at indices [0, 19]
Found 'put' in 'text5' at indices [26, 90]
Found 'and' in 'text5' at indices [101, 128, 171]
Found 'alfalfa' in 'text6' at indices [33, 87]

Knuth-Morris-Pratt string search

Emacs Lisp

Julia

Wren