Boyer-Moore string search: Difference between revisions

m
no edit summary
mNo edit summary
Line 68:
=={{header|J}}==
{{trans|Emacs Lisp}}
<langsyntaxhighlight lang=J>bmsearch0=: {{
startPos=. 0
rightMap=. (i.#x) (3 u: x)} 256#_1
Line 87:
end.
''
}}</langsyntaxhighlight>
 
Example use:
<lang Jpre> 'alfalfa' bmsearch0 'On behalf of an alfredo calfskin malfunction we donate alfalfa.'
55</langpre>
 
Caution: <tt>bmsearch0</tt> is an incomplete implementation of the algorithm. See talk page.
Line 97:
Here's a complete implementation of the algorithm:
 
<langsyntaxhighlight lang=J>Z=: {{ y {{ 1 i.~y~:(#y){.m }}\.y }}
 
bmsearch1=: {{
Line 132:
end.
EMPTY
}}</langsyntaxhighlight>
 
<lang Jpre> 'alfalfa' bmsearch1 'On behalf of an alfredo calfskin malfunction we donate alfalfa.'
55</langpre>
 
Testing performance on a relatively long text (and pattern):
 
<lang Jpre> text=: 'acgt'{~?.1e7#4
pat=. text{~9e6+i.1e5
pat bmsearch0 text
Line 146:
9000000
'pat bmsearch0 text' %&timex 'pat bmsearch1 text'
2.33477</langpre>
 
In other words: <tt>bmsearch0</tt> takes slightly over twice as long as <tt>bmsearch1</tt> for a somewhat randomly picked example search (though the results are the same).
Line 156:
"Algorithms on Strings, Trees, and Sequences", 1997, available online as a pdf in several publicly indexed
archives on the internet.
<langsyntaxhighlight lang=ruby>""" Rosetta Code task at rosettacode.org/mw/index.php?title=Boyer-Moore_string_search """
 
const ASCIIchars = String([Char(i) for i = 0:255]) # any standard or extended ASCII char
Line 468:
occurrences, num_alignments, num_character_comparisons = boyer_moore_with_counts(p7, p_bm7, t7)
@show occurrences, num_alignments, num_character_comparisons
</langsyntaxhighlight>{{out}}
<pre>
(p_bm.amap, p_bm.bad_char) = (Dict('A' => 1, 'G' => 3, 'T' => 4, 'C' => 2), [[0, 0, 0, 0], [0, 0, 0, 1], [0, 2, 0, 1], [3, 2, 0, 1]])
Line 607:
=={{header|Python}}==
Slightly modified from en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string-search_algorithm#Python_implementation.
<langsyntaxhighlight lang=python>"""
This version uses ASCII for case-sensitive matching. For Unicode you may want to match in UTF-8
bytes instead of creating a 0x10FFFF-sized table.
Line 772:
print("Found", PAT2, "at:", string_search(PAT2, TEXT1))
print("Found", PAT3, "at:", string_search(PAT3, TEXT2))
</langsyntaxhighlight>{{out}}
<pre>
Found put at: [26, 90]
Line 781:
=={{header|Raku}}==
{{trans|Phix}}
<langsyntaxhighlight lang=perl6># 20220818 Raku programming solution
 
sub suffixes (@pat,\m) {
Line 842:
my \j = $_ < 6 ?? $_ !! $_-1 ; # for searching text5 twice
say "Found '@pats[$_]' in 'text{j}' at indices ", BM @texts[j].comb, @pats[$_].comb
}</langsyntaxhighlight>
Output is the same as the [[Knuth-Morris-Pratt_string_search#Raku]] entry.
 
Line 849:
 
The same examples have been used as in the Julia entry above.
<langsyntaxhighlight lang=ecmascript>class BoyerMoore {
/**
* Returns the index within this string of the first occurrence of the
Line 976:
var j = (i < 5) ? i : i-1
System.print("Found '%(pats[i])' in 'text%(j+1)' at indices %(indicesOf.call(texts[j], pats[i]))")
}</langsyntaxhighlight>
 
{{out}}
59

edits