Talk:Boyer-Moore string search: Difference between revisions

Content added Content deleted

Inline

Revision as of 14:00, 18 July 2022

I am not sure if this is intended to be a task.

I also am not sure how this algorithm's performance fairs against the caching mechanisms of recent machines. (Other than branch prediction issues, boyer-moore is probably not penalized by caching implementations -- unlike tree search algorithms which experience poor cache locality -- but there might be some minimum search string length necessary to see significant gains from this approach, especially in uncached contexts.) --Rdm (talk) 01:13, 6 July 2022 (UTC)

For strings I do not know, but variants of this algorithm are commonly used to search DNA databases. See for example this paper: http://www.ijsetr.com/uploads/625413IJSETR2868-162.pdf --Wherrera (talk) 03:03, 6 July 2022 (UTC)

Emacs Lisp incorrect?

No output for one and no indication of how it might be used for two. It certainly doesn't run on TIO with the message

Trailing garbage following expression: 
;;(print (bm_compile_pattern "abcdb"))

and replit just seems to run an editor... Then again I've never used Emacs, so what do I know? --Pete Lomax (talk) 15:55, 9 July 2022 (UTC)

https://www.gnu.org/software/emacs/manual/html_node/efaq/Evaluating-Emacs-Lisp-code.html explains how to run emacs lisp code. --Rdm (talk) 15:28, 17 July 2022 (UTC)

That said, currently, the Emacs Lisp implementation (and the current J implementation which copies it) does not implement the good suffix rule. So the algorithm exhibits poor performance in certain contexts.

This isn't the sort of thing which impacts the correctness of the returned result -- it's strictly a performance issue (it's a performance improvement when "good suffixes" are rare or absent in the text being searched, it's a performance problem when relatively long "good suffixes" appear relatively often in the text being searched). I've got other things to do right now, but I'll circle back on this when I have time. --Rdm (talk) 13:53, 18 July 2022 (UTC)

Revision as of 13:53, 18 July 2022 (view source) Rdm (talk \| contribs) (→‎Emacs Lisp incorrect?: sort of?) ← Older edit		Revision as of 14:00, 18 July 2022 (view source) Rdm (talk \| contribs) m (→‎Emacs Lisp incorrect?) Newer edit →
Line 15:		Line 15:

	:: That said, currently, the Emacs Lisp implementation (and the current J implementation which copies it) does not implement the [[wp:Boyer–Moore_string-search_algorithm#The_good_suffix_rule\|good suffix rule]]. So the algorithm exhibits poor performance in certain contexts.		:: That said, currently, the Emacs Lisp implementation (and the current J implementation which copies it) does not implement the [[wp:Boyer–Moore_string-search_algorithm#The_good_suffix_rule\|good suffix rule]]. So the algorithm exhibits poor performance in certain contexts.
	:: This isn't the sort of thing which impacts the correctness of the returned result -- it's strictly a performance issue. I've got other things to do right now, but I'll circle back on this when I have time. --[[User:Rdm\|Rdm]] ([[User talk:Rdm\|talk]]) 13:53, 18 July 2022 (UTC)		:: This isn't the sort of thing which impacts the correctness of the returned result -- it's strictly a performance issue (it's a performance improvement when "good suffixes" are rare or absent in the text being searched, it's a performance problem when relatively long "good suffixes" appear relatively often in the text being searched). I've got other things to do right now, but I'll circle back on this when I have time. --[[User:Rdm\|Rdm]] ([[User talk:Rdm\|talk]]) 13:53, 18 July 2022 (UTC)