Count occurrences of a substring: Difference between revisions

Line 2,328:

Picat has a predicate for searching for substrings (find/4) but on backtracking (e.g. in a findall/2 wrapper) it yields all overlapping substrings which is not correct in this task.

So we have to roll our own. Here are two versions:

So we have to roll our own. Here are two versions.

===Recursion===

⚫

* count_substrings_rec/2: ~~Uses recursion ("Prolog style")~~

⚫

<lang Picat>count_substrings_rec(S, SB) = C =>

* count_substrings_find: Iterative version: Uses find/4 in a while loop.

⚫

The time differences between these two versions are quite large which is shown in ~~this~~ benchmark of searching the substring "ab" in a string of 100 000 random characters from the set of "abcde":

* count_substrings_rec/2: 0.009s

⚫

* count_substrings_find: 0.501s

<lang Picat>% Recursive version

⚫

count_substrings_rec(S, SB) = C =>

count_rec(S,SB,0,C).

Line 2,350:

Line 2,340:

count_rec([T|Rest],SB,Count0,Count) :-

T != SB, % this character is not a substring

count_rec(Rest,SB,Count0,Count).

count_rec(Rest,SB,Count0,Count).</lang>

===Iterative===

% Iterative version using find/4 (wrap with once/1 to avoid backtracking).

Iterative version using find/4 (wrap with once/1 to avoid backtracking).

% (Inspired by the Euphoria version.)

count_substrings_it_find(S, SB) = C =>

<lang Picat>count_substrings_find(S, SB) = C =>

SLen = S.len,

Count = 0,

Line 2,368:

Line 2,359:

end,

C = Count.</lang>

⚫

The time differences between these two versions are quite large which is shown in a benchmark of searching the substring "ab" in a string of 100 000 random characters from the set of "abcde":

⚫

* count_substrings_rec/2: 0.009s

⚫

* count_substrings_find: 0.501s

=={{header|PicoLisp}}==