Talk:Count occurrences of a substring: Difference between revisions

From Rosetta Code
Content added Content deleted
(→‎More specific?: Possible suggestion)
Line 2: Line 2:
What happens when there are more than one way to match? Say you have string "aBaBaBa" and want to find <i>non-overlapping</i> "aBa"s, you could say both (aBa)B(aBa) and aB(aBa)Ba, where brackets denote the found matches, are valid. --[[User:Ledrug|Ledrug]] 17:07, 16 June 2011 (UTC)
What happens when there are more than one way to match? Say you have string "aBaBaBa" and want to find <i>non-overlapping</i> "aBa"s, you could say both (aBa)B(aBa) and aB(aBa)Ba, where brackets denote the found matches, are valid. --[[User:Ledrug|Ledrug]] 17:07, 16 June 2011 (UTC)
:My guess would be to specify to match left-to-right, but that might be too restrictive of possible methods. Maybe say "give the highest possible number of non-overlapping matches"? Can you think of any situations where that isn't the same as the count when matching from left-to-right (or even right-to-left)? --[[User:Mwn3d|Mwn3d]] 17:20, 16 June 2011 (UTC)
:My guess would be to specify to match left-to-right, but that might be too restrictive of possible methods. Maybe say "give the highest possible number of non-overlapping matches"? Can you think of any situations where that isn't the same as the count when matching from left-to-right (or even right-to-left)? --[[User:Mwn3d|Mwn3d]] 17:20, 16 June 2011 (UTC)
::Earliest match always produces most matches. Proof: suppose a string is matched in two ways:
::<lang>...(m1)...(more matches)...
....(m2)...(more)...</lang>
:: where m1 and m2 overlap, and suppose both are giving highest possible number of matches at their starting location; further suppose second way matches more times: but this can't be, because we can then give up m2 and use m1 while keeping the rest, and it would have more matches than first way, which is a contradiction. Hence, highest num of matches can be obtained by starting earliest.

:: Since reversing both pattern and string makes left-to-right problem into a right-to-left one, this also proves highest matches can be obtained from either end.

:: That exercise aside, I just thought the task should be worded so that any given input would produce an unambiguous result. --[[User:Ledrug|Ledrug]] 17:45, 16 June 2011 (UTC)

Revision as of 17:45, 16 June 2011

More specific?

What happens when there are more than one way to match? Say you have string "aBaBaBa" and want to find non-overlapping "aBa"s, you could say both (aBa)B(aBa) and aB(aBa)Ba, where brackets denote the found matches, are valid. --Ledrug 17:07, 16 June 2011 (UTC)

My guess would be to specify to match left-to-right, but that might be too restrictive of possible methods. Maybe say "give the highest possible number of non-overlapping matches"? Can you think of any situations where that isn't the same as the count when matching from left-to-right (or even right-to-left)? --Mwn3d 17:20, 16 June 2011 (UTC)
Earliest match always produces most matches. Proof: suppose a string is matched in two ways:
<lang>...(m1)...(more matches)...

....(m2)...(more)...</lang>

where m1 and m2 overlap, and suppose both are giving highest possible number of matches at their starting location; further suppose second way matches more times: but this can't be, because we can then give up m2 and use m1 while keeping the rest, and it would have more matches than first way, which is a contradiction. Hence, highest num of matches can be obtained by starting earliest.
Since reversing both pattern and string makes left-to-right problem into a right-to-left one, this also proves highest matches can be obtained from either end.
That exercise aside, I just thought the task should be worded so that any given input would produce an unambiguous result. --Ledrug 17:45, 16 June 2011 (UTC)