Substring/Top and tail: Difference between revisions

Content deleted Content added
Aartaka (talk | contribs)
Add ed example
Kennypete (talk | contribs)
Added OmniMark solution
Line 1,710: Line 1,710:
MyStrin
MyStrin
yStrin
yStrin
</pre>

=={{header|OmniMark}}==
The following "Simple ASCII example" is limited in that it will only work where there are characters up to U+00FF in the "word", but works for most of the examples like "knight" and "brooms" (shown), which reflects most of the example text others have put to their solutions.

=== Simple ASCII example ===
<syntaxhighlight lang="omnimark">
process
local stream s variable initial {'brooms', 'hit'}
repeat over s
output '---------------------%nWord: %g(s)%n'
do scan s
match any-text any-text+ => firstremoved
output 'First removed: %x(firstremoved)%n'
done
do scan s
match ((lookahead not (any-text value-end)) any)+ => lastremoved
output 'Last removed: %x(lastremoved)%n'
done
do scan s
match any-text ((lookahead not (any-text value-end)) any)+ => bothremoved
output 'Both removed: %x(bothremoved)%n'
done
again
</syntaxhighlight>

{{out}}
<pre>
---------------------
Word: brooms
First removed: rooms
Last removed: broom
Both removed: room
---------------------
Word: hit
First removed: it
Last removed: hi
Both removed: i
</pre>

=== Complete, Unicode, example ===
This example considers the task's more complex requirements, i.e., "'''it must work on any valid Unicode code point, whether in the Basic Multilingual Plane''' [BMP] '''or above it'''". The word hit (U+0068 U+0069 U+0074) as well as 𝓱𝓲𝓽 (U+1D4F1 U+1D4F2 U+1D4FD) are handled (the latter word comprised of three characters above the BMP). Also handled is "là" (U+006C U+0061 U+0300), showing that this solution is also capable of handling combining characters like U+0300 (COMBINING GRAVE ACCENT), treating the accent as a ''character'', which it is.

<syntaxhighlight lang="omnimark">
include "utf8pat.xin"

define stream function ucps (value stream chars) as
local integer num
local stream unicodes
open unicodes as buffer
repeat scan chars
match utf8-char => s-char
set num to utf8-char-number(s-char)
put unicodes 'U+%u16r4fzd(num) '
again
close unicodes
return unicodes

process
local stream s variable initial {'brooms', '𝓱𝓲𝓽', 'là'}
repeat over s
output '----------------------------------------------------------------%n'
output 'Word: %g(s)%n ' || ucps(s) || '%n'
do scan s
match utf8-char utf8-char+ => firstremoved
output 'First removed: %x(firstremoved)%n '
output ucps(firstremoved) || '%n'
done
do scan s
match ((lookahead not (utf8-char value-end)) any)+ => lastremoved
output 'Last removed: %x(lastremoved)%n '
output ucps(lastremoved) || '%n'
done
do scan s
match utf8-char ((lookahead not (utf8-char value-end)) any)+ => bothremoved
output 'Both removed: %x(bothremoved)%n '
output ucps(bothremoved) || '%n'
done
again
</syntaxhighlight>

<pre>
----------------------------------------------------------------
Word: hit
U+0068 U+0069 U+0074
First removed: it
U+0069 U+0074
Last removed: hi
U+0068 U+0069
Both removed: i
U+0069
----------------------------------------------------------------
Word: 𝓱𝓲𝓽
U+1D4F1 U+1D4F2 U+1D4FD
First removed: 𝓲𝓽
U+1D4F2 U+1D4FD
Last removed: 𝓱𝓲
U+1D4F1 U+1D4F2
Both removed: 𝓲
U+1D4F2
----------------------------------------------------------------
Word: lĂ 
U+006C U+0061 U+0300
First removed: Ă 
U+0061 U+0300
Last removed: la
U+006C U+0061
Both removed: a
U+0061
</pre>
</pre>