Substring/Top and tail: Difference between revisions
Content deleted Content added
Add ed example |
Added OmniMark solution |
||
Line 1,710: | Line 1,710: | ||
MyStrin |
MyStrin |
||
yStrin |
yStrin |
||
</pre> |
|||
=={{header|OmniMark}}== |
|||
The following "Simple ASCII example" is limited in that it will only work where there are characters up to U+00FF in the "word", but works for most of the examples like "knight" and "brooms" (shown), which reflects most of the example text others have put to their solutions. |
|||
=== Simple ASCII example === |
|||
<syntaxhighlight lang="omnimark"> |
|||
process |
|||
local stream s variable initial {'brooms', 'hit'} |
|||
repeat over s |
|||
output '---------------------%nWord: %g(s)%n' |
|||
do scan s |
|||
match any-text any-text+ => firstremoved |
|||
output 'First removed: %x(firstremoved)%n' |
|||
done |
|||
do scan s |
|||
match ((lookahead not (any-text value-end)) any)+ => lastremoved |
|||
output 'Last removed: %x(lastremoved)%n' |
|||
done |
|||
do scan s |
|||
match any-text ((lookahead not (any-text value-end)) any)+ => bothremoved |
|||
output 'Both removed: %x(bothremoved)%n' |
|||
done |
|||
again |
|||
</syntaxhighlight> |
|||
{{out}} |
|||
<pre> |
|||
--------------------- |
|||
Word: brooms |
|||
First removed: rooms |
|||
Last removed: broom |
|||
Both removed: room |
|||
--------------------- |
|||
Word: hit |
|||
First removed: it |
|||
Last removed: hi |
|||
Both removed: i |
|||
</pre> |
|||
=== Complete, Unicode, example === |
|||
This example considers the task's more complex requirements, i.e., "'''it must work on any valid Unicode code point, whether in the Basic Multilingual Plane''' [BMP] '''or above it'''". The word hit (U+0068 U+0069 U+0074) as well as đąđ˛đ˝ (U+1D4F1 U+1D4F2 U+1D4FD) are handled (the latter word comprised of three characters above the BMP). Also handled is "lĂ " (U+006C U+0061 U+0300), showing that this solution is also capable of handling combining characters like U+0300 (COMBINING GRAVE ACCENT), treating the accent as a ''character'', which it is. |
|||
<syntaxhighlight lang="omnimark"> |
|||
include "utf8pat.xin" |
|||
define stream function ucps (value stream chars) as |
|||
local integer num |
|||
local stream unicodes |
|||
open unicodes as buffer |
|||
repeat scan chars |
|||
match utf8-char => s-char |
|||
set num to utf8-char-number(s-char) |
|||
put unicodes 'U+%u16r4fzd(num) ' |
|||
again |
|||
close unicodes |
|||
return unicodes |
|||
process |
|||
local stream s variable initial {'brooms', 'đąđ˛đ˝', 'lĂ '} |
|||
repeat over s |
|||
output '----------------------------------------------------------------%n' |
|||
output 'Word: %g(s)%n ' || ucps(s) || '%n' |
|||
do scan s |
|||
match utf8-char utf8-char+ => firstremoved |
|||
output 'First removed: %x(firstremoved)%n ' |
|||
output ucps(firstremoved) || '%n' |
|||
done |
|||
do scan s |
|||
match ((lookahead not (utf8-char value-end)) any)+ => lastremoved |
|||
output 'Last removed: %x(lastremoved)%n ' |
|||
output ucps(lastremoved) || '%n' |
|||
done |
|||
do scan s |
|||
match utf8-char ((lookahead not (utf8-char value-end)) any)+ => bothremoved |
|||
output 'Both removed: %x(bothremoved)%n ' |
|||
output ucps(bothremoved) || '%n' |
|||
done |
|||
again |
|||
</syntaxhighlight> |
|||
<pre> |
|||
---------------------------------------------------------------- |
|||
Word: hit |
|||
U+0068 U+0069 U+0074 |
|||
First removed: it |
|||
U+0069 U+0074 |
|||
Last removed: hi |
|||
U+0068 U+0069 |
|||
Both removed: i |
|||
U+0069 |
|||
---------------------------------------------------------------- |
|||
Word: đąđ˛đ˝ |
|||
U+1D4F1 U+1D4F2 U+1D4FD |
|||
First removed: đ˛đ˝ |
|||
U+1D4F2 U+1D4FD |
|||
Last removed: đąđ˛ |
|||
U+1D4F1 U+1D4F2 |
|||
Both removed: đ˛ |
|||
U+1D4F2 |
|||
---------------------------------------------------------------- |
|||
Word: lĂ |
|||
U+006C U+0061 U+0300 |
|||
First removed: Ă |
|||
U+0061 U+0300 |
|||
Last removed: la |
|||
U+006C U+0061 |
|||
Both removed: a |
|||
U+0061 |
|||
</pre> |
</pre> |
||