Substring/Top and tail: Difference between revisions

← Older edit Newer edit →

Content deleted Content added

Inline

@@ Line 1,710: / Line 1,710: @@
 MyStrin
 yStrin
+</pre>
+=={{header|OmniMark}}==
+The following "Simple ASCII example" is limited in that it will only work where there are characters up to U+00FF in the "word", but works for most of the examples like "knight" and "brooms" (shown), which reflects most of the example text others have put to their solutions.
+=== Simple ASCII example ===
+<syntaxhighlight lang="omnimark">
+process
+  local stream s variable initial {'brooms', 'hit'}
+  repeat over s
+    output '---------------------%nWord:          %g(s)%n'
+    do scan s
+      match any-text any-text+ => firstremoved
+        output 'First removed: %x(firstremoved)%n'
+    done
+    do scan s
+      match ((lookahead not (any-text value-end)) any)+ => lastremoved
+        output 'Last removed:  %x(lastremoved)%n'
+    done
+    do scan s
+      match any-text ((lookahead not (any-text value-end)) any)+ => bothremoved
+        output 'Both removed:  %x(bothremoved)%n'
+    done
+  again
+</syntaxhighlight>
+{{out}}
+<pre>
+---------------------
+Word:          brooms
+First removed: rooms
+Last removed:  broom
+Both removed:  room
+---------------------
+Word:          hit
+First removed: it
+Last removed:  hi
+Both removed:  i
+</pre>
+=== Complete, Unicode, example ===
+This example considers the task's more complex requirements, i.e., "'''it must work on any valid Unicode code point, whether in the Basic Multilingual Plane''' [BMP] '''or above it'''". The word hit (U+0068 U+0069 U+0074) as well as 𝓱𝓲𝓽 (U+1D4F1 U+1D4F2 U+1D4FD) are handled (the latter word comprised of three characters above the BMP). Also handled is "là" (U+006C U+0061 U+0300), showing that this solution is also capable of handling combining characters like U+0300 (COMBINING GRAVE ACCENT), treating the accent as a ''character'', which it is.
+<syntaxhighlight lang="omnimark">
+include "utf8pat.xin"
+define stream function ucps (value stream chars) as
+  local integer num
+  local stream unicodes
+  open unicodes as buffer
+  repeat scan chars
+    match utf8-char => s-char
+      set num to utf8-char-number(s-char)
+      put unicodes 'U+%u16r4fzd(num) '
+  again
+  close unicodes
+  return unicodes
+process
+  local stream s variable initial {'brooms', '𝓱𝓲𝓽', 'là'}
+  repeat over s
+    output '----------------------------------------------------------------%n'
+    output 'Word:          %g(s)%n               ' || ucps(s) || '%n'
+    do scan s
+      match utf8-char utf8-char+ => firstremoved
+        output 'First removed: %x(firstremoved)%n               '
+        output ucps(firstremoved) || '%n'
+    done
+    do scan s
+      match ((lookahead not (utf8-char value-end)) any)+ => lastremoved
+        output 'Last removed:  %x(lastremoved)%n               '
+        output ucps(lastremoved) || '%n'
+    done
+    do scan s
+      match utf8-char ((lookahead not (utf8-char value-end)) any)+ => bothremoved
+        output 'Both removed:  %x(bothremoved)%n               '
+        output ucps(bothremoved) || '%n'
+    done
+  again
+</syntaxhighlight>
+<pre>
+----------------------------------------------------------------
+Word:          hit
+               U+0068 U+0069 U+0074
+First removed: it
+               U+0069 U+0074
+Last removed:  hi
+               U+0068 U+0069
+Both removed:  i
+               U+0069
+----------------------------------------------------------------
+Word:          𝓱𝓲𝓽
+               U+1D4F1 U+1D4F2 U+1D4FD
+First removed: 𝓲𝓽
+               U+1D4F2 U+1D4FD
+Last removed:  𝓱𝓲
+               U+1D4F1 U+1D4F2
+Both removed:  𝓲
+               U+1D4F2
+----------------------------------------------------------------
+Word:          là
+               U+006C U+0061 U+0300
+First removed: à
+               U+0061 U+0300
+Last removed:  la
+               U+006C U+0061
+Both removed:  a
+               U+0061
 </pre>