Burrows–Wheeler transform: Difference between revisions
Content added Content deleted
(→{{header|J}}: concise implementation) |
(→{{header|TXR}}: New section.) |
||
Line 2,662: | Line 2,662: | ||
SIX.MIXED.PIXIES.SIFT.SIXTY.PIXIE.DUST.BOXES -> |STEXYDST.E.IXXIIXXSSMPPS.B..EE.^.USFXDIIOIIIT -> SIX.MIXED.PIXIES.SIFT.SIXTY.PIXIE.DUST.BOXES |
SIX.MIXED.PIXIES.SIFT.SIXTY.PIXIE.DUST.BOXES -> |STEXYDST.E.IXXIIXXSSMPPS.B..EE.^.USFXDIIOIIIT -> SIX.MIXED.PIXIES.SIFT.SIXTY.PIXIE.DUST.BOXES |
||
^ABC| -> error -> error</pre> |
^ABC| -> error -> error</pre> |
||
=={{header|TXR}}== |
|||
We use the U+DC00 code point as the EOF sentinel. In TXR terminology, this code is called the <i>pseudo-null</i>. It plays a special significance in that when a NUL byte occurs in UTF-8 external data, TXR's decoder maps it the U+DC00 point. When a string containing U+DC00 is converted to UTF-8, that code becomes a NUL again. |
|||
<syntaxhighlight lang="txrlisp">(defun bwt (str) |
|||
(let* ((seof `@str\xDC00`) |
|||
(n (len seof))) |
|||
(flow (collect-each ((i 0..n)) |
|||
(rot seof i)) |
|||
sort |
|||
(mappend last)))) |
|||
(defun ibwt (str) |
|||
(let* ((n (len str)) |
|||
(row (vector n ""))) |
|||
(dotimes (i n) |
|||
(set [row i] (mkstring 0))) |
|||
(dotimes (i n) |
|||
(each ((c str) |
|||
(j 0)) |
|||
(set [row j] `@c@[row j]`)) |
|||
(nsort row)) |
|||
[(find-if (op ends-with "\xDC00") row) 0..-1]))</syntaxhighlight> |
|||
At the REPL: |
|||
<pre>1> (bwt "^BANANA") |
|||
"BNN^AA\xDC00;A" |
|||
2> (ibwt *1) |
|||
"^BANANA" |
|||
3> (bwt "SIX.MIXED.PIXIES.SIFT.SIXTY.PIXIE.DUST.BOXES") |
|||
"TEXYDST.E.IXIXIXXSSMPPS.B..E.\xDC00.UESFXDIIOIIITS" |
|||
4> (ibwt *3) |
|||
"SIX.MIXED.PIXIES.SIFT.SIXTY.PIXIE.DUST.BOXES"</pre> |
|||
=={{header|Visual Basic .NET}}== |
=={{header|Visual Basic .NET}}== |
||
{{trans|C#}} |
{{trans|C#}} |
||
Line 2,788: | Line 2,823: | ||
--> ERROR: Input can't contain STX or ETX |
--> ERROR: Input can't contain STX or ETX |
||
--></pre> |
--></pre> |
||
=={{header|Wren}}== |
=={{header|Wren}}== |
||
{{trans|Go}} |
{{trans|Go}} |