Burrows–Wheeler transform: Difference between revisions
Content added Content deleted
(→{{header|TXR}}: Replace collect-each in bwt with mapcar.) |
(→{{header|TXR}}: Add missing definition of eof!) |
||
Line 2,666: | Line 2,666: | ||
We use the U+DC00 code point as the EOF sentinel. In TXR terminology, this code is called the <i>pseudo-null</i>. It plays a special significance in that when a NUL byte occurs in UTF-8 external data, TXR's decoder maps it the U+DC00 point. When a string containing U+DC00 is converted to UTF-8, that code becomes a NUL again. |
We use the U+DC00 code point as the EOF sentinel. In TXR terminology, this code is called the <i>pseudo-null</i>. It plays a special significance in that when a NUL byte occurs in UTF-8 external data, TXR's decoder maps it the U+DC00 point. When a string containing U+DC00 is converted to UTF-8, that code becomes a NUL again. |
||
<syntaxhighlight lang="txrlisp">( |
<syntaxhighlight lang="txrlisp">(defvarl eof "\xDC00") |
||
(defun bwt (str) |
|||
(if (contains eof str) |
(if (contains eof str) |
||
(error "~s: input may not contain ~a" %fun% eof)) |
(error "~s: input may not contain ~a" %fun% eof)) |