Unicode strings: Difference between revisions

Line 191:

PicoLisp can directly handle _only_ Unicode (UTF-8) strings. So the problem is rather how to handle non-Unicode strings: They must be pre- or post-processed by external tools, typically with pipes during I/O. For example, to read a line from a file in 8859 encoding:

<lang PicoLisp>(in '(iconv "-f" "ISO-8859-15" "file.txt") (line))</lang>

=={{header|Seed7}}==

The Unicode encoding of Seed7 [http://seed7.sourceforge.net/manual/types.htm#char characters] and

[http://seed7.sourceforge.net/manual/types.htm#string strings] is UTF-32. Seed7 source files use

UTF-8 encoding. [http://seed7.sourceforge.net/manual/tokens.htm#Character Character literals] and

[http://seed7.sourceforge.net/manual/tokens.htm#String_literals string literals] are

therefore written with UTF-8 encoding. Unicode characters are allowed in comments,

but not in identifiers and keywords. Functions, which send strings to the operating system convert

them to the encoding used by the OS. Strings received by the operating system are converted to UTF-32.

Seed7 supports reading and writing [http://seed7.sourceforge.net/libraries/external_file.htm Latin-1],

[http://seed7.sourceforge.net/libraries/utf8.htm UTF-8] and

[http://seed7.sourceforge.net/libraries/utf16.htm UTF-16] files.

Because of UTF-32 there is no distinction between byte and character position.

=={{header|Tcl}}==