Unicode strings: Difference between revisions
Content added Content deleted
m (Link to our Unicode page.) |
(Add Seed7 example) |
||
Line 191: | Line 191: | ||
PicoLisp can directly handle _only_ Unicode (UTF-8) strings. So the problem is rather how to handle non-Unicode strings: They must be pre- or post-processed by external tools, typically with pipes during I/O. For example, to read a line from a file in 8859 encoding: |
PicoLisp can directly handle _only_ Unicode (UTF-8) strings. So the problem is rather how to handle non-Unicode strings: They must be pre- or post-processed by external tools, typically with pipes during I/O. For example, to read a line from a file in 8859 encoding: |
||
<lang PicoLisp>(in '(iconv "-f" "ISO-8859-15" "file.txt") (line))</lang> |
<lang PicoLisp>(in '(iconv "-f" "ISO-8859-15" "file.txt") (line))</lang> |
||
=={{header|Seed7}}== |
|||
The Unicode encoding of Seed7 [http://seed7.sourceforge.net/manual/types.htm#char characters] and |
|||
[http://seed7.sourceforge.net/manual/types.htm#string strings] is UTF-32. Seed7 source files use |
|||
UTF-8 encoding. [http://seed7.sourceforge.net/manual/tokens.htm#Character Character literals] and |
|||
[http://seed7.sourceforge.net/manual/tokens.htm#String_literals string literals] are |
|||
therefore written with UTF-8 encoding. Unicode characters are allowed in comments, |
|||
but not in identifiers and keywords. Functions, which send strings to the operating system convert |
|||
them to the encoding used by the OS. Strings received by the operating system are converted to UTF-32. |
|||
Seed7 supports reading and writing [http://seed7.sourceforge.net/libraries/external_file.htm Latin-1], |
|||
[http://seed7.sourceforge.net/libraries/utf8.htm UTF-8] and |
|||
[http://seed7.sourceforge.net/libraries/utf16.htm UTF-16] files. |
|||
Because of UTF-32 there is no distinction between byte and character position. |
|||
=={{header|Tcl}}== |
=={{header|Tcl}}== |