Unicode strings: Difference between revisions

Added Pike implementation
(Added Pike implementation)
Line 1,106:
PicoLisp can directly handle _only_ Unicode (UTF-8) strings. So the problem is rather how to handle non-Unicode strings: They must be pre- or post-processed by external tools, typically with pipes during I/O. For example, to read a line from a file in 8859 encoding:
<lang PicoLisp>(in '(iconv "-f" "ISO-8859-15" "file.txt") (line))</lang>
 
=={{header|Pike}}==
All strings in Pike are Unicode internally, the charset of the source
can at any line be changed with the "#charset" pre processor
directive. The default charset is ISO-8859-1, and any of the ~400
charsets supported in the Charset module can be used in source
code. It is also possible to implement your own charset and load it
with #charset directives.
 
Regardless of source charset it's always possible to enter strings as
literals.
 
All IO is untouched bit-streams, but since the terminal probably does
not want a stream of Unicode we manually encode it as UTF8 before
writing it out.
 
<lang Pike>
#charset utf8
void main()
{
string nånsense = "\u03bb \0344 \n";
string hello = "你好";
string 水果 = "pineapple";
string 真相 = sprintf("%s, %s goes really well on pizza\n", hello, 水果);
write( string_to_utf8(真相) );
write( string_to_utf8(nånsense) );
}
</lang>
{{Out}}
<pre>
你好, pineapple goes really well on pizza
λ ä
</pre>
 
=={{header|Python}}==
Anonymous user