Unicode strings: Difference between revisions
no edit summary
({{header|ZX Spectrum Basic}}) |
No edit summary |
||
Line 1:
{{draft task}}
Demonstrate how one is expected to handle Unicode strings. Some example considerations: can a Unicode string be directly written in the source code? How does one do IO with unicode strings? Can these strings be manipulated easily? Can non-ASCII characters be used for keywords/identifiers/etc? What encodings (UTF-8, UTF-16, etc) can your language accept without much trouble?
=={{header|J}}==
Unicode characters can be represented directly in J strings:
<lang j> '♥♦♣♠'
♥♦♣♠</lang>
By default, they are represented as utf-8:
<lang j> #'♥♦♣♠'
12</lang>
However, they can be represented as utf-16 instead:
<lang j> 7 u:'♥♦♣♠'
♥♦♣♠
#7 u:'♥♦♣♠'
4</lang>
These forms are not treated as equivalent:
<lang> '♥♦♣♠' -: 7 u:'♥♦♣♠'
0</lang>
I/O uses characters in whatever format they happen to be in.
See also: http://www.jsoftware.com/help/dictionary/duco.htm
Unicode characters are not legal tokens or names, within current versions J.
=={{header|Perl}}==
In Perl, "Unicode" means "UTF-8". If you want to include utf8 characters in your source file, unless you have set <code>PERL_UNICODE</code> environment correctly, you should do<lang Perl>use utf8;</lang> or you rick the parser treating the file as raw bytes.
|