Unicode strings: Difference between revisions

no edit summary
({{header|ZX Spectrum Basic}})
No edit summary
Line 1:
{{draft task}}
Demonstrate how one is expected to handle Unicode strings. Some example considerations: can a Unicode string be directly written in the source code? How does one do IO with unicode strings? Can these strings be manipulated easily? Can non-ASCII characters be used for keywords/identifiers/etc? What encodings (UTF-8, UTF-16, etc) can your language accept without much trouble?
=={{header|J}}==
 
Unicode characters can be represented directly in J strings:
 
<lang j> '♥♦♣♠'
♥♦♣♠</lang>
 
By default, they are represented as utf-8:
 
<lang j> #'♥♦♣♠'
12</lang>
 
However, they can be represented as utf-16 instead:
 
<lang j> 7 u:'♥♦♣♠'
♥♦♣♠
#7 u:'♥♦♣♠'
4</lang>
 
These forms are not treated as equivalent:
 
<lang> '♥♦♣♠' -: 7 u:'♥♦♣♠'
0</lang>
 
I/O uses characters in whatever format they happen to be in.
 
See also: http://www.jsoftware.com/help/dictionary/duco.htm
 
Unicode characters are not legal tokens or names, within current versions J.
 
=={{header|Perl}}==
In Perl, "Unicode" means "UTF-8". If you want to include utf8 characters in your source file, unless you have set <code>PERL_UNICODE</code> environment correctly, you should do<lang Perl>use utf8;</lang> or you rick the parser treating the file as raw bytes.
6,962

edits