Unicode strings: Difference between revisions

Line 568:
 
The way in which characters are encoded in memory is not defined by the Tcl language (the implementation uses byte arrays, UTF-16 arrays and UCS-2 strings as appropriate) and the only characters with any restriction on use as command or variable names are the ASCII parenthesis and colon characters. However, the <code>$var</code> shorthand syntax is much more restricted (to ASCII alphanumeric plus underline only); other cases have to use the more verbose form: <code>[set funny–var–name]</code>.
 
=={{header|TXR}}==
 
TXR source code and I/O are all assumed to be text which is UTF-8 encoded. This is a self-contained implementation, not relying on any encoding library. TXR ignores LANG and such environment variables.
 
Characters can be coded directly, or encoded indirectly with hexadecimal escape sequences.
 
The original regular expression engine supports Unicode. One of the regression test cases uses Japanese text.
 
As of version 035, identifiers such as variables are restricted to English letters, numbers and underscores.
 
=={{header|UNIX Shell}}==
Anonymous user