Unicode strings: Difference between revisions
Content added Content deleted
(added C#) |
|||
Line 542: | Line 542: | ||
return 0; |
return 0; |
||
}</lang> |
}</lang> |
||
=={{header|C#}}== |
|||
In C#, the native string representation is actually determined by the Common Language Runtime. In CLR, the string data type is a sequence of char, and the char data type represents a UTF-16 code unit. The native string representation is essentially UTF-16, except that strings can contain sequences of UTF-16 code units that aren't valid in UTF-16 if the string contains incorrectly-used high and low surrogates. |
|||
C# string literals support the \u escape sequence for 4-digit hexadecimal Unicode code points, \U for 6-digit code points, and UTF-encoded source code is also supported so that "Unicode strings" can be included in the source code as-is. |
|||
C# benefits from the extensive support for Unicode in the .NET Base Class Library, including |
|||
* Various UTF encodings |
|||
* String normalization |
|||
* Unicode character database subset |
|||
* Breaking strings into text elements |
|||
=={{header|Common Lisp}}== |
=={{header|Common Lisp}}== |