Revision as of 05:23, 31 October 2019 (view source) Eoraptor (talk \| contribs) m (+Stata (not done yet)) ← Older edit		Revision as of 05:47, 31 October 2019 (view source) Eoraptor (talk \| contribs) (→‎{{header\|Stata}}) Newer edit →
Line 1,322: =={{header\|Stata}}== See ''[https://www.stata.com/features/overview/unicode/ Unicode support]'' on Stata web site. See also the help on [https://www.stata.com/help.cgi?unicode Unicode utilities]. Unicode support was added in Stata 14. # How easy is it to present Unicode strings in source code? :One can include any Unicode character in the source code. Code is stored as UTF-8 text files with extension .do, .ado or .mata. The ''Output window'' can print Unicode characters as well. # Can Unicode literals be written directly, or be part of identifiers/keywords/etc? :Yes. Unicode literals can be part of variable names (in all places : datasets, scalar and matrix variables, and Mata variables). # How well can the language communicate with the rest of the world? :Stata datasets (extension .dta) are stored in UTF-8. I/O with CSV files can use any encoding supported by Java (see the list [https://docs.oracle.com/en/java/javase/11/intl/supported-encodings.html here]). # Is it good at input/output with Unicode? :Yes. # Is it convenient to manipulate Unicode strings in the language? :Stata has string functions to manipulate Unicode strings. It also has legacy functions to manipulate strings as byte sequences: the unicode flavor is prefixed by "u". For instance, ''strtrim'' for the ASCII function and ''ustrtrim'' for the Unicode function. # How broad/deep does the language support Unicode? :Unicode support is good. There is one missing function: while it's easy to get the character from the numeric value of a Unicode code point, with the [https://www.stata.com/help.cgi?uchar() uchar] function, the converse is not easy. However, it's possible to convert a Unicode string to ''escaped'' hex values, e.g. <code>ustrtohex("Ж")</code> returns "\u0416", and the converse operation is done with [https://www.stata.com/help.cgi?ustrunescape() ustrunescape]. # What encodings (e.g. UTF-8, UTF-16, etc) can be used? :Data and code are stored in UTF-8. I/O with CSV data files can be done in any encoding supported by Java, which includes UTF-8, UTF-16 and UTF-32. # Does it support normalization? :Yes. See the help for the [https://www.stata.com/help.cgi?ustrnormalize() ustrnormalize] function. It supports the NFC, NFD, NFKC, NFKD and NFKCC forms. =={{header\|Tcl}}==

Unicode strings: Difference between revisions

Unicode strings (view source)

Revision as of 05:47, 31 October 2019