Unicode strings: Difference between revisions
Content added Content deleted
Drkameleon (talk | contribs) (Added Arturo implementation) |
Thundergnat (talk | contribs) m (→{{header|Raku}}: typo, formatting, update) |
||
Line 1,215: | Line 1,215: | ||
=={{header|Raku}}== |
=={{header|Raku}}== |
||
(formerly Perl 6) |
(formerly Perl 6) |
||
Raku programs and strings are all in Unicode and operate at a grapheme abstraction level, which is agnostic to underlying encodings or normalizations. (These are generally handled at program boundaries.) Opened files default to UTF-8 encoding. All Unicode character properties are in play, so any appropriate characters may be used as parts of identifiers, whitespace, or user-defined operators. For instance: |
Raku programs and strings are all in Unicode and operate at a grapheme abstraction level, which is agnostic to underlying encodings or normalizations. (These are generally handled at program boundaries.) Opened files default to UTF-8 encoding. All Unicode character properties are in play, so any appropriate characters may be used as parts of identifiers, whitespace, or user-defined operators. For instance: |
||
Line 1,223: | Line 1,224: | ||
Raku tracks the Unicode consortium standards releases and is generally up to the latest |
Raku tracks the Unicode consortium standards releases and is generally up to the latest |
||
standard within a month or so of its release. (currently at |
standard within a month or so of its release. (currently at 13.1 as of May 2021) |
||
* Supports the normalized forms NFC, NFD, NFKC, and NFKD, and character equivalence as specified in [http://unicode.org/reports/tr15/ Unicode technical report #15]. |
* Supports the normalized forms NFC, NFD, NFKC, and NFKD, and character equivalence as specified in [http://unicode.org/reports/tr15/ Unicode technical report #15]. |
||
Line 1,232: | Line 1,233: | ||
* Provides built-in routines to access character names, do name-to-character character-to-ordinal and ordinal-to-character conversions. |
* Provides built-in routines to access character names, do name-to-character character-to-ordinal and ordinal-to-character conversions. |
||
* Works seamlessly with upper plane and private use plane character codepoints. |
* Works seamlessly with upper plane and private use plane character codepoints. |
||
* Provides tools to deal with strings that contain invalid Unicode |
* Provides tools to deal with strings that contain invalid Unicode characters. |
||
In general, it tries to make dealing with Unicode "just work". |
In general, it tries to make dealing with Unicode "just work". |