Character codes: Difference between revisions
m
→{{header|Perl 6}}: UTF16
SqrtNegInf (talk | contribs) (→{{header|Perl}}: Unicode) |
Thundergnat (talk | contribs) m (→{{header|Perl 6}}: UTF16) |
||
Line 1,543:
Both Perl 5 and Perl 6 have good Unicode support. Note that even multi-byte emoji and characters outside the BMP are considered single characters.
<lang perl6>for 'AΑА𪚥🇺🇸👨👩👧👦'.comb {
.put for
['Character
'UTF-16LE', 'UTF-16BE', 'Round trip']».fmt('%14s:')
Z [$_, .uninames, .ords, .ords.fmt('0x%X'), .encode('UTF8')».base(16), .ords.chrs];▼
Z
.encode('utf16le')».fmt('%02X').join.comb(4),
.encode('utf16be')».fmt('%02X').join.comb(4), .ords.chrs
];
say '';
{{out}}
<pre
Character: A
Character name: LATIN CAPITAL LETTER A
Line 1,555 ⟶ 1,560:
Hex ordinal(s): 0x41
UTF-8: 41
UTF-16LE: 4100
UTF-16BE: 0041
Round trip: A
Line 1,562 ⟶ 1,569:
Hex ordinal(s): 0x391
UTF-8: CE 91
UTF-16LE: 9103
UTF-16BE: 0391
Round trip: Α
Line 1,569 ⟶ 1,578:
Hex ordinal(s): 0x410
UTF-8: D0 90
UTF-16LE: 1004
UTF-16BE: 0410
Round trip: А
Character: 𪚥
Character name:
Ordinal(s): 173733
Hex ordinal(s): 0x2A6A5
UTF-8: F0 AA 9A A5
UTF-16LE: 69D8 A5DE
UTF-16BE: D869 DEA5
Round trip: 𪚥
Line 1,583 ⟶ 1,596:
Hex ordinal(s): 0x1F1FA 0x1F1F8
UTF-8: F0 9F 87 BA F0 9F 87 B8
UTF-16LE: 3CD8 FADD 3CD8 F8DD
UTF-16BE: D83C DDFA D83C DDF8
Round trip: 🇺🇸
Line 1,590 ⟶ 1,605:
Hex ordinal(s): 0x1F468 0x200D 0x1F469 0x200D 0x1F467 0x200D 0x1F466
UTF-8: F0 9F 91 A8 E2 80 8D F0 9F 91 A9 E2 80 8D F0 9F 91 A7 E2 80 8D F0 9F 91 A6
UTF-16LE: 3DD8 68DC 0D20 3DD8 69DC 0D20 3DD8 67DC 0D20 3DD8 66DC
UTF-16BE: D83D DC68 200D D83D DC69 200D D83D DC67 200D D83D DC66
Round trip: 👨👩👧👦
=={{header|Phix}}==
|