Character codes: Difference between revisions

m
Line 1,543:
Both Perl 5 and Perl 6 have good Unicode support. Note that even multi-byte emoji and characters outside the BMP are considered single characters.
<lang perl6>for 'AΑА𪚥🇺🇸👨‍👩‍👧‍👦'.comb {
.put for
['Character:', 'Character name:', 'Ordinal(s):', 'Hex ordinal(s):', 'UTF-8:', 'Round trip:']».fmt('%15s')
'UTF-16LE', 'UTF-16BE', 'Round trip']».fmt('%14s:')
Z [$_, .uninames, .ords, .ords.fmt('0x%X'), .encode('UTF8')».base(16), .ords.chrs];
Z
Z [ $_, .uninames, .ords, .ords.fmt('0x%X'), .encode('UTF8utf8')».basefmt(16'%02X'), .ords.chrs];
.encode('utf16le')».fmt('%02X').join.comb(4),
.encode('utf16be')».fmt('%02X').join.comb(4), .ords.chrs
];
say '';
}</lang>
{{out}}
<pre><nowiki>
Character: A
Character name: LATIN CAPITAL LETTER A
Line 1,555 ⟶ 1,560:
Hex ordinal(s): 0x41
UTF-8: 41
UTF-16LE: 4100
UTF-16BE: 0041
Round trip: A
 
Line 1,562 ⟶ 1,569:
Hex ordinal(s): 0x391
UTF-8: CE 91
UTF-16LE: 9103
UTF-16BE: 0391
Round trip: Α
 
Line 1,569 ⟶ 1,578:
Hex ordinal(s): 0x410
UTF-8: D0 90
UTF-16LE: 1004
UTF-16BE: 0410
Round trip: А
 
Character: 𪚥
Character name: <CJK IdeographUNIFIED Extension BIDEOGRAPH-2A6A5>
Ordinal(s): 173733
Hex ordinal(s): 0x2A6A5
UTF-8: F0 AA 9A A5
UTF-16LE: 69D8 A5DE
UTF-16BE: D869 DEA5
Round trip: 𪚥
 
Line 1,583 ⟶ 1,596:
Hex ordinal(s): 0x1F1FA 0x1F1F8
UTF-8: F0 9F 87 BA F0 9F 87 B8
UTF-16LE: 3CD8 FADD 3CD8 F8DD
UTF-16BE: D83C DDFA D83C DDF8
Round trip: 🇺🇸
 
Line 1,590 ⟶ 1,605:
Hex ordinal(s): 0x1F468 0x200D 0x1F469 0x200D 0x1F467 0x200D 0x1F466
UTF-8: F0 9F 91 A8 E2 80 8D F0 9F 91 A9 E2 80 8D F0 9F 91 A7 E2 80 8D F0 9F 91 A6
UTF-16LE: 3DD8 68DC 0D20 3DD8 69DC 0D20 3DD8 67DC 0D20 3DD8 66DC
UTF-16BE: D83D DC68 200D D83D DC69 200D D83D DC67 200D D83D DC66
Round trip: 👨‍👩‍👧‍👦
</nowiki></pre>
 
=={{header|Phix}}==
10,333

edits