Read a file character by character/UTF8: Difference between revisions
Content added Content deleted
(→{{header|NetRexx}}: Tidy up descriptiion) |
(add Perl solution) |
||
Line 287: | Line 287: | ||
CodePoint: index="008" character_count="1" id="U+00031" hex="0x000031" dec="0000049" oct="0000061" char="1" utf-16="0031" utf-8="31" name="DIGIT ONE" |
CodePoint: index="008" character_count="1" id="U+00031" hex="0x000031" dec="0000049" oct="0000061" char="1" utf-16="0031" utf-8="31" name="DIGIT ONE" |
||
CodePoint: index="009" character_count="1" id="U+00032" hex="0x000032" dec="0000050" oct="0000062" char="2" utf-16="0032" utf-8="32" name="DIGIT TWO" |
CodePoint: index="009" character_count="1" id="U+00032" hex="0x000032" dec="0000050" oct="0000062" char="2" utf-16="0032" utf-8="32" name="DIGIT TWO" |
||
</pre> |
|||
=={{header|Perl}}== |
|||
<lang perl>open my $fh, "<:encoding(UTF-8)", "input.txt" or die "$!"; |
|||
while (read $fh, my $char, 1) { |
|||
printf "got Unicode character U+%04x\n", ord $char; |
|||
} |
|||
close $fh;</lang> |
|||
If the ''input.txt'' file contains <code>AB€⼈⼥</code> followed by a newline, the output would be: |
|||
<pre> |
|||
got Unicode character U+0041 |
|||
got Unicode character U+0042 |
|||
got Unicode character U+20ac |
|||
got Unicode character U+2f08 |
|||
got Unicode character U+2f25 |
|||
got Unicode character U+000a |
|||
</pre> |
</pre> |
||