Read a file character by character/UTF8: Difference between revisions
Read a file character by character/UTF8 (view source)
Revision as of 20:17, 29 December 2013
, 10 years agoadded REXX version 2 (show my understanding of this task)
m (→{{header|REXX}}: added whitespace to a REXX statement. -- ~~~~) |
Walterpachl (talk | contribs) (added REXX version 2 (show my understanding of this task)) |
||
Line 56:
</lang>
=={{header|REXX}}==
===version 1===
REXX doesn't support UTF8 encoded wide characters, just bytes.
<br><br>The task's requirement stated that '''EOF''' was to be returned upon reaching the end-of-file, so this programming example was written as a subroutine (procedure).
Line 95 ⟶ 96:
18 character, (hex,char) 454F46 EOF
</pre>
===version 2===
<lang rexx>/* REXX ---------------------------------------------------------------
* 29.12.2013 Walter Pachl
* read one utf8 character at a time
* see http://de.wikipedia.org/wiki/UTF-8#Kodierung
* sorry this is in German but the encoding table should be obvious
*--------------------------------------------------------------------*/
oid='utf8.txt';'erase' oid /* first create file containing utf8 chars*/
Call charout oid,'79'x
Call charout oid,'C3A4'x
Call charout oid,'C2AE'x
Call charout oid,'E282AC'x
Call charout oid,'F09D849E'x
Call lineout oid
fid='utf8.txt' /* then read it and show the contents */
Do While chars(fid)>0
c8=get_utf8char(fid)
Say left(c8,4) c2x(c8)
End
Say 'EOF'
Exit
get_utf8char: Procedure
Parse Arg f
c=charin(f)
b=c2b(c)
If left(b,1)=0 Then
Nop
Else Do
p=pos('0',b)
Do i=1 To p-2
If chars(f)=0 Then Do
Say 'illegal contents in file' f
Leave
End
c=c||charin(f)
End
End
Return c
c2b: Return x2b(c2x(arg(1)))</lang>
output:
<pre>y 79
ä C3A4
® C2AE
€ E282AC
ð„ž F09D849E
EOF</pre>
=={{header|Ruby}}==
|