Talk:String length: Difference between revisions

Content added Content deleted
(→‎PL/I error: new section)
Line 19: Line 19:
The byte length calculations for unicode appear generally incorrect. They're only valid for codepoints which are in the Basic Multilingual Plane, but not for the Supplemental planes. I.e. 🀁 wouldn't fit within a single wide character; it would be represented in UTF-16 as 0xD38C and 0xDC01 (if I've done the math right). --[[User:Short Circuit|Michael Mol]] 18:24, 15 March 2012 (UTC)
The byte length calculations for unicode appear generally incorrect. They're only valid for codepoints which are in the Basic Multilingual Plane, but not for the Supplemental planes. I.e. 🀁 wouldn't fit within a single wide character; it would be represented in UTF-16 as 0xD38C and 0xDC01 (if I've done the math right). --[[User:Short Circuit|Michael Mol]] 18:24, 15 March 2012 (UTC)
: If you want to be completely general, there exist [https://en.wikipedia.org/wiki/Unicode_normalization other issues to consider.] Note, in particular, that not all combining forms have codepoints. --[[User:Rdm|Rdm]] 18:31, 15 March 2012 (UTC)
: If you want to be completely general, there exist [https://en.wikipedia.org/wiki/Unicode_normalization other issues to consider.] Note, in particular, that not all combining forms have codepoints. --[[User:Rdm|Rdm]] 18:31, 15 March 2012 (UTC)

== PL/I error ==

the last line ( put skip list ('Byte length=', length(trim(SM)); )
<br>is syntactically incorrect (a closing parenthesis is missing)
<br>I tried to add it and get an error message:
<br>IBM1569I S 9.0 SIZE argument must be a CONNECTED reference.
<br>--[[User:Walterpachl|Walterpachl]] ([[User talk:Walterpachl|talk]]) 18:43, 22 October 2013 (UTC)