Talk:String length: Difference between revisions

← Older edit

Talk:String length (view source)

Revision as of 08:03, 30 August 2014

1,224 bytes added , 9 years ago

→‎Incorrect: please provide a reference file

Walterpachl

2,295

edits

Revision as of 11:20, 9 December 2011 (view source) rosettacode>Mischi (Component Pascal is UTF-16 only.) ← Older edit		Latest revision as of 08:03, 30 August 2014 (view source) Walterpachl (talk \| contribs) (→‎Incorrect: please provide a reference file)
(4 intermediate revisions by 3 users not shown)
Line 14: The example for character length does not deal with utf-8 and as much as I understand also fails with Non-BMP code points. == Incorrect == The byte length calculations for unicode appear generally incorrect. They're only valid for codepoints which are in the Basic Multilingual Plane, but not for the Supplemental planes. I.e. 🀁 wouldn't fit within a single wide character; it would be represented in UTF-16 as 0xD38C and 0xDC01 (if I've done the math right). --[[User:Short Circuit\|Michael Mol]] 18:24, 15 March 2012 (UTC) : If you want to be completely general, there exist [https://en.wikipedia.org/wiki/Unicode_normalization other issues to consider.] Note, in particular, that not all combining forms have codepoints. --[[User:Rdm\|Rdm]] 18:31, 15 March 2012 (UTC) :: could we see a text file that contains the various BYTE strings and the expected length results? ..[[User:Walterpachl\|Walterpachl]] ([[User talk:Walterpachl\|talk]]) 08:03, 30 August 2014 (UTC) == PL/I error == the last line ( put skip list ('Byte length=', length(trim(SM)); ) <br>is syntactically incorrect (a closing parenthesis is missing) <br>I tried to add it and get an error message: <br>IBM1569I S 9.0 SIZE argument must be a CONNECTED reference. <br>--[[User:Walterpachl\|Walterpachl]] ([[User talk:Walterpachl\|talk]]) 18:43, 22 October 2013 (UTC)