Compare length of two strings: Difference between revisions

Added Wren
(Added Wren)
Line 81:
🤔🇺🇸: characters:2, Unicode code points:3, UTF-8 bytes:12, UTF-16 bytes:12
👨‍👩‍👧‍👦: characters:1, Unicode code points:7, UTF-8 bytes:25, UTF-16 bytes:22</pre>
 
=={{header|Wren}}==
{{libheader|Wren-upc}}
In Wren a string (i.e. an object of the String class) is an immutable sequence of bytes which is usually interpreted as UTF-8 but does not have to be.
 
With regard to string length, the ''String.count'' method returns the number of 'codepoints' in the string. If the string contains bytes which are invalid UTF-8, each such byte adds one to the count.
 
To find the number of bytes one can use ''String.bytes.count''.
 
Unicode grapheme clusters, where what appears to be a single 'character' may in fact be an amalgam of several codepoints, are not directly supported by Wren but it is possible to measure the length in grapheme clusters of a string (i.e. the number of ''user perceived characters'') using the ''Graphemes.clusterCount'' method of the Wren-upc module.
<lang ecmascript>import "./upc" for Graphemes
 
var printCounts = Fn.new { |s1, s2, c1, c2|
var l1 = (c1 > c2) ? [s1, c1] : [s2, c2]
var l2 = (c1 > c2) ? [s2, c2] : [s1, c1]
System.print( "%(l1[0]) : length %(l1[1])")
System.print( "%(l2[0]) : length %(l2[1])\n")
}
 
var codepointCounts = Fn.new { |s1, s2|
var c1 = s1.count
var c2 = s2.count
System.print("Comparison by codepoints:")
printCounts.call(s1, s2, c1, c2)
}
 
var byteCounts = Fn.new { |s1, s2|
var c1 = s1.bytes.count
var c2 = s2.bytes.count
System.print("Comparison by bytes:")
printCounts.call(s1, s2, c1, c2)
}
 
var graphemeCounts = Fn.new { |s1, s2|
var c1 = Graphemes.clusterCount(s1)
var c2 = Graphemes.clusterCount(s2)
System.print("Comparison by grapheme clusters:")
printCounts.call(s1, s2, c1, c2)
}
 
for (pair in [ ["nino", "niño"], ["👨‍👩‍👧‍👦", "🤔🇺🇸"] ]) {
codepointCounts.call(pair[0], pair[1])
byteCounts.call(pair[0], pair[1])
graphemeCounts.call(pair[0], pair[1])
}</lang>
 
{{out}}
<pre>
Comparison by codepoints:
niño : length 4
nino : length 4
 
Comparison by bytes:
niño : length 5
nino : length 4
 
Comparison by grapheme clusters:
niño : length 4
nino : length 4
 
Comparison by codepoints:
👨‍👩‍👧‍👦 : length 7
🤔🇺🇸 : length 3
 
Comparison by bytes:
👨‍👩‍👧‍👦 : length 25
🤔🇺🇸 : length 12
 
Comparison by grapheme clusters:
🤔🇺🇸 : length 2
👨‍👩‍👧‍👦 : length 1
</pre>
 
=={{header|Z80 Assembly}}==
9,490

edits