Substring/Top and tail: Difference between revisions

Line 1,107:
sock
room</pre>
 
Nearly all current solutions for this task fail to work correctly: the task says "The program must reference logical characters (code points), not 8-bit code units for UTF-8 or 16-bit code units for UTF-16." The above program, as most on this page, works only with Unicode code points in the Basic Multilingual Plane. The code below, on the other hand, works correctly with all Unicode characters.
 
<syntaxhighlight lang="java">public class SubstringTopAndTail {
public static void main( String[] args ){
var s = "\uD83D\uDC0Eabc\uD83D\uDC0E"; // Horse emoji, a, b, c, horse emoji: "🐎abc🐎"
 
var sizeOfFirstChar = s.offsetByCodePoints(0, 1);
var sizeOfLastChar = s.codePointCount(s.length() - 2, s.length()) == 1 ? 2 : 1;
 
var removeFirst = s.substring(sizeOfFirstChar);
var removeLast = s.substring(0, s.length() - sizeOfLastChar);
var removeBoth = s.substring(sizeOfFirstChar, s.length() - sizeOfLastChar);
 
System.out.println(removeFirst);
System.out.println(removeLast);
System.out.println(removeBoth);
}
}</syntaxhighlight>
 
Results:
<pre>abc🐎
🐎abc
abc</pre>
 
=={{header|JavaScript}}==
44

edits