Unicode strings: Difference between revisions

Added Swift
(Add comment for Rust)
(Added Swift)
Line 1,360:
 \uF8FF
</pre>
 
=={{header|Swift}}==
{{libheader|Swift}}
Swift has an [https://swiftdoc.org/v5.1/type/string/ advanced string type] that defaults to i18n operations and exposes encoding through views:
 
<lang swift>let flag = "🇵🇷"
print(flag.characters.count)
// Prints "1"
print(flag.unicodeScalars.count)
// Prints "2"
print(flag.utf16.count)
// Prints "4"
print(flag.utf8.count)
// Prints "8"
 
let nfc = "\u{01FA}"//Ǻ LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE
let nfd = "\u{0041}\u{030A}\u{0301}"//Latin Capital Letter A + ◌̊ COMBINING RING ABOVE + ◌́ COMBINING ACUTE ACCENT
let nfkx = "\u{FF21}\u{030A}\u{0301}"//Fullwidth Latin Capital Letter A + ◌̊ COMBINING RING ABOVE + ◌́ COMBINING ACUTE ACCENT
print(nfc == nfd) //NFx: true
print(nfc == nfkx) //NFKx: false
</lang>
 
Swift [https://forums.swift.org/t/string-s-abi-and-utf-8/17676 apparently uses a null terminiated char array] for storage to provide compatibility with C, but does a lot of work under-the-covers to make things more ergonomic:
 
<blockquote>Although strings in Swift have value semantics, strings use a copy-on-write strategy to store their data in a buffer. This buffer can then be shared by different copies of a string. A string’s data is only copied lazily, upon mutation, when more than one string instance is using the same buffer. Therefore, the first in any sequence of mutating operations may cost O(n) time and space.
</blockquote>
 
See also:
 
* 'smol': [https://swift.org/blog/utf8-string/ a stack allocated string type] which can "store up" 10/15 UTF-8 "code units" (32bit/64bit systems).
* [https://forums.swift.org/t/string-s-abi-and-utf-8/17676 String’s ABI and UTF-8] mentions data-structures that can be used to share the backing UTF-8 data.
 
 
=={{header|Rust}}==