Revision as of 19:04, 21 September 2020 (view source) rosettacode>Pdolezal (Add comment for Rust) ← Older edit		Revision as of 05:24, 4 December 2020 (view source) rosettacode>Indolering (Added Swift) Newer edit →
Line 1,360:  \uF8FF </pre> =={{header\|Swift}}== {{libheader\|Swift}} Swift has an [https://swiftdoc.org/v5.1/type/string/ advanced string type] that defaults to i18n operations and exposes encoding through views: <lang swift>let flag = "🇵🇷" print(flag.characters.count) // Prints "1" print(flag.unicodeScalars.count) // Prints "2" print(flag.utf16.count) // Prints "4" print(flag.utf8.count) // Prints "8" let nfc = "\u{01FA}"//Ǻ LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE let nfd = "\u{0041}\u{030A}\u{0301}"//Latin Capital Letter A + ◌̊ COMBINING RING ABOVE + ◌́ COMBINING ACUTE ACCENT let nfkx = "\u{FF21}\u{030A}\u{0301}"//Fullwidth Latin Capital Letter A + ◌̊ COMBINING RING ABOVE + ◌́ COMBINING ACUTE ACCENT print(nfc == nfd) //NFx: true print(nfc == nfkx) //NFKx: false </lang> Swift [https://forums.swift.org/t/string-s-abi-and-utf-8/17676 apparently uses a null terminiated char array] for storage to provide compatibility with C, but does a lot of work under-the-covers to make things more ergonomic: <blockquote>Although strings in Swift have value semantics, strings use a copy-on-write strategy to store their data in a buffer. This buffer can then be shared by different copies of a string. A string’s data is only copied lazily, upon mutation, when more than one string instance is using the same buffer. Therefore, the first in any sequence of mutating operations may cost O(n) time and space. </blockquote> See also: * 'smol': [https://swift.org/blog/utf8-string/ a stack allocated string type] which can "store up" 10/15 UTF-8 "code units" (32bit/64bit systems). * [https://forums.swift.org/t/string-s-abi-and-utf-8/17676 String’s ABI and UTF-8] mentions data-structures that can be used to share the backing UTF-8 data. =={{header\|Rust}}==

Unicode strings: Difference between revisions

Unicode strings (view source)

Revision as of 05:24, 4 December 2020