Category talk:Wren-upc: Difference between revisions

m
→‎User-perceived characters: Updated blurb to Unicode version 15.0.
m (→‎Source code: Now uses Wren S/H lexer.)
m (→‎User-perceived characters: Updated blurb to Unicode version 15.0.)
 
(One intermediate revision by the same user not shown)
Line 4:
Given the complexity of this process, Wren doesn't have built-in support for it and this module aims to remedy that situation. It is based on Oliver Kuederle's [https://github.com/rivo/uniseg Unicode Text Segmentation for Go library] version 0.1.0 which is subject to the [https://github.com/rivo/uniseg/blob/master/LICENSE.txt MIT License] and is based on Unicode version 12.0.
 
At the time of writing (November 20222023) there has been a subsequent release for this library (version 0.3.0) which includes support for word breaking, sentence breaking and line breaking and is based on Unicode version 14.0 (subsequently updated to version 15.0 though not yet released). These additional features are unlikely to see much (if any) use on Rosetta Code and, given the amount of code needed to deal with them, I have decided not to support them in upc.wren. However, I have updated the property table so it now includes characters added up to and including Unicode version 1415.0.
 
If anyone is interested in using the additional features of version 0.3.0, then it would probably be easier to wrap the library using a Go host rather than attempting to translate all of it to Wren.
Line 174:
[0x0CD5, 0x0CD6, extend],
[0x0CE2, 0x0CE3, extend],
[0x0CF3, 0x0CF3, spacingMark],
[0x0D00, 0x0D01, extend],
[0x0D02, 0x0D03, spacingMark],
Line 203 ⟶ 204:
[0x0EB3, 0x0EB3, spacingMark],
[0x0EB4, 0x0EBC, extend],
[0x0EC8, 0x0ECD0x0ECE, extend],
[0x0F18, 0x0F19, extend],
[0x0F35, 0x0F35, extend],
Line 1,367 ⟶ 1,368:
[0x10D24, 0x10D27, extend],
[0x10EAB, 0x10EAC, extend],
[0x10EFD, 0x10EFF, extend],
[0x10F46, 0x10F50, extend],
[0x10F82, 0x10F85, extend],
Line 1,406 ⟶ 1,408:
[0x11236, 0x11237, extend],
[0x1123E, 0x1123E, extend],
[0x11241, 0x11241, extend],
[0x112DF, 0x112DF, extend],
[0x112E0, 0x112E2, spacingMark],
Line 1,525 ⟶ 1,528:
[0x11EF3, 0x11EF4, extend],
[0x11EF5, 0x11EF6, spacingMark],
[0x134300x11F00, 0x134380x11F01, controlextend],
[0x11F02, 0x11F02, prepend],
[0x11F03, 0x11F03, spacingMark],
[0x11F34, 0x11F35, spacingMark],
[0x11F36, 0x11F3A, extend],
[0x11F3E, 0x11F3F, spacingMark],
[0x11F40, 0x11F40, extend],
[0x11F41, 0x11F41, spacingMark],
[0x11F42, 0x11F42, extend],
[0x13430, 0x1343F, control],
[0x13440, 0x13440, extend],
[0x13447, 0x13455, extend],
[0x16AF0, 0x16AF4, extend],
[0x16B30, 0x16B36, extend],
Line 1,558 ⟶ 1,572:
[0x1E023, 0x1E024, extend],
[0x1E026, 0x1E02A, extend],
[0x1E08F, 0x1E08F, extend],
[0x1E130, 0x1E136, extend],
[0x1E2AE, 0x1E2AE, extend],
[0x1E2EC, 0x1E2EF, extend],
[0x1E4EC, 0x1E4EF, extend],
[0x1E8D0, 0x1E8D6, extend],
[0x1E944, 0x1E94A, extend],
Line 1,811 ⟶ 1,827:
[0x1F6D5, 0x1F6D5, extendedPictographic],
[0x1F6D6, 0x1F6D7, extendedPictographic],
[0x1F6D8, 0x1F6DC0x1F6DB, extendedPictographic],
[0x1F6DC, 0x1F6DC, extendedPictographic],
[0x1F6DD, 0x1F6DF, extendedPictographic],
[0x1F6E0, 0x1F6E5, extendedPictographic],
Line 1,893 ⟶ 1,910:
[0x1FA80, 0x1FA82, extendedPictographic],
[0x1FA83, 0x1FA86, extendedPictographic],
[0x1FA87, 0x1FA8F0x1FA88, extendedPictographic],
[0x1FA89, 0x1FA8F, extendedPictographic],
[0x1FA90, 0x1FA95, extendedPictographic],
[0x1FA96, 0x1FAA8, extendedPictographic],
Line 1,900 ⟶ 1,918:
[0x1FAB0, 0x1FAB6, extendedPictographic],
[0x1FAB7, 0x1FABA, extendedPictographic],
[0x1FABB, 0x1FABF0x1FABD, extendedPictographic],
[0x1FABE, 0x1FABE, extendedPictographic],
[0x1FABF, 0x1FABF, extendedPictographic],
[0x1FAC0, 0x1FAC2, extendedPictographic],
[0x1FAC3, 0x1FAC5, extendedPictographic],
[0x1FAC6, 0x1FACF0x1FACD, extendedPictographic],
[0x1FACE, 0x1FACF, extendedPictographic],
[0x1FAD0, 0x1FAD6, extendedPictographic],
[0x1FAD7, 0x1FAD9, extendedPictographic],
[0x1FADA, 0x1FADF0x1FADB, extendedPictographic],
[0x1FADC, 0x1FADF, extendedPictographic],
[0x1FAE0, 0x1FAE7, extendedPictographic],
[0x1FAE8, 0x1FAEF0x1FAE8, extendedPictographic],
[0x1FAE9, 0x1FAEF, extendedPictographic],
[0x1FAF0, 0x1FAF6, extendedPictographic],
[0x1FAF7, 0x1FAFF0x1FAF8, extendedPictographic],
[0x1FAF9, 0x1FAFF, extendedPictographic],
[0x1FC00, 0x1FFFD, extendedPictographic],
[0xE0000, 0xE0000, control],
9,476

edits