Talk:Unicode variable names: Difference between revisions

Want to add Security considerations
(→‎The wrong triangle: Not really a problem for programs)
(Want to add Security considerations)
Line 61:
 
:: They're not ''semantically'' interchangeable. They might not have the same glyph in all fonts. They might participate in ligatures differently (Latin-based writing systems are largely simple that way, but other writing systems are very much not). And anyway, it tends to not be such a huge problem in practice; individual (human) languages don't have to deal with the problem in the first place, it's only when you try to support all known writing systems that you run into problems. (If you want an area where there ''are'' problems due to the sorts of issues you mention, try unicode domain names; that's a totally different problem from source code though.) –[[User:Dkf|Donal Fellows]] 23:45, 10 July 2011 (UTC)
 
==Security considerations==
I want to add an overview which approach was taken:
 
* ASCII only
* None (such as php, D, nim, crystal, ...)
* Certain ranges (e.g. ALtId in C11) (C, C++) (e.g. prone to rtl attacks)
* only TR31 (Letters, Numbers, ...)
* disallow excluded and limited use scripts
* check mixed scripts,
* require normalization, and which. (such as the odd choice of NFKC with python 3)
 
I'm working on such an overview and tools for all main stream languages (and filesystems and everything with names). Because 99% of them are insecure and do not follow the unicode security guidelines for identifiers, leading to unidentifiable identifiers. https://github.com/rurban/libu8ident --[[User:ReiniUrban|ReiniUrban]] ([[User talk:ReiniUrban|talk]]) 09:34, 30 December 2021 (UTC)