Unicode strings: Difference between revisions

Line 13:

* [[Unicode variable names]]

* [[Terminal control/Display an extended character]]

=={{header|80386 Assembly}}==

* How well prepared is the programming language for Unicode? - Well prepared. Assembly language can do anything the computer can do.

* How easy is it to present Unicode strings in source code? - Easy, they are in hexadecimal.

* Can Unicode literals be written directly - Depends on the compiler. MASM does not allow this, but we could create a translator.

* or be part of identifiers/keywords/etc? - Depends on compiler. Intel notation does not use Unicode identifiers or mnemonics. Assembly language converts to numeric machine code, so everything is represented as mnemonics. You can use your own mnemonics, but you need to be able to compile them. One way to do this is to use a wrapper (which you would create) that converts your Unicode mnemonic notation to the notation that the compiler is expecting.

* How well can the language communicate with the rest of the world? - Difficult. This is a low level language, so all communication can be done, but you have to set up data structures, and produce many lines of code for just basic tasks.

* Is it good at input/output with Unicode? - Yes and No. The Unicode bit is easy, but for input/output, we have to set up data structures and produce many lines of code, or link to code libraries.

* Is it convenient to manipulate Unicode strings in the language? - No. String manipulation requires lots of code. We can link to code libraries though, but it is not as straightforward, as it would be in a higher level language.

* How broad/deep does the language support Unicode? We can do anything in assembly language, so support is 100%, but nothing is convenient with respect to Unicode. Strings are just a series of bytes, treatment of a series of bytes as a string is down to the compilerm if it provides string support as an extension. You need to be prepared to define data structures containing the values that you want.

What encodings (e.g. UTF-8, UTF-16, etc) can be used? All encodings are supported, but again, nothing is convenient with respect to encodings, although hexadecimal notation is good to use in assembly language.

* Normalization? - This is supported. All characters are in hexadecimal. Be prepared to write lots of code.

* Canonization? - Good. We create our own data structures, so you can canonize however you like. However, be prepared to write lots of code.

=={{header|C}}==