Special characters: Difference between revisions

Special chars in BLC
m (→‎{{header|Phix}}: phix/basics)
imported>Tromp
(Special chars in BLC)
 
(34 intermediate revisions by 12 users not shown)
Line 32:
* <code>.</code> macro concatenation - ex: <code>&VAR.A</code>
* <code>.</code> macro sequence symbol - ex: <code>.SEQ</code>
=={{header|6502 Assembly}}==
Syntax varies between assemblers but most of these are standard:
* <code>,</code> macro argument separator, data block element separator
* <code>#</code> tells the assembler the number after it is a constant, not a memory address (e.g. <code>LDA #3</code>)
* <code>$</code> tells the assembler the number after it is hexadecimal (e.g. <code>LDA #$21</code>)
* <code>%</code> tells the assembler the number after it is binary (e.g. <code>LDA #%11110111</code>)
* <code><</code> represents the "low byte" of a number greater than 8 bits (e.g. <code><$1234</code> gets replaced with <code>$34</code>)
* <code>></code> represents the "high byte" of a number greater than 8 bits (e.g. <code><$1234</code> gets replaced with <code>$12</code>)
* <code>"</code> strings are enclosed in double quotes, e.g. <code>db "Hello World",0</code>
* <code>'</code> single ASCII values are enclosed in single quotes, e.g. <code>LDA #'J'</code>
* <code>;</code> comment character
* <code>:</code> marks the end of a label name. Some assemblers require a colon for a label to count, others do not.
* <code>@</code> or <code>.</code> as a label prefix indicate this label is local only, and is only in "scope" between two labels with no prefix.
* <code>\</code> prefix for a macro argument in a macro definition
* <code>+-*/()</code> mathematical operators for constant label offsets (e.g. <code>myLabel+1</code> = the address just after <code>myLabel</code>)
* <code>$</code> or <code>*</code>, as an operand of an instruction, represents the current program counter. This can be used to implement an endless loop (e.g. <code>jmp *</code>) or to load from a program counter relative offset.
* The comma <code>,</code> separates entries in a <code>db</code> or <code>dw</code> statement that all exist on the same line.
 
There are no "escape sequences" per se, but if you really need to print (for example) an apostrophe it's easiest to just use its ascii code instead of trying to specify it as a literal:
<syntaxhighlight lang="6502asm">LDA #''' ;is likely to confuse the assembler, and is just hard to read.
LDA #$27 ;gets the same result and isn't ambiguous.</syntaxhighlight>
 
=={{header|68000 Assembly}}==
Syntax varies between assemblers but most of these are standard:
* <code>,</code> separator for macro arguments, data block elements, and instruction operands
* <code>#</code> tells the assembler the number after it is a constant, not a memory address
* <code>$</code> prefix that identifies a hexadecimal number
* <code>%</code> prefix that identifies a binary number
* <code>"</code> strings are enclosed in double quotes
* <code>'</code> single ASCII values are enclosed in single quotes
* <code>;</code> comment
* <code>:</code> marks the end of a label name
* <code>@</code> prefix for a local label
* <code>.</code> prefix for a local label
* <code>\</code> prefix for a macro argument in a macro definition
* <code>+-*/()</code> mathematical operators for constant label offsets (e.g. <code>myLabel+1</code> = the address just after <code>myLabel</code>)
* Parentheses <code>()</code> around an address register mean that the address stored at that address register is being dereferenced and the data it points to is the actual operand of the instruction, e.g. <code>MOVE.L (A0),D0</code>
 
 
Special characters only have meaning during the assembly process. They merely tell the assembler how to assemble your code. There is no built-in understanding of control codes like <code>%d</code> or <code>\n</code> for string operations, for example.
 
If, for example, you're trying to load a string literal of something like an apostrophe that is likely to have confusing syntax, it's easiest to specify the ascii code directly:
<syntaxhighlight lang="68000devpac">MOVE.B #''',D0 ;is likely to cause a parsing error when assembling.
MOVE.B #$27,D0 ;same result but works every time.</syntaxhighlight>
 
Even better is to use an <code>equ</code> directive so that you aren't resorting to "magic numbers":
<syntaxhighlight lang="68000devpac">ascii_apostrophe equ $27
MOVE.B #ascii_apostrophe,D0</syntaxhighlight>
 
=={{header|8086 Assembly}}==
Syntax varies between assemblers, but the following are typically standard:
* <code>;</code> is the comment character.
* <code>" "</code> encloses literal strings in embedded data
* <code>' '</code> encloses a single string character as an instruction operand (e.g. <code>MOV AL,'3'</code>). Note that a character in single or double quotes is really just an alias for the ASCII code. The assembly program makes no distinction between numbers and text.
* <code>:</code> is placed at the end of a code label. The label represents the address of the instruction directly below it, and can be used with <code>JMP</code> and other such branching commands.
* <code>:</code> between a segment register and a label or standard register represent a segment-offset pair to load from/store to. (e.g. <code>MOV [ES:DI],AX</code>)
* A <code>0x</code> prefix or a <code>h</code> suffix indicates that the numeric quantity is hexadecimal. When using a "h" suffix, an extra leading zero is required if the leading digit is A-F. (e.g. 0A000h, 0FFh. This leading zero does not count toward the size of the operand.)
* A <code>0b</code> prefix or a <code>b</code> suffix indicates that the numeric quantity is binary.
* <code>[ ]</code> around a register or label name indicates that the register or label represents a pointer being dereferenced, and the actual operand is the data at that address. To remove ambiguity, one can type "byte ptr" or "word ptr" before the brackets to specify the type associated with the pointer. (e.g. <code>inc byte ptr [bx]</code>)
 
=={{header|ActionScript}}==
Line 56 ⟶ 115:
 
There is no escape sequences in character literals. Any character supported by the source encoding is allowed. The only escape sequence of string literals is "" (doubled double quotation marks) which denotes ". When characters need to be specified by their code positions (in Unicode), this is made using the 'Val attribute:
<langsyntaxhighlight Adalang="ada">with Ada.Text_IO; use Ada.Text_IO;
 
procedure Test is
begin
Put ("Quote """ & ''' & """" & Character'Val (10));
end Test;</langsyntaxhighlight>
{{out}}
<pre>
Line 68 ⟶ 127:
</pre>
Note that character and string literals serve all character and string types. For example with Wide_Wide characters (32-bit) and strings:
<langsyntaxhighlight Adalang="ada">with Ada.Wide_Wide_Text_IO; use Ada.Wide_Wide_Text_IO;
 
procedure Test is
begin
Put ("Unicode """ & ''' & """" & Wide_Wide_Character'Val (10));
end Test;</langsyntaxhighlight>
 
=={{header|ALGOL 68}}==
Line 80 ⟶ 139:
the character displayed when a number cannot being printed in the width provided. And
the null character indicating the end of characters in a BYTES array.
<langsyntaxhighlight lang="algol68">printf(($"flip:"g"!"l$,flip));
printf(($"flop:"g"!"l$,flop));
printf(($"blank:"g"!"l$,blank));
printf(($"error char:"g"!"l$,error char));
printf(($"null character:"g"!"l$,null character))</langsyntaxhighlight>
{{out}}
<pre>
Line 95 ⟶ 154:
To handle the output movement to (and input movement from) a device
ALGOL 68 has the following four positioning procedures:
<langsyntaxhighlight lang="algol68">print(("new page:",new page));
print(("new line:",new line));
print(("space:",space));
print(("backspace:",backspace))</langsyntaxhighlight>These procedures may not all be supported on a particular device.
 
If a particular device (CHANNEL) is '''set possible''', then there
Line 200 ⟶ 259:
<br>
Certain combinations of letters form reserved words in Algol W and cannot be used as identifiers. Also "not", "not =" and "comment" can be used as alternatives for ¬ and ¬= and % (however only ; can terminate a comment started with "comment").
 
=={{header|Arturo}}==
 
<table border="1" class="docutils"><tr><th>Escape sequence</th><th>Meaning</th></tr>
<tr><td><tt class="docutils literal"><span class="pre">\a</span></tt></td><td><span id="alert_2043891028">alert</span></td></tr>
<tr><td><tt class="docutils literal"><span class="pre">\b</span></tt></td><td><span id="backspace_1274784623">backspace</span></td></tr>
<tr><td><tt class="docutils literal"><span class="pre">\e</span></tt></td><td><span id="escape_471864567">escape</span> <span id="esc_611322509">[ESC]</span></td></tr>
<tr><td><tt class="docutils literal"><span class="pre">\f</span></tt></td><td><span id="form-feed_295412702">form feed</span></td></tr>
<tr><td><tt class="docutils literal"><span class="pre">\n</span></tt></td><td><span id="line-feed_1443601756">line feed</span></td></tr>
<tr><td><tt class="docutils literal"><span class="pre">\r</span></tt></td><td><span id="carriage-return_731232527">carriage return</span></td></tr>
<tr><td><tt class="docutils literal"><span class="pre">\t</span></tt></td><td><span id="tabulator_1835392927">tabulator</span></td></tr>
<tr><td><tt class="docutils literal"><span class="pre">\v</span></tt></td><td><span id="vertical-tabulator_359268340">vertical tabulator</span></td></tr>
<tr><td><tt class="docutils literal"><span class="pre">\\</span></tt></td><td><span id="backslash_112366856">backslash</span></td></tr>
<tr><td><tt class="docutils literal"><span class="pre">\&quot;</span></tt></td><td><span id="quotation-mark_2111627364">quotation mark</span></td></tr>
<tr><td><tt class="docutils literal"><span class="pre">\'</span></tt></td><td><span id="apostrophe_551932232">apostrophe</span></td></tr>
</table>
 
=={{header|AutoHotkey}}==
Line 419 ⟶ 494:
 
And unlike the earlier versions, when Befunge-98 encounters an unsupported instructions (typically an unassigned fingerprint character), the program counter is reflected instead of the character being ignored.
 
=={{header|Binary Lambda Calculus}}==
 
BLC has no special characters and no need to escape any. That's essential for a language in order to be universal in the sense of https://gist.github.com/tromp/86b3184f852f65bfb814e3ab0987d861#universality
 
=={{header|Bracmat}}==
Line 559 ⟶ 638:
\U000000CF
However, letters in the basic execution character set may not be written in this form (but since all those characters are in standard ASCII, writing them as universal character constants would only obfuscate anyway). If the compiler accepts direct usage of of non-ASCII characters somewhere in the code, the result must be the same as with the corresponding universal character name. For example, the following two lines, if accepted by the compiler, should have the same effect:
<langsyntaxhighlight lang="cpp">std::cout << "Tür\n";
std::cout << "T\u00FC\n";</langsyntaxhighlight>
Note that in principle, C++ would also allow to use such letters in identifiers, e.g.
<langsyntaxhighlight lang="cpp">extern int Tür; // if the compiler allows literal ü
extern int T\u00FCr; // should in theory work everywhere</langsyntaxhighlight>
but that's not generally supported by existing compilers (e.g. g++ 4.1.2 doesn't support it).
 
Another escape sequence working everywhere is to escape the newline: If a backslash is at the end of the line, the next line is pasted to it without any space in between. For example:
<langsyntaxhighlight lang="cpp">int const\
ant; // defines a variable of type int named constant, not a variable of type int const named ant</langsyntaxhighlight>
 
=== String and character literal ===
A string literal is surrounded by double quotes(<tt>"</tt>). A character literal is surrounded by single quotes (<tt>'</tt>). Example:
<langsyntaxhighlight lang="cpp">char const str = "a string literal";
char c = 'x'; // a character literal</langsyntaxhighlight>
 
The following escape sequences are only allowed inside string constants and character constants:
Line 599 ⟶ 678:
The <tt>#</tt> character in C++ is special as it is interpreted only in the preprocessing phase, and shouldn't occur (outside of character/string constants) after preprocessing.
*If <tt>#</tt> appears as first non-whitespace character in the line, it introduces a preprocessor directive. For example
<syntaxhighlight lang ="cpp">#include <iostream></langsyntaxhighlight>
*Inside macro definitions, a single <tt>#</tt> is the stringification operator, which turns its argument into a string. For example:
<langsyntaxhighlight lang="cpp">#define STR(x) #x
int main()
{
std::cout << STR(Hello world) << std::endl; // STR(Hello world) expands to "Hello world"
}</langsyntaxhighlight>
*Also inside macro definitions, <tt>##</tt> is the token pasting operator. For example:
<langsyntaxhighlight lang="cpp">#define THE(x) the_ ## x
int THE(answer) = 42; // THE(answer) expands to the_answer</langsyntaxhighlight>
 
Note that the # character is not interpreted specially inside character or string literals.
Line 647 ⟶ 726:
Within E ''quasiliterals'', backslash is not special and <code>$\</code> plays the same role;
 
<langsyntaxhighlight lang="e">? println(`1 + 1$\n= ${1 + 1}`)
1 + 1
= 2</langsyntaxhighlight>
 
=={{header|Erlang}}==
Line 658 ⟶ 737:
When Forth fails to interpret a symbol as a defined word, an attempt is made to interpret it as a number. In numerical interpretation there arise a number of special characters:
 
<langsyntaxhighlight lang="forth">
10 \ single cell number
-10 \ negative single cell number
10. \ double cell number
10e \ floating-point number</langsyntaxhighlight>
 
Many systems - and the Forth200x standard - extend this set with base prefixes:
 
<langsyntaxhighlight lang="forth">
#10 \ decimal
$10 \ hex
%10 \ binary</langsyntaxhighlight>
 
Of strings, Forth200x [http://www.forth200x.org/escaped-strings.html Escaped Strings] adds a string-parsing word with very familiar backslashed escapes.
Line 815 ⟶ 894:
1 Advance to the top of the next page.
Other output devices may or may not follow this scheme. A teletype attached to a PDP-15 in the 1970s certainly would chug along to the top of the next page, but screens attached to IBM mainframes would not start at the top line. Still later systems have generally ignored this protocol, but the output from free-format WRITE statements still starts with a space, just in case.
 
 
=={{header|FreeBASIC}}==
 
=== Comments ===
 
* ' Comment
* /' '/ Multi-line Comment
 
=== Assignment operator symbols ===
 
* = assignment operator
 
=== Arithmetic operator symbols ===
 
* + addition
* - subtraction
* * multiplication
* / division
* \ integer division
* ^ exponentiation
* += addition and assign
* -= subtraction and assign
* *= multiplication and assign
* /= division and assign
* \= integer division and assign
* ^= exponentiation and assign
 
=== String operator symbols ===
 
* + concatenation
* & concatenation with conversion
* &= concatenation and assign
 
=== Comparative operator symbols ===
 
* = equality
* < less than
* > greater than
* <= less than or equal to
* >= greater than or equal to
* <> inequality
 
=== Statement and argument separators ===
 
* : separates multiple statements on a line
 
=== Enclosures ===
 
* " " used as enclosures for strings
* ' ' used as enclosures for strings
* ( ) function argument enclosures, array element reference, and used to dictate mathematical precedence
 
=== Output separators ===
 
* ; move cursor to next column instead of newline and separates redirection stream from data
* , move cursor to next tabstop instead of newline and alternative to semicolon for separation of stream from data
 
=== Preprocessor operator ===
 
* # preprocessor operator to convert macro arguments to strings
* ## preprocessor operator to concatenate strings
 
=== Escape sequences ===
 
* ! indicates that escape sequences can be used in literal strings.
* $ explicitly indicates that a string literal should not be processed for escape sequences.
 
The accepted escape sequences in text are:
<table border>
<tr bgcolor="#C0C0C0" ><th>Escape Sequence<th class="head">Meaning
<tr><td><span class="pre">\a</span><td>beep
<tr><td><span class="pre">\b</span><td>backspace (BS)
<tr><td><span class="pre">\f</span><td>formfeed (FF)
<tr><td><span class="pre">\n</span> o <span class="pre">\l</span><td>newline
<tr><td><span class="pre">\r</span><td>carriage return (CR)
<tr><td><span class="pre">\t</span><td>horizontal tab
<tr><td><span class="pre">\unnnn</span><td> unicode char in hex
<tr><td><span class="pre">\v</span><td>vertical tab
<tr><td><span class="pre">\nnn</span><td>ascii char in decimal
<tr><td><span class="pre">\&hnn</span><td>ascii char in hex
<tr><td><span class="pre">\&onnn</span><td>ascii char in octal
<tr><td><span class="pre">\&bnnnnnnnn</span><td>ascii char in binary
<tr><td><span class="pre">\\</span><td>backslash
<tr><td><span class="pre">\"</span><td>double quote
<tr><td><span class="pre">\'</span><td>single quote
</table>
 
=== Pointer operations symbols ===
 
* * Dereferences a pointer
* @ Returns the address of a string literal, variable, object or procedure
* [] Returns a reference to memory offset from an address
 
=== Miscellaneous ===
* ? shortcut for ‘print’
 
 
=={{header|Gambas}}==
Line 862 ⟶ 1,038:
 
Comments
<langsyntaxhighlight lang="haskell">-- comment here until end of line
{- comment here -}</langsyntaxhighlight>
 
Operator symbols (nearly any sequence can be used)
<langsyntaxhighlight lang="haskell">! # $ % & * + - . / < = > ? @ \ ^ | - ~ :
: as first character denotes constructor</langsyntaxhighlight>
 
Reserved symbol sequences
<langsyntaxhighlight lang="haskell">.. : :: = \ | <- -> @ ~ => _</langsyntaxhighlight>
 
Infix quotes
<langsyntaxhighlight lang="haskell">`identifier` (to use as infix operator)</langsyntaxhighlight>
 
Characters
<langsyntaxhighlight lang="haskell">'.'
\ escapes</langsyntaxhighlight>
 
Strings
<langsyntaxhighlight lang="haskell">"..."
\ escapes</langsyntaxhighlight>
 
Special escapes
<langsyntaxhighlight lang="haskell">\a alert
\b backspace
\f form feed
Line 890 ⟶ 1,066:
\r carriage return
\t horizontal tab
\v vertical tab</langsyntaxhighlight>
 
Other
<langsyntaxhighlight lang="haskell">( ) (grouping)
( , ) (tuple type/tuple constructor)
{ ; } (grouping inside let, where, do, case without layout)
[ , ] (list type/list constructor)
[ | ] (list comprehension)</langsyntaxhighlight>
 
Unicode characters, according to category:
<langsyntaxhighlight lang="haskell">Upper case (identifiers)
Lower case (identifiers)
Digits (numbers)
Symbol/punctuation (operators)</langsyntaxhighlight>
 
=={{header|HicEst}}==
Line 947 ⟶ 1,123:
The closest thing J has to an escape sequence is that paired quotes, in a character literal, represent a single quote character.
 
<langsyntaxhighlight Jlang="j"> '' NB. empty string
 
'''' NB. one quote character
'
'''''' NB. two quote characters
''</langsyntaxhighlight>
 
Since it's not clear what "special characters" would mean, in the context of J, here is an informal treatment of J's word forming rules:
Line 960 ⟶ 1,136:
A character literal consists of paired quote characters with any other characters between them.
 
<langsyntaxhighlight Jlang="j"> 'For example, this is a character literal'</langsyntaxhighlight>
 
A numeric literal consists of a leading numeric character (a digit or _) followed by alphanumeric (numeric or alphabetic) characters, dots and spaces. A sequence of spaces will end a numeric literal if it is not immediately followed by a numeric character.
 
<langsyntaxhighlight Jlang="j"> 1
1 0 1 0 1 0 1
_3.14159e6</langsyntaxhighlight>
 
Some numeric literals are not implemented by the language
 
<langsyntaxhighlight Jlang="j"> 3l1t3
|ill-formed number</langsyntaxhighlight>
 
Words consist of an alphabetic character (a-z or A-Z) followed by alphanumeric characters and optionally followed by a sequence of dots or colons. Words which do not contain . or : can be given definitions by the user. The special word NB. continues to the end of the line and is ignored (it's a comment) during execution. Words may also contain the underscore character (_) but if there's a trailing underscore, or if there's two adjacent underscores in a word, that has special significance in name lookup.
 
<langsyntaxhighlight Jlang="j"> example=: ARGV NB. example and ARGV are user definable words</langsyntaxhighlight>
 
Tokens consist of any other printable character optionally followed by a sequence of dots or colons. (Tokens which begin with . or : must be preceded by a space character).
 
<syntaxhighlight lang J="j"> +/ .* NB. + / . and * are all meaningful tokens in J</langsyntaxhighlight>
 
Additionally, starting in J9, <code>{{</code> and <code>}}</code> are treated as tokens if they are not immediately followed by a dot (<code>.</code>) or colon (<code>:</code>). Previously, each of these would have been two separate tokens. Additionally, these delimit nestable blocks (previously, J did not implement anonymous nestable blocks -- instead each block needed to be named to be referred to from inside another block). And, in this context, <code>)</code> is a special character. For example <code>{{)n</code> begins a 'noun block' -- a sequence of literal characters, terminated by a linefeed followed by <code>}}</code>.
 
=={{header|Java}}==
Math:
<langsyntaxhighlight lang="java">& | ^ ~ //bitwise AND, OR, XOR, and NOT
>> << //bitwise arithmetic shift
>>> //bitwise logical shift
+ - * / = % //+ can be used for String concatenation)</langsyntaxhighlight>
Any of the previous math operators can be placed in front of an equals sign to make a self-operation replacement:
<langsyntaxhighlight lang="java">x = x + 2 is the same as x += 2
++ -- //increment and decrement--before a variable for pre (++x), after for post(x++)
== < > != <= >= //comparison</langsyntaxhighlight>
Boolean:
<langsyntaxhighlight lang="java">! //NOT
&& || //short-circuit AND, OR
^ & | //long-circuit XOR, AND, OR</langsyntaxhighlight>
Other:
<langsyntaxhighlight lang="java">{ } //scope
( ) //for functions
; //statement terminator
Line 1,005 ⟶ 1,183:
// //comment prefix (can be escaped by \u unicode escape sequence see below)
/* */ //comment enclosures (can be escaped by \u unicode escape sequence see below)
</syntaxhighlight>
</lang>
Escape characters:
<langsyntaxhighlight lang="java">\b //Backspace
\n //Line Feed
\r //Carriage Return
Line 1,016 ⟶ 1,194:
\" //Double Quote
\\ //Backslash
\DDD //Octal Escape Sequence, D is a number between 0 and 7; can only express characters from 0 to 255 (i.e. \0 to \377)</langsyntaxhighlight>
Unicode escapes:
<langsyntaxhighlight lang="java">\uHHHH //Unicode Escape Sequence, H is any hexadecimal digit between 0 and 9 and between A and F</langsyntaxhighlight>
Be extremely careful with Unicode escapes. Unicode escapes are special and are substituted with the specified character ''before'' the source code is parsed. In other words, they apply anywhere in the code, not just inside character and string literals. Variable names can contain foreign characters. It also means that you can use Unicode escapes to write any character in the source code, and it would work. For example, you can say <code>\u002b</code> instead of saying <code>+</code> for addition; you can say <code>String\u0020foo</code> and it would be interpreted as two identifiers: <code>String foo</code>; you can even write the entire Java source file with Unicode escapes, as a poor form of obfuscation.
 
However, this leads to many problems:
* <code>\u000A</code> will become a line return in the code, which will terminate line-end comments:
<langsyntaxhighlight lang="java">// hello \u000A this looks like a comment</langsyntaxhighlight>
: is a syntax error, because the part after <code>\u000A</code> is on the next line and no longer in the comment
* <code>\u0022</code> will become a double-quote in the code, which ends / begins a string literal:
<langsyntaxhighlight lang="java">"hello \u0022 is this a string?"</langsyntaxhighlight>
: is a syntax error, because the part after <code>\u0022</code> is outside the string literal
* An invalid sequence of <code>\u</code>, even in comments that usually are ignored, will cause a parsing error:
<langsyntaxhighlight lang="java">/*
* c:\unix\home\
*/</langsyntaxhighlight>
: is a syntax error, because <code>\unix</code> is not a valid Unicode escape, even though you think that it should be inside a comment
 
Line 1,040 ⟶ 1,218:
Any JSON entity can be specified in a jq program in accordance with the JSON specification. See json.org for details. The following discussion accordingly ignores JSON literals.
 
jq severely restricts the characters that can be used as "identifiers" in a jq program. The regular expression governing the choice of identifiers is currently:<langsyntaxhighlight lang="jq">^[a-zA-Z_][a-zA-Z_0-9]*$</langsyntaxhighlight>
 
That is, identifiers are alphanumeric except that _ may also be used.
Line 1,077 ⟶ 1,255:
Lasso has the follow special characters (excluding math / string functions) [http://lassoguide.com/language/operators.html].
 
<langsyntaxhighlight Lassolang="lasso"># defined local ie. #mylocal will fail if not defined
$ defined variable ie. $myvar will fail if not defined
= assignment
Line 1,097 ⟶ 1,275:
// comment
/* open comment
*/ close comment</langsyntaxhighlight>
 
=={{header|LaTeX}}==
Line 1,105 ⟶ 1,283:
To make some of these characters appear literally in output, prefix the character with a \. For example, to typeset 5% of $10 you would type
 
<syntaxhighlight lang ="latex">5\% of \$10</lang>documentclass{minimal}
\begin{document}
5\% of \$10
\end{document}</syntaxhighlight>
 
Note that the set of special characters in LaTeX isn't really fixed, but can be changed by LaTeX code. For example, the package <tt>ngerman</tt> (providing German-specific definitions, including easier access to umlaut letters) re-defines the double quote character (") as special character, so you can more easily write German words like "hören" (as <tt>h"oren</tt> instead of <tt>h{\"o}ren</tt>).
Line 1,149 ⟶ 1,330:
<br />
In literal string assignments only " has to be escaped. This can be done by using the constant QUOTE:
<langsyntaxhighlight lang="lingo">str = "Hello " & QUOTE & "world!" & QUOTE
put str
-- "Hello "world!""</langsyntaxhighlight>
<br />
The special characters listed above are not allowed in variable or function names.
Line 1,225 ⟶ 1,406:
* ---[[ ]] Uncommented block enclosures
 
=={{header|Mathematica}}/{{header|Wolfram Language}}==
<langsyntaxhighlight Mathematicalang="mathematica">Markup :
() Sequence
{} List
Line 1,237 ⟶ 1,418:
 
Within expression:
\ At end of line: Continue on next line, skipping white space</langsyntaxhighlight>
 
=={{header|MBS}}==
Line 1,251 ⟶ 1,432:
* '''( )''' function argument enclosures
* '''/*''' and '''*/''' enclosure symbols for alternative style comments
 
=={{header|MIPS Assembly}}==
These are usually standard but may vary depending on the assembler you're using.
* <code>;</code> is the comment character. Block comments are not usually supported.
* <code>$</code> as a prefix indicates a register, e.g. <code>$zero</code>, <code>$a0</code>, <code>$v0</code>, <code>$t0</code>.
* <code>0x</code> before a number indicates that this is a hexadecimal quantity. (All numeric literals get converted to binary eventually, but you can specify them in your source code however you like.)
* <code>0b</code> is the same as above but for binary instead.
* <code>" "</code> are used as enclosures for embedded strings, typically defined as data blocks using the <code>.byte</code> or <code>.ascii</code> directives. A null terminator is not necessarily provided for you.
* <code>' '</code> are used as enclosures for short string operands of instructions, e.g. <code>li $a0,'AB'</code>
* <code>[ ]</code>, when used with the load and store instructions, specifies the register that is being dereferenced. In MIPS Assembly, reading from/writing to memory requires that the desired memory address be loaded into a register first. For example, <code>lw $a0,[$a1]</code> treat the value in <code>$a1</code> as a memory address, load the 32-bit value stored at that memory address, and put it into <code>$a0</code>.
* <code>.</code> is often required before an assembler directive.
 
In addition, the typical C-style math operators exist, but can only be used at compile time. They are handy for explaining the meaning of what would otherwise be "magic numbers" (i.e. numeric constants with little to no explanation or context.) Combining C-style operators with <code>equ</code> directives is very helpful for documentation.
<syntaxhighlight lang="mips">VRAMBASE equ 0xA1000008
VRAMSIZE equ 0x12C00
SCREENWIDTH equ 320
 
la $t0,VRAMBASE+SCREENWIDTH ;skip the first row of pixels
li $t1,VRAMSIZE/2 ;we're going to fill half the screen</syntaxhighlight>
 
One challenge you might face is trying to write an apostrophe or quotation mark. The most portable way to do this is to specify its ASCII code directly. As for control codes such as tab, new line, etc., the assembler is unlikely to recognize the C standard codes. Using the ASCII codes for them is your best bet.
<syntaxhighlight lang="mips"> li $t0,0x27 ;trying to use li $t0,''' may confuse the assembler.
 
MyString:
.byte "Hello World",13,10,0 ;13 = carriage return, 10 = linefeed, 0 = null terminator
.align 4</syntaxhighlight>
 
=={{header|MUMPS}}==
Line 1,270 ⟶ 1,477:
From [http://nim-lang.org/manual.html#string-literals the Nim Manual]:
<table border="1" class="docutils"><tr><th>Escape sequence</th><th>Meaning</th></tr>
<tr><td><tt class="docutils literal"><span class="pre">\n</span></tt></td><td><span id="newline_109139365">newline</span></td></tr>
<tr><td><tt class="docutils literal"><span class="pre">\r</span></tt>, <tt class="docutils literal"><span class="pre">\c</span></tt></td><td><span id="carriage-return_731232527">carriage return</span></td></tr>
<tr><td><tt class="docutils literal"><span class="pre">\n</span></tt>, <tt class="docutils literal"><span class="pre">\l</span></tt></td><td><span id="line-feed_1443601756">line feed</span></td></tr>
<tr><td><tt class="docutils literal"><span class="pre">\f</span></tt></td><td><span id="form-feed_295412702">form feed</span></td></tr>
<tr><td><tt class="docutils literal"><span class="pre">\t</span></tt></td><td><span id="tabulator_1835392927">tabulator</span></td></tr>
Line 1,287 ⟶ 1,493:
 
There are also raw string literals that are preceded with the letter r (or R) and are delimited by matching double quotes (just like ordinary string literals) and do not interpret the escape sequences. This is especially convenient for regular expressions or Windows paths:
<langsyntaxhighlight lang="nim">var f = openFileopen(r"C:\texts\text.txt") # a raw string, so ``\t`` is no tab</langsyntaxhighlight>
To produce a single " within a raw string literal, it has to be doubled:
<langsyntaxhighlight lang="nim">r"a""b"</langsyntaxhighlight>
Produces:
<langsyntaxhighlight lang="nim">a"b</langsyntaxhighlight>
 
=={{header|OASYS Assembler}}==
Line 1,327 ⟶ 1,533:
=={{header|Objeck}}==
 
<langsyntaxhighlight lang="objeck">
\b //Backspace
\n //Line Feed
Line 1,335 ⟶ 1,541:
\' //Single Quote
\" //Double Q
</syntaxhighlight>
</lang>
 
Unicode escapes:
<langsyntaxhighlight lang="objeck">\uHHHH //Unicode Escape Sequence, H is any hexadecimal digit between 0 and 9 and between A and F</langsyntaxhighlight>
 
=={{header|OCaml}}==
Line 1,369 ⟶ 1,575:
 
Any reserved character you can use as character value using #\ prefix.
<langsyntaxhighlight lang="scheme">
(print #\|) (print #\#)
; ==> 124
; ==> 35
</syntaxhighlight>
</lang>
 
Any reserved character you can use as part of symbol name using | prefix and postfix. You can't use | itself as part of symbol name without using internal libraries.
<langsyntaxhighlight lang="scheme">
(define |I'm the ,`stra[]ge symbol :))| 123)
 
Line 1,384 ⟶ 1,590:
(print (+ |I'm the ,`stra[]ge symbol :))| 17))
; ==> 140
</syntaxhighlight>
</lang>
 
=={{header|PARI/GP}}==
Line 1,401 ⟶ 1,607:
 
While not a special character as such, whitespace is handled differently in gp than in most languages. While whitespace is said to be ignored in free-form languages, it is truly ignored in gp scripts: the gp parser literally removes whitespace outside of strings. Thus
<syntaxhighlight lang PARI="pari/GPgp">is square(9)</langsyntaxhighlight>
is interpreted the same as
<syntaxhighlight lang PARI="pari/GPgp">issquare(9)</langsyntaxhighlight>
or even
<syntaxhighlight lang PARI="pari/GPgp">iss qua re(9)</langsyntaxhighlight>
 
===Enclosures===
Line 1,680 ⟶ 1,886:
Can also optionally terminate declarations, eg integer a,b,c and integer a,b,c,$ are equivalent.
: namespace qualification, for example arwen:hiWord() means the one in arwen, not some other hiWord(). See also :=
. decimal separator, or part of .. Notefor thereslices, isor noclas/struct dotfield notation in Phix,access such as this.that.theother.
% deprecated. Was once used for things like %isVar, nowadays it is illegal.
\ outside strings, the only other place this can be used is as part of a path in an include statement.
^ illegal. I thinksuspect it is only in the syntax file to stop error files from being painful on the eyes.
, argument and sequence element separator
= assignment or equality operator, depending on context
Line 1,699 ⟶ 1,905:
; (optional) statement separator
: already described
| illegal. IIt thinkis it isprobably only in the syntax file to stop profile listings from being painful on the eyes.
() parameter delimiters and precedence override
[] subscripts
Line 1,730 ⟶ 1,936:
However, in string constants, enclosed in apostrophes or (since PL/I for OS/2) quotation marks, a single apostrophe/quote in the string
must be duplicated, thus:
<langsyntaxhighlight PLlang="pl/Ii">'John''s pen' which is stored as <<John's pen>>
"He said ""Go!"" and opened the door" which is stored as <<He said "Go!" and opened the door>></langsyntaxhighlight>
Of course, in either of the above the string can be enclosed with the "other" delimiter and no duplication is required.
 
Line 1,767 ⟶ 1,973:
 
The code is based on readable words and only a semicolon (;) as start-of-comment & a normal colon (:) as command separator are used.
<langsyntaxhighlight PureBasiclang="purebasic">a=1 ; The ';' indicates that a comment starts
b=2*a: a=b*33 ; b will now be 2, and a=66</langsyntaxhighlight>
 
=={{header|Python}}==
Line 1,878 ⟶ 2,084:
character set. In a Unicode literal, these escapes denote a Unicode character
with the given value.</li></ol>
 
=={{header|Quackery}}==
 
Quackery has no special symbols and no escape sequences.
 
Any sequence of 1 or more characters in the range ! to ~ (i.e. ascii character 33 to ascii character 126, the printable characters) is a valid identifier.
 
The Quackery compiler first searches the builders dictionary (i.e. compiler directives) from most to least recent, then the names dictionary (ie defined functions or procedures, "words" in the Quackery nomenclature) from most to least recent, then attempts to parse the identifier as a number in the current base (usually decimal, but can be in the range 2 to 36). The search is case sensitive.
 
This limits what tokens can be overwritten; creating a word with the same identifier as a builder will have no effect as the builder dictionary is searched first, and creating a builder or a word with a token that could be parsed as a number will make that number unavailable as a number, so by convention lower case is used throughout, except for numbers with a base higher than ten that include letters, which should be all upper case, and names and builders that consist only of digits with or without a leading minus sign are to be avoided.
 
=={{header|Racket}}==
Line 1,991 ⟶ 2,207:
===blanks in digraphs and trigraphs===
Digraphs and trigraphs can have imbedded blanks (or whitespace) between the characters, so:
<langsyntaxhighlight lang="rexx"> if a¬==b then say 'not equal'
if a ¬== b then say 'not equal'
if a ¬ = = b then say 'not equal'
if a ¬ ,
= = b then say 'not equal'</langsyntaxhighlight>
are equivalent &nbsp; (assuming the &nbsp; '''¬''' &nbsp; symbol is supported by the REXX interpreter).
===assignment operator symbol===
Line 2,009 ⟶ 2,225:
 
Note that /= is a valid infix operator in some Rexx implementations meaning 'not equal' as in
<langsyntaxhighlight lang="rexx">If a/=b Then Say 'a is not equal to b'</langsyntaxhighlight>
Note: the above is not an infix operator for an '''assignment''' as this section is named.
 
Line 2,079 ⟶ 2,295:
* ''' ( ''' &nbsp; &nbsp; is the '''start''' of an expression.
* ''' ) ''' &nbsp; &nbsp; is the &nbsp;'''end'''&nbsp; of an expression.
<langsyntaxhighlight lang="rexx"> y = (a+b)/(c-d)
z = ((y-2)/(y+2)) / ((a**2 * b**2)* abs(j))
p = 2**(2**3)</langsyntaxhighlight>
 
===function/subroutine argument enclosures, separators===
Line 2,089 ⟶ 2,305:
* ''' , ''' &nbsp; &nbsp; are used to separate arguments (if any) or to indicate omitted arguments for functions/subroutines.
Arguments may be omitted.
<langsyntaxhighlight lang="rexx"> tn = time()
tc = time('C')
x = strip(y,,'+')</langsyntaxhighlight>
 
===comment enclosures===
Line 2,106 ⟶ 2,322:
* ''' ' ''' &nbsp; &nbsp; is called an apostrophe &nbsp; (also called a ''single'' quote)
* ''' " ''' &nbsp; &nbsp; is called a ''double'' quote
<langsyntaxhighlight lang="rexx"> a = 'ready, set, go!'</langsyntaxhighlight>
or
<langsyntaxhighlight lang="rexx"> a = "ready, set, go!"</langsyntaxhighlight>
To assign a null, two formats that can be used are:
<langsyntaxhighlight lang="rexx"> nuttin =''
nothing=""</langsyntaxhighlight>
 
====apostophe [''' ' '''] duplication====
In string constants (enclosed in ''single'' apostrophes), a literal apostrophe in the string must be duplicated, thus:
<langsyntaxhighlight lang="rexx"> yyy = 'John''s pen'</langsyntaxhighlight>
which is stored as
<pre>John's pen
</pre>
An alternate way of expressing the above is:
<langsyntaxhighlight lang="rexx"> yyy = "John's pen"</langsyntaxhighlight>
 
====quotation mark [''' " '''] duplication====
In string constants (enclosed in ''double'' quotes), a literal double quote in the string must be duplicated, thus:
<langsyntaxhighlight lang="rexx"> jjj = "Quote the Raven, ""Nevermore"""</langsyntaxhighlight>
which is stored as
<pre>
Line 2,133 ⟶ 2,349:
The lowercase &nbsp; '''b''' &nbsp; or uppercase &nbsp; '''B''' &nbsp; (letter) acts as a literal character notation marker, enabling binary literals to be stored as
<br>character strings by using binary (bit) representation of character codes:
<langsyntaxhighlight lang="rexx"> lf = '000001010'b /* (ASCII) */
cr = "1101"B /* (ASCII) */
greeting = '100100001100101011011000110110001101111'b /* (ASCII) Hello */</langsyntaxhighlight>
 
====hexadecimal literals====
The lowercase &nbsp; '''x''' &nbsp; or uppercase &nbsp; '''X''' &nbsp; (letter) acts as a literal character notation marker, enabling hexadecimal literals to be stored as
<br>character strings by using hexadecimal representation of character codes (which can be in lower or uppercase):
<langsyntaxhighlight lang="rexx"> lf = '0A'x /* (ASCII) */
lf = 'a'x /*same as above. */
lf = "a"X /*same as above. */
cr = '0D'x /* (ASCII) */
greeting = '48656C6C6F'x /* (ASCII) Hello */</langsyntaxhighlight>
 
====literal character digraphs aren't supported====
Line 2,153 ⟶ 2,369:
 
Note that in later versions of the Regina interpreter, &nbsp; '''--''' &nbsp; ''may'' &nbsp; be used to introduce a line comment (if the appropriate option is in effect.)
<langsyntaxhighlight lang="rexx">something=12 -- an assignment, what else</langsyntaxhighlight>
<br>The above usage invalidates negative unary operators for classic Rexx. &nbsp; In the following:
<langsyntaxhighlight lang="rexx">something=--12</langsyntaxhighlight>
or the string may not be hardcoded within a single REXX statement:
<langsyntaxhighlight lang="rexx">x='--12 ++12 -12.000 +12 12 12. 012 0012 0012.00 1.2E1 1.2e+1 --12e-00 120e-1 ++12 ++12. 0--12 00--12 --12'
do j=1 for words(x)
interpret 'something=' word(x,j)
say 'j=' j ' x=' x ' something='something
end</langsyntaxhighlight>
or the expression may be user specified as in:
<langsyntaxhighlight lang="rexx">say 'enter an expression:'
parse pull x
interpret 'expression=' x
say 'expression=' expression</langsyntaxhighlight>
can do completely different assignments [or evaluation(s)], depending what version of (classic) REXX is being used.
<br>The following REXX interpreters (for the 1<sup>st</sup> example assign a &nbsp; '''12''' &nbsp; to the variable &nbsp; '''something''':
Line 2,190 ⟶ 2,406:
Normally, the ''end-of-line'' (or ''end-of-program'') is assumed the end of a REXX statement,
<br>but multiple REXX statements can be used (on one line) by separating them with a semicolon &nbsp; [''';'''].
<langsyntaxhighlight lang="rexx"> x=5; y=6;z=y**2; say x y z</langsyntaxhighlight>
 
===continuation symbol===
A REXX statement can be continued on the next line by appending a comma [''','''] as the
<br>last significant symbol on the line to be continued.
<langsyntaxhighlight lang="rexx"> say 'The sum of the first three numbers in the file "INSTANCE.INPUTS" is',
a b c ' and the total of all three is' grandTotal</langsyntaxhighlight>
<langsyntaxhighlight lang="rexx"> say 'The sums of the first three numbers in the file "INSTANCE.INPUTS" is', /*tell sums.*/
a b c ' and the total of all three is' grandTotal</langsyntaxhighlight>
 
===statement label===
Line 2,210 ⟶ 2,426:
To separate the arguments of a ''subroutine'', ''function'', or '''BIF''' (built-in function) calls/invocations,
<br>the comma &nbsp; [''','''] &nbsp; is used.
<langsyntaxhighlight lang="rexx"> secondChar = substr(xxx, 2, 1)
thirdChar = substr(xxx,3,1)
pruned = strip(zebra, 'Trailing', ".") /*remove trailing period(s).*/</langsyntaxhighlight>
Note that a comma is also used for continuation if it is the last significant character on a REXX statement &nbsp; (see ''continuation character'' above).
<br>Also note that REXX only examines the 1<sup>st</sup> character of the (''trailing'') option, and that the ''case'' of the letter is irrelevant.
Line 2,258 ⟶ 2,474:
 
=={{header|Scala}}==
The most of Java special characters are available in Scala. The big difference is they are not built in the compiler but defined in the appropriate library of classes. Because operators works on classes they are actual methods of that classes. Example: <langsyntaxhighlight Scalalang="scala">val n = 1 + 2</langsyntaxhighlight>This is interpreted as "1" is of class Int and use the method "+" with parameter "2". (Donn't worry, later this will be unboxed to e.g. native JVM primitives.)
 
This flexible approach gives the possibility to define and redefine (overridden) operators. Sometimes new are invented but the recommendation is to use this with care.
Line 2,317 ⟶ 2,533:
=={{header|Tcl}}==
As documented in man Tcl, the following special characters are defined:
<langsyntaxhighlight Tcllang="tcl">{...} ;# group in one word, without substituting content; nests
"..." ;# group in one word, with substituting content
[...] ;# evaluate content as script, then substitute with its result; nests
Line 2,336 ⟶ 2,552:
;# (extends till end of line)
{*} ;# if first characters of a word, interpret as list of words to substitute,
;# not single word (introduced with Tcl 8.5)</langsyntaxhighlight>
 
=={{header|TXR}}==
Line 2,449 ⟶ 2,665:
* [[ ]] bash specific feature
* (( )) arithmetic expansion enclosures
 
=={{header|V (Vlang)}}==
Note: Further elaboration can be found in the documentation.
<syntaxhighlight lang="Vlang">
// escape codes
windows_newline := '\r\n' // escape special characters like in C
assert windows_newline.len == 2
 
// arbitrary bytes can be directly specified using `\x##`
// a hex digit aardvark_str := '\x61ardvark' assert aardvark_str == 'aardvark'
assert '\xc0'[0] == u8(0xc0)
 
// or using octal escape `\###` notation where `#` is an octal digit
aardvark_str2 := '\141ardvark'
assert aardvark_str2 == 'aardvark'
 
// Unicode can be specified directly as `\u####` where # is a hex digit
// and will be converted internally to its UTF-8 representation
star_str := '\u2605' // ★
assert star_str == '★'
assert star_str == '\xe2\x98\x85' // UTF-8 can be specified this way too.
</syntaxhighlight>
 
=={{header|Wren}}==
Line 2,454 ⟶ 2,692:
 
Apart from these the only character which can be regarded as ''special'' is the underscore. Apart from letters and digits this is the only other character permitted in an identifier. However, an identifier which starts with an underscore is ''always'' an instance field and an identifier which starts with two (or more) underscores is ''always'' a static field of a class. They cannot be used for any other purpose.
 
=={{header|XPL0}}==
The operators that XPL0 supports are listed in the [[Operator_precedence#XPL0]] task, and the escape characters used are listed in the [[Literals/String#XPL0]] task.
 
Other than these, the only character that can be regarded as ''special'' is the underscore. Apart from letters and digits, this is the only other character permitted in an identifier. An identifier must start with a capital letter or an underscore. This avoids the problem of certain words being reserved by the language and not available as identifier names. For example, ''For'' could be used as a variable name, and it would not conflict with the ''for'' loop command word.
 
=={{header|XSLT}}==
Line 2,480 ⟶ 2,723:
*Star(*): rest of, forever: [0..*], [1,*]
*Underline(_): valid in a name but also used as a throw away: a,_,c=...
 
=={{header|Z80 Assembly}}==
* <code>$</code> or <code>&</code> represents a hexadecimal quantity, e.g. <code>&C000</code>
* <code>%</code> represents a binary quantity.
* <code>z</code>, <code>nz</code>: In <code>JP/JR/CALL/RET</code>, this represents that the zero flag is set (equals 1) or clear (equals 0) respectively.
* <code>c</code>, <code>nc</code>: In <code>JP/JR/CALL/RET</code>, this represents that the carry flag is set (equals 1) or clear (equals 0) respectively.
* <code>pe</code>, <code>po</code>: In <code>JP/JR/CALL/RET</code>, this represents that the parity/overflow flag is set (equals 1) or clear (equals 0) respectively. Unfortunately, the same syntax is used regardless of what the flag is representing.
* <code>m</code>, <code>p</code>: In <code>JP/JR/CALL/RET</code>, this represents that the sign flag is set (equals 1) or clear (equals 0) respectively.
* <code>'</code> represents a shadow register (e.g. <code>BC'</code>,<code>DE'</code>,<code>HL'</code>, or <code>AF'</code>.) These registers are never written to directly; rather, an instruction such as <code>EX AF,AF'</code> must be used to exchange the shadow register with the "real" one.
* <code>$</code> or <code>*</code>, when used as the operand of an instruction, can represent the current program counter value, e.g. <code>jp *</code> creates an endless loop, and <code>ld (*+2),a</code> stores the accumulator at the address two bytes after this instruction.
* <code><</code> and <code>></code> represent "the low byte of" and "the high byte of," respectively. For example, if you type <code><&1234</code> the assembler will replace this with <code>&34</code>.
* The comma <code>,</code> separates instruction operands. The operand on the left is the destination, the operand on the right is the source. For example, <code>LD A,B</code> copies the value stored in B, into A.
* Parentheses <code>( )</code> around a register or number operand, indicate a dereference of a pointer. Essentially the equivalent of the unary <code>*</code> operator in C.
* Double quotes are for strings defined as data blocks, e.g. <code>db "Hello World",13,10,0</code>. There are no escape sequences recognized by the assembler. Control codes are specified by separating them by commas ''outside'' the quotation marks, as in the example above.
* Single quotes are typically used for single-character operands, e.g. <code>CP '\'</code> to compare the accumulator to the ASCII code for backslash. Note that all characters in single or double quotes are really aliases for the numeric quantity equal to their ASCII code. So technically you can type any string by typing their ASCII codes without quotation marks, and separating each character with commas. However, it's much more efficient and readable to use quoted strings.
 
[[Category: Syntax elements]]
Anonymous user