Special characters

From Rosetta Code
Task
Special characters
You are encouraged to solve this task according to the task description, using any language you may know.

Special characters are symbols (single characters or sequences of characters) that have a "special" built-in meaning in the language and typically cannot be used in identifiers.

Escape sequences are methods that the language uses to remove the special meaning from the symbol, enabling it to be used as a normal character, or sequence of characters when this can be done.


Task

List the special characters and show escape sequences in the language.

See also: Quotes

Contents

ActionScript[edit]

  • , function argument separator
  •  ; statement separator
  • // comment prefix

Enclosures[edit]

  • " " literal string enclosures
  • ( ) function argument enclosures
  • /* */ comment block enclosures

Ada[edit]

Ada uses the following characters: The alphabet A-Z The digits 0-9 The following characters "#&'()*+,-./:;<=>_| The space character The following compound characters => .. ** := /= >= <= << >> <>

Identifiers consist of a letter followed by any number of letters or Numbers, which may be separated by a single underscore.

There is no escape sequences in character literals. Any character supported by the source encoding is allowed. The only escape sequence of string literals is "" (doubled double quotation marks) which denotes ". When characters need to be specified by their code positions (in Unicode), this is made using the 'Val attribute:

with Ada.Text_IO;  use Ada.Text_IO;
 
procedure Test is
begin
Put ("Quote """ & ''' & """" & Character'Val (10));
end Test;
Output:
Quote "'"

Note that character and string literals serve all character and string types. For example with Wide_Wide characters (32-bit) and strings:

with Ada.Wide_Wide_Text_IO;  use Ada.Wide_Wide_Text_IO;
 
procedure Test is
begin
Put ("Unicode """ & ''' & """" & Wide_Wide_Character'Val (10));
end Test;

ALGOL 68[edit]

ALGOL 68 has several built-in character constants. The following characters are (respectively) the representations of TRUE and FALSE, the blank character ".", the character displayed when a number cannot being printed in the width provided. And the null character indicating the end of characters in a BYTES array.

printf(($"flip:"g"!"l$,flip));
printf(($"flop:"g"!"l$,flop));
printf(($"blank:"g"!"l$,blank));
printf(($"error char:"g"!"l$,error char));
printf(($"null character:"g"!"l$,null character))
Output:
flip:T!
flop:F!
blank: !
error char:*!
null character:

To handle the output movement to (and input movement from) a device ALGOL 68 has the following four positioning procedures:

print(("new page:",new page));
print(("new line:",new line));
print(("space:",space));
print(("backspace:",backspace))
These procedures may not all be supported on a particular device.

If a particular device (CHANNEL) is set possible, then there are three built-in procedures that allow movement about this device.

  • set char number - set the position in the current line.
  • reset - move to the first character of the first line of the first page. For example a home or tape rewind.
  • set - allows the movement to selected page, line and character.

ALGOL 68 pre-dates the current ASCII standard, and hence supports many non ASCII characters. Moreover ALGOL 68 had to work on 6-bits per byte hardware, hence it was necessary to be able to write the same ALGOL 68 code in strictly upper-case. Here are the special characters together with their upper-case alternatives (referred to as "worthy characters").

Character ASCII Worthy bold
¢ # CO co
>= GE ge
<= LE le
/= or ~= NE ne
¬ ~ NOT not
\/ OR or
/\ or & AND and
÷  % OVER over
× * TIMES times
** UP up
DOWN down
-> OF of
⊥ or ×+ *+ I i
\ E e
NIL nil
ELEM elem
LWB lwb
UPB upb
LWS lws
UPS ups

Most of these characters made their way into European standard characters sets (eg ALCOR and GOST). Ironically the ¢ character was dropped from later versions of America's own ASCII character set.

The character "⏨" is one ALGOL 68 byte (not two bytes).

AutoHotkey[edit]

The escape character defaults to accent/backtick (`).

  • `, = , (literal comma). Note: Commas that appear within the last parameter of a command do not need to be escaped because the program knows to treat them literally. The same is true for all parameters of MsgBox because it has smart comma handling.
  • `% = % (literal percent)
  • `` = ` (literal accent; i.e. two consecutive escape characters result in a single literal character)
  • `; = ; (literal semicolon). Note: This is necessary only if a semicolon has a space or tab to its left. If it does not, it will be recognized correctly without being escaped.
  • `n = newline (linefeed/LF)
  • `r = carriage return (CR)
  • `b = backspace
  • `t = tab (the more typical horizontal variety)
  • `v = vertical tab -- corresponds to Ascii value 11. It can also be manifest in some applications by typing Control+K.
  • `a = alert (bell) -- corresponds to Ascii value 7. It can also be manifest in some applications by typing Control+G.
  • `f = formfeed -- corresponds to Ascii value 12. It can also be manifest in some applications by typing Control+L.
  • Send = When the Send command or Hotstrings are used in their default (non-raw) mode, characters such as {}^!+# have special meaning. Therefore, to use them literally in these cases, enclose them in braces. For example: Send {^}{!}{{}
  • "" = Within an expression, two consecutive quotes enclosed inside a literal string resolve to a single literal quote. For example: Var := "The color ""red"" was found."

AWK[edit]

AWK uses the following special characters:

  •  ! logical NOT
  •  ; statement separator
  • # comment marker
  • $ field reference operator and regular expression anchor
  •  % modulus, output format specifier prefix
  • * multiplication operator and regular expression repetition operator
  • + addition operator and regular expression operator
  • - subtraction operator, regular expression range operator
  • , separates items in a list, range pattern delimiter
  • . decimal point and regular expression operator
  • / division operator and regular expression enclosure symbol
  •  : ternary operation component
  •  ; statement and rule separator
  • = assignment operator
  • < comparative less than operator, infeed operator
  • > comparative greater than operator, outfeed operator
  •  ? regular expression match no more than once, ternary operation component
  • \ escape sequence, line continuation, suppression of interpolation (in code and regular expressions), literal character insertion
  • ^ modulus, regular expression anchor and compliment box indicator
  • | regular expression alternation operator, pipefeed operator
  • ~ regular expression binding operator

Digraphs[edit]

  • == equality comparative operator
  •  != inequality comparative operator
  • <= comparative less than or equal to
  • >= comparative greater than or equal to
  • >> appendfeed operator
  • && logical AND
  • || logical OR
  • ++ increment nudge operator
  • -- decrement nudge operator
  • += addition compound assignment operator
  • -= subtraction compound assignment operator
  • *= multiplication compound assignment operator
  • /= division compound assignment operator
  • ^= exponent compound assignment operator
  •  %= modulus compound assignment operator
  •  !~ regular expression non containment operator

Enclosures[edit]

  • " " literal string enclosures
  • / / regular expression enclosures
  • { } body of code, code block, action enclosure, or body of if/for/while block
  • ( ) conditional constructs in for/while loops; arguments to a function, grouping expression components, regular expression subexpression enclosures, overriding precedence
  • [ ] array element enclosures

In addition, regular expressions and (s)printf have their own "little languages". Note that the ampersand, snail, underscore, backtick and apostrophe symbols have no special meanings in isolation.

BASIC[edit]

Assignment operator symbols[edit]

  • = assignment operator

Arithmetic operator symbols[edit]

  • + addition
  • - subtraction
  • * multiplication
  • / division
  • \ integer division

Data type indicators[edit]

  •  % suffix sigil following integer variable names
  • $ suffix sigil following string variable names

Comparative operator symbols[edit]

  • = equality
  • < less than
  • > greater than
  • <= less than or equal to
  • >= greater than or equal to
  • <> inequality

Enclosures[edit]

  • " " used as enclosures for strings
  • ( ) function argument enclosures, array element reference, and used to dictate mathematical precedence

Output separators[edit]

  •  ; move cursor to next column instead of newline and separates redirection stream from data
  • , move cursor to next tabstop instead of newline and alternative to semicolon for separation of stream from data

Statement and argument separators[edit]

  •  : separates multiple statements on a line
  • , separates multiple arguments to functions

Redirection operator[edit]

  • # prefixes a stream number for input or output redirection

Batch File[edit]

Check HERE for more details.

Basically, these are the special characters in Batch Files:

  • % (Escape Sequence: %%) - Used for using variables.
  • & (Escape Sequence: ^&) - Used for executing multiple commands in one line.
  • ( and ) (Escape Sequence: ^( and ^), respectively) - grouping symbols, works similar to the curly brackets in Java, C, etc.
  • > (Escape Sequence: ^>) - The "redirection" symbol, used for redirecting the output of a command to a file.
  • < (Escape Sequence: ^<) - Used for sending the content of a file into a command.
  • | (Escape Sequence: ^|) - The "pipe" symbol, Used for sending the output of a command into another command.
  • ^ (Escape Sequence: ^^) - Escapes the next character. (quite weird...)
  • ! (Escape Sequence: ^^!) - Used for using delayed variables. (Required iff delayed variable expansion is enabled)

BBC BASIC[edit]

These are the principal special characters, in addition to the regular symbols used in BASIC for arithmetic operations, comparisons, delimiters etc.:

?     A unary or dyadic operator giving 8 bit indirection. 
!     A unary or dyadic operator giving 32 bit indirection. 
#     As a prefix indicates a file channel number.
      As a suffix indicates a 64-bit numeric variable or constant. 
$     As a prefix indicates a 'fixed string' (string indirection).
      As a suffix indicates a string variable.  
%     As a prefix indicates a binary constant e.g. %11101111.
      As a suffix indicates an integer (signed 32-bit) variable. 
&     As a prefix indicates a hexadecimal constant e.g. &EF.
      As a suffix indicates a byte (unsigned 8-bit) variable. 
'     Causes an additional new-line in PRINT or INPUT.
;     Suppresses a forthcoming action, e.g. the new-line in PRINT. 
@     A prefix character for 'system' variables. 
^     A unary operator returning a pointer (address of an object).
      The dyadic exponentiation (raise to the power) operator.
\     The line continuation character, to split code across lines. 
[ ]   Delimiters for assembler statements.
{ }   Indicates a structure.  
~     Causes conversion to hexadecimal, in PRINT and STR$. 
|     A unary operator giving floating-point indirection.
      A delimiter in the VDU statement.  

bc[edit]

  • = assignment operator
  •  % modulus operator
  • - negative number prefix
  • == equality comparative operator

Enclosures[edit]

  • " " literal string enclosures
  • ( ) expression enclosures, precedence override
  • /* */ block comment enclosures

Bracmat[edit]

Almost all ASCII characters that are not alphanumeric are special characters in Bracmat. The one exception (April 2014) is the closing square bracket ]. Any character can be part of an identifier if the identifier is enclosed in double quotes. The only characters that must be escaped are \ and ".

Some special characters - the prefix characters [ ~ / # < > % @ ` ? ! - - can occur inside an unquoted identifier, as long as they are preceded by an alphanumeric character.

The usual control codes can occur in unquoted identifiers if represented as escape sequences \a \b \t \n \v \f \r. If a control code occurs in an identifier in own person, then the identifier must be quoted.

Also \" and \\ can occur in unquoted identifiers.

If in doubt whether an identifier needs quotes, use them in your code and see whether Bracmat needs them by inspecting the result of a program listing produced by the built-in function lst$. If the quotes have disappeared, they were not necessary. It is never wrong to enclose an identifier in quotes.

Brainf***[edit]

The only characters that mean anything in BF are its commands:

> move the pointer one to the right

< move the pointer one to the left

+ increment the value at the pointer

- decrement the value at the pointer

, input one byte to memory at the pointer

. output one byte from memory at the pointer

[ begin loop if the value at the pointer is not 0

] end loop

All other characters are comments.

C[edit]

See C++.

As in C++, ?, #, \, ' and " have special meaning (altogether with { and }). Also trigraphs work (they are an "old" way to avoid the "old" difficulties of finding characters like { } etc. on some keyboards).

  • = assignment operator, enumeration value
  • , function parameter separator
  •  ; statement separator, loop construct component
  • . element member selector for structures or unions, decimal point
  •  ! logical NOT operator
  • ~ bitwise BWNOT
  • # preprocessor stringize operator, preprocessor directive prefix
  • \ literal character notation prefix, line continuation
  • & address resolution prefix, bitwise BWAND operator
  • * multiplication, pointer resolution, file pointer prefix, indirection
  • + addition, optional unary positive, prefix increment, postfix increment
  • - subtraction, unary negative
  • / division
  • ^ bitwise BWXOR operator
  • | bitwise BWOR operator
  • _ internal library identifier prefix
  •  % modulus, output format specifier prefix
  • > greater than comparative operator
  • < less than comparative operator

Digraphs[edit]

  • == equality operator
  •  != inequality operator
  • ++ incremental nudge operator
  • -- decremental nudge operator
  • && logical AND
  • || logical OR
  • += additive compound assignment operator
  • -= subtractive compound assignment operator
  • *= multiplication compound assignment operator
  • /= division compound assignment operator
  •  %= modulus assignment operator
  • <= less than or equal to comparative operator
  • >= greater than or equal to comparative operator
  • &= bitwise BWAND combination assignment operator
  • |= bitwise BWOR combination assignment operator
  • ^= bitwise BWXOR combination assignment operator
  • << bitwise left shift
  • >> bitwise right shift
  • .* pointer to object member
  • .> pointer to structure or union element
  •  :> base operator
  • ## preprocessor concatenation operator
  • #@ preprocessor charizing operator

Trigraphs[edit]

  • .>* pointer to pointer member
  • <<= bitwise left shift compound assignment
  • >>= bitwise right shift compound assignment
  • ... used in variadic function declaration

Enclosures[edit]

  • " " literal string enclosures
  • { } Group statements together into blocks of code
  • ( ) Enclosure for function parameters
  • < > header filename enclosure
  • /* */ Comment enclosures

Ternary operators[edit]

  •  ? , : The hook and colon are used together to produce ternary operation syntax

C99 Extensions[edit]

  • // Comment prefix

C99 standard (but not previous standards) recognizes also universal character names, like C++.

String and character literals are like C++ (or rather the other way around!), and even the meaning and usage of the # character is the same.

C++[edit]

C++ has several types of escape sequences, which are interpreted in various contexts. The main characters with special properties are the question mark (?), the pound sign (#), the backslash (\), the single quote (') and the double quote (").

Digraphs[edit]

  • // comment prefix
  • << infeed operator
  • >> outfeed operator
  •  :: scope modifier

Trigraphs[edit]

Trigraphs are certain character sequences starting with two question marks, which can be used instead of certain characters, and which are always and in all contexts interpreted as the replacement character. They can be used anywhere in the source, including, but not limited to string constants. The complete list is:

Trigraph  Replacement letter
  ??(       [
  ??)       ]
  ??<       {
  ??>       }
  ??/       \
  ??=       #
  ??'       ^
  ??!       |
  ??-       ~

Note that interpretation of those trigraphs is the very first step in C++ compilation, therefore the trigraphs can be used instead of their replacement letters everywhere, including in all of the following escape sequences (e.g. instead of \u00CF (see next section) you can also write ??/u00CF, and it will be interpreted the same way).

Also note that some compilers don't interpret trigraphs by default, since today's character sets all contain the replacement characters, and therefore trigraphs are practically not used. However, accidentally using them (e.g. in a string constant) may change the code semantics on some compilers, so one should still be aware of them.

Universal character names and escaping newlines[edit]

Moreover, C++ allows to use arbitrary Unicode letters to be represented in the basic execution character set (which is a subset of ASCII), by using a so-called universal character name. Those have one of the forms

\uXXXX
\UXXXXXXXX

where each X is to be replaced by a hex digit. For example, the German umlaut letter ü can be written as

\u00CF

or

\U000000CF

However, letters in the basic execution character set may not be written in this form (but since all those characters are in standard ASCII, writing them as universal character constants would only obfuscate anyway). If the compiler accepts direct usage of of non-ASCII characters somewhere in the code, the result must be the same as with the corresponding universal character name. For example, the following two lines, if accepted by the compiler, should have the same effect:

std::cout << "Tür\n";
std::cout << "T\u00FC\n";

Note that in principle, C++ would also allow to use such letters in identifiers, e.g.

extern int Tür; // if the compiler allows literal ü
extern int T\u00FCr; // should in theory work everywhere

but that's not generally supported by existing compilers (e.g. g++ 4.1.2 doesn't support it).

Another escape sequence working everywhere is to escape the newline: If a backslash is at the end of the line, the next line is pasted to it without any space in between. For example:

int const\
ant; // defines a variable of type int named constant, not a variable of type int const named ant

String and character literal[edit]

A string literal is surrounded by double quotes("). A character literal is surrounded by single quotes ('). Example:

char const str = "a string literal";
char c = 'x'; // a character literal

The following escape sequences are only allowed inside string constants and character constants:

escape seq.  meaning          ASCII character/codepoint
 \a           alert             BEL ^G/7
 \b           backspace         BS  ^H/8
 \f           form feed         FF  ^L/12
 \n           newline           LF  ^J/10
 \r           carriage return   CR  ^M/13
 \t           tab               TAB ^I/9
 \v           vertical tab      VT  ^K/11
 \'           single quote      '           (unescaped ' would end character constant)
 \"           double quote      "           (unescaped " would end string constant)
 \\           backslash         \           (unescaped \ would introduce escape sequence)
 \?           question mark     ?           (useful to break trigraphs in strings)
 \0           string end marker NUL ^@/0    (special case of octal char value)
 \nnn         (octal char value)            (each n must be an octal digit)
 \xnn         (hex char value)              (each n must be a hexadecimal digit)

Note that C++ doesn't guarantee ASCII. On non-ASCII platforms (e.g. EBCDIC), the rightmost column of course doesn't apply. However, \0 unconditionally has the value 0.

Also note that some compilers add the non-standard escape sequence \e for Escape (that is, the ASCII escape character).

The # character[edit]

The # character in C++ is special as it is interpreted only in the preprocessing phase, and shouldn't occur (outside of character/string constants) after preprocessing.

  • If # appears as first non-whitespace character in the line, it introduces a preprocessor directive. For example
#include <iostream>
  • Inside macro definitions, a single # is the stringification operator, which turns its argument into a string. For example:
#define STR(x) #x
int main()
{
std::cout << STR(Hello world) << std::endl; // STR(Hello world) expands to "Hello world"
}
  • Also inside macro definitions, ## is the token pasting operator. For example:
#define THE(x) the_ ## x
int THE(answer) = 42; // THE(answer) expands to the_answer

Note that the # character is not interpreted specially inside character or string literals.

Clojure[edit]

See Clojure's Reader documentation.

E[edit]

E uses typical C-style backslash escapes within literals. The defined escapes are:

Sequence Unicode Meaning
\b U+0008 (Backspace)
\t U+0009 (Tab)
\n U+000A (Line feed)
\f U+000C (Form feed)
\r U+000D (Carriage return)
\" U+0022 "
\' U+0027 '
\\ U+005C \
\<newline> None (Line continuation -- stands for no characters)
\uXXXX U+XXXX (BMP Unicode character, 4 hex digits)

Consensus has not been reached on handling non-BMP characters. All other backslash-followed-by-character sequences are syntax errors.

Within E quasiliterals, backslash is not special and $\ plays the same role;

? println(`1 + 1$\n= ${1 + 1}`)
1 + 1
= 2

Erlang[edit]

Erlang variables can use A-Z, a-z, _ and 0-9. They must start with A-Z or _. Erlang atoms (sort of enums) can use the same characters, but must start with a-z. If the atom is quoted, ie starts and stops with ', it can contain any character (except ').

Forth[edit]

When Forth fails to interpret a symbol as a defined word, an attempt is made to interpret it as a number. In numerical interpretation there arise a number of special characters:

 
10 \ single cell number
-10 \ negative single cell number
10. \ double cell number
10e \ floating-point number

Many systems - and the Forth200x standard - extend this set with base prefixes:

 
#10 \ decimal
$10 \ hex
 %10 \ binary

Of strings, Forth200x Escaped Strings adds a string-parsing word with very familiar backslashed escapes.

There are otherwise no special characters or escapes in Forth.

Gambas[edit]

  • # prefixes stream numbers for input / output redirection
  • ' comment prefix
  •  ; output separator moves cursor to next column instead of newline, separates redirection stream from data
  •  : statement separator
  • . object element separator
  • , separates arguments to functions
  • = assignment, equality
  • + addition, optional unary positive
  • - subtraction, unary negative
  • * multiplication
  • / division
  • < less than
  • > greater than

Digraphs[edit]

  • <= less than or equal to
  • >= greater than or equal to
  • <> inequality

Enclosures[edit]

  • " " string enclosures
  • ( ) function parameter enclosures, overriding precedence

Go[edit]

Within a character literals and string literals, the backslash is a special character that begins an escape sequence. Examples are '\n' and “\xFF”. These sequences are documented in the language specification.

Special purpose escape sequences are also defined within the context of certain packages in the standard library, such html and regexp.

Go keywords, operators, and delimiters are all predefined are all composed of ASCII characters, however the character encoding of Go source code is specified to be UTF-8. This allows user-defined identifiers and literals to incorporate non-ASCII characters.

Whitespace is generally ignored except as is it delimits tokens, with one exception: Newline is a very special character. As explained by the language specification, translation (that is compilation) involves a step where the tokenizer converts (most) newlines to semicolons, which are then handled as terminators in the grammar of the formal language. Of course you as the programmer, or user of the language, are not involved in this intermediate stage of the compilation process and so the effect you see is somewhat different. The effect for the programmer is that the grammatical structure is partially determined by the 2D layout of the source code.

GUISS[edit]

  • , statement separator
  •  : Used as a separator (usually between the user interface component and the component name or gist)
  • > Used to specify user input or selected item
  • [ ] Enclosure for symbol or digraph names

Haskell[edit]

Comments

-- comment here until end of line
{- comment here -}

Operator symbols (nearly any sequence can be used)

! # $ % & * + - . / < = > ? @ \ ^ | - ~ :
: as first character denotes constructor

Reserved symbol sequences

.. : :: = \ | <- -> @ ~ => _

Infix quotes

`identifier` (to use as infix operator)

Characters

'.'
\ escapes

Strings

"..."
\ escapes

Special escapes

\a alert
\b backspace
\f form feed
\n new line
\r carriage return
\t horizontal tab
\v vertical tab

Other

( )   (grouping)
( , ) (tuple type/tuple constructor)
{ ; } (grouping inside let, where, do, case without layout)
[ , ] (list type/list constructor)
[ | ] (list comprehension)

Unicode characters, according to category:

Upper case (identifiers)
Lower case (identifiers)
Digits (numbers)
Symbol/punctuation (operators)

HicEst[edit]

HicEst has no escape characters. Strings may contain all characters. String constants can be delimited by most non-standard characters, usually ' or ".

  •  ! starts a comment. The comment extends to the end of the line.
  • The global variable $ is the current linear left hand side array index in array expressions
  • The global variable $$ is set to the sequence number of either of the last activated toolbar button number, or menu item number, or popup item number
  • If # appears as the first character in a line, it starts the optional appendix section of the script. This terminates the program section. Appendix chapters are not compiled and are therefore not executable. They serve to store information that can be retrieved by the APPENDIX function.

HTML[edit]

  • = assignment (within a tag)
  • / prefixes tag closures (within tag enclosures)

Enclosures[edit]

  • " " string value enclosures (within a tag)
  • < > tag enclosures
  • comment enclosures

Icon and Unicon[edit]

Icon and Unicon strings and csets may contain the following special characters

\b backspace
\d delete
\e escape
\f formfeed
\l linefeed
\n newline
\r return
\t horizontal tab
\v vertical tab
\' single quote
\" double quote
\\ backslash
\ddd octal code
\xdd hexadecimal code
\^c control code

J[edit]

The closest thing J has to an escape sequence is that paired quotes, in a character literal, represent a single quote character.

   ''      NB. empty string
 
'''' NB. one quote character
'
'''''' NB. two quote characters
''

Since it's not clear what "special characters" would mean, in the context of J, here is an informal treatment of J's word forming rules:

Lines are terminated by newline characters, and J sentences are separated by newline characters. J sometimes treats sequences of lines specially, in which case a line with a single right parenthesis terminates the sequence.

A character literal consists of paired quote characters with any other characters between them.

   'For example, this is a character literal'

A numeric literal consists of a leading numeric character (a digit or _) followed by alphanumeric (numeric or alphabetic) characters, dots and spaces. A sequence of spaces will end a numeric literal if it is not immediately followed by a numeric character.

   1
1 0 1 0 1 0 1
_3.14159e6

Some numeric literals are not implemented by the language

   3l1t3
|ill-formed number

Words consist of an alphabetic character (a-z or A-Z) followed by alphanumeric characters and optionally followed by a sequence of dots or colons. Words which do not contain . or : can be given definitions by the user. The special word NB. continues to the end of the line and is ignored (it's a comment) during execution. Words may also contain the underscore character (_) but if there's a trailing underscore, or if there's two adjacent underscores in a word, that has special significance in name lookup.

   example=: ARGV NB. example and ARGV are user definable words

Tokens consist of any other printable character optionally followed by a sequence of dots or colons. (Tokens which begin with . or : must be preceded by a space character).

  +/ .*  NB. + / . and * are all meaningful tokens in J

Java[edit]

Math:

& | ^ ~ //bitwise AND, OR, XOR, and NOT
>> << //bitwise arithmetic shift
>>> //bitwise logical shift
+ - * / = % //+ can be used for String concatenation)

Any of the previous math operators can be placed in front of an equals sign to make a self-operation replacement:

x = x + 2 is the same as x += 2
++ -- //increment and decrement--before a variable for pre (++x), after for post(x++)
== < > != <= >= //comparison

Boolean:

! //NOT
&& || //short-circuit AND, OR
^ & | //long-circuit XOR, AND, OR

Other:

{ } //scope
( ) //for functions
; //statement terminator
[ ] //array index
" //string literal
' //character literal
? : //ternary operator
// //comment prefix (can be escaped by \u unicode escape sequence see below)
/* */ //comment enclosures (can be escaped by \u unicode escape sequence see below)

Escape characters:

\b     //Backspace
\n //Line Feed
\r //Carriage Return
\f //Form Feed
\t //Tab
\0 //Null) Note. This is actually a OCTAL escape but handy nonetheless
\' //Single Quote
\" //Double Quote
\\ //Backslash
\DDD //Octal Escape Sequence, D is a number between 0 and 7; can only express characters from 0 to 255 (i.e. \0 to \377)

Unicode escapes:

\uHHHH //Unicode Escape Sequence, H is any hexadecimal digit between 0 and 9 and between A and F

Be extremely careful with Unicode escapes. Unicode escapes are special and are substituted with the specified character before the source code is parsed. In other words, they apply anywhere in the code, not just inside character and string literals. Variable names can contain foreign characters. It also means that you can use Unicode escapes to write any character in the source code, and it would work. For example, you can say \u002b instead of saying + for addition; you can say String\u0020foo and it would be interpreted as two identifiers: String foo; you can even write the entire Java source file with Unicode escapes, as a poor form of obfuscation.

However, this leads to many problems:

  • \u000A will become a line return in the code, which will terminate line-end comments:
// hello \u000A this looks like a comment
is a syntax error, because the part after \u000A is on the next line and no longer in the comment
  • \u0022 will become a double-quote in the code, which ends / begins a string literal:
"hello \u0022 is this a string?"
is a syntax error, because the part after \u0022 is outside the string literal
  • An invalid sequence of \u, even in comments that usually are ignored, will cause a parsing error:
/*
* c:\unix\home\
*/
is a syntax error, because \unix is not a valid Unicode escape, even though you think that it should be inside a comment

JavaScript[edit]

See Java

jq[edit]

Any JSON entity can be specified in a jq program in accordance with the JSON specification. See json.org for details. The following discussion accordingly ignores JSON literals.

jq severely restricts the characters that can be used as "identifiers" in a jq program. The regular expression governing the choice of identifiers is currently:
^[a-zA-Z_][a-zA-Z_0-9]*$

That is, identifiers are alphanumeric except that _ may also be used.

jq variables take the form of an identifier preceded by "$", e.g. "$a".

Almost all the ASCII printing characters that are invalid in jq identifiers have special significance in jq programs. There are currently just five exceptions -- ~`^&' -- but "^" has its usual significance in connection with regular expression specifications.

Object Keys[edit]

The restriction on jq identifiers does not apply to object keys. If o is an object, and if k is a key (i.e. a JSON string), then the value of k in o can accessed as o[k], e.g. o["mykey"]; furthermore, as a convenience, if the string of characters in the key conforms to the rules for jq identifiers, then the form o.id may be used, where id is the key without the enclosing quotation marks. For example, .["mykey"] is synonymous with .mykey

String Interpolation[edit]

jq also supports "string interpolation". To interpolate the string value of any JSON entity, e, a string literal such as "\(e)" is used. Notice that such interpolating strings are not valid JSON strings themselves.

Lasso[edit]

Lasso has the follow special characters (excluding math / string functions) [1].

#	defined local ie. #mylocal will fail if not defined
$ defined variable ie. $myvar will fail if not defined
= assignment
:= assign as return assigned value
? ternary conditional true ? this
| ternary else false ? this | that
|| or
&& and
! negative operator
{ open capture
} close capture
=> specify givenblock / capture
-> invoke method: mytype->mymethod
& retarget: mytype->mymethod& // returns mytype
^ autocollect from capture: {^ 'this will be outputted' ^}
:: tag prefix, ie. ::mytype->gettype // returns myype
:: type constraint, ie. define mymethod(p::integer) => #i * 2
\ escape method: ie. \mymethod->invoke(2)
// comment
/* open comment
*/ close comment

LaTeX[edit]

LaTeX has ten special characters: # $ % & ~ _ ^ \ { }

To make some of these characters appear literally in output, prefix the character with a \. For example, to typeset 5% of $10 you would type

5\% of \$10

Note that the set of special characters in LaTeX isn't really fixed, but can be changed by LaTeX code. For example, the package ngerman (providing German-specific definitions, including easier access to umlaut letters) re-defines the double quote character (") as special character, so you can more easily write German words like "hören" (as h"oren instead of h{\"o}ren).

Lilypond[edit]

  • = assignment
  •  % comment prefix
  • \ command prefix, function name prefix
  • / time signature notation separator
  • # direct scheme expression prefix
  • - sharp of flat semipitch prefix (note: this is not special in a direct scheme expression)
  • . dotted note suffix
  • | barline marker
  • ' raise octave suffix
  • , lower octave suffix
  •  ? cautionary accidental
  • ~ accidental tie
  • \\ combined voice fragment separator

Enclosures[edit]

  • " " title enclosure, voice name enclosure
  • { } compound music expression enclosure, markup text enclosure, expression enclosure
  • < > chord grouping enclosures
  •  %{ %} multiline comment enclosures
  • << >> combined voice fragment enclosures

Lingo[edit]

  • Assignment: =
  • Comment prefix: --
  • Arithmetic operators: * / + -
  • Comparative operators: = <> < > <= >=
  • Decimal point: .
  • Literal string enclosure: "
  • String concatenation operators: & &&
  • Operator for extracting parts of a string: ..
  • Overriding precedence and function argument enclosures: ( )
  • List/property list enclosures: [ ]
  • List/property list element separator: ,
  • Property list key/value assignment: :
  • Line continuation character: \


In literal string assignments only " has to be escaped. This can be done by using the constant QUOTE:

str = "Hello " & QUOTE & "world!" & QUOTE
put str
-- "Hello "world!""


The special characters listed above are not allowed in variable or function names.

Lua[edit]

Assignment[edit]

  • = Assignment

Arithmetic Operators[edit]

  • + Addition, Optional unary positive
  • - Subtraction, Optional unary minus
  • * Multiplication
  • / Division
  •  % Modulus (also character class prefix)
  • ^ Exponent

Comparative Operators[edit]

  • == equality
  • < less than
  • > greater than
  • <= less than or equal to
  • >= greater than or equal to
  • ~=

Concatenation Operators[edit]

  • .. concatenation operator

Length Counter[edit]

  • # Length operator (also used as a directive prefix)

Logical Operators[edit]

  • && logical and
  • || logical or

Markup Components[edit]

  •  ;
  •  :
  • , list separator, function argument separator
  • . decimal point
  • ... vararg placeholder in function definition
  •  ;:=
  •  ::=

Prefixes[edit]

  • \ Literal character representation prefix
  • -- Comment prefix
  • __ metamethod prefix

Regular expression components[edit]

  • * regular expression repetition operator
  • + regular expression repetition operator
  • - regular expression range operator

Enclosures[edit]

  • ' ' Literal string enclosures (interpolated)
  • " " Literal string enclosures (interpolated)
  • ( ) Overriding precedence, function argument enclosures
  • [ ] Element number enclosure, regular expression enclosure
  • [^ ] compliment box enclosure
  • { } Enumeration enclosures
  • --[[ ]] Comment block enclosures
  • ---[[ ]] Uncommented block enclosures

Mathematica[edit]

Markup : 
() Sequence
{} List
" String
\ Escape for following character
(* *) Comment block
base^^number`s
` Context
[[]] Indexed reference
 
Within expression:
\ At end of line: Continue on next line, skipping white space

MBS[edit]

  • ! start of comment line
  • := assignment operator
  • = comparative equality operator
  • ; end of statement marker
  • , argument separator
  • + addition operator
  • * string length assignment
  • " " used as enclosures for strings
  • ( ) function argument enclosures
  • /* and */ enclosure symbols for alternative style comments

MUMPS[edit]

MUMPS doesn't have any special characters among the printable ASCII set. The double quote character, ", is a bit odd when it is intended to be part of a string. You double it, which can look quite odd when it's adjacent to the delimiting edge of a string.
USER>Set S1="Hello, World!"  Write S1
Hello, World!
USER>Set S2=""Hello, World!"" Write S2
 
SET S2=""Hello, World!"" Write S2
^
<SYNTAX>
USER>Set S3="""Hello, World!"" she typed." Write S3
"Hello, World!" she typed.
USER>Set S4="""""""Wow""""""" Write S4
"""Wow"""

Nim[edit]

From the Nim Manual:

Escape sequenceMeaning
\nnewline
\r, \ccarriage return
\lline feed
\fform feed
\ttabulator
\vvertical tabulator
\\backslash
\"quotation mark
\'apostrophe
\ '0'..'9'+character with decimal value d; all decimal digits directly following are used for the character
\aalert
\bbackspace
\eescape [ESC]
\x HHcharacter with hex value HH; exactly two hex digits are allowed

There are also raw string literals that are preceded with the letter r (or R) and are delimited by matching double quotes (just like ordinary string literals) and do not interpret the escape sequences. This is especially convenient for regular expressions or Windows paths:

var f = openFile(r"C:\texts\text.txt") # a raw string, so ``\t`` is no tab

To produce a single " within a raw string literal, it has to be doubled:

r"a""b"

Produces:

a"b

OASYS Assembler[edit]

The special characters are:

;   Comment (to end of line)
-   Introduces a negative number (not used for subtraction)
=   Load an include file or define a macro
&   Prefix for a method
%   Prefix for a global variable
!   Prefix for an object
.   Prefix for a property
'   Prefix for a vocabulary word
?   Prefix for a class; check if an object is of this class
*   Prefix for a class; create a new object of this class
:   Prefix for a label; define the label
/   Prefix for a label; jump if true
\   Prefix for a label; jump if false
|   Prefix for a label; jump unconditionally
,   Prefix for a local variable or argument
[   Begin a declaration heading or phrase
]   End a declaration heading or phrase
(   Begin a dispatch method
)   End a dispatch method
<   Read through a pointer
>   Write through a pointer
+   The object that the method was called on
"   Begin string literal
{   Begin string literal
~   Special (used for advanced macros)
^   Suffix for pointer types
@   Suffix for object type
#   Suffix for integer type
$   Suffix for string type

Objeck[edit]

 
\b //Backspace
\n //Line Feed
\r //Carriage Return
\t //Tab
\0 //Null
\' //Single Quote
\" //Double Q

Unicode escapes:

\uHHHH //Unicode Escape Sequence, H is any hexadecimal digit between 0 and 9 and between A and F

OCaml[edit]

Character escape sequences

\\     backslash
\" double quote
\' single quote
\n line feed
\r carriage return
\t tab
\b backspace
\ (backslash followed by a space) space
\DDD where D is a decimal digit; the character with code DDD in decimal
\xHH where H is a hex digit; the character with code HH in hex

PARI/GP[edit]

\e escape
\t tab
\n newline

Any other character that is quoted simply becomes itself. In particular, \" is useful for adding quotes inside strings.

While not a special character as such, whitespace is handled differently in gp than in most languages. While whitespace is said to be ignored in free-form languages, it is truly ignored in gp scripts: the gp parser literally removes whitespace outside of strings. Thus

is square(9)

is interpreted the same as

issquare(9)

or even

iss qua re(9)

Enclosures[edit]

  • ( ) function parameter enclosures

Pascal[edit]

  •  ; statement separator
  • . end of program marker
  • , declaration separator,function argument separator
  • = equality operator
  • + addition operator, string concatenation
  • - subtraction operator
  • * multiplication operator

Digraphs[edit]

  •  := assignment operator

Enclosures[edit]

  • ' ' literal string enclosures
  • ( ) function parameter enclosures
  • { } comment enclosures

Perl[edit]

Note that in perl quotation operator designations may temporarily change a symbol into an enclosure and the meaning of symbols may vary according to context.

Assignment operator symbols[edit]

  • = assignment operator

Arithmetic operator symbols[edit]

  • + addition (also optional unary positive number designator)
  • - subtraction (also negative number designator and named unary operator prefix)
  • * multiplication (also used as a prefix sigil for typeglob variables)
  • / division (also used as a regularexpression delimiter)
  •  % modulus (also used as a placeholder prefix)
  • ** exponent

Note that perl does not provide an integer division operator, but does support modulus

Bitwise operator symbols[edit]

  • & bitwise AND operator
  • | bitwise OR operator
  • ^ bitwise XOR operator
  • ~ bitwise NOT operator

Comparative operator symbols[edit]

  • == numeric equality
  • < numeric less than (also used as a readline enclosure)
  • > numeric greater than (also used as a readline enclosure)
  • <= numeric less than or equal to
  • >= numeric greater than or equal to
  • <> numeric inequality (also used for readline input)
  • <=> numeric tristate comparative
  • ~~ smartmatch operator

Comment markers[edit]

  • # prefixes comments

Context switching operators[edit]

  • =()= Array context operator
  • ~~ String context operator
  • -+- Convert numerical prefix to numerical context
  • 0+ Numerical context prefix

Data type indicators[edit]

  • $ prefix sigil and prototyping placeholder for scalar variables (also used as a placeholder modifier for element reordering)
  • @ prefix sigil and prototyping placeholder for array variables
  •  % prefix sigil and prototyping placeholder for associative container variables
  • & prefix sigil and prototyping placeholder for subroutine names (also used as a bitwise AND operator)
  • * prefix sigil and prototyping placeholder for typeglob variables

Enclosures[edit]

  • ' Literal string enclosures
  • " Interpolated string enclosures
  • / Regular expression enclosures
  • ( ) Overriding precedence, list construction, control element enclosures, treat functions as terms rather than operators
  • [ ] Array reference enclosure, Array definition structure
  • < > Readline enclosures
  • { } Group statements together into blocks of code, literal character representation construct
  • $( ) Dereferencing enclosures
  • @{[ ]} Interpolates enclosed array inside a string

Escape sequences[edit]

These escape sequences can be used in any construct with interpolation. See Quote-and-Quote-like-Operators for more info.

\t tab (HT,TAB)
\n newline (NL)
\r carriage return (CR)
\f form feed (FF)
\b backspace (BS)
\a alarm (BEL)
\e escape (ESC)
\0?? octal char example: \033 (ESC)
\x?? hex char example: \x1b (ESC)
\x{???} wide hex char example: \x{263a} (SMILEY)
\c? control char example: \c[ (ESC)
\N{U+????} Unicode character example: \N{U+263D} (FIRST QUARTER MOON)
\N{????} named Unicode character example: \N{FIRST QUARTER MOON}, see charnames

Here document allocation[edit]

  • << The double open chevron symbol may be used to allocate here documents

Nudge operators[edit]

  • ++ incremental nudge operator
  • -- decremental nudge operator
  • ~- decremental nudge (positive numbers only)

Shift operators[edit]

  • << bitshift left (dyadic) (also here document allocation)
  • >> bitshift right (dyadic)

Combination assignment operators[edit]

Arithmetic Combination Assignment Operators[edit]

  • += addition
  • -= subtraction
  • *= multiplication
  • /= division
  • **= exponent
  •  %= modulus

String Manipulation Combination Assignment Operators[edit]

  • x= repetition
  • .= concatenation

Shift Combination Assignment Operators[edit]

  • <<= Binary Shift Left
  • >>= Binary Shift Right

Logical Combination Assignment Operators[edit]

  • ||= OR
  • &&= AND

Bitwise Combination Assignment Operators[edit]

  • |= BWOR
  • &= BWAND
  • ^= BWXOR

Ellipsis, Range, Flip Flop, Concatenation, Repetition[edit]

  • . concatenation (also regular expression wildcard)
  • x repetition operator
  • .. range or flipflop operator, depends on context
  • ... ellipsis operator

Quoting Operators[edit]

  • q literal string enclosure designator
  • qq interpolated string enclosure designator
  • qr regular expression enclosure designator
  • qw word list enclosure designator
  • qx external command enclosure designator

Referencing and dereferencing operators[edit]

  • \ referencing operator (also escape sequencing prefix, and regular expression symbol grouping)
  • $$ dereferencing operator
  • -> dereferencing and associative container lookup

Regular expression symbols[edit]

  • / modifier and delimiter
  • =~ regular expression binding operator
  • ~ regular expression negation operator
  • [ ] match box enclosure
  • [^ ] compliment box enclosure
  • \ symbol grouping prefix character
  • . wildcard
  • ( ) grouping subexpressions, phrase enclosure, marked subexpression definition, negation operation enclosures
  • (?: ) non backreferenced grouping enclosures
  • (?= ) positive lookahead enclosures
  • (?! ) negative lookahead enclosures
  • (?<= ) positive lookbehind enclosures
  • (?<! ) negative lookbehind enclosures
  • (? ) enforcement or negation operation enclosures
  • + repetition operator
  • * repetition operator
  • | alternation operator
  • , count separator
  • $ anchor

Special variables[edit]

  • $. sequence number
  • $, output field separator
  • $;
  • $_ default variable
  • $" alternative output field separator
  • $# output specifier for formatted numbers
  • $*
  • $! autoflush flag
  • $+ last substring matched to a regular expression subpattern
  • $0
  • $/
  • $\
  • $|
  • $& string matched by last regular expression
  • $' substring following last matched regular expression
  • $` substring preceding last matched regular expression
  •  %ENV associative container holding the environment variables
  •  %SIG
  • @+
  • @-
  • @ARGV array containing the command line parameters
  • @F
  • @INC library search path

Statement, argument and element separators[edit]

  •  ; end of statement marker
  • , function argument separator, list element separator

Ternary operators[edit]

  •  ? , : The hook and colon are used together to produce ternary operation syntax

Miscellaneous symbols[edit]


Perl 6[edit]

Technically speaking, all characters are special in Perl 6, since they're all just the result of particular mixes of parse rules. Nevertheless, some characters may appear to be more special than others. (However, we will not document any operators here, which contain plenty of odd characters in Perl 6.)

Sigils:

    $   Item
    @	Positional
    %	Associative
    &	Callable

Twigils may occur only after a sigil, and indicate special scoping:

    *	Dynamically scoped
    ?	Compile-time constant
    ^	Positional placeholder
    :	Named placeholder
    !	Private attribute
    .	Public attribute/accessor
    ~	Slang
    =	Pod data
    <	Named match from $/

Quote characters:

    ''	Single quotes
    ""	Double quotes
    //	Regex
    「」	Quotes that allow no interpolation at all
    <>	Quote words
    «»	Shell-style words

Escapes within single quotes:

    \\	Backslash
    \'	Quote char

Escapes within double quotes:

    {}	Interpolate results of block
    $	Interpolate item
    @	Interpolate list (requires postcircumfix)
    %	Interpolate hash (requires postcircumfix)
    &	Interpolate call (requires postcircumfix)
    \\	Backslash
    \"	Quote char
    \a	Alarm
    \b	Backspace
    \c	Decimal, control, or named char
    \e	Escape
    \f	Formfeed
    \n	Newline
    \o	Octal char
    \r	Return
    \t	Tab
    \x	Hex char
    \0	Null

Escapes within character classes and translations include most of the double-quote backslashes, plus:

    ..	range

Escapes within regexes include most of the double-quote escapes, plus:

    :	Some kind of declaration
    <>	Some kind of assertion
    []	Simple grouping
    ()	Capture grouping
    {}	Side effect block
    .	Any character
    \d	Digit
    \w	Alphanumeric
    \s	Whitespace
    \h	Horizontal whitespace
    \v	Vertical whitespace
    ''	Single quoted string
    ""	Double quoted string
    «	Word initial boundary
    »	Word final boundary
    ^	String start
    ^^	Line start
    $	String end
    $$	Line end

Note that all non-alphanumeric characters are reserved for escapes and operators in Perl 6 regexes.

Any lowercase backslash escape in a regex may be uppercased to negate it, hence \N matches anything that is not a newline.

Phix[edit]

In terms of special characters, Phix is pretty much the polar opposite of languages like Perl, APL, and J, and needs a touch fewer brackets than C-based languages (and obviously far fewer than lisp-based languages).

The following are taken directly from the Phix.syn (syntax colouring) file, which can be edited as needed (for errors or new compiler features):

Delimiters #$:.%\^
Operators , = := == != < <= > >= @= @== @!= @< @<= @> @>= + - * / += -= *= /= @+= @-= @*= @/= .. & &= ? ; : |
Braces ()[]{}
BlockComment /* */ --/* --*/
LineComment --
TokenStart abcedfghijklmnopqrstuvwxyz
TokenStart ABCDEFGHIJKLMNOPQRSTUVWXYZ_
TokenChar 0123456789
Escapes \rnt\'"eE#x0buU

The last line means that escapes in string literals start with a backslash, and there are 14 of them: CR, LF, TAB, backslash, single and double quotes, escape (#1B, e and E allowed), hex byte (# and x allowed), NUL, backspace, and 4 and 8-digit unicode characters.

Taking the others in order:

# hex literal, or #iXXX compiler directive (see [[Pragmatic_directives]])
$ roughly means "end". For instance s[2..$] is equivalent to s[2..length(s)]. 
Can also optionally terminate declarations, eg integer a,b,c and integer a,b,c,$ are equivalent.
: namespace qualification, for example arwen:hiWord() means the one in arwen, not some other hiWord(). See also :=
. decimal separator, or part of .. Note there is no dot notation in Phix, such as this.that.theother.
% deprecated. Was once used for things like %isVar, nowadays it is illegal.
\ outside strings, the only other place this can be used is as part of a path in an include statement.
^ illegal. I think it is only in the syntax file to stop error files from being painful on the eyes.
, argument and sequence element separator
= assignment or equality operator, depending on context
:= assignment operator, also used for named parameters
== equality operator. := and == are just slightly more explicit forms of =
!= < <= > >= standard comparison operators
@ roughly means "all", as above but apply to an entire sequence (rarely used)
+ - * / standard maths operators
+= -= *= /= ditto, with assignment/implied operand
@+= @-= @*= @/= as above but applying to an entire sequence (ditto)
.. slice, for example s[4..6] is three elements of s
& concatenation operator, eg "this"&"that" is "thisthat"
&= ditto, with assignment/implied operand
? debug print shorthand
; (optional) statement separator
: already described
| illegal. I think it is only in the syntax file to stop profile listings from being painful on the eyes.
() parameter delimiters and precedence override
[] subscripts
{} sequence formation

PicoLisp[edit]

Markup:
   () []    List
   .        Dotted pair (when surounded by white space)
   "        Transient symbol (string)
   {}       External symbol (database object)
   \        Escape for following character
   #        Comment line
   #{ }#    Comment block


Read macros:
   '        The 'quote' function
   `        Evaluate and insert a list element
   ~        Evaluate and splice a partial list
   ,        Indexed reference

Within strings:
   ^        ASCII control character
   \        At end of line: Continue on next line, skipping white space

plainTeX[edit]

TeX attachs to each character a category code, that determines its "meaning" for TeX. Macro packages can redefine the category code of any character. Ignoring the category code 10 (blank), 11 (letters) and 12 (a category embracing all characters that are not letters nor "special" characters according to TeX) and few more not interesting here, when TeX begins the only characters that have a category code so that we can consider "special" for the purpose of this page, are

  • \ %

Then plainTeX assigns few more (here I don't list some non-printable characters that also get assigned a "special" category code)

  • { } $ & # ^ _ ~

and these all are "inherited" by a lot of other macro packages (among these, LaTeX).


PL/I[edit]

PL/I has no escape characters as such. However, in string constants, enclosed in apostrophes or (since PL/I for OS/2) quotation marks, a single apostrophe/quote in the string must be duplicated, thus:

'John''s pen' which is stored as <<John's pen>>
"He said ""Go!"" and opened the door" which is stored as <<He said "Go!" and opened the door>>

Of course, in either of the above the string can be enclosed with the "other" delimiter and no duplication is required.

PowerShell[edit]

PowerShell is unusual in that it retains many of the escape sequences of languages descended from C, except that unlike these languages it uses a backtick ` as the escape character rather than a backslash \. For example `n is a new line and `t is a tab.

Progress[edit]

  • . - End of statement marker
  • @ - (alternative ;&)
  • [ - (alternative ;<)
  • ] - (alternative ;>)
  • ^ - (alternative ;*)
  • ` - (alternative ;')
  • { - (alternative ;()
  • | - (alternative ;%)
  • } - (alternative ;))
  • ~ - (alternative ;?)

PureBasic[edit]

There is no escape sequences in character literals. Any character supported by the source encoding is allowed and to insert the quote (“) sign either the constant #DOUBLEQUOTE$ or the its Ascii-code can be used.

The code is based on readable words and only a semicolon (;) as start-of-comment & a normal colon (:) as command separator are used.

a=1             ; The ';' indicates that a comment starts
b=2*a: a=b*33 ; b will now be 2, and a=66

Python[edit]

(From the Python Documentation):

Unless an 'r' or 'R' prefix is present, escape sequences in strings are interpreted according to rules similar to those used by Standard C. The recognized escape sequences are:

Escape Sequence Meaning Notes
\newline Ignored  
\\ Backslash (\)  
\' Single quote (')  
\" Double quote (")  
\a ASCII Bell (BEL)  
\b ASCII Backspace (BS)  
\f ASCII Formfeed (FF)  
\n ASCII Linefeed (LF)  
\N{name} Character named name in the Unicode database (Unicode only)  
\r ASCII Carriage Return (CR)  
\t ASCII Horizontal Tab (TAB)  
\uxxxx Character with 16-bit hex value xxxx (Unicode only) (1)
\Uxxxxxxxx Character with 32-bit hex value xxxxxxxx (Unicode only) (2)
\v ASCII Vertical Tab (VT)  
\ooo Character with octal value ooo (3,5)
\xhh Character with hex value hh (4,5)

Notes:

  1. Individual code units which form parts of a surrogate pair can be encoded using this escape sequence.
  2. Any Unicode character can be encoded this way, but characters outside the Basic Multilingual Plane (BMP) will be encoded using a surrogate pair if Python is compiled to use 16-bit code units (the default). Individual code units which form parts of a surrogate pair can be encoded using this escape sequence.
  3. As in Standard C, up to three octal digits are accepted.
  4. Unlike in Standard C, exactly two hex digits are required.
  5. In a string literal, hexadecimal and octal escapes denote the byte with the given value; it is not necessary that the byte encodes a character in the source character set. In a Unicode literal, these escapes denote a Unicode character with the given value.

Racket[edit]

Racket, like many Schemes, has very little "special" syntax. You can use pretty much anything in identifiers in your code, including exotic Unicode characters. (As usual in other Lisps, symbols share the same syntax as identifiers.) Notable exceptions:

  • Parentheses: round, square, curly (all equivalent, only required to be balanced)
  • Spaces are obviously the usual delimiters
  • A period is used for improper pairs and related things, but it is fine if it's in an identifier that has more characters
  • The hash character "#" is used as a general mechanism for various new syntaxes, but it is fine to use in the middle of an identifier
  • Backslash is used to escape any character, making the above characters possible to use
  • Vertical bars can be used as identifier quotations used around it

REXX[edit]

blanks in digraphs and trigraphs[edit]

Digraphs and trigraphs can have imbedded blanks (or whitespace) between the characters, so:

  if a¬==b      then say 'not equal'
if a ¬== b then say 'not equal'
if a ¬ = = b then say 'not equal'
if a ¬ ,
= = b then say 'not equal'

are equivalent   (assuming the   ¬   symbol is supported by the REXX interpreter).

assignment operator symbol[edit]

  • = assignment operator       (There are other methods to assign variables, however.)

compound assignment operators[edit]

REXX doesn't support compound assignment operators, so the

  • +=
  • -=
  • *=
  • /=

digraphs (above) aren't legal for assignments in classic REXX.

Note that /= is a valid infix operator in some Rexx implementations meaning 'not equal' as in

If a/=b Then Say 'a is not equal to b'

Note: the above is not an infix operator for an assignment as this section is named.

arithmetic operator symbols[edit]

  • + addition
  • - subtraction
  • * multiplication
  • / division
  • % integer division
  • // modulus   (or remainder division)
  • ** exponent (only for integer powers)

unary operator symbols[edit]

  • + positive value
  • - negative value
  • \ logicial negation
  • ¬ logical negation (some REXXes)
  • ~ logical negation (some REXXes)

comparative operator symbols[edit]

  • = equal to
  • == strictly equal to
  • \= not equal to
  • ¬= not equal to (some REXXes)
  • /= not equal to (some REXXes)
  • ~= not equal to (some REXXes)
  • \== strictly not equal to
  • /== strictly not equal to
  • ¬== strictly not equal to (some REXXes)
  • ~== strictly not equal to (some REXXes)
  • < less than
  • > greater than
  • <= less than or equal to
  • >= greater than or equal to
  • \< not less than
  • ¬< not less than (some REXXes)
  • ~< not less than (some REXXes)
  • \> not greater than
  • ¬> not greater than (some REXXes)
  • ~> not greater than (some REXXes)
  • <> not equal to
  • >< not equal to
  • << strictly less than
  • >> strictly greater than
  • <<= strictly less than or equal to
  • >>= strictly greater than or equal to
  • \<< strictly not less than
  • ¬<< strictly not less than (some REXXes)
  • ~<< strictly not less than (some REXXes)
  • \>> strictly not greater than
  • ¬>> strictly not greater than (some REXXes)
  • ~>> strictly not greater than (some REXXes)
  • ··· and others

logical operator symbols[edit]

  • & logical AND
  • | logical OR
  • && logical XOR
  • \ logical not
  • / logical not (some REXXes)
  • ¬ logical not (some REXXes)
  • ~ logical not (some REXXes)

concatenation operator symbol[edit]

  • || concatenation (or abuttal)

expression enclosures[edit]

The   (   and   )   symbols are used as enclosures for expressions to help/clarify the priority/priorities for evaluation expressions in REXX.

  • (     is the start of an expression.
  • )     is the  end  of an expression.
  y = (a+b)/(c-d)
z = ((y-2)/(y+2)) / ((a**2 * b**2)* abs(j))
p = 2**(2**3)

function/subroutine argument enclosures, separators[edit]

The   (   and   )   symbols are used as enclosures for function arguments in REXX.

  • (     is the start of a list of arguments.
  • )     is the  end  of a list of arguments.
  • ,     are used to separate arguments (if any) or to indicate omitted arguments for functions/subroutines.

Arguments may be omitted.

  tn = time()
tc = time('C')
x = strip(y,,'+')

comment enclosures[edit]

The   /*   and   */   symbols are used as enclosures for comments in REXX.

  • /*     is the start of a comment.
  • */     is the  end  of a comment.

REXX comments may span multiple lines.

Note that REXX supports nesting of comments, so nested comments must
have matching opening (start) and closing (end) comment delimiters.

literal enclosures[edit]

simple literals[edit]

The   '   and   "   symbols are used as enclosures for literal (character) strings.

  • '     is called an apostrophe   (also called a single quote)
  • "     is called a double quote
  a = 'ready, set, go!'

or

  a = "ready, set, go!"

To assign a null, two formats that can be used are:

  nuttin =''
nothing=""

apostophe [ ' ] duplication[edit]

In string constants (enclosed in single apostrophes), a literal apostrophe in the string must be duplicated, thus:

  yyy = 'John''s pen'

which is stored as

John's pen

An alternate way of expressing the above is:

  yyy = "John's pen"

quotation mark [ " ] duplication[edit]

In string constants (enclosed in double quotes), a literal double quote in the string must be duplicated, thus:

  jjj = "Quote the Raven, ""Nevermore"""

which is stored as

Quote the Raven, "Nevermore"

binary literals[edit]

The lowercase   b   or uppercase   B   (letter) acts as a literal character notation marker, enabling binary literals to be stored as
character strings by using binary (bit) representation of character codes:

  lf = '000001010'b                                      /* (ASCII)       */
cr = "1101"B /* (ASCII) */
greeting = '100100001100101011011000110110001101111'b /* (ASCII) Hello */

hexadecimal literals[edit]

The lowercase   x   or uppercase   X   (letter) acts as a literal character notation marker, enabling hexadecimal literals to be stored as
character strings by using hexadecimal representation of character codes (which can be in lower or uppercase):

  lf = '0A'x                  /* (ASCII)       */
lf = 'a'x /*same as above. */
lf = "a"X /*same as above. */
cr = '0D'x /* (ASCII) */
greeting = '48656C6C6F'x /* (ASCII) Hello */

literal character digraphs aren't supported[edit]

The rexx language doesn't support the use of character representation digraphs (escape sequences) using a backslash [\] symbol.

nudge operators[edit]

REXX doesn't support the use of nudge operators, so the   ++   and   --   symbols aren't special in REXX other than that they are used as unary prefix operators.

Note that in later versions of the Regina interpreter,   --   may   be used to introduce a line comment (if the appropriate option is in effect.)

something=12 -- an assignment, what else


The above usage invalidates negative unary operators for classic Rexx.   In the following:

something=--12

or the string may not be hardcoded within a single REXX statement:

x='--12 ++12 -12.000 +12 12 12. 012 0012 0012.00 1.2E1 1.2e+1 --12e-00 120e-1 ++12 ++12. 0--12 00--12 --12'
do j=1 for words(x)
interpret 'something=' word(x,j)
say 'j=' j ' x=' x ' something='something
end

or the expression may be user specified as in:

say 'enter an expression:'
parse pull x
interpret 'expression=' x
say 'expression=' expression

can do completely different assignments [or evaluation(s)], depending what version of (classic) REXX is being used.
The following REXX interpreters (for the 1st example assign a   12   to the variable   something:

  • PC/REXX
  • Personal REXX
  • R4
  • REXX/imc
  • BREXX
  • CTC REXX
  • CRC REXX
  • OS/2 REXX
  • KEXX
  • OS/400 REXX
  • Regina 3.3 and earlier
  • Regina 3.4 and later if the option is in effect: noSingle_line_comments
  • CMS REXX
  • TSO REXX
  • REXX compiler (IBM)
  • and others.

-- Gerard Schildberger 20:57, 17 February 2013 (UTC)

end-of-statement symbol[edit]

Normally, the end-of-line (or end-of-program) is assumed the end of a REXX statement,
but multiple REXX statements can be used (on one line) by separating them with a semicolon   [;].

  x=5;  y=6;z=y**2;    say x y z

continuation symbol[edit]

A REXX statement can be continued on the next line by appending a comma [,] as the
last significant symbol on the line to be continued.

  say 'The sum of the first three numbers in the file "INSTANCE.INPUTS" is',
a b c ' and the total of all three is' grandTotal
  say 'The sums of the first three numbers in the file "INSTANCE.INPUTS" is',  /*tell sums.*/
a b c ' and the total of all three is' grandTotal

statement label[edit]

A REXX statement label which can be "jumped to/branched to" by a

  • signal
  • call
  • invoked as a function   x=func(y+4)

A REXX label is any valid REXX symbol followed by a colon [:] with or without leading/intervening/trailing blanks.

argument separator[edit]

To separate the arguments of a subroutine, function, or BIF (built-in function) calls/invocations,
the comma   [,]   is used.

  secondChar = substr(xxx, 2, 1)
thirdChar = substr(xxx,3,1)
pruned = strip(zebra, 'Trailing', ".") /*remove trailing period(s).*/

Note that a comma is also used for continuation if it is the last significant character on a REXX statement   (see continuation character above).
Also note that REXX only examines the 1st character of the (trailing) option, and that the case of the letter is irrelevant.

period [.][edit]

A period   [.]   can be used for:

  • a decimal point   (in a number)
  • as part of a label   (a label can start and/or end with one or more periods)
  • as part of a variable name   (a variable can't start with a period, but it can end with one or more periods)
  • as a placeholder in a parsing template to indicate that one token is to be ignored (skipped)
  • as a generic stemmed array element;   to assign all elements:   K.=2   assigns   2   to all elements in   K.
  • as a stemmed array index delimiter   (to indicate multiple indexes):   G.2.x = "tuna"
  • (in Regina) a variable starting with a period can be one of several global scope variables that can't be modified by the programmer.

Ruby[edit]

assignment operator symbol[edit]

  • = assignment operator

here document markers[edit]

  • << here document marker
  • <<- alternative here document marker

Scala[edit]

The most of Java special characters are available in Scala. The big difference is they are not built in the compiler but defined in the appropriate library of classes. Because operators works on classes they are actual methods of that classes. Example:
val n = 1 + 2
This is interpreted as "1" is of class Int and use the method "+" with parameter "2". (Donn't worry, later this will be unboxed to e.g. native JVM primitives.)

This flexible approach gives the possibility to define and redefine (overridden) operators. Sometimes new are invented but the recommendation is to use this with care.

Seed7[edit]

Within character literals and string literals, the backslash is a special character that begins an escape sequence:

    audible alert    BEL      \a    backslash    (\)   \\
    backspace        BS       \b    apostrophe   (')   \'   
    escape           ESC      \e    double quote (")   \"
    formfeed         FF       \f
    newline          NL (LF)  \n    control-A          \A
    carriage return  CR       \r      ...
    horizontal tab   HT       \t    control-Z          \Z
    vertical tab     VT       \v

Additionally the following escape sequences can be used:

  • A backslash followed by an integer literal and a semicolon is interpreted as character with the specified ordinal number. Note that the integer literal is interpreted decimal unless it is written as based integer.
  • Two backslashes with a sequence of blanks, horizontal tabs, carriage returns and new lines between them are completely ignored. The ignored characters are not part of the string. This can be used to continue a string in the following line. Note that in this case the leading spaces in the new line are not part of the string. Although this possibility exists also for character literals it makes more sense to use it with string literals.

E.g.:

"\""   "'"   "A\"B !"   "Euro: \8364;"   "CRLF\r\n"

SQL[edit]

All characters can be used as identifiers if you put double-quotes around it.

Other than that, the special characters are:

' '  String literal
" "  Identifier
[ ]  Identifier
` `  Identifier
?    Numbered host parameter
:    Named host parameter
$    Named host parameter
@    Named host parameter
( )  Parentheses
+    Add
-    Subtract/negative
*    Multiply
/    Divide
%    Modulo
&    Bitwise AND
|    Bitwise OR
~    Bitwise NOT
<<   Left shift
>>   Right shift
=    Equal
==   Equal
<>   Not equal
!=   Not equal
<    Less
>    Greater
<=   Less or equal
>=   Greater or equal
||   String concatenation

Tcl[edit]

As documented in man Tcl, the following special characters are defined:

{...}     ;# group in one word, without substituting content; nests
"..." ;# group in one word, with substituting content
[...] ;# evaluate content as script, then substitute with its result; nests
$foo ;# substitute with content of variable foo
$bar(foo) ;# substitute with content of element 'foo' of array 'bar'
\a ;# audible alert (bell)
\b ;# backspace
\f ;# form feed
\n ;# newline
\r ;# carriage return
\t ;# Tab
\v ;# vertical tab
\\ ;# backslash
\ooo ;# the Unicode with octal value 'ooo'
\xhh ;# the character with hexadecimal value 'hh'
\uhhhh ;# the Unicode with hexadecimal value 'hhhh'
#  ;# if first character of a word expected to be a command, begin comment
;# (extends till end of line)
{*} ;# if first characters of a word, interpret as list of words to substitute,
;# not single word (introduced with Tcl 8.5)

TXR[edit]

Text not containing the character @ is a TXR query representing a match that text. The sequence @@ encodes a single literal @.

All other special syntax is introduced by @:

  • @# comment
  • @\n # escaped character, embedded into surrounding text. Similar to C escapes, with \e for ASCII ESC.
  • @\x1234 @\1234 Hex or octal escapes: Unicode width, not byte.
  • @symbol variable ref
  • @*symbol variable ref with longest match semantics
  • @{symbol expr ...} variable ref extended syntax
  • @expr directive

Where expr is Lispy syntax which can be an atom, or a list of atoms or lists in parentheses, or possibly a dotted list (terminated by an atom other than nil):

  • (elem1 elem2 ... elemn) proper
  • (elem1 elem2 ... elemn . atom) dotted

Atoms can be:

  • ABc123_4 symbols, represented by tokens consisting of letters, underscores and digits, beginning with a letter. Symbols have packages, e.g., system:foo, but this is not accessible from the TXR lexical conventions.
  • :FoO42 keyword symbols, denoted by colon, which is not part of the symbol name.
  • "string literals"
  • `quasi @literals` with embedded @ syntax
  • 'c' characters
  • 123 integers
  • /reg/ regular expressions

Within literals and regexes:

  • \r various backslash escapes similar to C
  • \\ single backslash

Within literals, quasiliterals and character constants:

  • \' \" \` escape any of the quotes: not available within regex.

The regex syntax is fairly standard fare, with these extensions:

  • ~R complement of R: set of strings other than those that match R
  • R%S match shortest number of repetitions of R prior to S.
  • R&S match R and S simultaneously: the intersection of the set of strings matching S and the set matching R.
  • [] empty class; match nothing, not even the empty string.

UNIX Shell[edit]

The Bourne shell treats the following as special characters:

  • # comment marker
  •  ! logical not (within a test command), compliment box operator
  • $ variable referencing prefix
  • & referencing open file descriptors and background process marker
  • * filename and string matching wildcard
  • . inclusion command
  • / pathname separator
  •  : parameter expansion and do nothing command
  •  ; command separator
  • = assignment and parameter expansion operator
  • \ escape sequence prefix
  •  ? wildcard metacharacter
  • ) switch conditional component
  • ` external command enclosure
  • | pipeline connector
  • - parameter expansion operator
  • + parameter expansion operator

Digraphs[edit]

  • *) switch conditional component
  • #! hashbang
  •  ;; switch conditional component
  • << here document operator
  • >> append redirection operator
  • $* single element command line expansion special variable
  • $@ multiple element command line expansion special variable
  • $# number of command line parameters special variable
  • $? special variable holding return value of last operation
  •  :- parameter expansion operator
  •  := parameter expansion operator
  •  :+ parameter expansion operator
  •  :? parameter expansion operator

Enclosures[edit]

  • " " interpolated string enclosures
  • ' ' non interpolated string enclosures
  • [ ] test command substitute and character range enclosures
  • ( ) subshell execution enclosures, empty for function declaration
  • `( )` external subshell execution enclosures
  • [! ] compliment box enclosures
  • { } code block enclosures and variable name isolation enclosures
  • ${ } variable name isolation enclosures

Extended shell features[edit]

The Korn shell, Bourne Again Shell and Posix shell provide the following additional special characters:

  • - korn shell unary arithmetic operator
  • { brace expansion marker
  • ~ home directory expansion operator
  • && Extended syntax for execute if true (on success)
  • || Extended syntax for execute if false (on failure)
  • -- Extended syntax marker for end of command line switches
  • == bash specific feature

Extended shell enclosures[edit]

  • $( ) Extended syntax for external command capture construct
  • [[ ]] bash specific feature
  • (( )) arithmetic expansion enclosures

XSLT[edit]

XSLT is based on XML, and so has the same special characters which must be escaped using character entities:

  • & - &amp;
  • < - &lt;
  • > - &gt;
  • " - &quot;
  • ' - &apos;

Any Unicode character may also be represented via its decimal code point (&#nnnn;) or hexadecimal code point (&#xdddd;).

zkl[edit]

  • In a string: The C ones: \r \n \t \f \e \b \ (the escape)
  • //, #, /* */ comments, no escaping
  • C-like math & logic: == != > < <= => + - * / += -= /= */ and or
  • Block scope: { }
  • Brackets([]): Subscripting, range([0..10]), attributes(class [static] foo {})
  • Function/method call: name(...)
  • Method access: .method
  • Compose this chunk of stuff in that chunk of stuff: :(colon)
  • List comprehension: [[ ]], [& ]]
  • Ops (for use in function calls): '+ '- '* '/ '> '>= '< '<=
  • Closures: 'wrap
  • Comma(,): separate parameters, list assignment
  • Star(*): rest of, forever: [0..*], [1,*]
  • Underline(_): valid in a name but also used as a throw away: a,_,c=...