Quoting constructs

From Rosetta Code
Task
Quoting constructs
You are encouraged to solve this task according to the task description, using any language you may know.

Pretty much every programming language has some form of quoting construct to allow embedding of data in a program, be it literal strings, numeric data or some combination thereof.

Show examples of the quoting constructs in your language. Explain where they would likely be used, what their primary use is, what limitations they have and why one might be preferred over another. Is one style interpolating and another not? Are there restrictions on the size of the quoted data? The type? The format?

This is intended to be open-ended and free form. If you find yourself writing more than a few thousand words of explanation, summarize and provide links to relevant documentation; but do provide at least a fairly comprehensive summary here, on this page, NOT just a link to [See the language docs].

Note: This is primarily for quoting constructs for data to be "embedded" in some way into a program. If there is some special format for external data, it may be mentioned but that isn't the focus of this task.

6502 Assembly[edit]

The following uses VASM syntax for quoting constructs. There is no built-in support for interpolation, escape characters, etc. What constitutes as an escape character depends on the code that is using embedded strings. A function that needs this embedded data can take as an argument a pointer to the data, which can easily be obtained by loading the low byte and high byte of the label into consecutive zero page RAM locations.

LookUpTable: db $00,$03,$06,$09,$12             ;a sequence of pre-defined bytes

MyString: db "Hello World!",0                   ;a null-terminated string

GraphicsData: incbin "C:\game\gfx\tilemap.chr"  ;a file containing the game's graphics

Most 6502 assemblers use Motorola syntax, which uses the following conventions:

  • A value with no prefix is interpreted as a base 10 (decimal) number. $ represents hexadecimal and % represents binary. Single or double quotes represent ASCII.
  • Unlike immediate values in instructions, quoted data does NOT begin with a #. For example, dw $1234 represents the literal value 0x1234, not "the value stored at memory address 0x1234."
  • Multiple values can be put on the same line, separated by commas. DB only needs to be before the first data value on that line. Or, you can put each value on its own line. Both are valid and have the same end result when the code is assembled.

6502 Assembly uses db or byte for 8-bit data and dw or word for 16-bit data. 16-bit values are written by the programmer in big-endian, but stored little-endian. For example, the following two data blocks are equivalent. You can write it either way, but the end result is the same.

dw $ABCD
db $CD,$AB

Most assemblers support "C-like" operators, and there are a few additional ones:

  • < or LOW() means "The low byte of." For example, <$3456 evaluates to 56.
  • > or HIGH() means "The high byte of." For example, <$78AB evaluates to 78.

These two operators are most frequently used with labeled memory addresses, like so:

lookup_table_lo:
byte <Table00,<Table01,<Table02
lookup_table_hi:
byte >Table00,>Table01,>Table02


68000 Assembly[edit]

Formatting is largely dependent on the assembler and the syntax. Generally speaking, assemblers that use Motorola syntax follow these conventions:

  • A value with no prefix is interpreted as a base 10 (decimal) number. $ represents hexadecimal and % represents binary. Single or double quotes represent ASCII.
  • Unlike immediate values in instructions, quoted data does NOT begin with a #. For example, DC.L $12345678 represents the literal value 0x12345678, not "the value stored at memory address 0x12345678."
  • The length of the data must be specified, and if the value given is smaller than that size, it will get padded to the left with zeroes. If the value is too big to fit in the specified size, you'll get a compiler error and the assembly is cancelled.
  • Multiple values can be put on the same line, separated by commas. DC._ only needs to be before the first data value on that line. Or, you can put each value on its own line. Both are valid and have the same end result when the code is assembled. You should always follow byte data with EVEN which will add an extra byte of padding if the total number of bytes before it was odd. This is necessary for your code to comply with the CPU's alignment rules.
ByteData:
DC.B $01,$02,$03,$04,$05
even
WordData:
DC.W $01,$02
DC.W $03,$04
;the above was the same as DC.W $0001,$0002,$0003,$0004
LongData:
DC.L $00000001,$00000002,$00000004,$00000008

MyString:
DC.B "Hello World!",0 ;a null terminator will not be automatically placed. 
even

DS._ represents a sequence of space. The number after it specifies how many bytes/words/longs' worth of zeroes to place. Some assemblers support values besides zero, others do not.

DS.B 8  ;8 bytes, each equals 0
DS.W 16 ;16 words, each equals zero
DS.L 20 ;20 longs, each equals zero

In addition to constants, a label can also be specified. If a label is defined with an EQU statement, the label will be replaced with the assigned value during the assembly process.

ScreenSize equ $1200

MOVE.W (MyData),D0

MyData
DC.W ScreenSize

Code labels, on the other hand, get replaced with the memory address they point to. This can be used to make a lookup table of various data areas, functions, etc. Since there is no "24-bit" data constant directive, you'll have to use DC.L for code labels. The top byte will always be zero in this case.

Printstring:
;insert your code here

FunctionTable:
DC.L PrintString ;represents the address of the function "PrintString"

Most constants can be derived from compile-time expressions, which are useful for explaining what the data actually means. The expressions are evaluated during the assembly process, and the resulting object code will have these calculations already completed, so your program doesn't have to waste time doing them. Most "C-like" operators are supported, but as always the exact syntax depends on your assembler. Parentheses will aid the assembler in getting these correct, but sometimes it still doesn't do what you expect.

DC.B $0200>>3     ;evaluates to $0040. As long as the final result fits within the designated storage size, you're good.
DC.W 4+5          ;evaluates to $0009
DC.W (40*30)-1    ;evaluates to $1199
DC.L MyFunction+4 ;evaluates to the address of MyFunction, plus 4.

We can use this technique to get the length of a region of data, which the assembler can calculate for us.

TilemapCollision:
	DC.B $11,$11,$11,$11,$11,$11,$11,$11,$11,$11
	DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01
	DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01
	DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01
	DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01
	DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01
	DC.B $11,$00,$00,$00,$00,$00,$00,$00,$00,$01
	DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01
	DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01
	DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01
	DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01
	DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01
	DC.B $10,$00,$00,$00,$00,$00,$00,$00,$00,$01
	DC.B $11,$11,$11,$11,$11,$11,$11,$11,$11,$11
TilemapCollisionEnd:

MOVE.W #(TilemapCollisionEnd-TilemapCollision)-1,D0
;gets the length of this region of memory, minus 1, into D0. 
;  Again, even though the "operands" of this expression are longs, 
;  their difference fits in 16 bits and that's all that matters.


For quoting binary data in another file, you can use the incbin directive to embed it directly in your source code. This is handy for graphics data and music.

Applesoft BASIC[edit]

Real precision numbers (also called "floating point" numbers) and quoted strings are constructs within expressions. Literal strings, real precision numbers, and quoted strings are contructs within DATA statements.

Numbers start with a digit from 0 to 9 or a sign + or - and can include a two digit signed exponent. The real numbers must be in the range from -1.7E+38 to 1.7E+38. Reals with an absolute value less than 2.9388E-39 are converted to zero.

Quoted strings start with the double quote. Quoted strings are terminated with the double quote or by the end of a statement. A quote cannot be embedded in a quoted string. Most control characters can be embedded in quoted strings, but this is usually discouraged.

Literals can be used in DATA statments. These are strings that do not start with a double quote and can have a double quote included in the literal string.

Quoted constructs within expressions[edit]

? 0 : ? -326.12E-5 : ? HELLO : ? "HELLO" : ? "HELLO

The literal HELLO is interpreted as a variable name, and it's value 0 is printed.

Output:
0
-3.2612E-03
0
HELLO
HELLO

Quoted constructs within DATA statements[edit]

 10  DATA 0,-326.12E-5,HELLO,"HELLO","HELLO
 20  READ A%: PRINT A%: READ A: PRINT A: READ A$: PRINT A$: READ A$: PRINT A$: READ A$: PRINT A$
 30  DATA AB"C
 40  READ A$: PRINT A$
Output:
0
-3.2612E-03
HELLO
HELLO
HELLO
AB"C

BQN[edit]

BQN programs manipulate data of seven types:

  • Character
'a'
'b'
@

@ is a symbol that represents the null character. Characters can contain a newline(@+10 is recommended, however).

  • Number
123
1.23
123E5
¯1234
∞
π

, ¯∞ and π are constants which represent infinity, negative infinity and pi.

  • Function: A block {} which takes 1 or two arguments: 𝕩 and/or 𝕨
  • 1-Modifier: A block similar to a function, which can take 1 extra function argument 𝔽 on the left.
  • 2-Modifier: A modifier which can take two function arguments, 𝔽 and 𝔾.
  • Namespace: A block where any data member is exported using assignment.
  • Array: consists of any of the above.
    • Regular array notation
      ⟨1, 2, 3⟩
      You can nest arrays in arrays. Separators can be ,, and newline.
    • Stranding
      1‿2‿3
      any expression which doesn't fit in a single atom must be put in parentheses.
    • Strings
      "Hello World"
      "Quoted "" String"
      any sequence of characters including newlines can be put inside a string. Quotes are escaped by typing two quotes.

FreeBASIC[edit]

Translation of: Ring
'In FB there is no substr function, then 
'Function taken fron the https://www.freebasic.net/forum/index.php

Function substr(Byref soriginal As String, Byref spattern As Const String, Byref sreplacement As Const String) As String
    ' in <soriginal> replace all occurrences of <spattern> by <sreplacement>
    Dim As Uinteger p, q
    
    If sreplacement <> spattern Then
        p = Instr(soriginal, spattern)
        If p Then
            q = Len(sreplacement)
            If q = 0 Then q = 1
            Do
                soriginal = Left(soriginal, p - 1) + sreplacement + Mid(soriginal, p + Len(spattern))
                p = Instr(p + q, soriginal, spattern)
            Loop Until p = 0
        End If
    End If
    Return soriginal
End Function

Dim As String text(1 To 3)
text(1) = "This is 'first' example for quoting"
text(2) = "This is second 'example' for quoting"
text(3) = "This is third example 'for' quoting"

For n As Integer = 1 To Ubound(text)
    Print !"text for quoting:\n"; text(n)
    Print !"quoted text:\n"; substr(text(n),"'",""); !"\n"
Next n
Sleep
Output:
Same as Ring input.

Go[edit]

package main

import (
    "fmt"
    "os"
    "regexp"
    "strconv"
)

/* Quoting constructs in Go. */

// In Go a Unicode codepoint, expressed as a 32-bit integer, is referred to as a 'rune'
// but the more familiar term 'character' will be used instead here.

// Character literal (single quotes).
// Can contain any single character including an escaped character.
var (
    rl1 = 'a'
    rl2 = '\'' // single quote can only be included in escaped form
)

// Interpreted string literal (double quotes).
// A sequence of characters including escaped characters.
var (
    is1 = "abc"
    is2 = "\"ab\tc\"" // double quote can only be included in escaped form
)

// Raw string literal(single back quotes).
// Can contain any character including a 'physical' new line but excluding back quote.
// Escaped characters are interpreted literally i.e. `\n` is backslash followed by n.
// Raw strings are typically used for hard-coding pieces of text perhaps
// including single and/or double quotes without the need to escape them.
// They are particularly useful for regular expressions.
var (
    rs1 = `
first"
second'
third"
`
    rs2 = `This is one way of including a ` + "`" + ` in a raw string literal.`
    rs3 = `\d+` // a sequence of one or more digits in a regular expression
)

func main() {
    fmt.Println(rl1, rl2) // prints the code point value not the character itself
    fmt.Println(is1, is2)
    fmt.Println(rs1)
    fmt.Println(rs2)
    re := regexp.MustCompile(rs3)
    fmt.Println(re.FindString("abcd1234efgh"))

    /* None of the above quoting constructs can deal directly with interpolation.
       This is done instead using library functions.
    */

    // C-style using %d, %f, %s etc. within a 'printf' type function.
    n := 3
    fmt.Printf("\nThere are %d quoting constructs in Go.\n", n)

    // Using a function such as fmt.Println which can take a variable
    // number of arguments, of any type, and print then out separated by spaces.
    s := "constructs"
    fmt.Println("There are", n, "quoting", s, "in Go.")

    // Using the function os.Expand which requires a mapper function to fill placeholders
    // denoted by ${...} within a string.
    mapper := func(placeholder string) string {
        switch placeholder {
        case "NUMBER":
            return strconv.Itoa(n)
        case "TYPES":
            return s
        }
        return ""
    }
    fmt.Println(os.Expand("There are ${NUMBER} quoting ${TYPES} in Go.", mapper))
}
Output:
97 39
abc "ab	c"

first"
second'
third"

This is one way of including a ` in a raw string literal.
1234

There are 3 quoting constructs in Go.
There are 3 quoting constructs in Go.
There are 3 quoting constructs in Go.

J[edit]

J provides four mechanisms for inline data:

  • A sequence of numbers, for example 1 2 3
  • A sequence of characters, for example '1 2 3'
  • A newline terminated multiline script, for example:
    0 :0
      1 2 3
      4 5 6
    )
    
  • (in recent J versions), an embeddable newline terminated multiline script, for example:
    {{)n
      1 2 3
      4 5 6
    }}
    

Note that a multiline {{)n construct discards the leading newline, but the construct can also be used for single line strings (more concise than ' delimited strings when ' appears multiple times in the string). For example {{)n1 2 3}} is the same value as '1 2 3'.

Sequences of numbers or characters which contain exactly one element are treated specially -- they do not have a length of their own.

J also has a constant language for numbers, which gives special significance to embedded letters. For example 12e3 is the floating point value 12000 (but J extends this notation to support some numbers in bases other than 10, extended precision integers, rational values and complex values and approximations involving certain commonly used constants, such as pi).

The multiline scripts are special cases of the mechanisms for defining verbs, adverbs and conjunctions (what might be called functions or macros or operators or procedures in other languages) which instead provide the raw characters of the definition. The old form (beginning with 0 : 0 and ending with a line containing a single right parenthesis and no other displayable characters) is different from the new form (beginning with {{)n and ending with a line which has }} and no other characters preceding it) in the way that any following part of a surrounding sentence is arranged. These values of A would be equivalent:

NB. no trailing linefeed
A=: '1 2 3'

NB. removing linefeed
A=: 0 : 0-.LF
1 2 3
)

NB. removing linefeed
A=: {{)n
1 2 3
}}-.LF

Also, the {{}} forms are nestable. So, for example, this would also define an equivalent value for A:

{{
   {{
      A=: {{)n
1 2 3
}}-.LF
   }}''
}}''

Here, we are defining verbs inline and immediately evaluating them (by providing an argument (which is ignored because it is not referenced)).

The use of an unbalanced right parenthesis as an escape character was inherited from APL. The double curly brace mechanism was a compromise between J's existing use of curly braces and visual conventions used in a variety of other languages.

jq[edit]

"Data" in the context of jq can be understood to mean a JSON value or a stream of such values. Such data can be included in a jq program wherever an expression is allowed, but in a jq program, consecutive JSON values must be specified using "," as a separator, as shown in this snippet:

def data:
  "A string", 1, {"a":0}, [1,2,[3]]
;
Long JSON strings can be broken up into smaller JSON strings and concatenated using the infix "+" operator, e.g.
"This is not such a"
+ "long string after all."

"Raw data", such as character strings that are not expressed as JSON strings, cannot be included in jq programs textually but must be "imported" in some manner, e.g. from environment variables, text files, or using command-line options.

Julia[edit]

Note: Almost all of the documentation below is quoted from various portions of the Julia documentation at docs.julialang.org.

  • As with all objects in Julia, the size of a quoted string is limited by the maximum allocatable memory object in the underlying OS (32 or 64 bit).
  • Quoted strings are considered to contain a series of Unicode characters, but invalid Unicode within strings does not itself throw any errors. Therefore, strings may potentially contain any values.
  • String literals are delimited by double quotes or triple double quotes:
julia> str = "Hello, world.\n"
"Hello, world.\n"

julia> """Contains "quote" characters and
a newline"""
"Contains \"quote\" characters and \na newline"
  • Both single and triple quoted strings are may contain interpolated values. Triple-quoted strings are also dedented to the level of the least-indented line. This is useful for defining strings within code that is indented. For example:
julia> str = """
           Hello,
           world.
         """
"  Hello,\n  world.\n"
  • Julia allows interpolation into string literals using $:
julia> "$greet, $whom.\n"
"Hello, world.\n"
  • The shortest complete expression after the $ is taken as the expression whose value is to be interpolated into the string. Thus, you can interpolate any expression into a string using parentheses:
julia> "1 + 2 = $(1 + 2)"
"1 + 2 = 3"
  • Julia reserves the single quote ' for character literals, not for strings:
julia> 'π'
'π': Unicode U+03C0 (category Ll: Letter, lowercase)
  • Julia requires commands sent to functions such as run() be surrounded by backticks. Such expressions create a Cmd object, which is used for running a child process from Julia:
julia> mycommand = `echo hello`
`echo hello`

julia> typeof(mycommand)
Cmd

julia> run(mycommand);
hello
  • Julia uses the colon : in metaprogramming for quoting symbols and other code:
julia> a  = :+
:+

julia> typeof(a)
Symbol

julia> b = quote + end
quote
    #= REPL[3]:1 =#
    +
end

julia> typeof(b)
Expr

julia> eval(a) == eval(b)
true

julia> c = :(2 + 3)
:(2 + 3)

julia> eval(c)
5

Lua[edit]

Lua has three string definition syntaxes: single- and double-quotes, which are equivalent; and long-bracket pairs [[ ]] which may span multiple lines. Long-bracket pairs may be specified to an arbitrary depth, which may be useful for quoting Lua source code itself (which might use long-brackets). Lua strings are variable-length arrays of bytes, not 0-terminated (as in C), so may contain aribitrary raw binary data. Commonly escaped characters and octal\hexadecimal notation are supported.

s1 = "This is a double-quoted 'string' with embedded single-quotes."
s2 = 'This is a single-quoted "string" with embedded double-quotes.'
s3 = "this is a double-quoted \"string\" with escaped double-quotes."
s4 = 'this is a single-quoted \'string\' with escaped single-quotes.'
s5 = [[This is a long-bracket "'string'" with embedded single- and double-quotes.]]
s6 = [=[This is a level 1 long-bracket ]]string[[ with [[embedded]] long-brackets.]=]
s7 = [==[This is a level 2 long-bracket ]=]string[=[ with [=[embedded]=] level 1 long-brackets, etc.]==]
s8 = [[This is
a long-bracket string
with embedded
line feeds]]
s9 = "any \0 form \1 of \2 string \3 may \4 contain \5 raw \6 binary \7 data \xDB"
print(s1)
print(s2)
print(s3)
print(s4)
print(s5)
print(s6)
print(s7)
print(s8)
print(s9) -- with audible "bell" from \7 if supported by os
print("some raw binary:", #s9, s9:byte(5), s9:byte(12), s9:byte(17))
Output:
This is a double-quoted 'string' with embedded single-quotes.
This is a single-quoted "string" with embedded double-quotes.
this is a double-quoted "string" with escaped double-quotes.
this is a single-quoted 'string' with escaped single-quotes.
This is a long-bracket "'string'" with embedded single- and double-quotes.
This is a level 1 long-bracket ]]string[[ with [[embedded]] long-brackets.
This is a level 2 long-bracket ]=]string[=[ with [=[embedded]=] level 1 long-brackets, etc.
This is
a long-bracket string
with embedded
line feeds
any   form ☺ of ☻ string ♥ may ♦ contain ♣ raw ♠ binary  data █
some raw binary:        64      0       1       2

Nim[edit]

Nim uses single quote for characters.
It uses double quotes for single line strings and triple double quotes for multiline strings.
The standard module strformat provides interpolation in strings.
Nim allows to define special strings, for instance to describe regular expressions or PEGs (parsing expression grammar).
Arrays literals are defined as a list of values between brackets.
Preceding an array literal with an @ creates a sequence literal instead.
Tuples literals are defined as a list of values between parentheses. Field names may be specified by preceding a value by the name followed by a colon.

echo "A simple string."
echo "A simple string including tabulation special character \\t: \t."

echo """
First part of a multiple string,
followed by second part
and third part.
"""

echo r"A raw string containing a \."

# Interpolation in strings.
import strformat
const C = "constant"
const S = fmt"A string with interpolation of a {C}."
echo S
var x = 3
echo fmt"A string with interpolation of expression “2 * x + 3”: {2 * x + 3}."
echo fmt"Displaying “x” with an embedded format: {x:05}."

# Regular expression string.
import re
let r = re"\d+"

# Pegs string.
import pegs
let e = peg"\d+"

# Array literal.
echo [1, 2, 3]        # Element type if implicit ("int" here).
echo [byte(1), 2, 3]  # Element type is specified by the first element type.
echo [byte 1, 2, 3]   # An equivalent way to specify the type.

echo @[1, 2, 3]       # Sequence of ints.

# Tuples.
echo ('a', 1, true)   # Tuple without explicit field names.
echo (x: 1, y: 2)     # Tuple with two int fields "x" and "y".
Output:
A simple string.
A simple string including tabulation special character \t: 	.
First part of a multiple string,
followed by second part
and third part.

A raw string containing a \.
A string with interpolation of a constant.
A string with interpolation of expression “2 * x + 3”: 9.
Displaying “x” with an embedded format: 00003.
[1, 2, 3]
[1, 2, 3]
[1, 2, 3]
@[1, 2, 3]
('a', 1, true)
(x: 1, y: 2)

Phix[edit]

Library: Phix/basics

Single quotes are used for single ascii characters, eg 'A'. Multibyte unicode characters are typically held as utf-8 strings.
Double quotes are used for single-line strings, with backslash interpretation, eg "one\ntwo\nthree\n".
The concatenation operator & along with a couple more quotes can certainly be used to mimic string continuation, however it is technically an implementation detail rather than part of the language specification as to whether that occurs at compile-time or run-time.
Phix does not support interpolation other than printf-style, eg printf(1,"Hello %s,\nYour account balance is %3.2f\n",{name,balance}).
Back-ticks and triple-quotes are used for multi-line strings, without backslash interpretation, eg

constant t123 = `
one
two
three
`

or (entirely equivalent, except the following can contain back-ticks which the above cannot, and vice versa for triple quotes)

constant t123 = """
one
two
three
"""

Both are also equivalent to the top double-quote one-liner. Note that a single leading '\n' is automatically stripped.
Several builtins such as substitute, split, and join are often used to convert such strings into the required internal form.
Regular expressions are usually enclosed in back-ticks, specifically to avoid backslash interpretation.
You can also declare hexadecimal strings, eg

x"1 2 34 5678_AbC" -- same as {0x01, 0x02, 0x34, 0x56, 0x78, 0xAB, 0x0C}
                   -- note however it displays as {1,2,52,86,120,171,12}
                   -- whereas x"414243" displays as "ABC" (as all chars)

Literal sequences are represented with curly braces, and can be nested to any depth, eg

{2, 3, 5, 7, 11, 13, 17, 19}
{1, 2, {3, 3, 3}, 4, {5, {6}}}
{{"John", "Smith"}, 52389, 97.25}
{}  -- the 0-element sequence

Raku[edit]

The Perl philosophy, which Raku thoroughly embraces, is that "There Is More Than One Way To Do It" (often abbreviated to TIMTOWTDI). Quoting constructs is an area where this is enthusiastically espoused.

Raku has a whole quoting specific sub-language built in called Q. Q changes the parsing rules inside the quoting structure and allows extremely fine control over how the enclosed data is parsed. Every quoting construct in Raku is some form of a Q syntactic structure, using adverbs to fine tune the desired behavior, though many of the most commonly used have some form of "shortcut" syntax for quick and easy use. Usually, when using an adverbial form, you may omit the Q: and just use the adverb.

In general, any and all quoting structures have theoretically unlimited length, in practice, are limited by memory size, practically, it is probably better to limit them to less than a gigabyte or so, though they can be read as a supply, not needing to hold the whole thing in memory at once. They can hold multiple lines of data. How the new-line characters are treated depends entirely on any white-space adverb applied. The Q forms use some bracketing character to delimit the quoted data. Usually some Unicode bracket ( [], {}, <>, ⟪⟫, whatever,) that has an "open" and "close" bracketing character, but they may use any non-indentifier character as both opener and closer. ||, //, ??, the list goes on. The universal escape character for constructs that allow escaping is backslash "\".

The following exposition barely scratches the surface. For much more detail see the Raku documentation for quoting constructs for a comprehensive list of adverbs and examples.

The most commonly used
  • Q[ ], common shortcut: 「 」
The most basic form of quoting. No interpolation, no escaping. What is inside is what you get. No exceptions.
「Ze backslash characters!\ \Zay do NUSSING!! \」 -> Ze backslash characters!\ \Zay do NUSSING!! \


  • "Single quote" quoting. - Q:q[ ], adverbial: q[ ], common shortcut: ' '
No interpolation, but allow escaping quoting characters.
'Don\'t panic!' -> Don't panic!


  • "Double quote" quoting. - Q:qq[ ], adverbial: qq[ ], common shortcut: " "
Interpolates: embedded variables, logical characters, character codes, continuations.
"Hello $name, today is {Date.today} \c[grinning face] \n🦋" -> Hello Dave, today is 2020-03-25 😀
🦋
Where $name is a variable containing a name (one would imagine), {Date.today} is a continuation - a code block to be executed and the result inserted, \c[grinning face] is the literal emoji character 😀 as a character code, \n is a new-line character and 🦋 is an emoji butterfly. Allows escape sequences, and indeed, requires them when embedding data that looks like it may be an interpolation target but isn't.


Every adverbial form has both a q and a qq variant to give the 'single quote' or "double quote" semantics. Only the most commonly used are listed here.


  • "Quote words" - Q:qw[ ], adverbial: qw[ ], common shortcut: < >
No interpolation, but allow escaping quote characters. (Inherited from the q[] escape semantics)
< a β 3 Б 🇩🇪 >
Parses whatever is inside as a white-space separated list of words. Returns a list with all white space removed. Any numeric values are returned as allomorphs.
That list may be operated on directly with any listy operator or it may be assigned to a variable.
say < a β 3 Б 🇩🇪 >[*-1] # What is the last item in the list? (🇩🇪)
say +< a β 3 Б 🇩🇪 > # How many items are in the list? (5)


  • "Quote words with white space protection" - Q:qww[ ], adverbial: qww[ ]
May preserve white space by enclosing it in single or double quote characters, otherwise identical to qw[ ].
say qww< a β '3 Б' 🇩🇪 >[2] # Item 2 in the list? (3 Б)


  • "Double quote words" quoting. - Q:qqw[ ], adverbial: qqw[ ], common shortcut: << >> or « »
Interpolates similar to standard double quote, but then interprets the interpolated string as a white space separated list.


  • "Double quoted words with white space protection" - Q:qqww[ ], adverbial: qqww[ ]
Same as qqw[ ] but retains quoted white space.


  • "System command" - Q:qx[ ], adverbial: qx[ ]
Execute the string inside the construct as a system command and return the result.


  • "Heredoc format" - Q:q:to/END/; END, adverbial: q:to/END/; END
Return structured text between two textual delimiters. Depending on the adverb, may or may not interpolate (same rules as other adverbial forms.) Will return the text with the same indent as the indent of the final delimiter. The text delimiter is user chosen (and is typically, though not necessarily, uppercase) as is the delimiter bracket character.

There are other adverbs to give precise control what interpolates or doesn't, that may be applied to any of the above constructs. See the doc page for details. There is another whole sub-genre dedicated to quoting regexes.

REXX[edit]

There are no "escape" characters used in the REXX language.

The different types (or styles) of incorporating quoted constructs are a largely matter of style.

/*REXX program demonstrates various ways to express a string of characters  or  numbers.*/
a= 'This is one method of including a '' (an apostrophe) within a string.'
b= "This is one method of including a ' (an apostrophe) within a string."

                                                 /*sometimes,  an apostrophe is called  */
                                                 /*a quote.                             */
/*──────────────────────────────────────────────────────────────────────────────────────*/
c= "This is one method of including a "" (a double quote) within a string."
d= 'This is one method of including a " (a double quote) within a string.'

                                                 /*sometimes,  a double quote is also   */
                                                 /*called a quote,  which can make for  */
                                                 /*some confusion and bewilderment.     */
/*──────────────────────────────────────────────────────────────────────────────────────*/
f= 'This is one method of expressing a long literal by concatenations,  the '     ||  ,
   'trailing character of the above clause must contain a trailing '              ||  ,
   'comma (,)  === note the embedded trailing blank in the above 2 statements.'
/*──────────────────────────────────────────────────────────────────────────────────────*/
g= 'This is another method of expressing a long literal by '         ,
   "abutments,  the trailing character of the above clause must "    ,
   'contain a trailing comma (,)'
/*──────────────────────────────────────────────────────────────────────────────────────*/
h= 'This is another method of expressing a long literal by '       ,  /*still continued.*/
   "abutments,  the trailing character of the above clause must "                  ,
   'contain a trailing comma (,)  ---  in this case, the comment  /* ... */  is '   ,
   'essentially is not considered to be "part of" the REXX clause.'
/*──────────────────────────────────────────────────────────────────────────────────────*/
i= 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109

                                                 /*This is one way to express a list of */
                                                 /*numbers that don't have a sign.      */
/*──────────────────────────────────────────────────────────────────────────────────────*/
j= 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109,
   71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181

                                                 /*This is one way to express a long    */
                                                 /*list of numbers that don't have a    */
                                                 /*sign.                                */
                                                 /*Note that this form of continuation  */
                                                 /*implies a blank is abutted to first  */
                                                 /*part of the REXX statement.          */
                                                 /*Also note that some REXXs have a     */
                                                 /*maximum clause length.               */
/*──────────────────────────────────────────────────────────────────────────────────────*/
k= 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109,
   71 73   79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181

                                                 /*The  J  and  K  values are identical,*/
                                                 /*superfluous and extraneous blanks are*/
                                                 /*ignored.                             */
/*──────────────────────────────────────────────────────────────────────────────────────*/
l= '-2 3 +5 7 -11 13 17 19 -23 29 -31 37 -41 43 47 -53 59 -61 67 -71 73 79 -83 89 97 -101'

                                                 /*This is one way to express a list of */
                                                 /*numbers that have a sign.            */
/*──────────────────────────────────────────────────────────────────────────────────────*/
m= a b c d f g h i j k l                         /*this will create a list of all the   */
                                                 /*listed strings used  (so far)  into  */
                                                 /*the variable     L      (with an     */
                                                 /*intervening blank between each       */
                                                 /*variable's value.                    */


Ring[edit]

This example is incomplete. Explain where they would likely be used, what their primary use is, what limitations they have and why one might be preferred over another. Is one style interpolating and another not? Are there restrictions on the size of the quoted data? The type? The format? Please ensure that it meets all task requirements and remove this message.
text = list(3)

text[1] = "This is 'first' example for quoting"
text[2] = "This is second 'example' for quoting"
text[3] = "This is third example 'for' quoting"

for n = 1 to len(text)
    see "text for quoting: " + nl + text[n] + nl
    str = substr(text[n],"'","")
    see "quoted text:" + nl + str + nl + nl
next
Output:

text for quoting:
This is 'first' example for quoting
quoted text:
This is first example for quoting

text for quoting:
This is second 'example' for quoting
quoted text:
This is second example for quoting

text for quoting:
This is third example 'for' quoting
quoted text:
This is third example for quoting

Smalltalk[edit]

Characters[edit]

Character literals are prefixed by a "$". Conceptionally, they are the elements of strings, although effectively only the codePoint is stored in strings. But when accessing a string element, instances of Character are used. Characters can be asked for being uppercase, lowercase, etc.

$a
$Å
$日

Strings[edit]

String literals are enclosed in single quotes. Conceptionally, they holde instances of Character as element, but actually the underlying storage representation is chosen to be space effective. Typically, underneath are classes like SingleByteString, TwoByteString and FourByteString, but this is transparent to the programmer. Strings can hold any Unicode character; UTF8 is only used when strings are exchanged with the external world (which is good, as it makes operations like stringLength much easier).

'hello'
'日本語

Traditional Smalltalk-80 does not support any escapes inside strings, which is inconvenient, occasionally.
Smalltalk/X supports an extended syntax for C-like strings:

Works with: Smalltalk/X
c'hello\nthis\tis a C string\0x0D'

and also embedded expressions:

e'hello world; it is now {Time now}\n'

Arrays[edit]

Literal arrays are written as #(...), where the elements are space separated; each literal array element can be any type of literal again, optionally omitting the '#'-character:

#( 1 1.234 (1/2) foo 'hello' #(9 8 7) (99 88 77)  '日本語' true [1 2 3] false)

Here, the third element is a fraction, followed by the symbol #'foo', two arrays, a character, another string, the boolean true, a byteArray and the boolean false.

ByteArrays[edit]

A dense collection of byte valued integers is written as #[..]. Conceptionally, they are arrays of integer values in the range 0..255, but use only one byte per element of storage. They are typically used for bulk storage such as bitmap images, or when exchanging such with external functions.

#[ 1 2 16rFF 2r0101010 ]

Symbols[edit]

These are like symbol atoms in Lisp/Scheme, written as #'...' (i.e. like a string with hash prefix). If the characters do not contain special characters or are of the form allowed for a message selector, the quotes can be omitted. Symbols are often used as key in dictionaries, especially for message selectors and global/namespace name bindings. They can be quickly compared using "==", which is. a pointer compare (identity) instead of "=" which is compares the contents (equality).

#'foo'
#'foo bar baz'
#foo.  " same as #'foo' "
#'++'
#++  " same as #'++' "
#a:b:c:  " same as #'a:b:c:' "

Blocks[edit]

Somewhat between literal constants and instantiated object are blocks, which represent a closure (lambda function). Here, the a object is constructed as literal at compile time, and a closure object (which wraps the code plus the visible variables) into an object at execution time. Blocks thus represent a piece of code which can be stored in an instance variable, passed as argument or returned from a method.
Block syntax is very compact:

[ expression . expression ... expression ]

or for a block with arguments:

[:arg1 :arg2 :... :argN | expression . expression ... expression ]

Blocks are one of the fundamental building blocks of Smalltalk (no pun here), as the language (Compiler) does not specify any syntax for control structures. Control structures like if, while, etc. are all implemented as library functions, and defined eg. in the Boolean, Block or Collection classes.
If you have a block at hand, it can be evaluated by sending it a "value"message:

aBlock value.  "evaluate the block, passing no argument"
anotherBlock value:1 value:2. "evaluate the block, passing two arguments"

The most basic implementation of such a control structure is found in the Boolean subclasses True and False, which implement eg. "ifTrue:arg" and "ifFalse:". Here are those two as concrete example:

in the True class:
ifTrue: aBlock
    ^ aBlock value "I am true, so I evaluate the block"

in the False class:
ifTrue: aBlock
    ^ nil  "I am false, so I ignore the block"

Thus, the expression "someBoolean ifTrue:[ 'hello print' ]" will either evaluate the lambda or not, depending on the someBoolean receiver. Obviously, you can teach other objects on how to respond to "value" messages and then use them as if they where blocks. Actually, the Object class also implements "value", so you can also write: "a := someCondition ifTrue:10 ifFalse:20". Thich works because "Object value" simply returns the receiver.

In addition, some Smalltalk dialects implement additional syntax extensions.

Inline Object[edit]

Works with: Smalltalk/X
#{
    foo:  <someConstant>
    bar:  <someConstant>
}

Generates a literal constant instance of an anonymous class, with two instance vars: foo and bar. The object is dumb in that it only provides getter and setter functions. These are used eg. when returning structured multiple values from a method.

Dense Typed Arrays[edit]

Works with: Smalltalk/X

Similar to byteArrays, there are dense arrays of ints, floats, doubles or bits (i.e. they use much less memory compared to regular arrays, which hold pointers to their elements). They are also perfect when calling out to C-language functions. The syntax is analogous to the Scheme language's syntax:

Works with: Smalltalk/X
#u16( 1 2 3 ).  " an array of unsigned int16s "
#u32( 1 2 3 ).  " an array of unsigned int32s "
#u64( 1 2 3 ).  " an array of unsigned int64s "
#s16( -1 2 3 ). " an array of signed int16s "
#s32( -1 2 3 ). " an array of signed int32s "
#s64( -1 2 3 ). " an array of signed int64s "
#f16( -1 2.0 3 ). " an array of float16s "
#f32( -1 2.0 3 ). " an array of float32s "
#f64( -1 2.0 3 ). " an array of float64s "
#b( 1 0 1 1 0 0 ). " an array of bits "
#B( true false true true ). " an array of booleans "

Wren[edit]

Library: Wren-fmt

Wren has two quoting constructs: ordinary string literals and (from v0.4.0) raw string literals.

An ordinary string literal is a sequence of characters (usually interpreted as UTF-8) enclosed in double-quotes. It can include various escape sequences as listed in the Literals/String#Wren task.

It also supports interpolation which enables any Wren expression, whatever its type or complexity, to be embedded in an ordinary string literal by placing it in parentheses immediately preceded by a % character. A literal % character is represented by the escape sequence \%.

If the expression is not a string, then it is automatically converted to one by applying its type's toString method. All classes have such a method which is usually written explicitly or can be just inherited from the Object class which sits at the top of the type hierarchy.

Interpolated expressions can also be nested though this is not usually a good idea as they can quickly become unreadable.

It can be argued that interpolated strings which contain anything other than simple expressions (for example formatting information) are hard to read anyway and, although not part of the standard language, the above module contains methods modelled after C's 'printf' function family to meet this objection.

A raw string literal is any text delimited by triple double-quotes, """, and is interpreted verbatim i.e. any control codes and/or interpolations are not processed as such.

If a triple quote appears on its own line then any trailing whitespace on that line is ignored.

Here are some examples of all this.

import "/fmt" for Fmt

// simple string literal
System.print("Hello world!")

// string literal including an escape sequence
System.print("Hello tabbed\tworld!")

// interpolated string literal
var w = "world"
System.print("Hello interpolated %(w)!")

// 'printf' style
Fmt.print("Hello 'printf' style $s!", w)

// more complicated interpolated string literal
var h = "Hello"
System.print("%(Fmt.s(-8, h)) more complicated interpolated %(w.map { |c| "%(c + "\%")" }.join())!")

// more complicated 'printf' style
Fmt.print("$-8s more complicated 'printf' style $s\%!", h, w.join("\%"))

// raw string literal
var r = """
Hello, raw string literal which interpets a control code such as "\n" and an 
interpolation such as %(h) as verbatim text.
Single (") or dual ("") double-quotes can be included without problem. 
"""
System.print(r)
Output:
Hello world!
Hello tabbed	world!
Hello interpolated world!
Hello 'printf' style world!
Hello    more complicated interpolated w%o%r%l%d%!
Hello    more complicated 'printf' style w%o%r%l%d%!
Hello, raw string literal which interpets a control code such as "\n" and an 
interpolation such as %(h) as verbatim text.
Single (") or dual ("") double-quotes can be included without problem. 

Z80 Assembly[edit]

Translation of: 6502 Assembly

Quoting constructs is very straightforward. The use of quotation marks tells the assembler that the data inside those quotation marks is to be assembled as ASCII values. A null terminator must go outside the quotation marks. There is no built-in support for control codes, as the assembler does not assume that any "putS" routine exists. Whatever functions you create that use these strings will have to be given the capability of handling control codes.

Unlike some languages, there is no limit on how long a string can be. A text string can span as many lines as you want it to, since the null terminator is the only way the CPU knows it will end (assuming that your "putS" routine uses a null terminator.)

MyString: 
     byte "Hello World",0              ;a null-terminated string
LookupTable: 
     byte &03,&06,&09,&0C              ;a pre-defined sequence of bytes (similar in concept to enum in C)
TileGfx: 
     incbin "Z:\game\gfx\tilemap.bmp"  ;a file containing bitmap graphics data

For most Z80 assemblers, the following are standard:

  • A value with no prefix is interpreted as a base 10 (decimal) number. $ or & represents hexadecimal and % represents binary. Single or double quotes represent ASCII.

Multiple values can be put on the same line, separated by commas. DB only needs to be before the first data value on that line. Or, you can put each value on its own line. Both are valid and have the same end result when the code is assembled.

Z80 Assembly uses db or byte for 8-bit data and dw or word for 16-bit data. 16-bit values are written by the programmer in big-endian, but stored little-endian. For example, the following two data blocks are equivalent. You can write it either way, but the end result is the same.

word $ABCD
word $CD,$AB

Most assemblers support "C-like" operators, and there are a few additional ones:

  • < or LOW() means "The low byte of." For example, <$3456 evaluates to 56.
  • > or HIGH() means "The high byte of." For example, <$78AB evaluates to 78.

These two operators are most frequently used with labeled memory addresses, like so:

lookup_table_lo:
byte <Table00,<Table01,<Table02
lookup_table_hi:
byte >Table00,>Table01,>Table02

The incbin directive can be used for embedding raw graphics data, text, or music.

zkl[edit]

Quoting text: zkl has two types of text: parsed and raw. Strings are limited to one line, no continuations.

Parsed text is in double quotes ("text\n") and escape ("\n" is newline, UTF-8 ("\Ubd;" or "\u00bd"), etc).

"Raw" text is unparsed, useful for things like regular expressions and unit testing of source code. It uses the form 0'<sigil>text<sigil>. For example 0'<text\n> is the text "text\\n". There is no restriction on sigil (other than it is one character).

Text blocks are multiple lines of text that are gathered into one line and then evaluated (thus can be anything, such as string or code and are often mixed). #<<< (at the start of a line) begins and ends the block. A #<<<" beginning tag prepends a " to the block. For example:

#<<<
text:=
"
A
";
#<<<

is parsed as text:="\nA\n";

Other data types are pretty much as in other languages.