S-expressions: Difference between revisions

Content deleted Content added
→‎{{header|Python}}: Added support for negative numbers.
Hout (talk | contribs)
 
(138 intermediate revisions by 41 users not shown)
Line 1:
{{task|Data Structures}}
[[wp:S-Expression|S-Expressions]]   are one convenient way to parse and store data.
 
 
;Task:
Write a simple reader and writer for S-Expressions that handles quoted and unquoted strings, integers and floats.
 
The reader should read a single but nested S-Expression from a string and store it in a suitable datastructure (list, array, etc). Newlines and other whitespace may be ignored unless contained within a quoted string. “<tt>()</tt>” inside quoted strings are not interpreted, but treated as part of the string. Handling escaped quotes inside a string is optional; thus “<tt>(foo"bar)</tt>” maybe treated as a string “<tt>foo"bar</tt>”, or as an error.
 
Newlines and other whitespace may be ignored unless contained within a quoted string.
For this, the reader need not recognise “<tt>\</tt>” for escaping, but should, in addition, recognize numbers if the language has appropriate datatypes.
 
“<tt>()</tt>” &nbsp; inside quoted strings are not interpreted, but treated as part of the string.
 
Handling escaped quotes inside a string is optional; &nbsp; thus “<tt>(foo"bar)</tt>” maybe treated as a string “<tt>foo"bar</tt>”, or as an error.
 
For this, the reader need not recognize “<tt>\</tt>” for escaping, but should, in addition, recognize numbers if the language has appropriate datatypes.
 
Languages that support it may treat unquoted strings as symbols.
Line 13 ⟶ 21:
 
The reader should be able to read the following input
<langsyntaxhighlight lang="lisp">((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))</langsyntaxhighlight>
and turn it into a native datastructure. (see the [[#Pike|Pike]], [[#Python|Python]] and [[#Ruby|Ruby]] implementations for examples of native data structures.)
 
Line 20 ⟶ 28:
Strings that don't contain whitespace or parentheses () don't need to be quoted in the resulting S-Expression, but as a simplification, any string may be quoted.
 
 
'''Extra Credit:''' Let the writer produce pretty printed output with indenting and line-breaks
;Extra Credit:
Let the writer produce pretty printed output with indenting and line-breaks.
<br><br>
 
=={{header|11l}}==
{{trans|Nim}}
 
<syntaxhighlight lang="11l">T Token
T.enum Kind
INT
FLOAT
STRING
IDENT
LPAR
RPAR
END
 
Kind kind
String val
 
F (kind, val = ‘’)
.kind = kind
.val = val
 
F lex(input_str)
[Token] result
V pos = 0
 
F current()
R I @pos < @input_str.len {@input_str[@pos]} E Char("\0")
 
L pos < input_str.len
V ch = input_str[pos]
I ch == ‘(’
pos++
result.append(Token(Token.Kind.LPAR))
E I ch == ‘)’
pos++
result.append(Token(Token.Kind.RPAR))
E I ch C ‘0’..‘9’
V num = ‘’
V kind = Token.Kind.INT
L current() C ‘0’..‘9’
num ‘’= current()
pos++
I current() == ‘.’
num ‘’= current()
kind = FLOAT
pos++
L current() C ‘0’..‘9’
num ‘’= current()
pos++
result.append(Token(kind, num))
E I ch C (‘ ’, "\t", "\n", "\r")
pos++
E I ch == ‘"’
V str = ‘’
pos++
L current() != ‘"’
str ‘’= current()
pos++
pos++
result.append(Token(Token.Kind.STRING, str))
E
V BannedChars = Set([‘ ’, "\t", ‘"’, ‘(’, ‘)’, ‘;’])
V ident = ‘’
L current() !C BannedChars
ident ‘’= current()
pos++
result.append(Token(Token.Kind.IDENT, ident))
 
result.append(Token(Token.Kind.END))
R result
 
F indent(s, count)
R (count * ‘ ’)‘’s.replace("\n", "\n"(count * ‘ ’))
 
T SExpr
T.enum Kind
INT
FLOAT
STRING
IDENT
LIST
 
Kind kind
String val
[SExpr] children
 
F (kind, val = ‘’)
.kind = kind
.val = val
 
F to_str()
I .kind C (SExpr.Kind.INT, SExpr.Kind.FLOAT, SExpr.Kind.IDENT)
R .val
E I .kind == STRING
R ‘"’(.val)‘"’
E I .kind == LIST
V result = ‘(’
L(i, ex) enumerate(.children)
I ex.kind == LIST & ex.children.len > 1
result ‘’= "\n"
result ‘’= indent(ex.to_str(), 2)
E
I i > 0
result ‘’= ‘ ’
result ‘’= ex.to_str()
R result‘)’
assert(0B)
 
V input_str = ‘
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
V tokens = lex(input_str)
V pos = 0
 
F current()
R I :pos < :tokens.len {:tokens[:pos]} E Token(Token.Kind.END)
 
F parse() -> SExpr
V token = current()
:pos++
I token.kind == INT
R SExpr(SExpr.Kind.INT, token.val)
E I token.kind == FLOAT
R SExpr(SExpr.Kind.FLOAT, token.val)
E I token.kind == STRING
R SExpr(SExpr.Kind.STRING, token.val)
E I token.kind == IDENT
R SExpr(SExpr.Kind.IDENT, token.val)
E I token.kind == LPAR
V result = SExpr(SExpr.Kind.LIST)
L current().kind !C (Token.Kind.RPAR, Token.Kind.END)
result.children.append(parse())
assert(current().kind != END, ‘Missing right paren ')'’)
:pos++
R result
assert(0B)
 
print(parse().to_str())</syntaxhighlight>
 
{{out}}
<pre>
(
(data "quoted data" 123 4.5)
(data
(!@# (4.5) "(more" "data)")))
</pre>
 
=={{header|Ada}}==
Line 28 ⟶ 186:
Specification of package S_Expr:
 
<langsyntaxhighlight Adalang="ada">with Ada.Strings.Unbounded;
private with Ada.Containers.Indefinite_Vectors;
 
Line 82 ⟶ 240:
end record;
 
end S_Expr;</langsyntaxhighlight>
 
The implementation of S_Expr:
 
<langsyntaxhighlight Adalang="ada">with Ada.Integer_Text_IO, Ada.Float_Text_IO;
 
package body S_Expr is
Line 140 ⟶ 298:
end Print;
 
end S_Expr;</langsyntaxhighlight>
 
Specification and Implementation of S_Expr.Parser (a child package of S_Expr):
 
<langsyntaxhighlight Adalang="ada">generic -- child of a generic package must be a generic unit
package S_Expr.Parser is
 
Line 150 ⟶ 308:
-- the result of a parse process is always a list of expressions
 
end S_Expr.Parser;</langsyntaxhighlight>
 
<langsyntaxhighlight Adalang="ada">with Ada.Integer_Text_IO, Ada.Float_Text_IO;
 
package body S_Expr.Parser is
Line 261 ⟶ 419:
end Parse;
 
end S_Expr.Parser;</langsyntaxhighlight>
 
The main program Test_S_Expr:
 
<langsyntaxhighlight Adalang="ada">with S_Expr.Parser, Ada.Text_IO;
 
procedure Test_S_Expr is
Line 287 ⟶ 445:
Expression_List.First.Print(Indention => 0);
-- Parse will output a list of S-Expressions. We need the first Expression.
end Test_S_Expr;</langsyntaxhighlight>
 
{{out}}
The output:<pre> (
<pre> (
(
data
Line 312 ⟶ 471:
)
) </pre>
 
=={{header|ALGOL 68}}==
<syntaxhighlight lang="algol68"># S-Expressions #
CHAR nl = REPR 10;
# mode representing an S-expression #
MODE SEXPR = STRUCT( UNION( VOID, STRING, REF SEXPR ) element, REF SEXPR next );
# creates an initialises an SEXPR #
PROC new s expr = REF SEXPR: HEAP SEXPR := ( EMPTY, NIL );
# reports an error #
PROC error = ( STRING msg )VOID: print( ( "**** ", msg, newline ) );
# S-expression reader - reads and returns an S-expression from the string s #
PROC s reader = ( STRING s )REF SEXPR:
BEGIN
PROC at end = BOOL: s pos > UPB s;
PROC curr = CHAR: IF at end THEN REPR 0 ELSE s[ s pos ] FI;
PROC skip spaces = VOID: WHILE NOT at end AND ( curr = " " OR curr = nl ) DO s pos +:= 1 OD;
PROC end of list = BOOL: at end OR curr = ")";
INT s pos := LWB s;
INT t pos;
[ ( UPB s - LWB s ) + 1 ]CHAR token; # token text - large enough to hold the whole string if necessary #
# adds the current character to the token #
PROC add curr = VOID: token[ t pos +:= 1 ] := curr;
# get an s expression element from s #
PROC get element = REF SEXPR:
BEGIN
REF SEXPR result = new s expr;
skip spaces;
# get token text #
IF at end THEN
# no element #
element OF result := EMPTY
ELIF curr = "(" THEN
s pos +:= 1;
skip spaces;
IF NOT end of list
THEN
REF SEXPR nested expression = get element;
REF SEXPR element pos := nested expression;
element OF result := nested expression;
skip spaces;
WHILE NOT end of list
DO
element pos := next OF element pos := get element;
skip spaces
OD
FI;
IF curr = ")" THEN
s pos +:= 1
ELSE
error( "Missing "")""" )
FI
ELIF curr = ")" THEN
s pos +:= 1;
error( "Unexpected "")""" );
element OF result := EMPTY
ELSE
# quoted or unquoted string #
t pos := LWB token - 1;
IF curr /= """" THEN
# unquoted string #
WHILE add curr;
s pos +:= 1;
NOT at end AND curr /= " " AND curr /= "("
AND curr /= ")" AND curr /= """"
AND curr /= nl
DO SKIP OD
ELSE
# quoted string #
WHILE add curr;
s pos +:= 1;
NOT at end AND curr /= """"
DO SKIP OD;
IF curr /= """" THEN
# missing string quote #
error( "Unterminated string: <<" + token[ : t pos ] + ">>" )
ELSE
# have the closing quote #
add curr;
s pos +:= 1
FI
FI;
element OF result := token[ : t pos ]
FI;
result
END # get element # ;
 
REF SEXPR s expr = get element;
skip spaces;
IF NOT at end THEN
# extraneuos text after the expression #
error( "Unexpected text at end of expression: " + s[ s pos : ] )
FI;
 
s expr
END # s reader # ;
# prints an S expression #
PROC s writer = ( REF SEXPR s expr )VOID:
BEGIN
# prints an S expression with a suitable indent #
PROC print indented s expression = ( REF SEXPR s expr, INT indent )VOID:
BEGIN
REF SEXPR s pos := s expr;
WHILE REF SEXPR( s pos ) ISNT REF SEXPR( NIL ) DO
FOR i TO indent DO print( ( " " ) ) OD;
CASE element OF s pos
IN (VOID ): print( ( "()", newline ) )
, (STRING s): print( ( s, newline ) )
, (REF SEXPR e): BEGIN
print( ( "(", newline ) );
print indented s expression( e, indent + 4 );
FOR i TO indent DO print( ( " " ) ) OD;
print( ( ")", newline ) )
END
OUT
error( "Unexpected S expression element" )
ESAC;
s pos := next OF s pos
OD
END # print indented s expression # ;
 
print indented s expression( s expr, 0 )
END # s writer # ;
# test the eader and writer with the example from the task #
s writer( s reader( "((data ""quoted data"" 123 4.5)"
+ nl
+ " (data (!@# (4.5) ""(more"" ""data)"")))"
+ nl
)
)</syntaxhighlight>
{{out}}
<pre>
(
(
data
"quoted data"
123
4.5
)
(
data
(
!@#
(
4.5
)
"(more"
"data)"
)
)
)
</pre>
 
=={{header|APL}}==
{{works with|Dyalog APL}}
 
These are two functions, <code>sexp</code> parses an S-expression, and
<code>pretty</code> prettyprints a parsed expression.
 
The S-expression is represented as a nested APL vector, where every
item is a tuple consisting of a number representing the type, and
the value of the item.
 
The simple types are string (1), number (2), and atom (3); in these
cases the value is a character vector (for string and atom) or a number.
Type 0 is a list, and the value is a vector of items.
 
As an example, this is how a list may be defined in APL itself:
 
<pre>
pretty⊂(0((3 'Hi')(3 'Bye')(1 'A string')(0((3 'Depth')(2 42)))))
(
Hi
Bye
"A string"
(
Depth
42
)
)
</pre>
 
The following is the result of parsing and then prettyprinting the given input:
 
<pre>
⍴r
65
r
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
sexp r
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│┌─┬───────────────────────────────────────────────────────────────────────────────────────────────────────────────┐│
││0│┌──────────────────────────────────────────────┬──────────────────────────────────────────────────────────────┐││
││ ││┌─┬──────────────────────────────────────────┐│┌─┬──────────────────────────────────────────────────────────┐│││
││ │││0│┌────────┬───────────────┬───────┬───────┐│││0│┌────────┬───────────────────────────────────────────────┐││││
││ │││ ││┌─┬────┐│┌─┬───────────┐│┌─┬───┐│┌─┬───┐││││ ││┌─┬────┐│┌─┬───────────────────────────────────────────┐│││││
││ │││ │││3│data│││1│quoted data│││2│123│││2│4.5│││││ │││3│data│││0│┌───────┬─────────────┬─────────┬─────────┐││││││
││ │││ ││└─┴────┘│└─┴───────────┘│└─┴───┘│└─┴───┘││││ ││└─┴────┘││ ││┌─┬───┐│┌─┬─────────┐│┌─┬─────┐│┌─┬─────┐│││││││
││ │││ │└────────┴───────────────┴───────┴───────┘│││ ││ ││ │││3│!@#│││0│┌───────┐│││1│(more│││1│data)││││││││
││ ││└─┴──────────────────────────────────────────┘││ ││ ││ ││└─┴───┘││ ││┌─┬───┐│││└─┴─────┘│└─┴─────┘│││││││
││ ││ ││ ││ ││ ││ ││ │││2│4.5││││ │ │││││││
││ ││ ││ ││ ││ ││ ││ ││└─┴───┘│││ │ │││││││
││ ││ ││ ││ ││ ││ ││ │└───────┘││ │ │││││││
││ ││ ││ ││ ││ ││ │└─┴─────────┘│ │ │││││││
││ ││ ││ ││ ││ │└───────┴─────────────┴─────────┴─────────┘││││││
││ ││ ││ ││ │└─┴───────────────────────────────────────────┘│││││
││ ││ ││ │└────────┴───────────────────────────────────────────────┘││││
││ ││ │└─┴──────────────────────────────────────────────────────────┘│││
││ │└──────────────────────────────────────────────┴──────────────────────────────────────────────────────────────┘││
│└─┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────┘│
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
pretty sexp r
(
(
data
"quoted data"
123
4.5
)
(
data
(
!@#
(
4.5
)
"(more"
"data)"
)
)
)
</pre>
 
 
 
<syntaxhighlight lang="apl">sexp←{
wspace←' ',⎕TC ⍝ whitespace is space, tab, cr, lf
 
⍝ turn string into number if possible
num←{
0=≢⍵:⍬ ⍝ empty = nope
(⊃⍵)∊'-¯':-∇1↓⍵ ⍝ negative?
(1≥⍵+.='.')∧⍵∧.∊⊂⎕D,'.':⍎⍵ ⍝ number: all digits and 0 or 1 points
⍬ ⍝ otherwise, nope.
}
 
⍝ tokenize (0=brackets, 1=strings, 2=atoms)
tok←{
d←(~∧\⍵∊wspace)/⍵ ⍝ ignore leading whitespace
d≡'':d ⍝ empty input = empty output
s←1↑d ⋄ r←1↓d ⍝ start and rest
s∊'()':(⊂0,⊂s),∇r ⍝ brackets: just the bracket
sb←∧\~('"'=r)∧'\'≠¯1⌽r ⍝ strings: up to first " not preceded by \
sd←(1⌽sd≠'"')/sd←sb/r ⍝ without escape characters
s='"':(⊂1,⊂sd),∇1↓(~sb)/r
atm←∧\~d∊wspace,'()"' ⍝ atom: up to next whitespace, () or "
(⊂2,⊂atm/d),∇(~atm)/d
}
 
⍝ build structure from tokens
build←{
⍺←⍬
0=≢⍵:⍺ ⍬ ⍝ empty input = done
typ tok←⊃⍵ ⍝ current token and type
rst←1↓⍵ ⍝ rest of tokens
tok≡,'(':(⍺,⊂0 l)∇r⊣l r←∇rst ⍝ open bracket: go down a level
tok≡,')':⍺ rst ⍝ close bracket: go up a level
typ=1:(⍺,⊂1 tok)∇rst ⍝ string: type 1
0≠≢n←num tok:(⍺,⊂2(,n))∇rst ⍝ number: type 2
(⍺,⊂3 tok)∇rst ⍝ symbol: type 3
}
 
⍝ check that a string was passed in
(''≢0↑⍵)∨1≠⍴⍴⍵:⎕SIGNAL⊂('EN'11)('Message' 'Input must be a char vector')
 
⍝ check that all strings are closed
quot←('"'=⍵)∧'\'≠¯1⌽⍵
0≠2|+/quot:⎕SIGNAL⊂('EN'11)('Message' 'Open string')
 
⍝ check that all brackets match (except those in strings)
nest←+\+⌿1 ¯1×[1]'()'∘.=(~2|+\quot)/⍵
(0≠¯1↑nest)∨0<.∨nest:⎕SIGNAL⊂('EN'11)('Message' 'Mismatched parentheses')
 
⊃build tok ⍵
}
 
pretty←{
⍝ Prettyprinter for parsed S-expressions
NL←⎕tc[2]
∊∇{
typ itm←⍵
typ=3:itm,NL ⍝ Atom
typ=2:(⍕itm),NL ⍝ Number
typ=1:('"',('"'⎕R'\\"'⊢itm),'"'),NL ⍝ String
typ=0:'(',NL,('^'⎕R' '⊢⍺⍺ itm),')',NL ⍝ List
}¨⍵
}
 
</syntaxhighlight>
 
=={{header|Arturo}}==
<syntaxhighlight lang="rebol">code: {
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
}
 
s: first to :block code
inspect.muted s
print as.code s</syntaxhighlight>
 
{{out}}
 
<pre>[ :inline
[ :inline
data :word
quoted data :string
123 :integer
4.5 :floating
]
[ :inline
data :word
[ :inline
! :symbol
@ :symbol
# :symbol
[ :inline
4.5 :floating
]
(more :string
data) :string
]
]
]
((data "quoted data" 123 4.5) (data (! @ # (4.5) "(more" "data)")))</pre>
 
=={{header|AutoHotkey}}==
<syntaxhighlight lang="autohotkey">S_Expressions(Str){
Str := RegExReplace(Str, "s)(?<![\\])"".*?[^\\]""(*SKIP)(*F)|((?<![\\])[)(]|\s)", "`n$0`n")
Str := RegExReplace(Str, "`am)^\s*\v+") , Cnt := 0
loop, parse, Str, `n, `r
{
Cnt := A_LoopField=")" ? Cnt-1 : Cnt
Res .= tabs(Cnt) A_LoopField "`r`n"
Cnt := A_LoopField="(" ? Cnt+1 : Cnt
}
return Res
}
tabs(n){
loop, % n
Res .= "`t"
return Res
}</syntaxhighlight>
Examples:<syntaxhighlight lang="autohotkey">Str =
(
((data da\(\)ta "quot\\ed data" 123 4.5)
("data" (!@# (4.5) "(mo\"re" "data)")))
)
MsgBox, 262144, , % S_Expressions(Str)</syntaxhighlight>
{{out}}
<pre>(
(
data
da\(\)ta
"quot\\ed data"
123
4.5
)
(
"data"
(
!@#
(
4.5
)
"(mo\"re"
"data)"
)
)
)
 
</pre>
 
=={{header|C}}==
<langsyntaxhighlight lang="c">#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
Line 539 ⟶ 1,079:
print_expr(x, 0);
return 0;
}</syntaxhighlight>
}</lang>output<lang>input is:
{{out}}<syntaxhighlight lang="text">input is:
((data da\(\)ta "quot\\ed data" 123 4.5)
("data" (!@# (4.5) "(mo\"re" "data)")))
Line 562 ⟶ 1,103:
)
)
)</langsyntaxhighlight>
 
=={{header|C sharp|C#}}==
Implementation of S-expression serializer & deserializer in C# 6.0 language.
 
Git repository with code and tests can be found here: https://github.com/ichensky/SExpression/tree/rosettacode
 
<syntaxhighlight lang="csharp">
using System;
using System.Collections.Generic;
using System.Text;
 
public class SNode
{
private List<SNode> _items;
public string Name { get; set; }
public IReadOnlyCollection<SNode> Items { get { return _items.AsReadOnly(); } }
public SNode()
{
this._items = new List<SNode>();
}
public SNode(string name):this()
{
this.Name=name;
}
public void AddNode(SNode node)
{
this._items.Add(node);
}
}
 
public class SNodeFull : SNode
{
private bool _isLeaf;
public bool IsLeaf { get => _isLeaf; }
public SNodeFull(bool isLeaf) : base()
{
this._isLeaf = isLeaf;
}
 
public SNodeFull(string name, bool isLeaf) : base(name)
{
this._isLeaf = isLeaf;
}
 
public SNodeFull RootNode { get; set; }
 
public void AddNode(SNodeFull node)
{
base.AddNode(node);
node.RootNode = this;
}
}
 
</syntaxhighlight>
<syntaxhighlight lang="csharp">
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
 
namespace SExpression
{
public partial class SExpression
{
public const string ErrorStrNotValidFormat = "Not valid format.";
}
public partial class SExpression : ISExpression
{
public String Serialize(SNode root)
{
if (root == null)
{
throw new ArgumentNullException();
}
var sb = new StringBuilder();
Serialize(root, sb);
return sb.ToString();
}
private void Serialize(SNode node, StringBuilder sb)
{
sb.Append('(');
 
if (node.Items.Count > 0)
{
int x = 0;
foreach (var item in node.Items)
{
if (x>0)
{
sb.Append(' ');
}
else
{
x++;
}
if (item.Items.Count > 0)
{
Serialize(item, sb);
}
else
{
SerializeItem(item, sb);
}
}
}
 
sb.Append(')');
}
private void SerializeItem(SNode node, StringBuilder sb)
{
if (node.Name == null)
{
sb.Append("()");
return;
}
node.Name = node.Name.Replace("\"", "\\\"");
if (node.Name.IndexOfAny(new char[] { ' ', '"', '(', ')' }) != -1 || node.Name == string.Empty)
{
sb.Append('"').Append(node.Name).Append('"');
return;
}
sb.Append(node.Name);
}
}
public partial class SExpression
{
public SNode Deserialize(string st)
{
if (st==null)
{
return null;
}
st = st.Trim();
if (string.IsNullOrEmpty(st))
{
return null;
}
 
var begin = st.IndexOf('(');
if (begin != 0)
{
throw new Exception();
}
var end = st.LastIndexOf(')');
if (end != st.Length - 1)
{
throw new Exception(ErrorStrNotValidFormat);
}
st = st.Remove(st.Length-1).Remove(0, 1).ToString();
var node = new SNodeFull(false);
Deserialize(ref st, node);
return node;
}
 
private void Deserialize(ref string st, SNodeFull root)
{
st = st.Trim();
if (string.IsNullOrEmpty(st))
{
return;
}
 
SNodeFull node = null;
SNodeFull r = root;
do
{
while (st[0] == ')')
{
st = st.Remove(0, 1).Trim();
if (st.Length==0)
{
return;
}
r = root.RootNode;
if (r==null)
{
throw new Exception(ErrorStrNotValidFormat);
}
}
node = DeserializeItem(ref st);
st = st.Trim();
 
r.AddNode(node);
if (!node.IsLeaf)
{
Deserialize(ref st,node);
}
}
while (st.Length > 0);
}
 
private SNodeFull DeserializeItem(ref string st)
{
if (st[0] == '(')
{
st = st.Remove(0, 1);
return new SNodeFull(false);
}
 
var x = 0;
var esc = 0;
for (int i = 0; i < st.Length; i++)
{
if (st[i] == '"')
{
if (esc == 0)
{
esc = 1;
}
else if(esc == 1 && (i> 0 && st[i - 1] == '\\'))
{
throw new Exception(ErrorStrNotValidFormat);
}
else
{
esc = 2;
break;
}
}
else if (esc==0 && " ()".Contains(st[i]))
{
break;
}
 
x++;
}
if (esc == 1)
{
throw new Exception(ErrorStrNotValidFormat);
}
 
var head = esc==0? st.Substring(0, x): st.Substring(1,x-1);
st = st.Remove(0, esc ==0 ? x: x + 2);
return new SNodeFull(head, true);
}
}
}
</syntaxhighlight>
<syntaxhighlight lang="csharp">
using System;
using System.Collections.Generic;
 
namespace SExpression.Test
{
class Program
{
static void Main(string[] args)
{
var str =
@"((data ""quoted data"" 123 4.5)
(data(!@# (4.5) ""(more"" ""data)"")))";
 
var se = new SExpression();
var node = se.Deserialize(str);
var result = se.Serialize(node);
Console.WriteLine(result);
}
}
}
</syntaxhighlight>
{{out}}
<pre>
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))
</pre>
 
=={{header|C++}}==
The function s_expr::parse parses an object from an input stream. An object may be a
quoted string, symbol (i.e. unquoted string), a number (integer or float) or a list
of objects. The "\" character is used for escaping special characters.
C++ 17 is required. As it stands, you can't really do anything with the parsed object
apart from writing it out, which fulfils this task's requirements. With some more work
this code might actually be useful.
<syntaxhighlight lang="cpp">#include <cctype>
#include <iomanip>
#include <iostream>
#include <list>
#include <memory>
#include <sstream>
#include <string>
#include <variant>
 
namespace s_expr {
 
enum class token_type { none, left_paren, right_paren, symbol, string, number };
enum class char_type { left_paren, right_paren, quote, escape, space, other };
enum class parse_state { init, quote, symbol };
 
struct token {
token_type type = token_type::none;
std::variant<std::string, double> data;
};
 
char_type get_char_type(char ch) {
switch (ch) {
case '(':
return char_type::left_paren;
case ')':
return char_type::right_paren;
case '"':
return char_type::quote;
case '\\':
return char_type::escape;
}
if (isspace(static_cast<unsigned char>(ch)))
return char_type::space;
return char_type::other;
}
 
bool parse_number(const std::string& str, token& tok) {
try {
size_t pos = 0;
double num = std::stod(str, &pos);
if (pos == str.size()) {
tok.type = token_type::number;
tok.data = num;
return true;
}
} catch (const std::exception&) {
}
return false;
}
 
bool get_token(std::istream& in, token& tok) {
char ch;
parse_state state = parse_state::init;
bool escape = false;
std::string str;
token_type type = token_type::none;
while (in.get(ch)) {
char_type ctype = get_char_type(ch);
if (escape) {
ctype = char_type::other;
escape = false;
} else if (ctype == char_type::escape) {
escape = true;
continue;
}
if (state == parse_state::quote) {
if (ctype == char_type::quote) {
type = token_type::string;
break;
}
else
str += ch;
} else if (state == parse_state::symbol) {
if (ctype == char_type::space)
break;
if (ctype != char_type::other) {
in.putback(ch);
break;
}
str += ch;
} else if (ctype == char_type::quote) {
state = parse_state::quote;
} else if (ctype == char_type::other) {
state = parse_state::symbol;
type = token_type::symbol;
str = ch;
} else if (ctype == char_type::left_paren) {
type = token_type::left_paren;
break;
} else if (ctype == char_type::right_paren) {
type = token_type::right_paren;
break;
}
}
if (type == token_type::none) {
if (state == parse_state::quote)
throw std::runtime_error("syntax error: missing quote");
return false;
}
tok.type = type;
if (type == token_type::string)
tok.data = str;
else if (type == token_type::symbol) {
if (!parse_number(str, tok))
tok.data = str;
}
return true;
}
 
void indent(std::ostream& out, int level) {
for (int i = 0; i < level; ++i)
out << " ";
}
 
class object {
public:
virtual ~object() {}
virtual void write(std::ostream&) const = 0;
virtual void write_indented(std::ostream& out, int level) const {
indent(out, level);
write(out);
}
};
 
class string : public object {
public:
explicit string(const std::string& str) : string_(str) {}
void write(std::ostream& out) const { out << std::quoted(string_); }
private:
std::string string_;
};
 
class symbol : public object {
public:
explicit symbol(const std::string& str) : string_(str) {}
void write(std::ostream& out) const {
for (char ch : string_) {
if (get_char_type(ch) != char_type::other)
out << '\\';
out << ch;
}
}
private:
std::string string_;
};
 
class number : public object {
public:
explicit number(double num) : number_(num) {}
void write(std::ostream& out) const { out << number_; }
private:
double number_;
};
 
class list : public object {
public:
void write(std::ostream& out) const;
void write_indented(std::ostream&, int) const;
void append(const std::shared_ptr<object>& ptr) {
list_.push_back(ptr);
}
private:
std::list<std::shared_ptr<object>> list_;
};
 
void list::write(std::ostream& out) const {
out << "(";
if (!list_.empty()) {
auto i = list_.begin();
(*i)->write(out);
while (++i != list_.end()) {
out << ' ';
(*i)->write(out);
}
}
out << ")";
}
 
void list::write_indented(std::ostream& out, int level) const {
indent(out, level);
out << "(\n";
if (!list_.empty()) {
for (auto i = list_.begin(); i != list_.end(); ++i) {
(*i)->write_indented(out, level + 1);
out << '\n';
}
}
indent(out, level);
out << ")";
}
 
class tokenizer {
public:
tokenizer(std::istream& in) : in_(in) {}
bool next() {
if (putback_) {
putback_ = false;
return true;
}
return get_token(in_, current_);
}
const token& current() const {
return current_;
}
void putback() {
putback_ = true;
}
private:
std::istream& in_;
bool putback_ = false;
token current_;
};
 
std::shared_ptr<object> parse(tokenizer&);
 
std::shared_ptr<list> parse_list(tokenizer& tok) {
std::shared_ptr<list> lst = std::make_shared<list>();
while (tok.next()) {
if (tok.current().type == token_type::right_paren)
return lst;
else
tok.putback();
lst->append(parse(tok));
}
throw std::runtime_error("syntax error: unclosed list");
}
 
std::shared_ptr<object> parse(tokenizer& tokenizer) {
if (!tokenizer.next())
return nullptr;
const token& tok = tokenizer.current();
switch (tok.type) {
case token_type::string:
return std::make_shared<string>(std::get<std::string>(tok.data));
case token_type::symbol:
return std::make_shared<symbol>(std::get<std::string>(tok.data));
case token_type::number:
return std::make_shared<number>(std::get<double>(tok.data));
case token_type::left_paren:
return parse_list(tokenizer);
default:
break;
}
throw std::runtime_error("syntax error: unexpected token");
}
 
} // namespace s_expr
 
void parse_string(const std::string& str) {
std::istringstream in(str);
s_expr::tokenizer tokenizer(in);
auto exp = s_expr::parse(tokenizer);
if (exp != nullptr) {
exp->write_indented(std::cout, 0);
std::cout << '\n';
}
}
 
int main(int argc, char** argv) {
std::string test_string =
"((data \"quoted data\" 123 4.5)\n"
" (data (!@# (4.5) \"(more\" \"data)\")))";
if (argc == 2)
test_string = argv[1];
try {
parse_string(test_string);
} catch (const std::exception& ex) {
std::cerr << ex.what() << '\n';
}
return 0;
}</syntaxhighlight>
 
{{out}}
<pre>
(
(
data
"quoted data"
123
4.5
)
(
data
(
!@#
(
4.5
)
"(more"
"data)"
)
)
)
</pre>
 
=={{header|Ceylon}}==
<syntaxhighlight lang="ceylon">class Symbol(symbol) {
shared String symbol;
string => symbol;
}
 
abstract class Token()
of DataToken | leftParen | rightParen {}
 
abstract class DataToken(data)
of StringToken | IntegerToken | FloatToken | SymbolToken
extends Token() {
 
shared String|Integer|Float|Symbol data;
string => data.string;
}
 
class StringToken(String data) extends DataToken(data) {}
class IntegerToken(Integer data) extends DataToken(data) {}
class FloatToken(Float data) extends DataToken(data) {}
class SymbolToken(Symbol data) extends DataToken(data) {}
 
object leftParen extends Token() {
string => "(";
}
object rightParen extends Token() {
string => ")";
}
 
class Tokens(String input) satisfies {Token*} {
shared actual Iterator<Token> iterator() => object satisfies Iterator<Token> {
 
variable value index = 0;
 
shared actual Token|Finished next() {
 
while(exists nextChar = input[index], nextChar.whitespace) {
index++;
}
 
if(index >= input.size) {
return finished;
}
 
assert(exists char = input[index]);
 
if(char == '(') {
index++;
return leftParen;
}
if(char == ')') {
index++;
return rightParen;
}
 
if(char == '"') {
value builder = StringBuilder();
while(exists nextChar = input[++index]) {
if(nextChar == '"') {
index++;
break;
}
if(nextChar == '\\') {
if(exists nextNextChar = input[++index]) {
switch(nextNextChar)
case('\\') {
builder.append("\\");
}
case('t') {
builder.append("\t");
}
case('n') {
builder.append("\n");
}
case('"') {
builder.append("\"");
}
else {
throw Exception("unknown escaped character");
}
} else {
throw Exception("unclosed string");
}
} else {
builder.appendCharacter(nextChar);
}
}
return StringToken(builder.string);
}
 
value builder = StringBuilder();
while(exists nextChar = input[index], !nextChar.whitespace && nextChar != '(' && nextChar != ')') {
builder.appendCharacter(nextChar);
index++;
}
value string = builder.string;
if(is Integer int = Integer.parse(string)) {
return IntegerToken(int);
} else if(is Float float = Float.parse(string)) {
return FloatToken(float);
} else {
return SymbolToken(Symbol(string));
}
}
};
}
 
abstract class Node() of Atom | Group {}
 
class Atom(data) extends Node() {
shared String|Integer|Float|Symbol data;
string => data.string;
}
class Group() extends Node() satisfies {Node*} {
shared variable Node[] nodes = [];
string => nodes.string;
shared actual Iterator<Node> iterator() => nodes.iterator();
 
}
 
Node buildTree(Tokens tokens) {
 
[Group, Integer] recursivelyBuild(Token[] tokens, variable Integer index = 0) {
value result = Group();
while(exists token = tokens[index]) {
switch (token)
case (leftParen) {
value [newNode, newIndex] = recursivelyBuild(tokens, index + 1);
index = newIndex;
result.nodes = result.nodes.append([newNode]);
}
case (rightParen) {
return [result, index];
}
else {
result.nodes = result.nodes.append([Atom(token.data)]);
}
index++;
}
return [result, index];
}
 
value root = recursivelyBuild(tokens.sequence())[0];
return root.first else Group();
}
 
void prettyPrint(Node node, Integer indentation = 0) {
 
void paddedPrint(String s) => print(" ".repeat(indentation) + s);
 
if(is Atom node) {
paddedPrint(node.string);
} else {
paddedPrint("(");
for(n in node.nodes) {
prettyPrint(n, indentation + 2);
}
paddedPrint(")");
}
}
 
shared void run() {
value tokens = Tokens("""((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))""");
print(tokens);
 
value tree = buildTree(tokens);
prettyPrint(tree);
}
</syntaxhighlight>
{{out}}
<pre>
(
(
data
quoted data
123
4.5
)
(
data
(
!@#
(
4.5
)
(more
data)
)
)
)
</pre>
 
=={{header|CoffeeScript}}==
{{improve|CoffeeScript|This solution does not reproduce unquoted strings as per task description}}
<langsyntaxhighlight lang="coffeescript">
# This code works with Lisp-like s-expressions.
#
Line 697 ⟶ 1,999:
console.log "output:\n#{pp output}\n"
console.log "round trip:\n#{sexp output}\n"
</syntaxhighlight>
</lang>
{{out}}
output
<syntaxhighlight lang="text">
> coffee sexp.coffee
input:
Line 729 ⟶ 2,031:
round trip:
(("data" "quoted data with escaped \"" 123 4.5 "14") ("data" ("!@#" (4.5) "(more" "data)")))
</syntaxhighlight>
</lang>
 
=={{header|Common Lisp}}==
===Parsing S-Expressions===
Like most Lisp dialects, Common Lisp uses s-expressions to represent both source code and data. Therefore, a properly formatted s-expression is easily "read" then "evaluated" within the top-level REPL (read-eval-print-loop).
both source code and data.
Therefore, a properly formatted s-expression is easily "read"
then "evaluated" within the top-level REPL (read-eval-print-loop).
<pre>CL-USER> (read-from-string "((data \"quoted data\" 123 4.5) (data (!@# (4.5) \"(more\" \"data)\")))")
((DATA "quoted data" 123 4.5) (DATA (!@# (4.5) "[more" "data]")))
Line 744 ⟶ 2,049:
Unfortunately, our pointy-haired boss has asked you to write a parser for an unusual s-expression syntax that uses square brackets instead of parenthesis. In most programming languages, this would necessitate writing an entire parser. Fortunately, the Common Lisp reader can be modified through the use of macro-characters to accomplish this task. When the reader parses a macro-character token, a function associated with the macro-character is called. As evidenced below, modifying the behavior of the Lisp reader by setting macro-character functions to handle additional sytax requires far less work than writing a complete parser from scratch.
 
<langsyntaxhighlight lang="lisp">(defun lsquare-reader (stream char)
(declare (ignore char))
(read-delimited-list #\] stream t))
(set-macro-character #\[ #'lsquare-reader) ;;Call the lsquare-reader function when a '[' token is parsed
(set-macro-character #\] (get-macro-character #\) nil)) ;;Do the same thing as ')' when a ']' token is parsed</langsyntaxhighlight>
Unit test code:
<langsyntaxhighlight lang="lisp">;;A list of unit tests. Each test is a cons in which the car (left side) contains the
;;test string and the cdr (right side) the expected result of reading the S-Exp.
(setf unit-tests
Line 765 ⟶ 2,070:
(dolist (test unit-tests)
(format t "String: ~23s Expected: ~23s Actual: ~s~%"
(car test) (cdr test) (read-from-string (car test)))))</langsyntaxhighlight>
{{out| Unit test output:}}
<pre>CL-USER> (run-tests)
String: "[]" Expected: NIL Actual: NIL
Line 779 ⟶ 2,084:
(read-from-string "[)") ;;Error: An object cannot start with ')'
(read-from-string "(]") ;;Error: An object cannot start with ']' </pre>
{{out|Task output:}}
<pre>CL-USER> (setf task "[[data \"quoted data\" 123 4.5] [data [!@# [4.5] \"[more\" \"data]\"]]]")
"[[data \"quoted data\" 123 4.5] [data [!@# [4.5] \"[more\" \"data]\"]]]"
Line 799 ⟶ 2,104:
===Writing S-Expressions===
The next step in this task is to write a standard Lisp s-expression in the square bracket notation.
<langsyntaxhighlight lang="lisp">(defun write-sexp (sexp)
"Writes a Lisp s-expression in square bracket notation."
(labels ((parse (sexp)
Line 822 ⟶ 2,127:
(subseq str 0 last-char)
str)))))))
(concatenate 'string "[" (fix-spacing (parse sexp)) "]")))</langsyntaxhighlight>
Unit test code:
<langsyntaxhighlight lang="lisp">(setf unit-tests '(((1 2) (3 4)) (1 2 3 4) ("ab(cd" "mn)op")
(1 (2 (3 (4)))) ((1) (2) (3)) ()))
 
Line 830 ⟶ 2,135:
(dolist (test unit-tests)
(format t "Before: ~18s After: ~s~%"
test (write-sexp test))))</langsyntaxhighlight>
{{out|Unit test output:}}
<pre>CL-USER> (run-tests)
Before: ((1 2) (3 4)) After: "[[1 2] [3 4]]"
Line 839 ⟶ 2,144:
Before: ((1) (2) (3)) After: "[[1] [2] [3]]"
Before: NIL After: "[]"</pre>
And finally{{out|Finally, round trip output for the original task example:}}
<pre>CL-USER> task
"[[data \"quoted data\" 123 4.5] [data [!@# [4.5] \"[more\" \"data]\"]]]"
Line 848 ⟶ 2,153:
"[[DATA \"quoted data\" 123 4.5] [DATA [!@# [4.5] \"[more\" \"data]\"]]]"
CL-USER> </pre>
 
=={{header|Cowgol}}==
 
This parser allows both the use of quotes in symbols (e.g. <code>abc"def</code>),
as well as escaped quotes in quoted strings (e.g. <code>"Hello \" world"</code>).
Integer numbers are recognized and stored as such, but since Cowgol does not have
a native floating point type, floating point numbers are not.
 
<syntaxhighlight lang="cowgol">include "cowgol.coh";
include "strings.coh";
include "malloc.coh";
 
const MAXDEPTH := 256; # Maximum depth (used for stack sizes)
const MAXSTR := 256; # Maximum string length
 
# Type markers
const T_ATOM := 1;
const T_STRING := 2;
const T_NUMBER := 3;
const T_LIST := 4;
 
# Value union
record SVal is
number @at(0): int32;
string @at(0): [uint8]; # also used for atoms
list @at(0): [SExp];
end record;
 
# Holds a linked list of items
record SExp is
type: uint8;
next: [SExp];
val: SVal;
end record;
 
# Free an S-Expression
sub FreeSExp(exp: [SExp]) is
var stack: [SExp][MAXDEPTH];
stack[0] := exp;
var sp: @indexof stack := 1;
while sp > 0 loop
sp := sp - 1;
exp := stack[sp];
while exp != 0 as [SExp] loop
var next := exp.next;
case exp.type is
when T_ATOM:
Free(exp.val.string);
when T_STRING:
Free(exp.val.string);
when T_LIST:
stack[sp] := exp.val.list;
sp := sp + 1;
end case;
Free(exp as [uint8]);
exp := next;
end loop;
end loop;
end sub;
 
# Build an S-Expression
sub ParseSExp(in: [uint8]): (out: [SExp]) is
out := 0 as [SExp];
sub SkipSpace() is
while ([in] != 0) and ([in] <= 32) loop
in := @next in;
end loop;
end sub;
sub AtomEnd(): (space: [uint8]) is
space := in;
while ([space] > 32)
and ([space] != '(')
and ([space] != ')') loop
space := @next space;
end loop;
end sub;
record Stk is
start: [SExp];
cur: [SExp];
end record;
 
var strbuf: uint8[MAXSTR];
var stridx: @indexof strbuf := 0;
var item: [SExp];
var stack: Stk[MAXDEPTH];
var sp: @indexof stack := 0;
stack[0].start := 0 as [SExp];
stack[0].cur := 0 as [SExp];
sub Store(item: [SExp]) is
if stack[sp].start == 0 as [SExp] then
stack[sp].start := item;
end if;
if stack[sp].cur != 0 as [SExp] then
stack[sp].cur.next := item;
end if;
stack[sp].cur := item;
end sub;
# called on error to clean up memory
sub FreeAll() is
loop
FreeSExp(stack[sp].start);
stack[sp].start := 0;
if sp == 0 then break; end if;
sp := sp - 1;
end loop;
end sub;
loop
SkipSpace();
case [in] is
when 0: break;
when '"':
var escape: uint8 := 0;
stridx := 0;
loop
in := @next in;
if [in] == 0 then break;
elseif escape == 1 then
strbuf[stridx] := [in];
stridx := stridx + 1;
escape := 0;
elseif [in] == '\\' then escape := 1;
elseif [in] == '"' then break;
else
strbuf[stridx] := [in];
stridx := stridx + 1;
end if;
end loop;
if [in] == 0 then
# missing _"_
FreeAll();
return;
end if;
in := @next in;
strbuf[stridx] := 0;
stridx := stridx + 1;
item := Alloc(@bytesof SExp) as [SExp];
item.type := T_STRING;
item.val.string := Alloc(stridx as intptr);
CopyString(&strbuf[0], item.val.string);
Store(item);
when '(':
in := @next in;
sp := sp + 1;
stack[sp].start := 0 as [SExp];
stack[sp].cur := 0 as [SExp];
when ')':
in := @next in;
if sp == 0 then
# stack underflow, error
FreeAll();
return;
else
item := Alloc(@bytesof SExp) as [SExp];
item.type := T_LIST;
item.val.list := stack[sp].start;
sp := sp - 1;
Store(item);
end if;
when else:
var aend := AtomEnd();
item := Alloc(@bytesof SExp) as [SExp];
# if this is a valid integer number then store as number
var ptr: [uint8];
(item.val.number, ptr) := AToI(in);
if ptr == aend then
# a number was parsed and the whole atom consumed
item.type := T_NUMBER;
else
# not a valid integer number, store as atom
item.type := T_ATOM;
var length := aend - in;
item.val.string := Alloc(length + 1);
MemCopy(in, length, item.val.string);
[item.val.string + length] := 0;
end if;
in := aend;
Store(item);
end case;
end loop;
if sp != 0 then
# unterminated list!
FreeAll();
return;
else
# return start of list
out := stack[0].start;
end if;
end sub;
 
# Prettyprint an S-Expression with types
sub prettyprint(sexp: [SExp]) is
sub PrintNum(n: int32) is
var buf: uint8[16];
[IToA(n, 10, &buf[0])] := 0;
print(&buf[0]);
end sub;
sub PrintQuoteStr(s: [uint8]) is
print_char('"');
while [s] != 0 loop
if [s] == '"' or [s] == '\\' then
print_char('\\');
end if;
print_char([s]);
s := @next s;
end loop;
print_char('"');
end sub;
var stack: [SExp][MAXDEPTH];
var sp: @indexof stack := 1;
stack[0] := sexp;
sub Indent(n: @indexof stack) is
while n > 0 loop
print(" ");
n := n - 1;
end loop;
end sub;
loop
sp := sp - 1;
while stack[sp] != 0 as [SExp] loop
Indent(sp);
case stack[sp].type is
when T_ATOM:
print(stack[sp].val.string);
print(" :: Atom");
stack[sp] := stack[sp].next;
when T_STRING:
PrintQuoteStr(stack[sp].val.string);
print(" :: String");
stack[sp] := stack[sp].next;
when T_NUMBER:
PrintNum(stack[sp].val.number);
print(" :: Number");
stack[sp] := stack[sp].next;
when T_LIST:
print_char('(');
sp := sp + 1;
stack[sp] := stack[sp-1].val.list;
stack[sp-1] := stack[sp-1].next;
end case;
print_nl();
end loop;
if sp == 0 then
break;
end if;
Indent(sp-1);
print_char(')');
print_nl();
end loop;
end sub;
 
var str := "((data \"quoted data\" 123 4.5)\n (data (!@# (4.5) \"(more\" \"data)\")))";
 
print("Input:\n");
print(str);
print_nl();
 
print("Parsed:\n");
prettyprint(ParseSExp(str));
print_nl();</syntaxhighlight>
 
{{out}}
 
<pre>Input:
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
Parsed:
(
(
data :: Atom
"quoted data" :: String
123 :: Number
4.5 :: Atom
)
(
data :: Atom
(
!@# :: Atom
(
4.5 :: Atom
)
"(more" :: String
"data)" :: String
)
)
)</pre>
 
=={{header|D}}==
<langsyntaxhighlight lang="d">import std.stdio:, writestd.conv, writeln;std.algorithm, std.variant, std.uni,
import std.conv: textfunctional, parsestd.string;
import std.algorithm: canFind;
import std.variant: Variant;
import std.uni: isAlpha, isNumber, isWhite;
 
alias Sexp = Variant;
Line 860 ⟶ 2,462:
struct Symbol {
private string name;
string toString() @safe const pure nothrow { return name; }
}
 
Sexp parseSexp(in string rawtxt) @safe pure /*nothrow*/ {
static bool isIdentChar(in char c) @safe pure /*nothrow*/ {
return c.isAlpha || "0123456789!@#-".representation.canFind(c);
}
 
size_t pos = 0;
 
while (isWhite(raw[pos])) pos++;
Sexp _parse() /*nothrow*/ {
size_tauto i = pos + 1;
scope (exit)
pos = i;
if (rawtxt[pos] == '"') {
while (rawtxt[i] != '"' && i < rawtxt.length)
i++;
i++;
return Sexp(rawtxt[pos + 1 .. i - 1]);
} else if (isNumber(rawtxt[pos]).isNumber) {
while (isNumber(rawtxt[i]).isNumber && i < rawtxt.length)
i++;
if (rawtxt[i] == '.') {
i++;
while (isNumber(rawtxt[i]).isNumber && i < rawtxt.length)
i++;
returnauto Sexp(parse!double(rawaux = txt[pos .. i])); //
return aux.parse!double.Sexp;
}
returnauto Sexp(parse!ulong(rawaux = txt[pos .. i])); //
return aux.parse!ulong.Sexp;
} else if (isIdentChar(raw[pos])) {
} else whileif (isIdentChar(rawtxt[ipos])) && i < raw.length){
while (isIdentChar(txt[i]) && i < txt.length)
i++;
return Sexp(Symbol(rawtxt[pos .. i]));
} else if (rawtxt[pos] == '(') {
Sexp[] lst;
while (rawtxt[i] != ')') {
while (isWhite(rawtxt[i]).isWhite)
i++;
pos = i;
lst ~= _parse();
i = pos;
while (isWhite(rawtxt[i]).isWhite)
i++;
}
Line 909 ⟶ 2,513:
return Sexp(null);
}
 
return _parse();
txt = txt.find!(not!isWhite);
return _parse;
}
 
void writeSexp(Sexp expr) {
if (expr.type == typeid(string)) {
write('"\"', expr, '"');
write(expr);
write("\"");
} else if (expr.type == typeid(Sexp[])) {
'('.write("(");
auto arr = expr.get!(Sexp[]);
foreach (immutable i, e; arr) {
writeSexp(e).writeSexp;
if (i + 1 < arr.length)
write("' ")'.write;
}
')'.write(")");
} else {
write(expr).write;
}
}
 
void main() {
auto testpTest = `((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))`.parseSexp;
auto pTest = parseSexp(test);
writeln("Parsed: ", pTest);
write("Printed: ").write;
writeSexp(pTest).writeSexp;
}</syntaxhighlight>
writeln();
}</lang>
{{out}}
<pre>Parsed: [[data, quoted data, 123, 4.5], [data, [!@#, [4.5], (more, data)]]]
Printed: ((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))</pre>
 
=={{header|EchoLisp}}==
The '''(read-from-string input-string)''' function parses a string into an s-expression, which is the native representation of program/data in EchoLisp and the majority of Lisps .
<syntaxhighlight lang="lisp">
(define input-string #'((data "quoted data" 123 4.5)\n(data (!@# (4.5) "(more" "data)")))'#)
 
input-string
→ "((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))"
 
(define s-expr (read-from-string input-string))
s-expr
→ ((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))
 
(first s-expr)
→ (data "quoted data" 123 4.5)
(first(first s-expr))
→ data
(first(rest s-expr))
→ (data (!@# (4.5) "(more" "data)"))
</syntaxhighlight>
 
=={{header|F_Sharp|F#}}==
 
Implementation of S-expression parser in F# 4.7 language.
 
Visual Studio COmmunity 2019 Edition - Version 16.4.5.
 
Learn more about F# at https://fsharp.org
 
 
The file <code>SExpr.fs</code> containing the implementation:
 
<syntaxhighlight lang="fsharp">
module SExpr
(* This module is a very simple port of the OCaml version to F# (F-Sharp) *)
(* The original OCaml setatment is comment out and the F# statement(s) follow *)
(* Port performed by Bob Elward 23 Feb 2020 *)
 
(* The .Net standard would use "+" and not "^" for string concatenation *)
(* I kept the "^" to be compatable with the "ml" standard *)
(* The line below eliminates the warning/suggestion to use "+" *)
#nowarn "62"
 
(** This module is a very simple parsing library for S-expressions. *)
(* Copyright (C) 2009 Florent Monnier, released under MIT license. *)
(* modified to match the task description *)
 
(* Code obtained from: https://rosettacode.org/wiki/S-Expressions#OCaml *)
(* Note: The type below defines the grammar for this S-Expression (S-Expr).
An S-Expr is either an Atom or an S-Expr
*)
 
open System.Text
open System.IO
open System
 
type sexpr = Atom of string | Expr of sexpr list
 
type state =
| Parse_root of sexpr list
| Parse_content of sexpr list
| Parse_word of StringBuilder * sexpr list
| Parse_string of bool * StringBuilder * sexpr list
 
let parse pop_char =
let rec aux st =
match pop_char() with
| None ->
begin match st with
| Parse_root sl -> (List.rev sl)
| Parse_content _
| Parse_word _
| Parse_string _ ->
failwith "Parsing error: content not closed by parenthesis"
end
| Some c ->
match c with
| '(' ->
begin match st with
| Parse_root sl ->
let this = aux(Parse_content []) in
aux(Parse_root((Expr this)::sl))
| Parse_content sl ->
let this = aux(Parse_content []) in
aux(Parse_content((Expr this)::sl))
| Parse_word(w, sl) ->
let this = aux(Parse_content []) in
(* aux(Parse_content((Expr this)::Atom(Buffer.contents w)::sl)) *)
aux(Parse_content((Expr this)::Atom(w.ToString())::sl))
| Parse_string(_, s, sl) ->
//Buffer.add_char s c;
s.Append(c) |> ignore;
aux(Parse_string(false, s, sl))
end
| ')' ->
begin match st with
| Parse_root sl ->
failwith "Parsing error: closing parenthesis without openning"
| Parse_content sl -> (List.rev sl)
(* | Parse_word(w, sl) -> List.rev(Atom(Buffer.contents w)::sl) *)
| Parse_word(w, sl) -> List.rev(Atom(w.ToString())::sl)
| Parse_string(_, s, sl) ->
(* Buffer.add_char s c; *)
s.Append(c) |> ignore;
aux(Parse_string(false, s, sl))
end
| ' ' | '\n' | '\r' | '\t' ->
begin match st with
| Parse_root sl -> aux(Parse_root sl)
| Parse_content sl -> aux(Parse_content sl)
(* | Parse_word(w, sl) -> aux(Parse_content(Atom(Buffer.contents w)::sl)) *)
| Parse_word(w, sl) -> aux(Parse_content(Atom(w.ToString())::sl))
| Parse_string(_, s, sl) ->
//Buffer.add_char s c;
s.Append(c) |> ignore;
aux(Parse_string(false, s, sl))
end
| '"' ->
(* '"' *)
begin match st with
| Parse_root _ -> failwith "Parse error: double quote at root level"
| Parse_content sl ->
(* let s = Buffer.create 74 in *)
let s = StringBuilder(74) in
aux(Parse_string(false, s, sl))
| Parse_word(w, sl) ->
(* let s = Buffer.create 74 in *)
let s = StringBuilder(74) in
(* aux(Parse_string(false, s, Atom(Buffer.contents w)::sl)) *)
aux(Parse_string(false, s, Atom(w.ToString())::sl))
| Parse_string(true, s, sl) ->
(* Buffer.add_char s c; *)
s.Append(c) |> ignore;
aux(Parse_string(false, s, sl))
| Parse_string(false, s, sl) ->
(* aux(Parse_content(Atom(Buffer.contents s)::sl)) *)
aux(Parse_content(Atom(s.ToString())::sl))
end
| '\\' ->
begin match st with
| Parse_string(true, s, sl) ->
(* Buffer.add_char s c; *)
s.Append(c) |> ignore;
aux(Parse_string(false, s, sl))
| Parse_string(false, s, sl) ->
aux(Parse_string(true, s, sl))
| _ ->
failwith "Parsing error: escape character in wrong place"
end
| _ ->
begin match st with
| Parse_root _ ->
failwith(Printf.sprintf "Parsing error: char '%c' at root level" c)
| Parse_content sl ->
(* let w = Buffer.create 16 in *)
let w = StringBuilder(16) in
(* Buffer.add_char w c; *)
w.Append(c) |> ignore;
aux(Parse_word(w, sl))
| Parse_word(w, sl) ->
(* Buffer.add_char w c; *)
w.Append(c) |> ignore;
aux(Parse_word(w, sl))
| Parse_string(_, s, sl) ->
(* Buffer.add_char s c; *)
s.Append(c) |> ignore;
aux(Parse_string(false, s, sl))
end
in
aux (Parse_root [])
 
 
let string_pop_char str =
let len = String.length str in
let i = ref(-1) in
(function () -> incr i; if !i >= len then None else Some(str.[!i]))
 
 
let parse_string str =
parse (string_pop_char str)
 
(*
let ic_pop_char ic =
(function () ->
try Some(input_char ic)
with End_of_file -> (None))
*)
 
let ic_pop_char (ic:TextReader) =
(function () -> try Some(Convert.ToChar(ic.Read()))
with _End_of_file -> (None)
)
 
 
let parse_ic ic =
parse (ic_pop_char ic)
 
 
let parse_file filename =
(* let ic = open_in filename in *)
let ic = File.OpenText filename in
let res = parse_ic ic in
(* close_in ic; *)
ic.Close();
(res)
 
 
let quote s =
"\"" ^ s ^ "\""
 
let needs_quote s =
(* List.exists (String.contains s) [' '; '\n'; '\r'; '\t'; '('; ')'] *)
List.exists (fun elem -> (String.exists (fun c -> c = elem) s)) [' '; '\n'; '\r'; '\t'; '('; ')']
 
let protect s =
(* There is no need to "escape" .Net strings the framework takes care of this *)
(* let s = String.escaped s in *)
if needs_quote s then quote s else s
 
 
let string_of_sexpr s =
let rec aux acc = function
| (Atom tag)::tl -> aux ((protect tag)::acc) tl
| (Expr e)::tl ->
let s =
"(" ^
(String.concat " " (aux [] e))
^ ")"
in
aux (s::acc) tl
| [] -> (List.rev acc)
in
String.concat " " (aux [] s)
 
 
let print_sexpr s =
(* print_endline (string_of_sexpr s) *)
printfn "%s" (string_of_sexpr s)
 
 
let string_of_sexpr_indent s =
let rec aux i acc = function
| (Atom tag)::tl -> aux i ((protect tag)::acc) tl
| (Expr e)::tl ->
let s =
(*
"\n" ^ (String.make i ' ') ^ "(" ^
(String.concat " " (aux (succ i) [] e))
^ ")"
*)
"\n" ^ (String.replicate i " ") ^ "(" ^
(String.concat " " (aux (i + 1) [] e))
^ ")"
in
aux i (s::acc) tl
| [] -> (List.rev acc)
in
String.concat "\n" (aux 0 [] s)
 
 
let print_sexpr_indent s =
(* print_endline (string_of_sexpr_indent s) *)
printfn "%s" (string_of_sexpr_indent s)
</syntaxhighlight>
 
 
The file <code>Program.fs</code> containing the main module to
call the S-Expression parser and printer. Two options are shown:
Read the experession from a file of preset it in the code.
 
<syntaxhighlight lang="fsharp">
module Program
(* Learn more about F# at https://fsharp.org *)
 
open System
 
[<EntryPoint>]
let main argv =
let sexpr =
(* Data from file supplied at runtime or a preset string? *)
if argv.Length > 0 then
(* Data from file supplied at runtime *)
begin
(* Get the file to parse *)
let name = argv.[0] in
(* parse the program file *)
SExpr.parse_file name
end
else
(* Data from a preset string *)
begin
(* preset the string *)
let data= "((data \"quoted data\" 123 4.5) (data (!@# (4.5) \"(more\" \"data)\")))" in
 
(* parse the program file *)
SExpr.parse_string data
end
(* Print the parsed program token list *)
(printf "\nSExpr: \n");
SExpr.print_sexpr sexpr;
(printf "\nSExpr - Indented: \n");
SExpr.print_sexpr_indent sexpr;
(* return an integer exit code *)
0
</syntaxhighlight>
 
{{out}}
<pre>
 
SExpr:
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))
 
SExpr - Indented:
 
(
(data "quoted data" 123 4.5)
(data
(!@#
(4.5) "(more" "data)")))
</pre>
 
=={{header|Factor}}==
Factor's nested sequences are close enough to s-expressions that in most cases we can simply <code>eval</code> s-expression strings after some minor whitespace/bracket/parenthesis transformations. However, if we wish to support symbols, this approach becomes complicated because symbols need to be declared before use. This means we need to go into the string and identify them, so we may as well parse the s-expression properly while we're there.
 
We have a nice tool at our disposal for doing this. In its standard library, Factor contains a domain-specific language for defining [[wp:extended Backus-Naur form|extended Backus-Naur form]] (EBNF) grammars. EBNF is a convenient, declarative way to describe different parts of a grammar where later rules build on earlier ones until the final rule defines the entire grammar. Upon calling the word defined by <code>EBNF:</code>, an input string will be tokenized according to the declared rules and stored in an abstract syntax tree.
 
To get an idea of how this works, look at the final rule. It declares that an s-expression is any number of objects (comprised of numbers, floats, strings, and symbols) and s-expressions (the rule is recursive, allowing for nested s-expressions) surrounded by parenthesis which are in turn surrounded by any amount of whitespace. This implementation of EBNF allows us to define actions: the quotation after the <code>=></code> is called on the rule token just before being added to the abstract syntax tree. This is convenient for our use case where we need to parse different types of objects into our sequence structure.
 
Factor has a comprehensive prettyprinter which can print any Factor object in a readable way. Not only can we leverage it to easily print our native data structure, but we can also call <code>unparse</code> to convert it to a string. This leaves us with a string reminiscent of the original input, and we are able to take it the rest of the way with two simple regular expressions.
 
<syntaxhighlight lang="factor">USING: formatting kernel math.parser multiline peg peg.ebnf
regexp sequences prettyprint words ;
IN: rosetta-code.s-expressions
 
STRING: input
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
;
 
EBNF: sexp>seq [=[
ws = [\n\t\r ]* => [[ drop ignore ]]
digits = [0-9]+
number = digits => [[ string>number ]]
float = digits:a "." digits:b => [[ a b "." glue string>number ]]
string = '"'~ [^"]* '"'~ => [[ "" like ]]
symbol = [!#-'*-~]+ => [[ "" like <uninterned-word> ]]
object = ws ( float | number | string | symbol ) ws
sexp = ws "("~ ( object | sexp )* ")"~ ws => [[ { } like ]]
]=]
 
: seq>sexp ( seq -- str )
unparse R/ {\s+/ "(" R/ \s+}/ ")" [ re-replace ] 2bi@ ;
input [ "Input:\n%s\n\n" printf ] [
sexp>seq dup seq>sexp
"Native:\n%u\n\nRound trip:\n%s\n" printf
] bi</syntaxhighlight>
{{out}}
<pre>
Input:
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
 
Native:
{
{ data "quoted data" 123 4.5 }
{ data { !@# { 4.5 } "(more" "data)" } }
}
 
Round trip:
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
</pre>
 
=={{header|Go}}==
<langsyntaxhighlight lang="go">package main
 
import (
Line 1,118 ⟶ 3,099:
fmt.Println(s.i)
}
}</langsyntaxhighlight>
{{out}}
Output:
<pre>
input:
Line 1,144 ⟶ 3,125:
main.qString: "data)"
</pre>
 
=={{header|Haskell}}==
<syntaxhighlight lang="haskell">import qualified Data.Functor.Identity as F
import qualified Text.Parsec.Prim as Prim
import Text.Parsec
((<|>), (<?>), many, many1, char, try, parse, sepBy, choice,
between)
import Text.Parsec.Token
(integer, float, whiteSpace, stringLiteral, makeTokenParser)
import Text.Parsec.Char (noneOf)
import Text.Parsec.Language (haskell)
 
data Val
= Int Integer
| Float Double
| String String
| Symbol String
| List [Val]
deriving (Eq, Show)
 
tProg :: Prim.ParsecT String a F.Identity [Val]
tProg = many tExpr <?> "program"
where
tExpr = between ws ws (tList <|> tAtom) <?> "expression"
ws = whiteSpace haskell
tAtom =
(try (Float <$> float haskell) <?> "floating point number") <|>
(try (Int <$> integer haskell) <?> "integer") <|>
(String <$> stringLiteral haskell <?> "string") <|>
(Symbol <$> many1 (noneOf "()\"\t\n\r ") <?> "symbol") <?>
"atomic expression"
tList = List <$> between (char '(') (char ')') (many tExpr) <?> "list"
 
p :: String -> IO ()
p = either print (putStrLn . unwords . map show) . parse tProg ""
 
main :: IO ()
main = do
let expr =
"((data \"quoted data\" 123 4.5)\n (data (!@# (4.5) \"(more\" \"data)\")))"
putStrLn ("The input:\n" ++ expr ++ "\n\nParsed as:")
p expr</syntaxhighlight>
{{Out}}
<pre>The input:
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
 
Parsed as:
List [List [Symbol "data",String "quoted data",Int 123,Float 4.5],List [Symbol "data",List [Symbol "!@#",List [Float 4.5],String "(more",String "data)"]]]</pre>
 
 
Or, parsing by hand (rather than with a parser combinator library) and printing a parse tree diagram:
<syntaxhighlight lang="haskell">{-# LANGUAGE TupleSections #-}
import Data.Bifunctor (bimap)
import Data.List (mapAccumL)
import Data.List.Split (splitOn)
import Data.Maybe (catMaybes, fromMaybe, listToMaybe)
import Data.Tree (Forest, Tree (..), drawForest)
------------------------ DATA TYPE -----------------------
data Val
= Int Integer
| Float Double
| String String
| Symbol String
| List [Val]
deriving (Eq, Show, Read)
instance Semigroup Val where
List a <> List b = List (a <> b)
instance Monoid Val where
mempty = List []
--------------------------- MAIN -------------------------
main :: IO ()
main = do
let expr =
unlines
[ "((data \"quoted data\" 123 4.5)",
" (data (!@# (4.5) \"(more\" \"data)\")))"
]
parse = fst (parseExpr (tokenized expr))
putStrLn $ treeDiagram $ forestFromVal parse
putStrLn "Serialized from the parse tree:\n"
putStrLn $ litVal parse
------------------- S-EXPRESSION PARSER ------------------
parseExpr :: [String] -> (Val, [String])
parseExpr = until finished parseToken . (mempty,)
finished :: (Val, [String]) -> Bool
finished (_, []) = True
finished (_, token : _) = ")" == token
parseToken :: (Val, [String]) -> (Val, [String])
parseToken (v, "(" : rest) =
bimap
((v <>) . List . return)
tail
(parseExpr rest)
parseToken (v, ")" : rest) = (v, rest)
parseToken (v, t : rest) = (v <> List [atom t], rest)
----------------------- TOKEN PARSER ---------------------
atom :: String -> Val
atom [] = mempty
atom s@('"' : _) =
fromMaybe mempty (maybeRead ("String " <> s))
atom s =
headDef (Symbol s) $
catMaybes $
maybeRead . (<> (' ' : s)) <$> ["Int", "Float"]
maybeRead :: String -> Maybe Val
maybeRead = fmap fst . listToMaybe . reads
----------------------- TOKENIZATION ---------------------
tokenized :: String -> [String]
tokenized s = quoteTokens '"' s >>= go
where
go [] = []
go token@('"' : _) = [token]
go s = words $ spacedBrackets s
quoteTokens :: Char -> String -> [String]
quoteTokens q s = snd $ mapAccumL go False (splitOn [q] s)
where
go b s
| b = (False, '"' : s <> "\"")
| otherwise = (True, s)
spacedBrackets :: String -> String
spacedBrackets [] = []
spacedBrackets (c : cs)
| c `elem` "()" = ' ' : c : " " <> spacedBrackets cs
| otherwise = c : spacedBrackets cs
------------------------- DIAGRAMS -----------------------
treeDiagram :: Forest Val -> String
treeDiagram = drawForest . fmap (fmap show)
forestFromVal :: Val -> Forest Val
forestFromVal (List xs) = treeFromVal <$> xs
treeFromVal :: Val -> Tree Val
treeFromVal (List xs) =
Node (Symbol "List") (treeFromVal <$> xs)
treeFromVal v = Node v []
---------------------- SERIALISATION ---------------------
litVal (Symbol x) = x
litVal (Int x) = show x
litVal (Float x) = show x
litVal (String x) = '"' : x <> "\""
litVal (List [List xs]) = litVal (List xs)
litVal (List xs) = '(' : (unwords (litVal <$> xs) <> ")")
------------------------- GENERIC ------------------------
headDef :: a -> [a] -> a
headDef d [] = d
headDef _ (x : _) = x</syntaxhighlight>
{{Out}}
<pre>Symbol "List"
|
+- Symbol "List"
| |
| +- Symbol "data"
| |
| +- String "quoted data"
| |
| +- Int 123
| |
| `- Float 4.5
|
`- Symbol "List"
|
+- Symbol "data"
|
`- Symbol "List"
|
+- Symbol "!@#"
|
+- Symbol "List"
| |
| `- Float 4.5
|
+- String "(more"
|
`- String "data)"
 
 
Serialized from the parse tree:
 
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))</pre>
 
=={{header|Icon}} and {{header|Unicon}}==
The following should suffice as a demonstration. <br> String escaping and quoting could be handled more robustly. The example takes single and double qoutes. Single quotes were used instead of doubles in the input.
String escaping and quoting could be handled more robustly.
<lang Icon>link ximage
The example takes single and double quotes. <br>
Single quotes were used instead of doubles in the input.
<syntaxhighlight lang="icon">link ximage
 
procedure main()
Line 1,206 ⟶ 3,394:
}
return T
end</langsyntaxhighlight>
 
{{libheader|Icon Programming Library}}
[http://www.cs.arizona.edu/icon/library/src/procs/ximage.icn ximage.icn formats arbitrary structures into printable strings]
 
{{Out}}
Output:<pre>Input: "((data 'quoted data' 123 4.5) (data (!@# (4.5) '(more' 'data)')))"
<pre>Input: "((data 'quoted data' 123 4.5) (data (!@# (4.5) '(more' 'data)')))"
Structure:
L2 := list(1)
Line 1,236 ⟶ 3,425:
This implementation does not support escape characters. If escape characters were added, we would need additional support in the tokenizer (an extra character class, and in the state table an extra column and two extra rows, or almost double the number of state transitions: 35 instead of 20), and additional support in the data language (unfmt would need to strip out escape characters and fmt would need to insert escape characters -- so each of these routines would also perhaps double in size.) And that's a lot of bulk for serialize/deserialize mechanism which, by design, cannot represent frequently used data elements (such as matrices and gerunds).
 
<langsyntaxhighlight lang="j">NB. character classes: 0: paren, 1: quote, 2: whitespace, 3: wordforming (default)
chrMap=: '()';'"';' ',LF,TAB,CR
 
NB. state columns correspond to the above character classes
NB. first digit chooses next state.
NB. second digit is action 0: do nothing, 1: start wordtoken, 2: end wordtoken
states=: 10 10#: ".;._2]0 :0
11 21 00 31 NB. state 0: initial state
Line 1,287 ⟶ 3,476:
 
readSexpr=: fmt L:0 @rdSexpr :.writeSexpr
writeSexpr=: wrSexpr @(unfmt L:0) :.readSexpr</langsyntaxhighlight>
 
 
Example use:
 
<langsyntaxhighlight lang="j"> readSexpr '((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))'
┌───────────────────────────┬────────────────────────────────┐
│┌─────┬───────────┬───┬───┐│┌─────┬────────────────────────┐│
Line 1,303 ⟶ 3,492:
└───────────────────────────┴────────────────────────────────┘
writeSexpr readSexpr '((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))'
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))</langsyntaxhighlight>
 
=={{header|Haskell}}==
<lang haskell>
import Text.ParserCombinators.Parsec ((<|>), (<?>), many, many1, char, try, parse, sepBy, choice)
import Text.ParserCombinators.Parsec.Char (noneOf)
import Text.ParserCombinators.Parsec.Token (integer, float, whiteSpace, stringLiteral, makeTokenParser)
import Text.ParserCombinators.Parsec.Language (haskellDef)
 
 
data Val = Int Integer
| Float Double
| String String
| Symbol String
| List [Val] deriving (Eq, Show)
 
lexer = makeTokenParser haskellDef
 
tInteger = (integer lexer) >>= (return . Int) <?> "integer"
 
tFloat = (float lexer) >>= (return . Float) <?> "floating point number"
 
tString = (stringLiteral lexer) >>= (return . String) <?> "string"
 
tSymbol = (many1 $ noneOf "()\" \t\n\r") >>= (return . Symbol) <?> "symbol"
 
tAtom = choice [try tFloat, try tInteger, tSymbol, tString] <?> "atomic expression"
 
tExpr = do
whiteSpace lexer
expr <- tList <|> tAtom
whiteSpace lexer
return expr
<?> "expression"
 
tList = do
char '('
list <- many tExpr
char ')'
return $ List list
<?> "list"
 
tProg = many tExpr <?> "program"
 
p ex = case parse tProg "" ex of
Right x -> putStrLn $ unwords $ map show x
Left err -> print err
 
main = do
let expr = "((data \"quoted data\" 123 4.5)\n (data (!@# (4.5) \"(more\" \"data)\")))"
putStrLn $ "The input:\n" ++ expr ++ "\n"
putStr "Parsed as:\n"
p expr
</lang>
 
=={{header|Java}}==
Line 1,366 ⟶ 3,502:
 
====LispTokenizer.java====
<langsyntaxhighlight lang="java">package jfkbits;
 
import java.io.BufferedReader;
Line 1,476 ⟶ 3,612:
{
}
}</langsyntaxhighlight>
 
====Token.java====
<langsyntaxhighlight lang="java">package jfkbits;
import java.io.StreamTokenizer;
 
Line 1,507 ⟶ 3,643:
}
}
}</langsyntaxhighlight>
 
====Atom.java====
<langsyntaxhighlight lang="java">package jfkbits;
 
import jfkbits.LispParser.Expr;
Line 1,526 ⟶ 3,662:
}
 
}</langsyntaxhighlight>
 
====StringAtom.java====
<langsyntaxhighlight lang="java">package jfkbits;
 
public class StringAtom extends Atom
Line 1,549 ⟶ 3,685:
}
}
</syntaxhighlight>
</lang>
 
====ExprList.java====
<langsyntaxhighlight lang="java">package jfkbits;
 
import java.util.AbstractCollection;
Line 1,622 ⟶ 3,758:
}
 
}</langsyntaxhighlight>
 
====LispParser.java====
<langsyntaxhighlight lang="java">package jfkbits;
 
 
Line 1,672 ⟶ 3,808:
 
}
</syntaxhighlight>
</lang>
 
====LispParserDemo.java====
<langsyntaxhighlight lang="java">import jfkbits.ExprList;
import jfkbits.LispParser;
import jfkbits.LispParser.ParseException;
Line 1,700 ⟶ 3,836:
}
}
}</langsyntaxhighlight>
 
=={{header|JavaScript}}==
(for a '''bug-fix''' concerning \" and \n in strings see the [[Talk:S-expressions#JavaScript_version_bugfix_for_%5C%22_and_%5Cn_in_strings|Discussion]])
===Procedural===
<syntaxhighlight lang="javascript">String.prototype.parseSexpr = function() {
var t = this.match(/\s*("[^"]*"|\(|\)|"|[^\s()"]+)/g)
for (var o, c=0, i=t.length-1; i>=0; i--) {
var n, ti = t[i].trim()
if (ti == '"') return
else if (ti == '(') t[i]='[', c+=1
else if (ti == ')') t[i]=']', c-=1
else if ((n=+ti) == ti) t[i]=n
else t[i] = '\'' + ti.replace('\'', '\\\'') + '\''
if (i>0 && ti!=']' && t[i-1].trim()!='(' ) t.splice(i,0, ',')
if (!c) if (!o) o=true; else return
}
return c ? undefined : eval(t.join(''))
}
 
Array.prototype.toString = function() {
var s=''; for (var i=0, e=this.length; i<e; i++) s+=(s?' ':'')+this[i]
return '('+s+')'
}
 
Array.prototype.toPretty = function(s) {
if (!s) s = ''
var r = s + '(<br>'
var s2 = s + Array(6).join('&nbsp;')
for (var i=0, e=this.length; i<e; i+=1) {
var ai = this[i]
r += ai.constructor != Array ? s2+ai+'<br>' : ai.toPretty(s2)
}
return r + s + ')<br>'
}
 
var str = '((data "quoted data" 123 4.5)\n (data (!@# (4.5) "(more" "data)")))'
document.write('text:<br>', str.replace(/\n/g,'<br>').replace(/ /g,'&nbsp;'), '<br><br>')
var sexpr = str.parseSexpr()
if (sexpr === undefined)
document.write('Invalid s-expr!', '<br>')
else
document.write('s-expr:<br>', sexpr, '<br><br>', sexpr.constructor != Array ? '' : 'pretty print:<br>' + sexpr.toPretty())</syntaxhighlight>
{{out}}
<pre>text:
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
 
s-expr:
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))
 
pretty print:
(
(
data
"quoted data"
123
4.5
)
(
data
(
!@#
(
4.5
)
"(more"
"data)"
)
)
)</pre>
 
 
===Functional===
Showing the parse tree in an indented JSON format, and writing out a reserialization:
<syntaxhighlight lang="javascript">(() => {
"use strict";
 
// ------------------ S-EXPRESSIONS ------------------
const main = () => {
const expr = [
"((data \"quoted data\" 123 4.5)",
" (data (!@# (4.5) \"(more\" \"data)\")))"
]
.join("\n");
 
const [parse, residue] = parseExpr(
tokenized(expr)
);
 
return 0 < residue.length
? `Unparsed tokens: ${JSON.stringify(residue)}`
: 0 < parse.length
? [
JSON.stringify(parse, null, 2),
"Reserialized from parse:",
parse.map(serialized).join(" ")
]
.join("\n\n")
: "Could not be parsed";
};
 
// ---------------- EXPRESSION PARSER ----------------
 
// parseExpr [String] -> ([Expr], [String])
const parseExpr = tokens =>
// A tuple of (parsed trees, residual tokens)
// derived from a list of tokens.
until(finished)(readToken)([
[], tokens
]);
 
 
// finished :: ([Expr], [String]) -> Bool
const finished = ([, tokens]) =>
// True if no tokens remain, or the next
// closes a sub-expression.
0 === tokens.length || ")" === tokens[0];
 
 
// readToken :: ([Expr], [String]) -> ([Expr], [String])
const readToken = ([xs, tokens]) => {
// A tuple of enriched expressions and
// depleted tokens.
const [token, ...ts] = tokens;
 
// An open bracket introduces recursion over
// a sub-expression to define a sub-list.
return "(" === token
? (() => {
const [expr, rest] = parseExpr(ts);
 
return [xs.concat([expr]), rest.slice(1)];
})()
: ")" === token
? [xs, token]
: [xs.concat(atom(token)), ts];
};
 
// ------------------- ATOM PARSER -------------------
 
// atom :: String -> Expr
const atom = s =>
0 < s.length
? isNaN(s)
? "\"'".includes(s[0])
? s.slice(1, -1)
: {name: s}
: parseFloat(s, 10)
: "";
 
 
// ------------------ TOKENIZATION -------------------
 
// tokenized :: String -> [String]
const tokenized = s =>
// Brackets and quoted or unquoted atomic strings.
quoteTokens("\"")(s).flatMap(
segment => "\"" !== segment[0]
? segment.replace(/([()])/gu, " $1 ")
.split(/\s+/u)
.filter(Boolean)
: [segment]
);
 
 
// quoteTokens :: Char -> String -> [String]
const quoteTokens = q =>
// Alternating unquoted and quoted segments.
s => s.split(q).flatMap(
(k, i) => even(i)
? 0 < k.length
? [k]
: []
: [`${q}${k}${q}`]
);
 
// ------------------ SERIALIZATION ------------------
 
// serialized :: Expr -> String
const serialized = e => {
const t = typeof e;
 
return "number" === t
? `${e}`
: "string" === t
? `"${e}"`
: "object" === t
? Array.isArray(e)
? `(${e.map(serialized).join(" ")})`
: e.name
: "?";
};
 
 
// --------------------- GENERIC ---------------------
 
// even :: Int -> Bool
const even = n =>
// True if 2 is a factor of n.
0 === n % 2;
 
 
// until :: (a -> Bool) -> (a -> a) -> a -> a
const until = p =>
// The value resulting from repeated applications
// of f to the seed value x, terminating when
// that result returns true for the predicate p.
f => {
const go = x =>
p(x)
? x
: go(f(x));
 
return go;
};
 
return main();
})();</syntaxhighlight>
{{Out}}
<pre>[
[
[
{
"name": "data"
},
"quoted data",
123,
4.5
],
[
{
"name": "data"
},
[
{
"name": "!@#"
},
[
4.5
],
"(more",
"data)"
]
]
]
]
 
Reserialized from parse:
 
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))</pre>
 
=={{header|jq}}==
{{works with|jq}}
'''Also works with gojq, the Go implementation of jq'''
[[Category:PEG]]
This entry is based on a Parsing Expression Grammar (PEG) for S-expressions.
The idea is to pass a JSON object `{remainder:_, result:_ }` through a
jq pipeline corresponding to a PEG for S-expressions, consuming the
text in `.remainder` and building up `.result`.
For further details about this approach, see e.g.
[https://github.com/stedolan/jq/wiki/Parsing-Expression-Grammars jq as a PEG Engine].
<syntaxhighlight lang=jq>
# PEG infrastructure
def star(E): ((E | star(E)) // .) ;
 
### Helper functions:
# Consume a regular expression rooted at the start of .remainder, or emit empty;
# on success, update .remainder and set .match but do NOT update .result
def consume($re):
# on failure, match yields empty
(.remainder | match("^" + $re)) as $match
| .remainder |= .[$match.length :]
| .match = $match.string;
 
def parse($re):
consume($re)
| .result = .result + [.match] ;
 
def parseNumber($re):
consume($re)
| .result = .result + [.match|tonumber] ;
 
def eos: select(.remainder == "");
 
# whitespace
def ws: consume("[ \t\r\n]*");
 
def box(E):
((.result = null) | E) as $e
| .remainder = $e.remainder
| .result += [$e.result] # the magic sauce
;
 
# S-expressions
 
# Input: a string
# Output: an array representation of the input if it is an S-expression
def SExpression:
def string: consume("\"") | parse("[^\"]") | consume("\"");
def identifier: parse("[^ \t\n\r()]+");
def decimal: parseNumber("[0-9]+([.][0-9]*)?");
def hex: parse("0x[0-9A-Fa-f]+") ;
def number: hex // decimal;
def atom: ws | (string // number // identifier);
 
def SExpr: ws | consume("[(]") | ws | box(star(atom // SExpr)) | consume("[)]");
 
{remainder: .} | SExpr | ws | eos | .result;
 
SExpression
</syntaxhighlight>
'''Invocation:'''
<pre>
cat << EOF |
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
EOF
jq -Rsc -f s-expression.jq
</pre>
{{output}}
<pre>
[[["data","\"quoted","data\"",123,4.5],["data",["!@#",[4.5],"\"",["more\"","\"data"],"\""]]]]
</pre>
 
=={{header|Julia}}==
<syntaxhighlight lang="julia">
function rewritequotedparen(s)
segments = split(s, "\"")
for i in 1:length(segments)
if i & 1 == 0 # even i
ret = replace(segments[i], r"\(", s"_O_PAREN")
segments[i] = replace(ret, r"\)", s"_C_PAREN")
end
end
join(segments, "\"")
end
 
function reconsdata(n, s)
if n > 1
print(" ")
end
if s isa String && ismatch(r"[\$\%\!\$\#]", s) == false
print("\"$s\"")
else
print(s)
end
end
 
function printAny(anyarr)
print("(")
for (i, el) in enumerate(anyarr)
if el isa Array
print("(")
for (j, el2) in enumerate(el)
if el2 isa Array
print("(")
for(k, el3) in enumerate(el2)
if el3 isa Array
print(" (")
for(n, el4) in enumerate(el3)
reconsdata(n, el4)
end
print(")")
else
reconsdata(k, el3)
end
end
print(")")
else
reconsdata(j, el2)
end
end
if i == 1
print(")\n ")
else
print(")")
end
end
end
println(")")
end
 
removewhitespace(s) = replace(replace(s, r"\n", " "), r"^\s*(\S.*\S)\s*$", s"\1")
quote3op(s) = replace(s, r"([\$\!\@\#\%]{3})", s"\"\1\"")
paren2bracket(s) = replace(replace(s, r"\(", s"["), r"\)", s"]")
data2symbol(s) = replace(s, "[data", "[:data")
unrewriteparens(s) = replace(replace(s, "_C_PAREN", ")"), "_O_PAREN", "(")
addcommas(s) = replace(replace(s, r"\]\s*\[", "],["), r" (?![a-z])", ",")
 
inputstring = """
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
"""
 
println("The input string is:\n", inputstring)
processed = (inputstring |> removewhitespace |> rewritequotedparen |> quote3op
|> paren2bracket |> data2symbol |> unrewriteparens |> addcommas)
nat = eval(parse("""$processed"""))
println("The processed native structure is:\n", nat)
println("The reconstructed string is:\n"), printAny(nat)
</syntaxhighlight>
{{output}}<pre>
The input string is:
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
 
The processed native structure is:
Array{Any,1}[Any[:data, "quoted data", 123, 4.5], Any[:data, Any["!@#", [4.5], "(more", "data)"]]]
 
The reconstructed string is:
((data "quoted data" 123 4.5)
(data(!@# (4.5) "(more" "data)")))
</pre>
 
=={{header|Kotlin}}==
{{trans|JavaScript}}
<syntaxhighlight lang="groovy">// version 1.2.31
 
const val INDENT = 2
 
fun String.parseSExpr(): List<String>? {
val r = Regex("""\s*("[^"]*"|\(|\)|"|[^\s()"]+)""")
val t = r.findAll(this).map { it.value }.toMutableList()
if (t.size == 0) return null
var o = false
var c = 0
for (i in t.size - 1 downTo 0) {
val ti = t[i].trim()
val nd = ti.toDoubleOrNull()
if (ti == "\"") return null
if (ti == "(") {
t[i] = "["
c++
}
else if (ti == ")") {
t[i] = "]"
c--
}
else if (nd != null) {
val ni = ti.toIntOrNull()
if (ni != null) t[i] = ni.toString()
else t[i] = nd.toString()
}
else if (ti.startsWith("\"")) { // escape embedded double quotes
var temp = ti.drop(1).dropLast(1)
t[i] = "\"" + temp.replace("\"", "\\\"") + "\""
}
if (i > 0 && t[i] != "]" && t[i - 1].trim() != "(") t.add(i, ", ")
if (c == 0) {
if (!o) o = true else return null
}
}
return if (c != 0) null else t
}
 
fun MutableList<String>.toSExpr(): String {
for (i in 0 until this.size) {
this[i] = when (this[i]) {
"[" -> "("
"]" -> ")"
", " -> " "
else -> {
if (this[i].startsWith("\"")) { // unescape embedded quotes
var temp = this[i].drop(1).dropLast(1)
"\"" + temp.replace("\\\"", "\"") + "\""
}
else this[i]
}
}
}
return this.joinToString("")
}
 
fun List<String>.prettyPrint() {
var level = 0
loop@for (t in this) {
var n: Int
when(t) {
", ", " " -> continue@loop
"[", "(" -> {
n = level * INDENT + 1
level++
}
"]", ")" -> {
level--
n = level * INDENT + 1
}
else -> {
n = level * INDENT + t.length
}
}
println("%${n}s".format(t))
}
}
 
fun main(args: Array<String>) {
val str = """((data "quoted data" 123 4.5)""" + "\n" +
""" (data (!@# (4.5) "(more" "data)")))"""
val tokens = str.parseSExpr()
if (tokens == null)
println("Invalid s-expr!")
else {
println("Native data structure:")
println(tokens.joinToString(""))
println("\nNative data structure (pretty print):")
tokens.prettyPrint()
 
val tokens2 = tokens.toMutableList()
println("\nRecovered S-Expression:")
println(tokens2.toSExpr())
println("\nRecovered S-Expression (pretty print):")
tokens2.prettyPrint()
}
}</syntaxhighlight>
 
{{out}}
<pre>
Native data structure:
[[data, "quoted data", 123, 4.5], [data, [!@#, [4.5], "(more", "data)"]]]
 
Native data structure (pretty print):
[
[
data
"quoted data"
123
4.5
]
[
data
[
!@#
[
4.5
]
"(more"
"data)"
]
]
]
 
Recovered S-Expression:
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))
 
Recovered S-Expression (pretty print):
(
(
data
"quoted data"
123
4.5
)
(
data
(
!@#
(
4.5
)
"(more"
"data)"
)
)
)
</pre>
 
=={{header|Lua}}==
 
This uses LPeg, a parsing expression grammar library written by one the authors of Lua.
Tested with Lua 5.3.5 and LPeg 1.0.2-1.
 
<syntaxhighlight lang="lua">lpeg = require 'lpeg' -- see http://www.inf.puc-rio.br/~roberto/lpeg/
 
imports = 'P R S C V match'
for w in imports:gmatch('%a+') do _G[w] = lpeg[w] end -- make e.g. 'lpeg.P' function available as 'P'
 
function tosymbol(s) return s end
function tolist(x, ...) return {...} end -- ignore the first capture, the whole sexpr
ws = S' \t\n'^0 -- whitespace, 0 or more
 
digits = R'09'^1 -- digits, 1 or more
Tnumber = C(digits * (P'.' * digits)^-1) * ws / tonumber -- ^-1 => at most 1
 
Tstring = C(P'"' * (P(1) - P'"')^0 * P'"') * ws
 
sep = S'()" \t\n'
symstart = (P(1) - (R'09' + sep))
symchar = (P(1) - sep)
Tsymbol = C(symstart * symchar^0) * ws / tosymbol
 
atom = Tnumber + Tstring + Tsymbol
lpar = P'(' * ws
rpar = P')' * ws
sexpr = P{ -- defines a recursive pattern
'S';
S = ws * lpar * C((atom + V'S')^0) * rpar / tolist
}</syntaxhighlight>
 
Now to use the <i>sexpr</i> pattern:
 
<syntaxhighlight lang="lua">eg_input = [[
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
]]
 
eg_produced = match(sexpr, eg_input)
 
eg_expected = { -- expected Lua data structure of the reader (lpeg.match)
{'data', '"quoted data"', 123, 4.5},
{'data', {'!@#', {4.5}, '"(more"', '"data)"'}}
}
 
function check(produced, expected)
assert(type(produced) == type(expected))
if type(expected) == 'table' then -- i.e. a list
assert(#produced == #expected)
for i = 1, #expected do check(produced[i], expected[i]) end
else
assert(produced == expected)
end
end
 
check(eg_produced, eg_expected)
print("checks out!") -- won't get here if any <i>check()</i> assertion fails
</syntaxhighlight>
 
And here's the pretty printer, whose output looks like all the others:
 
<syntaxhighlight lang="lua">function pprint(expr, indent)
local function prindent(fmt, expr)
io.write(indent) -- no line break
print(string.format(fmt, expr))
end
if type(expr) == 'table' then
if #expr == 0 then
prindent('()')
else
prindent('(')
local indentmore = ' ' .. indent
for i= 1,#expr do pprint(expr[i], indentmore) end
prindent(')')
end
elseif type(expr) == 'string' then
if expr:sub(1,1) == '"' then
prindent("%q", expr:sub(2,-2)) -- print as a Lua string
else
prindent("%s", expr) -- print as a symbol
end
else
prindent("%s", expr)
end
end
 
pprint(eg_expected, '')</syntaxhighlight>
 
=={{header|Nim}}==
 
<syntaxhighlight lang="nim">import strutils
 
const Input = """
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
"""
 
type
TokenKind = enum
tokInt, tokFloat, tokString, tokIdent
tokLPar, tokRPar
tokEnd
Token = object
case kind: TokenKind
of tokString: stringVal: string
of tokInt: intVal: int
of tokFloat: floatVal: float
of tokIdent: ident: string
else: discard
 
proc lex(input: string): seq[Token] =
var pos = 0
 
template current: char =
if pos < input.len: input[pos]
else: '\x00'
 
while pos < input.len:
case current
of ';':
inc(pos)
while current notin {'\r', '\n'}:
inc(pos)
if current == '\r': inc(pos)
if current == '\n': inc(pos)
of '(': inc(pos); result.add(Token(kind: tokLPar))
of ')': inc(pos); result.add(Token(kind: tokRPar))
of '0'..'9':
var
num = ""
isFloat = false
while current in Digits:
num.add(current)
inc(pos)
if current == '.':
num.add(current)
isFloat = true
inc(pos)
while current in Digits:
num.add(current)
inc(pos)
result.add(if isFloat: Token(kind: tokFloat, floatVal: parseFloat(num))
else: Token(kind: tokInt, intVal: parseInt(num)))
of ' ', '\t', '\n', '\r': inc(pos)
of '"':
var str = ""
inc(pos)
while current != '"':
str.add(current)
inc(pos)
inc(pos)
result.add(Token(kind: tokString, stringVal: str))
else:
const BannedChars = {' ', '\t', '"', '(', ')', ';'}
var ident = ""
while current notin BannedChars:
ident.add(current)
inc(pos)
result.add(Token(kind: tokIdent, ident: ident))
result.add(Token(kind: tokEnd))
 
type
SExprKind = enum
seInt, seFloat, seString, seIdent, seList
SExpr = ref object
case kind: SExprKind
of seInt: intVal: int
of seFloat: floatVal: float
of seString: stringVal: string
of seIdent: ident: string
of seList: children: seq[SExpr]
ParseError = object of CatchableError
 
proc `$`*(se: SExpr): string =
case se.kind
of seInt: result = $se.intVal
of seFloat: result = $se.floatVal
of seString: result = '"' & se.stringVal & '"'
of seIdent: result = se.ident
of seList:
result = "("
for i, ex in se.children:
if ex.kind == seList and ex.children.len > 1:
result.add("\n")
result.add(indent($ex, 2))
else:
if i > 0:
result.add(" ")
result.add($ex)
result.add(")")
 
var
tokens = lex(Input)
pos = 0
 
template current: Token =
if pos < tokens.len: tokens[pos]
else: Token(kind: tokEnd)
 
proc parseInt(token: Token): SExpr =
result = SExpr(kind: seInt, intVal: token.intVal)
 
proc parseFloat(token: Token): SExpr =
result = SExpr(kind: seFloat, floatVal: token.floatVal)
 
proc parseString(token: Token): SExpr =
result = SExpr(kind: seString, stringVal: token.stringVal)
 
proc parseIdent(token: Token): SExpr =
result = SExpr(kind: seIdent, ident: token.ident)
 
proc parse(): SExpr
 
proc parseList(): SExpr =
result = SExpr(kind: seList)
while current.kind notin {tokRPar, tokEnd}:
result.children.add(parse())
if current.kind == tokEnd:
raise newException(ParseError, "Missing right paren ')'")
else:
inc(pos)
 
proc parse(): SExpr =
var token = current
inc(pos)
result =
case token.kind
of tokInt: parseInt(token)
of tokFloat: parseFloat(token)
of tokString: parseString(token)
of tokIdent: parseIdent(token)
of tokLPar: parseList()
else: nil
 
echo parse()</syntaxhighlight>
 
{{out}}
 
<pre>
(
(data "quoted data" 123 4.5)
(data
(!@# (4.5) "(more" "data)")))
</pre>
 
=={{header|OCaml}}==
You may be interested in this [https://dev.realworldocaml.org/data-serialization.html chapter of the book Real World OCaml].
 
The file <code>SExpr.mli</code> containing the interface:
 
<langsyntaxhighlight lang="ocaml">(** This module is a very simple parsing library for S-expressions. *)
(* Copyright (C) 2009 Florent Monnier, released under MIT license. *)
 
Line 1,734 ⟶ 4,683:
 
val string_of_sexpr_indent : sexpr list -> string
(** same than [string_of_sexpr] but with indentation *)</langsyntaxhighlight>
 
The file <code>SExpr.ml</code> containing the implementation:
 
<langsyntaxhighlight lang="ocaml">(** This module is a very simple parsing library for S-expressions. *)
(* Copyright (C) 2009 Florent Monnier, released under MIT license. *)
(* modified to match the task description *)
Line 1,915 ⟶ 4,864:
 
let print_sexpr_indent s =
print_endline (string_of_sexpr_indent s)</langsyntaxhighlight>
 
Then we compile this small module and test it in the interactive loop:
Line 1,953 ⟶ 4,902:
 
=={{header|Perl}}==
<syntaxhighlight lang="perl">#!/usr/bin/perl -w
<lang perl>use Text::Balanced qw(extract_delimited extract_bracketed);
use strict;
use warnings;
 
sub sexpr
{
my $txt@stack = $_([0]);
local $txt_ =~ s/^\s+//s$_[0];
$txt =~ s/\s+$//s;
$txt =~ /^\((.*)\)$/s or die "Not an s-expression: <<<$txt>>>";
$txt = $1;
 
while (m{
my $ret = [];
\G # start match right at the end of the previous one
my $w;
\s*+ # skip whitespaces
while ($txt ne '') {
# now try to match any of possible tokens in THIS order:
my $c = substr $txt,0,1;
if ($c eq '?<lparen>\(') {|
(?<rparen>\)) |
($w, $txt) = extract_bracketed($txt, '()');
(?<FLOAT>[0-9]*+\.[0-9]*+) |
$w = sexpr($w);
(?<INT>[0-9]++) |
} elsif ($c eq '"') {
(?:"(?<STRING>([^\"\\]|\\.)*+)") |
($w, $txt) = extract_delimited($txt, '"');
(?<IDENTIFIER>[^\s()]++)
$w =~ s/^"(.*)"/$1/;
# Flags:
# g = match the same string repeatedly
# m = ^ and $ match at \n
# s = dot and \s matches \n
# x = allow comments within regex
}gmsx)
{
die "match error" if 0+(keys %+) != 1;
 
my $token = (keys %+)[0];
my $val = $+{$token};
 
if ($token eq 'lparen') {
my $a = [];
push @{$stack[$#stack]}, $a;
push @stack, $a;
} elsif ($token eq 'rparen') {
pop @stack;
} else {
push @{$stack[$#stack]}, bless \$val, $token;
$txt =~ s/^(\S+)// and $w = $1;
}
push @$ret, $w;
$txt =~ s/^\s+//s;
}
return $retstack[0]->[0];
}
 
sub quote
{ (local $_ = $_[0]) =~ /[\s\"\(\)]/s ? do{s/\"/\\\"/gs; qq{"$_"}} : $_; }
 
sub sexpr2txt
{
{ qq{(@{[ map { ref($_) eq '' ? quote($_) : sexpr2txt($_) } @{$_[0]} ]})} }</lang>
qq{(@{[ map {
ref($_) eq '' ? quote($_) :
ref($_) eq 'STRING' ? quote($$_) :
ref($_) eq 'ARRAY' ? sexpr2txt($_) : $$_
} @{$_[0]} ]})}
}</syntaxhighlight>
Check:
<langsyntaxhighlight lang="perl">my $s = sexpr(q{
 
((data "quoted data" 123 4.5)
Line 2,000 ⟶ 4,970:
 
# Convert back
print sexpr2txt($s)."\n";</langsyntaxhighlight>
Output:
<pre>$VAR1 = [
[
bless( do{\(my $o = 'data')}, 'IDENTIFIER' ),
bless( do{\(my $o = 'quoted data')}, 'STRING' ),
bless( do{\(my $o = '123')}, 'INT' ),
bless( do{\(my $o = '4.5')}, 'FLOAT' )
],
[
bless( do{\(my $o = 'data')}, 'IDENTIFIER' ),
[
bless( do{\(my $o = '!@#')}, 'IDENTIFIER' ),
[
bless( do{\(my $o = '4.5')}, 'FLOAT' )
],
bless( do{\(my $o = '(more')}, 'STRING' ),
bless( do{\(my $o = 'data)')}, 'STRING' )
]
]
Line 2,023 ⟶ 4,993:
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))</pre>
 
=={{header|Perl 6Phix}}==
The distinction between a symbol data and a quoted string "data" is simple: both are represented as strings, with the symbol
{{works with|Rakudo|2011.11-105-g48514ff on parrot 3.9.0}}
being held as "data" and the quoted string being held as "\"data\"", and you can test for the latter by seeing if the first
character is a double quote. Internally, it is easy to differentiate between a symbol (held as a string) and a number, but
that may not be clear on the display: 4e-5 and 4-e5 may appear similar but the latter is probably a parse failure. It may
be more sensible for get_term() to raise an error if the scanf fails, than assume it is a symbol like it does now.
<!--<syntaxhighlight lang="phix">(phixonline)-->
<span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span>
<span style="color: #008080;">constant</span> <span style="color: #000000;">s_expr_str</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">"""
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))"""</span>
<span style="color: #008080;">function</span> <span style="color: #000000;">skip_spaces</span><span style="color: #0000FF;">(</span><span style="color: #004080;">string</span> <span style="color: #000000;">s</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">sidx</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">while</span> <span style="color: #000000;">sidx</span><span style="color: #0000FF;"><=</span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">and</span> <span style="color: #7060A8;">find</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s</span><span style="color: #0000FF;">[</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">],{</span><span style="color: #008000;">' '</span><span style="color: #0000FF;">,</span><span style="color: #008000;">'\t'</span><span style="color: #0000FF;">,</span><span style="color: #008000;">'\r'</span><span style="color: #0000FF;">,</span><span style="color: #008000;">'\n'</span><span style="color: #0000FF;">})</span> <span style="color: #008080;">do</span> <span style="color: #000000;">sidx</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span> <span style="color: #008080;">end</span> <span style="color: #008080;">while</span>
<span style="color: #008080;">return</span> <span style="color: #000000;">sidx</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span>
<span style="color: #008080;">function</span> <span style="color: #000000;">get_term</span><span style="color: #0000FF;">(</span><span style="color: #004080;">string</span> <span style="color: #000000;">s</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">sidx</span><span style="color: #0000FF;">)</span>
<span style="color: #000080;font-style:italic;">-- get a single quoted string, symbol, or number.</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">ch</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">s</span><span style="color: #0000FF;">[</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">]</span>
<span style="color: #004080;">string</span> <span style="color: #000000;">res</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">""</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">ch</span><span style="color: #0000FF;">=</span><span style="color: #008000;">'\"'</span> <span style="color: #008080;">then</span>
<span style="color: #000000;">res</span> <span style="color: #0000FF;">&=</span> <span style="color: #000000;">ch</span>
<span style="color: #008080;">while</span> <span style="color: #000000;">1</span> <span style="color: #008080;">do</span>
<span style="color: #000000;">sidx</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span>
<span style="color: #000000;">ch</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">s</span><span style="color: #0000FF;">[</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">]</span>
<span style="color: #000000;">res</span> <span style="color: #0000FF;">&=</span> <span style="color: #000000;">ch</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">ch</span><span style="color: #0000FF;">=</span><span style="color: #008000;">'\\'</span> <span style="color: #008080;">then</span>
<span style="color: #000000;">sidx</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span>
<span style="color: #000000;">ch</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">s</span><span style="color: #0000FF;">[</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">]</span>
<span style="color: #000000;">res</span> <span style="color: #0000FF;">&=</span> <span style="color: #000000;">ch</span>
<span style="color: #008080;">elsif</span> <span style="color: #000000;">ch</span><span style="color: #0000FF;">=</span><span style="color: #008000;">'\"'</span> <span style="color: #008080;">then</span>
<span style="color: #000000;">sidx</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span>
<span style="color: #008080;">exit</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">while</span>
<span style="color: #008080;">else</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">asnumber</span> <span style="color: #0000FF;">=</span> <span style="color: #0000FF;">(</span><span style="color: #000000;">ch</span><span style="color: #0000FF;">>=</span><span style="color: #008000;">'0'</span> <span style="color: #008080;">and</span> <span style="color: #000000;">ch</span><span style="color: #0000FF;"><=</span><span style="color: #008000;">'9'</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">while</span> <span style="color: #008080;">not</span> <span style="color: #7060A8;">find</span><span style="color: #0000FF;">(</span><span style="color: #000000;">ch</span><span style="color: #0000FF;">,{</span><span style="color: #008000;">')'</span><span style="color: #0000FF;">,</span><span style="color: #008000;">' '</span><span style="color: #0000FF;">,</span><span style="color: #008000;">'\t'</span><span style="color: #0000FF;">,</span><span style="color: #008000;">'\r'</span><span style="color: #0000FF;">,</span><span style="color: #008000;">'\n'</span><span style="color: #0000FF;">})</span> <span style="color: #008080;">do</span>
<span style="color: #000000;">res</span> <span style="color: #0000FF;">&=</span> <span style="color: #000000;">ch</span>
<span style="color: #000000;">sidx</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">sidx</span><span style="color: #0000FF;">></span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">then</span> <span style="color: #008080;">exit</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #000000;">ch</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">s</span><span style="color: #0000FF;">[</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">]</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">while</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">asnumber</span> <span style="color: #008080;">then</span>
<span style="color: #004080;">sequence</span> <span style="color: #000000;">scanres</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">scanf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">res</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"%f"</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">if</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">scanres</span><span style="color: #0000FF;">)=</span><span style="color: #000000;">1</span> <span style="color: #008080;">then</span> <span style="color: #008080;">return</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">scanres</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">][</span><span style="color: #000000;">1</span><span style="color: #0000FF;">],</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">}</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #000080;font-style:italic;">-- error? (failed to parse number)</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #008080;">return</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">res</span><span style="color: #0000FF;">,</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">}</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span>
<span style="color: #008080;">function</span> <span style="color: #000000;">parse_s_expr</span><span style="color: #0000FF;">(</span><span style="color: #004080;">string</span> <span style="color: #000000;">s</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">sidx</span><span style="color: #0000FF;">)</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">ch</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">s</span><span style="color: #0000FF;">[</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">]</span>
<span style="color: #004080;">sequence</span> <span style="color: #000000;">res</span> <span style="color: #0000FF;">=</span> <span style="color: #0000FF;">{}</span>
<span style="color: #004080;">object</span> <span style="color: #000000;">element</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">ch</span><span style="color: #0000FF;">!=</span><span style="color: #008000;">'('</span> <span style="color: #008080;">then</span> <span style="color: #0000FF;">?</span><span style="color: #000000;">9</span><span style="color: #0000FF;">/</span><span style="color: #000000;">0</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #000000;">sidx</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span>
<span style="color: #008080;">while</span> <span style="color: #000000;">1</span> <span style="color: #008080;">do</span>
<span style="color: #000000;">sidx</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">skip_spaces</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s</span><span style="color: #0000FF;">,</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">)</span>
<span style="color: #000080;font-style:italic;">-- error? (if past end of string/missing ')')</span>
<span style="color: #000000;">ch</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">s</span><span style="color: #0000FF;">[</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">]</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">ch</span><span style="color: #0000FF;">=</span><span style="color: #008000;">')'</span> <span style="color: #008080;">then</span> <span style="color: #008080;">exit</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">ch</span><span style="color: #0000FF;">=</span><span style="color: #008000;">'('</span> <span style="color: #008080;">then</span>
<span style="color: #0000FF;">{</span><span style="color: #000000;">element</span><span style="color: #0000FF;">,</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">}</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">parse_s_expr</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s</span><span style="color: #0000FF;">,</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">else</span>
<span style="color: #0000FF;">{</span><span style="color: #000000;">element</span><span style="color: #0000FF;">,</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">}</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">get_term</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s</span><span style="color: #0000FF;">,</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #000000;">res</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">append</span><span style="color: #0000FF;">(</span><span style="color: #000000;">res</span><span style="color: #0000FF;">,</span><span style="color: #000000;">element</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">while</span>
<span style="color: #000000;">sidx</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">skip_spaces</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s</span><span style="color: #0000FF;">,</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">return</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">res</span><span style="color: #0000FF;">,</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">}</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span>
<span style="color: #004080;">sequence</span> <span style="color: #000000;">s_expr</span>
<span style="color: #004080;">integer</span> <span style="color: #000000;">sidx</span>
<span style="color: #0000FF;">{</span><span style="color: #000000;">s_expr</span><span style="color: #0000FF;">,</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">}</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">parse_s_expr</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s_expr_str</span><span style="color: #0000FF;">,</span><span style="color: #000000;">1</span><span style="color: #0000FF;">)</span>
<span style="color: #008080;">if</span> <span style="color: #000000;">sidx</span><span style="color: #0000FF;"><=</span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s_expr_str</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">then</span>
<span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"incomplete parse(\"%s\")\n"</span><span style="color: #0000FF;">,{</span><span style="color: #000000;">s_expr_str</span><span style="color: #0000FF;">[</span><span style="color: #000000;">sidx</span><span style="color: #0000FF;">..$]})</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
<span style="color: #7060A8;">puts</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"\nThe string:\n"</span><span style="color: #0000FF;">)</span>
<span style="color: #0000FF;">?</span><span style="color: #000000;">s_expr_str</span>
<span style="color: #7060A8;">puts</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"\nDefault pretty printing:\n"</span><span style="color: #0000FF;">)</span>
<span style="color: #000080;font-style:italic;">--?s_expr</span>
<span style="color: #7060A8;">pp</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s_expr</span><span style="color: #0000FF;">)</span>
<span style="color: #7060A8;">puts</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"\nBespoke pretty printing:\n"</span><span style="color: #0000FF;">)</span>
<span style="color: #000080;font-style:italic;">--ppEx(s_expr,{pp_Nest,1,pp_StrFmt,-1,pp_IntCh,false,pp_Brkt,"()"})</span>
<span style="color: #7060A8;">ppEx</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s_expr</span><span style="color: #0000FF;">,{</span><span style="color: #004600;">pp_Nest</span><span style="color: #0000FF;">,</span><span style="color: #000000;">4</span><span style="color: #0000FF;">,</span><span style="color: #004600;">pp_StrFmt</span><span style="color: #0000FF;">,-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #004600;">pp_IntCh</span><span style="color: #0000FF;">,</span><span style="color: #004600;">false</span><span style="color: #0000FF;">,</span><span style="color: #004600;">pp_Brkt</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"()"</span><span style="color: #0000FF;">})</span>
<!--</syntaxhighlight>-->
{{out}}
<pre>
The string:
"((data \"quoted data\" 123 4.5)\n (data (!@# (4.5) \"(more\" \"data)\")))"
 
Default pretty printing:
This parses the task, but it isn't really a good lisp parser, because it always wants whitespace between lists, so <code>(()())</code> will fail ( <code>(() ())</code> wont)
{{"data", "\"quoted data\"", 123'{',4.5},
{"data", {"!@#", {4.5}, "\"(more\"", "\"data)\""}}}
 
Bespoke pretty printing:
<lang perl6>grammar S-Exp {
((data,
rule TOP {^ <s-list> $};
"quoted data",
 
123,
token s-list { '(' ~ ')' [ <in_list> ** [\s+] | '' ] }
4.5),
token in_list { <s-token> | <s-list> }
(data,
(!@#,
proto token s-token {*}
(4.5),
token s-token:sym<Num> {\d*\.?\d+}
"(more",
token s-token:sym<String> {'"' ['\"' |<-[\\"]>]*? '"'} #'
"data)")))
token s-token:sym<Atom> {<-[()\s]>+}
</pre>
}
# The Actions class, for each syntactic rule there is a method
# that stores some data in the abstract syntax tree with make
class S-Exp::ACTIONS {
method TOP ($/) {make $<s-list>.ast}
method s-list ($/) {make [$<in_list>».ast]}
method in_list ($/) {make $/.values[0].ast}
method s-token:sym<Num> ($/){make +$/}
method s-token:sym<String> ($/){make ~$/.substr(1,*-1)}
method s-token:sym<Atom> ($/){make ~$/}
}
multi s-exp_writer (Positional $ary) {'(' ~ $ary.map(&s-exp_writer).join(' ') ~ ')'}
multi s-exp_writer (Numeric $num) {~$num}
multi s-exp_writer (Str $str) {
return $str unless $str ~~ /<[(")]>|\s/;
return '()' if $str eq '()';
'"' ~ $str.subst('"', '\"' ) ~ '"';
}
my $s-exp = '((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))';
my $actions = S-Exp::ACTIONS.new();
my $perl_array = (S-Exp.parse($s-exp, :$actions)).ast;
say "the expression:\n$s-exp\n";
say "the perl6-expression:\n{$perl_array.perl}\n";
say "and back:\n{s-exp_writer($perl_array)}";</lang>
 
Output:
<pre>the expression:
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
 
the perl6-expression:
[["data", "quoted data", "123", 9/2], ["data", ["!\@#", [9/2], "(more", "data)"]]]
 
and back:
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))</pre>
 
=={{header|PicoLisp}}==
The '[http://software-lab.de/doc/refA.html#any any]' function parses an s-expression from a string (indentical to the way '[http://software-lab.de/doc/refR.html#read read]' does this from an input stream).
<langsyntaxhighlight PicoLisplang="picolisp">: (any "((data \"quoted data\" 123 4.5) (data (!@# (4.5) \"(more\" \"data)\")))")
-> ((data "quoted data" 123 5) (data (!@# (5) "(more" "data)")))
 
Line 2,105 ⟶ 5,129:
+-- "(more"
|
+-- "data)"</langsyntaxhighlight>
Implementing a subset of 'any' explicitly:
<langsyntaxhighlight PicoLisplang="picolisp">(de readSexpr ()
(case (skip)
("(" (char) (readList))
Line 2,132 ⟶ 5,156:
(until (or (sp? (peek)) (member (peek) '("(" ")")))
(link (char)) ) )
(or (format X) (intern (pack X))) ) )</langsyntaxhighlight>
It can be used in a pipe to read from a string:
<langsyntaxhighlight PicoLisplang="picolisp">: (pipe (prin "((data \"quoted data\" 123 4.5) (data (!@# (4.5) \"(more\" \"data)\")))") (readSexpr))
-> ((data "quoted data" 123 5) (data (!@# (5) "(more" "data)")))</langsyntaxhighlight>
'[http://software-lab.de/doc/refS.html#sym sym]' does the reverse (i.e. builds a symbol (string) from an expression).
<langsyntaxhighlight PicoLisplang="picolisp">: (sym @@)
-> "((data \"quoted data\" 123 5) (data (!@# (5) \"(more\" \"data)\")))"</langsyntaxhighlight>
Implementing a subset of the built-in printer:
<langsyntaxhighlight PicoLisplang="picolisp">(de printSexpr (Expr Fun)
(cond
((pair Expr)
Line 2,153 ⟶ 5,177:
(mapc Fun (chop Expr))
(Fun "\"") )
(T (mapc Fun (chop Expr))) ) )</langsyntaxhighlight>
This can be used for plain printing
<langsyntaxhighlight PicoLisplang="picolisp">: (printSexpr
'((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))
prin )
((data "quoted data" 123 5) (data (!@# (5) "(more" "data)")))</langsyntaxhighlight>
or to collect the characters into a string:
<langsyntaxhighlight PicoLisplang="picolisp">: (pack
(make
(printSexpr
'((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))
link ) ) )
-> "((data \"quoted data\" 123 5) (data (!@# (5) \"(more\" \"data)\")))"</langsyntaxhighlight>
 
=={{header|Pike}}==
<langsyntaxhighlight lang="pike">class Symbol(string name)
{
string _sprintf(int type)
Line 2,273 ⟶ 5,297:
string input = "((data \"quoted data\" 123 4.5)\n (data (!@# (4.5) \"(more\" \"data)\")))";
array data = group(tokenizer(input))[0];
string output = sexp(data);</langsyntaxhighlight>
 
Output:
Line 2,282 ⟶ 5,306:
 
((data "quoted data" 123 4.5) (data (!@# (45) "(more" "data)")))
 
=={{header|Potion}}==
How values are stored: Tuples for list, integers for integers, floats for floats, strings for symbols, quoted strings for strings. This implementation is not the most elegant/succinct or practical (it's trusty and has no real error handling).
<syntaxhighlight lang="potion">isdigit = (c): 47 < c ord and c ord < 58.
iswhitespace = (c): c ord == 10 or c ord == 13 or c == " ".
 
# str: a string of the form "...<nondigit>[{<symb>}]..."
# i: index to start at (must be the index of <nondigit>)
# => returns (<the symbol as a string>, <index after the last char>)
parsesymbol = (str, i) :
datum = ()
while (str(i) != "(" and str(i) != ")" and not iswhitespace(str(i)) and str(i) != "\"") :
datum append(str(i++))
.
(datum join, i)
.
 
# str: a string of the form "...[<minus>]{<digit>}[<dot>{<digit>}]..."
# i: index to start at (must be the index of the first token)
# => returns (<float or int>, <index after the last digit>)
parsenumber = (str, i) :
datum = ()
dot = false
while (str(i) != "(" and str(i) != ")" and not iswhitespace(str(i)) and str(i) != "\"") :
if (str(i) == "."): dot = true.
datum append(str(i++))
.
if (dot): (datum join number, i).
else: (datum join number integer, i).
.
 
# str: a string of the form "...\"....\"..."
# i: index to start at (must be the index of the first quote)
# => returns (<the string>, <index after the last quote>)
parsestring = (str, i) :
datum = ("\"")
while (str(++i) != "\"") :
datum append(str(i))
.
datum append("\"")
(datum join, ++i)
.
 
# str: a string of the form "...(...)..."
# i: index to start at
# => returns (<tuple/list>, <index after the last paren>)
parselist = (str, i) :
lst = ()
data = ()
while (str(i) != "("): i++.
i++
while (str(i) != ")") :
if (not iswhitespace(str(i))) :
if (isdigit(str(i)) or (str(i) == "-" and isdigit(str(i + 1)))): data = parsenumber(str, i).
elsif (str(i) == "\""): data = parsestring(str, i).
elsif (str(i) == "("): data = parselist(str, i).
else: data = parsesymbol(str, i).
lst append(data(0))
i = data(1)
. else :
++i
.
.
(lst, ++i)
.
 
parsesexpr = (str) :
parselist(str, 0)(0)
.
 
parsesexpr("(define (factorial x) \"compute factorial\" (version 2.0) (apply * (range 1 x)))") string print
"\n" print
parsesexpr("((data \"quoted data\" 123 4.5)
(data (!@# (4.5) \"(more\" \"data)\")))") string print
"\n" print</syntaxhighlight>
 
=={{header|Python}}==
===Procedural===
<lang python>import re
<syntaxhighlight lang="python">import re
 
dbg = False
Line 2,292 ⟶ 5,392:
(?P<brackl>\()|
(?P<brackr>\))|
(?P<num>\-?\d+\.\d+|\-?\d+)\b|
(?P<sq>"[^"]*")|
(?P<s>[^(^)\Ss]+)\b
)'''
 
Line 2,343 ⟶ 5,443:
print("\nParsed to Python:", parsed)
 
print("\nThen back to: '%s'" % print_sexp(parsed))</langsyntaxhighlight>
 
;Output:
Line 2,354 ⟶ 5,454:
;Simpler parser:
Note that in the example above the parser also recognises and changes the type of some tokens as well as generating a nested list. If that functionality is not needed, or better done elsewhere, then the parse function can be achieved more simply by just applying the regexp:
<langsyntaxhighlight lang="python">>>> from pprint import pprint as pp
>>> x = [[(t,v) for t,v in termtypes.groupdict().items() if v][0] for termtypes in re.finditer(term_regex, sexp)]
>>> pp(x)
Line 2,376 ⟶ 5,476:
('brackr', ')'),
('brackr', ')')]
>>> </langsyntaxhighlight>
 
===Functional===
Composing functionally, and writing out a tree diagram, and a serialization, of the parse.
<syntaxhighlight lang="python">'''S-expressions'''
 
from itertools import chain, repeat
import re
 
 
def main():
'''Sample s-expression parsed, diagrammed,
and reserialized from the parse tree.
'''
expr = "((data \"quoted data\" 123 4.5)\n" + (
" (data (!@# (4.5) \"(more\" \"data)\")))"
)
parse = parseExpr(tokenized(expr))[0]
print(
drawForest([
fmapTree(str)(tree) for tree
in forestFromExprs(parse)
])
)
print(
f'\nReserialized from parse:\n\n{serialized(parse)}'
)
 
 
# ----------------- S-EXPRESSION PARSER ------------------
 
# parseExpr :: [String] -> ([Expr], [String]
def parseExpr(tokens):
'''A tuple of a nested list with any
unparsed tokens that remain.
'''
return until(finished)(parseToken)(
([], tokens)
)
 
 
# finished :: ([Expr], [String]) -> Bool
def finished(xr):
'''True if no tokens remain,
or the next token is a closing bracket.
'''
r = xr[1]
return (not r) or (r[0] == ")")
 
 
# parseToken :: ([Expr], [String]) -> ([Expr], [String])
def parseToken(xsr):
'''A tuple of an expanded expression list
and a reduced token list.
'''
xs, r = xsr
h, *t = r
if "(" == h:
expr, rest = parseExpr(t)
return xs + [expr], rest[1:]
else:
return (xs, t) if ")" == h else (
xs + [atom(h)], t
)
 
# --------------------- ATOM PARSER ----------------------
 
# atom :: String -> Expr
def atom(s):
'''A Symbol, String, Float, or Int derived from s.
Symbol is represented as a dict with a 'name' key.
'''
def n(k):
return float(k) if '.' in k else int(k)
 
return s if '"' == s[0] else (
n(s) if s.replace('.', '', 1).isdigit() else {
"name": s
}
)
 
 
# --------------------- TOKENIZATION ---------------------
 
# tokenized :: String -> [String]
def tokenized(s):
'''A list of the tokens in s.
'''
return list(chain.from_iterable(map(
lambda token: [token] if '"' == token[0] else (
x for x in re.split(
r'\s+',
re.sub(r"([()])", r" \1 ", token)
) if x
) if token else [], (
x if (0 == i % 2) else f'"{x}"'
for (i, x) in enumerate(s.split('"'))
)
)))
 
 
# -------------------- SERIALIZATION ---------------------
 
# serialized :: Expr -> String
def serialized(e):
'''An s-expression written out from the parse tree.
'''
k = typename(e)
 
return str(e) if k in ['int', 'float', 'str'] else (
(
f'({" ".join([serialized(x) for x in e])})' if (
(1 < len(e)) or ('list' != typename(e[0]))
) else serialized(e[0])
) if 'list' == k else (
e.get("name") if 'dict' == k else "?"
)
)
 
 
# typename :: a -> String
def typename(x):
'''Name property of the type of a value.'''
return type(x).__name__
 
 
# ------------------- TREE DIAGRAMMING -------------------
 
# Node :: a -> [Tree a] -> Tree a
def Node(v):
'''Constructor for a Tree node which connects a
value of some kind to a list of zero or
more child trees.
'''
return lambda xs: {'type': 'Tree', 'root': v, 'nest': xs}
 
 
# append :: [a] -> [a] -> [a]
def append(a, b):
'''Concatenation.'''
return a + b
 
 
# draw :: Tree a -> [String]
def draw(node):
'''List of the lines of an ASCII
diagram of a tree.
'''
def shift_(h, other, xs):
return list(map(
append,
chain(
[h], (
repeat(other, len(xs) - 1)
)
),
xs
))
 
def drawSubTrees(xs):
return (
(
['|'] + shift_(
'├─ ', '│ ', draw(xs[0])
) + drawSubTrees(xs[1:])
) if 1 < len(xs) else ['|'] + shift_(
'└─ ', ' ', draw(xs[0])
)
) if xs else []
 
return (root(node)).splitlines() + (
drawSubTrees(nest(node))
)
 
 
# drawForest :: [Tree String] -> String
def drawForest(trees):
'''A simple unicode character representation of
a list of trees.
'''
return '\n'.join(map(drawTree, trees))
 
 
# drawTree :: Tree a -> String
def drawTree(tree):
'''ASCII diagram of a tree.'''
return '\n'.join(draw(tree))
 
 
# fmapTree :: (a -> b) -> Tree a -> Tree b
def fmapTree(f):
'''A new tree holding the results of
an application of f to each root in
the existing tree.
'''
def go(x):
return Node(
f(root(x))
)([go(v) for v in nest(x)])
return go
 
 
# forestFromExprs :: [Expr] -> [Tree Expr]
def forestFromExprs(es):
'''A list of expressions rewritten as a forest.
'''
return [treeFromExpr(x) for x in es]
 
 
# nest :: Tree a -> [Tree a]
def nest(t):
'''Accessor function for children of tree node.'''
return t.get('nest')
 
 
# root :: Tree a -> a
def root(t):
'''Accessor function for data of tree node.'''
return t.get('root')
 
 
# treeFromExprs :: Expr -> Tree Expr
def treeFromExpr(e):
'''An expression rewritten as a tree.
'''
return (
Node({"name": "List"})(forestFromExprs(e))
) if type(e) is list else (
Node(e)([])
)
 
 
# ----------------------- GENERIC ------------------------
 
# until :: (a -> Bool) -> (a -> a) -> a -> a
def until(p):
'''The result of repeatedly applying f until p holds.
The initial seed value is x.
'''
def go(f):
def loop(x):
v = x
while not p(v):
v = f(v)
return v
return loop
return go
 
 
# MAIN ---
if __name__ == '__main__':
main()</syntaxhighlight>
{{Out}}
<pre>{'name': 'List'}
|
├─ {'name': 'List'}
│ |
│ ├─ {'name': 'data'}
│ |
│ ├─ "quoted data"
│ |
│ ├─ 123
│ |
│ └─ 4.5
|
└─ {'name': 'List'}
|
├─ {'name': 'data'}
|
└─ {'name': 'List'}
|
├─ {'name': '!@#'}
|
├─ {'name': 'List'}
│ |
│ └─ 4.5
|
├─ "(more"
|
└─ "data)"
 
Reserialized from parse:
 
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))</pre>
 
=={{header|Racket}}==
 
Racket has builtin support for S-expressions in the form of the read function.
<langsyntaxhighlight lang="racket">
#lang racket
(define input
Line 2,391 ⟶ 5,774:
 
(read (open-input-string input))
</syntaxhighlight>
</lang>
Output:
<pre>
Line 2,397 ⟶ 5,780:
</pre>
 
=={{header|REXXRaku}}==
(formerly Perl 6)
The checking of errors has been min minimized (issuing of error message is very rudimentary, as is the error recovery).
{{works with|Rakudo|2020.02}}
<br>More grouping symbols have been added: [] {}, as well as another type of literal.
<br>Also added were two more seperators (a comma and semicolon).
<br>Seperators that could be added are more whitespace characters (vertical/horizontal tabs, line feed, form feed, tab char, etc).
<br><br>It would normally be considered improper, but the literal string delimiters were left intact; making it much easier to understand what is/was being parsed.
<lang rexx>/*REXX program parses an S-expression and displays the results. */
input= '((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))'
say 'input:' /*indicate what is being shown. */
say input /*echo the input to the screen. */
say copies('═',length(input)) /*display a header fence. */
$.= /*stem array to hold the tokens. */
groupO.1 = '{' ; groupC.1 = '}' /*grouping symbols (Open & Close)*/
groupO.2 = '[' ; groupC.2 = ']' /* " " " " " */
groupO.3 = '(' ; groupC.3 = ')' /* " " " " " */
groupSym = 3 /*the number of grouping symbols.*/
# = 0 /*the number of tokens. */
tabs = 10 /*used for indenting the levels. */
q.1 = "'" /*literal string delimiter, 1st. */
q.2 = '"' /* " " " 2nd. */
numLits = 2 /*number of kinds of literals. */
seps = ',;' /*characters used for separation.*/
atoms = ' 'seps /*characters used to sep atoms. */
level = 0 /*current level being processed. */
quoted = 0 /*quotation level (when nested). */
groupu = /*used to go ↑ an expresion level*/
groupd = /* " " " ↓ " " " */
do n=1 for groupSym /*handle for # grouping symbols. */
atoms = atoms || groupO.n || groupC.n
groupu = groupu || groupO.n
groupd = groupd || groupC.n
end /*n*/
literals=
do k=1 for numLits
literals = literals || q.k
end /*k*/
!=
/*═════════════════════════════════════start of the text parsing.═══════*/
do j=1 to length(input); _ = substr(input,j,1)
if quoted then do
!=! || _
if _==literalStart then quoted=0
iterate
end
 
This parses the task, but it isn't really a good lisp parser, because it always wants whitespace between lists, so <code>(()())</code> will fail ( <code>(() ())</code> wont)
if pos(_,literals)\==0 then do
literalStart = _
! = ! || _
quoted = 1
iterate
end
 
<syntaxhighlight lang="raku" line>grammar S-Exp {
if pos(_,atoms)==0 then do; !=! || _ ; iterate; end
rule TOP {^ <s-list> $};
else do; call add!; ! = _ ; end
 
token s-list { '(' ~ ')' [ <in_list>+ % [\s+] | '' ] }
if pos(_,literals)==0 then do
token in_list { <s-token> | <s-list> }
if pos(_,groupu)\==0 then level=level+1
call add!
proto token s-token {*}
if pos(_,groupd)\==0 then level=level-1
token s-token:sym<Num> {\d*\.?\d+}
if level<0 then say 'oops, mismatched' _
token s-token:sym<String> {'"' ['\"' |<-[\\"]>]*? '"'} #'
iterate
token s-token:sym<Atom> {<-[()\s]>+}
end
end /*j*/
}
# The Actions class, for each syntactic rule there is a method
# that stores some data in the abstract syntax tree with make
class S-Exp::ACTIONS {
method TOP ($/) {make $<s-list>.ast}
method s-list ($/) {make [$<in_list>».ast]}
method in_list ($/) {make $/.values[0].ast}
method s-token:sym<Num> ($/){make +$/}
method s-token:sym<String> ($/){make ~$/.substr(1,*-1)}
method s-token:sym<Atom> ($/){make ~$/}
}
multi s-exp_writer (Positional $ary) {'(' ~ $ary.map(&s-exp_writer).join(' ') ~ ')'}
multi s-exp_writer (Numeric $num) {~$num}
multi s-exp_writer (Str $str) {
return $str unless $str ~~ /<[(")]>|\s/;
return '()' if $str eq '()';
'"' ~ $str.subst('"', '\"' ) ~ '"';
}
my $s-exp = '((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))';
my $actions = S-Exp::ACTIONS.new();
my $raku_array = (S-Exp.parse($s-exp, :$actions)).ast;
say "the expression:\n$s-exp\n";
say "the Raku expression:\n{$raku_array.raku}\n";
say "and back:\n{s-exp_writer($raku_array)}";</syntaxhighlight>
 
{{out}}
call add! /*handle any residual tokens. */
<pre>the expression:
if level\==0 then say 'oops, mismatched grouping symbol'
((data "quoted data" 123 4.5)
if quoted then say 'oops, no end of quoted literal' literalStart
(data (!@# (4.5) "(more" "data)")))
/*═════════════════════════════════════end of text parsing.═════════════*/
 
the Raku expression:
do j=1 for #
[["data", "quoted data", "123", 9/2], ["data", ["!\@#", [9/2], "(more", "data)"]]]
say $.j
end /*j*/
exit /*stick a fork in it, we're done.*/
 
and back:
/*──────────────────────────────────ADD! subroutine─────────────────────*/
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))</pre>
add!: if !\='' then do
 
#=#+1
=={{header|REXX}}==
$.#=left('',max(0,tabs*(level-1)))!
The checking of errors has been minimized (issuing of error message is very rudimentary, as is the error recovery).
end
<br>More grouping symbols have been added &nbsp; (brackets &nbsp; '''[ ]''', &nbsp; braces &nbsp; '''{ }''', &nbsp; and &nbsp; guillemets &nbsp; <big>'''« »''')</big>, &nbsp; as well as another types of literals.
!=
<br>Also added were two more separators &nbsp; (a comma and semicolon).
return</lang>
<br>Separators that could be added are more whitespace characters (vertical/horizontal tabs, line feed, form feed, tab char, etc).
'''output'''
 
<pre style="overflow:scroll">
It would normally be considered improper, but the literal string delimiters were left intact; making it much easier to understand what is/was being parsed.
input:
<syntaxhighlight lang="rexx">/*REXX program parses an S-expression and displays the results to the terminal. */
input= '((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))'
say center('input', length(input), "═") /*display the header title to terminal.*/
say input /* " " input data " " */
say copies('═', length(input) ) /* " " header sep " " */
grpO.=; grpO.1 = '{' ; grpC.1 = "}" /*pair of grouping symbol: braces */
grpO.2 = '[' ; grpC.2 = "]" /* " " " " brackets */
grpO.3 = '(' ; grpC.3 = ")" /* " " " " parentheses */
grpO.4 = '«' ; grpC.4 = "»" /* " " " " guillemets */
q.=; q.1 = "'" ; q.2 = '"' /*1st and 2nd literal string delimiter.*/
# = 0 /*the number of tokens found (so far). */
tabs = 10 /*used for the indenting of the levels.*/
seps = ',;' /*characters used for separation. */
atoms = ' 'seps /* " " to separate atoms. */
level = 0 /*the current level being processed. */
quoted = 0 /*quotation level (for nested quotes).*/
grpU = /*used to go up an expression level.*/
grpD = /* " " " down " " " */
@.=; do n=1 while grpO.n\==''
atoms = atoms || grpO.n || grpC.n /*add Open and Closed groups to ATOMS.*/
grpU = grpU || grpO.n /*add Open groups to GRPU, */
grpD = grpD || grpC.n /*add Closed groups to GRPD, */
end /*n*/ /* [↑] handle a bunch of grouping syms*/
literals=
do k=1 while q.k\==''; literals= literals || q.k /*add literal delimiters*/
end /*k*/
!=; literalStart=
do j=1 to length(input); $= substr(input, j, 1) /* ◄■■■■■text parsing*/
/* ◄■■■■■text parsing*/
if quoted then do; !=! || $; if $==literalStart then quoted= 0 /* ◄■■■■■text parsing*/
iterate /* ◄■■■■■text parsing*/
end /* [↑] handle running quoted string. */ /* ◄■■■■■text parsing*/
/* ◄■■■■■text parsing*/
if pos($, literals)\==0 then do; literalStart= $; != ! || $; quoted= 1 /* ◄■■■■■text parsing*/
iterate /* ◄■■■■■text parsing*/
end /* [↑] handle start of quoted strring.*/ /* ◄■■■■■text parsing*/
/* ◄■■■■■text parsing*/
if pos($, atoms)==0 then do; != ! || $; iterate; end /*is an atom?*/ /* ◄■■■■■text parsing*/
else do; call add!; != $; end /*isn't " " ?*/ /* ◄■■■■■text parsing*/
/* ◄■■■■■text parsing*/
if pos($, literals)==0 then do; if pos($, grpU)\==0 then level= level + 1 /* ◄■■■■■text parsing*/
call add! /* ◄■■■■■text parsing*/
if pos($, grpD)\==0 then level= level - 1 /* ◄■■■■■text parsing*/
if level<0 then say 'error, mismatched' $ /* ◄■■■■■text parsing*/
end /* ◄■■■■■text parsing*/
end /*j*/ /* ◄■■■■■text parsing*/
/* ◄■■■■■text parsing*/
call add! /*process any residual tokens. */ /* ◄■■■■■text parsing*/
if level\==0 then say 'error, mismatched grouping symbol' /* ◄■■■■■text parsing*/
if quoted then say 'error, no end of quoted literal' literalStart /* ◄■■■■■text parsing*/
 
do m=1 for #; say @.m /*display the tokens ───► terminal. */
end /*m*/
exit 0 /*stick a fork in it, we're all done. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
add!: if !='' then return; #=#+1; @.#=left("", max(0, tabs*(level-1)))!; !=; return</syntaxhighlight>
{{out|output|text=&nbsp; when using the default input:}}
<pre>
══════════════════════════════input══════════════════════════════
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))
═════════════════════════════════════════════════════════════════
Line 2,486 ⟶ 5,912:
(
data
"quoted data"
123
4.5
)
(
Line 2,497 ⟶ 5,923:
4.5
)
"(more"
"data)"
)
)
Line 2,506 ⟶ 5,932:
=={{header|Ruby}}==
{{works with|Ruby|1.9}}
<langsyntaxhighlight lang="ruby">class SExpr
def initialize(str)
@original = str
Line 2,512 ⟶ 5,938:
end
attr_reader :data, :original
 
def to_sexpr
@data.to_sexpr
end
 
private
 
def parse_sexpr(str)
state = :token_start
Line 2,525 ⟶ 5,951:
str.each_char do |char|
case state
 
when :token_start
case char
Line 2,541 ⟶ 5,967:
word = char
end
 
when :read_quoted_string
case char
Line 2,550 ⟶ 5,976:
word << char
end
 
when :read_string_or_number
case char
Line 2,565 ⟶ 5,991:
end
end
 
sexpr_tokens_to_array(tokens)
end
 
def symbol_or_number(word)
beginInteger(word)
rescue ArgumentError
Integer(word)
begin
Float(word)
rescue ArgumentError
begin word.to_sym
Float(word)
rescue ArgumentError
word.to_sym
end
end
end
 
def sexpr_tokens_to_array(tokens, idx = 0)
result = []
Line 2,626 ⟶ 6,050:
(data (!@# (4.5) "(more" "data)")))
END
 
puts "original sexpr:\n#{sexpr.original}"
puts "\nruby data structure:\n#{sexpr.data.inspect}"
puts "\nand back to S-Expr:\n#{sexpr.to_sexpr}"</langsyntaxhighlight>
 
{{out}}
outputs
<pre>original sexpr:
original sexpr:
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
Line 2,640 ⟶ 6,065:
 
and back to S-Expr:
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))</pre>
</pre>
 
=={{header|Rust}}==
lib.rs:
<syntaxhighlight lang="rust">
//! This implementation isn't based on anything in particular, although it's probably informed by a
//! lot of Rust's JSON encoding code. It should be very fast (both encoding and decoding the toy
//! example here takes under a microsecond on my machine) and tries to avoid unnecessary allocation.
//!
//! In a real implementation, most of this would be private, with only a few visible functions, and
//! there would be somewhat nicer signatures (in particular, the fact that `ParseContext` has to be
//! mutable would get annoying in real code pretty quickly, so it would probably be split out).
//!
//! It supports the ability to read individual atoms, not just lists, although whether this is
//! useful is questionable.
//!
//! Caveats: Does not support symbols vs. non-symbols (it wouldn't be hard, but it would greatly
//! complicate setting up our test structure since we'd have to force it to go through functions
//! that checked to make sure `Symbol`s couldn't have spaces, or slow down our parser by checking
//! for this information each time, which is obnoxious). Does not support string escaping, because
//! the decoding technique doesn't allocate extra space for strings. Does support numbers, but
//! only float types (supporting more types is possible but would complicate the code
//! significantly).
 
extern crate typed_arena;
 
use typed_arena::Arena;
 
use self::Error::*;
use self::SExp::*;
use self::Token::*;
use std::io;
use std::num::FpCategory;
use std::str::FromStr;
 
/// The actual `SExp` structure. Supports `f64`s, lists, and string literals. Note that it takes
/// everything by reference, rather than owning it--this is mostly done just so we can allocate
/// `SExp`s statically (since we don't have to call `Vec`). It does complicate the code a bit,
/// requiring us to have a `ParseContext` that holds an arena where lists are actually allocated.
#[derive(PartialEq, Debug)]
pub enum SExp<'a> {
/// Float literal: 0.5
F64(f64),
 
/// List of SExps: ( a b c)
List(&'a [SExp<'a>]),
 
/// Plain old string literal: "abc"
Str(&'a str),
}
 
/// Errors that can be thrown by the parser.
#[derive(PartialEq, Debug)]
pub enum Error {
/// If the float is `NaN`, `Infinity`, etc.
NoReprForFloat,
 
/// Missing an end double quote during string parsing
UnterminatedStringLiteral,
 
/// Some other kind of I/O error
Io,
 
/// ) appeared where it shouldn't (usually as the first token)
IncorrectCloseDelimiter,
 
/// Usually means a missing ), but could also mean there were no tokens at all.
UnexpectedEOF,
 
/// More tokens after the list is finished, or after a literal if there is no list.
ExpectedEOF,
}
 
impl From<io::Error> for Error {
fn from(_err: io::Error) -> Error {
Error::Io
}
}
 
/// Tokens returned from the token stream.
#[derive(PartialEq, Debug)]
enum Token<'a> {
/// Left parenthesis
ListStart,
 
/// Right parenthesis
ListEnd,
 
/// String or float literal, quotes removed.
Literal(SExp<'a>),
 
/// Stream is out of tokens.
Eof,
}
 
/// An iterator over a string that yields a stream of Tokens.
///
/// Implementation note: it probably seems weird to store first, rest, AND string, since they should
/// all be derivable from string. But see below.
#[derive(Copy, Clone, Debug)]
struct Tokens<'a> {
/// The part of the string that still needs to be parsed
string: &'a str,
 
/// The first character to parse
first: Option<char>,
 
/// The rest of the string after the first character
rest: &'a str,
}
 
impl<'a> Tokens<'a> {
/// Initialize a token stream for a given string.
fn new(string: &str) -> Tokens {
let mut chars = string.chars();
 
match chars.next() {
Some(ch) => Tokens {
string,
first: Some(ch),
rest: chars.as_str(),
},
None => Tokens {
string,
first: None,
rest: string,
},
}
}
 
/// Utility function to update information in the iterator. It might not be performant to keep
/// rest cached, but there are times where we don't know exactly what string is (at least, not
/// in a way that we can *safely* reconstruct it without allocating), so we keep both here.
/// With some unsafe code we could probably get rid of one of them (and maybe first, too).
fn update(&mut self, string: &'a str) {
self.string = string;
let mut chars = self.string.chars();
 
if let Some(ch) = chars.next() {
self.first = Some(ch);
self.rest = chars.as_str();
} else {
self.first = None;
};
}
 
/// This is where the lexing happens. Note that it does not handle string escaping.
fn next_token(&mut self) -> Result<Token<'a>, Error> {
loop {
match self.first {
// List start
Some('(') => {
self.update(self.rest);
return Ok(ListStart);
}
// List end
Some(')') => {
self.update(self.rest);
return Ok(ListEnd);
}
// Quoted literal start
Some('"') => {
// Split the string at most once. This lets us get a
// reference to the next piece of the string without having
// to loop through the string again.
let mut iter = self.rest.splitn(2, '"');
// The first time splitn is run it will never return None, so this is safe.
let str = iter.next().unwrap();
match iter.next() {
// Extract the interior of the string without allocating. If we want to
// handle string escaping, we would have to allocate at some point though.
Some(s) => {
self.update(s);
return Ok(Literal(Str(str)));
}
None => return Err(UnterminatedStringLiteral),
}
}
// Plain old literal start
Some(c) => {
// Skip whitespace. This could probably be made more efficient.
if c.is_whitespace() {
self.update(self.rest);
continue;
}
// Since we've exhausted all other possibilities, this must be a real literal.
// Unlike the quoted case, it's not an error to encounter EOF before whitespace.
let mut end_ch = None;
let str = {
let mut iter = self.string.splitn(2, |ch: char| {
let term = ch == ')' || ch == '(';
if term {
end_ch = Some(ch)
}
term || ch.is_whitespace()
});
// The first time splitn is run it will never return None, so this is safe.
let str = iter.next().unwrap();
self.rest = iter.next().unwrap_or("");
str
};
match end_ch {
// self.string will be incorrect in the Some(_) case. The only reason it's
// okay is because the next time next() is called in this case, we know it
// will be '(' or ')', so it will never reach any code that actually looks
// at self.string. In a real implementation this would be enforced by
// visibility rules.
Some(_) => self.first = end_ch,
None => self.update(self.rest),
}
return Ok(Literal(parse_literal(str)));
}
None => return Ok(Eof),
}
}
}
}
 
/// This is not the most efficient way to do this, because we end up going over numeric literals
/// twice, but it avoids having to write our own number parsing logic.
fn parse_literal(literal: &str) -> SExp {
match literal.bytes().next() {
Some(b'0'..=b'9') | Some(b'-') => match f64::from_str(literal) {
Ok(f) => F64(f),
Err(_) => Str(literal),
},
_ => Str(literal),
}
}
 
/// Parse context, holds information required by the parser (and owns any allocations it makes)
pub struct ParseContext<'a> {
/// The string being parsed. Not required, but convenient.
string: &'a str,
 
/// Arena holding any allocations made by the parser.
arena: Option<Arena<Vec<SExp<'a>>>>,
 
/// Stored in the parse context so it can be reused once allocated.
stack: Vec<Vec<SExp<'a>>>,
}
 
impl<'a> ParseContext<'a> {
/// Create a new parse context from a given string
pub fn new(string: &'a str) -> ParseContext<'a> {
ParseContext {
string,
arena: None,
stack: Vec::new(),
}
}
}
 
impl<'a> SExp<'a> {
/// Serialize a SExp.
fn encode<T: io::Write>(&self, writer: &mut T) -> Result<(), Error> {
match *self {
F64(f) => {
match f.classify() {
// We don't want to identify NaN, Infinity, etc. as floats.
FpCategory::Normal | FpCategory::Zero => {
write!(writer, "{}", f)?;
Ok(())
}
_ => Err(Error::NoReprForFloat),
}
}
List(l) => {
// Writing a list is very straightforward--write a left parenthesis, then
// recursively call encode on each member, and then write a right parenthesis. The
// only reason the logic is as long as it is is to make sure we don't write
// unnecessary spaces between parentheses in the zero or one element cases.
write!(writer, "(")?;
let mut iter = l.iter();
if let Some(sexp) = iter.next() {
sexp.encode(writer)?;
for sexp in iter {
write!(writer, " ")?;
sexp.encode(writer)?;
}
}
write!(writer, ")")?;
Ok(())
}
Str(s) => {
write!(writer, "\"{}\"", s)?;
Ok(())
}
}
}
 
/// Deserialize a SExp.
pub fn parse(ctx: &'a mut ParseContext<'a>) -> Result<SExp<'a>, Error> {
ctx.arena = Some(Arena::new());
// Hopefully this unreachable! gets optimized out, because it should literally be
// unreachable.
let arena = match ctx.arena {
Some(ref mut arena) => arena,
None => unreachable!(),
};
let ParseContext {
string,
ref mut stack,
..
} = *ctx;
// Make sure the stack is cleared--we keep it in the context to avoid unnecessary
// reallocation between parses (if you need to remember old parse information for a new
// list, you can pass in a new context).
stack.clear();
let mut tokens = Tokens::new(string);
// First, we check the very first token to see if we're parsing a full list. It
// simplifies parsing a lot in the subsequent code if we can assume that.
let next = tokens.next_token();
let mut list = match next? {
ListStart => Vec::new(),
Literal(s) => {
return if tokens.next_token()? == Eof {
Ok(s)
} else {
Err(ExpectedEOF)
};
}
ListEnd => return Err(IncorrectCloseDelimiter),
Eof => return Err(UnexpectedEOF),
};
 
// We know we're in a list if we got this far.
loop {
let tok = tokens.next_token();
match tok? {
ListStart => {
// We push the previous context onto our stack when we start reading a new list.
stack.push(list);
list = Vec::new()
}
Literal(s) => list.push(s), // Plain old literal, push it onto the current list
ListEnd => {
match stack.pop() {
// Pop the old context off the stack on list end.
Some(mut l) => {
// We allocate a slot for the current list in our parse context (needed
// for safety) before pushing it onto its parent list.
l.push(List(&*arena.alloc(list)));
// Now reset the current list to the parent list
list = l;
}
// There was nothing on the stack, so we're at the end of the topmost list.
// The check to make sure there are no more tokens is required for
// correctness.
None => {
return match tokens.next_token()? {
Eof => Ok(List(&*arena.alloc(list))),
_ => Err(ExpectedEOF),
};
}
}
}
// We encountered an EOF before the list ended--that's an error.
Eof => return Err(UnexpectedEOF),
}
}
}
 
/// Convenience method for the common case where you just want to encode a SExp as a String.
pub fn buffer_encode(&self) -> Result<String, Error> {
let mut m = Vec::new();
self.encode(&mut m)?;
// Because encode() only ever writes valid UTF-8, we can safely skip the secondary check we
// normally have to do when converting from Vec<u8> to String. If we didn't know that the
// buffer was already UTF-8, we'd want to call container_as_str() here.
unsafe { Ok(String::from_utf8_unchecked(m)) }
}
}
 
pub const SEXP_STRUCT: SExp<'static> = List(&[
List(&[Str("data"), Str("quoted data"), F64(123.), F64(4.5)]),
List(&[
Str("data"),
List(&[Str("!@#"), List(&[F64(4.5)]), Str("(more"), Str("data)")]),
]),
]);
 
pub const SEXP_STRING_IN: &str = r#"((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))"#;
 
 
and main.rs:
 
use s_expressions::{ParseContext, SExp, SEXP_STRING_IN, SEXP_STRUCT};
 
fn main() {
println!("{:?}", SEXP_STRUCT.buffer_encode());
let ctx = &mut ParseContext::new(SEXP_STRING_IN);
println!("{:?}", SExp::parse(ctx));
}
</syntaxhighlight>{{out}}
<pre>
Ok("((\"data\" \"quoted data\" 123 4.5) (\"data\" (\"!@#\" (4.5) \"(more\" \"data)\")))")
Ok(List([List([Str("data"), Str("quoted data"), F64(123.0), F64(4.5)]), List([Str("data"), List([Str("!@#"), List([F64(4.5)]), Str("(more"), Str("data)")])])]))
</pre>
 
 
=={{header|Scheme}}==
Because Scheme, like all serious lisp implementations, has a native function called read for parsing s-expressions,
{{improve|Scheme|Please demonstrate how to write a reader and printer in lisp. see [[Talk:S-Expressions#lisp_solutions]].}}
this code will never be used. It serves more as an example of how to write simple parsers in Scheme.
It also forgos turning things into their native scheme representation and uses strings for all atoms of data.
 
Note that this example includes erroneous closing quotes when checking for #\" because syntax highlighting sucks
Like Common Lisp, R5RS Scheme has a <code>read</code> function parses an s-expression from an input stream.
and no one should have to wade through blocks of red.
 
Using guile scheme 2.0.11
 
<syntaxhighlight lang="scheme">(define (sexpr-read port)
(define (help port)
(let ((char (read-char port)))
(cond
((or (eof-object? char) (eq? char #\) )) '())
((eq? char #\( ) (cons (help port) (help port)))
((char-whitespace? char) (help port))
((eq? char #\"") (cons (quote-read port) (help port)))
(#t (unread-char char port) (cons (string-read port) (help port))))))
; This is needed because the function conses all parsed sexprs onto something,
; so the top expression is one level too deep.
(car (help port)))
 
(define (quote-read port)
(define (help port)
(let ((char (read-char port)))
(if
(or (eof-object? char) (eq? char #\""))
'()
(cons char (help port)))))
(list->string (help port)))
 
(define (string-read port)
(define (help port)
(let ((char (read-char port)))
(cond
((or (eof-object? char) (char-whitespace? char)) '())
((eq? char #\) ) (unread-char char port) '())
(#t (cons char (help port))))))
(list->string (help port)))
 
(define (format-sexpr expr)
(define (help expr pad)
(if
(list? expr)
(begin
(format #t "~a(~%" (make-string pad #\tab))
(for-each (lambda (x) (help x (1+ pad))) expr)
(format #t "~a)~%" (make-string pad #\tab)))
(format #t "~a~a~%" (make-string pad #\tab) expr)))
(help expr 0))
 
(format-sexpr (sexpr-read
(open-input-string "((data \"quoted data\" 123 4.5) (data (!@# (4.5) \"(more\" \"data)\")))")))</syntaxhighlight>
 
(uses SRFI 6, Basic String Ports)
<lang scheme>(define input "((data \"quoted data\" 123 4.5)
(data (!@# (4.5) \"(more\" \"data)\")))")
(define data (read (open-input-string input)))
(define output (let ((out (open-output-string)))
(write data out)
(get-output-string out)))
(write input) (newline)
(write data) (newline)
(write output) (newline)</lang>
Output:
(
"((data \"quoted data\" 123 4.5)\n (data (!@# (4.5) \"(more\" \"data)\")))"
(
data
quoted data
123
4.5
)
(
data
(
!@#
(
4.5
)
(more
data)
)
)
)
 
=={{header|Sidef}}==
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))
{{trans|Perl}}
<syntaxhighlight lang="ruby">func sexpr(txt) {
txt.trim!
 
if (txt.match(/^\((.*)\)$/s)) {|m|
"((data \"quoted data\" 123 4.5) (data (!@# (4.5) \"(more\" \"data)\")))"
txt = m[0]
}
else {
die "Invalid: <<#{txt}>>"
}
 
var w
var ret = []
 
while (!txt.is_empty) {
given (txt.first) {
when('(') {
(w, txt) = txt.extract_bracketed('()');
w = sexpr(w)
}
when ('"') {
(w, txt) = txt.extract_delimited('"')
w.sub!(/^"(.*)"/, {|s1| s1 })
}
else {
txt.sub!(/^(\S+)/, {|s1| w = s1; '' })
}
}
ret << w
txt.trim_beg!
}
return ret
}
 
func sexpr2txt(String e) {
e ~~ /[\s"\(\)]/ ? do { e.gsub!('"', '\\"'); %Q("#{e}") } : e
}
 
func sexpr2txt(expr) {
'(' + expr.map {|e| sexpr2txt(e) }.join(' ') + ')'
}
 
var s = sexpr(%q{
 
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
 
})
 
say s # dump structure
say sexpr2txt(s) # convert back</syntaxhighlight>
{{out}}
<pre>
[["data", "quoted data", "123", "4.5"], ["data", ["!\@#", ["4.5"], "(more", "data)"]]]
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))
</pre>
 
=={{header|Tcl}}==
Note that because Tcl doesn't expose a type system (well, not in a conventional sense) the parts of the parsed out data structure are tagged lists; the first element is one of “<tt>string</tt>”, “<tt>int</tt>”, “<tt>real</tt>” and “<tt>atom</tt>” to indicate a leaf token, or “<tt>list</tt>” to indicate a sublist. A “native” data structure could also be generated, but then that would turn things into lists that are not in the original.
<langsyntaxhighlight lang="tcl">package require Tcl 8.5
 
proc fromSexp {str} {
Line 2,714 ⟶ 6,652:
return [lindex $content 0]
}
}</langsyntaxhighlight>
Demonstrating with the sample data:
<langsyntaxhighlight lang="tcl">set sample {((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))}
set parsed [fromSexp $sample]
puts "sample: $sample"
puts "parsed: $parsed"
puts "regen: [toSexp $parsed]"</langsyntaxhighlight>
Output:
<pre>
Line 2,730 ⟶ 6,668:
</pre>
As you can see, whitespace is not preserved in non-terminal locations.
 
=={{header|TXR}}==
 
TXR is in the Lisp family, and uses S-Expressions. So right from the system prompt we can do:
 
<pre>$ txr -p '(read)'
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
[Ctrl-D][Enter]
((data "quoted data" 123 4.5) (data (! (sys:var #(4.5)) "(more" "data)")))</pre>
 
However, note that the <code>@</code> character has a special meaning: <code>@obj</code> turns into <code>(sys:var obj)</code>. The purpose of this notation is to support Lisp code that requires meta-variables and meta-expressions. This can be used, for instance, in pattern matching to distinguish binding variables and matching operations from literal syntax.
 
The following solution avoids "cheating" in this way with the built-in parser; it implements a from-scratch S-exp parser which treats <code>!@#</code> as just a symbol.
 
The grammar is roughly as follows:
 
<pre>
expr := ws? atom
| ws? ( ws? expr* ws? )
 
atom := float | int | sym | str
 
float := sign? digit+ . digit* exponent?
| sign? digit* . digit+ exponent?
| sign? digit+ exponent
 
int := sign? digit+
 
str := " (\" | anychar )* "
 
sym := sym-char +
 
sym-char := /* non-whitespace, but not ( and not ) */</pre>
 
Code:
 
<syntaxhighlight lang="txr">@(define float (f))@\
@(local (tok))@\
@(cases)@\
@{tok /[+\-]?\d+\.\d*([Ee][+\-]?\d+)?/}@\
@(or)@\
@{tok /[+\-]?\d*\.\d+([Ee][+\-]?\d+)?/}@\
@(or)@\
@{tok /[+\-]?\d+[Ee][+\-]?\d+/}@\
@(end)@\
@(bind f @(flo-str tok))@\
@(end)
@(define int (i))@\
@(local (tok))@\
@{tok /[+\-]?\d+/}@\
@(bind i @(int-str tok))@\
@(end)
@(define sym (s))@\
@(local (tok))@\
@{tok /[^\s()]+/}@\
@(bind s @(intern tok))@\
@(end)
@(define str (s))@\
@(local (tok))@\
@{tok /"(\\"|[^"])*"/}@\
@(bind s @[tok 1..-1])@\
@(end)
@(define atom (a))@\
@(cases)@\
@(float a)@(or)@(int a)@(or)@(str a)@(or)@(sym a)@\
@(end)@\
@(end)
@(define expr (e))@\
@(cases)@\
@/\s*/@(atom e)@\
@(or)@\
@/\s*\(\s*/@(coll :vars (e))@(expr e)@/\s*/@(last))@(end)@\
@(end)@\
@(end)
@(freeform)
@(expr e)@junk
@(output)
expr: @(format nil "~s" e)
junk: @junk
@(end)</syntaxhighlight>
 
Run:
 
<pre>$ txr s-expressions.txr -
()
expr: nil
junk:
$ txr s-expressions.txr -
3e3
expr: 3000.0
junk:
$ txr s-expressions.txr -
+3
expr: 3
junk:
$ txr s-expressions.txr -
abc*
expr: abc*
junk:
$ txr s-expressions.txr -
abc*)
expr: abc*
junk: )
$ txr s-expressions.txr -
((data "quoted data" 123 4.5)
(data (!@# (4.5) "(more" "data)")))
expr: ((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))
junk:
</pre>
 
TODO: Note that the recognizer for string literals does not actually process the interior escape sequences <code>\"</code>; these remain as part of the string data. The only processing is the stripping of the outer quotes from the lexeme.
 
Explanation of most confusing line:
 
<syntaxhighlight lang="txr"> @/\s*\(\s*/@(coll :vars (e))@(expr e)@/\s*/@(last))@(end)</syntaxhighlight>
 
First, we match an open parenthesis that can be embedded in whitespace. Then we have a <code>@(coll)</code> construct which terminates with <code>@(end)</code>. This is a repetition construct for collecting zero or more items. The <code>:vars (e)</code> argument makes the collect strict: each repetition must bind the variable <code>e</code>. More importantly, in this case, if nothing is
collected, then <code>e</code> gets bound to <code>nil</code> (the empty list). The collect construct does not look at context beyond itself. To terminate the collect at the closing parenthesis we use <code>@(last))</code>. The second closing parenthesis here is literal text to be matched, not TXR syntax. This special clause establishes the terminating context without which the collect will munge all input. When the last clause matches, whatever it matches is consumed and the collect ends. (There is a related <code>@(until)</code> clause which terminates the collect, but leaves its own match unconsumed.)
 
=={{header|Wren}}==
{{trans|Kotlin}}
{{libheader|Wren-pattern}}
{{libheader|Wren-fmt}}
<syntaxhighlight lang="wren">import "./pattern" for Pattern
import "./fmt" for Fmt
 
var INDENT = 2
 
var parseSExpr = Fn.new { |str|
var ipat = " \t\n\f\v\r()\""
var p = Pattern.new("""+0/s["+0^""|(|)|"|+1/I]""", Pattern.within, ipat)
var t = p.findAll(str).map { |m| m.text }.toList
if (t.count == 0) return null
var o = false
var c = 0
for (i in t.count-1..0) {
var ti = t[i].trim()
var nd = Num.fromString(ti)
if (ti == "\"") return null
if (ti == "(") {
t[i] = "["
c = c + 1
} else if (ti == ")") {
t[i] = "]"
c = c - 1
} else if (nd) {
var ni = Num.fromString(ti)
t[i] = ni ? ni.toString : nd.toString
} else if (ti.startsWith("\"")) { // escape embedded double quotes
var temp = ti[1...-1]
t[i] = "\"" + temp.replace("\"", "\\\"") + "\""
}
if (i > 0 && t[i] != "]" && t[i - 1].trim() != "(") t.insert(i, ", ")
if (c == 0) {
if (!o) o = true else return null
}
}
return (c != 0) ? null : t
}
 
var toSExpr = Fn.new { |tokens|
for (i in 0...tokens.count) {
if (tokens[i] == "[") {
tokens[i] = "("
} else if (tokens[i] == "]") {
tokens[i] = ")"
} else if (tokens[i] == ", ") {
tokens[i] = " "
} else if (tokens[i].startsWith("\"")) { // unescape embedded quotes
var temp = tokens[i][1...-1]
tokens[i] = "\"" + temp.replace("\\\"", "\"") + "\""
}
}
return tokens.join()
}
 
var prettyPrint = Fn.new { |tokens|
var level = 0
for (t in tokens) {
var n
if (t == ", " || t == " ") {
continue
} else if (t == "[" || t == "(") {
n = level * INDENT + 1
level = level + 1
} else if (t == "]" || t == ")") {
level = level - 1
n = level * INDENT + 1
} else {
n = level * INDENT + t.count
}
Fmt.print("$*s", n, t)
}
}
 
var str = """((data "quoted data" 123 4.5)""" + "\n" +
""" (data (!@# (4.5) "(more" "data)")))"""
var tokens = parseSExpr.call(str)
if (!tokens) {
System.print("Invalid s-expr!")
} else {
System.print("Native data structure:")
System.print(tokens.join())
System.print("\nNative data structure (pretty print):")
prettyPrint.call(tokens)
 
System.print("\nRecovered S-Expression:")
System.print(toSExpr.call(tokens))
System.print("\nRecovered S-Expression (pretty print):")
prettyPrint.call(tokens)
}</syntaxhighlight>
 
{{out}}
<pre>
Native data structure:
[[data, "quoted data", 123, 4.5], [data, [!@#, [4.5], "(more", "data)"]]]
 
Native data structure (pretty print):
[
[
data
"quoted data"
123
4.5
]
[
data
[
!@#
[
4.5
]
"(more"
"data)"
]
]
]
 
Recovered S-Expression:
((data "quoted data" 123 4.5) (data (!@# (4.5) "(more" "data)")))
 
Recovered S-Expression (pretty print):
(
(
data
"quoted data"
123
4.5
)
(
data
(
!@#
(
4.5
)
"(more"
"data)"
)
)
)
</pre>
 
{{omit from|Brlcad}}