S-expressions: Difference between revisions
m (more clarifications) |
(implement prototype in pike) |
||
Line 10: | Line 10: | ||
The reader should be able to read the following input |
The reader should be able to read the following input |
||
<lang lips>((data "quoted data" 123 4.5) |
<lang lips>((data "quoted data" 123 4.5) |
||
(data ("(more" "data)" |
(data (123 (4.5) "(more" "data)")))</lang> |
||
and eg. in python produce a list as: |
and eg. in python produce a list as: |
||
<lang python>[["data", "quoted data", 123, 4.5] |
<lang python>[["data", "quoted data", 123, 4.5] |
||
["data", ["(more", "data)" |
["data", [123, [4.5], "(more", "data)"]]]</lang> |
||
The writer should be able to take the produced list and turn it into a new S-Expression. |
The writer should be able to take the produced list and turn it into a new S-Expression. |
||
Strings that don't contain whitespace or parentheses () don't need to be quoted in the resulting S-Expression, but as a simplification, any string may be quoted. |
Strings that don't contain whitespace or parentheses () don't need to be quoted in the resulting S-Expression, but as a simplification, any string may be quoted. |
||
=={{header|Pike}}== |
|||
this version doesn't yet handle int and float and it doesn't remove unneeded quotes from simple strings |
|||
<lang pike>string input = ((data \"quoted data\" 123 4.5)\n (data (123 (45) \"(more\" \"data)\")))"; |
|||
array tokenizer(string input) |
|||
{ |
|||
array output = ({}); |
|||
for(int i=0; i<sizeof(input); i++) |
|||
{ |
|||
switch(input[i]) |
|||
{ |
|||
case '(': output+= ({"("}); break; |
|||
case ')': output += ({")"}); break; |
|||
case '"': output+=array_sscanf(input[++i..], "%s\"%[ \t\n]")[0..0]; |
|||
i+=sizeof(output[-1]); |
|||
break; |
|||
case ' ': |
|||
case '\t': |
|||
case '\n': break; |
|||
default: output+=array_sscanf(input[i..], "%s%[) \t\n]")[0..0]; |
|||
i+=sizeof(output[-1])-1; break; |
|||
} |
|||
} |
|||
return output; |
|||
} |
|||
// this function is based on the logic in Parser.C.group() in the pike library; |
|||
array group(array tokens) |
|||
{ |
|||
ADT.Stack stack=ADT.Stack(); |
|||
array ret =({}); |
|||
foreach(tokens;; string token) |
|||
{ |
|||
switch(token) |
|||
{ |
|||
case "(": stack->push(ret); ret=({}); break; |
|||
case ")": |
|||
if (!sizeof(ret) || !stack->ptr) |
|||
{ |
|||
// Mismatch |
|||
werror ("unmatched close parenthesis\n"); |
|||
return ret; |
|||
} |
|||
ret=stack->pop()+({ ret }); |
|||
break; |
|||
default: ret+=({token}); break; |
|||
} |
|||
} |
|||
return ret; |
|||
} |
|||
string sexp(array input) |
|||
{ |
|||
array output = ({}); |
|||
foreach(input;; mixed item) |
|||
{ |
|||
if (arrayp(item)) |
|||
output += ({ sexp(item) }); |
|||
else |
|||
output += ({ sprintf("%O", item) }); |
|||
} |
|||
return "("+output*" "+")"; |
|||
} |
|||
array data = group(tokenizer(input))[0]; |
|||
string output = sexp(data); |
|||
</lang> |
|||
Output: |
|||
({({"data", "quoted data", "123", "4.5"}), ({"data", ({"123", ({"45"}), "(more", "data)"})})}) |
|||
(("data" "quoted data" "123" "4.5") ("data" ("123" ("45") "(more" "data)"))) |
Revision as of 18:10, 15 October 2011
S-Expressions are one convenient way to parse and store data.
Write a simple reader and writer for S-Expressions that handles quoted and unquoted strings, integers and floats.
The reader should read a single but nested S-Expression from a string and store it in a suitable datastructure (list, array, etc). Newlines and other whitespace may be ignored unless contained within a quoted string. () inside quoted strings are not interpreted, but treated as part of the string. Handling escaped quotes inside a string is optional. thus (foo"bar) maybe treated as a string 'foo"bar', or as an error.
Languages that support this may treat unquoted strings as symbols.
The reader should be able to read the following input <lang lips>((data "quoted data" 123 4.5)
(data (123 (4.5) "(more" "data)")))</lang>
and eg. in python produce a list as:
<lang python>[["data", "quoted data", 123, 4.5]
["data", [123, [4.5], "(more", "data)"]]]</lang>
The writer should be able to take the produced list and turn it into a new S-Expression. Strings that don't contain whitespace or parentheses () don't need to be quoted in the resulting S-Expression, but as a simplification, any string may be quoted.
Pike
this version doesn't yet handle int and float and it doesn't remove unneeded quotes from simple strings <lang pike>string input = ((data \"quoted data\" 123 4.5)\n (data (123 (45) \"(more\" \"data)\")))";
array tokenizer(string input) {
array output = ({}); for(int i=0; i<sizeof(input); i++) { switch(input[i]) { case '(': output+= ({"("}); break; case ')': output += ({")"}); break; case '"': output+=array_sscanf(input[++i..], "%s\"%[ \t\n]")[0..0]; i+=sizeof(output[-1]); break; case ' ': case '\t': case '\n': break; default: output+=array_sscanf(input[i..], "%s%[) \t\n]")[0..0]; i+=sizeof(output[-1])-1; break; } } return output;
}
// this function is based on the logic in Parser.C.group() in the pike library; array group(array tokens) {
ADT.Stack stack=ADT.Stack(); array ret =({});
foreach(tokens;; string token) { switch(token) { case "(": stack->push(ret); ret=({}); break; case ")": if (!sizeof(ret) || !stack->ptr) { // Mismatch werror ("unmatched close parenthesis\n"); return ret; } ret=stack->pop()+({ ret }); break; default: ret+=({token}); break; } } return ret;
}
string sexp(array input) {
array output = ({}); foreach(input;; mixed item) { if (arrayp(item)) output += ({ sexp(item) }); else output += ({ sprintf("%O", item) }); } return "("+output*" "+")";
}
array data = group(tokenizer(input))[0]; string output = sexp(data); </lang>
Output:
({({"data", "quoted data", "123", "4.5"}), ({"data", ({"123", ({"45"}), "(more", "data)"})})}) (("data" "quoted data" "123" "4.5") ("data" ("123" ("45") "(more" "data)")))