S-expressions: Difference between revisions

From Rosetta Code
Content added Content deleted
m (more clarifications)
(implement prototype in pike)
Line 10: Line 10:
The reader should be able to read the following input
The reader should be able to read the following input
<lang lips>((data "quoted data" 123 4.5)
<lang lips>((data "quoted data" 123 4.5)
(data ("(more" "data)" 123) (4.5)))</lang>
(data (123 (4.5) "(more" "data)")))</lang>


and eg. in python produce a list as:
and eg. in python produce a list as:


<lang python>[["data", "quoted data", 123, 4.5]
<lang python>[["data", "quoted data", 123, 4.5]
["data", ["(more", "data)", 123], [4.5]]]</lang>
["data", [123, [4.5], "(more", "data)"]]]</lang>


The writer should be able to take the produced list and turn it into a new S-Expression.
The writer should be able to take the produced list and turn it into a new S-Expression.
Strings that don't contain whitespace or parentheses () don't need to be quoted in the resulting S-Expression, but as a simplification, any string may be quoted.
Strings that don't contain whitespace or parentheses () don't need to be quoted in the resulting S-Expression, but as a simplification, any string may be quoted.

=={{header|Pike}}==
this version doesn't yet handle int and float and it doesn't remove unneeded quotes from simple strings
<lang pike>string input = ((data \"quoted data\" 123 4.5)\n (data (123 (45) \"(more\" \"data)\")))";

array tokenizer(string input)
{
array output = ({});
for(int i=0; i<sizeof(input); i++)
{
switch(input[i])
{
case '(': output+= ({"("}); break;
case ')': output += ({")"}); break;
case '"': output+=array_sscanf(input[++i..], "%s\"%[ \t\n]")[0..0];
i+=sizeof(output[-1]);
break;
case ' ':
case '\t':
case '\n': break;
default: output+=array_sscanf(input[i..], "%s%[) \t\n]")[0..0];
i+=sizeof(output[-1])-1; break;
}
}
return output;
}

// this function is based on the logic in Parser.C.group() in the pike library;
array group(array tokens)
{
ADT.Stack stack=ADT.Stack();
array ret =({});

foreach(tokens;; string token)
{
switch(token)
{
case "(": stack->push(ret); ret=({}); break;
case ")":
if (!sizeof(ret) || !stack->ptr)
{
// Mismatch
werror ("unmatched close parenthesis\n");
return ret;
}
ret=stack->pop()+({ ret });
break;
default: ret+=({token}); break;
}
}
return ret;
}

string sexp(array input)
{
array output = ({});
foreach(input;; mixed item)
{
if (arrayp(item))
output += ({ sexp(item) });
else
output += ({ sprintf("%O", item) });
}
return "("+output*" "+")";
}

array data = group(tokenizer(input))[0];
string output = sexp(data);
</lang>

Output:
({({"data", "quoted data", "123", "4.5"}), ({"data", ({"123", ({"45"}), "(more", "data)"})})})
(("data" "quoted data" "123" "4.5") ("data" ("123" ("45") "(more" "data)")))

Revision as of 18:10, 15 October 2011

S-expressions is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

S-Expressions are one convenient way to parse and store data.

Write a simple reader and writer for S-Expressions that handles quoted and unquoted strings, integers and floats.

The reader should read a single but nested S-Expression from a string and store it in a suitable datastructure (list, array, etc). Newlines and other whitespace may be ignored unless contained within a quoted string. () inside quoted strings are not interpreted, but treated as part of the string. Handling escaped quotes inside a string is optional. thus (foo"bar) maybe treated as a string 'foo"bar', or as an error.

Languages that support this may treat unquoted strings as symbols.

The reader should be able to read the following input <lang lips>((data "quoted data" 123 4.5)

(data (123 (4.5) "(more" "data)")))</lang>

and eg. in python produce a list as:

<lang python>[["data", "quoted data", 123, 4.5]

["data", [123, [4.5], "(more", "data)"]]]</lang>

The writer should be able to take the produced list and turn it into a new S-Expression. Strings that don't contain whitespace or parentheses () don't need to be quoted in the resulting S-Expression, but as a simplification, any string may be quoted.

Pike

this version doesn't yet handle int and float and it doesn't remove unneeded quotes from simple strings <lang pike>string input = ((data \"quoted data\" 123 4.5)\n (data (123 (45) \"(more\" \"data)\")))";

array tokenizer(string input) {

   array output = ({}); 
   for(int i=0; i<sizeof(input); i++)
   { 
       switch(input[i])
       { 
           case '(': output+= ({"("}); break; 
           case ')': output += ({")"}); break; 
           case '"': output+=array_sscanf(input[++i..], "%s\"%[ \t\n]")[0..0]; 
                     i+=sizeof(output[-1]); 
                     break; 
           case ' ': 
           case '\t': 
           case '\n': break; 
           default: output+=array_sscanf(input[i..], "%s%[) \t\n]")[0..0]; 
                    i+=sizeof(output[-1])-1; break; 
       }
   }
   return output;

}

// this function is based on the logic in Parser.C.group() in the pike library; array group(array tokens) {

   ADT.Stack stack=ADT.Stack();
   array ret =({});
   foreach(tokens;; string token)
   {
       switch(token)
       {
           case "(": stack->push(ret); ret=({}); break;
           case ")":
                   if (!sizeof(ret) || !stack->ptr) 
                   {
                     // Mismatch
                       werror ("unmatched close parenthesis\n");
                       return ret;
                   }
                   ret=stack->pop()+({ ret }); 
                   break;
           default: ret+=({token}); break;
       }
   }
   return ret;

}

string sexp(array input) {

   array output = ({});
   foreach(input;; mixed item)
   {
       if (arrayp(item))
           output += ({ sexp(item) });
       else
           output += ({ sprintf("%O", item) });
   }
   return "("+output*" "+")";

}

array data = group(tokenizer(input))[0]; string output = sexp(data); </lang>

Output:

({({"data", "quoted data", "123", "4.5"}), ({"data", ({"123", ({"45"}), "(more", "data)"})})})
(("data" "quoted data" "123" "4.5") ("data" ("123" ("45") "(more" "data)")))