Parse EBNF: Difference between revisions

← Older edit

Parse EBNF (view source)

Revision as of 13:53, 28 January 2024

3,017 bytes added , 3 months ago

Added Algol 68

Tigerofdarkness

3,032

edits

Revision as of 22:29, 25 August 2022 (view source) Ssotka (talk \| contribs) (→‎{{header\|Raku}}) ← Older edit		Latest revision as of 13:53, 28 January 2024 (view source) Tigerofdarkness (talk \| contribs) (Added Algol 68)
(4 intermediate revisions by 2 users not shown)
Line 11: See [[Parse EBNF/Tests\|the tests]]. <br><br> =={{header\|ALGOL 68}}== The source of the Algol 68 sample is somewhat largish, so is on a separate page on Rosetta Code here: [[Parse EBNF/ALGOL 68]]. {{out}} <pre> Valid EBNF: a/z: 'a' { a = 'a1' ( 'a2' \| 'a3' ) { 'a4' } [ 'a5' ] 'a6' ; } 'z' a: seq literal "a1" oneof seq literal "a2" seq literal "a3" list literal "a4" opt literal "a5" literal "a6" "a1a3a4a4a5a6" is valid according to a "a1 a2a6" is valid according to a "a1 a3 a4 a6" is valid according to a ** Syntax error near "a1 a4 a5 a6" "a1 a4 a5 a6" is not valid according to a Syntax error near "a5 a5 a6" "a1 a2 a4 a5 a5 a6" is not valid according to a Unexpected text: "a7" at the end of source "a1 a2 a4 a5 a6 a7" is not valid according to a ** Syntax error near "your ad here" "your ad here" is not valid according to a Valid EBNF: Arithmetic expressions: 'Arithmetic expressions' { expr = term { plus term } . term = factor { times factor } . factor = number \| '(' expr ')' . plus = '+' \| '-' . times = '' \| '/' . number = digit { digit } . digit = '0' \| '1' \| '2' \| '3' \| '4' \| '5' \| '6' \| '7' \| '8' \| '9' . } expr: seq rule term list rule plus rule term term: seq rule factor list rule times rule factor factor: oneof seq rule number seq literal "(" rule expr literal ")" plus: oneof seq literal "+" seq literal "-" times: oneof seq literal "" seq literal "/" number: seq rule digit list rule digit digit: oneof seq literal "0" seq literal "1" seq literal "2" seq literal "3" seq literal "4" seq literal "5" seq literal "6" seq literal "7" seq literal "8" seq literal "9" "2" is valid according to Arithmetic expressions "23 + 4/23 - 7" is valid according to Arithmetic expressions "(3 + 4) 6-2+(4(4))" is valid according to Arithmetic expressions * Syntax error near "-2" "-2" is not valid according to Arithmetic expressions Unexpected text: "+" at the end of source "3 +" is not valid according to Arithmetic expressions Syntax error near " 3" "(4 + 3" is not valid according to Arithmetic expressions Expected "{", not "a" Invalid EBNF: a = '1'; Expected "}", not "(eof)" Invalid EBNF: { a = '1' ; Expected "=", not "world" Invalid EBNF: { hello world = '1'; } ** Rule "bar" not defined Invalid EBNF: { foo = bar . } </pre> =={{header\|Go}}== Line 16 ⟶ 149: <br> A more or less faithful translation except that indices are 0-based rather than 1-based and so 1 less than in the Phix results. <~~lang~~syntaxhighlight lang="go">package main import ( Line 420 ⟶ 553: fmt.Println() } }</~~lang~~syntaxhighlight> {{out}} Line 503 ⟶ 636: We use Parsec to generate Parsec. <~~lang~~syntaxhighlight lang="haskell">import Control.Applicative import Control.Monad import Data.Maybe Line 675 ⟶ 808: lc c = char c <* ws ws = many $ oneOf " \n\t"</~~lang~~syntaxhighlight> =={{header\|Julia}}== Line 683 ⟶ 816: Tested with Julia v1.7.2 <~~lang~~syntaxhighlight lang="julia"> struct Grammar regex::Regex Line 835 ⟶ 968: end </syntaxhighlight> ~~</lang>~~ =={{header\|Modula-2}}== <~~lang~~syntaxhighlight lang="modula2">MODULE EBNF; FROM ASCII IMPORT EOL; Line 927 ⟶ 1,060: Tabulate (T0); Tabulate (T1); END EBNF.</~~lang~~syntaxhighlight> And the source for the EBNF scanner. I hope you like nested procedures. <~~lang~~syntaxhighlight lang="modula2">IMPLEMENTATION MODULE EBNFScanner; FROM ASCII IMPORT LF; Line 1,096 ⟶ 1,229: Ino := 0; ch := ' ' END EBNFScanner.</~~lang~~syntaxhighlight> =={{header\|Perl}}== <~~lang~~syntaxhighlight lang="perl">#!/usr/bin/perl use strict; # http://www.rosettacode.org/wiki/Parse_EBNF Line 1,273 ⟶ 1,406: ---------------------------------------------------------------------- { foo = bar . } "undefined production check" ----------------------------------------------------------------------</~~lang~~syntaxhighlight> {{out}} <pre> Line 1,387 ⟶ 1,520: =={{header\|Phix}}== <!--<~~lang~~syntaxhighlight ~~Phix~~lang="phix">(phixonline)--> <span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span> <span style="color: #004080;">string</span> <span style="color: #000000;">src</span> Line 1,704 ⟶ 1,837: <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #008080;">end</span> <span style="color: #008080;">for</span> <!--</~~lang~~syntaxhighlight>--> In real use, I would be tempted to use numeric literals rather than string tags in such structures, but the latter certainly make things ten times easier to debug, plus I got an instantly legible syntax tree dump (the bit just after "===>" below) practically for free. {{out}} Line 1,771 ⟶ 1,904: =={{header\|PicoLisp}}== <~~lang~~syntaxhighlight ~~PicoLisp~~lang="picolisp">(de EBNF "expr : term ( ( PLUS \| MINUS ) term )* ;" "term : factor ( ( MULT \| DIV ) factor )* ;" Line 1,780 ⟶ 1,913: (unless (and (match '(@S : @E ;) (str E)) (not (cdr @S))) (quit "Invalid EBNF" E) ) (put (car @S) 'ebnf @E) ) )</~~lang~~syntaxhighlight> <~~lang~~syntaxhighlight ~~PicoLisp~~lang="picolisp">(de matchEbnf (Pat) (cond ((asoq Pat '((PLUS . +) (MINUS . -) (MULT . ) (DIV . /))) Line 1,823 ⟶ 1,956: (let Lst (str Str "") (catch NIL (parseLst (get 'expr 'ebnf)) ) ) )</~~lang~~syntaxhighlight> Output: <pre>: (parseEbnf "1 + 2 * -3 / 7 - 3 * 4") Line 1,835 ⟶ 1,968: It is implemented and exercised using the flavor of EBNF and test cases specified on the [[Parse EBNF/Tests\|test page]]. <syntaxhighlight lang="raku" line># A Raku grammar to parse EBNF ~~<lang perl6>~~ ~~# A Raku~~ grammar toEBNF ~~parse EBNF~~{ rule TOP { ^ <title>? '{' [ <production> ]+ '}' <comment>? $ } ~~grammar EBNF {~~ ~~rule~~ rule ~~TOP~~production { ^ <~~title~~name>? '{=' [ <~~production~~expression> <[.;]~~+ '}' <comment~~>~~? $~~ } rule ~~production~~expression { <~~name~~term> ~~'='~~+% ~~<expression> <[.;]>~~"\|" } rule ~~expression~~ term { <~~term~~factor> +~~% "\|"~~ } rule factor { <group> \| <repeat> \| <optional> \| <identifier> \| <literal> } ~~rule term { <factor>+ }~~ rule ~~factor~~ ~~{ <~~group> \|{ ~~<repeat> \|~~'(' <~~optional~~expression> ~~\| <identifier> \| <literal>~~')' } ~~rule~~ rule ~~group~~ repeat { '({' <expression> ')}' } ~~rule~~ rule ~~repeat~~optional { '{[' <expression> '}]' } token identifier { <-[\\|\(\)\{\}\[\]\.\;\"\'\s]>+ } #" ~~rule optional { '[' <expression> ']' }~~ token ~~identifier~~ literal { ["'" <-[~~\\|\(\)\{\}\[\~~']~~\.\;\~~>+ "\'\s" \| '"' <-["]>+ '"'] } #" ~~token~~ token ~~literal~~ { ~~["'"~~ ~~<-[']>+~~ ~~"'"~~title ~~\| '"'~~{ <~~-["]~~literal>~~+ '"']~~ } #" ~~token~~ token ~~title~~comment { <literal> } token ~~comment~~ name { <~~literal~~identifier> <?before \h* '='> } } ~~token name { <identifier> <?before \h* '='> }~~ } class EBNF::Actions { ~~class EBNF::Actions {~~ method TOP($/) { say "Syntax Tree:\n", $/; # Dump the syntax tree to STDOUT Line 1,861 ⟶ 1,993: ">]+\$\}\n " ~ $<production>>>.ast ~ "\}" } method production($/) { make 'token ' ~ $<name> ~ ' {' ~ $<expression>.ast ~ "}\n" } method expression($/) { make join '\|', $<term>>>.ast } method term($/) { make join '\h', $<factor>>>.ast } method factor($/) { make $<literal> ?? $<literal> !! $<group> ?? '[' ~ $<group>.ast ~ ']' !! Line 1,874 ⟶ 2,006: '<' ~ $<identifier> ~ '>' } method repeat($/) { make $<expression>.ast } method optional($/) { make $<expression>.ast } method group($/) { make $<expression>.ast } } # An array of test cases my @tests = ( { ebnf => Line 1,937 ⟶ 2,069: teststrings => ['foobar'] } ); # Test the parser. my $i = 1; for @tests -> $test { unless EBNF.parse($test<ebnf>) { say "Parsing EBNF grammar:\n"; Line 1,972 ⟶ 2,104: say '' x 79, "\n"; unlink $fn; }</syntaxhighlight> } ~~</lang>~~ Output: Line 2,244 ⟶ 2,375: {{in progress\|lang=Ruby\|day=12\|month=May\|year=2011}} {{incomplete\|Ruby\|The tokenizer is here, but the parser is very incomplete.}} <~~lang~~syntaxhighlight lang="ruby">#-- # The tokenizer splits the input into Tokens like "identifier", # ":", ")" and so on. This design uses a StringScanner on each line of Line 2,382 ⟶ 2,513: parse end </syntaxhighlight> ~~</lang>~~ =={{header\|Tcl}}== Line 2,388 ⟶ 2,519: Demonstration lexer and parser. Note that this parser supports parenthesized expressions, making the grammar recursive. <~~lang~~syntaxhighlight lang="tcl">package require Tcl 8.6 # Utilities to make the coroutine easier to use Line 2,522 ⟶ 2,653: } throw SYNTAX "\"$payload\" at position $index" }</~~lang~~syntaxhighlight> <~~lang~~syntaxhighlight lang="tcl"># Demonstration code puts [parse "1 - 2 - -3 4 + 5"] puts [parse "1 - 2 - -3 * (4 + 5)"]</~~lang~~syntaxhighlight> Output: <pre> Line 2,536 ⟶ 2,667: {{trans\|Phix}} {{libheader\|Wren-fmt}} {{libheader\|Wren-~~trait~~iterate}} {{libheader\|Wren-str}} {{libheader\|Wren-seq}} Translated via the Go entry. <~~lang~~syntaxhighlight ~~ecmascript~~lang="wren">import "./fmt" for Conv, Fmt import "./~~trait~~iterate" for Stepped import "./str" for Char import "./seq" for Lst var src = "" Line 2,907 ⟶ 3,038: System.print() i = i + 1 }</~~lang~~syntaxhighlight> {{out}}