Compiler/syntax analyzer: Difference between revisions

Content added Content deleted

Inline

Latest revision as of 19:40, 30 May 2024

Syntax Analyzer

A Syntax analyzer transforms a token stream (from the Lexical analyzer) into a Syntax tree, based on a grammar.

Take the output from the Lexical analyzer task, and convert it to an Abstract Syntax Tree (AST), based on the grammar below. The output should be in a flattened format.

The program should read input from a file and/or stdin, and write output to a file and/or stdout. If the language being used has a parser module/library/class, it would be great if two versions of the solution are provided: One without the parser module, and one with.

Grammar

The simple programming language to be analyzed is more or less a (very tiny) subset of C. The formal grammar in Extended Backus-Naur Form (EBNF):

    stmt_list           =   {stmt} ;

    stmt                =   ';'
                          | Identifier '=' expr ';'
                          | 'while' paren_expr stmt
                          | 'if' paren_expr stmt ['else' stmt]
                          | 'print' '(' prt_list ')' ';'
                          | 'putc' paren_expr ';'
                          | '{' stmt_list '}'
                          ;

    paren_expr          =   '(' expr ')' ;

    prt_list            =   (string | expr) {',' (String | expr)} ;

    expr                =   and_expr            {'||' and_expr} ;
    and_expr            =   equality_expr       {'&&' equality_expr} ;
    equality_expr       =   relational_expr     [('==' | '!=') relational_expr] ;
    relational_expr     =   addition_expr       [('<' | '<=' | '>' | '>=') addition_expr] ;
    addition_expr       =   multiplication_expr {('+' | '-') multiplication_expr} ;
    multiplication_expr =   primary             {('*' | '/' | '%') primary } ;
    primary             =   Identifier
                          | Integer
                          | '(' expr ')'
                          | ('+' | '-' | '!') primary
                          ;

The resulting AST should be formulated as a Binary Tree.

Example - given the simple program (below), stored in a file called while.t, create the list of tokens, using one of the Lexical analyzer solutions

lex < while.t > while.lex

Run one of the Syntax analyzer solutions

parse < while.lex > while.ast

The following table shows the input to lex, lex output, and the AST produced by the parser

Input to lex

Output from lex, input to parse

Output from parse

count = 1;
 while (count < 10) {
     print("count is: ", count, "\n");
     count = count + 1;
 }

1 1 Identifier count 1 7 Op_assign 1 9 Integer 1 1 10 Semicolon 2 1 Keyword_while 2 7 LeftParen 2 8 Identifier count 2 14 Op_less 2 16 Integer 10 2 18 RightParen 2 20 LeftBrace 3 5 Keyword_print 3 10 LeftParen 3 11 String "count is: " 3 23 Comma 3 25 Identifier count 3 30 Comma 3 32 String "\n" 3 36 RightParen 3 37 Semicolon 4 5 Identifier count 4 11 Op_assign 4 13 Identifier count 4 19 Op_add 4 21 Integer 1 4 22 Semicolon 5 1 RightBrace 6 1 End_of_input

Sequence Sequence ; Assign Identifier count Integer 1 While Less Identifier count Integer 10 Sequence Sequence ; Sequence Sequence Sequence ; Prts String "count is: " ; Prti Identifier count ; Prts String "\n" ; Assign Identifier count Add Identifier count Integer 1

Specifications

List of node type names

Identifier String Integer Sequence If Prtc Prts Prti While Assign Negate Not Multiply Divide Mod
Add Subtract Less LessEqual Greater GreaterEqual Equal NotEqual And Or

In the text below, Null/Empty nodes are represented by ";".

Non-terminal (internal) nodes

For Operators, the following nodes should be created:

Multiply Divide Mod Add Subtract Less LessEqual Greater GreaterEqual Equal NotEqual And Or

For each of the above nodes, the left and right sub-nodes are the operands of the respective operation.

In pseudo S-Expression format:

(Operator expression expression)

Negate, Not

For these node types, the left node is the operand, and the right node is null.

(Operator expression ;)

Sequence - sub-nodes are either statements or Sequences.

If - left node is the expression, the right node is If node, with it's left node being the if-true statement part, and the right node being the if-false (else) statement part.

(If expression (If statement else-statement))

If there is not an else, the tree becomes:

(If expression (If statement ;))

Prtc

(Prtc (expression) ;)

Prts

(Prts (String "the string") ;)

Prti

(Prti (Integer 12345) ;)

While - left node is the expression, the right node is the statement.

(While expression statement)

Assign - left node is the left-hand side of the assignment, the right node is the right-hand side of the assignment.

(Assign Identifier expression)

Terminal (leaf) nodes:

Identifier: (Identifier ident_name)
Integer:    (Integer 12345)
String:     (String "Hello World!")
";":        Empty node

Some simple examples

Sequences denote a list node; they are used to represent a list. semicolon's represent a null node, e.g., the end of this path.

This simple program:

   a=11;

Produces the following AST, encoded as a binary tree:

Under each non-leaf node are two '|' lines. The first represents the left sub-node, the second represents the right sub-node:

   (1) Sequence
   (2)     |-- ;
   (3)     |-- Assign
   (4)         |-- Identifier: a
   (5)         |-- Integer: 11

In flattened form:

   (1) Sequence
   (2) ;
   (3) Assign
   (4) Identifier    a
   (5) Integer       11

This program:

   a=11;
   b=22;
   c=33;

Produces the following AST:

   ( 1) Sequence
   ( 2)     |-- Sequence
   ( 3)     |   |-- Sequence
   ( 4)     |   |   |-- ;
   ( 5)     |   |   |-- Assign
   ( 6)     |   |       |-- Identifier: a
   ( 7)     |   |       |-- Integer: 11
   ( 8)     |   |-- Assign
   ( 9)     |       |-- Identifier: b
   (10)     |       |-- Integer: 22
   (11)     |-- Assign
   (12)         |-- Identifier: c
   (13)         |-- Integer: 33

In flattened form:

   ( 1) Sequence
   ( 2) Sequence
   ( 3) Sequence
   ( 4) ;
   ( 5) Assign
   ( 6) Identifier    a
   ( 7) Integer       11
   ( 8) Assign
   ( 9) Identifier    b
   (10) Integer       22
   (11) Assign
   (12) Identifier    c
   (13) Integer       33

Pseudo-code for the parser.

Uses Precedence Climbing for expression parsing, and Recursive Descent for statement parsing. The AST is also built:

def expr(p)
    if tok is "("
        x = paren_expr()
    elif tok in ["-", "+", "!"]
        gettok()
        y = expr(precedence of operator)
        if operator was "+"
            x = y
        else
            x = make_node(operator, y)
    elif tok is an Identifier
        x = make_leaf(Identifier, variable name)
        gettok()
    elif tok is an Integer constant
        x = make_leaf(Integer, integer value)
        gettok()
    else
        error()

    while tok is a binary operator and precedence of tok >= p
        save_tok = tok
        gettok()
        q = precedence of save_tok
        if save_tok is not right associative
            q += 1
        x = make_node(Operator save_tok represents, x, expr(q))

    return x

def paren_expr()
    expect("(")
    x = expr(0)
    expect(")")
    return x

def stmt()
    t = NULL
    if accept("if")
        e = paren_expr()
        s = stmt()
        t = make_node(If, e, make_node(If, s, accept("else") ? stmt() : NULL))
    elif accept("putc")
        t = make_node(Prtc, paren_expr())
        expect(";")
    elif accept("print")
        expect("(")
        repeat
            if tok is a string
                e = make_node(Prts, make_leaf(String, the string))
                gettok()
            else
                e = make_node(Prti, expr(0))

            t = make_node(Sequence, t, e)
        until not accept(",")
        expect(")")
        expect(";")
    elif tok is ";"
        gettok()
    elif tok is an Identifier
        v = make_leaf(Identifier, variable name)
        gettok()
        expect("=")
        t = make_node(Assign, v, expr(0))
        expect(";")
    elif accept("while")
        e = paren_expr()
        t = make_node(While, e, stmt()
    elif accept("{")
        while tok not equal "}" and tok not equal end-of-file
            t = make_node(Sequence, t, stmt())
        expect("}")
    elif tok is end-of-file
        pass
    else
        error()
    return t

def parse()
    t = NULL
    gettok()
    repeat
        t = make_node(Sequence, t, stmt())
    until tok is end-of-file
    return t

Once the AST is built, it should be output in a flattened format. This can be as simple as the following

def prt_ast(t)
    if t == NULL
        print(";\n")
    else
        print(t.node_type)
        if t.node_type in [Identifier, Integer, String]     # leaf node
            print the value of the Ident, Integer or String, "\n"
        else
            print("\n")
            prt_ast(t.left)
            prt_ast(t.right)

If the AST is correctly built, loading it into a subsequent program should be as simple as

def load_ast()
    line = readline()
    # Each line has at least one token
    line_list = tokenize the line, respecting double quotes

    text = line_list[0] # first token is always the node type

    if text == ";"   # a terminal node
        return NULL

    node_type = text # could convert to internal form if desired

    # A line with two tokens is a leaf node
    # Leaf nodes are: Identifier, Integer, String
    # The 2nd token is the value
    if len(line_list) > 1
        return make_leaf(node_type, line_list[1])

    left = load_ast()
    right = load_ast()
    return make_node(node_type, left, right)

Finally, the AST can also be tested by running it against one of the AST Interpreter solutions.

Test program, assuming this is in a file called prime.t: lex <prime.t | parse

Input to lex

Output from lex, input to parse

Output from parse

/*
 Simple prime number generator
 */
count = 1;
n = 1;
limit = 100;
while (n < limit) {
    k=3;
    p=1;
    n=n+2;
    while ((k*k<=n) && (p)) {
        p=n/k*k!=n;
        k=k+2;
    }
    if (p) {
        print(n, " is prime\n");
        count = count + 1;
    }
}
print("Total primes found: ", count, "\n");

4 1 Identifier count 4 7 Op_assign 4 9 Integer 1 4 10 Semicolon 5 1 Identifier n 5 3 Op_assign 5 5 Integer 1 5 6 Semicolon 6 1 Identifier limit 6 7 Op_assign 6 9 Integer 100 6 12 Semicolon 7 1 Keyword_while 7 7 LeftParen 7 8 Identifier n 7 10 Op_less 7 12 Identifier limit 7 17 RightParen 7 19 LeftBrace 8 5 Identifier k 8 6 Op_assign 8 7 Integer 3 8 8 Semicolon 9 5 Identifier p 9 6 Op_assign 9 7 Integer 1 9 8 Semicolon 10 5 Identifier n 10 6 Op_assign 10 7 Identifier n 10 8 Op_add 10 9 Integer 2 10 10 Semicolon 11 5 Keyword_while 11 11 LeftParen 11 12 LeftParen 11 13 Identifier k 11 14 Op_multiply 11 15 Identifier k 11 16 Op_lessequal 11 18 Identifier n 11 19 RightParen 11 21 Op_and 11 24 LeftParen 11 25 Identifier p 11 26 RightParen 11 27 RightParen 11 29 LeftBrace 12 9 Identifier p 12 10 Op_assign 12 11 Identifier n 12 12 Op_divide 12 13 Identifier k 12 14 Op_multiply 12 15 Identifier k 12 16 Op_notequal 12 18 Identifier n 12 19 Semicolon 13 9 Identifier k 13 10 Op_assign 13 11 Identifier k 13 12 Op_add 13 13 Integer 2 13 14 Semicolon 14 5 RightBrace 15 5 Keyword_if 15 8 LeftParen 15 9 Identifier p 15 10 RightParen 15 12 LeftBrace 16 9 Keyword_print 16 14 LeftParen 16 15 Identifier n 16 16 Comma 16 18 String " is prime\n" 16 31 RightParen 16 32 Semicolon 17 9 Identifier count 17 15 Op_assign 17 17 Identifier count 17 23 Op_add 17 25 Integer 1 17 26 Semicolon 18 5 RightBrace 19 1 RightBrace 20 1 Keyword_print 20 6 LeftParen 20 7 String "Total primes found: " 20 29 Comma 20 31 Identifier count 20 36 Comma 20 38 String "\n" 20 42 RightParen 20 43 Semicolon 21 1 End_of_input

Sequence Sequence Sequence Sequence Sequence ; Assign Identifier count Integer 1 Assign Identifier n Integer 1 Assign Identifier limit Integer 100 While Less Identifier n Identifier limit Sequence Sequence Sequence Sequence Sequence ; Assign Identifier k Integer 3 Assign Identifier p Integer 1 Assign Identifier n Add Identifier n Integer 2 While And LessEqual Multiply Identifier k Identifier k Identifier n Identifier p Sequence Sequence ; Assign Identifier p NotEqual Multiply Divide Identifier n Identifier k Identifier k Identifier n Assign Identifier k Add Identifier k Integer 2 If Identifier p If Sequence Sequence ; Sequence Sequence ; Prti Identifier n ; Prts String " is prime\n" ; Assign Identifier count Add Identifier count Integer 1 ; Sequence Sequence Sequence ; Prts String "Total primes found: " ; Prti Identifier count ; Prts String "\n" ;

Additional examples

Your solution should pass all the test cases above and the additional tests found Here.

Reference

The C and Python versions can be considered reference implementations.

Related Tasks

ALGOL W

begin % syntax analyser %
    % parse tree nodes %
    record node( integer         type
               ; reference(node) left, right
               ; integer         iValue % nString/nIndentifier number or nInteger value %
               );
    integer     nIdentifier, nString, nInteger, nSequence, nIf,   nPrtc, nPrts
          ,     nPrti,       nWhile,  nAssign,  nNegate,   nNot,  nMultiply
          ,     nDivide,     nMod,    nAdd,     nSubtract, nLess, nLessEqual
          ,     nGreater,    nGreaterEqual,     nEqual,    nNotEqual,    nAnd, nOr
          ;
    string(14) array ndName ( 1 :: 25 );
    % tokens - names must match those output by the lexical analyser %
    integer     tkType, tkLine, tkColumn, tkLength, tkIntegerValue;
    integer     tOp_multiply   , tOp_divide        , tOp_mod       , tOp_add
          ,     tOp_subtract   , tOp_negate        , tOp_less      , tOp_lessequal
          ,     tOp_greater    , tOp_greaterequal  , tOp_equal     , tOp_notequal
          ,     tOp_not        , tOp_assign        , tOp_and       , tOp_or
          ,     tLeftParen     , tRightParen       , tLeftBrace    , tRightBrace
          ,     tSemicolon     , tComma            , tKeyword_if   , tKeyword_else
          ,     tKeyword_while , tKeyword_print    , tKeyword_putc , tIdentifier
          ,     tInteger       , tString           , tEnd_of_input
          ,     MAX_TOKEN_TYPE, PRIMARY_PREC
          ;
    string(16)  array tkName         ( 1 :: 31 );
    integer     array tkPrec, tkNode ( 1 :: 31 );
    % string literals and identifiers - uses a linked list - a hash table might be better... %
    string(1)   array text ( 0 :: 4095 );
    integer     textNext, TEXT_MAX;
    record textElement ( integer start, length; reference(textElement) next );
    reference(textElement) idList, stList;

    % returns a new node with left and right branches %
    reference(node) procedure opNode ( integer value opType; reference(node) value opLeft, opRight ) ; begin
        node( opType, opLeft, opRight, 0 )
    end opNode ;

    % returns a new operand node %
    reference(node) procedure operandNode ( integer value opType, opValue ) ; begin
        node( opType, null, null, opValue )
    end operandNode ;

    % reports an error %
    procedure synError( integer value line, column; string(80) value message ); begin
        integer errorPos;
        write( i_w := 1, s_w := 0, "**** Error at(", line, ",", column, "): " );
        errorPos := 0;
        while errorPos < 80 and message( errorPos // 1 ) not = "." do begin
            writeon( s_w := 0, message( errorPos // 1 ) );
            errorPos := errorPos + 1
        end while_not_at_end_of_message ;
        writeon( s_w := 0, "." )
    end synError ;

    % reports an error and stops %
    procedure fatalError( integer value line, column; string(80) value message ); begin
        synError( line, column, message );
        assert( false )
    end fatalError ;

    % prints a node and its sub-nodes %
    procedure writeNode( reference(node) value n ) ; begin
        % prints an identifier or string from text %
        procedure writeOnText( reference(textElement) value txHead; integer value txNumber ) ;
        begin
            reference(textElement) txPos;
            integer                count;
            txPos := txHead;
            count := 1;
            while count < txNumber and txPos not = null do begin
                txPos := next(txPos);
                count := count + 1
            end while_text_element_not_found ;
            if txPos = null then fatalError( 0, txNumber, "INTERNAL ERROR: text not found." )
            else for cPos := 0 until length(txPos) - 1 do writeon( text( start(txPos) + cPos ) );
            if text( start(txPos) ) = """" then writeon( """" );
        end writeOnText ;

        if n = null then write( ";" )
        else begin
            write( ndName( type(n) ) );
            if      type(n) = nInteger    then writeon( iValue(n) )
            else if type(n) = nIdentifier then writeOnText( idList, iValue(n) )
            else if type(n) = nString     then writeOnText( stList, iValue(n) )
            else begin
                writeNode(  left(n) );
                writeNode( right(n) )
            end
        end
    end writeNode ;

    % reads a token from standard input %
    procedure readToken ; begin

        % parses a string from line and stores it in a string in the text array %
        % - if it is not already present in the specified textElement list.     %
        % returns the position of the string in the text array                  %
        integer procedure readString ( reference(textElement) value result txList; string(1) value terminator ) ; begin
            string(256) str;
            integer     sLen, sPos, ePos;
            logical     found;
            reference(textElement) txPos, txLastPos;
            % get the text of the string %
            str  := " ";
            sLen := 0;
            str( sLen // 1 ) := line( lPos // 1 );
            sLen := sLen + 1;
            lPos := lPos + 1;
            while lPos <= 255 and line( lPos // 1 ) not = terminator do begin
                str( sLen // 1 ) := line( lPos // 1 );
                sLen := sLen + 1;
                lPos := lPos + 1
            end while_more_string ;
            if lPos > 255 then fatalError( tkLine, tkColumn, "Unterminated String in token file." );
            % attempt to find the text in the list of strings/identifiers %
            txLastPos := txPos := txList;
            found := false;
            ePos := 0;
            while not found and txPos not = null do begin
                ePos  := ePos + 1;
                found := ( length(txPos) = sLen );
                sPos  := 0;
                while found and sPos < sLen do begin
                    found := str( sPos // 1 ) = text( start(txPos) + sPos );
                    sPos  := sPos + 1
                end while_not_found ;
                txLastPos := txPos;
                if not found then txPos := next(txPos)
            end while_string_not_found ;
            if not found then begin
                % the string/identifier is not in the list - add it %
                ePos := ePos + 1;
                if txList = null then txList := textElement( textNext, sLen, null )
                                 else next(txLastPos) := textElement( textNext, sLen, null );
                if textNext + sLen > TEXT_MAX then fatalError( tkLine, tkColumn, "Text space exhausted." )
                else begin
                    for cPos := 0 until sLen - 1 do begin
                        text( textNext ) := str( cPos // 1 );
                        textNext := textNext + 1
                    end for_cPos
                end
            end if_not_found ;
            ePos
        end readString ;

        % gets an integer from the line - no checks for valid digits %
        integer procedure readInteger ; begin
            integer n;
            while line( lPos // 1 ) = " " do lPos := lPos + 1;
            n := 0;
            while line( lPos // 1 ) not = " " do begin
                n    := ( n * 10 ) + ( decode( line( lPos // 1 ) ) - decode( "0" ) );
                lPos := lPos + 1
            end while_not_end_of_integer ;
            n
        end readInteger ;

        string(256) line;
        string(16)  name;
        integer     lPos, tPos;
        tPos := lPos := 0;
        readcard( line );
        % get the line and column numbers %
        tkLine   := readInteger;
        tkColumn := readInteger;
        % get the token name %
        while line( lPos // 1 ) = " " do lPos := lPos + 1;
        name := "";
        while lPos < 256 and line( lPos // 1 ) not = " " do begin
            name( tPos // 1 ) := line( lPos // 1 );
            lPos := lPos + 1;
            tPos := tPos + 1
        end  while_more_name ;
        % determine the token type %
        tkType         := 1;
        tkIntegerValue := 0;
        while tkType <= MAX_TOKEN_TYPE and name not = tkName( tkType ) do tkType := tkType + 1;
        if tkType > MAX_TOKEN_TYPE then fatalError( tkLine, tkColumn, "Malformed token" );
        % handle the additional parameter for identifier/string/integer %
        if tkType = tInteger or tkType = tIdentifier or tkType = tString then begin
            while line( lPos // 1 ) = " " do lPos := lPos + 1;
            if      tkType = tInteger    then tkIntegerValue := readInteger
            else if tkType = tIdentifier then tkIntegerValue := readString( idList, " "  )
            else  % tkType = tString     %    tkIntegerValue := readString( stList, """" )
        end if_token_with_additional_parameter ;
    end readToken ;

    % parses a statement %
    reference(node) procedure parseStatement ; begin
        reference(node) stmtNode, stmtExpr;

        % skips the current token if it is expectedToken,              %
        % returns true if the token was expectedToken, false otherwise %
        logical procedure have ( integer value expectedToken ) ; begin
            logical haveExpectedToken;
            haveExpectedToken := ( tkType = expectedToken );
            if haveExpectedToken and tkType not = tEnd_of_input then readToken;
            haveExpectedToken
        end have ;

        % issues an error message and skips past the next semi-colon or to end of input %
        procedure skipStatement ( string(80) value message ) ; begin
            synError( tkLine, tkColumn, message );
            while tkType not = tEnd_of_input and not have( tSemicolon ) do readToken
        end skipStatement ;

        % checks we have a semicolon, issues an error and skips the statement if not %
        procedure mustBeEndOfStatement ; begin
            if not have( tSemicolon ) then skipStatement( """;"" expected." )
        end mustBeEndOfStatement ;

        % skips the current token if it is "(" and issues an error if it isn't %
        procedure mustBeLeftParen ; begin
            if not have( tLeftParen ) then synError( tkLine, tkColumn, """("" expected." )
        end % mustBeLeftParen % ;

        % skips the current token if it is ")" and issues an error if it isn't %
        procedure mustBeRightParen ; begin
            if not have( tRightParen ) then synError( tkLine, tkColumn, """)"" expected." )
        end % mustBeRightParen % ;

        % gets the next token and parses an expression with the specified precedence %
        reference(node) procedure nextAndparseExpr ( integer value precedence ) ; begin
            readToken;
            parseExpr( precedence )
        end nextAndParseExpr ;

        % parses an expression with the specified precedence %
        % all operators are assumed to be left-associative %
        reference(node) procedure parseExpr ( integer value precedence ) ; begin

            % handles a single token primary %
            reference(node) procedure simplePrimary ( integer value primaryNodeType ) ; begin
                reference(node) primaryNode;
                primaryNode := operandNode( primaryNodeType, tkIntegerValue );
                readToken;
                primaryNode
            end simplePrimary ;

            reference(node) exprNode;

            if precedence < PRIMARY_PREC  then begin
                exprNode := parseExpr( precedence + 1 );
                while tkPrec( tkType ) = precedence do begin
                    integer op;
                    op := tkNode( tkType );
                    exprNode := opNode( op, exprNode, nextAndParseExpr( precedence + 1 ) )
                end while_op_at_this_precedence_level
                end
            else if tkType = tIdentifier  then exprNode := simplePrimary( nIdentifier )
            else if tkType = tInteger     then exprNode := simplePrimary( nInteger    )
            else if tkType = nString      then begin
                synError( tkLine, tkColumn, "Unexpected string literal." );
                exprNode := simplePrimary( nInteger )
                end
            else if tkType = tLeftParen   then exprNode := parseParenExpr
            else if tkType = tOp_add      then exprNode := nextAndParseExpr( precedence )
            else if tkType = tOp_subtract then exprNode := opNode( nNegate, nextAndParseExpr( precedence ), null )
            else if tkType = tOp_not      then exprNode := opNode( nNot,    nextAndParseExpr( precedence ), null )
            else begin
                synError( tkLine, tkColumn, "Syntax error in expression." );
                exprNode := simplePrimary( nInteger )
            end;
            exprNode
        end parseExpr ;

        % parses a preenthesised expression %
        reference(node) procedure parseParenExpr ; begin
            reference(node) exprNode;
            mustBeLeftParen;
            exprNode := parseExpr( 0 );
            mustBeRightParen;
            exprNode
        end parseParenExpr ;

        % parse statement depending on it's first token %
        if      tkType = tIdentifier then begin % assignment statement %
            stmtExpr := operandNode( nIdentifier, tkIntegerValue );
            % skip the identifier and check for "=" %
            readToken;
            if not have( tOp_Assign ) then synError( tkLine, tkColumn, "Expected ""="" in assignment statement." );
            stmtNode := opNode( nAssign, stmtExpr, parseExpr( 0 ) );
            mustBeEndOfStatement
            end
        else if have( tKeyword_while ) then begin
            stmtExpr := parseParenExpr;
            stmtNode := opNode( nWhile, stmtExpr, parseStatement )
            end        
        else if have( tkeyword_if ) then begin
            stmtExpr := parseParenExpr;
            stmtNode := opNode( nIf, stmtExpr, opNode( nIf, parseStatement, null ) );
            if have( tKeyword_else ) then % have an "else" part % right(right(stmtNode)) := parseStatement
            end
        else if have( tKeyword_Print ) then begin
            mustBeLeftParen;
            stmtNode := null;
            while begin
                if tkType = tString then begin
                    stmtNode := opNode( nSequence
                                      , stmtNode
                                      , opNode( nPrts, operandNode( nString, tkIntegerValue ), null )
                                      );
                    readToken
                    end
                else stmtNode := opNode( nSequence, stmtNode, opNode( nPrti, parseExpr( 0 ), null ) );
                have( tComma )
            end do begin end;
            mustBeRightparen;
            mustBeEndOfStatement;
            end
        else if have( tKeyword_Putc ) then begin
            stmtNode := opNode( nPrtc, parseParenExpr, null );
            mustBeEndOfStatement
            end
        else if have( tLeftBrace ) then begin % block %
            stmtNode := parseStatementList( tRightBrace );
            if not have( tRightBrace ) then synError( tkLine, tkColumn, "Expected ""}""." );
            end
        else if have( tSemicolon ) then stmtNode := null
        else begin % unrecognised statement %
            skipStatement( "Unrecognised statement." );
            stmtNode := null
        end if_various_tokens ;
        stmtNode
    end parseStatement ;

    % parses a statement list ending with the specified terminator %
    reference(node) procedure parseStatementList ( integer value terminator ) ; begin
        reference(node) listNode;
        listNode := null;
        while tkType not = terminator
          and tkType not = tEnd_of_input do listNode := opNode( nSequence, listNode, parseStatement );
        listNode
    end parseStatementList ;

    % sets a node code and name %
    procedure setNode ( integer result nd; integer value ndCode; string(14) value name ) ;
        begin nd := ndCode; ndName( ndCode ) := name end;

    setNode( nIdentifier,    1, "Identifier"   ); setNode( nString,     2, "String"    );
    setNode( nInteger,       3, "Integer"      ); setNode( nSequence,   4, "Sequence"  ); setNode( nIf,   5, "If"  );
    setNode( nPrtc,          6, "Prtc"         ); setNode( nPrts,       7, "Prts"      );
    setNode( nPrti,          8, "Prti"         ); setNode( nWhile,      9, "While"     );
    setNode( nAssign,       10, "Assign"       ); setNode( nNegate,    11, "Negate"    ); setNode( nNot, 12, "Not" );
    setNode( nMultiply,     13, "Multiply"     ); setNode( nDivide,    14, "Divide"    ); setNode( nMod, 15, "Mod" );
    setNode( nAdd,          16, "Add"          ); setNode( nSubtract,  17, "Subtract"  );
    setNode( nLess,         18, "Less"         ); setNode( nLessEqual, 19, "LessEqual" );
    setNode( nGreater,      20, "Greater"      );
    setNode( nGreaterEqual, 21, "GreaterEqual" ); setNode( nEqual,     22, "Equal"     );
    setNode( nNotEqual,     23, "NotEqual"     ); setNode( nAnd,       24, "And"       ); setNode( nOr,  25, "Or"  );
    tOp_multiply     :=  1; tkName( tOp_multiply     ) := "Op_multiply";     tkPrec( tOp_multiply     ) :=  5;
    tOp_divide       :=  2; tkName( tOp_divide       ) := "Op_divide";       tkPrec( tOp_divide       ) :=  5;
    tOp_mod          :=  3; tkName( tOp_mod          ) := "Op_mod";          tkPrec( tOp_mod          ) :=  5;
    tOp_add          :=  4; tkName( tOp_add          ) := "Op_add";          tkPrec( tOp_add          ) :=  4;
    tOp_subtract     :=  5; tkName( tOp_subtract     ) := "Op_subtract";     tkPrec( tOp_subtract     ) :=  4;
    tOp_negate       :=  6; tkName( tOp_negate       ) := "Op_negate";       tkPrec( tOp_negate       ) := -1;
    tOp_less         :=  7; tkName( tOp_less         ) := "Op_less";         tkPrec( tOp_less         ) :=  3;
    tOp_lessequal    :=  8; tkName( tOp_lessequal    ) := "Op_lessequal";    tkPrec( tOp_lessequal    ) :=  3;
    tOp_greater      :=  9; tkName( tOp_greater      ) := "Op_greater";      tkPrec( tOp_greater      ) :=  3;
    tOp_greaterequal := 10; tkName( tOp_greaterequal ) := "Op_greaterequal"; tkPrec( tOp_greaterequal ) :=  3;
    tOp_equal        := 11; tkName( tOp_equal        ) := "Op_equal";        tkPrec( tOp_equal        ) :=  2;
    tOp_notequal     := 12; tkName( tOp_notequal     ) := "Op_notequal";     tkPrec( tOp_notequal     ) :=  2;
    tOp_not          := 13; tkName( tOp_not          ) := "Op_not";          tkPrec( tOp_not          ) := -1;
    tOp_assign       := 14; tkName( tOp_assign       ) := "Op_assign";       tkPrec( tOp_assign       ) := -1;
    tOp_and          := 15; tkName( tOp_and          ) := "Op_and";          tkPrec( tOp_and          ) :=  1;
    tOp_or           := 16; tkName( tOp_or           ) := "Op_or";           tkPrec( tOp_or           ) :=  0;
    tLeftParen       := 17; tkName( tLeftParen       ) := "LeftParen";       tkPrec( tLeftParen       ) := -1;
    tRightParen      := 18; tkName( tRightParen      ) := "RightParen";      tkPrec( tRightParen      ) := -1;
    tLeftBrace       := 19; tkName( tLeftBrace       ) := "LeftBrace";       tkPrec( tLeftBrace       ) := -1;
    tRightBrace      := 20; tkName( tRightBrace      ) := "RightBrace";      tkPrec( tRightBrace      ) := -1;
    tSemicolon       := 21; tkName( tSemicolon       ) := "Semicolon";       tkPrec( tSemicolon       ) := -1;
    tComma           := 22; tkName( tComma           ) := "Comma";           tkPrec( tComma           ) := -1;
    tKeyword_if      := 23; tkName( tKeyword_if      ) := "Keyword_if";      tkPrec( tKeyword_if      ) := -1;
    tKeyword_else    := 24; tkName( tKeyword_else    ) := "Keyword_else";    tkPrec( tKeyword_else    ) := -1;
    tKeyword_while   := 25; tkName( tKeyword_while   ) := "Keyword_while";   tkPrec( tKeyword_while   ) := -1;
    tKeyword_print   := 26; tkName( tKeyword_print   ) := "Keyword_print";   tkPrec( tKeyword_print   ) := -1;
    tKeyword_putc    := 27; tkName( tKeyword_putc    ) := "Keyword_putc";    tkPrec( tKeyword_putc    ) := -1;
    tIdentifier      := 28; tkName( tIdentifier      ) := "Identifier";      tkPrec( tIdentifier      ) := -1;
    tInteger         := 29; tkName( tInteger         ) := "Integer";         tkPrec( tInteger         ) := -1;
    tString          := 30; tkName( tString          ) := "String";          tkPrec( tString          ) := -1;
    tEnd_of_input    := 31; tkName( tEnd_of_input    ) := "End_of_input";    tkPrec( tEnd_of_input    ) := -1;
    MAX_TOKEN_TYPE   := 31; TEXT_MAX := 4095; textNext := 0; PRIMARY_PREC := 6;
    for tkPos := 1 until MAX_TOKEN_TYPE do tkNode( tkPos ) := - tkPos;
    tkNode( tOp_multiply     ) := nMultiply;  tkNode( tOp_divide   ) := nDivide;   tkNode( tOp_mod          ) := nMod;
    tkNode( tOp_add          ) := nAdd;       tkNode( tOp_subtract ) := nSubtract; tkNode( tOp_less         ) := nLess;
    tkNode( tOp_lessequal    ) := nLessEqual; tkNode( tOp_greater  ) := nGreater;
    tkNode( tOp_greaterequal ) := nGreaterEqual;
    tkNode( tOp_equal        ) := nEqual;     tkNode( tOp_notequal ) := nNotEqual; tkNode( tOp_not          ) := nNot;
    tkNode( tOp_and          ) := nAnd;       tkNode( tOp_or       ) := nOr;
    stList := idList := null;

    % parse the output from the lexical analyser and output the linearised parse tree %
    readToken;
    writeNode( parseStatementList( tEnd_of_input ) )
end.

Output:

Output from parsing the Prime Numbers example program.

Sequence      
Sequence      
Sequence      
Sequence      
Sequence      
;
Assign        
Identifier    count
Integer                    1  
Assign        
Identifier    n
Integer                    1  
Assign        
Identifier    limit
Integer                  100  
While         
Less          
Identifier    n
Identifier    limit
Sequence      
Sequence      
Sequence      
Sequence      
Sequence      
;
Assign        
Identifier    k
Integer                    3  
Assign        
Identifier    p
Integer                    1  
Assign        
Identifier    n
Add           
Identifier    n
Integer                    2  
While         
And           
LessEqual     
Multiply      
Identifier    k
Identifier    k
Identifier    n
Identifier    p
Sequence      
Sequence      
;
Assign        
Identifier    p
NotEqual      
Multiply      
Divide        
Identifier    n
Identifier    k
Identifier    k
Identifier    n
Assign        
Identifier    k
Add           
Identifier    k
Integer                    2  
If            
Identifier    p
If            
Sequence      
Sequence      
;
Sequence      
Sequence      
;
Prti          
Identifier    n
;
Prts          
String        " is prime\n"
;
Assign        
Identifier    count
Add           
Identifier    count
Integer                    1  
;
Sequence      
Sequence      
Sequence      
;
Prts          
String        "Total primes found: "
;
Prti          
Identifier    count
;
Prts          
String        "\n"
;

ATS

(********************************************************************)
(* Usage: parse [INPUTFILE [OUTPUTFILE]]
   If INPUTFILE or OUTPUTFILE is "-" or missing, then standard input
   or standard output is used, respectively. *)

#include "share/atspre_staload.hats"
staload UN = "prelude/SATS/unsafe.sats"

#define NIL list_nil ()
#define ::  list_cons

%{^
/* alloca(3) is needed for ATS exceptions. */
#include <alloca.h>
%}

(********************************************************************)

#define NUM_TOKENS 31

#define TOKEN_ELSE          0
#define TOKEN_IF            1
#define TOKEN_PRINT         2
#define TOKEN_PUTC          3
#define TOKEN_WHILE         4
#define TOKEN_MULTIPLY      5
#define TOKEN_DIVIDE        6
#define TOKEN_MOD           7
#define TOKEN_ADD           8
#define TOKEN_SUBTRACT      9
#define TOKEN_NEGATE       10
#define TOKEN_LESS         11
#define TOKEN_LESSEQUAL    12
#define TOKEN_GREATER      13
#define TOKEN_GREATEREQUAL 14
#define TOKEN_EQUAL        15
#define TOKEN_NOTEQUAL     16
#define TOKEN_NOT          17
#define TOKEN_ASSIGN       18
#define TOKEN_AND          19
#define TOKEN_OR           20
#define TOKEN_LEFTPAREN    21
#define TOKEN_RIGHTPAREN   22
#define TOKEN_LEFTBRACE    23
#define TOKEN_RIGHTBRACE   24
#define TOKEN_SEMICOLON    25
#define TOKEN_COMMA        26
#define TOKEN_IDENTIFIER   27
#define TOKEN_INTEGER      28
#define TOKEN_STRING       29
#define TOKEN_END_OF_INPUT 30

typedef token_t =
  [i : int | TOKEN_ELSE <= i; i <= TOKEN_END_OF_INPUT] 
  int i
typedef tokentuple_t = (token_t, String, ullint, ullint)

fn
token_text (tok : token_t) : String =
  case+ tok of
  | TOKEN_ELSE          => "else"
  | TOKEN_IF            => "if"
  | TOKEN_PRINT         => "print"
  | TOKEN_PUTC          => "putc"
  | TOKEN_WHILE         => "while"
  | TOKEN_MULTIPLY      => "*"
  | TOKEN_DIVIDE        => "/"
  | TOKEN_MOD           => "%"
  | TOKEN_ADD           => "+"
  | TOKEN_SUBTRACT      => "-"
  | TOKEN_NEGATE        => "-"
  | TOKEN_LESS          => "<"
  | TOKEN_LESSEQUAL     => "<="
  | TOKEN_GREATER       => ">"
  | TOKEN_GREATEREQUAL  => ">="
  | TOKEN_EQUAL         => "=="
  | TOKEN_NOTEQUAL      => "!="
  | TOKEN_NOT           => "!"
  | TOKEN_ASSIGN        => "="
  | TOKEN_AND           => "&&"
  | TOKEN_OR            => "||"
  | TOKEN_LEFTPAREN     => "("
  | TOKEN_RIGHTPAREN    => ")"
  | TOKEN_LEFTBRACE     => "{"
  | TOKEN_RIGHTBRACE    => "}"
  | TOKEN_SEMICOLON     => ";"
  | TOKEN_COMMA         => ","
  | TOKEN_IDENTIFIER    => "Ident"
  | TOKEN_INTEGER       => "Integer literal"
  | TOKEN_STRING        => "String literal"
  | TOKEN_END_OF_INPUT  => "EOI"

(********************************************************************)
(* A perfect hash for the lexical token names.

   This hash was generated by GNU gperf and then translated to
   reasonable ATS by hand. Note, though, that one could have embedded
   the generated C code directly and used it. *)

#define MIN_WORD_LENGTH 5
#define MAX_WORD_LENGTH 15
#define MIN_HASH_VALUE 5
#define MAX_HASH_VALUE 64
#define HASH_TABLE_SIZE 65

local
  extern castfn u : {n : nat | n < 256} int n -<> uint8 n
in
  vtypedef asso_values_vt = @[[n : nat | n < 256] uint8 n][256]

  var asso_values =
    @[[n : nat | n < 256] uint8 n][256]
      (u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 10, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u  0, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u  0, u 65, u 25,
       u  5, u  5, u  0, u 15, u 65, u  0, u 65, u 65, u 10, u 65,
       u 30, u  0, u 65, u  5, u 10, u 10, u  0, u 15, u 65, u 65,
       u 65, u  5, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65, u 65,
       u 65, u 65, u 65, u 65, u 65, u 65)
end

fn
get_asso_value {i : nat | i < 256}
               (i : uint i) :<>
    [n : nat | n < 256] uint n =
  let
    extern castfn u8ui : {n : nat} uint8 n -<> uint n
    extern castfn mk_asso_values :<>
      {p : addr} ptr p -<> (asso_values_vt @ p | ptr p)

    val asso_values_tup = mk_asso_values (addr@ asso_values)
    macdef asso_values = !(asso_values_tup.1)
    val retval = asso_values[i]
    val _ = $UN.castvwtp0{void} asso_values_tup
  in
    u8ui retval
  end

fn
hash {n : int | MIN_WORD_LENGTH <= n; n <= MAX_WORD_LENGTH}
     (str : string n,
      len : size_t n) :<>
    [key : nat] uint key =
  let
    extern castfn uc2ui : {n : nat} uchar n -<> uint n

    val c1 = uc2ui (c2uc str[4])
    val c2 = uc2ui (c2uc str[pred len])
  in
    sz2u len + get_asso_value c1 + get_asso_value c2
  end

typedef wordlist_vt = @[(String, token_t)][HASH_TABLE_SIZE]

var wordlist =
  @[(String, token_t)][HASH_TABLE_SIZE]
    (("", 0), ("", 0), ("", 0), ("", 0), ("", 0),
     ("Comma", 26),
     ("Op_not", 17),
     ("", 0), ("", 0), ("", 0),
     ("Keyword_if", 1),
     ("Op_mod", 7),
     ("End_of_input", 30),
     ("Keyword_print", 2),
     ("Op_divide", 6),
     ("RightBrace", 24),
     ("Op_add", 8),
     ("Keyword_else", 0),
     ("Keyword_while", 4),
     ("Op_negate", 10),
     ("Identifier", 27),
     ("Op_notequal", 16),
     ("Op_less", 11),
     ("Op_equal", 15),
     ("LeftBrace", 23),
     ("Op_or", 20),
     ("Op_subtract", 9),
     ("Op_lessequal", 12),
     ("", 0), ("", 0),
     ("Op_greater", 13),
     ("Op_multiply", 5 ),
     ("Integer", 28),
     ("", 0), ("", 0),
     ("Op_greaterequal", 14),
     ("", 0),
     ("Keyword_putc", 3),
     ("", 0),
     ("LeftParen", 21),
     ("RightParen", 22),
     ("Op_and", 19),
     ("", 0), ("", 0), ("", 0), ("", 0), ("", 0), ("", 0), ("", 0),
     ("Op_assign", 18),
     ("", 0),
     ("String", 29),
     ("", 0), ("", 0), ("", 0), ("", 0), ("", 0), ("", 0), ("", 0),
     ("", 0), ("", 0), ("", 0), ("", 0), ("", 0),
     ("Semicolon", 25))

fn
get_wordlist_entry
          {n   : nat | n <= MAX_HASH_VALUE}
          (key : uint n) :<> (String, token_t) =
  let
    extern castfn mk_wordlist_tup :<>
      {p : addr} ptr p -<> (wordlist_vt @ p | ptr p)

    val wordlist_tup = mk_wordlist_tup (addr@ wordlist)
    macdef wordlist = !(wordlist_tup.1)
    val retval = wordlist[key]
    val _ = $UN.castvwtp0{void} wordlist_tup
  in
    retval
  end

fn
string2token_t_opt
          {n   : int}
          (str : string n) :<>
    Option token_t =
  let
    val len = string_length str
  in
    if len < i2sz MIN_WORD_LENGTH then
      None ()
    else if i2sz MAX_WORD_LENGTH < len then
      None ()
    else
      let
        val key = hash (str, len)
      in
        if i2u MAX_HASH_VALUE < key then
          None ()
        else
          let
            val (s, tok) = get_wordlist_entry (key)
          in
            if str <> s then
              None ()
            else
              Some tok
          end
      end
  end

(********************************************************************)

exception bad_lex_integer of (String)
exception bad_lex_token_name of (String)
exception bad_string_literal of (String)

extern fun {}
skip_something$pred : char -<> bool
fn {}
skip_something {n : nat}
               {i : nat | i <= n}
               (s : string n,
                n : size_t n,
                i : size_t i) :<>
    [j : nat | i <= j; j <= n]
    size_t j =
  let
    fun
    loop {k : nat | i <= k; k <= n} .<n - k>.
         (k : size_t k) :<>
        [j : nat | i <= j; j <= n]
        size_t j =
      if k = n then
        k
      else if ~(skip_something$pred<> s[k]) then
        k
      else
        loop (succ k)
  in
    loop i
  end

fn
skip_space {n : nat}
           {i : nat | i <= n}
           (s : string n,
            n : size_t n,
            i : size_t i) :<>
    [j : nat | i <= j; j <= n]
    size_t j =
  let
    implement skip_something$pred<> (c) = isspace c
  in
    skip_something (s, n, i)
  end

fn
skip_nonspace {n : nat}
              {i : nat | i <= n}
              (s : string n,
               n : size_t n,
               i : size_t i) :<>
    [j : nat | i <= j; j <= n]
    size_t j =
  let
    implement skip_something$pred<> (c) = ~isspace c
  in
    skip_something (s, n, i)
  end

fn
skip_nonquote {n : nat}
              {i : nat | i <= n}
              (s : string n,
               n : size_t n,
               i : size_t i) :<>
    [j : nat | i <= j; j <= n]
    size_t j =
  let
    implement skip_something$pred<> (c) = c <> '"'
  in
    skip_something (s, n, i)
  end

fn
skip_string_literal
          {n : nat}
          {i : nat | i <= n}
          (s : string n,
           n : size_t n,
           i : size_t i) :<>
    [j : nat | i <= j; j <= n]
    size_t j =
  if i = n then
    i
  else if s[i] <> '"' then
    i
  else
    let
      val j = skip_nonquote (s, n, succ i)
    in
      if j = n then
        i
      else
        succ j
    end

fn
get_substr {n, i, j : nat | i <= j; j <= n}
           (s : string n,
            i : size_t i,
            j : size_t j) :
    [m : int | m == j - i] string m =
  let
    val s = string_make_substring (s, i, j - i)
  in
    strnptr2string s
  end

fn
string2ullint
          {n : nat}
          (s : string n) : ullint =
  let
    val n = string_length s
  in
    if n = i2sz 0 then
      $raise bad_lex_integer ("")
    else
      let
        extern castfn u2ull : uint -<> ullint

        fun
        evaluate {k : nat | k <= n} .<n - k>.
                 (k : size_t k,
                  v : ullint) : ullint =
          if k = n then
            v
          else if ~isdigit s[k] then
            $raise bad_lex_integer (s)
          else
            let
              val d = char2ui s[k] - char2ui '0'
            in
              evaluate (succ k, (10ULL * v) + u2ull d)
            end
      in
        evaluate (i2sz 0, 0ULL)
      end
  end

fn
string2token {n   : int}
             (str : string n) : token_t =
  case+ string2token_t_opt str of
  | None () => $raise bad_lex_token_name (str)
  | Some tok => tok

fn
read_lex_file (inpf : FILEref) : List0 tokentuple_t =
  (* Convert the output of "lex" to a list of tokens. *)
  (* This routine could stand to do more validation of the input. *)
  let
    fun
    loop (lst : List0 tokentuple_t) : List0 tokentuple_t =
      if fileref_is_eof inpf then
        lst
      else
        let
          val s = strptr2string (fileref_get_line_string inpf)
          val n = string_length s
          prval _ = lemma_g1uint_param n

          val i0_line_no = skip_space (s, n, i2sz 0)
        in
          if i0_line_no = n then
            (* Skip any blank lines, including end of file. *)
            loop lst
          else
            let
              val i1_line_no = skip_nonspace (s, n, i0_line_no)
              val s_line_no = get_substr (s, i0_line_no, i1_line_no)
              val line_no = string2ullint s_line_no

              val i0_column_no = skip_space (s, n, i1_line_no)
              val i1_column_no = skip_nonspace (s, n, i0_column_no)
              val s_column_no = get_substr (s, i0_column_no,
                                            i1_column_no)
              val column_no = string2ullint s_column_no

              val i0_tokname = skip_space (s, n, i1_column_no)
              val i1_tokname = skip_nonspace (s, n, i0_tokname)
              val tokname = get_substr (s, i0_tokname, i1_tokname)
              val tok = string2token tokname
            in
              case+ tok of
              | TOKEN_INTEGER =>
                let
                  val i0 = skip_space (s, n, i1_tokname)
                  val i1 = skip_nonspace (s, n, i0)
                  val arg = get_substr (s, i0, i1)
                  val toktup = (tok, arg, line_no, column_no)
                in
                  loop (toktup :: lst)
                end
              | TOKEN_IDENTIFIER =>
                let
                  val i0 = skip_space (s, n, i1_tokname)
                  val i1 = skip_nonspace (s, n, i0)
                  val arg = get_substr (s, i0, i1)
                  val toktup = (tok, arg, line_no, column_no)
                in
                  loop (toktup :: lst)
                end
              | TOKEN_STRING =>
                let
                  val i0 = skip_space (s, n, i1_tokname)
                  val i1 = skip_string_literal (s, n, i0)
                  val arg = get_substr (s, i0, i1)
                  val toktup = (tok, arg, line_no, column_no)
                in
                  loop (toktup :: lst)
                end
              | _ =>
                let
                  val toktup = (tok, "", line_no, column_no)
                in
                  loop (toktup :: lst)
                end
            end
        end
  in
    list_vt2t (list_reverse (loop NIL))
  end

(********************************************************************)

exception truncated_lexical of ()
exception unexpected_token of (tokentuple_t, token_t)
exception unexpected_primary of (tokentuple_t)
exception unterminated_statement_block of (ullint, ullint)
exception expected_a_statement of (tokentuple_t)

datatype node_t =
| node_t_nil of ()
| node_t_leaf of (String, String)
| node_t_cons of (String, node_t, node_t)

fn
right_assoc (tok : token_t) : bool =
  (* None of the currently supported operators is right
     associative. *)
  false

fn
binary_op (tok : token_t) : bool =
  case+ tok of
  | TOKEN_ADD => true
  | TOKEN_SUBTRACT => true
  | TOKEN_MULTIPLY => true
  | TOKEN_DIVIDE => true
  | TOKEN_MOD => true
  | TOKEN_LESS => true
  | TOKEN_LESSEQUAL => true
  | TOKEN_GREATER => true
  | TOKEN_GREATEREQUAL => true
  | TOKEN_EQUAL => true
  | TOKEN_NOTEQUAL => true
  | TOKEN_AND => true
  | TOKEN_OR => true
  | _ => false

fn
precedence (tok : token_t) : int =
  case+ tok of
  | TOKEN_MULTIPLY => 13
  | TOKEN_DIVIDE => 13
  | TOKEN_MOD => 13
  | TOKEN_ADD => 12
  | TOKEN_SUBTRACT => 12
  | TOKEN_NEGATE => 14
  | TOKEN_NOT => 14
  | TOKEN_LESS => 10
  | TOKEN_LESSEQUAL => 10
  | TOKEN_GREATER => 10
  | TOKEN_GREATEREQUAL => 10
  | TOKEN_EQUAL => 9
  | TOKEN_NOTEQUAL => 9
  | TOKEN_AND => 5
  | TOKEN_OR => 4
  | _ => ~1

fn
opname (tok : token_t) : String =
  case- tok of
  | TOKEN_MULTIPLY => "Multiply"
  | TOKEN_DIVIDE => "Divide"
  | TOKEN_MOD => "Mod"
  | TOKEN_ADD => "Add"
  | TOKEN_SUBTRACT => "Subtract"
  | TOKEN_NEGATE => "Negate"
  | TOKEN_NOT => "Not"
  | TOKEN_LESS => "Less"
  | TOKEN_LESSEQUAL => "LessEqual"
  | TOKEN_GREATER => "Greater"
  | TOKEN_GREATEREQUAL => "GreaterEqual"
  | TOKEN_EQUAL => "Equal"
  | TOKEN_NOTEQUAL => "NotEqual"
  | TOKEN_AND => "And"
  | TOKEN_OR => "Or"

fn
parse (lex : List0 tokentuple_t) : node_t =
  let
    typedef toktups_t (n : int) = list (tokentuple_t, n)
    typedef toktups_t = [n : nat] toktups_t n

    fn
    expect (expected : token_t,
            lex      : toktups_t) : toktups_t =
      case+ lex of
      | NIL => $raise truncated_lexical ()
      | toktup :: tail =>
        if toktup.0 = expected then
          tail
        else
          $raise unexpected_token (toktup, expected)

    fn
    peek {n : int} (lex : toktups_t n) : [1 <= n] token_t =
      case+ lex of
      | NIL => $raise truncated_lexical ()
      | (tok, _, _, _) :: _ => tok

    fun
    stmt (lex : toktups_t) : (node_t, toktups_t) =
      case+ lex of
      | NIL => $raise truncated_lexical ()
      | (TOKEN_IF, _, _, _) :: lex =>
        let
          val (e, lex) = paren_expr lex
          val (s, lex) = stmt lex
        in
          case+ lex of
          | (TOKEN_ELSE, _, _, _) :: lex =>
            let
              val (t, lex) = stmt lex
            in
              (node_t_cons ("If", e, node_t_cons ("If", s, t)), lex)
            end
          | _ =>
            let
              (* There is no 'else' clause. *)
              val t = node_t_nil ()
            in
              (node_t_cons ("If", e, node_t_cons ("If", s, t)), lex)
            end
        end
      | (TOKEN_PUTC, _, _, _) :: lex =>
        let
          val (subtree, lex) = paren_expr lex
          val subtree = node_t_cons ("Prtc", subtree, node_t_nil ())
          val lex = expect (TOKEN_SEMICOLON, lex)
        in
          (subtree, lex)
        end
      | (TOKEN_PRINT, _, _, _) :: lex =>
        let
          val lex = expect (TOKEN_LEFTPAREN, lex)
          fun
          loop_over_args (subtree : node_t,
                          lex     : toktups_t) : (node_t, toktups_t) =
            case+ lex of
            | (TOKEN_STRING, arg, _, _) ::
                (TOKEN_COMMA, _, _, _) :: lex =>
              let
                val leaf = node_t_leaf ("String", arg)
                val e = node_t_cons ("Prts", leaf, node_t_nil ())
              in
                loop_over_args
                  (node_t_cons ("Sequence", subtree, e), lex)
              end
            | (TOKEN_STRING, arg, _, _) :: lex =>
              let
                val lex = expect (TOKEN_RIGHTPAREN, lex)
                val lex = expect (TOKEN_SEMICOLON, lex)
                val leaf = node_t_leaf ("String", arg)
                val e = node_t_cons ("Prts", leaf, node_t_nil ())
              in
                (node_t_cons ("Sequence", subtree, e), lex)
              end
            | _ :: _ =>
              let
                val (x, lex) = expr (0, lex)
                val e = node_t_cons ("Prti", x, node_t_nil ())
                val subtree = node_t_cons ("Sequence", subtree, e)
              in
                case+ peek lex of
                | TOKEN_COMMA =>
                  let
                    val lex = expect (TOKEN_COMMA, lex)
                  in
                    loop_over_args (subtree, lex)
                  end
                | _ =>
                  let
                    val lex = expect (TOKEN_RIGHTPAREN, lex)
                    val lex = expect (TOKEN_SEMICOLON, lex)
                  in
                    (subtree, lex)
                  end
              end
            | NIL => $raise truncated_lexical ()
        in
          loop_over_args (node_t_nil (), lex)
        end
      | (TOKEN_SEMICOLON, _, _, _) :: lex => (node_t_nil (), lex)
      | (TOKEN_IDENTIFIER, arg, _, _) :: lex =>
        let
          val v = node_t_leaf ("Identifier", arg)
          val lex = expect (TOKEN_ASSIGN, lex)
          val (subtree, lex) = expr (0, lex)
          val t = node_t_cons ("Assign", v, subtree)
          val lex = expect (TOKEN_SEMICOLON, lex)
        in
          (t, lex)
        end
      | (TOKEN_WHILE, _, _, _) :: lex =>
        let
          val (e, lex) = paren_expr lex
          val (t, lex) = stmt lex
        in
          (node_t_cons ("While", e, t), lex)
        end
      | (TOKEN_LEFTBRACE, _, _, _) :: lex =>
        let
          fun
          loop_over_stmts (subtree : node_t,
                           lex     : toktups_t) :
              (node_t, toktups_t) =
            case+ lex of
            | (TOKEN_RIGHTBRACE, _, _, _) :: lex => (subtree, lex)
            | (TOKEN_END_OF_INPUT, _, line_no, column_no) :: _ =>
              $raise unterminated_statement_block (line_no, column_no)
            | _ =>
              let
                val (e, lex) = stmt lex
              in
                loop_over_stmts
                  (node_t_cons ("Sequence", subtree, e), lex)
              end
        in
          loop_over_stmts (node_t_nil (), lex)
        end
      | (TOKEN_END_OF_INPUT, _, _, _) :: lex => (node_t_nil (), lex)
      | toktup :: _ => $raise expected_a_statement (toktup)
    and
    expr (prec : int,
          lex  : toktups_t) : (node_t, toktups_t) =
      case+ lex of
      | (TOKEN_LEFTPAREN, _, _, _) :: _ =>
        (* '(' expr ')' *)
        let
          val (subtree, lex) = paren_expr lex
        in
          prec_climb (prec, subtree, lex)
        end
      | (TOKEN_ADD, _, _, _) :: lex =>
         (* '+' expr *)
        let
          val (subtree, lex) = expr (precedence TOKEN_ADD, lex)
        in
          prec_climb (prec, subtree, lex)
        end
      | (TOKEN_SUBTRACT, _, _, _) :: lex =>
        (* '-' expr *)
        let
          val (subtree, lex) = expr (precedence TOKEN_NEGATE, lex)
          val subtree = node_t_cons ("Negate", subtree, node_t_nil ())
        in
          prec_climb (prec, subtree, lex)
        end
      | (TOKEN_NOT, _, _, _) :: lex =>
        (* '!' expr *)
        let
          val (subtree, lex) = expr (precedence TOKEN_NOT, lex)
          val subtree = node_t_cons ("Not", subtree, node_t_nil ())
        in
          prec_climb (prec, subtree, lex)
        end
      | (TOKEN_IDENTIFIER, arg, _, _) :: lex =>
        let
          val leaf = node_t_leaf ("Identifier", arg)
        in
          prec_climb (prec, leaf, lex)
        end
      | (TOKEN_INTEGER, arg, _, _) :: lex =>
        let
          val leaf = node_t_leaf ("Integer", arg)
        in
          prec_climb (prec, leaf, lex)
        end
      | toktup :: lex =>
        $raise unexpected_primary (toktup)
      | NIL =>
        $raise truncated_lexical ()
    and
    prec_climb (prec    : int,
                subtree : node_t,
                lex     : toktups_t) : (node_t, toktups_t) =
      case+ peek lex of
      | tokval =>
        if ~binary_op tokval then
          (subtree, lex)
        else if precedence tokval < prec then
          (subtree, lex)
        else
          case+ lex of
          | toktup :: lex =>
            let
              val q =
                if right_assoc (toktup.0) then
                  precedence tokval
                else
                  succ (precedence tokval)

              val (e, lex) = expr (q, lex)
              val subtree1 =
                node_t_cons (opname (toktup.0), subtree, e)
            in
              prec_climb (prec, subtree1, lex)
            end
    and
    paren_expr (lex : toktups_t) : (node_t, toktups_t) =
      (* '(' expr ')' *)
      let
        val lex = expect (TOKEN_LEFTPAREN, lex)
        val (subtree, lex) = expr (0, lex)
        val lex = expect (TOKEN_RIGHTPAREN, lex)
      in
        (subtree, lex)
      end

    fun
    main_loop (subtree : node_t,
               lex     : toktups_t) : node_t =
      case+ peek lex of
      | TOKEN_END_OF_INPUT => subtree
      | _ =>
        let
          val (x, lex) = stmt lex
        in
          main_loop (node_t_cons ("Sequence", subtree, x), lex)
        end
  in
    main_loop (node_t_nil (), lex)
  end

fn
print_ast (outf : FILEref,
           ast  : node_t) : void =
  let
    fun
    traverse (ast : node_t) : void =
      case+ ast of
      | node_t_nil () => fprintln! (outf, ";")
      | node_t_leaf (str, arg) => fprintln! (outf, str, " ", arg)
      | node_t_cons (str, left, right) =>
        begin
          fprintln! (outf, str);
          traverse left;
          traverse right
        end
  in
    traverse ast
  end  

(********************************************************************)

fn
main_program (inpf : FILEref,
              outf : FILEref) : int =
  let
    val toklst = read_lex_file inpf
    val ast = parse toklst
    val () = print_ast (outf, ast)
  in
    0
  end

fn
error_start (line_no   : ullint,
             column_no : ullint) : void =
  print! ("(", line_no, ", ", column_no, ") error: ")

implement
main (argc, argv) =
  let
    val inpfname =
      if 2 <= argc then
        $UN.cast{string} argv[1]
      else
        "-"
    val outfname =
      if 3 <= argc then
        $UN.cast{string} argv[2]
      else
        "-"
  in
    try
      let
        val inpf =
          if (inpfname : string) = "-" then
            stdin_ref
          else
            fileref_open_exn (inpfname, file_mode_r)

        val outf =
          if (outfname : string) = "-" then
            stdout_ref
          else
            fileref_open_exn (outfname, file_mode_w)
      in
        main_program (inpf, outf)
      end
    with
    | ~ unexpected_primary @(tok, _, line_no, column_no) =>
      begin
        error_start (line_no, column_no);
        println! ("Expecting a primary, found: ", token_text tok);
        1
      end
    | ~ unexpected_token (@(tok, _, line_no, column_no), expected) =>
      begin
        error_start (line_no, column_no);
        println! ("Expecting '", token_text expected,
                  "', found '", token_text tok, "'");
        1
      end
    | ~ expected_a_statement @(tok, _, line_no, column_no) =>
      begin
        error_start (line_no, column_no);
        println! ("expecting start of statement, found '",
                  token_text tok, "'");
        1
      end
    | ~ unterminated_statement_block (line_no, column_no) =>
      begin
        error_start (line_no, column_no);
        println! ("unterminated statement block");
        1
      end
    | ~ truncated_lexical () =>
      begin
        println! ("truncated input token stream");
        2
      end
    | ~ bad_lex_integer (s) =>
      begin
        println! ("bad integer literal in the token stream: '",
                  s, "'");
        2
      end
    | ~ bad_string_literal (s) =>
      begin
        println! ("bad string literal in the token stream: '",
                  s, "'");
        2
      end
    | ~ bad_lex_token_name (s) =>
      begin
        println! ("bad token name in the token stream: '",
                  s, "'");
        2
      end
end

(********************************************************************)

Output:

$ patscc -O2 -DATS_MEMALLOC_GCBDW -o parse parse-in-ATS.dats -latslib -lgc && ./lex compiler-tests/primes.t | ./parse
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier count
Integer 1
Assign
Identifier n
Integer 1
Assign
Identifier limit
Integer 100
While
Less
Identifier n
Identifier limit
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier k
Integer 3
Assign
Identifier p
Integer 1
Assign
Identifier n
Add
Identifier n
Integer 2
While
And
LessEqual
Multiply
Identifier k
Identifier k
Identifier n
Identifier p
Sequence
Sequence
;
Assign
Identifier p
NotEqual
Multiply
Divide
Identifier n
Identifier k
Identifier k
Identifier n
Assign
Identifier k
Add
Identifier k
Integer 2
If
Identifier p
If
Sequence
Sequence
;
Sequence
Sequence
;
Prti
Identifier n
;
Prts
String " is prime\n"
;
Assign
Identifier count
Add
Identifier count
Integer 1
;
Sequence
Sequence
Sequence
;
Prts
String "Total primes found: "
;
Prti
Identifier count
;
Prts
String "\n"
;

AWK

Tested with gawk 4.1.1 and mawk 1.3.4.

function Token_assign(tk, attr,      attr_array, n, i) {
  n=split(attr, attr_array)
  for(i=1; i<=n; i++)
    Tokens[tk,i-1] = attr_array[i]
}

#*** show error and exit
function error(msg) {
  printf("(%s, %s) %s\n", err_line, err_col, msg)
  exit(1)
}

function gettok(      line, n, i) {
  getline line
  if (line == "")
    error("empty line")
  n=split(line, line_list)
  # line col Ident var_name
  # 1    2   3     4
  err_line = line_list[1]
  err_col  = line_list[2]
  tok_text = line_list[3]
  tok = all_syms[tok_text]
  for (i=5; i<=n; i++)
    line_list[4] = line_list[4] " " line_list[i]
  if (tok == "")
    error("Unknown token " tok_text)
  tok_other = ""
  if (tok == "tk_Integer" || tok == "tk_Ident" || tok =="tk_String")
    tok_other = line_list[4]
}

function make_node(oper, left, right, value) {
  node_type [next_free_node_index] = oper
  node_left [next_free_node_index] = left
  node_right[next_free_node_index] = right
  node_value[next_free_node_index] = value
  return next_free_node_index ++
}

function make_leaf(oper, n) {
  return make_node(oper, 0, 0, n)
}

function expect(msg, s) {
  if (tok == s) {
    gettok()
    return
  }
  error(msg ": Expecting '" Tokens[s,TK_NAME] "', found '" Tokens[tok,TK_NAME] "'")
}

function expr(p,       x, op, node) {
  x = 0
  if (tok == "tk_Lparen") {
    x = paren_expr()
  } else if (tok == "tk_Sub" || tok == "tk_Add") {
    if (tok == "tk_Sub")
      op = "tk_Negate"
    else
      op = "tk_Add"
    gettok()
    node = expr(Tokens["tk_Negate",TK_PRECEDENCE]+0)
    if (op == "tk_Negate")
      x = make_node("nd_Negate", node)
    else
      x = node
  } else if (tok == "tk_Not") {
    gettok()
    x = make_node("nd_Not", expr(Tokens["tk_Not",TK_PRECEDENCE]+0))
  } else if (tok == "tk_Ident") {
    x = make_leaf("nd_Ident", tok_other)
    gettok()
  } else if (tok == "tk_Integer") {
    x = make_leaf("nd_Integer", tok_other)
    gettok()
  } else {
    error("Expecting a primary, found: " Tokens[tok,TK_NAME])
  }
  while (((Tokens[tok,TK_IS_BINARY]+0) > 0) && ((Tokens[tok,TK_PRECEDENCE]+0) >= p)) {
    op = tok
    gettok()
    q = Tokens[op,TK_PRECEDENCE]+0
    if (! (Tokens[op,TK_RIGHT_ASSOC]+0 > 0))
      q += 1
    node = expr(q)
    x = make_node(Tokens[op,TK_NODE], x, node)
  }
  return x
}

function paren_expr(        node) {
  expect("paren_expr", "tk_Lparen")
  node = expr(0)
  expect("paren_expr", "tk_Rparen")
  return node
}

function stmt(              t, e, s, s2, v) {
  t = 0
  if (tok == "tk_If") {
    gettok()
    e = paren_expr()
    s = stmt()
    s2 = 0
    if (tok == "tk_Else") {
      gettok()
      s2 = stmt()
    }
    t = make_node("nd_If", e, make_node("nd_If", s, s2))
  } else if (tok == "tk_Putc") {
    gettok()
    e = paren_expr()
    t = make_node("nd_Prtc", e)
    expect("Putc", "tk_Semi")
  } else if (tok == "tk_Print") {
    gettok()
    expect("Print", "tk_Lparen")
    while (1) {
      if (tok == "tk_String") {
        e = make_node("nd_Prts", make_leaf("nd_String", tok_other))
        gettok()
      } else {
        e = make_node("nd_Prti", expr(0))
      }
      t = make_node("nd_Sequence", t, e)
      if (tok != "tk_Comma")
        break
      gettok()
    }
    expect("Print", "tk_Rparen")
    expect("Print", "tk_Semi")
  } else if (tok == "tk_Semi") {
    gettok()
  } else if (tok == "tk_Ident") {
    v = make_leaf("nd_Ident", tok_other)
    gettok()
    expect("assign", "tk_Assign")
    e = expr(0)
    t = make_node("nd_Assign", v, e)
    expect("assign", "tk_Semi")
  } else if (tok == "tk_While") {
    gettok()
    e = paren_expr()
    s = stmt()
    t = make_node("nd_While", e, s)
  } else if (tok == "tk_Lbrace") {
    gettok()
    while (tok != "tk_Rbrace" && tok != "tk_EOI")
      t = make_node("nd_Sequence", t, stmt())
    expect("Lbrace", "tk_Rbrace")
  } else if (tok == "tk_EOI") {
  } else {
    error("Expecting start of statement, found: " Tokens[tok,TK_NAME])
  }
  return t
}

function parse(         t) {
  t = 0   # None
  gettok()
  while (1) {
    t = make_node("nd_Sequence", t, stmt())
    if (tok == "tk_EOI" || t == 0)
      break
  }
  return t
}

function prt_ast(t) {
  if (t == 0) {
    print(";")
  } else {
    printf("%-14s", Display_nodes[node_type[t]])
    if ((node_type[t] == "nd_Ident") || (node_type[t] == "nd_Integer"))
      printf("%s\n", node_value[t])
    else if (node_type[t] == "nd_String") {
      printf("%s\n", node_value[t])
    } else {
      print("")
      prt_ast(node_left[t])
      prt_ast(node_right[t])
    }
  }
}

BEGIN {
  all_syms["End_of_input"    ] = "tk_EOI"
  all_syms["Op_multiply"     ] = "tk_Mul"
  all_syms["Op_divide"       ] = "tk_Div"
  all_syms["Op_mod"          ] = "tk_Mod"
  all_syms["Op_add"          ] = "tk_Add"
  all_syms["Op_subtract"     ] = "tk_Sub"
  all_syms["Op_negate"       ] = "tk_Negate"
  all_syms["Op_not"          ] = "tk_Not"
  all_syms["Op_less"         ] = "tk_Lss"
  all_syms["Op_lessequal"    ] = "tk_Leq"
  all_syms["Op_greater"      ] = "tk_Gtr"
  all_syms["Op_greaterequal" ] = "tk_Geq"
  all_syms["Op_equal"        ] = "tk_Eq"
  all_syms["Op_notequal"     ] = "tk_Neq"
  all_syms["Op_assign"       ] = "tk_Assign"
  all_syms["Op_and"          ] = "tk_And"
  all_syms["Op_or"           ] = "tk_Or"
  all_syms["Keyword_if"      ] = "tk_If"
  all_syms["Keyword_else"    ] = "tk_Else"
  all_syms["Keyword_while"   ] = "tk_While"
  all_syms["Keyword_print"   ] = "tk_Print"
  all_syms["Keyword_putc"    ] = "tk_Putc"
  all_syms["LeftParen"       ] = "tk_Lparen"
  all_syms["RightParen"      ] = "tk_Rparen"
  all_syms["LeftBrace"       ] = "tk_Lbrace"
  all_syms["RightBrace"      ] = "tk_Rbrace"
  all_syms["Semicolon"       ] = "tk_Semi"
  all_syms["Comma"           ] = "tk_Comma"
  all_syms["Identifier"      ] = "tk_Ident"
  all_syms["Integer"         ] = "tk_Integer"
  all_syms["String"          ] = "tk_String"

  Display_nodes["nd_Ident"   ] = "Identifier"
  Display_nodes["nd_String"  ] = "String"
  Display_nodes["nd_Integer" ] = "Integer"
  Display_nodes["nd_Sequence"] = "Sequence"
  Display_nodes["nd_If"      ] = "If"
  Display_nodes["nd_Prtc"    ] = "Prtc"
  Display_nodes["nd_Prts"    ] = "Prts"
  Display_nodes["nd_Prti"    ] = "Prti"
  Display_nodes["nd_While"   ] = "While"
  Display_nodes["nd_Assign"  ] = "Assign"
  Display_nodes["nd_Negate"  ] = "Negate"
  Display_nodes["nd_Not"     ] = "Not"
  Display_nodes["nd_Mul"     ] = "Multiply"
  Display_nodes["nd_Div"     ] = "Divide"
  Display_nodes["nd_Mod"     ] = "Mod"
  Display_nodes["nd_Add"     ] = "Add"
  Display_nodes["nd_Sub"     ] = "Subtract"
  Display_nodes["nd_Lss"     ] = "Less"
  Display_nodes["nd_Leq"     ] = "LessEqual"
  Display_nodes["nd_Gtr"     ] = "Greater"
  Display_nodes["nd_Geq"     ] = "GreaterEqual"
  Display_nodes["nd_Eql"     ] = "Equal"
  Display_nodes["nd_Neq"     ] = "NotEqual"
  Display_nodes["nd_And"     ] = "And"
  Display_nodes["nd_Or"      ] = "Or"

  TK_NAME         =          0
  TK_RIGHT_ASSOC  =                   1
  TK_IS_BINARY    =                     2
  TK_IS_UNARY     =                       3
  TK_PRECEDENCE   =                          4
  TK_NODE         =                             5
  Token_assign("tk_EOI"    , "EOI     0 0 0 -1 -1        ")
  Token_assign("tk_Mul"    , "*       0 1 0 13 nd_Mul    ")
  Token_assign("tk_Div"    , "/       0 1 0 13 nd_Div    ")
  Token_assign("tk_Mod"    , "%       0 1 0 13 nd_Mod    ")
  Token_assign("tk_Add"    , "+       0 1 0 12 nd_Add    ")
  Token_assign("tk_Sub"    , "-       0 1 0 12 nd_Sub    ")
  Token_assign("tk_Negate" , "-       0 0 1 14 nd_Negate ")
  Token_assign("tk_Not"    , "!       0 0 1 14 nd_Not    ")
  Token_assign("tk_Lss"    , "<       0 1 0 10 nd_Lss    ")
  Token_assign("tk_Leq"    , "<=      0 1 0 10 nd_Leq    ")
  Token_assign("tk_Gtr"    , ">       0 1 0 10 nd_Gtr    ")
  Token_assign("tk_Geq"    , ">=      0 1 0 10 nd_Geq    ")
  Token_assign("tk_Eql"    , "==      0 1 0  9 nd_Eql    ")
  Token_assign("tk_Neq"    , "!=      0 1 0  9 nd_Neq    ")
  Token_assign("tk_Assign" , "=       0 0 0 -1 nd_Assign ")
  Token_assign("tk_And"    , "&&      0 1 0  5 nd_And    ")
  Token_assign("tk_Or"     , "||      0 1 0  4 nd_Or     ")
  Token_assign("tk_If"     , "if      0 0 0 -1 nd_If     ")
  Token_assign("tk_Else"   , "else    0 0 0 -1 -1        ")
  Token_assign("tk_While"  , "while   0 0 0 -1 nd_While  ")
  Token_assign("tk_Print"  , "print   0 0 0 -1 -1        ")
  Token_assign("tk_Putc"   , "putc    0 0 0 -1 -1        ")
  Token_assign("tk_Lparen" , "(       0 0 0 -1 -1        ")
  Token_assign("tk_Rparen" , ")       0 0 0 -1 -1        ")
  Token_assign("tk_Lbrace" , "{       0 0 0 -1 -1        ")
  Token_assign("tk_Rbrace" , "}       0 0 0 -1 -1        ")
  Token_assign("tk_Semi"   , ";       0 0 0 -1 -1        ")
  Token_assign("tk_Comma"  , ",       0 0 0 -1 -1        ")
  Token_assign("tk_Ident"  , "Ident   0 0 0 -1 nd_Ident  ")
  Token_assign("tk_Integer", "Integer 0 0 0 -1 nd_Integer")
  Token_assign("tk_String" , "String  0 0 0 -1 nd_String ")

  input_file = "-"
  err_line   = 0
  err_col    = 0
  tok        = ""
  tok_text   = ""
  next_free_node_index = 1

  if (ARGC > 1)
    input_file = ARGV[1]
  t = parse()
  prt_ast(t)
}

Output — count:

Sequence Sequence ; Assign Identifier count Integer 1 While Less Identifier count Integer 10 Sequence Sequence ; Sequence Sequence Sequence ; Prts String "count is: " ; Prti Identifier count ; Prts String "\n" ; Assign Identifier count Add Identifier count Integer 1

C

Tested with gcc 4.81 and later, compiles warning free with -Wall -Wextra

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include <stdbool.h>
#include <ctype.h>

#define NELEMS(arr) (sizeof(arr) / sizeof(arr[0]))

typedef enum {
    tk_EOI, tk_Mul, tk_Div, tk_Mod, tk_Add, tk_Sub, tk_Negate, tk_Not, tk_Lss, tk_Leq, tk_Gtr,
    tk_Geq, tk_Eql, tk_Neq, tk_Assign, tk_And, tk_Or, tk_If, tk_Else, tk_While, tk_Print,
    tk_Putc, tk_Lparen, tk_Rparen, tk_Lbrace, tk_Rbrace, tk_Semi, tk_Comma, tk_Ident,
    tk_Integer, tk_String
} TokenType;

typedef enum {
    nd_Ident, nd_String, nd_Integer, nd_Sequence, nd_If, nd_Prtc, nd_Prts, nd_Prti, nd_While,
    nd_Assign, nd_Negate, nd_Not, nd_Mul, nd_Div, nd_Mod, nd_Add, nd_Sub, nd_Lss, nd_Leq,
    nd_Gtr, nd_Geq, nd_Eql, nd_Neq, nd_And, nd_Or
} NodeType;

typedef struct {
    TokenType tok;
    int err_ln;
    int err_col;
    char *text;             /* ident or string literal or integer value */
} tok_s;

typedef struct Tree {
    NodeType node_type;
    struct Tree *left;
    struct Tree *right;
    char *value;
} Tree;

// dependency: Ordered by tok, must remain in same order as TokenType enum
struct {
    char       *text, *enum_text;
    TokenType   tok;
    bool        right_associative, is_binary, is_unary;
    int         precedence;
    NodeType    node_type;
} atr[] = {
    {"EOI",             "End_of_input"   , tk_EOI,     false, false, false, -1, -1        },
    {"*",               "Op_multiply"    , tk_Mul,     false, true,  false, 13, nd_Mul    },
    {"/",               "Op_divide"      , tk_Div,     false, true,  false, 13, nd_Div    },
    {"%",               "Op_mod"         , tk_Mod,     false, true,  false, 13, nd_Mod    },
    {"+",               "Op_add"         , tk_Add,     false, true,  false, 12, nd_Add    },
    {"-",               "Op_subtract"    , tk_Sub,     false, true,  false, 12, nd_Sub    },
    {"-",               "Op_negate"      , tk_Negate,  false, false, true,  14, nd_Negate },
    {"!",               "Op_not"         , tk_Not,     false, false, true,  14, nd_Not    },
    {"<",               "Op_less"        , tk_Lss,     false, true,  false, 10, nd_Lss    },
    {"<=",              "Op_lessequal"   , tk_Leq,     false, true,  false, 10, nd_Leq    },
    {">",               "Op_greater"     , tk_Gtr,     false, true,  false, 10, nd_Gtr    },
    {">=",              "Op_greaterequal", tk_Geq,     false, true,  false, 10, nd_Geq    },
    {"==",              "Op_equal"       , tk_Eql,     false, true,  false,  9, nd_Eql    },
    {"!=",              "Op_notequal"    , tk_Neq,     false, true,  false,  9, nd_Neq    },
    {"=",               "Op_assign"      , tk_Assign,  false, false, false, -1, nd_Assign },
    {"&&",              "Op_and"         , tk_And,     false, true,  false,  5, nd_And    },
    {"||",              "Op_or"          , tk_Or,      false, true,  false,  4, nd_Or     },
    {"if",              "Keyword_if"     , tk_If,      false, false, false, -1, nd_If     },
    {"else",            "Keyword_else"   , tk_Else,    false, false, false, -1, -1        },
    {"while",           "Keyword_while"  , tk_While,   false, false, false, -1, nd_While  },
    {"print",           "Keyword_print"  , tk_Print,   false, false, false, -1, -1        },
    {"putc",            "Keyword_putc"   , tk_Putc,    false, false, false, -1, -1        },
    {"(",               "LeftParen"      , tk_Lparen,  false, false, false, -1, -1        },
    {")",               "RightParen"     , tk_Rparen,  false, false, false, -1, -1        },
    {"{",               "LeftBrace"      , tk_Lbrace,  false, false, false, -1, -1        },
    {"}",               "RightBrace"     , tk_Rbrace,  false, false, false, -1, -1        },
    {";",               "Semicolon"      , tk_Semi,    false, false, false, -1, -1        },
    {",",               "Comma"          , tk_Comma,   false, false, false, -1, -1        },
    {"Ident",           "Identifier"     , tk_Ident,   false, false, false, -1, nd_Ident  },
    {"Integer literal", "Integer"        , tk_Integer, false, false, false, -1, nd_Integer},
    {"String literal",  "String"         , tk_String,  false, false, false, -1, nd_String },
};

char *Display_nodes[] = {"Identifier", "String", "Integer", "Sequence", "If", "Prtc",
    "Prts", "Prti", "While", "Assign", "Negate", "Not", "Multiply", "Divide", "Mod",
    "Add", "Subtract", "Less", "LessEqual", "Greater", "GreaterEqual", "Equal",
    "NotEqual", "And", "Or"};

static tok_s tok;
static FILE *source_fp, *dest_fp;

Tree *paren_expr();

void error(int err_line, int err_col, const char *fmt, ... ) {
    va_list ap;
    char buf[1000];

    va_start(ap, fmt);
    vsprintf(buf, fmt, ap);
    va_end(ap);
    printf("(%d, %d) error: %s\n", err_line, err_col, buf);
    exit(1);
}

char *read_line(int *len) {
    static char *text = NULL;
    static int textmax = 0;

    for (*len = 0; ; (*len)++) {
        int ch = fgetc(source_fp);
        if (ch == EOF || ch == '\n') {
            if (*len == 0)
                return NULL;
            break;
        }
        if (*len + 1 >= textmax) {
            textmax = (textmax == 0 ? 128 : textmax * 2);
            text = realloc(text, textmax);
        }
        text[*len] = ch;
    }
    text[*len] = '\0';
    return text;
}

char *rtrim(char *text, int *len) {         // remove trailing spaces
    for (; *len > 0 && isspace(text[*len - 1]); --(*len))
        ;

    text[*len] = '\0';
    return text;
}

TokenType get_enum(const char *name) {      // return internal version of name
    for (size_t i = 0; i < NELEMS(atr); i++) {
        if (strcmp(atr[i].enum_text, name) == 0)
            return atr[i].tok;
    }
    error(0, 0, "Unknown token %s\n", name);
    return 0;
}

tok_s gettok() {
    int len;
    tok_s tok;
    char *yytext = read_line(&len);
    yytext = rtrim(yytext, &len);

    // [ ]*{lineno}[ ]+{colno}[ ]+token[ ]+optional

    // get line and column
    tok.err_ln  = atoi(strtok(yytext, " "));
    tok.err_col = atoi(strtok(NULL, " "));

    // get the token name
    char *name = strtok(NULL, " ");
    tok.tok = get_enum(name);

    // if there is extra data, get it
    char *p = name + strlen(name);
    if (p != &yytext[len]) {
        for (++p; isspace(*p); ++p)
            ;
        tok.text = strdup(p);
    }
    return tok;
}

Tree *make_node(NodeType node_type, Tree *left, Tree *right) {
    Tree *t = calloc(sizeof(Tree), 1);
    t->node_type = node_type;
    t->left = left;
    t->right = right;
    return t;
}

Tree *make_leaf(NodeType node_type, char *value) {
    Tree *t = calloc(sizeof(Tree), 1);
    t->node_type = node_type;
    t->value = strdup(value);
    return t;
}

void expect(const char msg[], TokenType s) {
    if (tok.tok == s) {
        tok = gettok();
        return;
    }
    error(tok.err_ln, tok.err_col, "%s: Expecting '%s', found '%s'\n", msg, atr[s].text, atr[tok.tok].text);
}

Tree *expr(int p) {
    Tree *x = NULL, *node;
    TokenType op;

    switch (tok.tok) {
        case tk_Lparen:
            x = paren_expr();
            break;
        case tk_Sub: case tk_Add:
            op = tok.tok;
            tok = gettok();
            node = expr(atr[tk_Negate].precedence);
            x = (op == tk_Sub) ? make_node(nd_Negate, node, NULL) : node;
            break;
        case tk_Not:
            tok = gettok();
            x = make_node(nd_Not, expr(atr[tk_Not].precedence), NULL);
            break;
        case tk_Ident:
            x = make_leaf(nd_Ident, tok.text);
            tok = gettok();
            break;
        case tk_Integer:
            x = make_leaf(nd_Integer, tok.text);
            tok = gettok();
            break;
        default:
            error(tok.err_ln, tok.err_col, "Expecting a primary, found: %s\n", atr[tok.tok].text);
    }

    while (atr[tok.tok].is_binary && atr[tok.tok].precedence >= p) {
        TokenType op = tok.tok;

        tok = gettok();

        int q = atr[op].precedence;
        if (!atr[op].right_associative)
            q++;

        node = expr(q);
        x = make_node(atr[op].node_type, x, node);
    }
    return x;
}

Tree *paren_expr() {
    expect("paren_expr", tk_Lparen);
    Tree *t = expr(0);
    expect("paren_expr", tk_Rparen);
    return t;
}

Tree *stmt() {
    Tree *t = NULL, *v, *e, *s, *s2;

    switch (tok.tok) {
        case tk_If:
            tok = gettok();
            e = paren_expr();
            s = stmt();
            s2 = NULL;
            if (tok.tok == tk_Else) {
                tok = gettok();
                s2 = stmt();
            }
            t = make_node(nd_If, e, make_node(nd_If, s, s2));
            break;
        case tk_Putc:
            tok = gettok();
            e = paren_expr();
            t = make_node(nd_Prtc, e, NULL);
            expect("Putc", tk_Semi);
            break;
        case tk_Print: /* print '(' expr {',' expr} ')' */
            tok = gettok();
            for (expect("Print", tk_Lparen); ; expect("Print", tk_Comma)) {
                if (tok.tok == tk_String) {
                    e = make_node(nd_Prts, make_leaf(nd_String, tok.text), NULL);
                    tok = gettok();
                } else
                    e = make_node(nd_Prti, expr(0), NULL);

                t = make_node(nd_Sequence, t, e);

                if (tok.tok != tk_Comma)
                    break;
            }
            expect("Print", tk_Rparen);
            expect("Print", tk_Semi);
            break;
        case tk_Semi:
            tok = gettok();
            break;
        case tk_Ident:
            v = make_leaf(nd_Ident, tok.text);
            tok = gettok();
            expect("assign", tk_Assign);
            e = expr(0);
            t = make_node(nd_Assign, v, e);
            expect("assign", tk_Semi);
            break;
        case tk_While:
            tok = gettok();
            e = paren_expr();
            s = stmt();
            t = make_node(nd_While, e, s);
            break;
        case tk_Lbrace:         /* {stmt} */
            for (expect("Lbrace", tk_Lbrace); tok.tok != tk_Rbrace && tok.tok != tk_EOI;)
                t = make_node(nd_Sequence, t, stmt());
            expect("Lbrace", tk_Rbrace);
            break;
        case tk_EOI:
            break;
        default: error(tok.err_ln, tok.err_col, "expecting start of statement, found '%s'\n", atr[tok.tok].text);
    }
    return t;
}

Tree *parse() {
    Tree *t = NULL;

    tok = gettok();
    do {
        t = make_node(nd_Sequence, t, stmt());
    } while (t != NULL && tok.tok != tk_EOI);
    return t;
}

void prt_ast(Tree *t) {
    if (t == NULL)
        printf(";\n");
    else {
        printf("%-14s ", Display_nodes[t->node_type]);
        if (t->node_type == nd_Ident || t->node_type == nd_Integer || t->node_type == nd_String) {
            printf("%s\n", t->value);
        } else {
            printf("\n");
            prt_ast(t->left);
            prt_ast(t->right);
        }
    }
}

void init_io(FILE **fp, FILE *std, const char mode[], const char fn[]) {
    if (fn[0] == '\0')
        *fp = std;
    else if ((*fp = fopen(fn, mode)) == NULL)
        error(0, 0, "Can't open %s\n", fn);
}

int main(int argc, char *argv[]) {
    init_io(&source_fp, stdin,  "r",  argc > 1 ? argv[1] : "");
    init_io(&dest_fp,   stdout, "wb", argc > 2 ? argv[2] : "");
    prt_ast(parse());
}

Output — prime numbers AST:

Sequence Sequence Sequence Sequence Sequence ; Assign Identifier count Integer 1 Assign Identifier n Integer 1 Assign Identifier limit Integer 100 While Less Identifier n Identifier limit Sequence Sequence Sequence Sequence Sequence ; Assign Identifier k Integer 3 Assign Identifier p Integer 1 Assign Identifier n Add Identifier n Integer 2 While And LessEqual Multiply Identifier k Identifier k Identifier n Identifier p Sequence Sequence ; Assign Identifier p NotEqual Multiply Divide Identifier n Identifier k Identifier k Identifier n Assign Identifier k Add Identifier k Integer 2 If Identifier p If Sequence Sequence ; Sequence Sequence ; Prti Identifier n ; Prts String " is prime\n" ; Assign Identifier count Add Identifier count Integer 1 ; Sequence Sequence Sequence ; Prts String "Total primes found: " ; Prti Identifier count ; Prts String "\n" ;

COBOL

Code by Steve Williams. Tested with GnuCOBOL 2.2.

        >>SOURCE FORMAT IS FREE
identification division.
*> this code is dedicated to the public domain
*> (GnuCOBOL) 2.3-dev.0
*> for extra credit, generate this program directly from the EBNF
program-id. parser.
environment division.
configuration section.
repository.  function all intrinsic.
input-output section.
file-control.
    select input-file assign using input-name
        status is input-status
        organization is line sequential.
data division.
file section.
fd  input-file global.
01  input-record global.
    03  input-line pic zzzz9.
    03  input-column pic zzzzzz9.
    03  filler pic x(3).
    03  input-token pic x(16).
    03  input-value pic x(48).

working-storage section.
01  program-name pic x(32) value spaces global.
01  input-name pic x(32) value spaces global.
01  input-status pic xx global.

01  line-no pic 999 value 0.
01  col-no pic 99 value 0.

01  error-record global.
    03  error-line-no pic zzzz9.
    03  error-col-no pic zzzzzz9.
    03  filler pic x value space.
    03  error-message pic x(64) value spaces.

01  token global.
    03  token-type pic x(16).
    03  token-line pic 999.
    03  token-column pic 99.
    03  token-value pic x(48).

01  parse-stack global.
    03  p pic 999 value 0.
    03  p-lim pic 999 value 200.
    03  p-zero pic 999 value 0.
    03  parse-entry occurs 200.
        05  parse-name pic x(24).
        05  parse-token pic x(16).
        05  parse-left pic 999.
        05  parse-right pic 999.
        05  parse-work pic 999.
        05  parse-work1 pic 999.

01  abstract-syntax-tree global.
    03  t pic 999 value 0.
    03  t1 pic 999.
    03  t-lim pic 999 value 998.
    03  filler occurs 998.
        05  leaf.
            07  leaf-type pic x(14).
            07  leaf-value pic x(48).
        05  node redefines leaf.
            07  node-type pic x(14).
            07  node-left pic 999.
            07  node-right pic 999.

01  indent pic x(200) value all '|   ' global.

procedure division chaining program-name.
start-parser.
    if program-name <> spaces
        string program-name delimited by space '.lex' into input-name
        open input input-file
        if input-status <> '00'
            string 'in parser ' trim(input-name) ' open status ' input-status
                into error-message
            call 'reporterror'
        end-if
    end-if
    call 'gettoken'
    call 'stmt_list'
    if input-name <> spaces
        close input-file
    end-if

    call 'printast' using t

    >>d perform dump-ast

    stop run
    .
dump-ast.
    display '==========' upon syserr
    display 'ast:' upon syserr
    display 't=' t upon syserr
    perform varying t1 from 1 by 1 until t1 > t
        if leaf-type(t1) = 'Identifier' or 'Integer' or 'String'
            display t1 space trim(leaf-type(t1)) space trim(leaf-value(t1)) upon syserr
        else
            display t1 space node-left(t1) space node-right(t1) space trim(node-type(t1))
                upon syserr 
        end-if
    end-perform
    .

identification division.
program-id. stmt_list common recursive.
data division.
procedure division.
start-stmt_list.
    call 'push' using module-id
    move p-zero to parse-left(p)
    perform forever
        call 'stmt'
        move return-code to parse-right(p)
        call 'makenode' using 'Sequence' parse-left(p) parse-right(p)
        move return-code to parse-left(p)
        if parse-right(p) = 0
        or token-type = 'End_of_input'
            exit perform
        end-if
    end-perform
    call 'pop'
    .
end program stmt_list.

identification division.
program-id. stmt common recursive.
procedure division.
start-stmt.
    call 'push' using module-id
    move p-zero to parse-left(p)
    evaluate token-type
    when 'Semicolon'
        call 'gettoken'
    when 'Identifier'
        *>Identifier '=' expr ';'
        call 'makeleaf' using 'Identifier' token-value
        move return-code to parse-left(p)
        call 'gettoken'
        call 'expect' using 'Op_assign'
        call 'expr'
        move return-code to parse-right(p)
        call 'expect' using 'Semicolon'
        call 'makenode' using 'Assign' parse-left(p) parse-right(p)
        move return-code to parse-left(p)
    when 'Keyword_while'
        *>'while' paren_expr '{' stmt '}'
        call 'gettoken'
        call 'paren_expr'
        move return-code to parse-work(p)
        call 'stmt'
        move return-code to parse-right(p)
        call 'makenode' using 'While' parse-work(p) parse-right(p)
        move return-code to parse-left(p)
    when 'Keyword_if'
        *>'if' paren_expr stmt ['else' stmt]
        call 'gettoken'
        call 'paren_expr'
        move return-code to parse-left(p)
        call 'stmt'
        move return-code to parse-work(p)
        move p-zero to parse-work1(p)
        if token-type = 'Keyword_else'
            call 'gettoken'
            call 'stmt'
            move return-code to parse-work1(p)
        end-if
        call 'makenode' using 'If' parse-work(p) parse-work1(p)
        move return-code to parse-right(p)
        call 'makenode' using 'If' parse-left(p) parse-right(p)
        move return-code to parse-left(p)
    when 'Keyword_print'
        *>'print' '(' prt_list ')' ';'
        call 'gettoken'
        call 'expect' using 'LeftParen'
        call 'prt_list'
        move return-code to parse-left(p)
        call 'expect' using 'RightParen'
        call 'expect' using 'Semicolon'
    when 'Keyword_putc'
        *>'putc' paren_expr ';'
        call 'gettoken'
        call 'paren_expr'
        move return-code to parse-left(p)
        call 'makenode' using 'Prtc' parse-left(p) p-zero
        move return-code to parse-left(p)
        call 'expect' using 'Semicolon'
    when 'LeftBrace'
        *>'{' stmt '}'
        call 'gettoken'
        move p-zero to parse-left(p)
        perform until token-type = 'RightBrace' or 'End_of_input'
            call 'stmt'
            move return-code to parse-right(p)
            call 'makenode' using 'Sequence' parse-left(p) parse-right(p)  
            move return-code to parse-left(p)
        end-perform
        if token-type <> 'End_of_input'
            call 'gettoken'
        end-if
    when other
        move 0 to parse-left(p)
    end-evaluate
    move parse-left(p) to return-code
    call 'pop'
    .
end program stmt.

identification division.
program-id. paren_expr common recursive.
procedure division.
start-paren_expr.
    *>'(' expr ')' ;
    call 'push' using module-id
    call 'expect' using 'LeftParen'
    call 'expr'
    call 'expect' using 'RightParen'
    call 'pop'
    .
end program paren_expr.

identification division.
program-id. prt_list common.
procedure division.
start-prt_list.
    *>(string | expr) {',' (String | expr)} ;
    call 'push' using module-id
    move p-zero to parse-work(p)
    perform prt_entry
    perform until token-type <> 'Comma'
        call 'gettoken'
        perform prt_entry
    end-perform
    call 'pop'
    exit program
    .
prt_entry.
    if token-type = 'String'
        call 'makeleaf' using token-type token-value
        move return-code to parse-left(p)
        call 'makenode' using 'Prts' parse-left(p) p-zero
        call 'gettoken'
    else
        call 'expr'
        move return-code to parse-left(p)
        call 'makenode' using 'Prti' parse-left(p) p-zero
    end-if
    move return-code to parse-right(p)
    call 'makenode' using 'Sequence' parse-work(p) parse-right(p)
    move return-code to parse-work(p)
    .
end program prt_list.

identification division.
program-id. expr common recursive.
procedure division.
start-expr.
    *>and_expr {'||' and_expr} ;
    call 'push' using module-id
    call 'and_expr'
    move return-code to parse-left(p)
    perform forever
       if token-type <> 'Op_or'
           exit perform
       end-if
       call 'gettoken'
       call 'and_expr'
       move return-code to parse-right(p)
       call 'makenode' using 'Or' parse-left(p) parse-right(p)
       move return-code to parse-left(p)
    end-perform
    move parse-left(p) to return-code
    call 'pop'
    .
end program expr.

identification division.
program-id. and_expr common recursive.
procedure division.
start-and_expr.
    *>equality_expr {'&&' equality_expr} ;
    call 'push' using module-id
    call 'equality_expr'
    move return-code to parse-left(p)
    perform forever
        if token-type <> 'Op_and'
            exit perform
        end-if
        call 'gettoken'
        call 'equality_expr'
        move return-code to parse-right(p)
        call 'makenode' using 'And' parse-left(p) parse-right(p)
        move return-code to parse-left(p)
    end-perform
    call 'pop'
    .
end program and_expr.

identification division.
program-id. equality_expr common recursive.
procedure division.
start-equality_expr.
    *>relational_expr [('==' | '!=') relational_expr] ;
    call 'push' using module-id
    call 'relational_expr'
    move return-code to parse-left(p)
    evaluate token-type
    when 'Op_equal'
        move 'Equal' to parse-token(p)
    when 'Op_notequal'
        move 'NotEqual' to parse-token(p)
    end-evaluate
    if parse-token(p) <> spaces
        call 'gettoken'
        call 'relational_expr'
        move return-code to parse-right(p)
        call 'makenode' using parse-token(p) parse-left(p) parse-right(p)
        move return-code to parse-left(p)
    end-if
    call 'pop'
    .
end program equality_expr.

identification division.
program-id. relational_expr common recursive.
procedure division.
start-relational_expr.
    *>addition_expr [('<' | '<=' | '>' | '>=') addition_expr] ;
    call 'push' using module-id
    call 'addition_expr'
    move return-code to parse-left(p)
    evaluate token-type
    when 'Op_less'
        move 'Less' to parse-token(p)
    when 'Op_lessequal'
        move 'LessEqual' to parse-token(p)
    when 'Op_greater'
        move 'Greater' to parse-token(p)
    when 'Op_greaterequal'
        move 'GreaterEqual' to parse-token(p)
    end-evaluate
    if parse-token(p) <> spaces
        call 'gettoken'
        call 'addition_expr'
        move return-code to parse-right(p)
        call 'makenode' using parse-token(p) parse-left(p) parse-right(p)
        move return-code to parse-left(p)
    end-if
    call 'pop'
    .
end program relational_expr.

identification division.
program-id. addition_expr common recursive.
procedure division.
start-addition_expr.
    *>multiplication_expr {('+' | '-') multiplication_expr} ;
    call 'push' using module-id
    call 'multiplication_expr'
    move return-code to parse-left(p)
    perform forever
        evaluate token-type
        when 'Op_add'
            move 'Add' to parse-token(p)
        when 'Op_subtract'
            move 'Subtract' to parse-token(p)
        when other
            exit perform
        end-evaluate
        call 'gettoken'
        call 'multiplication_expr'
        move return-code to parse-right(p)
        call 'makenode' using parse-token(p) parse-left(p) parse-right(p)
        move return-code to parse-left(p)
    end-perform
    call 'pop'
    .
end program addition_expr.

identification division.
program-id.  multiplication_expr common recursive.
procedure division.
start-multiplication_expr.
    *>primary {('*' | '/' | '%') primary } ;
    call 'push' using module-id
    call 'primary'
    move return-code to parse-left(p)
    perform forever
        evaluate token-type
        when 'Op_multiply'
            move 'Multiply' to parse-token(p)
        when 'Op_divide'
            move 'Divide' to parse-token(p)
        when 'Op_mod'
            move 'Mod' to parse-token(p)
        when other
            exit perform
        end-evaluate
        call 'gettoken'
        call 'primary'
        move return-code to parse-right(p)
        call 'makenode' using parse-token(p) parse-left(p) parse-right(p)
        move return-code to parse-left(p)
    end-perform
    call 'pop'
    .
end program multiplication_expr.

identification division.
program-id. primary common recursive.
procedure division.
start-primary.
    *>  Identifier
    *>| Integer
    *>| 'LeftParen' expr 'RightParen'
    *>| ('+' | '-' | '!') primary
    *>;
    call 'push' using module-id
    evaluate token-type
    when 'Identifier'
        call 'makeleaf' using 'Identifier' token-value
        call 'gettoken'
    when 'Integer'
        call 'makeleaf' using 'Integer' token-value
        call 'gettoken'
    when 'LeftParen'
        call 'gettoken'
        call 'expr'
        call 'expect' using 'RightParen'
        move t to return-code
    when 'Op_add'
        call 'gettoken'
        call 'primary'
    when 'Op_subtract'
        call 'gettoken'
        call 'primary'
        move return-code to parse-left(p)
        call 'makenode' using 'Negate' parse-left(p) p-zero
    when 'Op_not'
        call 'gettoken'
        call 'primary'
        move return-code to parse-left(p)
        call 'makenode' using 'Not' parse-left(p) p-zero
    when other
        move 0 to return-code
    end-evaluate  
    call 'pop'
    .
end program primary.

program-id. reporterror common.
procedure division.
start-reporterror.
report-error.
    move token-line to error-line-no
    move token-column to error-col-no
    display error-record upon syserr
    stop run with error status -1
    .
end program reporterror.

identification division.
program-id. gettoken common.
procedure division.
start-gettoken.
    if program-name = spaces
        move '00' to input-status
        accept input-record on exception move '10' to input-status end-accept
    else
        read input-file
    end-if

    evaluate input-status
    when '00'
        move input-token to token-type
        move input-value to token-value
        move numval(input-line) to token-line
        move numval(input-column) to token-column
        >>d display indent(1:min(4 * p,length(indent))) 'new token: ' token-type upon syserr
    when '10'
        string 'in parser ' trim(input-name) ' unexpected end of input'
            into error-message
        call 'reporterror'
    when other
        string 'in parser ' trim(input-name) ' unexpected input-status ' input-status
            into error-message
        call 'reporterror'
    end-evaluate
    .
end program gettoken.

identification division.
program-id. expect common.
data division.
linkage section.
01  what any length.
procedure division using what.
start-expect.
    if token-type <> what
        string 'in parser expected ' what ' found ' token-type into error-message
        call 'reporterror'
    end-if
    >>d display indent(1:min(4 * p,length(indent))) 'match: ' token-type upon syserr
    call 'gettoken'
    .
end program expect.

identification division.
program-id. push common.
data division.
linkage section.
01  what any length.
procedure division using what.
start-push.
    >>d display indent(1:min(4 * p,length(indent))) 'push ' what upon syserr
    if p >= p-lim
        move 'in parser stack overflow' to error-message
        call 'reporterror'
    end-if
    add 1 to p
    initialize parse-entry(p)
    move what to parse-name(p)
    .
end program push.

identification division.
program-id. pop common.
procedure division.
start-pop.
    if p < 1
        move 'in parser stack underflow' to error-message
        call 'reporterror'
    end-if
    >>d display indent(1:4 * p - 4) 'pop ' parse-name(p) upon syserr
    subtract 1 from p
    .
end program pop.

identification division.
program-id. makenode common.
data division.
linkage section.
01  parm-type any length.
01  parm-left pic 999.
01  parm-right pic 999.
procedure division using parm-type parm-left parm-right.
start-makenode.
    if t >= t-lim
        string 'in parser makenode tree index t exceeds ' t-lim into error-message
        call 'reporterror'
    end-if
    add 1 to t
    move parm-type to node-type(t)
    move parm-left to node-left(t)
    move parm-right to node-right(t)
    move t to return-code
    .
end program makenode.

identification division.
program-id. makeleaf common.
data division.
linkage section.
01  parm-type any length.
01  parm-value pic x(48).
procedure division using parm-type parm-value.
start-makeleaf.
    if t >= t-lim
        string 'in parser makeleaf tree index t exceeds ' t-lim into error-message
        call 'reporterror'
    end-if
    add 1 to t
    move parm-type to leaf-type(t)
    move parm-value to leaf-value(t)
    move t to return-code
    .
end program makeleaf.

identification division.
program-id. printast recursive.
data division.
linkage section.
01  n pic 999.
procedure division using n.
start-printast.
    if n = 0
        display ';'
        exit program
    end-if
    evaluate leaf-type(n)
    when 'Identifier'
    when 'Integer'
    when 'String'
        display leaf-type(n) trim(leaf-value(n))
    when other
        display node-type(n)
        call 'printast' using node-left(n)
        call 'printast' using node-right(n)
    end-evaluate
    .
end program printast.
end program parser.

Output — Primes:

prompt$ ./lexer <testcases/Primes | ./parser
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier    count
Integer       1
Assign
Identifier    n
Integer       1
Assign
Identifier    limit
Integer       100
While
Less
Identifier    n
Identifier    limit
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier    k
Integer       3
Assign
Identifier    p
Integer       1
Assign
Identifier    n
Add
Identifier    n
Integer       2
While
And
LessEqual
Multiply
Identifier    k
Identifier    k
Identifier    n
Identifier    p
Sequence
Sequence
;
Assign
Identifier    p
NotEqual
Multiply
Divide
Identifier    n
Identifier    k
Identifier    k
Identifier    n
Assign
Identifier    k
Add
Identifier    k
Integer       2
If
Identifier    p
If
Sequence
Sequence
;
Sequence
Sequence
;
Prti
Identifier    n
;
Prts
String        " is prime\n"
;
Assign
Identifier    count
Add
Identifier    count
Integer       1
;
Sequence
Sequence
Sequence
;
Prts
String        "Total primes found: "
;
Prti
Identifier    count
;
Prts
String        "\n"
;

Common Lisp

Works with: SBCL version 2.2.3

Works with: roswell version 21.10.14.111

Library: cl-ppcre

Library: trivia

Translation of: Icon

#!/bin/sh
#|-*- mode:lisp -*-|#
#|
exec ros -Q -- $0 "$@"
|#
(progn ;;init forms
  (ros:ensure-asdf)
  #+quicklisp(ql:quickload '() :silent t))

(defpackage :ros.script.parse.3859374047
  (:use :cl))
(in-package :ros.script.parse.3859374047)

;;;
;;; The Rosetta Code Tiny-Language Parser, in Common Lisp.
;;;

(require "cl-ppcre")
(require "trivia")

(defstruct tokstruc line-no column-no tok tokval)

(defconstant re-blank-line
  (ppcre:create-scanner "^\\s*$"))

(defconstant re-token-1
  (ppcre:create-scanner
   "^\\s*(\\d+)\\s+(\\d+)\\s+(\\S+)\\s*$"))

(defconstant re-token-2
  (ppcre:create-scanner
   "^\\s*(\\d+)\\s+(\\d+)\\s+(\\S+)\\s+(\\S(.*\\S)?)\\s*$"))

(defun string-to-tok (s)
  (trivia:match s
    ("Keyword_else" 'TOK-ELSE)
    ("Keyword_if" 'TOK-IF)
    ("Keyword_print" 'TOK-PRINT)
    ("Keyword_putc" 'TOK-PUTC)
    ("Keyword_while" 'TOK-WHILE)
    ("Op_multiply" 'TOK-MULTIPLY)
    ("Op_divide" 'TOK-DIVIDE)
    ("Op_mod" 'TOK-MOD)
    ("Op_add" 'TOK-ADD)
    ("Op_subtract" 'TOK-SUBTRACT)
    ("Op_negate" 'TOK-NEGATE)
    ("Op_less" 'TOK-LESS)
    ("Op_lessequal" 'TOK-LESSEQUAL)
    ("Op_greater" 'TOK-GREATER)
    ("Op_greaterequal" 'TOK-GREATEREQUAL)
    ("Op_equal" 'TOK-EQUAL)
    ("Op_notequal" 'TOK-NOTEQUAL)
    ("Op_not" 'TOK-NOT)
    ("Op_assign" 'TOK-ASSIGN)
    ("Op_and" 'TOK-AND)
    ("Op_or" 'TOK-OR)
    ("LeftParen" 'TOK-LEFTPAREN)
    ("RightParen" 'TOK-RIGHTPAREN)
    ("LeftBrace" 'TOK-LEFTBRACE)
    ("RightBrace" 'TOK-RIGHTBRACE)
    ("Semicolon" 'TOK-SEMICOLON)
    ("Comma" 'TOK-COMMA)
    ("Identifier" 'TOK-IDENTIFIER)
    ("Integer" 'TOK-INTEGER)
    ("String" 'TOK-STRING)
    ("End_of_input" 'TOK-END-OF-INPUT)
    (_ (warn "unparseable token line")
       (uiop:quit 1))))

(defun precedence (tok)
  (case tok
    (TOK-MULTIPLY 13)
    (TOK-DIVIDE 13)
    (TOK-MOD 13)
    (TOK-ADD 12)
    (TOK-SUBTRACT 12)
    (TOK-NEGATE 14)
    (TOK-NOT 14)
    (TOK-LESS 10)
    (TOK-LESSEQUAL 10)
    (TOK-GREATER 10)
    (TOK-GREATEREQUAL 10)
    (TOK-EQUAL 9)
    (TOK-NOTEQUAL 9)
    (TOK-AND 5)
    (TOK-OR 4)
    (otherwise -1)))

(defun binary-p (tok)
  (case tok
    (TOK-ADD t)
    (TOK-SUBTRACT t)
    (TOK-MULTIPLY t)
    (TOK-DIVIDE t)
    (TOK-MOD t)
    (TOK-LESS t)
    (TOK-LESSEQUAL t)
    (TOK-GREATER t)
    (TOK-GREATEREQUAL t)
    (TOK-EQUAL t)
    (TOK-NOTEQUAL t)
    (TOK-AND t)
    (TOK-OR t)
    (otherwise nil)))

(defun right-associative-p (tok)
  (declare (ignorable tok))
  nil)           ; None of the current operators is right associative.

(defun tok-text (tok)
  (ecase tok
    (TOK-ELSE          "else")
    (TOK-IF            "if")
    (TOK-PRINT         "print")
    (TOK-PUTC          "putc")
    (TOK-WHILE         "while")
    (TOK-MULTIPLY      "*")
    (TOK-DIVIDE        "/")
    (TOK-MOD           "%")
    (TOK-ADD           "+")
    (TOK-SUBTRACT      "-")
    (TOK-NEGATE        "-")
    (TOK-LESS          "<")
    (TOK-LESSEQUAL     "<=")
    (TOK-GREATER       ">")
    (TOK-GREATEREQUAL  ">=")
    (TOK-EQUAL         "==")
    (TOK-NOTEQUAL      "!=")
    (TOK-NOT           "!")
    (TOK-ASSIGN        "=")
    (TOK-AND           "&&")
    (TOK-OR            "((")
    (TOK-LEFTPAREN     "(")
    (TOK-RIGHTPAREN    ")")
    (TOK-LEFTBRACE     "{")
    (TOK-RIGHTBRACE    "}")
    (TOK-SEMICOLON     ";")
    (TOK-COMMA         ",")
    (TOK-IDENTIFIER    "Ident")
    (TOK-INTEGER       "Integer literal")
    (TOK-STRING        "String literal")
    (TOK-END_OF_INPUT  "EOI")))

(defun operator (tok)
  (ecase tok
    (TOK-MULTIPLY "Multiply")
    (TOK-DIVIDE "Divide")
    (TOK-MOD "Mod")
    (TOK-ADD "Add")
    (TOK-SUBTRACT "Subtract")
    (TOK-NEGATE "Negate")
    (TOK-NOT "Not")
    (TOK-LESS "Less")
    (TOK-LESSEQUAL "LessEqual")
    (TOK-GREATER "Greater")
    (TOK-GREATEREQUAL "GreaterEqual")
    (TOK-EQUAL "Equal")
    (TOK-NOTEQUAL "NotEqual")
    (TOK-AND "And")
    (TOK-OR "Or")))

(defun join (&rest args)
  (apply #'concatenate 'string args))

(defun nxt (gettok)
  (funcall gettok :nxt))

(defun curr (gettok)
  (funcall gettok :curr))

(defun err (token msg)
  (format t "(~A, ~A) error: ~A~%"
          (tokstruc-line-no token)
          (tokstruc-column-no token)
          msg)
  (uiop:quit 1))

(defun prt-ast (outf ast)
  ;;
  ;; For fun, let us do prt-ast *non*-recursively, with a stack and a
  ;; loop.
  ;;
  (let ((stack `(,ast)))
    (loop while stack
          do (let ((x (car stack)))
               (setf stack (cdr stack))
               (cond ((not x) (format outf ";~%"))
                     ((or (string= (car x) "Identifier")
                          (string= (car x) "Integer")
                          (string= (car x) "String"))
                      (format outf "~A ~A~%" (car x) (cadr x)))
                     (t (format outf "~A~%" (car x))
                        (setf stack (cons (caddr x) stack))
                        (setf stack (cons (cadr x) stack))))))))

(defun accept (gettok tok)
  (if (eq (tokstruc-tok (curr gettok)) tok)
      (nxt gettok)
      nil))

(defun expect (gettok msg tok)
  (let ((curr-tok (tokstruc-tok (curr gettok))))
    (if (eq curr-tok tok)
        (nxt gettok)
        (err (curr gettok)
             (join msg ": Expecting '"
                   (tok-text tok) "', found '"
                   (tok-text curr-tok) "'")))))

(defun parse (gettok)
  (defun paren-expr (gettok)
    (expect gettok "paren_expr" 'TOK-LEFTPAREN)
    (let ((x (expr gettok 0)))
      (expect gettok "paren_expr" 'TOK-RIGHTPAREN)
      x))

  (defun expr (gettok p)
    (let* ((tok (curr gettok))
           (x (case (tokstruc-tok tok)
                (TOK-LEFTPAREN (paren-expr gettok))
                (TOK-SUBTRACT
                 (nxt gettok)
                 (let ((y (expr gettok (precedence 'TOK-NEGATE))))
                   `("Negate" ,y ())))
                (TOK-ADD
                 (nxt gettok)
                 (expr gettok (precedence 'TOK-NEGATE)))
                (TOK-NOT
                 (nxt gettok)
                 (let ((y (expr gettok (precedence 'TOK-NOT))))
                   `("Not" ,y ())))
                (TOK-IDENTIFIER
                 (let ((y `("Identifier" ,(tokstruc-tokval tok))))
                   (nxt gettok)
                   y))
                (TOK-INTEGER
                 (let ((y `("Integer" ,(tokstruc-tokval tok))))
                   (nxt gettok)
                   y))
                (otherwise
                 (err tok (join "Expecting a primary, found: "
                                (tok-text (tokstruc-tok tok))))))))
      ;;
      ;; Precedence climbing for binary operators.
      ;;
      (loop for tok = (curr gettok)
            for toktok = (tokstruc-tok tok)
            while (and (binary-p toktok) (<= p (precedence toktok)))
            do (progn (nxt gettok)
                      (let ((q (if (right-associative-p toktok)
                                   (precedence toktok)
                                   (1+ (precedence toktok)))))
                        (setf x `(,(operator toktok) ,x
                                  ,(expr gettok q))))))
      x))

  (defun stmt (gettok)
    (cond ((accept gettok 'TOK-IF)
           (let* ((e (paren-expr gettok))
                  (s (stmt gettok))
                  (x (if (accept gettok 'TOK-ELSE)
                         `("If" ,s ,(stmt gettok))
                         `("If" ,s ()))))
             `("If" ,e ,x)))

          ((accept gettok 'TOK-PUTC)
           (let ((x `("Prtc" ,(paren-expr gettok) ())))
             (expect gettok "Putc" 'TOK-SEMICOLON)
             x))

          ((accept gettok 'TOK-PRINT)
           (expect gettok "Print" 'TOK-LEFTPAREN)
           (let ((x '()))
             (loop for tok = (curr gettok)
                   for toktok = (tokstruc-tok tok)
                   for e = (if (eq toktok 'TOK-STRING)
                               (let* ((tokval (tokstruc-tokval tok))
                                      (leaf `("String" ,tokval))
                                      (e `("Prts" ,leaf ())))
                                 (nxt gettok)
                                 e)
                               `("Prti" ,(expr gettok 0) ()))
                   do (setf x `("Sequence" ,x ,e))
                   while (accept gettok 'TOK-COMMA))
             (expect gettok "Print" 'TOK-RIGHTPAREN)
             (expect gettok "Print" 'TOK-SEMICOLON)
             x))

          ((eq (tokstruc-tok (curr gettok)) 'TOK-SEMICOLON)
           (nxt gettok))

          ((eq (tokstruc-tok (curr gettok)) 'TOK-IDENTIFIER)
           (let ((v `("Identifier" ,(tokstruc-tokval (curr gettok)))))
             (nxt gettok)
             (expect gettok "assign" 'TOK-ASSIGN)
             (let ((x `("Assign" ,v ,(expr gettok 0))))
               (expect gettok "assign" 'TOK-SEMICOLON)
               x)))

          ((accept gettok 'TOK-WHILE)
           (let ((e (paren-expr gettok)))
             `("While" ,e ,(stmt gettok))))

          ((accept gettok 'TOK-LEFTBRACE)
           (let ((x '()))
             (loop for tok = (curr gettok)
                   for toktok = (tokstruc-tok tok)
                   until (or (eq toktok 'TOK-RIGHTBRACE)
                             (eq toktok 'TOK-END-OF-INPUT))
                   do (setf x `("Sequence" ,x ,(stmt gettok))))
             (expect gettok "Lbrace" 'TOK-RIGHTBRACE)
             x))

          ((eq (tokstruc-tok (curr gettok)) 'TOK-END-OF-INPUT)
           '())

          (t (let* ((tok (curr gettok))
                    (toktok (tokstruc-tok tok)))
               (err tok (join "expecting start of statement, found '"
                              (tok-text toktok) "'"))))))

  ;;
  ;; Parsing of the top-level statement sequence.
  ;;
  (let ((x '()))
    (nxt gettok)
    (loop do (setf x `("Sequence" ,x ,(stmt gettok)))
          until (eq (tokstruc-tok (curr gettok)) 'TOK-END-OF-INPUT))
    x))

(defun string-to-tokstruc (s)
  (let ((strings
          (nth-value 1 (ppcre:scan-to-strings re-token-1 s))))
    (if strings
        (make-tokstruc :line-no (elt strings 0)
                       :column-no (elt strings 1)
                       :tok (string-to-tok (elt strings 2))
                       :tokval nil)
        (let ((strings
                (nth-value 1 (ppcre:scan-to-strings re-token-2 s))))
          (if strings
              (make-tokstruc :line-no (elt strings 0)
                             :column-no (elt strings 1)
                             :tok (string-to-tok (elt strings 2))
                             :tokval (elt strings 3))
              (progn
                (warn "unparseable token line")
                (uiop:quit 1)))))))

(defun read-token-line (inpf)
  (loop for line = (read-line inpf nil "End_of_input")
        while (ppcre:scan re-blank-line line)
        finally (return line)))

(defun open-inpf (inpf-filename)
  (if (string= inpf-filename "-")
      *standard-input*
      (open inpf-filename :direction :input)))

(defun open-outf (outf-filename)
  (if (string= outf-filename "-")
      *standard-output*
      (open outf-filename :direction :output
                          :if-exists :overwrite
                          :if-does-not-exist :create)))

(defun usage-error ()
  (princ "Usage: parse [INPUTFILE [OUTPUTFILE]]" *standard-output*)
  (terpri *standard-output*)
  (princ "If either INPUTFILE or OUTPUTFILE is \"-\", the respective"
         *standard-output*)
  (princ " standard I/O is used." *standard-output*)
  (terpri *standard-output*)
  (uiop:quit 1))

(defun get-filenames (argv)
  (trivia:match argv
    ((list) '("-" "-"))
    ((list inpf-filename) `(,inpf-filename "-"))
    ((list inpf-filename outf-filename) `(,inpf-filename
                                          ,outf-filename))
    (_ (usage-error))))

(defun main (&rest argv)
  (let* ((filenames (get-filenames argv))
         (inpf-filename (car filenames))
         (inpf (open-inpf inpf-filename))
         (outf-filename (cadr filenames))
         (outf (open-outf outf-filename)))

    (let* ((current-token (list nil))
           (gettok-curr (lambda () (elt current-token 0)))
           (gettok-nxt (lambda ()
                         (let* ((s (read-token-line inpf))
                                (tok (string-to-tokstruc s)))
                           (setf (elt current-token 0) tok)
                           tok)))
           (gettok (lambda (instruction)
                     (trivia:match instruction
                       (:curr (funcall gettok-curr))
                       (:nxt (funcall gettok-nxt)))))
           (ast (parse gettok)))
      (prt-ast outf ast))

    (unless (string= inpf-filename "-")
      (close inpf))
    (unless (string= outf-filename "-")
      (close outf))

    (uiop:quit 0)))

;;; vim: set ft=lisp lisp:

Output:

$ ./parse.ros compiler-tests/primes.lex
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier count
Integer 1
Assign
Identifier n
Integer 1
Assign
Identifier limit
Integer 100
While
Less
Identifier n
Identifier limit
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier k
Integer 3
Assign
Identifier p
Integer 1
Assign
Identifier n
Add
Identifier n
Integer 2
While
And
LessEqual
Multiply
Identifier k
Identifier k
Identifier n
Identifier p
Sequence
Sequence
;
Assign
Identifier p
NotEqual
Multiply
Divide
Identifier n
Identifier k
Identifier k
Identifier n
Assign
Identifier k
Add
Identifier k
Integer 2
If
Identifier p
If
Sequence
Sequence
;
Sequence
Sequence
;
Prti
Identifier n
;
Prts
String " is prime\n"
;
Assign
Identifier count
Add
Identifier count
Integer 1
;
Sequence
Sequence
Sequence
;
Prts
String "Total primes found: "
;
Prti
Identifier count
;
Prts
String "\n"
;
quence
Sequence
Sequence
Sequence
;
Assign
Identifier x_x
Divide
Multiply
Identifier x
Identifier x
Integer 200
Assign
Identifier y_y
Divide
Multiply
Identifier y
Identifier y
Integer 200
If
Greater
Add
Identifier x_x
Identifier y_y
Integer 800
If
Sequence
Sequence
Sequence
;
Assign
Identifier the_char
Add
Integer 48
Identifier i
If
Greater
Identifier i
Integer 9
If
Sequence
;
Assign
Identifier the_char
Integer 64
;
Assign
Identifier i
Identifier max_iter
;
Assign
Identifier y
Add
Divide
Multiply
Identifier x
Identifier y
Integer 100
Identifier y0
Assign
Identifier x
Add
Subtract
Identifier x_x
Identifier y_y
Identifier x0
Assign
Identifier i
Add
Identifier i
Integer 1
Prtc
Identifier the_char
;
Assign
Identifier x0
Add
Identifier x0
Identifier x_step
Prtc
Integer 10
;
Assign
Identifier y0
Subtract
Identifier y0
Identifier y_step

Forth

Tested with Gforth 0.7.3.

CREATE BUF 0 ,              \ single-character look-ahead buffer
: PEEK   BUF @ 0= IF KEY BUF ! THEN BUF @ ;
: GETC   PEEK  0 BUF ! ;
: SPACE?   DUP BL = SWAP  9 14 WITHIN  OR ;
: >SPACE   BEGIN PEEK SPACE? WHILE GETC DROP REPEAT ;
: DIGIT?   48 58 WITHIN ;
: GETINT   >SPACE  0
   BEGIN  PEEK DIGIT?
   WHILE  GETC [CHAR] 0 -  SWAP 10 * +
   REPEAT ;
: GETNAM   >SPACE PAD 1+
   BEGIN PEEK SPACE? INVERT
   WHILE GETC OVER C! CHAR+
   REPEAT  PAD TUCK - 1-  PAD C! ;
: GETSTR   >SPACE PAD  1+ GETC DROP   \ skip leading "
   BEGIN GETC  DUP [CHAR] " <>
   WHILE OVER C! CHAR+
   REPEAT  DROP  PAD TUCK - 1-  PAD C! ;
: INTERN   HERE SWAP  DUP C@ 1+ BOUNDS DO I C@ C, LOOP  ALIGN ;

CREATE #TK 0 ,
: TK:   CREATE #TK @ ,  1 #TK +!  DOES> @ ;
TK: End_of_input      TK: Keyword_if        TK: Keyword_else
TK: Keyword_while     TK: Keyword_print     TK: Keyword_putc
TK: String            TK: Integer           TK: Identifier
TK: LeftParen         TK: RightParen
TK: LeftBrace         TK: RightBrace
TK: Semicolon         TK: Comma
TK: Op_assign         TK: Op_not
: (BINARY?)   [ #TK @ ] literal >= ;
TK: Op_subtract       TK: Op_add
TK: Op_mod            TK: Op_multiply       TK: Op_divide
TK: Op_equal          TK: Op_notequal
TK: Op_less           TK: Op_lessequal
TK: Op_greater        TK: Op_greaterequal
TK: Op_and            TK: Op_or
CREATE TOKEN  0 , 0 , 0 , 0 ,
: TOKEN-TYPE   TOKEN 2 CELLS + @ ;
: TOKEN-VALUE   TOKEN 3 CELLS + @ ;
: GETTOK   GETINT GETINT TOKEN 2!
           GETNAM FIND DROP EXECUTE
	   DUP Integer    = IF GETINT ELSE
	   DUP String     = IF GETSTR INTERN ELSE
	   DUP Identifier = IF GETNAM INTERN ELSE
	   0 THEN THEN THEN
	   TOKEN 3 CELLS + !  TOKEN 2 CELLS + ! ;
: BINARY?   TOKEN-TYPE (BINARY?) ;

CREATE PREC #TK @ CELLS ALLOT  PREC #TK @ CELLS -1 FILL
: PREC!   CELLS PREC + ! ;
14 Op_not          PREC!  13 Op_multiply     PREC!
13 Op_divide       PREC!  13 Op_mod          PREC!
12 Op_add          PREC!  12 Op_subtract     PREC!
10 Op_less         PREC!  10 Op_greater      PREC!
10 Op_lessequal    PREC!  10 Op_greaterequal PREC!
 9 Op_equal        PREC!   9 Op_notequal     PREC!
 5 Op_and          PREC!   4 Op_or           PREC!
: PREC@   CELLS PREC + @ ;

\ Each AST Node is a sequence of cells in data space consisting
\ of the execution token of a printing word, followed by that
\ node's data.  Each printing word receives the address of the
\ node's data, and is responsible for printing that data
\ appropriately.

DEFER .NODE
: .NULL   DROP ." ;" CR ;
CREATE $NULL  ' .NULL ,
: .IDENTIFIER   ." Identifier " @ COUNT TYPE CR ;
: $IDENTIFIER ( a-addr --)  HERE SWAP  ['] .IDENTIFIER , , ;
: .INTEGER   ." Integer " @ . CR ;
: $INTEGER ( n --)  HERE SWAP  ['] .INTEGER , , ;
: "TYPE"   [CHAR] " EMIT  TYPE  [CHAR] " EMIT ;
: .STRING   ." String " @ COUNT "TYPE" CR ;
: $STRING ( a-addr --)  HERE SWAP  ['] .STRING , , ;
: .LEAF   DUP  @ COUNT TYPE CR  CELL+ @ .NODE  0 .NULL ;
: LEAF   CREATE HERE CELL+ ,  BL WORD INTERN .
          DOES> HERE >R ['] .LEAF ,  @ , ,  R> ;
LEAF $PRTC Prtc    LEAF $PRTS Prts       LEAF $PRTI Prti
LEAF $NOT Not      LEAF $NEGATE Negate
: .BINARY   DUP  @ COUNT TYPE CR
            CELL+ DUP @ .NODE  CELL+ @ .NODE ;
: BINARY   CREATE HERE CELL+ ,  BL WORD INTERN .
           DOES> HERE >R ['] .BINARY ,  @ ,  SWAP 2,  R> ;
BINARY $SEQUENCE Sequence   BINARY $ASSIGN Assign
BINARY $WHILE While         BINARY $IF If
BINARY $SUBTRACT Subtract   BINARY $ADD Add
BINARY $MOD Mod             BINARY $MULTIPLY Multiply
BINARY $DIVIDE Divide
BINARY $LESS Less           BINARY $LESSEQUAL LessEqual
BINARY $GREATER Greater     BINARY $GREATEREQUAL GreaterEqual
BINARY $EQUAL Equal         BINARY $NOTEQUAL NotEqual
BINARY $AND And             BINARY $OR Or

: TOK-CONS ( x* -- node-xt) TOKEN-TYPE  CASE
   Op_subtract     OF ['] $SUBTRACT     ENDOF
   Op_add          OF ['] $ADD          ENDOF
   op_mod          OF ['] $MOD          ENDOF
   op_multiply     OF ['] $MULTIPLY     ENDOF
   Op_divide       OF ['] $DIVIDE       ENDOF
   Op_equal        OF ['] $EQUAL        ENDOF
   Op_notequal     OF ['] $NOTEQUAL     ENDOF
   Op_less         OF ['] $LESS         ENDOF
   Op_lessequal    OF ['] $LESSEQUAL    ENDOF
   Op_greater      OF ['] $GREATER      ENDOF
   Op_greaterequal OF ['] $GREATEREQUAL ENDOF
   Op_and          OF ['] $AND          ENDOF
   Op_or           OF ['] $OR           ENDOF
   ENDCASE ;

: (.NODE)   DUP CELL+ SWAP @ EXECUTE ;
' (.NODE) IS .NODE

: .- ( n --)  0 <# #S #> TYPE ;
: EXPECT ( tk --)  DUP TOKEN-TYPE <>
   IF CR ." stdin:" TOKEN 2@ SWAP .- ." :" .-
     ." : unexpected token, expecting " . CR BYE
   THEN  DROP GETTOK ;
: '('   LeftParen EXPECT ;
: ')'   RightParen EXPECT ;
: '}'   RightBrace EXPECT ;
: ';'   Semicolon EXPECT ;
: ','   Comma EXPECT ;
: '='   Op_assign EXPECT ;

DEFER *EXPR  DEFER EXPR  DEFER STMT
: PAREN-EXPR   '(' EXPR ')' ;
: PRIMARY
   TOKEN-TYPE LeftParen   = IF PAREN-EXPR              EXIT THEN
   TOKEN-TYPE Op_add      = IF GETTOK 12 *EXPR         EXIT THEN
   TOKEN-TYPE Op_subtract = IF GETTOK 14 *EXPR $NEGATE EXIT THEN
   TOKEN-TYPE Op_not      = IF GETTOK 14 *EXPR $NOT    EXIT THEN
   TOKEN-TYPE Identifier  = IF TOKEN-VALUE $IDENTIFIER      ELSE
   TOKEN-TYPE Integer     = IF TOKEN-VALUE $INTEGER    THEN THEN
   GETTOK ;
: (*EXPR) ( n -- node)
   PRIMARY ( n node)
   BEGIN OVER TOKEN-TYPE PREC@ SWAP OVER <=  BINARY?  AND
   WHILE ( n node prec) 1+ TOK-CONS SWAP GETTOK *EXPR SWAP EXECUTE
   REPEAT ( n node prec) DROP NIP ( node) ;
: (EXPR)   0 *EXPR ;
: -)?   TOKEN-TYPE RightParen <> ;
: -}?   TOKEN-TYPE RightBrace <> ;
: (STMT)
   TOKEN-TYPE Semicolon = IF GETTOK STMT EXIT THEN
   TOKEN-TYPE Keyword_while =
     IF GETTOK  PAREN-EXPR STMT $WHILE  EXIT THEN
   TOKEN-TYPE Keyword_if =
     IF GETTOK  PAREN-EXPR STMT
       TOKEN-TYPE Keyword_else = IF GETTOK STMT ELSE $NULL THEN
       $IF $IF EXIT
     THEN
   TOKEN-TYPE Keyword_putc =
     IF GETTOK  PAREN-EXPR ';' $PRTC  EXIT THEN
   TOKEN-TYPE Keyword_print =
     IF GETTOK  '(' $NULL
        BEGIN TOKEN-TYPE String =
           IF TOKEN-VALUE $STRING $PRTS  GETTOK
           ELSE EXPR $PRTI THEN  $SEQUENCE  -)?
        WHILE ',' REPEAT  ')' ';'  EXIT THEN
   TOKEN-TYPE Identifier =
     IF TOKEN-VALUE $IDENTIFIER GETTOK '=' EXPR ';' $ASSIGN
        EXIT THEN
   TOKEN-TYPE LeftBrace =
     IF $NULL GETTOK BEGIN -}? WHILE STMT $SEQUENCE REPEAT
        '}' EXIT THEN
   TOKEN-TYPE End_of_input = IF EXIT THEN  EXPR ;
' (*EXPR) IS *EXPR  ' (EXPR) IS EXPR  ' (STMT) IS STMT

: -EOI?   TOKEN-TYPE End_of_input <> ;
: PARSE   $NULL GETTOK BEGIN -EOI? WHILE STMT $SEQUENCE REPEAT ;
PARSE  .NODE

Output — Count AST:

Sequence Sequence ; Assign Identifier count Integer 1 While Less Identifier count Integer 10 Sequence Sequence ; Sequence Sequence Sequence ; Prts String "count is: " ; Prti Identifier count ; Prts String "\n" ; Assign Identifier count Add Identifier count Integer 1

Fortran

Works with: gfortran version 11.2.1

The following is Fortran 2008/2018 code with C preprocessing directives. If you call the program source ‘parse.F90’, with a capital ‘F’, then gfortran will know to run the C preprocessor.

!!!
!!! An implementation of the Rosetta Code parser task:
!!! https://rosettacode.org/wiki/Compiler/syntax_analyzer
!!!
!!! The implementation is based on the published pseudocode.
!!!

module compiler_type_kinds
  use, intrinsic :: iso_fortran_env, only: int32
  use, intrinsic :: iso_fortran_env, only: int64

  implicit none
  private

  ! Synonyms.
  integer, parameter, public :: size_kind = int64
  integer, parameter, public :: length_kind = size_kind
  integer, parameter, public :: nk = size_kind

  ! Synonyms for character capable of storing a Unicode code point.
  integer, parameter, public :: unicode_char_kind = selected_char_kind ('ISO_10646')
  integer, parameter, public :: ck = unicode_char_kind

  ! Synonyms for integers capable of storing a Unicode code point.
  integer, parameter, public :: unicode_ichar_kind = int32
  integer, parameter, public :: ick = unicode_ichar_kind
end module compiler_type_kinds

module string_buffers
  use, intrinsic :: iso_fortran_env, only: error_unit
  use, intrinsic :: iso_fortran_env, only: int64
  use, non_intrinsic :: compiler_type_kinds, only: nk, ck, ick

  implicit none
  private

  public :: strbuf_t

  type :: strbuf_t
     integer(kind = nk), private :: len = 0
     !
     ! ‘chars’ is made public for efficient access to the individual
     ! characters.
     !
     character(1, kind = ck), allocatable, public :: chars(:)
   contains
     procedure, pass, private :: ensure_storage => strbuf_t_ensure_storage
     procedure, pass :: to_unicode_full_string => strbuf_t_to_unicode_full_string
     procedure, pass :: to_unicode_substring => strbuf_t_to_unicode_substring
     procedure, pass :: length => strbuf_t_length
     procedure, pass :: set => strbuf_t_set
     procedure, pass :: append => strbuf_t_append
     generic :: to_unicode => to_unicode_full_string
     generic :: to_unicode => to_unicode_substring
     generic :: assignment(=) => set
  end type strbuf_t

contains

  function strbuf_t_to_unicode_full_string (strbuf) result (s)
    class(strbuf_t), intent(in) :: strbuf
    character(:, kind = ck), allocatable :: s

    !
    ! This does not actually ensure that the string is valid Unicode;
    ! any 31-bit ‘character’ is supported.
    !

    integer(kind = nk) :: i

    allocate (character(len = strbuf%len, kind = ck) :: s)
    do i = 1, strbuf%len
       s(i:i) = strbuf%chars(i)
    end do
  end function strbuf_t_to_unicode_full_string

  function strbuf_t_to_unicode_substring (strbuf, i, j) result (s)
    !
    ! ‘Extreme’ values of i and j are allowed, as shortcuts for ‘from
    ! the beginning’, ‘up to the end’, or ‘empty substring’.
    !
    class(strbuf_t), intent(in) :: strbuf
    integer(kind = nk), intent(in) :: i, j
    character(:, kind = ck), allocatable :: s

    !
    ! This does not actually ensure that the string is valid Unicode;
    ! any 31-bit ‘character’ is supported.
    !

    integer(kind = nk) :: i1, j1
    integer(kind = nk) :: n
    integer(kind = nk) :: k

    i1 = max (1_nk, i)
    j1 = min (strbuf%len, j)
    n = max (0_nk, (j1 - i1) + 1_nk)

    allocate (character(n, kind = ck) :: s)
    do k = 1, n
       s(k:k) = strbuf%chars(i1 + (k - 1_nk))
    end do
  end function strbuf_t_to_unicode_substring

  elemental function strbuf_t_length (strbuf) result (n)
    class(strbuf_t), intent(in) :: strbuf
    integer(kind = nk) :: n

    n = strbuf%len
  end function strbuf_t_length

  elemental function next_power_of_two (x) result (y)
    integer(kind = nk), intent(in) :: x
    integer(kind = nk) :: y

    !
    ! It is assumed that no more than 64 bits are used.
    !
    ! The branch-free algorithm is that of
    ! https://archive.is/nKxAc#RoundUpPowerOf2
    !
    ! Fill in bits until one less than the desired power of two is
    ! reached, and then add one.
    !

    y = x - 1
    y = ior (y, ishft (y, -1))
    y = ior (y, ishft (y, -2))
    y = ior (y, ishft (y, -4))
    y = ior (y, ishft (y, -8))
    y = ior (y, ishft (y, -16))
    y = ior (y, ishft (y, -32))
    y = y + 1
  end function next_power_of_two

  elemental function new_storage_size (length_needed) result (size)
    integer(kind = nk), intent(in) :: length_needed
    integer(kind = nk) :: size

    ! Increase storage by orders of magnitude.

    if (2_nk**32 < length_needed) then
       size = huge (1_nk)
    else
       size = next_power_of_two (length_needed)
    end if
  end function new_storage_size

  subroutine strbuf_t_ensure_storage (strbuf, length_needed)
    class(strbuf_t), intent(inout) :: strbuf
    integer(kind = nk), intent(in) :: length_needed

    integer(kind = nk) :: new_size
    type(strbuf_t) :: new_strbuf

    if (.not. allocated (strbuf%chars)) then
       ! Initialize a new strbuf%chars array.
       new_size = new_storage_size (length_needed)
       allocate (strbuf%chars(1:new_size))
    else if (ubound (strbuf%chars, 1) < length_needed) then
       ! Allocate a new strbuf%chars array, larger than the current
       ! one, but containing the same characters.
       new_size = new_storage_size (length_needed)
       allocate (new_strbuf%chars(1:new_size))
       new_strbuf%chars(1:strbuf%len) = strbuf%chars(1:strbuf%len)
       call move_alloc (new_strbuf%chars, strbuf%chars)
    end if
  end subroutine strbuf_t_ensure_storage

  subroutine strbuf_t_set (dst, src)
    class(strbuf_t), intent(inout) :: dst
    class(*), intent(in) :: src

    integer(kind = nk) :: n
    integer(kind = nk) :: i

    select type (src)
    type is (character(*, kind = ck))
       n = len (src, kind = nk)
       call dst%ensure_storage(n)
       do i = 1, n
          dst%chars(i) = src(i:i)
       end do
       dst%len = n
    type is (character(*))
       n = len (src, kind = nk)
       call dst%ensure_storage(n)
       do i = 1, n
          dst%chars(i) = src(i:i)
       end do
       dst%len = n
    class is (strbuf_t)
       n = src%len
       call dst%ensure_storage(n)
       dst%chars(1:n) = src%chars(1:n)
       dst%len = n
    class default
       error stop
    end select
  end subroutine strbuf_t_set

  subroutine strbuf_t_append (dst, src)
    class(strbuf_t), intent(inout) :: dst
    class(*), intent(in) :: src

    integer(kind = nk) :: n_dst, n_src, n
    integer(kind = nk) :: i

    select type (src)
    type is (character(*, kind = ck))
       n_dst = dst%len
       n_src = len (src, kind = nk)
       n = n_dst + n_src
       call dst%ensure_storage(n)
       do i = 1, n_src
          dst%chars(n_dst + i) = src(i:i)
       end do
       dst%len = n
    type is (character(*))
       n_dst = dst%len
       n_src = len (src, kind = nk)
       n = n_dst + n_src
       call dst%ensure_storage(n)
       do i = 1, n_src
          dst%chars(n_dst + i) = src(i:i)
       end do
       dst%len = n
    class is (strbuf_t)
       n_dst = dst%len
       n_src = src%len
       n = n_dst + n_src
       call dst%ensure_storage(n)
       dst%chars((n_dst + 1):n) = src%chars(1:n_src)
       dst%len = n
    class default
       error stop
    end select
  end subroutine strbuf_t_append

end module string_buffers

module reading_one_line_from_a_stream
  use, intrinsic :: iso_fortran_env, only: input_unit
  use, intrinsic :: iso_fortran_env, only: error_unit
  use, non_intrinsic :: compiler_type_kinds, only: nk, ck, ick
  use, non_intrinsic :: string_buffers

  implicit none
  private

  ! get_line_from_stream: read an entire input line from a stream into
  ! a strbuf_t.
  public :: get_line_from_stream

  character(1, kind = ck), parameter :: linefeed_char = char (10, kind = ck)

  ! The following is correct for Unix and its relatives.
  character(1, kind = ck), parameter :: newline_char = linefeed_char

contains

  subroutine get_line_from_stream (unit_no, eof, no_newline, strbuf)
    integer, intent(in) :: unit_no
    logical, intent(out) :: eof ! End of file?
    logical, intent(out) :: no_newline ! There is a line but it has no
                                       ! newline? (Thus eof also must
                                       ! be .true.)
    class(strbuf_t), intent(inout) :: strbuf

    character(1, kind = ck) :: ch

    strbuf = ''
    call get_ch (unit_no, eof, ch)
    do while (.not. eof .and. ch /= newline_char)
       call strbuf%append (ch)
       call get_ch (unit_no, eof, ch)
    end do
    no_newline = eof .and. (strbuf%length() /= 0)
  end subroutine get_line_from_stream

  subroutine get_ch (unit_no, eof, ch)
    !
    ! Read a single code point from the stream.
    !
    ! Currently this procedure simply inputs ‘ASCII’ bytes rather than
    ! Unicode code points.
    !
    integer, intent(in) :: unit_no
    logical, intent(out) :: eof
    character(1, kind = ck), intent(out) :: ch

    integer :: stat
    character(1) :: c = '*'

    eof = .false.

    if (unit_no == input_unit) then
       call get_input_unit_char (c, stat)
    else
       read (unit = unit_no, iostat = stat) c
    end if

    if (stat < 0) then
       ch = ck_'*'
       eof = .true.
    else if (0 < stat) then
       write (error_unit, '("Input error with status code ", I0)') stat
       stop 1
    else
       ch = char (ichar (c, kind = ick), kind = ck)
    end if
  end subroutine get_ch

!!!
!!! If you tell gfortran you want -std=f2008 or -std=f2018, you likely
!!! will need to add also -fall-intrinsics or -U__GFORTRAN__
!!!
!!! The first way, you get the FGETC intrinsic. The latter way, you
!!! get the C interface code that uses getchar(3).
!!!
#ifdef __GFORTRAN__

  subroutine get_input_unit_char (c, stat)
    !
    ! The following works if you are using gfortran.
    !
    ! (FGETC is considered a feature for backwards compatibility with
    ! g77. However, I know of no way to reconfigure input_unit as a
    ! Fortran 2003 stream, for use with ordinary ‘read’.)
    !
    character, intent(inout) :: c
    integer, intent(out) :: stat

    call fgetc (input_unit, c, stat)
  end subroutine get_input_unit_char

#else

  subroutine get_input_unit_char (c, stat)
    !
    ! An alternative implementation of get_input_unit_char. This
    ! actually reads input from the C standard input, which might not
    ! be the same as input_unit.
    !
    use, intrinsic :: iso_c_binding, only: c_int
    character, intent(inout) :: c
    integer, intent(out) :: stat

    interface
       !
       ! Use getchar(3) to read characters from standard input. This
       ! assumes there is actually such a function available, and that
       ! getchar(3) does not exist solely as a macro. (One could write
       ! one’s own getchar() if necessary, of course.)
       !
       function getchar () result (c) bind (c, name = 'getchar')
         use, intrinsic :: iso_c_binding, only: c_int
         integer(kind = c_int) :: c
       end function getchar
    end interface

    integer(kind = c_int) :: i_char

    i_char = getchar ()
    !
    ! The C standard requires that EOF have a negative value. If the
    ! value returned by getchar(3) is not EOF, then it will be
    ! representable as an unsigned char. Therefore, to check for end
    ! of file, one need only test whether i_char is negative.
    !
    if (i_char < 0) then
       stat = -1
    else
       stat = 0
       c = char (i_char)
    end if
  end subroutine get_input_unit_char

#endif

end module reading_one_line_from_a_stream

module lexer_token_facts
  implicit none
  private

  integer, parameter, public :: tk_EOI = 0
  integer, parameter, public :: tk_Mul = 1
  integer, parameter, public :: tk_Div = 2
  integer, parameter, public :: tk_Mod = 3
  integer, parameter, public :: tk_Add = 4
  integer, parameter, public :: tk_Sub = 5
  integer, parameter, public :: tk_Negate = 6
  integer, parameter, public :: tk_Not = 7
  integer, parameter, public :: tk_Lss = 8
  integer, parameter, public :: tk_Leq = 9
  integer, parameter, public :: tk_Gtr = 10
  integer, parameter, public :: tk_Geq = 11
  integer, parameter, public :: tk_Eq = 12
  integer, parameter, public :: tk_Neq = 13
  integer, parameter, public :: tk_Assign = 14
  integer, parameter, public :: tk_And = 15
  integer, parameter, public :: tk_Or = 16
  integer, parameter, public :: tk_If = 17
  integer, parameter, public :: tk_Else = 18
  integer, parameter, public :: tk_While = 19
  integer, parameter, public :: tk_Print = 20
  integer, parameter, public :: tk_Putc = 21
  integer, parameter, public :: tk_Lparen = 22
  integer, parameter, public :: tk_Rparen = 23
  integer, parameter, public :: tk_Lbrace = 24
  integer, parameter, public :: tk_Rbrace = 25
  integer, parameter, public :: tk_Semi = 26
  integer, parameter, public :: tk_Comma = 27
  integer, parameter, public :: tk_Ident = 28
  integer, parameter, public :: tk_Integer = 29
  integer, parameter, public :: tk_String = 30
  integer, parameter, public :: tk_Positive = 31

  character(16), parameter, public :: lexer_token_string(0:31) = &
       (/ "EOI             ",   &
       &  "*               ",   &
       &  "/               ",   &
       &  "%               ",   &
       &  "+               ",   &
       &  "-               ",   &
       &  "-               ",   &
       &  "!               ",   &
       &  "<               ",   &
       &  "<=              ",   &
       &  ">               ",   &
       &  ">=              ",   &
       &  "==              ",   &
       &  "!=              ",   &
       &  "=               ",   &
       &  "&&              ",   &
       &  "||              ",   &
       &  "if              ",   &
       &  "else            ",   &
       &  "while           ",   &
       &  "print           ",   &
       &  "putc            ",   &
       &  "(               ",   &
       &  ")               ",   &
       &  "{               ",   &
       &  "}               ",   &
       &  ";               ",   &
       &  ",               ",   &
       &  "Ident           ",   &
       &  "Integer literal ",   &
       &  "String literal  ",   &
       &  "+               " /)

  integer, parameter, public :: lexer_token_arity(0:31) = &
       & (/ -1,                   & ! EOI
       &    2, 2, 2, 2, 2,        & ! * / % + -
       &    1, 1,                 & ! negate !
       &    2, 2, 2, 2, 2, 2,     & ! < <= > >= == !=
       &    -1,                   & ! =
       &    2, 2,                 & ! && ||
       &    -1, -1, -1, -1, -1,   & !
       &    -1, -1, -1, -1, -1,   & !
       &    -1, -1, -1, -1,       & !
       &    1 /)                    ! positive

  integer, parameter, public :: lexer_token_precedence(0:31) = &
       & (/ -1,                   & ! EOI
       &    13, 13, 13,           & ! * / %
       &    12, 12,               & ! + -
       &    14, 14,               & ! negate !
       &    10, 10, 10, 10,       & ! < <= > >=
       &    9, 9,                 & ! == !=
       &    -1,                   & ! =
       &    5,                    & ! &&
       &    4,                    & ! ||
       &    -1, -1, -1, -1, -1,   & !
       &    -1, -1, -1, -1, -1,   & !
       &    -1, -1, -1, -1,       & !
       &    14 /)                   ! positive

  integer, parameter, public :: left_associative = 0
  integer, parameter, public :: right_associative = 1

  ! All current operators are left associative. (The values in the
  ! array for things that are not operators are unimportant.)
  integer, parameter, public :: lexer_token_associativity(0:31) = left_associative

end module lexer_token_facts

module reading_of_lexer_tokens
  use, intrinsic :: iso_fortran_env, only: error_unit
  use, non_intrinsic :: compiler_type_kinds, only: nk, ck, ick
  use, non_intrinsic :: string_buffers
  use, non_intrinsic :: reading_one_line_from_a_stream
  use, non_intrinsic :: lexer_token_facts

  implicit none
  private  

  public :: lexer_token_t
  public :: get_lexer_token

  character(1, kind = ck), parameter :: horizontal_tab_char = char (9, kind = ck)
  character(1, kind = ck), parameter :: linefeed_char = char (10, kind = ck)
  character(1, kind = ck), parameter :: vertical_tab_char = char (11, kind = ck)
  character(1, kind = ck), parameter :: formfeed_char = char (12, kind = ck)
  character(1, kind = ck), parameter :: carriage_return_char = char (13, kind = ck)
  character(1, kind = ck), parameter :: space_char = ck_' '

  type :: lexer_token_t
     integer :: token_no = -(huge (1))
     character(:, kind = ck), allocatable :: val
     integer(nk) :: line_no = -(huge (1_nk))
     integer(nk) :: column_no = -(huge (1_nk))
  end type lexer_token_t

contains

  subroutine get_lexer_token (unit_no, lex_line_no, eof, token)
    !
    ! Lines that are empty or contain only whitespace are tolerated.
    !
    ! Also tolerated are comment lines, whose first character is a
    ! '!'. It is convenient for debugging to be able to comment out
    ! lines.
    !
    ! A last line be without a newline is *not* tolerated, unless it
    ! contains only whitespace.
    !
    ! Letting there be some whitespace is partly for the sake of
    ! reading cut-and-paste from a browser display.
    !
    integer, intent(in) :: unit_no
    integer(kind = nk), intent(inout) :: lex_line_no
    logical, intent(out) :: eof
    type(lexer_token_t), intent(out) :: token

    type(strbuf_t) :: strbuf
    logical :: no_newline
    logical :: input_found

    ! Let a negative setting initialize the line number.
    lex_line_no = max (0_nk, lex_line_no)

    strbuf = ''
    eof = .false.
    input_found = .false.
    do while (.not. eof .and. .not. input_found)
       call get_line_from_stream (unit_no, eof, no_newline, strbuf)
       if (eof) then
          if (no_newline) then
             lex_line_no = lex_line_no + 1
             if (.not. strbuf_is_all_whitespace (strbuf)) then
                call start_error_message (lex_line_no)
                write (error_unit, '("lexer line ends without a newline")')
                stop 1
             end if
          end if
       else
          lex_line_no = lex_line_no + 1
          input_found = .true.
          if (strbuf_is_all_whitespace (strbuf)) then
             ! A blank line.
             input_found = .false.
          else if (0 < strbuf%length()) then
             if (strbuf%chars(1) == ck_'!') then
                ! A comment line.
                input_found = .false.
             end if
          end if
       end if
    end do

    token = lexer_token_t ()
    if (.not. eof) then
       token = strbuf_to_token (lex_line_no, strbuf)
    end if
  end subroutine get_lexer_token

  function strbuf_to_token (lex_line_no, strbuf) result (token)
    integer(kind = nk), intent(in) :: lex_line_no
    class(strbuf_t), intent(in) :: strbuf
    type(lexer_token_t) :: token

    character(:, kind = ck), allocatable :: line_no
    character(:, kind = ck), allocatable :: column_no
    character(:, kind = ck), allocatable :: token_name
    character(:, kind = ck), allocatable :: val_string
    integer :: stat
    integer(kind = nk) :: n

    call split_line (lex_line_no, strbuf, line_no, column_no, token_name, val_string)

    read (line_no, *, iostat = stat) token%line_no
    if (stat /= 0) then
       call start_error_message (lex_line_no)
       write (error_unit, '("line number field is unreadable or too large")')
       stop 1
    end if

    read (column_no, *, iostat = stat) token%column_no
    if (stat /= 0) then
       call start_error_message (lex_line_no)
       write (error_unit, '("column number field is unreadable or too large")')
       stop 1
    end if

    token%token_no = token_name_to_token_no (lex_line_no, token_name)

    select case (token%token_no)
    case (tk_Ident)
       ! I do no checking of identifier names.
       allocate (token%val, source = val_string)
    case (tk_Integer)
       call check_is_all_digits (lex_line_no, val_string)
       allocate (token%val, source = val_string)
    case (tk_String)
       n = len (val_string, kind = nk)
       if (n < 2) then
          call string_literal_missing_or_no_good
       else if (val_string(1:1) /= ck_'"' .or. val_string(n:n) /= ck_'"') then
          call string_literal_missing_or_no_good
       else
          allocate (token%val, source = val_string)
       end if
    case default
       if (len (val_string, kind = nk) /= 0) then
          call start_error_message (lex_line_no)
          write (error_unit, '("token should not have a value")')
          stop 1
       end if
    end select

  contains

    subroutine string_literal_missing_or_no_good
      call start_error_message (lex_line_no)
      write (error_unit, '("""String"" token requires a string literal")')
      stop 1
    end subroutine string_literal_missing_or_no_good

  end function strbuf_to_token

  subroutine split_line (lex_line_no, strbuf, line_no, column_no, token_name, val_string)
    integer(kind = nk), intent(in) :: lex_line_no
    class(strbuf_t), intent(in) :: strbuf
    character(:, kind = ck), allocatable, intent(out) :: line_no
    character(:, kind = ck), allocatable, intent(out) :: column_no
    character(:, kind = ck), allocatable, intent(out) :: token_name
    character(:, kind = ck), allocatable, intent(out) :: val_string

    integer(kind = nk) :: i, j

    i = skip_whitespace (strbuf, 1_nk)
    j = skip_non_whitespace (strbuf, i)
    line_no = strbuf%to_unicode(i, j - 1)
    call check_is_all_digits (lex_line_no, line_no)

    i = skip_whitespace (strbuf, j)
    j = skip_non_whitespace (strbuf, i)
    column_no = strbuf%to_unicode(i, j - 1)
    call check_is_all_digits (lex_line_no, column_no)

    i = skip_whitespace (strbuf, j)
    j = skip_non_whitespace (strbuf, i)
    token_name = strbuf%to_unicode(i, j - 1)

    i = skip_whitespace (strbuf, j)
    if (strbuf%length() < i) then
       val_string = ck_''
    else if (strbuf%chars(i) == ck_'"') then
       j = skip_whitespace_backwards (strbuf, strbuf%length())
       if (strbuf%chars(j) == ck_'"') then
          val_string = strbuf%to_unicode(i, j)
       else
          call start_error_message (lex_line_no)
          write (error_unit, '("string literal does not end in a double quote")')
          stop 1
       end if
    else
       j = skip_non_whitespace (strbuf, i)
       val_string = strbuf%to_unicode(i, j - 1)
       i = skip_whitespace (strbuf, j)
       if (i <= strbuf%length()) then
          call start_error_message (lex_line_no)
          write (error_unit, '("token line contains unexpected text")')
          stop 1
       end if
    end if
  end subroutine split_line

  function token_name_to_token_no (lex_line_no, token_name) result (token_no)
    integer(kind = nk), intent(in) :: lex_line_no
    character(*, kind = ck), intent(in) :: token_name
    integer :: token_no

    !!
    !! This implementation is not optimized in any way, unless the
    !! Fortran compiler can optimize the SELECT CASE.
    !!

    select case (token_name)
    case (ck_"End_of_input")
       token_no = tk_EOI
    case (ck_"Op_multiply")
       token_no = tk_Mul
    case (ck_"Op_divide")
       token_no = tk_Div
    case (ck_"Op_mod")
       token_no = tk_Mod
    case (ck_"Op_add")
       token_no = tk_Add
    case (ck_"Op_subtract")
       token_no = tk_Sub
    case (ck_"Op_negate")
       token_no = tk_Negate
    case (ck_"Op_not")
       token_no = tk_Not
    case (ck_"Op_less")
       token_no = tk_Lss
    case (ck_"Op_lessequal    ")
       token_no = tk_Leq
    case (ck_"Op_greater")
       token_no = tk_Gtr
    case (ck_"Op_greaterequal")
       token_no = tk_Geq
    case (ck_"Op_equal")
       token_no = tk_Eq
    case (ck_"Op_notequal")
       token_no = tk_Neq
    case (ck_"Op_assign")
       token_no = tk_Assign
    case (ck_"Op_and")
       token_no = tk_And
    case (ck_"Op_or")
       token_no = tk_Or
    case (ck_"Keyword_if")
       token_no = tk_If
    case (ck_"Keyword_else")
       token_no = tk_Else
    case (ck_"Keyword_while")
       token_no = tk_While
    case (ck_"Keyword_print")
       token_no = tk_Print
    case (ck_"Keyword_putc")
       token_no = tk_Putc
    case (ck_"LeftParen")
       token_no = tk_Lparen
    case (ck_"RightParen")
       token_no = tk_Rparen
    case (ck_"LeftBrace")
       token_no = tk_Lbrace
    case (ck_"RightBrace")
       token_no = tk_Rbrace
    case (ck_"Semicolon")
       token_no = tk_Semi
    case (ck_"Comma")
       token_no = tk_Comma
    case (ck_"Identifier")
       token_no = tk_Ident
    case (ck_"Integer")
       token_no = tk_Integer
    case (ck_"String")
       token_no = tk_String
    case default
       call start_error_message (lex_line_no)
       write (error_unit, '("unrecognized token name: ", A)') token_name
       stop 1
    end select
  end function token_name_to_token_no

  function skip_whitespace (strbuf, i) result (j)
    class(strbuf_t), intent(in) :: strbuf
    integer(kind = nk), intent(in) :: i
    integer(kind = nk) :: j

    logical :: done

    j = i
    done = .false.
    do while (.not. done)
       if (at_end_of_line (strbuf, j)) then
          done = .true.
       else if (.not. isspace (strbuf%chars(j))) then
          done = .true.
       else
          j = j + 1
       end if
    end do
  end function skip_whitespace

  function skip_non_whitespace (strbuf, i) result (j)
    class(strbuf_t), intent(in) :: strbuf
    integer(kind = nk), intent(in) :: i
    integer(kind = nk) :: j

    logical :: done

    j = i
    done = .false.
    do while (.not. done)
       if (at_end_of_line (strbuf, j)) then
          done = .true.
       else if (isspace (strbuf%chars(j))) then
          done = .true.
       else
          j = j + 1
       end if
    end do
  end function skip_non_whitespace

  function skip_whitespace_backwards (strbuf, i) result (j)
    class(strbuf_t), intent(in) :: strbuf
    integer(kind = nk), intent(in) :: i
    integer(kind = nk) :: j

    logical :: done

    j = i
    done = .false.
    do while (.not. done)
       if (j == -1) then
          done = .true.
       else if (.not. isspace (strbuf%chars(j))) then
          done = .true.
       else
          j = j - 1
       end if
    end do
  end function skip_whitespace_backwards

  function at_end_of_line (strbuf, i) result (bool)
    class(strbuf_t), intent(in) :: strbuf
    integer(kind = nk), intent(in) :: i
    logical :: bool

    bool = (strbuf%length() < i)
  end function at_end_of_line

  elemental function strbuf_is_all_whitespace (strbuf) result (bool)
    class(strbuf_t), intent(in) :: strbuf
    logical :: bool

    integer(kind = nk) :: n
    integer(kind = nk) :: i

    n = strbuf%length()
    if (n == 0) then
       bool = .true.
    else
       i = 1
       bool = .true.
       do while (bool .and. i /= n + 1)
          bool = isspace (strbuf%chars(i))
          i = i + 1
       end do
    end if
  end function strbuf_is_all_whitespace

  elemental function isspace (ch) result (bool)
    character(1, kind = ck), intent(in) :: ch
    logical :: bool

    bool = (ch == horizontal_tab_char) .or.  &
         & (ch == linefeed_char) .or.        &
         & (ch == vertical_tab_char) .or.    &
         & (ch == formfeed_char) .or.        &
         & (ch == carriage_return_char) .or. &
         & (ch == space_char)
  end function isspace

  elemental function isdigit (ch) result (bool)
    character(1, kind = ck), intent(in) :: ch
    logical :: bool

    integer(kind = ick), parameter :: zero = ichar (ck_'0', kind = ick)
    integer(kind = ick), parameter :: nine = ichar (ck_'9', kind = ick)

    integer(kind = ick) :: i_ch

    i_ch = ichar (ch, kind = ick)
    bool = (zero <= i_ch .and. i_ch <= nine)
  end function isdigit

  subroutine check_is_all_digits (lex_line_no, str)
    integer(kind = nk), intent(in) :: lex_line_no
    character(*, kind = ck), intent(in) :: str

    integer(kind = nk) :: n
    integer(kind = nk) :: i

    n = len (str, kind = nk)
    if (n == 0_nk) then
       call start_error_message (lex_line_no)
       write (error_unit, '("a required field is missing")')
       stop 1
    else
       do i = 1, n
          if (.not. isdigit (str(i:i))) then
             call start_error_message (lex_line_no)
             write (error_unit, '("a numeric field contains a non-digit")')
             stop 1
          end if
       end do
    end if
  end subroutine check_is_all_digits

  subroutine start_error_message (lex_line_no)
    integer(kind = nk), intent(in) :: lex_line_no

    write (error_unit, '("Token stream error at line ", I0, ": ")', advance = 'no') &
         &    lex_line_no
  end subroutine start_error_message

end module reading_of_lexer_tokens

module syntactic_analysis
  use, intrinsic :: iso_fortran_env, only: input_unit
  use, intrinsic :: iso_fortran_env, only: output_unit
  use, intrinsic :: iso_fortran_env, only: error_unit
  use, non_intrinsic :: compiler_type_kinds, only: nk, ck, ick
  use, non_intrinsic :: string_buffers
  use, non_intrinsic :: lexer_token_facts
  use, non_intrinsic :: reading_of_lexer_tokens

  implicit none
  private

  public :: ast_node_t
  public :: ast_t
  public :: parse_token_stream
  public :: output_ast_flattened

  integer, parameter, public :: tk_start_of_statement = -1
  integer, parameter, public :: tk_primary = -2

  integer, parameter :: node_Identifier = 1
  integer, parameter :: node_String = 2
  integer, parameter :: node_Integer = 3
  integer, parameter :: node_Sequence = 4
  integer, parameter :: node_If = 5
  integer, parameter :: node_Prtc = 6
  integer, parameter :: node_Prts = 7
  integer, parameter :: node_Prti = 8
  integer, parameter :: node_While = 9
  integer, parameter :: node_Assign = 10
  integer, parameter :: node_Negate = 11
  integer, parameter :: node_Not = 12
  integer, parameter :: node_Multiply = 13
  integer, parameter :: node_Divide = 14
  integer, parameter :: node_Mod = 15
  integer, parameter :: node_Add = 16
  integer, parameter :: node_Subtract = 17
  integer, parameter :: node_Less = 18
  integer, parameter :: node_LessEqual = 19
  integer, parameter :: node_Greater = 20
  integer, parameter :: node_GreaterEqual = 21
  integer, parameter :: node_Equal = 22
  integer, parameter :: node_NotEqual = 23
  integer, parameter :: node_And = 24
  integer, parameter :: node_Or = 25

  character(16), parameter :: node_variety_string(1:25) = &
       (/ "Identifier      ",    &
       &  "String          ",    &
       &  "Integer         ",    &
       &  "Sequence        ",    &
       &  "If              ",    &
       &  "Prtc            ",    &
       &  "Prts            ",    &
       &  "Prti            ",    &
       &  "While           ",    &
       &  "Assign          ",    &
       &  "Negate          ",    &
       &  "Not             ",    &
       &  "Multiply        ",    &
       &  "Divide          ",    &
       &  "Mod             ",    &
       &  "Add             ",    &
       &  "Subtract        ",    &
       &  "Less            ",    &
       &  "LessEqual       ",    &
       &  "Greater         ",    &
       &  "GreaterEqual    ",    &
       &  "Equal           ",    &
       &  "NotEqual        ",    &
       &  "And             ",    &
       &  "Or              " /)

  type :: ast_node_t
     integer :: node_variety
     character(:, kind = ck), allocatable :: val
     type(ast_node_t), pointer :: left => null ()
     type(ast_node_t), pointer :: right => null ()
   contains
     procedure, pass :: assign => ast_node_t_assign
     generic :: assignment(=) => assign
     final :: ast_node_t_finalize
  end type ast_node_t

  ! ast_t phases.
  integer, parameter :: building = 1
  integer, parameter :: completed = 2

  type :: ast_t
     !
     ! This type is used to build the subtrees, as well as for the
     ! completed AST. The difference is in the setting of ‘phase’.
     !
     type(ast_node_t), pointer :: node => null ()
     integer, private :: phase = building
   contains
     procedure, pass :: assign => ast_t_assign
     generic :: assignment(=) => assign
     final :: ast_t_finalize
  end type ast_t

  type(ast_t), parameter :: ast_nil = ast_t (null ())

contains

  recursive subroutine ast_node_t_assign (node, other)
    class(ast_node_t), intent(out) :: node
    class(*), intent(in) :: other

    select type (other)
    class is (ast_node_t)
       node%node_variety = other%node_variety
       if (allocated (other%val)) allocate (node%val, source = other%val)
       if (associated (other%left)) allocate (node%left, source = other%left)
       if (associated (other%right)) allocate (node%right, source = other%right)
    class default
       ! This branch should never be reached.
       error stop
    end select
  end subroutine ast_node_t_assign

  recursive subroutine ast_node_t_finalize (node)
    type(ast_node_t), intent(inout) :: node

    if (associated (node%left)) deallocate (node%left)
    if (associated (node%right)) deallocate (node%right)
  end subroutine ast_node_t_finalize

  recursive subroutine ast_t_assign (ast, other)
    class(ast_t), intent(out) :: ast
    class(*), intent(in) :: other

    select type (other)
    class is (ast_t)
       if (associated (other%node)) allocate (ast%node, source = other%node)
       !
       ! Whether it is better to set phase to ‘building’ or to set it
       ! to ‘other%phase’ is unclear to me. Probably ‘building’ is the
       ! better choice. Which variable controls memory recovery is
       ! clear and unchanging, in that case: it is the original,
       ! ‘other’, that does.
       !
       ast%phase = building
    class default
       ! This should not happen.
       error stop
    end select
  end subroutine ast_t_assign

  subroutine ast_t_finalize (ast)
    type(ast_t), intent(inout) :: ast

    !
    ! When we are building the tree, the tree’s nodes should not be
    ! deallocated when the ast_t variable temporarily holding them
    ! goes out of scope.
    !
    ! However, once the AST is completed, we do want the memory
    ! recovered when the variable goes out of scope.
    !
    ! (Elsewhere I have written a primitive garbage collector for
    ! Fortran programs, but in this case it would be a lot of overhead
    ! for little gain. In fact, we could reasonably just let the
    ! memory leak, in this program.
    !
    ! Fortran runtimes *are* allowed by the standard to have garbage
    ! collectors built in. To my knowledge, at the time of this
    ! writing, only NAG Fortran has a garbage collector option.)
    !

    if (ast%phase == completed) then
       if (associated (ast%node)) deallocate (ast%node)
    end if
  end subroutine ast_t_finalize

  function parse_token_stream (unit_no) result (ast)
    integer, intent(in) :: unit_no
    type(ast_t) :: ast

    integer(kind = nk) :: lex_line_no
    type(ast_t) :: statement
    type(lexer_token_t) :: token

    lex_line_no = -1_nk
    call get_token (unit_no, lex_line_no, token)
    call parse_statement (unit_no, lex_line_no, token, statement)
    ast = make_internal_node (node_Sequence, ast, statement)
    do while (token%token_no /= tk_EOI)
       call parse_statement (unit_no, lex_line_no, token, statement)
       ast = make_internal_node (node_Sequence, ast, statement)
    end do
    ast%phase = completed
  end function parse_token_stream

  recursive subroutine parse_statement (unit_no, lex_line_no, token, ast)
    integer, intent(in) :: unit_no
    integer(kind = nk), intent(inout) :: lex_line_no
    type(lexer_token_t), intent(inout) :: token
    type(ast_t), intent(out) :: ast

    ast = ast_nil

    select case (token%token_no)
    case (tk_If)
       call parse_ifelse_construct
    case (tk_Putc)
       call parse_putc
    case (tk_Print)
       call parse_print
    case (tk_Semi)
       call get_token (unit_no, lex_line_no, token)
    case (tk_Ident)
       call parse_identifier
    case (tk_While)
       call parse_while_construct
    case (tk_Lbrace)
       call parse_lbrace_construct
    case (tk_EOI)
       continue
    case default
       call syntax_error_message ("", tk_start_of_statement, token)
       stop 1
    end select

  contains

    recursive subroutine parse_ifelse_construct
      type(ast_t) :: predicate
      type(ast_t) :: statement_for_predicate_true
      type(ast_t) :: statement_for_predicate_false

      call expect_token ("If", tk_If, token)
      call get_token (unit_no, lex_line_no, token)
      call parse_parenthesized_expression (unit_no, lex_line_no, token, predicate)
      call parse_statement (unit_no, lex_line_no, token, statement_for_predicate_true)
      if (token%token_no == tk_Else) then
         call get_token (unit_no, lex_line_no, token)
         call parse_statement (unit_no, lex_line_no, token, statement_for_predicate_false)
         ast = make_internal_node (node_If, statement_for_predicate_true, &
              &                    statement_for_predicate_false)
      else
         ast = make_internal_node (node_If, statement_for_predicate_true, ast_nil)
      end if
      ast = make_internal_node (node_If, predicate, ast)
    end subroutine parse_ifelse_construct

    recursive subroutine parse_putc
      type(ast_t) :: arguments

      call expect_token ("Putc", tk_Putc, token)
      call get_token (unit_no, lex_line_no, token)
      call parse_parenthesized_expression (unit_no, lex_line_no, token, arguments)
      ast = make_internal_node (node_Prtc, arguments, ast_nil)
      call expect_token ("Putc", tk_Semi, token)
      call get_token (unit_no, lex_line_no, token)
    end subroutine parse_putc

    recursive subroutine parse_print
      logical :: done
      type(ast_t) :: arg
      type(ast_t) :: printer

      call expect_token ("Print", tk_Print, token)
      call get_token (unit_no, lex_line_no, token)
      call expect_token ("Print", tk_Lparen, token)
      done = .false.
      do while (.not. done)
         call get_token (unit_no, lex_line_no, token)
         select case (token%token_no)
         case (tk_String)
            arg = make_leaf_node (node_String, token%val)
            printer = make_internal_node (node_Prts, arg, ast_nil)
            call get_token (unit_no, lex_line_no, token)
         case default
            call parse_expression (unit_no, 0, lex_line_no, token, arg)
            printer = make_internal_node (node_Prti, arg, ast_nil)
         end select
         ast = make_internal_node (node_Sequence, ast, printer)
         done = (token%token_no /= tk_Comma)
      end do
      call expect_token ("Print", tk_Rparen, token)
      call get_token (unit_no, lex_line_no, token)
      call expect_token ("Print", tk_Semi, token)
      call get_token (unit_no, lex_line_no, token)
    end subroutine parse_print

    recursive subroutine parse_identifier
      type(ast_t) :: left_side
      type(ast_t) :: right_side

      left_side = make_leaf_node (node_Identifier, token%val)
      call get_token (unit_no, lex_line_no, token)
      call expect_token ("assign", tk_Assign, token)
      call get_token (unit_no, lex_line_no, token)
      call parse_expression (unit_no, 0, lex_line_no, token, right_side)
      ast = make_internal_node (node_Assign, left_side, right_side)
      call expect_token ("assign", tk_Semi, token)
      call get_token (unit_no, lex_line_no, token)
    end subroutine parse_identifier

    recursive subroutine parse_while_construct
      type(ast_t) :: predicate
      type(ast_t) :: statement_to_be_repeated

      call expect_token ("While", tk_While, token)
      call get_token (unit_no, lex_line_no, token)
      call parse_parenthesized_expression (unit_no, lex_line_no, token, predicate)
      call parse_statement (unit_no, lex_line_no, token, statement_to_be_repeated)
      ast = make_internal_node (node_While, predicate, statement_to_be_repeated)
    end subroutine parse_while_construct

    recursive subroutine parse_lbrace_construct
      type(ast_t) :: statement

      call expect_token ("Lbrace", tk_Lbrace, token)
      call get_token (unit_no, lex_line_no, token)
      do while (token%token_no /= tk_Rbrace .and. token%token_no /= tk_EOI)
         call parse_statement (unit_no, lex_line_no, token, statement)
         ast = make_internal_node (node_Sequence, ast, statement)
      end do
      call expect_token ("Lbrace", tk_Rbrace, token)
      call get_token (unit_no, lex_line_no, token)
    end subroutine parse_lbrace_construct

  end subroutine parse_statement

  recursive subroutine parse_expression (unit_no, p, lex_line_no, token, ast)
    integer, intent(in) :: unit_no
    integer, intent(in) :: p
    integer(kind = nk), intent(inout) :: lex_line_no
    type(lexer_token_t), intent(inout) :: token
    type(ast_t), intent(out) :: ast

    integer :: precedence
    type(ast_t) :: expression

    select case (token%token_no)
    case (tk_Lparen)
       call parse_parenthesized_expression (unit_no, lex_line_no, token, ast)
    case (tk_Sub)
       token%token_no = tk_Negate
       precedence = lexer_token_precedence(token%token_no)
       call get_token (unit_no, lex_line_no, token)
       call parse_expression (unit_no, precedence, lex_line_no, token, expression)
       ast = make_internal_node (node_Negate, expression, ast_nil)
    case (tk_Add)
       token%token_no = tk_Positive
       precedence = lexer_token_precedence(token%token_no)
       call get_token (unit_no, lex_line_no, token)
       call parse_expression (unit_no, precedence, lex_line_no, token, expression)
       ast = expression
    case (tk_Not)
       precedence = lexer_token_precedence(token%token_no)
       call get_token (unit_no, lex_line_no, token)
       call parse_expression (unit_no, precedence, lex_line_no, token, expression)
       ast = make_internal_node (node_Not, expression, ast_nil)
    case (tk_Ident)
       ast = make_leaf_node (node_Identifier, token%val)
       call get_token (unit_no, lex_line_no, token)
    case (tk_Integer)
       ast = make_leaf_node (node_Integer, token%val)
       call get_token (unit_no, lex_line_no, token)
    case default
       call syntax_error_message ("", tk_primary, token)
       stop 1
    end select

    do while (lexer_token_arity(token%token_no) == 2 .and. &
         &    p <= lexer_token_precedence(token%token_no))
       block
         type(ast_t) :: right_expression
         integer :: q
         integer :: node_variety

         if (lexer_token_associativity(token%token_no) == right_associative) then
            q = lexer_token_precedence(token%token_no)
         else
            q = lexer_token_precedence(token%token_no) + 1
         end if
         node_variety = binary_operator_node_variety (token%token_no)
         call get_token (unit_no, lex_line_no, token)
         call parse_expression (unit_no, q, lex_line_no, token, right_expression)
         ast = make_internal_node (node_variety, ast, right_expression)
       end block
    end do
  end subroutine parse_expression

  recursive subroutine parse_parenthesized_expression (unit_no, lex_line_no, token, ast)
    integer, intent(in) :: unit_no
    integer(kind = nk), intent(inout) :: lex_line_no
    type(lexer_token_t), intent(inout) :: token
    type(ast_t), intent(out) :: ast

    call expect_token ("paren_expr", tk_Lparen, token)
    call get_token (unit_no, lex_line_no, token)
    call parse_expression (unit_no, 0, lex_line_no, token, ast)
    call expect_token ("paren_expr", tk_Rparen, token)
    call get_token (unit_no, lex_line_no, token)
  end subroutine parse_parenthesized_expression

  elemental function binary_operator_node_variety (token_no) result (node_variety)
    integer, intent(in) :: token_no
    integer :: node_variety

    select case (token_no)
    case (tk_Mul)
       node_variety = node_Multiply
    case (tk_Div)
       node_variety = node_Divide
    case (tk_Mod)
       node_variety = node_Mod
    case (tk_Add)
       node_variety = node_Add
    case (tk_Sub)
       node_variety = node_Subtract
    case (tk_Lss)
       node_variety = node_Less
    case (tk_Leq)
       node_variety = node_LessEqual
    case (tk_Gtr)
       node_variety = node_Greater
    case (tk_Geq)
       node_variety = node_GreaterEqual
    case (tk_Eq)
       node_variety = node_Equal
    case (tk_Neq)
       node_variety = node_NotEqual
    case (tk_And)
       node_variety = node_And
    case (tk_Or)
       node_variety = node_Or
    case default
       ! This branch should never be reached.
       error stop
    end select
  end function binary_operator_node_variety

  function make_internal_node (node_variety, left, right) result (ast)
    integer, intent(in) :: node_variety
    class(ast_t), intent(in) :: left, right
    type(ast_t) :: ast

    type(ast_node_t), pointer :: node

    allocate (node)
    node%node_variety = node_variety
    node%left => left%node
    node%right => right%node
    ast%node => node
  end function make_internal_node

  function make_leaf_node (node_variety, val) result (ast)
    integer, intent(in) :: node_variety
    character(*, kind = ck), intent(in) :: val
    type(ast_t) :: ast

    type(ast_node_t), pointer :: node

    allocate (node)
    node%node_variety = node_variety
    node%val = val
    ast%node => node
  end function make_leaf_node

  subroutine get_token (unit_no, lex_line_no, token)
    integer, intent(in) :: unit_no
    integer(kind = nk), intent(inout) :: lex_line_no
    type(lexer_token_t), intent(out) :: token

    logical :: eof

    call get_lexer_token (unit_no, lex_line_no, eof, token)
    if (eof) then
       write (error_unit, '("Parser error: the stream of input tokens is incomplete")')
       stop 1
    end if
  end subroutine get_token

  subroutine expect_token (message, token_no, token)
    character(*), intent(in) :: message
    integer, intent (in) :: token_no
    class(lexer_token_t), intent(in) :: token

    if (token%token_no /= token_no) then
       call syntax_error_message (message, token_no, token)
       stop 1
    end if
  end subroutine expect_token

  subroutine syntax_error_message (message, expected_token_no, token)
    character(*), intent(in) :: message
    integer, intent(in) :: expected_token_no
    class(lexer_token_t), intent(in) :: token

    ! Write a message to an output unit dedicated to printing
    ! errors. The message could, of course, be more detailed than what
    ! we are doing here.
    write (error_unit, '("Syntax error at ", I0, ".", I0)') &
         &    token%line_no, token%column_no

    !
    ! For the sake of the exercise, also write, to output_unit, a
    ! message in the style of the C reference program.
    !
    write (output_unit, '("(", I0, ", ", I0, ") error: ")', advance = 'no') &
         &    token%line_no, token%column_no
    select case (expected_token_no)
    case (tk_start_of_statement)
       write (output_unit, '("expecting start of statement, found ''", 1A, "''")') &
            &    trim (lexer_token_string(token%token_no))
    case (tk_primary)
       write (output_unit, '("Expecting a primary, found ''", 1A, "''")') &
            &    trim (lexer_token_string(token%token_no))
    case default
       write (output_unit, '(1A, ": Expecting ''", 1A, "'', found ''", 1A, "''")') &
            &    trim (message), trim (lexer_token_string(expected_token_no)), &
            &    trim (lexer_token_string(token%token_no))
    end select
  end subroutine syntax_error_message

  subroutine output_ast_flattened (unit_no, ast)
    integer, intent(in) :: unit_no
    type(ast_t), intent(in) :: ast

    call output_ast_node_flattened (unit_no, ast%node)
  end subroutine output_ast_flattened

  recursive subroutine output_ast_node_flattened (unit_no, node)
    integer, intent(in) :: unit_no
    type(ast_node_t), pointer, intent(in) :: node

      if (.not. associated (node)) then
         write (unit_no, '(";")')
      else
         if (allocated (node%val)) then
            write (unit_no, '(1A16, 2X, 1A)') &
                 &   node_variety_string(node%node_variety), node%val
         else
            write (unit_no, '(1A)') &
                 &   trim (node_variety_string(node%node_variety))
            call output_ast_node_flattened (unit_no, node%left)
            call output_ast_node_flattened (unit_no, node%right)
         end if
      end if
    end subroutine output_ast_node_flattened

end module syntactic_analysis

program parse
  use, intrinsic :: iso_fortran_env, only: input_unit
  use, intrinsic :: iso_fortran_env, only: output_unit
  use, intrinsic :: iso_fortran_env, only: error_unit
  use, non_intrinsic :: syntactic_analysis

  implicit none

  integer, parameter :: inp_unit_no = 100
  integer, parameter :: outp_unit_no = 101

  integer :: arg_count
  character(200) :: arg
  integer :: inp
  integer :: outp

  arg_count = command_argument_count ()
  if (3 <= arg_count) then
     call print_usage
  else
     if (arg_count == 0) then
        inp = input_unit
        outp = output_unit
     else if (arg_count == 1) then
        call get_command_argument (1, arg)
        inp = open_for_input (trim (arg))
        outp = output_unit
     else if (arg_count == 2) then
        call get_command_argument (1, arg)
        inp = open_for_input (trim (arg))
        call get_command_argument (2, arg)
        outp = open_for_output (trim (arg))
     end if

     block
       type(ast_t) :: ast

       ast = parse_token_stream (inp)
       call output_ast_flattened (outp, ast)
     end block
  end if

contains

  function open_for_input (filename) result (unit_no)
    character(*), intent(in) :: filename
    integer :: unit_no

    integer :: stat

    open (unit = inp_unit_no, file = filename, status = 'old', &
         & action = 'read', access = 'stream', form = 'unformatted',  &
         & iostat = stat)
    if (stat /= 0) then
       write (error_unit, '("Error: failed to open ", 1A, " for input")') filename
       stop 1
    end if
    unit_no = inp_unit_no
  end function open_for_input

  function open_for_output (filename) result (unit_no)
    character(*), intent(in) :: filename
    integer :: unit_no

    integer :: stat

    open (unit = outp_unit_no, file = filename, action = 'write', iostat = stat)
    if (stat /= 0) then
       write (error_unit, '("Error: failed to open ", 1A, " for output")') filename
       stop 1
    end if
    unit_no = outp_unit_no
  end function open_for_output

  subroutine print_usage
    character(200) :: progname

    call get_command_argument (0, progname)
    write (output_unit, '("Usage: ", 1A, " [INPUT_FILE [OUTPUT_FILE]]")') &
         &      trim (progname)
  end subroutine print_usage
  
end program parse

Output:

Prime numbers example:

Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier        count
Integer           1
Assign
Identifier        n
Integer           1
Assign
Identifier        limit
Integer           100
While
Less
Identifier        n
Identifier        limit
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier        k
Integer           3
Assign
Identifier        p
Integer           1
Assign
Identifier        n
Add
Identifier        n
Integer           2
While
And
LessEqual
Multiply
Identifier        k
Identifier        k
Identifier        n
Identifier        p
Sequence
Sequence
;
Assign
Identifier        p
NotEqual
Multiply
Divide
Identifier        n
Identifier        k
Identifier        k
Identifier        n
Assign
Identifier        k
Add
Identifier        k
Integer           2
If
Identifier        p
If
Sequence
Sequence
;
Sequence
Sequence
;
Prti
Identifier        n
;
Prts
String            " is prime\n"
;
Assign
Identifier        count
Add
Identifier        count
Integer           1
;
Sequence
Sequence
Sequence
;
Prts
String            "Total primes found: "
;
Prti
Identifier        count
;
Prts
String            "\n"
;

Go

Translation of: C

package main

import (
    "bufio"
    "fmt"
    "log"
    "os"
    "strconv"
    "strings"
)

type TokenType int

const (
    tkEOI TokenType = iota
    tkMul
    tkDiv
    tkMod
    tkAdd
    tkSub
    tkNegate
    tkNot
    tkLss
    tkLeq
    tkGtr
    tkGeq
    tkEql
    tkNeq
    tkAssign
    tkAnd
    tkOr
    tkIf
    tkElse
    tkWhile
    tkPrint
    tkPutc
    tkLparen
    tkRparen
    tkLbrace
    tkRbrace
    tkSemi
    tkComma
    tkIdent
    tkInteger
    tkString
)

type NodeType int

const (
    ndIdent NodeType = iota
    ndString
    ndInteger
    ndSequence
    ndIf
    ndPrtc
    ndPrts
    ndPrti
    ndWhile
    ndAssign
    ndNegate
    ndNot
    ndMul
    ndDiv
    ndMod
    ndAdd
    ndSub
    ndLss
    ndLeq
    ndGtr
    ndGeq
    ndEql
    ndNeq
    ndAnd
    ndOr
)

type tokS struct {
    tok    TokenType
    errLn  int
    errCol int
    text   string // ident or string literal or integer value
}

type Tree struct {
    nodeType NodeType
    left     *Tree
    right    *Tree
    value    string
}

// dependency: Ordered by tok, must remain in same order as TokenType consts
type atr struct {
    text             string
    enumText         string
    tok              TokenType
    rightAssociative bool
    isBinary         bool
    isUnary          bool
    precedence       int
    nodeType         NodeType
}

var atrs = []atr{
    {"EOI", "End_of_input", tkEOI, false, false, false, -1, -1},
    {"*", "Op_multiply", tkMul, false, true, false, 13, ndMul},
    {"/", "Op_divide", tkDiv, false, true, false, 13, ndDiv},
    {"%", "Op_mod", tkMod, false, true, false, 13, ndMod},
    {"+", "Op_add", tkAdd, false, true, false, 12, ndAdd},
    {"-", "Op_subtract", tkSub, false, true, false, 12, ndSub},
    {"-", "Op_negate", tkNegate, false, false, true, 14, ndNegate},
    {"!", "Op_not", tkNot, false, false, true, 14, ndNot},
    {"<", "Op_less", tkLss, false, true, false, 10, ndLss},
    {"<=", "Op_lessequal", tkLeq, false, true, false, 10, ndLeq},
    {">", "Op_greater", tkGtr, false, true, false, 10, ndGtr},
    {">=", "Op_greaterequal", tkGeq, false, true, false, 10, ndGeq},
    {"==", "Op_equal", tkEql, false, true, false, 9, ndEql},
    {"!=", "Op_notequal", tkNeq, false, true, false, 9, ndNeq},
    {"=", "Op_assign", tkAssign, false, false, false, -1, ndAssign},
    {"&&", "Op_and", tkAnd, false, true, false, 5, ndAnd},
    {"||", "Op_or", tkOr, false, true, false, 4, ndOr},
    {"if", "Keyword_if", tkIf, false, false, false, -1, ndIf},
    {"else", "Keyword_else", tkElse, false, false, false, -1, -1},
    {"while", "Keyword_while", tkWhile, false, false, false, -1, ndWhile},
    {"print", "Keyword_print", tkPrint, false, false, false, -1, -1},
    {"putc", "Keyword_putc", tkPutc, false, false, false, -1, -1},
    {"(", "LeftParen", tkLparen, false, false, false, -1, -1},
    {")", "RightParen", tkRparen, false, false, false, -1, -1},
    {"{", "LeftBrace", tkLbrace, false, false, false, -1, -1},
    {"}", "RightBrace", tkRbrace, false, false, false, -1, -1},
    {";", "Semicolon", tkSemi, false, false, false, -1, -1},
    {",", "Comma", tkComma, false, false, false, -1, -1},
    {"Ident", "Identifier", tkIdent, false, false, false, -1, ndIdent},
    {"Integer literal", "Integer", tkInteger, false, false, false, -1, ndInteger},
    {"String literal", "String", tkString, false, false, false, -1, ndString},
}

var displayNodes = []string{
    "Identifier", "String", "Integer", "Sequence", "If", "Prtc", "Prts", "Prti",
    "While", "Assign", "Negate", "Not", "Multiply", "Divide", "Mod", "Add",
    "Subtract", "Less", "LessEqual", "Greater", "GreaterEqual", "Equal",
    "NotEqual", "And", "Or",
}

var (
    err     error
    token   tokS
    scanner *bufio.Scanner
)

func reportError(errLine, errCol int, msg string) {
    log.Fatalf("(%d, %d) error : %s\n", errLine, errCol, msg)
}

func check(err error) {
    if err != nil {
        log.Fatal(err)
    }
}

func getEum(name string) TokenType { // return internal version of name#
    for _, atr := range atrs {
        if atr.enumText == name {
            return atr.tok
        }
    }
    reportError(0, 0, fmt.Sprintf("Unknown token %s\n", name))
    return tkEOI
}

func getTok() tokS {
    tok := tokS{}
    if scanner.Scan() {
        line := strings.TrimRight(scanner.Text(), " \t")
        fields := strings.Fields(line)
        // [ ]*{lineno}[ ]+{colno}[ ]+token[ ]+optional
        tok.errLn, err = strconv.Atoi(fields[0])
        check(err)
        tok.errCol, err = strconv.Atoi(fields[1])
        check(err)
        tok.tok = getEum(fields[2])
        le := len(fields)
        if le == 4 {
            tok.text = fields[3]
        } else if le > 4 {
            idx := strings.Index(line, `"`)
            tok.text = line[idx:]
        }
    }
    check(scanner.Err())
    return tok
}

func makeNode(nodeType NodeType, left *Tree, right *Tree) *Tree {
    return &Tree{nodeType, left, right, ""}
}

func makeLeaf(nodeType NodeType, value string) *Tree {
    return &Tree{nodeType, nil, nil, value}
}

func expect(msg string, s TokenType) {
    if token.tok == s {
        token = getTok()
        return
    }
    reportError(token.errLn, token.errCol,
        fmt.Sprintf("%s: Expecting '%s', found '%s'\n", msg, atrs[s].text, atrs[token.tok].text))
}

func expr(p int) *Tree {
    var x, node *Tree
    switch token.tok {
    case tkLparen:
        x = parenExpr()
    case tkSub, tkAdd:
        op := token.tok
        token = getTok()
        node = expr(atrs[tkNegate].precedence)
        if op == tkSub {
            x = makeNode(ndNegate, node, nil)
        } else {
            x = node
        }
    case tkNot:
        token = getTok()
        x = makeNode(ndNot, expr(atrs[tkNot].precedence), nil)
    case tkIdent:
        x = makeLeaf(ndIdent, token.text)
        token = getTok()
    case tkInteger:
        x = makeLeaf(ndInteger, token.text)
        token = getTok()
    default:
        reportError(token.errLn, token.errCol,
            fmt.Sprintf("Expecting a primary, found: %s\n", atrs[token.tok].text))
    }

    for atrs[token.tok].isBinary && atrs[token.tok].precedence >= p {
        op := token.tok
        token = getTok()
        q := atrs[op].precedence
        if !atrs[op].rightAssociative {
            q++
        }
        node = expr(q)
        x = makeNode(atrs[op].nodeType, x, node)
    }
    return x
}

func parenExpr() *Tree {
    expect("parenExpr", tkLparen)
    t := expr(0)
    expect("parenExpr", tkRparen)
    return t
}

func stmt() *Tree {
    var t, v, e, s, s2 *Tree
    switch token.tok {
    case tkIf:
        token = getTok()
        e = parenExpr()
        s = stmt()
        s2 = nil
        if token.tok == tkElse {
            token = getTok()
            s2 = stmt()
        }
        t = makeNode(ndIf, e, makeNode(ndIf, s, s2))
    case tkPutc:
        token = getTok()
        e = parenExpr()
        t = makeNode(ndPrtc, e, nil)
        expect("Putc", tkSemi)
    case tkPrint: // print '(' expr {',' expr} ')'
        token = getTok()
        for expect("Print", tkLparen); ; expect("Print", tkComma) {
            if token.tok == tkString {
                e = makeNode(ndPrts, makeLeaf(ndString, token.text), nil)
                token = getTok()
            } else {
                e = makeNode(ndPrti, expr(0), nil)
            }
            t = makeNode(ndSequence, t, e)
            if token.tok != tkComma {
                break
            }
        }
        expect("Print", tkRparen)
        expect("Print", tkSemi)
    case tkSemi:
        token = getTok()
    case tkIdent:
        v = makeLeaf(ndIdent, token.text)
        token = getTok()
        expect("assign", tkAssign)
        e = expr(0)
        t = makeNode(ndAssign, v, e)
        expect("assign", tkSemi)
    case tkWhile:
        token = getTok()
        e = parenExpr()
        s = stmt()
        t = makeNode(ndWhile, e, s)
    case tkLbrace: // {stmt}
        for expect("Lbrace", tkLbrace); token.tok != tkRbrace && token.tok != tkEOI; {
            t = makeNode(ndSequence, t, stmt())
        }
        expect("Lbrace", tkRbrace)
    case tkEOI:
        // do nothing
    default:
        reportError(token.errLn, token.errCol,
            fmt.Sprintf("expecting start of statement, found '%s'\n", atrs[token.tok].text))
    }
    return t
}

func parse() *Tree {
    var t *Tree
    token = getTok()
    for {
        t = makeNode(ndSequence, t, stmt())
        if t == nil || token.tok == tkEOI {
            break
        }
    }
    return t
}

func prtAst(t *Tree) {
    if t == nil {
        fmt.Print(";\n")
    } else {
        fmt.Printf("%-14s ", displayNodes[t.nodeType])
        if t.nodeType == ndIdent || t.nodeType == ndInteger || t.nodeType == ndString {
            fmt.Printf("%s\n", t.value)
        } else {
            fmt.Println()
            prtAst(t.left)
            prtAst(t.right)
        }
    }
}

func main() {
    source, err := os.Open("source.txt")
    check(err)
    defer source.Close()
    scanner = bufio.NewScanner(source)
    prtAst(parse())
}

Output:

Prime Numbers example:

Sequence       
Sequence       
Sequence       
Sequence       
Sequence       
;
Assign         
Identifier     count
Integer        1
Assign         
Identifier     n
Integer        1
Assign         
Identifier     limit
Integer        100
While          
Less           
Identifier     n
Identifier     limit
Sequence       
Sequence       
Sequence       
Sequence       
Sequence       
;
Assign         
Identifier     k
Integer        3
Assign         
Identifier     p
Integer        1
Assign         
Identifier     n
Add            
Identifier     n
Integer        2
While          
And            
LessEqual      
Multiply       
Identifier     k
Identifier     k
Identifier     n
Identifier     p
Sequence       
Sequence       
;
Assign         
Identifier     p
NotEqual       
Multiply       
Divide         
Identifier     n
Identifier     k
Identifier     k
Identifier     n
Assign         
Identifier     k
Add            
Identifier     k
Integer        2
If             
Identifier     p
If             
Sequence       
Sequence       
;
Sequence       
Sequence       
;
Prti           
Identifier     n
;
Prts           
String         " is prime\n"
;
Assign         
Identifier     count
Add            
Identifier     count
Integer        1
;
Sequence       
Sequence       
Sequence       
;
Prts           
String         "Total primes found: "
;
Prti           
Identifier     count
;
Prts           
String         "\n"
;

Icon

Works with: Icon version 9.5.20i

I use co-expressions in a way that could easily be done differently, but I prefer to use the co-expressions. (These can be sluggish or fast, depending on what sort of Icon you are running. In this case, the speed differences are of little concern.)

#
# The Rosetta Code Tiny-Language Parser, in Icon.
#
# This implementation is based closely on the pseudocode and the C
# reference implementation.
#

# ximage from the IPL is useful for debugging. Use "xdump(x)" to
# pretty-print x.
#link ximage

record token_record (line_no, column_no, tok, tokval)
record token_getter (nxt, curr)

procedure main (args)
  local inpf_name, outf_name
  local inpf, outf
  local nexttok, currtok, current_token, gettok
  local ast

  inpf_name := "-"
  outf_name := "-"
  if 1 <= *args then inpf_name := args[1]
  if 2 <= *args then outf_name := args[2]

  inpf :=
      if inpf_name == "-" then
          &input
      else
          (open(inpf_name, "r") |
           stop("failed to open \"" || inpf_name || "\" for input"))
  outf :=
      if outf_name == "-" then
          &output
      else
          (open(outf_name, "w") |
           stop("failed to open \"" || outf_name || "\" for output"))

  current_token := [&null]
  nexttok := create generate_tokens(inpf, current_token)
  currtok := create get_current_token (current_token)
  gettok := token_getter(nexttok, currtok)
  ast := parse(gettok)
  prt_ast(outf, ast)

  close(inpf)
  close(outf)
end

procedure prt_ast (outf, ast)
  if *ast = 0 then {
    write(outf, ";")
  } else {
    writes(outf, ast[1])
    if ast[1] == ("Identifier" | "Integer" | "String") then {
      write(outf, " ", ast[2])
    } else {
      write(outf)
      prt_ast(outf, ast[2])
      prt_ast(outf, ast[3])
    }
  }
end

procedure generate_tokens (inpf, current_token)
  local s

  while s := read(inpf) do {
    if trim(s) ~== "" then {
      current_token[1] := string_to_token_record(s)
      suspend current_token[1]
    }
  }
end

procedure get_current_token (current_token)
  repeat (suspend current_token[1])
end

procedure string_to_token_record (s)
  local line_no, column_no, tok, tokval

  static spaces

  initial {
    spaces := ' \t\f\v\r\n'
  }

  trim(s) ? {
    tab(many(spaces))
    line_no := integer(tab(many(&digits)))
    tab(many(spaces))
    column_no := integer(tab(many(&digits)))
    tab(many(spaces))
    tok := tab(many(&letters ++ '_'))
    tab(many(spaces))
    tokval := tab(0)
  }
  return token_record(line_no, column_no, tok, tokval)
end

procedure parse (gettok)
  local tok
  local t

  t := []
  @gettok.nxt
  tok := "Not End_of_input"
  while tok ~== "End_of_input" do {
    t := ["Sequence", t, stmt(gettok)]
    tok := (@gettok.curr).tok
  }
  return t
end

procedure stmt (gettok)
  local e, s, t, v
  local tok
  local done

  t := []
  if accept(gettok, "Keyword_if") then {
    e := paren_expr(gettok)
    s := stmt(gettok)
    t := ["If", e, ["If", s, 
                    if accept(gettok, "Keyword_else")
                    then stmt(gettok) else []]]
  } else if accept(gettok, "Keyword_putc") then {
    t := ["Prtc", paren_expr(gettok), []]
    expect(gettok, "Putc", "Semicolon")
  } else if accept(gettok, "Keyword_print") then {
    expect(gettok, "Print", "LeftParen")
    done := 0
    while done = 0 do {
      tok := @gettok.curr
      if tok.tok == "String" then {
        e := ["Prts", ["String", tok.tokval], []]
        @gettok.nxt
      } else {
        e := ["Prti", expr(gettok, 0), []]
      }
      t := ["Sequence", t, e]
      accept(gettok, "Comma") | (done := 1)
    }
    expect(gettok, "Print", "RightParen")
    expect(gettok, "Print", "Semicolon")
  } else if (@gettok.curr).tok == "Semicolon" then {
    @gettok.nxt
  } else if (@gettok.curr).tok == "Identifier" then {
    v := ["Identifier", (@gettok.curr).tokval]
    @gettok.nxt
    expect(gettok, "assign", "Op_assign")
    t := ["Assign", v, expr(gettok, 0)]
    expect(gettok, "assign", "Semicolon")
  } else if accept(gettok, "Keyword_while") then {
    e := paren_expr(gettok)
    t := ["While", e, stmt(gettok)]
  } else if accept(gettok, "LeftBrace") then {
    until (@gettok.curr).tok == ("RightBrace" | "End_of_input") do {
      t := ["Sequence", t, stmt(gettok)]
    }
    expect(gettok, "Lbrace", "RightBrace")
  } else if (@gettok.curr).tok ~== "End_of_input" then {
    tok := @gettok.curr
    error(tok, ("expecting start of statement, found '" ||
                text(tok.tok) || "'"))
  }
  return t
end

procedure paren_expr (gettok)
  local x

  expect(gettok, "paren_expr", "LeftParen");
  x := expr(gettok, 0);
  expect(gettok, "paren_expr", "RightParen");
  return x
end

procedure expr (gettok, p)
  local tok, save_tok
  local x, y
  local q

  tok := @gettok.curr
  case tok.tok of {
    "LeftParen" : {
      x := paren_expr(gettok)
    }
    "Op_subtract" : {
      @gettok.nxt
      y := expr(gettok, precedence("Op_negate"))
      x := ["Negate", y, []]
    }
    "Op_add" : {
      @gettok.nxt
      x := expr(gettok, precedence("Op_negate"))
    }
    "Op_not" : {
      @gettok.nxt
      y := expr(gettok, precedence("Op_not"))
      x := ["Not", y, []]
    }
    "Identifier" : {
      x := ["Identifier", tok.tokval]
      @gettok.nxt
    }
    "Integer" : {
      x := ["Integer", tok.tokval]
      @gettok.nxt
    }
    default : {
      error(tok, "Expecting a primary, found: " || text(tok.tok))
    }
  }

  while (tok := @gettok.curr &
         is_binary(tok.tok) &
         p <= precedence(tok.tok)) do
      {
        save_tok := tok
        @gettok.nxt
        q := precedence(save_tok.tok)
        if not is_right_associative(save_tok.tok) then q +:= 1
        x := [operator(save_tok.tok), x, expr(gettok, q)]
      }

  return x
end

procedure accept (gettok, tok)
  local nxt

  if (@gettok.curr).tok == tok then nxt := @gettok.nxt else fail
  return nxt
end

procedure expect (gettok, msg, tok)
  if (@gettok.curr).tok ~== tok then {
    error(@gettok.curr,
          msg || ": Expecting '" || text(tok) || "', found '" ||
          text((@gettok.curr).tok) || "'")
  }
  return @gettok.nxt
end

procedure error (token, msg)
  write("(", token.line_no, ", ", token.column_no, ") error: ", msg)
  exit(1)
end

procedure precedence (tok)
  local p

  case tok of {
    "Op_multiply" : p := 13
    "Op_divide" : p := 13
    "Op_mod" : p := 13
    "Op_add" : p := 12
    "Op_subtract" : p := 12
    "Op_negate" : p := 14
    "Op_not" : p := 14
    "Op_less" : p := 10
    "Op_lessequal" : p := 10
    "Op_greater" : p := 10
    "Op_greaterequal" : p := 10
    "Op_equal" : p := 9
    "Op_notequal" : p := 9
    "Op_and" : p := 5
    "Op_or" : p := 4
    default : p := -1
  }
  return p
end

procedure is_binary (tok)
  return ("Op_add" |
          "Op_subtract" |
          "Op_multiply" |
          "Op_divide" |
          "Op_mod" |
          "Op_less" |
          "Op_lessequal" |
          "Op_greater" |
          "Op_greaterequal" |
          "Op_equal" |
          "Op_notequal" |
          "Op_and" |
          "Op_or") == tok
  fail
end

procedure is_right_associative (tok)
  # None of the current operators is right associative.
  fail
end

procedure operator (tok)
  local s

  case tok of {
    "Op_multiply" : s := "Multiply"
    "Op_divide" : s := "Divide"
    "Op_mod" : s := "Mod"
    "Op_add" : s := "Add"
    "Op_subtract" : s := "Subtract"
    "Op_negate" : s := "Negate"
    "Op_not" : s := "Not"
    "Op_less" : s := "Less"
    "Op_lessequal" : s := "LessEqual"
    "Op_greater" : s := "Greater"
    "Op_greaterequal" : s := "GreaterEqual"
    "Op_equal" : s := "Equal"
    "Op_notequal" : s := "NotEqual"
    "Op_and" : s := "And"
    "Op_or" : s := "Or"
  }
  return s
end

procedure text (tok)
  local s

  case tok of {
    "Keyword_else"    :  s := "else"
    "Keyword_if"      :  s := "if"
    "Keyword_print"   :  s := "print"
    "Keyword_putc"    :  s := "putc"
    "Keyword_while"   :  s := "while"
    "Op_multiply"     :  s := "*"
    "Op_divide"       :  s := "/"
    "Op_mod"          :  s := "%"
    "Op_add"          :  s := "+"
    "Op_subtract"     :  s := "-"
    "Op_negate"       :  s := "-"
    "Op_less"         :  s := "<"
    "Op_lessequal"    :  s := "<="
    "Op_greater"      :  s := ">"
    "Op_greaterequal" :  s := ">="
    "Op_equal"        :  s := "=="
    "Op_notequal"     :  s := "!="
    "Op_not"          :  s := "!"
    "Op_assign"       :  s := "="
    "Op_and"          :  s := "&&"
    "Op_or"           :  s := "||"
    "LeftParen"       :  s := "("
    "RightParen"      :  s := ")"
    "LeftBrace"       :  s := "{"
    "RightBrace"      :  s := "}"
    "Semicolon"       :  s := ";"
    "Comma"           :  s := ","
    "Identifier"      :  s := "Ident"
    "Integer"         :  s := "Integer literal"
    "String"          :  s := "String literal"
    "End_of_input"    :  s := "EOI"
  }
  return s
end

Output:

$ icont -s -u parse.icn && ./parse compiler-tests/primes.lex
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier count
Integer 1
Assign
Identifier n
Integer 1
Assign
Identifier limit
Integer 100
While
Less
Identifier n
Identifier limit
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier k
Integer 3
Assign
Identifier p
Integer 1
Assign
Identifier n
Add
Identifier n
Integer 2
While
And
LessEqual
Multiply
Identifier k
Identifier k
Identifier n
Identifier p
Sequence
Sequence
;
Assign
Identifier p
NotEqual
Multiply
Divide
Identifier n
Identifier k
Identifier k
Identifier n
Assign
Identifier k
Add
Identifier k
Integer 2
If
Identifier p
If
Sequence
Sequence
;
Sequence
Sequence
;
Prti
Identifier n
;
Prts
String " is prime\n"
;
Assign
Identifier count
Add
Identifier count
Integer 1
;
Sequence
Sequence
Sequence
;
Prts
String "Total primes found: "
;
Prti
Identifier count
;
Prts
String "\n"
;

J

Implementation:

require'format/printf'
 
tkref=: tokenize 'End_of_input*/%+--<<=>>===!=!&&||print=print(if{else}while;,putc)a""0'
tkref,. (tknames)=: tknames=:;: {{)n
 End_of_input Op_multiply Op_divide Op_mod Op_add Op_subtract Op_negate Op_less
 Op_lessequal Op_greater Op_greaterequal Op_equal Op_notequal Op_not Op_and
 Op_or Keyword_print Op_assign Keyword_print LeftParen Keyword_if LeftBrace Keyword_else RightBrace
 Keyword_while Semicolon Comma Keyword_putc RightParen
 Identifier String Integer
}}-.LF
 
tkV=: 2 (tkref i.tokenize '*/%+-<<=>>===!=&&||')} (#tktyp)#0
tkV=: 1 (1 0+tkref i.tokenize '-!')} tkV
tkPrec=: 13 13 13 12 12 10 10 10 10 9 9 5 4 (tkref i.tokenize'*/%+-<<=>>==!=&&||')} tkV<._1
tkPrec=: 14 (1 0+tkref i.tokenize'-!')} tkPrec
NB. proofread |:(<"1 tkV,.tkPrec),tkref,:tknames
 
tkref,.(ndDisp)=: ndDisp=:;:{{)n
 Sequence Multiply Divide Mod Add Subtract Negate Less LessEqual Greater
 GreaterEqual Equal NotEqual Not And Or Prts Assign Prti x If x x x While
 x x Prtc x Identifier String Integer
}}-.LF
NB. proofread |:tkref,:ndDisp
 
gettoken=: {{
  'tok_ln tok_col'=: (0;ndx){::x
  'tok_name tok_value'=: (1;ndx){::x
  if. 'Error'-:tok_name do.
    error 'invalid word ',":tok_value
  end.
  ind=. tknames i.<tok_name
  tok_text=: ind{::tkref
  tok_valence=: ind{::tkV
  tok_precedence=: ind{::tkPrec
  ndx=:ndx+1
  node_display=: ind{::ndDisp
}}
 
parse=: {{
  ndx=: tok_ln=: tok_col=: 0
  gettok=: y&gettoken
  gettok''
  t=.a:
  whilst.-.(a:-:t)+.tok_name-:End_of_input do.
    t=. Sequence make_node t stmt''
  end.
}}
 
stmt=:{{)v
  t=. a:
  select.tok_name
    case.Keyword_if do.
      s=. stmt e=. paren_expr gettok''
      if.Keyword_else-:tok_name
      do.   S=. stmt gettok''
      else. S=. a: end.
      t=. If make_node e If make_node s S
    case.Keyword_putc do.
      e=. paren_expr gettok''
      t=. Prtc make_node e a:
      Prtc expect Semicolon
    case.Keyword_print do.gettok''
      'Print' expect LeftParen
      while.do.
        if.String-:tok_name
        do. gettok e=. Prts make_node (String make_leaf tok_value) a:
        else. e=. Prti make_node (expr 0) a: end.
        t=. Sequence make_node t e
        if.Comma-:tok_name
        do.Comma expect Comma
        else.break.end.
      end.
      'Print' expect RightParen
      'Print' expect Semicolon      
    case.Semicolon do.gettok''
    case.Identifier do.
      gettok v=. Identifier make_leaf tok_value
      Assign expect Op_assign
      t=. Assign make_node v e=. expr 0
      Assign expect Semicolon
    case.Keyword_while do.
      t=. While make_node e s=. stmt e=. paren_expr gettok''
    case.LeftBrace do.
      'LeftBrace' expect LeftBrace
      while.-.(<tok_name) e.  RightBrace;End_of_input do.
        t=. Sequence make_node t stmt''
      end.
      'LeftBrace' expect RightBrace
    case.End_of_input do.
    case.do. error 'Expecting start of statement, found %s'sprintf<tok_text
  end.
  t
}}
 
paren_expr=: {{
  'paren_expr' expect LeftParen
  t=. expr 0
  'paren_expr' expect RightParen
  t
}}
 
not_prec=: tkPrec{~tknames i.<Op_not
expr=: {{
  select.tok_name
    case.LeftParen do.e=. paren_expr''
    case.Op_add do.gettok''
      e=. expr not_prec
    case.Op_subtract do.gettok''
      e=. Negate make_node (expr not_prec) a:
    case.Op_not do.gettok''
      e=. Not make_node (expr not_prec) a:
    case.Identifier do.
      gettok e=. Identifier make_leaf tok_value
    case.Integer do.
      gettok e=. Integer make_leaf tok_value
    case.do. error 'Expecting a primary, found %s'sprintf<tok_text
  end.
  while.(2=tok_valence)*tok_precedence>:y do.
    q=. 1+tok_precedence [ op=. node_display NB. no right associative operators
    gettok''
    node=. expr q
    e=. op make_node e node
  end.
  e
}}
 
expect=: {{
  if.y-:tok_name do. gettok'' return.end.
  error '%s: Expecting "%s", found "%s"'sprintf x;(tkref{::~tknames i.<y);tok_text
}}
 
make_leaf=: {{
  x;y
}}
 
make_node=: {{
  m;n;<y
}}
 
error=: {{
  echo 'Error: line %d, column %d: %s\n'sprintf tok_ln;tok_col;y throw.
}}
 
 
syntax=: {{
  ;(flatAST parse y),each LF
}}
 
flatAST=: {{
  assert.*L.y
  select.#y
    case.1 do.<';' assert.y-:a:
    case.2 do.<;:inv ":each y
    case.3 do.({.y),(flatAST 1{::y),flatAST 2{::y
    case.do.assert.0
  end.
}}

Some quirks worth noting:

(1) '+' appears in the productions for 'primary' and 'addition_expr' but has only one node type (because we do not represent its appearance in 'primary' with a node.

(2) '-' and 'print' do have two node types (which we sort out on the fly).

(3) In this implementation, we require a 1:1 mapping between the data structure representing token types and the data structure representing node types. This means two token entries for both - and print (the second instance of both gets ignored by the lexer).

(4) Because the data structure produced by the lexer is independent of any type system implementation, we can use the same type system for the lexer or a different type system for the lexer and either way works (as long as the implementations are consistent with the spec).

(5) In this context parallel constant arrays represent token and node types.

Task example:

primes=: {{)n
/*
 Simple prime number generator
 */
count = 1;
n = 1;
limit = 100;
while (n < limit) {
    k=3;
    p=1;
    n=n+2;
    while ((k*k<=n) && (p)) {
        p=n/k*k!=n;
        k=k+2;
    }
    if (p) {
        print(n, " is prime\n");
        count = count + 1;
    }
}
print("Total primes found: ", count, "\n");
}}

   syntax lex primes
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier count
Integer 1
Assign
Identifier n
Integer 1
Assign
Identifier limit
Integer 100
While
Less
Identifier n
Identifier limit
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier k
Integer 3
Assign
Identifier p
Integer 1
Assign
Identifier n
Add
Identifier n
Integer 2
While
And
LessEqual
Multiply
Identifier k
Identifier k
Identifier n
Identifier p
Sequence
Sequence
;
Assign
Identifier p
NotEqual
Divide
Identifier n
Multiply
Identifier k
Identifier k
Identifier n
Assign
Identifier k
Add
Identifier k
Integer 2
If
Identifier p
If
Sequence
Sequence
;
Sequence
Sequence
;
Prti
Identifier n
;
Prts
String " is prime\n"
;
Assign
Identifier count
Add
Identifier count
Integer 1
;
Sequence
Sequence
Sequence
;
Prts
String "Total primes found: "
;
Prti
Identifier count
;
Prts
String "\n"
;

Java

Usage: java Parser infile [>outfile]

Translation of: Python

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import java.util.StringTokenizer;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.util.HashMap;

class Parser {
	private List<Token> source;
	private Token token;
	private int position;

	static class Node {
		public NodeType nt;
		public Node left, right;
		public String value;

		Node() {
			this.nt = null;
			this.left = null;
			this.right = null;
			this.value = null;
		}
		Node(NodeType node_type, Node left, Node right, String value) {
			this.nt = node_type;
			this.left = left;
			this.right = right;
			this.value = value;
		}
		public static Node make_node(NodeType nodetype, Node left, Node right) {
			return new Node(nodetype, left, right, "");
		}
		public static Node make_node(NodeType nodetype, Node left) {
			return new Node(nodetype, left, null, "");
		}
		public static Node make_leaf(NodeType nodetype, String value) {
			return new Node(nodetype, null, null, value);
		}
	}

	static class Token {
		public TokenType tokentype;
		public String value;
		public int line;
		public int pos;

		Token(TokenType token, String value, int line, int pos) {
			this.tokentype = token; this.value = value; this.line = line; this.pos = pos;
		}
		@Override
		public String toString() {
			return String.format("%5d  %5d %-15s %s", this.line, this.pos, this.tokentype, this.value);
		}
	}

	static enum TokenType {
		End_of_input(false, false, false, -1, NodeType.nd_None),
		Op_multiply(false, true, false, 13, NodeType.nd_Mul),
		Op_divide(false, true, false, 13, NodeType.nd_Div),
		Op_mod(false, true, false, 13, NodeType.nd_Mod),
		Op_add(false, true, false, 12, NodeType.nd_Add),
		Op_subtract(false, true, false, 12, NodeType.nd_Sub),
		Op_negate(false, false, true, 14, NodeType.nd_Negate),
		Op_not(false, false, true, 14, NodeType.nd_Not),
		Op_less(false, true, false, 10, NodeType.nd_Lss),
		Op_lessequal(false, true, false, 10, NodeType.nd_Leq),
		Op_greater(false, true, false, 10, NodeType.nd_Gtr),
		Op_greaterequal(false, true, false, 10, NodeType.nd_Geq),
		Op_equal(false, true, true, 9, NodeType.nd_Eql),
		Op_notequal(false, true, false, 9, NodeType.nd_Neq),
		Op_assign(false, false, false, -1, NodeType.nd_Assign),
		Op_and(false, true, false, 5, NodeType.nd_And),
		Op_or(false, true, false, 4, NodeType.nd_Or),
		Keyword_if(false, false, false, -1, NodeType.nd_If),
		Keyword_else(false, false, false, -1, NodeType.nd_None),
		Keyword_while(false, false, false, -1, NodeType.nd_While),
		Keyword_print(false, false, false, -1, NodeType.nd_None),
		Keyword_putc(false, false, false, -1, NodeType.nd_None),
		LeftParen(false, false, false, -1, NodeType.nd_None),
		RightParen(false, false, false, -1, NodeType.nd_None),
		LeftBrace(false, false, false, -1, NodeType.nd_None),
		RightBrace(false, false, false, -1, NodeType.nd_None),
		Semicolon(false, false, false, -1, NodeType.nd_None),
		Comma(false, false, false, -1, NodeType.nd_None),
		Identifier(false, false, false, -1, NodeType.nd_Ident),
		Integer(false, false, false, -1, NodeType.nd_Integer),
		String(false, false, false, -1, NodeType.nd_String);

		private final int precedence;
		private final boolean right_assoc;
		private final boolean is_binary;
		private final boolean is_unary;
		private final NodeType node_type;

		TokenType(boolean right_assoc, boolean is_binary, boolean is_unary, int precedence, NodeType node) {
			this.right_assoc = right_assoc;
			this.is_binary = is_binary;
			this.is_unary = is_unary;
			this.precedence = precedence;
			this.node_type = node;
		}
		boolean isRightAssoc() { return this.right_assoc; }
		boolean isBinary() { return this.is_binary; }
		boolean isUnary() { return this.is_unary; }
		int getPrecedence() { return this.precedence; }
		NodeType getNodeType() { return this.node_type; }
	}
	static enum NodeType {
		nd_None(""), nd_Ident("Identifier"), nd_String("String"), nd_Integer("Integer"), nd_Sequence("Sequence"), nd_If("If"),
		nd_Prtc("Prtc"), nd_Prts("Prts"), nd_Prti("Prti"), nd_While("While"),
		nd_Assign("Assign"), nd_Negate("Negate"), nd_Not("Not"), nd_Mul("Multiply"), nd_Div("Divide"), nd_Mod("Mod"), nd_Add("Add"),
		nd_Sub("Subtract"), nd_Lss("Less"), nd_Leq("LessEqual"),
		nd_Gtr("Greater"), nd_Geq("GreaterEqual"), nd_Eql("Equal"), nd_Neq("NotEqual"), nd_And("And"), nd_Or("Or");

		private final String name;

		NodeType(String name) {
			this.name = name;
		}

		@Override
		public String toString() { return this.name; }
	}
	static void error(int line, int pos, String msg) {
		if (line > 0 && pos > 0) {
			System.out.printf("%s in line %d, pos %d\n", msg, line, pos);
		} else {
			System.out.println(msg);
		}
		System.exit(1);
	}
	Parser(List<Token> source) {
		this.source = source;
		this.token = null;
		this.position = 0;
	}
	Token getNextToken() {
		this.token = this.source.get(this.position++);
		return this.token;
	}
	Node expr(int p) {
		Node result = null, node;
		TokenType op;
		int q;

		if (this.token.tokentype == TokenType.LeftParen) {
			result = paren_expr();
		} else if (this.token.tokentype == TokenType.Op_add || this.token.tokentype == TokenType.Op_subtract) {
			op = (this.token.tokentype == TokenType.Op_subtract) ? TokenType.Op_negate : TokenType.Op_add;
			getNextToken();
			node = expr(TokenType.Op_negate.getPrecedence());
			result = (op == TokenType.Op_negate) ? Node.make_node(NodeType.nd_Negate, node) : node;
		} else if (this.token.tokentype == TokenType.Op_not) {
			getNextToken();
			result = Node.make_node(NodeType.nd_Not, expr(TokenType.Op_not.getPrecedence()));
		} else if (this.token.tokentype == TokenType.Identifier) {
			result = Node.make_leaf(NodeType.nd_Ident, this.token.value);
			getNextToken();
		} else if (this.token.tokentype == TokenType.Integer) {
			result = Node.make_leaf(NodeType.nd_Integer, this.token.value);
			getNextToken();
		} else {
			error(this.token.line, this.token.pos, "Expecting a primary, found: " + this.token.tokentype);
		}

		while (this.token.tokentype.isBinary() && this.token.tokentype.getPrecedence() >= p) {
			op = this.token.tokentype;
			getNextToken();
			q = op.getPrecedence();
			if (!op.isRightAssoc()) {
				q++;
			}
			node = expr(q);
			result = Node.make_node(op.getNodeType(), result, node);
		}
		return result;
	}
	Node paren_expr() {
		expect("paren_expr", TokenType.LeftParen);
		Node node = expr(0);
		expect("paren_expr", TokenType.RightParen);
		return node;
	}
	void expect(String msg, TokenType s) {
		if (this.token.tokentype == s) {
			getNextToken();
			return;
		}
		error(this.token.line, this.token.pos, msg + ": Expecting '" + s + "', found: '" + this.token.tokentype + "'");
	}
	Node stmt() {
		Node s, s2, t = null, e, v;
		if (this.token.tokentype == TokenType.Keyword_if) {
			getNextToken();
			e = paren_expr();
			s = stmt();
			s2 = null;
			if (this.token.tokentype == TokenType.Keyword_else) {
				getNextToken();
				s2 = stmt();
			}
			t = Node.make_node(NodeType.nd_If, e, Node.make_node(NodeType.nd_If, s, s2));
		} else if (this.token.tokentype == TokenType.Keyword_putc) {
			getNextToken();
			e = paren_expr();
			t = Node.make_node(NodeType.nd_Prtc, e);
			expect("Putc", TokenType.Semicolon);
		} else if (this.token.tokentype == TokenType.Keyword_print) {
			getNextToken();
			expect("Print", TokenType.LeftParen);
			while (true) {
				if (this.token.tokentype == TokenType.String) {
					e = Node.make_node(NodeType.nd_Prts, Node.make_leaf(NodeType.nd_String, this.token.value));
					getNextToken();
				} else {
					e = Node.make_node(NodeType.nd_Prti, expr(0), null);
				}
				t = Node.make_node(NodeType.nd_Sequence, t, e);
				if (this.token.tokentype != TokenType.Comma) {
					break;
				}
				getNextToken();
			}
			expect("Print", TokenType.RightParen);
			expect("Print", TokenType.Semicolon);
		} else if (this.token.tokentype == TokenType.Semicolon) {
			getNextToken();
		} else if (this.token.tokentype == TokenType.Identifier) {
			v = Node.make_leaf(NodeType.nd_Ident, this.token.value);
			getNextToken();
			expect("assign", TokenType.Op_assign);
			e = expr(0);
			t = Node.make_node(NodeType.nd_Assign, v, e);
			expect("assign", TokenType.Semicolon);
		} else if (this.token.tokentype == TokenType.Keyword_while) {
			getNextToken();
			e = paren_expr();
			s = stmt();
			t = Node.make_node(NodeType.nd_While, e, s);
		} else if (this.token.tokentype == TokenType.LeftBrace) {
			getNextToken();
			while (this.token.tokentype != TokenType.RightBrace && this.token.tokentype != TokenType.End_of_input) {
				t = Node.make_node(NodeType.nd_Sequence, t, stmt());
			}
			expect("LBrace", TokenType.RightBrace);
		} else if (this.token.tokentype == TokenType.End_of_input) {
		} else {
			error(this.token.line, this.token.pos, "Expecting start of statement, found: " + this.token.tokentype);
		}
		return t;
	}
	Node parse() {
		Node t = null;
		getNextToken();
		while (this.token.tokentype != TokenType.End_of_input) {
			t = Node.make_node(NodeType.nd_Sequence, t, stmt());
		}
		return t;
	}
	void printAST(Node t) {
		int i = 0;
		if (t == null) {
			System.out.println(";");
		} else {
			System.out.printf("%-14s", t.nt);
			if (t.nt == NodeType.nd_Ident || t.nt == NodeType.nd_Integer || t.nt == NodeType.nd_String) {
				System.out.println(" " + t.value);
			} else {
				System.out.println();
				printAST(t.left);
				printAST(t.right);
			}
		}
	}
	public static void main(String[] args) {
		if (args.length > 0) {
			try {
				String value, token;
				int line, pos;
				Token t;
				boolean found;
				List<Token> list = new ArrayList<>();
				Map<String, TokenType> str_to_tokens = new HashMap<>();

				str_to_tokens.put("End_of_input", TokenType.End_of_input);
				str_to_tokens.put("Op_multiply", TokenType.Op_multiply);
				str_to_tokens.put("Op_divide", TokenType.Op_divide);
				str_to_tokens.put("Op_mod", TokenType.Op_mod);
				str_to_tokens.put("Op_add", TokenType.Op_add);
				str_to_tokens.put("Op_subtract", TokenType.Op_subtract);
				str_to_tokens.put("Op_negate", TokenType.Op_negate);
				str_to_tokens.put("Op_not", TokenType.Op_not);
				str_to_tokens.put("Op_less", TokenType.Op_less);
				str_to_tokens.put("Op_lessequal", TokenType.Op_lessequal);
				str_to_tokens.put("Op_greater", TokenType.Op_greater);
				str_to_tokens.put("Op_greaterequal", TokenType.Op_greaterequal);
				str_to_tokens.put("Op_equal", TokenType.Op_equal);
				str_to_tokens.put("Op_notequal", TokenType.Op_notequal);
				str_to_tokens.put("Op_assign", TokenType.Op_assign);
				str_to_tokens.put("Op_and", TokenType.Op_and);
				str_to_tokens.put("Op_or", TokenType.Op_or);
				str_to_tokens.put("Keyword_if", TokenType.Keyword_if);
				str_to_tokens.put("Keyword_else", TokenType.Keyword_else);
				str_to_tokens.put("Keyword_while", TokenType.Keyword_while);
				str_to_tokens.put("Keyword_print", TokenType.Keyword_print);
				str_to_tokens.put("Keyword_putc", TokenType.Keyword_putc);
				str_to_tokens.put("LeftParen", TokenType.LeftParen);
				str_to_tokens.put("RightParen", TokenType.RightParen);
				str_to_tokens.put("LeftBrace", TokenType.LeftBrace);
				str_to_tokens.put("RightBrace", TokenType.RightBrace);
				str_to_tokens.put("Semicolon", TokenType.Semicolon);
				str_to_tokens.put("Comma", TokenType.Comma);
				str_to_tokens.put("Identifier", TokenType.Identifier);
				str_to_tokens.put("Integer", TokenType.Integer);
				str_to_tokens.put("String", TokenType.String);

				Scanner s = new Scanner(new File(args[0]));
				String source = " ";
				while (s.hasNext()) {
					String str = s.nextLine();
					StringTokenizer st = new StringTokenizer(str);
					line = Integer.parseInt(st.nextToken());
					pos = Integer.parseInt(st.nextToken());
					token = st.nextToken();
					value = "";
					while (st.hasMoreTokens()) {
						value += st.nextToken() + " ";
					}
					found = false;
					if (str_to_tokens.containsKey(token)) {
						found = true;
						list.add(new Token(str_to_tokens.get(token), value, line, pos));
					}
					if (found == false) {
						throw new Exception("Token not found: '" + token + "'");
					}
				}
				Parser p = new Parser(list);
				p.printAST(p.parse());
			} catch (FileNotFoundException e) {
				error(-1, -1, "Exception: " + e.getMessage());
			} catch (Exception e) {
				error(-1, -1, "Exception: " + e.getMessage());
			}
		} else {
			error(-1, -1, "No args");
		}
	}
}

Julia

Julia tends to discourage large numbers of global variables, so this direct port from the Python reference implementation moves the globals into a function wrapper.

Translation of: Python

struct ASTnode
    nodetype::Int
    left::Union{Nothing, ASTnode}
    right::Union{Nothing, ASTnode}
    value::Union{Nothing, Int, String}
end

function syntaxanalyzer(inputfile)
    tkEOI, tkMul, tkDiv, tkMod, tkAdd, tkSub, tkNegate, tkNot, tkLss, tkLeq, tkGtr, tkGeq,
    tkEql, tkNeq, tkAssign, tkAnd, tkOr, tkIf, tkElse, tkWhile, tkPrint, tkPutc, tkLparen, tkRparen,
    tkLbrace, tkRbrace, tkSemi, tkComma, tkIdent, tkInteger, tkString = collect(1:31)

    ndIdent, ndString, ndInteger, ndSequence, ndIf, ndPrtc, ndPrts, ndPrti, ndWhile,
    ndAssign, ndNegate, ndNot, ndMul, ndDiv, ndMod, ndAdd, ndSub, ndLss, ndLeq,
    ndGtr, ndGeq, ndEql, ndNeq, ndAnd, ndOr = collect(1:25)

    TK_NAME, TK_RIGHT_ASSOC, TK_IS_BINARY, TK_IS_UNARY, TK_PRECEDENCE, TK_NODE = collect(1:6) # label Token columns
    Tokens = [
    ["EOI"             , false, false, false, -1, -1       ],
    ["*"               , false, true,  false, 13, ndMul    ],
    ["/"               , false, true,  false, 13, ndDiv    ],
    ["%"               , false, true,  false, 13, ndMod    ],
    ["+"               , false, true,  false, 12, ndAdd    ],
    ["-"               , false, true,  false, 12, ndSub    ],
    ["-"               , false, false, true,  14, ndNegate ],
    ["!"               , false, false, true,  14, ndNot    ],
    ["<"               , false, true,  false, 10, ndLss    ],
    ["<="              , false, true,  false, 10, ndLeq    ],
    [">"               , false, true,  false, 10, ndGtr    ],
    [">="              , false, true,  false, 10, ndGeq    ],
    ["=="              , false, true,  false,  9, ndEql    ],
    ["!="              , false, true,  false,  9, ndNeq    ],
    ["="               , false, false, false, -1, ndAssign ],
    ["&&"              , false, true,  false,  5, ndAnd    ],
    ["||"              , false, true,  false,  4, ndOr     ],
    ["if"              , false, false, false, -1, ndIf     ],
    ["else"            , false, false, false, -1, -1       ],
    ["while"           , false, false, false, -1, ndWhile  ],
    ["print"           , false, false, false, -1, -1       ],
    ["putc"            , false, false, false, -1, -1       ],
    ["("               , false, false, false, -1, -1       ],
    [")"               , false, false, false, -1, -1       ],
    ["{"               , false, false, false, -1, -1       ],
    ["}"               , false, false, false, -1, -1       ],
    [";"               , false, false, false, -1, -1       ],
    [","               , false, false, false, -1, -1       ],
    ["Ident"           , false, false, false, -1, ndIdent  ],
    ["Integer literal" , false, false, false, -1, ndInteger],
    ["String literal"  , false, false, false, -1, ndString ]]

    allsyms = Dict(
        "End_of_input" => tkEOI, "Op_multiply" => tkMul, "Op_divide" => tkDiv,
        "Op_mod" => tkMod, "Op_add" => tkAdd, "Op_subtract" => tkSub,
        "Op_negate" => tkNegate, "Op_not" => tkNot, "Op_less" => tkLss,
        "Op_lessequal" => tkLeq, "Op_greater" => tkGtr, "Op_greaterequal" => tkGeq,
        "Op_equal" => tkEql, "Op_notequal" => tkNeq, "Op_assign" => tkAssign,
        "Op_and" => tkAnd, "Op_or" => tkOr, "Keyword_if" => tkIf, "Keyword_else" => tkElse,
        "Keyword_while" => tkWhile, "Keyword_print" => tkPrint, "Keyword_putc" => tkPutc,
        "LeftParen" => tkLparen, "RightParen" => tkRparen, "LeftBrace" => tkLbrace,
        "RightBrace" => tkRbrace, "Semicolon" => tkSemi, "Comma" => tkComma,
        "Identifier" => tkIdent, "Integer" => tkInteger, "String" => tkString)

    displaynodes = ["Identifier", "String", "Integer", "Sequence", "If", "Prtc", "Prts", "Prti", "While",
                     "Assign", "Negate", "Not", "Multiply", "Divide", "Mod", "Add", "Subtract", "Less", 
                     "LessEqual", "Greater", "GreaterEqual", "Equal", "NotEqual", "And", "Or"]

    errline, errcol, tok, toktext = fill("", 4)  
    error(msg) = throw("Error in syntax: $msg.")
    nilnode = ASTnode(0, nothing, nothing, nothing)
    tokother = "" 
   
    function gettok()
        s = readline(inputfile)
        if length(s) == 0
            error("empty line")
        end
        linelist = split(strip(s), r"\s+", limit = 4)
        # line col Ident varname
        # 0    1   2     3
        errline, errcol, toktext = linelist[1:3]
        if !haskey(allsyms, toktext)
            error("Unknown token $toktext")
        end
        tok = allsyms[toktext]
        tokother = (tok in [tkInteger, tkIdent, tkString]) ? linelist[4] : ""
     end   

    makenode(oper, left, right = nilnode) = ASTnode(oper, left, right, nothing)
    makeleaf(oper, n::Int) = ASTnode(oper, nothing, nothing, n) 
    makeleaf(oper, n) = ASTnode(oper, nothing, nothing, string(n)) 
    expect(msg, s) = if tok != s error("msg: Expecting $(Tokens[s][TK_NAME]), found $(Tokens[tok][TK_NAME])") else gettok() end

    function expr(p)
        x = nilnode
        if tok == tkLparen
            x = parenexpr()
        elseif tok in [tkSub, tkAdd]
            op = tok == tkSub ? tkNegate : tkAdd
            gettok()
            node = expr(Tokens[tkNegate][TK_PRECEDENCE])
            x = (op == tkNegate) ? makenode(ndNegate, node) : node
        elseif tok == tkNot
            gettok()
            x = makenode(ndNot, expr(Tokens[tkNot][TK_PRECEDENCE]))
        elseif tok == tkIdent
            x = makeleaf(ndIdent, tokother)
            gettok()
        elseif tok == tkInteger
            x = makeleaf(ndInteger, tokother)
            gettok()
        else
            error("Expecting a primary, found: $(Tokens[tok][TK_NAME])")
        end
        while Tokens[tok][TK_IS_BINARY] && (Tokens[tok][TK_PRECEDENCE] >= p)
            op = tok
            gettok()
            q = Tokens[op][TK_PRECEDENCE]
            if !Tokens[op][TK_RIGHT_ASSOC]
                q += 1
            end
            node = expr(q)
            x = makenode(Tokens[op][TK_NODE], x, node)
        end
        x
    end

    parenexpr() = (expect("paren_expr", tkLparen); node = expr(0); expect("paren_expr", tkRparen); node)

    function stmt()
        t = nilnode
        if tok == tkIf
            gettok()
            e = parenexpr()
            s = stmt()
            s2 = nilnode
            if tok == tkElse
                gettok()
                s2 = stmt()
            end
            t = makenode(ndIf, e, makenode(ndIf, s, s2))
        elseif tok == tkPutc
            gettok()
            e = parenexpr()
            t = makenode(ndPrtc, e)
            expect("Putc", tkSemi)
        elseif tok == tkPrint
            gettok()
            expect("Print", tkLparen)
            while true
                if tok == tkString
                    e = makenode(ndPrts, makeleaf(ndString, tokother))
                    gettok()
                else
                    e = makenode(ndPrti, expr(0))
                end
                t = makenode(ndSequence, t, e)
                if tok != tkComma
                    break
                end
                gettok()
            end
            expect("Print", tkRparen)
            expect("Print", tkSemi)
        elseif tok == tkSemi
            gettok()
        elseif tok == tkIdent
            v = makeleaf(ndIdent, tokother)
            gettok()
            expect("assign", tkAssign)
            e = expr(0)
            t = makenode(ndAssign, v, e)
            expect("assign", tkSemi)
        elseif tok == tkWhile
            gettok()
            e = parenexpr()
            s = stmt()
            t = makenode(ndWhile, e, s)
        elseif tok == tkLbrace
            gettok()
            while (tok != tkRbrace) && (tok != tkEOI)
                t = makenode(ndSequence, t, stmt())
            end
            expect("Lbrace", tkRbrace)
        elseif tok != tkEOI
            error("Expecting start of statement, found: $(Tokens[tok][TK_NAME])")
        end
        return t
    end

    function parse()
        t = nilnode
        gettok()
        while true
            t = makenode(ndSequence, t, stmt())
            if (tok == tkEOI) || (t == nilnode)
                break
            end
        end
        t
    end
    
    function prtASTnode(t)
        if t == nothing
            return
        elseif t == nilnode 
            println(";")
        elseif t.nodetype in [ndIdent, ndInteger, ndString]
            println(rpad(displaynodes[t.nodetype], 14), t.value)
        else
            println(rpad(displaynodes[t.nodetype], 14))
        end
        prtASTnode(t.left)
        prtASTnode(t.right)
    end

    # runs the function
    prtASTnode(parse())
end

testtxt = """
    1      1 Identifier      count
    1      7 Op_assign
    1      9 Integer             1
    1     10 Semicolon
    2      1 Keyword_while
    2      7 LeftParen
    2      8 Identifier      count
    2     14 Op_less
    2     16 Integer            10
    2     18 RightParen
    2     20 LeftBrace
    3      5 Keyword_print
    3     10 LeftParen
    3     11 String          \"count is: \"
    3     23 Comma
    3     25 Identifier      count
    3     30 Comma
    3     32 String          \"\\n\"
    3     36 RightParen
    3     37 Semicolon
    4      5 Identifier      count
    4     11 Op_assign
    4     13 Identifier      count
    4     19 Op_add
    4     21 Integer             1
    4     22 Semicolon
    5      1 RightBrace
    6      1 End_of_input           """

syntaxanalyzer(IOBuffer(testtxt))  # for isolated testing

# syntaxanalyzer(length(ARGS) > 1 ? ARGS[1] : stdin) # for use as in the Python code

M2000 Interpreter

This program written without functions. Subs use the current stack of values (a feature from interpreter) to return arrays. Subs run on same scope as the module or function which called. We use Local to make local variables (erased at return). Sub prt_ast() called first time without passing parameter, because parameter already exist in stack of values. Interpreter when call a module, a function, a subroutine always pass values to stack of values. Functions called in an expression, always have own stack of values. Modules call other modules passing the same stack of values. Threads are parts of modules, with same scope in module where belong, but have own stack and static variables, and they rub in time intervals.

A (1,2,3) is an auto array or tuple. We can assign a tuple in a variable, in a item in another tuple. A tuple is a reference type, but here we don't use a second pointer (we say references variables which references to other variables - reference or value type-, so we say pointer the reference who hold an object alive. We can read 2nd item (expected string) from alfa, a pointer to array, using array$(alfa,1) or alfa#val$(1). The second variation can be used multiple times if a tuple has another tulple so alfa#val(2)#val$(1) return a string from 3rd item, which expect a tuple from 2nd item. The other variation array$(array(alfa,2),1) for the same result.

Module syntax_analyzer(b$){
	enum tokens {
		Op_add, Op_subtract, Op_not=5, Op_multiply=10, Op_divide, Op_mod, 
		Op_negate,  Op_less, Op_lessequal, Op_greater, Op_greaterequal,
		Op_equal, Op_notequal, Op_and, Op_or, Op_assign=100, Keyword_if=110,
		Keyword_else, Keyword_while, Keyword_print, Keyword_putc, LeftParen, RightParen,
		LeftBrace, RightBrace, Semicolon, Comma, Identifier, Integer, String, End_of_input
	}
	
	Inventory precedence=Op_multiply:=13, Op_divide:=13, Op_mod:=13, Op_add:=12, Op_subtract:=12
	Append  precedence, Op_negate:=14, Op_not:=14, Op_less:=10, Op_lessequal:=10, Op_greater:=10 
	Append  precedence, Op_greaterequal:=10, Op_equal:=9, Op_notequal:=9, Op_assign:=-1, Op_and:=5
	Append  precedence, Op_or:=4
	
	Inventory symbols=Op_multiply:="Multiply", Op_divide:="Divide", Op_mod:="Mod", Op_add:="Add"
	Append  symbols, Op_negate:="Negate", Op_not:="Not", Op_less:="Less", Op_subtract:="Subtract"
	Append  symbols, Op_lessequal:="LessEqual", Op_greater:="Greater", Op_greaterequal:="GreaterEqual"
	Append  symbols, Op_equal:="Equal", Op_notequal:="NotEqual",  Op_and:="And", Op_or:="Or" 
	
	def lineNo, ColumnNo, m, line$, a, lim, cur=-1
	const nl$=chr$(13)+chr$(10), Ansi=3
	Dim lex$()
	lex$()=piece$(b$,chr$(13)+chr$(10)) 
	lim=dimension(lex$(),1)-1
	op=End_of_input
	flush
	k=0
	Try {
		push (,)   ' Null
		getone(&op)
		repeat
		stmt(&op)
		shift 2  ' swap two top items
		push ("Sequence", array, array)
		k++
		until op=End_of_Input
	}
	er$=error$
	if er$<>"" then print er$ : flush: break
	Print "Ast"
	Document Output$
	prt_ast()
	clipboard Output$
	Save.Doc Output$, "parse.t", Ansi
	document parse$
	Load.Doc parse$,"parse.t", Ansi
	Report parse$
	
	sub prt_ast(t)
		if len(t)<1 then
			Output$=";"+nl$
		else.if len(t)=3 then
			Output$=t#val$(0) +nl$
			prt_ast(t#val(1)) : prt_ast(t#val(2))
		else
			Output$=t#val$(0) +nl$
		end if
	end sub
	sub expr(p)   ' only a number
		local x=(,), prev=op
		if  op>=Identifier then
			x=(line$,)
			getone(&op)
		else.if op=LeftParen then
			paren_exp()
			x=array
		else.if op<10 then
			getone(&op)
			expr(precedence(int(Op_negate)))
			read local y
			if prev=Op_add then
				x=y
			else
				if prev=Op_subtract then prev=Op_negate
				x=(symbols(prev), y,(,))
			End if
		else
			 {error "??? "+eval$(op)}
		end if
		local prec
		while exist(precedence, int(op))
			prev=op : prec=eval(precedence)
			if prec<14 and prec>=p else exit
			getone(&op)
			expr(prec+1)  ' all operators are left associative (use prec for right a.)
			x=(symbols(int(prev)), x, array)
		End While
		Push x
	end sub
	sub paren_exp()
		expected(LeftParen)
		getone(&op)
		expr(0)
		expected(RightParen)
		getone(&op)
	end sub
	sub stmt(&op)
		local t=(,)
		if op=Identifier then
			t=(line$)
			getone(&op)
			expected(Op_assign)
			getone(&op) 
			expr(0)
			read local rightnode
			Push ("Assign",t,rightnode)
			expected(Semicolon)
			getone(&op)
		else.if op=Semicolon then
			getone(&op)
			Push (";",)
		else.if op=Keyword_print then
			getone(&op)
			expected(LeftParen)
			repeat
				getone(&op)
				if op=String then
					Push ("Prts",(line$,),(,))
					getone(&op)
				else
					expr(0)
					Push ("Prti", array,(,))
				end if
				t=("Sequence", t, array)
			until op<>Comma
			expected(RightParen)
			getone(&op)
			expected(Semicolon)
			getone(&op)
			push t
		else.if op=Keyword_while then
			getone(&op)
			paren_exp()
			stmt(&op)
			shift 2
			Push ("While",array, array)
		else.if op=Keyword_if then
			getone(&op)
			paren_exp()
			stmt(&op)
			local s2=(,)
			if op=Keyword_else then
				getone(&op)
				stmt(&op)
				read s2
			end if
			shift 2
			Push ("If",array ,("If",array,s2))
		else.if op=Keyword_putc then
			getone(&op)
			paren_exp()
			Push ("Prtc",array,t)
			expected(Semicolon)
			getone(&op)
		else.if op=LeftBrace then
			Brace()
		else
			error "Unkown Op"	
		end if
	end sub
	Sub Brace()
			getone(&op)
			while op<>RightBrace and op<>End_of_input
				stmt(&op)
				t=("Sequence", t, array)
			end while
			expected(RightBrace)
			getone(&op)
			push t
	End Sub
	Sub expected(what)
		if not op=what then {Error "Expected "+eval$(what)+str$(LineNo)+","+Str$(ColumnNo)}
	End Sub
	sub getone(&op)
		op=End_of_input
		while cur<lim
		cur++
		line$=trim$(lex$(cur))
		if line$<>"" then exit
		end while
		if cur=lim then exit sub
		LineNo=Val(line$,"int",m)
		line$=mid$(line$, m)
		ColumnNo=Val(line$,"int",m)
		line$=trim$(mid$(line$, m))
		Rem : Print LineNo, ColumnNo
		m=instr(line$," ")
		if m>0 then op=Eval("."+leftpart$(line$, " ")) else op=Eval("."+line$)
	end sub
}

syntax_analyzer {
         1         1 LeftBrace
         5         5 Identifier left_edge
         5        17 Op_assign
         5        19 Op_subtract
         5        20 Integer 420
         5        23 Semicolon
         6         5 Identifier right_edge
         6        17 Op_assign
         6        20 Integer 300
         6        23 Semicolon
         7         5 Identifier top_edge
         7        17 Op_assign
         7        20 Integer 300
         7        23 Semicolon
         8         5 Identifier bottom_edge
         8        17 Op_assign
         8        19 Op_subtract
         8        20 Integer 300
         8        23 Semicolon
         9         5 Identifier x_step
         9        17 Op_assign
         9        22 Integer 7
         9        23 Semicolon
        10         5 Identifier y_step
        10        17 Op_assign
        10        21 Integer 15
        10        23 Semicolon
        12         5 Identifier max_iter
        12        17 Op_assign
        12        20 Integer 200
        12        23 Semicolon
        14         5 Identifier y0
        14         8 Op_assign
        14        10 Identifier top_edge
        14        18 Semicolon
        15         5 Keyword_while
        15        11 LeftParen
        15        12 Identifier y0
        15        15 Op_greater
        15        17 Identifier bottom_edge
        15        28 RightParen
        15        30 LeftBrace
        16         9 Identifier x0
        16        12 Op_assign
        16        14 Identifier left_edge
        16        23 Semicolon
        17         9 Keyword_while
        17        15 LeftParen
        17        16 Identifier x0
        17        19 Op_less
        17        21 Identifier right_edge
        17        31 RightParen
        17        33 LeftBrace
        18        13 Identifier y
        18        15 Op_assign
        18        17 Integer 0
        18        18 Semicolon
        19        13 Identifier x
        19        15 Op_assign
        19        17 Integer 0
        19        18 Semicolon
        20        13 Identifier the_char
        20        22 Op_assign
        20        24 Integer 32
        20        27 Semicolon
        21        13 Identifier i
        21        15 Op_assign
        21        17 Integer 0
        21        18 Semicolon
        22        13 Keyword_while
        22        19 LeftParen
        22        20 Identifier i
        22        22 Op_less
        22        24 Identifier max_iter
        22        32 RightParen
        22        34 LeftBrace
        23        17 Identifier x_x
        23        21 Op_assign
        23        23 LeftParen
        23        24 Identifier x
        23        26 Op_multiply
        23        28 Identifier x
        23        29 RightParen
        23        31 Op_divide
        23        33 Integer 200
        23        36 Semicolon
        24        17 Identifier y_y
        24        21 Op_assign
        24        23 LeftParen
        24        24 Identifier y
        24        26 Op_multiply
        24        28 Identifier y
        24        29 RightParen
        24        31 Op_divide
        24        33 Integer 200
        24        36 Semicolon
        25        17 Keyword_if
        25        20 LeftParen
        25        21 Identifier x_x
        25        25 Op_add
        25        27 Identifier y_y
        25        31 Op_greater
        25        33 Integer 800
        25        37 RightParen
        25        39 LeftBrace
        26        21 Identifier the_char
        26        30 Op_assign
        26        32 Integer 48
        26        36 Op_add
        26        38 Identifier i
        26        39 Semicolon
        27        21 Keyword_if
        27        24 LeftParen
        27        25 Identifier i
        27        27 Op_greater
        27        29 Integer 9
        27        30 RightParen
        27        32 LeftBrace
        28        25 Identifier the_char
        28        34 Op_assign
        28        36 Integer 64
        28        39 Semicolon
        29        21 RightBrace
        30        21 Identifier i
        30        23 Op_assign
        30        25 Identifier max_iter
        30        33 Semicolon
        31        17 RightBrace
        32        17 Identifier y
        32        19 Op_assign
        32        21 Identifier x
        32        23 Op_multiply
        32        25 Identifier y
        32        27 Op_divide
        32        29 Integer 100
        32        33 Op_add
        32        35 Identifier y0
        32        37 Semicolon
        33        17 Identifier x
        33        19 Op_assign
        33        21 Identifier x_x
        33        25 Op_subtract
        33        27 Identifier y_y
        33        31 Op_add
        33        33 Identifier x0
        33        35 Semicolon
        34        17 Identifier i
        34        19 Op_assign
        34        21 Identifier i
        34        23 Op_add
        34        25 Integer 1
        34        26 Semicolon
        35        13 RightBrace
        36        13 Keyword_putc
        36        17 LeftParen
        36        18 Identifier the_char
        36        26 RightParen
        36        27 Semicolon
        37        13 Identifier x0
        37        16 Op_assign
        37        18 Identifier x0
        37        21 Op_add
        37        23 Identifier x_step
        37        29 Semicolon
        38         9 RightBrace
        39         9 Keyword_putc
        39        13 LeftParen
        39        14 Integer 10
        39        18 RightParen
        39        19 Semicolon
        40         9 Identifier y0
        40        12 Op_assign
        40        14 Identifier y0
        40        17 Op_subtract
        40        19 Identifier y_step
        40        25 Semicolon
        41         5 RightBrace
        42         1 RightBrace
        43         1 End_of_Input
}

Output:

Sequence
;
Sequence
Sequence
Sequence
Sequence
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier left_edge
Negate
Integer 420
;
Assign
Identifier right_edge
Integer 300
Assign
Identifier top_edge
Integer 300
Assign
Identifier bottom_edge
Negate
Integer 300
;
Assign
Identifier x_step
Integer 7
Assign
Identifier y_step
Integer 15
Assign
Identifier max_iter
Integer 200
Assign
Identifier y0
Identifier top_edge
While
Greater
Identifier y0
Identifier bottom_edge
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier x0
Identifier left_edge
While
Less
Identifier x0
Identifier right_edge
Sequence
Sequence
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier y
Integer 0
Assign
Identifier x
Integer 0
Assign
Identifier the_char
Integer 32
Assign
Identifier i
Integer 0
While
Less
Identifier i
Identifier max_iter
Sequence
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier x_x
Divide
Multiply
Identifier x
Identifier x
Integer 200
Assign
Identifier y_y
Divide
Multiply
Identifier y
Identifier y
Integer 200
If
Greater
Add
Identifier x_x
Identifier y_y
Integer 800
if
Sequence
Sequence
Sequence
;
Assign
Identifier the_char
Add
Integer 48
Identifier i
If
Greater
Identifier i
Integer 9
if
Sequence
;
Assign
Identifier the_char
Integer 64
;
Assign
Identifier i
Identifier max_iter
;
Assign
Identifier y
Add
Divide
Multiply
Identifier x
Identifier y
Integer 100
Identifier y0
Assign
Identifier x
Add
Subtract
Identifier x_x
Identifier y_y
Identifier x0
Assign
Identifier i
Add
Identifier i
Integer 1
Putc
Identifier the_char
;
Assign
Identifier x0
Add
Identifier x0
Identifier x_step
Putc
Integer 10
;
Assign
Identifier y0
Subtract
Identifier y0
Identifier y_step

Nim

Using the third version of Nim lexer.

import ast_lexer

type NodeKind* = enum
                    nIdentifier = "Identifier"
                    nString = "String"
                    nInteger = "Integer"
                    nSequence = "Sequence"
                    nIf = "If"
                    nPrtc = "Prtc"
                    nPrts = "Prts"
                    nPrti = "Prti"
                    nWhile = "While"
                    nAssign = "Assign"
                    nNegate = "Negate"
                    nNot = "Not"
                    nMultiply = "Multiply"
                    nDivide = "Divide"
                    nMod = "Mod"
                    nAdd = "Add"
                    nSubtract = "Subtract"
                    nLess = "Less"
                    nLessEqual = "LessEqual"
                    nGreater = "Greater"
                    nGreaterEqual = "GreaterEqual"
                    nEqual = "Equal"
                    nNotEqual = "NotEqual"
                    nAnd = "And"
                    nOr = "Or"

type Node* = ref object
  left*: Node
  right*: Node
  case kind*: NodeKind
  of nString: stringVal*: string
  of nInteger: intVal*: int
  of nIdentifier: name*: string
  else: nil

type Operator = range[tokMult..tokOr]

const

  Precedences: array[Operator, int] = [13,  # tokMult
                                       13,  # tokDiv
                                       13,  # tokMod
                                       12,  # tokAdd
                                       12,  # tokSub
                                       10,  # tokLess
                                       10,  # tokLessEq
                                       10,  # tokGreater
                                       10,  # tokGreaterEq
                                        9,  # tokEq
                                        9,  # tokNeq
                                       14,  # tokNot
                                       -1,  # tokAssign
                                        5,  # tokAnd
                                        4]  # tokOr
  UnaryPrecedence = 14
  BinaryOperators = {tokMult, tokDiv, tokMod, tokAdd, tokSub, tokLess, tokLessEq,
                    tokGreater, tokGreaterEq, tokEq, tokNotEq, tokAnd, tokOr}

  # Mapping of operators from TokenKind to NodeKind.
  NodeKinds: array[Operator, NodeKind] = [nMultiply, nDivide, nMod, nAdd, nSubtract,
                                          nLess, nLessEqual, nGreater, nGreaterEqual,
                                          nEqual, nNotEqual, nNot, nAssign, nAnd, nOr]

type SyntaxError* = object of CatchableError


####################################################################################################

template expect(token: Token; expected: TokenKind; errmsg: string) =
  ## Check if a token is of the expected kind.
  ## Raise a SyntaxError if this is not the case.
  if token.kind != expected:
    raise newException(SyntaxError, "line " & $token.ln & ": " & errmsg)
  token = lexer.next()

#---------------------------------------------------------------------------------------------------

proc newNode*(kind: NodeKind; left: Node; right: Node = nil): Node =
  ## Create a new node with given left and right children.
  result = Node(kind: kind, left: left, right: right)

#---------------------------------------------------------------------------------------------------

# Forward reference.
proc parExpr(lexer: var Lexer; token: var Token): Node

#---------------------------------------------------------------------------------------------------

proc expr(lexer: var Lexer; token: var Token; p: int): Node =
  ## Parse an expression.

  case token.kind

  of tokLPar:
    result = parExpr(lexer, token)

  of tokAdd, tokSub, tokNot:
    # Unary operators.
    let savedToken = token
    token = lexer.next()
    let e = expr(lexer, token, UnaryPrecedence)
    if savedToken.kind == tokAdd:
      result = e
    else:
      result = newNode(if savedToken.kind == tokSub: nNegate else: nNot, e)

  of tokIdent:
    result = Node(kind: nIdentifier, name: token.ident)
    token = lexer.next()

  of tokInt:
    result = Node(kind:nInteger, intVal: token.intVal)
    token = lexer.next()

  of tokChar:
    result = Node(kind:nInteger, intVal: ord(token.charVal))
    token = lexer.next()

  else:
    raise newException(SyntaxError, "Unexpected symbol at line " & $token.ln)

  # Process the binary operators in the expression.
  while token.kind in BinaryOperators and Precedences[token.kind] >= p:
    let savedToken = token
    token = lexer.next()
    let q = Precedences[savedToken.kind] + 1  # No operator is right associative.
    result = newNode(NodeKinds[savedToken.kind], result, expr(lexer, token, q))

#---------------------------------------------------------------------------------------------------

proc parExpr(lexer: var Lexer; token: var Token): Node =
  ## Parse a parenthetized expression.
  token.expect(tokLPar, "'(' expected")
  result = expr(lexer, token, 0)
  token.expect(tokRPar, "')' expected")

#---------------------------------------------------------------------------------------------------

proc stmt(lexer: var Lexer; token: var Token): Node =
  ## Parse a statement.

  case token.kind:

  of tokIf:
    token = lexer.next()
    let e = parExpr(lexer, token)
    let thenNode = stmt(lexer, token)
    var elseNode: Node = nil
    if token.kind == tokElse:
      token = lexer.next()
      elseNode = stmt(lexer, token)
    result = newNode(nIf, e, newNode(nIf, thenNode, elseNode))

  of tokPutc:
    token = lexer.next()
    result = newNode(nPrtc, parExpr(lexer, token))
    token.expect(tokSemi, "';' expected")

  of tokPrint:
    token = lexer.next()
    token.expect(tokLPar, "'(' expected")
    while true:
      var e: Node
      if token.kind == tokString:
        e = newNode(nPrts, Node(kind: nString, stringVal: token.stringVal))
        token = lexer.next()
      else:
        e = newNode(nPrti, expr(lexer, token, 0))
      result = newNode(nSequence, result, e)
      if token.kind == tokComma:
        token = lexer.next()
      else:
        break
    token.expect(tokRPar, "')' expected")
    token.expect(tokSemi, "';' expected")

  of tokSemi:
    token = lexer.next()

  of tokIdent:
    let v = Node(kind: nIdentifier, name: token.ident)
    token = lexer.next()
    token.expect(tokAssign, "'=' expected")
    result = newNode(nAssign, v, expr(lexer, token, 0))
    token.expect(tokSemi, "';' expected")

  of tokWhile:
    token = lexer.next()
    let e = parExpr(lexer, token)
    result = newNode(nWhile, e, stmt(lexer, token))

  of tokLBrace:
    token = lexer.next()
    while token.kind notin {tokRBrace, tokEnd}:
      result = newNode(nSequence, result, stmt(lexer, token))
    token.expect(tokRBrace, "'}' expected")

  of tokEnd:
    discard

  else:
    raise newException(SyntaxError, "Unexpected symbol at line " & $token.ln)

#---------------------------------------------------------------------------------------------------

proc parse*(code: string): Node =
  ## Parse the code provided.

  var lexer = initLexer(code)
  var token = lexer.next()
  while true:
    result = newNode(nSequence, result, stmt(lexer, token))
    if token.kind == tokEnd:
      break

#———————————————————————————————————————————————————————————————————————————————————————————————————

when isMainModule:

  import os, strformat, strutils

  proc printAst(node: Node) =
    ## Print tha AST in linear form.

    if node.isNil:
      echo ';'

    else:
      stdout.write &"{$node.kind:<14}"
      case node.kind
      of nIdentifier:
        echo node.name
      of nInteger:
        echo node.intVal
      of nString:
        # Need to escape and to replace hexadecimal \x0A by \n.
        echo escape(node.stringVal).replace("\\x0A", "\\n")
      else:
        echo ""
        node.left.printAst()
        node.right.printAst()


  let code = if paramCount() < 1: stdin.readAll() else: paramStr(1).readFile()
  let tree = parse(code)
  tree.printAst()

Output:

Prime number program AST.

Sequence      
Sequence      
Sequence      
Sequence      
Sequence      
;
Assign        
Identifier    count
Integer       1
Assign        
Identifier    n
Integer       1
Assign        
Identifier    limit
Integer       100
While         
Less          
Identifier    n
Identifier    limit
Sequence      
Sequence      
Sequence      
Sequence      
Sequence      
;
Assign        
Identifier    k
Integer       3
Assign        
Identifier    p
Integer       1
Assign        
Identifier    n
Add           
Identifier    n
Integer       2
While         
And           
LessEqual     
Multiply      
Identifier    k
Identifier    k
Identifier    n
Identifier    p
Sequence      
Sequence      
;
Assign        
Identifier    p
NotEqual      
Multiply      
Divide        
Identifier    n
Identifier    k
Identifier    k
Identifier    n
Assign        
Identifier    k
Add           
Identifier    k
Integer       2
If            
Identifier    p
If            
Sequence      
Sequence      
;
Sequence      
Sequence      
;
Prti          
Identifier    n
;
Prts          
String        " is prime\n"
;
Assign        
Identifier    count
Add           
Identifier    count
Integer       1
;
Sequence      
Sequence      
Sequence      
;
Prts          
String        "Total primes found: "
;
Prti          
Identifier    count
;
Prts          
String        "\n"
;

ObjectIcon

Translation of: Icon

There are very few differences from the plain Icon implementation, although neither compiler can compile the other's implementation.

In Object Icon, the co-expressions should be fast.

# -*- ObjectIcon -*-
#
# The Rosetta Code Tiny-Language Parser, in Object Icon.
#
# This implementation is based closely on the pseudocode and the C
# reference implementation.
#

import io

record token_record (line_no, column_no, tok, tokval)
record token_getter (nxt, curr)

procedure main (args)
  local inpf_name, outf_name
  local inpf, outf
  local nexttok, currtok, current_token, gettok
  local ast

  inpf_name := "-"
  outf_name := "-"
  if 1 <= *args then inpf_name := args[1]
  if 2 <= *args then outf_name := args[2]

  inpf :=
      if inpf_name == "-" then
         FileStream.stdin
      else
        (FileStream(inpf_name, FileOpt.RDONLY) | stop(&why))
  outf :=
      if outf_name == "-" then
         FileStream.stdout
      else
        (FileStream(outf_name, ior(FileOpt.WRONLY, 
                                   FileOpt.TRUNC, 
                                   FileOpt.CREAT)) | stop(&why))
  
  current_token := [&null]
  nexttok := create generate_tokens(inpf, current_token)
  currtok := create get_current_token (current_token)
  gettok := token_getter(nexttok, currtok)
  ast := parse(gettok)
  prt_ast(outf, ast)

  close(inpf)
  close(outf)
end

procedure prt_ast (outf, ast)
  if *ast = 0 then {
    write(outf, ";")
  } else {
    writes(outf, ast[1])
    if ast[1] == ("Identifier" | "Integer" | "String") then {
      write(outf, " ", ast[2])
    } else {
      write(outf)
      prt_ast(outf, ast[2])
      prt_ast(outf, ast[3])
    }
  }
end

procedure generate_tokens (inpf, current_token)
  local s

  while s := read(inpf) do {
    if trim(s) ~== "" then {
      current_token[1] := string_to_token_record(s)
      suspend current_token[1]
    }
  }
end

procedure get_current_token (current_token)
  repeat (suspend current_token[1])
end

procedure string_to_token_record (s)
  local line_no, column_no, tok, tokval

  static spaces

  initial {
    spaces := ' \t\f\v\r\n'
  }

  trim(s) ? {
    tab(many(spaces))
    line_no := integer(tab(many(&digits)))
    tab(many(spaces))
    column_no := integer(tab(many(&digits)))
    tab(many(spaces))
    tok := tab(many(&letters ++ '_'))
    tab(many(spaces))
    tokval := tab(0)
  }
  return token_record(line_no, column_no, tok, tokval)
end

procedure parse (gettok)
  local tok
  local t

  t := []
  @gettok.nxt
  tok := "Not End_of_input"
  while tok ~== "End_of_input" do {
    t := ["Sequence", t, stmt(gettok)]
    tok := (@gettok.curr).tok
  }
  return t
end

procedure stmt (gettok)
  local e, s, t, v
  local tok
  local done

  t := []
  if accept(gettok, "Keyword_if") then {
    e := paren_expr(gettok)
    s := stmt(gettok)
    t := ["If", e, ["If", s, 
                    if accept(gettok, "Keyword_else")
                    then stmt(gettok) else []]]
  } else if accept(gettok, "Keyword_putc") then {
    t := ["Prtc", paren_expr(gettok), []]
    expect(gettok, "Putc", "Semicolon")
  } else if accept(gettok, "Keyword_print") then {
    expect(gettok, "Print", "LeftParen")
    done := &no
    while /done do {
      tok := @gettok.curr
      if tok.tok == "String" then {
        e := ["Prts", ["String", tok.tokval], []]
        @gettok.nxt
      } else {
        e := ["Prti", expr(gettok, 0), []]
      }
      t := ["Sequence", t, e]
      accept(gettok, "Comma") | (done := &yes)
    }
    expect(gettok, "Print", "RightParen")
    expect(gettok, "Print", "Semicolon")
  } else if (@gettok.curr).tok == "Semicolon" then {
    @gettok.nxt
  } else if (@gettok.curr).tok == "Identifier" then {
    v := ["Identifier", (@gettok.curr).tokval]
    @gettok.nxt
    expect(gettok, "assign", "Op_assign")
    t := ["Assign", v, expr(gettok, 0)]
    expect(gettok, "assign", "Semicolon")
  } else if accept(gettok, "Keyword_while") then {
    e := paren_expr(gettok)
    t := ["While", e, stmt(gettok)]
  } else if accept(gettok, "LeftBrace") then {
    until (@gettok.curr).tok == ("RightBrace" | "End_of_input") do {
      t := ["Sequence", t, stmt(gettok)]
    }
    expect(gettok, "Lbrace", "RightBrace")
  } else if (@gettok.curr).tok ~== "End_of_input" then {
    tok := @gettok.curr
    error(tok, ("expecting start of statement, found '" ||
                tok_text(tok.tok) || "'"))
  }
  return t
end

procedure paren_expr (gettok)
  local x

  expect(gettok, "paren_expr", "LeftParen");
  x := expr(gettok, 0);
  expect(gettok, "paren_expr", "RightParen");
  return x
end

procedure expr (gettok, p)
  local tok, save_tok
  local x, y
  local q

  tok := @gettok.curr
  case tok.tok of {
    "LeftParen" : {
      x := paren_expr(gettok)
    }
    "Op_subtract" : {
      @gettok.nxt
      y := expr(gettok, precedence("Op_negate"))
      x := ["Negate", y, []]
    }
    "Op_add" : {
      @gettok.nxt
      x := expr(gettok, precedence("Op_negate"))
    }
    "Op_not" : {
      @gettok.nxt
      y := expr(gettok, precedence("Op_not"))
      x := ["Not", y, []]
    }
    "Identifier" : {
      x := ["Identifier", tok.tokval]
      @gettok.nxt
    }
    "Integer" : {
      x := ["Integer", tok.tokval]
      @gettok.nxt
    }
    default : {
      error(tok, "Expecting a primary, found: " || tok_text(tok.tok))
    }
  }

  while (tok := @gettok.curr &
         is_binary(tok.tok) &
         p <= precedence(tok.tok)) do
      {
        save_tok := tok
        @gettok.nxt
        q := precedence(save_tok.tok)
        if not is_right_associative(save_tok.tok) then q +:= 1
        x := [operator(save_tok.tok), x, expr(gettok, q)]
      }

  return x
end

procedure accept (gettok, tok)
  local nxt

  if (@gettok.curr).tok == tok then nxt := @gettok.nxt else fail
  return nxt
end

procedure expect (gettok, msg, tok)
  if (@gettok.curr).tok ~== tok then {
    error(@gettok.curr,
          msg || ": Expecting '" || tok_text(tok) || "', found '" ||
          tok_text((@gettok.curr).tok) || "'")
  }
  return @gettok.nxt
end

procedure error (token, msg)
  write("(", token.line_no, ", ", token.column_no, ") error: ", msg)
  exit(1)
end

procedure precedence (tok)
  local p

  case tok of {
    "Op_multiply" : p := 13
    "Op_divide" : p := 13
    "Op_mod" : p := 13
    "Op_add" : p := 12
    "Op_subtract" : p := 12
    "Op_negate" : p := 14
    "Op_not" : p := 14
    "Op_less" : p := 10
    "Op_lessequal" : p := 10
    "Op_greater" : p := 10
    "Op_greaterequal" : p := 10
    "Op_equal" : p := 9
    "Op_notequal" : p := 9
    "Op_and" : p := 5
    "Op_or" : p := 4
    default : p := -1
  }
  return p
end

procedure is_binary (tok)
  return ("Op_add" |
          "Op_subtract" |
          "Op_multiply" |
          "Op_divide" |
          "Op_mod" |
          "Op_less" |
          "Op_lessequal" |
          "Op_greater" |
          "Op_greaterequal" |
          "Op_equal" |
          "Op_notequal" |
          "Op_and" |
          "Op_or") == tok
  fail
end

procedure is_right_associative (tok)
  # None of the current operators is right associative.
  fail
end

procedure operator (tok)
  local s

  case tok of {
    "Op_multiply" : s := "Multiply"
    "Op_divide" : s := "Divide"
    "Op_mod" : s := "Mod"
    "Op_add" : s := "Add"
    "Op_subtract" : s := "Subtract"
    "Op_negate" : s := "Negate"
    "Op_not" : s := "Not"
    "Op_less" : s := "Less"
    "Op_lessequal" : s := "LessEqual"
    "Op_greater" : s := "Greater"
    "Op_greaterequal" : s := "GreaterEqual"
    "Op_equal" : s := "Equal"
    "Op_notequal" : s := "NotEqual"
    "Op_and" : s := "And"
    "Op_or" : s := "Or"
  }
  return s
end

procedure tok_text (tok)
  local s

  case tok of {
    "Keyword_else"    :  s := "else"
    "Keyword_if"      :  s := "if"
    "Keyword_print"   :  s := "print"
    "Keyword_putc"    :  s := "putc"
    "Keyword_while"   :  s := "while"
    "Op_multiply"     :  s := "*"
    "Op_divide"       :  s := "/"
    "Op_mod"          :  s := "%"
    "Op_add"          :  s := "+"
    "Op_subtract"     :  s := "-"
    "Op_negate"       :  s := "-"
    "Op_less"         :  s := "<"
    "Op_lessequal"    :  s := "<="
    "Op_greater"      :  s := ">"
    "Op_greaterequal" :  s := ">="
    "Op_equal"        :  s := "=="
    "Op_notequal"     :  s := "!="
    "Op_not"          :  s := "!"
    "Op_assign"       :  s := "="
    "Op_and"          :  s := "&&"
    "Op_or"           :  s := "||"
    "LeftParen"       :  s := "("
    "RightParen"      :  s := ")"
    "LeftBrace"       :  s := "{"
    "RightBrace"      :  s := "}"
    "Semicolon"       :  s := ";"
    "Comma"           :  s := ","
    "Identifier"      :  s := "Ident"
    "Integer"         :  s := "Integer literal"
    "String"          :  s := "String literal"
    "End_of_input"    :  s := "EOI"
  }
  return s
end

Output:

$ oit -s parse_in_OI.icn && ./parse_in_OI compiler-tests/primes.lex
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier count
Integer 1
Assign
Identifier n
Integer 1
Assign
Identifier limit
Integer 100
While
Less
Identifier n
Identifier limit
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier k
Integer 3
Assign
Identifier p
Integer 1
Assign
Identifier n
Add
Identifier n
Integer 2
While
And
LessEqual
Multiply
Identifier k
Identifier k
Identifier n
Identifier p
Sequence
Sequence
;
Assign
Identifier p
NotEqual
Multiply
Divide
Identifier n
Identifier k
Identifier k
Identifier n
Assign
Identifier k
Add
Identifier k
Integer 2
If
Identifier p
If
Sequence
Sequence
;
Sequence
Sequence
;
Prti
Identifier n
;
Prts
String " is prime\n"
;
Assign
Identifier count
Add
Identifier count
Integer 1
;
Sequence
Sequence
Sequence
;
Prts
String "Total primes found: "
;
Prti
Identifier count
;
Prts
String "\n"
;

Perl

Tested on perl v5.26.1

#!/usr/bin/perl

use strict;   # parse.pl - inputs lex, outputs flattened ast
use warnings; # http://www.rosettacode.org/wiki/Compiler/syntax_analyzer

my $h = qr/\G\s*\d+\s+\d+\s+/;  # header of each line

sub error { die "*** Expected @_ at " . (/\G(.*\n)/ ?
  $1 =~ s/^\s*(\d+)\s+(\d+)\s+/line $1 character $2 got /r : "EOF\n") }

sub want { /$h \Q$_[1]\E.*\n/gcx ? shift : error "'$_[1]'" }

local $_ = join '', <>;
print want stmtlist(), 'End_of_input';

sub stmtlist
  {
  /(?=$h (RightBrace|End_of_input))/gcx and return ";\n";
  my ($stmt, $stmtlist) = (stmt(), stmtlist());
  $stmtlist eq ";\n" ? $stmt : "Sequence\n$stmt$stmtlist";
  }

sub stmt
  {
  /$h Semicolon\n/gcx ? ";\n" :
    /$h Identifier \s+ (\w+) \n/gcx ? want("Assign\nIdentifier\t$1\n",
      'Op_assign') . want expr(0), 'Semicolon' :
    /$h Keyword_while \n/gcx ? "While\n" . parenexp() . stmt() :
    /$h Keyword_if \n/gcx ?  "If\n" . parenexp() . "If\n" . stmt() .
      (/$h Keyword_else \n/gcx ? stmt() : ";\n") :
    /$h Keyword_print \n/gcx ? want('', 'LeftParen') .
      want want(printlist(), 'RightParen'), 'Semicolon' :
    /$h Keyword_putc \n/gcx ? want "Prtc\n" . parenexp() . ";\n", 'Semicolon' :
    /$h LeftBrace \n/gcx ? want stmtlist(), 'RightBrace' :
    error 'A STMT';
  }

sub parenexp { want('', 'LeftParen') . want expr(0), 'RightParen' } # (expr)

sub printlist
  {
  my $ast = /$h String \s+ (".*") \n/gcx ?
    "Prts\nString\t\t$1\n;\n" : "Prti\n" . expr(0) . ";\n";
  /$h Comma \n/gcx ? "Sequence\n$ast" . printlist() : $ast;
  }

sub expr               # (sort of EBNF) expr = operand { operator expr }
  {
  my $ast =                                        # operand
    /$h Integer \s+ (\d+) \n/gcx ? "Integer\t\t$1\n" :
    /$h Identifier \s+ (\w+) \n/gcx ? "Identifier\t$1\n" :
    /$h LeftParen \n/gcx ? want expr(0), 'RightParen' :
    /$h Op_(negate|subtract) \n/gcx ? "Negate\n" . expr(8) . ";\n" :
    /$h Op_not \n/gcx ? "Not\n" . expr(8) . ";\n" :
    /$h Op_add \n/gcx ? expr(8) :
    error "A PRIMARY";
  $ast =                                           # { operator expr }
    $_[0] <= 7 && /$h Op_multiply \n/gcx ? "Multiply\n$ast" . expr(8) :
    $_[0] <= 7 && /$h Op_divide \n/gcx ? "Divide\n$ast" . expr(8) :
    $_[0] <= 7 && /$h Op_mod \n/gcx ? "Mod\n$ast" . expr(8) :
    $_[0] <= 6 && /$h Op_add \n/gcx ? "Add\n$ast" . expr(7) :
    $_[0] <= 6 && /$h Op_subtract \n/gcx ? "Subtract\n$ast" . expr(7) :
    $_[0] == 5 && /(?=$h Op_(less|greater)(equal)? \n)/gcx ? error 'NO ASSOC' :
    $_[0] <= 5 && /$h Op_lessequal \n/gcx ? "LessEqual\n$ast" . expr(5) :
    $_[0] <= 5 && /$h Op_less \n/gcx ? "Less\n$ast" . expr(5) :
    $_[0] <= 5 && /$h Op_greater \n/gcx ? "Greater\n$ast" . expr(5) :
    $_[0] <= 5 && /$h Op_greaterequal \n/gcx ?  "GreaterEqual\n$ast" . expr(5) :
    $_[0] == 3 && /(?=$h Op_(not)?equal \n)/gcx ? error 'NO ASSOC' :
    $_[0] <= 3 && /$h Op_equal \n/gcx ? "Equal\n$ast" . expr(3) :
    $_[0] <= 3 && /$h Op_notequal \n/gcx ? "NotEqual\n$ast" . expr(3) :
    $_[0] <= 1 && /$h Op_and \n/gcx ? "And\n$ast" . expr(2) :
    $_[0] <= 0 && /$h Op_or \n/gcx ? "Or\n$ast" . expr(1) :
    return $ast while 1;
  }

Output — Count AST:

Sequence Assign Identifier count Integer 1 While Less Identifier count Integer 10 Sequence Sequence Prts String "count is: " ; Sequence Prti Identifier count ; Prts String "\n" ; Assign Identifier count Add Identifier count Integer 1

Phix

Reusing lex.e (and core.e) from the Lexical Analyzer task, and again written as a reusable module.

--
-- demo\rosetta\Compiler\parse.e
-- =============================
--
--  The reusable part of parse.exw
--
with javascript_semantics
include lex.e

sequence tok

procedure errd(sequence msg, sequence args={})
    {tok_line,tok_col} = tok
    error(msg,args)
end procedure

global sequence toks
integer next_tok = 1

function get_tok()
    sequence tok = toks[next_tok]
    next_tok += 1
    return tok
end function

procedure expect(string msg, integer s)
    integer tk = tok[3]
    if tk!=s then
        errd("%s: Expecting '%s', found '%s'\n", {msg, tkNames[s], tkNames[tk]})
    end if
    tok = get_tok()
end procedure

function expr(integer p)
    object x = NULL, node
    integer op = tok[3] 

    switch op do
        case tk_LeftParen:
            tok = get_tok()
            x = expr(0)
            expect("expr",tk_RightParen)
        case tk_sub: 
        case tk_add:
            tok = get_tok()
            node = expr(precedences[tk_neg]);
            x = iff(op==tk_sub?{tk_neg, node, NULL}:node)
        case tk_not:
            tok = get_tok();
            x = {tk_not, expr(precedences[tk_not]), NULL}
        case tk_Identifier:
            x = {tk_Identifier, tok[4]}
            tok = get_tok();
        case tk_Integer:
            x = {tk_Integer, tok[4]}
            tok = get_tok();
        default:
            errd("Expecting a primary, found: %s\n", tkNames[op])
    end switch
 
    op = tok[3]
    while narys[op]=BINARY 
      and precedences[op]>=p do
        tok = get_tok()
        x = {op, x, expr(precedences[op]+1)}
        op = tok[3]
    end while
    return x;
end function

function paren_expr(string msg)
    expect(msg, tk_LeftParen);
    object t = expr(0)
    expect(msg, tk_RightParen);
    return t
end function

function stmt()
    object t = NULL, e, s
 
    switch tok[3] do
        case tk_if:
            tok = get_tok();
            object condition = paren_expr("If-cond");
            object ifblock = stmt();
            object elseblock = NULL;
            if tok[3] == tk_else then
                tok = get_tok();
                elseblock = stmt();
            end if
            t = {tk_if, condition, {tk_if, ifblock, elseblock}}
        case tk_putc:
            tok = get_tok();
            e = paren_expr("Prtc")
            t = {tk_putc, e, NULL}
            expect("Putc", tk_Semicolon);
        case tk_print:
            tok = get_tok();
            expect("Print",tk_LeftParen)
            while 1 do
                if tok[3] == tk_String then
                    e = {tk_Prints, {tk_String, tok[4]}, NULL}
                    tok = get_tok();
                else
                    e = {tk_Printi, expr(0), NULL}
                end if
                t = {tk_Sequence, t, e}
                if tok[3]!=tk_Comma then exit end if
                expect("Print", tk_Comma)
            end while
            expect("Print", tk_RightParen);
            expect("Print", tk_Semicolon);
        case tk_Semicolon:
            tok = get_tok();
        case tk_Identifier:
            object v
            v = {tk_Identifier, tok[4]}
            tok = get_tok();
            expect("assign", tk_assign);
            e = expr(0);
            t = {tk_assign, v, e}
            expect("assign", tk_Semicolon);
        case tk_while:
            tok = get_tok();
            e = paren_expr("while");
            s = stmt();
            t = {tk_while, e, s}
        case tk_LeftBrace:      /* {stmt} */
            expect("LeftBrace", tk_LeftBrace)
            while not find(tok[3],{tk_RightBrace,tk_EOI}) do
                t = {tk_Sequence, t, stmt()}
            end while
            expect("LeftBrace", tk_RightBrace);
            break;
        case tk_EOI:
            break;
        default: 
            errd("expecting start of statement, found '%s'\n", tkNames[tok[3]]);
    end switch
    return t
end function

global function parse()
    object t = NULL
    tok = get_tok()
    while 1 do
        object s = stmt()
        if s=NULL then exit end if
        t = {tk_Sequence, t, s}
    end while
    return t
end function

And a simple test driver for the specific task:

--
-- demo\rosetta\Compiler\parse.exw
-- ===============================
--
with javascript_semantics
include parse.e

procedure print_ast(object t)
    if t == NULL then
        printf(output_file,";\n")
    else
        integer ttype = t[1]
        printf(output_file,tkNames[ttype])
        if ttype=tk_Identifier then
            printf(output_file," %s\n",t[2])
        elsif ttype=tk_Integer then
            printf(output_file," %d\n",t[2])
        elsif ttype=tk_String then
            printf(output_file," %s\n",enquote(t[2]))
        else
            printf(output_file,"\n")
            print_ast(t[2])
            print_ast(t[3])
        end if
    end if
end procedure

function ptree(object t)
    if sequence(t) then
        integer t1 = t[1]
        t = deep_copy(t)
        t[1] = tkNames[t1]
        if not find(t1,{tk_Identifier,tk_String}) then
            for i=2 to length(t) do
                if t1=tk_Sequence and t[i]=NULL then
                    t[i] = "NULL"
                else
                    t[i] = ptree(t[i])
                end if
            end for
        end if
    end if
    return t
end function

procedure main(sequence cl)
    open_files(cl)
    toks = lex()
    object t = parse()
    print_ast(t)
pp(ptree(t),{pp_Nest,10,pp_Pause,0,pp_IntCh,false})
    close_files()
end procedure

--main(command_line())
main({0,0,"test3.c"}) -- not parseable!
--main({0,0,"primes.c"})    -- as Algol, C, Python (apart from spacing)
--main({0,0,"count.c"})     -- as AWK              (       ""         )

Output:

Line 5 column 40:
Print: Expecting 'LeftParen', found 'Op_subtract'

Python

Tested with Python 2.7 and 3.x

from __future__ import print_function
import sys, shlex, operator

tk_EOI, tk_Mul, tk_Div, tk_Mod, tk_Add, tk_Sub, tk_Negate, tk_Not, tk_Lss, tk_Leq, tk_Gtr, \
tk_Geq, tk_Eql, tk_Neq, tk_Assign, tk_And, tk_Or, tk_If, tk_Else, tk_While, tk_Print,      \
tk_Putc, tk_Lparen, tk_Rparen, tk_Lbrace, tk_Rbrace, tk_Semi, tk_Comma, tk_Ident,          \
tk_Integer, tk_String = range(31)

nd_Ident, nd_String, nd_Integer, nd_Sequence, nd_If, nd_Prtc, nd_Prts, nd_Prti, nd_While, \
nd_Assign, nd_Negate, nd_Not, nd_Mul, nd_Div, nd_Mod, nd_Add, nd_Sub, nd_Lss, nd_Leq,     \
nd_Gtr, nd_Geq, nd_Eql, nd_Neq, nd_And, nd_Or = range(25)

# must have same order as above
Tokens = [
    ["EOI"             , False, False, False, -1, -1        ],
    ["*"               , False, True,  False, 13, nd_Mul    ],
    ["/"               , False, True,  False, 13, nd_Div    ],
    ["%"               , False, True,  False, 13, nd_Mod    ],
    ["+"               , False, True,  False, 12, nd_Add    ],
    ["-"               , False, True,  False, 12, nd_Sub    ],
    ["-"               , False, False, True,  14, nd_Negate ],
    ["!"               , False, False, True,  14, nd_Not    ],
    ["<"               , False, True,  False, 10, nd_Lss    ],
    ["<="              , False, True,  False, 10, nd_Leq    ],
    [">"               , False, True,  False, 10, nd_Gtr    ],
    [">="              , False, True,  False, 10, nd_Geq    ],
    ["=="              , False, True,  False,  9, nd_Eql    ],
    ["!="              , False, True,  False,  9, nd_Neq    ],
    ["="               , False, False, False, -1, nd_Assign ],
    ["&&"              , False, True,  False,  5, nd_And    ],
    ["||"              , False, True,  False,  4, nd_Or     ],
    ["if"              , False, False, False, -1, nd_If     ],
    ["else"            , False, False, False, -1, -1        ],
    ["while"           , False, False, False, -1, nd_While  ],
    ["print"           , False, False, False, -1, -1        ],
    ["putc"            , False, False, False, -1, -1        ],
    ["("               , False, False, False, -1, -1        ],
    [")"               , False, False, False, -1, -1        ],
    ["{"               , False, False, False, -1, -1        ],
    ["}"               , False, False, False, -1, -1        ],
    [";"               , False, False, False, -1, -1        ],
    [","               , False, False, False, -1, -1        ],
    ["Ident"           , False, False, False, -1, nd_Ident  ],
    ["Integer literal" , False, False, False, -1, nd_Integer],
    ["String literal"  , False, False, False, -1, nd_String ]
    ]

all_syms = {"End_of_input"   : tk_EOI,     "Op_multiply"    : tk_Mul,
            "Op_divide"      : tk_Div,     "Op_mod"         : tk_Mod,
            "Op_add"         : tk_Add,     "Op_subtract"    : tk_Sub,
            "Op_negate"      : tk_Negate,  "Op_not"         : tk_Not,
            "Op_less"        : tk_Lss,     "Op_lessequal"   : tk_Leq,
            "Op_greater"     : tk_Gtr,     "Op_greaterequal": tk_Geq,
            "Op_equal"       : tk_Eql,     "Op_notequal"    : tk_Neq,
            "Op_assign"      : tk_Assign,  "Op_and"         : tk_And,
            "Op_or"          : tk_Or,      "Keyword_if"     : tk_If,
            "Keyword_else"   : tk_Else,    "Keyword_while"  : tk_While,
            "Keyword_print"  : tk_Print,   "Keyword_putc"   : tk_Putc,
            "LeftParen"      : tk_Lparen,  "RightParen"     : tk_Rparen,
            "LeftBrace"      : tk_Lbrace,  "RightBrace"     : tk_Rbrace,
            "Semicolon"      : tk_Semi,    "Comma"          : tk_Comma,
            "Identifier"     : tk_Ident,   "Integer"        : tk_Integer,
            "String"         : tk_String}

Display_nodes = ["Identifier", "String", "Integer", "Sequence", "If", "Prtc", "Prts",
    "Prti", "While", "Assign", "Negate", "Not", "Multiply", "Divide", "Mod", "Add",
    "Subtract", "Less", "LessEqual", "Greater", "GreaterEqual", "Equal", "NotEqual",
    "And", "Or"]

TK_NAME         = 0
TK_RIGHT_ASSOC  = 1
TK_IS_BINARY    = 2
TK_IS_UNARY     = 3
TK_PRECEDENCE   = 4
TK_NODE         = 5

input_file = None
err_line   = None
err_col    = None
tok        = None
tok_text   = None

#*** show error and exit
def error(msg):
    print("(%d, %d) %s" % (int(err_line), int(err_col), msg))
    exit(1)

#***
def gettok():
    global err_line, err_col, tok, tok_text, tok_other
    line = input_file.readline()
    if len(line) == 0:
        error("empty line")

    line_list = shlex.split(line, False, False)
    # line col Ident var_name
    # 0    1   2     3

    err_line = line_list[0]
    err_col  = line_list[1]
    tok_text = line_list[2]

    tok = all_syms.get(tok_text)
    if tok == None:
        error("Unknown token %s" % (tok_text))

    tok_other = None
    if tok in [tk_Integer, tk_Ident, tk_String]:
        tok_other = line_list[3]

class Node:
    def __init__(self, node_type, left = None, right = None, value = None):
        self.node_type  = node_type
        self.left  = left
        self.right = right
        self.value = value

#***
def make_node(oper, left, right = None):
    return Node(oper, left, right)

#***
def make_leaf(oper, n):
    return Node(oper, value = n)

#***
def expect(msg, s):
    if tok == s:
        gettok()
        return
    error("%s: Expecting '%s', found '%s'" % (msg, Tokens[s][TK_NAME], Tokens[tok][TK_NAME]))

#***
def expr(p):
    x = None

    if tok == tk_Lparen:
        x = paren_expr()
    elif tok in [tk_Sub, tk_Add]:
        op = (tk_Negate if tok == tk_Sub else tk_Add)
        gettok()
        node = expr(Tokens[tk_Negate][TK_PRECEDENCE])
        x = (make_node(nd_Negate, node) if op == tk_Negate else node)
    elif tok == tk_Not:
        gettok()
        x = make_node(nd_Not, expr(Tokens[tk_Not][TK_PRECEDENCE]))
    elif tok == tk_Ident:
        x = make_leaf(nd_Ident, tok_other)
        gettok()
    elif tok == tk_Integer:
        x = make_leaf(nd_Integer, tok_other)
        gettok()
    else:
        error("Expecting a primary, found: %s" % (Tokens[tok][TK_NAME]))

    while Tokens[tok][TK_IS_BINARY] and Tokens[tok][TK_PRECEDENCE] >= p:
        op = tok
        gettok()
        q = Tokens[op][TK_PRECEDENCE]
        if not Tokens[op][TK_RIGHT_ASSOC]:
            q += 1

        node = expr(q)
        x = make_node(Tokens[op][TK_NODE], x, node)

    return x

#***
def paren_expr():
    expect("paren_expr", tk_Lparen)
    node = expr(0)
    expect("paren_expr", tk_Rparen)
    return node

#***
def stmt():
    t = None

    if tok == tk_If:
        gettok()
        e = paren_expr()
        s = stmt()
        s2 = None
        if tok == tk_Else:
            gettok()
            s2 = stmt()
        t = make_node(nd_If, e, make_node(nd_If, s, s2))
    elif tok == tk_Putc:
        gettok()
        e = paren_expr()
        t = make_node(nd_Prtc, e)
        expect("Putc", tk_Semi)
    elif tok == tk_Print:
        gettok()
        expect("Print", tk_Lparen)
        while True:
            if tok == tk_String:
                e = make_node(nd_Prts, make_leaf(nd_String, tok_other))
                gettok()
            else:
                e = make_node(nd_Prti, expr(0))

            t = make_node(nd_Sequence, t, e)
            if tok != tk_Comma:
                break
            gettok()
        expect("Print", tk_Rparen)
        expect("Print", tk_Semi)
    elif tok == tk_Semi:
        gettok()
    elif tok == tk_Ident:
        v = make_leaf(nd_Ident, tok_other)
        gettok()
        expect("assign", tk_Assign)
        e = expr(0)
        t = make_node(nd_Assign, v, e)
        expect("assign", tk_Semi)
    elif tok == tk_While:
        gettok()
        e = paren_expr()
        s = stmt()
        t = make_node(nd_While, e, s)
    elif tok == tk_Lbrace:
        gettok()
        while tok != tk_Rbrace and tok != tk_EOI:
            t = make_node(nd_Sequence, t, stmt())
        expect("Lbrace", tk_Rbrace)
    elif tok == tk_EOI:
        pass
    else:
        error("Expecting start of statement, found: %s" % (Tokens[tok][TK_NAME]))

    return t

#***
def parse():
    t = None
    gettok()
    while True:
        t = make_node(nd_Sequence, t, stmt())
        if tok == tk_EOI or t == None:
            break
    return t

def prt_ast(t):
    if t == None:
        print(";")
    else:
        print("%-14s" % (Display_nodes[t.node_type]), end='')
        if t.node_type in [nd_Ident, nd_Integer]:
            print("%s" % (t.value))
        elif t.node_type == nd_String:
            print("%s" %(t.value))
        else:
            print("")
            prt_ast(t.left)
            prt_ast(t.right)

#*** main driver
input_file = sys.stdin
if len(sys.argv) > 1:
    try:
        input_file = open(sys.argv[1], "r", 4096)
    except IOError as e:
        error(0, 0, "Can't open %s" % sys.argv[1])
t = parse()
prt_ast(t)

Output — prime numbers AST:

Sequence Sequence Sequence Sequence Sequence ; Assign Identifier count Integer 1 Assign Identifier n Integer 1 Assign Identifier limit Integer 100 While Less Identifier n Identifier limit Sequence Sequence Sequence Sequence Sequence ; Assign Identifier k Integer 3 Assign Identifier p Integer 1 Assign Identifier n Add Identifier n Integer 2 While And LessEqual Multiply Identifier k Identifier k Identifier n Identifier p Sequence Sequence ; Assign Identifier p NotEqual Multiply Divide Identifier n Identifier k Identifier k Identifier n Assign Identifier k Add Identifier k Integer 2 If Identifier p If Sequence Sequence ; Sequence Sequence ; Prti Identifier n ; Prts String " is prime\n" ; Assign Identifier count Add Identifier count Integer 1 ; Sequence Sequence Sequence ; Prts String "Total primes found: " ; Prti Identifier count ; Prts String "\n" ;

RATFOR

Works with: ratfor77 version public domain 1.0

Works with: gfortran version 11.2.1

Works with: f2c version 20100827

FORTRAN 77 is a non-recursive language, in the specific sense that it does not support recursive algorithms. What is missing is simple: there is no way to specify that a value should go onto a call stack. Local variables were all treated by compilers more or less as what C programmers would call "static". Subprogram parameters were all passed by reference, rather than by value as in C.

However, it is perfectly possible to implement a recursive language in FORTRAN 77 and do the programming in that.

Which is what I do here. I have implemented the recursive algorithm of the parser pseudocode in a tiny, FORTH-like "language" specific for the task. The parser code, that is, is not written directly in Ratfor, but instead is written in a tiny "language" and interpreted by a Ratfor subroutine.

Printing the abstract syntax tree is done with a quite ordinary non-recursive tree traversal written directly in Ratfor.

There is no paradox in the notion of doing recursive programming within a Ratfor program by the method described above. All the recursion is at a higher level of abstraction than the Ratfor programming itself. If you examine the Ratfor code as Ratfor code, there is not a single recursive call.

######################################################################
#
# The Rosetta Code parser in Ratfor 77.
# 
#
# Ratfor 77 is a preprocessor for FORTRAN 77; therefore we do not have
# recursive calls available. For printing the flattened tree, I use an
# ordinary non-recursive implementation of the tree traversal. The
# mutually recursive parser itself is more difficult to handle; for
# that, I implement a tiny, FORTH-like token processor that supports
# recursive calls.
#
# How to deal with input is another problem. I use formatted input,
# treating each line as a (regrettably fixed length) array of type
# CHARACTER. It is a very simple method, and leaves the input in a
# form convenient for the necessary processing (given that the input
# is not formatted in columns).
#
#
# On a POSIX platform, the program can be compiled with f2c and run
# somewhat as follows:
#
#    ratfor77 parse-in-ratfor.r > parse-in-ratfor.f
#    f2c -C -Nc40 parse-in-ratfor.f
#    cc parse-in-ratfor.c -lf2c
#    ./a.out < compiler-tests/primes.lex
#
# With gfortran, a little differently:
#
#    ratfor77 parse-in-ratfor.r > parse-in-ratfor.f
#    gfortran -fcheck=all -std=legacy parse-in-ratfor.f
#    ./a.out < compiler-tests/primes.lex
#
#
# I/O is strictly from default input and to default output, which, on
# POSIX systems, usually correspond respectively to standard input and
# standard output.
#
#---------------------------------------------------------------------

# Parameters that you can adjust.

define(LINESZ, 256)             # Size of an input line. 
define(STRNSZ, 4096)            # Size of the string pool.
define(NODSSZ, 4096)            # Size of the nodes pool.
define(DSTKSZ, 4096)            # Size of the data stack.
define(PSTKSZ, 4096)            # Size of the precedence stack.
define(XSTKSZ, 4096)            # Size of the execution stack.

#---------------------------------------------------------------------

define(TOKSZ, 5) # Size of a lexical token, in integers.
define(ILN, 1)   # Index for line number.
define(ICN, 2)   # Index for column number.
define(ITK, 3)   # Index for token number.
define(ITV, 4)   # Index for the string pool index of the token value.
define(ITN, 5)   # Index for the length of the token value.

define(NODESZ, 3)
define(NTAG,   1)               # Index for the tag.
                                # For an internal node --
define(NLEFT,  2)               #   Index for the left node.
define(NRIGHT, 3)               #   Index for the right node.
                                # For a leaf node --
define(NITV,   2)               #   Index for the string pool index.
define(NITN,   3)               #   Length of the value.

define(NIL, -1)                 # Nil node.

define(TKELSE,  0)
define(TKIF,    1)
define(TKPRNT,  2)
define(TKPUTC,  3)
define(TKWHIL,  4)
define(TKMUL,   5)
define(TKDIV,   6)
define(TKMOD,   7)
define(TKADD,   8)
define(TKSUB,   9)
define(TKNEG,  10)
define(TKLT,   11)
define(TKLE,   12)
define(TKGT,   13)
define(TKGE,   14)
define(TKEQ,   15)
define(TKNE,   16)
define(TKNOT,  17)
define(TKASGN, 18)
define(TKAND,  19)
define(TKOR,   20)
define(TKLPAR, 21)
define(TKRPAR, 22)
define(TKLBRC, 23)
define(TKRBRC, 24)
define(TKSEMI, 25)
define(TKCMMA, 26)
define(TKID,   27)
define(TKINT,  28)
define(TKSTR,  29)
define(TKEOI,  30)

define(NDID,    0)
define(NDSTR,   1)
define(NDINT,   2)
define(NDSEQ,   3)
define(NDIF,    4)
define(NDPRTC,  5)
define(NDPRTS,  6)
define(NDPRTI,  7)
define(NDWHIL,  8)
define(NDASGN,  9)
define(NDNEG,  10)
define(NDNOT,  11)
define(NDMUL,  12)
define(NDDIV,  13)
define(NDMOD,  14)
define(NDADD,  15)
define(NDSUB,  16)
define(NDLT,   17)
define(NDLE,   18)
define(NDGT,   19)
define(NDGE,   20)
define(NDEQ,   21)
define(NDNE,   22)
define(NDAND,  23)
define(NDOR,   24)

subroutine string (src, isrc, nsrc, strngs, istrng, i, n)

  # Store a string in the string pool.

  implicit none

  character src(*)         # Source string.
  integer isrc, nsrc       # Index and length of the source substring.
  character strngs(STRNSZ) # The string pool.
  integer istrng           # The string pool's next slot.
  integer i, n             # Index and length within the string pool.

  integer j

  if (STRNSZ < istrng + nsrc)
    {
      write (*, '(''string pool exhausted'')')
      stop
    }
  for (j = 0; j < nsrc; j = j + 1)
    strngs(istrng + j) = src(isrc + j)
  i = istrng
  n = nsrc
  istrng = istrng + nsrc
end

subroutine astnod (node, nodes, inodes, i)

  # Store a node in the nodes pool.

  implicit none

  integer node(NODESZ)
  integer nodes(NODESZ, NODSSZ)
  integer inodes
  integer i

  integer j

  if (NODSSZ < inodes + 1)
    {
      write (*, '(''node pool exhausted'')')
      stop
    }
  i = inodes
  inodes = inodes + 1
  for (j = 1; j <= NODESZ; j = j + 1)
    nodes(j, i) = node(j)
end

function issp (c)

  # Is a character a space character?

  implicit none

  character c
  logical issp

  integer ic

  ic = ichar (c)
  issp = (ic == 32 || (9 <= ic && ic <= 13))
end

function skipsp (str, i, imax)

  # Skip past spaces in a string.

  implicit none

  character str(*)
  integer i
  integer imax
  integer skipsp

  logical issp

  logical done

  skipsp = i
  done = .false.
  while (!done)
    {
      if (imax <= skipsp)
        done = .true.
      else if (!issp (str(skipsp)))
        done = .true.
      else
        skipsp = skipsp + 1
    }
end

function skipns (str, i, imax)

  # Skip past non-spaces in a string.

  implicit none

  character str(*)
  integer i
  integer imax
  integer skipns

  logical issp

  logical done

  skipns = i
  done = .false.
  while (!done)
    {
      if (imax <= skipns)
        done = .true.
      else if (issp (str(skipns)))
        done = .true.
      else
        skipns = skipns + 1
    }
end

function trimrt (str, n)

  # Find the length of a string, if one ignores trailing spaces.

  implicit none

  character str(*)
  integer n
  integer trimrt

  logical issp

  logical done

  trimrt = n
  done = .false.
  while (!done)
    {
      if (trimrt == 0)
        done = .true.
      else if (!issp (str(trimrt)))
        done = .true.
      else
        trimrt = trimrt - 1
    }
end

function mktok (str, i, n)

  # Convert a substring to a token integer.

  implicit none

  character str(*)
  integer i
  integer n
  integer mktok

  character*16 tokstr(0:30)
  character*16 test
  integer j
  logical done

  data tokstr / 'Keyword_else    ', _
                'Keyword_if      ', _
                'Keyword_print   ', _
                'Keyword_putc    ', _
                'Keyword_while   ', _
                'Op_multiply     ', _
                'Op_divide       ', _
                'Op_mod          ', _
                'Op_add          ', _
                'Op_subtract     ', _
                'Op_negate       ', _
                'Op_less         ', _
                'Op_lessequal    ', _
                'Op_greater      ', _
                'Op_greaterequal ', _
                'Op_equal        ', _
                'Op_notequal     ', _
                'Op_not          ', _
                'Op_assign       ', _
                'Op_and          ', _
                'Op_or           ', _
                'LeftParen       ', _
                'RightParen      ', _
                'LeftBrace       ', _
                'RightBrace      ', _
                'Semicolon       ', _
                'Comma           ', _
                'Identifier      ', _
                'Integer         ', _
                'String          ', _
                'End_of_input    ' /

  test = '                '
  for (j = 0; j < n; j = j + 1)
    test(j + 1 : j + 1) = str(i + j)

  j = 0
  done = .false.
  while (!done)
    {
      if (TKEOI < j)
        {
          write (*, '(''unrecognized token'')')
          stop
        }
      else if (test == tokstr(j))
        done = .true.
      else
        j = j + 1
    }

  mktok = j
end

function mkint (str, i, n)

  # Convert a unsigned integer substring to an integer.

  implicit none

  character str(*)
  integer i
  integer n
  integer mkint

  integer j

  mkint = 0
  for (j = 0; j < n; j = j + 1)
    mkint = (10 * mkint) + (ichar (str(i + j)) - 48)
end

subroutine rdtok (strngs, istrng, blank, linno, colno, tokno, _
                  itkval, ntkval)

  # Read a token from the input.

  implicit none

  character strngs(STRNSZ)      # The string pool.
  integer istrng                # The string pool's next slot.
  logical blank                 # Is the line blank?
  integer linno                 # The line number.
  integer colno                 # The column number.
  integer tokno                 # The token number.
  integer itkval, ntkval        # Token value as a string.

  integer skipsp, skipns, trimrt
  integer mkint, mktok

  character line(LINESZ)
  character*20 fmt
  integer n, i, j

  # Read a line of text as an array of characters.
  write (fmt, '(''('', I10, ''A1)'')') LINESZ
  read (*, fmt) line

  n = trimrt (line, LINESZ)
  blank = (n == 0)

  if (!blank)
    {
      i = skipsp (line, 1, n + 1)
      j = skipns (line, i, n + 1)
      linno = mkint (line, i, j - i)

      i = skipsp (line, j, n + 1)
      j = skipns (line, i, n + 1)
      colno = mkint (line, i, j - i)

      i = skipsp (line, j, n + 1)
      j = skipns (line, i, n + 1)
      tokno = mktok (line, i, j - i)

      i = skipsp (line, j, n + 1)
      j = n + 1
      call string (line, i, j - i, strngs, istrng, itkval, ntkval)
    }
end

subroutine gettok (strngs, istrng, tok)

  # Get the next token.

  implicit none

  character strngs(STRNSZ)      # The string pool.
  integer istrng                # The string pool's next slot.
  integer tok(TOKSZ)

  integer linno, colno, tokno, itkval, ntkval
  logical blank

  blank = .true.
  while (blank)
    call rdtok (strngs, istrng, blank, linno, colno, tokno, _
                itkval, ntkval)
  tok(ILN) = linno
  tok(ICN) = colno
  tok(ITK) = tokno
  tok(ITV) = itkval
  tok(ITN) = ntkval
end

function accept (strngs, istrng, curtok, tokno)

  implicit none

  character strngs(STRNSZ)      # The string pool.
  integer istrng                # The string pool's next slot.
  integer curtok(TOKSZ)
  integer tokno
  logical accept

  accept = (curtok(ITK) == tokno)
  if (accept)
    call gettok (strngs, istrng, curtok)
end

subroutine expect (strngs, istrng, curtok, tokno)

  implicit none

  character strngs(STRNSZ)      # The string pool.
  integer istrng                # The string pool's next slot.
  integer curtok(TOKSZ)
  integer tokno

  logical accept

  if (!accept (strngs, istrng, curtok, tokno))
    {
      # This is not the same message as printed by the reference C
      # implementation. You can change that, if you wish.
      write (*, 100) curtok(ILN), curtok(ICN)
100   format ('unexpected token at line ', I5, ', column ', I5)
      stop
    }
end

function prec (tokno)

  # Precedence.

  implicit none

  integer tokno
  integer prec

  if (tokno == TKMUL || tokno == TKDIV || tokno == TKMOD)
    prec = 13
  else if (tokno == TKADD || tokno == TKSUB)
    prec = 12
  else if (tokno == TKNEG || tokno == TKNOT)
    prec = 14
  else if (tokno == TKLT || tokno == TKLE || _
           tokno == TKGT || tokno == TKGE)
    prec = 10
  else if (tokno == TKEQ || tokno == TKNE)
    prec = 9
  else if (tokno == TKAND)
    prec = 5
  else if (tokno == TKOR)
    prec = 4
  else
    prec = -1
end

function isbin (tokno)

  # Is an operation binary?

  implicit none

  integer tokno
  logical isbin

  isbin = (tokno == TKADD || _
           tokno == TKSUB || _
           tokno == TKMUL || _
           tokno == TKDIV || _
           tokno == TKMOD || _
           tokno == TKLT  || _
           tokno == TKLE  || _
           tokno == TKGT  || _
           tokno == TKGE  || _
           tokno == TKEQ  || _
           tokno == TKNE  || _
           tokno == TKAND || _
           tokno == TKOR)
end

function rtassc (tokno)

  # Is an operation right associative?

  implicit none

  integer tokno
  logical rtassc

  # None of the current operators is right associative.
  rtassc = .false.
end

function opernt (tokno)

  # Return the node tag for a binary operator.

  implicit none

  integer tokno
  integer opernt

  if (tokno == TKMUL)
    opernt = NDMUL
  else if (tokno == TKDIV)
    opernt = NDDIV
  else if (tokno == TKMOD)
    opernt = NDMOD
  else if (tokno == TKADD)
    opernt = NDADD
  else if (tokno == TKSUB)
    opernt = NDSUB
  else if (tokno == TKNEG)
    opernt = NDNEG
  else if (tokno == TKNOT)
    opernt = NDNOT
  else if (tokno == TKLT)
    opernt = NDLT
  else if (tokno == TKLE)
    opernt = NDLE
  else if (tokno == TKGT)
    opernt = NDGT
  else if (tokno == TKGE)
    opernt = NDGE
  else if (tokno == TKEQ)
    opernt = NDEQ
  else if (tokno == TKNE)
    opernt = NDNE
  else if (tokno == TKAND)
    opernt = NDAND
  else if (tokno == TKOR)
    opernt = NDOR
  else
    {
      write (*, '(''unrecognized binary operator'')')
      stop
    }
end

#---------------------------------------------------------------------

subroutine prtast (strngs, nodes, i, dstack)

  # Print a tree in flattened format.

  implicit none

  character strngs(*)
  integer nodes(NODESZ, NODSSZ)
  integer i
  integer dstack(DSTKSZ)

  integer j
  integer k
  integer n
  integer q, r
  integer tag

  character*80 fmt

  dstack(1) = i
  j = 2
  while (j != 1)
    {
      j = j - 1
      k = dstack(j)
      if (k < 1)
        write (*, '('';'')')
      else
        {
          tag = nodes(NTAG, k)
          if (tag == NDID)
            {
              n = nodes(NITN, k)
              write (fmt, '(''("Identifier ", '', I5, ''A)'')') n
              q = nodes(NITV, k)
              write (*, fmt) (strngs(r), r = q, q + n - 1)
            }
          else if (tag == NDINT)
            {
              n = nodes(NITN, k)
              write (fmt, '(''("Integer ", '', I5, ''A)'')') n
              q = nodes(NITV, k)
              write (*, fmt) (strngs(r), r = q, q + n - 1)
            }
          else if (tag == NDSTR)
            {
              n = nodes(NITN, k)
              write (fmt, '(''("String ", '', I5, ''A)'')') n
              q = nodes(NITV, k)
              write (*, fmt) (strngs(r), r = q, q + n - 1)
            }
          else
            {
              if (tag == NDSEQ)
                write (*, '(''Sequence'')')
              else if (tag == NDIF)
                write (*, '(''If'')')
              else if (tag == NDPRTC)
                write (*, '(''Prtc'')')
              else if (tag == NDPRTS)
                write (*, '(''Prts'')')
              else if (tag == NDPRTI)
                write (*, '(''Prti'')')
              else if (tag == NDWHIL)
                write (*, '(''While'')')
              else if (tag == NDASGN)
                write (*, '(''Assign'')')
              else if (tag == NDNEG)
                write (*, '(''Negate'')')
              else if (tag == NDNOT)
                write (*, '(''Not'')')
              else if (tag == NDMUL)
                write (*, '(''Multiply'')')
              else if (tag == NDDIV)
                write (*, '(''Divide'')')
              else if (tag == NDMOD)
                write (*, '(''Mod'')')
              else if (tag == NDADD)
                write (*, '(''Add'')')
              else if (tag == NDSUB)
                write (*, '(''Subtract'')')
              else if (tag == NDLT)
                write (*, '(''Less'')')
              else if (tag == NDLE)
                write (*, '(''LessEqual'')')
              else if (tag == NDGT)
                write (*, '(''Greater'')')
              else if (tag == NDGE)
                write (*, '(''GreaterEqual'')')
              else if (tag == NDEQ)
                write (*, '(''Equal'')')
              else if (tag == NDNE)
                write (*, '(''NotEqual'')')
              else if (tag == NDAND)
                write (*, '(''And'')')
              else if (tag == NDOR)
                write (*, '(''Or'')')
              else
                {
                  write (*, '(''unrecognized node type'')')
                  stop
                }
              if (DSTKSZ - 2 < n)
                {
                  write (*, '(''node stack overflow'')')
                  stop
                }
              dstack(j) = nodes(NRIGHT, k)
              dstack(j + 1) = nodes(NLEFT, k)
              j = j + 2
            }
        }
    }
end

#---------------------------------------------------------------------

# A tiny recursive language. Each instruction is two integers,
# although the second integer may be XPAD. XLOCs are named by
# integers.

define(XPAD,     0)             # "Padding"

define(XLOC,    10)             # "Jump or call location"
define(XJUMP,   20)             # "Jump to a place"
define(XJUMPT,  30)             # "Jump to a place, if true"
define(XJUMPF,  40)             # "Jump to a place, if false"
define(XCALL,   50)             # "Call a subprogram"
define(XRET,    60)             # "Return from a subprogram"

define(XPUSH,  110)             # "Push an immediate value"
define(XSWAP,  120)             # "Swap top two stack entries"

define(XLT,    200)             # "Less than?"
define(XADDI,  210)             # "Add immediate."

define(XPPUSH, 610)             # "Push top to precedence stack"
define(XPCOPY, 620)             # "Copy top of prec stack to top"
define(XPDROP, 630)             # "Drop top of precedence stack"

define(XGETTK, 710)             # "Get the next token"
define(XTOKEQ, 720)             # "Token equals the argument?"
define(XEXPCT, 730)             # "Expect token"
define(XACCPT, 740)             # "Accept token"

define(XTOK,   810)             # "Push the token number"
define(XBINOP, 820)             # "Is top a binary operator?"
define(XRASSC, 830)             # "Is top a right associative op?"
define(XPREC,  840)             # "Precedence of token no. on top"
define(XOPER,  850)             # "Operator for token no. on top"

define(XINTND, 970)             # "Make internal node"
define(XOPND,  975)             # "Make internal node for operator"
define(XLEFND, 980)             # "Make leaf node"
define(XNILND, 985)             # "Make nil node"

define(XERROR, 1010)            # "Error"
define(XRWARN, 1020)            # "Unused right associative branch"

define(XPING,  2010)    # Print a ping message (for debugging)
define(XPRTND, 2020)    # Print node at stack top (for debugging)
define(XPRTTP, 2030)    # Print stack top as integer (for debugging)
define(XPRTTK, 2040)    # Print the current token (for debugging)
define(XPRTP,  2050)    # Print the current precedence (for debugging)
define(XPRTST, 2060)    # Print the whole data stack (for debugging)

# Call and jump locations in our program:
define(CSTMT,  1000)            # stmt
define(STMT01, 1010)
define(STMT02, 1020)
define(STMT03, 1030)
define(STMT04, 1040)
define(STMT05, 1050)
define(STMT06, 1060)
define(STMT07, 1070)
define(STMT08, 1080)
define(STMT09, 1090)
define(STMT10, 1100)
define(STMT11, 1110)
define(STMT12, 1120)
define(STMT13, 1130)
define(STMT14, 1140)
define(STMT15, 1150)
define(CPEXPR, 2000)            # paren_expr
define(CEXPR,  3000)            # expr
define(EXPR01, 3010)
define(EXPR02, 3020)
define(EXPR03, 3030)
define(EXPR04, 3040)
define(EXPR05, 3050)
define(EXPR06, 3060)
define(EXPR10, 3100)
define(EXPR11, 3110)
define(EXPR12, 3120)
define(EXPR13, 3130)
define(PARS01, 4010)            # parse

# Error numbers.
define(EXSTMT, 100)             # "expecting start of statement"
define(EXPRIM, 200)             # "expecting a primary"

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

subroutine ld(code, i, instr1, instr2)

  implicit none

  integer code(*)
  integer i
  integer instr1, instr2

  code(i) = instr1
  code(i + 1) = instr2
  i = i + 2
end

subroutine ldcode (code)

  # Load the code that is in the recursive language. The array
  # allocated to hold the code must be large enough, but we do not
  # check.

  implicit none

  integer code(*)
  integer i

  i = 1

  #--------------------------------------------------

  # The main loop.
  call ld (code, i, XNILND, XPAD) # Nil node for start of sequence.
  call ld (code, i, XGETTK, XPAD)
  call ld (code, i, XLOC, PARS01) # Top of loop
  call ld (code, i, XCALL, CSTMT)
  call ld (code, i, XINTND, NDSEQ)
  call ld (code, i, XTOKEQ, TKEOI) # End_of_input
  call ld (code, i, XJUMPF, PARS01) # Loop unless end of input.
  call ld (code, i, XRET, XPAD)

  #--------------------------------------------------

  call ld (code, i, XLOC, CEXPR)  # Start of "expr"
  call ld (code, i, XPPUSH, XPAD) # Push the precedence argument.

  call ld (code, i, XTOKEQ, TKLPAR) # LeftParen
  call ld (code, i, XJUMPF, EXPR01)

  # "( ... )"
  call ld (code, i, XCALL, CPEXPR)
  call ld (code, i, XJUMP, EXPR10)

  call ld (code, i, XLOC, EXPR01)

  call ld (code, i, XACCPT, TKSUB) # Op_subtract
  call ld (code, i, XJUMPF, EXPR02)

  # Unary minus
  call ld (code, i, XPUSH, TKNEG)
  call ld (code, i, XPREC, XPAD)
  call ld (code, i, XCALL, CEXPR)  # expr <--
  call ld (code, i, XNILND, XPAD)  # expr nil <--
  call ld (code, i, XINTND, NDNEG)
  call ld (code, i, XJUMP, EXPR10)

  call ld (code, i, XLOC, EXPR02)

  call ld (code, i, XACCPT, TKADD) # Op_add
  call ld (code, i, XJUMPF, EXPR03)

  # Unary plus
  call ld (code, i, XPUSH, TKNEG)
  call ld (code, i, XPREC, XPAD)
  call ld (code, i, XCALL, CEXPR)  # expr <--
  call ld (code, i, XJUMP, EXPR10)

  call ld (code, i, XLOC, EXPR03)

  call ld (code, i, XACCPT, TKNOT) # Op_not
  call ld (code, i, XJUMPF, EXPR04)

  # "!"
  call ld (code, i, XPUSH, TKNOT)
  call ld (code, i, XPREC, XPAD)
  call ld (code, i, XCALL, CEXPR)  # expr <--
  call ld (code, i, XNILND, XPAD)  # expr nil <--
  call ld (code, i, XINTND, NDNOT)
  call ld (code, i, XJUMP, EXPR10)

  call ld (code, i, XLOC, EXPR04)

  call ld (code, i, XTOKEQ, TKID) # Identifier
  call ld (code, i, XJUMPF, EXPR05)

  # Identifier
  call ld (code, i, XLEFND, NDID)
  call ld (code, i, XGETTK, XPAD)
  call ld (code, i, XJUMP, EXPR10)

  call ld (code, i, XLOC, EXPR05)

  call ld (code, i, XTOKEQ, TKINT) # Integer
  call ld (code, i, XJUMPF, EXPR06)

  # Integer.
  call ld (code, i, XLEFND, NDINT)
  call ld (code, i, XGETTK, XPAD)
  call ld (code, i, XJUMP, EXPR10)

  call ld (code, i, XLOC, EXPR06)

  call ld (code, i, XERROR, EXPRIM)

  call ld (code, i, XLOC, EXPR10) # Top of precedence climbing loop

  call ld (code, i, XTOK, XPAD)
  call ld (code, i, XBINOP, XPAD)
  call ld (code, i, XJUMPF, EXPR11) # Exit loop, if not a binary op.

  call ld (code, i, XTOK, XPAD)
  call ld (code, i, XPREC, XPAD)    # curtok_prec <--
  call ld (code, i, XPCOPY, XPAD)   # curtok_prec p <--
  call ld (code, i, XLT, XPAD)      # (curtok_prec < p)? <--
  call ld (code, i, XJUMPT, EXPR11) # Exit loop if true.

  call ld (code, i, XTOK, XPAD)
  call ld (code, i, XOPER, XPAD) # x op <--
  call ld (code, i, XSWAP, XPAD) # op x <--

  call ld (code, i, XTOK, XPAD)
  call ld (code, i, XRASSC, XPAD)
  call ld (code, i, XJUMPT, EXPR12)

  # Left associative.
  call ld (code, i, XTOK, XPAD)
  call ld (code, i, XPREC, XPAD)
  call ld (code, i, XADDI, 1)   # op x q:=(q + 1) <--
  call ld (code, i, XJUMP, EXPR13)

  call ld (code, i, XLOC, EXPR12)

  # Right associative. (Currently an unused branch.)
  call ld (code, i, XRWARN, XPAD) # Warn about unused branch.
  call ld (code, i, XTOK, XPAD)
  call ld (code, i, XPREC, XPAD) # op x q <--

  call ld (code, i, XLOC, EXPR13)

  call ld (code, i, XGETTK, XPAD)
  call ld (code, i, XCALL, CEXPR) # op x expr(q) <--
  call ld (code, i, XOPND, XPAD)  # new_x <--

  call ld (code, i, XJUMP, EXPR10) # Continue looping.

  call ld (code, i, XLOC, EXPR11) # Loop exit.

  call ld (code, i, XPDROP, XPAD) # Drop the precedence argument.
  call ld (code, i, XRET, XPAD)   # End of "expr"

  #--------------------------------------------------

  call ld (code, i, XLOC, CPEXPR) # Start of "paren_expr"
  call ld (code, i, XEXPCT, TKLPAR)
  call ld (code, i, XPUSH, 0)
  call ld (code, i, XCALL, CEXPR)
  call ld (code, i, XEXPCT, TKRPAR)
  call ld (code, i, XRET, XPAD)

  #--------------------------------------------------

  call ld (code, i, XLOC, CSTMT)  # Start of "stmt"

  call ld (code, i, XACCPT, TKIF) # Keyword_if
  call ld (code, i, XJUMPF, STMT01)

  # "if (...) then ... else ..."
  call ld (code, i, XCALL, CPEXPR)  # Get the paren expr ("if (...)").
  call ld (code, i, XCALL, CSTMT)   # Get the "then" clause.
  call ld (code, i, XACCPT, TKELSE) # Keyword_else
  call ld (code, i, XJUMPF, STMT02)
  call ld (code, i, XCALL, CSTMT) # Get the "else" clause.
  call ld (code, i, XJUMP, STMT03)
  call ld (code, i, XLOC, STMT02)
  call ld (code, i, XNILND, XPAD) # The "else" statement is nil.
  call ld (code, i, XLOC, STMT03)
  call ld (code, i, XINTND, NDIF) # ("If" pred ("If" then else))
  call ld (code, i, XINTND, NDIF)
  call ld (code, i, XRET, XPAD)

  call ld (code, i, XLOC, STMT01)

  call ld (code, i, XACCPT, TKPUTC) # Keyword_putc
  call ld (code, i, XJUMPF, STMT04)

  # "putc (...);"
  call ld (code, i, XCALL, CPEXPR) # Get the paren expr.
  call ld (code, i, XNILND, XPAD)
  call ld (code, i, XINTND, NDPRTC) # ("Prtc" expr nil)
  call ld (code, i, XEXPCT, TKSEMI) # Expect ";"
  call ld (code, i, XRET, XPAD)

  call ld (code, i, XLOC, STMT04)

  call ld (code, i, XACCPT, TKPRNT) # Keyword_print
  call ld (code, i, XJUMPF, STMT05)

  # "print(... , ... , ...);"
  call ld (code, i, XEXPCT, TKLPAR) # Expect "("
  call ld (code, i, XNILND, XPAD)   # nil for start of sequence
  call ld (code, i, XLOC, STMT08)   # Top of loop
  call ld (code, i, XTOKEQ, TKSTR)
  call ld (code, i, XJUMPT, STMT06)
  call ld (code, i, XPUSH, 0)
  call ld (code, i, XCALL, CEXPR)
  call ld (code, i, XNILND, XPAD)
  call ld (code, i, XINTND, NDPRTI) # ("Prti" expr nil)
  call ld (code, i, XJUMP, STMT07)
  call ld (code, i, XLOC, STMT06)
  call ld (code, i, XLEFND, NDSTR)
  call ld (code, i, XNILND, XPAD)
  call ld (code, i, XINTND, NDPRTS) # ("Prts" ("String" ...) nil)
  call ld (code, i, XGETTK, XPAD)
  call ld (code, i, XLOC, STMT07)
  call ld (code, i, XINTND, NDSEQ)  # ("Sequence" ... ...)
  call ld (code, i, XACCPT, TKCMMA) # Comma
  call ld (code, i, XJUMPT, STMT08) # Loop if comma.
  call ld (code, i, XEXPCT, TKRPAR) # Expect ")"
  call ld (code, i, XEXPCT, TKSEMI) # Expect ";"
  call ld (code, i, XRET, XPAD)

  call ld (code, i, XLOC, STMT05)

  call ld (code, i, XACCPT, TKSEMI) # Semicolon
  call ld (code, i, XJUMPF, STMT09)

  # Accept a lone ";".
  call ld (code, i, XRET, XPAD)

  call ld (code, i, XLOC, STMT09)

  call ld (code, i, XTOKEQ, TKID) # Identifier
  call ld (code, i, XJUMPF, STMT10)

  # "identifier = expr;"
  call ld (code, i, XLEFND, NDID) # ("Identifier" ...)
  call ld (code, i, XGETTK, XPAD)
  call ld (code, i, XEXPCT, TKASGN)
  call ld (code, i, XPUSH, 0)
  call ld (code, i, XCALL, CEXPR)
  call ld (code, i, XINTND, NDASGN) # ("Assign" ("Identifier" ...) expr)
  call ld (code, i, XEXPCT, TKSEMI)
  call ld (code, i, XRET, XPAD)

  call ld (code, i, XLOC, STMT10)

  call ld (code, i, XACCPT, TKWHIL) # While
  call ld (code, i, XJUMPF, STMT11)

  # "while (...) ..."
  call ld (code, i, XCALL, CPEXPR)
  call ld (code, i, XCALL, CSTMT)
  call ld (code, i, XINTND, NDWHIL) # ("While" pred stmt)
  call ld (code, i, XRET, XPAD)

  call ld (code, i, XLOC, STMT11)

  call ld (code, i, XACCPT, TKLBRC) # LeftBrace
  call ld (code, i, XJUMPF, STMT12)

  # "{ ... }"
  call ld (code, i, XNILND, XPAD)   # nil for start of sequence
  call ld (code, i, XLOC, STMT13)   # Top of loop
  call ld (code, i, XTOKEQ, TKEOI)
  call ld (code, i, XJUMPT, STMT14)
  call ld (code, i, XTOKEQ, TKRBRC)
  call ld (code, i, XJUMPT, STMT14)
  call ld (code, i, XCALL, CSTMT)
  call ld (code, i, XINTND, NDSEQ)  # ("Sequence" ... ...)
  call ld (code, i, XJUMP, STMT13)  # Loop again.
  call ld (code, i, XLOC, STMT14)   # Loop exit
  call ld (code, i, XEXPCT, TKRBRC) # Expect ";".
  call ld (code, i, XRET, XPAD)

  call ld (code, i, XLOC, STMT12)

  call ld (code, i, XTOKEQ, TKEOI) # End_of_input
  call ld (code, i, XJUMPF, STMT15)

  call ld (code, i, XRET, XPAD)

  call ld (code, i, XLOC, STMT15)

  call ld (code, i, XERROR, EXSTMT) # "expecting start of stmt"

  #--------------------------------------------------

end

subroutine dtpush (dstack, idstck, x)

  # Push to the data stack.

  implicit none

  integer dstack(DSTKSZ)
  integer idstck
  integer x

  if (DSTKSZ < idstck)
    {
      write (*, '(''node stack exhausted'')')
      stop
    }
  dstack(idstck) = x
  idstck = idstck + 1
end

function dtpop (dstack, idstck)

  # Pop from the data stack.

  implicit none

  integer dstack(DSTKSZ)
  integer idstck
  integer dtpop

  if (DSTKSZ < idstck)
    {
      write (*, '(''node stack exhausted'')')
      stop
    }
  idstck = idstck - 1
  dtpop = dstack(idstck)
end

subroutine ppush (pstack, ipstck, x)

  # Push to the precedence stack.

  implicit none

  integer pstack(PSTKSZ)
  integer ipstck
  integer x

  if (PSTKSZ < ipstck)
    {
      write (*, '(''precedence stack exhausted'')')
      stop
    }
  pstack(ipstck) = x
  ipstck = ipstck + 1
end

function ppop (pstack, ipstck)

  # Pop from the precedence stack.

  implicit none

  integer pstack(PSTKSZ)
  integer ipstck
  integer ppop

  if (PSTKSZ < ipstck)
    {
      write (*, '(''precedence stack exhausted'')')
      stop
    }
  ipstck = ipstck - 1
  ppop = pstack(ipstck)
end

function ipfind (code, loc)

  # Find a location.

  implicit none

  integer code(*)
  integer loc
  integer ipfind

  integer i

  i = 1
  while (code(i) != XLOC || code(i + 1) != loc)
    i = i + 2
  ipfind = i
end

subroutine ippush (xstack, ixstck, ip)

  # Push the instruction pointer.

  implicit none

  integer xstack(XSTKSZ)
  integer ixstck
  integer ip

  if (XSTKSZ < ixstck)
    {
      write (*, '(''recursive call stack exhausted'')')
      stop
    }
  xstack(ixstck) = ip
  ixstck = ixstck + 1
end

function ippop (xstack, ixstck)

  # Pop an instruction pointer value.

  implicit none

  integer xstack(XSTKSZ)
  integer ixstck
  integer ippop

  if (ixstck == 1)
    {
      write (*, '(''recursive call stack underflow'')')
      stop
    }
  ixstck = ixstck - 1
  ippop = xstack(ixstck)
end

function logl2i (u)

  # Convert LOGICAL to INTEGER.

  implicit none

  logical u
  integer logl2i

  if (u)
    logl2i = 1
  else
    logl2i = 0
end

subroutine recurs (strngs, istrng,
                   nodes, inodes, _
                   dstack, idstck, _
                   pstack, ipstck, _
                   xstack, ixstck, _
                   code, ip)

  implicit none

  character strngs(STRNSZ)      # String pool.
  integer istrng                # String pool's next slot.
  integer nodes(NODESZ, NODSSZ) # Node pool
  integer inodes                # Node pool's next slot.
  integer dstack(DSTKSZ)        # Data stack.
  integer idstck                # Data stack pointer.
  integer pstack(PSTKSZ)        # Precedence stack.
  integer ipstck                # Precedence stack pointer.
  integer xstack(XSTKSZ)        # Execution stack.
  integer ixstck                # Execution stack pointer.
  integer code(*)               # Code in the recursive language.
  integer ip                    # Instruction pointer.

  integer prec
  integer opernt
  integer logl2i
  integer dtpop
  integer ppop
  integer ippop
  integer ipfind
  logical accept
  logical isbin
  logical rtassc

  integer curtok(TOKSZ)
  integer node(NODESZ)
  integer curprc                # Current precedence value.
  integer i, j
  logical done

  curprc = 0
  done = .false.
  while (.not. done)
    {
      if (code(ip) == XLOC)
        {
          ip = ip + 2
        }
      else if (code(ip) == XJUMP)
        {
          ip = ipfind (code, code(ip + 1))
        }
      else if (code(ip) == XJUMPT)
        {
          i = dtpop (dstack, idstck)
          if (i != 0)
            ip = ipfind (code, code(ip + 1))
          else
            ip = ip + 2
        }
      else if (code(ip) == XJUMPF)
        {
          i = dtpop (dstack, idstck)
          if (i == 0)
            ip = ipfind (code, code(ip + 1))
          else
            ip = ip + 2
        }
      else if (code(ip) == XCALL)
        {
          call ippush (xstack, ixstck, ip + 2)
          ip = ipfind (code, code(ip + 1))
        }
      else if (code(ip) == XRET)
        {
          if (ixstck == 1)
            done = .true.
          else
            ip = ippop (xstack, ixstck)
        }
      else if (code(ip) == XINTND)
        {
          node(NRIGHT) = dtpop (dstack, idstck)
          node(NLEFT) = dtpop (dstack, idstck)
          node(NTAG) = code(ip + 1)
          call astnod (node, nodes, inodes, i)
          call dtpush (dstack, idstck, i)
          ip = ip + 2
        }
      else if (code(ip) == XOPND)
        {
          node(NRIGHT) = dtpop (dstack, idstck)
          node(NLEFT) = dtpop (dstack, idstck)
          node(NTAG) = dtpop (dstack, idstck)
          call astnod (node, nodes, inodes, i)
          call dtpush (dstack, idstck, i)
          ip = ip + 2
        }
      else if (code(ip) == XLEFND)
        {
          node(NITV) = curtok(ITV)
          node(NITN) = curtok(ITN)
          node(NTAG) = code(ip + 1)
          call astnod (node, nodes, inodes, i)
          call dtpush (dstack, idstck, i)
          ip = ip + 2
        }
      else if (code(ip) == XNILND)
        {
          call dtpush (dstack, idstck, NIL)
          ip = ip + 2
        }
      else if (code(ip) == XGETTK)
        {
          call gettok (strngs, istrng, curtok)
          ip = ip + 2
        }
      else if (code(ip) == XTOKEQ)
        {
          i = logl2i (curtok(ITK) == code(ip + 1))
          call dtpush (dstack, idstck, i)
          ip = ip + 2
        }
      else if (code(ip) == XEXPCT)
        {
          call expect (strngs, istrng, curtok, code(ip + 1))
          ip = ip + 2
        }
      else if (code(ip) == XACCPT)
        {
          i = logl2i (accept (strngs, istrng, curtok, code(ip + 1)))
          call dtpush (dstack, idstck, i)
          ip = ip + 2
        }
      else if (code(ip) == XSWAP)
        {
          i = dtpop (dstack, idstck)
          j = dtpop (dstack, idstck)
          call dtpush (dstack, idstck, i)
          call dtpush (dstack, idstck, j)
          ip = ip + 2
        }
      else if (code(ip) == XLT)
        {
          j = dtpop (dstack, idstck)
          i = dtpop (dstack, idstck)
          call dtpush (dstack, idstck, logl2i (i < j))
          ip = ip + 2
        }
      else if (code(ip) == XADDI)
        {
          i = dtpop (dstack, idstck)
          call dtpush (dstack, idstck, i + code(ip + 1))
          ip = ip + 2
        }
      else if (code(ip) == XPPUSH)
        {
          i = dtpop (dstack, idstck)
          call ppush (pstack, ipstck, i)
          ip = ip + 2
        }
      else if (code(ip) == XPCOPY)
        {
          i = ppop (pstack, ipstck)
          call ppush (pstack, ipstck, i)
          call dtpush (dstack, idstck, i)
          ip = ip + 2
        }
      else if (code(ip) == XPDROP)
        {
          i = ppop (pstack, ipstck)
          ip = ip + 2
        }
      else if (code(ip) == XBINOP)
        {
          i = dtpop (dstack, idstck)
          i = logl2i (isbin (i))
          call dtpush (dstack, idstck, i)
          ip = ip + 2
        }
      else if (code(ip) == XRASSC)
        {
          i = dtpop (dstack, idstck)
          i = logl2i (rtassc (i))
          call dtpush (dstack, idstck, i)
          ip = ip + 2
        }
      else if (code(ip) == XPREC)
        {
          i = dtpop (dstack, idstck)
          call dtpush (dstack, idstck, prec (i))
          ip = ip + 2
        }
      else if (code(ip) == XOPER)
        {
          i = dtpop (dstack, idstck)
          call dtpush (dstack, idstck, opernt (i))
          ip = ip + 2
        }
      else if (code(ip) == XTOK)
        {
          call dtpush (dstack, idstck, curtok(ITK))
          ip = ip + 2
        }
      else if (code(ip) == XPUSH)
        {
          call dtpush (dstack, idstck, code(ip + 1))
          ip = ip + 2
        }
      else if (code(ip) == XERROR)
        {
          if (code(ip + 1) == EXSTMT)
            {
              write (*, 1000) curtok(ILN), curtok(ICN)
1000          format ('expected start of statement at line ', _
                      I5, ', column ', I5)
            }
          else if (code(ip + 1) == EXPRIM)
            {
              write (*, 1010) curtok(ILN), curtok(ICN)
1010          format ('expected a primary at line ', _
                      I5, ', column ', I5)
            }
          else
            {
              write (*, 2000) curtok(ILN), curtok(ICN)
2000          format ('syntax error at line ', _
                      I5, ', column ', I5)
            }
          stop
        }
      else if (code(ip) == XRWARN)
        {
          write (*, 3000)
3000      format ('executing supposedly unused ', _
                  '"right associative" operator branch')
          ip = ip + 2
        }
      else if (code(ip) == XPING)
        {
          write (*, '(''ping'')')
          ip = ip + 2
        }
      else if (code(ip) == XPRTND)
        {
          i = dtpop (dstack, idstck)
          call dtpush (dstack, idstck, i)
          call prtast (strngs, nodes, i, dstack)
          ip = ip + 2
        }
      else if (code(ip) == XPRTTP)
        {
          i = dtpop (dstack, idstck)
          call dtpush (dstack, idstck, i)
          write (*, '(''top = '', I20)') i
          ip = ip + 2
        }
      else if (code(ip) == XPRTTK)
        {
          write (*, '(''curtok ='', 5(1X, I5))') curtok
          ip = ip + 2
        }
      else if (code(ip) == XPRTP)
        {
          write (*, '(''curprc = '', I2)') curprc
          ip = ip + 2
        }
      else if (code(ip) == XPRTST)
        {
          write (*, '(''dstack ='', 100000(1X, I5))') _
                (dstack(i), i = 1, idstck - 1)
          ip = ip + 2
        }
      else
        {
          write (*, '(''illegal instruction'')')
          stop
        }
    }
end

#---------------------------------------------------------------------

program parse

  implicit none

  character strngs(STRNSZ)      # String pool.
  integer istrng                # String pool's next slot.
  integer nodes(NODESZ, NODSSZ) # Node pool
  integer inodes                # Node pool's next slot.
  integer dstack(DSTKSZ)        # Node stack.
  integer idstck                # Node stack pointer.
  integer pstack(PSTKSZ)        # Precedence stack.
  integer ipstck                # Precedence stack pointer.
  integer xstack(XSTKSZ)        # Execution stack.
  integer ixstck                # Execution stack pointer.
  integer code(1000)            # Recursive code.
  integer ip                    # Instruction pointer.

  integer i

  integer dtpop

  istrng = 1
  inodes = 1
  idstck = 1
  ipstck = 1
  ixstck = 1

  call ldcode (code)
  ip = 1

  call recurs (strngs, istrng, nodes, inodes, _
               dstack, idstck, pstack, ipstck, _
               xstack, ixstck, code, ip)
  i = dtpop (dstack, idstck)
  call prtast (strngs, nodes, i, dstack)
end

######################################################################

Output:

To compile and run with gfortran on a POSIX system:

$ ratfor77 parse-in-ratfor.r > parse-in-ratfor.f && gfortran -O2 -fcheck=all -std=legacy parse-in-ratfor.f && ./a.out < compiler-tests/primes.lex

To use f2c instead of gfortran:

ratfor77 parse-in-ratfor.r > parse-in-ratfor.f && f2c -C -Nc40 parse-in-ratfor.f && cc -O parse-in-ratfor.c -lf2c && ./a.out < compiler-tests/primes.lex

The output should be:

Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier count
Integer 1
Assign
Identifier n
Integer 1
Assign
Identifier limit
Integer 100
While
Less
Identifier n
Identifier limit
Sequence
Sequence
Sequence
Sequence
Sequence
;
Assign
Identifier k
Integer 3
Assign
Identifier p
Integer 1
Assign
Identifier n
Add
Identifier n
Integer 2
While
And
LessEqual
Multiply
Identifier k
Identifier k
Identifier n
Identifier p
Sequence
Sequence
;
Assign
Identifier p
NotEqual
Multiply
Divide
Identifier n
Identifier k
Identifier k
Identifier n
Assign
Identifier k
Add
Identifier k
Integer 2
If
Identifier p
If
Sequence
Sequence
;
Sequence
Sequence
;
Prti
Identifier n
;
Prts
String " is prime\n"
;
Assign
Identifier count
Add
Identifier count
Integer 1
;
Sequence
Sequence
Sequence
;
Prts
String "Total primes found: "
;
Prti
Identifier count
;
Prts
String "\n"
;

Scala

The complete implementation for the compiler tasks can be found in a GitHub repository at github.com/edadma/rosettacodeCompiler which includes full unit testing for the samples given in Compiler/Sample programs.

The following code implements a configurable (from a symbol map provided as a parameter) Precedence Climbing parser for the output of the lexer. The recursive descent language parser is closely based on the pseudo code given in the task description.

package xyz.hyperreal.rosettacodeCompiler

import scala.io.Source

object SyntaxAnalyzer {

  val symbols =
    Map[String, (PrefixOperator, InfixOperator)](
      "Op_or"           -> (null, InfixOperator(10, LeftAssoc, BranchNode("Or", _, _))),
      "Op_and"          -> (null, InfixOperator(20, LeftAssoc, BranchNode("And", _, _))),
      "Op_equal"        -> (null, InfixOperator(30, LeftAssoc, BranchNode("Equal", _, _))),
      "Op_notequal"     -> (null, InfixOperator(30, LeftAssoc, BranchNode("NotEqual", _, _))),
      "Op_less"         -> (null, InfixOperator(40, LeftAssoc, BranchNode("Less", _, _))),
      "Op_lessequal"    -> (null, InfixOperator(40, LeftAssoc, BranchNode("LessEqual", _, _))),
      "Op_greater"      -> (null, InfixOperator(40, LeftAssoc, BranchNode("Greater", _, _))),
      "Op_greaterequal" -> (null, InfixOperator(40, LeftAssoc, BranchNode("GreaterEqual", _, _))),
      "Op_add"          -> (PrefixOperator(30, identity), InfixOperator(50, LeftAssoc, BranchNode("Add", _, _))),
      "Op_minus" -> (PrefixOperator(70, BranchNode("Negate", _, TerminalNode)), InfixOperator(
        50,
        LeftAssoc,
        BranchNode("Subtract", _, _))),
      "Op_multiply" -> (null, InfixOperator(60, LeftAssoc, BranchNode("Multiply", _, _))),
      "Op_divide"   -> (null, InfixOperator(60, LeftAssoc, BranchNode("Divide", _, _))),
      "Op_mod"      -> (null, InfixOperator(60, RightAssoc, BranchNode("Mod", _, _))),
      "Op_not"      -> (PrefixOperator(70, BranchNode("Not", _)), null),
      "LeftParen"   -> null,
      "RightParen"  -> null
    )

  def apply = new SyntaxAnalyzer(symbols)

  abstract class Node
  case class LeafNode(name: String, value: String)                            extends Node
  case class BranchNode(name: String, left: Node, right: Node = TerminalNode) extends Node
  case object TerminalNode                                                    extends Node

  abstract class Assoc
  case object LeftAssoc  extends Assoc
  case object RightAssoc extends Assoc

  abstract class Operator
  case class InfixOperator(prec: Int, assoc: Assoc, compute: (Node, Node) => Node) extends Operator
  case class PrefixOperator(prec: Int, compute: Node => Node)                      extends Operator

}

class SyntaxAnalyzer(symbols: Map[String, (SyntaxAnalyzer.PrefixOperator, SyntaxAnalyzer.InfixOperator)]) {
  import SyntaxAnalyzer.{BranchNode, InfixOperator, LeafNode, LeftAssoc, Node, PrefixOperator, TerminalNode}

  def fromStdin = fromSource(Source.stdin)

  def fromString(src: String) = fromSource(Source.fromString(src))

  def fromSource(s: Source) = {
    val tokens = ((s.getLines map (_.trim.split(" +", 4)) map {
      case Array(line, col, name) =>
        symbols get name match {
          case None | Some(null) => SimpleToken(line.toInt, col.toInt, name)
          case Some(operators)   => OperatorToken(line.toInt, col.toInt, name, operators)
        }
      case Array(line, col, name, value) => ValueToken(line.toInt, col.toInt, name, value)
    }) toStream)

    flatten(parse(tokens))
  }

  def flatten(n: Node): Unit =
    n match {
      case TerminalNode          => println(";")
      case LeafNode(name, value) => println(s"$name $value")
      case BranchNode(name, left, right) =>
        println(name)
        flatten(left)
        flatten(right)
    }

  def parse(toks: Stream[Token]) = {
    var cur = toks

    def next = cur = cur.tail

    def token = cur.head

    def consume = {
      val res = token

      next
      res
    }

    def accept(name: String) =
      if (token.name == name) {
        next
        true
      } else
        false

    def expect(name: String, error: String = null) =
      if (token.name != name)
        sys.error(if (error eq null) s"expected $name, found ${token.name}" else s"$error: $token")
      else
        next

    def expression(minPrec: Int): Node = {
      def infixOperator = token.asInstanceOf[OperatorToken].operators._2

      def isInfix = token.isInstanceOf[OperatorToken] && infixOperator != null

      var result =
        consume match {
          case SimpleToken(_, _, "LeftParen") =>
            val result = expression(0)

            expect("RightParen", "expected closing parenthesis")
            result
          case ValueToken(_, _, name, value)                         => LeafNode(name, value)
          case OperatorToken(_, _, _, (prefix, _)) if prefix ne null => prefix.compute(expression(prefix.prec))
          case OperatorToken(_, _, _, (_, infix)) if infix ne null =>
            sys.error(s"expected a primitive expression, not an infix operator: $token")
        }

      while (isInfix && infixOperator.prec >= minPrec) {
        val InfixOperator(prec, assoc, compute) = infixOperator
        val nextMinPrec                         = if (assoc == LeftAssoc) prec + 1 else prec

        next
        result = compute(result, expression(nextMinPrec))
      }

      result
    }

    def parenExpression = {
      expect("LeftParen")

      val e = expression(0)

      expect("RightParen")
      e
    }

    def statement: Node = {
      var stmt: Node = TerminalNode

      if (accept("Keyword_if"))
        stmt = BranchNode("If",
                          parenExpression,
                          BranchNode("If", statement, if (accept("Keyword_else")) statement else TerminalNode))
      else if (accept("Keyword_putc")) {
        stmt = BranchNode("Prtc", parenExpression)
        expect("Semicolon")
      } else if (accept("Keyword_print")) {
        expect("LeftParen")

        do {
          val e =
            if (token.name == "String")
              BranchNode("Prts", LeafNode("String", consume.asInstanceOf[ValueToken].value))
            else
              BranchNode("Prti", expression(0))

          stmt = BranchNode("Sequence", stmt, e)
        } while (accept("Comma"))

        expect("RightParen")
        expect("Semicolon")
      } else if (token.name == "Semicolon")
        next
      else if (token.name == "Identifier") {
        val ident = LeafNode("Identifier", consume.asInstanceOf[ValueToken].value)

        expect("Op_assign")
        stmt = BranchNode("Assign", ident, expression(0))
        expect("Semicolon")
      } else if (accept("Keyword_while"))
        stmt = BranchNode("While", parenExpression, statement)
      else if (accept("LeftBrace")) {
        while (token.name != "RightBrace" && token.name != "End_of_input") {
          stmt = BranchNode("Sequence", stmt, statement)
        }

        expect("RightBrace")
      } else
        sys.error(s"syntax error: $token")

      stmt
    }

    var tree: Node = TerminalNode

    do {
      tree = BranchNode("Sequence", tree, statement)
    } while (token.name != "End_of_input")

    expect("End_of_input")
    tree
  }

  abstract class Token {
    val line: Int;
    val col: Int;
    val name: String
  }

  case class SimpleToken(line: Int, col: Int, name: String)                                               extends Token
  case class ValueToken(line: Int, col: Int, name: String, value: String)                                 extends Token
  case class OperatorToken(line: Int, col: Int, name: String, operators: (PrefixOperator, InfixOperator)) extends Token

}

Scheme

Code implements a recursive descent parser based on the given grammar. Tested against all programs in Compiler/Sample programs.

(import (scheme base)
        (scheme process-context)
        (scheme write))

(define *names* (list (cons 'Op_add 'Add)
                      (cons 'Op_subtract 'Subtract)
                      (cons 'Op_multiply 'Multiply)
                      (cons 'Op_divide 'Divide)
                      (cons 'Op_mod 'Mod)
                      (cons 'Op_not 'Not)
                      (cons 'Op_equal 'Equal)
                      (cons 'Op_notequal 'NotEqual)
                      (cons 'Op_or 'Or)
                      (cons 'Op_and 'And)
                      (cons 'Op_less 'Less)
                      (cons 'Op_lessequal 'LessEqual)
                      (cons 'Op_greater 'Greater)
                      (cons 'Op_greaterequal 'GreaterEqual)))

(define (retrieve-name type)
  (let ((res (assq type *names*)))
    (if res
      (cdr res)
      (error "Unknown type name"))))

;; takes a vector of tokens
(define (parse tokens) ; read statements, until hit end of tokens
  (define posn 0)
  (define (peek-token)
    (vector-ref tokens posn))
  (define (get-token)
    (set! posn (+ 1 posn))
    (vector-ref tokens (- posn 1)))
  (define (match type)
    (if (eq? (car (vector-ref tokens posn)) type)
      (set! posn (+ 1 posn))
      (error "Could not match token type" type)))
  ; make it easier to read token parts
  (define type car)
  (define value cadr)
  ;
  ;; left associative read of one or more items with given separators
  (define (read-one-or-more reader separators)
    (let loop ((lft (reader)))
      (let ((next (peek-token)))
        (if (memq (type next) separators)
          (begin (match (type next))
                 (loop (list (retrieve-name (type next)) lft (reader))))
          lft))))
  ;
  ;; read one or two items with given separator
  (define (read-one-or-two reader separators)
    (let* ((lft (reader))
           (next (peek-token)))
      (if (memq (type next) separators)
        (begin (match (type next))
               (list (retrieve-name (type next)) lft (reader)))
        lft)))
  ;
  (define (read-primary)
    (let ((next (get-token)))
      (case (type next)
        ((Identifier Integer)
         next)
        ((LeftParen)
         (let ((v (read-expr)))
           (match 'RightParen)
           v))
        ((Op_add) ; + sign is ignored 
         (read-primary))
        ((Op_not)
         (list 'Not (read-primary) '()))
        ((Op_subtract)
         (list 'Negate (read-primary) '()))
        (else
          (error "Unknown primary type")))))
  ;
  (define (read-multiplication-expr) ; *
    (read-one-or-more read-primary '(Op_multiply Op_divide Op_mod)))
  ;
  (define (read-addition-expr) ; *
    (read-one-or-more read-multiplication-expr '(Op_add Op_subtract)))
  ;
  (define (read-relational-expr) ; ?
    (read-one-or-two read-addition-expr 
                     '(Op_less Op_lessequal Op_greater Op_greaterequal)))
  ;
  (define (read-equality-expr) ; ?
    (read-one-or-two read-relational-expr '(Op_equal Op_notequal)))
  ;
  (define (read-and-expr) ; *
    (read-one-or-more read-equality-expr '(Op_and)))
  ;
  (define (read-expr) ; *
    (read-one-or-more read-and-expr '(Op_or)))
  ;
  (define (read-prt-list)
    (define (read-print-part)
      (if (eq? (type (peek-token)) 'String)
        (list 'Prts (get-token) '())
        (list 'Prti (read-expr) '())))
    ;
    (do ((tok (read-print-part) (read-print-part))
         (rec '() (list 'Sequence rec tok)))
      ((not (eq? (type (peek-token)) 'Comma))
       (list 'Sequence rec tok))
      (match 'Comma)))
  ;
  (define (read-paren-expr)
    (match 'LeftParen)
    (let ((v (read-expr)))
      (match 'RightParen)
      v))
  ;
  (define (read-stmt)
    (case (type (peek-token))
      ((SemiColon)
       '())
      ((Identifier)
       (let ((id (get-token)))
         (match 'Op_assign)
         (let ((ex (read-expr)))
           (match 'Semicolon)
           (list 'Assign id ex))))
      ((Keyword_while)
       (match 'Keyword_while)
       (let* ((expr (read-paren-expr))
              (stmt (read-stmt)))
         (list 'While expr stmt)))
      ((Keyword_if)
       (match 'Keyword_if)
       (let* ((expr (read-paren-expr))
              (then-part (read-stmt))
              (else-part (if (eq? (type (peek-token)) 'Keyword_else)
                           (begin (match 'Keyword_else)
                                  (read-stmt))
                           '())))
         (list 'If expr (list 'If then-part else-part))))
      ((Keyword_print)
       (match 'Keyword_print)
       (match 'LeftParen)
       (let ((v (read-prt-list)))
         (match 'RightParen)
         (match 'Semicolon)
         v))
      ((Keyword_putc)
       (match 'Keyword_putc)
       (let ((v (read-paren-expr)))
         (match 'Semicolon)
         (list 'Putc v '())))
      ((LeftBrace) 
       (match 'LeftBrace)
       (let ((v (read-stmts)))
         (match 'RightBrace)
         v))
      (else
        (error "Unknown token type for statement" (type (peek-token))))))
  ;
  (define (read-stmts)
    (do ((sequence (list 'Sequence '() (read-stmt)) 
                   (list 'Sequence sequence (read-stmt))))
      ((memq (type (peek-token)) '(End_of_input RightBrace))
       sequence)))
  ;
  (let ((res (read-stmts)))
    (match 'End_of_input)
    res))

;; reads tokens from file, parses and returns the AST
(define (parse-file filename)
  (define (tokenise line)
    (let ((port (open-input-string line)))
      (read port) ; discard line
      (read port) ; discard col
      (let* ((type (read port)) ; read type as symbol
             (val (read port))) ; check for optional value
        (if (eof-object? val)
          (list type)
          (list type val)))))
  ;
  (with-input-from-file 
    filename
    (lambda () 
      (do ((line (read-line) (read-line))
           (toks '() (cons (tokenise line) toks)))
        ((eof-object? line) 
         (parse (list->vector (reverse toks))))))))

;; Output the AST in flattened format 
(define (display-ast ast)
  (cond ((null? ast)
         (display ";\n"))
        ((= 2 (length ast))
         (display (car ast))
         (display #\tab)
         (write (cadr ast)) ; use write to preserve " " on String
         (newline))
        (else
          (display (car ast)) (newline)
          (display-ast (cadr ast))
          (display-ast (cadr (cdr ast))))))

;; read from filename passed on command line
(if (= 2 (length (command-line)))
  (display-ast (parse-file (cadr (command-line))))
  (display "Error: provide program filename\n"))

Wren

Translation of: Go

Library: Wren-dynamic

Library: Wren-fmt

Library: wren-ioutil

import "./dynamic" for Enum, Struct, Tuple
import "./fmt" for Fmt
import "./ioutil" for FileUtil

var tokens = [
    "EOI",
    "Mul",
    "Div",
    "Mod",
    "Add",
    "Sub",
    "Negate",
    "Not",
    "Lss",
    "Leq",
    "Gtr",
    "Geq",
    "Eql",
    "Neq",
    "Assign",
    "And",
    "Or",
    "If",
    "Else",
    "While",
    "Print",
    "Putc",
    "Lparen",
    "Rparen",
    "Lbrace",
    "Rbrace",
    "Semi",
    "Comma",
    "Ident",
    "Integer",
    "String"
]

var Token = Enum.create("Token", tokens)

var nodes = [
    "Ident",
    "String",
    "Integer",
    "Sequence",
    "If",
    "Prtc",
    "Prts",
    "Prti",
    "While",
    "Assign",
    "Negate",
    "Not",
    "Mul",
    "Div",
    "Mod",
    "Add",
    "Sub",
    "Lss",
    "Leq",
    "Gtr",
    "Geq",
    "Eql",
    "Neq",
    "And",
    "Or"
]

var Node = Enum.create("Node", nodes)

// 'text' field represents ident ot string literal or integer value
var TokS = Struct.create("TokS", ["tok", "errLn", "errCol", "text"])

var Tree = Struct.create("Tree", ["nodeType", "left", "right", "value"])

// dependency: Ordered by tok, must remain in same order as Token enum constants
var Atr = Tuple.create("Atr", ["text", "enumText", "tok", "rightAssociative", "isBinary",
                               "isUnary", "precedence", "nodeType"])

var atrs = [
    Atr.new("EOI", "End_of_input", Token.EOI, false, false, false, -1, -1),
    Atr.new("*", "Op_multiply", Token.Mul, false, true, false, 13, Node.Mul),
    Atr.new("/", "Op_divide", Token.Div, false, true, false, 13, Node.Div),
    Atr.new("\%", "Op_mod", Token.Mod, false, true, false, 13, Node.Mod),
    Atr.new("+", "Op_add", Token.Add, false, true, false, 12, Node.Add),
    Atr.new("-", "Op_subtract", Token.Sub, false, true, false, 12, Node.Sub),
    Atr.new("-", "Op_negate", Token.Negate, false, false, true, 14, Node.Negate),
    Atr.new("!", "Op_not", Token.Not, false, false, true, 14, Node.Not),
    Atr.new("<", "Op_less", Token.Lss, false, true, false, 10, Node.Lss),
    Atr.new("<=", "Op_lessequal", Token.Leq, false, true, false, 10, Node.Leq),
    Atr.new(">", "Op_greater", Token.Gtr, false, true, false, 10, Node.Gtr),
    Atr.new(">=", "Op_greaterequal", Token.Geq, false, true, false, 10, Node.Geq),
    Atr.new("==", "Op_equal", Token.Eql, false, true, false, 9, Node.Eql),
    Atr.new("!=", "Op_notequal", Token.Neq, false, true, false, 9, Node.Neq),
    Atr.new("=", "Op_assign", Token.Assign, false, false, false, -1, Node.Assign),
    Atr.new("&&", "Op_and", Token.And, false, true, false, 5, Node.And),
    Atr.new("||", "Op_or", Token.Or, false, true, false, 4, Node.Or),
    Atr.new("if", "Keyword_if", Token.If, false, false, false, -1, Node.If),
    Atr.new("else", "Keyword_else", Token.Else, false, false, false, -1, -1),
    Atr.new("while", "Keyword_while", Token.While, false, false, false, -1, Node.While),
    Atr.new("print", "Keyword_print", Token.Print, false, false, false, -1, -1),
    Atr.new("putc", "Keyword_putc", Token.Putc, false, false, false, -1, -1),
    Atr.new("(", "LeftParen", Token.Lparen, false, false, false, -1, -1),
    Atr.new(")", "RightParen", Token.Rparen, false, false, false, -1, -1),
    Atr.new("{", "LeftBrace", Token.Lbrace, false, false, false, -1, -1),
    Atr.new("}", "RightBrace", Token.Rbrace, false, false, false, -1, -1),
    Atr.new(";", "Semicolon", Token.Semi, false, false, false, -1, -1),
    Atr.new(",", "Comma", Token.Comma, false, false, false, -1, -1),
    Atr.new("Ident", "Identifier", Token.Ident, false, false, false, -1, Node.Ident),
    Atr.new("Integer literal", "Integer", Token.Integer, false, false, false, -1, Node.Integer),
    Atr.new("String literal", "String", Token.String, false, false, false, -1, Node.String),
]

var displayNodes = [
    "Identifier", "String", "Integer", "Sequence", "If", "Prtc", "Prts", "Prti",
    "While", "Assign", "Negate", "Not", "Multiply", "Divide", "Mod", "Add",
    "Subtract", "Less", "LessEqual", "Greater", "GreaterEqual", "Equal",
    "NotEqual", "And", "Or"
]

var token = TokS.new(0, 0, 0, "")

var reportError = Fn.new { |eline, ecol, msg| Fiber.abort("(%(eline):%(ecol)) error : %(msg)") }

// return internal version of name
var getEnum = Fn.new { |name|
    for (atr in atrs) {
        if (atr.enumText == name) return atr.tok
    }
    reportError.call(0, 0, "Unknown token %(name)")
}

var lines = []
var lineCount = 0
var lineNum = 0

var getTok = Fn.new {
    var tok = TokS.new(0, 0, 0, "")
    if (lineNum < lineCount) {
        var line = lines[lineNum].trimEnd(" \t")
        lineNum = lineNum + 1
        var fields = line.split(" ").where { |s| s != "" }.toList
        // [ ]*{lineno}[ ]+{colno}[ ]+token[ ]+optional
        tok.errLn = Num.fromString(fields[0])
        tok.errCol = Num.fromString(fields[1])
        tok.tok = getEnum.call(fields[2])
        var le = fields.count
        if (le == 4) {
            tok.text = fields[3]
        } else if (le > 4) {
            var idx = line.indexOf("\"")
            tok.text = line[idx..-1]
        }
    }
    return tok
}

var makeNode = Fn.new { |nodeType, left, right| Tree.new(nodeType, left, right, "") }

var makeLeaf = Fn.new { |nodeType, value| Tree.new(nodeType, null, null, value) }

var expect = Fn.new { |msg, s|
    if (token.tok == s) {
        token = getTok.call()
        return
    }
    reportError.call(token.errLn, token.errCol,
        Fmt.swrite("$s: Expecting '$s', found '$s'", msg, atrs[s].text, atrs[token.tok].text))
}

var parenExpr  // forward reference

var expr // recursive function
expr = Fn.new { |p|
    var x
    var node
    var t = token.tok
    if (t == Token.Lparen) {
        x = parenExpr.call()
    } else if (t == Token.Sub || t == Token.Add) {
        var op = t
        token = getTok.call()
        node = expr.call(atrs[Token.Negate].precedence)
        if (op == Token.Sub) {
            x = makeNode.call(Node.negate, node, null)
        } else {
            x = node
        }
    } else if (t == Token.Not) {
        token = getTok.call()
        x = makeNode.call(Node.Not, expr.call(atrs[Token.Not].precedence), null)
    } else if (t == Token.Ident) {
        x = makeLeaf.call(Node.Ident, token.text)
        token = getTok.call()
    } else if (t == Token.Integer) {
        x = makeLeaf.call(Node.Integer, token.text)
        token = getTok.call()
    } else {
        reportError.call(token.errLn, token.errCol,
            Fmt.swrite("Expecting a primary, found: $s", atrs[token.tok].text))
    }

    while (atrs[token.tok].isBinary && atrs[token.tok].precedence >= p) {
        var op = token.tok
        token = getTok.call()
        var q = atrs[op].precedence
        if (!atrs[op].rightAssociative) q = q + 1
        node = expr.call(q)
        x = makeNode.call(atrs[op].nodeType, x, node)
    }
    return x
}

parenExpr = Fn.new {
    expect.call("parenExpr", Token.Lparen)
    var t = expr.call(0)
    expect.call("parenExpr", Token.Rparen)
    return t
}

var stmt // recursive function
stmt = Fn.new {
    var t
    var v
    var e
    var s
    var s2
    var tt = token.tok
    if (tt == Token.If) {
        token = getTok.call()
        e = parenExpr.call()
        s = stmt.call()
        s2 = null
        if (token.tok == Token.Else) {
            token = getTok.call()
            s2 = stmt.call()
        }
        t = makeNode.call(Node.If, e, makeNode.call(Node.If, s, s2))
    } else if (tt == Token.Putc) {
        token = getTok.call()
        e = parenExpr.call()
        t = makeNode.call(Node.Prtc, e, null)
        expect.call("Putc", Token.Semi)
    } else if (tt == Token.Print) { // print '(' expr {',' expr} ')'
        token = getTok.call()
        expect.call("Print", Token.Lparen)
        while (true) {
            if (token.tok == Token.String) {
                e = makeNode.call(Node.Prts, makeLeaf.call(Node.String, token.text), null)
                token = getTok.call()
            } else {
                e = makeNode.call(Node.Prti, expr.call(0), null)
            }
            t = makeNode.call(Node.Sequence, t, e)
            if (token.tok != Token.Comma) break
            expect.call("Print", Token.Comma)
        }
        expect.call("Print", Token.Rparen)
        expect.call("Print", Token.Semi)
    } else if (tt == Token.Semi) {
        token = getTok.call()
    } else if (tt == Token.Ident) {
        v = makeLeaf.call(Node.Ident, token.text)
        token = getTok.call()
        expect.call("assign", Token.Assign)
        e = expr.call(0)
        t = makeNode.call(Node.Assign, v, e)
        expect.call("assign", Token.Semi)
    } else if (tt == Token.While) {
        token = getTok.call()
        e = parenExpr.call()
        s = stmt.call()
        t = makeNode.call(Node.While, e, s)
    } else if (tt == Token.Lbrace) { // {stmt}
        expect.call("Lbrace", Token.Lbrace)
        while (token.tok != Token.Rbrace && token.tok != Token.EOI) {
            t = makeNode.call(Node.Sequence, t, stmt.call())
        }
        expect.call("Lbrace", Token.Rbrace)
    } else if (tt == Token.EOI) {
        // do nothing
    } else {
        reportError.call(token.errLn, token.errCol,
            Fmt.Swrite("expecting start of statement, found '$s'", atrs[token.tok].text))
    }
    return t
}

var parse = Fn.new {
    var t
    token = getTok.call()
    while (true) {
        t = makeNode.call(Node.Sequence, t, stmt.call())
        if (!t || token.tok == Token.EOI) break
    }
    return t
}

var prtAst  // recursive function
prtAst = Fn.new { |t|
    if (!t) {
        System.print(";")
    } else {
        Fmt.write("$-14s ", displayNodes[t.nodeType])
        if (t.nodeType == Node.Ident || t.nodeType == Node.Integer || t.nodeType == Node.String) {
            System.print(t.value)
        } else {
            System.print()
            prtAst.call(t.left)
            prtAst.call(t.right)
        }
    }
}

lines = FileUtil.readLines("source.txt")
lineCount = lines.count
prtAst.call(parse.call())

Output:

Sequence       
Sequence       
Sequence       
Sequence       
Sequence       
;
Assign         
Identifier     count
Integer        1
Assign         
Identifier     n
Integer        1
Assign         
Identifier     limit
Integer        100
While          
Less           
Identifier     n
Identifier     limit
Sequence       
Sequence       
Sequence       
Sequence       
Sequence       
;
Assign         
Identifier     k
Integer        3
Assign         
Identifier     p
Integer        1
Assign         
Identifier     n
Add            
Identifier     n
Integer        2
While          
And            
LessEqual      
Multiply       
Identifier     k
Identifier     k
Identifier     n
Identifier     p
Sequence       
Sequence       
;
Assign         
Identifier     p
NotEqual       
Multiply       
Divide         
Identifier     n
Identifier     k
Identifier     k
Identifier     n
Assign         
Identifier     k
Add            
Identifier     k
Integer        2
If             
Identifier     p
If             
Sequence       
Sequence       
;
Sequence       
Sequence       
;
Prti           
Identifier     n
;
Prts           
String         " is prime\n"
;
Assign         
Identifier     count
Add            
Identifier     count
Integer        1
;
Sequence       
Sequence       
Sequence       
;
Prts           
String         "Total primes found: "
;
Prti           
Identifier     count
;
Prts           
String         "\n"
;

Zig

const std = @import("std");

pub const NodeValue = union(enum) {
    integer: i32,
    string: []const u8,

    fn fromToken(token: Token) ?NodeValue {
        if (token.value) |value| {
            switch (value) {
                .integer => |int| return NodeValue{ .integer = int },
                .string => |str| return NodeValue{ .string = str },
            }
        } else {
            return null;
        }
    }
};

pub const Tree = struct {
    left: ?*Tree,
    right: ?*Tree,
    typ: NodeType,
    value: ?NodeValue = null,
};

pub const ParserError = error{
    OutOfMemory,
    ExpectedNotFound,
} || std.fmt.ParseIntError;

pub const Parser = struct {
    token_it: LexerOutputTokenizer,
    curr: Token,
    allocator: std.mem.Allocator,

    const Self = @This();

    pub fn init(allocator: std.mem.Allocator, str: []const u8) Self {
        return Self{
            .token_it = LexerOutputTokenizer.init(str),
            .curr = Token{ .line = 0, .col = 0, .typ = .unknown },
            .allocator = allocator,
        };
    }

    fn makeNode(self: *Self, typ: NodeType, left: ?*Tree, right: ?*Tree) !*Tree {
        const result = try self.allocator.create(Tree);
        result.* = Tree{ .left = left, .right = right, .typ = typ };
        return result;
    }

    fn makeLeaf(self: *Self, typ: NodeType, value: ?NodeValue) !*Tree {
        const result = try self.allocator.create(Tree);
        result.* = Tree{ .left = null, .right = null, .typ = typ, .value = value };
        return result;
    }

    pub fn parse(self: *Self) ParserError!?*Tree {
        try self.next();
        var result: ?*Tree = null;
        while (true) {
            const stmt = try self.parseStmt();
            result = try self.makeNode(.sequence, result, stmt);
            if (self.curr.typ == .eof) break;
        }
        return result;
    }

    /// Classic "Recursive descent" statement parser.
    fn parseStmt(self: *Self) ParserError!?*Tree {
        var result: ?*Tree = null;
        switch (self.curr.typ) {
            .kw_print => {
                try self.next();
                try self.expect(.left_paren);
                // Parse each print's argument as an expression delimited by commas until we reach
                // a closing parens.
                while (true) {
                    var expr: ?*Tree = null;
                    if (self.curr.typ == .string) {
                        expr = try self.makeNode(
                            .prts,
                            try self.makeLeaf(.string, NodeValue.fromToken(self.curr)),
                            null,
                        );
                        try self.next();
                    } else {
                        expr = try self.makeNode(.prti, try self.parseExpr(0), null);
                    }
                    result = try self.makeNode(.sequence, result, expr);
                    if (self.curr.typ != .comma) break;
                    try self.next();
                }
                try self.expect(.right_paren);
                try self.expect(.semicolon);
            },
            .kw_putc => {
                try self.next();
                result = try self.makeNode(.prtc, try self.parseParenExpr(), null);
                try self.expect(.semicolon);
            },
            .kw_while => {
                try self.next();
                const expr = try self.parseParenExpr();
                result = try self.makeNode(.kw_while, expr, try self.parseStmt());
            },
            .kw_if => {
                try self.next();
                const expr = try self.parseParenExpr();
                const if_stmt = try self.parseStmt();
                const else_stmt = blk: {
                    if (self.curr.typ == .kw_else) {
                        try self.next();
                        break :blk try self.parseStmt();
                    } else {
                        break :blk null;
                    }
                };
                const stmt_node = try self.makeNode(.kw_if, if_stmt, else_stmt);
                // If-statement uses `.kw_if` node for both first node with `expr` on the left
                // and statements on the right and also `.kw_if` node which goes to the right
                // and contains both if-branch and else-branch.
                result = try self.makeNode(.kw_if, expr, stmt_node);
            },
            .left_brace => {
                try self.next();
                while (self.curr.typ != .right_brace and self.curr.typ != .eof) {
                    result = try self.makeNode(.sequence, result, try self.parseStmt());
                }
                try self.expect(.right_brace);
            },
            .identifier => {
                const identifer = try self.makeLeaf(.identifier, NodeValue.fromToken(self.curr));
                try self.next();
                try self.expect(.assign);
                const expr = try self.parseExpr(0);
                result = try self.makeNode(.assign, identifer, expr);
                try self.expect(.semicolon);
            },
            .semicolon => try self.next(),
            else => {
                std.debug.print("\nSTMT: UNKNOWN {}\n", .{self.curr});
                std.os.exit(1);
            },
        }
        return result;
    }

    /// "Precedence climbing" expression parser.
    fn parseExpr(self: *Self, precedence: i8) ParserError!?*Tree {
        var result: ?*Tree = null;
        switch (self.curr.typ) {
            .left_paren => {
                result = try self.parseParenExpr();
            },
            .subtract => {
                try self.next();
                const metadata = NodeMetadata.find(.negate);
                const expr = try self.parseExpr(metadata.precedence);
                result = try self.makeNode(.negate, expr, null);
            },
            .not => {
                try self.next();
                const metadata = NodeMetadata.find(.not);
                const expr = try self.parseExpr(metadata.precedence);
                result = try self.makeNode(.not, expr, null);
            },
            .add => {
                try self.next();
                result = try self.parseExpr(precedence);
            },
            .integer, .identifier => {
                const node_type = NodeMetadata.find(self.curr.typ).node_type;
                result = try self.makeLeaf(node_type, NodeValue.fromToken(self.curr));
                try self.next();
            },
            else => {
                std.debug.print("\nEXPR: UNKNOWN {}\n", .{self.curr});
                std.os.exit(1);
            },
        }

        var curr_metadata = NodeMetadata.find(self.curr.typ);
        while (curr_metadata.binary and curr_metadata.precedence >= precedence) {
            const new_precedence =
                if (curr_metadata.right_associative)
                curr_metadata.precedence
            else
                curr_metadata.precedence + 1;
            try self.next();
            const sub_expr = try self.parseExpr(new_precedence);
            result = try self.makeNode(curr_metadata.node_type, result, sub_expr);
            curr_metadata = NodeMetadata.find(self.curr.typ);
        }
        return result;
    }

    fn parseParenExpr(self: *Self) ParserError!?*Tree {
        try self.expect(.left_paren);
        const result = try self.parseExpr(0);
        try self.expect(.right_paren);
        return result;
    }

    fn next(self: *Self) ParserError!void {
        const token = try self.token_it.next();
        if (token) |tok| {
            self.curr = tok;
        } else {
            self.curr = Token{ .line = 0, .col = 0, .typ = .unknown };
        }
    }

    fn expect(self: *Self, token_type: TokenType) ParserError!void {
        if (self.curr.typ != token_type) {
            const expected_str = NodeMetadata.find(token_type).token_str;
            const found_str = NodeMetadata.find(self.curr.typ).token_str;
            std.debug.print(
                "({d}, {d}) error: Expecting '{s}', found '{s}'\n",
                .{ self.curr.line, self.curr.col, expected_str, found_str },
            );
            return ParserError.ExpectedNotFound;
        }
        try self.next();
    }
};

pub fn parse(allocator: std.mem.Allocator, str: []const u8) !?*Tree {
    var parser = Parser.init(allocator, str);
    return try parser.parse();
}

pub fn main() !void {
    var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator);
    defer arena.deinit();
    const allocator = arena.allocator();

    var arg_it = std.process.args();
    _ = try arg_it.next(allocator) orelse unreachable; // program name
    const file_name = arg_it.next(allocator);
    // We accept both files and standard input.
    var file_handle = blk: {
        if (file_name) |file_name_delimited| {
            const fname: []const u8 = try file_name_delimited;
            break :blk try std.fs.cwd().openFile(fname, .{});
        } else {
            break :blk std.io.getStdIn();
        }
    };
    defer file_handle.close();
    const input_content = try file_handle.readToEndAlloc(allocator, std.math.maxInt(usize));

    const result: ?*Tree = try parse(allocator, input_content);
    const result_str = try astToFlattenedString(allocator, result);
    _ = try std.io.getStdOut().write(result_str);
}

const NodeMetadata = struct {
    token_type: TokenType,
    right_associative: bool,
    binary: bool,
    unary: bool,
    precedence: i8,
    node_type: NodeType,
    token_str: []const u8,

    const self = [_]NodeMetadata{
        .{ .token_type = .multiply, .right_associative = false, .binary = true, .unary = false, .precedence = 13, .node_type = .multiply, .token_str = "*" },
        .{ .token_type = .divide, .right_associative = false, .binary = true, .unary = false, .precedence = 13, .node_type = .divide, .token_str = "/" },
        .{ .token_type = .mod, .right_associative = false, .binary = true, .unary = false, .precedence = 13, .node_type = .mod, .token_str = "%" },
        .{ .token_type = .add, .right_associative = false, .binary = true, .unary = false, .precedence = 12, .node_type = .add, .token_str = "+" },
        .{ .token_type = .subtract, .right_associative = false, .binary = true, .unary = false, .precedence = 12, .node_type = .subtract, .token_str = "-" },
        .{ .token_type = .negate, .right_associative = false, .binary = false, .unary = true, .precedence = 14, .node_type = .negate, .token_str = "-" },
        .{ .token_type = .less, .right_associative = false, .binary = true, .unary = false, .precedence = 10, .node_type = .less, .token_str = "<" },
        .{ .token_type = .less_equal, .right_associative = false, .binary = true, .unary = false, .precedence = 10, .node_type = .less_equal, .token_str = "<=" },
        .{ .token_type = .greater, .right_associative = false, .binary = true, .unary = false, .precedence = 10, .node_type = .greater, .token_str = ">" },
        .{ .token_type = .greater_equal, .right_associative = false, .binary = true, .unary = false, .precedence = 10, .node_type = .greater_equal, .token_str = ">=" },
        .{ .token_type = .equal, .right_associative = false, .binary = true, .unary = false, .precedence = 9, .node_type = .equal, .token_str = "=" },
        .{ .token_type = .not_equal, .right_associative = false, .binary = true, .unary = false, .precedence = 9, .node_type = .not_equal, .token_str = "!=" },
        .{ .token_type = .not, .right_associative = false, .binary = false, .unary = true, .precedence = 14, .node_type = .not, .token_str = "!" },
        .{ .token_type = .assign, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .assign, .token_str = "=" },
        .{ .token_type = .bool_and, .right_associative = false, .binary = true, .unary = false, .precedence = 5, .node_type = .bool_and, .token_str = "&&" },
        .{ .token_type = .bool_or, .right_associative = false, .binary = true, .unary = false, .precedence = 4, .node_type = .bool_or, .token_str = "||" },
        .{ .token_type = .left_paren, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .unknown, .token_str = "(" },
        .{ .token_type = .right_paren, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .unknown, .token_str = ")" },
        .{ .token_type = .left_brace, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .unknown, .token_str = "{" },
        .{ .token_type = .right_brace, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .unknown, .token_str = "}" },
        .{ .token_type = .semicolon, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .unknown, .token_str = ";" },
        .{ .token_type = .comma, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .unknown, .token_str = "," },
        .{ .token_type = .kw_if, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .kw_if, .token_str = "if" },
        .{ .token_type = .kw_else, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .unknown, .token_str = "else" },
        .{ .token_type = .kw_while, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .kw_while, .token_str = "while" },
        .{ .token_type = .kw_print, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .unknown, .token_str = "print" },
        .{ .token_type = .kw_putc, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .unknown, .token_str = "putc" },
        .{ .token_type = .identifier, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .identifier, .token_str = "Identifier" },
        .{ .token_type = .integer, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .integer, .token_str = "Integer literal" },
        .{ .token_type = .string, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .string, .token_str = "String literal" },
        .{ .token_type = .eof, .right_associative = false, .binary = false, .unary = false, .precedence = -1, .node_type = .unknown, .token_str = "End of line" },
    };

    pub fn find(token_type: TokenType) NodeMetadata {
        for (self) |metadata| {
            if (metadata.token_type == token_type) return metadata;
        } else {
            unreachable;
        }
    }
};

pub const NodeType = enum {
    unknown,
    identifier,
    string,
    integer,
    sequence,
    kw_if,
    prtc,
    prts,
    prti,
    kw_while,
    assign,
    negate,
    not,
    multiply,
    divide,
    mod,
    add,
    subtract,
    less,
    less_equal,
    greater,
    greater_equal,
    equal,
    not_equal,
    bool_and,
    bool_or,

    pub fn toString(self: NodeType) []const u8 {
        return switch (self) {
            .unknown => "UNKNOWN",
            .identifier => "Identifier",
            .string => "String",
            .integer => "Integer",
            .sequence => "Sequence",
            .kw_if => "If",
            .prtc => "Prtc",
            .prts => "Prts",
            .prti => "Prti",
            .kw_while => "While",
            .assign => "Assign",
            .negate => "Negate",
            .not => "Not",
            .multiply => "Multiply",
            .divide => "Divide",
            .mod => "Mod",
            .add => "Add",
            .subtract => "Subtract",
            .less => "Less",
            .less_equal => "LessEqual",
            .greater => "Greater",
            .greater_equal => "GreaterEqual",
            .equal => "Equal",
            .not_equal => "NotEqual",
            .bool_and => "And",
            .bool_or => "Or",
        };
    }
};

fn astToFlattenedString(allocator: std.mem.Allocator, tree: ?*Tree) ![]const u8 {
    var result = std.ArrayList(u8).init(allocator);
    var writer = result.writer();
    try treeToString(allocator, writer, tree);
    return result.items;
}

pub const TokenType = enum {
    unknown,
    multiply,
    divide,
    mod,
    add,
    subtract,
    negate,
    less,
    less_equal,
    greater,
    greater_equal,
    equal,
    not_equal,
    not,
    assign,
    bool_and,
    bool_or,
    left_paren,
    right_paren,
    left_brace,
    right_brace,
    semicolon,
    comma,
    kw_if,
    kw_else,
    kw_while,
    kw_print,
    kw_putc,
    identifier,
    integer,
    string,
    eof,

    const from_string_map = std.ComptimeStringMap(TokenType, .{
        .{ "Op_multiply", .multiply },
        .{ "Op_divide", .divide },
        .{ "Op_mod", .mod },
        .{ "Op_add", .add },
        .{ "Op_subtract", .subtract },
        .{ "Op_negate", .negate },
        .{ "Op_less", .less },
        .{ "Op_lessequal", .less_equal },
        .{ "Op_greater", .greater },
        .{ "Op_greaterequal", .greater_equal },
        .{ "Op_equal", .equal },
        .{ "Op_notequal", .not_equal },
        .{ "Op_not", .not },
        .{ "Op_assign", .assign },
        .{ "Op_and", .bool_and },
        .{ "Op_or", .bool_or },
        .{ "LeftParen", .left_paren },
        .{ "RightParen", .right_paren },
        .{ "LeftBrace", .left_brace },
        .{ "RightBrace", .right_brace },
        .{ "Semicolon", .semicolon },
        .{ "Comma", .comma },
        .{ "Keyword_if", .kw_if },
        .{ "Keyword_else", .kw_else },
        .{ "Keyword_while", .kw_while },
        .{ "Keyword_print", .kw_print },
        .{ "Keyword_putc", .kw_putc },
        .{ "Identifier", .identifier },
        .{ "Integer", .integer },
        .{ "String", .string },
        .{ "End_of_input", .eof },
    });

    pub fn fromString(str: []const u8) TokenType {
        return from_string_map.get(str).?;
    }
};

pub const TokenValue = union(enum) {
    integer: i32,
    string: []const u8,
};

pub const Token = struct {
    line: usize,
    col: usize,
    typ: TokenType = .unknown,
    value: ?TokenValue = null,
};

const TreeToStringError = error{OutOfMemory};

fn treeToString(
    allocator: std.mem.Allocator,
    writer: std.ArrayList(u8).Writer,
    tree: ?*Tree,
) TreeToStringError!void {
    if (tree) |t| {
        _ = try writer.write(try std.fmt.allocPrint(
            allocator,
            "{s}",
            .{t.typ.toString()},
        ));
        switch (t.typ) {
            .string, .identifier => _ = try writer.write(try std.fmt.allocPrint(
                allocator,
                "   {s}\n",
                .{t.value.?.string},
            )),
            .integer => _ = try writer.write(try std.fmt.allocPrint(
                allocator,
                "   {d}\n",
                .{t.value.?.integer},
            )),
            else => {
                _ = try writer.write(try std.fmt.allocPrint(
                    allocator,
                    "\n",
                    .{},
                ));
                try treeToString(allocator, writer, t.left);
                try treeToString(allocator, writer, t.right);
            },
        }
    } else {
        _ = try writer.write(try std.fmt.allocPrint(
            allocator,
            ";\n",
            .{},
        ));
    }
}

pub const LexerOutputTokenizer = struct {
    it: std.mem.SplitIterator(u8),

    const Self = @This();

    pub fn init(str: []const u8) Self {
        return Self{ .it = std.mem.split(u8, str, "\n") };
    }

    pub fn next(self: *Self) std.fmt.ParseIntError!?Token {
        if (self.it.next()) |line| {
            if (line.len == 0) return null;
            var tokens_it = std.mem.tokenize(u8, line, " ");
            const lineNumber = try std.fmt.parseInt(usize, tokens_it.next().?, 10);
            const colNumber = try std.fmt.parseInt(usize, tokens_it.next().?, 10);
            const typ_text = tokens_it.next().?;
            const typ = TokenType.fromString(typ_text);
            const pre_value_index = tokens_it.index;
            const value = tokens_it.next();
            var token = Token{ .line = lineNumber, .col = colNumber, .typ = typ };
            if (value) |val| {
                const token_value = blk: {
                    switch (typ) {
                        .string, .identifier => {
                            tokens_it.index = pre_value_index;
                            break :blk TokenValue{ .string = tokens_it.rest() };
                        },
                        .integer => break :blk TokenValue{ .integer = try std.fmt.parseInt(i32, val, 10) },
                        else => unreachable,
                    }
                };
                token.value = token_value;
            }
            return token;
        } else {
            return null;
        }
    }
};

fn stringToTokenList(allocator: std.mem.Allocator, str: []const u8) !std.ArrayList(Token) {
    var result = std.ArrayList(Token).init(allocator);
    var lexer_output_it = LexerOutputTokenizer.init(str);
    while (try lexer_output_it.next()) |token| {
        try result.append(token);
    }
    return result;
}

@@ Line 1: / Line 1: @@
-{{task}}Syntax Analyzer
+{{task}}{{task heading|Syntax Analyzer}}
 A Syntax analyzer transforms a token stream (from the [[Compiler/lexical_analyzer|Lexical analyzer]])
 into a Syntax tree, based on a grammar.
-{{task heading}}
 Take the output from the Lexical analyzer [[Compiler/lexical_analyzer|task]],