Compiler/syntax analyzer: Difference between revisions

Line 3,304:

Implementation:

<lang J>require'format/printf'

require'format/printf'

tkref=: tokenize 'End_of_input*/%+--<<=>>===!=!&&||print=print(if{else}while;~~print~~,putc)a""0'

tkref=: tokenize 'End_of_input*/%+--<<=>>===!=!&&||print=print(if{else}while;,putc)a""0'

tkref,. (tknames)=: tknames=:;: {{)n

End_of_input Op_multiply Op_divide Op_mod Op_add Op_subtract Op_negate Op_less

Op_lessequal Op_greater Op_greaterequal Op_equal Op_notequal Op_not Op_and

Op_or Keyword_print Op_assign Keyword_print LeftParen Keyword_if LeftBrace Keyword_else RightBrace

Keyword_while Semicolon ~~Keyword_print~~ Comma Keyword_putc RightParen

Keyword_while Semicolon Comma Keyword_putc RightParen

Identifier String Integer

}}-.LF

Line 3,319:

Line 3,318:

tkEOI tkMul tkDiv tkMod tkAdd tkSub tkNeg tkLss tkLeq tkGtr tkGeq tkEql

tkNeq tkNot tkAnd tkOr tkPrint tkAssign tkPrint tkLpar tkIf tkLbra tkElse

tkRbra tkWhile tkSemi ~~tkPrint~~ tkComma tkPutc tkRpar tkId tkString tkInt

tkRbra tkWhile tkSemi tkComma tkPutc tkRpar tkId tkString tkInt

}}-.LF

Line 3,332:

Line 3,331:

Sequence Multiply Divide Mod Add Subtract Negate Less LessEqual Greater

GreaterEqual Equal NotEqual Not And Or Prts Assign Prti x If x x x While

x ~~Prtc~~ x Putc x Identifier String Integer

x x Putc x Identifier String Integer

}}-.LF

NB. proofread |:ndRef,:ndDisp

Line 3,385:

Line 3,384:

case.tkPutc do.

e=. paren_expr gettok''

t=. ~~Prtc~~ make_node e a:

t=. Putc make_node e a:

Putc expect tkSemi

case.tkPrint do.gettok''

Line 3,467:

Line 3,466:

echo 'Error: line %d, column %d: %s\n'sprintf tok_ln;tok_col;y throw.

}}

syntax=: {{

Line 3,481:

end.

}}

</lang>

Some quirks worth noting:

(1) '+' appears in the productions for 'primary' and 'addition_expr' but has only one node type (because we do not represent its appearance in 'primary' with a node.

(2) '-' and 'print' do have two node types (which we sort out on the fly).

(3) In this implementation, we require a 1:1 mapping between the data structure representing token types and the data structure representing node types. This means two token entries for both - and print (the second instance of both gets ignored by the lexer).

(4) Because the data structure produced by the lexer is independent of any type system implementation, we can use the same type system for the lexer or a different type system for the lexer and either way works (as long as the implementations are consistent with the spec).

Task example: