Jump to content

Compiler/syntax analyzer: Difference between revisions

m
syntax highlighting fixup automation
m (Reverted edits by Jwells1213 (talk) to last revision by Chemoelectric)
m (syntax highlighting fixup automation)
Line 21:
[https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_Form Extended Backus-Naur Form (EBNF)]:
 
<syntaxhighlight lang="ebnf">
<lang EBNF>
stmt_list = {stmt} ;
 
Line 47:
| '(' expr ')'
| ('+' | '-' | '!') primary
;</langsyntaxhighlight>
 
The resulting AST should be formulated as a Binary Tree.
Line 68:
|-
| style="vertical-align:top" |
<langsyntaxhighlight lang="c">count = 1;
while (count < 10) {
print("count is: ", count, "\n");
count = count + 1;
}</langsyntaxhighlight>
 
| style="vertical-align:top" |
Line 280:
[https://en.wikipedia.org/wiki/Recursive_descent_parser Recursive Descent] for statement parsing. The AST is also built:
 
<langsyntaxhighlight lang="python">def expr(p)
if tok is "("
x = paren_expr()
Line 364:
t = make_node(Sequence, t, stmt())
until tok is end-of-file
return t</langsyntaxhighlight>
 
;Once the AST is built, it should be output in a [[Flatten_a_list|flattened format.]] This can be as simple as the following:
 
<langsyntaxhighlight lang="python">def prt_ast(t)
if t == NULL
print(";\n")
Line 378:
print("\n")
prt_ast(t.left)
prt_ast(t.right)</langsyntaxhighlight>
 
;If the AST is correctly built, loading it into a subsequent program should be as simple as:
 
<langsyntaxhighlight lang="python">def load_ast()
line = readline()
# Each line has at least one token
Line 402:
left = load_ast()
right = load_ast()
return make_node(node_type, left, right)</langsyntaxhighlight>
 
Finally, the AST can also be tested by running it against one of the AST Interpreter [[Compiler/AST_interpreter|solutions]].
Line 415:
|-
| style="vertical-align:top" |
<langsyntaxhighlight lang="c">/*
Simple prime number generator
*/
Line 434:
}
}
print("Total primes found: ", count, "\n");</langsyntaxhighlight>
 
| style="vertical-align:top" |
Line 654:
 
=={{header|ALGOL W}}==
<langsyntaxhighlight lang="algolw">begin % syntax analyser %
% parse tree nodes %
record node( integer type
Line 1,041:
readToken;
writeNode( parseStatementList( tEnd_of_input ) )
end.</langsyntaxhighlight>
{{out}}
Output from parsing the Prime Numbers example program.
Line 1,144:
=={{header|ATS}}==
 
<langsyntaxhighlight ATSlang="ats">(********************************************************************)
(* Usage: parse [INPUTFILE [OUTPUTFILE]]
If INPUTFILE or OUTPUTFILE is "-" or missing, then standard input
Line 2,074:
end
 
(********************************************************************)</langsyntaxhighlight>
 
 
Line 2,177:
=={{header|AWK}}==
Tested with gawk 4.1.1 and mawk 1.3.4.
<syntaxhighlight lang="awk">
<lang AWK>
function Token_assign(tk, attr, attr_array, n, i) {
n=split(attr, attr_array)
Line 2,473:
prt_ast(t)
}
</syntaxhighlight>
</lang>
 
{{out|case=count}}
Line 2,514:
=={{header|C}}==
Tested with gcc 4.81 and later, compiles warning free with -Wall -Wextra
<langsyntaxhighlight Clang="c">#include <stdio.h>
#include <stdlib.h>
#include <string.h>
Line 2,854:
init_io(&dest_fp, stdout, "wb", argc > 2 ? argv[2] : "");
prt_ast(parse());
}</langsyntaxhighlight>
 
{{out|case=prime numbers AST}}
Line 2,960:
Code by Steve Williams. Tested with GnuCOBOL 2.2.
 
<langsyntaxhighlight lang="cobol"> >>SOURCE FORMAT IS FREE
identification division.
*> this code is dedicated to the public domain
Line 3,565:
.
end program printast.
end program parser.</langsyntaxhighlight>
 
{{out|case=Primes}}
Line 3,673:
 
 
<langsyntaxhighlight lang="lisp">#!/bin/sh
#|-*- mode:lisp -*-|#
#|
Line 4,084:
(uiop:quit 0)))
 
;;; vim: set ft=lisp lisp:</langsyntaxhighlight>
 
{{out}}
Line 4,274:
=={{header|Forth}}==
Tested with Gforth 0.7.3.
<langsyntaxhighlight Forthlang="forth">CREATE BUF 0 , \ single-character look-ahead buffer
: PEEK BUF @ 0= IF KEY BUF ! THEN BUF @ ;
: GETC PEEK 0 BUF ! ;
Line 4,445:
: -EOI? TOKEN-TYPE End_of_input <> ;
: PARSE $NULL GETTOK BEGIN -EOI? WHILE STMT $SEQUENCE REPEAT ;
PARSE .NODE</langsyntaxhighlight>
 
{{out|case=Count AST}}
Line 4,487:
{{works with|gfortran|11.2.1}}
The following is Fortran 2008/2018 code with C preprocessing directives. If you call the program source ‘parse.F90’, with a capital ‘F’, then gfortran will know to run the C preprocessor.
<langsyntaxhighlight lang="fortran">!!!
!!! An implementation of the Rosetta Code parser task:
!!! https://rosettacode.org/wiki/Compiler/syntax_analyzer
Line 6,033:
end subroutine print_usage
end program parse</langsyntaxhighlight>
 
{{out}}
Line 6,135:
=={{header|Go}}==
{{trans|C}}
<langsyntaxhighlight lang="go">package main
 
import (
Line 6,488:
scanner = bufio.NewScanner(source)
prtAst(parse())
}</langsyntaxhighlight>
 
{{out}}
Line 6,597:
 
 
<syntaxhighlight lang="icon">#
<lang Icon>#
# The Rosetta Code Tiny-Language Parser, in Icon.
#
Line 6,954:
}
return s
end</langsyntaxhighlight>
 
{{out}}
Line 7,057:
Implementation:
 
<langsyntaxhighlight Jlang="j">require'format/printf'
tkref=: tokenize 'End_of_input*/%+--<<=>>===!=!&&||print=print(if{else}while;,putc)a""0'
Line 7,214:
end.
}}
</syntaxhighlight>
</lang>
 
Some quirks worth noting:
Line 7,230:
Task example:
 
<syntaxhighlight lang="j">
<lang J>
primes=: {{)n
/*
Line 7,350:
String "\n"
;
</syntaxhighlight>
</lang>
 
=={{header|Java}}==
Usage: java Parser infile [>outfile]
{{trans|Python}}
<langsyntaxhighlight lang="java">
import java.io.File;
import java.io.FileNotFoundException;
Line 7,707:
}
}
</syntaxhighlight>
</lang>
 
=={{header|Julia}}==
Julia tends to discourage large numbers of global variables, so this direct port from the Python reference implementation moves the globals into a function wrapper.
{{trans|Python}}
<langsyntaxhighlight lang="julia">struct ASTnode
nodetype::Int
left::Union{Nothing, ASTnode}
Line 7,964:
 
# syntaxanalyzer(length(ARGS) > 1 ? ARGS[1] : stdin) # for use as in the Python code
</syntaxhighlight>
</lang>
 
=={{header|M2000 Interpreter}}==
Line 7,972:
 
 
<syntaxhighlight lang="m2000 interpreter">
<lang M2000 Interpreter>
Module syntax_analyzer(b$){
enum tokens {
Line 8,345:
43 1 End_of_Input
}
</syntaxhighlight>
</lang>
 
{{out}}
Line 8,525:
Using the third version of Nim lexer.
 
<langsyntaxhighlight Nimlang="nim">import ast_lexer
 
type NodeKind* = enum
Line 8,773:
let code = if paramCount() < 1: stdin.readAll() else: paramStr(1).readFile()
let tree = parse(code)
tree.printAst()</langsyntaxhighlight>
 
{{out}}
Line 8,882:
 
 
<langsyntaxhighlight ObjectIconlang="objecticon"># -*- ObjectIcon -*-
#
# The Rosetta Code Tiny-Language Parser, in Object Icon.
Line 9,238:
}
return s
end</langsyntaxhighlight>
 
{{out}}
Line 9,342:
=={{header|Perl}}==
Tested on perl v5.26.1
<langsyntaxhighlight Perllang="perl">#!/usr/bin/perl
 
use strict; # parse.pl - inputs lex, outputs flattened ast
Line 9,415:
$_[0] <= 0 && /$h Op_or \n/gcx ? "Or\n$ast" . expr(1) :
return $ast while 1;
}</langsyntaxhighlight>
 
{{out|case=Count AST}}
Line 9,449:
=={{header|Phix}}==
Reusing lex.e (and core.e) from the [[Compiler/lexical_analyzer#Phix|Lexical Analyzer task]], and again written as a reusable module.
<!--<langsyntaxhighlight Phixlang="phix">(phixonline)-->
<span style="color: #000080;font-style:italic;">--
-- demo\rosetta\Compiler\parse.e
Line 9,602:
<span style="color: #008080;">return</span> <span style="color: #000000;">t</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span>
<!--</langsyntaxhighlight>-->
And a simple test driver for the specific task:
<!--<langsyntaxhighlight Phixlang="phix">(phixonline)-->
<span style="color: #000080;font-style:italic;">--
-- demo\rosetta\Compiler\parse.exw
Line 9,663:
--main({0,0,"primes.c"}) -- as Algol, C, Python (apart from spacing)
--main({0,0,"count.c"}) -- as AWK ( "" )</span>
<!--</langsyntaxhighlight>-->
{{out}}
<pre>
Line 9,672:
=={{header|Python}}==
Tested with Python 2.7 and 3.x
<langsyntaxhighlight Pythonlang="python">from __future__ import print_function
import sys, shlex, operator
 
Line 9,938:
error(0, 0, "Can't open %s" % sys.argv[1])
t = parse()
prt_ast(t)</langsyntaxhighlight>
 
{{out|case=prime numbers AST}}
Line 10,058:
 
 
<langsyntaxhighlight lang="ratfor">######################################################################
#
# The Rosetta Code parser in Ratfor 77.
Line 11,608:
end
 
######################################################################</langsyntaxhighlight>
 
 
Line 11,723:
The following code implements a configurable (from a symbol map provided as a parameter) Precedence Climbing parser for the output of the [http://rosettacode.org/wiki/Compiler/lexical_analyzer#Scala lexer]. The recursive descent language parser is closely based on the pseudo code given in the task description.
 
<langsyntaxhighlight lang="scala">
package xyz.hyperreal.rosettacodeCompiler
 
Line 11,933:
 
}
</syntaxhighlight>
</lang>
 
=={{header|Scheme}}==
Line 11,939:
Code implements a recursive descent parser based on the given grammar. Tested against all programs in [[Compiler/Sample programs]].
 
<langsyntaxhighlight lang="scheme">
(import (scheme base)
(scheme process-context)
Line 12,146:
(display-ast (parse-file (cadr (command-line))))
(display "Error: provide program filename\n"))
</syntaxhighlight>
</lang>
 
=={{header|Wren}}==
Line 12,153:
{{libheader|Wren-fmt}}
{{libheader|wren-ioutil}}
<langsyntaxhighlight lang="ecmascript">import "/dynamic" for Enum, Struct, Tuple
import "/fmt" for Fmt
import "/ioutil" for FileUtil
Line 12,469:
lines = FileUtil.readLines("source.txt")
lineCount = lines.count
prtAst.call(parse.call())</langsyntaxhighlight>
 
{{out}}
Line 12,571:
 
=={{header|Zig}}==
<langsyntaxhighlight lang="zig">
const std = @import("std");
 
Line 13,124:
return result;
}
</syntaxhighlight>
</lang>
10,333

edits

Cookies help us deliver our services. By using our services, you agree to our use of cookies.