User:Ed Davis: Difference between revisions

Content added Content deleted
No edit summary
No edit summary
Line 10: Line 10:
or scanner (though "scanner" is also used to refer to the first stage of a lexer).
or scanner (though "scanner" is also used to refer to the first stage of a lexer).


;The Task
==The Task==


Create a lexical analyzer for the Tiny programming language.
Create a lexical analyzer for the Tiny programming language. The
program should read input from a file and/or stdin, and write
output to a file and/or stdout.


;Specification
==Specification==


===Operators===
operators:


{| class="wikitable"
{| class="wikitable"
Line 43: Line 45:
|}
|}


===Symbols===
symbols:


{| class="wikitable"
{| class="wikitable"
Line 61: Line 63:
| ',' || comma || Comma
| ',' || comma || Comma
|}
|}

===Keywords===

{| class="wikitable"
|-
! Characters !! Name
|-
| "if" || If
|-
| "while" || While
|-
| "print" || Print
|-
| "putc" || Putc
|}

===Other entities===


{| class="wikitable"
{| class="wikitable"
Line 74: Line 93:
| string literal || ".*" || String
| string literal || ".*" || String
|}
|}



Notes: For char literals, '\n' is supported as a new line
Notes: For char literals, '\n' is supported as a new line
Line 81: Line 99:
supported.
supported.


'''Comments''' /* ... */ (multi-line)


====Complete list of token names====
keywords:


'''EOI, Print, Putc, If, While, Lbrace, Rbrace, Lparen, Rparen,
{| class="wikitable"
Uminus, Mul, Div, Add, Sub, Lss, Gtr, Leq, Neq, And, Semi, Comma,
|-
Assign, Integerk, Stringk, Ident'''
! Characters !! Name
|-
| "if" || If
|-
| "while" || While
|-
| "print" || Print
|-
| "putc" || Putc
|}


===Program output===
comments: /* ... */ (multi-line)

Complete list of token names:

EOI, Print, Putc, If, While, Lbrace, Rbrace, Lparen, Rparen, Uminus, Mul, Div, Add,
Sub, Lss, Gtr, Leq, Neq, And, Semi, Comma, Assign, Integerk, Stringk, Ident


Output of the program should be the line and column where the
Output of the program should be the line and column where the
Line 109: Line 114:
should follow.
should follow.


===Test Cases===

Test Cases
----------


<lang c>
<lang c>
Line 120: Line 123:
</lang>
</lang>


Output
===Output===
------


{| class="wikitable"
{| class="wikitable"
Line 146: Line 148:
</lang>
</lang>


Output
===Output===
------


{| class="wikitable"
{| class="wikitable"
Line 176: Line 177:
|}
|}


Diagnostics:
===Diagnostics===
------------
The following error conditions should be caught:
The following error conditions should be caught:


Empty character constant. Example: ''
* Empty character constant. Example: ''
Unknown escape sequence. Example: '\r'
* Unknown escape sequence. Example: '\r'
Multi-character constant. Example: 'xx'
* Multi-character constant. Example: 'xx'
End-of-file in comment. Closing comment characters not found.
* End-of-file in comment. Closing comment characters not found.
End-of-file while scanning string literal. Closing string character not found.
* End-of-file while scanning string literal. Closing string character not found.
End-of-line while scanning string literal. Closing string character not found before end-of-line.
* End-of-line while scanning string literal. Closing string character not found before end-of-line.
Unrecognized character. Example: |
* Unrecognized character. Example: |


Refer additional questions to the C and Python implementations.
Refer additional questions to the C and Python implementations.