Compiler/lexical analyzer: Difference between revisions
Content added Content deleted
(use shorter example in the whitespace section) |
(template "introheader" was renamed to "task heading") |
||
Line 5: | Line 5: | ||
: ''Lexical analysis is the process of converting a sequence of characters (such as in a computer program or web page) into a sequence of tokens (strings with an identified "meaning"). A program that performs lexical analysis may be called a lexer, tokenizer, or scanner (though "scanner" is also used to refer to the first stage of a lexer).'' |
: ''Lexical analysis is the process of converting a sequence of characters (such as in a computer program or web page) into a sequence of tokens (strings with an identified "meaning"). A program that performs lexical analysis may be called a lexer, tokenizer, or scanner (though "scanner" is also used to refer to the first stage of a lexer).'' |
||
{{task heading}} |
|||
{{introheader|The Task}} |
|||
Create a lexical analyzer for the simple programming language specified below. The |
Create a lexical analyzer for the simple programming language specified below. The |
||
Line 12: | Line 12: | ||
if two versions of the solution are provided: One without the lexer module, and one with. |
if two versions of the solution are provided: One without the lexer module, and one with. |
||
{{ |
{{task heading|Input Specification}} |
||
The simple programming language to be analyzed is more or less a subset of [[C]]. It supports the following tokens: |
The simple programming language to be analyzed is more or less a subset of [[C]]. It supports the following tokens: |
||
Line 162: | Line 162: | ||
</pre> |
</pre> |
||
{{ |
{{task heading|Output Format}} |
||
The program output should be a sequence of lines, each consisting of the following whitespace-separated fields: |
The program output should be a sequence of lines, each consisting of the following whitespace-separated fields: |
||
Line 171: | Line 171: | ||
# the token value (only for <tt>IDENTIFIER</tt>, <tt>INTEGER</tt>, and <tt>STRING</tt> tokens) |
# the token value (only for <tt>IDENTIFIER</tt>, <tt>INTEGER</tt>, and <tt>STRING</tt> tokens) |
||
{{ |
{{task heading|Diagnostics}} |
||
The following error conditions should be caught: |
The following error conditions should be caught: |
||
Line 199: | Line 199: | ||
|} |
|} |
||
{{ |
{{task heading|Test Cases}} |
||
{| class="wikitable" |
{| class="wikitable" |
||
Line 307: | Line 307: | ||
|} |
|} |
||
{{ |
{{task heading|Reference}} |
||
The Flex, C, Python and Euphoria versions can be considered reference implementations. |
The Flex, C, Python and Euphoria versions can be considered reference implementations. |