Compiler/lexical analyzer: Difference between revisions

Content deleted Content added
update the token names in the Output Format section of the task description
Document END_OF_INPUT and comments more explicitly
Line 82:
|}
 
;Identifiers and literals
;Other entities
 
These differ from the the previous tokens, in that each occurrence of them has a value associated with it.
Line 124:
** Char literals cannot represent a single quote character (value 39).
** String literals cannot represent strings containing double quote characters.
 
;Zero-width tokens
 
{| class="wikitable"
|-
! Name || Location
|-
| <tt>END_OF_INPUT</tt> || when the end of the input stream is reached
|}
 
;White space
 
* Zero or more whitespace characters, or comments enclosed in <code>/* ... */</code>, are allowed between any two tokens, with the exceptions noted below.
* "Longest token matching" is used to resolve conflicts (e.g., in order to match '''<=''' as a single token rather than the two tokens '''<''' and '''=''').
* Whitespace is ''required'' between two tokens that have an alphanumeric character or underscore at the edge.
Line 163 ⟶ 172:
 
They should produce the same token stream, except for the line and column positions.
 
Comments enclosed in <code>/* ... */</code> are also treated as whitespace outside of strings.
 
;Complete list of token names