Jump to content

Compiler/lexical analyzer: Difference between revisions

Document END_OF_INPUT and comments more explicitly
(update the token names in the Output Format section of the task description)
(Document END_OF_INPUT and comments more explicitly)
Line 82:
|}
 
;Identifiers and literals
;Other entities
 
These differ from the the previous tokens, in that each occurrence of them has a value associated with it.
Line 124:
** Char literals cannot represent a single quote character (value 39).
** String literals cannot represent strings containing double quote characters.
 
;Zero-width tokens
 
{| class="wikitable"
|-
! Name || Location
|-
| <tt>END_OF_INPUT</tt> || when the end of the input stream is reached
|}
 
;White space
 
* Zero or more whitespace characters, or comments enclosed in <code>/* ... */</code>, are allowed between any two tokens, with the exceptions noted below.
* "Longest token matching" is used to resolve conflicts (e.g., in order to match '''<=''' as a single token rather than the two tokens '''<''' and '''=''').
* Whitespace is ''required'' between two tokens that have an alphanumeric character or underscore at the edge.
Line 163 ⟶ 172:
 
They should produce the same token stream, except for the line and column positions.
 
Comments enclosed in <code>/* ... */</code> are also treated as whitespace outside of strings.
 
;Complete list of token names
Anonymous user
Cookies help us deliver our services. By using our services, you agree to our use of cookies.