Compiler/lexical analyzer: Difference between revisions

some more rewording in the task description
(Re-order and rename columns of the token tables, and add more details. (This doesn't affect the task content.))
(some more rewording in the task description)
Line 84:
;Other entities
 
These differ from the otherthe previous tokens, in that each occurrence of them has a value associated with it.
 
{| class="wikitable"
Line 119:
|}
 
* For char and string literals, the <code>\n</code> escape sequence is supported asto represent a new -line character.
* For char literals, to represent a backslash, use <code>\\</code>.
* For char literals, an embedded single quote character is not supported.
Line 128:
 
* Zero or more whitespace characters or comments are allowed between any two tokens, with the exceptions noted below.
* "longestLongest token matching" is used to resolve conflicts (e.g., in order to match '''<=''' as a single token rather than the two tokens '''<''' and '''=''').
* Whitespace is only ''required'' inbetween thetwo followingtokens that have an alphanumeric character or underscore at the situations:edge.
** This means: keywords, identifiers, and integer literals.
** To distinguish between keywords:
*** ifprinte.g. -<code>ifprint</code> is recognized as an identifier, instead of the keywords '''<tt>if'''</tt> and '''<tt>print'''</tt>.
*** e.g. <code>42fred -</code> is an invalid, and neither recognized as a number ornor invalidan identifier.
** To distinguish between keywords and integers:
* Whitespace is ''not allowed'' inside of tokens (except for chars and strings where they are part of the value).
*** 42fred - is an invalid number or invalid identifier.
** e.g. <code>& &</code> is invalid, and not interpreted as the <tt>&&</tt> operator.
* Whitespace is not allowed between:
** Multi-character operators: These cannot be recognized unless they occur without embedded whitespace: '''&&''' '''<='''.
 
The following programs are equivalent:
Anonymous user