Strip block comments: Difference between revisions

m
Line 572:
If the text delimiters were single characters only, similar to text literals, it would be easy: scan the text character by character and change state accordingly, though there would be complications if say two quote characters in a row were to signify a single internal quote. A DO-loop would do for the scan. But with multi-character delimiters the scan would have to lurch over a match, and fiddling the index variable of a DO-loop is frowned upon. So instead, slog it out. And be annoyed afresh by the indeterminacy of boolean expression evaluation of the form A '''and''' B or A '''or''' B in the context where the test is ''safe'' '''and''' ''test'' because the test might provoke an out-of-bounds fault if not safely within bounds. Like, THIS is being tested against the text in ACARD, but it must not compare beyond the end of the text in ACARD.
 
The removal of delimited text is taken literally: an incoming card's content might be entirely within a block comment and so be entirely rejected; if so, a null line results in ALINE, and it is ''not'' written to the output file. In other words, lines of block comment are not preserved as blank lines, nor as null lines, they are not there. Only if a line contains text outside of a block comment will it survive. Outside delimited block comments, spaces are just as valid as any other symbol, and are preserved. So, for example, it is <code>a = b + c ;</code> not <code>a = b + c ;</code> - two spaces after the = sign. Similarly, trailing spaces on a line survive - though I have UltraEdit set to trim trailing spaces and it is not clear whether the example source is to be regarded as containing them or not. Similarly, the presence of the delimiters is determined without context, for instance irrespective of whether or not they are inside quoted strings. If this process is applied to its own source file, then the only change is to produce <code>CALL UNBLOCK("")</code>, which could be avoided if the statement were to be <code>CALL UNBLOCK("/"//"*","*/")</code> so that the starting delimiter would not be self-identifying. The ending delimiter could be treated in the same way if there was fear that a block comment might have been started earlier in the source file.
 
The presence of the delimiters is determined without context, for instance irrespective of whether or not they are inside quoted strings. They cannot be split across lines and recognised, even though in Fortran itself such splitting is permissible. If this process is applied to its own source file, then the only change is to produce <code>CALL UNBLOCK("")</code>, which could be avoided if the statement were to be <code>CALL UNBLOCK("/"//"*","*/")</code> so that the starting delimiter would not be self-identifying. The ending delimiter could be treated in the same way if there was fear that a block comment might have been started earlier in the source file.
 
A feature of Fortran's character comparison is that trailing spaces are ignored, so that "x " and "x " and "x" are all deemed equal. Unfortunate choices of starting and ending delimiter texts can be made if they contain characters in common.
1,220

edits