String comparison: Difference between revisions

→‎{{header|Fortran}}: Ah yes, masking...
(→‎{{header|Fortran}}: Ah yes, masking...)
Line 968:
Exact matching is problematic, because trailing spaces are disregarded so that <code>"blah" .EQ. "blah "</code> yields ''true''. Thus, the quality of equality is strained. To test for "exact" equality the lengths would have to be compared also, as in <code>TEXT1.EQ.TEXT2 .AND. LEN(TEXT1).EQ.LEN(TEXT2)</code>
 
All character comparisons are literal: case counts. There is no facility for case insensitive comparison (though in principle a compiler could offer to do so via non-standard mnemonics) and there often are no library routines for case conversion. The usual procedure is to copy both items to scratch variables (with all the annoyance of "how big?"), convert both to upper case (or both to lower case) and then compare. Or, rather than copying the strings, one might code for a character-by-character comparison and handling case differences one character at a time. With single-character comparison one can use ICHAR(''c'') to obtain the numerical value of the character code, and this enables the use of the three-way test of the arithmetical-IF as in <code>IF (ICHAR(TEXT1(L:L)) - ICHAR(TEXT2(L:L))) ''negative'',''equal'',''positive''</code>, where ''negative'' would be the statement label jumped to should character L of TEXT2 be greater than character L of TEXT1. With equality, one would increment L and after checking that TEXT1 and TEXT2 had another character L available, test afresh. Further, to accommodate case insensitivity, one could prepare an array of 256 integers, say UC, where UC(''i'') = ''i'' except for those indices corresponding to the character code values of the lower case letters, for which the array has instead the value of the corresponding upper case letter. Then the comparison might be <code>IF (UC(ICHAR(TEXT1(L:L))) - UC(ICHAR(TEXT2(L:L)))) ''negative'',''equal'',''positive''</code> so that case insensitivity is achieved without the annoyance of multiple testing or case conversion but at the cost of array access. And by re-arranging values in array UC, a custom ordering of the character codes could be achieved at no extra cost.
 
To accommodate case insensitivity, one could use an AND operation to mask off the bits distinguishing a lower case letter from an upper case letter, but should non-letter characters be involved, this may mask other differences as well. Instead, prepare an array of 256 integers, say UC, where UC(''i'') = ''i'' except for those indices corresponding to the character code values of the lower case letters, for which the array has instead the value of the corresponding upper case letter. Then the comparison might be <code>IF (UC(ICHAR(TEXT1(L:L))) - UC(ICHAR(TEXT2(L:L)))) ''negative'',''equal'',''positive''</code> so that case insensitivity is achieved without the annoyance of multiple testing or case conversion but at the cost of array access. And by re-arranging values in array UC, a custom ordering of the character codes could be achieved at no extra cost.
 
=={{header|FreeBASIC}}==
1,220

edits