String comparison: Difference between revisions

→‎{{header|Fortran}}: Character comparisons can be done via arithmetic.
(→‎{{header|Fortran}}: Character comparisons can be done via arithmetic.)
Line 923:
Fortran 77 introduced the CHARACTER*''n'' TEXT type, whereby a variable was declared to have a fixed amount of space, of ''n'' characters, and trailing spaces were usual. There is no "length" attribute and LEN(TEXT) does not report the current length of the string but its size, which remains ''n''. F90 however introduced facilities whereby a character variable could be redefined to have the required size each time it has a value assigned to it, and this scheme became part of the language with F2003.
 
On the other hand, comparisons have been available from the start: mnemonics "stropped" by periods: <code>.LT.</code> <code>.LE.</code> <code>.EQ.</code> <code>.NE.</code> <code>.GE.</code> <code>.GT.</code> and more flexible compilers (F90 and later) also recognise respectively <code><</code> <code><=</code> <code>==</code> (a single = being comitted to representing assignment) <code>/=</code> (most ASCII keyboards lacking a ¬ character, present on IBM keyboards for EBCDIC) <code>>=</code> <code>></code>, so character string comparison is straightforward when both entities are character, and the usage has the same form as when both are numeric. There is no facility for comparing a numeric value such as 12345 to a character sequence "12345" because these are of incompatible types with very different bit patterns. You would have to convert the number to a character sequence, or the character sequence to a number - which last is complicated by the possibility of character sequences not presenting a valid number, as in "12three45". Thus, the comparison operators are polymorphic in application (to characters or to numbers) but unbending in use as a comparison can only be made of the same type entities. Similarly, although the existence of > ''etc.'' implies the operation of subtraction, this is not allowed for character variables and so the three-way choice of result via the arithmetic-IF is unavailable.
 
Exact matching is problematic, because trailing spaces are disregarded so that <code>"blah" .EQ. "blah "</code> yields ''true''. Thus, the quality of equality is strained. To test for "exact" equality the lengths would have to be compared also, as in <code>TEXT1.EQ.TEXT2 .AND. LEN(TEXT1).EQ.LEN(TEXT2)</code>
 
All character comparisons are literal: case counts. There is no facility for case insensitive comparison (though in principle a compiler could offer to do so via non-standard mnemonics) and there often are no library routines for case conversion. The usual procedure is to copy both items to scratch variables (with all the annoyance of "how big?"), convert both to upper case (or both to lower case) and then compare. Or, rather than copying the strings, one might code for a character-by-character comparison and handling case differences one character at a time. With single-character comparison one can use ICHAR(''c'') to obtain the numerical value of the character code, and this enables the use of the three-way test of the arithmetical-IF as in <code>IF (ICHAR(TEXT1(L:L)) - ICHAR(TEXT2(L:L))) ''negative'',''equal'',''positive''</code>, where ''negative'' would be the statement label jumped to should character L of TEXT2 be greater than character L of TEXT1. With equality, one would increment L and after checking that TEXT1 and TEXT2 had another character L available, test afresh. Further, to accommodate case insensitivity, one could prepare an array of 256 integers, say UC, where UC(''i'') = ''i'' except for the indices corresponding to the character code values of the lower case letters which instead have those of the upper case letters. Then the comparison might be <code>IF (UC(ICHAR(TEXT1(L:L))) - UC(ICHAR(TEXT2(L:L)))) ''negative'',''equal'',''positive''</code> whereby case insensitivity is achieved without the annoyance of multiple testing or case conversion but at the cost of array access.
 
=={{header|F_Sharp|F#}}==
1,220

edits