String matching: Difference between revisions

(→‎{{header|Forth}}: Add Fortran.)
Line 1,118:
tuck 2>r negate over + 0 max /string 2r> compare 0= ;
\ use SEARCH ( a l a2 l2 -- a3 l3 ? ) for contains</lang>
 
=={{header|Fortran}}==
Fortran does not offer a string type, but since F77 it has been possible to use a CHARACTER variable, of some specified size, whose size may be accessed via the LEN function. When passed as a parameter, a secret additional parameter specifies its size and so string-like usage is possible. Character matching is case sensitive, and, trailing spaces are ignored so that "xx" and "xx " are deemed equal. The function INDEX(text,target) determines the first index in ''text'' where ''target'' matches, and returns zero if there is no such match. Unfortunately, the function does not allow the specification of a starting position for a search, as to find any second and further matches. One must specify something like <code>INDEX(text(5:),target)</code> to start with position five, and then deal with the resulting offsets needed to relate the result to positions within the parameter. Some Fortran compilers ''do'' offer a starting point, and also an option to search backwards from the end, but these facilities are not guaranteed.
 
A second problem is presented by the possibility that a logical expression such as <code>L.LT.0 .OR. ''etc.''</code> will always or might possibly or in certain constructions but not others be fully evaluated, which is to say that the ''etc'' will be evaluated even though L < 0 is ''true'' so that the result is determined. And in this case, evaluating the ''etc'' will cause trouble because the indexing won't work! To be safe, therefore, a rather lame two-stage test is required - though optimising compilers might well shift code around anyway.
 
In the case of STARTS, these annoyances can be left to the INDEX function rather than comparing the start of A against B. At the cost of it searching the whole of A. Otherwise, it would be the mirror of ENDS.
 
<lang Fortran>
SUBROUTINE STARTS(A,B) !Text A starts with text B?
CHARACTER*(*) A,B
IF (INDEX(A,B).EQ.1) THEN !Searches A to find B.
WRITE (6,*) ">",A,"< starts with >",B,"<"
ELSE
WRITE (6,*) ">",A,"< does not start with >",B,"<"
END IF
END SUBROUTINE STARTS
 
SUBROUTINE HAS(A,B) !Text B appears somewhere in text A?
CHARACTER*(*) A,B
INTEGER L
L = INDEX(A,B) !The first position in A where B matches.
IF (L.LE.0) THEN
WRITE (6,*) ">",A,"< does not contain >",B,"<"
ELSE
WRITE (6,*) ">",A,"< contains a >",B,"<, offset",L
END IF
END SUBROUTINE HAS
 
SUBROUTINE ENDS(A,B) !Text A ends with text B.
CHARACTER*(*) A,B
INTEGER L
L = LEN(A) - LEN(B) !Find the tail end of A that B might match.
IF (L.LT.0) THEN !Dare not use an OR, because of full evaluation risks.
WRITE (6,*) ">",A,"< is too short to end with >",B,"<" !Might as well have a special message.
ELSE IF (A(L + 1:L + LEN(B)).NE.B) THEN !Otherwise, it is safe to look.
WRITE (6,*) ">",A,"< does not end with >",B,"<"
ELSE
WRITE (6,*) ">",A,"< ends with >",B,"<"
END IF
END SUBROUTINE ENDS
 
CALL STARTS("This","is")
CALL STARTS("Theory","The")
CALL HAS("Bananas","an")
CALL ENDS("Banana","an")
CALL ENDS("Banana","na")
CALL ENDS("Brief","Much longer")
END
</lang>
Output: text strings are bounded by >''etc.''< in case of leading or trailing spaces.
<pre>
>This< does not start with >is<
>Theory< starts with >The<
>Bananas< contains a >an<, offset 2
>Banana< does not end with >an<
>Banana< ends with >na<
>Brief< is too short to end with >Much longer<
</pre>
 
=={{header|GML}}==
1,220

edits