String matching

From Rosetta Code
Task
String matching
You are encouraged to solve this task according to the task description, using any language you may know.

Given two strings, demonstrate the following 3 types of matchings:

  1. Determining if the first string starts with second string
  2. Determining if the first string contains the second string at any location
  3. Determining if the first string ends with the second string

Optional requirements:

  1. Print the location of the match for part 2
  2. Handle multiple occurrences of a string for part 2.

Ada

<lang Ada> with Ada.Strings.Fixed; use Ada.Strings.Fixed; with Ada.Text_IO; use Ada.Text_IO;

procedure Match_Strings is

  S1 : constant String := "abcd";
  S2 : constant String := "abab";
  S3 : constant String := "ab";

begin

  if S1'Length >= S3'Length and then S1 (S1'First..S1'First + S3'Length - 1) = S3 then
     Put_Line ( & S1 & "' starts with '" & S3 & );
  end if;
  if S2'Length >= S3'Length and then S2 (S2'Last - S3'Length + 1..S2'Last) = S3 then
     Put_Line ( & S2 & "' ends with '" & S3 & );
  end if;
  Put_Line ( & S3 & "' first appears in '" & S1 & "' at" & Integer'Image (Index (S1, S3)));
  Put_Line
  (   & S3 & "' appears in '" & S2 &  &
     Integer'Image (Ada.Strings.Fixed.Count (S2, S3)) & " times"
  );

end Match_Strings; </lang> Sample output:

'abcd' starts with 'ab'
'abab' ends with 'ab'
'ab' first appears in 'abcd' at 1
'ab' appears in 'abab' 2 times

ALGOL 68

Translation of: python
Works with: ALGOL 68 version Revision 1 - no extensions to language used
Works with: ALGOL 68G version Any - tested with release 1.18.0-9h.tiny

<lang algol68># define some appropriate OPerators # PRIO STARTSWITH = 5, ENDSWITH = 5; OP STARTSWITH = (STRING str, prefix)BOOL: # assuming LWB = 1 #

 IF UPB str < UPB prefix THEN FALSE ELSE str[:UPB prefix]=prefix FI;

OP ENDSWITH = (STRING str, suffix)BOOL: # assuming LWB = 1 #

 IF UPB str < UPB suffix THEN FALSE ELSE str[UPB str-UPB suffix+1:]=suffix FI;

INT loc, loc2;

print((

 "abcd" STARTSWITH "ab", # returns TRUE #
 "abcd" ENDSWITH "zn", # returns FALSE #
 string in string("bb",loc,"abab"), # returns FALSE #
 string in string("ab",loc,"abab"), # returns TRUE #
 (string in string("bb",loc,"abab")|loc|-1), # returns -1 #
 (string in string("ab",loc,"abab")|loc|-1), # returns +1 #
 (string in string("ab",loc2,"abab"[loc+1:])|loc+loc2|-1) # returns +3 #

))</lang> Output:

TFFT         -1         +1         +3

C#

Works with: Mono version 2.6

<lang csharp> class Program { public static void Main (string[] args) { var value = "abcd".StartsWith("ab"); value = "abcd".EndsWith("zn"); //returns false value = "abab".Contains("bb"); //returns false value = "abab".Contains("ab"); //returns true int loc = "abab".IndexOf("bb"); //returns -1 loc = "abab".IndexOf("ab"); //returns 0 loc = "abab".IndexOf("ab",loc+1); //returns 2 } } </lang>

C++

<lang cpp>#include <string> using namespace std;

string s1="abcd"; string s2="abab"; string s3="ab"; //Beginning s1.compare(0,s3.size(),s3)==0; //End s1.compare(s1.size()-s3.size(),s3.size(),s3)==0; //Anywhere s1.find(s2)//returns string::npos int loc=s2.find(s3)//returns 0 loc=s2.find(s3,loc+1)//returns 2</lang>

Forth

<lang forth>: starts-with ( a l a2 l2 -- ? )

 tuck 2>r min 2r> compare 0= ;
ends-with ( a l a2 l2 -- ? )
 tuck 2>r negate over + 0 max /string 2r> compare 0= ;

\ use SEARCH ( a l a2 l2 -- a3 l3 ? ) for contains</lang>

J

<lang j>startswith=: ] -: ({.~ #) contains=: +./@:E.~ endswith=: ] -: ({.~ -@#)</lang>

Example use:

<lang j> 'abcd' startswith 'ab' 1

  'abcd' startswith 'cd'

0

  'abcd' endswith 'ab'

0

  'abcd' endswith 'cd'

1

  'abcd' contains 'bb'

0

  'abcd' contains 'ab'

1

  'abcd' contains 'bc'

1

  'abab' contains 'ab'

1

  'abab' I.@E.~ 'ab'       NB. find starting indicies

0 2</lang>

Note that these verbs also apply to arrays of type other than character so: <lang j> 0 1 2 3 startswith 0 1 NB. integer 1

  4.2 5.1 1.3 9 3 contains 1.3 4.2     NB. floating point

0

  4.2 5.1 1.3 4.2 9 3 contains 1.3 4.2 

1</lang>

Java

<lang java>"abcd".startsWith("ab") //returns true "abcd".endsWith("zn") //returns false "abab".contains("bb") //returns false "abab".contains("ab") //returns true int loc = "abab".indexOf("bb") //returns -1 loc = "abab".indexOf("ab") //returns 0 loc = "abab".indexOf("ab",loc+1) //returns 2</lang>

<lang logo>to starts.with? :sub :thing

 if empty? :sub [output "true]
 if empty? :thing [output "false]
 if not equal? first :sub first :thing [output "false]
 output starts.with? butfirst :sub butfirst :thing

end

to ends.with? :sub :thing

 if empty? :sub [output "true]
 if empty? :thing [output "false]
 if not equal? last :sub last :thing [output "false]
 output ends.with? butlast :sub butlast :thing

end

show starts.with? "dog "doghouse  ; true show ends.with? "house "doghouse  ; true show substring? "gho "doghouse  ; true (built-in)</lang>

Objective-C

<lang objc>[@"abcd" hasPrefix:@"ab"] //returns true [@"abcd" hasSuffix:@"zn"] //returns false int loc = [@"abab" rangeOfString:@"bb"].location //returns -1 loc = [@"abab" rangeOfString:@"ab"].location //returns 0 loc = [@"abab" rangeOfString:@"ab" options:0 range:NSMakeRange(loc+1, [@"abab" length]-(loc+1))].location //returns 2</lang>

Perl

<lang perl># the first four examples use regular expressions, so make sure to escape any special regex characters in the substring "abcd" =~ /^ab/ #returns true "abcd" =~ /zn$/ #returns false "abab" =~ /bb/ #returns false "abab" =~ /ab/ #returns true my $loc = index("abab", "bb") #returns -1 $loc = index("abab", "ab") #returns 0 $loc = index("abab", "ab", $loc+1) #returns 2</lang>

PicoLisp

<lang PicoLisp>: (pre? "ab" "abcd") -> "abcd"

(pre? "xy" "abcd")

-> NIL

(sub? "bc" "abcd")

-> "abcd"

(sub? "xy" "abcd")

-> NIL

(tail (chop "cd") (chop "abcd"))

-> ("c" "d")

(tail (chop "xy") (chop "abcd"))

-> NIL


(de positions (Pat Str)

  (setq Pat (chop Pat))
  (make
     (for ((I . L) (chop Str) L (cdr L))
        (and (head Pat L) (link I)) ) ) )
(positions "bc" "abcdabcd")

-> (2 6)</lang>

PureBasic

<lang PureBasic>Procedure StartsWith(String1$, String2$)

 Protected Result
 If FindString(String1$, String2$, 1) =1 ; E.g Found in possition 1
   Result =CountString(String1$, String2$)
 EndIf
 ProcedureReturn Result

EndProcedure

Procedure EndsWith(String1$, String2$)

 Protected Result, dl=Len(String1$)-Len(String2$)
 If dl>=0 And Right(String1$, Len(String2$))=String2$
   Result =CountString(String1$, String2$)
 EndIf
 ProcedureReturn Result

EndProcedure</lang> And a verification <lang PureBasic>Debug StartsWith("Rosettacode", "Rosetta")  ; = 1 Debug StartsWith("Rosettacode", "code")  ; = 0 Debug StartsWith("eleutherodactylus cruralis", "e")  ; = 3 Debug EndsWith ("Rosettacode", "Rosetta")  ; = 0 Debug EndsWith ("Rosettacode", "code")  ; = 1 Debug EndsWith ("Rosettacode", "e")  ; = 2</lang>

Python

<lang python>"abcd".startswith("ab") #returns true "abcd".endswith("zn") #returns false "bb" in "abab" #returns false "ab" in "abab" #returns true loc = "abab".find("bb") #returns -1 loc = "abab".find("ab") #returns 0 loc = "abab".find("ab",loc+1) #returns 2</lang>

Ruby

Works with: Ruby version 1.8.7

<lang ruby>'abcd'.start_with?('ab') #returns true 'abcd'.end_with?('zn') #returns false 'abab'.include?('bb') #returns false 'abab'.include?('ab') #returns true 'abab'.index('bb') #returns -1 'abab'.index('ab') #returns 0 'abab'.index('ab', 1) #returns</lang>

Retro

<lang Retro>with strings'

startsWith? ( $1 $2 - f )
 dup getLength [ swap ] dip 0 swap getSubset compare ;

"abcdefghijkl" "abcde" startsWith? "abcdefghijkl" "bcd" startsWith?

"abcdefghijkl" "bcd" search "abcdefghijkl" "zmq" search

endsWith? ( $1 $2 - f )
 swap dup getLength + over getLength - compare ;

"abcdefghijkl" "ijkl" endsWith? "abcdefghijkl" "abc" endsWith?</lang>

Scala

<lang scala>"abcd".startsWith("ab") //returns true "abcd".endsWith("zn") //returns false "abab".contains("bb") //returns false "abab".contains("ab") //returns true

var loc="abab".indexOf("bb") //returns -1 loc = "abab".indexOf("ab") //returns 0 loc = "abab".indexOf("ab", loc+1) //returns 2</lang>

Tcl

In this code, we are looking in various ways for the string in the variable needle in the string in the variable haystack. <lang tcl>set isPrefix [string equal -length [string length $needle] $haystack $needle] set isContained [expr {[string first $needle $haystack] >= 0}] set isSuffix [string equal $needle [string range $haystack end-[expr {[string length $needle]-1}] end]]</lang>

Of course, in the cases where the needle is a glob-safe string (i.e., doesn't have any of the characters “*?[\” in), this can be written far more conveniently: <lang tcl>set isPrefix [string match $needle* $haystack] set isContained [string match *$needle* $haystack] set isSuffix [string match *$needle $haystack]</lang>

Another powerful technique is to use the regular expression engine in literal string mode: <lang tcl>set isContained [regexp ***=$needle $haystack]</lang> This can be extended by getting the regexp to return the locations of the matches, enabling the other forms of match to be done: <lang tcl>set matchLocations [regexp -indices -all -inline ***=$needle $haystack]

  1. Each match location is a pair, being the index into the string where the needle started
  2. to match and the index where the needle finished matching

set isContained [expr {[llength $matchLocations] > 0}] set isPrefix [expr {[lindex $matchLocations 0 0] == 0}] set isSuffix [expr {[lindex $matchLocations end 1] == [string length $haystack]-1}] set firstMatchStart [lindex $matchLocations 0 0] puts "Found \"$needle\" in \"$haystack\" at $firstMatchStart" foreach location $matchLocations {

   puts "needle matched at index [lindex $location 0]"

}</lang>