Regular expressions

The goal of this task is

to match a string against a regular expression
to substitute part of a string using a regular expression

Ada

There is no Regular Expression library in the Ada Standard, so I am using one of the libraries provided by gnat/gcc. <lang ada>with Ada.Text_IO; with Gnat.Regpat; use Ada.Text_IO;

procedure Regex is

  package Pat renames Gnat.Regpat;

  procedure Search_For_Pattern(Compiled_Expression: Pat.Pattern_Matcher;
                               Search_In: String;
                               First, Last: out Positive;
                               Found: out Boolean) is
     Result: Pat.Match_Array (0 .. 1);
  begin
     Pat.Match(Compiled_Expression, Search_In, Result);
     Found := not Pat."="(Result(1), Pat.No_Match);
     if Found then
        First := Result(1).First;
        Last := Result(1).Last;
     end if;
  end Search_For_Pattern;

  Word_Pattern: constant String := "([a-zA-Z]+)";

  Str:           String:= "I love PATTERN matching!";
  Current_First: Positive := Str'First;
  First, Last:   Positive;
  Found:         Boolean;

begin

  -- first, find all the words in Str
  loop
     Search_For_Pattern(Pat.Compile(Word_Pattern),
                        Str(Current_First .. Str'Last),
                        First, Last, Found);
  exit when not Found;
     Put_Line("<" & Str(First .. Last) & ">");
     Current_First := Last+1;
  end loop;

  -- second, replace "PATTERN" in Str by "pattern"
  Search_For_Pattern(Pat.Compile("(PATTERN)"), Str, First, Last, Found);
  Str := Str(Str'First .. First-1) & "pattern" & Str(Last+1 .. Str'Last);
  Put_Line(Str);

end Regex;</lang>

Output:

<I>
<love>
<PATTERN>
<matching>
I love pattern matching!

ALGOL 68

The routines grep in strings and sub in string are not part of ALGOL 68's standard prelude.

Works with: ALGOL 68G version Any - tested with release mk15-0.8b.fc9.i386

<lang algol68>INT match=0, no match=1, out of memory error=2, other error=3;

STRING str := "i am a string";

Match: #

STRING m := "string$"; INT start, end; IF grep in string(m, str, start, end) = match THEN printf(($"Ends with """g""""l$, str[start:end])) FI;

Replace: #

IF sub in string(" a ", " another ",str) = match THEN printf(($gl$, str)) FI;</lang> Output:

Ends with "string"
i am another string

Standard ALGOL 68 does have an primordial form of pattern matching called a format. This is designed to extract values from input data. But it can also be used for outputting (and transputting) the original data.

Works with: ALGOL 68 version Standard - But declaring book as flex[]flex[]string

Works with: ALGOL 68G version Any - tested with release mk15-0.8b.fc9.i386

For example:<lang algol68>FORMAT pattern = $ddd" "c("cats","dogs")$; FILE file; STRING book; associate(file, book); on value error(file, (REF FILE f)BOOL: stop); on format error(file, (REF FILE f)BOOL: stop);

book := "100 dogs"; STRUCT(INT count, type) dalmatians;

getf(file, (pattern, dalmatians)); print(("Dalmatians: ", dalmatians, new line)); count OF dalmatians +:=1; printf(($"Gives: "$, pattern, dalmatians, $l$))</lang> Output:

Dalmatians:        +100         +2
Gives 101 dogs

Argile

<lang Argile>use std, regex

(: matching :) if "some matchable string" =~ /^some" "+[a-z]*" "+string$/

 echo string matches

else

 echo string "doesn't" match

(: replacing :) let t = strdup "some allocated string" t =~ s/a/"4"/g t =~ s/e/"3"/g t =~ s/i/"1"/g t =~ s/o/"0"/g t =~ s/s/$/g print t free t

(: flushing regex allocations :) uninit regex

check mem leak; use dbg (:optional:)</lang>

(note that it needs to be compiled with argrt library)

Output:

string matches
$0m3 4ll0c4t3d $tr1ng

AutoHotkey

<lang AutoHotkey>MsgBox % foundpos := RegExMatch("Hello World", "World$") MsgBox % replaced := RegExReplace("Hello World", "World$", "yourself")</lang>

AWK

AWK supports regular expressions, which are typically marked up with slashes in front and back, and the "~" operator: <lang awk>$ awk '{if($0~/[A-Z]/)print "uppercase detected"}' abc ABC uppercase detected</lang> As shorthand, a regular expression in the condition part fires if it matches an input line: <lang awk>awk '/[A-Z]/{print "uppercase detected"}' def DeF uppercase detected</lang> For substitution, the first argument can be a regular expression, while the replacement string is constant (only that '&' in it receives the value of the match): <lang awk>$ awk '{gsub(/[A-Z]/,"*");print}' abCDefG ab**ef* $ awk '{gsub(/[A-Z]/,"(&)");print}' abCDefGH ab(C)(D)ef(G)(H)</lang> This variant matches one or more uppercase letters in one round: <lang awk>$ awk '{gsub(/[A-Z]+/,"(&)");print}' abCDefGH ab(CD)ef(GH)</lang>

BBC BASIC

Works with: BBC BASIC for Windows

Uses the gnu_regex library. <lang bbcbasic> SYS "LoadLibrary", "gnu_regex.dll" TO gnu_regex%

     IF gnu_regex% = 0 ERROR 100, "Cannot load gnu_regex.dll"
     SYS "GetProcAddress", gnu_regex%, "regcomp" TO regcomp
     SYS "GetProcAddress", gnu_regex%, "regexec" TO regexec
     
     DIM regmatch{start%, finish%}, buffer% 256
     
     REM Find all 'words' in a string:
     teststr$ = "I love PATTERN matching!"
     pattern$ = "([a-zA-Z]+)"
     
     SYS regcomp, buffer%, pattern$, 1 TO result%
     IF result% ERROR 101, "Failed to compile regular expression"
     
     first% = 1
     REPEAT
       SYS regexec, buffer%, MID$(teststr$, first%), 1, regmatch{}, 0 TO result%
       IF result% = 0 THEN
         s% = regmatch.start%
         f% = regmatch.finish%
         PRINT "<" MID$(teststr$, first%+s%, f%-s%) ">"
         first% += f%
       ENDIF
     UNTIL result%
     
     REM Replace 'PATTERN' with 'pattern':
     teststr$ = "I love PATTERN matching!"
     pattern$ = "(PATTERN)"
     
     SYS regcomp, buffer%, pattern$, 1 TO result%
     IF result% ERROR 101, "Failed to compile regular expression"
     SYS regexec, buffer%, teststr$, 1, regmatch{}, 0 TO result%
     IF result% = 0 THEN
       s% = regmatch.start%
       f% = regmatch.finish%
       MID$(teststr$, s%+1, f%-s%) = "pattern"
       PRINT teststr$
     ENDIF

     SYS "FreeLibrary", gnu_regex%</lang>

Output:

<I>
<love>
<PATTERN>
<matching>
I love pattern matching!

Bracmat

Pattern matching in Bracmat is inspired by pattern matching in Snobol. It also is quite different from regular expressions:

Patterns in Bracmat are not greedy
It is not possible to replace substrings, because values can never be changed
Patterns always must match all of the subject
Strings as well as complex data can be subjected to pattern matching

List all rational numbers smaller then 7 hidden in the string "fgsakg789/35768685432fkgha" <lang bracmat>@("fesylk789/35768poq2art":? (#<7:?n & out$!n & ~) ?)</lang> Output:

After the last number, the match expression fails.

C

Works with: POSIX

As far as I can see, POSIX defined function for regex matching, but nothing for substitution. So we must do all the hard work by hand. The complex-appearing code could be turned into a function.

<lang c>#include <stdio.h>

include <stdlib.h>
include <sys/types.h>
include <regex.h>
include <string.h>

int main() {

  regex_t preg;
  regmatch_t substmatch[1];
  const char *tp = "string$";
  const char *t1 = "this is a matching string";
  const char *t2 = "this is not a matching string!";
  const char *ss = "istyfied";
  
  regcomp(&preg, "string$", REG_EXTENDED);
  printf("'%s' %smatched with '%s'\n", t1,
                                       (regexec(&preg, t1, 0, NULL, 0)==0) ? "" : "did not ", tp);
  printf("'%s' %smatched with '%s'\n", t2,
                                       (regexec(&preg, t2, 0, NULL, 0)==0) ? "" : "did not ", tp);
  regfree(&preg);
  /* change "a[a-z]+" into "istifyed"?*/
  regcomp(&preg, "a[a-z]+", REG_EXTENDED);
  if ( regexec(&preg, t1, 1, substmatch, 0) == 0 )
  {
     //fprintf(stderr, "%d, %d\n", substmatch[0].rm_so, substmatch[0].rm_eo);
     char *ns = malloc(substmatch[0].rm_so + 1 + strlen(ss) +
                       (strlen(t1) - substmatch[0].rm_eo) + 2);
     memcpy(ns, t1, substmatch[0].rm_so+1);
     memcpy(&ns[substmatch[0].rm_so], ss, strlen(ss));
     memcpy(&ns[substmatch[0].rm_so+strlen(ss)], &t1[substmatch[0].rm_eo],
               strlen(&t1[substmatch[0].rm_eo]));
     ns[ substmatch[0].rm_so + strlen(ss) +
         strlen(&t1[substmatch[0].rm_eo]) ] = 0;
     printf("mod string: '%s'\n", ns);
     free(ns); 
  } else {
     printf("the string '%s' is the same: no matching!\n", t1);
  }
  regfree(&preg);
  
  return 0;

}</lang>

C++

Works with: g++ version 4.0.2

Library: Boost

<lang cpp>#include <iostream>

include <string>
include <iterator>
include <boost/regex.hpp>

int main() {

 boost::regex re(".* string$");
 std::string s = "Hi, I am a string";

 // match the complete string
 if (boost::regex_match(s, re))
   std::cout << "The string matches.\n";
 else
   std::cout << "Oops - not found?\n";

 // match a substring
 boost::regex re2(" a.*a");
 boost::smatch match;
 if (boost::regex_search(s, match, re2))
 {
   std::cout << "Matched " << match.length()
             << " characters starting at " << match.position() << ".\n";
   std::cout << "Matched character sequence: \""
             << match.str() << "\"\n";
 }
 else
 {
   std::cout << "Oops - not found?\n";
 }

 // replace a substring
 std::string dest_string;
 boost::regex_replace(std::back_inserter(dest_string),
                      s.begin(), s.end(),
                      re2,
                      "'m now a changed");
 std::cout << dest_string << std::endl;

}</lang>

C#

<lang csharp>using System; using System.Text.RegularExpressions;

class Program {

   static void Main(string[] args) {
       string str = "I am a string";

       if (new Regex("string$").IsMatch(str)) {
           Console.WriteLine("Ends with string.");
       }

       str = new Regex(" a ").Replace(str, " another ");
       Console.WriteLine(str);
   }

}</lang>

Clojure

<lang clojure>(let [s "I am a string"]

 ;; match
 (when (re-find #"string$" s)
   (println "Ends with 'string'."))
 (when-not (re-find #"^You" s)
   (println "Does not start with 'You'."))

 ;; substitute
 (println (clojure.string/replace s " a " " another "))

)</lang>

Common Lisp

Translation of: Perl

Uses CL-PPCRE - Portable Perl-compatible regular expressions for Common Lisp.

<lang lisp>(let ((string "I am a string"))

 (when (cl-ppcre:scan "string$" string)
   (write-line "Ends with string"))
 (unless (cl-ppcre:scan "^You" string )
   (write-line "Does not start with 'You'")))</lang>

Substitute

<lang lisp>(let* ((string "I am a string")

      (string (cl-ppcre:regex-replace " a " string " another ")))
 (write-line string))</lang>

Test and Substitute

<lang lisp>(let ((string "I am a string"))

 (multiple-value-bind (string matchp)
     (cl-ppcre:regex-replace "\\bam\\b" string "was")
   (when matchp
     (write-line "I was able to find and replace 'am' with 'was'."))))</lang>

CLISP regexp engine

Works with: CLISP

Clisp comes with built-in regexp matcher. On a Clisp prompt: <lang lisp>[1]> (regexp:match "fox" "quick fox jumps")

S(REGEXP:MATCH :START 6 :END 9)</lang>

To find all matches, loop with different :start keyword.

Replacing text can be done with the help of REGEXP:REGEXP-SPLIT function: <lang lisp>[2]> (defun regexp-replace (pat repl string)

 (reduce #'(lambda (x y) (string-concat x repl y))
         (regexp:regexp-split pat string)))

REGEXP-REPLACE [3]> (regexp-replace "x\\b" "-X-" "quick foxx jumps") "quick fox-X- jumps"</lang>

D

<lang d>import std.stdio, std.regex;

void main() {

   immutable string s = "I am a string";

   // Test:
   if (!match(s, r"string$").empty)
       writeln("Ends with 'string'.");

   // Substitute:
   replace(s, regex(" a "), " another ").writeln();

}</lang>

Output:

Ends with 'string'.
I am another string

In std.string there are string functions to perform the same operations more efficiently.

Erlang

<lang erlang>match() -> String = "This is a string", case re:run(String, "string$") of {match,_} -> io:format("Ends with 'string'~n"); _ -> ok end.

substitute() -> String = "This is a string", NewString = re:replace(String, " a ", " another ", [{return, list}]), io:format("~s~n",[NewString]).</lang>

Forth

Library: Forth Foundation Library

Test/Match <lang forth>include ffl/rgx.fs

\ Create a regular expression variable 'exp' in the dictionary

rgx-create exp

\ Compile an expression

s" Hello (World)" exp rgx-compile [IF]

 .( Regular expression successful compiled.) cr

[THEN]

\ (Case sensitive) match a string with the expression

s" Hello World" exp rgx-cmatch? [IF]

 .( String matches with the expression.) cr

[ELSE]

 .( No match.) cr

[THEN]</lang>

Frink

Pattern matching: <lang frink> line = "My name is Inigo Montoya."

for [first, last] = line =~ %r/my name is (\w+) (\w+)/ig {

  println["First name is: $first"]
  println["Last name is: $last"]

} </lang>

Replacement: (Replaces in the variable line) <lang frink> line =~ %s/Frank/Frink/g </lang>

Go

<lang go>package main import "fmt" import "regexp"

func main() {

 str := "I am the original string"

 // Test
 matched, _ := regexp.MatchString(".*string$", str)
 if matched { fmt.Println("ends with 'string'") }

 // Substitute
 pattern := regexp.MustCompile("original")
 result := pattern.ReplaceAllString(str, "modified")
 fmt.Println(result)

}</lang>

Groovy

"Matching" Solution (it's complicated): <lang groovy>import java.util.regex.*;

def woodchuck = "How much wood would a woodchuck chuck if a woodchuck could chuck wood?" def pepper = "Peter Piper picked a peck of pickled peppers"

println "=== Regular-expression String syntax (/string/) ===" def woodRE = /[Ww]o\w+d/ def piperRE = /[Pp]\w+r/ assert woodRE instanceof String && piperRE instanceof String assert (/[Ww]o\w+d/ == "[Ww]o\\w+d") && (/[Pp]\w+r/ == "[Pp]\\w+r") println ([woodRE: woodRE, piperRE: piperRE]) println ()

println "=== Pattern (~) operator ===" def woodPat = ~/[Ww]o\w+d/ def piperPat = ~piperRE assert woodPat instanceof Pattern && piperPat instanceof Pattern

def woodList = woodchuck.split().grep(woodPat) println ([exactTokenMatches: woodList]) println ([exactTokenMatches: pepper.split().grep(piperPat)]) println ()

println "=== Matcher (=~) operator ===" def wwMatcher = (woodchuck =~ woodRE) def ppMatcher = (pepper =~ /[Pp]\w+r/) def wpMatcher = (woodchuck =~ /[Pp]\w+r/) assert wwMatcher instanceof Matcher && ppMatcher instanceof Matcher assert wwMatcher.toString() == woodPat.matcher(woodchuck).toString() assert ppMatcher.toString() == piperPat.matcher(pepper).toString() assert wpMatcher.toString() == piperPat.matcher(woodchuck).toString()

println ([ substringMatches: wwMatcher.collect { it }]) println ([ substringMatches: ppMatcher.collect { it }]) println ([ substringMatches: wpMatcher.collect { it }]) println ()

println "=== Exact Match (==~) operator ===" def containsWoodRE = /.*/ + woodRE + /.*/ def containsPiperRE = /.*/ + piperRE + /.*/ def wwMatches = (woodchuck ==~ containsWoodRE) assert wwMatches instanceof Boolean def wwNotMatches = ! (woodchuck ==~ woodRE) def ppMatches = (pepper ==~ containsPiperRE) def pwNotMatches = ! (pepper ==~ containsWoodRE) def wpNotMatches = ! (woodchuck ==~ containsPiperRE) assert wwMatches && wwNotMatches && ppMatches && pwNotMatches && pwNotMatches

println ("'${woodchuck}' ${wwNotMatches ? 'does not' : 'does'} match '${woodRE}' exactly") println ("'${woodchuck}' ${wwMatches ? 'does' : 'does not'} match '${containsWoodRE}' exactly")</lang>

Output:

=== Regular-expression String syntax (/string/)=== 
[woodRE:[Ww]o\w+d, piperRE:[Pp]\w+r]

=== Pattern (~) operator ===
[exactTokenMatches:[wood, would]]
[exactTokenMatches:[Peter, Piper]]

=== Matcher (=~) operator ===
[substringMatches:[wood, would, wood, wood, wood]]
[substringMatches:[Peter, Piper, pepper]]
[substringMatches:[]]

=== Exact Match (==~) operator ===
'How much wood would a woodchuck chuck if a woodchuck could chuck wood?' does not match '[Ww]o\w+d' exactly
'How much wood would a woodchuck chuck if a woodchuck could chuck wood?' does match '.*[Ww]o\w+d.*' exactly

Replacement Solution (String.replaceAll()): <lang groovy>println woodchuck.replaceAll(/c\w+k/, "CHUCK")</lang>

Output:

How much wood would a woodCHUCK CHUCK if a woodCHUCK could CHUCK wood?

Reusable Replacement Solution (Matcher.replaceAll()): <lang groovy>def ck = (woodchuck =~ /c\w+k/) println (ck.replaceAll("CHUCK")) println (ck.replaceAll("wind")) println (ck.replaceAll("pile")) println (ck.replaceAll("craft")) println (ck.replaceAll("block")) println (ck.replaceAll("row")) println (ck.replaceAll("shed")) println (ck.replaceAll("man")) println (ck.replaceAll("work")) println (ck.replaceAll("pickle"))</lang>

Output:

How much wood would a woodCHUCK CHUCK if a woodCHUCK could CHUCK wood?
How much wood would a woodwind wind if a woodwind could wind wood?
How much wood would a woodpile pile if a woodpile could pile wood?
How much wood would a woodcraft craft if a woodcraft could craft wood?
How much wood would a woodblock block if a woodblock could block wood?
How much wood would a woodrow row if a woodrow could row wood?
How much wood would a woodshed shed if a woodshed could shed wood?
How much wood would a woodman man if a woodman could man wood?
How much wood would a woodwork work if a woodwork could work wood?
How much wood would a woodpickle pickle if a woodpickle could pickle wood?

Haskell

Test <lang haskell>import Text.Regex

str = "I am a string"

case matchRegex (mkRegex ".*string$") str of

 Just _  -> putStrLn $ "ends with 'string'"
 Nothing -> return ()</lang>

Substitute <lang haskell>import Text.Regex

orig = "I am the original string" result = subRegex (mkRegex "original") orig "modified" putStrLn $ result</lang>

HicEst

<lang hicest>CHARACTER string*100/ "The quick brown fox jumps over the lazy dog" / REAL, PARAMETER :: Regex=128, Count=256

characters_a_m = INDEX(string, "[a-m]", Regex+Count) ! counts 16

vocals_changed = EDIT(Text=string, Option=Regex, Right="[aeiou]", RePLaceby='**', DO=LEN(string) ) ! changes 11 WRITE(ClipBoard) string ! Th** q****ck br**wn f**x j**mps **v**r th** l**zy d**g</lang>

Icon and Unicon

Regex includes procedures to provide access to regular expressions within native string scanning and matching expressions. 'ReFind' and 'ReMatch' respectively generate the sequence of beginning and ending positions matched by a regular expression. Additionally, there is a regular expression pattern compiler 'RePat' and other supporting functions and variables.

<lang Icon>procedure main()

s := "A simple string" p := "string$" # regular expression

s ? write(image(s),if ReFind(p) then " matches " else " doesn't match ",image(p))

s[j := ReFind(p,s):ReMatch(p,s,j)] := "replacement" write(image(s)) end

link regexp # link to IPL regexp </lang>

Library: Icon Programming Library

See regexp.

Sample output:

"A simple string" matches "string$"
"A simple replacement"

Inform 7

Inform's regex support is similar to Perl's but with some limitations: angle brackets are used instead of square brackets, there is no multiline mode, several control characters and character classes are omitted, and backtracking is slightly less powerful.

<lang inform7>let T be indexed text; let T be "A simple string"; if T matches the regular expression ".*string$", say "ends with string."; replace the regular expression "simple" in T with "replacement";</lang>

J

J's regex support is built on top of PCRE.

<lang j>load'regex' NB. Load regex library str =: 'I am a string' NB. String used in examples.</lang>

Matching:

<lang j> '.*string$' rxeq str NB. 1 is true, 0 is false 1</lang>

Substitution:

<lang j> ('am';'am still') rxrplc str I am still a string</lang>

Note: use<lang J> open'regex'</lang> to read the source code for the library. The comments list 6 main definitions and a dozen utility definitions.

Java

Works with: Java version 1.4+

Test

<lang java>String str = "I am a string"; if (str.matches(".*string")) { // note: matches() tests if the entire string is a match

 System.out.println("ends with 'string'");

}</lang>

To match part of a string, or to process matches: <lang java>import java.util.regex.*; Pattern p = Pattern.compile("a*b"); Matcher m = p.matcher(str); while (m.find()) {

 // use m.group() to extract matches

}</lang>

Substitute

<lang java>String orig = "I am the original string"; String result = orig.replaceAll("original", "modified"); // result is now "I am the modified string"</lang>

JavaScript

Test/Match <lang javascript>var subject = "Hello world!";

// Two different ways to create the RegExp object // Both examples use the exact same pattern... matching "hello" var re_PatternToMatch = /Hello (World)/i; // creates a RegExp literal with case-insensitivity var re_PatternToMatch2 = new RegExp("Hello (World)", "i");

// Test for a match - return a bool var isMatch = re_PatternToMatch.test(subject);

// Get the match details // Returns an array with the match's details // matches[0] == "Hello world" // matches[1] == "world" var matches = re_PatternToMatch2.exec(subject);</lang>

Substitute <lang javascript>var subject = "Hello world!";

// Perform a string replacement // newSubject == "Replaced!" var newSubject = subject.replace(re_PatternToMatch, "Replaced");</lang>

Lua

<lang lua>str1 = "This is a string!" str2 = "string"

print( str1:match( str2 ) ) erg = str1:gsub( "a", "another" ); print( erg )</lang>

M4

<lang M4>regexp(`GNUs not Unix', `\<[a-z]\w+') regexp(`GNUs not Unix', `\<[a-z]$\w+$', `a \& b \1 c')</lang>

Output:

5
a not b ot c

Mathematica

<lang Mathematica> StringCases["I am a string with the number 18374 in me",RegularExpression["[0-9]+"]] StringReplace["I am a string",RegularExpression["I\\sam"] -> "I'm"] </lang> The in-notebook output, in order:

{18374}
I'm a string

MIRC Scripting Language

<lang mirc>alias regular_expressions {

 var %string = This is a string
 var %re = string$
 if ($regex(%string,%re) > 0) {
   echo -a Ends with string.
 }
 %re = \ba\b
 if ($regsub(%string,%re,another,%string) > 0) {
   echo -a Result 1: %string
 }
 %re = \b(another)\b
 echo -a Result 2: $regsubex(%string,%re,yet \1)

}</lang>

Output:

Ends with string.
Result 1: This is another string
Result 2: This is yet another string

MUMPS

MUMPS doesn't have a replacement functionality when using the pattern matching operator, ?. We can mimic it with $PIECE, but $PIECE doesn't work with regular expressions as an operand.

<lang MUMPS>REGEXP

NEW HI,W,PATTERN,BOOLEAN
SET HI="Hello, world!",W="world"
SET PATTERN=".E1"""_W_""".E"
SET BOOLEAN=HI?@PATTERN
WRITE "Source string - '"_HI_"'",!
WRITE "Partial string - '"_W_"'",!
WRITE "Pattern string created is - '"_PATTERN_"'",!
WRITE "Match? ",$SELECT(BOOLEAN:"YES",'BOOLEAN:"No"),!
;
SET BOOLEAN=$FIND(HI,W)
IF BOOLEAN>0 WRITE $PIECE(HI,W,1)_"string"_$PIECE(HI,W,2)
QUIT</lang>

Usage:

USER>D REGEXP^ROSETTA
Source string - 'Hello, world!'
Partial string - 'world'
Pattern string created is - '.E1"world".E'
Match? YES
Hello, string!

NetRexx

<lang NetRexx>/* NetRexx */ options replace format comments java crossref symbols nobinary

import java.util.regex.

st1 = 'Fee, fie, foe, fum, I smell the blood of an Englishman' rx1 = 'f.e.*?' sbx = 'foo'

rx1ef = '(?i)'rx1 -- use embedded flag expression == Pattern.CASE_INSENSITIVE

-- using String's matches & replaceAll mcm = (String st1).matches(rx1ef) say 'String "'st1'"' 'matches pattern "'rx1ef'":' Boolean(mcm) say say 'Replace all occurences of regex pattern "'rx1ef'" with "'sbx'"' stx = Rexx stx = (String st1).replaceAll(rx1ef, sbx) say 'Input string: "'st1'"' say 'Result string: "'stx'"' say

-- using java.util.regex classes pt1 = Pattern.compile(rx1, Pattern.CASE_INSENSITIVE) mc1 = pt1.matcher(st1) mcm = mc1.matches() say 'String "'st1'"' 'matches pattern "'pt1.toString()'":' Boolean(mcm) mc1 = pt1.matcher(st1) say say 'Replace all occurences of regex pattern "'rx1'" with "'sbx'"' sx1 = Rexx sx1 = mc1.replaceAll(sbx) say 'Input string: "'st1'"' say 'Result string: "'sx1'"' say

return </lang> Output:

String "Fee, fie, foe, fum, I smell the blood of an Englishman" matches pattern "(?i)f.e.*?": true

Replace all occurences of regex pattern "(?i)f.e.*?" with "foo"
Input string:  "Fee, fie, foe, fum, I smell the blood of an Englishman"
Result string: "foo, foo, foo, fum, I smell the blood of an Englishman"

String "Fee, fie, foe, fum, I smell the blood of an Englishman" matches pattern "f.e.*?": true

Replace all occurences of regex pattern "f.e.*?" with "foo"
Input string:  "Fee, fie, foe, fum, I smell the blood of an Englishman"
Result string: "foo, foo, foo, fum, I smell the blood of an Englishman"

NewLISP

<lang NewLISP >(regex "[bB]+" "AbBBbABbBAAAA") -> ("bBBb" 1 4)</lang>

Objeck

<lang objeck> use RegEx;

bundle Default {

 class RegExTest {
   function : Main(args : String[]) ~ Nil {
     string := "I am a string";
     # exact match
     regex := RegEx->New(".*string");
     if(regex->MatchExact(".*string")) {
       "ends with 'string'"->PrintLine();
     };
     # replace all
     regex := RegEx->New(" a ");
     regex->ReplaceAll(string, " another ")->PrintLine();
   }
 }

} </lang>

Objective-C

Test

Works with: Mac OS X version 10.4+

Works with: iOS version 3.0+

<lang objc>NSString *str = @"I am a string"; NSString *regex = @".*string$";

// Note: the MATCHES operator matches the entire string, necessitating the ".*" NSPredicate *pred = [NSPredicate predicateWithFormat:@"SELF MATCHES %@", regex];

if ([pred evaluateWithObject:str]) {

   NSLog(@"ends with 'string'");

}</lang> Unfortunately this method cannot find the location of the match or do substitution.

NSRegularExpressionSearch

Test

Works with: Mac OS X version 10.7+

Works with: iOS version 3.2+

<lang objc>NSString *str = @"I am a string"; if ([str rangeOfString:@"string$" options:NSRegularExpressionSearch].location != NSNotFound) {

   NSLog(@"Ends with 'string'");

}</lang>

Substitute

Works with: Mac OS X version 10.7+

Works with: iOS version 4.0+

undocumented

<lang objc>NSString *orig = @"I am the original string"; NSString *result = [orig stringByReplacingOccurrencesOfString:@"original"

                                                  withString:@"modified"
                                                     options:NSRegularExpressionSearch
                                                       range:NSMakeRange(0, [orig length])];

NSLog(@"%@", result);</lang>

NSRegularExpression

Works with: Mac OS X version 10.7+

Works with: iOS version 4.0+

Test <lang objc>NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"string$"

                                                                      options:0
                                                                        error:NULL];

NSString *str = @"I am a string"; if ([regex rangeOfFirstMatchInString:str

                            options:0
                              range:NSMakeRange(0, [str length])
    ].location != NSNotFound) {
   NSLog(@"Ends with 'string'");

}</lang>

Loop through matches <lang objc>for (NSTextCheckingResult *match in [regex matchesInString:str

                                                  options:0
                                                    range:NSMakeRange(0, [str length])
                                    ]) {
   // match.range gives the range of the whole match
   // [match rangeAtIndex:i] gives the range of the i'th capture group (starting from 1)

}</lang>

Substitute <lang objc>NSString *orig = @"I am the original string"; NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"original"

                                                                      options:0
                                                                        error:NULL];

NSString *result = [regex stringByReplacingMatchesInString:orig

                                                  options:0
                                                    range:NSMakeRange(0, [orig length])
                                             withTemplate:@"modified"];

NSLog(@"%@", result);</lang>

OCaml

With the standard library

Test <lang ocaml>#load "str.cma";; let str = "I am a string";; try

 ignore(Str.search_forward (Str.regexp ".*string$") str 0);
 print_endline "ends with 'string'"

with Not_found -> ()

Substitute <lang ocaml>#load "str.cma";; let orig = "I am the original string";; let result = Str.global_replace (Str.regexp "original") "modified" orig;; (* result is now "I am the modified string" *)</lang>

Using Pcre

Library: ocaml-pcre

<lang ocaml>let matched pat str =

 try ignore(Pcre.exec ~pat str); (true)
 with Not_found -> (false)

let () =

 Printf.printf "matched = %b\n" (matched "string$" "I am a string");
 Printf.printf "Substitute: %s\n"
   (Pcre.replace ~pat:"original" ~templ:"modified" "I am the original string")

ooRexx

<lang ooRexx>/* Rexx */ /* Using the RxRegExp Regular Expression built-in utility class */

st1 = 'Fee, fie, foe, fum, I smell the blood of an Englishman' rx1 = '[Ff]?e' -- unlike most regex engines, RxRegExp uses '?' instead of '.' to match any single character sbx = 'foo'

myRE = .RegularExpression~new() myRE~parse(rx1, MINIMAL)

mcm = myRE~pos(st1) say 'String "'st1'"' 'matches pattern "'rx1'":' bool2string(mcm > 0) say

-- The RxRegExp package doesn't provide a replace capability so you must roll your own st0 = st1 loop label GREP forever

 mcp = myRE~pos(st1)
 if mcp > 0 then do
   mpp = myRE~position
   fnd = st1~substr(mcp, mpp - mcp + 1)
   stx = st1~changestr(fnd, sbx, 1)
   end
 else leave GREP
 st1 = stx
 end GREP

say 'Input string: "'st0'"' say 'Result string: "'stx'"' return exit

bool2string: procedure do

 parse arg bv .
 if bv then bx = 'true'
       else bx = 'false'
 return bx

end exit

requires "rxregexp.cls"

</lang> Output:'

String "Fee, fie, foe, fum, I smell the blood of an Englishman" matches pattern "[Ff]?e": true

Input string:  "Fee, fie, foe, fum, I smell the blood of an Englishman"
Result string: "foo, foo, foo, fum, I smell the blood of an Englishman"

Oxygene

<lang oxygene> // Match and Replace part of a string using a Regular Expression // // Nigel Galloway - April 15th., 2012 // namespace re;

interface

type

 re = class
 public
   class method Main; 
 end;

implementation

class method re.Main; const

 myString = 'I think that I am Nigel';

var

 r: System.Text.RegularExpressions.Regex;
 myResult : String;

begin

 r := new System.Text.RegularExpressions.Regex('(I am)|(you are)');
 Console.WriteLine("{0} contains {1}", myString, r.Match(myString));
 myResult := r.Replace(myString, "you are");
 Console.WriteLine("{0} contains {1}", myResult, r.Match(myResult));

end;

end. </lang> Produces:

I think that I am Nigel contains I am
I think that you are Nigel contains you are

Oz

<lang oz>declare

 [Regex] = {Module.link ['x-oz://contrib/regex']}
 String = "This is a string"

in

 if {Regex.search "string$" String} \= false then
    {System.showInfo "Ends with string."}
 end
 {System.showInfo {Regex.replace String " a " fun {$ _ _} " another " end}}</lang>

Pascal

<lang pascal> // Match and Replace part of a string using a Regular Expression // // Nigel Galloway - April 11th., 2012 // program RegularExpr;

uses

 RegExpr;

const

 myString = 'I think that I am Nigel';
 myMatch = '(I am)|(you are)';

var

 r : TRegExpr;
 myResult : String;

begin

 r := TRegExpr.Create;
 r.Expression := myMatch;
 write(myString);
 if r.Exec(myString) then writeln(' contains ' + r.Match[0]);
 myResult := r.Replace(myString, 'you are', False);
 write(myResult);
 if r.Exec(myResult) then writeln(' contains ' + r.Match[0]);

end. </lang> Produces:

>RegularExpr
I think that I am Nigel contains I am
I think that you are Nigel contains you are

Perl

Works with: Perl version 5.8.8

Test <lang perl>$string = "I am a string"; if ($string =~ /string$/) {

  print "Ends with 'string'\n";

}

if ($string !~ /^You/) {

  print "Does not start with 'You'\n";

}</lang>

Substitute <lang perl>$string = "I am a string"; $string =~ s/ a / another /; # makes "I am a string" into "I am another string" print $string;</lang>

In Perl 5.14+, you can return a new substituted string without altering the original string: <lang perl>$string = "I am a string"; $string2 = $string =~ s/ a / another /r; # $string2 == "I am another string", $string is unaltered print $string2;</lang>

Test and Substitute <lang perl>$string = "I am a string"; if ($string =~ s/\bam\b/was/) { # \b is a word border

  print "I was able to find and replace 'am' with 'was'\n";

}</lang>

Options <lang perl># add the following just after the last / for additional control

g = globally (match as many as possible)
i = case-insensitive
s = treat all of $string as a single line (in case you have line breaks in the content)
m = multi-line (the expression is run on each line individually)

$string =~ s/i/u/ig; # would change "I am a string" into "u am a strung"</lang>

Omission of the regular expression binding operators

If regular expression matches are being made against the topic variable, it is possible to omit the regular expression binding operators:

$_ = "I like banana milkshake."; if (/banana/) { # The regular expression binding operator is omitted

 print "Match found\n";

}

Perl 6

<lang perl6>use v6; if 'a long string' ~~ /string$/ {

  say "It ends with 'string'";

}

substitution has a few nifty features

$_ = 'The quick Brown fox'; s:g:samecase/\w+/xxx/; .say;

output:
Xxx xxx Xxx xxx

</lang>

PHP

Works with: PHP version 5.2.0

<lang php>$string = 'I am a string';

Test

if (preg_match('/string$/', $string)) {

   echo "Ends with 'string'\n";

}

Replace

$string = preg_replace('/\ba\b/', 'another', $string); echo "Found 'a' and replace it with 'another', resulting in this string: $string\n";</lang>

Output:

Ends with 'string'
Foud 'a' and replaced it with 'another', resulting in this string: I am another string

PicoLisp

Calling the C library

PicoLisp doesn't have built-in regex functionality. It is easy to call the native C library. <lang PicoLisp>(let (Pat "a[0-9]z" String "a7z")

  (use Preg
     (native "@" "regcomp" 'I '(Preg (64 B . 64)) Pat 1)  # Compile regex
     (when (=0 (native "@" "regexec" 'I (cons NIL (64) Preg) String 0 0 0))
        (prinl "String \"" String "\" matches regex \"" Pat "\"") ) ) )</lang>

Output:

String "a7z" matches pattern "a[0-9]z"

Using Pattern Matching

Regular expressions are static and inflexible. Another possibility is dynamic pattern matching, where arbitrary conditions can be programmed. <lang PicoLisp>(let String "The number <7> is incremented"

  (use (@A @N @Z)
     (and
        (match '(@A "<" @N ">"  @Z) (chop String))
        (format @N)
        (prinl @A "<" (inc @) ">" @Z) ) ) )</lang>

Output:

The number <8> is incremented

PowerShell

<lang powershell>"I am a string" -match '\bstr' # true "I am a string" -replace 'a\b','no' # I am no string</lang> By default both the -match and -replace operators are case-insensitive. They can be made case-sensitive by using the -cmatch and -creplace operators.

PureBasic

<lang PureBasic>String$ = "<tag>some text consisting of Roman letters spaces and numbers like 12</tag>" regex$ = "<([a-z]*)>[a-z,A-Z,0-9, ]*</\1>" regex_replace$ = "letters[a-z,A-Z,0-9, ]*numbers[a-z,A-Z,0-9, ]*" If CreateRegularExpression(1, regex$) And CreateRegularExpression(2, regex_replace$)

 If MatchRegularExpression(1, String$)
   Debug "Tags correct, and only alphanummeric or space characters between them"
 EndIf
 Debug ReplaceRegularExpression(2, String$, "char stuff")

EndIf</lang>

Python

<lang python>import re

string = "This is a string"

if re.search('string$',string):

   print "Ends with string."

string = re.sub(" a "," another ",string) print string</lang>

R

First, define some strings. <lang R>pattern <- "string" text1 <- "this is a matching string" text2 <- "this does not match"</lang> Matching with grep. The indices of the texts containing matches are returned. <lang R>grep(pattern, c(text1, text2)) # 1</lang> Matching with regexpr. The positions of the starts of the matches are returned, along with the lengths of the matches. <lang R>regexpr(pattern, c(text1, text2))</lang>

[1] 20 -1
attr(,"match.length")
[1]  6 -1

Replacement <lang R>gsub(pattern, "pair of socks", c(text1, text2))</lang>

[1] "this is a matching pair of socks" "this does not match"

Raven

<lang raven>'i am a string' as str</lang>

Match:

<lang raven>str m/string$/ if "Ends with 'string'\n" print</lang>

Replace:

<lang raven>str r/ a / another / print</lang>

REBOL

<lang REBOL>REBOL [ Title: "Regular Expression Matching" Author: oofoe Date: 2009-12-06 URL: http://rosettacode.org/wiki/Regular_expression_matching ]

string: "This is a string."

REBOL doesn't use a conventional Perl-compatible regular expression
syntax. Instead, it uses a variant Parsing Expression Grammar with
the 'parse' function. It's also not limited to just strings. You can
define complex grammars that actually parse and execute program
files.

Here, I provide a rule to 'parse' that specifies searching through
the string until "string." is found, then the end of the string. If
the subject string satisfies the rule, the expression will be true.

if parse string [thru "string." end] [ print "Subject ends with 'string.'"]

For replacement, I take advantage of the ability to call arbitrary
code when a pattern is matched -- everything in the parens will be
executed when 'to " a "' is satisfied. This marks the current string
location, then removes the offending word and inserts the replacement.

parse string [ to " a " ; Jump to target. mark: ( remove/part mark 3 ; Remove target. mark: insert mark " another " ; Insert replacement. ) :mark ; Pick up where I left off. ] print [crlf "Parse replacement:" string]

For what it's worth, the above operation is more conveniently done
with the 'replace' function

replace string " another " " a " ; Change string back. print [crlf "Replacement:" string]</lang>

Output:

Subject ends with 'string.'

Parse replacement: This is another string.

Replacement: This is a string.

REXX

Rexx does not directly support the use of regular expressions as part of the language.
However, some rexx interpreters offer support for regular expressions via external function libraries or
through implementation specific extensions.

It is also possible to emulate regular expressions through appropriate coding techniques.

All of the following REXX examples are modeled after the PERL examples.
testing <lang rexx>/*REXX program demonstrates testing (modeled after Perl example).*/ $string = "I am a string"

                                   say 'The string is:'  $string
     x = "string"

if right($string,length(x))=x then say 'It ends with:' x

     y = "You"

if left($string,length(y))\=y then say 'It does not start with:' y

     z = "ring"

if pos(z,$string)\==0 then say 'It contains the string:' z

     z = "ring"

if wordpos(z,$string)==0 then say 'It does not contain the word:' z

                                      /*stick a fork in it, we're done.*/</lang>

output

The string is: I am a string
It ends with: string
It does not start with: You
It contains the string: ring
It does not contain the word: ring

substitution (destructive) <lang rexx>/*REXX program demonstrates substitution (modeled after Perl example).*/ $string = "I am a string"

   old = " a "
   new = " another "

say 'The original string is:' $string say 'old word is:' old say 'new word is:' new $string = changestr(old,$string,new) say 'The changed string is:' $string

                                      /*stick a fork in it, we're done.*/</lang>

output

The original string is: I am a string
old  word  is:  a
new  word  is:  another
The  changed string is: I am another string

substitution (non-destructive) <lang rexx>/*REXX program shows non-destructive sub. (modeled after Perl example).*/ $string = "I am a string"

   old = " a "
   new = " another "

say 'The original string is:' $string say 'old word is:' old say 'new word is:' new $string2 = changestr(old,$string,new) say 'The original string is:' $string say 'The changed string is:' $string2

                                      /*stick a fork in it, we're done.*/</lang>

output

The original string is: I am a string
old  word  is:  a
new  word  is:  another
The original string is: I am a string
The  changed string is: I am another string

test and substitute <lang rexx>/*REXX program shows test and substitute (modeled after Perl example).*/

$string = "I am a string"
    old = " am "
    new = " was "

say 'The original string is:' $string say 'old word is:' old say 'new word is:' new

if wordpos(old,$string)\==0 then

          do
          $string = changestr(old,$string,new)
          say 'I was able to find and replace ' old " with " new
          end
                                      /*stick a fork in it, we're done.*/

</lang> output

The original string is: I am a string
old  word  is:  am
new  word  is:  was
I was able to find and replace   am   with   was

Ruby

Test <lang ruby>string="I am a string" puts "Ends with 'string'" if string[/string$/] puts "Does not start with 'You'" if !string[/^You/]</lang>

Substitute <lang ruby>puts string.gsub(/ a /,' another ')

or

string[/ a /]='another' puts string</lang>

Substitute using block <lang ruby>puts(string.gsub(/\bam\b/) do |match|

      puts "I found #{match}"
      #place "was" instead of the match
      "was"
    end)</lang>

Run BASIC

<lang runbasic>string$ = "I am a string" if right$(string$,6) = "string" then print "'";string$;"' ends with 'string'" i = instr(string$,"am") string$ = left$(string$,i - 1) + "was" + mid$(string$,i + 2) print "replace 'am' with 'was' = ";string$ </lang>Output:

'I am a string' ends with 'string'
replace 'am' with 'was' = I was a string

Sather

Sather understands POSIX regular expressions.

<lang sather>class MAIN is

 -- we need to implement the substitution
 regex_subst(re:REGEXP, s, sb:STR):STR is
   from, to:INT;
   re.match(s, out from, out to);
   if from = -1 then return s; end;
   return s.head(from) + sb + s.tail(s.size - to);
 end;

 main is
   s ::= "I am a string";
   re ::= REGEXP::regexp("string$", true);
   if re.match(s) then
     #OUT + "'" + s + "'" + " ends with 'string'\n";
   end;
   if ~REGEXP::regexp("^You", false).match(s) then
     #OUT + "'" + s + "'" + " does not begin with 'You'\n";
   end;
   #OUT + regex_subst(re, s, "integer") + "\n";
   #OUT + regex_subst(REGEXP::regexp("am +a +st", true), s, "get the ") + "\n";
 end;

end;</lang>

Scala

Define <lang scala>val Bottles1 = "(\\d+) bottles of beer".r // syntactic sugar val Bottles2 = """(\d+) bottles of beer""".r // using triple-quotes to preserve backslashes val Bottles3 = new scala.util.matching.Regex("(\\d+) bottles of beer") // standard val Bottles4 = new scala.util.matching.Regex("""(\d+) bottles of beer""", "bottles") // with named groups</lang>

Search and replace with string methods: <lang scala>"99 bottles of beer" matches "(\\d+) bottles of beer" // the full string must match "99 bottles of beer" replace ("99", "98") // Single replacement "99 bottles of beer" replaceAll ("b", "B") // Multiple replacement</lang>

Search with regex methods: <lang scala>"\\d+".r findFirstIn "99 bottles of beer" // returns first partial match, or None "\\w+".r findAllIn "99 bottles of beer" // returns all partial matches as an iterator "\\s+".r findPrefixOf "99 bottles of beer" // returns a matching prefix, or None Bottles4 findFirstMatchIn "99 bottles of beer" // returns a "Match" object, or None Bottles4 findPrefixMatchOf "99 bottles of beer" // same thing, for prefixes val bottles = (Bottles4 findFirstMatchIn "99 bottles of beer").get.group("bottles") // Getting a group by name val Bottles4(bottles) = "99 bottles of beer" // syntactic sugar (not using group name)</lang>

Using pattern matching with regex: <lang scala>val Some(bottles) = Bottles4 findPrefixOf "99 bottles of beer" // throws an exception if the matching fails; full string must match for {

 line <- """|99 bottles of beer on the wall
            |99 bottles of beer
            |Take one down, pass it around
            |98 bottles of beer on the wall""".stripMargin.lines

} line match {

 case Bottles1(bottles) => println("There are still "+bottles+" bottles.") // full string must match, so this will match only once
 case _ =>

} for {

 matched <- "(\\w+)".r findAllIn "99 bottles of beer" matchData // matchData converts to an Iterator of Match

} println("Matched from "+matched.start+" to "+matched.end)</lang>

Replacing with regex: <lang scala>Bottles2 replaceFirstIn ("99 bottles of beer", "98 bottles of beer") Bottles3 replaceAllIn ("99 bottles of beer", "98 bottles of beer")</lang>

Shiny

<lang shiny>str: 'I am a string'</lang>

Match text: <lang shiny>if str.match ~string$~

   say "Ends with 'string'"

end</lang>

Replace text: <lang shiny>say str.alter ~ a ~ 'another'</lang>

Slate

This library is still in its early stages. There isn't currently a feature to replace a substring.

'http://slatelanguage.org/test/page?query' =~ '^(([^:/?#]+)\\:)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?'.

" ==> {'http:'. 'http'. '//slatelanguage.org'. 'slatelanguage.org'. '/test/page'. '?query'. 'query'. Nil} " </lang>

Smalltalk

<lang smalltalk>|re s s1| re := Regex fromString: '[a-z]+ing'. s := 'this is a matching string'. s1 := 'this does not match'.

(s =~ re) ifMatched: [ :b |

  b match displayNl

]. (s1 =~ re) ifMatched: [ :b |

  'Strangely matched!' displayNl

] ifNotMatched: [

  'no match!' displayNl

].

(s replacingRegex: re with: 'modified') displayNl.</lang>

SNOBOL4

In SNOBOL4, patterns are based not on regular expressions, but are a native datatype which can be constructed, manipulated, concatenated, used in pattern expressions, stored into variables, and so forth. Patterns can be constructed ahead of time and saved in variables, and those preconstructed patterns can also reference additional pattern and data items which won't be known until actual pattern match time. Patterns can define calls to functions which will be called during actual pattern matching, and whose outcome can affect how the pattern match continues, which tentative matches will and won't be accepted, and so forth.

SNOBOL4 pattern matching is thus hugely more capable than traditional regular expressions are. An example of a pattern matching problem that would be prohibitively difficult to create as a regular expression would be to "create a pattern which matches a complete name and international postal mailing address."

SNOBOL4's "raison d'etre" is pattern matching and string manipulation (although it's also strong in data structures too). The basic statement syntax in SNOBOL4 is:

<lang snobol4>label subject pattern = object :(goto)</lang>

The basic operation is to evaluate the subject, evaluate the pattern, find the pattern in the subject, evaluate the object, and then replace the portion of the subject matched by the pattern with the evaluated object. If any of those steps fails (i.e. does not succeed) then execution continues with the goto, as appropriate.

The goto can be unconditional, or can be based on whether the statement succeeded or failed (and that is the basis for all explicit transfers of control in SNOBOL4). This example finds the string "SNOBOL4" in string variable string1, and replaces it with "new SPITBOL" (SPITBOL is an implementation of SNOBOL4, basically SPITBOL is to SNOBOL4 what Turbo Pascal is to Pascal):

<lang snobol4> string1 = "The SNOBOL4 language is designed for string manipulation."

    string1 "SNOBOL4" = "new SPITBOL"                   :s(changed)f(nochange)</lang>

The following example replaces "diameter is " and a numeric value by "circumference is " and the circumference instead (it also shows creation of a pattern which matches integer or real numeric values, and storing that pattern into a variable... and then using that pattern variable later in a slightly more complicated pattern expression):

<lang snobol4> pi = 3.1415926

    dd = "0123456789"
    string1 = "For the first circle, the diameter is 2.5 inches."
    numpat = span(dd) (("." span(dd)) | null)
    string1 "diameter is " numpat . diam = "circumference is " diam * pi</lang>

Relatively trivial pattern matching and replacements can be attacked very effectively using regular expressions, but regular expressions (while ubiquitous) are a crippling limitation for more complicated pattern matching problems.

Standard ML

There is no regex support in the Basis Library; however, various implementations have their own support.

Works with: SML/NJ

Test <lang sml>CM.make "$/regexp-lib.cm"; structure RE = RegExpFn (

     structure P = AwkSyntax
     structure E = BackTrackEngine);

val re = RE.compileString "string$"; val string = "I am a string"; case StringCvt.scanString (RE.find re) string

of NONE => print "match failed\n"
 | SOME match =>
     let
       val {pos, len} = MatchTree.root match
     in
       print ("matched at position " ^ Int.toString pos ^ "\n")
     end;</lang>

Tcl

Test using regexp: <lang tcl>set theString "I am a string" if {[regexp -- {string$} $theString]} {

   puts "Ends with 'string'"

}

if {![regexp -- {^You} $theString]} {

   puts "Does not start with 'You'"

}</lang>

Extract substring using regexp <lang tcl>set theString "This string has >123< a number in it" if {[regexp -- {>(\d+)<} $theString -> number]} {

   puts "Contains the number $number"

}</lang>

Substitute using regsub <lang tcl>set theString = "I am a string" puts [regsub -- { +a +} $theString { another }]</lang>

Toka

Toka's regular expression library allows for matching, but does not yet provide for replacing elements within strings.

<lang toka>#! Include the regex library needs regex

! The two test strings

" This is a string" is-data test.1 " Another string" is-data test.2

! Create a new regex named 'expression' which tries
! to match strings beginning with 'This'.

" ^This" regex: expression

! An array to store the results of the match
! (Element 0 = starting offset, Element 1 = ending offset of match)

2 cells is-array match

! Try both test strings against the expression.
! try-regex will return a flag. -1 is TRUE, 0 is FALSE

expression test.1 2 match try-regex . expression test.2 2 match try-regex .</lang>

TXR

Search and replace: simple

Txr is not designed for sed-like filtering, but here is how to do sed -e 's/dog/cat/g':

<lang txr>@(collect) @(coll :gap 0)@mismatch@{match /dog/}@(end)@suffix @(output) @(rep)@{mismatch}cat@(end)@suffix @(end) @(end)</lang>

How it works is that the body of the coll uses a double-variable match: an unbound variable followed by a regex-match variable. The meaning of this combination is, "Search for the regular expression, and if successful, then bind all the characters whcih were skipped over by the search to the first variable, and the matching text to the second variable." So we collect pairs: pieces of mismatching text, and pieces of text which match the regex dog. At the end, there is usually going to be a piece of text which does not match the body, because it has no match for the regex. Because :gap 0 is specified, the coll construct will terminate when faced with this nonmatching text, rather than skipping it in a vain search for a match, which allows @suffix to take on this trailing text.

To output the substitution, we simply spit out the mismatching texts followed by the replacement text, and then add the suffix.

Search and replace: strip comments from C source

Based on the technique of the previous example, here is a query for stripping C comments from a source file, replacing them by a space. Here, the "non-greedy" version of the regex Kleene operator is used, denoted by %. This allows for a very simple, straightforward regex which correctly matches C comments. The freeform operator allows the entire input stream to be treated as one big line, so this works across multi-line comments.

<lang txr>@(freeform) @(coll :gap 0)@notcomment@{comment /[/][*].%[*][/]/}@(end)@tail @(output) @(rep)@notcomment @(end)@tail @(end)</lang>

Vala

<lang vala> void main(){

   string sentence = "This is a sample sentence.";

   Regex a = new Regex("s[ai]mple"); // if using \n type expressions, use triple " for string literals as easy method to escape them

   if (a.match(sentence)){
       stdout.printf("\"%s\" is in \"%s\"!\n", a.get_pattern(), sentence);
   }

   string sentence_replacement = "cat";
   sentence = a.replace(sentence, sentence.length, 0, sentence_replacement);
   stdout.printf("Replaced sentence is: %s\n", sentence);

} </lang>

Output:

"s[ai]mple" is in "This is a sample sentence."!
Replaced sentence is: This is a cat sentence.

Vedit macro language

Vedit can perform searches and matching with either regular expressions, pattern matching codes or plain text. These examples use regular expressions.

Match text at cursor location: <lang vedit>if (Match(".* string$", REGEXP)==0) {

   Statline_Message("This line ends with 'string'")

}</lang>

Search for a pattern: <lang vedit>if (Search("string$", REGEXP+NOERR)) {

   Statline_Message("'string' at and of line found")

}</lang>

Replace: <lang vedit>Replace(" a ", " another ", REGEXP+NOERR)</lang>

Web 68

<lang web68>@1Introduction. Web 68 has access to a regular expression module which can compile regular expressions, use them for matching strings, and replace strings with the matched string.

@a@<Compiler prelude@> BEGIN @<Declarations@> @<Logic at the top level@> END @<Compiler postlude@>

@ The local compiler requires a special prelude.

@<Compiler prel...@>= PROGRAM rosettacode regex CONTEXT VOID USE regex,standard

@ And a special postlude.

@<Compiler post...@>= FINISH

@1Regular expressions. Compile a regular expression and match a string using it.

@<Decl...@>= STRING regexp="string$"; REF REGEX rx=rx compile(regexp);

@ Declare a string for the regular expression to match.

@<Decl...@>= STRING to match = "This is a string";

@ Define a routine to print the result of matching.

@<Decl...@>= OP MATCH = (REF REGEX rx,STRING match)STRING: IF rx match(rx,match,LOC SUBEXP) THEN "matches" ELSE "doesn't match" FI;

@ Check whether the regular expression matches the string.

@<Logic...@>= print(("String """,regexp,""" ",rx MATCH to match,

      " string """,to match,"""",newline))

@ The end. This program is processed by tang to produce Algol 68 code which has to be compiled by the a68toc compiler. It's output is then compiled by gcc to produce a binary program. The script 'ca' provided with the Debian package algol68toc requires the following command to process this program.

 ca -l mod rosettacoderegex.w68

That's it. The resulting binary will print 'String "string$" matches string "This is a string"'</lang>