Regular expressions

From Rosetta Code
Jump to: navigation, search
Task
Regular expressions
You are encouraged to solve this task according to the task description, using any language you may know.

The goal of this task is

  • to match a string against a regular expression
  • to substitute part of a string using a regular expression

Contents

[edit] Ada

There is no Regular Expression library in the Ada Standard, so I am using one of the libraries provided by gnat/gcc.

with Ada.Text_IO; with Gnat.Regpat; use Ada.Text_IO;
 
procedure Regex is
 
package Pat renames Gnat.Regpat;
 
procedure Search_For_Pattern(Compiled_Expression: Pat.Pattern_Matcher;
Search_In: String;
First, Last: out Positive;
Found: out Boolean) is
Result: Pat.Match_Array (0 .. 1);
begin
Pat.Match(Compiled_Expression, Search_In, Result);
Found := not Pat."="(Result(1), Pat.No_Match);
if Found then
First := Result(1).First;
Last := Result(1).Last;
end if;
end Search_For_Pattern;
 
Word_Pattern: constant String := "([a-zA-Z]+)";
 
Str: String:= "I love PATTERN matching!";
Current_First: Positive := Str'First;
First, Last: Positive;
Found: Boolean;
 
begin
-- first, find all the words in Str
loop
Search_For_Pattern(Pat.Compile(Word_Pattern),
Str(Current_First .. Str'Last),
First, Last, Found);
exit when not Found;
Put_Line("<" & Str(First .. Last) & ">");
Current_First := Last+1;
end loop;
 
-- second, replace "PATTERN" in Str by "pattern"
Search_For_Pattern(Pat.Compile("(PATTERN)"), Str, First, Last, Found);
Str := Str(Str'First .. First-1) & "pattern" & Str(Last+1 .. Str'Last);
Put_Line(Str);
end Regex;


Output:

<I>
<love>
<PATTERN>
<matching>
I love pattern matching!

[edit] AppleScript

Library: Satimage.osax
try
find text ".*string$" in "I am a string" with regexp
on error message
return message
end try
 
try
change "original" into "modified" in "I am the original string" with regexp
on error message
return message
end try

Output:



[edit] ALGOL 68

The routines grep in strings and sub in string are not part of ALGOL 68's standard prelude.

Works with: ALGOL 68G version Any - tested with release mk15-0.8b.fc9.i386
INT match=0, no match=1, out of memory error=2, other error=3;
 
STRING str := "i am a string";
 
# Match: #
 
STRING m := "string$";
INT start, end;
IF grep in string(m, str, start, end) = match THEN printf(($"Ends with """g""""l$, str[start:end])) FI;
 
# Replace: #
 
IF sub in string(" a ", " another ",str) = match THEN printf(($gl$, str)) FI;

Output:

Ends with "string"
i am another string

Standard ALGOL 68 does have an primordial form of pattern matching called a format. This is designed to extract values from input data. But it can also be used for outputting (and transputting) the original data.

Works with: ALGOL 68 version Standard - But declaring book as flex[]flex[]string
Works with: ALGOL 68G version Any - tested with release mk15-0.8b.fc9.i386
For example:
FORMAT pattern = $ddd" "c("cats","dogs")$;
FILE file; STRING book; associate(file, book);
on value error(file, (REF FILE f)BOOL: stop);
on format error(file, (REF FILE f)BOOL: stop);
 
book := "100 dogs";
STRUCT(INT count, type) dalmatians;
 
getf(file, (pattern, dalmatians));
print(("Dalmatians: ", dalmatians, new line));
count OF dalmatians +:=1;
printf(($"Gives: "$, pattern, dalmatians, $l$))

Output:

Dalmatians:        +100         +2
Gives 101 dogs

[edit] Argile

use std, regex
 
(: matching :)
if "some matchable string" =~ /^some" "+[a-z]*" "+string$/
echo string matches
else
echo string "doesn't" match
 
(: replacing :)
let t = strdup "some allocated string"
t =~ s/a/"4"/g
t =~ s/e/"3"/g
t =~ s/i/"1"/g
t =~ s/o/"0"/g
t =~ s/s/$/g
print t
free t
 
(: flushing regex allocations :)
uninit regex
 
check mem leak; use dbg (:optional:)

(note that it needs to be compiled with argrt library)

Output:

string matches
$0m3 4ll0c4t3d $tr1ng

[edit] AutoHotkey

MsgBox % foundpos := RegExMatch("Hello World", "World$")  
MsgBox % replaced := RegExReplace("Hello World", "World$", "yourself")

[edit] AWK

AWK supports regular expressions, which are typically enclosed using slash symbols at the front and back, and the tilde regular expression binding operator:

$ awk '{if($0~/[A-Z]/)print "uppercase detected"}'
abc
ABC
uppercase detected

As shorthand, a regular expression in the condition part fires if it matches an input line:

awk '/[A-Z]/{print "uppercase detected"}'
def
DeF
uppercase detected

For substitution, the first argument can be a regular expression, while the replacement string is constant (only that '&' in it receives the value of the match):

$ awk '{gsub(/[A-Z]/,"*");print}'
abCDefG
ab**ef*
$ awk '{gsub(/[A-Z]/,"(&)");print}'
abCDefGH
ab(C)(D)ef(G)(H)

This variant matches one or more uppercase letters in one round:

$ awk '{gsub(/[A-Z]+/,"(&)");print}'
abCDefGH
ab(CD)ef(GH)

Regular expression negation can be achieved by combining the regular expression binding operator with a logical not operator, as follows:

if (text !~ /strawberry/) {

 print "Match not found"

}

[edit] BBC BASIC

Uses the gnu_regex library.

      SYS "LoadLibrary", "gnu_regex.dll" TO gnu_regex%
IF gnu_regex% = 0 ERROR 100, "Cannot load gnu_regex.dll"
SYS "GetProcAddress", gnu_regex%, "regcomp" TO regcomp
SYS "GetProcAddress", gnu_regex%, "regexec" TO regexec
 
DIM regmatch{start%, finish%}, buffer% 256
 
REM Find all 'words' in a string:
teststr$ = "I love PATTERN matching!"
pattern$ = "([a-zA-Z]+)"
 
SYS regcomp, buffer%, pattern$, 1 TO result%
IF result% ERROR 101, "Failed to compile regular expression"
 
first% = 1
REPEAT
SYS regexec, buffer%, MID$(teststr$, first%), 1, regmatch{}, 0 TO result%
IF result% = 0 THEN
s% = regmatch.start%
f% = regmatch.finish%
PRINT "<" MID$(teststr$, first%+s%, f%-s%) ">"
first% += f%
ENDIF
UNTIL result%
 
REM Replace 'PATTERN' with 'pattern':
teststr$ = "I love PATTERN matching!"
pattern$ = "(PATTERN)"
 
SYS regcomp, buffer%, pattern$, 1 TO result%
IF result% ERROR 101, "Failed to compile regular expression"
SYS regexec, buffer%, teststr$, 1, regmatch{}, 0 TO result%
IF result% = 0 THEN
s% = regmatch.start%
f% = regmatch.finish%
MID$(teststr$, s%+1, f%-s%) = "pattern"
PRINT teststr$
ENDIF
 
SYS "FreeLibrary", gnu_regex%

Output:

<I>
<love>
<PATTERN>
<matching>
I love pattern matching!

[edit] Bracmat

Pattern matching in Bracmat is inspired by pattern matching in Snobol. It also is quite different from regular expressions:
  • Patterns in Bracmat are not greedy
  • It is not possible to replace substrings, because values can never be changed
  • Patterns always must match all of the subject
  • Strings as well as complex data can be subjected to pattern matching

List all rational numbers smaller then 7 hidden in the string "fgsakg789/35768685432fkgha"

@("fesylk789/35768poq2art":? (#<7:?n & out$!n & ~) ?)

Output:

789/357
789/3576
789/35768
89/35
89/357
89/3576
89/35768
9/3
9/35
9/357
9/3576
9/35768
3
5
6
2

After the last number, the match expression fails.

[edit] Brat

Test

str = "I am a string"
 
true? str.match(/string$/)
{ p "Ends with 'string'" }
 
false? str.match(/^You/)
{ p "Does not start with 'You'" }
 

Substitute

# Substitute in copy
 
str2 = str.sub(/ a /, " another ")
 
p str # original unchanged
p str2 # prints "I am another string"
 
# Substitute in place
 
str.sub!(/ a /, " another ")
 
p str # prints "I am another string"
 
# Substitute with a block
 
str.sub! /a/
{ match | match.upcase }
 
p str # prints "I Am Another string"
 

[edit] C

Works with: POSIX

As far as I can see, POSIX defined function for regex matching, but nothing for substitution. So we must do all the hard work by hand. The complex-appearing code could be turned into a function.

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <regex.h>
#include <string.h>
 
int main()
{
regex_t preg;
regmatch_t substmatch[1];
const char *tp = "string$";
const char *t1 = "this is a matching string";
const char *t2 = "this is not a matching string!";
const char *ss = "istyfied";
 
regcomp(&preg, "string$", REG_EXTENDED);
printf("'%s' %smatched with '%s'\n", t1,
(regexec(&preg, t1, 0, NULL, 0)==0) ? "" : "did not ", tp);
printf("'%s' %smatched with '%s'\n", t2,
(regexec(&preg, t2, 0, NULL, 0)==0) ? "" : "did not ", tp);
regfree(&preg);
/* change "a[a-z]+" into "istifyed"?*/
regcomp(&preg, "a[a-z]+", REG_EXTENDED);
if ( regexec(&preg, t1, 1, substmatch, 0) == 0 )
{
//fprintf(stderr, "%d, %d\n", substmatch[0].rm_so, substmatch[0].rm_eo);
char *ns = malloc(substmatch[0].rm_so + 1 + strlen(ss) +
(strlen(t1) - substmatch[0].rm_eo) + 2);
memcpy(ns, t1, substmatch[0].rm_so+1);
memcpy(&ns[substmatch[0].rm_so], ss, strlen(ss));
memcpy(&ns[substmatch[0].rm_so+strlen(ss)], &t1[substmatch[0].rm_eo],
strlen(&t1[substmatch[0].rm_eo]));
ns[ substmatch[0].rm_so + strlen(ss) +
strlen(&t1[substmatch[0].rm_eo]) ] = 0;
printf("mod string: '%s'\n", ns);
free(ns);
} else {
printf("the string '%s' is the same: no matching!\n", t1);
}
regfree(&preg);
 
return 0;
}

[edit] C++

Works with: g++ version 4.0.2
Library: Boost
#include <iostream>
#include <string>
#include <iterator>
#include <boost/regex.hpp>
 
int main()
{
boost::regex re(".* string$");
std::string s = "Hi, I am a string";
 
// match the complete string
if (boost::regex_match(s, re))
std::cout << "The string matches.\n";
else
std::cout << "Oops - not found?\n";
 
// match a substring
boost::regex re2(" a.*a");
boost::smatch match;
if (boost::regex_search(s, match, re2))
{
std::cout << "Matched " << match.length()
<< " characters starting at " << match.position() << ".\n";
std::cout << "Matched character sequence: \""
<< match.str() << "\"\n";
}
else
{
std::cout << "Oops - not found?\n";
}
 
// replace a substring
std::string dest_string;
boost::regex_replace(std::back_inserter(dest_string),
s.begin(), s.end(),
re2,
"'m now a changed");
std::cout << dest_string << std::endl;
}

[edit] C#

using System;
using System.Text.RegularExpressions;
 
class Program {
static void Main(string[] args) {
string str = "I am a string";
 
if (new Regex("string$").IsMatch(str)) {
Console.WriteLine("Ends with string.");
}
 
str = new Regex(" a ").Replace(str, " another ");
Console.WriteLine(str);
}
}

[edit] Clojure

(let [s "I am a string"]
;; match
(when (re-find #"string$" s)
(println "Ends with 'string'."))
(when-not (re-find #"^You" s)
(println "Does not start with 'You'."))
 
;; substitute
(println (clojure.string/replace s " a " " another "))
)

[edit] Common Lisp

Translation of: Perl

Uses CL-PPCRE - Portable Perl-compatible regular expressions for Common Lisp.

(let ((string "I am a string"))
(when (cl-ppcre:scan "string$" string)
(write-line "Ends with string"))
(unless (cl-ppcre:scan "^You" string )
(write-line "Does not start with 'You'")))

Substitute

(let* ((string "I am a string")
(string (cl-ppcre:regex-replace " a " string " another ")))
(write-line string))

Test and Substitute

(let ((string "I am a string"))
(multiple-value-bind (string matchp)
(cl-ppcre:regex-replace "\\bam\\b" string "was")
(when matchp
(write-line "I was able to find and replace 'am' with 'was'."))))

[edit] CLISP regexp engine

Works with: CLISP

Clisp comes with built-in regexp matcher. On a Clisp prompt:

[1]> (regexp:match "fox" "quick fox jumps")
#S(REGEXP:MATCH :START 6 :END 9)

To find all matches, loop with different :start keyword.

Replacing text can be done with the help of REGEXP:REGEXP-SPLIT function:

[2]> (defun regexp-replace (pat repl string)
(reduce #'(lambda (x y) (string-concat x repl y))
(regexp:regexp-split pat string)))
REGEXP-REPLACE
[3]> (regexp-replace "x\\b" "-X-" "quick foxx jumps")
"quick fox-X- jumps"

[edit] D

void main() {
import std.stdio, std.regex;
 
immutable s = "I am a string";
 
// Test.
if (!s.match(r"string$").empty)
"Ends with 'string'.".writeln;
 
// Substitute.
s.replace(" a ".regex, " another ").writeln;
}
Output:
Ends with 'string'.
I am another string

In std.string there are string functions to perform the same operations more efficiently.

[edit] Dart

RegExp regexp = new RegExp(r'\w+\!');
 
String capitalize(Match m) => '${m[0].substring(0, m[0].length-1).toUpperCase()}';
 
void main(){
String hello = 'hello hello! world world!';
String hellomodified = hello.replaceAllMapped(regexp, capitalize);
print(hello);
print(hellomodified);
}
Output:
hello hello! world world!
hello HELLO world WORLD

[edit] Erlang

match() ->
String = "This is a string",
case re:run(String, "string$") of
{match,_} -> io:format("Ends with 'string'~n");
_ -> ok
end.
 
substitute() ->
String = "This is a string",
NewString = re:replace(String, " a ", " another ", [{return, list}]),
io:format("~s~n",[NewString]).


[edit] F#

Translation of: C#
open System
open System.Text.RegularExpressions
 
[<EntryPoint>]
let main argv =
let str = "I am a string"
if Regex("string$").IsMatch(str) then Console.WriteLine("Ends with string.")
 
let rstr = Regex(" a ").Replace(str, " another ")
Console.WriteLine(rstr)
0

[edit] Forth

Test/Match

include ffl/rgx.fs
 
\ Create a regular expression variable 'exp' in the dictionary
 
rgx-create exp
 
\ Compile an expression
 
s" Hello (World)" exp rgx-compile [IF]
.( Regular expression successful compiled.) cr
[THEN]
 
\ (Case sensitive) match a string with the expression
 
s" Hello World" exp rgx-cmatch? [IF]
.( String matches with the expression.) cr
[ELSE]
.( No match.) cr
[THEN]


[edit] Frink

Pattern matching:

 
line = "My name is Inigo Montoya."
 
for [first, last] = line =~ %r/my name is (\w+) (\w+)/ig
{
println["First name is: $first"]
println["Last name is: $last"]
}
 

Replacement: (Replaces in the variable line)

 
line =~ %s/Frank/Frink/g
 

[edit] GeneXus

Interesting link: http://wiki.gxtechnical.com/commwiki/servlet/hwiki?Regular+Expressions+%28RegEx%29,

Replacement:

&string = &string.ReplaceRegEx("^\s+|\s+$", "") // it's a trim!
&string = &string.ReplaceRegEx("Another (Match)", "Replacing $1") // Using replace groups

Check match:

If (&string.IsMatch("regex$"))
// The string ends with "regex"
EndIf

Split RegEx:

&stringCollection = &string.SplitRegEx("^\d{2,4}")

Matches:

&RegExMatchCollection = &string.Matches("(pa)tt(ern)")
For &RegExMatch In &RegExMatchCollection
&FullMatch = &RegExMatch.Value // &FullMatch contains the full pattern match: "pattern"
For &matchVarchar In &RegExMatch.Groups
// &matchVarchar contains group matches: "pa", "ern"
EndFor
EndFor

Flags:
s - Dot matches all (including newline)
m - multiline
i - ignore case
Using Flags Sintax: (?flags)pattern
Example:

&string = &string.ReplaceRegEx("(?si)IgnoreCase.+$", "") // Flags s and i

Error Handling:

&string = "abc"
&RegExMatchCollection = &string.Matches("[z-a]") // invalid pattern: z-a
&errCode = RegEx.GetLastErrCode() // returns 0 if no error and 1 if an error has occured
&errDsc = RegEx.GetLastErrDescription()

[edit] Go

package main
import "fmt"
import "regexp"
 
func main() {
str := "I am the original string"
 
// Test
matched, _ := regexp.MatchString(".*string$", str)
if matched { fmt.Println("ends with 'string'") }
 
// Substitute
pattern := regexp.MustCompile("original")
result := pattern.ReplaceAllString(str, "modified")
fmt.Println(result)
}

[edit] Groovy

"Matching" Solution (it's complicated):

import java.util.regex.*;
 
def woodchuck = "How much wood would a woodchuck chuck if a woodchuck could chuck wood?"
def pepper = "Peter Piper picked a peck of pickled peppers"
 
 
println "=== Regular-expression String syntax (/string/) ==="
def woodRE = /[Ww]o\w+d/
def piperRE = /[Pp]\w+r/
assert woodRE instanceof String && piperRE instanceof String
assert (/[Ww]o\w+d/ == "[Ww]o\\w+d") && (/[Pp]\w+r/ == "[Pp]\\w+r")
println ([woodRE: woodRE, piperRE: piperRE])
println ()
 
 
println "=== Pattern (~) operator ==="
def woodPat = ~/[Ww]o\w+d/
def piperPat = ~piperRE
assert woodPat instanceof Pattern && piperPat instanceof Pattern
 
def woodList = woodchuck.split().grep(woodPat)
println ([exactTokenMatches: woodList])
println ([exactTokenMatches: pepper.split().grep(piperPat)])
println ()
 
 
println "=== Matcher (=~) operator ==="
def wwMatcher = (woodchuck =~ woodRE)
def ppMatcher = (pepper =~ /[Pp]\w+r/)
def wpMatcher = (woodchuck =~ /[Pp]\w+r/)
assert wwMatcher instanceof Matcher && ppMatcher instanceof Matcher
assert wwMatcher.toString() == woodPat.matcher(woodchuck).toString()
assert ppMatcher.toString() == piperPat.matcher(pepper).toString()
assert wpMatcher.toString() == piperPat.matcher(woodchuck).toString()
 
println ([ substringMatches: wwMatcher.collect { it }])
println ([ substringMatches: ppMatcher.collect { it }])
println ([ substringMatches: wpMatcher.collect { it }])
println ()
 
 
println "=== Exact Match (==~) operator ==="
def containsWoodRE = /.*/ + woodRE + /.*/
def containsPiperRE = /.*/ + piperRE + /.*/
def wwMatches = (woodchuck ==~ containsWoodRE)
assert wwMatches instanceof Boolean
def wwNotMatches = ! (woodchuck ==~ woodRE)
def ppMatches = (pepper ==~ containsPiperRE)
def pwNotMatches = ! (pepper ==~ containsWoodRE)
def wpNotMatches = ! (woodchuck ==~ containsPiperRE)
assert wwMatches && wwNotMatches && ppMatches && pwNotMatches && pwNotMatches
 
println ("'${woodchuck}' ${wwNotMatches ? 'does not' : 'does'} match '${woodRE}' exactly")
println ("'${woodchuck}' ${wwMatches ? 'does' : 'does not'} match '${containsWoodRE}' exactly")

Output:

=== Regular-expression String syntax (/string/)=== 
[woodRE:[Ww]o\w+d, piperRE:[Pp]\w+r]

=== Pattern (~) operator ===
[exactTokenMatches:[wood, would]]
[exactTokenMatches:[Peter, Piper]]

=== Matcher (=~) operator ===
[substringMatches:[wood, would, wood, wood, wood]]
[substringMatches:[Peter, Piper, pepper]]
[substringMatches:[]]

=== Exact Match (==~) operator ===
'How much wood would a woodchuck chuck if a woodchuck could chuck wood?' does not match '[Ww]o\w+d' exactly
'How much wood would a woodchuck chuck if a woodchuck could chuck wood?' does match '.*[Ww]o\w+d.*' exactly

Replacement Solution (String.replaceAll()):

println woodchuck.replaceAll(/c\w+k/, "CHUCK")

Output:

How much wood would a woodCHUCK CHUCK if a woodCHUCK could CHUCK wood?

Reusable Replacement Solution (Matcher.replaceAll()):

def ck = (woodchuck =~ /c\w+k/)
println (ck.replaceAll("CHUCK"))
println (ck.replaceAll("wind"))
println (ck.replaceAll("pile"))
println (ck.replaceAll("craft"))
println (ck.replaceAll("block"))
println (ck.replaceAll("row"))
println (ck.replaceAll("shed"))
println (ck.replaceAll("man"))
println (ck.replaceAll("work"))
println (ck.replaceAll("pickle"))

Output:

How much wood would a woodCHUCK CHUCK if a woodCHUCK could CHUCK wood?
How much wood would a woodwind wind if a woodwind could wind wood?
How much wood would a woodpile pile if a woodpile could pile wood?
How much wood would a woodcraft craft if a woodcraft could craft wood?
How much wood would a woodblock block if a woodblock could block wood?
How much wood would a woodrow row if a woodrow could row wood?
How much wood would a woodshed shed if a woodshed could shed wood?
How much wood would a woodman man if a woodman could man wood?
How much wood would a woodwork work if a woodwork could work wood?
How much wood would a woodpickle pickle if a woodpickle could pickle wood?

[edit] Haskell

Test

import Text.Regex
 
str = "I am a string"
 
case matchRegex (mkRegex ".*string$") str of
Just _ -> putStrLn $ "ends with 'string'"
Nothing -> return ()

Substitute

import Text.Regex
 
orig = "I am the original string"
result = subRegex (mkRegex "original") orig "modified"
putStrLn $ result

[edit] HicEst

CHARACTER string*100/ "The quick brown fox jumps over the lazy dog" /
REAL, PARAMETER :: Regex=128, Count=256
 
characters_a_m = INDEX(string, "[a-m]", Regex+Count) ! counts 16
 
vocals_changed = EDIT(Text=string, Option=Regex, Right="[aeiou]", RePLaceby='**', DO=LEN(string) ) ! changes 11
WRITE(ClipBoard) string ! Th** q****ck br**wn f**x j**mps **v**r th** l**zy d**g

[edit] Icon and Unicon

Regex includes procedures to provide access to regular expressions within native string scanning and matching expressions. 'ReFind' and 'ReMatch' respectively generate the sequence of beginning and ending positions matched by a regular expression. Additionally, there is a regular expression pattern compiler 'RePat' and other supporting functions and variables.

procedure main()
 
s := "A simple string"
p := "string$" # regular expression
 
s ? write(image(s),if ReFind(p) then " matches " else " doesn't match ",image(p))
 
s[j := ReFind(p,s):ReMatch(p,s,j)] := "replacement"
write(image(s))
end
 
link regexp # link to IPL regexp

See regexp.

Sample output:
"A simple string" matches "string$"
"A simple replacement"

[edit] Inform 7

Inform's regex support is similar to Perl's but with some limitations: angle brackets are used instead of square brackets, there is no multiline mode, several control characters and character classes are omitted, and backtracking is slightly less powerful.

let T be indexed text;
let T be "A simple string";
if T matches the regular expression ".*string$", say "ends with string.";
replace the regular expression "simple" in T with "replacement";

[edit] J

J's regex support is built on top of PCRE.

load'regex'               NB.  Load regex library
str =: 'I am a string' NB. String used in examples.

Matching:

   '.*string$' rxeq str     NB.  1 is true, 0 is false
1

Substitution:

   ('am';'am still') rxrplc str
I am still a string
Note: use
   open'regex'
to read the source code for the library. The comments list 6 main definitions and a dozen utility definitions.

[edit] Java

Works with: Java version 1.4+

Test

String str = "I am a string";
if (str.matches(".*string")) { // note: matches() tests if the entire string is a match
System.out.println("ends with 'string'");
}

To match part of a string, or to process matches:

import java.util.regex.*;
Pattern p = Pattern.compile("a*b");
Matcher m = p.matcher(str);
while (m.find()) {
// use m.group() to extract matches
}

Substitute

String orig = "I am the original string";
String result = orig.replaceAll("original", "modified");
// result is now "I am the modified string"

[edit] JavaScript

Test/Match

var subject = "Hello world!";
 
// Two different ways to create the RegExp object
// Both examples use the exact same pattern... matching "hello"
var re_PatternToMatch = /Hello (World)/i; // creates a RegExp literal with case-insensitivity
var re_PatternToMatch2 = new RegExp("Hello (World)", "i");
 
// Test for a match - return a bool
var isMatch = re_PatternToMatch.test(subject);
 
// Get the match details
// Returns an array with the match's details
// matches[0] == "Hello world"
// matches[1] == "world"
var matches = re_PatternToMatch2.exec(subject);

Substitute

var subject = "Hello world!";
 
// Perform a string replacement
// newSubject == "Replaced!"
var newSubject = subject.replace(re_PatternToMatch, "Replaced");

[edit] Julia

Translation of: Perl

Julia implements Perl-compatible regular expressions (via the built-in PCRE library). To test for a match:

s = "I am a string"
if ismatch(r"string$", s)
println("'$s' ends with 'string'")
end

To perform replacements:

s = "I am a string"
s = replace(s, r" (a|an) ", " another ")

There are many other features of Julia's regular-expression support, too numerous to list here.


[edit] Lasso

Lasso has built in support for regular expressions using ICU regexps.

local(mytext = 'My name is: Stone, Rosetta
My name is: Hippo, Campus
')
 
local(regexp = regexp(
-find = `(?m)^My name is: (.*?), (.*?)$`,
-input = #mytext,
-replace = `Hello! I am $2 $1`,
-ignorecase
))
 
 
while(#regexp -> find) => {^
#regexp -> groupcount > 1 ? (#regexp -> matchString(2) -> trim&) + '<br />'
^}
 
#regexp -> reset(-input = #mytext)
#regexp -> findall
 
#regexp -> reset(-input = #mytext)
'<br />'
#regexp -> replaceall
Rosetta
Campus
array(My name is: Stone, Rosetta, My name is: Hippo, Campus)
Hello! I am Rosetta Stone Hello! I am Campus Hippo

[edit] Lua

str1 = "This is a string!"
str2 = "string"
 
print( str1:match( str2 ) )
erg = str1:gsub( "a", "another" ); print( erg )

[edit] M4

regexp(`GNUs not Unix', `\<[a-z]\w+')
regexp(`GNUs not Unix', `\<[a-z]\(\w+\)', `a \& b \1 c')

Output:

5
a not b ot c

[edit] Mathematica

 
StringCases["I am a string with the number 18374 in me",RegularExpression["[0-9]+"]]
StringReplace["I am a string",RegularExpression["I\\sam"] -> "I'm"]
 

The in-notebook output, in order:

{18374}
I'm a string

[edit] MIRC Scripting Language

alias regular_expressions {
var %string = This is a string
var %re = string$
if ($regex(%string,%re) > 0) {
echo -a Ends with string.
}
%re = \ba\b
if ($regsub(%string,%re,another,%string) > 0) {
echo -a Result 1: %string
}
%re = \b(another)\b
echo -a Result 2: $regsubex(%string,%re,yet \1)
}

Output:

Ends with string.
Result 1: This is another string
Result 2: This is yet another string

[edit] MUMPS

MUMPS doesn't have a replacement functionality when using the pattern matching operator, ?. We can mimic it with $PIECE, but $PIECE doesn't work with regular expressions as an operand.

REGEXP
NEW HI,W,PATTERN,BOOLEAN
SET HI="Hello, world!",W="world"
SET PATTERN=".E1"""_W_""".E"
SET BOOLEAN=HI?@PATTERN
WRITE "Source string - '"_HI_"'",!
WRITE "Partial string - '"_W_"'",!
WRITE "Pattern string created is - '"_PATTERN_"'",!
WRITE "Match? ",$SELECT(BOOLEAN:"YES",'BOOLEAN:"No"),!
 ;
SET BOOLEAN=$FIND(HI,W)
IF BOOLEAN>0 WRITE $PIECE(HI,W,1)_"string"_$PIECE(HI,W,2)
QUIT
Usage:
USER>D REGEXP^ROSETTA
Source string - 'Hello, world!'
Partial string - 'world'
Pattern string created is - '.E1"world".E'
Match? YES
Hello, string!

[edit] NetRexx

/* NetRexx */
options replace format comments java crossref symbols nobinary
 
import java.util.regex.
 
st1 = 'Fee, fie, foe, fum, I smell the blood of an Englishman'
rx1 = 'f.e.*?'
sbx = 'foo'
 
rx1ef = '(?i)'rx1 -- use embedded flag expression == Pattern.CASE_INSENSITIVE
 
-- using String's matches & replaceAll
mcm = (String st1).matches(rx1ef)
say 'String "'st1'"' 'matches pattern "'rx1ef'":' Boolean(mcm)
say
say 'Replace all occurrences of regex pattern "'rx1ef'" with "'sbx'"'
stx = Rexx
stx = (String st1).replaceAll(rx1ef, sbx)
say 'Input string: "'st1'"'
say 'Result string: "'stx'"'
say
 
-- using java.util.regex classes
pt1 = Pattern.compile(rx1, Pattern.CASE_INSENSITIVE)
mc1 = pt1.matcher(st1)
mcm = mc1.matches()
say 'String "'st1'"' 'matches pattern "'pt1.toString()'":' Boolean(mcm)
mc1 = pt1.matcher(st1)
say
say 'Replace all occurrences of regex pattern "'rx1'" with "'sbx'"'
sx1 = Rexx
sx1 = mc1.replaceAll(sbx)
say 'Input string: "'st1'"'
say 'Result string: "'sx1'"'
say
 
return
 

Output:

String "Fee, fie, foe, fum, I smell the blood of an Englishman" matches pattern "(?i)f.e.*?": true

Replace all occurrences of regex pattern "(?i)f.e.*?" with "foo"
Input string:  "Fee, fie, foe, fum, I smell the blood of an Englishman"
Result string: "foo, foo, foo, fum, I smell the blood of an Englishman"

String "Fee, fie, foe, fum, I smell the blood of an Englishman" matches pattern "f.e.*?": true

Replace all occurrences of regex pattern "f.e.*?" with "foo"
Input string:  "Fee, fie, foe, fum, I smell the blood of an Englishman"
Result string: "foo, foo, foo, fum, I smell the blood of an Englishman"

[edit] NewLISP

(regex "[bB]+" "AbBBbABbBAAAA") -> ("bBBb" 1 4)


[edit] Objeck

 
use RegEx;
 
bundle Default {
class RegExTest {
function : Main(args : String[]) ~ Nil {
string := "I am a string";
# exact match
regex := RegEx->New(".*string");
if(regex->MatchExact(".*string")) {
"ends with 'string'"->PrintLine();
};
# replace all
regex := RegEx->New(" a ");
regex->ReplaceAll(string, " another ")->PrintLine();
}
}
}
 

[edit] Objective-C

Test

Works with: Mac OS X version 10.4+
Works with: iOS version 3.0+
NSString *str = @"I am a string";
NSString *regex = @".*string$";
 
// Note: the MATCHES operator matches the entire string, necessitating the ".*"
NSPredicate *pred = [NSPredicate predicateWithFormat:@"SELF MATCHES %@", regex];
 
if ([pred evaluateWithObject:str]) {
NSLog(@"ends with 'string'");
}

Unfortunately this method cannot find the location of the match or do substitution.

[edit] NSRegularExpressionSearch

Test

Works with: Mac OS X version 10.7+
Works with: iOS version 3.2+
NSString *str = @"I am a string";
if ([str rangeOfString:@"string$" options:NSRegularExpressionSearch].location != NSNotFound) {
NSLog(@"Ends with 'string'");
}

Substitute

Works with: Mac OS X version 10.7+
Works with: iOS version 4.0+
undocumented
NSString *orig = @"I am the original string";
NSString *result = [orig stringByReplacingOccurrencesOfString:@"original"
withString:@"modified"
options:NSRegularExpressionSearch
range:NSMakeRange(0, [orig length])];
NSLog(@"%@", result);

[edit] NSRegularExpression

Works with: Mac OS X version 10.7+
Works with: iOS version 4.0+

Test

NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"string$"
options:0
error:NULL];
NSString *str = @"I am a string";
if ([regex rangeOfFirstMatchInString:str
options:0
range:NSMakeRange(0, [str length])
].location != NSNotFound) {
NSLog(@"Ends with 'string'");
}

Loop through matches

for (NSTextCheckingResult *match in [regex matchesInString:str
options:0
range:NSMakeRange(0, [str length])
]) {
// match.range gives the range of the whole match
// [match rangeAtIndex:i] gives the range of the i'th capture group (starting from 1)
}

Substitute

NSString *orig = @"I am the original string";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"original"
options:0
error:NULL];
NSString *result = [regex stringByReplacingMatchesInString:orig
options:0
range:NSMakeRange(0, [orig length])
withTemplate:@"modified"];
NSLog(@"%@", result);

[edit] OCaml

[edit] With the standard library

Test

#load "str.cma";;
let str = "I am a string";;
try
ignore(Str.search_forward (Str.regexp ".*string$") str 0);
print_endline "ends with 'string'"
with Not_found -> ()
;;

Substitute

#load "str.cma";;
let orig = "I am the original string";;
let result = Str.global_replace (Str.regexp "original") "modified" orig;;
(* result is now "I am the modified string" *)

[edit] Using Pcre

Library: ocaml-pcre

let matched pat str =
try ignore(Pcre.exec ~pat str); (true)
with Not_found -> (false)
;;
 
let () =
Printf.printf "matched = %b\n" (matched "string$" "I am a string");
Printf.printf "Substitute: %s\n"
(Pcre.replace ~pat:"original" ~templ:"modified" "I am the original string")
;;

[edit] ooRexx

/* Rexx */
/* Using the RxRegExp Regular Expression built-in utility class */
 
st1 = 'Fee, fie, foe, fum, I smell the blood of an Englishman'
rx1 = '[Ff]?e' -- unlike most regex engines, RxRegExp uses '?' instead of '.' to match any single character
sbx = 'foo'
 
myRE = .RegularExpression~new()
myRE~parse(rx1, MINIMAL)
 
mcm = myRE~pos(st1)
say 'String "'st1'"' 'matches pattern "'rx1'":' bool2string(mcm > 0)
say
 
-- The RxRegExp package doesn't provide a replace capability so you must roll your own
st0 = st1
loop label GREP forever
mcp = myRE~pos(st1)
if mcp > 0 then do
mpp = myRE~position
fnd = st1~substr(mcp, mpp - mcp + 1)
stx = st1~changestr(fnd, sbx, 1)
end
else leave GREP
st1 = stx
end GREP
say 'Input string: "'st0'"'
say 'Result string: "'stx'"'
return
exit
 
bool2string:
procedure
do
parse arg bv .
if bv then bx = 'true'
else bx = 'false'
return bx
end
exit
 
::requires "rxregexp.cls"
 

Output:'

String "Fee, fie, foe, fum, I smell the blood of an Englishman" matches pattern "[Ff]?e": true

Input string:  "Fee, fie, foe, fum, I smell the blood of an Englishman"
Result string: "foo, foo, foo, fum, I smell the blood of an Englishman"

[edit] Oxygene

 
// Match and Replace part of a string using a Regular Expression
//
// Nigel Galloway - April 15th., 2012
//
namespace re;
 
interface
 
type
re = class
public
class method Main;
end;
 
implementation
 
class method re.Main;
const
myString = 'I think that I am Nigel';
var
r: System.Text.RegularExpressions.Regex;
myResult : String;
begin
r := new System.Text.RegularExpressions.Regex('(I am)|(you are)');
Console.WriteLine("{0} contains {1}", myString, r.Match(myString));
myResult := r.Replace(myString, "you are");
Console.WriteLine("{0} contains {1}", myResult, r.Match(myResult));
end;
 
end.
 

Produces:

I think that I am Nigel contains I am
I think that you are Nigel contains you are

[edit] Oz

declare
[Regex] = {Module.link ['x-oz://contrib/regex']}
String = "This is a string"
in
if {Regex.search "string$" String} \= false then
{System.showInfo "Ends with string."}
end
{System.showInfo {Regex.replace String " a " fun {$ _ _} " another " end}}

[edit] Pascal

 
// Match and Replace part of a string using a Regular Expression
//
// Nigel Galloway - April 11th., 2012
//
program RegularExpr;
 
uses
RegExpr;
 
const
myString = 'I think that I am Nigel';
myMatch = '(I am)|(you are)';
var
r : TRegExpr;
myResult : String;
 
begin
r := TRegExpr.Create;
r.Expression := myMatch;
write(myString);
if r.Exec(myString) then writeln(' contains ' + r.Match[0]);
myResult := r.Replace(myString, 'you are', False);
write(myResult);
if r.Exec(myResult) then writeln(' contains ' + r.Match[0]);
end.
 

Produces:

>RegularExpr
I think that I am Nigel contains I am
I think that you are Nigel contains you are

[edit] Perl

Works with: Perl version 5.8.8

Test

$string = "I am a string";
if ($string =~ /string$/) {
print "Ends with 'string'\n";
}
 
if ($string !~ /^You/) {
print "Does not start with 'You'\n";
}


Substitute

$string = "I am a string";
$string =~ s/ a / another /; # makes "I am a string" into "I am another string"
print $string;

In Perl 5.14+, you can return a new substituted string without altering the original string:

$string = "I am a string";
$string2 = $string =~ s/ a / another /r; # $string2 == "I am another string", $string is unaltered
print $string2;


Test and Substitute

$string = "I am a string";
if ($string =~ s/\bam\b/was/) { # \b is a word border
print "I was able to find and replace 'am' with 'was'\n";
}


Options

# add the following just after the last / for additional control
# g = globally (match as many as possible)
# i = case-insensitive
# s = treat all of $string as a single line (in case you have line breaks in the content)
# m = multi-line (the expression is run on each line individually)
 
$string =~ s/i/u/ig; # would change "I am a string" into "u am a strung"

Omission of the regular expression binding operators

If regular expression matches are being made against the topic variable, it is possible to omit the regular expression binding operators:

$_ = "I like banana milkshake.";
if (/banana/) { # The regular expression binding operator is omitted
print "Match found\n";
}

[edit] Perl 6

use v6;
if 'a long string' ~~ /string$/ {
say "It ends with 'string'";
}
 
# substitution has a few nifty features
 
$_ = 'The quick Brown fox';
s:g:samecase/\w+/xxx/;
.say;
# output:
# Xxx xxx Xxx xxx
 

[edit] PHP

Works with: PHP version 5.2.0
$string = 'I am a string';
# Test
if (preg_match('/string$/', $string))
{
echo "Ends with 'string'\n";
}
# Replace
$string = preg_replace('/\ba\b/', 'another', $string);
echo "Found 'a' and replace it with 'another', resulting in this string: $string\n";

Output:

Ends with 'string'
Foud 'a' and replaced it with 'another', resulting in this string: I am another string

[edit] PicoLisp

[edit] Calling the C library

PicoLisp doesn't have built-in regex functionality. It is easy to call the native C library.

(let (Pat "a[0-9]z"  String "a7z")
(use Preg
(native "@" "regcomp" 'I '(Preg (64 B . 64)) Pat 1) # Compile regex
(when (=0 (native "@" "regexec" 'I (cons NIL (64) Preg) String 0 0 0))
(prinl "String \"" String "\" matches regex \"" Pat "\"") ) ) )

Output:

String "a7z" matches pattern "a[0-9]z"

[edit] Using Pattern Matching

Regular expressions are static and inflexible. Another possibility is dynamic pattern matching, where arbitrary conditions can be programmed.

(let String "The number <7> is incremented"
(use (@A @N @Z)
(and
(match '(@A "<" @N ">" @Z) (chop String))
(format @N)
(prinl @A "<" (inc @) ">" @Z) ) ) )

Output:

The number <8> is incremented

[edit] PowerShell

"I am a string" -match '\bstr'       # true
"I am a string" -replace 'a\b','no' # I am no string

By default both the -match and -replace operators are case-insensitive. They can be made case-sensitive by using the -cmatch and -creplace operators.

[edit] PureBasic

String$        = "<tag>some text consisting of Roman letters spaces and numbers like 12</tag>"
regex$ = "<([a-z]*)>[a-z,A-Z,0-9, ]*</\1>"
regex_replace$ = "letters[a-z,A-Z,0-9, ]*numbers[a-z,A-Z,0-9, ]*"
If CreateRegularExpression(1, regex$) And CreateRegularExpression(2, regex_replace$)
If MatchRegularExpression(1, String$)
Debug "Tags correct, and only alphanummeric or space characters between them"
EndIf
Debug ReplaceRegularExpression(2, String$, "char stuff")
EndIf

[edit] Python

import re
 
string = "This is a string"
 
if re.search('string$',string):
print("Ends with string.")
 
string = re.sub(" a "," another ",string)
print string

[edit] R

First, define some strings.

pattern <- "string"
text1 <- "this is a matching string"
text2 <- "this does not match"

Matching with grep. The indices of the texts containing matches are returned.

grep(pattern, c(text1, text2))  # 1

Matching with regexpr. The positions of the starts of the matches are returned, along with the lengths of the matches.

regexpr(pattern, c(text1, text2))
[1] 20 -1
attr(,"match.length")
[1]  6 -1

Replacement

gsub(pattern, "pair of socks", c(text1, text2))
[1] "this is a matching pair of socks" "this does not match"

[edit] Racket

 
#lang racket
 
(define s "I am a string")
 
(when (regexp-match? #rx"string$" s)
(displayln "Ends with 'string'."))
 
(unless (regexp-match? #rx"^You" s)
(displayln "Does not start with 'You'."))
 
(displayln (regexp-replace " a " s " another "))
 

[edit] Raven

'i am a string' as str

Match:

str m/string$/
if "Ends with 'string'\n" print

Replace once:

str r/ a / another /  print
str r/ /_/  print

Replace all:

str r/ /_/g  print

Replace case insensitive:

str r/ A / another /i  print

Splitting:

str s/ /

[edit] REBOL

rebol [
Title: "Regular Expression Matching"
Author: oofoe
Date: 2009-12-06
URL: http://rosettacode.org/wiki/Regular_expression_matching
]

 
string: "This is a string."
 
; REBOL doesn't use a conventional Perl-compatible regular expression
; syntax. Instead, it uses a variant Parsing Expression Grammar with
; the 'parse' function. It's also not limited to just strings. You can
; define complex grammars that actually parse and execute program
; files.
 
; Here, I provide a rule to 'parse' that specifies searching through
; the string until "string." is found, then the end of the string. If
; the subject string satisfies the rule, the expression will be true.
 
if parse string [thru "string." end] [
print "Subject ends with 'string.'"]
 
; For replacement, I take advantage of the ability to call arbitrary
; code when a pattern is matched -- everything in the parens will be
; executed when 'to " a "' is satisfied. This marks the current string
; location, then removes the offending word and inserts the replacement.
 
parse string [
to " a " ; Jump to target.
mark: (
remove/part mark 3 ; Remove target.
mark: insert mark " another " ; Insert replacement.
)
:mark ; Pick up where I left off.
]
print [crlf "Parse replacement:" string]
 
; For what it's worth, the above operation is more conveniently done
; with the 'replace' function:
 
replace string " another " " a " ; Change string back.
print [crlf "Replacement:" string]

Output:

Subject ends with 'string.'

Parse replacement: This is another string.

Replacement: This is a string.

[edit] REXX

Rexx does not directly support the use of regular expressions as part of the language.
However, some rexx interpreters offer support for regular expressions via external function libraries or
through implementation specific extensions.

It is also possible to emulate regular expressions through appropriate coding techniques.

All of the following REXX examples are modeled after the PERL examples.
testing

/*REXX program demonstrates   testing      (modeled after Perl example).*/
$string = "I am a string"
say 'The string is:' $string
x = "string"
if right($string,length(x))=x then say 'It ends with:' x
 
y = "You"
if left($string,length(y))\=y then say 'It does not start with:' y
 
z = "ring"
if pos(z,$string)\==0 then say 'It contains the string:' z
 
z = "ring"
if wordpos(z,$string)==0 then say 'It does not contain the word:' z
/*stick a fork in it, we're done.*/

output

The string is: I am a string
It ends with: string
It does not start with: You
It contains the string: ring
It does not contain the word: ring

substitution   (destructive)

/*REXX program demonstrates  substitution  (modeled after Perl example).*/
$string = "I am a string"
old = " a "
new = " another "
say 'The original string is:' $string
say 'old word is:' old
say 'new word is:' new
$string = changestr(old,$string,new)
say 'The changed string is:' $string
/*stick a fork in it, we're done.*/

output

The original string is: I am a string
old  word  is:  a
new  word  is:  another
The  changed string is: I am another string

substitution   (non-destructive)

/*REXX program shows  non-destructive sub. (modeled after Perl example).*/
$string = "I am a string"
old = " a "
new = " another "
say 'The original string is:' $string
say 'old word is:' old
say 'new word is:' new
$string2 = changestr(old,$string,new)
say 'The original string is:' $string
say 'The changed string is:' $string2
/*stick a fork in it, we're done.*/

output

The original string is: I am a string
old  word  is:  a
new  word  is:  another
The original string is: I am a string
The  changed string is: I am another string

test and substitute

/*REXX program shows  test and substitute  (modeled after Perl example).*/
$string = "I am a string"
old = " am "
new = " was "
say 'The original string is:' $string
say 'old word is:' old
say 'new word is:' new
 
if wordpos(old,$string)\==0 then
do
$string = changestr(old,$string,new)
say 'I was able to find and replace ' old " with " new
end
/*stick a fork in it, we're done.*/

output

The original string is: I am a string
old  word  is:  am
new  word  is:  was
I was able to find and replace   am   with   was

Some older REXXes don't have a changestr bif, so one is included here CHANGESTR.REX.

[edit] Ruby

Test

str = "I am a string"
p "Ends with 'string'" if str =~ /string$/
p "Does not start with 'You'" unless str =~ /^You/

Substitute

str.sub(/ a /, ' another ') #=> "I am another string"
# Or:
str[/ a /] = ' another ' #=> "another"
str #=> "I am another string"

Substitute using block

str.gsub(/\bam\b/) { |match| match.upcase } #=> "I AM a string"

[edit] Run BASIC

string$ = "I am a string"
if right$(string$,6) = "string" then print "'";string$;"' ends with 'string'"
i = instr(string$,"am")
string$ = left$(string$,i - 1) + "was" + mid$(string$,i + 2)
print "replace 'am' with 'was' = ";string$
 
Output:
'I am a string' ends with 'string'
replace 'am' with 'was' = I was a string

[edit] Sather

Sather understands POSIX regular expressions.

class MAIN is
-- we need to implement the substitution
regex_subst(re:REGEXP, s, sb:STR):STR is
from, to:INT;
re.match(s, out from, out to);
if from = -1 then return s; end;
return s.head(from) + sb + s.tail(s.size - to);
end;
 
main is
s ::= "I am a string";
re ::= REGEXP::regexp("string$", true);
if re.match(s) then
#OUT + "'" + s + "'" + " ends with 'string'\n";
end;
if ~REGEXP::regexp("^You", false).match(s) then
#OUT + "'" + s + "'" + " does not begin with 'You'\n";
end;
#OUT + regex_subst(re, s, "integer") + "\n";
#OUT + regex_subst(REGEXP::regexp("am +a +st", true), s, "get the ") + "\n";
end;
end;

[edit] Scala

Library: Scala

Define

val Bottles1 = "(\\d+) bottles of beer".r                                            // syntactic sugar
val Bottles2 = """(\d+) bottles of beer""".r // using triple-quotes to preserve backslashes
val Bottles3 = new scala.util.matching.Regex("(\\d+) bottles of beer") // standard
val Bottles4 = new scala.util.matching.Regex("""(\d+) bottles of beer""", "bottles") // with named groups

Search and replace with string methods:

"99 bottles of beer" matches "(\\d+) bottles of beer" // the full string must match
"99 bottles of beer" replace ("99", "98") // Single replacement
"99 bottles of beer" replaceAll ("b", "B") // Multiple replacement

Search with regex methods:

"\\d+".r findFirstIn "99 bottles of beer" // returns first partial match, or None
"\\w+".r findAllIn "99 bottles of beer" // returns all partial matches as an iterator
"\\s+".r findPrefixOf "99 bottles of beer" // returns a matching prefix, or None
Bottles4 findFirstMatchIn "99 bottles of beer" // returns a "Match" object, or None
Bottles4 findPrefixMatchOf "99 bottles of beer" // same thing, for prefixes
val bottles = (Bottles4 findFirstMatchIn "99 bottles of beer").get.group("bottles") // Getting a group by name

Using pattern matching with regex:

val Some(bottles) = Bottles4 findPrefixOf "99 bottles of beer" // throws an exception if the matching fails; full string must match
for {
line <- """|99 bottles of beer on the wall
|99 bottles of beer
|Take one down, pass it around
|98 bottles of beer on the wall"
"".stripMargin.lines
} line match {
case Bottles1(bottles) => println("There are still "+bottles+" bottles.") // full string must match, so this will match only once
case _ =>
}
for {
matched <- "(\\w+)".r findAllIn "99 bottles of beer" matchData // matchData converts to an Iterator of Match
} println("Matched from "+matched.start+" to "+matched.end)

Replacing with regex:

Bottles2 replaceFirstIn ("99 bottles of beer", "98 bottles of beer")
Bottles3 replaceAllIn ("99 bottles of beer", "98 bottles of beer")

[edit] Shiny

str: 'I am a string'

Match text:

if str.match ~string$~
say "Ends with 'string'"
end

Replace text:

say str.alter ~ a ~ 'another'

[edit] Slate

This library is still in its early stages. There isn't currently a feature to replace a substring.

 
'http://slatelanguage.org/test/page?query' =~ '^(([^:/?#]+)\\:)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?'.
 
" ==> {'http:'. 'http'. '//slatelanguage.org'. 'slatelanguage.org'. '/test/page'. '?query'. 'query'. Nil} "
 

[edit] Smalltalk

|re s s1|
re := Regex fromString: '[a-z]+ing'.
s := 'this is a matching string'.
s1 := 'this does not match'.
 
(s =~ re)
ifMatched: [ :b |
b match displayNl
].
(s1 =~ re)
ifMatched: [ :b |
'Strangely matched!' displayNl
]
ifNotMatched: [
'no match!' displayNl
].
 
(s replacingRegex: re with: 'modified') displayNl.


[edit] SNOBOL4

In SNOBOL4, patterns are based not on regular expressions, but are a native datatype which can be constructed, manipulated, concatenated, used in pattern expressions, stored into variables, and so forth. Patterns can be constructed ahead of time and saved in variables, and those preconstructed patterns can also reference additional pattern and data items which won't be known until actual pattern match time. Patterns can define calls to functions which will be called during actual pattern matching, and whose outcome can affect how the pattern match continues, which tentative matches will and won't be accepted, and so forth.

SNOBOL4 pattern matching is thus hugely more capable than traditional regular expressions are. An example of a pattern matching problem that would be prohibitively difficult to create as a regular expression would be to "create a pattern which matches a complete name and international postal mailing address."

SNOBOL4's "raison d'etre" is pattern matching and string manipulation (although it's also strong in data structures too). The basic statement syntax in SNOBOL4 is:

label   subject  pattern  =  object          :(goto)

The basic operation is to evaluate the subject, evaluate the pattern, find the pattern in the subject, evaluate the object, and then replace the portion of the subject matched by the pattern with the evaluated object. If any of those steps fails (i.e. does not succeed) then execution continues with the goto, as appropriate.

The goto can be unconditional, or can be based on whether the statement succeeded or failed (and that is the basis for all explicit transfers of control in SNOBOL4). This example finds the string "SNOBOL4" in string variable string1, and replaces it with "new SPITBOL" (SPITBOL is an implementation of SNOBOL4, basically SPITBOL is to SNOBOL4 what Turbo Pascal is to Pascal):

     string1 = "The SNOBOL4 language is designed for string manipulation."
string1 "SNOBOL4" = "new SPITBOL"  :s(changed)f(nochange)

The following example replaces "diameter is " and a numeric value by "circumference is " and the circumference instead (it also shows creation of a pattern which matches integer or real numeric values, and storing that pattern into a variable... and then using that pattern variable later in a slightly more complicated pattern expression):

     pi = 3.1415926
dd = "0123456789"
string1 = "For the first circle, the diameter is 2.5 inches."
numpat = span(dd) (("." span(dd)) | null)
string1 "diameter is " numpat . diam = "circumference is " diam * pi

Relatively trivial pattern matching and replacements can be attacked very effectively using regular expressions, but regular expressions (while ubiquitous) are a crippling limitation for more complicated pattern matching problems.

[edit] Standard ML

There is no regex support in the Basis Library; however, various implementations have their own support.

Works with: SML/NJ

Test

CM.make "$/regexp-lib.cm";
structure RE = RegExpFn (
structure P = AwkSyntax
structure E = BackTrackEngine);
val re = RE.compileString "string$";
val string = "I am a string";
case StringCvt.scanString (RE.find re) string
of NONE => print "match failed\n"
| SOME match =>
let
val {pos, len} = MatchTree.root match
in
print ("matched at position " ^ Int.toString pos ^ "\n")
end;

[edit] Tcl

Test using regexp:

set theString "I am a string"
if {[regexp -- {string$} $theString]} {
puts "Ends with 'string'"
}
 
if {![regexp -- {^You} $theString]} {
puts "Does not start with 'You'"
}

Extract substring using regexp

set theString "This string has >123< a number in it"
if {[regexp -- {>(\d+)<} $theString -> number]} {
puts "Contains the number $number"
}

Substitute using regsub

set theString = "I am   a   string"
puts [regsub -- { +a +} $theString { another }]

[edit] Toka

Toka's regular expression library allows for matching, but does not yet provide for replacing elements within strings.

#! Include the regex library
needs regex
 
#! The two test strings
" This is a string" is-data test.1
" Another string" is-data test.2
 
#! Create a new regex named 'expression' which tries
#! to match strings beginning with 'This'.
" ^This" regex: expression
 
#! An array to store the results of the match
#! (Element 0 = starting offset, Element 1 = ending offset of match)
2 cells is-array match
 
#! Try both test strings against the expression.
#! try-regex will return a flag. -1 is TRUE, 0 is FALSE
expression test.1 2 match try-regex .
expression test.2 2 match try-regex .

[edit] TXR

[edit] Search and replace: simple

Txr is not designed for sed-like filtering, but here is how to do sed -e 's/dog/cat/g':

@(collect)
@(coll :gap 0)@mismatch@{match /dog/}@(end)@suffix
@(output)
@(rep)@{mismatch}cat@(end)@suffix
@(end)
@(end)

How it works is that the body of the coll uses a double-variable match: an unbound variable followed by a regex-match variable. The meaning of this combination is, "Search for the regular expression, and if successful, then bind all the characters whcih were skipped over by the search to the first variable, and the matching text to the second variable." So we collect pairs: pieces of mismatching text, and pieces of text which match the regex dog. At the end, there is usually going to be a piece of text which does not match the body, because it has no match for the regex. Because :gap 0 is specified, the coll construct will terminate when faced with this nonmatching text, rather than skipping it in a vain search for a match, which allows @suffix to take on this trailing text.

To output the substitution, we simply spit out the mismatching texts followed by the replacement text, and then add the suffix.

[edit] Search and replace: strip comments from C source

Based on the technique of the previous example, here is a query for stripping C comments from a source file, replacing them by a space. Here, the "non-greedy" version of the regex Kleene operator is used, denoted by %. This allows for a very simple, straightforward regex which correctly matches C comments. The freeform operator allows the entire input stream to be treated as one big line, so this works across multi-line comments.

@(freeform)
@(coll :gap 0)@notcomment@{comment /[/][*].%[*][/]/}@(end)@tail
@(output)
@(rep)@notcomment @(end)@tail
@(end)

[edit] Regexes in TXR Lisp

Parse regex at run time to abstract syntax:

$ txr -p '(regex-parse "a.*b")'
(compound #\a (0+ wild) #\b)

Dynamically compile regex abstract syntax to regex object:

$ txr -p "(regex-compile '(compound #\a (0+ wild) #\b))"
#<sys:regex: 9c746d0>

Search replace with regsub.

$ txr -p '(regsub #/a+/ "-" "baaaaaad")'
"b-d"

[edit] Vala

 
void main(){
string sentence = "This is a sample sentence.";
 
Regex a = new Regex("s[ai]mple"); // if using \n type expressions, use triple " for string literals as easy method to escape them
 
if (a.match(sentence)){
stdout.printf("\"%s\" is in \"%s\"!\n", a.get_pattern(), sentence);
}
 
string sentence_replacement = "cat";
sentence = a.replace(sentence, sentence.length, 0, sentence_replacement);
stdout.printf("Replaced sentence is: %s\n", sentence);
}
 

Output:

"s[ai]mple" is in "This is a sample sentence."!
Replaced sentence is: This is a cat sentence.

[edit] Vedit macro language

Vedit can perform searches and matching with either regular expressions, pattern matching codes or plain text. These examples use regular expressions.

Match text at cursor location:

if (Match(".* string$", REGEXP)==0) {
Statline_Message("This line ends with 'string'")
}

Search for a pattern:

if (Search("string$", REGEXP+NOERR)) {
Statline_Message("'string' at and of line found")
}

Replace:

Replace(" a ", " another ", REGEXP+NOERR)

[edit] Web 68

@1Introduction.
Web 68 has access to a regular expression module which can compile regular expressions,
use them for matching strings, and replace strings with the matched string.
 
@a@<Compiler prelude@>
BEGIN
@<Declarations@>
@<Logic at the top level@>
END
@<Compiler postlude@>
 
@ The local compiler requires a special prelude.
 
@<Compiler prel...@>=
PROGRAM rosettacode regex CONTEXT VOID
USE regex,standard
 
@ And a special postlude.
 
@<Compiler post...@>=
FINISH
 
@1Regular expressions.
Compile a regular expression and match a string using it.
 
@<Decl...@>=
STRING regexp="string$";
REF REGEX rx=rx compile(regexp);
 
@ Declare a string for the regular expression to match.
 
@<Decl...@>=
STRING to match = "This is a string";
 
@ Define a routine to print the result of matching.
 
@<Decl...@>=
OP MATCH = (REF REGEX rx,STRING match)STRING:
IF rx match(rx,match,LOC SUBEXP)
THEN "matches"
ELSE "doesn't match"
FI;
 
@ Check whether the regular expression matches the string.
 
@<Logic...@>=
print(("String """,regexp,""" ",rx MATCH to match,
" string """,to match,"""",newline))
 
@ The end.
This program is processed by tang to produce Algol 68 code which has to be compiled by the a68toc compiler.
It's output is then compiled by gcc to produce a binary program. The script 'ca' provided with the Debian
package algol68toc requires the following command to process this program.
ca -l mod rosettacoderegex.w68
That's it. The resulting binary will print
'String "string$" matches string "This is a string"'

[edit] zkl

Strings are immutable so replacement is creation

var re=RegExp(".*string$");
re.matches("I am a string") //-->True
var s="I am a string thing"
re=RegExp("(string)")
re.search(s,True) //-->True
p,n:=re.matched[0] //--> L(L(7,6),"string")
String(s[0,p],"FOO",s[p+n,*]) //-->"I am a FOO thing"
Personal tools
Namespaces

Variants
Actions
Community
Explore
Misc
Toolbox