Regular expression matching
From Rosetta Code
Programming Task
This is a programming task. It lays out a problem which Rosetta Code users are encouraged to solve, using languages they know.
The goal of this task is
- to match a string against a regular expression
- to substitute part of a string using a regular expression
Contents |
[edit] AppleScript
Library: Satimage.osax
try
find text ".*string$" in "I am a string" with regexp
on error message
return message
end try
try
change "original" into "modified" in "I am the original string" with regexp
on error message
return message
end try
[edit] C++
Works with: g++ version 4.0.2
Library: Boost
#include <iostream>
#include <string>
#include <iterator>
#include <boost/regex.hpp>
int main()
{
boost::regex re(".* string$");
std::string s = "Hi, I am a string";
// match the complete string
if (boost::regex_match(s, re))
std::cout << "The string matches.\n";
else
std::cout << "Oops - not found?\n";
// match a substring
boost::regex re2(" a.*a");
boost::smatch match;
if (boost::regex_search(s, match, re2))
{
std::cout << "Matched " << match.length()
<< " characters starting at " << match.position() << ".\n";
std::cout << "Matched character sequence: \""
<< match.str() << "\"\n";
}
else
{
std::cout << "Oops - not found?\n";
}
// replace a substring
std::string dest_string;
boost::regex_replace(std::back_inserter(dest_string),
s.begin(), s.end(),
re2,
"'m now a changed");
std::cout << dest_string << std::endl;
}
[edit] C#
Works with: .NET version 2.0+
Import System.Text.RegularExpressions;
string str = "I am a clever string";
string pattern = ".*clever.*";
Regex regex = new Regex(pattern);
if ( regex.IsMatch( str) ) {
Console.WriteLine( "The string contains clever" );
}
if ( Regex.IsMatch( str, pattern ) ) {
Console.WriteLine( "A more clever way to detect that the string contains clever" );
}
[edit] D
import std.stdio, std.regexp;
void main() {
string s = "I am a string";
// Test:
if (search(s, r"string$"))
writefln("Ends with 'string'");
// Test, storing the regular expression:
auto re1 = RegExp(r"string$");
if (re1.search(s).test)
writefln("Ends with 'string'");
// Substitute:
writefln(sub(s, " a ", " another "));
// Substitute, storing the regular expression:
auto re2 = RegExp(" a ");
writefln(re2.replace(s, " another "));
}
Note that in std.string there are string functions to perform those string operations in a faster way.
[edit] Haskell
Test
import Text.Regex str = "I am a string" case matchRegex (mkRegex ".*string$") str of Just _ -> putStrLn $ "ends with 'string'" Nothing -> return ()
Substitute
import Text.Regex orig = "I am the original string" result = subRegex (mkRegex "original") orig "modified putStrLn $ result
[edit] J
J's regex support is built on top of PCRE.
load'regex' NB. Load regex library str =: 'I am a string' NB. String used in examples.
Matching:
'.*string$' rxeq str NB. 1 is true, 0 is false 1
Substitution:
('am';'am still') rxrplc str
I am still a string
[edit] Java
Works with: Java version 1.5+
Test
String str = "I am a string";
if (str.matches(".*string$")) {
System.out.println("ends with 'string'");
}
Substitute
String orig = "I am the original string";
String result = orig.replaceAll("original", "modified");
// result is now "I am the modified string"
[edit] JavaScript
Test/Match
var subject = "Hello world!";
// Two different ways to create the RegExp object
// Both examples use the exact same pattern... matching "hello"
var re_PatternToMatch = /Hello (World)/i; // creates a RegExp literal with case-insensitivity
var re_PatternToMatch2 = new RegExp("Hello (World)", "i");
// Test for a match - return a bool
var isMatch = re_PatternToMatch.test(subject);
// Get the match details
// Returns an array with the match's details
// matches[0] == "Hello world"
// matches[1] == "world"
var matches = re_PatternToMatch2.exec(subject);
Substitute
var subject = "Hello world!";
// Perform a string replacement
// newSubject == "Replaced!"
var newSubject = subject.replace(re_PatternToMatch, "Replaced");
[edit] OCaml
Test
#load "str.cma";; let str = "I am a string";; try ignore(Str.search_forward (Str.regexp ".*string$") str 0); print_endline "ends with 'string'" with Not_found -> () ;;
Substitute
#load "str.cma";; let orig = "I am the original string";; let result = Str.global_replace (Str.regexp "original") "modified" orig;; (* result is now "I am the modified string" *)
[edit] Perl
Works with: Perl version 5.8.8
Test
$string = "I am a string";
if ($string =~ /string$/) {
print "Ends with 'string'\n";
}
if ($string !~ /^You/) {
print "Does not start with 'You'\n";
}
Substitute
$string = "I am a string"; $string =~ s/ a / another /; # makes "I am a string" into "I am another string" print $string;
Test and Substitute
$string = "I am a string";
if ($string =~ s/\bam\b/was/) { # \b is a word border
print "I was able to find and replace 'am' with 'was'\n";
}
Options
# add the following just after the last / for additional control # g = globally (match as many as possible) # i = case-insensitive # s = treat all of $string as a single line (in case you have line breaks in the content) # m = multi-line (the expression is run on each line individually) $string =~ s/i/u/ig; # would change "I am a string" into "u am a strung"
[edit] PHP
Works with: PHP version 5.2.0
$string = 'I am a string';
Test
if (preg_match('/string$/', $string))
{
echo "Ends with 'string'\n";
}
Replace
$string = preg_replace('/\ba\b/', 'another', $string);
echo "Found 'a' and replace it with 'another', resulting in this string: $string\n";
[edit] Python
Works with: Python version 2.5
Setup
import re str = 'I am a string'
Test
if re.search(r'string$', str):
print "Ends with 'string'"
Test, storing the compiled regular expression in a variable
regex = re.compile(r'string$')
if regex.search(str):
print "Ends with 'string'"
To find all matches rather than just the first match, use re.findall rather than re.search.
Substitute
str = re.sub(r' a ', ' another ', str)
All instances of the specified pattern are replaced. To limit the number of instances replaced, specify the fourth argument to sub, the maximum number of replacements. To make a case-insensitive replacement, place (?i) at the beginning of the regular expression.
Substitute, storing the compiled regular expression in a variable
regex = re.compile(r' a ')
str = regex.sub(' another ', str)
Note: re.match() and regex.match() imply a "^" at the beginning of the regular expression. re.search() and regex.search() do not.
[edit] Raven
'i am a string' as str
Match:
str m/string$/ if "Ends with 'string'\n" print
Replace:
str r/ a / another / print
[edit] Ruby
Test
string="I am a string" puts "Ends with 'string'" if string[/string$/] puts "Does not start with 'You'" if !string[/^You/]
Substitute
puts string.gsub(/ a /,' another ') #or string[/ a /]='another' puts string
Substitute using block
puts(string.gsub(/\bam\b/) do |match|
puts "I found #{match}"
#place "was" instead of the match
"was"
end)
[edit] Tcl
Test
set theString "I am a string"
if {[regexp -- {string$} $theString]} {
puts "Ends with 'string'\n"
}
if (![regexp -- {^You} $theString]) {
puts "Does not start with 'You'\n"
}
Substitute
set theString = "I am a string"
puts [regsub -- { a } {I am a string} { another }]
[edit] Toka
Toka's regular expression library allows for matching, but does not yet provide for replacing elements within strings.
#! Include the regex library needs regex #! The two test strings " This is a string" is-data test.1 " Another string" is-data test.2 #! Create a new regex named 'expression' which tries #! to match strings beginning with 'This'. " ^This" regex: expression #! An array to store the results of the match #! (Element 0 = starting offset, Element 1 = ending offset of match) 2 cells is-array match #! Try both test strings against the expression. #! try-regex will return a flag. -1 is TRUE, 0 is FALSE expression test.1 2 match try-regex . expression test.2 2 match try-regex .
Categories: Programming Tasks | Text processing | AppleScript | Satimage.osax | C++ | Boost | C | D | Haskell | J | Java | JavaScript | OCaml | Perl | PHP | Python | Raven | Ruby | Tcl | Toka

