String comparison
Basic Data Operation
This is a basic data operation. It represents a fundamental action on a basic data type.
You may see other such operations in the Basic Data Operations category, or:
Integer Operations
Arithmetic |
Comparison
Boolean Operations
Bitwise |
Logical
String Operations
Concatenation |
Interpolation |
Comparison |
Matching
Memory Operations
Pointers & references |
Addresses
The task is to demonstrate how to compare two strings from within the language and how to achieve a lexical comparison. The task should demonstrate:
- Comparing two strings for exact equality
- Comparing two strings for inequality (i.e., the inverse of exact equality)
- Comparing two strings to see if one is lexically lower than the other
- Comparing two strings to see if one is lexically higher than the other
- How to achieve both case sensitive comparisons and case insensitive comparisons within the language
- How the language handles comparison of numeric strings if these are not treated lexically
- Demonstrate the other kinds of string comparisons that the language provides. For example, demonstrate the difference between generic comparison and coercive comparison if your language supports such a distinction.
See also:
AWK
<lang awk>BEGIN {
a="BALL" b="BELL"
IF (a == b) { print "The strings are equal" } IF (a != b) { print "The strings are not equal" } IF (a > b) { print "The first string is lexically higher than the second" } IF (a < b) { print "The first string is lexically lower than the second" } IF (a >= b) { print "The first string is not lexically lower than the second" } IF (a <= b) { print "The first string is not lexically higher than the second" }
}</lang>
BASIC
<lang basic>10 LET "A$="BELL" 20 LET B$="BELT" 30 IF A$ = B$ THEN PRINT "THE STRINGS ARE EQUAL": REM TEST FOR EQUALITY 40 IF A$ <> B$ THEN PRINT "THE STRINGS ARE NOT EQUAL": REM TEST FOR INEQUALITY 50 IF A$ > B$ THEN PRINT A$;" IS LEXICALLY HIGHER THAN ";B$: REM TEST FOR LEXICALLY HIGHER 60 IF A$ < B$ THEN PRINT A$;" IS LEXICALLY LOWER THAN ";B$: REM TEST FOR LEXICALLY LOWER 70 IF A$ <= B$ THEN PRINT A$;" IS NOT LEXICALLY HIGHER THAN ";B$ 80 IF A$ >= B$ THEN PRINT A$;" IS NOT LEXICALLY LOWER THAN ";B$ 90 END</lang>
Burlesque
<lang burlesque> blsq ) "abc""abc"== 1 blsq ) "abc""abc"!= 0 blsq ) "abc""Abc"cm 1 blsq ) "ABC""Abc"cm -1 </lang>
cm is used for comparision which returns 1,0,-1 like C's strcmp. == is Equal and != is NotEqual.
J
Solution:
The primitive -:
can be used to determine whether two strings are equivalent, but J doesn't have other inbuilt lexical comparison operators. They can defined as follows:
<lang j>eq=: -: NB. equal
ne=: -.@-: NB. not equal
gt=: {.@/:@,&boxopen *. ne NB. lexically greater than
lt=: -.@{.@/:@,&boxopen *. ne NB. lexically less than
ge=: {.@/:@,&boxopen +. eq NB. lexically greater than or equal to
le=: -.@{.@/:@,&boxopen NB. lexically less than or equal to</lang>
Usage: <lang j> 'ball' (eq , ne , gt , lt , ge , le) 'bell' 0 1 0 1 0 1
'ball' (eq , ne , gt , lt , ge , le) 'ball'
1 0 0 0 1 1
'YUP' (eq , ne , gt , lt , ge , le) 'YEP'
0 1 1 0 1 0</lang>
Perl 6
Perl 6 uses strong typing dynamically (and gradual typing statically), but normal string and numeric comparisons are coercive. (You may use generic comparison operators if you want polymorphic comparison—but usually you don't. :) <lang perl6>sub compare($a,$b) {
my $A = "{$a.WHAT.^name} '$a'"; my $B = "{$b.WHAT.^name} '$b'";
if $a eq $b { say "$A and $B are lexically equal" } if $a ne $b { say "$A and $B are not lexically equal" }
if $a gt $b { say "$A is lexically after $B" } if $a lt $b { say "$A is lexically before than $B" }
if $a ge $b { say "$A is not lexically before $B" } if $a le $b { say "$A is not lexically after $B" }
if $a === $b { say "$A and $B are identical objects" } if $a !=== $b { say "$A and $B are not identical objects" }
if $a eqv $b { say "$A and $B are generically equal" } if $a !eqv $b { say "$A and $B are not generically equal" }
if $a before $b { say "$A is generically after $B" } if $a after $b { say "$A is generically before $B" }
if $a !after $b { say "$A is not generically before $B" } if $a !before $b { say "$A is not generically after $B" }
say "The lexical relationship of $A and $B is { $a leg $b }" if $a ~~ Stringy; say "The generic relationship of $A and $B is { $a cmp $b }"; say "The numeric relationship of $A and $B is { $a <=> $b }" if $a ~~ Numeric; say ;
}
compare 'YUP', 'YUP'; compare 'BALL', 'BELL'; compare 24, 123; compare 5.1, 5; compare 5.1e0, 5 + 1/10;</lang>
- Output:
Str 'YUP' and Str 'YUP' are lexically equal Str 'YUP' is not lexically before Str 'YUP' Str 'YUP' is not lexically after Str 'YUP' Str 'YUP' and Str 'YUP' are identical objects Str 'YUP' and Str 'YUP' are generically equal Str 'YUP' is not generically before Str 'YUP' Str 'YUP' is not generically after Str 'YUP' The lexical relationship of Str 'YUP' and Str 'YUP' is Same The generic relationship of Str 'YUP' and Str 'YUP' is Same Str 'BALL' and Str 'BELL' are not lexically equal Str 'BALL' is lexically before than Str 'BELL' Str 'BALL' is not lexically after Str 'BELL' Str 'BALL' and Str 'BELL' are not identical objects Str 'BALL' and Str 'BELL' are not generically equal Str 'BALL' is generically after Str 'BELL' Str 'BALL' is not generically before Str 'BELL' The lexical relationship of Str 'BALL' and Str 'BELL' is Increase The generic relationship of Str 'BALL' and Str 'BELL' is Increase Int '24' and Int '123' are not lexically equal Int '24' is lexically after Int '123' Int '24' is not lexically before Int '123' Int '24' and Int '123' are not identical objects Int '24' and Int '123' are not generically equal Int '24' is generically after Int '123' Int '24' is not generically before Int '123' The generic relationship of Int '24' and Int '123' is Increase The numeric relationship of Int '24' and Int '123' is Increase Rat '5.1' and Int '5' are not lexically equal Rat '5.1' is lexically after Int '5' Rat '5.1' is not lexically before Int '5' Rat '5.1' and Int '5' are not identical objects Rat '5.1' and Int '5' are not generically equal Rat '5.1' is generically before Int '5' Rat '5.1' is not generically after Int '5' The generic relationship of Rat '5.1' and Int '5' is Decrease The numeric relationship of Rat '5.1' and Int '5' is Decrease Num '5.1' and Rat '5.1' are lexically equal Num '5.1' is not lexically before Rat '5.1' Num '5.1' is not lexically after Rat '5.1' Num '5.1' and Rat '5.1' are not identical objects Num '5.1' and Rat '5.1' are not generically equal Num '5.1' is not generically before Rat '5.1' Num '5.1' is not generically after Rat '5.1' The generic relationship of Num '5.1' and Rat '5.1' is Same The numeric relationship of Num '5.1' and Rat '5.1' is Same
Python
Note that Python is strongly typed. The string '24' is never coerced to a number, (or vice versa). <lang python>def compare(a, b):
print("\n%r is of type %r and %r is of type %r" % (a, type(a), b, type(b))) if a < b: print('%r is strictly less than %r' % (a, b)) if a <= b: print('%r is less than or equal to %r' % (a, b)) if a > b: print('%r is strictly greater than %r' % (a, b)) if a >= b: print('%r is greater than or equal to %r' % (a, b)) if a == b: print('%r is equal to %r' % (a, b)) if a != b: print('%r is not equal to %r' % (a, b)) if a is b: print('%r has object identity with %r' % (a, b)) if a is not b: print('%r has negated object identity with %r' % (a, b))
compare('YUP', 'YUP') compare('BALL', 'BELL') compare('24', '123') compare(24, 123) compare(5.0, 5)</lang>
- Output:
'YUP' is of type <class 'str'> and 'YUP' is of type <class 'str'> 'YUP' is less than or equal to 'YUP' 'YUP' is greater than or equal to 'YUP' 'YUP' is equal to 'YUP' 'YUP' has object identity with 'YUP' 'BALL' is of type <class 'str'> and 'BELL' is of type <class 'str'> 'BALL' is strictly less than 'BELL' 'BALL' is less than or equal to 'BELL' 'BALL' is not equal to 'BELL' 'BALL' has negated object identity with 'BELL' '24' is of type <class 'str'> and '123' is of type <class 'str'> '24' is strictly greater than '123' '24' is greater than or equal to '123' '24' is not equal to '123' '24' has negated object identity with '123' 24 is of type <class 'int'> and 123 is of type <class 'int'> 24 is strictly less than 123 24 is less than or equal to 123 24 is not equal to 123 24 has negated object identity with 123 5.0 is of type <class 'float'> and 5 is of type <class 'int'> 5.0 is less than or equal to 5 5.0 is greater than or equal to 5 5.0 is equal to 5 5.0 has negated object identity with 5
REXX
<lang rexx>animal = 'dog' if animal = 'cat' then
say animal "is lexically equal to cat"
if animal != 'cat' then
say animal "is not lexically equal cat"
if animal > 'cat' then
say animal "is lexically higher than cat"
if animal < 'cat' then
say animal "is lexically lower than cat"
if animal >= 'cat' then
say animal "is not lexically lower than cat"
if animal <= 'cat' then
say animal "is not lexically higher than cat"
/* The above comparative operators do not consider
leading and trailing whitespace when making comparisons. */
if ' cat ' = 'cat' then
say "this will print because whitespace is stripped"
/* To consider all whitespace in a comparison
we need to use strict comparative operators */
if ' cat ' == 'cat' then
say "this will not print because comparison is strict"</lang>
Here is a list of the strict comparative operators and their meaning:
- == Strictly Equal To
- !\== Strictly Not Equal To
- << Strictly Less Than
- >> Strictly Greater Than
- <<= Strictly Less Than or Equal To
- >>= Strictly Greater Than or Equal To
- !\<< Strictly Not Less Than
- !\>> Strictly Not Greater Than
Run BASIC
<lang runbasic>a$ = "dog" b$ = "cat" if a$ = b$ then print "the strings are equal" ' test for equalitY if a$ <> b$ then print "the strings are not equal" ' test for inequalitY if a$ > b$ then print a$;" is lexicallY higher than ";b$ ' test for lexicallY higher if a$ < b$ then print a$;" is lexicallY lower than ";b$ ' test for lexicallY lower if a$ <= b$ then print a$;" is not lexicallY higher than ";b$ if a$ >= b$ then print a$;" is not lexicallY lower than ";b$ end</lang>
Tcl
The best way to compare two strings in Tcl for equality is with the eq
and ne
expression operators:
<lang tcl>if {$a eq $b} {
puts "the strings are equal"
} if {$a ne $b} {
puts "the strings are not equal"
}</lang>
The numeric ==
and !=
operators also mostly work, but can give somewhat unexpected results when the both the values look numeric. The string equal
command is equally suited to equality-testing (and generates the same bytecode).
For ordering, the <
and >
operators may be used, but again they are principally numeric operators. For guaranteed string ordering, the result of the string compare
command should be used instead (which uses the unicode codepoints of the string):
<lang tcl>if {[string compare $a $b] < 0} {
puts "first string lower than second"
} if {[string compare $a $b] > 0} {
puts "first string higher than second"
}</lang>
Greater-or-equal and less-or-equal operations can be done by changing what exact comparison is used on the result of the string compare
.
Tcl also can do a prefix-equal (approximately the same as strncmp()
in C) through the use of the -length option:
<lang tcl>if {[string equal -length 3 $x "abc123"]} {
puts "first three characters are equal"
}</lang>
And case-insensitive equality is (orthogonally) enabled through the -nocase option. These options are supported by both string equal
and string compare
, but not by the expression operators.
UNIX Shell
<lang sh>#!/bin/sh
A=Bell B=Ball
- Traditional test command implementations test for equality and inequality
- but do not have a lexical comparison facility
if [ $A = $B ] ; then
ECHO 'The strings are equal'
fi if [ $A != $B ] ; then
ECHO 'The strings are not equal'
fi</lang>