Levenshtein distance
You are encouraged to solve this task according to the task description, using any language you may know.
This page uses content from Wikipedia. The original article was at Levenshtein distance. The list of authors can be seen in the page history. As with Rosetta Code, the text of Wikipedia is available under the GNU FDL. (See links for details on variance) |
In information theory and computer science, the Levenshtein distance is a metric for measuring the amount of difference between two sequences (i.e. an edit distance). The Levenshtein distance between two strings is defined as the minimum number of edits needed to transform one string into the other, with the allowable edit operations being insertion, deletion, or substitution of a single character.
- Example
The Levenshtein distance between "kitten" and "sitting" is 3, since the following three edits change one into the other, and there isn't a way to do it with fewer than three edits:
- kitten sitten (substitution of 'k' with 's')
- sitten sittin (substitution of 'e' with 'i')
- sittin sitting (insert 'g' at the end).
The Levenshtein distance between "rosettacode", "raisethysword" is 8.
The distance between two strings is same as that when both strings are reversed.
- Task
Implements a Levenshtein distance function, or uses a library function, to show the Levenshtein distance between "kitten" and "sitting".
- Related task
- Metrics
- Counting
- Word frequency
- Letter frequency
- Jewels and stones
- I before E except after C
- Bioinformatics/base count
- Count occurrences of a substring
- Count how many vowels and consonants occur in a string
- Remove/replace
- XXXX redacted
- Conjugate a Latin verb
- Remove vowels from a string
- String interpolation (included)
- Strip block comments
- Strip comments from a string
- Strip a set of characters from a string
- Strip whitespace from a string -- top and tail
- Strip control codes and extended characters from a string
- Anagrams/Derangements/shuffling
- Word wheel
- ABC problem
- Sattolo cycle
- Knuth shuffle
- Ordered words
- Superpermutation minimisation
- Textonyms (using a phone text pad)
- Anagrams
- Anagrams/Deranged anagrams
- Permutations/Derangements
- Find/Search/Determine
- ABC words
- Odd words
- Word ladder
- Semordnilap
- Word search
- Wordiff (game)
- String matching
- Tea cup rim text
- Alternade words
- Changeable words
- State name puzzle
- String comparison
- Unique characters
- Unique characters in each string
- Extract file extension
- Levenshtein distance
- Palindrome detection
- Common list elements
- Longest common suffix
- Longest common prefix
- Compare a list of strings
- Longest common substring
- Find common directory path
- Words from neighbour ones
- Change e letters to i in words
- Non-continuous subsequences
- Longest common subsequence
- Longest palindromic substrings
- Longest increasing subsequence
- Words containing "the" substring
- Sum of the digits of n is substring of n
- Determine if a string is numeric
- Determine if a string is collapsible
- Determine if a string is squeezable
- Determine if a string has all unique characters
- Determine if a string has all the same characters
- Longest substrings without repeating characters
- Find words which contains all the vowels
- Find words which contain the most consonants
- Find words which contains more than 3 vowels
- Find words whose first and last three letters are equal
- Find words with alternating vowels and consonants
- Formatting
- Substring
- Rep-string
- Word wrap
- String case
- Align columns
- Literals/String
- Repeat a string
- Brace expansion
- Brace expansion using ranges
- Reverse a string
- Phrase reversals
- Comma quibbling
- Special characters
- String concatenation
- Substring/Top and tail
- Commatizing numbers
- Reverse words in a string
- Suffixation of decimal numbers
- Long literals, with continuations
- Numerical and alphabetical suffixes
- Abbreviations, easy
- Abbreviations, simple
- Abbreviations, automatic
- Song lyrics/poems/Mad Libs/phrases
- Mad Libs
- Magic 8-ball
- 99 bottles of beer
- The Name Game (a song)
- The Old lady swallowed a fly
- The Twelve Days of Christmas
- Tokenize
- Text between
- Tokenize a string
- Word break problem
- Tokenize a string with escaping
- Split a character string based on change of character
- Sequences
11l
F minimumEditDistance(=s1, =s2)
I s1.len > s2.len
(s1, s2) = (s2, s1)
V distances = Array(0 .. s1.len)
L(char2) s2
V newDistances = [L.index + 1]
L(char1) s1
I char1 == char2
newDistances.append(distances[L.index])
E
newDistances.append(1 + min((distances[L.index], distances[L.index + 1], newDistances.last)))
distances = newDistances
R distances.last
print(minimumEditDistance(‘kitten’, ‘sitting’))
print(minimumEditDistance(‘rosettacode’, ‘raisethysword’))
- Output:
3 8
360 Assembly
* Levenshtein distance - 22/04/2020
LEVENS CSECT
USING LEVENS,R13 base register
B 72(R15) skip savearea
DC 17F'0' savearea
SAVE (14,12) save previous context
ST R13,4(R15) link backward
ST R15,8(R13) link forward
LR R13,R15 set addressability
LA R8,1 ii=1
DO WHILE=(C,R8,LE,NN) do ii=1 to nn
LR R1,R8 ii
SLA R1,5 *32
LA R4,SS-32(R1) @ss(ii,1)
MVC S1,0(R4) s1=ss(ii,1)
LR R1,R8 ii
SLA R1,1 *2
LA R1,1(R1) +1
SLA R1,4 *16
LA R4,SS-32(R1) @ss(ii,2)
MVC S2,0(R4) s2=ss(ii,2)
LA R1,S1 @s1
BAL R14,LENGTHST call length
AH R0,=H'1' +1
ST R0,N1 n1=length(s1)+1
LA R1,S2 @s2
BAL R14,LENGTHST call length
AH R0,=H'1' +1
ST R0,N2 n2=length(s2)+1
L R4,N1 n1
IF C,R4,EQ,=F'1' THEN if n1=1 then
L R3,N2 n2
BCTR R3,0 lev=n2-1
B RET goto ret
ENDIF , endif
L R4,N2 n2
IF C,R4,EQ,=F'1' THEN if n2=1 then
L R3,N1 n1
BCTR R3,0 lev=n1-1
B RET goto ret
ENDIF , endif
LA R6,1 i=1
DO WHILE=(C,R6,LE,N1) do i=1 to n1
LR R2,R6 i
BCTR R2,0 -1
LR R1,R6 i
SLA R1,6 *64
ST R2,D-64(R1) d(i,1)=i-1
LA R6,1(R6) i++
ENDDO , enddo i
LA R7,1 j=1
DO WHILE=(C,R7,LE,N2) do j=1 to n2
LR R2,R7 j
BCTR R2,0 j-1
LR R1,R7 j
SLA R1,2 *4
ST R2,D-4(R1) d(1,j)=j-1
LA R7,1(R7) j++
ENDDO , enddo j
LA R6,2 i=2
DO WHILE=(C,R6,LE,N1) do i=2 to n1
LA R4,S1-2 @s1-2
AR R4,R6 +i
MVC C1(1),0(R4) c1=substr(s1,i-1,1)
LA R7,2 j=2
DO WHILE=(C,R7,LE,N2) do j=2 to n2
LA R4,S2-2 @s2-2
AR R4,R7 +j
MVC C2(1),0(R4) c2=substr(s2,j-1,1)
LR R1,R6 i
SH R1,=H'2' -2
SLA R1,4 *16
AR R1,R7 +j
SLA R1,2 *4
L R2,D-4(R1) d(i-1,j)
LA R2,1(R2) +1
ST R2,D1 d1=d(i-1,j)+1
LR R2,R7 j
BCTR R2,0 -1
LR R1,R6 i
BCTR R1,0 -1
SLA R1,4 *16
AR R1,R2 +(j-1)
SLA R1,2 *4
L R2,D-4(R1) d(i,j-1)
LA R2,1(R2) +1
ST R2,D2 d2=d(i,j-1)+1
IF CLC,C1,NE,C2 THEN if c1<>c2 then
LA R9,1 x=1
ELSE , else
SR R9,R9 x=0
ENDIF , endif
LR R1,R6 i
SH R1,=H'2' i-2
LR R2,R7 j
BCTR R2,0 j-1
SLA R1,4 *16
AR R1,R2 +(j-1)
SLA R1,2 *4
L R2,D-4(R1) d(i-1,j-1)
AR R2,R9 +x
ST R2,D3 d3=d(i-1,j-1)+x
L R4,D1 d1
IF C,R4,LT,D2 THEN if d1<d2 then
IF C,R4,LT,D3 THEN if d1<d3 then
L R10,D1 dd=d1
ELSE , else
L R10,D3 dd=d3
ENDIF , endif
ELSE , else
L R4,D2 d2
IF C,R4,LT,D3 THEN if d2<d3 then
L R10,D2 dd=d2
ELSE , else
L R10,D3 dd=d3
ENDIF , endif
ENDIF , endif
LR R1,R6 i
BCTR R1,0 i-1
SLA R1,4 *16
AR R1,R7 +j
SLA R1,2 *4
ST R10,D-4(R1) d(i,j)=dd
LA R7,1(R7) j++
ENDDO , enddo j
LA R6,1(R6) i++
ENDDO , enddo i
L R1,N1 n1
BCTR R1,0 -1
SLA R1,4 *16
A R1,N2 +n2
SLA R1,2 *4
L R3,D-4(R1) lev=d(n1,n2)
RET MVC PG,=CL80' ' clear buffer
MVC PG(16),S1 output s1
MVC PG+17(16),S2 output s2
XDECO R3,XDEC edit lev
MVC PG+34(5),XDEC+7 output lev
XPRNT PG,L'PG print bufffer
LA R8,1(R8) ii++
ENDDO , enddo ii
L R13,4(0,R13) restore previous savearea pointer
RETURN (14,12),RC=0 restore registers from calling save
LENGTHST CNOP 0,4 function lengthst(c)
LA R0,16 i=16
LA R1,15(R1) @ci
LENGTHS1 LTR R0,R0 while i>0
BZ LENGTHS2 ..
CLI 0(R1),C' ' if Mid(c,i,1)=" "
BNE LENGTHS2 then exit while
BCTR R1,0 @ci--
BCTR R0,0 i--
B LENGTHS1 endwhile
LENGTHS2 BR R14 return to caller
SS DC CL16'kitten',CL16'sitting'
DC CL16'rosettacode',CL16'raisethysword'
DC CL16'saturday',CL16'sunday'
DC CL16'sleep',CL16'fleeting'
NN DC A((NN-SS)/(2*L'SS))
N1 DS F n1
N2 DS F n2
S1 DS CL16 s1
S2 DS CL16 s2
D DS 256F d(16,16)
D1 DS F d1
D2 DS F d2
D3 DS F d3
C1 DS CL1 c1
C2 DS CL1 c2
PG DS CL80 buffer
XDEC DS CL12 temp fo xdeco
REGEQU
END LEVENS
- Output:
kitten sitting 3 rosettacode raisethysword 8 saturday sunday 3 sleep fleeting 5
Action!
DEFINE STRING="CHAR ARRAY" ; sys.act
DEFINE width="15" ; max characters 14
DEFINE MatrixSize="225" ; 15*15
PROC Set2Dm(INT ARRAY matrix, INT x,y, val)
matrix(x+y*width)=val
RETURN
INT FUNC Get2Dm(INT ARRAY matrix, INT x,y)
INT res
res=matrix(x+y*width)
RETURN(res)
INT FUNC DamerauLevenshteinDistance(STRING str1, str2)
INT ARRAY matrix(MatrixSize)
BYTE Result, tmp, Min, K, L, I, J, M, N
Result=0
M=str1(0)
N=str2(0)
FOR I=0 TO MatrixSize-1 DO matrix(I)=0 OD
FOR I=0 TO M DO Set2Dm(matrix, I,0, I) OD
FOR J=0 TO N DO Set2Dm(matrix, 0,J, J) OD
FOR J=1 TO N DO
FOR I=1 TO M DO
IF str1(I) = str2(J) THEN
tmp=Get2Dm(matrix, I-1, J-1)
Set2Dm(matrix, I,J, tmp) ; no operation required
ELSE
Min = Get2Dm(matrix, I-1,J)+1 ; REM delete
K = Get2Dm(matrix, I,J-1)+1 ; REM insert
L = Get2Dm(matrix, I-1, J-1)+1 ; REM substitution
IF K < Min THEN Min=K FI
IF L < Min THEN Min=L FI
Set2Dm(matrix, I,J, Min)
;transposition for Damerau Levenshtein Distance
;IF I>1 AND J>1 THEN
; IF str1(I) = str2(J-1) AND str1(I-1) = str2(J) THEN
; Min=Get2Dm(matrix, I,J)
; tmp=Get2Dm(matrix, I-2,J-2)+1
; IF Min>tmp THEN Min=tmp FI
; Set2Dm(matrix, I,J, Min) ; REM transposition
; FI
;FI
FI
OD
OD
Result=Get2Dm(matrix, M,N)
RETURN(Result)
PROC MAIN()
INT result
STRING Word_1(15), Word_2(15)
PUT(125)
PUTE()
SCopy(Word_1,"kitten") SCopy(Word_2,"sitting")
PrintF("%S - %S%E",Word_1,Word_2)
result=DamerauLevenshteinDistance(Word_1,Word_2)
PrintF("Levenshtein Distance=%U%E%E",result)
;PrintF("Damerau Levenshtein Distance=%U%E%E",result)
SCopy(Word_1,"rosettacode") SCopy(Word_2,"raisethysword")
PrintF("%S - %S%E",Word_1,Word_2)
result=DamerauLevenshteinDistance(Word_1,Word_2)
PrintF("Levenshtein Distance=%U%E%E",result)
;PrintF("Damerau Levenshtein Distance=%U%E%E",result)
SCopy(Word_1,"qwerty") SCopy(Word_2,"qweryt")
PrintF("%S - %S%E",Word_1,Word_2)
result=DamerauLevenshteinDistance(Word_1,Word_2)
PrintF("Levenshtein Distance=%U%E%E",result)
;PrintF("Damerau Levenshtein Distance=%U%E%E",result)
RETURN
- Output:
kitten, sitting: 3 rosettacode, raisethysword: 8
Ada
with Ada.Text_IO;
procedure Main is
function Levenshtein_Distance (S, T : String) return Natural is
D : array (0 .. S'Length, 0 .. T'Length) of Natural;
begin
for I in D'Range (1) loop
D (I, 0) := I;
end loop;
for I in D'Range (2) loop
D (0, I) := I;
end loop;
for J in T'Range loop
for I in S'Range loop
if S (I) = T (J) then
D (I, J) := D (I - 1, J - 1);
else
D (I, J) :=
Natural'Min
(Natural'Min (D (I - 1, J) + 1, D (I, J - 1) + 1),
D (I - 1, J - 1) + 1);
end if;
end loop;
end loop;
return D (S'Length, T'Length);
end Levenshtein_Distance;
begin
Ada.Text_IO.Put_Line
("kitten -> sitting:" &
Integer'Image (Levenshtein_Distance ("kitten", "sitting")));
Ada.Text_IO.Put_Line
("rosettacode -> raisethysword:" &
Integer'Image (Levenshtein_Distance ("rosettacode", "raisethysword")));
end Main;
- Output:
kitten -> sitting: 3 rosettacode -> raisethysword: 8
Aime
integer
dist(data s, t, integer i, j, list d)
{
integer x;
x = d[i * (~t + 1) + j];
if (x == -1) {
if (i == ~s) {
x = ~t - j;
} elif (j == ~t) {
x = ~s - i;
} elif (s[i] == t[j]) {
x = dist(s, t, i + 1, j + 1, d);
} else {
x = dist(s, t, i + 1, j + 1, d)
.min(dist(s, t, i, j + 1, d))
.min(dist(s, t, i + 1, j, d));
x += 1;
}
d[i * (~t + 1) + j] = x;
}
x;
}
levenshtein(data s, t)
{
list d;
d.pn_integer(0, (~s + 1) * (~t + 1), -1);
dist(s, t, 0, 0, d);
}
main(void)
{
text s1, s2;
o_form("`~' to `~' is ~\n", s1 = "rosettacode", s2 = "raisethysword",
levenshtein(s1, s2));
o_form("`~' to `~' is ~\n", s1 = "kitten", s2 = "sitting",
levenshtein(s1, s2));
0;
}
- Output:
`rosettacode' to `raisethysword' is 8 `kitten' to `sitting' is 3
ALGOL 68
...largely to highlight the syntactic similarities.
Non-recursive algorithm - although Algol 68 supports recursion, Action! doesn't.
# Calculate Levenshtein distance between strings - translated from the Action! sample #
BEGIN
PROC levenshtein distance = (STRING str1, str2)INT:
BEGIN
INT m=UPB str1;
INT n=UPB str2;
(0:m,0:n)INT matrix;
FOR i FROM 0 TO m DO FOR j FROM 0 TO n DO matrix(i,j):=0 OD OD;
FOR i TO m DO matrix(i,1):=i OD;
FOR j TO n DO matrix(1,j):=j OD;
FOR j FROM 1 TO n DO
FOR i FROM 1 TO m DO
IF str1(i) = str2(j) THEN
matrix(i,j):=matrix(i-1, j-1) # no operation required #
ELSE
INT min := matrix(i-1,j)+1 ; # deletion #
INT k = matrix(i,j-1)+1 ; # insertion #
INT l = matrix(i-1, j-1)+1 ; # substitution #
IF k < min THEN min:=k FI;
IF l < min THEN min:=l FI;
matrix(i,j):=min
FI
OD
OD;
matrix(m,n)
END;
STRING word 1, word 2;
word 1 :="kitten"; word 2 := "sitting";
print((word 1," -> ",word 2,": "));
print(("Levenshtein Distance: ",whole(levenshtein distance(word 1,word 2),0),newline));
word 1 := "rosettacode"; word 2 := "raisethysword";
print((word 1," -> ",word 2,": "));
print(("Levenshtein Distance: ",whole(levenshtein distance(word 1,word 2),0),newline));
word 1 := "qwerty"; word 2 := "qweryt";
print((word 1," -> ",word 2,": "));
print(("Levenshtein Distance: ",whole(levenshtein distance(word 1,word 2),0),newline));
word 1 := "Action!"; word 2 := "Algol 68";
print((word 1," -> ",word 2,": "));
print(("Levenshtein Distance: ",whole(levenshtein distance(word 1,word 2),0),newline))
END
- Output:
kitten -> sitting: Levenshtein Distance: 3 rosettacode -> raisethysword: Levenshtein Distance: 8 qwerty -> qweryt: Levenshtein Distance: 2 Action! -> Algol 68: Levenshtein Distance: 7
AppleScript
Iteration
Translation of the "fast" C-version
set dist to findLevenshteinDistance for "sunday" against "saturday"
to findLevenshteinDistance for s1 against s2
script o
property l : s1
property m : s2
end script
if s1 = s2 then return 0
set ll to length of s1
set lm to length of s2
if ll = 0 then return lm
if lm = 0 then return ll
set v0 to {}
repeat with i from 1 to (lm + 1)
set end of v0 to (i - 1)
end repeat
set item -1 of v0 to 0
copy v0 to v1
repeat with i from 1 to ll
-- calculate v1 (current row distances) from the previous row v0
-- first element of v1 is A[i+1][0]
-- edit distance is delete (i+1) chars from s to match empty t
set item 1 of v1 to i
-- use formula to fill in the rest of the row
repeat with j from 1 to lm
if item i of o's l = item j of o's m then
set cost to 0
else
set cost to 1
end if
set item (j + 1) of v1 to min3 for ((item j of v1) + 1) against ((item (j + 1) of v0) + 1) by ((item j of v0) + cost)
end repeat
copy v1 to v0
end repeat
return item (lm + 1) of v1
end findLevenshteinDistance
to min3 for anInt against anOther by theThird
if anInt < anOther then
if theThird < anInt then
return theThird
else
return anInt
end if
else
if theThird < anOther then
return theThird
else
return anOther
end if
end if
end min3
Composition of generic functions
(ES6 version)
In the ancient tradition of "Use library functions whenever feasible." (for better productivity), and also in the even older functional tradition of composing values (for better reliability) rather than sequencing actions:
-- levenshtein :: String -> String -> Int
on levenshtein(sa, sb)
set {s1, s2} to {characters of sa, characters of sb}
script
on |λ|(ns, c)
script minPath
on |λ|(z, c1xy)
set {c1, x, y} to c1xy
minimum({y + 1, z + 1, x + fromEnum(c1 is not c)})
end |λ|
end script
set {n, ns1} to {item 1 of ns, rest of ns}
scanl(minPath, n + 1, zip3(s1, ns, ns1))
end |λ|
end script
last item of foldl(result, enumFromTo(0, length of s1), s2)
end levenshtein
--------------------------- TEST ---------------------------
on run
script test
on |λ|(tuple)
set {sa, sb} to tuple
levenshtein(sa, sb)
end |λ|
end script
map(test, [["kitten", "sitting"], ["sitting", "kitten"], ¬
["rosettacode", "raisethysword"], ["raisethysword", "rosettacode"]])
--> {3, 3, 8, 8}
end run
-------------------- GENERIC FUNCTIONS ---------------------
-- enumFromTo :: Int -> Int -> [Int]
on enumFromTo(m, n)
if m ≤ n then
set lst to {}
repeat with i from m to n
set end of lst to i
end repeat
lst
else
{}
end if
end enumFromTo
-- fromEnum :: Enum a => a -> Int
on fromEnum(x)
set c to class of x
if c is boolean then
if x then
1
else
0
end if
else if c is text then
if x ≠ "" then
id of x
else
missing value
end if
else
x as integer
end if
end fromEnum
-- foldl :: (a -> b -> a) -> a -> [b] -> a
on foldl(f, startValue, xs)
tell mReturn(f)
set v to startValue
set lng to length of xs
repeat with i from 1 to lng
set v to |λ|(v, item i of xs, i, xs)
end repeat
return v
end tell
end foldl
-- map :: (a -> b) -> [a] -> [b]
on map(f, xs)
-- The list obtained by applying f
-- to each element of xs.
tell mReturn(f)
set lng to length of xs
set lst to {}
repeat with i from 1 to lng
set end of lst to |λ|(item i of xs, i, xs)
end repeat
return lst
end tell
end map
-- minimum :: Ord a => [a] -> a
on minimum(xs)
set lng to length of xs
if lng < 1 then return missing value
set m to item 1 of xs
repeat with x in xs
set v to contents of x
if v < m then set m to v
end repeat
return m
end minimum
-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
-- 2nd class handler function lifted into 1st class script wrapper.
if script is class of f then
f
else
script
property |λ| : f
end script
end if
end mReturn
-- scanl :: (b -> a -> b) -> b -> [a] -> [b]
on scanl(f, startValue, xs)
tell mReturn(f)
set v to startValue
set lng to length of xs
set lst to {startValue}
repeat with i from 1 to lng
set v to |λ|(v, item i of xs, i, xs)
set end of lst to v
end repeat
return lst
end tell
end scanl
-- zip3 :: [a] -> [b] -> [c] -> [(a, b, c)]
on zip3(xs, ys, zs)
script
on |λ|(x, i)
{x, item i of ys, item i of zs}
end |λ|
end script
map(result, items 1 thru ¬
minimum({length of xs, length of ys, length of zs}) of xs)
end zip3
- Output:
{3, 3, 8, 8}
Arc
Waterhouse Arc
O(n * m) time, linear space, using lists instead of vectors
(def levenshtein (str1 str2)
(withs l1 len.str1
l2 len.str2
row range0:inc.l1
(times j l2
(let next list.j
(times i l1
(push
(inc:min
car.next
((if (is str1.i str2.j) dec id) car.row)
(car:zap cdr row))
next))
(= row nrev.next)))
row.l1))
Arturo
print levenshtein "kitten" "sitting"
- Output:
3
AutoHotkey
levenshtein(s, t){
If s =
return StrLen(t)
If t =
return strLen(s)
If SubStr(s, 1, 1) = SubStr(t, 1, 1)
return levenshtein(SubStr(s, 2), SubStr(t, 2))
a := Levenshtein(SubStr(s, 2), SubStr(t, 2))
b := Levenshtein(s, SubStr(t, 2))
c := Levenshtein(SubStr(s, 2), t )
If (a > b)
a := b
if (a > c)
a := c
return a + 1
}
s1 := "kitten"
s2 := "sitting"
MsgBox % "distance between " s1 " and " s2 ": " levenshtein(s1, s2)
It correctly outputs '3'
AWK
Slavishly copied from the very clear AutoHotKey example.
#!/usr/bin/awk -f
BEGIN {
a = "kitten";
b = "sitting";
d = levenshteinDistance(a, b);
p = d == 1 ? "" : "s";
printf("%s -> %s after %d edit%s\n", a, b, d, p);
exit;
}
function levenshteinDistance(s1, s2,
s1First, s2First, s1Rest, s2Rest,
distA, distB, distC, minDist) {
# If either string is empty,
# then distance is insertion of the other's characters.
if (length(s1) == 0) return length(s2);
if (length(s2) == 0) return length(s1);
# Rest of process uses first characters
# and remainder of each string.
s1First = substr(s1, 1, 1);
s2First = substr(s2, 1, 1);
s1Rest = substr(s1, 2, length(s1));
s2Rest = substr(s2, 2, length(s2));
# If leading characters are the same,
# then distance is that between the rest of the strings.
if (s1First == s2First) {
return levenshteinDistance(s1Rest, s2Rest);
}
# Find the distances between sub strings.
distA = levenshteinDistance(s1Rest, s2);
distB = levenshteinDistance(s1, s2Rest);
distC = levenshteinDistance(s1Rest, s2Rest);
# Return the minimum distance between substrings.
minDist = distA;
if (distB < minDist) minDist = distB;
if (distC < minDist) minDist = distC;
return minDist + 1; # Include change for the first character.
}
Example output:
kitten -> sitting after 3 edits
Alternative, much faster but also less readable lazy-evaluation version from http://awk.freeshell.org/LevenshteinEditDistance
(where the above takes e.g. 0m44.904s in gawk 4.1.3 for 5 edits (length 10 and 14 strings), this takes user 0m0.004s):
#!/usr/bin/awk -f
function levdist(str1, str2, l1, l2, tog, arr, i, j, a, b, c) {
if (str1 == str2) {
return 0
} else if (str1 == "" || str2 == "") {
return length(str1 str2)
} else if (substr(str1, 1, 1) == substr(str2, 1, 1)) {
a = 2
while (substr(str1, a, 1) == substr(str2, a, 1)) a++
return levdist(substr(str1, a), substr(str2, a))
} else if (substr(str1, l1=length(str1), 1) == substr(str2, l2=length(str2), 1)) {
b = 1
while (substr(str1, l1-b, 1) == substr(str2, l2-b, 1)) b++
return levdist(substr(str1, 1, l1-b), substr(str2, 1, l2-b))
}
for (i = 0; i <= l2; i++) arr[0, i] = i
for (i = 1; i <= l1; i++) {
arr[tog = ! tog, 0] = i
for (j = 1; j <= l2; j++) {
a = arr[! tog, j ] + 1
b = arr[ tog, j-1] + 1
c = arr[! tog, j-1] + (substr(str1, i, 1) != substr(str2, j, 1))
arr[tog, j] = (((a<=b)&&(a<=c)) ? a : ((b<=a)&&(b<=c)) ? b : c)
}
}
return arr[tog, j-1]
}
BBC BASIC
PRINT "'kitten' -> 'sitting' has distance " ;
PRINT ; FNlevenshtein("kitten", "sitting")
PRINT "'rosettacode' -> 'raisethysword' has distance " ;
PRINT ; FNlevenshtein("rosettacode", "raisethysword")
END
DEF FNlevenshtein(s$, t$)
LOCAL i%, j%, m%, d%()
DIM d%(LENs$, LENt$)
FOR i% = 0 TO DIM(d%(),1)
d%(i%,0) = i%
NEXT
FOR j% = 0 TO DIM(d%(),2)
d%(0,j%) = j%
NEXT
FOR j% = 1 TO DIM(d%(),2)
FOR i% = 1 TO DIM(d%(),1)
IF MID$(s$,i%,1) = MID$(t$,j%,1) THEN
d%(i%,j%) = d%(i%-1,j%-1)
ELSE
m% = d%(i%-1,j%-1)
IF d%(i%,j%-1) < m% m% = d%(i%,j%-1)
IF d%(i%-1,j%) < m% m% = d%(i%-1,j%)
d%(i%,j%) = m% + 1
ENDIF
NEXT
NEXT j%
= d%(i%-1,j%-1)
Output:
'kitten' -> 'sitting' has distance 3 'rosettacode' -> 'raisethysword' has distance 8
BQN
Recursive slow version:
Levenshtein ← {
𝕨 𝕊"": ≠𝕨;
""𝕊 𝕩: ≠𝕩;
𝕨 𝕊 𝕩:
Tail←1⊸↓
𝕨 =○⊑◶⟨1+·⌊´ 𝕊○Tail ∾ Tail⊸𝕊 ∾ 𝕊⟜Tail, 𝕊○Tail⟩ 𝕩
}
Fast version:
Levenshtein ← @⊸∾⊸{¯1⊑(↕≠𝕨)𝕨{(1⊸+⊸⌊`1⊸+⌊⊑⊸»+𝕨≠𝕗˙)𝕩}´⌽𝕩}
- Example use:
"kitten" Levenshtein "sitting"
3
"Saturday" Levenshtein "Sunday"
3
"rosettacode" Levenshtein "raisethysword"
8
Bracmat
Recursive method, but with memoization.
(levenshtein=
lev cache
. ( lev
= s s0 s1 t t0 t1 L a b c val key
. (cache..find)$(str$!arg:?key):(?.?val)
& !val
| !arg:(?s,?t)
& ( !s:&@(!t:? [?L)
| !t:&@(!s:? [?L)
)
& (cache..insert)$(!key.!L)
& !L
| !arg:(@(?:%?s0 ?s1),@(?:%?t0 ?t1))
& !s0:!t0
& lev$(!s1,!t1)
| lev$(!s1,!t1):?a
& lev$(!s,!t1):?b
& lev$(!s1,!t):?c
& (!b:<!a:?a|)
& (!c:<!a:?a|)
& (cache..insert)$(!key.1+!a)
& 1+!a
)
& new$hash:?cache
& lev$!arg);
- Demonstrating:
levenshtein$(kitten,sitting) 3 levenshtein$(rosettacode,raisethysword) 8
Bruijn
Recursive and non-memoized method:
:import std/Combinator .
:import std/Char C
:import std/List .
:import std/Math .
levenshtein y [[[∅?1 ∀0 (∅?0 ∀1 (0 (1 [[[[go]]]])))]]]
go (C.eq? 3 1) (6 2 0) ++(lmin ((6 2 0) : ((6 5 0) : {}(6 2 4))))
:test ((levenshtein "rosettacode" "raisethysword") =? (+8)) ([[1]])
:test ((levenshtein "kitten" "sitting") =? (+3)) ([[1]])
C
Recursive method. Deliberately left in an inefficient state to show the recursive nature of the algorithm; notice how it would have become the Wikipedia algorithm if we memoized the function against parameters ls
and lt
.
#include <stdio.h>
#include <string.h>
/* s, t: two strings; ls, lt: their respective length */
int levenshtein(const char *s, int ls, const char *t, int lt)
{
int a, b, c;
/* if either string is empty, difference is inserting all chars
* from the other
*/
if (!ls) return lt;
if (!lt) return ls;
/* if last letters are the same, the difference is whatever is
* required to edit the rest of the strings
*/
if (s[ls - 1] == t[lt - 1])
return levenshtein(s, ls - 1, t, lt - 1);
/* else try:
* changing last letter of s to that of t; or
* remove last letter of s; or
* remove last letter of t,
* any of which is 1 edit plus editing the rest of the strings
*/
a = levenshtein(s, ls - 1, t, lt - 1);
b = levenshtein(s, ls, t, lt - 1);
c = levenshtein(s, ls - 1, t, lt );
if (a > b) a = b;
if (a > c) a = c;
return a + 1;
}
int main()
{
const char *s1 = "rosettacode";
const char *s2 = "raisethysword";
printf("distance between `%s' and `%s': %d\n", s1, s2,
levenshtein(s1, strlen(s1), s2, strlen(s2)));
return 0;
}
Take the above and add caching, we get (in C99):
#include <stdio.h>
#include <string.h>
int levenshtein(const char *s, const char *t)
{
int ls = strlen(s), lt = strlen(t);
int d[ls + 1][lt + 1];
for (int i = 0; i <= ls; i++)
for (int j = 0; j <= lt; j++)
d[i][j] = -1;
int dist(int i, int j) {
if (d[i][j] >= 0) return d[i][j];
int x;
if (i == ls)
x = lt - j;
else if (j == lt)
x = ls - i;
else if (s[i] == t[j])
x = dist(i + 1, j + 1);
else {
x = dist(i + 1, j + 1);
int y;
if ((y = dist(i, j + 1)) < x) x = y;
if ((y = dist(i + 1, j)) < x) x = y;
x++;
}
return d[i][j] = x;
}
return dist(0, 0);
}
int main(void)
{
const char *s1 = "rosettacode";
const char *s2 = "raisethysword";
printf("distance between `%s' and `%s': %d\n", s1, s2,
levenshtein(s1, s2));
return 0;
}
C#
This is a straightforward translation of the Wikipedia pseudocode.
using System;
namespace LevenshteinDistance
{
class Program
{
static int LevenshteinDistance(string s, string t)
{
int n = s.Length;
int m = t.Length;
int[,] d = new int[n + 1, m + 1];
if (n == 0)
{
return m;
}
if (m == 0)
{
return n;
}
for (int i = 0; i <= n; i++)
d[i, 0] = i;
for (int j = 0; j <= m; j++)
d[0, j] = j;
for (int j = 1; j <= m; j++)
for (int i = 1; i <= n; i++)
if (s[i - 1] == t[j - 1])
d[i, j] = d[i - 1, j - 1]; //no operation
else
d[i, j] = Math.Min(Math.Min(
d[i - 1, j] + 1, //a deletion
d[i, j - 1] + 1), //an insertion
d[i - 1, j - 1] + 1 //a substitution
);
return d[n, m];
}
static void Main(string[] args)
{
if (args.Length == 2)
Console.WriteLine("{0} -> {1} = {2}",
args[0], args[1], LevenshteinDistance(args[0], args[1]));
else
Console.WriteLine("Usage:-\n\nLevenshteinDistance <string1> <string2>");
}
}
}
- Example output:
> LevenshteinDistance kitten sitting kitten -> sitting = 3 > LevenshteinDistance rosettacode raisethysword rosettacode -> raisethysword = 8
C++
#include <string>
#include <iostream>
using namespace std;
// Compute Levenshtein Distance
// Martin Ettl, 2012-10-05
size_t uiLevenshteinDistance(const string &s1, const string &s2)
{
const size_t
m(s1.size()),
n(s2.size());
if( m==0 ) return n;
if( n==0 ) return m;
// allocation below is not ISO-compliant,
// it won't work with -pedantic-errors.
size_t costs[n + 1];
for( size_t k=0; k<=n; k++ ) costs[k] = k;
size_t i { 0 };
for (char const &c1 : s1)
{
costs[0] = i+1;
size_t corner { i },
j { 0 };
for (char const &c2 : s2)
{
size_t upper { costs[j+1] };
if( c1 == c2 ) costs[j+1] = corner;
else {
size_t t(upper<corner? upper: corner);
costs[j+1] = (costs[j]<t?costs[j]:t)+1;
}
corner = upper;
j++;
}
i++;
}
return costs[n];
}
int main()
{
string s0 { "rosettacode" },
s1 { "raisethysword" };
cout << "distance between " << s0 << " and " << s1 << " : "
<< uiLevenshteinDistance(s0,s1) << endl;
return 0;
}
- Example output:
$ ./a.out rosettacode raisethysword distance between rosettacode and raisethysword : 8
Generic ISO C++ version
#include <algorithm>
#include <iostream>
#include <numeric>
#include <string>
#include <vector>
template <typename StringType>
size_t levenshtein_distance(const StringType& s1, const StringType& s2) {
const size_t m = s1.size();
const size_t n = s2.size();
if (m == 0)
return n;
if (n == 0)
return m;
std::vector<size_t> costs(n + 1);
std::iota(costs.begin(), costs.end(), 0);
size_t i = 0;
for (auto c1 : s1) {
costs[0] = i + 1;
size_t corner = i;
size_t j = 0;
for (auto c2 : s2) {
size_t upper = costs[j + 1];
costs[j + 1] = (c1 == c2) ? corner
: 1 + std::min(std::min(upper, corner), costs[j]);
corner = upper;
++j;
}
++i;
}
return costs[n];
}
int main() {
std::wstring s0 = L"rosettacode";
std::wstring s1 = L"raisethysword";
std::wcout << L"distance between " << s0 << L" and " << s1 << L": "
<< levenshtein_distance(s0, s1) << std::endl;
return 0;
}
- Output:
distance between rosettacode and raisethysword: 8
Clojure
Recursive Version
(defn levenshtein [str1 str2]
(let [len1 (count str1)
len2 (count str2)]
(cond (zero? len1) len2
(zero? len2) len1
:else
(let [cost (if (= (first str1) (first str2)) 0 1)]
(min (inc (levenshtein (rest str1) str2))
(inc (levenshtein str1 (rest str2)))
(+ cost
(levenshtein (rest str1) (rest str2))))))))
(println (levenshtein "rosettacode" "raisethysword"))
- Output:
8
Iterative version
(defn levenshtein [w1 w2]
(letfn [(cell-value [same-char? prev-row cur-row col-idx]
(min (inc (nth prev-row col-idx))
(inc (last cur-row))
(+ (nth prev-row (dec col-idx)) (if same-char?
0
1))))]
(loop [row-idx 1
max-rows (inc (count w2))
prev-row (range (inc (count w1)))]
(if (= row-idx max-rows)
(last prev-row)
(let [ch2 (nth w2 (dec row-idx))
next-prev-row (reduce (fn [cur-row i]
(let [same-char? (= (nth w1 (dec i)) ch2)]
(conj cur-row (cell-value same-char?
prev-row
cur-row
i))))
[row-idx] (range 1 (count prev-row)))]
(recur (inc row-idx) max-rows next-prev-row))))))
CLU
min = proc [T: type] (a, b: T) returns (T)
where T has lt: proctype (T,T) returns (bool)
if a<b
then return(a)
else return(b)
end
end min
levenshtein = proc (s, t: string) returns (int)
ai = array[int]
aai = array[array[int]]
m: int := string$size(s)
n: int := string$size(t)
d: aai := aai$fill_copy(0, m+1, ai$fill(0, n+1, 0))
for i: int in int$from_to(1, m) do d[i][0] := i end
for j: int in int$from_to(1, n) do d[0][j] := j end
for j: int in int$from_to(1, n) do
for i: int in int$from_to(1, m) do
cost: int
if s[i] = t[j]
then cost := 0
else cost := 1
end
d[i][j] := min[int]( d[i-1][j]+1,
min[int]( d[i][j-1]+1,
d[i-1][j-1]+cost ))
end
end
return (d[m][n])
end levenshtein
show = proc (s, t: string)
po: stream := stream$primary_output()
stream$putl(po, s || " => " || t || ": " || int$unparse(levenshtein(s,t)))
end show
start_up = proc ()
show("kitten", "sitting")
show("rosettacode", "raisethysword")
end start_up
- Output:
kitten => sitting: 3 rosettacode => raisethysword: 8
COBOL
GnuCobol 2.2
identification division.
program-id. Levenshtein.
environment division.
configuration section.
repository.
function all intrinsic.
data division.
working-storage section.
77 string-a pic x(255).
77 string-b pic x(255).
77 length-a pic 9(3).
77 length-b pic 9(3).
77 distance pic z(3).
77 i pic 9(3).
77 j pic 9(3).
01 tab.
05 filler occurs 256.
10 filler occurs 256.
15 costs pic 9(3).
procedure division.
main.
move "kitten" to string-a
move "sitting" to string-b
perform levenshtein-dist
move "rosettacode" to string-a
move "raisethysword" to string-b
perform levenshtein-dist
stop run
.
levenshtein-dist.
move length(trim(string-a)) to length-a
move length(trim(string-b)) to length-b
initialize tab
perform varying i from 0 by 1 until i > length-a
move i to costs(i + 1, 1)
end-perform
perform varying j from 0 by 1 until j > length-b
move j to costs(1, j + 1)
end-perform
perform with test after varying i from 2 by 1 until i > length-a
perform with test after varying j from 2 by 1 until j > length-b
if string-a(i - 1:1) = string-b(j - 1:1)
move costs(i - 1, j - 1) to costs(i, j)
else
move min(min(costs(i - 1, j) + 1, *> a deletion
costs(i, j - 1) + 1), *> an insertion
costs(i - 1, j - 1) + 1) *> a substitution
to costs(i, j)
end-if
end-perform
end-perform
move costs(length-a + 1, length-b + 1) to distance
display trim(string-a) " -> " trim(string-b) " = " trim(distance)
.
- Output:
> ./Levenshtein kitten -> sitting = 3 rosettacode -> raisethysword = 8
CoffeeScript
levenshtein = (str1, str2) ->
# more of less ported simple algorithm from JS
m = str1.length
n = str2.length
d = []
return n unless m
return m unless n
d[i] = [i] for i in [0..m]
d[0][j] = j for j in [1..n]
for i in [1..m]
for j in [1..n]
if str1[i-1] is str2[j-1]
d[i][j] = d[i-1][j-1]
else
d[i][j] = Math.min(
d[i-1][j]
d[i][j-1]
d[i-1][j-1]
) + 1
d[m][n]
console.log levenshtein("kitten", "sitting")
console.log levenshtein("rosettacode", "raisethysword")
console.log levenshtein("stop", "tops")
console.log levenshtein("yo", "")
console.log levenshtein("", "yo")
Common Lisp
(defun levenshtein (a b)
(let* ((la (length a))
(lb (length b))
(rec (make-array (list (1+ la) (1+ lb)) :initial-element nil)))
(labels ((leven (x y)
(cond
((zerop x) y)
((zerop y) x)
((aref rec x y) (aref rec x y))
(t (setf (aref rec x y)
(min (+ (leven (1- x) y) 1)
(+ (leven x (1- y)) 1)
(+ (leven (1- x) (1- y)) (if (char= (char a (- la x)) (char b (- lb y))) 0 1))))))))
(leven la lb))))
(print (levenshtein "rosettacode" "raisethysword"))
- Output:
8
Crystal
The standard library includes levenshtein module
require "levenshtein"
puts Levenshtein.distance("kitten", "sitting")
puts Levenshtein.distance("rosettacode", "raisethysword")
- Output:
3 8
module Levenshtein
def self.distance(a, b)
a, b = a.downcase, b.downcase
costs = (0..b.size).to_a
(1..a.size).each do |i|
costs[0], nw = i, i - 1 # j == 0; nw is lev(i-1, j)
(1..b.size).each do |j|
costs[j], nw = [costs[j] + 1, costs[j-1] + 1, a[i-1] == b[j-1] ? nw : nw + 1].min, costs[j]
end
end
costs[b.size]
end
def self.test
%w{kitten sitting saturday sunday rosettacode raisethysword}.each_slice(2) do |(a, b)| #or do |pair| a, b = pair
puts "distance(#{a}, #{b}) = #{distance(a, b)}"
end
end
end
Levenshtein.test
- Output:
distance(kitten, sitting) = 3 distance(saturday, sunday) = 3 distance(rosettacode, raisethysword) = 8
def levenshtein_distance(str1, str2)
n, m = str1.size, str2.size
max = n / 2
return 0 if n == 0 || m == 0
return n if (n - m).abs > max
d = (0..m).to_a
x = 0
str1.each_char_with_index do |char1, i|
e = i + 1
str2.each_char_with_index do |char2, j|
cost = (char1 == char2) ? 0 : 1
x = [ d[j+1] + 1, # insertion
e + 1, # deletion
d[j] + cost # substitution
].min
d[j] = e
e = x
end
d[m] = x
end
x
end
%w{kitten sitting saturday sunday rosettacode raisethysword}.each_slice(2) do |(a, b)| #or do |pair| a, b = pair
puts "distance(#{a}, #{b}) = #{levenshtein_distance(a, b)}"
end
- Output:
distance(kitten, sitting) = 3 distance(saturday, sunday) = 3 distance(rosettacode, raisethysword) = 8
D
Standard Version
The standard library std.algorithm module includes a Levenshtein distance function:
void main() {
import std.stdio, std.algorithm;
levenshteinDistance("kitten", "sitting").writeln;
}
- Output:
3
Iterative Version
import std.stdio, std.algorithm;
int distance(in string s1, in string s2) pure nothrow {
auto costs = new int[s2.length + 1];
foreach (immutable i; 0 .. s1.length + 1) {
int lastValue = i;
foreach (immutable j; 0 .. s2.length + 1) {
if (i == 0)
costs[j] = j;
else {
if (j > 0) {
int newValue = costs[j - 1];
if (s1[i - 1] != s2[j - 1])
newValue = min(newValue, lastValue, costs[j]) + 1;
costs[j - 1] = lastValue;
lastValue = newValue;
}
}
}
if (i > 0)
costs[$ - 1] = lastValue;
}
return costs[$ - 1];
}
void main() {
foreach (p; [["kitten", "sitting"], ["rosettacode", "raisethysword"]])
writefln("distance(%s, %s): %d", p[0], p[1], distance(p[0], p[1]));
}
Memoized Recursive Version
import std.stdio, std.array, std.algorithm, std.functional;
uint lDist(T)(in const(T)[] s, in const(T)[] t) nothrow {
alias mlDist = memoize!lDist;
if (s.empty || t.empty) return max(t.length, s.length);
if (s[0] == t[0]) return mlDist(s[1 .. $], t[1 .. $]);
return min(mlDist(s, t[1 .. $]),
mlDist(s[1 .. $], t),
mlDist(s[1 .. $], t[1 .. $])) + 1;
}
void main() {
lDist("kitten", "sitting").writeln;
lDist("rosettacode", "raisethysword").writeln;
}
Delphi
program Levenshtein_distanceTest;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils,
Math;
function Levenshtein_distance(s, t: string): integer;
var
d: array of array of integer;
i, j, cost: integer;
begin
SetLength(d, Length(s) + 1);
for i := Low(d) to High(d) do
begin
SetLength(d[i], Length(t) + 1);
end;
for i := Low(d) to High(d) do
begin
d[i, 0] := i;
for j := Low(d[i]) to High(d[i]) do
begin
d[0, j] := j;
end;
end;
for i := Low(d) + 1 to High(d) do
begin
for j := Low(d[i]) + 1 to High(d[i]) do
begin
if s[i] = t[j] then
begin
cost := 0;
end
else
begin
cost := 1;
end;
d[i, j] := Min(Min(d[i - 1, j] + 1, //deletion
d[i, j - 1] + 1), //insertion
d[i - 1, j - 1] + cost //substitution
);
end;
end;
Result := d[Length(s), Length(t)];
end;
procedure Println(s, t: string);
begin
Writeln('> LevenshteinDistance "', s, '" "', t, '"');
Writeln(s, ' -> ', t, ' = ', Levenshtein_distance(s, t), #10);
end;
begin
Println('kitten', 'sitting');
Println('rosettacode', 'raisethysword');
readln;
end.
DWScript
Based on Wikipedia version
function LevenshteinDistance(s, t : String) : Integer;
var
i, j : Integer;
begin
var d:=new Integer[Length(s)+1, Length(t)+1];
for i:=0 to Length(s) do
d[i, 0] := i;
for j:=0 to Length(t) do
d[0, j] := j;
for j:=1 to Length(t) do
for i:=1 to Length(s) do
if s[i]=t[j] then
d[i, j] := d[i-1, j-1] // no operation
else d[i,j]:=MinInt(MinInt(
d[i-1, j] +1 , // a deletion
d[i, j-1] +1 ), // an insertion
d[i- 1,j-1] +1 // a substitution
);
Result:=d[Length(s), Length(t)];
end;
PrintLn(LevenshteinDistance('kitten', 'sitting'));
Dyalect
func levenshtein(s, t) {
var n = s.Length()
var m = t.Length()
var d = Array.Empty(n + 1, () => Array.Empty(m + 1))
if n == 0 {
return m
}
if (m == 0) {
return n
}
for i in 0..n {
d[i][0] = i
}
for j in 0..m {
d[0][j] = j
}
for j in 1..m {
for i in 1..n {
if s[i - 1] == t[j - 1] {
d[i][j] = d[i - 1][j - 1] //no operation
}
else {
d[i][j] = min(min(
d[i - 1][j] + 1, //a deletion
d[i][j - 1] + 1), //an insertion
d[i - 1][j - 1] + 1 //a substitution
)
}
}
}
d[n][m]
}
func run(x, y) {
print("\(x) -> \(y) = \(levenshtein(x, y))")
}
run("rosettacode", "raisethysword")
- Output:
rosettacode -> raisethysword = 8
EasyLang
func dist s1$ s2$ .
if len s1$ = 0
return len s2$
.
if len s2$ = 0
return len s1$
.
c1$ = substr s1$ 1 1
c2$ = substr s2$ 1 1
s1rest$ = substr s1$ 2 len s1$
s2rest$ = substr s2$ 2 len s2$
#
if c1$ = c2$
return dist s1rest$ s2rest$
.
min = lower dist s1rest$ s2rest$ dist s1$ s2rest$
min = lower min dist s1rest$ s2rest$
return min + 1
.
print dist "kitten" "sitting"
print dist "rosettacode" "raisethysword"
EchoLisp
;; Recursive version adapted from Clojure
;; Added necessary memoization
(define (levenshtein str1 str2 (cost 0) (rest1 0) (rest2 0) (key null))
(set! key (string-append str1 "|" str2))
(if (get 'mem key) ;; memoized ?
(get 'mem key)
;; else memoize
(putprop 'mem
(let [(len1 (string-length str1))
(len2 (string-length str2))]
(cond ((zero? len1) len2)
((zero? len2) len1)
(else
(set! cost (if (= (string-first str1) (string-first str2)) 0 1))
(set! rest1 (string-rest str1))
(set! rest2 (string-rest str2))
(min (1+ (levenshtein rest1 str2))
(1+ (levenshtein str1 rest2))
(+ cost
(levenshtein rest1 rest2 ))))))
key)))
;; 😛 127 calls with memoization
;; 😰 29737 calls without memoization
(levenshtein "kitten" "sitting") → 3
(levenshtein "rosettacode" "raisethysword") → 8
Ela
This code is translated from Haskell version.
open list
levenshtein s1 s2 = last <| foldl transform [0 .. length s1] s2
where transform (n::ns')@ns c = scanl calc (n+1) <| zip3 s1 ns ns'
where calc z (c', x, y) = minimum [y+1, z+1, x + toInt (c' <> c)]
Executing:
(levenshtein "kitten" "sitting", levenshtein "rosettacode" "raisethysword")
- Output:
(3, 8)
Elixir
defmodule Levenshtein do
def distance(a, b) do
ta = String.downcase(a) |> to_charlist |> List.to_tuple
tb = String.downcase(b) |> to_charlist |> List.to_tuple
m = tuple_size(ta)
n = tuple_size(tb)
costs = Enum.reduce(0..m, %{}, fn i,acc -> Map.put(acc, {i,0}, i) end)
costs = Enum.reduce(0..n, costs, fn j,acc -> Map.put(acc, {0,j}, j) end)
Enum.reduce(0..n-1, costs, fn j, acc ->
Enum.reduce(0..m-1, acc, fn i, map ->
d = if elem(ta, i) == elem(tb, j) do
map[ {i,j} ]
else
Enum.min([ map[ {i , j+1} ] + 1, # deletion
map[ {i+1, j } ] + 1, # insertion
map[ {i , j } ] + 1 ]) # substitution
end
Map.put(map, {i+1, j+1}, d)
end)
end)
|> Map.get({m,n})
end
end
words = ~w(kitten sitting saturday sunday rosettacode raisethysword)
Enum.each(Enum.chunk(words, 2), fn [a,b] ->
IO.puts "distance(#{a}, #{b}) = #{Levenshtein.distance(a,b)}"
end)
- Output:
distance(kitten, sitting) = 3 distance(saturday, sunday) = 3 distance(rosettacode, raisethysword) = 8
Erlang
Here are two implementations. The first is the naive version, the second is a memoized version using Erlang's dictionary datatype.
-module(levenshtein).
-compile(export_all).
distance_cached(S,T) ->
{L,_} = ld(S,T,dict:new()),
L.
distance(S,T) ->
ld(S,T).
ld([],T) ->
length(T);
ld(S,[]) ->
length(S);
ld([X|S],[X|T]) ->
ld(S,T);
ld([_SH|ST]=S,[_TH|TT]=T) ->
1 + lists:min([ld(S,TT),ld(ST,T),ld(ST,TT)]).
ld([]=S,T,Cache) ->
{length(T),dict:store({S,T},length(T),Cache)};
ld(S,[]=T,Cache) ->
{length(S),dict:store({S,T},length(S),Cache)};
ld([X|S],[X|T],Cache) ->
ld(S,T,Cache);
ld([_SH|ST]=S,[_TH|TT]=T,Cache) ->
case dict:is_key({S,T},Cache) of
true -> {dict:fetch({S,T},Cache),Cache};
false ->
{L1,C1} = ld(S,TT,Cache),
{L2,C2} = ld(ST,T,C1),
{L3,C3} = ld(ST,TT,C2),
L = 1+lists:min([L1,L2,L3]),
{L,dict:store({S,T},L,C3)}
end.
Below is a comparison of the runtimes, measured in microseconds, between the two implementations.
68> timer:tc(levenshtein,distance,["rosettacode","raisethysword"]).
{774799,8} % {Time, Result}
69> timer:tc(levenshtein,distance_cached,["rosettacode","raisethysword"]).
{659,8}
70> timer:tc(levenshtein,distance,["kitten","sitting"]).
{216,3}
71> timer:tc(levenshtein,distance_cached,["kitten","sitting"]).
{213,3}
ERRE
PROGRAM LEVENSHTEIN
!$DYNAMIC
DIM D%[0,0]
PROCEDURE LEVENSHTEIN(S$,T$->RES%)
LOCAL I%,J%,M%
FOR I%=0 TO LEN(S$) DO
D%[I%,0]=I%
END FOR
FOR J%=0 TO LEN(T$) DO
D%[0,J%]=J%
END FOR
FOR J%=1 TO LEN(T$) DO
FOR I%=1 TO LEN(S$) DO
IF MID$(S$,I%,1)=MID$(T$,J%,1) THEN
D%[I%,J%]=D%[I%-1,J%-1]
ELSE
M%=D%[I%-1,J%-1]
IF D%[I%,J%-1]<M% THEN M%=D%[I%,J%-1] END IF
IF D%[I%-1,J%]<M% THEN M%=D%[I%-1,J%] END IF
D%[I%,J%]=M%+1
END IF
END FOR
END FOR
RES%=D%[I%-1,J%-1]
END PROCEDURE
BEGIN
S$="kitten" T$="sitting"
PRINT("'";S$;"' -> '";T$;"' has distance ";)
!$DIM D%[LEN(S$),LEN(T$)]
LEVENSHTEIN(S$,T$->RES%)
PRINT(RES%)
!$ERASE D%
S$="rosettacode" T$="raisethysword"
PRINT("'";S$;"' -> '";T$;"' has distance ";)
!$DIM D%[LEN(S$),LEN(T$)]
LEVENSHTEIN(S$,T$->RES%)
PRINT(RES%)
!$ERASE D%
END PROGRAM
- Output:
'kitten' -> 'sitting' has distance 3 'rosettacode' -> 'raisethysword' has distance 8
Euphoria
Code by Jeremy Cowgar from the Euphoria File Archive.
function min(sequence s)
atom m
m = s[1]
for i = 2 to length(s) do
if s[i] < m then
m = s[i]
end if
end for
return m
end function
function levenshtein(sequence s1, sequence s2)
integer n, m
sequence d
n = length(s1) + 1
m = length(s2) + 1
if n = 1 then
return m-1
elsif m = 1 then
return n-1
end if
d = repeat(repeat(0, m), n)
for i = 1 to n do
d[i][1] = i-1
end for
for j = 1 to m do
d[1][j] = j-1
end for
for i = 2 to n do
for j = 2 to m do
d[i][j] = min({
d[i-1][j] + 1,
d[i][j-1] + 1,
d[i-1][j-1] + (s1[i-1] != s2[j-1])
})
end for
end for
return d[n][m]
end function
? levenshtein("kitten", "sitting")
? levenshtein("rosettacode", "raisethysword")
- Output:
3 8
F#
Standard version
open System
let getInput (name : string) =
Console.Write ("String {0}: ", name)
Console.ReadLine ()
let levDist (strOne : string) (strTwo : string) =
let strOne = strOne.ToCharArray ()
let strTwo = strTwo.ToCharArray ()
let (distArray : int[,]) = Array2D.zeroCreate (strOne.Length + 1) (strTwo.Length + 1)
for i = 0 to strOne.Length do distArray.[i, 0] <- i
for j = 0 to strTwo.Length do distArray.[0, j] <- j
for j = 1 to strTwo.Length do
for i = 1 to strOne.Length do
if strOne.[i - 1] = strTwo.[j - 1] then distArray.[i, j] <- distArray.[i - 1, j - 1]
else
distArray.[i, j] <- List.min (
[distArray.[i-1, j] + 1;
distArray.[i, j-1] + 1;
distArray.[i-1, j-1] + 1]
)
distArray.[strOne.Length, strTwo.Length]
let stringOne = getInput "One"
let stringTwo = getInput "Two"
printf "%A" (levDist stringOne stringTwo)
Console.ReadKey () |> ignore
Recursive Version
open System
let levenshtein (s1:string) (s2:string) : int =
let s1' = s1.ToCharArray()
let s2' = s2.ToCharArray()
let rec dist l1 l2 =
match (l1,l2) with
| (l1, 0) -> l1
| (0, l2) -> l2
| (l1, l2) ->
if s1'.[l1-1] = s2'.[l2-1] then dist (l1-1) (l2-1)
else
let d1 = dist (l1 - 1) l2
let d2 = dist l1 (l2 - 1)
let d3 = dist (l1 - 1) (l2 - 1)
1 + Math.Min(d1,(Math.Min(d2,d3)))
dist s1.Length s2.Length
printfn "dist(kitten, sitting) = %i" (levenshtein "kitten" "sitting")
Factor
USING: lcs prettyprint ;
"kitten" "sitting" levenshtein .
- Output:
3
Forth
: levenshtein ( a1 n1 a2 n2 -- n3)
dup \ if either string is empty, difference
if \ is inserting all chars from the other
2>r dup
if
2dup 1- chars + c@ 2r@ 1- chars + c@ =
if
1- 2r> 1- recurse exit
else \ else try:
2dup 1- 2r@ 1- recurse -rot \ changing first letter of s to t;
2dup 2r@ 1- recurse -rot \ remove first letter of s;
1- 2r> recurse min min 1+ \ remove first letter of t,
then \ any of which is 1 edit plus
else \ editing the rest of the strings
2drop 2r> nip
then
else
2drop nip
then
;
s" kitten" s" sitting" levenshtein . cr
s" rosettacode" s" raisethysword" levenshtein . cr
Fortran
program demo_edit_distance
character(len=:),allocatable :: sources(:),targets(:)
integer,allocatable :: answers(:),expected(:)
sources=[character(len=20) :: "kitten", "rosettacode", "Saturday", "sleep", "qwerty", "Fortran" ]
targets=[character(len=20) :: "sitting", "raisethysword", "Sunday", "fleeting", "qweryt", "Fortran" ]
expected=[ 3, 8, 3, 5, 2, 0 ]
! calculate answers
answers=edit_distance(sources,targets)
! print inputs, answers, and confirmation
do i=1, size(sources)
write(*,'(*(g0,1x))') sources(i), targets(i), answers(i), answers(i) == expected(i)
enddo
! a longer test
write(*,*)edit_distance("here's a bunch of words", "to wring out this code")==18
contains
pure elemental integer function edit_distance (source,target)
!! The Levenshtein distance function returns how many edits (deletions,
!! insertions, transposition) are required to turn one string into another.
character(len=*), intent(in) :: source, target
integer :: len_source, len_target, i, j, cost
integer :: matrix(0:len_trim(source), 0:len_trim(target))
len_source = len_trim(source)
len_target = len_trim(target)
matrix(:,0) = [(i,i=0,len_source)]
matrix(0,:) = [(j,j=0,len_target)]
do i = 1, len_source
do j = 1, len_target
cost=merge(0,1,source(i:i)==target(j:j))
matrix(i,j) = min(matrix(i-1,j)+1, matrix(i,j-1)+1, matrix(i-1,j-1)+cost)
enddo
enddo
edit_distance = matrix(len_source,len_target)
end function edit_distance
end program demo_edit_distance
- Output:
kitten sitting 3 T rosettacode raisethysword 8 T Saturday Sunday 3 T sleep fleeting 5 T qwerty qweryt 2 T Fortran Fortran 0 T T
FreeBASIC
' FB 1.05.0 Win64
' Uses the "iterative with two matrix rows" algorithm
' referred to in the Wikipedia article.
Function min(x As Integer, y As Integer) As Integer
Return IIf(x < y, x, y)
End Function
Function levenshtein(s As String, t As String) As Integer
' degenerate cases
If s = t Then Return 0
If s = "" Then Return Len(t)
If t = "" Then Return Len(s)
' create two integer arrays of distances
Dim v0(0 To Len(t)) As Integer '' previous
Dim v1(0 To Len(t)) As Integer '' current
' initialize v0
For i As Integer = 0 To Len(t)
v0(i) = i
Next
Dim cost As Integer
For i As Integer = 0 To Len(s) - 1
' calculate v1 from v0
v1(0) = i + 1
For j As Integer = 0 To Len(t) - 1
cost = IIf(s[i] = t[j], 0, 1)
v1(j + 1) = min(v1(j) + 1, min(v0(j + 1) + 1, v0(j) + cost))
Next j
' copy v1 to v0 for next iteration
For j As Integer = 0 To Len(t)
v0(j) = v1(j)
Next j
Next i
Return v1(Len(t))
End Function
Print "'kitten' to 'sitting' => "; levenshtein("kitten", "sitting")
Print "'rosettacode' to 'raisethysword' => "; levenshtein("rosettacode", "raisethysword")
Print "'sleep' to 'fleeting' => "; levenshtein("sleep", "fleeting")
Print
Print "Press any key to quit"
Sleep
- Output:
'kitten' to 'sitting' => 3 'rosettacode' to 'raisethysword' => 8 'sleep' to 'fleeting' => 5
Frink
Frink has a built-in function to calculate the Levenshtein edit distance between two strings:
println[editDistance["kitten","sitting"]]
It also has a function to calculate the Levenshtein-Damerau edit distance, editDistanceDamerau[str1,str2]
. This is similar to the editDistance
function but also allows swaps between two adjoining characters, which count as an edit distance of 1. This may make distances between some strings shorter, by say, treating transposition errors in a word as a less expensive operation than in the pure Levenshtein algorithm, and is generally more useful in more circumstances.
FutureBasic
Recursive version.
local fn LevenshteinDistance( s1 as CFStringRef, s2 as CFStringRef ) as NSInteger
NSInteger result
// If strings are equal, Levenshtein distance is 0
if ( fn StringIsEqual( s1, s2 ) ) then result = 0 : exit fn
// If either string is empty, then distance is the length of the other string.
if ( len(s1) == 0) then result = len(s2) : exit fn
if ( len(s2) == 0) then result = len(s1) : exit fn
// The remaining recursive process uses first characters and remainder of each string.
CFStringRef s1First = fn StringSubstringToIndex( s1, 1 )
CFStringRef s2First = fn StringSubstringToIndex( s2, 1 )
CFStringRef s1Rest = mid( s1, 1, len(s1) -1 )
CFStringRef s2Rest = mid( s2, 1, len(s2) -1 )
// If leading characters are the same, then distance is that between the rest of the strings.
if fn StringIsEqual( s1First, s2First ) then result = fn LevenshteinDistance( s1Rest, s2Rest ) : exit fn
// Find the distances between sub strings.
NSInteger distA = fn LevenshteinDistance( s1Rest, s2 )
NSInteger distB = fn LevenshteinDistance( s1, s2Rest )
NSInteger distC = fn LevenshteinDistance( s1Rest, s2Rest )
// Return the minimum distance between substrings.
NSInteger minDist = distA
if ( distB < minDist ) then minDist = distB
if ( distC < minDist ) then minDist = distC
result = minDist + 1 // Include change for the first character.
end fn = result
NSInteger i
CFStringRef testStr( 6, 2 )
testStr( 0, 0 ) = @"kitten" : testStr( 0, 1 ) = @"sitting"
testStr( 1, 0 ) = @"rosettacode" : testStr( 1, 1 ) = @"raisethysword"
testStr( 2, 0 ) = @"Saturday" : testStr( 2, 1 ) = @"Sunday"
testStr( 3, 0 ) = @"FutureBasic" : testStr( 3, 1 ) = @"FutureBasic"
testStr( 4, 0 ) = @"rave" : testStr( 4, 1 ) = @"ravel"
testStr( 5, 0 ) = @"black" : testStr( 5, 1 ) = @"slack"
testStr( 6, 0 ) = @"rave" : testStr( 6, 1 ) = @"grave"
for i = 0 to 6
print @"1st string = "; testStr( i, 0 )
print @"2nd string = "; testStr( i, 1 )
print @"Levenshtein distance = "; fn LevenshteinDistance( testStr( i, 0 ), testStr( i, 1 ) )
print
next
HandleEvents
- Output:
1st string = kitten 2nd string = sitting Levenshtein distance = 3 1st string = rosettacode 2nd string = raisethysword Levenshtein distance = 8 1st string = Saturday 2nd string = Sunday Levenshtein distance = 3 1st string = FutureBasic 2nd string = FutureBasic Levenshtein distance = 0 1st string = rave 2nd string = ravel Levenshtein distance = 1 1st string = black 2nd string = slack Levenshtein distance = 1 1st string = rave 2nd string = grave Levenshtein distance = 1
Go
WP algorithm:
package main
import "fmt"
func main() {
fmt.Println(ld("kitten", "sitting"))
}
func ld(s, t string) int {
d := make([][]int, len(s)+1)
for i := range d {
d[i] = make([]int, len(t)+1)
}
for i := range d {
d[i][0] = i
}
for j := range d[0] {
d[0][j] = j
}
for j := 1; j <= len(t); j++ {
for i := 1; i <= len(s); i++ {
if s[i-1] == t[j-1] {
d[i][j] = d[i-1][j-1]
} else {
min := d[i-1][j]
if d[i][j-1] < min {
min = d[i][j-1]
}
if d[i-1][j-1] < min {
min = d[i-1][j-1]
}
d[i][j] = min + 1
}
}
}
return d[len(s)][len(t)]
}
- Output:
3
package main
import "fmt"
func levenshtein(s, t string) int {
if s == "" { return len(t) }
if t == "" { return len(s) }
if s[0] == t[0] {
return levenshtein(s[1:], t[1:])
}
a := levenshtein(s[1:], t[1:])
b := levenshtein(s, t[1:])
c := levenshtein(s[1:], t)
if a > b { a = b }
if a > c { a = c }
return a + 1
}
func main() {
s1 := "rosettacode"
s2 := "raisethysword"
fmt.Printf("distance between %s and %s: %d\n", s1, s2,
levenshtein(s1, s2))
}
- Output:
distance between rosettacode and raisethysword: 8
Groovy
def distance(String str1, String str2) {
def dist = new int[str1.size() + 1][str2.size() + 1]
(0..str1.size()).each { dist[it][0] = it }
(0..str2.size()).each { dist[0][it] = it }
(1..str1.size()).each { i ->
(1..str2.size()).each { j ->
dist[i][j] = [dist[i - 1][j] + 1, dist[i][j - 1] + 1, dist[i - 1][j - 1] + ((str1[i - 1] == str2[j - 1]) ? 0 : 1)].min()
}
}
return dist[str1.size()][str2.size()]
}
[ ['kitten', 'sitting']: 3,
['rosettacode', 'raisethysword']: 8,
['edocattesor', 'drowsyhtesiar']: 8 ].each { key, dist ->
println "Checking distance(${key[0]}, ${key[1]}) == $dist"
assert distance(key[0], key[1]) == dist
}
- Output:
Checking distance(kitten, sitting) == 3 Checking distance(rosettacode, raisethysword) == 8 Checking distance(edocattesor, drowsyhtesiar) == 8
Haskell
Implementation 1
levenshtein :: Eq a => [a] -> [a] -> Int
levenshtein s1 s2 = last $ foldl transform [0 .. length s1] s2
where
transform ns@(n:ns1) c = scanl calc (n + 1) $ zip3 s1 ns ns1
where
calc z (c1, x, y) = minimum [y + 1, z + 1, x + fromEnum (c1 /= c)]
main :: IO ()
main = print (levenshtein "kitten" "sitting")
- Output:
3
Implementation 2
levenshtein :: Eq a => [a] -> [a] -> Int
levenshtein s1 [] = length s1
levenshtein [] s2 = length s2
levenshtein s1@(s1Head:s1Tail) s2@(s2Head:s2Tail)
| s1Head == s2Head = levenshtein s1Tail s2Tail
| otherwise = 1 + minimum [score1, score2, score3]
where score1 = levenshtein s1Tail s2Tail
score2 = levenshtein s1 s2Tail
score3 = levenshtein s1Tail s2
main :: IO ()
main = print (levenshtein "kitten" "sitting")
- Output:
3
Icon and Unicon
procedure main()
every process(!&input)
end
procedure process(s)
s ? (s1 := tab(upto(' \t')), s2 := (tab(many(' \t')), tab(0))) | fail
write("'",s1,"' -> '",s2,"' = ", levenshtein(s1,s2))
end
procedure levenshtein(s, t)
if (n := *s+1) = 1 then return *t
if (m := *t+1) = 1 then return *s
every !(d := list(n,0)) := list(m, 0)
every i := 1 to max(n,m) do d[i,1] := d[1,i] := i
every d[1(i := 2 to n, s_i := s[iM1 := i-1]), j := 2 to m] :=
min(d[iM1,j], d[i,jM1:=j-1],
d[iM1,jM1] + if s_i == t[jM1] then -1 else 0) + 1
return d[n,m]-1
end
- Example:
->leven kitten sitting 'kitten' -> 'sitting' = 3 ->
Io
A levenshtein
method is available on strings when the standard Range
addon is loaded.
Io 20110905
Io> Range ; "kitten" levenshtein("sitting")
==> 3
Io> "rosettacode" levenshtein("raisethysword")
==> 8
Io>
IS-BASIC
100 PROGRAM "Levensht.bas"
110 LET S1$="kitten":LET S2$="sitting"
120 PRINT "The Levenshtein distance between """;S1$;""" and """;S2$;""" is:";LEVDIST(S1$,S2$)
130 DEF LEVDIST(S$,T$)
140 LET N=LEN(T$):LET M=LEN(S$)
150 NUMERIC D(0 TO M,0 TO N)
160 FOR I=0 TO M
170 LET D(I,0)=I
180 NEXT
190 FOR J=0 TO N
200 LET D(0,J)=J
210 NEXT
220 FOR J=1 TO N
230 FOR I=1 TO M
240 IF S$(I)=T$(J) THEN
250 LET D(I,J)=D(I-1,J-1)
260 ELSE
270 LET D(I,J)=MIN(D(I-1,J)+1,MIN(D(I,J-1)+1,D(I-1,J-1)+1))
280 END IF
290 NEXT
300 NEXT
310 LET LEVDIST=D(M,N)
320 END DEF
J
One approach would be a literal transcription of the wikipedia implementation:
levenshtein=:4 :0
D=. x +/&i.&>:&# y
for_i.1+i.#x do.
for_j.1+i.#y do.
if. ((<:i){x)=(<:j){y do.
D=.(D {~<<:i,j) (<i,j)} D
else.
min=. 1+<./D{~(i,j) <@:-"1#:1 2 3
D=. min (<i,j)} D
end.
end.
end.
{:{:D
)
First, we setup up an matrix of costs, with 0 or 1 for unexplored cells (1 being where the character pair corresponding to that cell position has two different characters). Note that the "cost to reach the empty string" cells go on the bottom and the right, instead of the top and the left, because this works better with J's "insert" operation (which we will call "reduce" in the next paragraph here. It could also be thought of as a right fold which has been constrained such the initial value is the identity value for the operation. Or, just think of it as inserting its operation between each item of its argument...).
Then we reduce the rows of that matrix using an operation that treats those two rows as columns and reduces the rows of this derived matrix with an operation that gives the (unexplored cell + the minumum of the other cells) followed by (the cell adjacent to the previously unexplored cell.
However, this is a rather slow and bulky approach.
We can also do the usual optimization where we only represent one row of the distance matrix at a time:
levdist =:4 :0
'a b'=. (x;y) /: (#x),(#y)
D=. >: iz =. i.#b
for_j. a do.
D=. <./\&.(-&iz) (>: D) <. (j ~: b) + |.!.j_index D
end.
{:D
)
- Example use:
'kitten' levenshtein 'sitting'
3
'kitten' levdist 'sitting'
3
Time and space use:
timespacex '''kitten'' levenshtein ''sitting'''
0.000113 6016
timespacex '''kitten'' levdist ''sitting'''
1.5e_5 4416
So the levdist implementation is about 7 times faster for this example (which approximately corresponds to the length of the shortest string) See the Levenshtein distance essay on the Jwiki for additional solutions.
Java
Based on the definition for Levenshtein distance given in the Wikipedia article:
public class Levenshtein {
public static int distance(String a, String b) {
a = a.toLowerCase();
b = b.toLowerCase();
// i == 0
int [] costs = new int [b.length() + 1];
for (int j = 0; j < costs.length; j++)
costs[j] = j;
for (int i = 1; i <= a.length(); i++) {
// j == 0; nw = lev(i - 1, j)
costs[0] = i;
int nw = i - 1;
for (int j = 1; j <= b.length(); j++) {
int cj = Math.min(1 + Math.min(costs[j], costs[j - 1]), a.charAt(i - 1) == b.charAt(j - 1) ? nw : nw + 1);
nw = costs[j];
costs[j] = cj;
}
}
return costs[b.length()];
}
public static void main(String [] args) {
String [] data = { "kitten", "sitting", "saturday", "sunday", "rosettacode", "raisethysword" };
for (int i = 0; i < data.length; i += 2)
System.out.println("distance(" + data[i] + ", " + data[i+1] + ") = " + distance(data[i], data[i+1]));
}
}
- Output:
distance(kitten, sitting) = 3 distance(saturday, sunday) = 3 distance(rosettacode, raisethysword) = 8
public class Levenshtein{
public static int levenshtein(String s, String t){
/* if either string is empty, difference is inserting all chars
* from the other
*/
if(s.length() == 0) return t.length();
if(t.length() == 0) return s.length();
/* if first letters are the same, the difference is whatever is
* required to edit the rest of the strings
*/
if(s.charAt(0) == t.charAt(0))
return levenshtein(s.substring(1), t.substring(1));
/* else try:
* changing first letter of s to that of t,
* remove first letter of s, or
* remove first letter of t
*/
int a = levenshtein(s.substring(1), t.substring(1));
int b = levenshtein(s, t.substring(1));
int c = levenshtein(s.substring(1), t);
if(a > b) a = b;
if(a > c) a = c;
//any of which is 1 edit plus editing the rest of the strings
return a + 1;
}
public static void main(String[] args) {
String s1 = "kitten";
String s2 = "sitting";
System.out.println("distance between '" + s1 + "' and '"
+ s2 + "': " + levenshtein(s1, s2));
s1 = "rosettacode";
s2 = "raisethysword";
System.out.println("distance between '" + s1 + "' and '"
+ s2 + "': " + levenshtein(s1, s2));
StringBuilder sb1 = new StringBuilder(s1);
StringBuilder sb2 = new StringBuilder(s2);
System.out.println("distance between '" + sb1.reverse() + "' and '"
+ sb2.reverse() + "': "
+ levenshtein(sb1.reverse().toString(), sb2.reverse().toString()));
}
}
- Output:
distance between 'kitten' and 'sitting': 3 distance between 'rosettacode' and 'raisethysword': 8 distance between 'edocattesor' and 'drowsyhtesiar': 8
Iterative space optimized (even bounded)
import static java.lang.Math.abs;
import static java.lang.Math.max;
public class Levenshtein {
public static int ld(String a, String b) {
return distance(a, b, -1);
}
public static boolean ld(String a, String b, int max) {
return distance(a, b, max) <= max;
}
private static int distance(String a, String b, int max) {
if (a == b) return 0;
int la = a.length();
int lb = b.length();
if (max >= 0 && abs(la - lb) > max) return max+1;
if (la == 0) return lb;
if (lb == 0) return la;
if (la < lb) {
int tl = la; la = lb; lb = tl;
String ts = a; a = b; b = ts;
}
int[] cost = new int[lb+1];
for (int i=0; i<=lb; i+=1) {
cost[i] = i;
}
for (int i=1; i<=la; i+=1) {
cost[0] = i;
int prv = i-1;
int min = prv;
for (int j=1; j<=lb; j+=1) {
int act = prv + (a.charAt(i-1) == b.charAt(j-1) ? 0 : 1);
cost[j] = min(1+(prv=cost[j]), 1+cost[j-1], act);
if (prv < min) min = prv;
}
if (max >= 0 && min > max) return max+1;
}
if (max >= 0 && cost[lb] > max) return max+1;
return cost[lb];
}
private static int min(int ... a) {
int min = Integer.MAX_VALUE;
for (int i: a) if (i<min) min = i;
return min;
}
public static void main(String[] args) {
System.out.println(
ld("kitten","kitten") + " " + // 0
ld("kitten","sitten") + " " + // 1
ld("kitten","sittes") + " " + // 2
ld("kitten","sityteng") + " " + // 3
ld("kitten","sittYing") + " " + // 4
ld("rosettacode","raisethysword") + " " + // 8
ld("kitten","kittenaaaaaaaaaaaaaaaaa") + " " + // 17
ld("kittenaaaaaaaaaaaaaaaaa","kitten") // 17
);
System.out.println(
ld("kitten","kitten", 3) + " " + // true
ld("kitten","sitten", 3) + " " + // true
ld("kitten","sittes", 3) + " " + // true
ld("kitten","sityteng", 3) + " " + // true
ld("kitten","sittYing", 3) + " " + // false
ld("rosettacode","raisethysword", 3) + " " + // false
ld("kitten","kittenaaaaaaaaaaaaaaaaa", 3) + " " + // false
ld("kittenaaaaaaaaaaaaaaaaa","kitten", 3) // false
);
}
}
- Output:
0 1 2 3 4 8 17 17 true true true true false false false false
JavaScript
ES5
Iterative:
function levenshtein(a, b) {
var t = [], u, i, j, m = a.length, n = b.length;
if (!m) { return n; }
if (!n) { return m; }
for (j = 0; j <= n; j++) { t[j] = j; }
for (i = 1; i <= m; i++) {
for (u = [i], j = 1; j <= n; j++) {
u[j] = a[i - 1] === b[j - 1] ? t[j - 1] : Math.min(t[j - 1], t[j], u[j - 1]) + 1;
} t = u;
} return u[n];
}
// tests
[ ['', '', 0],
['yo', '', 2],
['', 'yo', 2],
['yo', 'yo', 0],
['tier', 'tor', 2],
['saturday', 'sunday', 3],
['mist', 'dist', 1],
['tier', 'tor', 2],
['kitten', 'sitting', 3],
['stop', 'tops', 2],
['rosettacode', 'raisethysword', 8],
['mississippi', 'swiss miss', 8]
].forEach(function(v) {
var a = v[0], b = v[1], t = v[2], d = levenshtein(a, b);
if (d !== t) {
console.log('levenstein("' + a + '","' + b + '") was ' + d + ' should be ' + t);
}
});
ES6
By composition of generic functions:
(() => {
"use strict";
// ------------ LEVENSHTEIN EDIT DISTANCE ------------
// levenshtein :: String -> String -> Int
const levenshtein = sa =>
// The Levenshtein edit distance
// between two given strings.
sb => {
const cs = [...sa];
const go = (ns, c) => {
const calc = z => tpl => {
const [c1, x, y] = Array.from(tpl);
return Math.min(
1 + y,
1 + z,
x + (
c1 === c
? 0
: 1
)
);
};
const [n, ...ns1] = ns;
return scanl(calc)(1 + n)(
zip3(cs)(ns)(ns1)
);
};
return last(
[...sb].reduce(
go,
enumFromTo(0)(cs.length)
)
);
};
// ---------------------- TEST -----------------------
const main = () => [
["kitten", "sitting"],
["sitting", "kitten"],
["rosettacode", "raisethysword"],
["raisethysword", "rosettacode"]
].map(uncurry(levenshtein));
// ---------------- GENERIC FUNCTIONS ----------------
// enumFromTo :: Int -> Int -> [Int]
const enumFromTo = m =>
n => Array.from({
length: 1 + n - m
}, (_, i) => m + i);
// last :: [a] -> a
const last = xs => {
// The last item of a list.
const n = xs.length;
return 0 < n
? xs[n - 1]
: null;
};
// scanl :: (b -> a -> b) -> b -> [a] -> [b]
const scanl = f =>
// The series of interim values arising
// from a catamorphism. Parallel to foldl.
startValue => xs =>
xs.reduce(
(a, x) => {
const v = f(a[0])(x);
return [v, a[1].concat(v)];
}, [startValue, [startValue]]
)[1];
// uncurry :: (a -> b -> c) -> ((a, b) -> c)
const uncurry = f =>
// A function over a pair, derived
// from a curried function.
(...args) => {
const
[x, y] = Boolean(args.length % 2)
? args[0]
: args;
return f(x)(y);
};
// zip3 :: [a] -> [b] -> [c] -> [(a, b, c)]
const zip3 = xs =>
ys => zs => xs.slice(
0,
Math.min(...[xs, ys, zs].map(x => x.length))
)
.map((x, i) => [x, ys[i], zs[i]]);
// MAIN ---
return JSON.stringify(main());
})();
- Output:
[3, 3, 8, 8]
jq
Works with gojq, the Go implementation of jq
The main point of interest about the following implementation is that it shows how the naive recursive algorithm can be tweaked within a completely functional framework to yield an efficient implementation.
Performance: Here is a breakdown of the run-times on a 2.53GHz machine:
9ms overhead (invoking jq and compiling the program) 17ms for kitten/sitting 67ms for rosettacode/raisethysword 71ms for edocattesor/drowsyhtesiar
# lookup the distance between s and t in the nested cache,
# which uses basic properties of the Levenshtein distance to save space:
def lookup(s;t):
if (s == t) then 0
elif (s|length) == 0 then (t|length)
elif (t|length) == 0 then (s|length)
elif (s|length) > (t|length) then
.[t] as $t | if $t then $t[s] else null end
else .[s] as $s | if $s then $s[t] else null end
end ;
# output is the updated cache;
# basic properties of the Levenshtein distance are used to save space:
def store(s;t;value):
if (s == t) then .
else (s|length) as $s | (t|length) as $t
| if $s == 0 or $t == 0 then .
elif $s < $t then .[s][t] = value
elif $t < $s then .[t][s] = value
else (.[s][t] = value) | (.[t][s] = value)
end
end ;
# Input is a cache of nested objects; output is [distance, cache]
def ld(s1; s2):
# emit [distance, cache]
# input: cache
def cached_ld(s;t):
lookup(s;t) as $check
| if $check then [ $check, . ]
else ld(s;t)
end
;
# If either string is empty,
# then distance is insertion of the other's characters.
if (s1|length) == 0 then [(s2|length), .]
elif (s2|length) == 0 then [(s1|length), .]
elif (s1[0:1] == s2[0:1]) then
cached_ld(s1[1:]; s2[1:])
else
cached_ld(s1[1:]; s2) as $a
| ($a[1] | cached_ld(s1; s2[1:])) as $b
| ($b[1] | cached_ld(s1[1:]; s2[1:])) as $c
| [$a[0], $b[0], $c[0]] | (min + 1) as $d
| [$d, ($c[1] | store(s1;s2;$d)) ]
end ;
def levenshteinDistance(s;t):
s as $s | t as $t | {} | ld($s;$t) | .[0];
Task
def demo:
"levenshteinDistance between \(.[0]) and \(.[1]) is \( levenshteinDistance(.[0]; .[1]) )";
(["kitten", "sitting"] | demo),
(["rosettacode","raisethysword"] | demo),
(["edocattesor", "drowsyhtesiar"] | demo),
(["this_algorithm_is_similar_to",
"Damerau-Levenshtein_distance"] | demo)
- Output:
levenshteinDistance between kitten and sitting is 3 levenshteinDistance between rosettacode and raisethysword is 8 levenshteinDistance between edocattesor and drowsyhtesiar is 8 levenshteinDistance between this_algorithm_is_similar_to and Damerau-Levenshtein_distance is 24
Jsish
From Javascript, ES5 entry.
/* Levenshtein Distance, in Jsish */
function levenshtein(a, b) {
var t = [], u, i, j, m = a.length, n = b.length;
if (!m) { return n; }
if (!n) { return m; }
for (j = 0; j <= n; j++) { t[j] = j; }
for (i = 1; i <= m; i++) {
for (u = [i], j = 1; j <= n; j++) {
u[j] = a[i - 1] === b[j - 1] ? t[j - 1] : Math.min(t[j - 1], t[j], u[j - 1]) + 1;
} t = u;
} return u[n];
}
provide('levenshtein', 1);
;levenshtein('', '');
;levenshtein('yo', '');
;levenshtein('', 'yo');
;levenshtein('yo', 'yo');
;levenshtein('tier', 'tor');
;levenshtein('saturday', 'sunday');
;levenshtein('mist', 'dist');
;levenshtein('tier', 'tor');
;levenshtein('kitten', 'sitting');
;levenshtein('stop', 'tops');
;levenshtein('rosettacode', 'raisethysword');
;levenshtein('mississippi', 'swiss miss');
/*
=!EXPECTSTART!=
levenshtein('', '') ==> 0
levenshtein('yo', '') ==> 2
levenshtein('', 'yo') ==> 2
levenshtein('yo', 'yo') ==> 0
levenshtein('tier', 'tor') ==> 2
levenshtein('saturday', 'sunday') ==> 3
levenshtein('mist', 'dist') ==> 1
levenshtein('tier', 'tor') ==> 2
levenshtein('kitten', 'sitting') ==> 3
levenshtein('stop', 'tops') ==> 2
levenshtein('rosettacode', 'raisethysword') ==> 8
levenshtein('mississippi', 'swiss miss') ==> 8
=!EXPECTEND!=
*/
- Output:
prompt$ jsish -u levenshtein.jsi [PASS] levenshtein.jsi
Julia
Recursive:
function levendist(s::AbstractString, t::AbstractString)
ls, lt = length.((s, t))
ls == 0 && return lt
lt == 0 && return ls
s₁, t₁ = s[2:end], t[2:end]
ld₁ = levendist(s₁, t₁)
s[1] == t[1] ? ld₁ : 1 + min(ld₁, levendist(s, t₁), levendist(s₁, t))
end
@show levendist("kitten", "sitting") # 3
@show levendist("rosettacode", "raisethysword") # 8
Iterative:
function levendist1(s::AbstractString, t::AbstractString)
ls, lt = length(s), length(t)
if ls > lt
s, t = t, s
ls, lt = lt, ls
end
dist = collect(0:ls)
for (ind2, chr2) in enumerate(t)
newdist = Vector{Int}(ls+1)
newdist[1] = ind2
for (ind1, chr1) in enumerate(s)
if chr1 == chr2
newdist[ind1+1] = dist[ind1]
else
newdist[ind1+1] = 1 + min(dist[ind1], dist[ind1+1], newdist[end])
end
end
dist = newdist
end
return dist[end]
end
Let's see some benchmark:
using BenchmarkTools
println("\n# levendist(kitten, sitting)")
s, t = "kitten", "sitting"
println(" - Recursive:")
@btime levendist(s, t)
println(" - Iterative:")
@btime levendist1(s, t)
println("\n# levendist(rosettacode, raisethysword)")
s, t = "rosettacode", "raisethysword"
println(" - Recursive:")
@btime levendist(s, t)
println(" - Iterative:")
@btime levendist1(s, t)
- Output:
# levendist(kitten, sitting) - Recursive: 78.788 μs (1103 allocations: 34.47 KiB) - Iterative: 494.376 ns (9 allocations: 1.16 KiB) # levendist(rosettacode, raisethysword) - Recursive: 317.817 ms (3468524 allocations: 105.85 MiB) - Iterative: 1.168 μs (15 allocations: 2.44 KiB)
Kotlin
Standard Version
// version 1.0.6
// Uses the "iterative with two matrix rows" algorithm referred to in the Wikipedia article.
fun levenshtein(s: String, t: String): Int {
// degenerate cases
if (s == t) return 0
if (s == "") return t.length
if (t == "") return s.length
// create two integer arrays of distances and initialize the first one
val v0 = IntArray(t.length + 1) { it } // previous
val v1 = IntArray(t.length + 1) // current
var cost: Int
for (i in 0 until s.length) {
// calculate v1 from v0
v1[0] = i + 1
for (j in 0 until t.length) {
cost = if (s[i] == t[j]) 0 else 1
v1[j + 1] = Math.min(v1[j] + 1, Math.min(v0[j + 1] + 1, v0[j] + cost))
}
// copy v1 to v0 for next iteration
for (j in 0 .. t.length) v0[j] = v1[j]
}
return v1[t.length]
}
fun main(args: Array<String>) {
println("'kitten' to 'sitting' => ${levenshtein("kitten", "sitting")}")
println("'rosettacode' to 'raisethysword' => ${levenshtein("rosettacode", "raisethysword")}")
println("'sleep' to 'fleeting' => ${levenshtein("sleep", "fleeting")}")
}
- Output:
'kitten' to 'sitting' => 3 'rosettacode' to 'raisethysword' => 8 'sleep' to 'fleeting' => 5
Functional/Folding Version
fun levenshtein(s: String, t: String,
charScore : (Char, Char) -> Int = { c1, c2 -> if (c1 == c2) 0 else 1}) : Int {
// Special cases
if (s == t) return 0
if (s == "") return t.length
if (t == "") return s.length
val initialRow : List<Int> = (0 until t.length + 1).map { it }.toList()
return (0 until s.length).fold(initialRow, { previous, u ->
(0 until t.length).fold( mutableListOf(u+1), {
row, v -> row.add(minOf(row.last() + 1,
previous[v+1] + 1,
previous[v] + charScore(s[u],t[v])))
row
})
}).last()
}
- Output:
'kitten' to 'sitting' => 3 'rosettacode' to 'raisethysword' => 8 'sleep' to 'fleeting' => 5
LFE
Simple Implementation
Suitable for short strings:
(defun levenshtein-simple
(('() str)
(length str))
((str '())
(length str))
(((cons a str1) (cons b str2)) (when (== a b))
(levenshtein-simple str1 str2))
(((= (cons _ str1-tail) str1) (= (cons _ str2-tail) str2))
(+ 1 (lists:min
(list
(levenshtein-simple str1 str2-tail)
(levenshtein-simple str1-tail str2)
(levenshtein-simple str1-tail str2-tail))))))
You can copy and paste that function into an LFE REPL and run it like so:
> (levenshtein-simple "a" "a")
0
> (levenshtein-simple "a" "")
1
> (levenshtein-simple "" "a")
1
> (levenshtein-simple "kitten" "sitting")
3
It is not recommended to test strings longer than the last example using this implementation, as performance quickly degrades.
Cached Implementation
(defun levenshtein-distance (str1 str2)
(let (((tuple distance _) (levenshtein-distance
str1 str2 (dict:new))))
distance))
(defun levenshtein-distance
(((= '() str1) str2 cache)
(tuple (length str2)
(dict:store (tuple str1 str2)
(length str2)
cache)))
((str1 (= '() str2) cache)
(tuple (length str1)
(dict:store (tuple str1 str2)
(length str1)
cache)))
(((cons a str1) (cons b str2) cache) (when (== a b))
(levenshtein-distance str1 str2 cache))
(((= (cons _ str1-tail) str1) (= (cons _ str2-tail) str2) cache)
(case (dict:is_key (tuple str1 str2) cache)
('true (tuple (dict:fetch (tuple str1 str2) cache) cache))
('false (let* (((tuple l1 c1) (levenshtein-distance str1 str2-tail cache))
((tuple l2 c2) (levenshtein-distance str1-tail str2 c1))
((tuple l3 c3) (levenshtein-distance str1-tail str2-tail c2))
(len (+ 1 (lists:min (list l1 l2 l3)))))
(tuple len (dict:store (tuple str1 str2) len c3)))))))
As before, here's some usage in the REPL. Note that longer strings are now possible to compare without incurring long execution times:
> (levenshtein-distance "a" "a")
0
> (levenshtein-distance "a" "")
1
> (levenshtein-distance "" "a")
1
> (levenshtein-distance "kitten" "sitting")
3
> (levenshtein-distance "rosettacode" "raisethysword")
8
Liberty BASIC
'Levenshtein Distance translated by Brandon Parker
'08/19/10
'from http://www.merriampark.com/ld.htm#VB
'No credit was given to the Visual Basic Author on the site :-(
Print LevenshteinDistance("kitten", "sitting")
End
'Get the minum of three values
Function Minimum(a, b, c)
Minimum = Min(a, Min(b, c))
End Function
'Compute the Levenshtein Distance
Function LevenshteinDistance(string1$, string2$)
n = Len(string1$)
m = Len(string2$)
If n = 0 Then
LevenshteinDistance = m
Exit Function
End If
If m = 0 Then
LevenshteinDistance = n
Exit Function
End If
Dim d(n, m)
For i = 0 To n
d(i, 0) = i
Next i
For i = 0 To m
d(0, i) = i
Next i
For i = 1 To n
si$ = Mid$(string1$, i, 1)
For ii = 1 To m
tj$ = Mid$(string2$, ii, 1)
If si$ = tj$ Then
cost = 0
Else
cost = 1
End If
d(i, ii) = Minimum((d(i - 1, ii) + 1), (d(i, ii - 1) + 1), (d(i - 1, ii - 1) + cost))
Next ii
Next i
LevenshteinDistance = d(n, m)
End Function
Limbo
implement Levenshtein;
include "sys.m"; sys: Sys;
print: import sys;
include "draw.m";
Levenshtein: module {
init: fn(nil: ref Draw->Context, args: list of string);
# Export distance so that this module can be used as either a
# standalone program or as a library:
distance: fn(s, t: string): int;
};
init(nil: ref Draw->Context, args: list of string)
{
sys = load Sys Sys->PATH;
if(!(len args % 2)) {
sys->fprint(sys->fildes(2), "Provide an even number of arguments!\n");
raise "fail:usage";
}
args = tl args;
while(args != nil) {
(s, t) := (hd args, hd tl args);
args = tl tl args;
print("%s <-> %s => %d\n", s, t, distance(s, t));
}
}
distance(s, t: string): int
{
if(s == "")
return len t;
if(t == "")
return len s;
if(s[0] == t[0])
return distance(s[1:], t[1:]);
a := distance(s[1:], t);
b := distance(s, t[1:]);
c := distance(s[1:], t[1:]);
if(a > b)
a = b;
if(a > c)
a = c;
return a + 1;
}
- Output:
% levenshtein kitten sitting rosettacode raisethysword kitten <-> sitting => 3 rosettacode <-> raisethysword => 8
LiveCode
//Code By Neurox66
function Levenshtein pString1 pString2
put 0 into tPosChar1
repeat for each char tChar1 in pString1
add 1 to tPosChar1
put tPosChar1 into tDistance[tPosChar1][0]
put 0 into tPosChar2
repeat for each char tChar2 in pString2
add 1 to tPosChar2
put tPosChar2 into tDistance[0][tPosChar2]
put 1 into tCost
if tChar1 = tChar2 then
put 0 into tCost
end if
put min((tDistance[tPosChar1-1][tPosChar2] + 1),(tDistance[tPosChar1][tPosChar2-1] + 1),(tDistance[tPosChar1-1][tPosChar2-1] + tCost)) into tDistance[tPosChar1][tPosChar2]
end repeat
end repeat
return tDistance[tPosChar1][tPosChar2]
end Levenshtein
put Levenshtein("kitten","sitting")
put Levenshtein("rosettacode","raisethysword")
- Output:
3 8
Lobster
def levenshtein(s: string, t: string) -> int:
def makeNxM(n: int, m: int, v: int) -> [[int]]:
let d:[[int]] = vector_reserve(typeof return, n)
for(n) i:
push(d, vector_reserve(typeof return, m))
for(m) j:
push(d[i], v)
return d
let ls = s.length
let lt = t.length
let d = makeNxM(ls + 1, lt + 1, -1)
def dist (i: int, j: int) -> int:
if d[i][j] >= 0:
return d[i][j]
var x = 0
if i == ls:
x = lt - j
else: if j == lt:
x = ls - i
else: if s[i] == t[j]:
x = dist(i + 1, j + 1)
else:
x = dist(i + 1, j + 1)
x = min(x, dist(i, j + 1))
x = min(x, dist(i + 1, j))
x += 1
d[i][j] = x
return x
return dist(0,0)
assert 3 == levenshtein("kitten", "sitting")
assert 8 == levenshtein("rosettacode", "raisethysword")
Lua
function leven(s,t)
if s == '' then return t:len() end
if t == '' then return s:len() end
local s1 = s:sub(2, -1)
local t1 = t:sub(2, -1)
if s:sub(0, 1) == t:sub(0, 1) then
return leven(s1, t1)
end
return 1 + math.min(
leven(s1, t1),
leven(s, t1),
leven(s1, t )
)
end
print(leven("kitten", "sitting"))
print(leven("rosettacode", "raisethysword"))
- Output:
3 8
M2000 Interpreter
Module Checkit {
\\ Iterative with two matrix rows
function LevenshteinDistance(s$,t$) {
if len(s$)<len(t$) then swap s$, t$
n=len(t$)
m=len(s$)
dim base 0, v0(n+1), v1(n+1)
Rem dim sw() ' we can use stack of values to make the swap.
for i=0 to n : v0(i)=i:next
for i=0 to m-1
v1(0)=i+1
for j=0 to n-1
deletioncost=v0(j+1)+1
insertioncost=v1(j)+1
if mid$(s$,i+1,1)=mid$(t$,j+1,1) then
substitutionCost=v0(j)
else
substitutionCost=v0(j)+1
end if
v1(j+1)=min.data(deletionCost, insertionCost, substitutionCost)
next
Rem sw()=v0():v0()=v1():v1()=sw()
\\ when we push arrays, we only push a pointer to
\\ when we read array (identifier with parenthesis) then we get a copy
\\ between Push and Read any change on arrays included in copies
Push v0(),v1(): Read v0(),v1()
next
=v0(n)
}
Print LevenshteinDistance("kitten","sitting")=3 ' true
Print LevenshteinDistance("Sunday","Saturday")=3 ' true
Print LevenshteinDistance("rosettacode","raisethysword")=8 ' true
}
Checkit
Module Checkit2 {
\\ Iterative with two matrix rows, using pointers to arrays
function LevenshteinDistance(s$,t$) {
if len(s$)<len(t$) then swap s$, t$
n=len(t$)
m=len(s$)
dim base 0, v0(n+1), v1(n+1)
v0=v0() ' v0 is pointer to v0()
v1=v1() ' v1 is pointer to v1()
for i=0 to n : v0(i)=i:next
for i=0 to m-1
return v1, 0:=i+1
for j=0 to n-1
deletioncost=Array(v0,j+1)+1
insertioncost=Array(v1,j)+1
if mid$(s$,i+1,1)=mid$(t$,j+1,1) then
substitutionCost=Array(v0,j)
else
substitutionCost=Array(v0,j)+1
end if
return v1, j+1:=min.data(deletionCost, insertionCost, substitutionCost)
next
swap v0, v1 ' just swap pointers
next
=Array(v0,n)
}
Print LevenshteinDistance("kitten","sitting")=3
Print LevenshteinDistance("Sunday","Saturday")=3
Print LevenshteinDistance("rosettacode","raisethysword")=8
}
Checkit2
Maple
> with(StringTools):
> Levenshtein("kitten","sitting");
3
> Levenshtein("rosettacode","raisethysword");
8
Mathematica /Wolfram Language
EditDistance["kitten","sitting"]
EditDistance["rosettacode","raisethysword"]
- Output:
3 7
MATLAB
function score = levenshtein(s1, s2)
% score = levenshtein(s1, s2)
%
% Calculates the area under the ROC for a given set
% of posterior predictions and labels. Currently limited to two classes.
%
% s1: string
% s2: string
% score: levenshtein distance
%
% Author: Ben Hamner (ben@benhamner.com)
if length(s1) < length(s2)
score = levenshtein(s2, s1);
elseif isempty(s2)
score = length(s1);
else
previous_row = 0:length(s2);
for i=1:length(s1)
current_row = 0*previous_row;
current_row(1) = i;
for j=1:length(s2)
insertions = previous_row(j+1) + 1;
deletions = current_row(j) + 1;
substitutions = previous_row(j) + (s1(i) ~= s2(j));
current_row(j+1) = min([insertions, deletions, substitutions]);
end
previous_row = current_row;
end
score = current_row(end);
end
Source : [1]
MiniScript
In the Mini Micro environment, this function is part of the stringUtil library module, and can be used like so:
import "stringUtil"
print "kitten".editDistance("sitting")
In environments where the stringUtil module is not available, you'd have to define it yourself:
string.editDistance = function(s2)
n = self.len
m = s2.len
if n == 0 then return m
if m == 0 then return n
s1chars = self.split("")
s2chars = s2.split("")
d = range(0, m)
lastCost = 0
for i in range(1, n)
s1char = s1chars[i-1]
lastCost = i
jMinus1 = 0
for j in range(1, m)
if s1char == s2chars[jMinus1] then cost = 0 else cost = 1
// set nextCost to the minimum of the following three possibilities:
a = d[j] + 1
b = lastCost + 1
c = cost + d[jMinus1]
if a < b then
if c < a then nextCost = c else nextCost = a
else
if c < b then nextCost = c else nextCost = b
end if
d[jMinus1] = lastCost
lastCost = nextCost
jMinus1 = j
end for
d[m] = lastCost
end for
return nextCost
end function
print "kitten".editDistance("sitting")
- Output:
3
Modula-2
MODULE LevenshteinDistance;
FROM InOut IMPORT WriteString, WriteCard, WriteLn;
FROM Strings IMPORT Length;
PROCEDURE levenshtein(s, t: ARRAY OF CHAR): CARDINAL;
CONST MaxLen = 15;
VAR d: ARRAY [0..MaxLen],[0..MaxLen] OF CARDINAL;
lenS, lenT, i, j: CARDINAL;
PROCEDURE min(a, b: CARDINAL): CARDINAL;
BEGIN
IF a<b THEN RETURN a;
ELSE RETURN b;
END;
END min;
BEGIN
lenS := Length(s);
lenT := Length(t);
IF lenS = 0 THEN RETURN lenT;
ELSIF lenT = 0 THEN RETURN lenS;
ELSE
FOR i := 0 TO lenS DO d[i,0] := i; END;
FOR j := 0 TO lenT DO d[0,j] := j; END;
FOR i := 1 TO lenS DO
FOR j := 1 TO lenT DO
IF s[i-1] = t[j-1] THEN
d[i,j] := d[i-1,j-1];
ELSE
d[i,j] :=
min(d[i-1,j] + 1,
min(d[i,j-1] + 1, d[i-1,j-1]+1));
END;
END;
END;
RETURN d[lenS,lenT];
END;
END levenshtein;
PROCEDURE ShowDistance(s, t: ARRAY OF CHAR);
BEGIN
WriteString(s);
WriteString(" -> ");
WriteString(t);
WriteString(": ");
WriteCard(levenshtein(s, t), 0);
WriteLn();
END ShowDistance;
BEGIN
ShowDistance("kitten", "sitting");
ShowDistance("rosettacode", "raisethysword");
END LevenshteinDistance.
- Output:
kitten -> sitting: 3 rosettacode -> raisethysword: 8
NetRexx
/* NetRexx */
options replace format comments java crossref symbols nobinary
n = 0
w = ''
n = n + 1; w[0] = n; w[n] = "kitten sitting"
n = n + 1; w[0] = n; w[n] = "rosettacode raisethysword"
loop n = 1 to w[0]
say w[n].word(1) "->" w[n].word(2)":" levenshteinDistance(w[n].word(1), w[n].word(2))
end n
return
method levenshteinDistance(s, t) private static
s = s.lower
t = t.lower
m = s.length
n = t.length
-- for all i and j, d[i,j] will hold the Levenshtein distance between
-- the first i characters of s and the first j characters of t;
-- note that d has (m+1)x(n+1) values
d = 0
-- source prefixes can be transformed into empty string by
-- dropping all characters (Note, ooRexx arrays are 1-based)
loop i = 2 to m + 1
d[i, 1] = 1
end i
-- target prefixes can be reached from empty source prefix
-- by inserting every characters
loop j = 2 to n + 1
d[1, j] = 1
end j
loop j = 2 to n + 1
loop i = 2 to m + 1
if s.substr(i - 1, 1) == t.substr(j - 1, 1) then do
d[i, j] = d[i - 1, j - 1] -- no operation required
end
else do
d[i, j] = -
(d[i - 1, j] + 1).min( - -- a deletion
(d[i, j - 1] + 1)).min( - -- an insertion
(d[i - 1, j - 1] + 1)) -- a substitution
end
end i
end j
return d[m + 1, n + 1]
Output:
kitten -> sitting: 3 rosettacode -> raisethysword: 8
Nim
Nim provides a function in module "std/editdistance" to compute the Levenshtein distance between two strings containing ASCII characters only or containing UTF-8 encoded Unicode runes.
import std/editdistance
echo editDistanceAscii("kitten", "sitting")
echo editDistanceAscii("rosettacode", "raisethysword")
- Output:
3 8
Here is a translation of the Python version.
import sequtils
func min(a, b, c: int): int {.inline.} = min(a, min(b, c))
proc levenshteinDistance(s1, s2: string): int =
var (s1, s2) = (s1, s2)
if s1.len > s2.len:
swap s1, s2
var distances = toSeq(0..s1.len)
for i2, c2 in s2:
var newDistances = @[i2+1]
for i1, c1 in s1:
if c1 == c2:
newDistances.add(distances[i1])
else:
newDistances.add(1 + min(distances[i1], distances[i1+1], newDistances[newDistances.high]))
distances = newDistances
result = distances[distances.high]
echo levenshteinDistance("kitten","sitting")
echo levenshteinDistance("rosettacode","raisethysword")
Nu
'kitten' | str distance 'sitting'
- Output:
3
Oberon-2
MODULE LevesteinDistance;
IMPORT Out,Strings;
PROCEDURE Levestein(s,t:ARRAY OF CHAR):LONGINT;
CONST
maxlen = 15;
VAR
d:ARRAY maxlen,maxlen OF LONGINT;
lens,lent,i,j:LONGINT;
PROCEDURE Min(a,b:LONGINT):LONGINT;
BEGIN
IF a < b THEN RETURN a ELSE RETURN b END
END Min;
BEGIN
lens := Strings.Length(s);
lent := Strings.Length(t);
IF lens = 0 THEN RETURN lent
ELSIF lent = 0 THEN RETURN lens
ELSE
FOR i := 0 TO lens DO d[i,0] := i END;
FOR j := 0 TO lent DO d[0,j] := j END;
FOR i := 1 TO lens DO
FOR j := 1 TO lent DO
IF s[i-1] = t[j-1] THEN
d[i,j] := d[i-1,j-1]
ELSE
d[i,j] := Min((d[i-1,j] + 1),
Min(d[i,j-1] + 1, d[i-1,j-1] + 1));
END
END
END
END;
RETURN d[lens,lent];
END Levestein;
PROCEDURE ShowDistance(s,t:ARRAY OF CHAR);
BEGIN
Out.String(s);
Out.String(" -> ");
Out.String(t);
Out.String(": ");
Out.Int(Levestein(s,t),0);
Out.Ln
END ShowDistance;
BEGIN
ShowDistance("kitten", "sitting");
ShowDistance("rosettacode", "raisethysword");
END LevesteinDistance.
- Output:
kitten -> sitting: 3 rosettacode -> raisethysword: 8
Objeck
class Levenshtein {
function : Main(args : String[]) ~ Nil {
if(args->Size() = 2) {
s := args[0]; t := args[1]; d := Distance(s,t);
"{$s} -> {$t} = {$d}"->PrintLine();
};
}
function : native : Distance(s : String,t : String) ~ Int {
d := Int->New[s->Size() + 1, t->Size() + 1];
for(i := 0; i <= s->Size(); i += 1;) {
d[i,0] := i;
};
for(j := 0; j <= t->Size(); j += 1;) {
d[0,j] := j;
};
for(j := 1; j <= t->Size(); j += 1;) {
for(i := 1; i <= s->Size(); i += 1;) {
if(s->Get(i - 1) = t->Get(j - 1)) {
d[i,j] := d[i - 1, j - 1];
}
else {
d[i,j] := (d[i - 1, j] + 1)
->Min(d[i, j - 1] + 1)
->Min(d[i - 1, j - 1] + 1);
};
};
};
return d[s->Size(), t->Size()];
}
}
Objective-C
Translation of the C# code into a NSString category
@interface NSString (levenshteinDistance)
- (NSUInteger)levenshteinDistanceToString:(NSString *)string;
@end
@implementation NSString (levenshteinDistance)
- (NSUInteger)levenshteinDistanceToString:(NSString *)string {
NSUInteger sl = [self length];
NSUInteger tl = [string length];
NSUInteger *d = calloc(sizeof(*d), (sl+1) * (tl+1));
#define d(i, j) d[((j) * sl) + (i)]
for (NSUInteger i = 0; i <= sl; i++) {
d(i, 0) = i;
}
for (NSUInteger j = 0; j <= tl; j++) {
d(0, j) = j;
}
for (NSUInteger j = 1; j <= tl; j++) {
for (NSUInteger i = 1; i <= sl; i++) {
if ([self characterAtIndex:i-1] == [string characterAtIndex:j-1]) {
d(i, j) = d(i-1, j-1);
} else {
d(i, j) = MIN(d(i-1, j), MIN(d(i, j-1), d(i-1, j-1))) + 1;
}
}
}
NSUInteger r = d(sl, tl);
#undef d
free(d);
return r;
}
@end
OCaml
Translation of the pseudo-code of the Wikipedia article:
let minimum a b c =
min a (min b c)
let levenshtein_distance s t =
let m = String.length s
and n = String.length t in
(* for all i and j, d.(i).(j) will hold the Levenshtein distance between
the first i characters of s and the first j characters of t *)
let d = Array.make_matrix (m+1) (n+1) 0 in
for i = 0 to m do
d.(i).(0) <- i (* the distance of any first string to an empty second string *)
done;
for j = 0 to n do
d.(0).(j) <- j (* the distance of any second string to an empty first string *)
done;
for j = 1 to n do
for i = 1 to m do
if s.[i-1] = t.[j-1] then
d.(i).(j) <- d.(i-1).(j-1) (* no operation required *)
else
d.(i).(j) <- minimum
(d.(i-1).(j) + 1) (* a deletion *)
(d.(i).(j-1) + 1) (* an insertion *)
(d.(i-1).(j-1) + 1) (* a substitution *)
done;
done;
d.(m).(n)
;;
let test s t =
Printf.printf " %s -> %s = %d\n" s t (levenshtein_distance s t);
;;
let () =
test "kitten" "sitting";
test "rosettacode" "raisethysword";
;;
A recursive functional version
This could be made faster with memoization
let levenshtein s t =
let rec dist i j = match (i,j) with
| (i,0) -> i
| (0,j) -> j
| (i,j) ->
if s.[i-1] = t.[j-1] then dist (i-1) (j-1)
else let d1, d2, d3 = dist (i-1) j, dist i (j-1), dist (i-1) (j-1) in
1 + min d1 (min d2 d3)
in
dist (String.length s) (String.length t)
let test s t =
Printf.printf " %s -> %s = %d\n" s t (levenshtein s t)
let () =
test "kitten" "sitting";
test "rosettacode" "raisethysword";
- Output:
kitten -> sitting = 3 rosettacode -> raisethysword = 8
ooRexx
say "kitten -> sitting:" levenshteinDistance("kitten", "sitting")
say "rosettacode -> raisethysword:" levenshteinDistance("rosettacode", "raisethysword")
::routine levenshteinDistance
use arg s, t
s = s~lower
t = t~lower
m = s~length
n = t~length
-- for all i and j, d[i,j] will hold the Levenshtein distance between
-- the first i characters of s and the first j characters of t;
-- note that d has (m+1)x(n+1) values
d = .array~new(m + 1, n + 1)
-- clear all elements in d
loop i = 1 to d~dimension(1)
loop j = 1 to d~dimension(2)
d[i, j] = 0
end
end
-- source prefixes can be transformed into empty string by
-- dropping all characters (Note, ooRexx arrays are 1-based)
loop i = 2 to m + 1
d[i, 1] = 1
end
-- target prefixes can be reached from empty source prefix
-- by inserting every characters
loop j = 2 to n + 1
d[1, j] = 1
end
loop j = 2 to n + 1
loop i = 2 to m + 1
if s~subchar(i - 1) == t~subchar(j - 1) then
d[i, j] = d[i - 1, j - 1] -- no operation required
else d[i, j] = min(d[i - 1, j] + 1, - -- a deletion
d[i, j-1] + 1, - -- an insertion
d[i - 1, j - 1] + 1) -- a substitution
end
end
return d[m + 1, n + 1 ]
Output:
kitten -> sitting: 3 rosettacode -> raisethysword: 8
PARI/GP
\\ Levenshtein distance between two words
\\ 6/21/16 aev
levensDist(s1,s2)={
my(n1=#s1,n2=#s2,v1=Vecsmall(s1),v2=Vecsmall(s2),c,
n11=n1+1,n21=n2+1,t=vector(n21,z,z-1),u0=vector(n21),u=u0);
if(s1==s2, return(0)); if(!n1, return(n2)); if(!n2, return(n1));
for(i=2,n11, u=u0; u[1]=i-1;
for(j=2,n21,
if(v1[i-1]==v2[j-1], c=t[j-1], c=vecmin([t[j-1],t[j],u[j-1]])+1);
u[j]=c;
);\\fend j
t=u;
);\\fend i
print(" *** Levenshtein distance = ",t[n21]," for strings: ",s1,", ",s2);
return(t[n21]);
}
{ \\ Testing:
levensDist("kitten","sitting");
levensDist("rosettacode","raisethysword");
levensDist("Saturday","Sunday");
levensDist("oX","X");
levensDist("X","oX");
}
- Output:
*** Levenshtein distance = 3 for strings: kitten, sitting *** Levenshtein distance = 8 for strings: rosettacode, raisethysword *** Levenshtein distance = 3 for strings: Saturday, Sunday *** Levenshtein distance = 1 for strings: oX, X *** Levenshtein distance = 1 for strings: X, oX
Pascal
A fairly direct translation of the wikipedia pseudo code:
Program LevenshteinDistanceDemo(output);
uses
Math;
function LevenshteinDistance(s, t: string): longint;
var
d: array of array of integer;
i, j, n, m: integer;
begin
n := length(t);
m := length(s);
setlength(d, m+1, n+1);
for i := 0 to m do
d[i,0] := i;
for j := 0 to n do
d[0,j] := j;
for j := 1 to n do
for i := 1 to m do
if s[i] = t[j] then
d[i,j] := d[i-1,j-1]
else
d[i,j] := min(d[i-1,j] + 1, min(d[i,j-1] + 1, d[i-1,j-1] + 1));
LevenshteinDistance := d[m,n];
end;
var
s1, s2: string;
begin
s1 := 'kitten';
s2 := 'sitting';
writeln('The Levenshtein distance between "', s1, '" and "', s2, '" is: ', LevenshteinDistance(s1, s2));
s1 := 'rosettacode';
s2 := 'raisethysword';
writeln('The Levenshtein distance between "', s1, '" and "', s2, '" is: ', LevenshteinDistance(s1, s2));
end.
- Output:
The Levenshtein distance between "kitten" and "sitting" is: 3 The Levenshtein distance between "rosettacode" and "raisethysword" is: 8
Perl
Recursive algorithm, as in the C sample. You are invited to comment out the line where it says so, and see the speed difference. By the way, there's the Memoize
standard module, but it requires setting quite a few parameters to work right for this example, so I'm just showing the simple minded caching scheme here.
use List::Util qw(min);
my %cache;
sub leven {
my ($s, $t) = @_;
return length($t) if $s eq '';
return length($s) if $t eq '';
$cache{$s}{$t} //= # try commenting out this line
do {
my ($s1, $t1) = (substr($s, 1), substr($t, 1));
(substr($s, 0, 1) eq substr($t, 0, 1))
? leven($s1, $t1)
: 1 + min(
leven($s1, $t1),
leven($s, $t1),
leven($s1, $t ),
);
};
}
print leven('rosettacode', 'raisethysword'), "\n";
Iterative solution:
use List::Util qw(min);
sub leven {
my ($s, $t) = @_;
my $tl = length($t);
my $sl = length($s);
my @d = ([0 .. $tl], map { [$_] } 1 .. $sl);
foreach my $i (0 .. $sl - 1) {
foreach my $j (0 .. $tl - 1) {
$d[$i + 1][$j + 1] =
substr($s, $i, 1) eq substr($t, $j, 1)
? $d[$i][$j]
: 1 + min($d[$i][$j + 1], $d[$i + 1][$j], $d[$i][$j]);
}
}
$d[-1][-1];
}
print leven('rosettacode', 'raisethysword'), "\n";
Phix
with javascript_semantics requires("0.8.2") -- (just the use of papply() at the end) function levenshtein(string a, b) integer n = length(a), m = length(b) if n=0 then return m end if if m=0 then return n end if sequence d = repeat(repeat(0, m+1), n+1) for i=1 to n do d[i+1][1] = i for j=1 to m do d[1][j+1] = j -- next := min({ prev +substitution, deletion, or insertion }): d[i+1][j+1] = min({d[i][j]+(a[i]!=b[j]), d[i][j+1]+1, d[i+1][j]+1}) end for end for return d[$][$] end function procedure test(string s1, s2, integer expected) integer actual = levenshtein(s1,s2) if actual!=expected or actual!=levenshtein(s2,s1) or actual!=levenshtein(reverse(s1),reverse(s2)) or actual!=levenshtein(reverse(s2),reverse(s1)) then crash("oh dear...") end if printf(1,"%14q <== %d ==> %q\n",{s1,actual,s2}) end procedure constant tests = {{"kitten", "sitting", 3}, {"rosettacode", "raisethysword", 8}, {"abcdefgh", "defghabc", 6}, {"saturday", "sunday", 3}, {"sleep", "fleeting", 5}, {"", "four", 4}, {"", "", 0}} papply(false,test,tests)
- Output:
"kitten" <== 3 ==> "sitting" "rosettacode" <== 8 ==> "raisethysword" "abcdefgh" <== 6 ==> "defghabc" "saturday" <== 3 ==> "sunday" "sleep" <== 5 ==> "fleeting" "" <== 4 ==> "four" "" <== 0 ==> ""
alternative
Modelled after the Processing code, uses a single/smaller array, passes the same tests as above.
function levenshtein(string a, b) sequence costs = tagset(length(b)+1,0) for i=1 to length(a) do costs[1] = i integer newcost = i-1, pj = i, cj for j=1 to length(b) do cj = costs[j+1] pj = min({pj+1, cj+1, newcost+(a[i]!=b[j])}) costs[j+1] = pj newcost = cj end for end for return costs[$-1] end function
PHP
echo levenshtein('kitten','sitting');
echo levenshtein('rosettacode', 'raisethysword');
- Output:
3 8
Picat
Iterative
Based on the iterative algorithm at Wikipedia. Picat is 1-based so some adjustments are needed.
levenshtein(S,T) = Dist =>
M = 1+S.length,
N = 1+T.length,
D = new_array(M,N),
foreach(I in 1..M)
D[I,1] := I-1
end,
foreach(J in 1..N)
D[1,J] := J-1
end,
foreach(J in 2..N, I in 2..M)
if S[I-1] == T[J-1] then
D[I,J] := D[I-1,J-1] % no operation required
else
D[I,J] := min([D[I-1,J ] + 1, % a deletion
D[I ,J-1] + 1, % an insertion
D[I-1,J-1] + 1] % a substitution
)
end
end,
Dist = D[M,N].
Tabled recursive version
table
levenshtein_rec(S,T) = Dist =>
Dist1 = 0,
if S.length = 0 then Dist1 := T.length
elseif T.length = 0 then Dist1 := S.length
elseif S[1] = T[1] then Dist1 := levenshtein_rec(S.tail(), T.tail())
else
A = levenshtein_rec(S.tail(), T.tail()),
B = levenshtein_rec(S , T.tail()),
C = levenshtein_rec(S.tail(), T),
if A > B then
A := B
elseif A > C then
A := C
end,
Dist1 := A + 1
end,
Dist = Dist1.
Mode-directed tabling
levenshtein_mode(S,T) = Dist =>
lev(S, T, Dist).
table (+,+,min)
lev([], [], 0).
lev([X|L], [X|R], D) :- !, lev(L, R, D).
lev([_|L], [_|R], D) :- lev(L, R, H), D is H+1.
lev([_|L], R, D) :- lev(L, R, H), D is H+1.
lev(L, [_|R], D) :- lev(L, R, H), D is H+1.
Test
go =>
S = [
["kitten","sitting"],
["rosettacode","raisethysword"],
["levenshtein","levenstein"],
["saturday", "sunday"],
["stop", "tops"],
["saturday", "sunday"]
],
foreach([W1,W2] in S)
% clear the table cache
initialize_table,
println(iter=[W1,W2,levenshtein(W1,W2)]),
println(recu=[W1,W2,levenshtein_rec(W1,W2)]),
println(mode=[W1,W2,levenshtein_mode(W1,W2)]),
nl
end,
nl.
- Output:
goal = go iter = [kitten,sitting,3] recu = [kitten,sitting,3] mode = [kitten,sitting,3] iter = [rosettacode,raisethysword,8] recu = [rosettacode,raisethysword,8] mode = [rosettacode,raisethysword,8] iter = [levenshtein,levenstein,1] recu = [levenshtein,levenstein,1] mode = [levenshtein,levenstein,1] iter = [saturday,sunday,3] recu = [saturday,sunday,3] mode = [saturday,sunday,3] iter = [stop,tops,2] recu = [stop,tops,2] mode = [stop,tops,2] iter = [saturday,sunday,3] recu = [saturday,sunday,3] mode = [saturday,sunday,3]
Benchmark on larger strings
Benchmarking the methods with larger strings of random lengths (between 1 and 2000).
go2 =>
_ = random2(),
Len = 2000,
S1 = generate_string(Len),
S2 = generate_string(Len),
println(len_s1=S1.len),
println(len_s2=S2.len),
nl,
println("iter:"),
time(L1 = levenshtein(S1,S2)),
println("rec:"),
time(L2 = levenshtein_rec(S1,S2)),
println("mode:"),
time(L3 = levenshtein_mode(S1,S2)),
println(distances=[iter=L1,rec=L2,mode=L3]),
nl.
%
% Generate a random string of max length MaxLen
%
generate_string(MaxLen) = S =>
Alpha = "abcdefghijklmnopqrstuvxyz",
Len = Alpha.length,
S := [Alpha[random(1,Len)] : _ in 1..random(1,MaxLen)].
Here is sample run. The version using mode-directed tabling is clearly the fastest.
- Output:
len_s1 = 1968 len_s2 = 1529 iter: CPU time 9.039 seconds. rec: CPU time 10.439 seconds. mode: CPU time 1.402 seconds. distances=[iter = 1607,rec = 1607,mode = 1607]
PicoLisp
Translation of the pseudo-code in the Wikipedia article:
(de levenshtein (A B)
(let D
(cons
(range 0 (length A))
(mapcar
'((I) (cons I (copy A)))
(range 1 (length B)) ) )
(for (J . Y) B
(for (I . X) A
(set
(nth D (inc J) (inc I))
(if (= X Y)
(get D J I)
(inc
(min
(get D J (inc I))
(get D (inc J) I)
(get D J I) ) ) ) ) ) ) ) )
or, using 'map' to avoid list indexing:
(de levenshtein (A B)
(let D
(cons
(range 0 (length A))
(mapcar
'((I) (cons I (copy A)))
(range 1 (length B)) ) )
(map
'((B Y)
(map
'((A X P)
(set (cdr P)
(if (= (car A) (car B))
(car X)
(inc (min (cadr X) (car P) (car X))) ) ) )
A
(car Y)
(cadr Y) ) )
B
D ) ) )
- Output (both cases):
: (levenshtein (chop "kitten") (chop "sitting")) -> 3
PL/I
version 1
*process source xref attributes or(!);
lsht: Proc Options(main);
Call test('kitten' ,'sitting');
Call test('rosettacode' ,'raisethysword');
Call test('Sunday' ,'Saturday');
Call test('Vladimir_Levenshtein[1965]',
'Vladimir_Levenshtein[1965]');
Call test('this_algorithm_is_similar_to',
'Damerau-Levenshtein_distance');
Call test('','abc');
test: Proc(s,t);
Dcl (s,t) Char(*) Var;
Put Edit(' 1st string = >'!!s!!'<')(Skip,a);
Put Edit(' 2nd string = >'!!t!!'<')(Skip,a);
Put Edit('Levenshtein distance =',LevenshteinDistance(s,t))
(Skip,a,f(3));
Put Edit('')(Skip,a);
End;
LevenshteinDistance: Proc(s,t) Returns(Bin Fixed(31));
Dcl (s,t) Char(*) Var;
Dcl (sl,tl) Bin Fixed(31);
Dcl ld Bin Fixed(31);
/* for all i and j, d[i,j] will hold the Levenshtein distance between
* the first i characters of s and the first j characters of t;
* note that d has (m+1)*(n+1) values */
sl=length(s);
tl=length(t);
Begin;
Dcl d(0:sl,0:tl) Bin Fixed(31);
Dcl (i,j,ii,jj) Bin Fixed(31);
d=0;
Do i=1 To sl; /* source prefixes can be transformed into */
d(i,0)=i; /* empty string by dropping all characters */
End;
Do j=1 To tl; /* target prefixes can be reached from */
d(0,j)=j; /* empty source prefix by inserting every character*/
End;
Do j=1 To tl;
jj=j-1;
Do i=1 To sl;
ii=i-1;
If substr(s,i,1)=substr(t,j,1) Then
d(i,j)=d(ii,jj); /* no operation required */
Else
d(i,j)=1+min(d(ii,j), /* a deletion */
d(i,jj), /* an insertion */
d(ii,jj)); /* a substitution */
End;
End;
ld=d(sl,tl);
End;
Return(ld);
End;
End;
- Output:
1st string = >kitten< 2nd string = >sitting< Levenshtein distance = 3 1st string = >rosettacode< 2nd string = >raisethysword< Levenshtein distance = 8 1st string = >Sunday< 2nd string = >Saturday< Levenshtein distance = 3 1st string = >Vladimir_Levenshtein[1965]< 2nd string = >Vladimir_Levenshtein[1965]< Levenshtein distance = 0 1st string = >this_algorithm_is_similar_to< 2nd string = >Damerau-Levenshtein_distance< Levenshtein distance = 24 1st string = >< 2nd string = >abc< Levenshtein distance = 3
version 2 recursive with memoization
*process source attributes xref or(!);
ld3: Proc Options(main);
Dcl ld(0:30,0:30) Bin Fixed(31);
call test('kitten' ,'sitting');
call test('rosettacode' ,'raisethysword');
call test('Sunday' ,'Saturday');
call test('Vladimir_Levenshtein[1965]',
'Vladimir_Levenshtein[1965]');
call test('this_algorithm_is_similar_to',
'Damerau-Levenshtein_distance');
call test('','abc');
test: Proc(s,t);
Dcl (s,t) Char(*);
ld=-1;
Put Edit(' 1st string = >'!!s!!'<')(Skip,a);
Put Edit(' 2nd string = >'!!t!!'<')(Skip,a);
Put Edit('Levenshtein distance =',
LevenshteinDistance(s,length(s),t,length(t)))
(Skip,a,f(3));
Put Edit('')(Skip,a);
End;
LevenshteinDistance: Proc(s,sl,t,tl) Recursive Returns(Bin Fixed(31));
Dcl (s,t) Char(*);
Dcl (sl,tl) Bin Fixed(31);
Dcl cost Bin Fixed(31);
If ld(sl,tl)^=-1 Then
Return(ld(sl,tl));
Select;
When(sl=0) ld(sl,tl)=tl;
When(tl=0) ld(sl,tl)=sl;
Otherwise Do;
/* test if last characters of the strings match */
cost=(substr(s,sl,1)^=substr(t,tl,1));
/* return minimum of delete char from s, delete char from t,
and delete char from both */
ld(sl,tl)=min(LevenshteinDistance(s,sl-1,t,tl )+1,
LevenshteinDistance(s,sl ,t,tl-1)+1,
LevenshteinDistance(s,sl-1,t,tl-1)+cost));
End;
End;
Return(ld(sl,tl));
End;
End;
Output is the same as for version 1.
PL/M
... under CP/M (or an emulator)
100H: /* CALCULATE THE LEVENSHTEIN DISTANCE BETWEEN STRINGS */
/* TRANS:ATED FROM THE ACTION! SAMPLE */
/* CP/M BDOS SYSTEM CALL, IGNORE THE RETURN VALUE */
BDOS: PROCEDURE( FN, ARG ); DECLARE FN BYTE, ARG ADDRESS; GOTO 5; END;
PR$CHAR: PROCEDURE( C ); DECLARE C BYTE; CALL BDOS( 2, C ); END;
PR$STRING: PROCEDURE( S ); DECLARE S ADDRESS; CALL BDOS( 9, S ); END;
PR$NL: PROCEDURE; CALL PR$CHAR( 0DH ); CALL PR$CHAR( 0AH ); END;
PR$NUMBER: PROCEDURE( N ); /* PRINTS A NUMBER IN THE MINIMUN FIELD WIDTH */
DECLARE N ADDRESS;
DECLARE V ADDRESS, N$STR ( 6 )BYTE, W BYTE;
V = N;
W = LAST( N$STR );
N$STR( W ) = '$';
N$STR( W := W - 1 ) = '0' + ( V MOD 10 );
DO WHILE( ( V := V / 10 ) > 0 );
N$STR( W := W - 1 ) = '0' + ( V MOD 10 );
END;
CALL PR$STRING( .N$STR( W ) );
END PR$NUMBER;
DECLARE WIDTH LITERALLY '17'; /* ALLOW STRINGS UP TO 16 CHARACTERS */
DECLARE MATRIX$SIZE LITERALLY '289'; /* 17*17 ELEMENTS */
DECLARE STRING LITERALLY '( WIDTH )BYTE';
SCOPY: PROCEDURE( WORD, STR ); /* CONVEET PL/M STYLE STRING TO ACTION! */
DECLARE ( WORD, STR ) ADDRESS;
DECLARE ( W BASED WORD, S BASED STR ) STRING;
DECLARE ( I, C ) BYTE;
I = 0;
DO WHILE( ( C := S( I ) ) <> '$' );
W( I := I + 1 ) = C;
END;
W( 0 ) = I;
END SCOPY;
SET2DM: PROCEDURE( MATRIX, X, Y, VAL );
DECLARE ( MATRIX, X, Y, VAL ) ADDRESS;
DECLARE M BASED MATRIX ( MATRIX$SIZE )ADDRESS;
M( X + ( Y * WIDTH ) ) = VAL;
END SET2DM;
GET2DM: PROCEDURE( MATRIX, X, Y )ADDRESS;
DECLARE ( MATRIX, X, Y, VAL ) ADDRESS;
DECLARE M BASED MATRIX ( MATRIX$SIZE )ADDRESS;
RETURN M( X + ( Y * WIDTH ) );
END GET2DM;
LEVENSHTEIN$DISTANCE: PROCEDURE( S1, S2 )ADDRESS;
DECLARE ( S1, S2 ) ADDRESS;
DECLARE STR1 BASED S1 STRING, STR2 BASED S2 STRING;
DECLARE MATRIX ( MATRIX$SIZE ) ADDRESS;
DECLARE ( MIN, K, L, I, J, M, N ) BYTE;
M = STR1( 0 );
N = STR2( 0 );
DO I = 0 TO MATRIX$SIZE - 1; MATRIX( I ) = 0; END;
DO I = 0 TO M; CALL SET2DM( .MATRIX, I, 1, I ); END;
DO J = 0 TO N; CALL SET2DM( .MATRIX, 1, J, J ); END;
DO J = 1 TO N;
DO I = 1 TO M;
IF STR1( I ) = STR2( J ) THEN DO;
CALL SET2DM( .MATRIX, I, J, GET2DM( .MATRIX, I - 1, J - 1 ) );
END;
ELSE DO;
MIN = GET2DM( .MATRIX, I - 1, J ) + 1; /* DELETION */
K = GET2DM( .MATRIX, I, J - 1 ) + 1; /* INSERTION */
L = GET2DM( .MATRIX, I - 1, J - 1 ) + 1; /* SUBSTITUTION */
IF K < MIN THEN MIN = K;
IF L < MIN THEN MIN = L;
CALL SET2DM( .MATRIX, I, J, MIN );
END;
END;
END;
RETURN GET2DM( .MATRIX, M, N );
END LEVENSHTEIN$DISTANCE;
TEST: PROCEDURE( W1, W2 );
DECLARE ( W1, W2 ) ADDRESS;
DECLARE ( WORD$1, WORD$2 ) STRING;
CALL SCOPY( .WORD$1, W1 );
CALL SCOPY( .WORD$2, W2 );
CALL PR$STRING( W1 ); CALL PR$STRING( .' -> $' ); CALL PR$STRING( W2 );
CALL PR$STRING( .', LEVENSHTEIN DISTANCE: $' );
CALL PR$NUMBER( LEVENSHTEIN$DISTANCE( .WORD$1, .WORD$2 ) );
CALL PR$NL;
END TEST;
/* TEST CASES */
CALL TEST( .'KITTEN$', .'SITTING$' );
CALL TEST( .'ROSETTACODE$', .'RAISETHYSWORD$' );
CALL TEST( .'QWERTY$', .'QWERYT$' );
CALL TEST( .( 'ACTION', 33, '$' ), .'PL/M$' );
EOF
- Output:
KITTEN -> SITTING, LEVENSHTEIN DISTANCE: 3 ROSETTACODE -> RAISETHYSWORD, LEVENSHTEIN DISTANCE: 8 QWERTY -> QWERYT, LEVENSHTEIN DISTANCE: 2 ACTION! -> PL/M, LEVENSHTEIN DISTANCE: 4
PowerShell
This version does not allow empty strings.
function Get-LevenshteinDistance
{
[CmdletBinding()]
[OutputType([PSCustomObject])]
Param
(
[Parameter(Mandatory=$true, Position=0)]
[ValidateNotNullOrEmpty()]
[Alias("s")]
[string]
$ReferenceObject,
[Parameter(Mandatory=$true, Position=1)]
[ValidateNotNullOrEmpty()]
[Alias("t")]
[string]
$DifferenceObject
)
[int]$n = $ReferenceObject.Length
[int]$m = $DifferenceObject.Length
$d = New-Object -TypeName 'System.Object[,]' -ArgumentList ($n + 1),($m + 1)
$outputObject = [PSCustomObject]@{
ReferenceObject = $ReferenceObject
DifferenceObject = $DifferenceObject
Distance = $null
}
for ($i = 0; $i -le $n; $i++)
{
$d[$i, 0] = $i
}
for ($i = 0; $i -le $m; $i++)
{
$d[0, $i] = $i
}
for ($i = 1; $i -le $m; $i++)
{
for ($j = 1; $j -le $n; $j++)
{
if ($ReferenceObject[$j - 1] -eq $DifferenceObject[$i - 1])
{
$d[$j, $i] = $d[($j - 1), ($i - 1)]
}
else
{
$d[$j, $i] = [Math]::Min([Math]::Min(($d[($j - 1), $i] + 1), ($d[$j, ($i - 1)] + 1)), ($d[($j - 1), ($i - 1)] + 1))
}
}
}
$outputObject.Distance = $d[$n, $m]
$outputObject
}
Get-LevenshteinDistance "kitten" "sitting"
Get-LevenshteinDistance rosettacode raisethysword
- Output:
ReferenceObject DifferenceObject Distance --------------- ---------------- -------- kitten sitting 3 rosettacode raisethysword 8
Processing
void setup() {
println(distance("kitten", "sitting"));
}
int distance(String a, String b) {
int [] costs = new int [b.length() + 1];
for (int j = 0; j < costs.length; j++)
costs[j] = j;
for (int i = 1; i <= a.length(); i++) {
costs[0] = i;
int nw = i - 1;
for (int j = 1; j <= b.length(); j++) {
int cj = min(1 + min(costs[j], costs[j - 1]), a.charAt(i - 1) == b.charAt(j - 1) ? nw : nw + 1);
nw = costs[j];
costs[j] = cj;
}
}
return costs[b.length()];
}
Processing Python mode
def setup():
println(distance("kitten", "sitting"))
def distance(a, b):
costs = []
for j in range(len(b) + 1):
costs.append(j)
for i in range(1, len(a) + 1):
costs[0] = i
nw = i - 1
for j in range(1, len(b) + 1):
cj = min(1 + min(costs[j], costs[j - 1]),
nw if a[i - 1] == b[j - 1] else nw + 1)
nw = costs[j]
costs[j] = cj
return costs[len(b)]
Prolog
Without Tabling
Works with SWI-Prolog.
Based on Wikipedia's pseudocode.
levenshtein(S, T, R) :-
length(S, M),
M1 is M+1,
length(T, N),
N1 is N+1,
length(Lev, N1),
maplist(init(M1), Lev),
numlist(0, N, LN),
maplist(init_n, LN, Lev),
nth0(0, Lev, Lev0),
numlist(0, M, Lev0),
% compute_levenshtein distance
numlist(1, N, LN1),
maplist(work_on_T(Lev, S), LN1, T),
last(Lev, LevLast),
last(LevLast, R).
work_on_T(Lev, S, J, TJ) :-
length(S, M),
numlist(1, M, LM),
maplist(work_on_S(Lev, J, TJ), LM, S).
work_on_S(Lev, J, C, I, C) :-
% same char
!,
I1 is I-1, J1 is J-1,
nth0(J1, Lev, LevJ1),
nth0(I1, LevJ1, V),
nth0(J, Lev, LevJ),
nth0(I, LevJ, V).
work_on_S(Lev, J, _C1, I, _C2) :-
I1 is I-1, J1 is J - 1,
% compute the value for deletion
nth0(J, Lev, LevJ),
nth0(I1, LevJ, VD0),
VD is VD0 + 1,
% compute the value for insertion
nth0(J1, Lev, LevJ1),
nth0(I, LevJ1, VI0),
VI is VI0 + 1,
% compute the value for substitution
nth0(I1, LevJ1, VS0),
VS is VS0 + 1,
% set the minimum value to cell(I,J)
sort([VD, VI, VS], [V|_]),
nth0(I, LevJ, V).
init(Len, C) :-
length(C, Len).
init_n(N, L) :-
nth0(0, L, N).
- Output examples:
?- levenshtein("kitten", "sitting", R). R = 3. ?- levenshtein("saturday", "sunday", R). R = 3. ?- levenshtein("rosettacode", "raisethysword", R). R = 8.
With Tabling
Using so called mode directed tabling:
:- table lev(_,_,min). lev([], [], 0). lev([X|L], [X|R], D) :- !, lev(L, R, D). lev([_|L], [_|R], D) :- lev(L, R, H), D is H+1. lev([_|L], R, D) :- lev(L, R, H), D is H+1. lev(L, [_|R], D) :- lev(L, R, H), D is H+1. ?- set_prolog_flag(double_quotes, codes). true. ?- lev("kitten", "sitting", R). R = 3.
PureBasic
Based on Wikipedia's pseudocode.
Procedure LevenshteinDistance(A_string$, B_String$)
Protected m, n, i, j, min, k, l
m = Len(A_string$)
n = Len(B_String$)
Dim D(m, n)
For i=0 To m: D(i,0)=i: Next
For j=0 To n: D(0,j)=j: Next
For j=1 To n
For i=1 To m
If Mid(A_string$,i,1) = Mid(B_String$,j,1)
D(i,j) = D(i-1, j-1); no operation required
Else
min = D(i-1, j)+1 ; a deletion
k = D(i, j-1)+1 ; an insertion
l = D(i-1, j-1)+1 ; a substitution
If k < min: min=k: EndIf
If l < min: min=l: EndIf
D(i,j) = min
EndIf
Next
Next
ProcedureReturn D(m,n)
EndProcedure
;- Testing
n = LevenshteinDistance("kitten", "sitting")
MessageRequester("Info","Levenshtein Distance= "+Str(n))
Python
Iterative 1
Faithful implementation of "Iterative with full matrix" from Wikipedia
def levenshteinDistance(str1, str2):
m = len(str1)
n = len(str2)
d = [[i] for i in range(1, m + 1)] # d matrix rows
d.insert(0, list(range(0, n + 1))) # d matrix columns
for j in range(1, n + 1):
for i in range(1, m + 1):
if str1[i - 1] == str2[j - 1]: # Python (string) is 0-based
substitutionCost = 0
else:
substitutionCost = 1
d[i].insert(j, min(d[i - 1][j] + 1,
d[i][j - 1] + 1,
d[i - 1][j - 1] + substitutionCost))
return d[-1][-1]
print(levenshteinDistance("kitten","sitting"))
print(levenshteinDistance("rosettacode","raisethysword"))
- Output:
3 8
Iterative 2
Implementation of the Wikipedia algorithm, optimized for memory
def minimumEditDistance(s1,s2):
if len(s1) > len(s2):
s1,s2 = s2,s1
distances = range(len(s1) + 1)
for index2,char2 in enumerate(s2):
newDistances = [index2+1]
for index1,char1 in enumerate(s1):
if char1 == char2:
newDistances.append(distances[index1])
else:
newDistances.append(1 + min((distances[index1],
distances[index1+1],
newDistances[-1])))
distances = newDistances
return distances[-1]
print(minimumEditDistance("kitten","sitting"))
print(minimumEditDistance("rosettacode","raisethysword"))
- Output:
3 8
Iterative 3
Iterative space optimized (even bounded)
def ld(a, b, mx=-1):
def result(d): return d if mx < 0 else False if d > mx else True
if a == b: return result(0)
la, lb = len(a), len(b)
if mx >= 0 and abs(la - lb) > mx: return result(mx+1)
if la == 0: return result(lb)
if lb == 0: return result(la)
if lb > la: a, b, la, lb = b, a, lb, la
cost = array('i', range(lb + 1))
for i in range(1, la + 1):
cost[0] = i; ls = i-1; mn = ls
for j in range(1, lb + 1):
ls, act = cost[j], ls + int(a[i-1] != b[j-1])
cost[j] = min(ls+1, cost[j-1]+1, act)
if (ls < mn): mn = ls
if mx >= 0 and mn > mx: return result(mx+1)
if mx >= 0 and cost[lb] > mx: return result(mx+1)
return result(cost[lb])
print(
ld('kitten','kitten'), # 0
ld('kitten','sitten'), # 1
ld('kitten','sittes'), # 2
ld('kitten','sityteng'), # 3
ld('kitten','sittYing'), # 4
ld('rosettacode','raisethysword'), # 8
ld('kitten','kittenaaaaaaaaaaaaaaaaa'), # 17
ld('kittenaaaaaaaaaaaaaaaaa','kitten') # 17
)
print(
ld('kitten','kitten',3), # True
ld('kitten','sitten',3), # True
ld('kitten','sittes',3), # True
ld('kitten','sityteng',3), # True
ld('kitten','sittYing',3), # False
ld('rosettacode','raisethysword',3), # False
ld('kitten','kittenaaaaaaaaaaaaaaaaa',3), # False
ld('kittenaaaaaaaaaaaaaaaaa','kitten',3) # False
)
- Output:
0 1 2 3 4 8 17 17 True True True True False False False False
Functional
Memoized recursion
(Uses this cache from the standard library).
>>> from functools import lru_cache
>>> @lru_cache(maxsize=4095)
def ld(s, t):
if not s: return len(t)
if not t: return len(s)
if s[0] == t[0]: return ld(s[1:], t[1:])
l1 = ld(s, t[1:])
l2 = ld(s[1:], t)
l3 = ld(s[1:], t[1:])
return 1 + min(l1, l2, l3)
>>> print( ld("kitten","sitting"),ld("rosettacode","raisethysword") )
3 8
Non-recursive: reduce and scanl
'''Levenshtein distance'''
from itertools import (accumulate, chain, islice)
from functools import reduce
# levenshtein :: String -> String -> Int
def levenshtein(sa):
'''Levenshtein distance between
two strings.'''
s1 = list(sa)
# go :: [Int] -> Char -> [Int]
def go(ns, c):
n, ns1 = ns[0], ns[1:]
# gap :: Int -> (Char, Int, Int) -> Int
def gap(z, c1xy):
c1, x, y = c1xy
return min(
succ(y),
succ(z),
succ(x) if c != c1 else x
)
return scanl(gap)(succ(n))(
zip(s1, ns, ns1)
)
return lambda sb: reduce(
go, list(sb), enumFromTo(0)(len(s1))
)[-1]
# TEST ----------------------------------------------------
# main :: IO ()
def main():
'''Tests'''
pairs = [
('rosettacode', 'raisethysword'),
('kitten', 'sitting'),
('saturday', 'sunday')
]
print(
tabulated(
'Levenshtein minimum edit distances:\n'
)(str)(str)(
uncurry(levenshtein)
)(concat(map(
list,
zip(pairs, map(swap, pairs))
)))
)
# GENERIC -------------------------------------------------
# compose (<<<) :: (b -> c) -> (a -> b) -> a -> c
def compose(g):
'''Right to left function composition.'''
return lambda f: lambda x: g(f(x))
# concat :: [[a]] -> [a]
# concat :: [String] -> String
def concat(xxs):
'''The concatenation of all the elements in a list.'''
xs = list(chain.from_iterable(xxs))
unit = '' if isinstance(xs, str) else []
return unit if not xs else (
''.join(xs) if isinstance(xs[0], str) else xs
)
# enumFromTo :: (Int, Int) -> [Int]
def enumFromTo(m):
'''Integer enumeration from m to n.'''
return lambda n: list(range(m, 1 + n))
# scanl :: (b -> a -> b) -> b -> [a] -> [b]
def scanl(f):
'''scanl is like reduce, but returns a succession of
intermediate values, building from the left.'''
return lambda a: lambda xs: (
list(accumulate(chain([a], xs), f))
)
# swap :: (a, b) -> (b, a)
def swap(tpl):
'''The swapped components of a pair.'''
return (tpl[1], tpl[0])
# succ :: Int => Int -> Int
def succ(x):
'''The successor of a value.
For numeric types, (1 +).'''
return 1 + x
# tabulated :: String -> (a -> String) ->
# (b -> String) ->
# (a -> b) -> [a] -> String
def tabulated(s):
'''Heading -> x display function ->
fx display function ->
f -> value list -> tabular string.'''
def go(xShow, fxShow, f, xs):
w = max(map(compose(len)(xShow), xs))
return s + '\n' + '\n'.join([
xShow(x).rjust(w, ' ') + ' -> ' + fxShow(f(x))
for x in xs
])
return lambda xShow: lambda fxShow: (
lambda f: lambda xs: go(
xShow, fxShow, f, xs
)
)
# take :: Int -> [a] -> [a]
# take :: Int -> String -> String
def take(n):
'''The prefix of xs of length n,
or xs itself if n > length xs.'''
return lambda xs: (
xs[0:n]
if isinstance(xs, list)
else list(islice(xs, n))
)
# uncurry :: (a -> b -> c) -> ((a, b) -> c)
def uncurry(f):
'''A function over a tuple
derived from a curried function.'''
return lambda xy: f(xy[0])(
xy[1]
)
# MAIN ---
if __name__ == '__main__':
main()
- Output:
Levenshtein minimum edit distances: ('rosettacode', 'raisethysword') -> 8 ('raisethysword', 'rosettacode') -> 8 ('kitten', 'sitting') -> 3 ('sitting', 'kitten') -> 3 ('saturday', 'sunday') -> 3 ('sunday', 'saturday') -> 3
Racket
A memoized recursive implementation.
#lang racket
(define (levenshtein a b)
(define (ls0 a-index b-index)
(cond [(or (= a-index -1) (= b-index -1)) (abs (- a-index b-index))]
[else
(define a-char (string-ref a a-index))
(define b-char (string-ref b b-index))
(if (equal? a-char b-char)
(ls (sub1 a-index) (sub1 b-index))
(min (add1 (ls (sub1 a-index) b-index))
(add1 (ls a-index (sub1 b-index)))
(add1 (ls (sub1 a-index) (sub1 b-index)))))]))
(define memo (make-hash))
(define (ls a-i b-i)
(hash-ref! memo (cons a-i b-i) (λ() (ls0 a-i b-i))))
(ls (sub1 (string-length a)) (sub1 (string-length b))))
(levenshtein "kitten" "sitting")
(levenshtein "rosettacode" "raisethysword")
- Output:
3 8
Raku
(formerly Perl 6)
Implementation of the Wikipedia algorithm. Since column 0 and row 0 are used for base distances, the original algorithm would require us to compare "@s[$i-1] eq @t[$j-1]", and reference $m and $n separately. Prepending an unused value (undef) onto @s and @t makes their indices align with the $i,$j numbering of @d, and lets us use .end instead of $m,$n.
sub levenshtein-distance ( Str $s, Str $t --> Int ) {
my @s = *, |$s.comb;
my @t = *, |$t.comb;
my @d;
@d[$_; 0] = $_ for ^@s.end;
@d[ 0; $_] = $_ for ^@t.end;
for 1..@s.end X 1..@t.end -> ($i, $j) {
@d[$i; $j] = @s[$i] eq @t[$j]
?? @d[$i-1; $j-1] # No operation required when eq
!! ( @d[$i-1; $j ], # Deletion
@d[$i ; $j-1], # Insertion
@d[$i-1; $j-1], # Substitution
).min + 1;
}
@d[*-1][*-1];
}
for <kitten sitting>, <saturday sunday>, <rosettacode raisethysword> -> ($s, $t) {
say "Levenshtein distance('$s', '$t') == ", levenshtein-distance($s, $t)
}
- Output:
Levenshtein distance('kitten', 'sitting') == 3 Levenshtein distance('saturday', 'sunday') == 3 Levenshtein distance('rosettacode', 'raisethysword') == 8
Refal
$ENTRY Go {
= <Show ('kitten') ('sitting')>
<Show ('rosettacode') ('raisethysword')>;
};
Show {
(e.A) (e.B) = <Prout e.A ' -> ' e.B ': ' <Lev (e.A) (e.B)>>;
};
Lev {
(e.A) (), <Lenw e.A>: s.L e.A = s.L;
() (e.B), <Lenw e.B>: s.L e.B = s.L;
(s.C e.A) (s.C e.B) = <Lev (e.A) (e.B)>;
(e.A) (e.B), e.A: s.HA e.LA, e.B: s.HB e.LB =
<+ 1 <Min <Lev (e.LA) (e.B)>
<Lev (e.A) (e.LB)>
<Lev (e.LA) (e.LB)>>>;
}
Min {
s.N = s.N;
s.M s.N e.X, <Compare s.M s.N>: {
'-' = <Min s.M e.X>;
s.X = <Min s.N e.X>;
};
};
- Output:
kitten -> sitting: 3 rosettacode -> raisethysword: 8
REXX
version 1
As per the task's requirements, this version includes a driver to display the results.
/*REXX program calculates and displays the Levenshtein distance between two strings. */
call Levenshtein 'kitten' , "sitting"
call Levenshtein 'rosettacode' , "raisethysword"
call Levenshtein 'Sunday' , "Saturday"
call Levenshtein 'Vladimir Levenshtein[1965]' , "Vladimir Levenshtein[1965]"
call Levenshtein 'this algorithm is similar to' , "Damerau─Levenshtein distance"
exit /*stick a fork in it, we're all done. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
Levenshtein: procedure; parse arg o,t; oL= length(o); tL= length(t); @.= 0
say ' original string = ' o /*show old string*/
say ' target string = ' t /* " target " */
do #=1 for tL; @.0.#= #; end /*#*/ /*the drop array.*/
do #=1 for oL; @.#.0= #; end /*#*/ /* " insert " */
do j=1 for tL; jm= j-1; q= substr(t, j, 1) /*obtain character. */
do k=1 for oL; km= k-1
if q==substr(o, k, 1) then @.k.j= @.km.jm /*use previous char.*/
else @.k.j= 1 + min(@.km.j, @.k.jm, @.km.jm)
end /*k*/
end /*j*/ /* [↑] best choice.*/
say ' Levenshtein distance = ' @.oL.tL; say; return
- output when using the internal default inputs:
original string = kitten target string = sitting Levenshtein distance = 3 original string = rosettacode target string = raisethysword Levenshtein distance = 8 original string = Sunday target string = Saturday Levenshtein distance = 3 original string = Vladimir Levenshtein[1965] target string = Vladimir Levenshtein[1965] Levenshtein distance = 0 original string = this algorithm is similar to target string = Damerau─Levenshtein distance Levenshtein distance = 24
version 2
same as Similar to version 1 (but does not include a driver for testing), reformatted and commented
/*rexx*/
call test 'kitten' ,'sitting'
call test 'rosettacode' ,'raisethysword'
call test 'Sunday' ,'Saturday'
call test 'Vladimir_Levenshtein[1965]',,
'Vladimir_Levenshtein[1965]'
call test 'this_algorithm_is_similar_to',,
'Damerau-Levenshtein_distance'
call test '','abc'
exit 0
test: Procedure
Parse Arg s,t
ld.=''
Say ' 1st string = >'s'<'
Say ' 2nd string = >'t'<'
Say 'Levenshtein distance =' Levenshtein(s,length(s),t,length(t))
Say ''
Return
Levenshtein: Procedure
Parse Arg s,t
/* for all i and j, d[i,j] will hold the Levenshtein distance between */
/* the first i characters of s and the first j characters of t; */
/* note that d has (m+1)*(n+1) values */
m=length(s)
n=length(t)
d.=0
Do i=1 To m /* source prefixes can be transformed into empty string by */
d.i.0=i /* dropping all characters */
End
Do j=1 To n /* target prefixes can be reached from empty source prefix */
d.0.j=j /* by inserting every character */
End
Do j=1 To n
jj=j-1
Do i=1 To m
ii=i-1
If substr(s,i,1)=substr(t,j,1) Then
d.i.j=d.ii.jj /* no operation required */
else
d.i.j=min(d.ii.j+1,, /* a deletion */
d.i.jj+1,, /* an insertion */
d.ii.jj+1) /* a substitution */
End
End
Say ' 1st string = ' s
Say ' 2nd string = ' t
say 'Levenshtein distance = ' d.m.n; say ''
Return d.m.n
- Output:
1st string = >kitten< 2nd string = >sitting< 1st string = kitten 2nd string = 6 Levenshtein distance = 6 Levenshtein distance = 6 1st string = >rosettacode< 2nd string = >raisethysword< 1st string = rosettacode 2nd string = 11 Levenshtein distance = 11 Levenshtein distance = 11 1st string = >Sunday< 2nd string = >Saturday< 1st string = Sunday 2nd string = 6 Levenshtein distance = 6 Levenshtein distance = 6 1st string = >Vladimir_Levenshtein[1965]< 2nd string = >Vladimir_Levenshtein[1965]< 1st string = Vladimir_Levenshtein[1965] 2nd string = 26 Levenshtein distance = 25 Levenshtein distance = 25 1st string = >this_algorithm_is_similar_to< 2nd string = >Damerau-Levenshtein_distance< 1st string = this_algorithm_is_similar_to 2nd string = 28 Levenshtein distance = 28 Levenshtein distance = 28 1st string = >< 2nd string = >abc< 1st string = 2nd string = 0 Levenshtein distance = 1 Levenshtein distance = 1
version 3
Alternate algorithm from Wikipedia (but does not include a driver for testing).
/*rexx*/
call test 'kitten' ,'sitting'
call test 'rosettacode' ,'raisethysword'
call test 'Sunday' ,'Saturday'
call test 'Vladimir_Levenshtein[1965]',,
'Vladimir_Levenshtein[1965]'
call test 'this_algorithm_is_similar_to',,
'Damerau-Levenshtein_distance'
call test '','abc'
exit 0
test: Procedure
Parse Arg s,t
ld.=''
Say ' 1st string = >'s'<'
Say ' 2nd string = >'t'<'
Say 'Levenshtein distance =' LevenshteinDistance(s,length(s),t,length(t))
Say ''
Return
LevenshteinDistance: Procedure
Parse Arg s,t
If s==t Then Return 0;
sl=length(s)
tl=length(t)
If sl=0 Then Return tl
If tl=0 Then Return sl
Do i=0 To tl
v0.i=i
End
Do i=0 To sl-1
v1.0=i+1
Do j=0 To tl-1
jj=j+1
cost=substr(s,i+1,1)<>substr(t,j+1,1)
v1.jj=min(v1.j+1,v0.jj+1,v0.j+cost)
End
Do j=0 to tl-1
v0.j=v1.j
End
End
return v1.tl
- Output:
1st string = >kitten< 2nd string = >sitting< Levenshtein distance = 2 1st string = >rosettacode< 2nd string = >raisethysword< Levenshtein distance = 3 1st string = >Sunday< 2nd string = >Saturday< Levenshtein distance = 2 1st string = >Vladimir_Levenshtein[1965]< 2nd string = >Vladimir_Levenshtein[1965]< Levenshtein distance = 3 1st string = >this_algorithm_is_similar_to< 2nd string = >Damerau-Levenshtein_distance< Levenshtein distance = 3 1st string = >< 2nd string = >abc< Levenshtein distance = 1
version 4 (recursive)
Recursive algorithm from Wikipedia with memoization
/*rexx*/
call test 'kitten' ,'sitting'
call test 'rosettacode' ,'raisethysword'
call test 'Sunday' ,'Saturday'
call test 'Vladimir_Levenshtein[1965]',,
'Vladimir_Levenshtein[1965]'
call test 'this_algorithm_is_similar_to',,
'Damerau-Levenshtein_distance'
call test '','abc'
Exit
test: Procedure
Parse Arg s,t
ld.=''
Say ' 1st string = >'s'<'
Say ' 2nd string = >'t'<'
Say 'Levenshtein distance =' LevenshteinDistance(s,length(s),t,length(t))
Say ''
Return
LevenshteinDistance: Procedure Expose ld.
/* sl and tl are the number of characters in string s and t respectively */
Parse Arg s,sl,t,tl
If ld.sl.tl<>'' Then
Return ld.sl.tl
Select
When sl=0 Then ld.sl.tl=tl
When tl=0 Then ld.sl.tl=sl
Otherwise Do
/* test if last characters of the strings match */
cost=substr(s,sl,1)<>substr(t,tl,1)
/* return minimum of delete char from s, delete char from t,
and delete char from both */
ld.sl.tl=min(LevenshteinDistance(s,sl-1,t,tl )+1,,
LevenshteinDistance(s,sl ,t,tl-1)+1,,
LevenshteinDistance(s,sl-1,t,tl-1)+cost)
End
End
Return ld.sl.tl
- Output:
1st string = >kitten< 2nd string = >sitting< Levenshtein distance = 3 1st string = >rosettacode< 2nd string = >raisethysword< Levenshtein distance = 8 1st string = >Sunday< 2nd string = >Saturday< Levenshtein distance = 3 1st string = >Vladimir_Levenshtein[1965]< 2nd string = >Vladimir_Levenshtein[1965]< Levenshtein distance = 0 1st string = >this_algorithm_is_similar_to< 2nd string = >Damerau-Levenshtein_distance< Levenshtein distance = 24 1st string = >< 2nd string = >abc< Levenshtein distance = 3
Ring
# Project : Levenshtein distance
load "stdlib.ring"
see "" + "distance(kitten, sitting) = " + levenshteindistance("kitten", "sitting") + nl
see "" + "distance(saturday, sunday) = " + levenshteindistance("saturday", "sunday") + nl
see "" + "distance(rosettacode, raisethysword) = " + levenshteindistance("rosettacode", "raisethysword") + nl
func levenshteindistance(s1, s2)
n = len(s1)
m = len(s2)
if n = 0
levenshteindistance = m
return
ok
if m = 0
levenshteindistance = n
return
ok
d = newlist(n, m)
for i = 1 to n
d[i][1] = i
next i
for i = 1 to m
d[1][i] = i
next
for i = 2 to n
si = substr(s1, i, 1)
for j = 2 to m
tj = substr(s2, j, 1)
if si = tj
cost = 0
else
cost = 1
ok
d[i][ j] = min((d[i - 1][ j]), min((d[i][j - 1] + 1), (d[i - 1][j - 1] + cost)))
next
next
levenshteindistance = d[n][m]
return levenshteindistance
Output:
distance(kitten, sitting) = 3 distance(saturday, sunday) = 3 distance(rosettacode, raisethysword) = 8
RPL
RPL code | Comment |
---|---|
≪ DUP2 SIZE SWAP SIZE → a b lb la ≪ IF la lb * NOT THEN la lb + ELSE a 2 la SUB b 2 lb SUB DUP2 LEV IF a 1 1 SUB b 1 1 SUB == THEN ROT ROT DROP2 ELSE a ROT LEV ROT b LEV MIN MIN 1 + END END ≫ ≫ 'LEV' STO |
LEV ( "a" "b" → distance )
if |a|=0 or [b|=0 then return resp. [b| or |a|
else
put tail(a), tail(b) and lev(tail(a),tail(b)) in stack
if a[1]=b[1} then
clean stack and return lev(tail(a),tail(b))
else
put lev(a,tail(b)) and lev(tail(a),b) in stack
return min of the 3 values in stack and add 1
|
"kitten" "sitting" LEV "Summer" "Winter" LEV "Levenshtein" "Levenshtein" LEV
- Output:
3: 3 2: 4 1: 0
Iterative implementation (Wagner-Fischer algorithm)
index of arrays and strings start with 1 in RPL, so the main trick in translating the algorithm given by Wikipedia was to manage the offsets properly. The distance between "rosettacode" and "raisethysword" is within the reach of a calculator (emulated or not), unlike the above recursive approach.
RPL code | Comment |
---|---|
≪
DUP2 { } + + ≪ SIZE ≫ DOLIST 1 ADD 0 CON → a b d
≪ 1 a SIZE FOR h
'd' h 1 + 1 2 →LIST h PUT NEXT
1 b SIZE FOR j
'd' 1 j 1 + 2 →LIST j PUT NEXT
1 b SIZE FOR j
1 a SIZE FOR h
a h DUP SUB b j DUP SUB ≠
'd' h j 2 →LIST GET +
'd' h j 1 + 2 →LIST GET 1 +
'd' h 1 + j 2 →LIST GET 1 +
MIN MIN 'd' h 1 + j 1 + 2 →LIST ROT PUT
NEXT NEXT
'd' DUP SIZE GET
≫ ≫ 'LEV2' STO
|
LEV2 ( "a" "b" → distance ) declare int d[0..m, 0..n] and set each element in d to zero for h from 1 to m: // h replaces i, which is √-1 in RPL d[h, 0] := i // RPL array indexes start with 1 for j from 1 to n: d[0, j] := j for j from 1 to n: for h from 1 to m: substitutionCost := ( s[h] <> t[j] ) d[h, j] := minimum(d[h-1, j-1] + substitutionCost, d[h-1, j] + 1, d[h, j-1] + 1) return d[m, n] |
"rosettacode" "raisethysword" LEV2
- Output:
1: 8
Ruby
Implementation of the wikipedia algorithm. Invariant is that for current loop indices i
and j
, costs[k]
for k < j
contains lev(i, k)
and for k >= j
contains lev(i-1, k). The inner loop body restores the invariant for the
new value of j
.
module Levenshtein
def self.distance(a, b)
a, b = a.downcase, b.downcase
costs = Array(0..b.length) # i == 0
(1..a.length).each do |i|
costs[0], nw = i, i - 1 # j == 0; nw is lev(i-1, j)
(1..b.length).each do |j|
costs[j], nw = [costs[j] + 1, costs[j-1] + 1, a[i-1] == b[j-1] ? nw : nw + 1].min, costs[j]
end
end
costs[b.length]
end
def self.test
%w{kitten sitting saturday sunday rosettacode raisethysword}.each_slice(2) do |a, b|
puts "distance(#{a}, #{b}) = #{distance(a, b)}"
end
end
end
Levenshtein.test
- Output:
distance(kitten, sitting) = 3 distance(saturday, sunday) = 3 distance(rosettacode, raisethysword) = 8
A variant can be found used in Rubygems [2]
def levenshtein_distance(str1, str2)
n = str1.length
m = str2.length
max = n/2
return m if 0 == n
return n if 0 == m
return n if (n - m).abs > max
d = (0..m).to_a
x = nil
str1.each_char.with_index do |char1,i|
e = i+1
str2.each_char.with_index do |char2,j|
cost = (char1 == char2) ? 0 : 1
x = [ d[j+1] + 1, # insertion
e + 1, # deletion
d[j] + cost # substitution
].min
d[j] = e
e = x
end
d[m] = x
end
x
end
%w{kitten sitting saturday sunday rosettacode raisethysword}.each_slice(2) do |a, b|
puts "distance(#{a}, #{b}) = #{levenshtein_distance(a, b)}"
end
same output
Run BASIC
print levenshteinDistance("kitten", "sitting")
print levenshteinDistance("rosettacode", "raisethysword")
end
function levenshteinDistance(s1$, s2$)
n = len(s1$)
m = len(s2$)
if n = 0 then
levenshteinDistance = m
goto [ex]
end if
if m = 0 then
levenshteinDistance = n
goto [ex]
end if
dim d(n, m)
for i = 0 to n
d(i, 0) = i
next i
for i = 0 to m
d(0, i) = i
next i
for i = 1 to n
si$ = mid$(s1$, i, 1)
for j = 1 to m
tj$ = mid$(s2$, j, 1)
if si$ = tj$ then cost = 0 else cost = 1
d(i, j) = min((d(i - 1, j) + 1),min((d(i, j - 1) + 1),(d(i - 1, j - 1) + cost)))
next j
next i
levenshteinDistance = d(n, m)
[ex]
end function
Output:
3 8
Rust
Implementation of the wikipedia algorithm.
fn main() {
println!("{}", levenshtein_distance("kitten", "sitting"));
println!("{}", levenshtein_distance("saturday", "sunday"));
println!("{}", levenshtein_distance("rosettacode", "raisethysword"));
}
fn levenshtein_distance(word1: &str, word2: &str) -> usize {
let w1 = word1.chars().collect::<Vec<_>>();
let w2 = word2.chars().collect::<Vec<_>>();
let word1_length = w1.len() + 1;
let word2_length = w2.len() + 1;
let mut matrix = vec![vec![0; word1_length]; word2_length];
for i in 1..word1_length { matrix[0][i] = i; }
for j in 1..word2_length { matrix[j][0] = j; }
for j in 1..word2_length {
for i in 1..word1_length {
let x: usize = if w1[i-1] == w2[j-1] {
matrix[j-1][i-1]
} else {
1 + std::cmp::min(
std::cmp::min(matrix[j][i-1], matrix[j-1][i])
, matrix[j-1][i-1])
};
matrix[j][i] = x;
}
}
matrix[word2_length-1][word1_length-1]
}
- Output:
3 3 8
Scala
Translated Wikipedia algorithm.
object Levenshtein0 extends App {
def distance(s1: String, s2: String): Int = {
val dist = Array.tabulate(s2.length + 1, s1.length + 1) { (j, i) => if (j == 0) i else if (i == 0) j else 0 }
@inline
def minimum(i: Int*): Int = i.min
for {j <- dist.indices.tail
i <- dist(0).indices.tail} dist(j)(i) =
if (s2(j - 1) == s1(i - 1)) dist(j - 1)(i - 1)
else minimum(dist(j - 1)(i) + 1, dist(j)(i - 1) + 1, dist(j - 1)(i - 1) + 1)
dist(s2.length)(s1.length)
}
def printDistance(s1: String, s2: String) {
println("%s -> %s : %d".format(s1, s2, distance(s1, s2)))
}
printDistance("kitten", "sitting")
printDistance("rosettacode", "raisethysword")
}
- Output:
kitten -> sitting : 3 rosettacode -> raisethysword : 8
Functional programmed, memoized
- Output:
Best seen running in your browser either by (ES aka JavaScript, non JVM) or Scastie (remote JVM).
import scala.collection.mutable
import scala.collection.parallel.ParSeq
object Levenshtein extends App {
def printDistance(s1: String, s2: String) =
println(f"$s1%s -> $s2%s : ${levenshtein(s1, s2)(s1.length, s2.length)}%d")
def levenshtein(s1: String, s2: String): mutable.Map[(Int, Int), Int] = {
val memoizedCosts = mutable.Map[(Int, Int), Int]()
def lev: ((Int, Int)) => Int = {
case (k1, k2) =>
memoizedCosts.getOrElseUpdate((k1, k2), (k1, k2) match {
case (i, 0) => i
case (0, j) => j
case (i, j) =>
ParSeq(1 + lev((i - 1, j)),
1 + lev((i, j - 1)),
lev((i - 1, j - 1))
+ (if (s1(i - 1) != s2(j - 1)) 1 else 0)).min
})
}
lev((s1.length, s2.length))
memoizedCosts
}
printDistance("kitten", "sitting")
printDistance("rosettacode", "raisethysword")
printDistance("Here's a bunch of words", "to wring out this code")
printDistance("sleep", "fleeting")
}
Scheme
Recursive version from wikipedia article.
(define (levenshtein s t)
(define (%levenshtein s sl t tl)
(cond ((zero? sl) tl)
((zero? tl) sl)
(else
(min (+ (%levenshtein (cdr s) (- sl 1) t tl) 1)
(+ (%levenshtein s sl (cdr t) (- tl 1)) 1)
(+ (%levenshtein (cdr s) (- sl 1) (cdr t) (- tl 1))
(if (char=? (car s) (car t)) 0 1))))))
(%levenshtein (string->list s)
(string-length s)
(string->list t)
(string-length t)))
- Output:
> (levenshtein "kitten" "sitting") 3 > (levenshtein "rosettacode" "raisethysword") 8
Seed7
$ include "seed7_05.s7i";
const func integer: levenshteinDistance (in string: s, in string: t) is func
result
var integer: distance is 0;
local
var array array integer: d is 0 times 0 times 0;
var integer: i is 0;
var integer: j is 0;
begin
d := [0 .. length(s)] times [0 .. length(t)] times 0;
for key i range s do
d[i][0] := i;
end for;
for key j range t do
d[0][j] := j;
for key i range s do
if s[i] = t[j] then
d[i][j] := d[pred(i)][pred(j)];
else
d[i][j] := min(min(succ(d[pred(i)][j]), succ(d[i][pred(j)])), succ(d[pred(i)][pred(j)]));
end if;
end for;
end for;
distance := d[length(s)][length(t)];
end func;
const proc: main is func
begin
writeln("kitten -> sitting: " <& levenshteinDistance("kitten", "sitting"));
writeln("rosettacode -> raisethysword: " <& levenshteinDistance("rosettacode", "raisethysword"));
end func;
- Output:
kitten -> sitting: 3 rosettacode -> raisethysword: 8
SenseTalk
SenseTalk has a built-in TextDifference function for this.
put textDifference("kitten", "sitting") // --> 3
put textDifference("rosettacode", "raisethysword") // --> 8
SequenceL
This implementation is based on the "Iterative with two matrix rows" version on Wikipedia.
import <Utilities/Sequence.sl>;
import <Utilities/Math.sl>;
main(args(2)) := LenenshteinDistance(args[1], args[2]);
LenenshteinDistance(s(1), t(1)) :=
0 when equalList(s,t) else
size(t) when size(s) = 0 else
size(s) when size(t) = 0 else
LenenshteinDistanceIterative(s, t, 0 ... size(t), duplicate(0, size(t) + 1), 1);
LenenshteinDistanceIterative(s(1), t(1), v0(1), v1(1), n) :=
v0[size(t) + 1] when n > size(s) else
LenenshteinDistanceIterative(s, t, iterate(s[n], t, v0, setElementAt(v1, 1, n + 0), 1), v0, n + 1);
iterate(s, t(1), v0(1), v1(1), n) :=
v1 when n > size(t) else
iterate(s, t, v0,
setElementAt(v1, n + 1,
min(min(v1[n] + 1, v0[n + 1] + 1), v0[n] + (0 when s = t[n] else 1))),
n + 1);
SETL
program levenshtein_distance;
tests := {['kitten', 'sitting'], ['rosettacode', 'raisethysword']};
loop for dest = tests(src) do
print(src + ' -> ' + dest + ': ' + str levenshtein(src, dest));
end loop;
proc levenshtein(s, t);
d := {};
loop for i in [1..#s] do
d(i,0) := i;
end loop;
loop for j in [1..#t] do
d(0,j) := j;
end loop;
loop for j in [1..#t] do
loop for i in [1..#s] do
d(i,j) := min/[
(d(i-1,j) ? 0) + 1,
(d(i,j-1) ? 0) + 1,
(d(i-1,j-1) ? 0) + if s(i) = t(j) then 0 else 1 end
];
end loop;
end loop;
return d(#s, #t);
end proc;
end program;
- Output:
kitten -> sitting: 3 rosettacode -> raisethysword: 8
Sidef
Recursive
func lev(s, t) is cached {
s || return t.len
t || return s.len
var s1 = s.slice(1)
var t1 = t.slice(1)
s[0] == t[0] ? __FUNC__(s1, t1)
: 1+Math.min(
__FUNC__(s1, t1),
__FUNC__(s, t1),
__FUNC__(s1, t )
)
}
Iterative
func lev(s, t) {
var d = [@(0 .. t.len), s.len.of {[_]}...]
for i,j in (^s ~X ^t) {
d[i+1][j+1] = (
s[i] == t[j]
? d[i][j]
: 1+Math.min(d[i][j+1], d[i+1][j], d[i][j])
)
}
d[-1][-1]
}
Calling the function:
say lev(%c'kitten', %c'sitting'); # prints: 3
say lev(%c'rosettacode', %c'raisethysword'); # prints: 8
Simula
BEGIN
INTEGER PROCEDURE LEVENSHTEINDISTANCE(S1, S2); TEXT S1, S2;
BEGIN
INTEGER N, M;
N := S1.LENGTH;
M := S2.LENGTH;
IF N = 0 THEN LEVENSHTEINDISTANCE := M ELSE
IF M = 0 THEN LEVENSHTEINDISTANCE := N ELSE
BEGIN
INTEGER ARRAY D(0:N, 0:M);
INTEGER I, J;
FOR I := 0 STEP 1 UNTIL N DO D(I, 0) := I;
FOR I := 0 STEP 1 UNTIL M DO D(0, I) := I;
S1.SETPOS(1);
FOR I := 1 STEP 1 UNTIL N DO
BEGIN
CHARACTER SI, TJ;
SI := S1.GETCHAR;
S2.SETPOS(1);
FOR J := 1 STEP 1 UNTIL M DO
BEGIN
INTEGER COST;
TJ := S2.GETCHAR;
COST := IF SI = TJ THEN 0 ELSE 1;
D(I, J) := MIN(D(I - 1, J) + 1, MIN(D(I, J - 1) + 1, D(I - 1, J - 1) + COST));
END;
END;
LEVENSHTEINDISTANCE := D(N, M);
END;
END LEVENSHTEINDISTANCE;
OUTINT(LEVENSHTEINDISTANCE("kitten", "sitting"), 0); OUTIMAGE;
OUTINT(LEVENSHTEINDISTANCE("rosettacode", "raisethysword"), 0); OUTIMAGE;
END
- Output:
3 8
Smalltalk
ST/X provides a customizable levenshtein method in the String class (weights for individual operations can be passed in):
'kitten' levenshteinTo: 'sitting' s:1 k:1 c:1 i:1 d:1 -> 3
'rosettacode' levenshteinTo: 'raisethysword' s:1 k:1 c:1 i:1 d:1 -> 8
Swift
Version using entire matrix:
func levDis(w1: String, w2: String) -> Int {
let (t, s) = (w1.characters, w2.characters)
let empty = Repeat(count: s.count, repeatedValue: 0)
var mat = [[Int](0...s.count)] + (1...t.count).map{[$0] + empty}
for (i, tLett) in t.enumerate() {
for (j, sLett) in s.enumerate() {
mat[i + 1][j + 1] = tLett == sLett ?
mat[i][j] : min(mat[i][j], mat[i][j + 1], mat[i + 1][j]).successor()
}
}
return mat.last!.last!
}
Version using only two rows at a time:
func levDis(w1: String, w2: String) -> Int {
let (t, s) = (w1.characters, w2.characters)
let empty = Repeat(count: s.count, repeatedValue: 0)
var last = [Int](0...s.count)
for (i, tLett) in t.enumerate() {
var cur = [i + 1] + empty
for (j, sLett) in s.enumerate() {
cur[j + 1] = tLett == sLett ? last[j] : min(last[j], last[j + 1], cur[j]).successor()
}
last = cur
}
return last.last!
}
Single array version
func levenshteinDistance(string1: String, string2: String) -> Int {
let m = string1.count
let n = string2.count
if m == 0 {
return n
}
if n == 0 {
return m
}
var costs = Array(0...n)
for (i, c1) in string1.enumerated() {
costs[0] = i + 1
var corner = i
for (j, c2) in string2.enumerated() {
let upper = costs[j + 1]
if c1 == c2 {
costs[j + 1] = corner
} else {
costs[j + 1] = 1 + min(costs[j], upper, corner)
}
corner = upper
}
}
return costs[n]
}
print(levenshteinDistance(string1: "rosettacode", string2: "raisethysword"))
- Output:
8
Tcl
proc levenshteinDistance {s t} {
# Edge cases
if {![set n [string length $t]]} {
return [string length $s]
} elseif {![set m [string length $s]]} {
return $n
}
# Fastest way to initialize
for {set i 0} {$i <= $m} {incr i} {
lappend d 0
lappend p $i
}
# Loop, computing the distance table (well, a moving section)
for {set j 0} {$j < $n} {} {
set tj [string index $t $j]
lset d 0 [incr j]
for {set i 0} {$i < $m} {} {
set a [expr {[lindex $d $i]+1}]
set b [expr {[lindex $p $i]+([string index $s $i] ne $tj)}]
set c [expr {[lindex $p [incr i]]+1}]
# Faster than min($a,$b,$c)
lset d $i [expr {$a<$b ? $c<$a ? $c : $a : $c<$b ? $c : $b}]
}
# Swap
set nd $p; set p $d; set d $nd
}
# The score is at the end of the last-computed row
return [lindex $p end]
}
- Usage:
puts [levenshteinDistance "kitten" "sitting"]; # Prints 3
TSE SAL
// library: math: get: damerau: levenshtein <description></description> <version>1.0.0.0.23</version> <version control></version control> (filenamemacro=getmadle.s) [kn, ri, th, 08-09-2011 23:04:55]
INTEGER PROC FNMathGetDamerauLevenshteinDistanceI( STRING s1, STRING s2 )
INTEGER L1 = Length( s1 )
INTEGER L2 = Length( s2 )
INTEGER substitutionCostI = 0
STRING h1[255] = ""
STRING h2[255] = ""
IF ( ( L1 == 0 ) OR ( L2 == 0 ) )
// Trivial case: one string is 0-length
RETURN( Max( L1, L2 ) )
ELSE
// The cost of substituting the last character
IF ( ( s1[ L1 ] ) == ( s2[ L2 ] ) )
substitutionCostI = 0
ELSE
substitutionCostI = 1
ENDIF
// h1 and h2 are s1 and s2 with the last character chopped off
h1 = SubStr( s1, 1, L1 - 1 )
h2 = SubStr( s2, 1, L2 - 1 )
IF ( ( L1 > 1 ) AND ( L2 > 1 ) AND ( s1[ L1 - 0 ] == s2[ L2 - 1 ] ) AND ( s1[ L1 - 1 ] == s2[ L2 - 0 ] ) )
RETURN( Min( Min( FNMathGetDamerauLevenshteinDistanceI( h1, s2 ) + 1, FNMathGetDamerauLevenshteinDistanceI( s1, h2 ) + 1 ), Min( FNMathGetDamerauLevenshteinDistanceI( h1 , h2 ) + substitutionCostI, FNMathGetDamerauLevenshteinDistanceI( SubStr( s1, 1, L1 - 2 ), SubStr( s2, 1, L2 - 2 ) ) + 1 ) ) )
ENDIF
RETURN( Min( Min( FNMathGetDamerauLevenshteinDistanceI( h1, s2 ) + 1, FNMathGetDamerauLevenshteinDistanceI( s1, h2 ) + 1 ), FNMathGetDamerauLevenshteinDistanceI( h1 , h2 ) + substitutionCostI ) )
ENDIF
END
PROC Main()
STRING s1[255] = "arcain"
STRING s2[255] = "arcane"
Warn( "Minimum amount of steps to convert ", s1, " to ", s2, " = ", FNMathGetDamerauLevenshteinDistanceI( s1, s2 ) ) // gives e.g. 2
s1 = "algorithm"
s2 = "altruistic"
Warn( "Minimum amount of steps to convert ", s1, " to ", s2, " = ", FNMathGetDamerauLevenshteinDistanceI( s1, s2 ) ) // gives e.g. 6
END
Turbo-Basic XL
10 DIM Word_1$(20), Word_2$(20), DLDm(21, 21)
11 CLS
20 Word_1$="kitten" : Word_2$="sitting" : ? Word_1$;" - ";Word_2$ : EXEC _DLD_ : ?
30 Word_1$="rosettacode" : Word_2$="raisethysword" : ? Word_1$;" - ";Word_2$ : EXEC _DLD_ : ?
11000 END
11600 REM DamerauLevenshteinDistance INPUT(Word_1$, Word_2$, DLDm[]) USE(I, J, K, L, M, N, Min) RETURN(INT Result)
11600 PROC _DLD_
11610 Result=0 : M=LEN(Word_1$) : N=LEN(Word_2$)
11620 FOR I=0 TO M : DLDm(I,0)=I : NEXT I
11630 FOR J=0 TO N : DLDm(0,J)=J : NEXT J
11640 FOR J=1 TO N
11650 FOR I=1 TO M
11660 IF Word_1$(I,I) = Word_2$(J,J)
11670 DLDm(I,J) = DLDm(I-1, J-1) : REM no operation required
11680 ELSE
11690 Min = DLDm(I-1, J)+1 : REM delete
11700 K = DLDm(I, J-1)+1 : REM insert
11710 L = DLDm(I-1, J-1)+1 : REM substitution
11720 IF K < Min THEN Min=K
11730 IF L < Min THEN Min=L
11740 DLDm(I,J) = Min
11750 REM IF I>1 AND J>1
11760 REM IF Word_1$(I,I) = Word_2$(J-1,J-1) AND Word_1$(I-1,I-1) = Word_2$(J,J)
11770 REM Min=DLDm(I,J) : IF Min>(DLDm(I-2,J-2)+1) THEN Min=(DLDm(I-2,J-2)+1)
11780 REM DLDm(I,J) = Min : REM transposition
11790 REM ENDIF
11800 REM ENDIF
11810 ENDIF
11820 NEXT I
11830 NEXT J
11840 Result=DLDm(M,N)
11845 REM ? "Damerau Levenshtein Distance=";Result
11846 ? "Damerau Distance=";Result
11850 ENDPROC
- Output:
kitten, sitting: 3 rosettacode, raisethysword: 8
TUSCRIPT
$$ MODE TUSCRIPT
distance=DISTANCE ("kitten", "sitting")
PRINT distance
Output:
3
TypeScript
function levenshtein(a: string, b: string): number {
const m: number = a.length,
n: number = b.length;
let t: number[] = [...Array(n + 1).keys()],
u: number[] = [];
for (let i: number = 0; i < m; i++) {
u = [i + 1];
for (let j: number = 0; j < n; j++) {
u[j + 1] = a[i] === b[j] ? t[j] : Math.min(t[j], t[j + 1], u[j]) + 1;
}
t = u;
}
return u[n];
}
uBasic/4tH
' Uses the "iterative with two matrix rows" algorithm
' referred to in the Wikipedia article.
' create two integer arrays of distances
Dim @u(128) ' previous
Dim @v(128) ' current
Print "'kitten' to 'sitting' => ";
Print FUNC(_Levenshtein ("kitten", "sitting"))
Print "'rosettacode' to 'raisethysword' => ";
Print FUNC(_Levenshtein ("rosettacode", "raisethysword"))
Print "'sleep' to 'fleeting' => ";
Print FUNC(_Levenshtein ("sleep", "fleeting"))
End
_Levenshtein
Param (2)
Local (3)
' degenerate cases
If Comp(a@, b@) = 0 Then Return (0)
If Len(a@) = 0 Then Return (Len(b@))
If Len(b@) = 0 Then Return (Len(a@))
' initialize v0
For c@ = 0 To Len(b@)
@u(c@) = c@
Next
For c@ = 0 To Len(a@) - 1
' calculate @v() from @u()
@v(0) = c@ + 1
For d@ = 0 To Len(b@) - 1
e@ = IIf(Peek (a@, c@) = Peek (b@, d@), 0, 1)
@v(d@ + 1) = min(@v(d@) + 1, Min(@u(d@ + 1) + 1, @u(d@) + e@))
Next
' copy @v() to @u() for next iteration
For d@ = 0 To Len(b@)
@u(d@) = @v(d@)
Next
Next
Return (@v(Len(b@)))
Uiua
Lev ← |1 memo(
±/↧≡◇⧻.
⨬(
/↥≡◇⧻ # If either is zero, return length of other
| /≍≡◇⊢. # Both start with same letter?
⨬(
# NO: 1 + min(Lev of (tail a, tail b), (tail a, b), (a, tail b))
+1/↧≡Lev[⊃(⍚↘1|⍜⊢⍚↘1|⍜(⊡1|⍚↘1))]
|
Lev ⍚↘1 # YES: = Lev of (tail a, tail b)
)
)
)
Lev {"kitten" "sitting"}
Lev {"rosettacode" "raisethysword"}
- Output:
3 8
Vala
class LevenshteinDistance : Object {
public static int compute (owned string s, owned string t, bool case_sensitive = false) {
var n = s.length;
var m = t.length;
var d = new int[n + 1, m + 1];
if (case_sensitive == false) {
s = s.down ();
t = t.down ();
}
if (n == 0) {
return m;
}
if (m == 0) {
return n;
}
for (var i = 0; i <= n; d[i, 0] = i++) {}
for (var j = 0; j <= m; d[0, j] = j++) {}
for (var i = 1; i <= n; i++) {
for (var j = 1; j <= m; j++) {
var cost = (t[j - 1] == s[i - 1]) ? 0 : 1;
d[i, j] = int.min (int.min (d[i - 1, j] + 1, d[i, j - 1] + 1), d[i - 1, j - 1] + cost);
}
}
return d[n, m];
}
}
VBA
Option Base 1
Function levenshtein(s1 As String, s2 As String) As Integer
Dim n As Integer: n = Len(s1) + 1
Dim m As Integer: m = Len(s2) + 1
Dim d() As Integer, i As Integer, j As Integer
ReDim d(n, m)
If n = 1 Then
levenshtein = m - 1
Exit Function
Else
If m = 1 Then
levenshtein = n - 1
Exit Function
End If
End If
For i = 1 To n
d(i, 1) = i - 1
Next i
For j = 1 To m
d(1, j) = j - 1
Next j
For i = 2 To n
For j = 2 To m
d(i, j) = WorksheetFunction.Min( _
d(i - 1, j) + 1, _
d(i, j - 1) + 1, _
(d(i - 1, j - 1) - (Mid(s1, i - 1, 1) <> Mid(s2, j - 1, 1))) _
)
Next j
Next i
levenshtein = d(n, m)
End Function
Public Sub main()
Debug.Print levenshtein("kitten", "sitting")
Debug.Print levenshtein("rosettacode", "raisethysword")
End Sub
- Output:
3 8
VBScript
' Levenshtein distance - 23/04/2020
Function Min(a,b)
If a < b then Min = a : Else Min = b
End Function 'Min
Function Levenshtein(s1, s2)
Dim d(), i, j, n1, n2, d1, d2, d3
n1 = Len(s1) + 1
n2 = Len(s2) + 1
ReDim d(n1, n2)
If n1 = 1 Then
Levenshtein = n2 - 1
Exit Function
End If
If n2 = 1 Then
Levenshtein = n1 - 1
Exit Function
End If
For i = 1 To n1
d(i, 1) = i - 1
Next
For j = 1 To n2
d(1, j) = j - 1
Next
For i = 2 To n1
For j = 2 To n2
d1 = d(i - 1, j ) + 1
d2 = d(i, j - 1) + 1
d3 = d(i - 1, j - 1) + Abs(Mid(s1, i - 1, 1) <> Mid(s2, j - 1, 1))
d(i, j) = Min(d1, Min(d2, d3))
Next
Next
Levenshtein = d(n1, n2)
End Function 'Levenshtein
Sub PrintLevenshtein(c1, c2)
WScript.StdOut.WriteLine c1&" "& c2&" "& Levenshtein(c1, c2)
End Sub 'PrintLevenshtein
PrintLevenshtein "kitten", "sitting"
PrintLevenshtein "rosettacode", "raisethysword"
PrintLevenshtein "saturday", "sunday"
PrintLevenshtein "sleep", "fleeting"
- Output:
kitten sitting 3 rosettacode raisethysword 8 saturday sunday 3 sleep fleeting 5
Visual Basic
Function min(x As Integer, y As Integer) As Integer
If x < y Then
min = x
Else
min = y
End If
End Function
Function levenshtein(s As String, t As String) As Integer
Dim ls As Integer, lt As Integer
Dim i As Integer, j As Integer, cost As Integer
' degenerate cases
ls = Len(s)
lt = Len(t)
If ls = lt Then
If s = t Then
Exit Function ' return 0
End If
ElseIf ls = 0 Then
levenshtein = lt
Exit Function
ElseIf lt = 0 Then
levenshtein = ls
Exit Function
End If
' create two integer arrays of distances
ReDim v0(0 To lt) As Integer '' previous
ReDim v1(0 To lt) As Integer '' current
' initialize v0
For i = 0 To lt
v0(i) = i
Next i
For i = 0 To ls - 1
' calculate v1 from v0
v1(0) = i + 1
For j = 0 To lt - 1
cost = Abs(CInt(Mid$(s, i + 1, 1) <> Mid$(t, j + 1, 1)))
v1(j + 1) = min(v1(j) + 1, min(v0(j + 1) + 1, v0(j) + cost))
Next j
' copy v1 to v0 for next iteration
For j = 0 To lt
v0(j) = v1(j)
Next j
Next i
levenshtein = v1(lt)
End Function
Sub Main()
' tests
Debug.Print "'kitten' to 'sitting' => "; levenshtein("kitten", "sitting")
Debug.Print "'sitting' to 'kitten' => "; levenshtein("sitting", "kitten")
Debug.Print "'rosettacode' to 'raisethysword' => "; levenshtein("rosettacode", "raisethysword")
Debug.Print "'sleep' to 'fleeting' => "; levenshtein("sleep", "fleeting")
End Sub
- Output:
'kitten' to 'sitting' => 3 'sitting' to 'kitten' => 3 'rosettacode' to 'raisethysword' => 8 'sleep' to 'fleeting' => 5
Visual Basic .NET
Function LevenshteinDistance(ByVal String1 As String, ByVal String2 As String) As Integer
Dim Matrix(String1.Length, String2.Length) As Integer
Dim Key As Integer
For Key = 0 To String1.Length
Matrix(Key, 0) = Key
Next
For Key = 0 To String2.Length
Matrix(0, Key) = Key
Next
For Key1 As Integer = 1 To String2.Length
For Key2 As Integer = 1 To String1.Length
If String1(Key2 - 1) = String2(Key1 - 1) Then
Matrix(Key2, Key1) = Matrix(Key2 - 1, Key1 - 1)
Else
Matrix(Key2, Key1) = Math.Min(Matrix(Key2 - 1, Key1) + 1, Math.Min(Matrix(Key2, Key1 - 1) + 1, Matrix(Key2 - 1, Key1 - 1) + 1))
End If
Next
Next
Return Matrix(String1.Length - 1, String2.Length - 1)
End Function
V (Vlang)
import strings
fn main() {
println(strings.levenshtein_distance("kitten", "sitting"))
println(strings.levenshtein_distance("rosettacode", "raisethysword"))
}
- Output:
3 8
Wren
var levenshtein = Fn.new { |s, t|
var ls = s.count
var lt = t.count
var d = List.filled(ls + 1, null)
for (i in 0..ls) {
d[i] = List.filled(lt + 1, 0)
d[i][0] = i
}
for (j in 0..lt) d[0][j] = j
for (j in 1..lt) {
for (i in 1..ls) {
if (s[i-1] == t[j-1]) {
d[i][j] = d[i-1][j-1]
} else {
var min = d[i-1][j]
if (d[i][j-1] < min) min = d[i][j-1]
if (d[i-1][j-1] < min) min = d[i-1][j-1]
d[i][j] = min + 1
}
}
}
return d[-1][-1]
}
System.print(levenshtein.call("kitten", "sitting"))
- Output:
3
Zig
Works with: 0.11.x, 0.12.0-dev.1381+61861ef39
For 0.10.x, replace @min(a, b, c) with std.math.min3(a, b, c).
Recursive method without memoization.
const std = @import("std");
fn levenshtein(s: []const u8, t: []const u8) usize {
// If either string is empty, difference is inserting all chars
// from the other
if (s.len == 0) return t.len;
if (t.len == 0) return s.len;
// If last letters are the same, the difference is whatever is
// required to edit the rest of the strings
if (s[s.len - 1] == t[t.len - 1])
return levenshtein(s[0 .. s.len - 1], t[0 .. t.len - 1]);
// Else try:
// changing last letter of s to that of t; or
// remove last letter of s; or
// remove last letter of t,
// any of which is 1 edit plus editing the rest of the strings
const a = levenshtein(s[0 .. s.len - 1], t[0 .. t.len - 1]);
const b = levenshtein(s, t[0 .. t.len - 1]);
const c = levenshtein(s[0 .. s.len - 1], t);
return @min(a, b, c) + 1;
}
pub fn main() std.fs.File.WriteError!void {
const stdout = std.io.getStdOut();
const stdout_w = stdout.writer();
const s1 = "rosettacode";
const s2 = "raisethysword";
try stdout_w.print("distance between '{s}' and '{s}': {d}\n", .{ s1, s2, levenshtein(s1, s2) });
return;
}
zkl
fcn levenshtein(s1,s2){
sz2,costs:=s2.len() + 1, List.createLong(sz2,0); // -->zero filled List
foreach i in (s1.len() + 1){
lastValue:=i;
foreach j in (sz2){
if (i==0) costs[j]=j;
else if (j>0){
newValue:=costs[j-1];
if (s1[i-1]!=s2[j-1])
newValue=newValue.min(lastValue, costs[j]) + 1;
costs[j-1]=lastValue;
lastValue =newValue;
}
}
if (i>0) costs[-1]=lastValue;
}
costs[-1]
}
foreach a,b in (T(T("kitten","sitting"), T("rosettacode","raisethysword"),
T("yo",""), T("","yo"), T("abc","abc")) ){
println(a," --> ",b,": ",levenshtein(a,b));
}
- Output:
kitten --> sitting: 3 rosettacode --> raisethysword: 8 yo --> : 2 --> yo: 2 abc --> abc: 0
ZX Spectrum Basic
10 REM ZX Spectrum Basic - Levenshtein distance
20 INPUT "first word:",n$
30 INPUT "second word:",m$
40 LET n=LEN n$:LET m=LEN m$:DIM d(m+1,n+1)
50 FOR i=1 TO m:LET d(i+1,1)=i:NEXT i
60 FOR j=1 TO n:LET d(1,j+1)=j:NEXT j
70 FOR j=1 TO n
80 FOR i=1 TO m
90 LET r=d(i,j)-(n$(j)=m$(i)):REM substitution
100 IF r>d(i,j+1) THEN LET r=r-1:REM insertion
110 LET d(i+1,j+1)=r+(r<=d(i+1,j)):REM deletion
120 NEXT i
130 NEXT j
140 PRINT "The Levenshtein distance between """;n$;""", """;m$;""" is ";d(m+1,n+1);"."
- Output:
The Levenshtein distance between "rosettacode", "raisethysword" is 8.
- Programming Tasks
- Solutions by Programming Task
- WikipediaSourced
- 11l
- 360 Assembly
- Action!
- Examples needing attention
- Ada
- Aime
- ALGOL 68
- AppleScript
- Arc
- Arturo
- AutoHotkey
- AWK
- BBC BASIC
- BQN
- Bracmat
- Bruijn
- C
- C sharp
- C++
- Clojure
- CLU
- COBOL
- CoffeeScript
- Common Lisp
- Crystal
- D
- Delphi
- DWScript
- Dyalect
- EasyLang
- EchoLisp
- Ela
- Elixir
- Erlang
- ERRE
- Euphoria
- F Sharp
- Factor
- Forth
- Fortran
- FreeBASIC
- Frink
- FutureBasic
- Go
- Groovy
- Haskell
- Icon
- Unicon
- Io
- IS-BASIC
- J
- Java
- JavaScript
- Jq
- Jsish
- Julia
- Kotlin
- LFE
- Liberty BASIC
- Limbo
- LiveCode
- Lobster
- Lua
- M2000 Interpreter
- Maple
- Mathematica
- Wolfram Language
- MATLAB
- MiniScript
- Modula-2
- NetRexx
- Nim
- Nu
- Oberon-2
- Objeck
- Objective-C
- OCaml
- OoRexx
- PARI/GP
- Pascal
- Perl
- Phix
- PHP
- Picat
- PicoLisp
- PL/I
- PL/M
- PowerShell
- Processing
- Processing Python mode
- Prolog
- PureBasic
- Python
- Racket
- Raku
- Refal
- REXX
- Ring
- RPL
- Ruby
- Run BASIC
- Rust
- Scala
- Scheme
- Seed7
- SenseTalk
- SequenceL
- SETL
- Sidef
- Simula
- Smalltalk
- Swift
- Tcl
- TSE SAL
- Turbo-Basic XL
- TUSCRIPT
- TypeScript
- UBasic/4tH
- Uiua
- Vala
- VBA
- VBScript
- Visual Basic
- Visual Basic .NET
- V (Vlang)
- Wren
- Zig
- Zkl
- ZX Spectrum Basic
- Pages with too many expensive parser function calls