Levenshtein distance/Alignment
The Levenshtein distance algorithm returns the number of atomic operations (insertion, deletion or edition) that must be performed on a string in order to obtain an other one, but it does not say anything about the actual operations used or their order.
An alignment is a notation used to describe the operations used to turn a string into an other. At some point in the strings, the minus character ('-') is placed in order to signify that a character must be added at this very place. For instance, an alignment between the words 'place' and 'palace' is:
P-LACE PALACE
For this task, write a function that shows the alignment of two strings for the corresponding levenshtein distance. As an example, use the words "rosettacode" and "raisethysword"
Perl
<lang perl>use List::Util qw(min);
sub levenshtein_distance_alignment {
my @s = ('^', split //, shift); my @t = ('^', split //, shift);
my @d; $d[$_][0] = $_ for 0 .. @s-1; $d[0][$_] = $_ for 0 .. @t-1;
my (@AS, @AT); $AS[$_][0] = join , @s[1 .. $_] for 0 .. @s-1; $AS[0][$_] = '-' x $_ for 0 .. @t-1; $AT[0][$_] = join , @t[1 .. $_] for 0 .. @t-1; $AT[$_][0] = '-' x $_ for 0 .. @s-1;
for my $i (1 .. @s-1) { J: for my $j (1 .. @t-1) { if ($s[$i] eq $t[$j]) { die "$i $j" if not defined $AS[$i-1][$j-1]; $AS[$i][$j] = $AS[$i-1][$j-1] . $s[$i]; $AT[$i][$j] = $AT[$i-1][$j-1] . $t[$j]; $d[$i][$j] = $d[$i-1][$j-1]; next J; } $d[$i][$j] = 1 + ( my $min = min $d[$i-1][$j], $d[$i][$j-1], $d[$i-1][$j-1] ); if ($d[$i-1][$j] == $min) { $AS[$i][$j] = $AS[$i-1][$j] . $s[$i]; $AT[$i][$j] = $AT[$i-1][$j] . '-'; } elsif ($d[$i][$j-1] == $min) { $AS[$i][$j] = $AS[$i][$j-1] . '-'; $AT[$i][$j] = $AT[$i][$j-1] . $t[$j]; } else { $AS[$i][$j] = $AS[$i-1][$j-1] . $s[$i]; $AT[$i][$j] = $AT[$i-1][$j-1] . $t[$j]; } } } return $AS[-1][-1], $AT[-1][-1];
}
print join "\n", levenshtein_distance_alignment "rosettacode", "raisethysword"; </lang>
- Output:
ro-settac-o-de raisethysword-