Talk:Levenshtein distance: Difference between revisions

J: follow bug fix on main page
(J: follow bug fix on main page)
Line 4:
 
<lang j>levD=: i.@-@>:@#@] ,~ >:@i.@-@#@[ ,.~ ~:/
lev=: [: {. {."1@((1&{ ,~ (1 1,{. +) <./@:+ }.)@,/\.)@,./@levD</lang>
 
''First, we setup up an matrix of costs, with 0 or 1 for unexplored cells (1 being where the character pair corresponding to that cell position has two different characters). Note that the "cost to reach the empty string" cells go on the bottom and the right, instead of the top and the left, because this works better with J's "reduce" operator.''
Line 25:
The J operator / inserts its verb between each item in an array, so <code>'kitten' lev 'sitting'</code> is equivalent to:
 
<lang j>{. 1 1 1 1 1 1 1 6 {."1@((1&{ ,~ (1 1,{. +) <./@:+ }.)@,/\.)@,. ... 1 1 1 1 1 0 1 1 {."1@((1&{ ,~ (1 1,{. +) <./@:+ }.)@,/\.)@,. 7 6 5 4 3 2 1 0</lang>
 
Or, breaking it down and avoiding problems with long lines:
 
<lang j>d=. 7 6 5 4 3 2 1 0
d=. 1 1 1 1 1 0 1 1 {."1@((1&{ ,~ (1 1,{. +) <./@:+ }.)@,/\.)@,. t
d=. 1 1 1 1 1 1 1 2 {."1@((1&{ ,~ (1 1,{. +) <./@:+ }.)@,/\.)@,. t
d=. 1 1 0 0 1 1 1 3 {."1@((1&{ ,~ (1 1,{. +) <./@:+ }.)@,/\.)@,. t
d=. 1 1 0 0 1 1 1 4 {."1@((1&{ ,~ (1 1,{. +) <./@:+ }.)@,/\.)@,. t
d=. 1 0 1 1 0 1 1 5 {."1@((1&{ ,~ (1 1,{. +) <./@:+ }.)@,/\.)@,. t
d=. 1 1 1 1 1 1 1 6 {."1@((1&{ ,~ (1 1,{. +) <./@:+ }.)@,/\.)@,. t
{.d NB. the first number in the final instance of d is our result
</lang>
Line 42:
 
So, let's take a specific example:
<lang j>1 1 1 1 1 0 1 1 {."1@((1&{ ,~ (1 1,{. +) <./@:+ }.)@,/\.)@,. 7 6 5 4 3 2 1 0</lang>
 
Examining this sentence, bottom up, the first thing we do is join together these two lists, as columns:
Line 60:
That leaves us with a sentence like:
 
<lang j>1 7 (1&{ ,~ (1 1,{. +) <./@:+ }.)@, ... 1 1 (1&{ ,~ (1 1,{. +) <./@:+ }.)@, 1 0</lang>
 
Or, breaking it down, again:
 
<lang j>dd7=.1 0
dd6=.1 1 (1&{ ,~ (1 1,{. +) <./@:+ }.)@, dd7
dd5=.0 2 (1&{ ,~ (1 1,{. +) <./@:+ }.)@, dd6
dd4=.1 3 (1&{ ,~ (1 1,{. +) <./@:+ }.)@, dd5
dd3=.1 4 (1&{ ,~ (1 1,{. +) <./@:+ }.)@, dd4
dd2=.1 5 (1&{ ,~ (1 1,{. +) <./@:+ }.)@, dd3
dd1=.1 6 (1&{ ,~ (1 1,{. +) <./@:+ }.)@, dd2
dd0=.1 7 (1&{ ,~ (1 1,{. +) <./@:+ }.)@, dd1</lang>
 
So, for example, finding the value for dd4 is 2 3 given that we know that dd5 is 1 2
Line 94:
The downside, of these kinds of dynamic algorithms is that while the individual operations can be easy or trivial to picture, the cascading dependencies make comprehending the data as a whole rather slippery. But perhaps viewing the data as a whole, with significant intermediate results, can help:
 
<lang j> 'kitten' {."1@((1&{ ,~ (1 1,{. +) <./@:+ }.)@,/\.)@,./\.@levD 'sitting'
3 3 4 4 5 6 6 6
3 2 3 3 4 5 5 5
6,962

edits