Talk:Levenshtein distance: Difference between revisions

Content added Content deleted
(J: follow bug fix on main page)
Line 4: Line 4:


<lang j>levD=: i.@-@>:@#@] ,~ >:@i.@-@#@[ ,.~ ~:/
<lang j>levD=: i.@-@>:@#@] ,~ >:@i.@-@#@[ ,.~ ~:/
lev=: [: {. {."1@((1&{ ,~ {. + <./@}.)@,/\.)@,./@levD</lang>
lev=: [: {. {."1@((1&{ ,~ (1 1,{.) <./@:+ }.)@,/\.)@,./@levD</lang>


''First, we setup up an matrix of costs, with 0 or 1 for unexplored cells (1 being where the character pair corresponding to that cell position has two different characters). Note that the "cost to reach the empty string" cells go on the bottom and the right, instead of the top and the left, because this works better with J's "reduce" operator.''
''First, we setup up an matrix of costs, with 0 or 1 for unexplored cells (1 being where the character pair corresponding to that cell position has two different characters). Note that the "cost to reach the empty string" cells go on the bottom and the right, instead of the top and the left, because this works better with J's "reduce" operator.''
Line 25: Line 25:
The J operator / inserts its verb between each item in an array, so <code>'kitten' lev 'sitting'</code> is equivalent to:
The J operator / inserts its verb between each item in an array, so <code>'kitten' lev 'sitting'</code> is equivalent to:


<lang j>{. 1 1 1 1 1 1 1 6 {."1@((1&{ ,~ {. + <./@}.)@,/\.)@,. ... 1 1 1 1 1 0 1 1 {."1@((1&{ ,~ {. + <./@}.)@,/\.)@,. 7 6 5 4 3 2 1 0</lang>
<lang j>{. 1 1 1 1 1 1 1 6 {."1@((1&{ ,~ (1 1,{.) <./@:+ }.)@,/\.)@,. ... 1 1 1 1 1 0 1 1 {."1@((1&{ ,~ (1 1,{.) <./@:+ }.)@,/\.)@,. 7 6 5 4 3 2 1 0</lang>


Or, breaking it down and avoiding problems with long lines:
Or, breaking it down and avoiding problems with long lines:


<lang j>d=. 7 6 5 4 3 2 1 0
<lang j>d=. 7 6 5 4 3 2 1 0
d=. 1 1 1 1 1 0 1 1 {."1@((1&{ ,~ {. + <./@}.)@,/\.)@,. t
d=. 1 1 1 1 1 0 1 1 {."1@((1&{ ,~ (1 1,{.) <./@:+ }.)@,/\.)@,. t
d=. 1 1 1 1 1 1 1 2 {."1@((1&{ ,~ {. + <./@}.)@,/\.)@,. t
d=. 1 1 1 1 1 1 1 2 {."1@((1&{ ,~ (1 1,{.) <./@:+ }.)@,/\.)@,. t
d=. 1 1 0 0 1 1 1 3 {."1@((1&{ ,~ {. + <./@}.)@,/\.)@,. t
d=. 1 1 0 0 1 1 1 3 {."1@((1&{ ,~ (1 1,{.) <./@:+ }.)@,/\.)@,. t
d=. 1 1 0 0 1 1 1 4 {."1@((1&{ ,~ {. + <./@}.)@,/\.)@,. t
d=. 1 1 0 0 1 1 1 4 {."1@((1&{ ,~ (1 1,{.) <./@:+ }.)@,/\.)@,. t
d=. 1 0 1 1 0 1 1 5 {."1@((1&{ ,~ {. + <./@}.)@,/\.)@,. t
d=. 1 0 1 1 0 1 1 5 {."1@((1&{ ,~ (1 1,{.) <./@:+ }.)@,/\.)@,. t
d=. 1 1 1 1 1 1 1 6 {."1@((1&{ ,~ {. + <./@}.)@,/\.)@,. t
d=. 1 1 1 1 1 1 1 6 {."1@((1&{ ,~ (1 1,{.) <./@:+ }.)@,/\.)@,. t
{.d NB. the first number in the final instance of d is our result
{.d NB. the first number in the final instance of d is our result
</lang>
</lang>
Line 42: Line 42:


So, let's take a specific example:
So, let's take a specific example:
<lang j>1 1 1 1 1 0 1 1 {."1@((1&{ ,~ {. + <./@}.)@,/\.)@,. 7 6 5 4 3 2 1 0</lang>
<lang j>1 1 1 1 1 0 1 1 {."1@((1&{ ,~ (1 1,{.) <./@:+ }.)@,/\.)@,. 7 6 5 4 3 2 1 0</lang>


Examining this sentence, bottom up, the first thing we do is join together these two lists, as columns:
Examining this sentence, bottom up, the first thing we do is join together these two lists, as columns:
Line 60: Line 60:
That leaves us with a sentence like:
That leaves us with a sentence like:


<lang j>1 7 (1&{ ,~ {. + <./@}.)@, ... 1 1 (1&{ ,~ {. + <./@}.)@, 1 0</lang>
<lang j>1 7 (1&{ ,~ (1 1,{.) <./@:+ }.)@, ... 1 1 (1&{ ,~ (1 1,{.) <./@:+ }.)@, 1 0</lang>


Or, breaking it down, again:
Or, breaking it down, again:


<lang j>dd7=.1 0
<lang j>dd7=.1 0
dd6=.1 1 (1&{ ,~ {. + <./@}.)@, dd7
dd6=.1 1 (1&{ ,~ (1 1,{.) <./@:+ }.)@, dd7
dd5=.0 2 (1&{ ,~ {. + <./@}.)@, dd6
dd5=.0 2 (1&{ ,~ (1 1,{.) <./@:+ }.)@, dd6
dd4=.1 3 (1&{ ,~ {. + <./@}.)@, dd5
dd4=.1 3 (1&{ ,~ (1 1,{.) <./@:+ }.)@, dd5
dd3=.1 4 (1&{ ,~ {. + <./@}.)@, dd4
dd3=.1 4 (1&{ ,~ (1 1,{.) <./@:+ }.)@, dd4
dd2=.1 5 (1&{ ,~ {. + <./@}.)@, dd3
dd2=.1 5 (1&{ ,~ (1 1,{.) <./@:+ }.)@, dd3
dd1=.1 6 (1&{ ,~ {. + <./@}.)@, dd2
dd1=.1 6 (1&{ ,~ (1 1,{.) <./@:+ }.)@, dd2
dd0=.1 7 (1&{ ,~ {. + <./@}.)@, dd1</lang>
dd0=.1 7 (1&{ ,~ (1 1,{.) <./@:+ }.)@, dd1</lang>


So, for example, finding the value for dd4 is 2 3 given that we know that dd5 is 1 2
So, for example, finding the value for dd4 is 2 3 given that we know that dd5 is 1 2
Line 94: Line 94:
The downside, of these kinds of dynamic algorithms is that while the individual operations can be easy or trivial to picture, the cascading dependencies make comprehending the data as a whole rather slippery. But perhaps viewing the data as a whole, with significant intermediate results, can help:
The downside, of these kinds of dynamic algorithms is that while the individual operations can be easy or trivial to picture, the cascading dependencies make comprehending the data as a whole rather slippery. But perhaps viewing the data as a whole, with significant intermediate results, can help:


<lang j> 'kitten' {."1@((1&{ ,~ {. + <./@}.)@,/\.)@,./\.@levD 'sitting'
<lang j> 'kitten' {."1@((1&{ ,~ (1 1,{.) <./@:+ }.)@,/\.)@,./\.@levD 'sitting'
3 3 4 4 5 6 6 6
3 3 4 4 5 6 6 6
3 2 3 3 4 5 5 5
3 2 3 3 4 5 5 5