Longest common substring: Difference between revisions

J
No edit summary
(J)
Line 99:
test
</pre>
 
=={{header|J}}==
 
This algorithm starts by comparing each character in the one string to each character in the other, and then iterates on this result until it finds the length of the longest common substring. So if Lx is the length of one argument string, Ly is the length of the other argument string, and Lr is the length of the result string, this algorithm uses space on the order of Lx*Ly and time on the order of Lx*Ly*Lr.
 
In other words: this can be suitable for small problems, but you might want something better if you're comparing gigabyte length strings with high commonality.
 
<lang J>lcstr=:4 :0
C=. ({.~ 1+$) x=/y
M=. >./ (* * * >. * + (_1&|.)@:|:^:2)^:_ C
N=. >./ M
y {~ (M i. N)-i.-N
)</lang>
 
Intermedate results:
 
C shows which characters are in common between the two strings.
M marks the length of the longest common substring ending at that position in the right argument
N is the length of the longest common substring
 
Example use:
 
<lang J> 'thisisatest' lcs 'testing123testing'
test</lang>
6,962

edits