Longest common subsequence: Difference between revisions
m I thought the bold wasn't clear enough so I underlined stuff too |
|||
Line 61: | Line 61: | ||
"tsitest" |
"tsitest" |
||
</pre> |
</pre> |
||
=={{header|J}}== |
|||
lcs=: [: >./@, [: (* * 1 + >./\@:(_1&|.)@:|:^:2)^:_ [: ,&0@:,.&0 =/ |
|||
Program written by Ewart Shaw. Copied, with permission, from the [http://www.jsoftware.com/jwiki/Essays/Longest_Common_Subsequence J wiki essay] on this topic. |
|||
=={{header|Java}}== |
=={{header|Java}}== |
Revision as of 18:16, 11 May 2008
The longest common subsequence (or LCS) of groups A and B is the longest group of elements from A and B that are common between the two groups and in the same order in each group. For example, the sequences "1234" and "1224533324" have an LCS of "1234":
1234 1224533324
For a string example, consider the sequences "thisisatest" and "testing123testing". An LCS would be "tsitest":
thisisatest testing123testing
In this puzzle, your code only needs to deal with strings. Write a function which returns an LCS of two strings (case-sensitive). You don't need to show multiple LCS's.
BASIC
<qbasic>FUNCTION lcs$ (a$, b$)
IF LEN(a$) = 0 OR LEN(b$) = 0 THEN
lcs$ = ""
ELSEIF RIGHT$(a$, 1) = RIGHT$(b$, 1) THEN
lcs$ = lcs$(LEFT$(a$, LEN(a$) - 1), LEFT$(b$, LEN(b$) - 1)) + RIGHT$(a$, 1)
ELSE
x$ = lcs$(a$, LEFT$(b$, LEN(b$) - 1)) y$ = lcs$(LEFT$(a$, LEN(a$) - 1), b$) IF LEN(x$) > LEN(y$) THEN lcs$ = x$ ELSE lcs$ = y$ END IF
END IF
END FUNCTION</qbasic>
Haskell
The Wikipedia solution translates directly into Haskell, with the only difference that equal characters are added in front:
longest xs ys = if length xs > length ys then xs else ys lcs [] _ = [] lcs _ [] = [] lcs (x:xs) (y:ys) | x == y = x : lcs xs ys | otherwise = longest (lcs (x:xs) ys) (lcs xs (y:ys))
Memoization (aka dynamic programming) of that uses zip to make both the index and the character available:
import Data.Array lcs xs ys = a!(0,0) where n = length xs m = length ys a = array ((0,0),(n,m)) $ l1 ++ l2 ++ l3 l1 = [((i,m),[]) | i <- [0..n]] l2 = [((n,j),[]) | j <- [0..m]] l3 = [((i,j), f x y i j) | (x,i) <- zip xs [0..], (y,j) <- zip ys [0..]] f x y i j | x == y = x : a!(i+1,j+1) | otherwise = longest (a!(i,j+1)) (a!(i+1,j))
Both solutions work of course not only with strings, but also with any other list. Example:
*Main> lcs "thisisatest" "testing123testing" "tsitest"
Java
Recursion
This is not a particularly fast algorithm, but it gets the job done eventually. The speed is a result of many recursive functions calls.
<java>public static String lcs(String a, String b){
if(a.length() == 0 || b.length() == 0){ return ""; }else if(a.charAt(a.length()-1) == b.charAt(b.length()-1)){ return lcs(a.substring(0,a.length()-1),b.substring(0,b.length()-1)) + a.charAt(a.length()-1); }else{ String x = lcs(a, b.substring(0,b.length()-1)); String y = lcs(a.substring(0,a.length()-1), b); return (x.length() > y.length()) ? x : y; }
}</java>
Dynamic Programming
<java>public static String lcs(String a, String b) {
int[][] lengths = new int[a.length()+1][b.length()+1];
// row 0 and column 0 are initialized to 0 already
for (int i = 0; i < a.length(); i++) for (int j = 0; j < b.length(); j++) if (a.charAt(i) == b.charAt(j)) lengths[i+1][j+1] = lengths[i][j] + 1; else lengths[i+1][j+1] = Math.max(lengths[i+1][j], lengths[i][j+1]);
// read the substring out from the matrix StringBuffer sb = new StringBuffer(); for (int x = a.length(), y = b.length(); x != 0 && y != 0; ) { if (lengths[x][y] == lengths[x-1][y]) x--; else if (lengths[x][y] == lengths[x][y-1]) y--; else { assert a.charAt(x-1) == b.charAt(y-1); sb.append(a.charAt(x-1)); x--; y--; } }
return sb.reverse().toString();
}</java>