Longest common subsequence: Difference between revisions

Content deleted Content added
CNHume (talk | contribs)
m Adjusted References, emphasizing the priority of Claus Rick.
CNHume (talk | contribs)
Clarified statement of the Longest Common Subsequence Problem
Line 1:
{{task}}[[Category:Recursion]][[Category:Memoization]]
Define a subsequence to be any string obtained by deleting zero or more symbols from an input string.
The '''longest common subsequence''' (or [http://en.wikipedia.org/wiki/Longest_common_subsequence_problem '''LCS''']) of groups A and B is the longest group of elements from A and B that are common between the two groups and in the same order in each group. For example, the sequences "1234" and "1224533324" have an LCS of "1234":
 
The '''Longest Common Subsequence''' (or [http://en.wikipedia.org/wiki/Longest_common_subsequence_problem '''LCS''']) is a subsequence of maximum length common to two (or more) strings.
 
Let x = x[0]... x[m-1] and y = y[0]... y[n-1], m <= n be strings drawn from an alphabet Sigma of size s, containing every distinct symbol in x + y.
 
An ordered pair (i, j) will be called a match if x[i] == y[j], where 0 <= i < m and 0 <= j < n.
 
Define a strict Cartesian product-order over these ordered pairs, such that (i1, j1) < (i2, j2) iff i1 < j1 and i2 < j2.
 
Given such a product-order < over a set of matches M, a chain C is any subset of M where either p1 < p2 or p2 < p1, for distinct pairs p1 and p2 in C.
 
Finding the LCS can then be viewed as the problem of finding a chain of maximum cardinality over the set of matches M.
 
This set of matches can be represented as an m*n bit matrix, where each bit M[i, j] is True iff there is a match at the corresponding positions of strings x and y.
 
A chain can then be visualized as any monotonically increasing curve through those match bits which are set to True.
 
For example, the sequences "1234" and "1224533324" have an LCS of "1234":
'''<u>1234</u>'''
'''<u>12</u>'''245'''<u>3</u>'''332'''<u>4</u>'''