Jump to content

Longest common subsequence: Difference between revisions

Simplified discussion of the product-order, defining it to be non-strict in keeping with wider convention. Corrected legend.
m (Revised interpretation of M as a relation.)
(Simplified discussion of the product-order, defining it to be non-strict in keeping with wider convention. Corrected legend.)
Line 6:
The [http://en.wikipedia.org/wiki/Longest_common_subsequence_problem '''Longest Common Subsequence'''] ('''LCS''') is a subsequence of maximum length common to two or more strings.
 
Let ''A'' =≡ ''A''[0]… ''A''[m - 1] and ''B'' =≡ ''B''[0]… ''B''[n - 1], m &lelt; n be strings drawn from an alphabet Σ of size s, containing every distinct symbol in A + B.
 
An ordered pair (i, j) will be calledreferred to as a match if ''A''[i] == ''B''[j], where 0 &lelt; i <&le; m and 0 &lelt; j <&le; n.
 
Define the strict Cartesian [https://en.wikipedia.org/wiki/Product_order product-order] (<&le;) over matchesordered pairs, such that (i1, j1) <&le; (i2, j2) &hArr; i1 <&le; i2 and j1 <&le; j2. Defining (>&ge;) similarly, we can write m2 <&le; m1 as m1 >&ge; m2.
 
We writesay that m1 <>, m2 toare mean''comparable'' thatif either m1 <&le; m2 or m1 >&ge; m2 holds, ''i.e.'', If i1 &lt; i2 and j2 &lt; j1 (or i2 &lt; i1 and j1 &lt; j2) then neither m1 &le; m2 nor m1 &ge; m2 are possible; and we say that m1, m2 are ''comparableincomparable''.
 
Given a product-order over the set of matches '''M''', a chain '''C''' is any subset of '''M''' wherein m1 <> m2 forwhich every pair of distinct elements m1 and m2 ofare '''C'''comparable. Similarly, an antichain '''D''' is any subset of '''M''' where m1in # m2 forwhich every pair of distinct elements m1 and m2 ofare '''D'''incomparable.
If i1 &le; i2 and j2 &le; j1 (or i2 &le; i1 and j1 &le; j2) then neither m1 < m2 nor m1 > m2 are possible; and m1, m2 are ''incomparable''.
 
The set '''M''' represents a relation over match pairs: (i, j) &isin; '''M'''[i, j] &hArr; (i, j) &isin; '''M'''[i, j]. Any chain '''C''' can be visualized as a curve which strictly increasingincreases curveas whichit passes through each match pair in the m*n coordinate space.
Defining (#) to denote this case, we write m1 # m2. Because the underlying product-order is strict, m1 == m2 (''i.e.'', i1 == i2 and j1 == j2) implies m1 # m2. m1 <> m2 implies m1 &ne; m2, ''i.e.'', that the two tuples differ in some component. Thus, the (<>) operator is the inverse of (#).
 
Finding an LCS can then be restated as the problem of finding a chain of maximum cardinality p over the set of matches '''M'''.
Given a product-order over the set of matches '''M''', a chain '''C''' is any subset of '''M''' where m1 <> m2 for every pair of distinct elements m1 and m2 of '''C'''. Similarly, an antichain '''D''' is any subset of '''M''' where m1 # m2 for every pair of distinct elements m1 and m2 of '''D'''.
 
Finding an LCS can then be restated as the problem of finding a chain of maximum cardinality p over the set of matches '''M'''.
 
The set '''M''' represents a relation over match pairs: (i, j) &isin; '''M''' &hArr; '''M'''[i, j]. Any chain '''C''' can be visualized as a strictly increasing curve which passes through each match pair in the m*n coordinate space.
 
According to [Dilworth 1950], this cardinality p equals the minimum number of disjoint antichains into which '''M''' can be decomposed. Note that such a decomposition into the minimal number p of disjoint antichains may not be unique.
Line 46 ⟶ 42:
A, B are input strings of lengths m, n respectively
p is the length of the LCS
M is the set of match pairs (i, j) such that xA[i] == yB[j]
r is the magnitude of M
s is the magnitude of the alphabet Σ of distinct symbols in xA + yB
 
'''References'''
159

edits

Cookies help us deliver our services. By using our services, you agree to our use of cookies.