Longest common subsequence: Difference between revisions

m
Replaced the term "monotonically" with "strictly".
m (Leveraged <> operator in the description of a chain.)
m (Replaced the term "monotonically" with "strictly".)
Line 12:
Define the strict Cartesian product-order (<) over matches, such that (i1, j1) < (i2, j2) iff i1 < j1 and i2 < j2. Defining (>) similarly, we can write m2 < m1 as m1 > m2.
 
If i1 <= j1 and i2 >= j2 (or if i1 >= i2 and i2 <= j2) then ''neither'' m1 < m2 ''nor'' m1 > m2 are possible; and m1, m2 are ''incomparable''. Defining (#) to denote this case, we write m1 # m2. '''Note:''' This includes the possibility that m1 == m2, ''i.e.'', that i1 == i2 and j1 == j2.
 
We write m1 <> m2 to mean that ''either'' m1 < m2 ''or'' m1 > m2 holds. Thus, the (<>) operator is the inverse of (#). In this case, m1 != m2, ''i.e.'', theeach twoof tuples differ in some component.
 
Given a product-order over the set of matches '''M''', a chain '''C''' is any subset of '''M''' where m1 <> m2 for every pair of distinct elements m1 and m2 of '''C'''.
Line 22:
The set of matches '''M''' can be visualized as an m*n bit matrix, where each bit '''M[i, j]''' is True iff there is a match at the corresponding positions of strings A and B.
 
Then any chain '''C''' can be visualized as a monotonicallystrictly increasing curve through those match bits which are set to True.
 
'''Background'''
Line 36:
In this case, a binary search optimization due to Hunt and Szymanski can be applied to the basic Dynamic Programming approach which results in expected performance of O(n log m). In the worst case, performance may degrade to O(m*n log m) time if the number of matches, and the space required to represent them, should grow to O(m*n).
 
''Note:'' Claus Rick has also described a linear-space algorithm with a time bound of O(n*s + min(p*m, p*(n-p))).
 
'''References'''
159

edits