Longest common subsequence: Difference between revisions

Longest common subsequence (view source)

Revision as of 07:25, 7 February 2022

275 bytes removed , 2 years ago

Simplified discussion of the product-order, defining it to be non-strict in keeping with wider convention. Corrected legend.

CNHume

159

edits

Revision as of 05:46, 6 February 2022 (view source) CNHume (talk \| contribs) m (Revised interpretation of M as a relation.) ← Older edit		Revision as of 07:25, 7 February 2022 (view source) CNHume (talk \| contribs) (Simplified discussion of the product-order, defining it to be non-strict in keeping with wider convention. Corrected legend.) Newer edit →
Line 6: The [http://en.wikipedia.org/wiki/Longest_common_subsequence_problem '''Longest Common Subsequence'''] ('''LCS''') is a subsequence of maximum length common to two or more strings. Let ''A'' =&equiv; ''A''[0]… ''A''[m - 1] and ''B'' =&equiv; ''B''[0]… ''B''[n - 1], m &lelt; n be strings drawn from an alphabet Σ of size s, containing every distinct symbol in A + B. An ordered pair (i, j) will be ~~called~~referred to as a match if ''A''[i] == ''B''[j], where 0 &lelt; i <≤ m and 0 &lelt; j <≤ n. Define the ~~strict Cartesian~~ [https://en.wikipedia.org/wiki/Product_order product-order] (<≤) over ~~matches~~ordered pairs, such that (i1, j1) <≤ (i2, j2) ⇔ i1 <≤ i2 and j1 <≤ j2. Defining (>≥) similarly, we can write m2 <≤ m1 as m1 >≥ m2. We ~~write~~say that m1 <>, m2 toare ~~mean~~''comparable'' ~~that~~if either m1 <≤ m2 or m1 >≥ m2 holds~~, ''i~~.~~e.'',~~ If i1 < i2 and j2 < j1 (or i2 < i1 and j1 < j2) then neither m1 ≤ m2 nor m1 ≥ m2 are possible; and we say that m1, m2 are ''~~comparable~~incomparable''. Given a product-order over the set of matches '''M''', a chain '''C''' is any subset of '''M''' ~~where~~in ~~m1 <> m2 for~~which every pair of distinct elements m1 and m2 ofare ~~'''C'''~~comparable. Similarly, an antichain '''D''' is any subset of '''M''' ~~where~~ m1in ~~# m2 for~~which every pair of distinct elements m1 and m2 ofare ~~'''D'''~~incomparable.▼ ~~If i1 ≤ i2 and j2 ≤ j1 (or i2 ≤ i1 and j1 ≤ j2) then neither m1 < m2 nor m1 > m2 are possible; and m1, m2 are ''incomparable''.~~ The set '''M''' represents a relation over match pairs: ~~(i, j) ∈~~ '''M'''[i, j] ⇔ (i, j) ∈ '''M'''~~[i, j]~~. Any chain '''C''' can be visualized as a curve which strictly ~~increasing~~increases ~~curve~~as ~~which~~it passes through each match pair in the mn coordinate space.▼ Defining (#) to denote this case, we write m1 # m2. Because the underlying product-order is strict, m1 == m2 (''i.e.'', i1 == i2 and j1 == j2) implies m1 # m2. m1 <> m2 implies m1 ≠ m2, ''i.e.'', that the two tuples differ in some component. Thus, the (<>) operator is the inverse of (#). Finding an LCS can ~~then~~ be restated as the problem of finding a chain of maximum cardinality p over the set of matches '''M'''.▼ ▲Given a product-order over the set of matches '''M''', a chain '''C''' is any subset of '''M''' where m1 <> m2 for every pair of distinct elements m1 and m2 of '''C'''. Similarly, an antichain '''D''' is any subset of '''M''' where m1 # m2 for every pair of distinct elements m1 and m2 of '''D'''. ▲Finding an LCS can then be restated as the problem of finding a chain of maximum cardinality p over the set of matches '''M'''. ▲The set '''M''' represents a relation over match pairs: (i, j) ∈ '''M''' ⇔ '''M'''[i, j]. Any chain '''C''' can be visualized as a strictly increasing curve which passes through each match pair in the mn coordinate space. According to [Dilworth 1950], this cardinality p equals the minimum number of disjoint antichains into which '''M''' can be decomposed. Note that such a decomposition into the minimal number p of disjoint antichains may not be unique. Line 46 ⟶ 42: A, B are input strings of lengths m, n respectively p is the length of the LCS M is the set of match pairs (i, j) such that xA[i] == yB[j] r is the magnitude of M s is the magnitude of the alphabet Σ of distinct symbols in xA + yB '''References'''