Non-continuous subsequences: Difference between revisions

Content added Content deleted

Inline

Revision as of 13:32, 29 March 2008

Consider some sequence of elements.

A subsequence contains some, but not all elements of this sequence, in the same order.

A continuous subsequence is missing elements only before its beginning and after its end.

Note: Subsequences are defined structurally, not by their contents. So a sequence a,b,c,d will always have the same subsequences and continous subsequences, no matter which values are substituted; it may be even the same value.

Task: Find all non-continuous subsequences for a given sequence. Example: For the sequence 1,2,3,4, there are five non-continuous subsequences, namely 1,3; 1,4; 2,4; 1,3,4 and 1,2,4.

Goal: There are different ways to calculate those subsequences. Demonstrate algorithm(s) that are natural for the language.

D

The list of non-continuous subsequences is constructed in the following way:

Select some continuous subsequences of length at least 3 from the original sequence;
From each such continuous subsequences, remove arbitrary elements from the middle, but always keep the first and last element.

That cannot generate duplicate sequences, because first and last elements determine the corresponding continous sequence uniquely. Also, sequences of length less than three cannot contain non-continuous subsequences by definition.

<d>module ncsub ; import std.stdio ;

struct Combi {

 private int n_, m_ ;
 alias int delegate(inout int[]) dg_type ;
 int opApply(dg_type dg) { return combinate([], 0, m_, dg) ; }
 int combinate(int[] fixed, int next, int left, dg_type dg) {
   if (left <= 0) dg(fixed) ;
   else
     for(int i = next ; i < n_ - left + 1; i++)
       combinate(fixed ~ [i+1], i + 1, left - 1, dg) ;
   return 0 ;
 }

}

T[] takesome(T)(T[] seq, int[] rmIndexs) {

 T[] res = seq.dup ;
 foreach(idx ; rmIndexs.reverse) 
   res = res[0..idx] ~ res[idx+1..$] ;
 return res ;

}

T[][] ncsub(T)(T[] seq) {

 if(seq.length < 3) return [] ;
 T[][] ncset ;
 for(int head = 0 ; head < seq.length - 2 ; head++)
   for(int tail = head + 2 ; tail < seq.length ; tail++) {
     T[] contseq = seq[head..tail+1] ;
     for(int takeNum = 1 ; takeNum < contseq.length - 1 ; takeNum++)
       foreach(removeIndexs ; Combi(contseq.length - 2, takeNum))
         ncset ~= takesome(contseq, removeIndexs) ;
   }
 return ncset ;

}

void main() {

 writefln(ncsub([1,2,3])) ;
 writefln(ncsub([1,2,3,4])) ;
 writefln(ncsub([1,2,3,4,5])) ;  // results match Haskell examples
 writefln(ncsub([1,1,1,1,1])) ;

}</d>

Haskell

Generalized monadic filter

action p x = if p x then succ x else x

fenceM p q s []     = guard (q s) >> return []
fenceM p q s (x:xs) = do
  (f,g) <- p 
  ys <- fenceM p q (g s) xs
  return $ f x ys

ncsubseq = fenceM [((:), action even), (flip const, action odd)] (>= 3) 0

Output:

*Main> ncsubseq [1..3]
[[1,3]]
*Main> ncsubseq [1..4]
[[1,2,4],[1,3,4],[1,3],[1,4],[2,4]]
*Main> ncsubseq [1..5]
[[1,2,3,5],[1,2,4,5],[1,2,4],[1,2,5],[1,3,4,5],[1,3,4],[1,3,5],[1,3],[1,4,5],[1,4],[1,5],[2,3,5],[2,4,5],[2,4],[2,5],[3,5]]

Filtered templates

This implementation works by computing templates of all possible subsequences of the given length of sequence, discarding the continuous ones, then applying the remaining templates to the input list.

continuous = null . dropWhile not . dropWhile id . dropWhile not
ncs xs = map (map fst . filter snd . zip xs) $
           filter (not . continuous) $
             mapM (const [True,False]) xs

J

Here, solution sequences are calculated abstractly by ncs, then used by ncs_of to draw items from the input list. The algorithm is filtered templates. As marked by sections, ncs (a) makes all possible sub-sequences of the given length, (b) retains those that contain an internal gap, then (c) returns a list of their index-lists.

   NB.   ======= c =======  ----b----  ========== a ==========
   ncs=: (#&.> <@:i.)~ <"1@: (#~ gap) @:(([ $ 2:) #: i.@(2^]))
   gap=: +./@:((1 i.~ 1 0 E. ])<(1 i:~ 0 1 E. ]))"1 1 @: ((##0:),.])
   ncs_of=: # (ncs@[ {&.> ]) <

Examples:

   ncs 4
+---+---+---+-----+-----+
|1 3|0 3|0 2|0 2 3|0 1 3|
+---+---+---+-----+-----+
   ncs_of 9 8 7 6
+---+---+---+-----+-----+
|8 6|9 6|9 7|9 7 6|9 8 6|
+---+---+---+-----+-----+
   ncs_of 'aeiou'
+--+--+--+---+---+--+--+---+--+---+---+----+---+---+----+----+
|iu|eu|eo|eou|eiu|au|ao|aou|ai|aiu|aio|aiou|aeu|aeo|aeou|aeiu|
+--+--+--+---+---+--+--+---+--+---+---+----+---+---+----+----+

@@ Line 6: / Line 6: @@
 A ''continuous'' subsequence is missing elements only before its beginning and after its end.
+Note: Subsequences are defined ''structurally'', not by their contents. So a sequence ''a,b,c,d'' will always have the same subsequences and continous subsequences, no matter which values are substituted; it may be even the same value.
 '''Task''': Find all non-continuous subsequences for a given sequence. Example: For the sequence ''1,2,3,4'', there are five non-continuous subsequences, namely ''1,3''; ''1,4''; ''2,4''; ''1,3,4'' and ''1,2,4''.
@@ Line 13: / Line 15: @@
 =={{header|D}}==
+The list of non-continuous subsequences is constructed in the following way:
-The task requirements didn't change, but this seems to be the only fitting template:
+# Select some continuous subsequences of length at least 3 from the original sequence;
-{{Example-needs-review}}
+# From each such continuous subsequences, remove arbitrary elements from the middle, but always keep the first and last element.
+That cannot generate duplicate sequences, because first and last elements determine the corresponding continous sequence uniquely. Also, sequences of length less than three cannot contain non-continuous subsequences by definition.
-(I may be misunderstood...) The list of non-continuous subsequences is constructed from some what different from task description, but should be equivalent way:
-#from the original sequence, select some continuous subsequences of length >= 3,
-#from each continuous subsequences in (1), remove some in-between elements beside first and last element,
-#each subsequences some elements removed in (2) is a non-continuous subsequences.
-'''Question''': How do you make sure you're not calculating the same subsequence twice?
-Sequences of length < 3 cannot contain non continuous subsequences by definition.
 <d>module ncsub ;
@@ Line 65: / Line 61: @@
   writefln(ncsub([1,2,3,4])) ;
   writefln(ncsub([1,2,3,4,5])) ;  // results match Haskell examples
-  writefln(ncsub([1,1,1,1,1])) ;  // should this return [] ? that is, no non-cont subseq.
+  writefln(ncsub([1,1,1,1,1])) ;
 }</d>
-'''Remark''': Subsequences are defined ''structurally'', not by their contents. So [1,1,1,1,1] should have exactly as many non-continuous-subsequences as [1,2,3,4,5] (they all just have values 1, so one cannot tell them apart). If your program returns different results for them, it doesn't implement the task correctly.
 =={{header|Haskell}}==