Longest common substring

Revision as of 14:24, 20 February 2015 by Nigel Galloway (talk | contribs)

Write a program that prompts the user to enter two different strings, and then output their longest common substring. Note that substrings are consecutive characters within a string. This distinguishes them from subsequences, which is any sequence of characters within a string, even if there are extraneous characters in between them. Hence, the longest common subsequence between "thisisatest" and "testing123testing" is "tsitest", whereas the longest common substring is just "test".

Longest common substring is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

References:

C#

<lang csharp>using System; namespace LongestCommonSubstring {

   class Program
   {
       static void Main(string[] args)
       {
           Console.Write("Enter word 1: ");
           string word1 = Console.ReadLine();
           Console.Write("Enter word 2: ");
           string word2 = Console.ReadLine();
           Console.WriteLine(lcs(word1, word2));
           Console.ReadKey(true);
       }
       //This algorithm for finding the longest commmon substring uses dynamic programming.
       public static string lcs(string a, string b)
       {
           var lengths = new int[a.Length, b.Length];
           int greatestLength = 0;
           string output = "";
           //Compare each consecutive character in string a with each consecutive character in string b.
           for (int i = 0; i < a.Length; i++)
           {
               for (int j = 0; j < b.Length; j++)
               {
                   //Keep a running count of the number of consecutive characters in each that are the same.
                   if (a[i] == b[j])
                   {
                       if (i == 0 || j == 0)
                       {
                           lengths[i, j] = 1;
                       }
                       else
                       {
                           lengths[i, j] = lengths[i - 1, j - 1] + 1;
                       }
                       //When you find the greatest count so far, note the substring of which it is. 
                       if (lengths[i, j] > greatestLength)
                       {
                           greatestLength = lengths[i, j];
                           output = a.Substring(i - greatestLength + 1, greatestLength);
                       }
                   }
                   else
                   {
                       lengths[i, j] = 0;
                   }
               }
           }
           //Output the substring you've noted.
           return output;
       }
   }

}</lang>

Go

Translation of: C#

<lang go>package main

import "fmt"

func lcs(a, b string) (output string) {

   lengths := make([]int, len(a)*len(b))
   greatestLength := 0
   for i, x := range a {
       for j, y := range b {
           if x == y {
               if i == 0 || j == 0 {
                   lengths[i*len(b)+j] = 1
               } else {
                   lengths[i*len(b)+j] = lengths[(i-1)*len(b)+j-1] + 1
               }
               if lengths[i*len(b)+j] > greatestLength {
                   greatestLength = lengths[i*len(b)+j]
                   output = a[i-greatestLength+1 : i+1]
               }
           }
       }
   }
   return

}

func main() {

   fmt.Println(lcs("thisisatest", "testing123testing"))

}</lang>

Output:
test