Longest Common Substring: recursive solution?
Asked Answered
I

14

13

The common substring algorithm :

LCS(x,y) = 1+ LCS(x[0...xi-1],y[0...yj-1] if x[xi]==y[yj]
           else 0

Now the Dynamic Programming solution is well understood. However I am unable to figure out the recursive solution. If there are more than one substrings then the above algorithm seems to fail.

Eg:

x = "LABFQDB" and y = "LABDB"

Applying the above algo

1+ (x=  "LABFQD" and y = "LABD")
1+ (x=  "LABFQ" and y = "LAB")
return 0 since 'Q'!='B'

The value returned would be 2 where i should have been 3?

Can someone specify a recursive solution?

Irrefrangible answered 3/7, 2014 at 6:46 Comment(0)
B
22

Try to avoid any confusion, what you're asking is longest common substring, not longest common subsequence, they're quite similar but have differences.

The recursive method for finding longest common substring is:
Given A and B as two strings, let m as the last index for A, n as the last index for B.

    if A[m] == B[n] increase the result by 1.
    if A[m] != B[n] :
      compare with A[m -1] and B[n] or
      compare with A[m] and B[n -1] 
    with result reset to 0.

The following is the code without applying memoization for better illustrating the algorithm.

   public int lcs(int[] A, int[] B, int m, int n, int res) {
        if (m == -1 || n == -1) {
            return res;
        }
        if (A[m] == B[n]) {
            res = lcs(A, B, m - 1, n - 1, res + 1);
        }
        return max(res, max(lcs(A, B, m, n - 1, 0), lcs(A, B, m - 1, n, 0)));
    }

    public int longestCommonSubString(int[] A, int[] B) {
        return lcs(A, B, A.length - 1, B.length - 1, 0);
    }
Brut answered 2/11, 2019 at 4:6 Comment(5)
How can we add memorization to it.Accustomed
In all the solutions people start from the end of the both the strings?? Why can we not start iteration from the starting of the strings ?Jaguar
@Jaguar You can but the base case will be different.Stakeout
This solution belongs to the Longest Common Subsequence problem not the Longest Common Substring. Original question is also related to the Longest Common Subsequence.Is
@Is This is not Longest Common Subsequence. This is in fact the right solution for Longest Common Substring.Vansickle
F
3

Here is the recursive code for LONGEST COMMON SUBSTRING:

int LCS(string str1, string str2, int n, int m, int count)
{
    if (n==0 || m==0)
        return count;
    if (str1[n-1] == str2[m-1])
        return LCS(str1, str2, n-1, m-1, count+1);
    return max(count, max(LCS(str1, str2, n-1, m, 0), LCS(str1, str2, n, m-1, 0)));
}
Feathers answered 28/6, 2020 at 15:36 Comment(0)
N
1

package algo.dynamic;

public class LongestCommonSubstring {

public static void main(String[] args) {
    String a = "LABFQDB";
    String b = "LABDB";
    int maxLcs = lcs(a.toCharArray(), b.toCharArray(), a.length(), b.length(), 0);
    System.out.println(maxLcs);
}

private static int lcs(char[] a, char[] b, int i, int j, int count) {
    if (i == 0 || j == 0)
        return count;
    if (a[i - 1] == b[j - 1]) {
        count = lcs(a, b, i - 1, j - 1, count + 1);
    }
    count = Math.max(count, Math.max(lcs(a, b, i, j - 1, 0), lcs(a, b, i - 1, j, 0)));
    return count;
}

}

Neuralgia answered 2/8, 2018 at 13:12 Comment(0)
T
1

Here is my recursive solution to the longest common substring problem.

ans = 0
def solve(i,j,s1,s2,n,m,dp):
    global ans

    # i is the starting index for s1
    # j is the starting index for s2

    if(i >= n or j >= m):
        return 0

    if(dp[i][j] != -1):
        return dp[i][j]
    

    if(s1[i] == s2[j]):
        dp[i][j] = 1 + solve(i+1,j+1,s1,s2,n,m,dp)

    else:
        dp[i][j] = 0

    solve(i,j+1,s1,s2,n,m,dp)
    solve(i+1,j,s1,s2,n,m,dp)
    ans = max(ans,dp[i][j]) # ans is storing maximum till now

    return dp[i][j]

def longestCommonSubstr(s1, s2, n, m):
    global ans
    # n is the length of s1
    # m is the length s2
    dp = [[-1 for i in range(m)] for j in range(n)]
    ans= 0
    solve(0,0,s1,s2,n,m,dp)
    return ans
Tavy answered 18/1, 2022 at 8:9 Comment(0)
E
0
long max_sub(int i, int j)
{

    if(i<0 or j<0)
        return 0;

    if(s[i]==p[j])
    {
        if(dp[i][j]==-1)
          dp[i][j]=1+max_sub(i-1,j-1);
    }
    else
    {
        dp[i][j] = 0;
    }
    if(i-1>=0 and dp[i-1][j]==-1)
        max_sub(i-1, j);
    if(j-1>=0 and dp[i][j-1]==-1)
        max_sub(i, j-1);
    return dp[i][j];
}

The problem with your code seems like you are not trying all the n^2 possibilities.

Eldoneldora answered 10/10, 2017 at 20:57 Comment(0)
F
0

I designed a recursive solution for this in c++. In my approach i am taking a particular i,j and then if they are equal i am adding 1 and calling the function for i+1, j+1, while if they aren`t equal i am storing a zero at the corresponding i, j in the 2D array that i have created. After its execution i am printing the 2D array and it seems to be okay. As i am just filling the 2D array i think the time complexity must be O(mn) where m is the length of one array and n is of another array.

//Finding longestcommonsubword using top down approach [memoization]
#include<iostream>
using namespace std;

int findlength(char A[], char B[], int i, int j, int st[][5], int r, int c){

if(r <= i)
  return 0;
else if(c <= j)
  return 0;
else{
    if(A[i] == B[j]){

        if(st[i][j] == -1)
        st[i][j] = 1+findlength(A, B, i+1, j+1, st, r, c);
    }else{
        st[i][j] = 0;
        int a = findlength(A, B, i, j+1, st, r, c);
        int b = findlength(A, B, i+1, j, st, r, c);
    }
}    

return st[i][j];
}

int main(){
int n,m;
cin>>n>>m;
char A[n],B[m];

for(int i = 0;i < n;i++)
  cin>>A[i];

for(int j = 0;j < m;j++)
  cin>>B[j];

int st[n][5];


for(int k = 0; k < n;k++){
    for(int l = 0; l< 5;l++){
       st[k][l] = -1;   
    }
}
findlength(A, B, 0, 0, st, n, 5);

for(int k = 0; k < n;k++){
    for(int l = 0; l< 5;l++){
      cout<<st[k][l]<<" ";
    }
    cout<<endl;
}

return 0;
}
Fay answered 7/9, 2018 at 5:48 Comment(0)
C
0
     int max; //This gloabal variable stores max substring length
     int[][]dp; // 2D Array for Memoization

     void main(){
     //--------Main method just intended to demonstrate initialization---------

     dp = new int[m+1][n+1] //m and n are string length
     lcs(String a,String b,int n,int m)
     }
     
     //---++++++++-----Recrsive Memoized Function------++++++++++++-------
     static int lcs(String a,String b,int n,int m){
     
     if(dp[n][m]!=-1)return dp[n][m];
    
     if(n==0||m==0)return dp[n][m]=0;
     
     
     if(a.charAt(n-1)==b.charAt(m-1))
     {
        int res=0;int i=n-1,j=m-1;
        while((i>=0&&j>=0)&&a.charAt(i)==b.charAt(j)){
            res++;
            if(i==0||j==0)return dp[n][m] = Math.max(res,max);
            i--;j--;
        }
        max=Math.max(res,max);
        
        return dp[n][m]=Math.max(max,Math.max(lcs(a,b,n-res,m),lcs(a,b,n,m-res)));
         
     }
     
     return dp[n][m]=Math.max(lcs(a,b,n-1,m),lcs(a,b,n,m-1));
     
     
     
 }
Chidester answered 26/1, 2021 at 17:44 Comment(0)
O
0

Below is the recursive approach for calculating the longest common string:

public int lcsLength(String x, String y)
{
    char[] xc = x.toCharArray();
    char[] yc = y.toCharArray();

    return lcsLength(xc, xc.length - 1, yc, yc.length - 1, 0);
}

private int lcsLength(char[] xc, int xn, char[] yc, int yn, int currentCsLength)
{
    if (xn < 0 || yn < 0) {
        return currentCsLength;
    }
    if (xc[xn] == yc[yn]) {
        return lcsLength(xc, xn - 1, yc, yn - 1, currentCsLength + 1);
    }
    else {
        return max(currentCsLength,
                max(
                        lcsLength(xc, xn - 1, yc, yn, 0),
                        lcsLength(xc, xn, yc, yn - 1, 0)));
    }
}

The downside of using this solution is that it is suffering from recalculating several times the common string for the same substrings of x and y.

This solution is using memoization technique to avoid calculating several times the longest common string in the recursion.

public int lcsLength(String x, String y)
{
    char[] xc = x.toCharArray();
    char[] yc = y.toCharArray();

    Integer[][] memoization = new Integer[xc.length][yc.length];

    return lcsLength(xc, xc.length - 1, yc, yc.length - 1, memoization);
}

private int lcsLength(char[] xc, int xn, char[] yc, int yn, Integer[][] memoization)
{
    if (xn < 0 || yn < 0) {
        return 0;
    }
    if (memoization[xn][yn] == null) {
        if (xc[xn] == yc[yn]) {
            // find out how long this common subsequence is
            int i = xn - 1, j = yn - 1, length = 1;
            while (i >= 0 && j >= 0 && xc[i] == yc[j]) {
                i--;
                j--;
                length++;
            }
            memoization[xn][yn] = Math.max(length, lcsLength(xc, xn - length, yc, yn - length, memoization));
        }
        else {
            memoization[xn][yn] = max(
                    lcsLength(xc, xn - 1, yc, yn, memoization),
                    lcsLength(xc, xn, yc, yn - 1, memoization));
        }
    }

    return memoization[xn][yn];
}
Opening answered 6/8, 2021 at 21:8 Comment(0)
S
0

Hope this might help,even though there are bunch of answers!

 public static int LongestCommonSubString(String x, String y, int m, int n, int curr_max) {
        if (m == 0 || n == 0) return curr_max;

        if (x.charAt(m - 1) == y.charAt(n - 1)) return LongestCommonSubString(x, y, m - 1, n - 1, curr_max + 1);
        
        return Math.max(LongestCommonSubString(x, y, m - 1, n, 0), LongestCommonSubString(x, y, m, n - 1, 0));
    }
Soothe answered 17/9, 2021 at 14:20 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Wolbrom
L
0

Recursive + Memoization solution; All possible combinations are discovered

class Solution{
    int[][] t;
    int longestCommonSubstr(String s1, String s2, int n, int m){
        // code here
        t = new int[n+1][m+1];
        for(int i = 0; i <=n; i++){
            for(int j = 0; j <=m; j++){
                t[i][j] = -1;
            }
        }
        for(int i = 0; i<=n; i++){
            t[i][0] = 0;
        }
        for(int j = 0; j<=m; j++){
            t[0][j] = 0;
        }
        solve(s1,s2,n,m);
        // for(int i = 0; i <=n; i++){
        //     for(int j = 0; j <=m; j++){
        //         System.out.print(t[i][j]+" ");
        //     }
        //     System.out.println();
        // }
        int ans = Integer.MIN_VALUE;
        for(int i = 0; i <= n; i++){
            for(int j = 0; j <=m; j++){
                ans = Math.max(ans,t[i][j]);
            }
        }
        return ans;
    }
    
    private int solve(String s1, String s2, int m, int n){
        if(m == 0 || n == 0) return 0;
        int ans = 0;
        if(s1.charAt(m-1) == s2.charAt(n-1)){
            if(t[m][n] == -1){
                t[m][n] = 1 + solve(s1,s2,m-1,n-1);
            }
            ans = t[m][n];
        }
        if(t[m-1][n] == -1)
        t[m-1][n] = solve(s1,s2,m-1,n);
        if(t[m][n-1] == -1)
        t[m][n-1] = solve(s1,s2,m,n-1);
        return ans;

    }
}
Levantine answered 30/1, 2022 at 20:0 Comment(0)
A
0

Recursive and memorization solution for longest common substring.

i think this what you are looking for

class Solution {
  int ans = 0;
  public int findLength(int[] A, int[] B) {
    int dp[][] = new int[A.length + 1][B.length + 1];
    for (int i = 0; i < A.length + 1; i++)
      Arrays.fill(dp[i], -1);
    commonSub(A, B, A.length, B.length, dp);
    return ans;
  }
  public int commonSub(int[] x, int[] y, int n, int m, int[][] dp) {
    //Base conditon
    if (n == 0 || m == 0) return 0;
    //Memorisation
    if (dp[n][m] != -1) return dp[n][m];
    // for traversing the whole string n*m
    commonSub(x, y, n, m - 1, dp);
    commonSub(x, y, n - 1, m, dp);
    if (x[n - 1] == y[m - 1]) {
      dp[n][m] = commonSub(x, y, n - 1, m - 1, dp) + 1;
      ans = Math.max(ans, dp[n][m]);
      return dp[n][m];
    }
    return dp[n][m] = 0;
  }
}

3 variable memorization index1 and index2 and count (cause we carrying count also)

import java.util.*;
import java.io.*;
public class Solution {
  public static int lcs(String str1, String str2) {
    int l1 = str1.length(), l2 = str2.length(), lmin = Math.min(l1, l2);
    int[][][] dp = new int[l1][l2][lmin];

    // filling 3d matrix dp with -1 for not visted
    for (int[][] mat: dp) {
      for (int[] row: mat)
        Arrays.fill(row, -1);
    }

    // max unit array just for storing max value in traversal
    int[] max = {0};
    lcsUtil(dp, max, str1, str2, l1 - 1, l2 - 1, 0);
    //returning maximum we caputured in the traversal
    return max[0];
  }

  static int lcsUtil(int[][][] dp, int[] max, String s1, String s2, int l1, int l2, int cnt) {
    if (l1 < 0 || l2 < 0)
      return cnt;
    if (dp[l1][l2][cnt] != -1)
      return dp[l1][l2][cnt];
    int x1 = 0, x2 = 0, x3 = 0;
    x1 = lcsUtil(dp, max, s1, s2, l1 - 1, l2, 0);
    x2 = lcsUtil(dp, max, s1, s2, l1, l2 - 1, 0);
    if (s1.charAt(l1) == s2.charAt(l2)) {
      x3 = lcsUtil(dp, max, s1, s2, l1 - 1, l2 - 1, cnt + 1);
      max[0] = Math.max(max[0], cnt + 1);
    }
    return dp[l1][l2][cnt] = Math.max(x3, Math.max(x1, x2));
  }
}
Assisi answered 21/9, 2022 at 8:42 Comment(0)
S
0

I tried some of the code here, but was still confused. With this input

hellothelo heisahello

the algorithm, first checks for "lo" and then proceeds to check longest subword from index 8. The final answer for this input was 2, but expected one is 5 for "hello".

Am I doing something wrong here?

tried with this, looks to be working

int lcw(string a, string b, int i, int j, int count){
    if (i>=a.size() || j>=b.size()) return count;

    if (a[i] == b[j]){ 
        return max(max(lcw(a, b, i, j+1, 0), lcw(a, b, i+1, j, 0)), lcw(a, b, i+1, j+1, count+1));
    }
    else {
        return max(count, max(lcw(a, b, i, j+1, 0), lcw(a, b, i+1, j, 0)));
    }
}

I feel the mistake for below algo

int lcw(string str1, string str2, int n, int m, int count)
{
    if (n==0 || m==0)
        return count;
    if (str1[n-1] == str2[m-1])
        return lcw(str1, str2, n-1, m-1, count+1);
    return max(count, max(lcw(str1, str2, n-1, m, 0), lcw(str1, str2, n, m-1, 0)));
}

Is that once we find something equal, we move on to next character in both string, but miss out on the possibility that there might be bigger string down the line.

Stone answered 7/4, 2023 at 7:32 Comment(0)
G
0

Below is the code but i guess it isn't O(mn)

    class Solution:
    def __init__(self):
        self.memo={}
        
    def longestCommonSubstr(self, S1, S2, n, m):
        
        def f(i,j):
            if (i,j) in self.memo:
                return self.memo[(i,j)]
            if i<0 or  j<0:
                return 0
                
            if S1[i]==S2[j]:
                t=1+f(i-1,j-1)
                self.memo[(i,j)]=t
                return t
            else:
                self.memo[(i,j)]=0
                return 0
        maxm=0
        for i in range(n-1,-1,-1):
            for j in range(m-1,-1,-1):
                if (i,j) in self.memo:
                    maxm=max(maxm,self.memo[(i,j)])
                else:
                    maxm=max(maxm,f(i,j))
        return maxm
Genuine answered 20/7, 2023 at 5:56 Comment(0)
M
-2
import sun.jvm.hotspot.types.CIntegerType;

class RespObject {
    public int len;
    public boolean isSubString;
    int maxLen;

    public RespObject(int len, boolean isSubString, int maxLen) {
        this.maxLen = maxLen;
        this.len = len;
        this.isSubString = isSubString;
    }
}

public class LongestCommonSubstring {
    public static void longestCommonSubstring(String str1, String str2, int i, int j, RespObject resp) {
        if ((j == str2.length()) || (i == str1.length())) {
            resp.isSubString = false;
            resp.len = 0;
            return;
        }
        int currLen = 0;
        longestCommonSubstring(str1, str2, i + 1, j, resp);
        RespObject respObject1 = resp;
        longestCommonSubstring(str1, str2, i, j + 1, resp);
        RespObject respObject2 = resp;
        if (str1.charAt(i) == str2.charAt(j)) {
            longestCommonSubstring(str1, str2, i + 1, j + 1, resp);
            resp.len += 1;
            currLen = resp.len;
            resp.isSubString = true;
        } else {
            resp.len = 0;
            resp.isSubString = false;
        }
        resp.maxLen = Integer.max(Integer.max(respObject1.maxLen, respObject2.maxLen), currLen);
    }

    public static void main(String[] args) {
        RespObject respObject = new RespObject(0, false, Integer.MIN_VALUE);
        longestCommonSubstring("dSite:Geeksf", "wSite:GeeksQ", 0, 0, respObject);
        System.out.println(respObject.len + "   " + String.valueOf(respObject.isSubString) + "  " + respObject.maxLen);
    }
}
Melmon answered 25/8, 2019 at 15:21 Comment(3)
Could you further describe why this solution fixes the problem?Whitewing
Welcome to Stack Overflow. Answering questions here is good but a lump of code with no explanation is not very useful. Please edit the answer to include an explanation of what it does, how it works and how it answers the original question.Oarsman
Please explain what you are doing.Horde

© 2022 - 2024 — McMap. All rights reserved.