Is there any way of finding the longest common subsequence of two sequences in O(NlogN) time?
I read somewhere that there is a way to achieve this using binary search.
I know the dp approach that takes O(N2) time.
Is there any way of finding the longest common subsequence of two sequences in O(NlogN) time?
I read somewhere that there is a way to achieve this using binary search.
I know the dp approach that takes O(N2) time.
For the general case, the O(N^2) dynamic programming algorithm is the best you can do. However, there exist better algorithms in some special cases.
This is a very common situation. Sequences consisting of letters from some alphabet (e.g. English) lie in this category. For this case, the O(N*M) algorithm can be optimised to get an O(N^2/logN) with method of four Russians. I don't know how exactly, you can search for it.
An example problem is "Given two permutations of numbers from 1 to N, find their LCS". This one can be solved in O(N*logN).
Let the sequences be A and B.
Define a sequence C. C[i] is the index of B[i] in A. (A[C[i]] = B[i])
LCS of A and B is the longest increasing subsequence of C.
The dynamic programming approach, which is O(n2) for general case. For certain other cases, there are lower-complexity algorithms:
For a fixed alphabet size (which doesn't grow with n), there's the Method of Four Russians which brings the time down to O(n2/log n) (see here).
See here another further optimized case.
diff
. –
Immoral Assuming Exponential Time Hypothesis (which is stricter than P is not equal to NP, but is still widely believed to be true), it is not possible to do it in time O(N^{2 - eps}) for any positive constant eps, see "Quadratic Conditional Lower Bounds for String Problems and Dynamic Time Warping" by Karl Bringmann and Marvin Kunnemann (pre-print on arXiv is available).
Roughly speaking, it means that the general case of this problem cannot be solved in time better than something like O(N^2/log N), so if you want faster algorithms you have to consider additional constraints (some properties of the strings) or look for approximate solution.
The longest common subsequence between two sequences is essentially n-squared.
Masek and Patterson (1980) made a minor improvement to n-squared / log n using the so-called "Four Russians" technique.
In most cases the additional complexity introduced by such convoluted approaches is not justified by the small gains. For practical purposes you can consider the n-squared approach as the reasonable optimum in typical applications.
diff
. –
Immoral vector <int> LIS;
int LongestIncreasingSubsequence(int n){
if(!n) return 0;
LIS.emplace_back(arr[0]);
for(int i = 1; i < n; i++){
if(arr[i] > LIS.back()) LIS.emplace_back(arr[i]);
else *lower_bound(LIS.begin(), LIS.end(), arr[i]) = arr[i];
}
return LIS.size();
}
The following is a O(NLOGM) approach of the LCS problem that uses binary search:
//If the strings are not the same length, take the longest one
//Map each letter with a ArrayList representing the occurences on that letter (indexes)
//The ArrayLists are thus sorted. This will enable binary search later on
Map<Character, List<Integer>> map = new HashMap<>();
for (int i = 0; i < first.length(); i++) {
char c = first.charAt(i);
if (!map.containsKey(c)) {
map.put(c, new ArrayList<>());
}
map.get(c).add(i);
}
//Loop on the second string and increment an index to check that any matching character is a valid match
//For sequence match we need to maintain the same order as in the initial string so we use an increasing index for that
int index = 0;
StringBuilder lcs = new StringBuilder();
for (int j = 0; j < second.length(); j++) {
char c = second.charAt(j);
if (map.containsKey(c)) {
List<Integer> indexes = map.get(c);
//The following method uses Binary Search to find the first bigger or equal index
//This is possible since the indexes list is sorted by definition
int firstBigger = firstBiggerOrEqualIndex(indexes, index);
if (firstBigger != -1) {
index = firstBigger+1;
lcs.append(c);
}
}
}
return lcs.toString();
The method firstBiggerOrEqualIndex uses Binary Search like this:
private static int firstBiggerOrEqualIndex(List<Integer> indexes, int index) {
int start = 0;
int end = indexes.size()-1;
int res = -1;
while(start <= end) {
int mid = (start+end)/2;
int value = indexes.get(mid);
if (value > index) {
res = value;
end = mid-1;
}
else if (value == index) {
res = value;
break;
}
else {
start = mid + 1;
}
}
return res;
}
© 2022 - 2024 — McMap. All rights reserved.