Algorithm - find all permutations of string a in string b
Asked Answered
K

7

6

Say we have

string a = "abc"
string b = "abcdcabaabccbaa"

Find location of all permutations of a in b. I am trying to find an effective algorithm for this.

Pseudo code:

sort string a // O(a loga)

for windows of length a in b  // O(b)?
   sort that window of b      // O(~a loga)?
   compare to a
   if equal
      save the index

So would this be a correct algorithm? Run time would be around O(aloga + ba loga) ~= O(a loga b)? How efficient would this be? Possibly way to reduce to O(a*b) or better?

Kr answered 6/1, 2017 at 22:3 Comment(3)
since when is sorting O(n)?Intermigration
My apologies, I corrected it. I think. I am not well at Big O notations. @IntermigrationKr
@Leeor: A string belongs usually to a small alphabet, so sorting it with counting sort is O(n+s)...Canonical
I
13

sorting is very expensive, and doesn't use the fact you move along b with a sliding window.

I would use a comparison method that is location agnostic (since any permutation is valid) - assign each letter a prime number, and each string will be the multiplication of its letter values.

this way, as you go over b, each step requires just dividing by the letter you remove from he left, and multiplying with the next letter.

You also need to convince yourself that this indeed matches uniquely for each string and covers all permutations - this comes from the uniqueness of prime decomposition. Also note that on larger strings the numbers get big so you may need some library for large numbers

Intermigration answered 6/1, 2017 at 22:17 Comment(8)
Hmm, that is a very interesting approach.Kr
Clever (so +1) but probably slower than just keeping counts.Heartbroken
Say on a smaller scale, this would run as a O(a*b)? Considering it would be O(1) to assign a prime number to the letters (assume we know up to 26 prime numbers), and we divide by what the product of a is, dividing each letter in b window?Kr
Wouldn't it just be O(b), using the sliding window approach? Whenever you shift your window to the right, divide by the char you remove and multiply by the char you add. Or are you not reducing multiplication/division to O(1)?Ottie
yes, it's just O(b), but as a grows in size you may have to increase the comparison time (max prime is determined by the alphabet, but if you allow repetitions you can get multiple copies of each prime)Intermigration
Ah, I was treating the multiplication/division as O(a) considering I have to loop thru each letter to do the division.Kr
Using prime numbers makes it really inefficient as the size of a grows: For example if size of a is 20, you need to first calculate the first 20 prime numbers, and also be worried about overflow of the product of these numbers.Petra
overflow shouldn't be a real issue: you can simply compute the sum of the logs of the prime numbers. If the prime number decomposition is unique, so is the log. Then all the large number become small and it's just additions and subtractions.Food
C
2

There is no need to hash, you can just count frequencies on your sliding window, and check if it matches. Assuming the size of your alphabet is s, you get a very simple O(s(n + m)) algorithm.

// a = [1 .. m] and b = [1 .. n] are the input
cnta = [1 .. s] array initialized to 0
cntb = [1 .. s] array initialized to 0
// nb_matches = the number of i s.t. cnta[i] = cntb[i]
// thus the current subword = a iff. nb_matches = s
nb_matches = s

for i = 1 to m:
    if cntb[a[i]] = 0: nb_matches -= 1
    cntb[a[i]] += 1

ans = 0
for i = 1 to n:
    if cntb[b[i]] = cnta[b[i]]: nb_matches -= 1
    cntb[b[i]] += 1
    if nb_matches = s: ans += 1
    if cntb[b[i]] = cnta[b[i]]: nb_matches += 1
    if i - m + 1 >= 1:
        if cntb[b[i - m + 1]] = cnta[b[i - m + 1]]: nb_matches -= 1
        cntb[b[i - m + 1]] += 1
        if cntb[b[i - m + 1]] = cnta[b[i - m + 1]]: nb_matches += 1
        cntb[b[i - m + 1]] -= 1
return ans
Canonical answered 6/1, 2017 at 23:23 Comment(2)
What is this notation?Leucoderma
O(s(n + m)) what is "s" in Big O? Is time or space?Sherlene
S
2

This is almost solution but will help you to count occurrences of permutations of small strings into larger string

made for only lower case chars

This solution having --

Time Complexity - O(L) where L is length of large input provided to problem, the exact would be to include 26 too for every char present in Large array but by ignoring constant terms, I will solely stand for this.

Space Complexity - O(1) because 26 is also constant and independent of how large input would be.

int findAllPermutations(string small, string larger) {
    int freqSmall[26] = {0};
    //window size
    int n = small.length();

    //to return
    int finalAns = 0;

    for (char a : small) {
        freqSmall[a - 97]++;
    }

    int freqlarger[26]={0};
    int count = 0;
    int j = 0;

    for (int i = 0; larger[i] != '\0'; i++) {
        freqlarger[larger[i] - 97]++;
        count++;

        if (count == n) {
            count = 0;
            int i;
            for (i = 0; i < 26; i++) {
                if (freqlarger[i] != freqSmall[i]) {
                    break;
                }
            }
            if (i == 26) {
                finalAns++;
            }
            freqlarger[larger[j] - 97]--;
             j++;
        }

    }
    return finalAns;
}

int main() {
    string s, t;
    cin >> s >> t;
    cout << findAllPermutations(s, t) << endl;
    return 0;
}
Sloan answered 3/3, 2021 at 8:34 Comment(1)
Thank you for this code snippet, which might provide some limited short-term help. A proper explanation would greatly improve its long-term value by showing why this is a good solution to the problem, and would make it more useful to future readers with other, similar questions. Please edit your answer to add some explanation, including the assumptions you've made.Volatilize
H
0

Write a function strcount() to count the number of occurrences of character ch in a string or sub-sring str.

Then just pass through the search string.

  for(i=0;i<haystacklenN-NeedleN+1;i++)
  {
    for(j=0;j<needleN;j++)
       if(strcount(haystack + i, Nneedle, needle[j]) != strcount(needles, needlesN, needle[j])
        break
  }
  if(j == needleN)
         /* found a permuatation */
Hatteras answered 6/1, 2017 at 22:19 Comment(0)
P
0

Below is my solution. The space complexity is just O(a + b), and the running time (if I can calculate correctly..) is O(b*a), as for each character in b, we may do a recursion a levels deep.

md5's answer is a good one and will be faster!!

public class FindPermutations {
public static void main(String[] args) {

    System.out.println(numPerms(new String("xacxzaa"),
            new String("fxaazxacaaxzoecazxaxaz")));
    System.out.println(numPerms(new String("ABCD"),
            new String("BACDGABCDA")));
    System.out.println(numPerms(new String("AABA"),
            new String("AAABABAA")));

    // prints 4, then 3, then 3
}

public static int numPerms(final String a, final String b) {
    int sum = 0;
    for (int i = 0; i < b.length(); i++) {
        if (permPresent(a, b.substring(i))) {
            sum++;
        }
    }
    return sum;
}

// is a permutation of a present at the start of b?
public static boolean permPresent(final String a, final String b) {
    if (a.isEmpty()) {
        return true;
    }

    if (b.isEmpty()) {
        return false;
    }

    final char first = b.charAt(0);
    if (a.contains(b.substring(0, 1))) {
        // super ugly, but removes first from a
        return permPresent(a.substring(0, a.indexOf(first)) + a.substring(a.indexOf(first)+1, a.length()),
                b.substring(1));
    }
    return false;
}
}

For searchability's sake, I arrive on this page afer looking for other solutions to compare mine to, with the problem originating from watching this clip: https://www.hackerrank.com/domains/tutorials/cracking-the-coding-interview. The original problem statement was something like 'find all permutations of s in b'.

Pyrogen answered 17/10, 2017 at 1:17 Comment(0)
P
0

Use 2 hash tables and with a sliding window of size = length of smaller string:

int premutations_of_B_in_A(string large, string small) {
    unordered_map<char, int> characters_in_large;
    unordered_map<char, int> characters_in_small;
    int ans = 0;

    for (char c : small) {
        characters_in_small[c]++;
    }
    for (int i = 0; i < small.length(); i++) {
        characters_in_large[large[i]]++;
        ans += (characters_in_small == characters_in_large);
    }
    for (int i = small.length(); i < large.length(); i++) {
        characters_in_large[large[i]]++;
        if (characters_in_large[large[i - small.length()]]-- == 1)
            characters_in_large.erase(large[i - small.length()]);

        ans += (characters_in_small == characters_in_large);
    }
    return ans;
}
Prominence answered 4/2, 2021 at 5:44 Comment(0)
P
0

Another approach involves using a sliding window with character count dictionaries to match permutations of S in B.

  • TC - O(N * M)
  • SC - O(1)
from collections import Counter


def find_permutations_in_string(S, B):
    f1 = Counter(S)
    len_s = len(S)
    len_b = len(B)
    res = []

    if f1 == Counter(B[:len_s]):
        res.append(0)

    l, r = 1, len_s

    while r < len_b:

        f2 = Counter(B[l : r + 1])

        if f1 == f2:
            res.append(l)

        l += 1
        r += 1

    return res


S = "abc"
B = "cbabcacab"
result = find_permutations_in_string(S, B)
print(result)  # Output: [0, 2, 3, 6]

S = "abc"
B = "defghicba"
result = find_permutations_in_string(S, B)
print(result)  # Output: [6]

S = "a"
B = "aaaaa"
result = find_permutations_in_string(S, B)
print(result)  # Output: [0, 1, 2, 3, 4]

S = "abc"
B = "aabbcc"
result = find_permutations_in_string(S, B)
print(result)  # Output: []
Prostration answered 6/6 at 6:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.