Select lexicographical smallest string after duplicates removed
Asked Answered
O

4

0

Remove all duplicates from a string and select the lexicographical smallest string possible. For example, the string cbacdcbc would return acdb, not adcb.

So this has a relatively simple solution if we don't have to select the string that's lexicographical smallest, but considering that fact, I'm not sure how to come to an efficient solution. Here's what I have so far:

    string removeDuplicateLetters(string s)
    {
        vector<bool> v(26,0);
        for(int i = 0; i < s.size(); i++) {
            v[s[i]-'a'] = 1;
        }

        string ss = "";
        for(int i = 0; i < s.size(); i++) {
            if(v[s[i]-'a']) {
                ss += s[i];
                v[s[i]-'a'] = 0;
            }
        }

        return ss;
    }
Oehsen answered 27/12, 2015 at 1:32 Comment(3)
This problem sounds like homework.Reiner
@HannoBinder actually, it's an interview question.Oehsen
Why can't abcd be the string?Gibun
I
3

Algorithm

  1. Check which letters are present in the input string: a,b,c,d.
  2. Find the first a that has all of b,c,d after it.
    Or if that's not possible, find the first b that has all of a,c,d after it.
    Or if that's not possible, find the first c that has all of a,b,d after it.
    Or if that's not possible, find the first d.
  3. Discard the start of the input string up to the selected character.
  4. Repeat from step 2 with the remaining characters to be found.

Code example

(in Javascript; my C++ is rusty). It creates a bit pattern chars to store which characters are still to be found, and an array after of bit patterns to store which characters are still available after each position.

function smallestString(input) {
    var chars = 0, after = [];
    for (var i = input.length - 1; i >= 0; i--) {
        chars |= 1 << (input.charCodeAt(i) - 97);
        after[i] = chars;
    }
    var result = "", start = 0, pos;
    while (chars) {
        for (var i = 0; i < 26; i++) {
            if (chars & (1 << i)) {
                pos = input.indexOf(String.fromCharCode(97 + i), start);
                if (chars == (chars & after[pos])) {
                    result += String.fromCharCode(97 + i);
                    chars -= 1 << i;
                    break;
                }
            }
        }
        start = pos + 1;
    }
    return result;
}

document.write(smallestString("cbacdcbc") + "<BR>");
document.write(smallestString("thequickbrownfoxjumpsoverthelazydog"));
Illusory answered 27/12, 2015 at 2:49 Comment(0)
G
1

m69's javascript in c++:

string smallestString(string input) {
    int chars = 0;
    int after[sizeof(input)];
    for (int i = input.length() - 1; i >= 0; i--) {
        chars |= 1 << (input[i] - 97);
        after[i] = chars;
    }
    string result = "";
    int start = 0, pos;
    while (chars) {
        for (int i = 0; i < 26; i++) {
            if (chars & (1 << i)) {
                pos = input.find('a' + i, start);
                if (chars == (chars & after[pos])) {
                    result += 'a' + i;
                    chars -= 1 << i;
                    break;
                }
            }
        }
        start = pos + 1;
    }
    return result;
}
Gibun answered 27/12, 2015 at 4:33 Comment(1)
Thanks; I originally learned to program with books about C and then C++, but I haven't used it in years, so I hesitate to post code in it..Pili
N
0

Sketch of an algorithm.

  1. Pass over the string, build a map of how many times each character appears, and the position of the rightmost (and possibly the only) occurrence of every character.

  2. Find the smallest character that can appear in the first position. To do this, go left to right, noting the smallest character encountered; stop when you hit the rightmost occurrence of any character. Remove all characters that precede the smallest one, and all other copies of the smallest one; update the map accordingly.

  3. Repeat starting from the character that immediately follows the smallest one from step #2.

Can terminate early once all counters in the map reach 1. Removal of other copies can be combined with normal iteration (just mark characters to be removed with 0 in the map-of-counters, skip them over in the normal search, remove them when removing the prefix).

This algorithm is quadratic in the worst case, at least in the size of the alphabet (the worst case is abc...zabc...; the algorithm examines half the string for each character, only to decide to keep it). I think this can be fixed by keeping track not just of the smallest, but also second smallest and third smallest and so on, in a kind of priority queue structure (details are left as n exercise for the reader).

Nurmi answered 27/12, 2015 at 2:49 Comment(0)
I
0

I find this approach simple.

Firstly find the count of each character.

input: s

vector<int> cnt(26);
int n=s.size();
for(int i=0;i<n;i++) cnt[s[i]-'a']++;

Have a visited vector, vector<bool> visit(26);

string ans="";
for(int i=0;i<n;i++){
    int t=s[i]-'a';
    cnt[t]--;
    if(visit[t]) continue;
    while(ans.size()>0 && s[i]<ans.back() && cnt[ans.back()-'a']>0){
        visit[ans.back()-'a']=false;
        ans.pop_back();
    }
    ans.push_back(s[i]);
    visit[t]=true;
}
return ans;

The time complexity is O(n)

Incriminate answered 27/6, 2019 at 9:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.