lexicographically smallest string after rotation
Asked Answered
L

2

9

I am trying to solve this problem in spoj

I need to find the number of rotations of a given string that will make it lexicographically smallest among all the rotations.

For example:

Original: ama

First rotation: maa

Second rotation: aam This is the lexicographically smallest rotation so the answer is 2.

Here's my code:

string s,tmp;
    char ss[100002];
    scanf("%s",ss);
    s=ss;
    tmp=s;
    int i,len=s.size(),ans=0,t=0;
    for(i=0;i<len;i++)
    {
        string x=s.substr(i,len-i)+s.substr(0,i);
        if(x<tmp)
        {
            tmp=x;
            t=ans;
        }
        ans++;
    }

    cout<<t<<endl;

I am getting "Time Limit Exceeded" for this solution. I don't understand what optimizations can be made. How can I increase the speed of my solution?

Liard answered 21/2, 2013 at 12:59 Comment(7)
Please don't use regional abbreviations like 'no.', and 'plz'. StackOverflow has a global audience, many of whom are not native English speakers. Also, what is TLE?Dulcinea
"Other" optimizations? Other than what?Finley
You are answering the wrong question. The linked question is how many rotations, not what is the lexicographically smallest answer.Banquer
@Robᵩ TLE is an acronym used in spoj, meaning Time Limit Exceeded. It looks like OP needs to speed up his solution.Tithable
@Watusimoto:yes..i need the no of rotations,but for that,i need to decide what is the lexicographically smallest stringLiard
@Robᵩ:i appreciate your advice..thanksLiard
See my answer for a detailed explanation with working code.Luckey
S
3

You can use a modified suffix array. I mean modified because you must not stop on word end.

Here is the code for a similar problem I solved (SA is the suffix array):

//719
//Glass Beads
//Misc;String Matching;Suffix Array;Circular
#include <iostream>
#include <iomanip>
#include <cstring>
#include <string>
#include <cmath>
#define MAX 10050
using namespace std;

int RA[MAX], tempRA[MAX];
int SA[MAX], tempSA[MAX];
int C[MAX];                

void suffix_sort(int n, int k) {
    memset(C, 0, sizeof C);        

    for (int i = 0; i < n; i++)        
        C[RA[(i + k)%n]]++;

    int sum = 0;
    for (int i = 0; i < max(256, n); i++) {                     
        int t = C[i]; 
        C[i] = sum; 
        sum += t;
    }

    for (int i = 0; i < n; i++)        
        tempSA[C[RA[(SA[i] + k)%n]]++] = SA[i];

    memcpy(SA, tempSA, n*sizeof(int));
}

void suffix_array(string &s) {             
    int n = s.size();

    for (int i = 0; i < n; i++) 
        RA[i] = s[i];              

    for (int i = 0; i < n; i++) 
        SA[i] = i;

    for (int k = 1; k < n; k *= 2) {     
        suffix_sort(n, k);
        suffix_sort(n, 0);

        int r = tempRA[SA[0]] = 0;
        for (int i = 1; i < n; i++) {
            int s1 = SA[i], s2 = SA[i-1];
            bool equal = true;
            equal &= RA[s1] == RA[s2];
            equal &= RA[(s1+k)%n] == RA[(s2+k)%n];

            tempRA[SA[i]] = equal ? r : ++r;     
        }

        memcpy(RA, tempRA, n*sizeof(int));
    } 
}

int main() {
    int tt; cin >> tt;
    while(tt--) {
        string s; cin >> s;
        suffix_array(s);
        cout << SA[0]+1 << endl;
   }
}

I took this implementation mostly from this book. There is an easier to write O(n log²n) version, but may not be efficient enough for your case (n=10^5). This version is O(n log n), and it's not the most efficient algorithm. The wikipedia article lists some O(n) algorithms, but I find most of them too complex to write during a programming contest. This O(n log n) is usually enough for most problems.

You can find some slides explaining suffix array concept (from the author of the book I mentioned) here.

Schuler answered 21/2, 2013 at 14:54 Comment(1)
I request you to explain the construction of suffix array or give me some link that explains construction of it.Liard
H
1

I know this comes very late but I stumbled across this from google on my search for an even faster variant of this algorithm. Turns out a good implementation is found at github: https://gist.github.com/MaskRay/8803371

It uses the lyndon factorization. That means it repeatly splits the string into lexicographically decreasing lyndon words. Lyndon word are strings that are (one of) the minimal rotations of themselves. Doing this in a circular way yields the lms of the string as the last found lyndon word.

int lyndon_word(const char *a, int n)
{
  int i = 0, j = 1, k;
  while (j < n) {
    // Invariant: i < j and indices in [0,j) \ i cannot be the first optimum
    for (k = 0; k < n && a[(i+k)%n] == a[(j+k)%n]; k++);
    if (a[(i+k)%n] <= a[(j+k)%n]) {
      // if k < n
      //   foreach p in [j,j+k], s_p > s_{p-(j-i)}
      //   => [j,j+k] are all suboptimal
      //   => indices in [0,j+k+1) \ i are suboptimal
      // else
      //   None of [j,j+k] is the first optimum
      j += k+1;
    } else {
      // foreach p in [i,i+k], s_p > s_{p+(j-i)}
      // => [i,i+k] are all suboptimal
      // => [0,j) and [0,i+k+1) are suboptimal
      // if i+k+1 < j
      //   j < j+1 and indices in [0,j+1) \ j are suboptimal
      // else
      //   i+k+1 < i+k+2 and indices in [0,i+k+2) \ (i+k+1) are suboptimal
      i += k+1;
      if (i < j)
        i = j++;
      else
        j = i+1;
    }
  }
  // j >= n => [0,n) \ i cannot be the first optimum
  return i;
}
Huntsman answered 16/5, 2016 at 22:30 Comment(3)
Please explain the gist of the faster implementation here, instead of just linking to it, so that the answer remains an answer if the link dies. This is prefered on SO.Hinder
@m69 Sure. I hope this version is better? Still quite a short explanation though.Huntsman
Info: This algorithm is the same as the phase 2 of "Shiloach's Fast Canonization Algorithm" (en.wikipedia.org/wiki/…).Cletacleti

© 2022 - 2024 — McMap. All rights reserved.