Calculating Nth permutation step?
Asked Answered
P

4

8

I have a char[26] of the letters a-z and via nested for statements I'm producing a list of sequences like:

aaa, aaz... aba, abb, abz, ... zzy, zzz.

Currently, the software is written to generate the list of all possible values from aaa-zzz and then maintains an index, and goes through each of them performing an operation on them.

The list is obviously large, it's not ridiculously large, but it's gotten to the point where the memory footprint is too large (there are also other areas being looked at, but this is one that I've got).

I'm trying to produce a formula where I can keep the index, but do away with the list of sequences and calculate the current sequence based on the current index (as the time between operations between sequences is long).

Eg:

char[] characters = {a, b, c... z};
int currentIndex = 29; // abd

public string CurrentSequence(int currentIndex)
{
    int ndx1 = getIndex1(currentIndex); // = 0
    int ndx2 = getIndex2(currentIndex); // = 1
    int ndx3 = getIndex3(currentIndex); // = 3

    return string.Format(
        "{0}{1}{2}", 
        characters[ndx1], 
        characters[ndx2], 
        characters[ndx3]); // abd
}

I've tried working out a small example using a subset (abc) and trying to index into that using modulo division, but I'm having trouble thinking today and I'm stumped.

I'm not asking for an answer, just any kind of help. Maybe a kick in the right direction?

Primula answered 24/8, 2010 at 21:21 Comment(6)
char[25] is not enough to hold a..z. You might want to check for buffer overflows or something.Apical
what exactly are you trying to achieve?Porcia
It looks like you're trying to calculate a single, particular step of a permutation sequence instead of indexing into the entire permutations result?Pyonephritis
I know the answer's been accepted, but it's a great question with a set of clever answers. I'll leave it up to the author but I suggest a better title, something like: "How to calculate Nth permutation step?" "Indexing into non-existent collection" doesn't really convey the issues, IMO.Pyonephritis
@Paul Sasik: agreed. Changed. As an aside, I did use variations on the ideas supplied by @LBushkin and @Martin Liversage to solve a problem very much associated with this one.Primula
@Paul - I don't think "permutation" is right. In the simplest form, at least, a permutation only allows each distinct item to occur once in each sequence in the list. You could argue a permutation from a bag/multiset, but IMO that's cheating ;-) LBushkin says the problem can be solved with a Cartesian product. Pedantically, I don't think that's right either - I think the list of sequences is a Cartesian product.Bolen
C
14

Hint: Think of how you would print a number in base 26 instead of base 10, and with letters instead of digits. What's the general algorithm for displaying a number in an arbitrary base?

Spoiler: (scroll right to view)

                                                                                      int ndx1 = currentIndex / 26 / 26 % 26;
                                                                                      int ndx2 = currentIndex / 26 % 26;
                                                                                      int ndx3 = currentIndex % 26;
Carmelocarmen answered 24/8, 2010 at 21:25 Comment(0)
A
6

Something like this ought to work, assuming 26 characters:

public string CurrentSequence(int currentIndex) {
    return characters[currentIndex / (26 * 26)] 
        + characters[(currentIndex / 26) % 26]
        + characters[currentIndex % 26];
}
Apical answered 24/8, 2010 at 21:25 Comment(0)
G
5

Wow, two questions in one day that can be solved via Cartesian products. Amazing.

You can use Eric Lippert's LINQ snippet to generate all combinations of the index values. This approach results in a streaming set of values, so they don't require storage in memory. This approach nicely separates the logic of generating the codes from maintaining state or performing computation with the code.

Eric's code for all combinations:

static IEnumerable<IEnumerable<T>> CartesianProduct<T>(this IEnumerable<IEnumerable<T>> sequences)  
{  
  IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };  
  return sequences.Aggregate(  
    emptyProduct,  
    (accumulator, sequence) =>   
      from accseq in accumulator   
      from item in sequence   
      select accseq.Concat(new[] {item}));                 
} 

You can now write:

public static IEnumerable<string> AllCodes()
{
  char[] characters = {a, b, c... z}; 
  IEnumerable<char[]> codeSets = new[] { characters, characters, characters };

  foreach( var codeValues in codeSets.CartesianProduct() )
  {
    yield return 
       string.Format( "{0}{1}{2}", codeValues[0], codeValues[1], codeValues[2]);
  }
}

The code above generates a streaming sequence of all code strings from aaa to zzz. You can now use this elsewhere where you perform your processing:

foreach( var code in AllCodes() )
{
    // use the code value somehow...
}
Gobi answered 24/8, 2010 at 21:27 Comment(4)
That solution doesn't allow you to lookup an index efficiently, which is the use case.Apical
@recursive. Perhaps. Excepy the OP did indicate that the software works its way through all of the indexes anyways. So this may still be helpful.Gobi
recursive is correct but it indeed is quite useful. I had seen Eric's post about that, but had forgotten it. +1Primula
I wouldn't say looking up an index is the use case. I read it as the OPs attempt to solve the big-list-in-memory problem, and I think using a lazy enumerator is a preferable solution, given the requirements such as they are defined in the question.Invariable
C
4

There are multiple ways to solve your problem, but an option is to generate the sequence on the fly instead of storing it in a list:

IEnumerable<String> Sequence() {
  for (var c1 = 'a'; c1 <= 'z'; ++c1)
    for (var c2 = 'a'; c2 <= 'z'; ++c2)
      for (var c3 = 'a'; c3 <= 'z'; ++c3)
        yield return String.Format("{0}{1}{2}", c1, c2, c3);
}

You can then enumerate all the strings:

foreach (var s in Sequence())
  Console.WriteLine(s);

This code doesn't use indices at all but it allows you to create a loop around the sequence of strings using simple code and without storing the strings.

Cremator answered 24/8, 2010 at 21:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.