Generating all possible permutations for a given base and number of digits

Asked 23/4, 2012 at 8:47 Answered 8/6, 2021 at 20:37

I'm sure this is pretty simple, but I'm stumped for a way to do this. Essentially if I have an array with P collumns and V^P rows, how can I fill in all the combinations, that is, essentially, all possible numbers in base V of P digits. For example, for P=3 and V=2:

Keep in mind that this is an 2 dimensional array, not an array of ints.

For P=4 and V=3.

Having this array generated, the rest of work for what I'm trying to devolop is trivial. So having some code/tips on how to do this would be greatly appreciated. Thanks.

Vassaux answered 23/4, 2012 at 8:47 Comment(0)

Taking your example with P=3 and V=2, in the first column you need this sequence of numbers:

0, 0, 0, 0, 1, 1, 1, 1

So you essentially want four 0's followed by four 1's.

In the second column you need:

0, 0, 1, 1, 0, 0, 1, 1

So you want two 0's followed by two 1's, followed by the same again.

In general, in column number n, you need V^(P-n) of each digit, repeated V^(n-1) times.

Example when P=3 and V=2:

Column 1: We need V^(P-n) = 2^(3-1) = 4 of each digit, repeated V^(n-1) = 2^0 = 1 times:

[0, 0, 0, 0, 1, 1, 1, 1]

Column 2: We need V^(P-n) = 2^(3-2) = 2 of each digit, repeated V^(n-1) = 2^1 = 2 times:

[0, 0, 1, 1], [0, 0, 1, 1]

Column 3: We need V^(P-n) = 2^(3-3) = 1 of each digit, repeated V^(n-1) = 2^2 = 4 times:

[0, 1], [0, 1], [0, 1], [0, 1]

Some Python code that generates this sequence:

def sequence(v, p, column):
    subsequence = []
    for i in range(v):
        subsequence += [i] * v**(p - column)
    return subsequence * v**(column - 1)

Suomi answered 3/6, 2012 at 13:38 Comment(0)

Basically this is making a list of v^p numbers from 0 to the largest number of digit width p in base v. numpy.base_repr can be used to do this in Python:

from numpy import base_repr

def base_of_size(base, size):
    for i in range(base ** size):
        yield base_repr(i, base).rjust(size, "0")

Additionally, itertools.product(range(v), repeat=p) is another Python builtin that does the job (it turns out most efficiently--see benchmark below).

Here's the algorithm from numpy.base_repr translated to C# (Convert.ToString() is very selective about bases):

using System;
using System.Collections.Generic;

class Converter
{
    public static IEnumerable<string> BaseOfSize(int baseN, int size) 
    {
        for (int i = 0; i < Math.Pow(baseN, size); i++) 
        {
              yield return BaseRepr(i, baseN).PadLeft(size, '0');
        }
    }

    public static string BaseRepr(int n, int baseN)
    {
        string digits = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
        var res = new List<char>();

        for (int num = Math.Abs(n); num > 0; num /= baseN) 
        {
            res.Add(digits[num%baseN]);  
        }

        if (n < 0) res.Add('-');

        res.Reverse();
        return string.Join("", res);
    }

    public static void Main(string[] args) 
    {
        foreach (var n in BaseOfSize(2, 3)) 
        {
            Console.WriteLine(n);
        }

        Console.WriteLine();

        foreach (var n in BaseOfSize(3, 4)) 
        {
            Console.WriteLine(n);
        }
    }
}

Output:

Although the numpy version is simple to use and iterative, it's also slow. Using a recursive DFS approach means we don't have to compute each number from scratch, but can simply increment the previous number until we reach a new leaf. These versions don't use generators, but it's an easy adjustment:

Python:

def base_of_size(base, size):
    def recurse(res, row, i=0):
        if i >= size:
            res.append(row[:])
        else:
            for j in range(base):
                row[i] = j
                recurse(res, row, i + 1)

        return res

    return recurse([], [None] * size)

C#:

using System;
using System.Collections.Generic;

class Converter
{
    public static List<List<int>> BaseOfSize(int v, int p) 
    {
        var res = new List<List<int>>();
        BaseOfSize(v, p, 0, new List<int>(new int[p]), res);
        return res;
    }

    private static void BaseOfSize(int v, int p, int i, List<int> row, List<List<int>> res)
    {
        if (i >= p) 
        {
            res.Add(new List<int>(row));
        }
        else 
        {
            for (int j = 0; j < v; j++) 
            { 
                row[i] = j;
                BaseOfSize(v, p, i + 1, row, res);
            }
        }
    }
}

Quick benchmark (with generators):

from itertools import product
from time import time
from numpy import base_repr

def base_of_size(base, size):
    def recurse(res, row, i=0):
        if i >= size:
            yield row[:]
        else:
            for j in range(base):
                row[i] = j
                yield from recurse(res, row, i + 1)

        return res

    yield from recurse([], [None] * size)

def base_of_size2(base, size):
    for i in range(base ** size):
        yield base_repr(i, base).rjust(size, "0")

if __name__ == "__main__":
    start = time()
    list(base_of_size(10, 6))
    end = time()
    print("dfs:", end - start)
    start = time()
    list(base_of_size2(10, 6))
    end = time()
    print("base_repr:", end - start)
    start = time()
    list(product(range(10), repeat=6))
    end = time()
    print("product:", end - start)

Output:

dfs: 4.616123676300049
base_repr: 9.795292377471924
product: 0.5925478935241699

itertools.product wins by a long shot.

Kistner answered 9/8, 2019 at 16:18 Comment(4)

This works, but I get a stack overflow exception if the length is too great. Is there a way to implement it without recursion? – Kielty 9/8, 2019 at 17:20

OK, added a re-write. How does this version work for you? It's slower but iterative. Still thinking about the problem. – Kistner 9/8, 2019 at 22:37

Yeah, the more I look at it, the more I don't see how it's possible to overflow the stack on the recursive versions. The depth is p, which is the exponent, so even just 2^30 is going to give you a result of 1073741824 numbers, and 30 is nowhere near the stack limit. I'm not sure I can write a stack version iteratively without lots of string concatenation or array copying nor should it be necessary. If it's a memory problem, maybe a generator version will work. – Kistner 9/8, 2019 at 23:13

Don't worry about it. I messed up and passed in a value that was way too big for p. It works fine. – Kielty 12/8, 2019 at 11:38

If there is varying number of options in each "digit", this code can be used.

Maybe in some optimization tool there exist algorithm to do this, because it could be useful for brute force method. Code below adds a column which shows "the value of biggest digit", it can be ignored:

import numpy as np
val=np.arange(15)
options=[2,2,3]
print(val)
print(options)
opt = options + [1] # Assumes options to be a list
opt_cp = np.flip(np.cumprod(np.flip(np.array(opt))))
ret = np.floor_divide(val[:,np.newaxis], opt_cp[np.newaxis,:])
ret[:,1:] = np.remainder(ret[:,1:], np.array(opt[:-1])[np.newaxis,:])
inds = ret[:,1:]
print(inds)

Contraposition answered 12/8, 2019 at 6:37 Comment(0)

You could also use numpy's N-dimensional mesh grid functions.

E.g.

np.mgrid[0:2,0:2,0:2].reshape((3, 8)).T

array([[0, 0, 0],
       [0, 0, 1],
       [0, 1, 0],
       [0, 1, 1],
       [1, 0, 0],
       [1, 0, 1],
       [1, 1, 0],
       [1, 1, 1]])

np.stack(np.meshgrid(range(2), range(2), range(2), indexing='ij')).reshape(3, -1).T

or in general for any P, V:

np.mgrid[[slice(0, V)]*P].reshape((P, -1)).T

np.stack(np.meshgrid(*[range(V)]*P, indexing='ij')).reshape((P, -1)).T

There must be a more obvious way but I can't think what.

Sandusky answered 8/6, 2021 at 20:37 Comment(0)

Recommended topics

Hot tags