Cartesian product in Numba
Asked Answered
R

2

8

My code uses a cartesian product of a list like this:

import itertools

cartesian_product = itertools.product(list('ABCDEF'), repeat=n)

n can be any value between 0 and 4.

numba currently does not support the itertools.product. I've been unable so far to come up with a working alternative. Any suggestions are welcome!

Repugnant answered 20/7, 2019 at 22:14 Comment(0)
A
2

Here's one way to solve your problem. There are two numba functions: one for the regular cartesian product (a list of arrays), and one using the repeat argument for a simple array. This is an adaptation of some code from another one of my SO answers.

@nb.njit(nb.int32[:,:](nb.int32[:]))
def cproduct_idx(sizes: np.ndarray):
    """Generates ids tuples for a cartesian product"""
    assert len(sizes) >= 2
    tuples_count  = np.prod(sizes)
    tuples = np.zeros((tuples_count, len(sizes)), dtype=np.int32)
    tuple_idx = 0
    # stores the current combination
    current_tuple = np.zeros(len(sizes))
    while tuple_idx < tuples_count:
        tuples[tuple_idx] = current_tuple
        current_tuple[0] += 1
        # using a condition here instead of including this in the inner loop
        # to gain a bit of speed: this is going to be tested each iteration,
        # and starting a loop to have it end right away is a bit silly
        if current_tuple[0] == sizes[0]:
            # the reset to 0 and subsequent increment amount to carrying
            # the number to the higher "power"
            current_tuple[0] = 0
            current_tuple[1] += 1
            for i in range(1, len(sizes) - 1):
                if current_tuple[i] == sizes[i]:
                    # same as before, but in a loop, since this is going
                    # to get run less often
                    current_tuple[i + 1] += 1
                    current_tuple[i] = 0
                else:
                    break
        tuple_idx += 1
    return tuples

@nb.njit
def cartesian_product(*arrays):
    sizes = [len(a) for a in arrays]
    sizes = np.asarray(sizes, dtype=np.int32)
    tuples_count  = np.prod(sizes)
    array_ids = cproduct_idx(sizes)
    tuples = np.zeros((tuples_count, len(sizes)))
    for i in range(len(arrays)):
        tuples[:, i] = arrays[i][array_ids[:, i]]
    return tuples

@nb.njit
def cartesian_product_repeat(array, repeat):
    sizes = [len(array) for _ in range(repeat)]
    sizes = np.asarray(sizes, dtype=np.int32)
    tuples_count  = np.prod(sizes)
    array_ids = cproduct_idx(sizes)
    tuples = np.zeros((tuples_count, len(sizes)))
    for i in range(repeat):
        tuples[:, i] = array[array_ids[:, i]]
    return tuples

And here's an exemple of the last function's execution:

>>> cartesian_product_repeat(np.arange(2), 3)

array([[0., 0., 0.],
       [1., 0., 0.],
       [0., 1., 0.],
       [1., 1., 0.],
       [0., 0., 1.],
       [1., 0., 1.],
       [0., 1., 1.],
       [1., 1., 1.]])

Anguilliform answered 6/10, 2020 at 21:45 Comment(0)
N
1

The cartesian product can be seen as a base 6 number of length 1 to 5 (0 to 4?) like 331 or 40302.

The number of possible combinations is 6^5, from 00000(base 6) to 55555(base 6) or AAAAA to FFFFF.

So if you really need them as list, generate the numbers from 0 to 6^5-1 which is decimal 7775, and map every number to the according code.

For example for the decimal number 127, the modulo 6 is 1. 127 / 6 => 21, 21 % 6 => 3 21 / 6 => 3 (3 < 6, we're done).

Our base-6 number therefore is 331, 331 is, for a mapping A:0, B:1, ... F:5 therefore DDB.

Depending on your use case, building the whole list once will be more efficient. For very big cartesian products using just a mapping from number to element and knowing the maximum index of the list (m^n) might be much more efficient.

Niles answered 22/7, 2019 at 12:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.