I have some code which loops through a large set of itertools.combinations
,
which is now a performance bottleneck. I'm trying to turn to numba
's @jit(nopython=True)
to speed it up, but I'm running into some issues.
First, it seems numba can't handle itertools.combinations
itself, per this small example:
import itertools
import numpy as np
from numba import jit
arr = [1, 2, 3]
c = 2
@jit(nopython=True)
def using_it(arr, c):
return itertools.combinations(arr, c)
for i in using_it(arr, c):
print(i)
throw error: numba.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Unknown attribute 'combinations' of type Module(<module 'itertools' (built-in)>)
After some googling, I found this github issue where the questioner proposed this numba-safe function for calculating permutations:
@jit(nopython=True)
def permutations(A, k):
r = [[i for i in range(0)]]
for i in range(k):
r = [[a] + b for a in A for b in r if (a in b)==False]
return r
Leveraging that, I can then easily filter down to combinations:
@jit(nopython=True)
def combinations(A, k):
return [item for item in permutations(A, k) if sorted(item) == item]
Now I can run that combinations
function without errors and get the correct result. However, this is now dramatically slower with the @jit(nopython=True)
than without it. Running this timing test:
A = list(range(20)) # numba throws 'cannot determine numba type of range' w/o list
k = 2
start = pd.Timestamp.utcnow()
print(combinations(A, k))
print(f"took {pd.Timestamp.utcnow() - start}")
clocks in at 2.6 seconds with the numba @jit(nopython=True)
decorators, and under 1/000 of a second with them commented out. So that's not really a workable solution for me either.
numba
will only shave constant factors off your computation, and not really an effective way to improve your runtime. Using the python version isn't going to be very efficient, requiring you to materialize the whole thing, also killing your performance – Microphotographitertools.combinations/permutations
are already written inC
and are very efficient. I would like to see how thenumba
homemade permutation function compares toitertools
. – Pedolist(itertools.permutations(list(range(10)), 8))
ran in0.3338
seconds. Thenumba
homemade version took about3.5
seconds. As @Microphotograph says, efficiency in generating combinations is not your problem here. – Pedoswap_options = itertools.combinations(100,2)
, then for eachswap_option
(consisting of two node indexes) I see if swapping them improves the tour length. So I'm not actually exhausting mycombinations
(or I only do once, when I hit a local optimum). But I am constantly looping through them and recalculating them – Supplementtwo_opt
function which callsitertools.combinations
, and I would like to numba@jit()
the whole function but can't without a numba-safe itertools.combinations alternative – Supplementc=2
can be computed very efficiently even though the number of combination is huge. – Vinnievinnitsa