I know this question is old, but here is how it may be done efficiently.
All code currently in Python which I'm sure will be easy enough to translate to C++.
- You will probably want to move from using an integer for the characteristic vector to using a bit array, since the integers used will need 500 bits (not a problem in Python)
- Feel free to update to C++ anyone.
- Distribute to the nodes their range of combinations (
start
number and length
to process), the set of items
from which combinations are to be chosen, and the number, k
, of items to choose.
- Initialise each node by having it find its starting combination directly from
start
and items
.
- Run each node by having it do the work for the first combination then iterate through the rest of its combinations, and do the associated work.
To perform 1 do as you suggest find n-choose-k and divide it into ranges - in your case 500-Choose-5 is, as you said, 255244687600 so, for node=0 to 9 you distribute:
(start=node*25524468760, length=25524468760, items=items, k=5)
To perform 2 you can find the starting combination directly (without iteration) using the combinatorial number system and find the integer representation of the combination's characteristic vector (to be used for the iteration in 3) at the same time:
def getCombination(index, items, k):
'''Returns (combination, characteristicVector)
combination - The single combination, of k elements of items, that would be at
the provided index if all possible combinations had each been sorted in
descending order (as defined by the order of items) and then placed in a
sorted list.
characteristicVector - an integer with chosen item's bits set.
'''
combination = []
characteristicVector = 0
n = len(items)
nCk = 1
for nMinusI, iPlus1 in zip(range(n, n - k, -1), range(1, k + 1)):
nCk *= nMinusI
nCk //= iPlus1
curIndex = nCk
for k in range(k, 0, -1):
nCk *= k
nCk //= n
while curIndex - nCk > index:
curIndex -= nCk
nCk *= (n - k)
nCk -= nCk % k
n -= 1
nCk //= n
n -= 1
combination .append(items[n])
characteristicVector += 1 << n
return combination, characteristicVector
The integer representation of the characteristic vector has k bits set in the positions of the items that make up the combination.
To perform 3 you can use Gosper's hack to iterate to the next characteristic vector for the combination in the same number system (the next combination that would appear in a sorted list of reverse sorted combinations relative to items
) and, at the same time, create the combination:
def nextCombination(items, characteristicVector):
'''Returns the next (combination, characteristicVector).
combination - The next combination of items that would appear after the
combination defined by the provided characteristic vector if all possible
combinations had each been sorted in descending order (as defined by the order
of items) and then placed in a sorted list.
characteristicVector - an integer with chosen item's bits set.
'''
u = characteristicVector & -characteristicVector
v = u + characteristicVector
if v <= 0:
raise OverflowError("Ran out of integers") # <- ready for C++
characteristicVector = v + (((v ^ characteristicVector) // u) >> 2)
combination = []
copiedVector = characteristicVector
index = len(items) - 1
while copiedVector > 0:
present, copiedVector = divmod(copiedVector, 1 << index)
if present:
combination.append(items[index])
index -= 1
return combination, characteristicVector
Repeat this length-1
times (since you already found the first one directly).
For example:
Five nodes processing 7-choose-3 letters:
>>> items = ('A','B','C','D','E','F','G')
>>> k = 3
>>> nodes = 5
>>> n = len(items)
>>> for nmip1, i in zip(range(n - 1, n - k, -1), range(2, k + 1)):
... n = n * nmip1 // i
...
>>> for node in range(nodes):
... length = n // nodes
... start = node * length
... print("Node {0} initialised".format(node))
... combination, cv = getCombination(start, items, k)
... doWork(combination)
... for i in range(length-1):
... combination, cv = nextCombination(items, cv)
... doWork(combination)
...
Node 0 initialised
Doing work with: C B A
Doing work with: D B A
Doing work with: D C A
Doing work with: D C B
Doing work with: E B A
Doing work with: E C A
Doing work with: E C B
Node 1 initialised
Doing work with: E D A
Doing work with: E D B
Doing work with: E D C
Doing work with: F B A
Doing work with: F C A
Doing work with: F C B
Doing work with: F D A
Node 2 initialised
Doing work with: F D B
Doing work with: F D C
Doing work with: F E A
Doing work with: F E B
Doing work with: F E C
Doing work with: F E D
Doing work with: G B A
Node 3 initialised
Doing work with: G C A
Doing work with: G C B
Doing work with: G D A
Doing work with: G D B
Doing work with: G D C
Doing work with: G E A
Doing work with: G E B
Node 4 initialised
Doing work with: G E C
Doing work with: G E D
Doing work with: G F A
Doing work with: G F B
Doing work with: G F C
Doing work with: G F D
Doing work with: G F E
>>>