Unique combinations of a list of tuples
Asked Answered
S

5

11

Given a list of 3-tuples, for example:[(1,2,3), (4,5,6), (7,8,9)] how would you compute all possible combinations and combinations of subsets?

In this case the result should look like this:

[
(1), (1,4), (1,5), (1,6), (1,7), (1,8), (1,9), (1,4,7), (1,4,8), (1,4,9), (1,5,7), (1,5,8), (1,5,9), (1,6,7), (1,6,8), (1,6,9),
(2), ...,
(3), ...,
(4), (4,7), (4,8), (4,9), 
(5), (5,7), (5,8), (5,9), 
(6), (6,7), (6,8), (6,9), 
(7), (8), (9)
]
  • all tuples with identical elements are regarded the same
  • combinations which derive from the same tuples are not allowed (e.g. these shouldn't be in the solution: (1,2), (4,6) or (7,8,9))
Spiniferous answered 10/8, 2019 at 18:22 Comment(3)
But wait, why (1) to (9) are part of the soultion if (1,2) is not allowed given the second rule ?Gendarmerie
It looks like there are three sets of tuples: 1) [(x,) for x in the_list[0]], 2) [(x,y) for x in the_list[0] for y in the_list[1]], and 3) [(x,y,z) for x in the_list[0] for y in the_list[1] for z in the_list[2]].Yockey
Possible duplicate of Picking unordered combinations from pools with overlapInterest
K
10

You can use recursion with a generator:

data = [(1,2,3), (4,5,6), (7,8,9)]
def combos(d, c = []):
   if len(c) == len(d):
     yield c
   else:
     for i in d:
        if i not in c:
           yield from combos(d, c+[i])

def product(d, c = []):
  if c:
    yield tuple(c)
  if d:
    for i in d[0]:
      yield from product(d[1:], c+[i])

result = sorted({i for b in combos(data) for i in product(b)})
final_result = [a for i, a in enumerate(result) if all(len(c) != len(a) or len(set(c)&set(a)) != len(a) for c in result[:i])]

Output:

[(1,), (1, 4), (1, 4, 7), (1, 4, 8), (1, 4, 9), (1, 5), (1, 5, 7), (1, 5, 8), (1, 5, 9), (1, 6), (1, 6, 7), (1, 6, 8), (1, 6, 9), (1, 7), (1, 8), (1, 9), (2,), (2, 4), (2, 4, 7), (2, 4, 8), (2, 4, 9), (2, 5), (2, 5, 7), (2, 5, 8), (2, 5, 9), (2, 6), (2, 6, 7), (2, 6, 8), (2, 6, 9), (2, 7), (2, 8), (2, 9), (3,), (3, 4), (3, 4, 7), (3, 4, 8), (3, 4, 9), (3, 5), (3, 5, 7), (3, 5, 8), (3, 5, 9), (3, 6), (3, 6, 7), (3, 6, 8), (3, 6, 9), (3, 7), (3, 8), (3, 9), (4,), (4, 7), (4, 8), (4, 9), (5,), (5, 7), (5, 8), (5, 9), (6,), (6, 7), (6, 8), (6, 9), (7,), (8,), (9,)]
Ketene answered 10/8, 2019 at 18:27 Comment(0)
F
6

Here is a non-recursive solution with a simple for loop. Uniqueness is enforced by applying set to the list of the outputted tuples.

lsts = [(1,2,3), (4,5,6), (7,8,9)]

res = [[]]
for lst in lsts:
    res += [(*r, x) for r in res for x in lst]

# print({tuple(lst) for lst in res[1:]})
# {(5, 9), (4, 7), (6, 9), (1, 4, 7), (2, 6, 9), (4, 8), (3, 4, 7), (2,
# 8), (2, 6, 8), (9,), (2, 5, 8), (1, 6), (3, 6, 8), (2, 5, 9), (3, 5,
# 9), (3, 7), (2, 5), (3, 6, 9), (5, 8), (1, 6, 8), (3, 5, 8), (2, 6,
# 7), (4, 9), (6, 7), (1,), (2, 9), (1, 6, 9), (3,), (1, 5), (5,), (3,
# 6), (7,), (3, 6, 7), (1, 5, 9), (2, 6), (2, 4, 7), (1, 5, 8), (3, 4,
# 8), (8,), (3, 4, 9), (1, 4), (1, 6, 7), (3, 9), (1, 9), (2, 5, 7), (3,
# 5), (2, 7), (2, 4, 9), (6, 8), (1, 5, 7), (2,), (2, 4, 8), (5, 7), (1,
# 4, 8), (3, 5, 7), (4,), (3, 8), (1, 8), (1, 4, 9), (6,), (1, 7), (3,
# 4), (2, 4)}
Freewheel answered 10/8, 2019 at 20:18 Comment(2)
Great solution! Only three lines of pure Python code, no imports necessary, by far the fastest solution and currently the only one that also includes the empty set (which should be part of the solution, IMHO).Elexa
BTW: You mention that uniqueness is enforced by applying set, but I think the list is already correct (i.e. contains only unique values, no duplicates).Elexa
E
3

Using itertools:

import itertools as it

def all_combinations(groups):
    result = set()
    for prod in it.product(*groups):
        for length in range(1, len(groups) + 1): 
            result.update(it.combinations(prod, length))
    return result

all_combinations([(1,2,3), (4,5,6), (7,8,9)])
Enthusiastic answered 10/8, 2019 at 18:35 Comment(0)
S
1

Another version:

from itertools import product, combinations

lst = [(1,2,3), (4,5,6), (7,8,9)]

def generate(lst):
    for idx in range(len(lst)):
        for val in lst[idx]:
            yield (val,)
            for j in range(1, len(lst)):
                for c in combinations(lst[idx+1:], j):
                    yield from tuple((val,) + i for i in product(*c))

l = [*generate(lst)]
print(l)

Prints:

[(1,), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (1, 4, 7), (1, 4, 8), (1, 4, 9), (1, 5, 7), (1, 5, 8), (1, 5, 9), (1, 6, 7), (1, 6, 8), (1, 6, 9), (2,), (2, 4), (2, 5), (2, 6), (2, 7), (2, 8), (2, 9), (2, 4, 7), (2, 4, 8), (2, 4, 9), (2, 5, 7), (2, 5, 8), (2, 5, 9), (2, 6, 7), (2, 6, 8), (2, 6, 9), (3,), (3, 4), (3, 5), (3, 6), (3, 7), (3, 8), (3, 9), (3, 4, 7), (3, 4, 8), (3, 4, 9), (3, 5, 7), (3, 5, 8), (3, 5, 9), (3, 6, 7), (3, 6, 8), (3, 6, 9), (4,), (4, 7), (4, 8), (4, 9), (5,), (5, 7), (5, 8), (5, 9), (6,), (6, 7), (6, 8), (6, 9), (7,), (8,), (9,)]
Subscribe answered 10/8, 2019 at 18:48 Comment(0)
G
1

Thank to @wovano for clarifying rule 2. This makes the solution even more shorter:

from itertools import chain, combinations, product

blubb = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]

set_combos = chain.from_iterable(combinations(blubb, i) for i in range(len(blubb) + 1))
result_func = list(chain.from_iterable(map(lambda x: product(*x), set_combos)))

And as a bonus a speed comparison. @hilberts_drinking_problem's solution is awesome but there is an overhead.

def pure_python(list_of_tuples):
    res = [tuple()]
    for lst in list_of_tuples:
        res += [(*r, x) for r in res for x in lst]
    return res


def with_itertools(list_of_tuples):
    set_combos = chain.from_iterable(combinations(list_of_tuples, i) for i in range(len(list_of_tuples) + 1))
    return list(chain.from_iterable(map(lambda x: product(*x), set_combos)))


assert sorted(with_itertools(blubb), key=str) == sorted(pure_python(blubb), key=str)

Both computes the same stuff, but...

%timeit with_itertools(blubb)
7.18 µs ± 11.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit pure_python(blubb)
10.5 µs ± 46 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

although for small samples itertools are little bit faster there is a large gap when the set grows:

import toolz
large_blubb = list(toolz.partition_all(3, range(3*10)))
assert sorted(with_itertools(large_blubb), key=str) == sorted(pure_python(large_blubb), key=str)

%timeit with_itertools(large_blubb)
106 ms ± 307 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit pure_python(large_blubb)
262 ms ± 1.85 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

which makes itertools's solution 2,5 times faster.


Edit: corrected according to rule 2. Plus speed comparison Edit: fix yet another mistake - now the speed comparison is realistic

Gendarmerie answered 10/8, 2019 at 19:32 Comment(6)
It seems you didn't understand the problem completely. Double-check the last requirement: "combinations which derive from the same tuples are not allowed". So it means you should use 0 or 1 element from (1,2,3) plus 0 or 1 element from (4,5,6) and 0 or 1 element from (7,8,9). See the output of the other answers as reference. Your solution returns invalid combinations such as (1,2,4) as well.Elexa
Thank you @Elexa for the explanation! I edited the answer. Now it should run according to the rules.Gendarmerie
Nice improvement!! I would change the 3th line to set_combos = chain.from_iterable(combinations(blubb, i) for i in range(len(blubb) + 1)), so that it works for arbitrary input length instead of fixed length of 3. It'll also add the empty list, so you don't have to add it manually (e.g. remove [[]] + ) on the next line. You're right that your solution is much faster for large input sets. Well done :-)Elexa
Thanks for the suggestions. Instead of len(blubb)+1 I used len(blubb[0])+1 if I understand correctly.Gendarmerie
No, it really must be len(blubb) and not len(blubb[0]), because you are creating combinations of sets (sets of numbers, not Python sets), so you have to specify the number of sets, which is len(blubb). NB: Why the name blubb?? A more descriptive name might be helpful ;-)Elexa
Oh, yes. Indeed there was a mistate in with_itertools. It is still faster but not orders of magnitudeGendarmerie

© 2022 - 2024 — McMap. All rights reserved.