Generate all subsets of size k (containing k elements) in Python
Asked Answered
P

4

24

I have a set of values and would like to create list of all subsets containing 2 elements.

For example, a source set ([1,2,3]) has the following 2-element subsets:

set([1,2]), set([1,3]), set([2,3])

Is there a way to do this in python?

Pseudohermaphrodite answered 11/9, 2011 at 12:30 Comment(0)
T
43

Seems like you want itertools.combinations:

>>> list(itertools.combinations((1, 2, 3), 2))
[(1, 2), (1, 3), (2, 3)]

If you want sets you'll have to convert them explicitly. If you don't mind an iterable instead of a list, and you're using Python 3, you can use map:

>>> s = set((1, 2, 3))
>>> map(set, itertools.combinations(s, 2))
<map object at 0x10cdc26d8>

To view all the results at once, you can pass the output of map to list. (In Python 2, the output of map is automatically a list.)

>>> list(map(set, itertools.combinations(s, 2)))
[{1, 2}, {1, 3}, {2, 3}]

However, if you know you'll need a list, a list comprehension is marginally better (h/t Jacob Bowyer):

>>> [set(i) for i in itertools.combinations(s, 2)]
[{1, 2}, {1, 3}, {2, 3}]
Triiodomethane answered 11/9, 2011 at 12:54 Comment(1)
Damn it!, by the way your map can be done with a list comp [set(i) for i in itertools.combinations(s, 2))]Deduction
S
3

This is a subset of the power set of {1, 2, 3} (or whatever set) containing all two-element sets.

See the Python itertools documentation and search on the term "powerset" for a general answer to this problem.

Sachsen answered 11/9, 2011 at 12:59 Comment(0)
J
1

Just to give another perspective, I looked for a way to iterate all subset of size 2 of {1.....N}, so I put itertools.combinations into test:

import itertools
from time import time


N = 7000
lst = [i for i in xrange(N)]

st = time()
c1 = 0
for x in itertools.combinations(lst, 2):
    c1 += 1
print "combinations: %f" % (time()-st)

st = time()
c2=0
for x in xrange(N):
    for y in xrange(x):
        c2 += 1
print "double loop: %f" % (time()-st)
print "c1=%d,c2=%d" % (c1,c2)

# prints:
#combinations: 4.247000
#double loop: 3.479000
# c1=24496500,c2=24496500

So I guess you should not always turn into the general solution.... If you know in advance the size of the subset you want, it should be more efficient to iterate using for loops.

Also note that you should not iterate over list(itertools.combinations(lst, 2)) since this move creates the list (and much slower than using the generator itself).

Julejulee answered 19/3, 2016 at 13:26 Comment(2)
These two tests don't do the same thing. itertools.combinations actually creates a tuple; your nested loop doesn't create a tuple.Triiodomethane
I did a quick test. If you actually need to create tuples inside the nested loop, it's slower by 50%. Furthermore, if you don't need to use a for loop to process the output of itertools.combinations, you can get a substantial speedup with this generator expression: c3 = sum(1 for pair in itertools.combinations(lst, 2)). That runs about 40% faster than the fastest nested loop. There are always many subtleties to consider when optimizing this kind of code!Triiodomethane
A
-1

Simple Python 3 solution (permutations in given size array):

def combinations(arr, n,k): 
    for i in range(n):
        for j in range(i+k-1,n):
            temp = arr[i:i+k-1]
            temp.append(arr[j])
            print(temp)

arr = [1,2,3,4,5,6]
k = 3
# All combinations subset with size k
print(combinations(arr,len(arr),k))

Output:

[1, 2, 3]
[1, 2, 4]
[1, 2, 5]
[1, 2, 6]
[2, 3, 4]
[2, 3, 5]
[2, 3, 6]
[3, 4, 5]
[3, 4, 6]
[4, 5, 6]
Antineutrino answered 3/9, 2022 at 8:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.