Pythonic way to create union of all values contained in multiple lists
Asked Answered
G

7

122

I have a list of lists:

lists = [[1,4,3,2,4], [4,5]]

I want to flatten this list and remove all duplicates; or, in other words, apply a set union operation:

desired_result = [1, 2, 3, 4, 5]

What's the easiest way to do this?

Glaser answered 28/1, 2010 at 0:44 Comment(0)
S
210

set.union does what you want:

>>> results_list = [[1,2,3], [1,2,4]]
>>> results_union = set().union(*results_list)
>>> print(results_union)
set([1, 2, 3, 4])

You can also do this with more than two lists.

Syncope answered 28/1, 2010 at 0:54 Comment(10)
@sth, thanks for example, but when I run it I get an error: Traceback (most recent call last): File "so_example.py", line 33, in ? results_union=set().union(*result_lists) TypeError: union() takes exactly one argument (3 given)Glaser
@AJ: According to the documentsion (docs.python.org/library/stdtypes.html#set.union) union() only supports multiple arguments for Python version 2.6 or higher. You seem to use a version before that, so you probably have to use an explicit loop: total = set(); for x in results_list: total.update(x) (s/;/\n/)Syncope
You can also save creating an empty set by changing the 2nd line to results_union = set.union(*(set(el) for el in results_list))Uncoil
In this case it's less neat. But if the inputs on the first line were sets too...Uncoil
I'm just adding this as I found it useful to know how to union a bunch of sets at onceUncoil
@Jean-FrançoisFabre Wrong.Lancashire
@Jean-FrançoisFabre TypeError: descriptor 'union' requires a 'set' object but received a 'list' in python 3.6 atleast.Boult
If you use set.union(*results_list) you're binding the method descriptor manually, i.e. sending in the first element of results_list as "self". This makes some weird restrictions: 1. doesn't duck-type properly (now the first element must be a set or instance of a set subclass), and 2. union of an empty results_list will be an error (incorrect result - should return empty set).Lancashire
@Jean-FrançoisFabre Which version of python did you test it on successfully? I tried both 2.7 and 3.6 and got errors on both.Pipage
sorry, actually I had a list of sets in input. Which explains why it worked for me, because first argument was used as self. Sorry for the confusion.Prole
M
18

Since you seem to be using Python 2.5 (it would be nice to mention in your Q if you need an A for versions != 2.6, the current production one, by the way;-) and want a list rather than a set as the result, I recommend:

import itertools

...

return list(set(itertools.chain(*result_list)))

itertools is generally a great way to work with iterators (and so with many kinds of sequences or collections) and I heartily recommend you become familiar with it. itertools.chain, in particular, is documented here.

Meyer answered 28/1, 2010 at 3:38 Comment(3)
+1 A perfect example of a good time to dip into the wonderful itertools package.Attestation
@Alex thanks...edited my question to specify version and remove blame from myself for being so behind in versions :) I'll make it a point to look into itertools, appreciate the suggestion.Glaser
@AJ, no blame, we all can suffer under such constraints after all (but please do remember to specify in future Qs!-); itertools.chain works fine in Python 2.4 as well, by the way.Meyer
G
3

You can also follow this style

In [12]: a = ['Orange and Banana', 'Orange Banana']
In [13]: b = ['Grapes', 'Orange Banana']
In [14]: c = ['Foobanana', 'Orange and Banana']

In [20]: list(set(a) | set(b) | set(c))
Out[20]: ['Orange and Banana', 'Foobanana', 'Orange Banana', 'Grapes']

In [21]: list(set(a) & set(b) | set(c))
Out[21]: ['Orange and Banana', 'Foobanana', 'Orange Banana']    
Guttery answered 1/3, 2016 at 12:13 Comment(0)
J
3

in comprehension way:

[*{ j for i in lists for j in i }]

or

[*functools.reduce(lambda x,y: {*x, *y}, lists)]
Jinni answered 21/6, 2020 at 7:2 Comment(0)
D
2

Unions are not supported by lists, which are ordered, but are supported by sets. Check out set.union.

Deaconry answered 28/1, 2010 at 0:53 Comment(0)
R
0

I used the following to do intersections, which avoids the need for sets.

a, b= [[1,2,3], [1,2]]
s = filter( lambda x: x in b, a)

or,

s = [ x for x in b if x in a ]
Rubellite answered 28/1, 2010 at 0:58 Comment(3)
Why would you even want to "avoid the need for sets"? They're faster, and clearer, for this purpose. And your "x in a" does a linear, brute-force search through the list each time you execute it. Yuck.Colander
sets require type casting, and linear speed isn't bad unless you are dealing with a large N.Rubellite
"Type casting"? In Python? Since when? Sets are basically dicts with only the keys, and they use hash and equality comparisons. Using "x in a" on a list does an equality comparison too. What's all this about type casting?Colander
C
-2
desired_result = [x for y in lists for x in y]
Coin answered 3/8, 2019 at 9:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.