python set comprehension for 2.6

Asked 7/7, 2015 at 14:13 Answered 7/7, 2015 at 14:28

I was trying set comprehension for 2.6, and came across the following two ways. I thought the first method would be faster than the second, timeit suggested otherwise. Why is the second method faster even though the second method has got an extra list instantiation followed by a set instantiation?

Method 1:

In [16]: %timeit set(node[0] for node in pwnodes if node[1].get('pm'))
1000000 loops, best of 3: 568 ns per loop

Method 2:

In [17]: %timeit set([node[0] for node in pwnodes if node[1].get('pm')]) 
1000000 loops, best of 3: 469 ns per loop

where pwnodes = [('e1', dict(pm=1, wired=1)), ('e2', dict(pm=1, wired=1))].

Gust answered 7/7, 2015 at 14:13 Comment(5)

I guess it's because the first option is a generator comprehension? – Leotie 7/7, 2015 at 14:23

related: #48289 – Leotie 7/7, 2015 at 14:28

I don't have 2.6 on my machine but Python 3.4 dis gives the same number of instructions for each method. – Adulteress 7/7, 2015 at 14:29

@IanAuld: It's not faster on 3.x (see Martijn's answer for why). – Digamy 7/7, 2015 at 14:31

There are no set comprehensions in python 2.6, you're using a generator expression, which is slightly slower. – Salamanca 7/7, 2015 at 14:36

Iteration is simply faster when using a list comprehension:

In [23]: from collections import deque

In [24]: %timeit deque((node[0] for node in pwnodes if node[1].get('pm')), maxlen=0)
1000 loops, best of 3: 305 µs per loop

In [25]: %timeit deque([node[0] for node in pwnodes if node[1].get('pm')], maxlen=0)
1000 loops, best of 3: 246 µs per loop

The deque is used to illustrate iteration speed; a deque with maxlen set to 0 discards all elements taken from the iterable so there are no memory allocation differences to skew the results.

That's because in Python 2, list comprehensions don't use a separate namespace, while a generator expression does (it has to, by necessity). That extra namespace requires a new frame on the stack, and this is expensive. The major advantage of generator expressions is their low memory footprint, not their speed.

In Python 3, list comprehensions have a separate namespace as well, and list comprehension and generator iteration speed is comparable. You also have set comprehensions, which are fastest still, even on Python 2.

Tiptoe answered 7/7, 2015 at 14:28 Comment(3)

Thanks Martijn, that explains. How do you use set comprehension on 2.6 – Gust 7/7, 2015 at 14:36

@Anandan: you don't; you use a list comprehension or generator expression and the set() callable just like you did. If memory is not an issue but speed is, use the list comprehension. – Tiptoe 7/7, 2015 at 14:38

A set comprehension is a new syntax added to Python 2.7 at the level of the grammar. It simply is not available in Python 2.6, and cannot be added. – Flabby 7/7, 2015 at 14:54

My guess is because the second one involves a generator and the first one doesn't. Generators are generally slower than the equivalent list if the equivalent list fits in memory.

In [4]: timeit for i in [i for i in range(1000)]: pass
10000 loops, best of 3: 47.2 µs per loop

In [5]: timeit for i in (i for i in range(1000)): pass
10000 loops, best of 3: 57.8 µs per loop

Leotie answered 7/7, 2015 at 14:27 Comment(0)

Recommended topics

Hot tags