Why is this generator expression function slower than the loop version?
Asked Answered
N

1

11

I have been operating under the theory that generator expressions tend to be more efficient than normal loops. But then I ran into the following example: write a function which given a number, N, and some factors, ps, returns the sum of all the numbers under N that are a multiple of at least one factor.

Here is a loop version and a shorter generator expression version:

def loops(N, ps):
    total_sum = 0 
    for i in xrange(N):
        for p in ps: 
            if i%p == 0:
                total_sum += i
                break
    return total_sum

def genexp(N, ps):
    return sum(i for i in xrange(N)
               if any(i%p == 0 for p in ps))

I'd expect the two to perform roughly equal, with maybe the comprehension version a little faster, but what I didn't expect was this:

for func in ('loops', 'genexp'):
    print func, timeit.timeit('%s(100000, [3,5,7])' % func, 
                              number=100, 
                              setup='from __main__ import %s' % func)


loops 2.82878184319
genexp 10.1663100719

4x slower isn't even close! Why? What am I misunderstanding?

Nix answered 3/9, 2015 at 17:44 Comment(2)
You have generator expressions, not list comprehensions.Trolly
@MartijnPieters Thanks! Clearly I'm not a python guy :)Nix
T
11

First of all: generator expressions are memory efficient, not necessarily speed efficient.

Your compact genexp() version is slower for two reasons:

  • Generator expressions are implemented using a new scope (like a new function). You are producing N new scopes, one for for each any() test. Creating a new scope and tearing it down again is relatively expensive, certainly when done in a loop and then compared with code that doesn't do this.

  • The sum() and any() names are additional globals to be looked up. In the case of any(), that's an additional N global lookups per test. Globals must be looked up in a dictionary, versus locals which are looked up by index in a C-array (which is very fast).

The latter is but a small component, most of the cost lies in creating and destroying frames (scopes); if you create a version where _any and _sum are locals to the function you get but a small improvement in performance:

>>> def genexp_locals(N, ps, _any=any, _sum=sum):
...     return _sum(i for i in xrange(N)
...                if _any(i%p == 0 for p in ps))
... 
>>> for func in ('loops', 'genexp', 'genexp_locals'):
...     print func, timeit.timeit('%s(100000, [3,5,7])' % func, 
...                               number=100, 
...                               setup='from __main__ import %s' % func)
... 
loops 2.00835800171
genexp 6.45241594315
genexp_locals 6.23843789101

I didn't create a local for xrange to keep that aspect the same. Technically speaking, the _any name is looked up as a closure, not a local, by the generator expression code object, which are not as slow as global lookups but not quite as speedy as a local lookup either.

Trolly answered 3/9, 2015 at 17:47 Comment(4)
I think "N new scopes for each any() test" should be "N new scopes, one for each any() test". And I think there's another cost that you don't mention, the cost of the back and forth between the generator objects and their consumers.Laminated
@Manuel: that overhead is somewhat negligible compared to all the other overheads. Thanks for the suggestion about the wording, I've folded that in.Trolly
Do you mean it's negligible in general, or just in this particular case (as these generators here barely produce anything)?Laminated
@Manuel: in this specific case.Trolly

© 2022 - 2024 — McMap. All rights reserved.