Passing generator expressions to any() and all()
Asked Answered
R

3

9

I was just messing around in the Python interpreter and I came across some unexpected behavior.

>>> bools = (True, True, True, False)
>>> all(bools)
False
>>> any(bools)
True

Ok, so far nothing out of the ordinary...

>>> bools = (b for b in (True, True, True, False))
>>> all(bools)
False
>>> any(bools)
False

Here's where things start getting spooky. I figure this happens because the all function iterates over the generator expression, calling its __next__ method and using up the values until it encounters one that is False. Here's some evidence to back that theory up:

>>> bools = (b for b in (True, False, True, True))
>>> all(bools)
False
>>> any(bools)
True

I think the result is different because the False is not at the end, so there are still some unused values left in the generator. If you type

>>> bools = (b for b in (True, False, True, True))
>>> all(bools)
False
>>> list(bools)
[True, True]

It seems like there are only 2 remaining values.

So, why exactly does this really happen? I'm sure there are many details that I'm missing.

Read answered 15/7, 2019 at 18:26 Comment(4)
what version of Python are you using?Shorthand
Just to elaborate on the correct answer: generators exhaust. When all values have been generated, it won't produce any more. Using the same generator twice, then, won't work as you may expect at first.Holocrine
worth reading: #232267Travistravus
This is actually a very good question. I've never thought of this and now I've ran through some tests to understand this!Spent
G
8

The behaviour of all() and any() are documented in the official documentation.

From the pseudo-code:

def all(iterable):
    for element in iterable:
        if not element:
            return False
    return True

all() only consumes True elements, it terminates when it finds the first element that evaluates to False.

def any(iterable):
    for element in iterable:
        if element:
            return True
    return False

any() consumes only False elements, it terminates when it finds the first element that evaluates to True.

Note that generators are not reset to their initial position when passed around. They stay at their current position unless more items are consumed. Hence,

>>> bools = (b for b in (True, False, True, True))

The following will consume the first two items. Since the second item is False, the iteration stops after that. This leaves the generator at a position after the second element.

>>> all(bools)
False

At this point the generator has (True, True) as the remaining values. You point that out correctly in your question. The following only consumes a single element.

>>> any(bools)
True

Note that there is still another True value obtainable from the generator after calling any().

And of course, if you call list() on a generator, all items from the generator are consumed and the generator will not yield any more items (it is "empty").

Gewirtz answered 15/7, 2019 at 18:34 Comment(2)
Thanks for the great answer @dhke. I like how you explain what is going on inside the functions.Read
@sam-pyt: Note that the pseudo code isn't mine, these are the actual explanations given in the Python documentation.Gewirtz
S
10

The problem that you are having is that you are using the generator after it has produced all the values.

You can verify this by running the following code:

>>> bools = (b for b in (True, False, True, True))
>>> all(bools) # once the False is found it will stop producing values
False
>>> next(bools) # next value after False which is True
True
>>> next(bools) # next value after True which is True
True
>>> next(bools)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

This will work:

>>> bools = (b for b in (True, False, True, True))
>>> all(bools)
False
>>> bools = (b for b in (True, False, True, True))
>>> any(bools)
True
Shorthand answered 15/7, 2019 at 18:31 Comment(4)
Nice catch — that can be really hard to see.Sarcoma
The first example doesn't produce an error, it prints True. I think if you change it to bools = (b for b in (True, True, True, False), it will produce the given error.Read
This answer and the one by @dhke are both valid for different organization of the seeds for the generatorSpent
@samp-pyt, that is correct, I updated my answer just now.Shorthand
G
8

The behaviour of all() and any() are documented in the official documentation.

From the pseudo-code:

def all(iterable):
    for element in iterable:
        if not element:
            return False
    return True

all() only consumes True elements, it terminates when it finds the first element that evaluates to False.

def any(iterable):
    for element in iterable:
        if element:
            return True
    return False

any() consumes only False elements, it terminates when it finds the first element that evaluates to True.

Note that generators are not reset to their initial position when passed around. They stay at their current position unless more items are consumed. Hence,

>>> bools = (b for b in (True, False, True, True))

The following will consume the first two items. Since the second item is False, the iteration stops after that. This leaves the generator at a position after the second element.

>>> all(bools)
False

At this point the generator has (True, True) as the remaining values. You point that out correctly in your question. The following only consumes a single element.

>>> any(bools)
True

Note that there is still another True value obtainable from the generator after calling any().

And of course, if you call list() on a generator, all items from the generator are consumed and the generator will not yield any more items (it is "empty").

Gewirtz answered 15/7, 2019 at 18:34 Comment(2)
Thanks for the great answer @dhke. I like how you explain what is going on inside the functions.Read
@sam-pyt: Note that the pseudo code isn't mine, these are the actual explanations given in the Python documentation.Gewirtz
H
3

A couple things are at play here.

The first thing is that generators can run exactly once for each element they're given. Unlike lists, or tuples, or any other objects with a fixed state, generators know what the __next__ value is, how to generate the value after that, and basically nothing else. When you call next(generator), you get that next value, the generator figures out a new __next__, and it completely loses memory of the value you just obtained. In essence, generators can't be used multiple times in a row.

The second thing is how all(), any(), and list() work internally, especially vis-a-vis generators. all()'s implementation looks something like this, only more complicated:

def all(iterable):
    for element in iterable:
        if bool(element) is False:
            return False
    return True

That is, the all() function short-circuits when it first finds a non-truthy element (and any() does the same thing, except the reverse). This is to save on processing time - why process the rest of the iterable if just the first element is unacceptable? In the case of a generator (e.g. your last example), this means it consumes all elements up until it finds a False. The generator still has elements left, but since it's already yielded those first two, it will behave in the future as though they never existed.

list() is simpler, and just calls next(generator) until the generator stops producing values. This makes the generator give up any values it hasn't yet consumed.

So the explanation for your last example is that

  1. You create a generator that will spit out the elements True, False, True, True in order
  2. You call all() on that generator, and it consumes the first two elements of the generator before it terminates, having found a falsey value.
  3. You call list() on that generator, and it consumes all remaining elements of the generator (that is, the last two) to create a list. It produces [2, 2].
Helfrich answered 15/7, 2019 at 18:37 Comment(1)
Ok, so it is what I suspected. Good to know, thank you for the answer.Read

© 2022 - 2024 — McMap. All rights reserved.