In practice, what are the main uses for the "yield from" syntax in Python 3.3?
Asked Answered
N

12

703

I'm having a hard time wrapping my brain around PEP 380.

  1. What are the situations where yield from is useful?
  2. What is the classic use case?
  3. Why is it compared to micro-threads?

So far I have used generators, but never really used coroutines (introduced by PEP-342). Despite some similarities, generators and coroutines are basically two different concepts. Understanding coroutines (not only generators) is the key to understanding the new syntax.

IMHO coroutines are the most obscure Python feature, most books make it look useless and uninteresting.


Thanks for the great answers, but special thanks to agf and his comment linking to David Beazley presentations.

Nikitanikki answered 14/3, 2012 at 19:33 Comment(2)
dabeaz.com/coroutinesGibe
Video of David Beazley's dabeaz.com/coroutines presentation: youtube.com/watch?v=Z_OAlIhXziwThreaten
V
1061

Let's get one thing out of the way first. The explanation that yield from g is equivalent to for v in g: yield v does not even begin to do justice to what yield from is all about. Because, let's face it, if all yield from does is expand the for loop, then it does not warrant adding yield from to the language and preclude a whole bunch of new features from being implemented in Python 2.x.

What yield from does is it establishes a transparent, bidirectional connection between the caller and the sub-generator:

  • The connection is "transparent" in the sense that it will propagate everything correctly, not just the elements being generated (e.g. exceptions are propagated).

  • The connection is "bidirectional" in the sense that data can be both sent from and to a generator.

(If we were talking about TCP, yield from g might mean "now temporarily disconnect my client's socket and reconnect it to this other server socket".)

BTW, if you are not sure what sending data to a generator even means, you need to drop everything and read about coroutines first—they're very useful (contrast them with subroutines), but unfortunately lesser-known in Python. Dave Beazley's Curious Course on Coroutines is an excellent start. Read slides 24-33 for a quick primer.

Reading data from a generator using yield from

def reader():
    """A generator that fakes a read from a file, socket, etc."""
    for i in range(4):
        yield '<< %s' % i

def reader_wrapper(g):
    # Manually iterate over data produced by reader
    for v in g:
        yield v

wrap = reader_wrapper(reader())
for i in wrap:
    print(i)

# Result
<< 0
<< 1
<< 2
<< 3

Instead of manually iterating over reader(), we can just yield from it.

def reader_wrapper(g):
    yield from g

That works, and we eliminated one line of code. And probably the intent is a little bit clearer (or not). But nothing life changing.

Sending data to a generator (coroutine) using yield from - Part 1

Now let's do something more interesting. Let's create a coroutine called writer that accepts data sent to it and writes to a socket, fd, etc.

def writer():
    """A coroutine that writes data *sent* to it to fd, socket, etc."""
    while True:
        w = (yield)
        print('>> ', w)

Now the question is, how should the wrapper function handle sending data to the writer, so that any data that is sent to the wrapper is transparently sent to the writer()?

def writer_wrapper(coro):
    # TBD
    pass

w = writer()
wrap = writer_wrapper(w)
wrap.send(None)  # "prime" the coroutine
for i in range(4):
    wrap.send(i)

# Expected result
>>  0
>>  1
>>  2
>>  3

The wrapper needs to accept the data that is sent to it (obviously) and should also handle the StopIteration when the for loop is exhausted. Evidently just doing for x in coro: yield x won't do. Here is a version that works.

def writer_wrapper(coro):
    coro.send(None)  # prime the coro
    while True:
        try:
            x = (yield)  # Capture the value that's sent
            coro.send(x)  # and pass it to the writer
        except StopIteration:
            pass

Or, we could do this.

def writer_wrapper(coro):
    yield from coro

That saves 6 lines of code, make it much much more readable and it just works. Magic!

Sending data to a generator yield from - Part 2 - Exception handling

Let's make it more complicated. What if our writer needs to handle exceptions? Let's say the writer handles a SpamException and it prints *** if it encounters one.

class SpamException(Exception):
    pass

def writer():
    while True:
        try:
            w = (yield)
        except SpamException:
            print('***')
        else:
            print('>> ', w)

What if we don't change writer_wrapper? Does it work? Let's try

# writer_wrapper same as above

w = writer()
wrap = writer_wrapper(w)
wrap.send(None)  # "prime" the coroutine
for i in [0, 1, 2, 'spam', 4]:
    if i == 'spam':
        wrap.throw(SpamException)
    else:
        wrap.send(i)

# Expected Result
>>  0
>>  1
>>  2
***
>>  4

# Actual Result
>>  0
>>  1
>>  2
Traceback (most recent call last):
  ... redacted ...
  File ... in writer_wrapper
    x = (yield)
__main__.SpamException

Um, it's not working because x = (yield) just raises the exception and everything comes to a crashing halt. Let's make it work, but manually handling exceptions and sending them or throwing them into the sub-generator (writer)

def writer_wrapper(coro):
    """Works. Manually catches exceptions and throws them"""
    coro.send(None)  # prime the coro
    while True:
        try:
            try:
                x = (yield)
            except Exception as e:   # This catches the SpamException
                coro.throw(e)
            else:
                coro.send(x)
        except StopIteration:
            pass

This works.

# Result
>>  0
>>  1
>>  2
***
>>  4

But so does this!

def writer_wrapper(coro):
    yield from coro

The yield from transparently handles sending the values or throwing values into the sub-generator.

This still does not cover all the corner cases though. What happens if the outer generator is closed? What about the case when the sub-generator returns a value (yes, in Python 3.3+, generators can return values), how should the return value be propagated? That yield from transparently handles all the corner cases is really impressive. yield from just magically works and handles all those cases.

I personally feel yield from is a poor keyword choice because it does not make the two-way nature apparent. There were other keywords proposed (like delegate but were rejected because adding a new keyword to the language is much more difficult than combining existing ones.

In summary, it's best to think of yield from as a transparent two way channel between the caller and the sub-generator.

References:

  1. PEP 380 - Syntax for delegating to a sub-generator (Ewing) [v3.3, 2009-02-13]
  2. PEP 342 - Coroutines via Enhanced Generators (GvR, Eby) [v2.5, 2005-05-10]
Venal answered 29/9, 2014 at 21:22 Comment(11)
@PraveenGollakota, in the second part of your question, Sending data to a generator (coroutine) using yield from - Part 1, what if you have more than coroutines to forward the received item to? Like a broadcaster or subscriber scenario where you provide multiple coroutines to the wrapper in your example and items should be sent to all or subset of them?Heathen
doing except StopIteration: pass INSIDE the while True: loop is not an accurate representation of yield from coro - which is not an infinite loop and after coro is exhausted (i.e. raises StopIteration), writer_wrapper will execute the next statement. After the last statement it will itself auto-raise StopIteration as any exhausted generator...Blossom
...so if writer contained for _ in range(4) instead of while True, then after printing >> 3 it would ALSO auto-raise StopIteration and this would be auto-handled by yield from and then writer_wrapper would auto-raise it's own StopIteration and because wrap.send(i) is not inside try block, it would be actually raised at this point (i.e. traceback will only report the line with wrap.send(i), not anything from inside the generator)Blossom
It astounds me that they didn't go with yield as instead of yield from. The semantics become far clearer: For the duration of this statement, basically behave as the coroutine being called, as if if the user was calling it directly. (And it took me this answer to realize that precisely because the meaning suggested by yield from is so not intuitively connected to what this answer explains to clearly.)Downe
What is the primary purpose of the generator wrapper?Actinal
And is it wrong to use the yield from in a simple generator: def gen(): yield from [1,2,3...] ?Actinal
I don't understand the final writer_wrapper example. As I understand it, you simply pass through. You can't intercept the data being sent and manipulate it. What's the point of this wrapped example? I don't see a way to catch the sent data, modify it, and send it to the next generator in the chain using "yield from".Corene
(If we were talking about TCP, yield from g might mean "now temporarily disconnect my client's socket and reconnect it to this other server socket".) Am I correct to understand that in the case of a recursive generator function, yield_from recursive_call() is effectively optimized tail-recursion?Andromada
@JonnyWaffles I don't think you can do that. I think the useful ness of yield from is that hooking up all sends, errors and iterations to an inner generator is something that you would often want to do so it deserves a shorthand.Prevailing
@JamesLin: A wrapper that only does yield from g is not useful. However, the wrapper could do other things before or after the yield from, like opening a file, opening and closing a socket, or any other stuff. Or it could do multiple yield from operations sequentially. Imagine a generator to concatenate generators: def gcat(generators): for g in generators: yield from g.Berfield
@Berfield Right ok, perhaps if you edit your answer to provide some pseudo code in the wrapper so other people can understand on the spot?Actinal
L
138

What are the situations where "yield from" is useful?

Every situation where you have a loop like this:

for x in subgenerator:
  yield x

As the PEP describes, this is a rather naive attempt at using the subgenerator, it's missing several aspects, especially the proper handling of the .throw()/.send()/.close() mechanisms introduced by PEP 342. To do this properly, rather complicated code is necessary.

What is the classic use case?

Consider that you want to extract information from a recursive data structure. Let's say we want to get all leaf nodes in a tree:

def traverse_tree(node):
  if not node.children:
    yield node
  for child in node.children:
    yield from traverse_tree(child)

Even more important is the fact that until the yield from, there was no simple method of refactoring the generator code. Suppose you have a (senseless) generator like this:

def get_list_values(lst):
  for item in lst:
    yield int(item)
  for item in lst:
    yield str(item)
  for item in lst:
    yield float(item)

Now you decide to factor out these loops into separate generators. Without yield from, this is ugly, up to the point where you will think twice whether you actually want to do it. With yield from, it's actually nice to look at:

def get_list_values(lst):
  for sub in [get_list_values_as_int, 
              get_list_values_as_str, 
              get_list_values_as_float]:
    yield from sub(lst)

Why is it compared to micro-threads?

I think what this section in the PEP is talking about is that every generator does have its own isolated execution context. Together with the fact that execution is switched between the generator-iterator and the caller using yield and __next__(), respectively, this is similar to threads, where the operating system switches the executing thread from time to time, along with the execution context (stack, registers, ...).

The effect of this is also comparable: Both the generator-iterator and the caller progress in their execution state at the same time, their executions are interleaved. For example, if the generator does some kind of computation and the caller prints out the results, you'll see the results as soon as they're available. This is a form of concurrency.

That analogy isn't anything specific to yield from, though - it's rather a general property of generators in Python.

Lindemann answered 14/3, 2012 at 19:48 Comment(5)
Refactoring generators is painful today.Consolidate
I tend to use itertools a lot for refactoring generators (stuff like itertools.chain), it's not that a big deal. I like yield from, but I still fail to see how revolutionary it is. It probably is, since Guido is all crazy about it, but I must be missing the big picture. I guess it's great for send() since this is hard to refactor, but I don't use that quite often.Lurleen
I suppose those get_list_values_as_xxx are simple generators with a single line for x in input_param: yield int(x) and the other two respectively with str and floatConspire
@NiklasB. re "extract information from a recursive data structure." I'm just getting into Py for data. Could you take a stab at this Q?Hybridism
Why doesn't map work? e.g. yield from map(traverse_tree, node.children)Skyjack
U
43

A short example will help you understand one of yield from's use case: get value from another generator

def flatten(sequence):
    """flatten a multi level list or something
    >>> list(flatten([1, [2], 3]))
    [1, 2, 3]
    >>> list(flatten([1, [2], [3, [4]]]))
    [1, 2, 3, 4]
    """
    for element in sequence:
        if hasattr(element, '__iter__'):
            yield from flatten(element)
        else:
            yield element

print(list(flatten([1, [2], [3, [4]]])))
Utopianism answered 27/12, 2016 at 15:47 Comment(1)
Just wanted to suggest that the print at the end would look a bit nicer without the conversion to a list - print(*flatten([1, [2], [3, [4]]]))Surber
L
40

Wherever you invoke a generator from within a generator you need a "pump" to re-yield the values: for v in inner_generator: yield v. As the PEP points out there are subtle complexities to this which most people ignore. Non-local flow-control like throw() is one example given in the PEP. The new syntax yield from inner_generator is used wherever you would have written the explicit for loop before. It's not merely syntactic sugar, though: It handles all of the corner cases that are ignored by the for loop. Being "sugary" encourages people to use it and thus get the right behaviors.

This message in the discussion thread talks about these complexities:

With the additional generator features introduced by PEP 342, that is no longer the case: as described in Greg's PEP, simple iteration doesn't support send() and throw() correctly. The gymnastics needed to support send() and throw() actually aren't that complex when you break them down, but they aren't trivial either.

I can't speak to a comparison with micro-threads, other than to observe that generators are a type of paralellism. You can consider the suspended generator to be a thread which sends values via yield to a consumer thread. The actual implementation may be nothing like this (and the actual implementation is obviously of great interest to the Python developers) but this does not concern the users.

The new yield from syntax does not add any additional capability to the language in terms of threading, it just makes it easier to use existing features correctly. Or more precisely it makes it easier for a novice consumer of a complex inner generator written by an expert to pass through that generator without breaking any of its complex features.

Lys answered 14/3, 2012 at 19:58 Comment(0)
A
13

yield will yields single value into collection.

yield from will yields collection into collection and make it flatten.

Check this example:

def yieldOnly():
    yield "A"
    yield "B"
    yield "C"

def yieldFrom():
    for i in [1, 2, 3]:
        yield from yieldOnly()

test = yieldFrom()
for i in test:
    print(i)

In console you will see:

A
B
C
A
B
C
A
B
C
Ankerite answered 6/3, 2021 at 3:26 Comment(0)
C
12

yield from basically chains iterators in a efficient way:

# chain from itertools:
def chain(*iters):
    for it in iters:
        for item in it:
            yield item

# with the new keyword
def chain(*iters):
    for it in iters:
        yield from it

As you can see it removes one pure Python loop. That's pretty much all it does, but chaining iterators is a pretty common pattern in Python.

Threads are basically a feature that allow you to jump out of functions at completely random points and jump back into the state of another function. The thread supervisor does this very often, so the program appears to run all these functions at the same time. The problem is that the points are random, so you need to use locking to prevent the supervisor from stopping the function at a problematic point.

Generators are pretty similar to threads in this sense: They allow you to specify specific points (whenever they yield) where you can jump in and out. When used this way, generators are called coroutines.

Read this excellent tutorials about coroutines in Python for more details

Catholic answered 14/3, 2012 at 20:2 Comment(3)
This answer is misleading because it elides the salient feature of "yield from", as mentioned above: send() and throw() support.Conger
Are you disputing Ben Jackson's answer above? My reading of your answer is that it is essentially syntactic sugar which follows the code transformation you provided. Ben Jackson's answer specifically refutes that claim.Conger
@JochenRitzel You never need to write your own chain function because itertools.chain already exists. Use yield from itertools.chain(*iters).Paco
P
11

In applied usage for the Asynchronous IO coroutine, yield from has a similar behavior as await in a coroutine function. Both of which is used to suspend the execution of coroutine.

For Asyncio, if there's no need to support an older Python version (i.e. >3.5), async def/await is the recommended syntax to define a coroutine. Thus yield from is no longer needed in a coroutine.

But in general outside of asyncio, yield from <sub-generator> has still some other usage in iterating the sub-generator as mentioned in the earlier answer.

Pinafore answered 26/8, 2018 at 21:5 Comment(0)
A
4

This code defines a function fixed_sum_digits returning a generator enumerating all six digits numbers such that the sum of digits is 20.

def iter_fun(sum, deepness, myString, Total):
    if deepness == 0:
        if sum == Total:
            yield myString
    else:  
        for i in range(min(10, Total - sum + 1)):
            yield from iter_fun(sum + i,deepness - 1,myString + str(i),Total)

def fixed_sum_digits(digits, Tot):
    return iter_fun(0,digits,"",Tot) 

Try to write it without yield from. If you find an effective way to do it let me know.

I think that for cases like this one: visiting trees, yield from makes the code simpler and cleaner.

Abhor answered 4/11, 2019 at 15:53 Comment(0)
J
3

Simply put, yield from provides tail recursion for iterator functions.

Juni answered 25/3, 2020 at 17:30 Comment(1)
That's neat! Can you provide an example showing how yield from facilitates tail recursion? I understand tail recursion and yield, but I don't see how to make it work in python.Essequibo
P
2

yield from yields from a generator until the generator is empty, and then continue executing following lines of codes.

e.g.

def gen(sequence):
    for i in sequence:
        yield i


def merge_batch(sub_seq):
    yield {"data": sub_seq}

def modified_gen(g, batch_size):
    stream = []
    for i in g:
        stream.append(i)
        stream_len = len(stream)
        if stream_len == batch_size:
            yield from merge_batch(stream)
            print("batch ends")
            stream = []
            stream_len = 0

running this gives you:

In [17]: g = gen([1,2,3,4,5,6,7,8,9,10])
In [18]: mg = modified_gen(g, 2)
In [19]: next(mg)
Out[19]: {'data': [1, 2]}

In [20]: next(mg)
batch ends
Out[20]: {'data': [3, 4]}

In [21]: next(mg)
batch ends
Out[21]: {'data': [5, 6]}

In [22]: next(mg)
batch ends
Out[22]: {'data': [7, 8]}

In [23]: next(mg)
batch ends
Out[23]: {'data': [9, 10]}

In [24]: next(mg)
batch ends
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
Input In [24], in <cell line: 1>()
----> 1 next(mg)

StopIteration: 

So, yield from can take outputs from another generator, do some modification, and then feed its own output to others as a generator itself.

That in my humble opinion, is one of the main use cases of yield from

Palila answered 11/8, 2022 at 10:11 Comment(0)
A
0

I think the first lines of PEP380/ the corresponding news explain it quite well :

PEP 380 adds the yield from expression, allowing a generator to delegate part of its operations to another generator. This allows a section of code containing yield to be factored out and placed in another generator.

Without yield from, it is quite hard to factor out parts of your co-routines.

Anthesis answered 29/6, 2023 at 8:2 Comment(0)
S
0

Simple example:

def some():
    yield [1, 2, 3]

for i in some():
    print(i)
# [1, 2, 3]


def some():
    yield from [1, 2, 3]

for i in some():
    print(i)
# 1
# 2
# 3
Speakeasy answered 18/4 at 23:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.