Return or yield from a function that calls a generator?
Asked Answered
N

5

32

I have a generator generator and also a convenience method to it - generate_all.

def generator(some_list):
  for i in some_list:
    yield do_something(i)

def generate_all():
  some_list = get_the_list()
  return generator(some_list) # <-- Is this supposed to be return or yield?

Should generate_all return or yield? I want the users of both methods to use it the same, i.e.

for x in generate_all()

should be equal to

some_list = get_the_list()
for x in generate(some_list)
Nananne answered 5/1, 2020 at 1:16 Comment(2)
There's a reason to use either. For this example, return is more efficientCollotype
This reminds me of a similar question I once posed: “yield from iterable” vs “return iter(iterable)”. While not specifically about generators it is basically the same as generators and iterators are quite similar in python. Also the strategy of comparing the bytecode as proposed by the answer may be of use here.Radiochemical
L
14

Generators are lazy-evaluating so return or yield will behave differently when you're debugging your code or if an exception is thrown.

With return any exception that happens in your generator won't know anything about generate_all, that's because when generator is really executed you have already left the generate_all function. With yield in there it will have generate_all in the traceback.

def generator(some_list):
    for i in some_list:
        raise Exception('exception happened :-)')
        yield i

def generate_all():
    some_list = [1,2,3]
    return generator(some_list)

for item in generate_all():
    ...
Exception                                 Traceback (most recent call last)
<ipython-input-3-b19085eab3e1> in <module>
      8     return generator(some_list)
      9 
---> 10 for item in generate_all():
     11     ...

<ipython-input-3-b19085eab3e1> in generator(some_list)
      1 def generator(some_list):
      2     for i in some_list:
----> 3         raise Exception('exception happened :-)')
      4         yield i
      5 

Exception: exception happened :-)

And if it's using yield from:

def generate_all():
    some_list = [1,2,3]
    yield from generator(some_list)

for item in generate_all():
    ...
Exception                                 Traceback (most recent call last)
<ipython-input-4-be322887df35> in <module>
      8     yield from generator(some_list)
      9 
---> 10 for item in generate_all():
     11     ...

<ipython-input-4-be322887df35> in generate_all()
      6 def generate_all():
      7     some_list = [1,2,3]
----> 8     yield from generator(some_list)
      9 
     10 for item in generate_all():

<ipython-input-4-be322887df35> in generator(some_list)
      1 def generator(some_list):
      2     for i in some_list:
----> 3         raise Exception('exception happened :-)')
      4         yield i
      5 

Exception: exception happened :-)

However this comes at the cost of performance. The additional generator layer does have some overhead. So return will be generally a bit faster than yield from ... (or for item in ...: yield item). In most cases this won't matter much, because whatever you do in the generator typically dominates the run-time so that the additional layer won't be noticeable.

However yield has some additional advantages: You aren't restricted to a single iterable, you can also easily yield additional items:

def generator(some_list):
    for i in some_list:
        yield i

def generate_all():
    some_list = [1,2,3]
    yield 'start'
    yield from generator(some_list)
    yield 'end'

for item in generate_all():
    print(item)
start
1
2
3
end

In your case the operations are quite simple and I don't know if it's even necessary to create multiple functions for this, one could easily just use the built-in map or a generator expression instead:

map(do_something, get_the_list())          # map
(do_something(i) for i in get_the_list())  # generator expression

Both should be identical (except for some differences when exceptions happen) to use. And if they need a more descriptive name, then you could still wrap them in one function.

There are multiple helpers that wrap very common operations on iterables built-in and further ones can be found in the built-in itertools module. In such simple cases I would simply resort to these and only for non-trivial cases write your own generators.

But I assume your real code is more complicated so that may not be applicable but I thought it wouldn't be a complete answer without mentioning alternatives.

Larry answered 5/1, 2020 at 14:5 Comment(0)
M
19

You're probably looking for Generator Delegation (PEP380)

For simple iterators, yield from iterable is essentially just a shortened form of for item in iterable: yield item

def generator(iterable):
  for i in iterable:
    yield do_something(i)

def generate_all():
  yield from generator(get_the_list())

It's pretty concise and also has a number of other advantages, such as being able to chain arbitrary/different iterables!

Magdala answered 5/1, 2020 at 1:19 Comment(4)
Oh you mean the naming of list? It's a bad example, not real code pasted in the question, I should probably edit it.Nananne
Yeah - never fear, I'm quite guilty of example code that won't even run at first ask..Magdala
First one can be a one-liner too :). yield from map(do_something, iterable) or even yield from (do_something(x) for x in iterable)Collotype
You only need delegation if you are, yourself, doing something other than just returning the new generator. If you just return the new generator, no delegation is needed. So yield from is pointless unless your wrapper does something else generator-y.Frederick
O
16

return generator(list) does what you want. But note that

yield from generator(list)

would be equivalent, but with the opportunity to yield more values after generator is exhausted. For example:

def generator_all_and_then_some():
    list = get_the_list()
    yield from generator(list)
    yield "one last thing"
Osi answered 5/1, 2020 at 1:19 Comment(1)
I believe there's a subtle difference between yield from and return when the consumer of the generator throws an exception inside of it - and with other operations that are influenced by the stack trace.Calyx
L
14

Generators are lazy-evaluating so return or yield will behave differently when you're debugging your code or if an exception is thrown.

With return any exception that happens in your generator won't know anything about generate_all, that's because when generator is really executed you have already left the generate_all function. With yield in there it will have generate_all in the traceback.

def generator(some_list):
    for i in some_list:
        raise Exception('exception happened :-)')
        yield i

def generate_all():
    some_list = [1,2,3]
    return generator(some_list)

for item in generate_all():
    ...
Exception                                 Traceback (most recent call last)
<ipython-input-3-b19085eab3e1> in <module>
      8     return generator(some_list)
      9 
---> 10 for item in generate_all():
     11     ...

<ipython-input-3-b19085eab3e1> in generator(some_list)
      1 def generator(some_list):
      2     for i in some_list:
----> 3         raise Exception('exception happened :-)')
      4         yield i
      5 

Exception: exception happened :-)

And if it's using yield from:

def generate_all():
    some_list = [1,2,3]
    yield from generator(some_list)

for item in generate_all():
    ...
Exception                                 Traceback (most recent call last)
<ipython-input-4-be322887df35> in <module>
      8     yield from generator(some_list)
      9 
---> 10 for item in generate_all():
     11     ...

<ipython-input-4-be322887df35> in generate_all()
      6 def generate_all():
      7     some_list = [1,2,3]
----> 8     yield from generator(some_list)
      9 
     10 for item in generate_all():

<ipython-input-4-be322887df35> in generator(some_list)
      1 def generator(some_list):
      2     for i in some_list:
----> 3         raise Exception('exception happened :-)')
      4         yield i
      5 

Exception: exception happened :-)

However this comes at the cost of performance. The additional generator layer does have some overhead. So return will be generally a bit faster than yield from ... (or for item in ...: yield item). In most cases this won't matter much, because whatever you do in the generator typically dominates the run-time so that the additional layer won't be noticeable.

However yield has some additional advantages: You aren't restricted to a single iterable, you can also easily yield additional items:

def generator(some_list):
    for i in some_list:
        yield i

def generate_all():
    some_list = [1,2,3]
    yield 'start'
    yield from generator(some_list)
    yield 'end'

for item in generate_all():
    print(item)
start
1
2
3
end

In your case the operations are quite simple and I don't know if it's even necessary to create multiple functions for this, one could easily just use the built-in map or a generator expression instead:

map(do_something, get_the_list())          # map
(do_something(i) for i in get_the_list())  # generator expression

Both should be identical (except for some differences when exceptions happen) to use. And if they need a more descriptive name, then you could still wrap them in one function.

There are multiple helpers that wrap very common operations on iterables built-in and further ones can be found in the built-in itertools module. In such simple cases I would simply resort to these and only for non-trivial cases write your own generators.

But I assume your real code is more complicated so that may not be applicable but I thought it wouldn't be a complete answer without mentioning alternatives.

Larry answered 5/1, 2020 at 14:5 Comment(0)
C
10

The following two statements will appear to be functionally equivalent in this particular case:

return generator(list)

and

yield from generator(list)

The later is approximately the same as

for i in generator(list):
    yield i

The return statement returns the generator you are looking for. A yield from or yield statement turns your whole function into something that returns a generator, which passes through the one you are looking for.

From a user point of view, there is no difference. Internally, however, the return is arguably more efficient since it does not wrap generator(list) in a superfluous pass-thru generator. If you plan on doing any processing on the elements of the wrapped generator, use some form of yield of course.

Collotype answered 5/1, 2020 at 1:20 Comment(0)
S
4

You would return it.

yielding* would cause generate_all() to evaluate to a generator itself, and calling next on that outer generator would return the inner generator returned by the first function, which isn't what you'd want.

* Not including yield from

Snitch answered 5/1, 2020 at 1:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.