Generators and files
Asked Answered
V

3

6

When I write:

lines = (line.strip() for line in open('a_file'))

Is the file opened immediately or is the file system only accessed when I start to consume the generator expression?

Vest answered 18/8, 2017 at 13:54 Comment(3)
If you do open = print first, then your code does print a_file.Lorenalorene
@StefanPochmann It took me a while but at least I understood your comment... Thank you very muchVest
@MSeifert Very nice edit!Vest
D
6

open() is called immediately upon the construction of the generator, irrespective of when or whether you consume from it.

The relevant spec is PEP-289:

Early Binding versus Late Binding

After much discussion, it was decided that the first (outermost) for-expression should be evaluated immediately and that the remaining expressions be evaluated when the generator is executed.

Asked to summarize the reasoning for binding the first expression, Guido offered [5]:

Consider sum(x for x in foo()). Now suppose there's a bug in foo() that raises an exception, and a bug in sum() that raises an exception before it starts iterating over its argument. Which exception would you expect to see? I'd be surprised if the one in sum() was raised rather the one in foo(), since the call to foo() is part of the argument to sum(), and I expect arguments to be processed before the function is called.

OTOH, in sum(bar(x) for x in foo()), where sum() and foo() are bugfree, but bar() raises an exception, we have no choice but to delay the call to bar() until sum() starts iterating -- that's part of the contract of generators. (They do nothing until their next() method is first called.)

See the rest of that section for further discussion.

Damar answered 18/8, 2017 at 14:5 Comment(0)
S
6

It is opened immediately. You can verify this if you use a filename that's not present (it will throw an Exception which indicates that Python actually tried to open it immediatly).

You can also use a function that gives more feedback to see that the command is executed even before the generator is iterated over:

def somefunction(filename):
    print(filename)
    return open(filename)

lines = (line.strip() for line in somefunction('a_file'))  # prints

However if you use a generator function instead of a generator expression the file is only opened when you iterate over it:

def somefunction(filename):
    print(filename)
    for line in open(filename):
        yield line.strip()

lines = somefunction('a_file')  # no print!

list(lines)                     # prints because list iterates over the generator function.
Suspensive answered 18/8, 2017 at 13:58 Comment(1)
I appreciate your observation about the difference between generator expressions vs generator functions. I've not approved your answer because I prefer the one referencing the original decision process, but yours is a very good one!Vest
D
6

open() is called immediately upon the construction of the generator, irrespective of when or whether you consume from it.

The relevant spec is PEP-289:

Early Binding versus Late Binding

After much discussion, it was decided that the first (outermost) for-expression should be evaluated immediately and that the remaining expressions be evaluated when the generator is executed.

Asked to summarize the reasoning for binding the first expression, Guido offered [5]:

Consider sum(x for x in foo()). Now suppose there's a bug in foo() that raises an exception, and a bug in sum() that raises an exception before it starts iterating over its argument. Which exception would you expect to see? I'd be surprised if the one in sum() was raised rather the one in foo(), since the call to foo() is part of the argument to sum(), and I expect arguments to be processed before the function is called.

OTOH, in sum(bar(x) for x in foo()), where sum() and foo() are bugfree, but bar() raises an exception, we have no choice but to delay the call to bar() until sum() starts iterating -- that's part of the contract of generators. (They do nothing until their next() method is first called.)

See the rest of that section for further discussion.

Damar answered 18/8, 2017 at 14:5 Comment(0)
M
2

It is opened immediately.

Example:

def func():
    print('x')
    return [1, 2, 3]

g = (x for x in func())

Output:

x

The function needs to return an iterable object. open() returns an open file object that is iterable. Therefore, the file will be opened when you define the generator expression.

Microclimate answered 18/8, 2017 at 13:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.