Read/Write Python Closures
Asked Answered
D

8

39

Closures are an incredibly useful language feature. They let us do clever things that would otherwise take a lot of code, and often enable us to write code that is more elegant and more clear. In Python 2.x, closures variable names cannot be rebound; that is, a function defined inside another lexical scope cannot do something like some_var = 'changed!' for variables outside of its local scope. Can someone explain why that is? There have been situations in which I would like to create a closure that rebinds variables in the outer scope, but it wasn't possible. I realize that in almost all cases (if not all of them), this behavior can be achieved with classes, but it is often not as clean or as elegant. Why can't I do it with a closure?

Here is an example of a rebinding closure:

def counter():
    count = 0
    def c():
        count += 1
        return count
    return c

This is the current behavior when you call it:

>>> c()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in c
UnboundLocalError: local variable 'count' referenced before assignment

What I'd like it to do instead is this:

>>> c()
1
>>> c()
2
>>> c()
3
Dispersal answered 5/1, 2010 at 21:57 Comment(6)
Well, in this case, itertools.count(1).next. In my experience it's not easy to find actual cases where idiomatic Python is "not as clean or as elegant" as the Perl/JS/Scheme using closures.Goran
what you are actually looking for is called "generators"Cutcheon
I chose this code example because it was simple, not because it was what I actually wanted to use in production. I'll try to remember a good production example and edit it in. Clearly my choice of example code has thrown people off.Dispersal
@fuzzy-lollipop That is patently false. I know what generators are, and I use them frequently. Generators are fantastic. Generators are not, however, read/write closures.Dispersal
Coroutines seem like they might be.Wadleigh
Ethan's recent edits, while technically more precise than the original wording, make this question considerably less accessible to the average programmer. I'd really like to roll them back, but I'd like a second opinion.Dispersal
S
32

To expand on Ignacio's answer:

def counter():
    count = 0
    def c():
        nonlocal count
        count += 1
        return count
    return c

x = counter()
print([x(),x(),x()])

gives [1,2,3] in Python 3; invocations of counter() give independent counters. Other solutions - especially using itertools/yield are more idiomatic.

Supernova answered 5/1, 2010 at 22:34 Comment(2)
I realize that this isn't the best example I could have used, it just seemed like the simplest example to write. This is exactly what I was looking for (along with a reason they didn't add 'nonlocal' years ago). The counter example was just a piece of throwaway code to get the point across. Thank you for making this explicit.Dispersal
@northtree The nonlocal statement merely affects scoping, it does not have an impact on thread-safety. The statement count += 1 will lead to incorrect results if multiple threads are executing it at once; it needs to be executed in a lock. However, this is orthogonal to the question. It depends on the use case whether locking should be present inside c or is a responsibility of a caller.Supernova
C
23

You could do this and it would work more or less the same way:

class counter(object):
    def __init__(self, count=0):
        self.count = count
    def __call__(self):
        self.count += 1
        return self.count    

Or, a bit of a hack:

def counter():
    count = [0]
    def incr(n):
        n[0] += 1
        return n[0]
    return lambda: incr(count)

I'd go with the first solution.

EDIT: That's what I get for not reading the big blog of text.

Anyway, the reason Python closures are rather limited is "because Guido felt like it." Python was designed in the early 90s, in the heyday of OO. Closures were rather low on the list of language features people wanted. As functional ideas like first class functions, closures, and other things make their way into mainstream popularity, languages like Python have had to tack them on, so their use may a bit awkward, because that's not what the language was designed for.

<rant on="Python scoping">

Also, Python (2.x) has rather odd (in my opinion) ideas about scoping that interferes with a sane implementation of closures, among other things. It always bothers me that this:

new = [x for x in old]

Leaves us with the name x defined in the scope we used it in, as it is (in my opinion) a conceptually smaller scope. (Though Python gets points for consistency, as doing the same thing with a for loop has the same behavior. The only way to avoid this is to use map.)

Anyway, </rant>

Cupboard answered 5/1, 2010 at 22:9 Comment(2)
All very good info, and I appreciate the insight. I also agree with your assessment of python's scoping. Still, the question was "why can't I?", not "how do I?". I want to know why the language was designed like this.Dispersal
Great hack. Used it in python 2.7 and love it (though I don't love that I have to do it).Romany
F
17

nonlocal in 3.x should remedy this.

Fordo answered 5/1, 2010 at 22:0 Comment(3)
That's fantastic news. I will look into it. Does that mean that "nonlocal" signals the interpreter to create a thunk for this function? Do you know how this works?Dispersal
I don't know all the details, but nonlocal should indicate to the compiler that it will need to walk the scopes in order to find the name.Fordo
Right. AFAIK, the only way for that to work is to essentially package up the whole scope of the encompassing function before it gets GC'd in a thunk.Dispersal
K
14

I would use a generator:

>>> def counter():
    count = 0
    while True:
        count += 1
        yield(count)
        
>>> c = counter()
>>> c.next()
1
>>> c.next()
2
>>> c.next()
3

EDIT: I believe the ultimate answer to your question is PEP-3104:

In most languages that support nested scopes, code can refer to or rebind (assign to) any name in the nearest enclosing scope. Currently, Python code can refer to a name in any enclosing scope, but it can only rebind names in two scopes: the local scope (by simple assignment) or the module-global scope (using a global declaration).

This limitation has been raised many times on the Python-Dev mailing list and elsewhere, and has led to extended discussion and many proposals for ways to remove this limitation. This PEP summarizes the various alternatives that have been suggested, together with advantages and disadvantages that have been mentioned for each.

Before version 2.1, Python's treatment of scopes resembled that of standard C: within a file there were only two levels of scope, global and local. In C, this is a natural consequence of the fact that function definitions cannot be nested. But in Python, though functions are usually defined at the top level, a function definition can be executed anywhere. This gave Python the syntactic appearance of nested scoping without the semantics, and yielded inconsistencies that were surprising to some programmers -- for example, a recursive function that worked at the top level would cease to work when moved inside another function, because the recursive function's own name would no longer be visible in its body's scope. This violates the intuition that a function should behave consistently when placed in different contexts.

Kinsman answered 5/1, 2010 at 22:19 Comment(4)
+1 I was too focused on the desired usage syntax that I forgot about generators. Always good.Cupboard
That's a good answer, but it's not an answer to the questions I asked. I want to know why python is designed like this, not how to work around it. I could do it with a class, with generators (which just instantiates a class, so it amounts to the same thing), or probably other ways. But I don't care about a counter instance, I want to know why the language was designed this way.Dispersal
Python 2.x thinks you are declaring a new variable if it is not in the current scope. That's why Python 3.0 introduced the nonlocal keyword, to workaround this issue.Kinsman
Again, "python thinks you're declaring a new variable" is the what, not the why. I got that much from the error message. :-) Still, I appreciate the insight.Dispersal
D
7

Functions can also have attributes, so this would work, too:

def counter():
    def c():
        while True:
            yield c.count
            c.count += 1
    c.count = 0
    return c

However, in this specific example, I'd use a generator as suggested by jbochi.

As for why, I can't say for sure, but I imagine it's not an explicit design choice, but rather a remnant of Python's sometimes-odd scoping rules (and especially the somewhat-odd evolution of its scoping rules).

Diapause answered 5/1, 2010 at 22:24 Comment(4)
+1 for a way to do it I didn't know before (and for agreeing on Python's inane scoping rules).Cupboard
That's an interesting idea, but it's effectively just a class declaration. In fact, it's disturbingly similar to javascript style classes. But thanks for the insight; I didn't know you could add arbitrary attributes to functions.Dispersal
Yeah, it's basically a class. And I've found that when you have an urge to use function attributes, you should either (a) use a generator (as in this example), or (b) use a class. But I thought it was worth pointing out that function attributes do exist and could be used as a solution to your problem (even though I agree that they're kind of ugly).Diapause
you have a semantic error in your program: the count attribute should apply to the inner function c rather than the outer function (which returns separate instances of inner functions)Mathers
E
6

This behavior is quite thoroughly explained the official Python tutorial as well as in the Python execution model. In particular, from the tutorial:

A special quirk of Python is that – if no global statement is in effect – assignments to names always go into the innermost scope.

However, this does not say anything about why it behaves in this way.

Some more information comes from PEP 3104, that tries to tackle this situation for Python 3.0.
There, you can see that it is this way because at a certain point in time, it was seen as the best solution instead of introducing classic static nested scopes (see Re: Scoping (was Re: Lambda binding solved?)).

That said, I have also my own interpretation.
Python implements namespaces as dictionaries; when a lookup for a variable fails in the inner, then it tries in the outer and so on, until it reaches the builtins.
However, binding a variable is a completely different stuff, because you need to specify a particular namespace - that it is always the innermost one (unless you set the "global" flag, that means it is always the global namespace).
Eventually, the different algorithms used for looking up and binding variables are the reason for closures to be read-only in Python.
But, again, this is just my speculation :-)

Eumenides answered 5/1, 2010 at 23:17 Comment(4)
Interesting ideas. From my perspective the most logical thing to do when you have an lvalue that is being assigned to is to first look it up in the symbol table. If it exists, use it, if not, create it at the innermost scope. But maybe that's just me...?Dispersal
I fully agree with you, and I have been biten several times by a similar assumption in Python. For this reason, I found very useful to reason in terms of dictionaries: as soon as you see memory model in that way, things become much clearer.Eumenides
I know it is late, but I wanted to add that the pep 3104 tackles this idea, and why not use it. Also, I did fall into this SO question because I was reading the PEP and wanted a definition/explanation for classic static nested scope any hints?Rossierossing
+1 :) This was the only answer I found trying to explain the lack of full first class closures in Python 2.xEmmettemmey
L
1

It is not that they are read-only, as much as the scope is more strict that you realize. If you can't nonlocal in Python 3+, then you can at least use explicit scoping. Python 2.6.1, with explicit scoping at the module level:

>>> def counter():
...     sys.modules[__name__].count = 0
...     def c():
...         sys.modules[__name__].count += 1
...         return sys.modules[__name__].count
...     sys.modules[__name__].c = c
...     
>>> counter()
>>> c()
1
>>> c()
2
>>> c()
3

A little more work is required to have a more restricted scope for the count variable, instead of using a pseudo-global module variable (still Python 2.6.1):

>>> def counter():
...     class c():
...         def __init__(self):
...             self.count = 0
...     cinstance = c()
...     def iter():
...         cinstance.count += 1
...         return cinstance.count
...     return iter
... 
>>> c = counter()
>>> c()
1
>>> c()
2
>>> c()
3
>>> d = counter()
>>> d()
1
>>> c()
4
>>> d()
2
Livvy answered 6/1, 2010 at 7:28 Comment(3)
It turns out the inability to affect things in non-local non-global scope is the same as being read-only. Explicit scoping at the module level just gives you another kind of global, it doesn't give you a new instance for each execution of the counter() function.Dispersal
Conceded. Please see my new additional example above. I am trying to get out what you said you would like to achieve in your original question.Livvy
It occurs to me now that my new example, using the internal cinstance to hold the state of count is almost identical to your first example in the question, except that count is now maintained inside an instance. There is a separate instance for each invocation of counter(), which is what you wanted originally. What puzzles me is why the internal instance variable cinstance is handled different with respect to scoping than the original count in your first example; as you can test yourself, the code above does not produce UnboundLocalError on cinstance.Livvy
N
0

To expand on sdcvvc's answer for passing param to closure.

def counter():
    count = 0
    def c(delta=1):
        nonlocal count
        count += delta
        return count
    return c

x = counter()
print([x(), x(100), x(-99)])

Thread-safe version:

import threading

def counter():
    count = 0
    _lock = threading.Lock()
    def c(delta=1):
        nonlocal count
        with _lock:
            count += delta
            return count
    return c
Northerly answered 20/3, 2020 at 0:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.