Creating a class within a function and access a function defined in the containing function's scope [duplicate]
Asked Answered
J

2

94

Edit:

See my full answer at the bottom of this question.

tl;dr answer: Python has statically nested scopes. The static aspect can interact with the implicit variable declarations, yielding non-obvious results.

(This can be especially surprising because of the language's generally dynamic nature).

I thought I had a pretty good handle on Python's scoping rules, but this problem has me thoroughly stymied, and my google-fu has failed me (not that I'm surprised - look at the question title ;)

I'm going to start with a few examples that work as expected, but feel free to skip to example 4 for the juicy part.

Example 1.

>>> x = 3
>>> class MyClass(object):
...     x = x
... 
>>> MyClass.x
3

Straightforward enough: during class definition we're able to access the variables defined in the outer (in this case global) scope.

Example 2.

>>> def mymethod(self):
...     return self.x
... 
>>> x = 3
>>> class MyClass(object):
...     x = x
...     mymethod = mymethod
...
>>> MyClass().mymethod()
3

Again (ignoring for the moment why one might want to do this), there's nothing unexpected here: we can access functions in the outer scope.

Note: as Frédéric pointed out below, this function doesn't seem to work. See Example 5 (and beyond) instead.

Example 3.

>>> def myfunc():
...     x = 3
...     class MyClass(object):
...         x = x
...     return MyClass
... 
>>> myfunc().x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in myfunc
  File "<stdin>", line 4, in MyClass
NameError: name 'x' is not defined

This is essentially the same as example 1: we're accessing the outer scope from within the class definition, just this time that scope isn't global, thanks to myfunc().

Edit 5: As @user3022222 pointed out below, I botched this example in my original posting. I believe this fails because only functions (not other code blocks, like this class definition) can access variables in the enclosing scope. For non-function code blocks, only local, global and built-in variables are accessible. A more thorough explanation is available in this question

One more:

Example 4.

>>> def my_defining_func():
...     def mymethod(self):
...         return self.y
...     class MyClass(object):
...         mymethod = mymethod
...         y = 3
...     return MyClass
... 
>>> my_defining_func()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in my_defining_func
  File "<stdin>", line 5, in MyClass
NameError: name 'mymethod' is not defined

Um...excuse me?

What makes this any different from example 2?

I'm completely befuddled. Please sort me out. Thanks!

P.S. on the off-chance that this isn't just a problem with my understanding, I've tried this on Python 2.5.2 and Python 2.6.2. Unfortunately those are all I have access to at the moment, but they both exhibit the same behaviour.

Edit According to http://docs.python.org/tutorial/classes.html#python-scopes-and-namespaces: at any time during execution, there are at least three nested scopes whose namespaces are directly accessible:

  • the innermost scope, which is searched first, contains the local names
  • the scopes of any enclosing functions, which are searched starting with the nearest enclosing scope, contains non-local, but also non-global names
  • the next-to-last scope contains the current module’s global names
  • the outermost scope (searched last) is the namespace containing built-in names

#4. seems to be a counter-example to the second of these.

Edit 2

Example 5.

>>> def fun1():
...     x = 3
...     def fun2():
...         print x
...     return fun2
... 
>>> fun1()()
3

Edit 3

As @Frédéric pointed out the assignment of to a variable of the same name as it has in the outer scope seems to "mask" the outer variable, preventing the assignment from functioning.

So this modified version of Example 4 works:

def my_defining_func():
    def mymethod_outer(self):
        return self.y
    class MyClass(object):
        mymethod = mymethod_outer
        y = 3
    return MyClass

my_defining_func()

However this doesn't:

def my_defining_func():
    def mymethod(self):
        return self.y
    class MyClass(object):
        mymethod_temp = mymethod
        mymethod = mymethod_temp
        y = 3
    return MyClass

my_defining_func()

I still don't fully understand why this masking occurs: shouldn't the name binding occur when the assignment happens?

This example at least provides some hint (and a more useful error message):

>>> def my_defining_func():
...     x = 3
...     def my_inner_func():
...         x = x
...         return x
...     return my_inner_func
... 
>>> my_defining_func()()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in my_inner_func
UnboundLocalError: local variable 'x' referenced before assignment
>>> my_defining_func()
<function my_inner_func at 0xb755e6f4>

So it appears that the local variable is defined at function creation (which succeeds), resulting in the local name being "reserved" and thus masking the outer-scope name when the function is called.

Interesting.

Thanks Frédéric for the answer(s)!

For reference, from the python docs:

It is important to realize that scopes are determined textually: the global scope of a function defined in a module is that module’s namespace, no matter from where or by what alias the function is called. On the other hand, the actual search for names is done dynamically, at run time — however, the language definition is evolving towards static name resolution, at “compile” time, so don’t rely on dynamic name resolution! (In fact, local variables are already determined statically.)

Edit 4

The Real Answer

This seemingly confusing behaviour is caused by Python's statically nested scopes as defined in PEP 227. It actually has nothing to do with PEP 3104.

From PEP 227:

The name resolution rules are typical for statically scoped languages [...] [except] variables are not declared. If a name binding operation occurs anywhere in a function, then that name is treated as local to the function and all references refer to the local binding. If a reference occurs before the name is bound, a NameError is raised.

[...]

An example from Tim Peters demonstrates the potential pitfalls of nested scopes in the absence of declarations:

i = 6
def f(x):
    def g():
        print i
    # ...
    # skip to the next page
    # ...
    for i in x:  # ah, i *is* local to f, so this is what g sees
        pass
    g()

The call to g() will refer to the variable i bound in f() by the for loop. If g() is called before the loop is executed, a NameError will be raised.

Lets run two simpler versions of Tim's example:

>>> i = 6
>>> def f(x):
...     def g():
...             print i
...     # ...
...     # later
...     # ...
...     i = x
...     g()
... 
>>> f(3)
3

when g() doesn't find i in its inner scope, it dynamically searches outwards, finding the i in f's scope, which has been bound to 3 through the i = x assignment.

But changing the order the final two statements in f causes an error:

>>> i = 6
>>> def f(x):
...     def g():
...             print i
...     # ...
...     # later
...     # ...
...     g()
...     i = x  # Note: I've swapped places
... 
>>> f(3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 7, in f
  File "<stdin>", line 3, in g
NameError: free variable 'i' referenced before assignment in enclosing scope

Remembering that PEP 227 said "The name resolution rules are typical for statically scoped languages", lets look at the (semi-)equivalent C version offer:

// nested.c
#include <stdio.h>

int i = 6;
void f(int x){
    int i;  // <--- implicit in the python code above
    void g(){
        printf("%d\n",i);
    }
    g();
    i = x;
    g();
}

int main(void){
    f(3);
}

compile and run:

$ gcc nested.c -o nested
$ ./nested 
134520820
3

So while C will happily use an unbound variable (using whatever happens to have been stored there before: 134520820, in this case), Python (thankfully) refuses.

As an interesting side-note, statically nested scopes enable what Alex Martelli has called "the single most important optimization the Python compiler does: a function's local variables are not kept in a dict, they're in a tight vector of values, and each local variable access uses the index in that vector, not a name lookup."

Jaine answered 28/11, 2010 at 12:14 Comment(0)
E
26

That's an artifact of Python's name resolution rules: you only have access to the global and the local scopes, but not to the scopes in-between, e.g. not to your immediate outer scope.

EDIT: The above was poorly worded, you do have access to the variables defined in outer scopes, but by doing x = x or mymethod = mymethod from a non-global namespace, you're actually masking the outer variable with the one you're defining locally.

In example 2, your immediate outer scope is the global scope, so MyClass can see mymethod, but in example 4 your immediate outer scope is my_defining_func(), so it can't, because the outer definition of mymethod is already masked by its local definition.

See PEP 3104 for more details about nonlocal name resolution.

Also note that, for the reasons explained above, I can't get example 3 to run under either Python 2.6.5 or 3.1.2:

>>> def myfunc():
...     x = 3
...     class MyClass(object):
...         x = x
...     return MyClass
... 
>>> myfunc().x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in myfunc
  File "<stdin>", line 4, in MyClass
NameError: name 'x' is not defined

But the following would work:

>>> def myfunc():
...     x = 3
...     class MyClass(object):
...         y = x
...     return MyClass
... 
>>> myfunc().y
3
Estrellaestrellita answered 28/11, 2010 at 12:28 Comment(9)
PEP 3104 only applies to Python 3.Sweatshop
@Chris, yes, and nonlocal does not exist in Python 2, so questioner can't use that. That was actually my point ;)Shout
According to docs.python.org/tutorial/…: at any time during execution, there are at least three nested scopes whose namespaces are directly accessible: • the innermost scope, which is searched first, contains the local names • the scopes of any enclosing functions, which are searched starting with the nearest enclosing scope, contains non-local, but also non-global names • the next-to-last scope contains the current module’s global names • the outermost scope (searched last) is the namespace containing built-in namesJaine
@Gabriel, example 3 raises NameError under both 2.6.5 and 3.1.2.Shout
@Gabriel, in the tutorial, the scopes of any enclosing functions probably refers to the scope in which those functions are defined and not to the functions themselves.Shout
I'm not sure I follow...what about example 5 added to the question? Does that run for you? My understanding was that nested scopes have read access to all their containing scopes (in order from inside to outside), but only write access to the inner-most and outer-most. Indeed, this seems to be what PEP 3104 says: "Currently, Python code can refer to a name in any enclosing scope, but it can only rebind names in two scopes: the local scope (by simple assignment) or the module-global scope (using a global declaration)."Jaine
@Gabriel, example 5 works because it doesn't do x == x. See my updated answer.Shout
@Frédéric ok, I think I'm all clear: name binding (and thus scope resolution happen dynamically), but name resolution happens statically. See the explanation at the bottom of my question for details. Thanks a lot!Jaine
you can also say "MyClass = myfunc()" and then MyClass is available in the rest of the script instead of having to call myfunc() every timeTuba
C
11

This post is a few years old, but it is among the rare ones to discuss the important problem of scope and static binding in Python. However, there is an important misunderstanding of the author for example 3 that might confuse readers. (do not take as granted that the other ones are all correct, it is just that I only looked at the issues raised by example 3 in details). Let me clarify what happened.

In example 3

def myfunc():
    x = 3
    class MyClass(object):
        x = x
    return MyClass

>>> myfunc().x

must return an error, unlike what the author of the post said. I believe that he missed the error because in example 1 x was assigned to 3 in the global scope. Thus a wrong understanding of what happened.

The explanation is extensively described in this post How references to variables are resolved in Python

Canaan answered 27/11, 2013 at 11:5 Comment(1)
Thanks for pointing this out: I've updated the explanation to fix the mistake you foundJaine

© 2022 - 2024 — McMap. All rights reserved.