Confusions about Python Descriptors and <Descriptor HowTo Guide>

Recently I read the official HOW-TO about Python descriptors, which actually derives from an essay written by Raymond Hettinger long time ago. But after reading it for several times, I am still confused about some parts of it. I will quote some paragraphs, followed by my confusions and questions.

If an instance’s dictionary has an entry with the same name as a data descriptor, the data descriptor takes precedence. If an instance’s dictionary has an entry with the same name as a non-data descriptor, the dictionary entry takes precedence.

Is this the same with class? What is the precedence chain of a class if its dictionary has an entry with the same name as a data/non-data descriptor?

For objects, the machinery is in object.__getattribute__() which transforms b.x into type(b).__dict__['x'].__get__(b, type(b)). The implementation works through a precedence chain that gives data descriptors priority over instance variables, instance variables priority over non-data descriptors, and assigns lowest priority to __getattr__() if provided.

For classes, the machinery is in type.__getattribute__() which transforms B.x into B.__dict__['x'].__get__(None, B).

The above two paragraphs tell the process of a descriptor's being invoked automatically upon attribute access. It lists the difference between attribute's being accessed by an instance (b.x) and a class (B.x). However, here are my confusions:
- if the attribute of a class or an instance is not a descriptor, will the transformation (i.e., transforms b.x into type(b).__dict__['x'].__get__(b, type(b)) and B.x into B.__dict__['x'].__get__(None, B)) still proceed? Is returning the attribute in this class's or instance's dict directly simpler?
- If an instance's dictionary has an entry with the same name as a non-data descriptor, according to the precedence rule in the first quote, the dictionary entry takes precedence, at this time will the transformation still proceed? Or just return the value in its dict?

Non-data descriptors provide a simple mechanism for variations on the usual patterns of binding functions into methods.

Is non-data descriptors chosen because functions/methods can only be gotten, but cannot be set?
What's the underlying mechanism for binding functions into methods? Since class dictionaries store methods as functions, if we call the same method using a class and its instance respectively, how can the underlying function tell whether its first argument should be self or not?

Functions have a __get__() method so that they can be converted to a method when accessed as attributes. The non-data descriptor transforms a obj.f(*args) call into f(obj, *args). Calling klass.f(*args) becomes f(*args).

How can a non-data descriptor transform a obj.f(*args) call into f(obj, *args)?
How can a non-data descriptor transform a klass.f(*args) call into f(*args)?
What's the underlying mechanism of the above two transformations? Why do differences exist between class and instance?
What does the role __get__() method play under the above circumstance?

Is this the same with class? What is the precedence chain of a class if its dictionary has an entry with the same name as a data/non-data descriptor?

No, if an attribute is defined both in a superclass and a subclass, the superclass value is completely ignored.

if the attribute of a class or an instance is not a descriptor, will the transformation (i.e., transforms b.x into type(b).__dict__['x'].__get__(b, type(b)) and B.x into B.__dict__['x'].__get__(None, B)) still proceed?

No, it returns directly the object gotten from the class's __dict__. Or equivalently, yes, if you pretend that all objects have by default a method __get__() that ignores its arguments and returns self.

If an instance's dictionary has an entry with the same name as a non-data descriptor, according to the precedence rule in the first quote, the dictionary entry takes precedence, at this time will the transformation still proceed? Or just return the value in its dict?

What is not clear in the paragraph that you quoted (maybe it's written down elsewhere) is that when b.x decides to return b.__dict__['x'], no __get__ is invoked in any case. The __get__ is invoked exactly when the syntax b.x or B.x decides to return an object that lives in a class dict.

Is non-data descriptors chosen because functions/methods can only be gotten, but cannot be set?

Yes: they are a generalization of the "old-style" class model in Python, in which you can say B.f = 42 even if f is a function object living in the class B. This lets the function object be overriden with an unrelated object. The data descriptors on the other hand have a different logic in order to support property.

What's the underlying mechanism for binding functions into methods? Since class dictionaries store methods as functions, if we call the same method using a class and its instance respectively, how can the underlying function tell whether its first argument should be self or not?

To understand this, you need to have "method objects" in mind. The syntax b.f(*args) is equivalent to (b.f)(*args), which is two steps. The first step calls f.__get__(b); this returns a method object that stores both b and f. The second step calls the method object, which will in turn call the original f by adding b as extra argument. This is something which doesn't occur for B.f, simply because B.f, i.e. f.__get__(None, B), is just f (in Python 3). It's the way the special method __get__ is designed on function objects.

Recommended topics

Hot tags