How does Python's "super" do the right thing?
Asked Answered
P

5

58

I'm running Python 2.5, so this question may not apply to Python 3. When you make a diamond class hierarchy using multiple inheritance and create an object of the derived-most class, Python does the Right Thing (TM). It calls the constructor for the derived-most class, then its parent classes as listed from left to right, then the grandparent. I'm familiar with Python's MRO; that's not my question. I'm curious how the object returned from super actually manages to communicate to calls of super in the parent classes the correct order. Consider this example code:

#!/usr/bin/python

class A(object):
    def __init__(self): print "A init"

class B(A):
    def __init__(self):
        print "B init"
        super(B, self).__init__()

class C(A):
    def __init__(self):
        print "C init"
        super(C, self).__init__()

class D(B, C):
    def __init__(self):
        print "D init"
        super(D, self).__init__()

x = D()

The code does the intuitive thing, it prints:

D init
B init
C init
A init

However, if you comment out the call to super in B's init function, neither A nor C's init function is called. This means B's call to super is somehow aware of C's existence in the overall class hierarchy. I know that super returns a proxy object with an overloaded get operator, but how does the object returned by super in D's init definition communicate the existence of C to the object returned by super in B's init definition? Is the information that subsequent calls of super use stored on the object itself? If so, why isn't super instead self.super?

Edit: Jekke quite rightly pointed out that it's not self.super because super is an attribute of the class, not an instance of the class. Conceptually this makes sense, but in practice super isn't an attribute of the class either! You can test this in the interpreter by making two classes A and B, where B inherits from A, and calling dir(B). It has no super or __super__ attributes.

Premillennial answered 3/3, 2009 at 16:45 Comment(6)
Super isn't self.super because super is a property of the class, not the instance. (I don't really understand the rest of the question.)Hygrostat
Irrelevant, maybe; but I have been advised by several people not to use Python 2.5 anymore, as Python 3 introduces so many new features / fixes so many bugs.Thilda
Edit to 'how does the object returned by super in D's init definition communicate the existence of C to the object returned by super in B's init definition'? I believe that is what you're asking - it might make more sense :)Leslie
@ adam: There are a lot of libraries that haven't been ported to 3.0 yet. (PyQt, to name an important one.)Spelter
EDIT: Reading Jacob Gabrielson's link to Super Harmful has exposed my Java background - super() definitely behaves differently in Python. I'll leave the rest of my answer for reference below: Posting as an answer, because it's too big for a comment: Now that I look at it again, and keep in mind I don't have an interpreter handy, wouldn't you expect to see: D init B init C init A init anyway, because C will be calling super() regardless? At any rate, I believe the behaviour can probably be explained by the fact that everyone should be aware that D iLeslie
@adam: You might mean Python 2.6 but not Python 3. #172806Hoban
I
16

I have provided a bunch of links below, that answer your question in more detail and more precisely than I can ever hope to. I will however give an answer to your question in my own words as well, to save you some time. I'll put it in points -

  1. super is a builtin function, not an attribute.
  2. Every type (class) in Python has an __mro__ attribute, that stores the method resolution order of that particular instance.
  3. Each call to super is of the form super(type[, object-or-type]). Let us assume that the second attribute is an object for the moment.
  4. At the starting point of super calls, the object is of the type of the Derived class (say DC).
  5. super looks for methods that match (in your case __init__) in the classes in the MRO, after the class specified as the first argument (in this case classes after DC).
  6. When the matching method is found (say in class BC1), it is called.
    (This method should use super, so I am assuming it does - See Python's super is nifty but can't be used - link below) That method then causes a search in the object's class' MRO for the next method, to the right of BC1.
  7. Rinse wash repeat till all methods are found and called.

Explanation for your example

 MRO: D,B,C,A,object  
  1. super(D, self).__init__() is called. isinstance(self, D) => True
  2. Search for next method in the MRO in classes to the right of D.

    B.__init__ found and called


  1. B.__init__ calls super(B, self).__init__().

    isinstance(self, B) => False
    isinstance(self, D) => True

  2. Thus, the MRO is the same, but the search continues to the right of B i.e. C,A,object are searched one by one. The next __init__ found is called.

  3. And so on and so forth.

An explanation of super
http://www.python.org/download/releases/2.2.3/descrintro/#cooperation
Things to watch for when using super
http://fuhm.net/super-harmful/
Pythons MRO Algorithm:
http://www.python.org/download/releases/2.3/mro/
super's docs:
http://docs.python.org/library/functions.html
The bottom of this page has a nice section on super:
http://docstore.mik.ua/orelly/other/python/0596001886_pythonian-chp-5-sect-2.html

I hope this helps clear it up.

Inharmonious answered 3/3, 2009 at 17:58 Comment(1)
in your explanation, I don't understand the 3rd point "isinstance(self, B) => False", then why the search continue?Sula
C
36

Change your code to this and I think it'll explain things (presumably super is looking at where, say, B is in the __mro__?):

class A(object):
    def __init__(self):
        print "A init"
        print self.__class__.__mro__

class B(A):
    def __init__(self):
        print "B init"
        print self.__class__.__mro__
        super(B, self).__init__()

class C(A):
    def __init__(self):
        print "C init"
        print self.__class__.__mro__
        super(C, self).__init__()

class D(B, C):
    def __init__(self):
        print "D init"
        print self.__class__.__mro__
        super(D, self).__init__()

x = D()

If you run it you'll see:

D init
(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <type 'object'>)
B init
(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <type 'object'>)
C init
(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <type 'object'>)
A init
(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <type 'object'>)

Also it's worth checking out Python's Super is nifty, but you can't use it.

Cadel answered 3/3, 2009 at 17:17 Comment(3)
just as i guessed: self is class D all the time.Spratt
Interesting, I would have guessed that if mro showed up when doing dir(class), but it doesn't. But if you do dir(class.__class__) then it is visible! Any idea why the discrepancy? class.__mro__ and class.__class__.__mro__ both workPremillennial
The reason is that mro is defined by the metaclass, not the class, so it doesn't show up in dir (see mail.python.org/pipermail/python-dev/2008-March/077604.html).Cadel
I
16

I have provided a bunch of links below, that answer your question in more detail and more precisely than I can ever hope to. I will however give an answer to your question in my own words as well, to save you some time. I'll put it in points -

  1. super is a builtin function, not an attribute.
  2. Every type (class) in Python has an __mro__ attribute, that stores the method resolution order of that particular instance.
  3. Each call to super is of the form super(type[, object-or-type]). Let us assume that the second attribute is an object for the moment.
  4. At the starting point of super calls, the object is of the type of the Derived class (say DC).
  5. super looks for methods that match (in your case __init__) in the classes in the MRO, after the class specified as the first argument (in this case classes after DC).
  6. When the matching method is found (say in class BC1), it is called.
    (This method should use super, so I am assuming it does - See Python's super is nifty but can't be used - link below) That method then causes a search in the object's class' MRO for the next method, to the right of BC1.
  7. Rinse wash repeat till all methods are found and called.

Explanation for your example

 MRO: D,B,C,A,object  
  1. super(D, self).__init__() is called. isinstance(self, D) => True
  2. Search for next method in the MRO in classes to the right of D.

    B.__init__ found and called


  1. B.__init__ calls super(B, self).__init__().

    isinstance(self, B) => False
    isinstance(self, D) => True

  2. Thus, the MRO is the same, but the search continues to the right of B i.e. C,A,object are searched one by one. The next __init__ found is called.

  3. And so on and so forth.

An explanation of super
http://www.python.org/download/releases/2.2.3/descrintro/#cooperation
Things to watch for when using super
http://fuhm.net/super-harmful/
Pythons MRO Algorithm:
http://www.python.org/download/releases/2.3/mro/
super's docs:
http://docs.python.org/library/functions.html
The bottom of this page has a nice section on super:
http://docstore.mik.ua/orelly/other/python/0596001886_pythonian-chp-5-sect-2.html

I hope this helps clear it up.

Inharmonious answered 3/3, 2009 at 17:58 Comment(1)
in your explanation, I don't understand the 3rd point "isinstance(self, B) => False", then why the search continue?Sula
S
6

just guessing:

self in all the four methods refer to the same object, that is, of class D. so, in B.__init__(), the call to to super(B,self) knows the whole diamond ancestry of self and it has to fetch the method from 'after' B. in this case, it's the C class.

Spratt answered 3/3, 2009 at 17:14 Comment(2)
self should refer to the instance of the containing class, and not another class down the line, no?Leslie
no, if you call a self.method() it would be the most specific implementation, no matter how up you are. that's the essence of polymorphismSpratt
C
3

super() knows the full class hierarchy. This is what happens inside B's init:

>>> super(B, self)
<super: <class 'B'>, <D object>>

This resolves the central question,

how does the object returned by super in D's init definition communicate the existence of C to the object returned by super in B's init definition?

Namely, in B's init definition, self is an instance of D, and thus communicates the existence of C. For example C can be found in type(self).__mro__.

Centerpiece answered 18/12, 2012 at 20:57 Comment(0)
G
3

Jacob's answer shows how to understand the problem, while batbrat's shows the details and hrr's goes straight to the point.

One thing they do not cover (at least not explicity) from your question is this point:

However, if you comment out the call to super in B's init function, neither A nor C's init function is called.

To understand that, change Jacob's code to to print the stack on A's init, as below:

import traceback

class A(object):
    def __init__(self):
        print "A init"
        print self.__class__.__mro__
        traceback.print_stack()

class B(A):
    def __init__(self):
        print "B init"
        print self.__class__.__mro__
        super(B, self).__init__()

class C(A):
    def __init__(self):
        print "C init"
        print self.__class__.__mro__
        super(C, self).__init__()

class D(B, C):
    def __init__(self):
        print "D init"
        print self.__class__.__mro__
        super(D, self).__init__()

x = D()

It is a bit surprising to see that B's line super(B, self).__init__() is actually calling C.__init__(), as C is not a baseclass of B.

D init
(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <type 'object'>)
B init
(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <type 'object'>)
C init
(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <type 'object'>)
A init
(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <type 'object'>)
  File "/tmp/jacobs.py", line 31, in <module>
    x = D()
  File "/tmp/jacobs.py", line 29, in __init__
    super(D, self).__init__()
  File "/tmp/jacobs.py", line 17, in __init__
    super(B, self).__init__()
  File "/tmp/jacobs.py", line 23, in __init__
    super(C, self).__init__()
  File "/tmp/jacobs.py", line 11, in __init__
    traceback.print_stack()

This happens because super (B, self) is not 'calling the B's baseclass version of __init__'. Instead, it is 'calling __init__ on the first class to the right of B that is present on self's __mro__ and that has such an attribute.

So, if you comment out the call to super in B's init function, the method stack will stop on B.__init__, and will never reach C or A.

To summarize:

  • Regardless of which class is referring to it, self is always a reference to the instance, and its __mro__ and __class__ remain constant
  • super() finds the method looking to the classes that are to the right of the current one on the __mro__. As the __mro__ remains constant, what happens is that it is searched as a list, not as a tree or a graph.

On that last point, note that the full name of the MRO's algorithm is C3 superclass linearization. That is, it flattens that structure into a list. When the different super() calls happen, they are effectivelly iterating that list.

Glutinous answered 9/7, 2017 at 21:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.