How dangerous is setting self.__class__ to something else?
Asked Answered
G

8

41

Say I have a class, which has a number of subclasses.

I can instantiate the class. I can then set its __class__ attribute to one of the subclasses. I have effectively changed the class type to the type of its subclass, on a live object. I can call methods on it which invoke the subclass's version of those methods.

So, how dangerous is doing this? It seems weird, but is it wrong to do such a thing? Despite the ability to change type at run-time, is this a feature of the language that should completely be avoided? Why or why not?

(Depending on responses, I'll post a more-specific question about what I would like to do, and if there are better alternatives).

Geranial answered 8/11, 2012 at 0:31 Comment(8)
I'd say it's wrong mostly because no client code will expect that the type of an object can change randomly. Of course, if you change the type to one that is compatible with the previous one, you'd not be introducing bugs. But there's probably better OO approaches to that. (Delegating to a strategy etc.) So my gut feeling is: Dangerous? No. Confusing? Yes. Useful enough to warrant the confusion? Unlikely.Idio
We usually think of an instance as having fixed methods and variable data. I'd love to see a use case with fixed data and variable methods. I think it would be fun! Please show us your use case.Lehrer
this is terrible design look at using the Factory Design Pattern instead...Circumflex
Please post the specific question. It's hard to imagine a case where this would be the best solution—but just because I can't imagine one doesn't mean it doesn't exist. :)Clemons
@Lehrer Functions can be data, so "variable methods" is more appropriately modelled by attributes holding function objects.Cellobiose
@Ben: Suppose you were modeling cellular automata. Suppose each cell could be in one of say 5 Stages. You could define 5 classes Stage1, Stage2, etc. Suppose each Stage class has multiple methods. If you allow changing __class__ you could instantly give a cell all the methods of a new stage (same names, but different behavior). If you refuse to change __class__, then you might have to include a stage attribute, and use a lot of if statements, or reassign a lot of attributes pointing to different stage's functions.Lehrer
@Lehrer Or just have a single dict of named methods and reassign just that one attribute. Or use a higher-level cellular automata library so you don't care how state methods are built under the covers. Or make the lookup dynamic (e.g., via __getattr__). Or, the most obvious way, have a "current_stage" member that holds a Stage1, and replace that with a Stage2. (If you want to forward methods to self.current_stage, see the "dynamic lookup" bit again.)Clemons
@Lehrer You could, but that seems way more appropriate to have each stage object be (optionally) initialised from an object in a different stage. That gives you more freedom to have different data between each stage. Alternatively you could pack up the functions in an object, and have that object be an attribute of your cell (same way as you do any logical sub-collection of data in an object). Yes, maybe assigning to __class__ would be a reasonable implementation in some cases. My point was just that Python already has a much more straightforward mechanism for handling "variable methods".Cellobiose
C
33

Here's a list of things I can think of that make this dangerous, in rough order from worst to least bad:

  • It's likely to be confusing to someone reading or debugging your code.
  • You won't have gotten the right __init__ method, so you probably won't have all of the instance variables initialized properly (or even at all).
  • The differences between 2.x and 3.x are significant enough that it may be painful to port.
  • There are some edge cases with classmethods, hand-coded descriptors, hooks to the method resolution order, etc., and they're different between classic and new-style classes (and, again, between 2.x and 3.x).
  • If you use __slots__, all of the classes must have identical slots. (And if you have the compatible but different slots, it may appear to work at first but do horrible things…)
  • Special method definitions in new-style classes may not change. (In fact, this will work in practice with all current Python implementations, but it's not documented to work, so…)
  • If you use __new__, things will not work the way you naively expected.
  • If the classes have different metaclasses, things will get even more confusing.

Meanwhile, in many cases where you'd think this is necessary, there are better options:

  • Use a factory to create an instance of the appropriate class dynamically, instead of creating a base instance and then munging it into a derived one.
  • Use __new__ or other mechanisms to hook the construction.
  • Redesign things so you have a single class with some data-driven behavior, instead of abusing inheritance.

As a very most common specific case of the last one, just put all of the "variable methods" into classes whose instances are kept as a data member of the "parent", rather than into subclasses. Instead of changing self.__class__ = OtherSubclass, just do self.member = OtherSubclass(self). If you really need methods to magically change, automatic forwarding (e.g., via __getattr__) is a much more common and pythonic idiom than changing classes on the fly.

Clemons answered 8/11, 2012 at 0:46 Comment(5)
Could you elaborate on the 2-3 porting issues?Sweepstakes
@agf: For example, I've got a @staticmethod called sm. I do m = myObj.sm; myObj.__class__ = Derived; print(m == myObj.sm). In 2.x, this is True; in 3.x, it's False.Clemons
related to the slots issue, specifically, if the storage structure between the two classes is incompatible (for slots or otherwise), assigning to __class__ fails. this affects things like list and tuple, where you cannot use __slots__ in subclasses.Housewifery
@TokenMacGuy: Thanks for the clarification. That's the "incompatible" case I was talking about. But there you at least get a nice error (like TypeError: __class__ assignment: 'B' object layout differs from 'A'). If you have two classes with compatible slots holding different descriptor types (which you probably shouldn't, but if you're the kind of person who reassigns __class__…), it can be much less clear.Clemons
@abarnert, thank you for this detailed answer. The most convincing factors for me are the potential future incompatibilities and how it's not actually documented to be able to do this and therefore is probably not a good road to head down. I had the init issue covered in another way; not using slots; not using metaclasses. This was just the answer I was looking for. Thank you.Geranial
Y
18

Assigning the __class__ attribute is useful if you have a long time running application and you need to replace an old version of some object by a newer version of the same class without loss of data, e.g. after some reload(mymodule) and without reload of unchanged modules. Other example is if you implement persistency - something similar to pickle.load.

All other usage is discouraged, especially if you can write the complete code before starting the application.

Yesteryear answered 8/11, 2012 at 1:28 Comment(1)
+1 for giving a reasonable use for assigning __class__. At the very least this should head off someone asking, "But if this is so bad, why does Python let me do it?"Clemons
C
6

On arbitrary classes, this is extremely unlikely to work, and is very fragile even if it does. It's basically the same thing as pulling the underlying function objects out of the methods of one class, and calling them on objects which are not instances of the original class. Whether or not that will work depends on internal implementation details, and is a form of very tight coupling.

That said, changing the __class__ of objects amongst a set of classes that were particularly designed to be used this way could be perfectly fine. I've been aware that you can do this for a long time, but I've never yet found a use for this technique where a better solution didn't spring to mind at the same time. So if you think you have a use case, go for it. Just be clear in your comments/documentation what is going on. In particular it means that the implementation of all the classes involved have to respect all of their invariants/assumptions/etc, rather than being able to consider each class in isolation, so you'd want to make sure that anyone who works on any of the code involved is aware of this!

Cellobiose answered 8/11, 2012 at 1:14 Comment(4)
No, it's actually pretty likely to work. Most classes do not have __slots__, custom metaclasses, @staticmethods, or the other things that can/will break your code. It's more likely that it will work, but 6 months later it'll break because of some seemingly unrelated change, or just confuse whoever's maintaining your code into breaking things…Clemons
@Clemons I don't know about you, but when I write code instances of MyFancyClass are unlikely to have the attributes expected by methods of MyCompletelyUnrelatedClass. __slots__, metaclasses, etc don't even come into it (although as far as I'm aware metaclasses and staticmethods respect changes to __class__ just fine). Instances of SomeClasseInAHeirarchy and SomeOtherClassInAHeirarchy are a little more likely to be able to cope with switching __class__ between them, but still not overly likely.Cellobiose
Yes, but the OP isn't talking about MyCompletelyUnrelatedClass, he's talking about MyTightlyCoupledSubclass.Clemons
@Clemons Yes, but (a) when I said "this is extremely unlikely to work" it was immediately after "on arbitrary classes" and (b) it's still not at all unlikely for two tightly coupled members of the same class hierarchy to have different attributes. Changing __class__ is not "pretty likely to work" unless you have explicitly designed for it to work.Cellobiose
N
3

Well, not discounting the problems cautioned about at the start. But it can be useful in certain cases.

First of all, the reason I am looking this post up is because I did just this and __slots__ doesn't like it. (yes, my code is a valid use case for slots, this is pure memory optimization) and I was trying to get around a slots issue.

I first saw this in Alex Martelli's Python Cookbook (1st ed). In the 3rd ed, it's recipe 8.19 "Implementing Stateful Objects or State Machine Problems". A fairly knowledgeable source, Python-wise.

Suppose you have an ActiveEnemy object that has different behavior from an InactiveEnemy and you need to switch back and forth quickly between them. Maybe even a DeadEnemy.

If InactiveEnemy was a subclass or a sibling, you could switch class attributes. More exactly, the exact ancestry matters less than the methods and attributes being consistent to code calling it. Think Java interface or, as several people have mentioned, your classes need to be designed with this use in mind.

Now, you still have to manage state transition rules and all sorts of other things. And, yes, if your client code is not expecting this behavior and your instances switch behavior, things will hit the fan.

But I've used this quite successfully on Python 2.x and never had any unusual problems with it. Best done with a common parent and small behavioral differences on subclasses with the same method signatures.

No problems, until my __slots__ issue that's blocking it just now. But slots are a pain in the neck in general.

I would not do this to patch live code. I would also privilege using a factory method to create instances.

But to manage very specific conditions known in advance? Like a state machine that the clients are expected to understand thoroughly? Then it is pretty darn close to magic, with all the risk that comes with it. It's quite elegant.

Python 3 concerns? Test it to see if it works but the Cookbook uses Python 3 print(x) syntax in its example, FWIW.

Neologism answered 28/6, 2014 at 4:42 Comment(0)
M
2

The other answers have done a good job of discussing the question of why just changing __class__ is likely not an optimal decision.

Below is one example of a way to avoid changing __class__ after instance creation, using __new__. I'm not recommending it, just showing how it could be done, for the sake of completeness. However it is probably best to do this using a boring old factory rather than shoe-horning inheritance into a job for which it was not intended.

class ChildDispatcher:
    _subclasses = dict()
    def __new__(cls, *args, dispatch_arg, **kwargs):
        # dispatch to a registered child class
        subcls = cls.getsubcls(dispatch_arg)
        return super(ChildDispatcher, subcls).__new__(subcls)
    def __init_subclass__(subcls, **kwargs):
        super(ChildDispatcher, subcls).__init_subclass__(**kwargs)
        # add __new__ contructor to child class based on default first dispatch argument
        def __new__(cls, *args, dispatch_arg = subcls.__qualname__, **kwargs):
            return super(ChildDispatcher,cls).__new__(cls, *args, **kwargs)
        subcls.__new__ = __new__
        ChildDispatcher.register_subclass(subcls)

    @classmethod
    def getsubcls(cls, key):
        name = cls.__qualname__
        if cls is not ChildDispatcher:
            raise AttributeError(f"type object {name!r} has no attribute 'getsubcls'")
        try:
            return ChildDispatcher._subclasses[key]
        except KeyError:
            raise KeyError(f"No child class key {key!r} in the "
                           f"{cls.__qualname__} subclasses registry")

    @classmethod
    def register_subclass(cls, subcls):
        name = subcls.__qualname__
        if cls is not ChildDispatcher:
            raise AttributeError(f"type object {name!r} has no attribute "
                                 f"'register_subclass'")
        if name not in ChildDispatcher._subclasses:
            ChildDispatcher._subclasses[name] = subcls
        else:
            raise KeyError(f"{name} subclass already exists")

class Child(ChildDispatcher): pass

c1 = ChildDispatcher(dispatch_arg = "Child")
assert isinstance(c1, Child)
c2 = Child()
assert isinstance(c2, Child)
Mimir answered 15/5, 2018 at 21:37 Comment(0)
M
0

How "dangerous" it is depends primarily on what the subclass would have done when initializing the object. It's entirely possible that it would not be properly initialized, having only run the base class's __init__(), and something would fail later because of, say, an uninitialized instance attribute.

Even without that, it seems like bad practice for most use cases. Easier to just instantiate the desired class in the first place.

Mangonel answered 8/11, 2012 at 0:46 Comment(0)
C
-1

Here's an example of one way you could do the same thing without changing __class__. Quoting @unutbu in the comments to the question:

Suppose you were modeling cellular automata. Suppose each cell could be in one of say 5 Stages. You could define 5 classes Stage1, Stage2, etc. Suppose each Stage class has multiple methods.

class Stage1(object):
  …

class Stage2(object):
  …

…

class Cell(object):
  def __init__(self):
    self.current_stage = Stage1()
  def goToStage2(self):
    self.current_stage = Stage2()
  def __getattr__(self, attr):
    return getattr(self.current_stage, attr)

If you allow changing __class__ you could instantly give a cell all the methods of a new stage (same names, but different behavior).

Same for changing current_stage, but this is a perfectly normal and pythonic thing to do, that won't confuse anyone.

Plus, it allows you to not change certain special methods you don't want changed, just by overriding them in Cell.

Plus, it works for data members, class methods, static methods, etc., in ways every intermediate Python programmer already understands.

If you refuse to change __class__, then you might have to include a stage attribute, and use a lot of if statements, or reassign a lot of attributes pointing to different stage's functions

Yes, I've used a stage attribute, but that's not a downside—it's the obvious visible way to keep track of what the current stage is, better for debugging and for readability.

And there's not a single if statement or any attribute reassignment except for the stage attribute.

And this is just one of multiple different ways of doing this without changing __class__.

Clemons answered 8/11, 2012 at 1:15 Comment(0)
L
-1

In the comments I proposed modeling cellular automata as a possible use case for dynamic __class__s. Let's try to flesh out the idea a bit:


Using dynamic __class__:

class Stage(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y

class Stage1(Stage):
    def step(self):
        if ...:
            self.__class__ = Stage2

class Stage2(Stage):
    def step(self):
        if ...:
            self.__class__ = Stage3

cells = [Stage1(x,y) for x in range(rows) for y in range(cols)]

def step(cells):
    for cell in cells:
        cell.step()
    yield cells

For lack of a better term, I'm going to call this

The traditional way: (mainly abarnert's code)

class Stage1(object):
    def step(self, cell):
        ...
        if ...:
            cell.goToStage2()

class Stage2(object):
    def step(self, cell):
        ...
        if ...:        
            cell.goToStage3()

class Cell(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y
        self.current_stage = Stage1()
    def goToStage2(self):
        self.current_stage = Stage2()
    def __getattr__(self, attr):
        return getattr(self.current_stage, attr)

cells = [Cell(x,y) for x in range(rows) for y in range(cols)]

def step(cells):
    for cell in cells:
        cell.step(cell)
    yield cells

Comparison:

  • The traditional way creates a list of Cell instances each with a current stage attribute.

    The dynamic __class__ way creates a list of instances which are subclasses of Stage. There is no need for a current stage attribute since __class__ already serves this purpose.

  • The traditional way uses goToStage2, goToStage3, ... methods to switch stages.

    The dynamic __class__ way requires no such methods. You just reassign __class__.

  • The traditional way uses the special method __getattr__ to delegate some method calls to the appropriate stage instance held in the self.current_stage attribute.

    The dynamic __class__ way does not require any such delegation. The instances in cells are already the objects you want.

  • The traditional way needs to pass the cell as an argument to Stage.step. This is so cell.goToStageN can be called.

    The dynamic __class__ way does not need to pass anything. The object we are dealing with has everything we need.


Conclusion:

Both ways can be made to work. To the extent that I can envision how these two implementations would pan-out, it seems to me the dynamic __class__ implementation will be

  • simpler (no Cell class),

  • more elegant (no ugly goToStage2 methods, no brain-teasers like why you need to write cell.step(cell) instead of cell.step()),

  • and easier to understand (no __getattr__, no additional level of indirection)

Lehrer answered 8/11, 2012 at 2:40 Comment(2)
Your comparisons are silly. For example, "The traditional way uses goToStage2, goToStage3, ... methods to switch stages. The dynamic __class__ way requires no such methods. You just reassign __class__." The traditional way requires no such methods either; they're trivial one-liners, and I just wrapped them in functions to give them names and make the example more readable. Also, you don't need to pass cell as an argument to every method; you store it as a member of the stage. The "no Cell class" is just because you renamed the Cell class to Stage. And so on.Clemons
If you want, I can edit the "traditional" part of your answer to show how it should actually be written, and then you can try the comparisons again, instead of doing comparisons against a badly-constructed straw man.Clemons

© 2022 - 2024 — McMap. All rights reserved.