How do I correctly clean up a Python object?
Asked Answered
B

11

608
class Package:
    def __init__(self):
        self.files = []

    # ...

    def __del__(self):
        for file in self.files:
            os.unlink(file)

__del__(self) above fails with an AttributeError exception. I understand Python doesn't guarantee the existence of "global variables" (member data in this context?) when __del__() is invoked. If that is the case and this is the reason for the exception, how do I make sure the object destructs properly?

Baliol answered 14/5, 2009 at 19:4 Comment(9)
Reading what you linked, global variables going away doesn't seem to apply here unless you're talking about when you program is exiting, during which I guess according to what you linked it might be POSSIBLE that the os module itself is already gone. Otherwise, I don't think it applies to member variables in a __del__() method.Beamends
The exception is thrown long before my program exits. The AttributeError exception I get is Python saying it doesn't recognize self.files as being an attribute of Package. I may be getting this wrong, but if by "globals" they don't mean variables global to methods (but possibly local to class) then I don't know what causes this exception. Google hints Python reserves the right to clean up member data before __del__(self) is called.Baliol
The code as posted seems to work for me (with Python 2.5). Can you post the actual code that is failing - or a simplified (the simpler the better version that still causes the error?Splenectomy
@ wilhelmtell can you give a more concrete example? In all my tests, the del destructor works perfectly.Seiber
The class is not very different from what I posted, but it's used in a much larger chunk of code that is too large for these margins to contain... I'll try and find the smallest code that triggers this.Baliol
wilhelmtell check my latest updateSeiber
If anyone wants to know: This article elaborates why __del__ should not be used as the counterpart of __init__. (I.e., it is not a "destructor" in the sense that __init__ is a constructor.Sprat
BOTTOM LINE: python does not support RAII like C++ and other languages (aka LIFO object destruction). REFERENCES: python docs, PEP 343, wikibooks for Python "Python does not support RAII".Wedlock
@Sprat The only argument the article makes is to make sure that __init__succeeds. If that's not the case, I would consider it a bug. I don't see a big argument there, to be honest.Globulin
B
770

I'd recommend using Python's with statement for managing resources that need to be cleaned up. The problem with using an explicit close() statement is that you have to worry about people forgetting to call it at all or forgetting to place it in a finally block to prevent a resource leak when an exception occurs.

To use the with statement, create a class with the following methods:

def __enter__(self)
def __exit__(self, exc_type, exc_value, traceback)

In your example above, you'd use

class Package:
    def __init__(self):
        self.files = []

    def __enter__(self):
        return self

    # ...

    def __exit__(self, exc_type, exc_value, traceback):
        for file in self.files:
            os.unlink(file)

Then, when someone wanted to use your class, they'd do the following:

with Package() as package_obj:
    # use package_obj

The variable package_obj will be an instance of type Package (it's the value returned by the __enter__ method). Its __exit__ method will automatically be called, regardless of whether or not an exception occurs.

You could even take this approach a step further. In the example above, someone could still instantiate Package using its constructor without using the with clause. You don't want that to happen. You can fix this by creating a PackageResource class that defines the __enter__ and __exit__ methods. Then, the Package class would be defined strictly inside the __enter__ method and returned. That way, the caller never could instantiate the Package class without using a with statement:

class PackageResource:
    def __enter__(self):
        class Package:
            ...
        self.package_obj = Package()
        return self.package_obj

    def __exit__(self, exc_type, exc_value, traceback):
        self.package_obj.cleanup()

You'd use this as follows:

with PackageResource() as package_obj:
    # use package_obj
Binocular answered 14/5, 2009 at 19:39 Comment(14)
Technically speaking, one could call PackageResource().__enter__() explicitly and thus create a Package that would never be finalized... but they'd really have to be trying to break the code. Probably not something to worry about.Havre
By the way, if you're using Python 2.5, you'll need to do from future import with_statement to be able to use the with statement.Binocular
I found an article which helps to show why __del__() acts the way it does and give credence to using a context manager solution: andy-pearce.com/blog/posts/2013/Apr/python-destructor-drawbacksBreakdown
How to use that nice and clean construct if you want to pass parameters? I would like to be able to do with Resource(param1, param2) as r: # ...Gib
@Gib you could give the Resource an __init__ method that stores *args and **kwargs in self, and then passes them on to the inner class in the enter method. When using the with statement, __init__ is called before __enter__Glyco
@BrianSchlenker That's almost what I ended up doing. Coming from a compiled-programming background, I only preferred giving explicit parameters to the __init__ method to make my "interface" clearer. Is it more "Pythonic" or are there advantages I don't see in using *args and **kwargs? I understand it will only fail at interpretation time anyway, but perhaps having a clearer __init__ signature for the resource makes it easier to use for programmers with a nice IDE?Gib
@Gib It really depends on what you want to do, I don't think one way is more pythonic than the other. Let's say you were wrapping a function that took many parameters and you didn't want to have to replicate all those parameters in your wrapper. That's when I would use *args and **kwargs, to simplify my programming process. In your case I would probably prefer to be explicit with the parameters as well.Glyco
Does this answer means that everytime we have to deal with a resource in python (files, db, etc) we have to create a wrapper class instead of just a destructor?Marsden
How would this pattern look if Package is an (abstract) base class? Is there a way to inherit from the inner class?Sybille
neat :) PackageResource can be simplified using contextlib, see my answerHypophysis
...or even better, if the cleanup is named close, simply use contextlib.closingHypophysis
What's the difference between __exit__ and __del__?Scrim
It's difficult to use with when the lifetime of some object exists outside of some syntactical frame. But instead is tied to some other event occurring.Ornithopter
It is also difficult to use with when you have to use several objects with parameters in your code. The layered structure of with is good when there is 1 layer, but not 7!Salgado
C
108

The standard way is to use atexit.register:

# package.py
import atexit
import os

class Package:
    def __init__(self):
        self.files = []
        atexit.register(self.cleanup)

    def cleanup(self):
        print("Running cleanup...")
        for file in self.files:
            print("Unlinking file: {}".format(file))
            # os.unlink(file)

But you should keep in mind that this will persist all created instances of Package until Python is terminated.

Demo using the code above saved as package.py:

$ python
>>> from package import *
>>> p = Package()
>>> q = Package()
>>> q.files = ['a', 'b', 'c']
>>> quit()
Running cleanup...
Unlinking file: a
Unlinking file: b
Unlinking file: c
Running cleanup...
Covarrubias answered 13/1, 2017 at 3:48 Comment(3)
The nice thing about the atexit.register approach is you don't have to worry about what the user of the class does (did they use with? did they explicitly call __enter__?) The downside is of course if you need the cleanup to happen before python exits, it won't work. In my case, I don't care if it is when the object goes out of scope or if it isn't until python exits. :)Ceres
Can I use enter and exit and also add atexit.register(self.__exit__)?Anurous
@Anurous I don't see how that would be useful? Can't you perform all cleanup logic inside __exit__, and use a contextmanager? Also, __exit__ takes additional arguments (i.e. __exit__(self, type, value, traceback)), so you would need to accout for those. Either way, it sounds like you should post a separate question on SO, because your use-case appears unusual?Covarrubias
C
52

A better alternative is to use weakref.finalize. See the examples at Finalizer Objects and Comparing finalizers with __del__() methods.

Chitarrone answered 20/3, 2017 at 15:37 Comment(2)
Used this today and it works flawlessly, better than other solutions. I have multiprocessing-based communicator class which opens a serial port and then I have a stop() method to close the ports and join() the processes. However, if the programs exits unexpectedly stop() is not called - I solved that with a finalizer. But in any case I call _finalizer.detach() in the stop method to prevent calling it twice (manually and later again by the finalizer).Congo
IMO, this is really the best answer. It combines the possibility of cleaning up at garbage collection with the possibility of cleaning up at exit. The caveat is that python 2.7 doesn't have weakref.finalize.Ceres
H
42

As an appendix to Clint's answer, you can simplify PackageResource using contextlib.contextmanager:

@contextlib.contextmanager
def packageResource():
    class Package:
        ...
    package = Package()
    yield package
    package.cleanup()

Alternatively, though probably not as Pythonic, you can override Package.__new__:

class Package(object):
    def __new__(cls, *args, **kwargs):
        @contextlib.contextmanager
        def packageResource():
            # adapt arguments if superclass takes some!
            package = super(Package, cls).__new__(cls)
            package.__init__(*args, **kwargs)
            yield package
            package.cleanup()

    def __init__(self, *args, **kwargs):
        ...

and simply use with Package(...) as package.

To get things shorter, name your cleanup function close and use contextlib.closing, in which case you can either use the unmodified Package class via with contextlib.closing(Package(...)) or override its __new__ to the simpler

class Package(object):
    def __new__(cls, *args, **kwargs):
        package = super(Package, cls).__new__(cls)
        package.__init__(*args, **kwargs)
        return contextlib.closing(package)

And this constructor is inherited, so you can simply inherit, e.g.

class SubPackage(Package):
    def close(self):
        pass
Hypophysis answered 20/5, 2015 at 12:11 Comment(3)
This is awesome. I particularly like the last example. It's unfortunate that we can't avoid the four-line boilerplate of the Package.__new__() method, however. Or maybe we can. We could probably define either a class decorator or metaclass genericizing that boilerplate for us. Food for Pythonic thought.Duodenary
@CecilCurry Thanks, and good point. Any class inheriting from Package should also do this (though I haven't tested that yet), so no metaclass should be required. Though I have found some pretty curious ways to use metaclasses in the past...Hypophysis
@CecilCurry Actually, the constructor is inherited, so you can use Package (or better a class named Closing) as your class parent instead of object. But don't ask me how multiple inheritance messes up with this...Hypophysis
I
23

Here is a minimal working skeleton:

class SkeletonFixture:

    def __init__(self):
        pass

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        pass

    def method(self):
        pass


with SkeletonFixture() as fixture:
    fixture.method()

Important: return self


If you're like me, and overlook the return self part (of Clint Miller's correct answer), you will be staring at this nonsense:

Traceback (most recent call last):
  File "tests/simplestpossible.py", line 17, in <module>                                                                                                                                                          
    fixture.method()                                                                                                                                                                                              
AttributeError: 'NoneType' object has no attribute 'method'

Hope it helps the next person.

Intelligent answered 16/4, 2018 at 14:10 Comment(1)
On return self , a discussion is here #38282353Tonkin
T
19

I don't think that it's possible for instance members to be removed before __del__ is called. My guess would be that the reason for your particular AttributeError is somewhere else (maybe you mistakenly remove self.file elsewhere).

However, as the others pointed out, you should avoid using __del__. The main reason for this is that instances with __del__ will not be garbage collected (they will only be freed when their refcount reaches 0). Therefore, if your instances are involved in circular references, they will live in memory for as long as the application run. (I may be mistaken about all this though, I'd have to read the gc docs again, but I'm rather sure it works like this).

Thilde answered 14/5, 2009 at 19:51 Comment(2)
Objects with __del__ can be garbage collected if their reference count from other objects with __del__ is zero and they are unreachable. This means if you have a reference cycle between objects with __del__, none of those will be collected. Any other case, however, should be resolved as expected.Hint
"Starting with Python 3.4, __del__() methods no longer prevent reference cycles from being garbage collected, and module globals are no longer forced to None during interpreter shutdown. So this code should work without any issues on CPython." - docs.python.org/3.6/library/…Trucker
M
15

I think the problem could be in __init__ if there is more code than shown?

__del__ will be called even when __init__ has not been executed properly or threw an exception.

Source

Meit answered 29/11, 2012 at 8:22 Comment(1)
Sounds very likely. The best way to avoid this problem when using __del__ is to explicitly declare all members at class-level, ensuring that they always exist, even if __init__ fails. In the given example, files = () would work, though mostly you'd just assign None; in either case, you still need to assign the real value in __init__.Enchanter
E
14

A good idea is to combine both approaches.

Implement a context manager for explicit life-cycle handling and handle automatic cleanup in case the user forgets it or it is not convenient to do so, like in an interpreter. Guaranteed cleanup at destruction is best done by weakref.finalize.

This is how many libraries actually do it. And some issue a warning if the context manager is not used, to encourage people to use it.

Once set up, a good thing about weakref.finalize is, that its logic is guaranteed to be called exactly once. So it can be used to trigger the cleanup in all cases.

import os
from typing import List
import weakref

class Package:
    def __init__(self):
        self.files = []
        self._finalizer = weakref.finalize(self, self._cleanup_files, self.files)

    @staticmethod
    def _cleanup_files(files: List):
        for file in files:
            os.unlink(file)

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        self._finalizer()

weakref.finalize returns a callable finalizer object which will be called when obj is garbage collected. Unlike an ordinary weak reference, a finalizer will always survive until the reference object is collected.

One downside is, that one has to implement the cleanup logic inside a staticmethod, as at that time the object is not alive anymore.

A benefit over atexit.register is that the object is not held in memory until the interpreter is shut down. The cleanup happens when the object gets garbage collected.

And unlike object.__del__, weakref.finalize is guaranteed to be called at interpreter shutdown, as the interpreter keeps special track of them.

Elastic answered 24/6, 2022 at 3:57 Comment(0)
S
9

Just wrap your destructor with a try/except statement and it will not throw an exception if your globals are already disposed of.

Edit

Try this:

from weakref import proxy

class MyList(list): pass

class Package:
    def __init__(self):
        self.__del__.im_func.files = MyList([1,2,3,4])
        self.files = proxy(self.__del__.im_func.files)

    def __del__(self):
        print self.__del__.im_func.files

It will stuff the file list in the del function that is guaranteed to exist at the time of call. The weakref proxy is to prevent Python, or yourself from deleting the self.files variable somehow (if it is deleted, then it will not affect the original file list). If it is not the case that this is being deleted even though there are more references to the variable, then you can remove the proxy encapsulation.

Seiber answered 14/5, 2009 at 19:8 Comment(1)
The problem is that if the member data is gone it's too late for me. I need that data. See my code above: I need the filenames to know which files to remove. I simplified my code though, there are other data I need to clean up myself (i.e. the interpreter won't know how to clean).Baliol
D
6

It seems that the idiomatic way to do this is to provide a close() method (or similar), and call it explicitely.

Dudek answered 14/5, 2009 at 19:9 Comment(2)
This is the approach I used before, but I ran into other problems with it. With exceptions thrown all over the place by other libraries, I need Python's help in cleaning up the mess in the case of an error. Specifically, I need Python to call the destructor for me, because otherwise the code becomes quickly unmanageable, and I will surely forget an exit-point where a call to .close() should be.Baliol
Some resources/objects should be finalized in particular order, thus Better explicit that implicit.Art
T
4

atexit.register is the standard way as has already been mentioned in ostrakach's answer.

However, it must be noted that the order in which objects might get deleted cannot be relied upon as shown in example below.

import atexit

class A(object):

    def __init__(self, val):
        self.val = val
        atexit.register(self.hello)

    def hello(self):
        print(self.val)


def hello2():
    a = A(10)

hello2()    
a = A(20)

Here, order seems legitimate in terms of reverse of the order in which objects were created as program gives output as :

20
10

However when, in a larger program, python's garbage collection kicks in object which is out of it's lifetime would get destructed first.

Tonkin answered 19/9, 2020 at 15:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.