Right way to clean up a temporary folder in Python class
Asked Answered
G

6

46

I am creating a class in which I want to generate a temporary workspace of folders that will persist for the life of the object and then be removed. I am using tempfile.mkdtemp() in the def __init__ to create the space, but I have read that I can't rely on __del__ being called.

I am wanting something like this:

class MyClass:
  def __init__(self):
    self.tempfolder = tempfile.mkdtemp()

  def ... #other stuff

  def __del__(self):
    if os.path.exists(self.tempfolder): shutil.rmtree(self.tempfolder)

Is there another/better way to handle this clean up? I was reading about with, but it appears to only be helpful within a function.

Gammadion answered 14/11, 2012 at 13:29 Comment(4)
I think the only reliable solutions will always involve doing the clean-up explicitly. I don't think the reliable automatic clean-up solution you envisage in your question is possible.Sunderance
@PedroRomano false: Python's context managers are exactly for this purpose.Joker
@katrielalex: even context managers need to be explicitly entered.Sunderance
@PedroRomano yes -- but you always need to e.g. open a file. The point of a context manager is that it handles cleaning up.Joker
J
64

Caveat: you can never guarantee that the temp folder will be deleted, because the user could always hard kill your process and then it can't run anything else.

That said, do

temp_dir = tempfile.mkdtemp()
try:
    <some code>
finally:
    shutil.rmtree(temp_dir)

Since this is a very common operation, Python has a special way to encapsulate "do something, execute code, clean up": a context manager. You can write your own as follows:

@contextlib.contextmanager
def make_temp_directory():
    temp_dir = tempfile.mkdtemp()
    try:
        yield temp_dir
    finally:
        shutil.rmtree(temp_dir)

and use it as

with make_temp_directory() as temp_dir:
    <some code>

(Note that this uses the @contextlib.contextmanager shortcut to make a context manager. If you want to implement one the original way, you need to make a custom class with __enter__ and __exit__ methods; the __enter__ would create and return the temp directory and the __exit__ delete it.

Joker answered 14/11, 2012 at 13:42 Comment(6)
Can the context manager persist my temporary folder for the life of the object? It looks like it would remove the folder once I leave the 'with' statement. I guess I could pass the folder into my object as a parameter, but I was hoping to encapsulate the temporary folder within the class.Gammadion
The definition of the with statement is that it closes the object when you leave it. If you want your temp directory to close somewhere else, you need to close it explicitly there and handle exceptions yourself.Joker
I like how this works now that I have refactored my code to use the contextmanager.Gammadion
The context manager works fine for me, but I had to add a try: finally: around the yield or otherwise it wouldn't clean up if the code using the temp dir raises an exception. Might be documented somewhere or might be different between python versions, but I thought I'd mention it.Sideling
That behavior is documented: "If an unhandled exception occurs in the block, it is reraised inside the generator at the point where the yield occurred. Thus, you can use a try...except...finally statement to trap the error (if any), or ensure that some cleanup takes place."Sentiment
Note that Python 3.2 added tempfile.TemporaryDirectory() which does the same thing that this answer shows. For 3.2 and newer I would use the library module code; for older Python versions I would use this code, but I'd rename the function to TemporaryDirectory() and document that it is similar to the 3.2 code. docs.python.org/3/library/tempfile.htmlFolacin
S
31

A nice way to deal with temporary files and directories is via a context manager. This is how you can use tempfile.TemporaryFile or tempfile.NamedTemporaryFile -- once you've exited the with statement (via normal exit, return, exception, or anything else) the file/directory and it's contents will be removed from the filesystem.

For Python 3.2+, this is built in as tempfile.TemporaryDirectory:

import tempfile

with tempfile.TemporaryDirectory() as temp_dir:
    ... do stuff ...

For earlier Python versions you can easily create your own context manager to do exactly the same thing. The differences here from @katrielalex answer are the passing of args to mkdtemp() and the try/finally block to make sure the directory gets cleaned up if an exception is raised.

import contextlib
import shutil

@contextlib.contextmanager
def temporary_directory(*args, **kwargs):
    d = tempfile.mkdtemp(*args, **kwargs)
    try:
        yield d
    finally:
        shutil.rmtree(d)


# use it
with temporary_directory() as temp_dir:
    ... do stuff ...

Note that if your process is hard-killed (eg. kill -9) then the directories won't get cleaned up.

Spit answered 21/2, 2014 at 0:7 Comment(0)
H
13

I somewhat experimented with this and I am quite confident that, if you cannot use a context manager, the best solution as of this posting is:

class MyClass(object):
    def __init__(self):
        self.tempfolder = tempfile.TemporaryDirectory()
    
    …

    def __del__(self):
        self.tempfolder.cleanup()

(Some conditional in __del__ may be reasonable if you cannot ensure __init__ to be called.)

Now, except from using the newer TemporaryDirectory instead of mkdtemp, this is not much different from what you were doing. Why do I still think this is the best you can do? Well, I tested several scenarios of program exit and similar (all on Linux) and:

  • I could not find a scenario where the temporary folder was not deleted even though I would expect that Python could decide that the respective instance of MyClass was not needed anymore. Automatic deletion happens as early as Python’s garbage collecting heuristics allow.

  • You can “help” the garbage collector with del myinstance and gc.collect(). Mind that del only decreases the reference count, so this does not ensure that garbage collection can happen and __del__ is called.

  • If you really want to ensure deletion (of the temporary directory), you can explicitly call myinstance.__del__(). If you can do this, you can probably also make MyClass itself a context manager.

  • The only case where the temporary folder persisted was when I hard-killed Python from the operating system – in which case I do not see how any solution within Python would work.

  • atexit (as suggested, e.g., by this answer) does not improve the situation: Either deletion happens without atexit anyway, or it does not happen even with atexit.

Hypercatalectic answered 19/3, 2022 at 11:59 Comment(1)
this is the correct answer. OP stipulates he is making a class with a temp workspace that will persist for the life of the object. this is what __del__ was made for.Charitacharitable
I
6

Another alternative using contextlib is to make your object closable, and use the closing context manager.

class MyClass:
    def __init__(self):
        self.tempfolder = tempfile.mkdtemp()

    def do_stuff():
        pass

    def close(self):
        if os.path.exists(self.tempfolder):
            shutil.rmtree(self.tempfolder)

Then with the context manager:

from contextlib import closing

with closing(MyClass()) as my_object:
    my_object.do_stuff()
Information answered 24/10, 2014 at 7:13 Comment(2)
I know a lot changed on context managers since this answer was posted, but wouldn’t it be easier to just add __enter__ and __exit__ methods to MyClass to make it useable as a context manager itself?Hypercatalectic
@Hypercatalectic as I say, it's an alternative. If you implement __enter__ and __exit__ you have to implement both. As the __init__ serves the same purpose, __enter__ will be empty. Using the close paradigm is simpler.Information
S
4

Other answers have noted that you can use a contextmanager or require your users to explicitly call some type of clean up function. These are great to do if you can. However, sometimes there's no where to hook up this cleanup because you are inside a large application and you are nested multiple layers down, and no one above you has cleanup methods or context managers.

In that case, you can use atexit: https://docs.python.org/2/library/atexit.html

import atexit

class MyClass:
  def __init__(self):
    self.tempfolder = tempfile.mkdtemp()
    atexit.register(shutil.rmtree, self.tempfolder)

  def ... #other stuff
Sherrard answered 21/2, 2019 at 22:36 Comment(2)
If atexit gets to be called, then __del__ as in the question should be too. At least in comparison to tempfile.TemporaryDirectory, I failed to find a scenario where atexit improves the situation (also see my answer).Hypercatalectic
On top, this solution prevents the garbage collector from deleting the temporary directory early and every way to include directory deletion in __del__ as well will lead to one of the following problems as soon as there are multiple instances of MyClass: 1) If rmtree is not unregistered from atexit, an error is raised upon exiting if one instance is cleaned up early. 2) If unregistered, atexit cannot delete a second instance of the class upon exiting. 3) If registered with a method, the garbage collector cannot clean up early because atexit still references the instance.Hypercatalectic
G
2

As stated by Bluewind you have to make sure to wrap the yield portion of the context manager inside of a try: finally statement otherwise any exceptions will not really be handled correctly inside of the context manager.

From Python 2.7 docs

At the point where the generator yields, the block nested in the with statement is executed. The generator is then resumed after the block is exited. If an unhandled exception occurs in the block, it is reraised inside the generator at the point where the yield occurred. Thus, you can use a try...except...finally statement to trap the error (if any), or ensure that some cleanup takes place. If an exception is trapped merely in order to log it or to perform some action (rather than to suppress it entirely), the generator must reraise that exception. Otherwise the generator context manager will indicate to the with statement that the exception has been handled, and execution will resume with the statement immediately following the with statement.

Also if you are using Python 3.2+ you should check out this little gem which has all of the above wrapped up nicely for you

tempfile.TemporaryDirectory(suffix='', prefix='tmp', dir=None)

This function creates a temporary directory using mkdtemp() (the supplied arguments are passed directly to the underlying function). The resulting object can be used as a context manager (see With Statement Context Managers). On completion of the context (or destruction of the temporary directory object), the newly created temporary directory and all its contents are removed from the filesystem.

The directory name can be retrieved from the name attribute of the returned object.

The directory can be explicitly cleaned up by calling the cleanup() method.

New in version 3.2.

Greathouse answered 29/10, 2013 at 11:30 Comment(1)
Python 3 cleanup() in with: be aware the cleanup() method is something USELESS INSIDE CONTEXT MANAGER with, since it's a method of tempfile.TemporaryDirectory() - which, if used in the context manager, returns a string ! (not the object itself). On top of that tempfile.TemporaryDirectory.cleanup() does not just "clean" its contents it also removes the directory itself!Antecedents

© 2022 - 2024 — McMap. All rights reserved.