Delete directory and all symlinks recursively
Asked Answered
L

3

11

I tried to use shutil to delete a directory and all contained files, as follows:

import shutil
from os.path import exists
if exists(path_dir):
    shutil.rmtree(path_dir)

Unfortunately, my solution does not work, throwing the following error:

FileNotFoundError: [Errno 2] No such file or directory: '._image1.jpg'

A quick search showed that I'm not alone in having this problem. In my understanding, the rmtree function is equivalent to the rm -Rf $DIR shell command - but this doesn't seem to be the case.

p.s. for reconstruction purposes. Please create a symbolic link for example using ln -s /path/to/original /path/to/link

Loretaloretta answered 6/12, 2021 at 13:53 Comment(2)
Is path_dir a path to a symbolic link?Maximamaximal
no, path_dir to a directory containing various files and foldersBurgundy
S
6

That is strange, I have no issues with shutil.rmtree() with or without symlink under the folder to be deleted, both in windows 10 and Ubuntu 20.04.2 LTS.

Anyhow try the following code. I tried it in windows 10 and Ubuntu.

from pathlib import Path
import shutil


def delete_dir_recursion(p):
    """
    Delete folder, sub-folders and files.
    """
    for f in p.glob('**/*'):
        if f.is_symlink():
            f.unlink(missing_ok=True)  # missing_ok is added in python 3.8
            print(f'symlink {f.name} from path {f} was deleted')
        elif f.is_file():
            f.unlink()
            print(f'file: {f.name} from path {f} was deleted')
        elif f.is_dir():
            try:
                f.rmdir()  # delete empty sub-folder
                print(f'folder: {f.name} from path {f} was deleted')
            except OSError:  # sub-folder is not empty
                delete_dir_recursion(f)  # recurse the current sub-folder
            except Exception as exception:  # capture other exception
                print(f'exception name: {exception.__class__.__name__}')
                print(f'exception msg: {exception}')

    try:
        p.rmdir()  # time to delete an empty folder
        print(f'folder: {p.name} from path {p} was deleted')
    except NotADirectoryError:
        p.unlink()  # delete folder even if it is a symlink, linux
        print(f'symlink folder: {p.name} from path {p} was deleted')
    except Exception as exception:
        print(f'exception name: {exception.__class__.__name__}')
        print(f'exception msg: {exception}')


def delete_dir(folder):
    p = Path(folder)

    if not p.exists():
        print(f'The path {p} does not exists!')
        return

    # Attempt to delete the whole folder at once.
    try:
        shutil.rmtree(p)
    except Exception as exception:
        print(f'exception name: {exception.__class__.__name__}')
        print(f'exception msg: {exception}')
        # continue parsing the folder
    else:  # else if no issues on rmtree()
        if not p.exists():  # verify
            print(f'folder {p} was successfully deleted by shutil.rmtree!')
            return

    print(f'Parse the folder {folder} ...')
    delete_dir_recursion(p)

    if not p.exists():  # verify
        print(f'folder {p} was successfully deleted!')

# start
folder_to_delete = '/home/zz/tmp/sample/b'  # delete folder b
delete_dir(folder_to_delete)

Sample output:

We are going to delete the folder b.

.
├── 1.txt
├── a
├── b
│   ├── 1
│   ├── 1.txt -> ../1.txt
│   ├── 2
│   │   └── 21
│   │       └── 21.txt
│   ├── 3
│   │   └── 31
│   ├── 4
│   │   └── c -> ../../c
│   ├── a -> ../a
│   └── b.txt
├── c

Parse the folder /home/zz/tmp/sample/b ...
symlink a from path /home/zz/tmp/sample/b/a was deleted
symlink c from path /home/zz/tmp/sample/b/4/c was deleted
folder: 4 from path /home/zz/tmp/sample/b/4 was deleted
symlink 1.txt from path /home/zz/tmp/sample/b/1.txt was deleted
file: b.txt from path /home/zz/tmp/sample/b/b.txt was deleted
file: 21.txt from path /home/zz/tmp/sample/b/2/21/21.txt was deleted
folder: 21 from path /home/zz/tmp/sample/b/2/21 was deleted
folder: 2 from path /home/zz/tmp/sample/b/2 was deleted
folder: 1 from path /home/zz/tmp/sample/b/1 was deleted
folder: 31 from path /home/zz/tmp/sample/b/3/31 was deleted
folder: 3 from path /home/zz/tmp/sample/b/3 was deleted
folder: b from path /home/zz/tmp/sample/b was deleted
folder /home/zz/tmp/sample/b was successfully deleted!
Stymie answered 9/12, 2021 at 6:6 Comment(5)
I think you are quiet close. The function deletes all the files inside all folders, but does not remove the folders after the are emptyBurgundy
@JürgenK. what could be the issue?, I expect the empty folder will be dealt in p.rmdir() # time to delete an empty folder Is there any messages? Are you testing in linux or windows?Stymie
No issue, rmdir works probably not recursive. So if there are empty folders containing another empty folders, rmdir wont work. Testing on macBurgundy
Thanks for the info I will recheck it.Stymie
@JürgenK. got the fix, now it can recursively deletes files/subfolders.Stymie
A
3

UPDATE: Underlying python bug has been fixed by cpython#14064, which will be part of python 3.13 (mitigation code for earlier versions below).

You are probably on Mac OSX and your directory is at least partially on a non-Mac filesystem (ie not HFS+). On those, Mac filesystem drivers automatically create binary companion files prefixed with ._ to record so-called extended attributes (explained in https://apple.stackexchange.com/questions/14980/why-are-dot-underscore-files-created-and-how-can-i-avoid-them, but also illustrated below).

rmtree on systems which do not support file descriptors in os.scandir (like Mac OSX) now unsafely creates a list of entries and then unlinks them one by one (creating a known race-condition: https://github.com/python/cpython/blob/908fd691f96403a3c30d85c17dd74ed1f26a60fd/Lib/shutil.py#L592-L621). Unfortunately two separate behaviours make this condition true every time:

  1. the original file is always listed before the extended attributes one, and
  2. when the original file is unlinked (test.txt) the meta file (._test.txt) is removed simultaneously.

Thus, the extended attribute file will be missing when it is its turn and throw the FileNotFoundError you are experiencing.

I think this bug would be best addressed by cpython#14064, which aims at ignoring FileNotFoundErrors in rmtree generally.

Mitigation

In the mean time you could ignore unlinking errors on those meta files with onerror:

def ignore_extended_attributes(func, filename, exc_info):
    is_meta_file = os.path.basename(filename).startswith("._")
    if not (func is os.unlink and is_meta_file):
        raise

shutil.rmtree(path_dir, onerror=ignore_extended_attributes)

Show case of Mac's extended attributes

To illustrate you can create a small ExFAT disk image and mount it to /Volumes/Untitled with the commands

hdiutil create -size 5m -fs exfat test.dmg
hdiutil attach test.dmg            # mounts at /Volumes/Untitled
cd /Volumes/Untitled

mkdir test                         # create a directory to remove
cd test
touch test.txt
open test.txt                      # open the test.txt file in the standard editor 

Just opening the file in the standard text editor creates an extended attributes file ._test.txt and records the last access time in it:

/Volumes/Untitled/test $ ls -a
.          ..         ._test.txt test.txt
/Volumes/Untitled/test $ xattr test.txt
com.apple.lastuseddate#PS

The problem is that unlinking the original file automatically also unlinks the companion file.

/Volumes/Untitled/test $ rm test.txt
/Volumes/Untitled/test $ ls -a
.          ..
Armitage answered 14/12, 2021 at 20:58 Comment(1)
Great answer. Thank you for the Mitigation-codeAcreage
A
1

From How to remove a directory including all its files in python?

# function that deletes all files and then folder

import glob, os

def del_folder(dir_name):
    
    dir_path = os.getcwd() +  "\{}".format(dir_name)
    try:
        os.rmdir(dir_path)  # remove the folder
    except:
        print("OSError")   # couldn't remove the folder because we have files inside it
    finally:
        # now iterate through files in that folder and delete them one by one and delete the folder at the end
        try:
            for filepath in os.listdir(dir_path):
                os.remove(dir_path +  "\{}".format(filepath))
            os.rmdir(dir_path)
            print("folder is deleted")
        except:
            print("folder is not there")

You can also just use the ignore_errors flag with shutil.rmtree().

shutil.rmtree('/folder_name', ignore_errors=True) That should remove a directory with file contents.

Arni answered 8/12, 2021 at 15:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.