Python securely remove file
Asked Answered
P

8

9

How can I securely remove a file using python? The function os.remove(path) only removes the directory entry, but I want to securely remove the file, similar to the apple feature called "Secure Empty Trash" that randomly overwrites the file.

What function securely removes a file using this method?

Pitchy answered 3/7, 2013 at 18:10 Comment(3)
this is not a feature of a programming language. this is a feature of the file system/ operating system / storage device.Check
IIRC, what Secure Erase Trash actually does is to unlink all the files, then do a single-pass random erasure immediately, then kick off a standard 35-pass erasure in the background.Viscoid
From what I know you can only overwrite file on HDD, not on SSD, due to the way SSD (flash mem) is working.Cracker
S
14

You can use srm to securely remove files. You can use Python's os.system() function to call srm.

Sweettempered answered 3/7, 2013 at 18:13 Comment(1)
I'd use subprocess.check_call rather than os.system, for all the usual reasons. There's no need for the performance hit, hijacking potential, etc. in spawning a shell, and it's better to automatically check that the call succeeded than to forget to do it manually and assume you've secure-erased files when you really haven't.Viscoid
D
7

You can very easily write a function in Python to overwrite a file with random data, even repeatedly, then delete it. Something like this:

import os

def secure_delete(path, passes=1):
    with open(path, "ba+") as delfile:
        length = delfile.tell()
    with open(path, "br+") as delfile:
        for i in range(passes):
            delfile.seek(0)
            delfile.write(os.urandom(length))
    os.remove(path)

Shelling out to srm is likely to be faster, however.

Decembrist answered 3/7, 2013 at 18:21 Comment(8)
That is a good idea, but is there an advantage to using random.seed() instead of os.urandom(n)Pitchy
os.urandom will probably be (a lot) faster since you can get more than one byte at a time. You'll want to generate the random data in chunks (maybe 256K to 1MB at a time) to avoid needing to hold all the random data in memory. That will probably be about as fast as srm.Decembrist
This won't be nearly as secure as using srm, and it may not be nearly as fast either. The Gutman algorithm has been standardized for decades for a good reason. And srm on some platforms will take advantage of the built-in "Secure Erase" on some hard drives.Viscoid
srm is, however, a solution only on platforms that have srm. My point is, there is no reason you couldn't implement whatever secure erasure algorithm you want in Python. My example wasn't meant to be canonical or anything, I didn't even test it.Decembrist
Nice, but pylint complains: "ba+" is not a valid mode for open. (bad-open-mode)Proliferate
This code can overwrite the contents of the file and hence the HD where the file is located multiple times. But, does it actually delete the file name too? At the end of the code, the file is simply deleted using os.remove. Can an investigator actually discover that the deleted file used to exist even though the content may not be retrievable?Trueblood
As mentioned in @phealy3330 comment this opens the file in append mode so all writes are appended to the end of the file and data is not overwrite. Is there a better way to implement this using r+b @kindall? It looks like delfile.seek(0) does not have an effect as specified here.Navada
yeah you probably want r+ rather than a+Decembrist
S
6

You can use srm, sure, you can always easily implement it in Python. Refer to wikipedia for the data to overwrite the file content with. Observe that depending on actual storage technology, data patterns may be quite different. Furthermore, if you file is located on a log-structured file system or even on a file system with copy-on-write optimisation, like btrfs, your goal may be unachievable from user space.

After you are done mashing up the disk area that was used to store the file, remove the file handle with os.remove().

If you also want to erase any trace of the file name, you can try to allocate and reallocate a whole bunch of randomly named files in the same directory, though depending on directory inode structure (linear, btree, hash, etc.) it may very tough to guarantee you actually overwrote the old file name.

Startling answered 3/7, 2013 at 18:40 Comment(2)
+1. But note that there are at some platforms/filesystems where you can do a secure erase from user space, but only by using some special API provided by the kernel/libc/fs. Which means using srm will work, but nothing you write in Python (unless you ctypes the special API) will.Viscoid
Meanwhile, it's probably worth looking at the srm for your platform (or, on a platform that doesn't have it, at least at some srm). For example, the source from OS X 10.8 is pretty simple if you know C at all, and understand fts (which is like Python's os.walk); there's almost nothing else tricky there.Viscoid
D
1

So at least in Python 3 using @kindall's solution I only got it to append. Meaning the entire contents of the file were still intact and every pass just added to the overall size of the file. So it ended up being [Original Contents][Random Data of that Size][Random Data of that Size][Random Data of that Size] which is not the desired effect obviously.

This trickery worked for me though. I open the file in append to find the length, then reopen in r+ so that I can seek to the beginning (in append mode it seems like what caused the undesired effect is that it was not actually possible to seek to 0)

So check this out:

def secure_delete(path, passes=3):
with open(path, "ba+", buffering=0) as delfile:
    length = delfile.tell()
delfile.close()
with open(path, "br+", buffering=0) as delfile:
    #print("Length of file:%s" % length)
    for i in range(passes):
        delfile.seek(0,0)
        delfile.write(os.urandom(length))
        #wait = input("Pass %s Complete" % i)
    #wait = input("All %s Passes Complete" % passes)
    delfile.seek(0)
    for x in range(length):
        delfile.write(b'\x00')
    #wait = input("Final Zero Pass Complete")
os.remove(path) #So note here that the TRUE shred actually renames to file to all zeros with the length of the filename considered to thwart metadata filename collection, here I didn't really care to implement

Un-comment the prompts to check the file after each pass, this looked good when I tested it with the caveat that the filename is not shredded like the real shred -zu does

Draff answered 26/2, 2019 at 16:38 Comment(0)
N
0

The answers implementing a manual solution did not work for me. My solution is as follows, it seems to work okay.

import os

def secure_delete(path, passes=1):
    length = os.path.getsize(path)
    with open(path, "br+", buffering=-1) as f:
        for i in range(passes):
            f.seek(0)
            f.write(os.urandom(length))
        f.close()
Navada answered 20/7, 2021 at 7:28 Comment(0)
H
0

If you are looking for performance, shred is faster.
srm does 38 passes, when 3 passes is considered secure (to some standards)
The Python implementation is much slower.
On my tests (time in seconds for 50 files, on a slow WSL machine):
shred: 2.66
pure Python: 22.70
SRM: 34.54

subprocess.check_call(['shred' ,'-n', '2', '-z', '-u', file_path])

Hotbed answered 13/12, 2023 at 13:7 Comment(0)
T
0

This secure delete function uses un-buffered file access and writes random data using fixed size chunks to prevent memory errors with very large files.

import os

def secure_delete(
    filepath: str,
    passes: int = 1,
    chunk_size: int = 1 << 18,  # 256KB.
) -> None:
    with open(filepath, "ba+", buffering=0) as fh:
        path_size = fh.tell()
    with open(filepath, "br+", buffering=0) as fh:
        for _ in range(passes):
            fh.seek(0)
            offset_next = 0
            while (offset := offset_next) < path_size:
                offset_next = min(offset + chunk_size, path_size)
                fh.write(os.urandom(offset_next - offset))
            assert offset == path_size
    os.remove(filepath)

Tommyetommyrot answered 12/5, 2024 at 3:2 Comment(0)
Q
0
import os
import random
import time

def secure_delete(file_path, overwrite_method='random', passes=random.randint(100, 10000)):
    try:
        # Get the current timestamp
        timestamp = int(time.time())
        # Set newname var 
        newname = ""
        
        # Open the original file in binary read mode
        with open(file_path, 'rb') as orig_file:
            # Read the file content
            content = orig_file.read()
        
        # Close the original file
        orig_file.close()
        
        # Perform the overwrite operation
        with open(file_path, 'wb') as file_to_overwrite:
            if overwrite_method == 'zeros':
                for i in range(passes):
                    file_to_overwrite.write((len(content) * "0").encode())
            elif overwrite_method == 'random':
                file_to_overwrite.write(bytes(random.randint(0, 255) for _ in range(len(content))))
            else:
                raise ValueError("Invalid overwrite method. Choose 'zeros' or 'random'.")
        
        
        # rename and delete
        
        for i in range(len(file_path)):
            newname = newname + f"{random.randint(0, 9)}"
        os.rename(file_path, newname)
        # Delete the original file
        #os.remove(newname)
        
        print(f"File {file_path} has been securely deleted.")
    
    except Exception as e:
        print(f"An error occurred during secure deletion: {e}")

# Example usage
if __name__ == "__main__":
    file_to_delete = input("Enter the path to the file you want to delete:\n>>> ")
    overwrite_method = input("Choose the overwrite method (random / zeros):\n>>> ").lower()
    while overwrite_method not in ['random', 'zeros']:
        overwrite_method = input("Invalid input. Please choose 'random' or 'zeros':\n>>> ").lower()
    passes = int(input("Enter the number of overwrite passes:\n>>> "))
    
    secure_delete(file_to_delete, overwrite_method=overwrite_method, passes=passes)
Quinone answered 8/9, 2024 at 12:19 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.