Google Cloud Storage: How to Delete a folder (recursively) in Python
Asked Answered
C

3

23

I am trying to delete a folder in GCS and its all content (including sub-directories) with its Python library. Also I understand GCS doesn't really have folders (but prefix?) but I am wondering how I can do that?

I tested this code:

from google.cloud import storage

def delete_blob(bucket_name, blob_name):
    """Deletes a blob from the bucket."""
    storage_client = storage.Client()
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(blob_name)

    blob.delete()

delete_blob('mybucket', 'top_folder/sub_folder/test.txt')
delete_blob('mybucket', 'top_folder/sub_folder/')

The first call to delete_blob worked but not the 2nd one. What can I delete a folder recursively?

Cathiecathleen answered 13/10, 2018 at 5:5 Comment(0)
J
35

To delete everything starting with a certain prefix (for example, a directory name), you can iterate over a list:

storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blobs = bucket.list_blobs(prefix='some/directory')
for blob in blobs:
  blob.delete()

Note that for very large buckets with millions or billions of objects, this may not be a very fast process. For that, you'll want to do something more complex, such as deleting in multiple threads or using lifecycle configuration rules to arrange for the objects to be deleted.

Josefina answered 13/10, 2018 at 5:41 Comment(3)
My initial hunch was to list out all and then iterate and delete the ones matching the given path.. but glad to know this options exists in the API (java-storage API in my case).. Thank you :)Eyecatching
You could also use bucket.delete_blobs method to delete all list files with one network round trip O(1) instead of O(n) separate delete calls.Delubrum
bucket.delete_blobs deletes one by one. googleapis.dev/python/storage/latest/… @DelubrumMckinley
I
5

Now it can be done by:

def delete_folder(cls, bucket_name, folder_name):
    bucket = cls.storage_client.get_bucket(bucket_name)
    """Delete object under folder"""
    blobs = list(bucket.list_blobs(prefix=folder_name))
    bucket.delete_blobs(blobs)
    print(f"Folder {folder_name} deleted.")
Irreparable answered 10/12, 2021 at 21:56 Comment(0)
W
1

deleteFiles might be what you are looking for. Or in Python delete_blobs. Assuming they are the same, the Node docs do a better job describing the behavior, namely

This is not an atomic request. A delete attempt will be made for each file individually. Any one can fail, in which case only a portion of the files you intended to be deleted would have.

Operations are performed in parallel, up to 10 at once.

Wellordered answered 9/5, 2022 at 18:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.