Fastest way to delete files in Amazon S3
Asked Answered
C

3

5

With boto3, one can delete files in a bucket as below

for object in bucket.objects.all():
    if 'xyz' in object.key:
        object.delete()

This sends one REST API call per file. If you have a large number of files, this can take a long time.

Is there a faster way to do this?

Coriolanus answered 29/12, 2017 at 16:15 Comment(0)
B
7

The easiest way to delete files is by using Amazon S3 Lifecycle Rules. Simply specify the prefix and an age (eg 1 day after creation) and S3 will delete the files for you!

However, this is not necessarily the fastest way to delete them -- it might take 24 hours until the rule is executed.

If you really want to delete the objects yourself, use delete_objects() instead of delete_object(). It can accept up to 1000 keys per call, which will be faster than deleting each object individually.

Bloc answered 29/12, 2017 at 22:1 Comment(0)
X
0

Boto provides support for MultiDelete. Here's an example of how you would use it:

import boto.s3
conn = boto.s3.connect_to_region('us-east-1')  # or whatever region you want
bucket = conn.get_bucket('mybucket')
keys_to_delete = ['mykey1', 'mykey2', 'mykey3', 'mykey4']
result = bucket.delete_keys(keys_to_delete)
Xymenes answered 15/6, 2020 at 14:37 Comment(0)
L
0

The AWS console now has an option to select a s3 bucket and click the "empty" button. This deletes files 1000 at a time (probably using the delete_objects() api call behind the scene) without the need to script it or call the api yourself. The only caveat is that you can't navigate away from the page until the process completes or it will halt the process. Works well if console is an option and the bucket in question has less than 2 million objects. I've noticed it tends to hang after the 2 million deleted objects mark.

Leasehold answered 29/3, 2023 at 20:48 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.