How do I get all versions of an S3 key and undelete using boto?
Asked Answered
B

3

7

I have had an S3 bucket for awhile but only now turned versioning on. Experimenting with it a bit trying to figure out what sort of detection protections I am getting with just the versioning on, without activating the "MFA delete" option.

I uploaded a test file, then deleted it then re-uploaded it twice. Now, using the S3 browser tool I am seeing 4 versions of the file: #1, #2 (deleted), #3 and #4 (current). If I use boto to get the latest version, I can extract its version_id:

import boto
c=boto.connect_s3()
b=c.get_bucket('my-bucket')
k = b.get_key('test2/dw.txt')
print k.version_id

But how do i get a full list of version_id's for a given key? And if I want to retrieve version #1 of the key (deleted), do I need to do something first using the version #2 id to "undelete" it?

Finally, does this deletion protection (creation of a delete marker) work with legacy files that had been uploaded before versioning was turned on?

Thx

Brose answered 30/1, 2015 at 19:32 Comment(0)
B
6

You can get a list of all available versions by using the list_versions method of the bucket object.

import boto
c = boto.connect_s3()
bucket = c.get_bucket('my-bucket')
for version in bucket.list_versions():
    print(version)

This will return a list of Key objects which have specific version_ids associated with them. You can retrieve any of the versions but using the normal methods on the Key object. If you want to make the older version the current version you would have to re-upload it or copy it on the server.

Once you enable versioning on a bucket, all delete operations after that point in time on any object in the bucket will result in a delete marker being written to the bucket rather than actually deleting the object.

Branks answered 30/1, 2015 at 19:49 Comment(2)
So does it mean that the only way to get a list of versions for a key is to get the complete list for the whole bucket using bucket.list_versions() and then loop over all the keys the way you show and test for a match of version.name and the name of the key I am interested in? That seems like a strange design decision on the part of AWS/boto folks, although maybe there is something that justifies it that I don't understand...Brose
No, the list_versions method takes a parameter called prefix that can be used to limit the results returned. If the key you are interested in is called foobar, then list_versions(prefix="foobar") should limit the results to just the versions for that key.Branks
C
4

You can get list of all version using following method

session = boto3.Session(aws_access_key_id, aws_secret_access_key)

s3 = session.client('s3')

bucket_name = 'bucketname'

versions = s3.list_object_versions (Bucket = bucket_name, Prefix = 'Key')

print(versions.get('Versions'))

This will print a list of all versions present in that bucket along with other information like key, storage class, size etc

Centistere answered 5/5, 2018 at 10:39 Comment(2)
N.b. that with boto3, you must manually handle pagination yourself, which this simple example does not do.Tchao
@Tchao or use boto3's built-in pagination helpers: boto3.amazonaws.com/v1/documentation/api/latest/reference/… Another gotcha that people seem to gloss over is prefix is just that: a prefix. It's not an exact match. If you have the files file, file2, and file3, then prefix="file" will match all of those keys.Durian
K
3

I didn't see an answers that also undoes the delete marker, so here is a script that I use to specifically undelete one object, you can potentially ignore the ENDPOINT if you use AWS S3.

  • This version uses the pagination helpers in case there are more versions of the object than fit in one response (1000 by default).
  • I create an s3.ObjectVersion using the returned VersionId and then delete() that to restore the object.
import boto3
import sys

ENDPOINT='10.62.64.200'

if len(sys.argv) != 3:
    print("Usage: {} bucketname key".format(sys.argv[0]))
    sys.exit(1)

bucketname = sys.argv[1]
key = sys.argv[2]

s3 = boto3.resource('s3', endpoint_url='http://' + ENDPOINT)
kwargs = {'Bucket' : bucketname, 'Prefix' : key}

pageresponse = s3.meta.client.get_paginator('list_object_versions').paginate(**kwargs)

for pageobject in pageresponse:
    if 'DeleteMarkers' in pageobject.keys() and pageobject['DeleteMarkers'][0]['Key'] == key:
        print("Undeleting s3://{}/{}".format(bucketname, key))
        s3.ObjectVersion(bucketname, key, pageobject['DeleteMarkers'][0]['VersionId']).delete()
Kingsley answered 12/8, 2021 at 14:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.