How can I delete files older than seven days in Amazon S3?
Asked Answered
D

9

13

I need to delete files in Amazon S3 which are older than seven days. I needed a shell script to do this, but I didn't have any luck with google search. I found the below URL:

http://shout.setfive.com/2011/12/05/deleting-files-older-than-specified-time-with-s3cmd-and-bash/

It is not helpful to us. What would be a script to delete all files older than seven days?

Donavon answered 22/5, 2018 at 12:27 Comment(0)
D
16

We have modified the code a little bit and it is working fine.

   aws s3 ls BUCKETNAME/ | while read -r line;
       do
        createDate=`echo $line|awk {'print $1" "$2'}`
        createDate=`date -d"$createDate" +%s`
        olderThan=`date --date "7 days ago" +%s`
        if [[ $createDate -lt $olderThan ]]
           then
            fileName=`echo $line|awk {'print $4'}`

            if [[ $fileName != "" ]]
            then
                    aws s3 rm BUCKETNAME/$fileName
            fi
       fi

       done;
Donavon answered 30/5, 2018 at 7:47 Comment(4)
add s3:// in the beginning of the aws s3 rm s3://BUCKETNAME/$fileNameLacking
Syntax error: "done" unexpected (expecting "then")Longo
@mba3gar here is the solution worked for me https://mcmap.net/q/841828/-how-can-i-delete-files-older-than-seven-days-in-amazon-s3Longo
This is cool, thanks!, I just needed to add --recursive to the s3 command: aws s3 ls BUCKETNAME/ --recursive for the script to work with paths that contain slash characters like BUCKETNAME/V1/V2/Godevil
A
38

The easiest method is to define Object Lifecycle Management on the Amazon S3 bucket.

You can specify that objects older than a certain number of days should be expired (deleted). The best part is that this happens automatically on a regular basis and you don't need to run your own script.

If you wanted to do it yourself, the best would be to write a script (eg in Python) to retrieve the list of files and delete ones older than a certain date.

Example: GitHub - jordansissel/s3cleaner: Amazon S3 file cleaner - delete things older than a certain age, matching a pattern, etc.

It's somewhat messier to do as a shell script.

Axiomatic answered 23/5, 2018 at 2:22 Comment(1)
This should be the accepted answer. There's no need to use messy scripts when we can configure OLM on the Amazon S3 bucket accordingly.Oversell
D
16

We have modified the code a little bit and it is working fine.

   aws s3 ls BUCKETNAME/ | while read -r line;
       do
        createDate=`echo $line|awk {'print $1" "$2'}`
        createDate=`date -d"$createDate" +%s`
        olderThan=`date --date "7 days ago" +%s`
        if [[ $createDate -lt $olderThan ]]
           then
            fileName=`echo $line|awk {'print $4'}`

            if [[ $fileName != "" ]]
            then
                    aws s3 rm BUCKETNAME/$fileName
            fi
       fi

       done;
Donavon answered 30/5, 2018 at 7:47 Comment(4)
add s3:// in the beginning of the aws s3 rm s3://BUCKETNAME/$fileNameLacking
Syntax error: "done" unexpected (expecting "then")Longo
@mba3gar here is the solution worked for me https://mcmap.net/q/841828/-how-can-i-delete-files-older-than-seven-days-in-amazon-s3Longo
This is cool, thanks!, I just needed to add --recursive to the s3 command: aws s3 ls BUCKETNAME/ --recursive for the script to work with paths that contain slash characters like BUCKETNAME/V1/V2/Godevil
L
4

I was looking for an s3cmd command to delete files older than N days, and here is what worked for me

s3cmd ls s3://your-address-here/ | awk -v dys="2" 'BEGIN { depoch=(dys*86400);cepoch=(systime()-depoch) } { gsub("-"," ",$1);gsub(":"," ",$2 );if (mktime($1" "$2" 00")<=cepoch) { print "s3cmd del "$4 } }' | bash
Longo answered 1/3, 2021 at 13:35 Comment(1)
this command doesn't work for me unless I use --recursive or -r s3cmd ls -r s3://your-address-here/ | awk -v dys="2" 'BEGIN { depoch=(dys*86400);cepoch=(systime()-depoch) } { gsub("-"," ",$1);gsub(":"," ",$2 );if (mktime($1" "$2" 00")<=cepoch) { print "s3cmd del "$4 } }' | bashHarmonyharmotome
S
3

Easiest way as of Oct 2023

I had similar situation to deal in a multi-tenant setup where a single user could create N buckets each with different retention period in days. Since I had all the bucket related config available at db level so simple use of s3cmd expire command did the job.

Set expiry:

s3cmd expire s3://BUCKET_PATH --expiry-days 7 --access_key=ACCESS_KEY --secret_key=ACCESS_KEY

List life cycle:

s3cmd getlifecycle s3://BUCKET_PATH --access_key=ACCESS_KEY --secret_key=SECRET_KEY
Sussman answered 26/10, 2023 at 4:31 Comment(0)
E
0

Here is a simple script that I wrote for my environment.

And, the files in my s3 bucket are in FULL_BACKUP_2020-06-25.tar.gz format.

#!/bin/bash

#Defining variables.
#Date=`date +%Y-%m-%d`
ThreeDaysOldDate=`date -d '-3 days' +%Y-%m-%d | tr -d '-'`
Obj=`/usr/local/bin/aws s3 ls s3://bucket_name/folder/ | sed -n '2,$'p | awk '{print $4}'| cut -b 13-22 | tr -d '-'`

#Comparing files older than past 3 days and removing them from S3.
for i in $Obj
do
    if [ $i -lt $ThreeDaysOldDate ]; then
        var1="FULL_BACKUP_"
        var2=".tar.gz"
        year=$(echo $i | cut -c 1-4)
        mon=$(echo $i | cut -c 5-6)
        day=$(echo $i | cut -c 7-8)
        DATE=$var1$year-$mon-$day$var2
        /usr/local/bin/aws s3 rm s3://bucket_name/folder/$DATE > /dev/null 2>&1
    fi
done
Eurhythmics answered 29/6, 2020 at 14:31 Comment(0)
S
0

This will delete 159 days aged files recursively from the S3 bucket. You can change the days as per your requirement. which includes filenames with spaces. The above scripts didn't work with filenames with spaces.

Note: Existing directory structure may get deleted. If you don't prefer directory structure you can use this.

If you would prefer directory structure give the full path of last child directory and modify this on each execution to secure parent directory structure.

example:

s3://BucketName/dir1/dir2/dir3/

s3://BucketName/dir1/dir2/dir4/

s3://BucketName/dir1/dir2/dir5/

vim s3_file_delete.sh

s3bucket="s3://BucketName"
s3dirpath="s3://BucketName/WithOrWithoutDirectoryPath/"
aws s3 ls $s3dirpath --recursive | while read -r line;
    do
     createDate=`echo $line|awk {'print $1" "$2'}`
     createDate=`date -d"$createDate" +%s`
     olderThan=`date --date "159 days ago" +%s`
     if [[ $createDate -lt $olderThan ]]
        then
         fileName=`echo $line|awk '{a="";for (i=4;i<=NF;i++){a=a" "$i}print a}' |awk '{ sub(/^[ \t]+/, ""); print }'`

         if [[ $fileName != "" ]]
         then
                 #echo "$s3bucket/$fileName"
                 aws s3 rm "$s3bucket/$fileName"
         fi
    fi

    done;
Scalf answered 6/11, 2020 at 13:20 Comment(0)
B
0

I slightly modified the one from Prabhu R for being able to execute the shell script on Mac OS X (I tested with Mac OS X v10.13 (High Sierra)):

BUCKETNAME=s3://BucketName/WithOrWithoutDirectoryPath/
aws s3 ls $BUCKETNAME | while read -r line;
do
  createDate=`echo $line|awk {'print $1" "$2'}`
  createDate=`gdate -d"$createDate" +%s`
  olderThan=`gdate '+%s' -d '1 week ago'`
  if [[ $createDate -lt $olderThan ]]
    then
      fileName=`echo $line|awk {'print $4'}`
      if [[ $fileName != "" ]]
        then
          echo "deleting " $BUCKETNAME$fileName
          aws s3 rm $BUCKETNAME$fileName
      fi
  fi
done;
Bromate answered 6/1, 2021 at 14:16 Comment(0)
R
0

I have created the below script and had it running with cron. As per my requirement, the script is able to delete one backup file daily which is eight days older and able to keep only seven days which is a backup file and generates here one file daily.

#!/bin/bash
#Purpose: functional for 7 days backup retention policy
count=$(/usr/bin/sudo /usr/local/bin/aws s3 ls bucketname  |nl|tail -n1|awk '{print $1}')
if [[ "$count" == 8 ]]
then
    filename=$(/usr/bin/sudo /usr/local/bin/aws s3 ls bucketname|awk '{print $NF}'|head -n1)
    /usr/bin/sudo /usr/local/bin/aws s3 rm s3://bucketname/$filename
fi
Redon answered 26/10, 2021 at 6:44 Comment(0)
S
0

Based on the solution suggested by @Prabhu R, I patched the code and added variables.

So if you save the bellow to cleanup.sh, you can run:

./cleanup.sh <bucket_name> <days_beyond_you_want_files_removed | number>

#!/bin/bash
 
aws s3 ls $1/ --recursive | while read -r line;
     do
      createDate=`echo $line|awk {'print $1" "$2'}`
      createDate=`date -d"$createDate" +%s`
      olderThan=`date --date "$2 days ago" +%s`
      if [[ $createDate -lt $olderThan ]]
         then
          fileName=`echo $line|awk {'print $4'}`

          if [[ $fileName != "" ]]
          then
                  aws s3 rm s3://$1/$fileName
          fi
     fi

     done; 
Scott answered 16/12, 2022 at 11:1 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.