How to change permission recursively to folder with AWS s3 or AWS s3api
Asked Answered
C

9

28

I am trying to grant permissions to an existing account in s3.

The bucket is owned by the account, but the data was copied from another account's bucket.

When I try to grant permissions with the command:

aws s3api put-object-acl --bucket <bucket_name> --key <folder_name> --profile <original_account_profile> --grant-full-control emailaddress=<destination_account_email>

I receive the error:

An error occurred (NoSuchKey) when calling the PutObjectAcl operation: The specified key does not exist.

while if I do it on a single file the command is successful.

How can I make it work for a full folder?

Carmencarmena answered 4/10, 2017 at 19:29 Comment(1)
ObjectACL just support files and bucket, not support folder. So you cannot define ACL for folder. The simplest solution that you define ACL for bucket level. Example: "Resource": "arn:aws:s3:::BUCKET_NAME/*"Cruce
A
16

You will need to run the command individually for every object.

You might be able to short-cut the process by using:

aws s3 cp --acl bucket-owner-full-control --metadata Key=Value --profile <original_account_profile> s3://bucket/path s3://bucket/path

That is, you copy the files to themselves, but with the added ACL that grants permissions to the bucket owner.

If you have sub-directories, then add --recursive.

Astatine answered 5/10, 2017 at 3:44 Comment(2)
Copying a file to itself with fail with "This copy request is illegal because it is trying to copy an object to itself without changing the object's metadata, storage class, website redirect location or encryption attributes"; but add another dummy change like setting the storage class, and you are good to go: <code>aws s3 cp --recursive --acl bucket-owner-full-control s3://bucket/path s3://bucket/path --storage-class STANDARD</code>Visional
@Visional has the most concise answer and worked for me. exactly right about the --storage-class as dummy attribute too. thx!Glynas
H
37

The other answers are ok, but the FASTEST way to do this is to use the aws s3 cp command with the option --metadata-directive REPLACE, like this:

aws s3 cp --recursive --acl bucket-owner-full-control s3://bucket/folder s3://bucket/folder --metadata-directive REPLACE

This gives speeds of between 50Mib/s and 80Mib/s.

The answer from the comments from John R, which suggested to use a 'dummy' option, like --storage-class STANDARD. Whilst this works, only gave me copy speeds between 5Mib/s and 11mb/s.

The inspiration for trying this came from AWS's support article on the subject: https://aws.amazon.com/premiumsupport/knowledge-center/s3-object-change-anonymous-ownership/

NOTE: If you encounter 'access denied` for some of your objects, this is likely because you are using AWS creds for the bucket owning account, whereas you need to use creds for the account where the files were copied from.

Hasty answered 9/9, 2020 at 4:10 Comment(2)
This also works for DigitalOcean Spaces. You just need to add --endpoint=https://[location].digitaloceanspaces.comDescription
This sets the content type of all files to be binary/octet-stream. Some files won't straight up work. Or won't be downloadable. This could lead to serious damage!!Dhaulagiri
G
33

This can be only be achieved with using pipes. Try -

aws s3 ls s3://bucket/path/ --recursive | awk '{cmd="aws s3api put-object-acl --acl bucket-owner-full-control --bucket bucket --key "$4; system(cmd)}'
Glooming answered 30/5, 2018 at 17:27 Comment(2)
Have in mind that if you have a large number of files it might take a while. I didn't time it, but it took be over an hour for cca 1.6k filesDennard
I am agree with @sskular. If you have a lot of object and you need it ASAP as it's often happen then prefer the force copy command from https://mcmap.net/q/218603/-how-to-change-permission-recursively-to-folder-with-aws-s3-or-aws-s3apiStillborn
A
16

You will need to run the command individually for every object.

You might be able to short-cut the process by using:

aws s3 cp --acl bucket-owner-full-control --metadata Key=Value --profile <original_account_profile> s3://bucket/path s3://bucket/path

That is, you copy the files to themselves, but with the added ACL that grants permissions to the bucket owner.

If you have sub-directories, then add --recursive.

Astatine answered 5/10, 2017 at 3:44 Comment(2)
Copying a file to itself with fail with "This copy request is illegal because it is trying to copy an object to itself without changing the object's metadata, storage class, website redirect location or encryption attributes"; but add another dummy change like setting the storage class, and you are good to go: <code>aws s3 cp --recursive --acl bucket-owner-full-control s3://bucket/path s3://bucket/path --storage-class STANDARD</code>Visional
@Visional has the most concise answer and worked for me. exactly right about the --storage-class as dummy attribute too. thx!Glynas
O
3

use python to set up the permissions recursively

#!/usr/bin/env python
import boto3
import sys

client = boto3.client('s3')
BUCKET='enter-bucket-name'

def process_s3_objects(prefix):
    """Get a list of all keys in an S3 bucket."""
    kwargs = {'Bucket': BUCKET, 'Prefix': prefix}
    failures = []
    while_true = True
    while while_true:
      resp = client.list_objects_v2(**kwargs)
      for obj in resp['Contents']:
        try:
            print(obj['Key'])
            set_acl(obj['Key'])
            kwargs['ContinuationToken'] = resp['NextContinuationToken']
        except KeyError:
            while_true = False
        except Exception:
            failures.append(obj["Key"])
            continue

    print "failures :", failures

def set_acl(key):
  client.put_object_acl(     
    GrantFullControl="id=%s" % get_account_canonical_id,
    Bucket=BUCKET,
    Key=key
)

def get_account_canonical_id():
  return client.list_buckets()["Owner"]["ID"]


process_s3_objects(sys.argv[1])
Orthochromatic answered 29/7, 2018 at 13:57 Comment(1)
This helped me but I had to adapt it. One of my main changes was to replace GrantFullControl="id=%s" % get_account_canonical_id with ACL='bucket-owner-full-control'. This was needed because I wanted to change ACLs for objects in a bucket in a different AWS account.Hexamethylenetetramine
W
2

One thing you can do to get around the need for setting the ACL for every single object is disabling ACLs for the bucket. All objects in the bucket will then be owned by the bucket owner, and you can use policies for access control instead of ACLs.

You do this by setting the "object ownership" setting to "bucket owner enforced". As per the AWS documentation, this is in fact the recommended setting:

For the majority of modern use cases in S3, we recommend that you disable ACLs by choosing the bucket owner enforced setting and use your bucket policy to share data with users outside of your account as needed. This approach simplifies permissions management and auditing.

You can set this in the web console by going to the "Permissions" tab for the bucket, and clicking the "Edit" button in the "Object Ownership" section. You can then select the "ACLs disabled" radio button.

You can also use the AWS CLI. An example from the documentation:

aws s3api put-bucket-ownership-controls --bucket DOC-EXAMPLE-BUCKET --ownership-controls Rules=[{ObjectOwnership=BucketOwnerEnforced}]
Weld answered 6/4, 2022 at 17:34 Comment(0)
S
1

This was my powershell only solution.

aws s3 ls s3://BUCKET/ --recursive | %{ "aws s3api put-object-acl --bucket BUCKET --key "+$_.ToString().substring(30)+" --acl bucket-owner-full-control" }

Subsonic answered 31/7, 2019 at 20:56 Comment(0)
D
1

I had a similar issue with taking ownership of log objects in a quite large bucket. Total number of objects - 3,290,956 Total size 1.4 TB.

The solutions I was able to find were far too sluggish for that amount of objects. I ended up writing some code that was able to do the job several times faster than

aws s3 cp

You will need to install requirements:

pip install pathos boto3 click

#!/usr/bin/env python3
import logging
import os
import sys
import boto3
import botocore
import click
from time import time
from botocore.config import Config
from pathos.pools import ThreadPool as Pool

logger = logging.getLogger(__name__)

streamformater = logging.Formatter("[*] %(levelname)s: %(asctime)s: %(message)s")
logstreamhandler = logging.StreamHandler()
logstreamhandler.setFormatter(streamformater)


def _set_log_level(ctx, param, value):
    if value:
        ctx.ensure_object(dict)
        ctx.obj["log_level"] = value
        logger.setLevel(value)
        if value <= 20:
            logger.info(f"Logger set to {logging.getLevelName(logger.getEffectiveLevel())}")
    return value


@click.group(chain=False)
@click.version_option(version='0.1.0')
@click.pass_context
def cli(ctx):
    """
        Take object ownership of S3 bucket objects.
    """
    ctx.ensure_object(dict)
    ctx.obj["aws_config"] = Config(
        retries={
            'max_attempts': 10,
            'mode': 'standard'
        }
    )


@cli.command("own")
@click.argument("bucket", type=click.STRING)
@click.argument("prefix", type=click.STRING, default="/")
@click.option("--profile", type=click.STRING, default="default", envvar="AWS_DEFAULT_PROFILE", help="Configuration profile from ~/.aws/{credentials,config}")
@click.option("--region", type=click.STRING, default="us-east-1", envvar="AWS_DEFAULT_REGION", help="AWS region")
@click.option("--threads", "-t", type=click.INT, default=40, help="Threads to use")
@click.option("--loglevel", "log_level", hidden=True, flag_value=logging.INFO, callback=_set_log_level, expose_value=False, is_eager=True, default=True)
@click.option("--verbose", "-v", "log_level", flag_value=logging.DEBUG, callback=_set_log_level, expose_value=False, is_eager=True, help="Increase log_level")
@click.pass_context
def command_own(ctx, *args, **kwargs):
    ctx.obj.update(kwargs)
    profile_name = ctx.obj.get("profile")
    region = ctx.obj.get("region")
    bucket = ctx.obj.get("bucket")
    prefix = ctx.obj.get("prefix").lstrip("/")
    threads = ctx.obj.get("threads")
    pool = Pool(nodes=threads)
    logger.addHandler(logstreamhandler)
    logger.info(f"Getting ownership of all objects in s3://{bucket}/{prefix}")
    start = time()

    try:
        SESSION: boto3.Session = boto3.session.Session(profile_name=profile_name)
    except botocore.exceptions.ProfileNotFound as e:
        logger.warning(f"Profile {profile_name} was not found.")
        logger.warning(f"Falling back to environment variables for AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_SESSION_TOKEN")
        AWS_ACCESS_KEY_ID = os.environ.get("AWS_ACCESS_KEY_ID", "")
        AWS_SECRET_ACCESS_KEY = os.environ.get("AWS_SECRET_ACCESS_KEY", "")
        AWS_SESSION_TOKEN = os.environ.get("AWS_SESSION_TOKEN", "")
        if AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY:
            if AWS_SESSION_TOKEN:
                SESSION: boto3.Session = boto3.session.Session(aws_access_key_id=AWS_ACCESS_KEY_ID, aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
                                                               aws_session_token=AWS_SESSION_TOKEN)
            else:
                SESSION: boto3.Session = boto3.session.Session(aws_access_key_id=AWS_ACCESS_KEY_ID, aws_secret_access_key=AWS_SECRET_ACCESS_KEY)
        else:
            logger.error("Unable to find AWS credentials.")
            sys.exit(1)

    s3c = SESSION.client('s3', config=ctx.obj["aws_config"])

    def bucket_keys(Bucket, Prefix='', StartAfter='', Delimiter='/'):
        Prefix = Prefix[1:] if Prefix.startswith(Delimiter) else Prefix
        if not StartAfter:
            del StartAfter
            if Prefix.endswith(Delimiter):
                StartAfter = Prefix
        del Delimiter
        for page in s3c.get_paginator('list_objects_v2').paginate(Bucket=Bucket, Prefix=Prefix):
            for content in page.get('Contents', ()):
                yield content['Key']

    def worker(key):
        logger.info(f"Processing: {key}")
        s3c.copy_object(Bucket=bucket, Key=key,
                        CopySource={'Bucket': bucket, 'Key': key},
                        ACL='bucket-owner-full-control',
                        StorageClass="STANDARD"
                        )

    object_keys = bucket_keys(bucket, prefix)
    pool.map(worker, object_keys)
    end = time()
    logger.info(f"Completed for {end - start:.2f} seconds.")


if __name__ == '__main__':
    cli()

Usage:

get_object_ownership.py own -v my-big-aws-logs-bucket /prefix

The bucket mentioned above was processed for ~7 hours using 40 threads.

[*] INFO: 2021-08-05 19:53:55,542: Completed for 25320.45 seconds.

Some more speed comparison using AWS cli vs this tool on the same subset of data:

aws s3 cp --recursive --acl bucket-owner-full-control --metadata-directive 53.59s user 7.24s system 20% cpu 5:02.42 total

vs

[*] INFO: 2021-08-06 09:07:43,506: Completed for 49.09 seconds.

Demonstrative answered 5/8, 2021 at 19:13 Comment(0)
C
0

I used this Linux Bash shell oneliner to change ACLs recursively:

aws s3 ls s3://bucket --recursive | cut -c 32- | xargs -n 1 -d '\n' -- aws s3api put-object-acl --acl public-read --bucket bukcet --key

It works even if file names contain () characters.

Curacy answered 21/4, 2021 at 12:54 Comment(0)
H
-2

The python code is more efficient this way, otherwise it takes a lot longer.

import boto3
import sys

client = boto3.client('s3')
BUCKET='mybucket'

def process_s3_objects(prefix):
    """Get a list of all keys in an S3 bucket."""
    kwargs = {'Bucket': BUCKET, 'Prefix': prefix}
    failures = []
    while_true = True
    while while_true:
      resp = client.list_objects_v2(**kwargs)
      for obj in resp['Contents']:
        try:
            set_acl(obj['Key'])
        except KeyError:
            while_true = False
        except Exception:
            failures.append(obj["Key"])
            continue
      kwargs['ContinuationToken'] = resp['NextContinuationToken']
    print ("failures :"+ failures)

def set_acl(key):
  print(key)
  client.put_object_acl(
    ACL='bucket-owner-full-control',
    Bucket=BUCKET,
    Key=key
)

def get_account_canonical_id():
  return client.list_buckets()["Owner"]["ID"]


process_s3_objects(sys.argv[1])
Hunnish answered 15/5, 2019 at 16:11 Comment(2)
Can you elaborate on how and why this code is more efficient and compared to what?Deferment
Can you also elaborate on why you've just copied a previous answer?Allynallys

© 2022 - 2024 — McMap. All rights reserved.