Retain owner and file permissions info when syncing to AWS S3 Bucket from Linux
Asked Answered
G

1

8

I am syncing a directory to AWS S3 from a Linux server for backup.

rsync -a --exclude 'cache' /path/live /path/backup
aws s3 sync  path/backup s3://myBucket/backup --delete

However, I noticed that when I want to restore a backup like so:

aws s3 sync s3://myBucket/backup path/live/ --delete

The owner and file permissions are different. Is there anything I can do or change in the code to retain the original Linux information of the files?

Thanks!

Gynaeco answered 25/12, 2016 at 16:48 Comment(3)
S3 isn't a Linux file system. It won't retain any Linux permissions because they don't apply to S3. You could try creating a tar file and copying that to S3, which would retain permission information, but that wouldn't be an incremental sync anymore.Mitchelmitchell
Stack Overflow is a site for programming and development questions. This question appears to be off-topic because it is not about programming or development. See What topics can I ask about here in the Help Center. Perhaps Super User or Unix & Linux Stack Exchange would be a better place to ask. Also see Where do I post questions about Dev Ops?Labannah
Thanks all for your suggestions. I thought there was a specific parameter I could enter to make this work but I guess this is just a S3 limitation and would require a workaround. compressing everything into a tar and just doing regular backup instead of sync seems to be the only way.Gynaeco
B
9

I stumbled on this question while looking for something else and figured you (or someone) might like to know you can use other tools that can preserve original (Linux) ownership information. There must be others but I know that s3cmd can keep the ownership information (stored in the metadata of the object in the bucket) and restore it if you sync it back to a Linux box.

The syntax for syncing is as follows

/usr/bin/s3cmd --recursive --preserve sync /path/ s3://mybucket/path/

And you can sync it back with the same command just reversing the from/to.

But, as you might know (if you did a little research on S3 costs optimisation), depending on the situation, it could be wiser to use a compressed file. It saves space and it should take less requests so you could end up with some savings at the end of the month.

Also, s3cmd is not the fastest tool to synchronise with S3 as it does not use multi-threading (and is not planning to) like other tools, so you might want to look for other tools that could preserve ownership and profits of multi-threading if that's still what you're looking for. To speedup data transfer with s3cmd, you could execute multiple s3cmd with different --exclude --include statements.

For example

/usr/bin/s3cmd --recursive --preserve --exclude="*" --include="a*" sync /path/ s3://mybucket/path/ & \
/usr/bin/s3cmd --recursive --preserve --exclude="*" --include="b*" sync /path/ s3://mybucket/path/ & \
/usr/bin/s3cmd --recursive --preserve --exclude="*" --include="c*" sync /path/ s3://mybucket/path/
Blench answered 31/1, 2017 at 4:41 Comment(1)
Thanks, this sounds very interesting, I will definitely try it out. You're right about compressing, but in my case it won't be saving space because I'm trying to do incremental backups.Gynaeco

© 2022 - 2024 — McMap. All rights reserved.