FTP/SFTP access to an Amazon S3 Bucket [closed]
Asked Answered
D

8

163

Is there a way to connect to an Amazon S3 bucket with FTP or SFTP rather than the built-in Amazon file transfer interface in the AWS console? Seems odd that this isn't a readily available option.

Donee answered 29/5, 2014 at 17:20 Comment(1)
AWS released in Nov, 2018 fully managed SFTP Service that enables the transfer of files directly into and out of Amazon S3. AWS Transfer for SFTPAnstus
Z
131

There are three options.

  • You can use a native Amazon Managed SFTP service (aka AWS Transfer for SFTP), which is easier to set up.
  • Or you can mount the bucket to a file system on a Linux server and access the files using the SFTP as any other files on the server (which gives you greater control).
  • Or you can just use a (GUI) client that natively supports S3 protocol (what is free).

Managed SFTP Service

  • In your Amazon AWS Console, go to AWS Transfer for SFTP and create a new server.

  • In SFTP server page, add a new SFTP user (or users).

    • Permissions of users are governed by an associated AWS role in IAM service (for a quick start, you can use AmazonS3FullAccess policy).

    • The role must have a trust relationship to transfer.amazonaws.com.

For details, see my guide Setting up an SFTP access to Amazon S3.


Mounting Bucket to Linux Server

Just mount the bucket using s3fs file system (or similar) to a Linux server (e.g. Amazon EC2) and use the server's built-in SFTP server to access the bucket.

  • Install the s3fs

  • Add your security credentials in a form access-key-id:secret-access-key to /etc/passwd-s3fs

  • Add a bucket mounting entry to fstab:

    <bucket> /mnt/<bucket> fuse.s3fs rw,nosuid,nodev,allow_other 0 0
    

For details, see my guide Setting up an SFTP access to Amazon S3.


Use S3 Client

Or use any free "FTP/SFTP client", that's also an "S3 client", and you do not have setup anything on server-side. For example, my WinSCP or Cyberduck.

WinSCP has even scripting and .NET/PowerShell interface, if you need to automate the transfers.

Zosima answered 3/9, 2015 at 14:17 Comment(2)
Having the bucket mounted as root gives later transfer permission denied problems when connecting with ec2-user via SFTP. /mnt/<bucket> folder is owned by root and has the group root as well.Nutcracker
@Nutcracker /others - Mount as ftp user (using uid/gid options) and make sure it's mounted with allow_other (or -o allow_other if mounting from s3fs command line).. works for me. It's also a good idea to write the files as read-only permissions ( -o default_acl=public-read) in my case (on a private bucket).Haase
D
69

Update

S3 now offers a fully-managed SFTP Gateway Service for S3 that integrates with IAM and can be administered using aws-cli.


There are theoretical and practical reasons why this isn't a perfect solution, but it does work...

You can install an FTP/SFTP service (such as proftpd) on a linux server, either in EC2 or in your own data center... then mount a bucket into the filesystem where the ftp server is configured to chroot, using s3fs.

I have a client that serves content out of S3, and the content is provided to them by a 3rd party who only supports ftp pushes... so, with some hesitation (due to the impedance mismatch between S3 and an actual filesystem) but lacking the time to write a proper FTP/S3 gateway server software package (which I still intend to do one of these days), I proposed and deployed this solution for them several months ago and they have not reported any problems with the system.

As a bonus, since proftpd can chroot each user into their own home directory and "pretend" (as far as the user can tell) that files owned by the proftpd user are actually owned by the logged in user, this segregates each ftp user into a "subdirectory" of the bucket, and makes the other users' files inaccessible.


There is a problem with the default configuration, however.

Once you start to get a few tens or hundreds of files, the problem will manifest itself when you pull a directory listing, because ProFTPd will attempt to read the .ftpaccess files over, and over, and over again, and for each file in the directory, .ftpaccess is checked to see if the user should be allowed to view it.

You can disable this behavior in ProFTPd, but I would suggest that the most correct configuration is to configure additional options -o enable_noobj_cache -o stat_cache_expire=30 in s3fs:

-o stat_cache_expire (default is no expire)

specify expire time(seconds) for entries in the stat cache

Without this option, you'll make fewer requests to S3, but you also will not always reliably discover changes made to objects if external processes or other instances of s3fs are also modifying the objects in the bucket. The value "30" in my system was selected somewhat arbitrarily.

-o enable_noobj_cache (default is disable)

enable cache entries for the object which does not exist. s3fs always has to check whether file(or sub directory) exists under object(path) when s3fs does some command, since s3fs has recognized a directory which does not exist and has files or subdirectories under itself. It increases ListBucket request and makes performance bad. You can specify this option for performance, s3fs memorizes in stat cache that the object (file or directory) does not exist.

This option allows s3fs to remember that .ftpaccess wasn't there.


Unrelated to the performance issues that can arise with ProFTPd, which are resolved by the above changes, you also need to enable -o enable_content_md5 in s3fs.

-o enable_content_md5 (default is disable)

verifying uploaded data without multipart by content-md5 header. Enable to send "Content-MD5" header when uploading a object without multipart posting. If this option is enabled, it has some influences on a performance of s3fs when uploading small object. Because s3fs always checks MD5 when uploading large object, this option does not affect on large object.

This is an option which never should have been an option -- it should always be enabled, because not doing this bypasses a critical integrity check for only a negligible performance benefit. When an object is uploaded to S3 with a Content-MD5: header, S3 will validate the checksum and reject the object if it's corrupted in transit. However unlikely that might be, it seems short-sighted to disable this safety check.

Quotes are from the man page of s3fs. Grammatical errors are in the original text.

Dendrochronology answered 30/5, 2014 at 3:10 Comment(12)
could you elaborate on the reasons why this solution isn't ideal?Romy
I did this thing. Do you know why ProFTP always goes into timeout while listing my bucket folder? From command line I can do ls without issuesCartel
@MarcoMarsala, yes, I do know why. I eventually encountered the same problem as directory sizes started to grow. I will research the details of the solution I worked out, and update this answer.Dendrochronology
@MarcoMarsala the fixes for large directories have been added to the answer.Dendrochronology
@Michael-sqlbot have you tried to use "AllowOverride off" directive in ProFTPd config to make it stop trying to read ".ftpaccess" files completely?Sonata
Will this setup work with multiple users? I setup S3FS and SFTP via OpenSSH and users can't write to their mounted S3 bucket folders because I can't control the users/permisions on their individual folders with S3FS.Trish
@T.BrianJones why can't you control the permissions on the folders with S3FS? You should be able to... but, with proFTPd, you don't have to -- you can leave everything owned by a single user, and proFTPd will make it appear as though each user owns their own files (and can be configured to chroot them to their individual home directories).Dendrochronology
I've tried everything and can only set user:group / permissions at the folder level where the S3 bucket is mounted. Then those permissions propagate down to every folder on S3. I've tried many things including many variations on this S3FS command sudo s3fs bucket-name /local-mount-folder-name/ -o iam_role=sftp-server -o allow_other -o umask=022 -o uid=501 -o gid=501 - I can't change any permissions on the folders in the Mounted S3 folder once it's created.Trish
@Michael-sqlbot - Could you recommend a good tutorial or notes on setting up ProFTP with chrooted users. I'm familiar doing this with OpenSSH, but it's proving more difficult to find information that seems reliable/straight forward for ProFTPd.Trish
Have you had issues with upload speeds? I can get 1.5mb/sec uploading directly to the server's hard drive, but only ~90kb/s when uploading to the s3fs mounted folder. My server is an EC2 in the same region as the S3 bucket and know my connection there is faster than my internet connection.Trish
is there a better solution to this problem now?Ellingson
@Ellingson I still use this solution in production. It doesn't give me any problems.Dendrochronology
C
25

Answer from 2014 for the people who are down-voting me:

Well, S3 isn't FTP. There are lots and lots of clients that support S3, however.

Pretty much every notable FTP client on OS X has support, including Transmit and Cyberduck.

If you're on Windows, take a look at Cyberduck or CloudBerry.

Updated answer for 2019:

AWS has recently released the AWS Transfer for SFTP service, which may do what you're looking for.

Coolth answered 29/5, 2014 at 19:45 Comment(2)
Cyberduck works fantastically easy if you're a server newbie like myself. Just clicked on Open Connection, selected S3 from the dropdown, and input my credentials. Much easier than some of the options mentioned above!Luz
I think it is important to mention that if one uses the AWS Transfer Family, they could incur significant costs. SFTP enabled on your endpoint: At $0.30 hourly rate, your monthly charge for SFTP is: $0.30 * 24 hours * 30 days = $216 SFTP data upload and download: At $0.04/GB, your monthly charge for data uploads and downloads over SFTP is: $0.04 * 1 GB * 30 days = $1.20 Adding the charges above, your total monthly bill for the AWS Transfer Family would be: $216 + $1.20 = $217.20.Baggott
A
7

Or spin Linux instance for SFTP Gateway in your AWS infrastructure that saves uploaded files to your Amazon S3 bucket.

Supported by Thorntech

Anstus answered 21/6, 2017 at 14:39 Comment(1)
We've been using the SFTP Gateway in production for large projects for several years. We've found it to be more reliable than s3fsEquivalency
C
4

Amazon has released SFTP services for S3, but they only do SFTP (not FTP or FTPES) and they can be cost prohibitive depending on your circumstances.

I'm the Founder of DocEvent.io, and we provide FTP/S Gateways for your S3 bucket without having to spin up servers or worry about infrastructure.

There are also other companies that provide a standalone FTP server that you pay by the month that can connect to an S3 bucket through the software configuration, for example brickftp.com.

Lastly there are also some AWS Marketplace apps that can help, here is a search link. Many of these spin up instances in your own infrastructure - this means you'll have to manage and upgrade the instances yourself which can be difficult to maintain and configure over time.

Crowell answered 18/2, 2019 at 3:20 Comment(1)
DocEvents looks good but have too much restrictions on free plan... I could not even try the service...Beatup
R
3

WinSCp now supports S3 protocol

First, make sure your AWS user with S3 access permissions has an “Access key ID” created. You also have to know the “Secret access key”. Access keys are created and managed on Users page of IAM Management Console.

Make sure New site node is selected.

On the New site node, select Amazon S3 protocol.

Enter your AWS user Access key ID and Secret access key

Save your site settings using the Save button.

Login using the Login button.

Rid answered 21/2, 2019 at 10:55 Comment(0)
A
2

Filezilla just released a Pro version of their FTP client. It connects to S3 buckets in a streamlined FTP like experience. I use it myself (no affiliation whatsoever) and it works great.

Atticism answered 30/7, 2017 at 16:14 Comment(0)
P
1

As other posters have pointed out, there are some limitations with the AWS Transfer for SFTP service. You need to closely align requirements. For example, there are no quotas, whitelists/blacklists, file type limits, and non key based access requires external services. There is also a certain overhead relating to user management and IAM, which can get to be a pain at scale.

We have been running an SFTP S3 Proxy Gateway for about 5 years now for our customers. The core solution is wrapped in a collection of Docker services and deployed in whatever context is needed, even on-premise or local development servers. The use case for us is a little different as our solution is focused data processing and pipelines vs a file share. In a Salesforce example, a customer will use SFTP as the transport method sending email, purchase...data to an SFTP/S3 enpoint. This is mapped an object key on S3. Upon arrival, the data is picked up, processed, routed and loaded to a warehouse. We also have fairly significant auditing requirements for each transfer, something the Cloudwatch logs for AWS do not directly provide.

As other have mentioned, rolling your own is an option too. Using AWS Lightsail you can setup a cluster, say 4, of $10 2GB instances using either Route 53 or an ELB.

In general, it is great to see AWS offer this service and I expect it to mature over time. However, depending on your use case, alternative solutions may be a better fit.

Parenthood answered 22/2, 2019 at 17:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.