Downloading a file from an s3 Bucket to the USERS computer
Asked Answered
T

2

22

Goal

Download file from s3 Bucket to users computer.

Context

I am working on a Python/Flask API for a React app. When the user clicks the Download button on the Front-End, I want to download the appropriate file to their machine.

What I've tried

import boto3 s3 = boto3.resource('s3') s3.Bucket('mybucket').download_file('hello.txt', '/tmp/hello.txt')

I am currently using some code that finds the path of the downloads folder and then plugging that path into download_file() as the second parameter, along with the file on the bucket that they are trying to download.

This worked locally, and tests ran fine, but I run into a problem once it is deployed. The code will find the downloads path of the SERVER, and download the file there.

Question

What is the best way to approach this? I have researched and cannot find a good solution for being able to download a file from the s3 bucket to the users downloads folder. Any help/advice is greatly appreciated.

Thorlie answered 4/4, 2017 at 19:20 Comment(1)
It all depends on how your user is connecting to the server. If it's through a browser, then you should make a new endpoint to download the file, and provide a link to the endpoint in your app. If you're writing a native app, then you'll need to setup some sort of RPC to get the file from the server.Moran
W
27

You should not need to save the file to the server. You can just download the file into memory, and then build a Response object containing the file.

from flask import Flask, Response
from boto3 import client

app = Flask(__name__)


def get_client():
    return client(
        's3',
        'us-east-1',
        aws_access_key_id='id',
        aws_secret_access_key='key'
    )


@app.route('/blah', methods=['GET'])
def index():
    s3 = get_client()
    file = s3.get_object(Bucket='blah-test1', Key='blah.txt')
    return Response(
        file['Body'].read(),
        mimetype='text/plain',
        headers={"Content-Disposition": "attachment;filename=test.txt"}
    )


app.run(debug=True, port=8800)

This is ok for small files, there won't be any meaningful wait time for the user. However with larger files, this well affect UX. The file will need to be completely downloaded to the server, then download to the user. So to fix this issue, use the Range keyword argument of the get_object method:

from flask import Flask, Response
from boto3 import client

app = Flask(__name__)


def get_client():
    return client(
        's3',
        'us-east-1',
        aws_access_key_id='id',
        aws_secret_access_key='key'
    )


def get_total_bytes(s3):
    result = s3.list_objects(Bucket='blah-test1')
    for item in result['Contents']:
        if item['Key'] == 'blah.txt':
            return item['Size']


def get_object(s3, total_bytes):
    if total_bytes > 1000000:
        return get_object_range(s3, total_bytes)
    return s3.get_object(Bucket='blah-test1', Key='blah.txt')['Body'].read()


def get_object_range(s3, total_bytes):
    offset = 0
    while total_bytes > 0:
        end = offset + 999999 if total_bytes > 1000000 else ""
        total_bytes -= 1000000
        byte_range = 'bytes={offset}-{end}'.format(offset=offset, end=end)
        offset = end + 1 if not isinstance(end, str) else None
        yield s3.get_object(Bucket='blah-test1', Key='blah.txt', Range=byte_range)['Body'].read()


@app.route('/blah', methods=['GET'])
def index():
    s3 = get_client()
    total_bytes = get_total_bytes(s3)

    return Response(
        get_object(s3, total_bytes),
        mimetype='text/plain',
        headers={"Content-Disposition": "attachment;filename=test.txt"}
    )


app.run(debug=True, port=8800)

This will download the file in 1MB chunks and send them to the user as they are downloaded. Both of these have been tested with a 40MB .txt file.

Westland answered 4/4, 2017 at 22:26 Comment(7)
thank you so much for this detailed answer! this has been extremely helpful and i was able to solve my problem using this code, with a few slight modifications :)Thorlie
What happens when the client cancels the download?Carlstrom
I haven't tried any of this myself, but check out this answerWestland
@AllieFitter what is basestring in the get_object_range function?Seamaid
A Python 2 relic, so it should use str instead. I haven't run this code in three years, so I'm not sure what else in it isn't compatible with Python 3.Westland
This won't work for other file types, say zipMoshe
@AllieFitter Can you please explain why to use get_client() function not just make one instance in globals and use it by all functions?Logger
F
23

A better way to solve this problem is to create presigned url. This gives you a temporary URL that's valid up to a certain amount of time. It also removes your flask server as a proxy between the AWS s3 bucket which reduces download time for the user.

def get_attachment_url():
   bucket = 'BUCKET_NAME'
   key = 'FILE_KEY'

   client: boto3.s3 = boto3.client(
     's3',
     aws_access_key_id=YOUR_AWS_ACCESS_KEY,
     aws_secret_access_key=YOUR_AWS_SECRET_KEY
   )

   return client.generate_presigned_url('get_object',
                                     Params={'Bucket': bucket, 'Key': key},
                                     ExpiresIn=60) `
Fete answered 15/4, 2020 at 8:33 Comment(1)
This is a great answer. I've upvoted it. However, one thing to note here is if you've used SSE-C (server-side encryption using customer provided key), then it would not work.Jock

© 2022 - 2024 — McMap. All rights reserved.