Streaming Amazon S3 Objects From a Web Server Using Laravel
Asked Answered
B

1

7

In my web-application built using laravel 5.1, users can upload some sensitive files that I store in Amazon S3. Later I want users WITH PERMISSION to download this file. Since I want this auth check in place, I cannot download the file using traditional methods by giving them direct link to file in S3.

My approach:

  1. When user requests a download, my servers download the file locally and then stream to user. Issue: Takes long time because files are too large sometimes.

  2. Give the user a pre-signed URL to download directly from S3. URL is only valid for 5 minutes. Issue: If that URL is shared, anyone can download it within 5 minutes.

  3. According to this article, stream the data directly from S3 to clients. This looks promising, but I don't know how to implement this.

According to this article, I need to:

  1. Register stream wrapper - which is my first problem because I don't know how to get the hold of S3Client object because laravel uses flysystem, and I don't know what methods to call to get this object. Maybe I need to include S3 package separately in my composer.json ?
  2. Disable output buffering - Do I need to do this in laravel or does laravel has already taken care of it ?

I am sure other developers have seen such an issue before and would like to get some help pointers. If anybody has already streamed directly from S3 to client using laravel Response::download($pathToFile, $name, $headers), then I would love to hear your methods.

Bosnia answered 28/7, 2015 at 19:19 Comment(9)
That doesn't stream from S3->user, is streams S3->Laravel->user. Your server is still in the loop, so you're incurring bandwidth and missing most of the benefits of serving straight off S3.Refutative
@Refutative I understand, but what choice do I have. I cannot give direct access to the files in S3. Plus I will have to make the files publicly visible in order for others to download.Bosnia
You might be able to cobble something together with temporary IAM credentials that limits use of the signed URL to a particular IP.Refutative
@Refutative Thanks for the link. But if my users were to be granted IAM roles, do they have to be part of AWS/Amazon or have to sign in?Bosnia
My understanding is that you'd have a IAM user called something like "temporary", with absolutely minimal permissions. You'd dole out temporary access tokens to it to your users, which they'd be able to use only to access particular signed URLs with particular IPs.Refutative
That said, it sounds like a bit of a nightmare to manage. I'd just make signed URLs with like ~10 second expirations - plenty of time to follow a redirect, but not much time to share it around. If they really want to share it around they're going to put it on The Pirate Bay anyways.Refutative
@Refutative Lets say I go with signed URLs with expiration of 10 seconds. Does it mean the user has to "finish" downloading in 10 seconds ? Will this method work for lets say a 2GB file ?Bosnia
As long as the request starts, it will be permitted to complete. BTW, if you put CloudFront in front of S3, your signed URLs can be limited to IP addresses. See docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/… - "IpAddress":{"AWS:SourceIp":"optional IP address"} - which will add some additional security to your scheme.Refutative
@Refutative This seems like a viable solution knowing the fact that "started requests will be allowed to finish". Thanks. :)Bosnia
B
9

From discussion in comments, I have arrived at some key points that I would like to share.

Pre-Signed URLs

As @ceejayoz pointed out, pre-signed URLs are not a bad idea because:

  1. I can keep time as low as 10 seconds which is perfect for any redirects and to start download, but not enough for the link to be shared.
  2. My previous understanding was that the download has to finish in the given time. So if the link expires in 10 seconds, the download has to happen before that. But @ceejayoz pointed that is not the case. The download which have started is allowed to finish.
  3. With cloud front, I can also restrict on the IP address, to add more security.


IAM Roles

He also pointed out another not so great method - to create temporary IAM users. This is a maintenance nightmare if not done correctly, so only do if you know what you are doing.


Stream From S3

This is the method that I have chosen for now. Maybe later I will move to the first method.

Warning: If you stream, then your server is still the middleman and all the data will go via your server. So if it fails, or is slow, your download will be slow.

My first question was how to register stream wrapper:

Since I am using Laravel and laravel uses flysystem for S3 management, there was no easy way for me to get the S3Client. Hence I added additional package AWS SDK for Laravel in my composer.json

"aws/aws-sdk-php-laravel" : "~3.0"

Then I wrote my code as follows:

class FileDelivery extends Command implements SelfHandling
{
    private $client;
    private $remoteFile;
    private $bucket;

    public function __construct($remoteFile)
    {
        $this->client = AWS::createClient('s3');
        $this->client->registerStreamWrapper();
        $this->bucket = 'mybucket';
        $this->remoteFile = $remoteFile;
    }

    public function handle()
    {
        try
        {
            // First get the meta-data of the object.
            $headers = $this->client->headObject(array(
                'Bucket' => $this->bucket,
                'Key' => $this->remoteFile
            ));

            $headers = $headers['@metadata'];
            if($headers['statusCode'] !== 200)
            {
                throw new S3Exception();
            }
        }
        catch(S3Exception $e)
        {
            return 404;
        }

        // return appropriate headers before the stream starts.
        http_response_code($headers['statusCode']);
        header("Last-Modified: {$headers['headers']['last-modified']}");
        header("ETag: {$headers['headers']['etag']}");
        header("Content-Type: {$headers['headers']['content-type']}");
        header("Content-Length: {$headers['headers']['content-length']}");
        header("Content-Disposition: attachment; filename=\"{$this->filename}\"");

        // Since file sizes can be too large,
        // buffers can suffer because they cannot store huge amounts of data.
        // Thus we disable buffering before stream starts.
        // We also flush anything pending in buffer.
        if(ob_get_level())
        {
            ob_end_flush();
        }
        flush();

        // Start the stream.
        readfile("s3://{$this->bucket}/{$this->remoteFile}");
    }
}

My second question was Do I need to Disable output buffering in laravel?

The answer IMHO is yes. The buffering lets the data flushed immediately from the buffer, resulting in lower memory consumption. Since we are not using any laravel function to offload the data to client, this is not done by laravel and hence needs to be done by us.

Bosnia answered 28/7, 2015 at 23:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.