Cloud storage and secure download strategy on app engine. GCS acl or blobstore
Asked Answered
H

1

5

My appengine app creates cloudstorage files. The files will be downloaded by a third party. The files contain personal medical information.

What would be the preferred way of downloading:

  1. Using a direct GCS download link with a user READER acl.
  2. Or using a blobstore download handler in an appengine app.

Both solutions require the third party to login (google login). Performance is not an issue. Privacy and the occurrence of security errors and mistakes are.

Using an encrypted zip file to download is an option. This means I have to store the password in the project. Or e-mail a random password?

Update The appengine code I used to create a signed download url

import time
import urllib
from datetime import datetime, timedelta
from google.appengine.api import app_identity
import os
import base64

API_ACCESS_ENDPOINT = 'https://storage.googleapis.com'

# Use the default bucket in the cloud and not the local SDK one from app_identity
default_bucket = '%s.appspot.com' % os.environ['APPLICATION_ID'].split('~', 1)[1]
google_access_id = app_identity.get_service_account_name()


def sign_url(bucket_object, expires_after_seconds=60):
    """ cloudstorage signed url to download cloudstorage object without login
        Docs : https://cloud.google.com/storage/docs/access-control?hl=bg#Signed-URLs
        API : https://cloud.google.com/storage/docs/reference-methods?hl=bg#getobject
    """

    method = 'GET'
    gcs_filename = '/%s/%s' % (default_bucket, bucket_object)
    content_md5, content_type = None, None

    expiration = datetime.utcnow() + timedelta(seconds=expires_after_seconds)
    expiration = int(time.mktime(expiration.timetuple()))

    # Generate the string to sign.
    signature_string = '\n'.join([
        method,
        content_md5 or '',
        content_type or '',
        str(expiration),
        gcs_filename])

    _, signature_bytes = app_identity.sign_blob(signature_string)
    signature = base64.b64encode(signature_bytes)

    # Set the right query parameters.
    query_params = {'GoogleAccessId': google_access_id,
                    'Expires': str(expiration),
                    'Signature': signature}

    # Return the download URL.
    return '{endpoint}{resource}?{querystring}'.format(endpoint=API_ACCESS_ENDPOINT,
                                                       resource=gcs_filename,
                                                       querystring=urllib.urlencode(query_params))
Hobson answered 24/4, 2015 at 12:38 Comment(0)
I
3

If a small number of users have access to all the files in the bucket, then solution #1 would be sufficient, as managing the ACL would not be too much of a pain.

However, if you have many different users who each require different access to the different files in the bucket, then solution #1 is impractical.

I'd avoid solution #2 as well, as you'd be paying for unnecessary incoming/outgoing GAE bandwidth.

Maybe a third solution to consider, would be to use App Engine handle authentication, and write logic to determine which users have access to which files. Then, when a file is requested for download, you create Signed URLs to download the data direct from GCS. You can set the expiration parameter to a value that works for you, which would invalidate the URL after a set amount of time.

Intervocalic answered 24/4, 2015 at 13:47 Comment(4)
Thank you for the third solution. The first part of your answer is already in place. The signed url with expiration is a good idea. No acl administration and a direct link download. But anyone who knows the URL can access the resource for a limited time.Hobson
The URL is obscure, and you could set expiry to a few seconds. Enough time to redirect to your user and start the download.Intervocalic
Yes. This makes it virtually impossible to hijack the link.Hobson
I have updated my answer with the Python code. In appengine it is very easy, using app_endenty to sign the signature string.Hobson

© 2022 - 2024 — McMap. All rights reserved.