Store Photos in Blobstore or as Blobs in Datastore - Which is better/more efficient /cheaper?
Asked Answered
O

1

27

I have an app where each DataStore Entity of a specific kind can have a number of photos associated with it. (Imagine a car sales website - one Car has multiple photos)

Originally since all the data is being sourced from another site, I was limited to having to store the photos as DataStore Blobs, but now that its possible to write BlobStore items programatically, I'm wondering if I should change my design and store the photos as BlobStore items?

So, the question is:
Is it 'better' to store the photos in the Blobstore, or as Blobs in the Datastore? Both are possible solutions, but which would be the better/cheaper/most efficient approach, and why?

Oath answered 20/2, 2012 at 13:34 Comment(0)
R
47

Images served from BlobStore have several advantages over Datastore:

  1. Images are served directly from BlobStore, so request does not go through GAE frontend instance. So you are saving on frontend instances time and hence cost.

  2. BlobStore storage cost is roughly half of Datastore storage cost ($0.13 vs $0.24). With Datastore you'd additionally pay for get() or query().

  3. BlobStore automatically uses Google cache service, so the only cost is cost of bandwidth ($0.12/GB). You can also set this on frontend instance via cache control, but the difference is that this is done automatically for BlobStore.

  4. Images in BlobStore can be served via ImageService and can be transformed on the fly, e.g. creating thumbnails. Transformed images are also automatically cached.

  5. Binary blobs in Datastore are limited to 1Mb in size.

One downside of BlobStore is that it has no access controls. Anybody with an URL to blob can download it. If you need ACL (Access Control List) take a look at Google Cloud Storage.

Update:

Cost wise the biggest saving will come from properly caching the images:

  1. Every image should have a permanent URL.
  2. Every image URL should be served with proper cache control HTTP headers:

    // 32M seconds is a bit more than one year 
    Cache-Control: max-age=32000000, must-revalidate
    

you can do this in java via:

httpResponse.setHeader("Cache-Control", "max-age=32000000, must-revalidate");

Update 2:

As Dan correctly points out in the comments, BlobStore data is served via a frontend instance, so access controls can be implemented by user code.

Renege answered 20/2, 2012 at 14:6 Comment(11)
Thanks - a great answer! think i'll be looking at my costs, and see if the reduced costs of using BlobStore will justify the programming effort of moving images out of the DataStore.Oath
Since you already have images in Datastore, you might reconsider just adding a proper cache control headers. App engine uses it's own edge caches, so requests for properly cached content never reaches your instance.Renege
That'll save me a load of code! Additional header it is then :)Oath
Good answer Peter, but please reconsider the access control caveat. Blobstore values are served at application-managed URLs, not at their own public URLs. The user request goes to the app, and the app serves the value by adding the Blobstore key to the response header, which is intercepted by the frontend. So the app has every opportunity to decide not to serve the value to a given user.Highway
@Dan: True, one can read data from BlobStore and serve ti via frontend instance - this would give app full control over serving images, but would come at a cost.Renege
@Peter: Uh, no, actually, the way you serve a value out of the Blobstore is to accept a request to the app, then respond with the X-AppEngine-BlobKey header with the key. App Engine intercepts the outgoing response and replaces the body with the Blobstore value streamed directly from the service. Because app logic sets the header in the first place, the app can implement any access control it wants. There is no default URL that serves values directly out of the Blobstore without app intervention. Maybe you're thinking of something else?Highway
@Dan, you are correct. Blobstore requests go via frontend instance. I mixed it with ImageService which produces a redirect and the final URL is on Google CDN and can be accessed without frontend instance.Renege
Hey, great info here, thanks. Related question: Why can it be that my blobstore files don't get cached (server returns 200 instead of 304 response)? I have added the Cache-Control: max-age=32000000, must-revalidate header and when analyzing the response it is there! (I'm serving .unity3d files with application/vnd.unity Content-Type)Festination
@Manuel: If you change the same resource (Url) with a new content, then you must either use max-age (and respect that on client) or use ETag (indicates version of content, also needs to handled properly on client). See: en.wikipedia.org/wiki/HTTP_ETagRenege
@DanSanderson I have asked a follow up question at #15552296, Will you please offer some advice?Twinkling
Peter, perhaps worth taking out the Images are served directly from BlobStore, so request does not go through GAE frontend instance point given that the request is indeed first going to the app as confirmed by the discussion with Dan.Pointsman

© 2022 - 2024 — McMap. All rights reserved.