Should I persist images on EBS or S3?
Asked Answered
A

5

65

I am migrating my Java,Tomcat, Mysql server to AWS EC2.

I have already attached EBS volume for storing MySql data. In my web application people may upload images. So I should persist them. There are 2 alternatives in my mind:

  1. Save uploaded images to EBS volume.
  2. Use the S3 service.

The followings are my notes, please be skeptic about them, as my expertise is not on servers, but software development.

  • EBS plus: S3 storage is more expensive. (0.15 $/Gb > 0.1$/Gb)

  • S3 plus: Serving statics from EBS may influence my web server's performance negatively. Is this true? Does Serving images affect server performance notably? For S3 my server will not be responsible for serving statics.

  • S3 plus: Serving statics from EBS may result I/O cost, probably it will be minor.

  • EBS plus: People say EBS is faster.

  • S3 plus: People say S3 is more safe for persistence.

  • EBS plus: No need to learn API, it is straight forward to save the images to EBS volume.

Namely I can not decide, will be happy if you guide.

Thanks

Areopagite answered 18/2, 2010 at 12:5 Comment(1)
S3 is now down to $0.14 GB-month according to aws.amazon.com/s3/pricingLesterlesya
E
50

I'm currently using S3 for a project and it's working extremely well.

EBS means you need to manage a volume + machines to attach it to. You need to add space as it's filling up and perform backups (not saying you shouldn't back up your S3 data, just that it's not as critical).

It also makes it harder to scale: when you want to add additional machines, you either need to pull off the images to a separate machine or clone the images across all. This also means you're adding a bottleneck: you'll have to manage your own upload process that will either upload to all machines or have a single machine managing it.

I recommend S3: it's set and forget. Any number of machines can be performing uploads in parallel and you don't really need to notify other machines about the upload.

In addition, you can use Amazon Cloudfront as a cheap CDN in front of the images instead of directly downloading from S3.

Ecotype answered 18/2, 2010 at 14:3 Comment(1)
+1 for S3+Cloudfront. I'm using it for serving Flash movies for one of our properties and it works very well.Shitty
A
54

The price comparison is not quite right: S3 charges are $0.14 per GB USED, whereas EBS charges are $0.10 per GB PROVISIONED (the size of your EBS volume), whether you use it or not. As a result, S3 may or may not be cheaper than EBS.

Adkinson answered 1/2, 2011 at 18:58 Comment(3)
Very good point. Probably should be a comment, not an answer, though. :) +1 anyway.Habsburg
Seems a valid answer.Gauhati
I prefer this as an answer (not as comment) while the question is about deciding between S3 and EBS. User will need consider this point as priority when going trough the benefit between both of them.Carbaugh
E
50

I'm currently using S3 for a project and it's working extremely well.

EBS means you need to manage a volume + machines to attach it to. You need to add space as it's filling up and perform backups (not saying you shouldn't back up your S3 data, just that it's not as critical).

It also makes it harder to scale: when you want to add additional machines, you either need to pull off the images to a separate machine or clone the images across all. This also means you're adding a bottleneck: you'll have to manage your own upload process that will either upload to all machines or have a single machine managing it.

I recommend S3: it's set and forget. Any number of machines can be performing uploads in parallel and you don't really need to notify other machines about the upload.

In addition, you can use Amazon Cloudfront as a cheap CDN in front of the images instead of directly downloading from S3.

Ecotype answered 18/2, 2010 at 14:3 Comment(1)
+1 for S3+Cloudfront. I'm using it for serving Flash movies for one of our properties and it works very well.Shitty
S
13

I have architected solutions on AWS for Stock photography sites which stores millions of images spanning TB's of data, I would like to share some of the best practice in AWS for your requirement:

P1) Store the Original Image file in S3 Standard option

P2) Store the reproducible images like thumbs etc in the S3 Reduced Redundancy option (RRS) to save costs

P3) Meta data about images including the S3 URL can be stored in Amazon RDS or Amazon DynamoDB depending upon the query complexity. Query the entries from Amazon RDS. If your query is complex it is also common practice to Store the meta data in Amazon CloudSearch or Apache Solr.

P4) Deliver your thumbs to users with low latency using Amazon CloudFront.

P5) Queue your image conversion either thru SQS or RabbitMQ on Amazon EC2

P6) If you are planning to use EBS, then they are not scalable with your EC2. So ideally you can use GlusterFS as your common storage pool for all your images. Multiple Amazon EC2 in Auto Scaled mode can still connect to it and access/write images.

Sundry answered 29/5, 2013 at 7:14 Comment(0)
S
8

You already outlined the advantages and disadvantages of both.

If you are planning to store terabytes of images, with storage requirements increasing day after day, S3 will probably be your best bet as it is built especially for these kinds of situations. You get unlimited storage space, without having to worry about sharding your data over many EBS volumes.

The recurrent cost of S3 is that it comes 50% more expensive than EBS. You will also have to learn the API and implement it in your application, but that is a one-off expense which I think you should be able to absorb very quickly.

Shortstop answered 18/2, 2010 at 12:30 Comment(2)
Yes learning api is not problem. What I want to clarify is whether serving statics outside the server (namely, letting S3 serves them) will have positive impact on my RAM usage?? I predict the memory (RAM) will be bottleneck for my server.Areopagite
Yes. Serving from S3 will free your EC2 instance from this responsibility, so it will absolutely save some CPU and RAM resources. How much depends on the traffic you expect. You may be interested in checking out the following Coding Horror blog post on this topic: codinghorror.com/blog/2007/03/…Shortstop
D
6

Do you expect the images to last indefinitely?

The Amazon EBS FAQ is pretty clear; the annual failure rate is not "essentially zero"; they quote 0.1% to 0.5%. It's better than the disk under your desk, but it would need some kind of backup.

Dalmatia answered 8/3, 2011 at 22:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.