Benefits of EBS vs. instance-store (and vice-versa) [closed]
Asked Answered
B

10

388

I'm unclear as to what benefits I get from EBS vs. instance-store for my instances on Amazon EC2. If anything, it seems that EBS is way more useful (stop, start, persist + better speed) at relatively little difference in cost...? Also, is there any metric as to whether more people are using EBS now that it's available, considering it is still relatively new?

Blackcap answered 2/9, 2010 at 19:29 Comment(6)
alestic.com/2012/01/ec2-ebs-boot-recommendedJamilla
also "micro" is only available if you are using EBS backed instances.Coriss
Instance Store volumes are much faster and do not a network based storage!Parol
I personally use the instance-store for dumping my up and running MongoDB collection into it and putting it on S3 for two reasons. First it's separated and it won't take down write speed on my 10-volume EBS RAID. Second is that it's way faster than EBS and since it comes with my instance there is no point for me to create extra EBS volumes to do the dumping and destroy them after putting them on S3. hope it helps and not constructive my a..Spill
I'm half way through AWS User Guide (700 pages). Have read carefully about EBS and Instance Storage. I still cannot understand why there is such differences. And even more puzzled as why Instance store is equivalent to S3 but is named differently. The question must be re-opened to receive more contribution to useful answers.Functionalism
@Functionalism Instance Store is the local disk(s) of the physical server and not the same thing as S3 or EBS. Instance Store is ephemeral, so anything in it is lost after the instance is restarted.Kalevala
A
299

The bottom line is you should almost always use EBS backed instances.

Here's why

  • EBS backed instances can be set so that they cannot be (accidentally) terminated through the API.
  • EBS backed instances can be stopped when you're not using them and resumed when you need them again (like pausing a Virtual PC), at least with my usage patterns saving much more money than I spend on a few dozen GB of EBS storage.
  • EBS backed instances don't lose their instance storage when they crash (not a requirement for all users, but makes recovery much faster)
  • You can dynamically resize EBS instance storage.
  • You can transfer the EBS instance storage to a brand new instance (useful if the hardware at Amazon you were running on gets flaky or dies, which does happen from time to time)
  • It is faster to launch an EBS backed instance because the image does not have to be fetched from S3.
  • If the hardware your EBS-backed instance is scheduled for maintenance, stopping and starting the instance automatically migrates to new hardware. I was also able to move an EBS-backed instance on failed hardware by force-stopping the instance and launching it again (your mileage may vary on failed hardware).

I'm a heavy user of Amazon and switched all of my instances to EBS backed storage as soon as the technology came out of beta. I've been very happy with the result.

EBS can still fail - not a silver bullet

Keep in mind that any piece of cloud-based infrastructure can fail at any time. Plan your infrastructure accordingly. While EBS-backed instances provide certain level of durability compared to ephemeral storage instances, they can and do fail. Have an AMI from which you can launch new instances as needed in any availability zone, back up your important data (e.g. databases), and if your budget allows it, run multiple instances of servers for load balancing and redundancy (ideally in multiple availability zones).

When Not To

At some points in time, it may be cheaper to achieve faster IO on Instance Store instances. There was a time when it was certainly true. Now there are many options for EBS storage, catering to many needs. The options and their pricing evolve constantly as technology changes. If you have a significant amount of instances that are truly disposable (they don't affect your business much if they just go away), do the math on cost vs. performance. EBS-backed instances can also die at any point in time, but my practical experience is that EBS is more durable.

Axial answered 2/9, 2010 at 19:55 Comment(16)
Yes, the above were my thoughts as well... Hopefully somehow here writes about their preferences for instance-store as a comparison...Blackcap
@HelloWorldy: The comparison is really "An instance store can't do..." and list the things an EBS store can. There's no real benefit other than possibly a small cost savings (that can be offset by the convenience of stopping/starting EBS backed instances).Axial
Instance store backed EC2 can also be set to not accidentally terminate.Lectionary
I'm actually switching most of my EBS backed EC2 instances to using instance stores. It really depends on what you want to achieve. I'm switching because of better IO and because I view each EC2 instance as disposable at all moments, or: it will break down any minute and I will lose everything that's on such an instance. Architecting that way helps to get a real HA system. See also stu.mp/2011/04/the-cloud-is-not-a-silver-bullet.htmlLectionary
@Jim: At least when I wrote the answer a year ago, you got much better IO by striping a number of EBS instances into a software RAID configuration than using instance storage. It's also much faster to launch a replacement instance from EBS backing than from S3 backing (instance storage is loaded from S3, which can be slow). I have not done much on AWS the last 6 months or so; things may have changed.Axial
If all your servers are using EBS, then they will all fail simultaneously if the EBS system itself fails, which has happened multiple times. EBS is highly convenient, but presents a systemic reliability risk.Dirge
@Leopd: Not true. The EBS system in a given availability zone is composed of numerous independent subsystems. A total failure of the entire EBS system is about as likely as a total failure of the availability zone it is in (which happens, and is a reason to mirror services across multiple zones). Case in point: The EBS outage just a few days ago. Of the 20 or so EBS volumes I have running in that zone, 2 were affected by the problem. Note though that the entire zone was impaired because AWS metered API calls to assist recovery. The zone was "lost" for a bit, but not every EBS volume.Axial
@EricJ Your point is good that depending on a single reliability zone is folly. And it's true that EBS is designed to avoid complete outages. But one would also be foolish to pretend that EBS drive failures are independent and uncorrelated the way they are for physical drives. Also remember that EBS is proprietary Amazon technology under constant improvement, and thus only as reliable as their software engineering practices (which history gives me reason to criticize). Other critical EC2 systems like routers and the VM's host OS are far better battle-tested.Dirge
@Leopd: Sure, but how does that invalidate my answer in any way? Instance store is far more likely to fail in an unrecoverable manner than an EBS backed instance. Historically, even when the EBS system has been impaired, the data has been recoverable after the system is back online. The same is not true for the loss of instance storage.Axial
@EricJ I disagree with your summary advice that EBS is almost always the right choice. If your architecture follows one of many common patterns which protect against random hardware failure, then you don't need EBS's increased robustness and are exposed to a new set of risks from systematic failure. Your answer implies there are no downsides to EBS, which is simply incorrect. There is always a tradeoff.Dirge
@Leopd: The vast majority of applications need to persist data in a manner that can survive an instance failure. Such applications are using something for that persistence. That something is almost always EBS. If the EBS persistent storage fails, very few use cases would be benefited by having the instance itself not be impacted. Sure, some apps can persist their state to S3, or a web service, or something else that does not depend on EBS. Very, very few apps do that in practice. Name a use case where EBS is not the right choice, all things considered, in your opinion.Axial
Saying that you can "dynamically" resize EBS storage is a stretch. This answer originally said such - I'm editing it to remove the word "dynamically". You have to go through a bunch of rigamarole with creating snapshots, making new bigger EBS volumes from the snapshots, yada yada yada. It's possible but i would not call it "dynamic." tomotvos.ca/cloud/how-to-resize-an-aws-volumeOrison
@DanPritts: All of that can be scripted. Indeed, we created a library of scripts to automate common AWS tasks. Still, it cannot be done online, so I agree with removing "dynamic" from the answer.Axial
Would you recommend using EBS backed instances, instead of instance store instances, when the instance is a state-less web server serving css, js and html filesNickelplate
Seems a little lop-sided -- though it's possible to run EBS-Backed instances and maintain a heavy emphasis on recyclability, I think that having newcomers looking at this post and subsequently creating EBS-Backed instances is dangerous because they likely will not maintain that same emphasis on recyclability, which is perhaps the most crucial component of any cloud infrastructure. And the good majority of people looking at this are sure to be new to this stuffFamily
@Accipheran: Good point. I have expanded my answer to stress that. Feel free to edit my edit if there are any additional points that you think are worth considering.Axial
S
70

99% of our AWS setup is recyclable. So for me it doesn't really matter if I terminate an instance -- nothing is lost ever. E.g. my application is automatically deployed on an instance from SVN, our logs are written to a central syslog server.

The only benefit of instance storage that I see are cost-savings. Otherwise EBS-backed instances win. Eric mentioned all the advantages.


[2012-07-16] I would phrase this answer a lot different today.

I haven't had any good experience with EBS-backed instances in the past year or so. The last downtimes on AWS pretty much wrecked EBS as well.

I am guessing that a service like RDS uses some kind of EBS as well and that seems to work for the most part. On the instances we manage ourselves, we have got rid off EBS where possible.

Getting rid to an extend where we moved a database cluster back to iron (= real hardware). The only remaining piece in our infrastructure is a DB server where we stripe multiple EBS volumes into a software RAID and backup twice a day. Whatever would be lost in between backups, we can live with.

EBS is a somewhat flakey technology since it's essentially a network volume: a volume attached to your server from remote. I am not negating the work done with it – it is an amazing product since essentially unlimited persistent storage is just an API call away. But it's hardly fit for scenarios where I/O performance is key.

And in addition to how network storage behaves, all network is shared on EC2 instances. The smaller an instance (e.g. t1.micro, m1.small) the worse it gets because your network interfaces on the actual host system are shared among multiple VMs (= your EC2 instance) which run on top of it.

The larger instance you get, the better it gets of course. Better here means within reason.

When persistence is required, I would always advice people to use something like S3 to centralize between instances. S3 is a very stable service. Then automate your instance setup to a point where you can boot a new server and it gets ready by itself. Then there is no need to have network storage which lives longer than the instance.

So all in all, I see no benefit to EBS-backed instances what so ever. I rather add a minute to bootstrap, then run with a potential SPOF.

Stinkstone answered 21/10, 2010 at 14:55 Comment(5)
Is there any significant improvement of IO performance with EBS IOPS-kind of volumes compared to standard? Supposing, the above said holds for EBS IOPS volumes, as well.Hutch
Both technologies evolve. I'm wirting this comment in 2014, when I have "Provisioned IOPS" EBS, but - the "instance store" is now SSD, which is even faster than before!! Ephemeral storage will always win in terms of speed. So I use both - keep the "persistent" stuff on EBS, having all the temp files, logs, "TempDB" database, swap-file and other stuff on Instance-store. BENEFIT FROM BOTH!Demona
What if you needed a distributed database which needs to store its data in a distributed and persistent manner. Wouldn't you need EBS because instance storage is not persistent?Meta
@Meta Of course you do. There are a lot of options these days, e.g. AWS started offering SSD-based storage. I would look into those and re-do the analysis (single vs. RAID, etc.). I would also look into getting the biggest instances possible because of network throughput. EBS is still an issue on instances like t1.micro.Stinkstone
The part of this answer about network performance is fairly obsolete - for quite a while now, there have existed a variety of instances that can be "EBS-optimized" at a small extra cost, and some that are such by default (with no surcharge), which have dedicated network interfaces towards EBS, cf. docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptimized.htmlOnwards
M
42

We like instance-store. It forces us to make our instances completely recyclable, and we can easily automate the process of building a server from scratch on a given AMI. This also means we can easily swap out AMIs. Also, EBS still has performance problems from time to time.

Mcgannon answered 24/11, 2011 at 14:39 Comment(2)
Netflix makes the same recommendations as well.Sweatt
So where do you store your block based persistent files?Meta
G
18

Eric pretty much nailed it. We (Bitnami) are a popular provider of free AMIs for popular applications and development frameworks (PHP, Joomla, Drupal, you get the idea). I can tell you that EBS-backed AMIs are significantly more popular than S3-backed. In general I think s3-backed instances are used for distributed, time-limited jobs (for example, large scale processing of data) where if one machine fails, another one is simply spinned up. EBS-backed AMIS tend to be used for 'traditional' server tasks, such as web or database servers that keep state locally and thus require the data to be available in the case of crashing.

One aspect I did not see mentioned is the fact that you can take snapshots of an EBS-backed instance while running, effectively allowing you to have very cost-effective backups of your infrastructure (the snapshots are block-based and incremental)

Genius answered 26/6, 2011 at 20:38 Comment(2)
S3 has in-built redundancy. EBS has none, so you'll need to deploy redundancy software on top of it.Didst
@Didst That is incorrect, per official documentation at docs.aws.amazon.com/AWSEC2/latest/UserGuide/raid-config.htmlOnwards
I
16

I've had the exact same experience as Eric at my last position. Now in my new job, I'm going through the same process I performed at my last job... rebuilding all their AMIs for EBS backed instances - and possibly as 32bit machines (cheaper - but can't use same AMI on 32 and 64 machines).

EBS backed instances launch quickly enough that you can begin to make use of the Amazon AutoScaling API which lets you use CloudWatch metrics to trigger the launch of additional instances and register them to the ELB (Elastic Load Balancer), and also to shut them down when no longer required.

This kind of dynamic autoscaling is what AWS is all about - where the real savings in IT infrastructure can come into play. It's pretty much impossible to do autoscaling right with the old s3 "InstanceStore"-backed instances.

Inspan answered 31/12, 2010 at 0:25 Comment(0)
G
13

I'm just starting to use EC2 myself so not an expert, but Amazon's own documentation says:

we recommend that you use the local instance store for temporary data and, for data requiring a higher level of durability, we recommend using Amazon EBS volumes or backing up the data to Amazon S3.

Emphasis mine.

I do more data analysis than web hosting, so persistence doesn't matter as much to me as it might for a web site. Given the distinction made by Amazon itself, I wouldn't assume that EBS is right for everyone.

I'll try to remember to weigh in again after I've used both.

Generator answered 7/8, 2012 at 6:9 Comment(0)
B
10

EBS is like the virtual disk of a VM:

  • Durable, instances backed by EBS can be freely started and stopped (saving money)
  • Can be snapshotted at any point in time, to get point-in-time backups
  • AMIs can be created from EBS snapshots, so the EBS volume becomes a template for new systems

Instance storage is:

  • Local, so generally faster
  • Non-networked, in normal cases EBS I/O comes at the cost of network bandwidth (except for EBS-optimized instances, which have separate EBS bandwidth)
  • Has limited I/O per second IOPS. Even provisioned I/O maxes out at a few thousand IOPS
  • Fragile. As soon as the instance is stopped, you lose everything in instance storage.

Here's where to use each:

  • Use EBS for the backing OS partition and permanent storage (DB data, critical logs, application config)
  • Use instance storage for in-process data, noncritical logs, and transient application state. Example: external sort storage, tempfiles, etc.
  • Instance storage can also be used for performance-critical data, when there's replication between instances (NoSQL DBs, distributed queue/message systems, and DBs with replication)
  • Use S3 for data shared between systems: input dataset and processed results, or for static data used by each system when lauched.
  • Use AMIs for prebaked, launchable servers
Breadfruit answered 8/4, 2016 at 18:13 Comment(0)
A
5

Most people choose to use EBS backed instance as it is stateful. It is to safer because everything you have running and installed inside it, will survive stop/stop or any instance failure.

Instance store is stateless, you loose it with all the data inside in case of any instance failure situation. However, it is free and faster because the instance volume is tied to the physical server where the VM is running.

Antemundane answered 11/12, 2015 at 21:1 Comment(0)
H
3

For someone new to all this and if accidentally landed here

As of now all AMI's in quickstart section are EBS backed

enter image description here

Also there's a good explanation at official doc for difference between EBS and Instance store

& this image pretty much sums it up enter image description here

Heehaw answered 18/4, 2016 at 8:16 Comment(0)
S
0

If you run multiple instance and assign a scheduled service of AWS Instance as one of your priority on Avoiding Unexpected Charges, I would recommend not to use the instance-store.

As explained on documentation of EBS Volumes and the answer from j2d3 and Siddharth Sharma the instance-store can run for as long as you want, but it cannot be stopped. Means that the service cannot be scheduled by an Automatic Start/Stop or Instance Recovery.

Moreover, for this kind of scheme there is also no benefit to use EBS Backed on Elastic Beanstalk as it is designed to ensure that all the resources you need are keep running. It will always do an automatically relaunches any services that you stop. enter image description here Reviewing all the rest, out of the total charges on using the VPC, EBS and ELB that added to EC2-Classic, the EC2-VPC with ELB is mostly the best choice where unlike on EC2-Classic, a stopped instance retains its associated Elastic IP addresses and the EBS volume is stored automatically.

As conclusion, taking the main part of your question:

it seems that EBS is way more useful (stop, start, persist + better speed) at relatively little difference in cost...?

The answer is yes but if your instance is EBS-based, it can be stopped. It will remain in your account, you will not be charged for it. You will be charge only the volume but EBS is charged hourly. You may also consider that among all available types you have a flexibility to Resize the EBS Volume.

Beside the benefits that already listed by Eric, it shall also be aware that in term of cost S3 may or may not be cheaper than EBS. I agree that it relatively little difference in cost if you keep running both types of instance within the same platform and architecture of the application all the time.

However if there a scenario to run the application on a lower cost service, pull all unhandled task and role them to the VPC/EBS via a pipeline or lambda within a short time basis say <1 hour a day, which impossible to do when you use an instance-store, then it will be a different story.

Saddlebacked answered 6/7, 2016 at 22:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.