Amazon ECS - Persistent data between instances
Asked Answered
E

3

9

How would you best handle persistent data between instances with a load-balanced service in Amazon ECS? Data only containers will not work and neither will the volumes you can specify in the tasks, they will both only persist on the instance itself. I have been trying to read up on attaching a EBS upon instance creation with User Data in Launch Configuration but i had no luck there.

Embodiment answered 26/1, 2016 at 23:34 Comment(5)
How much data? Is it read only?Potluck
I need to store the MySQL database + user uploaded content. No huge amounts of data but it needs to be R+W. I use a Linux envirormentEmbodiment
Amazon ECS data volumes is what you looking for docs.aws.amazon.com/AmazonECS/latest/developerguide/…Lissie
@Lissie From that page you can read that data volumes does not sync between instances and thats kinda useless when using autoscaling that can delete any instance when its no longer needed? "Amazon ECS does not sync your data volumes across container instances. Tasks that use persistent data volumes can be placed on any container instance in your cluster that has available capacity. If your tasks require persistent data volumes after stopping and restarting, you should always specify the same container instance at task launch time with the AWS CLI start-task command."Embodiment
@Embodiment sorry I misunderstood your question. What you want is actually a persistent storage for a docker cluster (like Swarm). I would suggest you looking at RDS like Aurora or Mysql + S3 (for user upload content) Also check out kubernetes (which can be run on normal EC2 smoothly)Lissie
P
2

Depending on data needs you have two options I can think of:

Mapping S3 bucket as a local drive

You can share an S3 bucket and limit access to any number of instances. We use a drive mapping solution in Windows that will mount an S3 bucket as a local drive. Similar drivers exist for Linux. So each instance gets the same mapped drive, and share that persistent data. The data is read/write, so if we scale in or out, each instance has access to the S3 data in a consistent format.

Mount a volume from a Snapshot

As you suggest, if it is read-only data that you need access to, you can use Userdata scripts to mount a volume from a snapshot at launch time. You just need a script, and credentials/IAM Role to run the appropriate commands at launch time

Potluck answered 27/1, 2016 at 0:7 Comment(8)
Thanks for the input! I will try to find a S3 mapper for Linux!Embodiment
I added a reference to a Linux S3 mapper.Potluck
@Embodiment but absolutely do not try to run a MySQL database over an S3 mapper. S3 is an object store, not a filesystem, and lacks the necessary consistency guarantees for this to work properly.Honegger
@Michael-sqlbot Thanks for the comment, we will try to go for RDS for the MySQL database instead.Embodiment
@RodrigoM Thanks for the link, was looking into s3fs, i will mark this answer as accepted :)Embodiment
I should follow up that this is not a limitation in s3fs itself. I use it, but not on the front-end, and not for databases. It's a clever way of doing things but you have to understand that there's an impedance gap between proper filesystems and object stores that cannot be fully bridged. Additionally, MySQL doesn't work over any kind of shared volumes. For other things, though, once it is available in your regions, Elastic File System is quite nice.Honegger
I feel like its kinda strange that amazon don't have a more standardised way of getting persistent data between instances? EFS is maby the answer though?Embodiment
Amazon EFS is probably the answer moving forward: aws.amazon.com/blogs/compute/…Aeciospore
L
11

You can use Amazon EFS to share a filesystem across ECS containers and instances. EFS is based on NFS so it can be mounted at multiple host instances at the same time. This allows cluster scheduling and scaling to work as intended. See a tutorial for persisting MySQL data this way here:

https://aws.amazon.com/blogs/compute/using-amazon-efs-to-persist-data-from-amazon-ecs-containers/

Loreleilorelie answered 6/12, 2016 at 10:39 Comment(0)
J
4

I suggest using Amazon EFS ( https://aws.amazon.com/blogs/compute/using-amazon-efs-to-persist-data-from-amazon-ecs-containers/).

Just add a limitation that there are only 4 regions to support EFS.

EU (Ireland)

US East (N. Virginia)

US East (Ohio)

US West (Oregon)

If your region is not supported then we can implement your own NFS share to share persistent folder between EC2 instances. S3FS looks cool but it's buggy ( I tested 2 years ago. Things may change today)

Jubbulpore answered 20/1, 2017 at 12:53 Comment(0)
P
2

Depending on data needs you have two options I can think of:

Mapping S3 bucket as a local drive

You can share an S3 bucket and limit access to any number of instances. We use a drive mapping solution in Windows that will mount an S3 bucket as a local drive. Similar drivers exist for Linux. So each instance gets the same mapped drive, and share that persistent data. The data is read/write, so if we scale in or out, each instance has access to the S3 data in a consistent format.

Mount a volume from a Snapshot

As you suggest, if it is read-only data that you need access to, you can use Userdata scripts to mount a volume from a snapshot at launch time. You just need a script, and credentials/IAM Role to run the appropriate commands at launch time

Potluck answered 27/1, 2016 at 0:7 Comment(8)
Thanks for the input! I will try to find a S3 mapper for Linux!Embodiment
I added a reference to a Linux S3 mapper.Potluck
@Embodiment but absolutely do not try to run a MySQL database over an S3 mapper. S3 is an object store, not a filesystem, and lacks the necessary consistency guarantees for this to work properly.Honegger
@Michael-sqlbot Thanks for the comment, we will try to go for RDS for the MySQL database instead.Embodiment
@RodrigoM Thanks for the link, was looking into s3fs, i will mark this answer as accepted :)Embodiment
I should follow up that this is not a limitation in s3fs itself. I use it, but not on the front-end, and not for databases. It's a clever way of doing things but you have to understand that there's an impedance gap between proper filesystems and object stores that cannot be fully bridged. Additionally, MySQL doesn't work over any kind of shared volumes. For other things, though, once it is available in your regions, Elastic File System is quite nice.Honegger
I feel like its kinda strange that amazon don't have a more standardised way of getting persistent data between instances? EFS is maby the answer though?Embodiment
Amazon EFS is probably the answer moving forward: aws.amazon.com/blogs/compute/…Aeciospore

© 2022 - 2024 — McMap. All rights reserved.