Should I use EBS or EFS for database?
Asked Answered
P

2

6

For database directories for MongoDB, Cassandra or Elasticsearch clusters with high availability, should I use EBS or EFS? MongoDB, Cassnadra and Elasticsearch clusters take care of replicating data across nodes if they are configured to have replication factor > 1, so EFS replication feature may not be needed I giuess.

Poltroonery answered 15/12, 2018 at 13:14 Comment(1)
FYI, AWS has DocumentDb which is MongoDb compatible and is fully managedEleaseeleatic
L
4

EFS is for multiple servers having access to the same set of files. Cassandra has replication built in, so it has no use for that feature. You would not want multiple Cassandra nodes accessing the same files anyway as each node manages its own sstables.

Not to mention Cassandra is disk intensive and gets angry if there is latency. Cassandra connections time out really easily. So, using an NFS mount (EFS) instead of a “local” disk is just a bad idea.

Read this if you haven’t already: https://aws.amazon.com/blogs/big-data/best-practices-for-running-apache-cassandra-on-amazon-ec2/

(Can’t speak for other databases like MongoDB.)

Leroy answered 15/12, 2018 at 13:48 Comment(1)
Thanks for the answer and it really clarified my concerns. And it will be same with the case of MongoDB as well. A MongoDB replicaset (a set of servers and server count > 1) has in built replication. And it is the same with the case of Elasticsearch indexes with replication factor > 0.Poltroonery
F
9

EBS - for databases

EFS - for file sharing across applications, VMs etc

Here is a good article that differentiates between the storage types

https://dzone.com/articles/confused-by-aws-storage-options-s3-ebs-amp-efs-explained

Fribble answered 15/12, 2018 at 14:2 Comment(0)
L
4

EFS is for multiple servers having access to the same set of files. Cassandra has replication built in, so it has no use for that feature. You would not want multiple Cassandra nodes accessing the same files anyway as each node manages its own sstables.

Not to mention Cassandra is disk intensive and gets angry if there is latency. Cassandra connections time out really easily. So, using an NFS mount (EFS) instead of a “local” disk is just a bad idea.

Read this if you haven’t already: https://aws.amazon.com/blogs/big-data/best-practices-for-running-apache-cassandra-on-amazon-ec2/

(Can’t speak for other databases like MongoDB.)

Leroy answered 15/12, 2018 at 13:48 Comment(1)
Thanks for the answer and it really clarified my concerns. And it will be same with the case of MongoDB as well. A MongoDB replicaset (a set of servers and server count > 1) has in built replication. And it is the same with the case of Elasticsearch indexes with replication factor > 0.Poltroonery

© 2022 - 2024 — McMap. All rights reserved.