Persisting data in a docker swarm with glusterfs

E

2

5

I have a docker swarm with a lot of containers, but in particolar:

mysql
mongodb
fluentd
elasticsearch

My problem is that when a node fails, the manager discards the current container and creates a new one in another node. So everytime i lost the persisting data stored in that particular container even using docker volumes.

So i would create four distributed glusterfs volumes over my cluster, and mount them as docker volumes into my containers.

Is this a correct way to resolve my problem?

If it is, what type of filesystem should i use for my glusterfs volumes?

Are there perfomance problems with this approch?

Elevate answered 18/1, 2018 at 21:55 Comment(0)

M

1

You might want to take a look at flocker (a volume data manager) which has integration for several container cluster managers, including Docker Swarm.

You will have to create a volume using flocker driver for each application as pointed by the tutorial:

...
volumes:
  mysql:
    driver: "flocker"
    driver_opts:
      size: "10GiB"
      profile: "bronze"
 ...

Matteroffact answered 3/3, 2018 at 17:24 Comment(1)

Seems the company behind flocker (ClusterHQ) went out of business in 2016. The flocker code is still available on github (but hasn't been modified since) and the links given in the tutorial are dead. – Conclude 16/7, 2018 at 15:24

N

6

GlusterFS would not be the correct way to resolve this for all of your containers since Gluster does not support "structured data", as stated in the GlusterFS Install Guide:

Gluster does not support so called “structured data”, meaning live, SQL databases. Of course, using Gluster to backup and restore the database would be fine - Gluster is traditionally better when using file sizes at of least 16KB (with a sweet spot around 128KB or so).

One solution to this would be master slave replication for the data in your databases. MySQL and mongoDB both support this (as described here and here), as do most common DBMSs.

Master slave replication is basically where for 2 or more copies of your database, one will be the master and the rest will be slaves. All write operations happen on the master, and all read operations happen on the slaves. Any data written to the master will be replicated across the slaves, by the master. Some DBMSs also provide a way to check if the master goes down and elect a new master if this happens, but I don't think all DBMSs do this.

You could alternatively set up a Galera Cluster, but as far as I'm aware this only supports MySQL.

I would have thought you could use GlusterFS for Fluentd and Elasticsearch, but I'm not familiar with either of those so I couldn't say for certain. I imagine it would depend on how they store any data they collect (if they collect any at all).

Necklace answered 3/3, 2018 at 17:16 Comment(0)

M

1

You might want to take a look at flocker (a volume data manager) which has integration for several container cluster managers, including Docker Swarm.

You will have to create a volume using flocker driver for each application as pointed by the tutorial:

...
volumes:
  mysql:
    driver: "flocker"
    driver_opts:
      size: "10GiB"
      profile: "bronze"
 ...

Matteroffact answered 3/3, 2018 at 17:24 Comment(1)

Seems the company behind flocker (ClusterHQ) went out of business in 2016. The flocker code is still available on github (but hasn't been modified since) and the links given in the tutorial are dead. – Conclude 16/7, 2018 at 15:24

Recommended topics

Hot tags