How to setup zookeeper cluster on docker swarm
Asked Answered
G

3

7

Environment : 6 server docker swarm cluster (2 master & 4 workers)

Requirement : We need to setup a zookeeper cluster on existing docker swarm.

Blocked on : To setup zookeeper in cluster, we need to provide all zk servers in each server config and provide unique ID in myid file.

Question : When we create a replica of zookeeper in docker swarm, how can we provide unique ID for each replica. Also how can we update zoo.cfg config file with ID of each zookeeper container.

Glittery answered 6/2, 2017 at 7:42 Comment(0)
T
8

This is not currently an easy ask. Fully scalable stateful application clusters is tricky when each cluster member has a need for a unique identity and storage volume.

On Docker Swarm, today, you are perhaps best advised to run each cluster member as a separate service, in your compose file (See 31z4/zookeeper-docker):

version: '2'
services:
    zoo1:
        image: 31z4/zookeeper
        restart: always
        ports:
            - 2181:2181
        environment:
            ZOO_MY_ID: 1
            ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888

    zoo2:
        image: 31z4/zookeeper
        restart: always
        ports:
            - 2182:2181
        environment:
            ZOO_MY_ID: 2
            ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
..
..

For a state of the art (but still evolving) solution, I recommend checking out Kubernetes:

The new concept of Statefulsets offers much promise. I expect Docker Swarm will grow a similar capability in time, where each container instance is assigned a unique and "sticky" hostname, which can be used as the basis for a unique identifier.

Topology answered 6/2, 2017 at 22:47 Comment(0)
P
3

We have created a docker image extending the official one which does exact that. The entrypoint.sh has been modified so that on startup of each container, it auto-discovers rest zookeeper nodes and configures the current node appropriately.

You can find the image in the docker store and in our github.

Note: Currently it does not handle cases such as a re-creation of a container cause of a failure.

EDIT (6/11/2018)

The latest image supports re-configuration of zookeeper cluster in cases like:

  • Scaling up the docker service (adding more containers)
  • Scaling down the docker service (removing containers)
  • A container is re-scheduled by docker swarm cause of a failure (new IP is assigned)
Propagandize answered 4/6, 2018 at 15:0 Comment(3)
I wrote that same script three weeks ago (:/ The only difference is that I use ip route get ${NODE_IP} | grep -c "dev lo" for the own IP check. Seemed more robust to me.) and I also got stuck with the re-creation. Have you given using Zookeeper 3.5's new reconfiguration features some thought?Ciliate
@Caesar, we are working to apply enhancements using the new reconfiguration api. It will will be auto-configured after scale up/down and in case of failed containers.Propagandize
I'm assuming you pulled the thing with the reconfiguration off? Might want to update your answer.Ciliate
B
0

I have been trying out deploying Zookeeper cluster in docker swarm mode.

I have deployed 3 machines connected to docker swarm network. My requirement is to, try running 3 Zookeeper instance on each of those nodes, which forms ensemble. Have gone through this thread, got few insights on how to deploy Zookeeper in docker swarm.

As @junius suggested, I have created the docker compose file. I have removed the constraints as the docker swarm ignores it. Refer https://forums.docker.com/t/docker-swarm-constraints-being-ignored/31555

My Zookeeper docker compose file looks like this

version: '3.3'

services:
    zoo1:
        image: zookeeper:3.4.12
        hostname: zoo1
        ports:
            - target: 2181
              published: 2181
              protocol: tcp
              mode: host
            - target: 2888
              published: 2888
              protocol: tcp
              mode: host
            - target: 3888
              published: 3888
              protocol: tcp
              mode: host
        networks:
            - net
        deploy:
            restart_policy:
                condition: on-failure
        environment:
            ZOO_MY_ID: 1
            ZOO_SERVERS: server.1=0.0.0.0:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
        volumes:
            - /home/zk/data:/data
            - /home/zk/datalog:/datalog
            - /etc/localtime:/etc/localtime:ro
    zoo2:
        image: zookeeper:3.4.12
        hostname: zoo2
        ports:
            - target: 2181
              published: 2181
              protocol: tcp
              mode: host
            - target: 2888
              published: 2888
              protocol: tcp
              mode: host
            - target: 3888
              published: 3888
              protocol: tcp
              mode: host
        networks:
            - net
        deploy:
            restart_policy:
                condition: on-failure
        environment:
            ZOO_MY_ID: 2
            ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=0.0.0.0:2888:3888 server.3=zoo3:2888:3888
        volumes:
            - /home/zk/data:/data
            - /home/zk/datalog:/datalog
            - /etc/localtime:/etc/localtime:ro
    zoo3:
        image: zookeeper:3.4.12
        hostname: zoo3
        ports:
            - target: 2181
              published: 2181
              protocol: tcp
              mode: host
            - target: 2888
              published: 2888
              protocol: tcp
              mode: host
            - target: 3888
              published: 3888
              protocol: tcp
              mode: host
        networks:
            - net
        deploy:
            restart_policy:
                condition: on-failure
        environment:
            ZOO_MY_ID: 3
            ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=0.0.0.0:2888:3888
        volumes:
            - /home/zk/data:/data
            - /home/zk/datalog:/datalog
            - /etc/localtime:/etc/localtime:ro
networks:
    net:

Deployed using docker stack command.

docker stack deploy -c zoo3.yml zk Creating network zk_net Creating service zk_zoo3 Creating service zk_zoo1 Creating service zk_zoo2

Zookeeper services comes up fine, each in each node without any issues.

docker stack services zk ID NAME MODE REPLICAS IMAGE PORTS rn7t5f3tu0r4 zk_zoo1 replicated 1/1 zookeeper:3.4.12 0.0.0.0:2181->2181/tcp, 0.0.0.0:2888->2888/tcp, 0.0.0.0:3888->3888/tcp u51r7bjwwm03 zk_zoo2 replicated 1/1 zookeeper:3.4.12 0.0.0.0:2181->2181/tcp, 0.0.0.0:2888->2888/tcp, 0.0.0.0:3888->3888/tcp zlbcocid57xz zk_zoo3 replicated 1/1 zookeeper:3.4.12 0.0.0.0:2181->2181/tcp, 0.0.0.0:2888->2888/tcp, 0.0.0.0:3888->3888/tcp

I have reproduced this issue discussed here, when i stop and started the zookeeper stack again.

docker stack rm zk docker stack deploy -c zoo3.yml zk

This time zookeeper cluster doesn't form. The docker instance logged the following

ZooKeeper JMX enabled by default
Using config: /conf/zoo.cfg
2018-11-02 15:24:41,531 [myid:2] - WARN  [WorkerSender[myid=2]:QuorumCnxManager@584] - Cannot open channel to 1 at election address zoo1/10.0.0.4:3888
java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:534)
        at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:454)
        at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:435)
        at java.lang.Thread.run(Thread.java:748)
2018-11-02 15:24:41,538 [myid:2] - WARN  [WorkerSender[myid=2]:QuorumCnxManager@584] - Cannot open channel to 3 at election address zoo3/10.0.0.2:3888
java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:534)
        at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:454)
        at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:435)
        at java.lang.Thread.run(Thread.java:748)
2018-11-02 15:38:19,146 [myid:2] - WARN  [QuorumPeer[myid=2]/0.0.0.0:2181:Learner@237] - Unexpected exception, tries=1, connecting to /0.0.0.0:2888
java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:229)
        at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:72)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:981)
2018-11-02 15:38:20,147 [myid:2] - WARN  [QuorumPeer[myid=2]/0.0.0.0:2181:Learner@237] - Unexpected exception, tries=2, connecting to /0.0.0.0:2888
java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:229)
        at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:72)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:981)

On close observation found, that first time when i deploy this stack, ZooKeeper instance with id: 2 running on node 1. this created a myid file with value 2.

cat /home/zk/data/myid 2

When i stopped and started the stack again, I found this time, ZooKeeper instance with id: 3 running on node 1.

docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 566b68c11c8b zookeeper:3.4.12 "/docker-entrypoin..." 6 minutes ago Up 6 minutes 0.0.0.0:2181->2181/tcp, 0.0.0.0:2888->2888/tcp, 0.0.0.0:3888->3888/tcp zk_zoo3.1.7m0hq684pkmyrm09zmictc5bm

But the myid file still have the value 2, which was set by the earlier instance.

Because of which the log shows [myid:2] and it tries to connect to instances with id 1 and 3 and fails.

On further debugging found that the docker-entrypoint.sh file contains the following code

# Write myid only if it doesn't exist
if [[ ! -f "$ZOO_DATA_DIR/myid" ]]; then
    echo "${ZOO_MY_ID:-1}" > "$ZOO_DATA_DIR/myid"
fi

This is causing the issue for me. I have edited the docker-entrypoint.sh with the following,

if [[ -f "$ZOO_DATA_DIR/myid" ]]; then
    rm "$ZOO_DATA_DIR/myid"
fi

echo "${ZOO_MY_ID:-1}" > "$ZOO_DATA_DIR/myid"

And mounted the docker-entrypoint.sh in my compose file.

With this fix, I am able to stop and start my stack multiple times and every time my zookeeper cluster is able to form ensemble without hitting the connect issue.

My docker-entrypoint.sh file as follows

#!/bin/bash

set -e

# Allow the container to be started with `--user`
if [[ "$1" = 'zkServer.sh' && "$(id -u)" = '0' ]]; then
    chown -R "$ZOO_USER" "$ZOO_DATA_DIR" "$ZOO_DATA_LOG_DIR"
    exec su-exec "$ZOO_USER" "$0" "$@"
fi

# Generate the config only if it doesn't exist
if [[ ! -f "$ZOO_CONF_DIR/zoo.cfg" ]]; then
    CONFIG="$ZOO_CONF_DIR/zoo.cfg"

    echo "clientPort=$ZOO_PORT" >> "$CONFIG"
    echo "dataDir=$ZOO_DATA_DIR" >> "$CONFIG"
    echo "dataLogDir=$ZOO_DATA_LOG_DIR" >> "$CONFIG"

    echo "tickTime=$ZOO_TICK_TIME" >> "$CONFIG"
    echo "initLimit=$ZOO_INIT_LIMIT" >> "$CONFIG"
    echo "syncLimit=$ZOO_SYNC_LIMIT" >> "$CONFIG"

    echo "maxClientCnxns=$ZOO_MAX_CLIENT_CNXNS" >> "$CONFIG"

    for server in $ZOO_SERVERS; do
        echo "$server" >> "$CONFIG"
    done
fi

if [[ -f "$ZOO_DATA_DIR/myid" ]]; then
    rm "$ZOO_DATA_DIR/myid"
fi

echo "${ZOO_MY_ID:-1}" > "$ZOO_DATA_DIR/myid"

exec "$@"

My docker compose file as follows

version: '3.3'

services:
    zoo1:
        image: zookeeper:3.4.12
        hostname: zoo1
        ports:
            - target: 2181
              published: 2181
              protocol: tcp
              mode: host
            - target: 2888
              published: 2888
              protocol: tcp
              mode: host
            - target: 3888
              published: 3888
              protocol: tcp
              mode: host
        networks:
            - net
        deploy:
            restart_policy:
                condition: on-failure
        environment:
            ZOO_MY_ID: 1
            ZOO_SERVERS: server.1=0.0.0.0:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
        volumes:
            - /home/zk/data:/data
            - /home/zk/datalog:/datalog
            - /home/zk/docker-entrypoint.sh:/docker-entrypoint.sh
            - /etc/localtime:/etc/localtime:ro
    zoo2:
        image: zookeeper:3.4.12
        hostname: zoo2
        ports:
            - target: 2181
              published: 2181
              protocol: tcp
              mode: host
            - target: 2888
              published: 2888
              protocol: tcp
              mode: host
            - target: 3888
              published: 3888
              protocol: tcp
              mode: host
        networks:
            - net
        deploy:
            restart_policy:
                condition: on-failure
        environment:
            ZOO_MY_ID: 2
            ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=0.0.0.0:2888:3888 server.3=zoo3:2888:3888
        volumes:
            - /home/zk/data:/data
            - /home/zk/datalog:/datalog
            - /home/zk/docker-entrypoint.sh:/docker-entrypoint.sh
            - /etc/localtime:/etc/localtime:ro
    zoo3:
        image: zookeeper:3.4.12
        hostname: zoo3
        ports:
            - target: 2181
              published: 2181
              protocol: tcp
              mode: host
            - target: 2888
              published: 2888
              protocol: tcp
              mode: host
            - target: 3888
              published: 3888
              protocol: tcp
              mode: host
        networks:
            - net
        deploy:
            restart_policy:
                condition: on-failure
        environment:
            ZOO_MY_ID: 3
            ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=0.0.0.0:2888:3888
        volumes:
            - /home/zk/data:/data
            - /home/zk/datalog:/datalog
            - /home/zk/docker-entrypoint.sh:/docker-entrypoint.sh
            - /etc/localtime:/etc/localtime:ro
networks:
    net:

With this I am able to get zookeeper instance up and running in docker using swarm mode, without hard coding any hostname in the compose file. If one of my node goes down, services are started on any available node on swarm, without any issues.

Thanks

Batten answered 2/11, 2018 at 16:18 Comment(3)
So this is actually a very incorrect approach. The problem you see is because you are using the same directory for all of the zookeepers, what is the purpose of that? In that case you could use a single zookeeper instance that restarts everytime it crashes. The correct fix would be using different directories for every instance. This way zookeeper is actually working like it is supposed to, keeping the datalog clean by communicating with each other.Sailor
Sorry, I am not using a same directory for all Zookeeper instance. This is a docker swarm yaml. I have 3 node swarm cluster and when i load this compose file, 3 instance of Zookeeper will be running on 3 nodes and not in a single node. they didn't share the directories.Batten
But you don't impose any restrictions on where they run. When you deploy them docker swarm will decide where they should run, so it could happen that 2 instances run on same machine. Even if they all run on 3 different nodes (by luck) when you restart the services they could be running on a different node they were running previously. So Zookeeper with ID 1 starts up on node 2 where it will encounter the id file with ID 2.Sailor

© 2022 - 2024 — McMap. All rights reserved.