Adding new ZooKeeper node in Kafka cluster?
Asked Answered
W

2

5

I have running an Apache Kafka cluster of five nodes, and I am using an Apache ZooKeeper cluster of three nodes.

In zookeeper.properties file:

server.1=zNode01:2888:3888
server.2=zNode02:2888:3888
server.3=zNode03:2888:3888

And in server.properties file:

zookeeper.connect=zNode01:2181,zNode02:2181,zNode03:2181

I want to add a new ZooKeeper node:

  1. I need to add this new ZooKeeper IP to an existing ZooKeeper properties file and need to restart it OR there is another way to do it?

  2. I need to add this new ZooKeeper IP to Kafka server.properties file and need to restart it OR there is another way to do it?

Wiliness answered 25/8, 2018 at 14:0 Comment(0)
A
6

Its a more involved than what @cricket_007 described. This would be a good read before you attempt to add a new member to the existing zookeeper cluster.

https://zookeeper.apache.org/doc/r3.5.3-beta/zookeeperReconfig.html

Focus specifically on "Modifying the current dynamic configuration" section.

Basically, these are the high level steps:

a) The new server has to be introduced to the leader. This is done by adding itself and "enough cluster information" in the zookeeper.properties file for the joiner to connect to the existing leader. The configuration doesn't need to be absolutely uptodate, but fresh enough to connect with the current leader. To do that you could just get zookeeper.properties file from one of the nodes in the cluster, append joiner information to it, and start the zookeeper server on the joiner node.

b) Note that the joiner being able to talk to the leader of the cluster doesn't make it a part of the cluster automatically. The zookeeper ensemble has to vote and decide upon adding the new node into the cluster. The status of the joiner currently is a non-voting follower, and if you look at the current configuration of zookeeper ensemble (via zkcli's "config" command), you will not see the new node listed in the ensemble.

c) Now, we use the zkcli's "reconfig" command to add the new node to the cluster either as a voting participant or an observer. Voting participant means that all the consensus decisions (Eg. whos the new leader, whether to commit a write etc) will involve all the voting participants (and not the observers). Observers are added primarily to increase the read throughput of the zookeeper ensemble without adding the extra overhead of involving them in the 2-phase commit for each write operation. The reconfig command also performs this 2-phase commit, where the leader gathers votes from all the voting participants whether the new node should be added to the cluster. If quorum of the existing participants agree, the new node is added to the cluster.

d) Now, executing the zkcli's config command will show the new node as part of the cluster, either as a voting participant or as an observer.

e) Lastly, you would want to update the server.properties file of kafka to close the loop. Even though this change might not be immediately needed, this would inform kafka server (which is a zookeeper client) of the availability of the new member in the zookeeper cluster, so that it can fallback to the newly added node during failure scenarios.

Hope the answer helps in understanding how dynamically adding a new node to zookeeper cluster works.

Andrea answered 31/8, 2018 at 0:3 Comment(0)
M
0

Note: I've not tried expanding a ZK cluster myself, but I would try

Adding the new nodes with all servers defined in the property file. They should join fine.

Then add the two new servers to the other ZKs, and perform a rolling restart on them.

Kafka doesn't need to add the extra properties unless you expect you'll lose more than one ZK at any given time

Mennonite answered 25/8, 2018 at 16:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.