Zookeeper - what will happen if I pass in a connection string only some of the nodes from the zk cluster (ensemble)?
Asked Answered
A

1

8

I have a zookeeper cluster consisting of N nodes (which knows about each other). What if I pass only M < N of the nodes' addresses in zk client connection string? What will be the cluster's behavior?

In a more specific case, what if I pass host address of only 1 zk from the cluster? Is it possible then for the zk client to connect to other hosts from the cluster? What if this one host is down? Will be client able to connect to other zookeeper nodes in an ensemble?

The other question is, is it possible to limit client to use only specific nodes from the ensemble?

Alyssaalyssum answered 16/1, 2017 at 18:1 Comment(0)
P
10

What if I pass only M < N of the nodes' addresses in zk client connection string? What will be the cluster's behavior?

ZooKeeper clients will connect only to the M nodes specified in the connection string. The ZooKeeper ensemble's back-end interactions (leader election and processing write transaction proposals) will continue to be processed by all N nodes in the cluster. Any of the N nodes still could become the ensemble leader. If a ZooKeeper server receives a write transaction request, and that server is not the current leader, then it will forward the request to the current leader.

In a more specific case, what if I pass host address of only 1 zk from the cluster? Is it possible then for the zk client to connect to other hosts from the cluster? What if this one host is down? Will be client able to connect to other zookeeper nodes in an ensemble?

No, the client would only be able to connect to the single address specified in the connection string. That address effectively becomes a single point of failure for the application, because if the server goes down, the client will not have any other options for establishing a connection.

The other question is, is it possible to limit client to use only specific nodes from the ensemble?

Yes, you can limit the nodes that the client considers for establishing a connection by listing only those nodes in the client's connection string. However, keep in mind that any of the N nodes in the cluster could still become the leader, and then all client write requests will get forwarded to that leader. In that sense, the client is using the other nodes indirectly, but the client is not establishing a direct socket connection to those nodes.

The ZooKeeper Overview page in the Apache documentation has further discussion of client and server behavior in a ZooKeeper cluster. For example, there is a relevant quote in the Implementation section:

As part of the agreement protocol all write requests from clients are forwarded to a single server, called the leader. The rest of the ZooKeeper servers, called followers, receive message proposals from the leader and agree upon message delivery. The messaging layer takes care of replacing leaders on failures and syncing followers with leaders.

Peart answered 16/1, 2017 at 18:15 Comment(2)
Thanks for a comprehensive answer. I have additional question: what could happen if we pass addresses of zookeeper nodes from 2 different clusters to a single client?Alyssaalyssum
@rideronthestorm, thanks, I'm glad it was helpful! Passing addresses from different clusters to a single ZooKeeper client will not work correctly. The client logic keeps track of the last seen transaction ID ("zxid") from the cluster. This is not synchronized between different clusters, so the client will show unpredictable behavior. The ZooKeeper Administrator's Guide says to avoid this near the bottom. It is possible for a single process to start up 2 different ZooKeeper clients connected to 2 different clusters though.Peart

© 2022 - 2024 — McMap. All rights reserved.