ZooKeeper reliability - three versus five nodes
Asked Answered
C

2

31

From the ZooKeeper FAQ:

Reliability:

A single ZooKeeper server (standalone) is essentially a coordinator with
no reliability (a single serving node failure brings down the ZK service).

A 3 server ensemble (you need to jump to 3 and not 2 because ZK works
based on simple majority voting) allows for a single server to fail and
the service will still be available.

So if you want reliability go with at least 3. We typically recommend
having 5 servers in "online" production serving environments. This allows
you to take 1 server out of service (say planned maintenance) and still
be able to sustain an unexpected outage of one of the remaining servers
w/o interruption of the service.

With a 3-server ensemble, if one server is taken out of rotation and one server has an unexpected outage, then there is still one remaining server that should ensure no interruption of service. Then why the need for 5 servers? Or is it more than just interruption of service that is being considered?

Update:

Thanks to @sbridges for pointing out that it has to do with maintaining a quorum. And the way that ZK defines a quorum is ceil(N/2) where N is the original number in the ensemble (and not just the currently available set).

Now, a google search for ZK quorum finds this in the HBase book chapter on ZK:

In ZooKeeper, an even number of peers is supported, but it is normally not used because an even sized ensemble requires, proportionally, more peers to form a quorum than an odd sized ensemble requires. For example, an ensemble with 4 peers requires 3 to form a quorum, while an ensemble with 5 also requires 3 to form a quorum. Thus, an ensemble of 5 allows 2 peers to fail and still maintain quorum, and thus is more fault tolerant than the ensemble of 4, which allows only 1 down peer.

And this paraphrasing of Wikipedia in Edward J. Yoon's blog:

Ordinarily, this is a majority of the people expected to be there, although many bodies may have a lower or higher quorum.

Corticosterone answered 23/10, 2012 at 1:11 Comment(1)
What would be an example of "planned maintenance" ?Nebiim
I
29

Zookeeper requires that you have a quorum of servers up, where quorum is ceil(N/2). For a 3 server ensemble, that means 2 servers must be up at any time, for a 5 server ensemble, 3 servers need to be up at any time.

Inconsiderable answered 23/10, 2012 at 3:41 Comment(4)
Why 4 node cluster isn't recommended?Cilicia
I agree with @Pangea here. This would imply that a 5 node cluster could only support 2 failures (a 3rd failure would drop below the quorum of 3) and a 4 node cluster could also support 2 failures (3rd failure would drop below the quorum of 2).Intranuclear
@Pangea, see zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html ...As long as a majority of the ensemble are up, the service will be available. Because Zookeeper requires a majority, it is best to use an odd number of machines. .... Also noting with an even number of nodes you also run the risk of a split brain, say you had 8 nodes, and the network partitioned into 2 parts with 4 nodes on each side...each (4) node side would not be able to continue as they do not have quorum.Ghostwrite
Looks like it should be ceil((N+1) / 2). The quorum for an ensemble of 4 should be 3, not 2 as noted in the HBase book.Corticosterone
S
5

Basically, Zookeeper will work just fine as long as Active Zookeepers are in MAJORITY compared to failed Zookeepers. Also, in case of even quorum size i.e 2,4,6 etc. Failed = Active, because of that its not recommended.

Both 3 and 4 will handle only 1 faliures then why whould we want to used 4 Zookeepers instead of 3.

enter image description here

Scaffolding answered 12/9, 2018 at 11:1 Comment(1)
seems very interesting observation, can you please elaborate what to interpret from "Majority" column.Minstrel

© 2022 - 2024 — McMap. All rights reserved.