Zookeeper on same node as kafka?
Asked Answered
J

2

8

I am setting up a kafka+zookeeper cluster. Let's say I want 3 kafka brokers. I am wondering if I can setup 3 machines with kafka on them and then run the zookeeper cluster on the same nodes. So each machine has a kafka+zookeeper node in the cluster, instead of having 3 machines for kafka and 3 machines for zookeeper (6 in total).

What are the advantages and disadvantages? These machines will most probably be dedicated to running kafka/zookeeper. I am thinking if I can reduce costs a bit without sacrificing performance.

Jehias answered 24/6, 2017 at 9:52 Comment(0)
N
11

We have been running zookeeper and kafka broker on the same node in production environment for years without any problems. The cluster is running at very very high qps and IO traffics, so I dare say that our experience suits most scenarios.

The advantage is quite simple, which is saving machines. Kafka brokers are IO-intensive, while zookeeper nodes don't cost too much disk IO as well as CPU. So they won't disturb each other in most occasions.

But do remember to keep watching at your CPU and IO(not only disk but also network) usages, and increase cluster capacity before they reach bottleneck.

I don't see any disadvantages because we have very good cluster capacity planning.

Nonconformist answered 24/6, 2017 at 13:51 Comment(4)
Thank you for your reply :) If I may ask, do you have any idea how powerful (estimate) machines you have your kafka+zookeeper running on?Jehias
@Weibo Li Do you have any problems when a server goes down which would mean a kafka broker and a zookeeper both go down together ? We have a similar setup and I was wondering if that could be contributing to us getting this issue #52368325Standush
With zookeeper and kafka going down together is screwing up the whole cluster in my case? Ever encountered to that issue?Alderney
@Weibo Li Can you take a look at this issue?https://mcmap.net/q/1326602/-whole-cluster-failing-if-one-kafka-node-goes-down/12953672Alderney
B
1

It makes sense to collocate them when Kafka cluster is small, 3-5 nodes. But keep in mind that it is a colocation of two applications that are sensitive to disk I/O. The workloads and how chatty they are with local Zk's also plays an important role here, especially from page cache memory usage perspective. 

Once Kafka cluster grows to a dozen or more nodes, collocation of Zk’s accordingly on each node will create quorum overheads(like slower writes, more nodes in quorum checks), so a separate Zk cluster has to be in place.

Overall, if from the start Kafka cluster usage is low and you want to save some costs, then it is reasonable to start them collocated, but have a migration strategy for setting up a separate Zk cluster to not be caught of guard once Kafka cluster has to be scaled horizontally. 

Bremerhaven answered 21/9, 2019 at 20:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.