How to start Zookeeper and then Kafka?
Asked Answered
R

2

5

I'm getting started with Confluent Platform which requires to run Zookeeper (zookeeper-server-start /etc/kafka/zookeeper.properties) and then Kafka (kafka-server-start /etc/kafka/server.properties). I am writing an Upstart script that should run both Kafka and Zookeeper. The issue is that Kafka should block until Zookeeper is ready (because it depends on it) but I can't find a reliable way to know when Zookeeper is ready. Here are some attempts in pseudo-code after running the Zookeeper server start:

  1. Use a hardcoded block

    sleep 5   
    

    Does not work reliably on slower computers and/or waits longer than needed.

  2. Check when something (hopefully Zookeeper) is running on port 2181

    wait until $(echo stat | nc localhost ${port}) is not none
    

    This did not seem to work as it doesn't wait long enough for Zookeeper to accept a Kafka connection.

  3. Check the logs

     wait until specific string in zookeeper log is found
    

    This is sketchy and there isn't even a string that cannot also be found on error too (e.g. "binding to port [...]").

Is there a reliable way to know when Zookeeper is ready to accept a Kafka connection? Otherwise, I will have to resort to a combination of 1 and 2.

Rewire answered 19/1, 2017 at 19:56 Comment(3)
I would have expected technique #2 to be sufficient. Can you please add more details about how startup fails when trying technique #2?Quintuplicate
@ChrisNauroth The exact error I'm getting in Kafka for technique #2 is the following: "FATAL [Kafka Server 0], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) java.lang.RuntimeException: A broker is already registered on the path /brokers/ids/0. This probably indicates that you either have configured a brokerid that is already in use, or else you have shutdown this broker and restarted it faster than the zookeeper timeout so it appears to be re-registering." -- it's fine if I add a delay after this though.Rewire
no one really answer the question "how to start zookeeper and then kafka" ...Pebbly
Q
3

The Kafka error message from your comment is definitely relevant:

FATAL [Kafka Server 0], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) java.lang.RuntimeException: A broker is already registered on the path /brokers/ids/0. This probably indicates that you either have configured a brokerid that is already in use, or else you have shutdown this broker and restarted it faster than the zookeeper timeout so it appears to be re-registering.

This indicates that ZooKeeper is up and running, and Kafka was able to connect to it. As I would have expected, technique #2 was sufficient for verifying that ZooKeeper is ready to accept connections.

Instead, the problem appears to be on the Kafka side. It has registered a ZooKeeper ephemeral node to represent the starting Kafka broker. An ephemeral node is deleted automatically when the client's ZooKeeper session expires (e.g. the process terminates so it stops heartbeating to ZooKeeper). However, this is based on timeouts. If the Kafka broker restarts rapidly, then after restart, it sees that a znode representing that broker already exists. To the new process start, this looks like there is already a broker started and registered at that path. Since brokers are expected to have unique IDs, it aborts.

Waiting for a period of time past the ZooKeeper session expiration is an appropriate response to this problem. If necessary, you could potentially tune the session expiration to happen faster as discussed in the ZooKeeper Administrator's Guide. (See discussion of tickTime, minSessionTimeout and maxSessionTimeout.) However, tuning session expiration to something too rapid could cause clients to experience spurious session expirations during normal operations.

I have less knowledge on Kafka, but perhaps there is also something that can be done on the Kafka side. I know that some management tools like Apache Ambari take steps to guarantee assignment of a unique ID to each broker on provisioning.

Quintuplicate answered 19/1, 2017 at 21:21 Comment(1)
Kafka itself provisions a unique ID to its brokers. The feature was contributed by an Hortonworks employee, so perhaps Ambari is using this feature?Holofernes
M
5

I found that using a timer is not reliable. the second option (waiting for the port) worked for me:

bin/zookeeper-server-start.sh -daemon config/zookeeper.properties && \
while ! nc -z localhost 2181; do sleep 0.1; done && \
bin/kafka-server-start.sh -daemon config/server.properties
Marlette answered 1/5, 2019 at 23:38 Comment(0)
Q
3

The Kafka error message from your comment is definitely relevant:

FATAL [Kafka Server 0], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) java.lang.RuntimeException: A broker is already registered on the path /brokers/ids/0. This probably indicates that you either have configured a brokerid that is already in use, or else you have shutdown this broker and restarted it faster than the zookeeper timeout so it appears to be re-registering.

This indicates that ZooKeeper is up and running, and Kafka was able to connect to it. As I would have expected, technique #2 was sufficient for verifying that ZooKeeper is ready to accept connections.

Instead, the problem appears to be on the Kafka side. It has registered a ZooKeeper ephemeral node to represent the starting Kafka broker. An ephemeral node is deleted automatically when the client's ZooKeeper session expires (e.g. the process terminates so it stops heartbeating to ZooKeeper). However, this is based on timeouts. If the Kafka broker restarts rapidly, then after restart, it sees that a znode representing that broker already exists. To the new process start, this looks like there is already a broker started and registered at that path. Since brokers are expected to have unique IDs, it aborts.

Waiting for a period of time past the ZooKeeper session expiration is an appropriate response to this problem. If necessary, you could potentially tune the session expiration to happen faster as discussed in the ZooKeeper Administrator's Guide. (See discussion of tickTime, minSessionTimeout and maxSessionTimeout.) However, tuning session expiration to something too rapid could cause clients to experience spurious session expirations during normal operations.

I have less knowledge on Kafka, but perhaps there is also something that can be done on the Kafka side. I know that some management tools like Apache Ambari take steps to guarantee assignment of a unique ID to each broker on provisioning.

Quintuplicate answered 19/1, 2017 at 21:21 Comment(1)
Kafka itself provisions a unique ID to its brokers. The feature was contributed by an Hortonworks employee, so perhaps Ambari is using this feature?Holofernes

© 2022 - 2024 — McMap. All rights reserved.