What will cause zookeeper Client session timed out
Asked Answered
L

5

13

I deployed a long running Storm topology. After several hours running, the whole topology went down. I checked worker logs, and found these logs . As it says, zookeeper client session timed out and it caused reconnection. I suspect it was relate to my broken topology. Now I try to find out what can cause clients timeout.

2016-02-29T10:34:12.386+0800 o.a.s.z.ClientCnxn [INFO] Client session timed out, have not heard from server in 23789ms for sessionid 0x252f862028c0083, closing socket connection and attempting reconnect
2016-02-29T10:34:12.986+0800 o.a.s.c.f.s.ConnectionStateManager [INFO] State change: SUSPENDED
2016-02-29T10:34:13.059+0800 b.s.cluster [WARN] Received event :disconnected::none: with disconnected Zookeeper.
2016-02-29T10:34:13.197+0800 o.a.s.z.ClientCnxn [INFO] Opening socket connection to server zk-3.cloud.mos/172.16.13.147:2181. Will not attempt to authenticate using SASL (unknown error)
2016-02-29T10:34:13.241+0800 o.a.s.z.ClientCnxn [WARN] Session 0x252f862028c0083 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_31]
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) ~[na:1.8.0_31]
    at org.apache.storm.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) ~[storm-core-0.9.6.jar:0.9.6]
    at org.apache.storm.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) ~[storm-core-0.9.6.jar:0.9.6]
Lysin answered 1/3, 2016 at 7:6 Comment(0)
B
7

Your client can no longer talk to the ZooKeeper server. The first thing that happened was there was no answer to the heartbeats within the negotiated session timeout:

2016-02-29T10:34:12.386+0800 o.a.s.z.ClientCnxn [INFO] Client session timed out, have not heard from server in 23789ms for sessionid 0x252f862028c0083, closing socket connection and attempting reconnect

Then when it tried to reconnect, it got a connection refused:

2016-02-29T10:34:13.241+0800 o.a.s.z.ClientCnxn [WARN] Session 0x252f862028c0083 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused

This means either your ZooKeeper server:

  • Is not reachable (network connection down)
  • Is dead (so nothing is listening on the socket)
  • Is GCing itself to death and cannot communicate (although that might have issued a connection timeout error, I'm not sure)

To tell more you will need to check the ZooKeeper server logs on your (Hadoop?) cluster.

Both answered 1/3, 2016 at 21:59 Comment(1)
please i'm facing same problem what if i have GC problem , how can i solve it ?Ruination
F
1

Its worked for me by increasing the connection timeout in server.properties:

zookeeper.connection.timeout.ms=60000
Fimble answered 8/10, 2022 at 14:21 Comment(0)
O
0

One way that this can happen is if you start zookeeper, then break in the terminal, then try to start kafka.

In order to use kafka, you really should use 3 terminal windows (or 3 PuTTY sessions if you are SSHing into your instance from Windows)

First Session for Zookeeper server. Second Session for Kafka server. Third Session for running Kafka commands to do things like create topics.

Opine answered 29/4, 2021 at 18:14 Comment(0)
H
-2

I have started Kafka in cluster mode with 3 zookeeper server and 3 Kafka server. All zookeeper server started successfully but while starting Kafka server its get disconnected stating "fatal error during Kafka server startup. prepare to shutdown (kafka.server.kafkaserver)". while investigation, I found that Kafka server get disconnected every time after 18 seconds[which is zookeeper.connection.timeout.ms = 18000 default value] so I updated the same and issue get resolved.

Hance answered 25/11, 2020 at 5:39 Comment(1)
Please provide some context and explanation to your answer. Even if this solves the problem, it doesn't help the reader understand what went wrong.Plerre
R
-4

always use 2181 as port number for zookeeper connection until you haven't configured your zookeeper !!!

Rudelson answered 24/8, 2017 at 10:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.