ZooKeeper keeps getting EndOfStreamException, causing a crash
Asked Answered
M

2

21

My Zookeeper is controlling a few different queues for different jobs, by holding the relevant job data in each node until the computer is ready to process. If I stop the overall service, such that no jobs can be started ZooKeeper runs just fine after a restart. However, some of these jobs seem to cause ZooKeeper to crash with the following message in the ZooKeeper log:

WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@349] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x15677f740ad002a, likely client has closed socket
        at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
        at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:745)
INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /127.0.0.1:46998 which had sessionid 0x15677f740ad002a

My ZooKeeper knowledge is very limited, as I am taking over from the guy that set it up originally.

I have tried to delete a lot of nodes with rmr [path] in the zookeeper shell, which seemed to have some effect (deleted 50k+ nodes that was left over/of no use), but it has kept crashing daily, and last night I couldn't get it to run for more than a couple of minutes before the same error/crash would occur.

How do I find out what is causing this?

I am pretty sure it is some general problem with the data that is recieved, or the stored data/nodes. The disk is only 92% full. I also found this post: Zookeeper keeps getting the WARN: "caught end of stream exception", but the solution doesn't make much sense to me. Also I am pretty sure that none of the messages kept in my znodes are more than 1MB large, but I am unsure how to confirm this.

Is there some way I can change the ZooKeeper log so that I can print additional information, such as the content/name of the znode it is operating on before it crashes?

Micropyle answered 11/8, 2016 at 5:34 Comment(0)
M
8

I was able to solve the problem by deleting all zookeeper snapshots and log files from the server running ZooKeeper. I don't know why this made a difference, but it has been running fine for the last 22 hours.

Micropyle answered 12/8, 2016 at 5:27 Comment(2)
Is your server still running fine after deleting the Zookeeper snapshots and logs? Or do you have to do that from time to time?Invocate
It continues to crash once in a while. The problem was not permanently fixed. Deleting logs and snapshots seem to help each time, though I also now try to delete all nodes in ZooKeeper.Micropyle
C
3

**This exception is an indicator of the end of the data stream of a session. It usually occurs when closing a connection to zookeeper. This exception does not signal a defect on the zookeeper side. Instead, it shows a connection to client is reset or closed. So please ignore the warning.

2020-08-17 09:05:05 WARN NIOServerCnxn:368 - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x373fb86e57b0018, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:239) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203) at java.lang.Thread.run(Thread.java:748) 2020-08-17 09:05:05 INFO NIOServerCnxn:1044 - Closed socket connection for client /xx.xx.xx.xx:55380 which had sessionid 0x373fb86e57b0018

Contribution answered 11/8, 2016 at 5:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.