ZeroMQ Client Lose Connection
Asked Answered
P

1

6

I have a client (PULL) connect to the server (PUSH). At first they work just fine. But later the connection is broken, and client-side ZeroMQ doesn't try to reconnect to server.

One mysterious thing is that if I do netstat in client side and server side, the client side shows the connection is still ESTABLISHED, while the server side doesn't have the corresponding entry. I suppose this is the reason why client-side doesn't do reconnecting.

PS: client and server are in differenct IDC, and there is a band limit between them. But when the disconnection happens, our monitor shows it does not hit the band limit.

And, when I do netstat in server side (when the connection is fine), sometimes the Send-Q column is very big, and then drop down to 0.

That's all the information I have. If you need more details please tell me.

Periwig answered 8/10, 2012 at 8:54 Comment(3)
what language? Any code examples? If you're killing your context (goes out of scope?) then the sockets wouldn't reconnect... Normally if everything still exists, zmq will handle connection drops and the such without much issue...Moskva
Did you find any more info regarding this? I am using a C# binding (clrzmq) and have experienced something like this. I use a SUB socket with multiple endpoints connected and all of a sudden all incoming data is lost. If I call disconnect and connect again (in the client) all is good again. Will look in netstat if I can get it to happen again.Starbuck
Might add that I have implemented a brute-force disconnect detection (using heartbeats and timeouts) which forces the ZeroMQ socket to disconnect/connect endpoints that goes silent for a while. This unfortunately has the drawback that all queues on both sides will be dropped for "reset" connections.Starbuck
H
8

I realize this is a very old question, but I ran into almost the exact same issue and found this while trying to find a fix. I believe I have fixed my issue so hopefully this helps someone at some point.

I had the same scenario but with ROUTER -> ROUTER. Everything worked great at first but after ~15 minutes of not sending any messages, messages would no longer make it. Then I found: http://api.zeromq.org/3-2:zmq-setsockopt. The three socket options that worked for me were ( using pyzmq ):

# self.client is my socket here
self.client.setsockopt(zmq.TCP_KEEPALIVE, 1)
self.client.setsockopt(zmq.TCP_KEEPALIVE_IDLE, 300)
self.client.setsockopt(zmq.TCP_KEEPALIVE_INTVL, 300)

These override the OS settings and I'm no longer seeing the connection timeout or drop.

Huffish answered 14/7, 2017 at 18:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.