TCP keep-alive gets involved after TCP zero-window and closes the connection erroneously

Asked 9/11, 2015 at 22:31 Answered 19/1, 2019 at 19:33

We're seeing this pattern happen a lot between two RHEL 6 boxes that are transferring data via a TCP connection. The client issues a TCP Window Full, 0.2s later the client sends TCP Keep-Alives, to which the server responds with what look like correctly shaped responses. The client is unsatisfied by this however and continues sending TCP Keep-Alives until it finally closes the connection with an RST nearly 9s later.

This is despite the RHEL boxes having the default TCP Keep-Alive configuration:

net.ipv4.tcp_keepalive_time = 7200
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_intvl = 75

...which declares that this should only occur until 2hrs of silence. Am I reading my PCAP wrong (relevant packets available on request)?

Below is Wireshark screenshot of the pattern, with my own packet notes in the middle.

Ledoux answered 9/11, 2015 at 22:31 Comment(3)

These are really window probes. Screenshot is illegible as always. – Spool 9/11, 2015 at 22:35

Can you expand on your assertion please? Two clicks on the screenshot renders it legible. PCAP extract available if required. – Ledoux 9/11, 2015 at 22:43

Two clicks didn't make it legible for me. I don't see any need to expand further. – Spool 9/1, 2017 at 7:35

Actually, these "keep-alive" packets are not used for TCP keep-alive! They are used for window size updates detection.

Wireshark treats them as keep-alive packets just because these packets look like keep-alive packet.

A TCP keep-alive packet is simply an ACK with the sequence number set to one less than the current sequence number for the connection.

(We assume that ip 10.120.67.113 refers to host A, 10.120.67.132 refers to host B.) In packet No.249511, A acks seq 24507484. In next packet(No.249512), B send seq 24507483(24507484-1).

Why there are so many "keep-alive" packets, what are they used for?

A sends data to B, and B replies zero-window size to tell A that he temporarily can't receive data anymore. In order to assure that A knows when B can receive data again, A send "keep-alive" packet to B again and again with persistence timer, B replies to A with his window size info (In our case, B's window size has always been zero).

And the normal TCP exponential backoff is used when calculating the persist timer. So we can see that A send its first "keep-alive" packet after 0.2s, send its second packet after 0.4s, the third is sent after 0.8, the fouth is sent after 1.6s...

This phenomenon is related to TCP flow control.

Discant answered 19/1, 2019 at 19:33 Comment(0)

The source and destination IP addresses in the packets originating from client do not match the destination and source IP addresses in the response packets, which indicates that there is some device in between the boxes doing NAT. It is also important to understand where the packets have been captured. Probably a packet capture on the client itself will help understand the issue.

Please note that the client can generate TCP keepalive if it does not receive a data packet for two hours or more. As per RFC 1122, the client retries keepalive if it does not receive a keepalive response from the peer. It eventually disconnects after continuous retry failure.

The NAT devices typically implement connection caches to maintain the state of ongoing connections. If the size of the connection reaches limit, the NAT devices drops old connections in order to service the new connections. This could also lead to such a scenario.

The given packet capture indicates that there is a high probability that packets are not reaching the client, so it will be helpful to capture packets on client machine.

Chiles answered 19/11, 2015 at 5:54 Comment(0)

I read the trace slightly differently: Sender sends more data than receiver can handle and gets zerowindow response Sender sends window probes (not keepalives it is way to soon for that) and the application gives up after 10 seconds with no progress and closes the connection, the reset indicates there is data pending in the TCP sendbuffer. If the application uses a large blocksize writing to the socket it may have seen no progress for more than the 10 seconds seen in the tcpdump.

If this is a straight connection (no proxies etc.) the most likely reason is that the receiving up stop receiving (or is slower than the sender & data transmission)

Radiocommunication answered 22/3, 2016 at 15:38 Comment(0)

It looks to me like packet number 249522 provoked the application on 10.120.67.113 to abort the connection. All the window probes get a zero window response from .132 (with no payload) and then .132 sends (unsolicited) packet 249522 with 63 bytes (and still showing 0 window). The PSH flag suggests that this 63 bytes is the entire data written by the app on .132. Then .113 in the same millisecond responds with an RST. I can't think of any reason why the TCP stack would send a RST immediately after receiving data (sequence numbers are correct). In my view it is almost certain that the app on .113 decided to give up based on the 63 byte message sent by .132.

Whirlpool answered 9/1, 2017 at 6:24 Comment(0)

Recommended topics

Hot tags