WebSockets ping/pong, why not TCP keepalive?

Asked 23/4, 2014 at 8:0 Answered 22/8, 2023 at 15:24

113

WebSockets have the option of sending pings to the other end, where the other end is supposed to respond with a pong.

Upon receipt of a Ping frame, an endpoint MUST send a Pong frame in response, unless it already received a Close frame. It SHOULD respond with Pong frame as soon as is practical.

TCP offers something similar in the form of keepalive:

[Y]ou send your peer a keepalive probe packet with no data in it and the ACK flag turned on. You can do this because of the TCP/IP specifications, as a sort of duplicate ACK, and the remote endpoint will have no arguments, as TCP is a stream-oriented protocol. On the other hand, you will receive a reply from the remote host (which doesn't need to support keepalive at all, just TCP/IP), with no data and the ACK set.

I would think that TCP keepalive is more efficient, because it can be handled within the kernel without the need to transfer data up to user space, parse a websocket frame, craft a response frame, and hand that back to the kernel for transmission. It's also less network traffic.

Furthermore, WebSockets are explicitly specified to always run over TCP; they're not transport-layer agnostic, so TCP keepalive is always available:

The WebSocket Protocol is an independent TCP-based protocol.

So why would one ever want to use WebSocket ping/pong instead of TCP keepalive?

David answered 23/4, 2014 at 8:0 Comment(4)

Actually one never uses WebSocket ping/pong because no API was created. And one never uses TCP keepalive either, for the reasons noted in the answers. This is a great example of how layering introduces complexity without solving problems: every layer has to implement the same feature, but each is useless for its own reason. So the application still has to implement its own keepalive on top of all the other layers. – Puberulent 6/8, 2021 at 21:56

Exactly, I am using STOMP which like a new layer, and STOMP has it's own ping/pong. – Duckpin 16/6, 2022 at 1:38

Also, maybe they're trying to leave the door half-open for other transports (in the future / just in case of), e.g. an agnostic (/wholeheartedly controllable) Ping-Pong mechanism. – Quadrilateral 1/9, 2023 at 17:6

[ Adding a new comment, as I ran out of time to edit my previous comment : ] (But Ping-Pong is/was not so fun when a library I use had an 'IsAlive' property/check, that I tried to display the value of within my Ping handler - alongside other properties that I also wanted to assess the value of, only to find that 'IsAlive' was calling the '(Send)Ping()' method/function. D'Oh! Client & Server never-ending Ping-Pong Wars!! ;P) – Quadrilateral 1/9, 2023 at 17:14

111

The problems with TCP keepalive are:

It is off by default.
It operates at two-hour intervals by default, instead of on-demand as the Ping/Pong protocol provides.
It operates between proxies rather than end to end.
As pointed out by @DavidSchwartz, it operates between TCP stacks, not between the applications so therefore it doesn't tell us whether the application is alive or not.

The comparison with WebSockets ping/pong isn't meaningful. TCP keepalive is automatic, and timed, when enabled, whereas WebSocket ping/pong is executed as required by the application.

Estellestella answered 23/4, 2014 at 9:50 Comment(9)

You can change both of these on a per-connection basis using setsockopt(2). – David 23/4, 2014 at 11:50

@David You can change the interval on some platforms. Not Windows, FreeBSD, Solaris, ... – Estellestella 18/11, 2015 at 9:21

Also, it does the wrong thing. What we want to know is whether the application on the other end is alive. Since, as you mentioned, TCP keepalives are handled in the kernel, they don't tell us whether the application on the other end is alive. – Subeditor 26/5, 2017 at 23:44

@DavidSchwartz I beleive this is the most important reason. Perhaps turn this to an answer ? – Ramrod 22/5, 2019 at 7:9

In Ubuntu, I see keepalive acks coming from clients about once every 30 seconds or so, with default settings. It works (I can put a 40-second timeout in the remote non-Ubuntu server TCP, and the connection still stays alive). But your point about excessively layered clients, proxies, etc. is a good reason to prefer WebSocket ping. – Puberulent 21/9, 2020 at 23:34

@David Layering should not be used to solve bugs in the kernel or the application (though, sadly, I have often seen this). The TCP keepalives are supposed to stop when the socket closes. The kernel is supposed to close the socket when the application exits. Applications are supposed to exit when they stop working (for example, they can include a watchdog thread). If any of these does not happen, it's a bug in the kernel or in the application. Fix the bug; don't layer on top of it. – Puberulent 6/8, 2021 at 22:2

@Estellestella you can change the TCP keep-alive interval on Windows, via WSAIoctl(SIO_KEEPALIVE_VALS) instead of setsockopt() – Fonville 24/8, 2023 at 19:17

@RemyLebeau Yes, that's why I said 'by default'. – Estellestella 28/8, 2023 at 23:41

@Estellestella you said: "You can change the interval on some platforms. Not Windows, FreeBSD, Solaris, ..." which is not true. Hence my comment – Fonville 29/8, 2023 at 2:50

Besides the answer of EJP I think it might be also related to HTTP proxy mechanisms. Websocket connections can also run through a (HTTP) proxy server. In such cases the TCP keepalive would only check the connection up to the proxy and not the end-to-end connection.

Drews answered 23/4, 2014 at 9:54 Comment(4)

I'm torn between accepting this or @vtortola's answer, which are both great reasons, but yours has one more upvote so here goes :) – David 23/4, 2014 at 12:43

@David I find it curious that you accepted an answer which is based on another more complete answer. There are several answers here that repeat this information, and several that are more complete. – Estellestella 1/2, 2016 at 3:3

Yours is... now. The proxy thing was the main point I was missing, and that was not in your original answer. – David 1/2, 2016 at 10:27

@David I am talking about several better answers, not just one. – Estellestella 7/6, 2016 at 1:9

http://www.whatwg.org/specs/web-apps/current-work/multipage/network.html#ping-and-pong-frames

.3.4 Ping and Pong frames

The WebSocket protocol specification defines Ping and Pong frames that can be used for keep-alive, heart-beats, network status probing, latency instrumentation, and so forth. These are not currently exposed in the API.

User agents may send ping and unsolicited pong frames as desired, for example in an attempt to maintain local network NAT mappings, to detect failed connections, or to display latency metrics to the user. User agents must not use pings or unsolicited pongs to aid the server; it is assumed that servers will solicit pongs whenever appropriate for the server's needs.

WebSockets have been developed with RTC in mind, so when I look at the ping/pong functionality, I see a way of measuring latency as well. The fact that the pong must return the same payload as the ping, make it very convenient to send a timestamp, and then calculate latency from client to server or vice verse.

Yseulta answered 23/4, 2014 at 9:55 Comment(0)

TCP keepalive doesn't get passed through a web proxy. The websocket ping/pong will be forwarded by through web proxies. TCP keepalive is designed to supervise a connection between TCP endpoints. Web socket endpoints are not equal to TCP endpoints. A websocket connection can use several TCP connections between two websocket endpoints.

Beeler answered 15/10, 2014 at 10:24 Comment(0)

update: while there are implementations of HTTP/1.1 over UDP, websocket protocol does NOT work over UDP (it requires reliable transport underneath).

original (incorrect) answer follows:

In addition to all the answers that point valid problems with TCP keepalive (mainly that it doesn't pass through proxies), bare in mind that HTTP (and WebSocket by extension) was designed to work also over UDP.

Malay answered 22/8, 2023 at 15:24 Comment(6)

No, WebSocket only works over TCP, as I mentioned in the question. – David 23/8, 2023 at 16:46

The HTTP/1.1 RFC 2616 does not specify a transport protocol, but does say that "HTTP only presumes a reliable transport; any protocol that provides such guarantees can be used", so UDP is out of the question. – David 23/8, 2023 at 16:48

@David you are correct. My memory must be playing tricks on me because I could swear I checked it earlier and UDP was also an option... huh ;-] – Malay 23/8, 2023 at 20:12

...and I even remember thinking "that's why websocket requires terminating message, because closing a UDP socket does nothing really"... Maybe it was some proposal that I thought was already a standard... ;-] – Malay 23/8, 2023 at 20:15

en.wikipedia.org/wiki/Universal_Plug_and_Play#Protocol maybe :) – David 24/8, 2023 at 18:17

Ever hear of "Reliable UDP"? It's a thing. Also, HTTP/3 RFC 9114 runs over QUIC instead of TCP, and QUIC uses UDP. – Fonville 24/8, 2023 at 19:15

Recommended topics

Hot tags