Why does TCP/IP on Windows7 take 500 sends to warm-up? ( w10,w8 proved not to suffer )
Asked Answered
N

1

10

We are seeing a bizarre and unexplained phenomenon with ZeroMQ on Windows 7, sending messages over TCP.( Or over inproc, as ZeroMQ uses TCP internally for signalling, on Windows ).

The phenomenon is that the first 500 messages arrive slower and slower, with latency rising steadily. Then latency drops and messages arrive consistently rapidly, except for spikes caused by CPU/network contention.

The issue is described here: https://github.com/zeromq/libzmq/issues/1608

It is consistently 500 messages. If we send without a delay, then messages are batched so we see the phenomenon stretch over several thousand sends. If we delay between sends, we see the graph more clearly. Even delaying as much as 50-100 msec between sends does not change things.

Message size is also irrelevant. I've tested with 10-byte messages and 10K messages, with the same results.

The maximum latency is always 2 msec (2,000 usec).

On Linux boxes we do not see this phenomenon.

What we'd like to do is eliminate this initial curve, so messages leave on a fresh connection with their normal low latency (around 20-100 usec).


Update: the issue does not show on Windows 10 nor 8. It seems to happen only on Windows 7.

Nationwide answered 29/10, 2015 at 8:58 Comment(2)
Wild hunch: I wonder if this could be related to TCP auto-tuning in Win7: sevenforums.com/network-sharing/…Everything
Great hunch, we're checking it out. Turns out the phenomenon does not seem to happen on Windows 10, neither Windows 8. So it could indeed be a Windows 7 autotuning feature.Nationwide
N
3

We've found the cause and a workaround. This is a general issue with all TCP activity on Windows 7 (at least) caused by buffering at the receiver side. You can find some hints on line under "TCP slow start."

On a new connection, or if there connection is idle for (I think) 150 msec or more, the receiver buffers incoming packets and does not provide these to the application, until the receive buffer is full and/or some timeout expires (it's unclear).

Our workaround in ZeroMQ, where we are using TCP sockets for interthread signalling, is to send a dummy chunk of data on new signal pairs. This forces the TCP stack to work "normally" and we then see consistent latencies of around 100-150 usec.

I'm not sure whether this is generally useful; for most applications it's profitable to wait a little on reception, so the TCP stack can deliver more to the calling application.

However for apps that send many small messages, this workaround may be helpful.

Note that if the connection is idle, the slow start happens again, so connections should heartbeat every 100 msec or so, if this is critical.

Nationwide answered 30/10, 2015 at 11:45 Comment(2)
Could you provide a link? The behavior you describe doesn't have anything to do with TCP slow-start. And there must surely be a better solution than sending more data.Ezarra
@PieterHintjens One minor question, Pieter, why does your test code in github use a low resolution s_clock (and state it as a reason for a poorer time resolution of the experiment ), while ZeroMQ has a wonderfull about 25ns-stepping Stopwatch.start() / Stopwatch.stop() utility service?"Dilatometer

© 2022 - 2024 — McMap. All rights reserved.