Implications of keeping linger.ms at 0
Asked Answered
A

3

16

We are using kafka 0.10.2.1. The documentation specifies that a buffer is available to send even if it isn't full-

By default a buffer is available to send immediately even if there is additional unused space in the buffer. However if you want to reduce the number of requests you can set linger.ms to something greater than 0.

However, it also says that the producer will attempt to batch requests even if linger time is set to 0ms-

Note that records that arrive close together in time will generally batch together even with linger.ms=0 so under heavy load batching will occur regardless of the linger configuration; however setting this to something larger than 0 can lead to fewer, more efficient requests when not under maximal load at the cost of a small amount of latency.

Intuitively, it seems that any kind of batching would require some linger time, and the only way to achieve a linger time of 0 would be to make the broker call synchronised. Clearly, keeping the linger time at 0 doesn't appear to harm performance as much as blocking on the send call, but seems to have some impact on performance. Can someone clarify what the docs are saying above?

Appendage answered 16/3, 2018 at 7:6 Comment(0)
A
17

The docs are saying that even though you set linger time to 0, you might end up with a little bit of batching under load since records are getting added to be sent faster than the send thread can dispatch them. This setting is optimizing for minimal latency. If the measure of performance you really care about is throughput, you'd increase the linger time a bit to batch more and that's what the docs are getting at. Not so much to do with synchronous send in this case. More in depth info

Aylmar answered 17/3, 2018 at 2:41 Comment(0)
L
11

With linger.ms=0 the record is sent as soon as possible and with many requests this may impact the performance. Forcing a little wait by increasing linger.ms on moderate/high load will optimize the use of the batch and increase throughput. This depends as well on the record size, the bigger the less can fit in the batch (batch.size default is 16Kb).

Basically it is a trade off between number of number of requests and throughput and it really depends on your scenario, however sending immediately does not take full advantage of batching and compression (if enabled) and I suggest to run some metrics with different values of linger.ms such as 0/5/10/50/200

In general I will suggest to set linger.ms > 0

References:

Landlordism answered 29/3, 2018 at 12:21 Comment(0)
I
0

I am by far no kafka expert, but these things should be explained easier, otherwise all the metrics read and not going to be understood.

First thing I want to notice is that a Sender Thread, which is not the thread you call producer::send under, sends batches of messages to the cluster. Now if your current batch has a single message inside it, it does not break the rule : it still sends batches, it just happens that there is one single in the current batch. There are metrics that allow you to see how full, on average, is a batch before it was sent.

If there are many batches that senders send that are more empty than full - it's not a good thing. The work it has to do to actually place messages is much more than expensive than the actual message sent and that's why batching exists to begin with.

In such cases, linger.ms might help, cause it will allow for a "batch" to stay a little bit more in the RecordAccumulator and thus more batching will happen.

Isolecithal answered 18/9, 2022 at 12:17 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.