RabbitMQ: throttling fast producer against large queues with slow consumer

Asked 20/1, 2015 at 9:37 Answered 31/7, 2019 at 8:46

We're currently using RabbitMQ, where a continuously super-fast producer is paired with a consumer limited by a limited resource (e.g. slow-ish MySQL inserts).

We don't like declaring a queue with x-max-length, since all messages will be dropped or dead-lettered once the limit is reached, and we don't want to loose messages.

Adding more consumers is easy, but they'll all be limited by the one shared resource, so that won't work. The problem still remains: How to slow down the producer?

Sure, we could put a flow control flag in Redis, memcached, MySQL or something else that the producer reads as pointed out in an answer to a similar question, or perhaps better, the producer could periodically test for queue length and throttle itself, but these seem like hacks to me.

I'm mostly questioning whether I have a fundamental misunderstanding. I had expected this to be a common scenario, and so I'm wondering:

What is best practice for throttling producers? How is this done with RabbitMQ? Or do you do this in a completely different way?

Background

Assume the producer actually knows how to slow himself down with the right input. E.g. a hardware sensor or hardware random number generator, that can generate as many events as needed.

In our particular real case, we have an API that users can use to add messages. Instead of devouring and discarding messages, we'd like to apply back-pressure by having our API return an error if the queue is "full", so the caller/user knows to back-off, or have the API block until the consumer catches up. We don't control our user, so regardless of how fast the consumer is, I can create a producer that is faster.

I was hoping for something like the API for a TCP socket, where a write() can block and where a select() can be used to determine if a handle is writable. So either having the RabbitMQ API block or have it return an error if the queue is full.

Gamesome answered 20/1, 2015 at 9:37 Comment(3)

Just out of curiosity did you ever solved the problem? We want to implement a similar feature where we want Producer to take some action if the Consumer is overloaded. – Kevyn 10/1, 2019 at 18:54

No, I'm sorry. No solution so far. – Quarter 11/1, 2019 at 13:56

Thanks Peter! I will let you know if we get to any solution. – Kevyn 11/1, 2019 at 19:24

For the x-max-length property, you said you don't want messages to be dropped or dead-lettered. I see there was an update in adding some more capabilities for this. As I see it is specified in the documentation:

"Use the overflow setting to configure queue overflow behaviour. If overflow is set to reject-publish, the most recently published messages will be discarded. In addition, if publisher confirms are enabled, the publisher will be informed of the reject via a basic.nack message"

So as I understand it, you can use queue limit to reject the new messages from publishers thus pushing some backpressure to the upstream.

Branle answered 31/7, 2019 at 8:46 Comment(0)

I don't think that this is in any way rabbitmq specific. Basically you have a scenario, where there are two systems of different processing capabilities, and this mismatch will either pose a risk of overflowing the queue (whatever it would be), or even in case of a constant mismatch between producer and consumer, simply create more and more time-distance between event creation and its handling.

I used to deal with this kind of scenarios, and unfortunately there is no magic bullet. You either have to speed up even handling (better hardware, more suited software?) or throttle the event creation (which has nothing to do with MQ really).

Now, I would ask you what's the goal and how the events are produced. Are the events are produced constantly, with either unlimitted or just very high rate (for example readings from sensors - the more, the better), or are they created in batches/spikes (for example: user requests in specific time periods, batch loads from CRM system). I assume that the goal is to process everything cause you mention you don't want to loose any queued message.

If the output is constant, then some limiter (either internal counter, if the producer is the only producer, or external queue length checks if queue can be filled with some other system) is definitely in place.

IF eventsInTimePeriod/timePeriod > estimatedConsumerBandwidth
THEN LowerRate()
ELSE RiseRate()

In real world scenarios we used to simply limit the output manually to the estimated values and there were some alerts set for queue length, time from queue entry to queue leaving etc. Where such limiters were omitted (by mistake mostly) we used to find later some tasks that were supposed to be handled in few hours, that were waiting for three months for their turn.

I'm afraid it's hard to answer to "How to slow down the producer?" if we know nothing about it, but some ideas are: aforementioned rate check or maybe a blocking AddMessage method:

AddMessage(message)
    WHILE(getQueueLength() > maxAllowedQueueLength)
        spin(1000); // or sleep or whatever
    mqAdapter.AddMessage(message)

I'd say it all depends on specific of the producer application and in general your architecture.

Avellaneda answered 20/1, 2015 at 10:25 Comment(3)

Thanks for your thoughtful reply. Perhaps my question was not clear. I've clarified the question accordingly. – Quarter 20/1, 2015 at 14:53

Unfortunatelly I won't be able to help with RabbitMQ specifics, never used that particular tech - maybe someone else will have some good insight. I have a feeling, seeing as you have a case where actual source of events are multiple users, that your case is fairly complicated. Simple blocking till queue drops below limit, for example, would be dangerous, as some clients might be unlucky and remain locked forever (because as soon as the length drops, other client pushes it back above the limit). Possibly just returning an error when queue is above certain length could work, but again... – Avellaneda 20/1, 2015 at 15:30

...but again it has some pitfalls to consider (like some clients never hitting the error, and others hitting it constantly, due to sheer chance). – Avellaneda 20/1, 2015 at 15:31

Background

Recommended topics

Hot tags