I think you have several issues with initial understanding. Frankly, I'm a bit surprised to see the following: both need 5 threads to handle the volume
. How did you identify you need that exact number? Do you have any guarantees 5 threads will be enough?
RabbitMQ is tuned and time tested, so it is all about proper design
and efficient message processing.
Let's try to review the problem and find a proper solution. BTW, message queue itself will not provide any guarantees you have really good solution. You have to understand what you are doing and also do some additional testing.
As you definitely know there are many layouts possible:
I will use layout B
as the simplest way to illustrate 1
producer N
consumers problem. Since you are so worried about the throughput. BTW, as you might expect RabbitMQ behaves quite well (source). Pay attention to prefetchCount
, I'll address it later:
So it is likely message processing logic is a right place to make sure you'll have enough throughput. Naturally you can span a new thread every time you need to process a message, but eventually such approach will kill your system. Basically, more threads you have bigger latency you'll get (you can check Amdahl's law if you want).
(see Amdahl’s law illustrated)
Tip #1: Be careful with threads, use ThreadPools (details)
A thread pool can be described as a collection of Runnable objects
(work queue) and a connections of running threads. These threads are
constantly running and are checking the work query for new work. If
there is new work to be done they execute this Runnable. The Thread
class itself provides a method, e.g. execute(Runnable r) to add a new
Runnable object to the work queue.
public class Main {
private static final int NTHREDS = 10;
public static void main(String[] args) {
ExecutorService executor = Executors.newFixedThreadPool(NTHREDS);
for (int i = 0; i < 500; i++) {
Runnable worker = new MyRunnable(10000000L + i);
executor.execute(worker);
}
// This will make the executor accept no new threads
// and finish all existing threads in the queue
executor.shutdown();
// Wait until all threads are finish
executor.awaitTermination();
System.out.println("Finished all threads");
}
}
Tip #2: Be careful with message processing overhead
I would say this is obvious optimization technique. It is likely you'll send small and easy to process messages. The whole approach is about smaller messages to be continuously set and processed. Big messages eventually will play a bad joke, so it is better to avoid that.
So it is better to send tiny pieces of information, but what about processing? There is an overhead every time you submit a job. Batch processing can be very helpful in case of high incoming message rate.
For example, let's say we have simple message processing logic and we do not want to have thread specific overheads every time message is being processed. In order to optimize that very simple CompositeRunnable can be introduced
:
class CompositeRunnable implements Runnable {
protected Queue<Runnable> queue = new LinkedList<>();
public void add(Runnable a) {
queue.add(a);
}
@Override
public void run() {
for(Runnable r: queue) {
r.run();
}
}
}
Or do the same in a slightly different way, by collecting messages to be processed:
class CompositeMessageWorker<T> implements Runnable {
protected Queue<T> queue = new LinkedList<>();
public void add(T message) {
queue.add(message);
}
@Override
public void run() {
for(T message: queue) {
// process a message
}
}
}
In such a way you can process messages more effectively.
Tip #3: Optimize message processing
Despite the fact you know can process messages in parallel (Tip #1
) and reduce processing overhead (Tip #2
) you have to do everything fast. Redundant processing steps, heavy loops and so on might affect performance a lot. Please see interesting case-study:
Improving Message Queue Throughput tenfold by choosing the right XML Parser
Tip #4: Connection and Channel Management
- Starting a new channel on an existing connection involves one network
round trip - starting a new connection takes several.
- Each connection uses a file descriptor on the server. Channels don't.
- Publishing a large message on one channel will block a connection
while it goes out. Other than that, the multiplexing is fairly transparent.
- Connections which are publishing can get blocked if the server is
overloaded - it's a good idea to separate publishing and consuming
connections
- Be prepared to handle message bursts
(source)
Please note, all tips are perfectly work together. Feel free to let me know if you need additional details.
Complete consumer example (source)
Please note the following:
- channel.basicQos(prefetch) - As you saw earlier
prefetchCount
might be very useful:
This command allows a consumer to choose a prefetch window that
specifies the amount of unacknowledged messages it is prepared to
receive. By setting the prefetch count to a non-zero value, the broker
will not deliver any messages to the consumer that would breach that
limit. To move the window forwards, the consumer has to acknowledge
the receipt of a message (or a group of messages).
- ExecutorService threadExecutor - you can specify properly configured executor service.
Example:
static class Worker extends DefaultConsumer {
String name;
Channel channel;
String queue;
int processed;
ExecutorService executorService;
public Worker(int prefetch, ExecutorService threadExecutor,
, Channel c, String q) throws Exception {
super(c);
channel = c;
queue = q;
channel.basicQos(prefetch);
channel.basicConsume(queue, false, this);
executorService = threadExecutor;
}
@Override
public void handleDelivery(String consumerTag,
Envelope envelope,
AMQP.BasicProperties properties,
byte[] body) throws IOException {
Runnable task = new VariableLengthTask(this,
envelope.getDeliveryTag(),
channel);
executorService.submit(task);
}
}
You can also check the following: