Mentaqueue provides a single producer single consumer queue based on the same ideas - http://mentaqueue.soliveirajr.com/Page.mtw , you could examine the code, though I've never used it myself.
Disruptor out of the box provides two techniques here - I won't go into code yet but can do that if you need.
It allows a way to sequence event handlers, and you could configure it so that each handler will process all requests in parallel; each request is handled by every handler.
A Worker Pool implementation that would allow a pool of worker threads to each process a request; each request would be handled once from a thread pool.
If you've identified that the queueing is taking a long time or you are having significant time contended (locks / synchronisation) then I would definitely look at the Disruptor. You'll get the best benefits by looking at whether tweaks to your architecture might lead to a clean use of the Disruptor.
Yes reducing transaction latency should help achieve throughput, so it could make sense, but it depends on what is holding up your throughput. This will become a very general comment - that you should identify the area of your application holding back throughput.
Indicators that would lead me to use Disruptor would be - lots of short run tasks handled in a similar way, contention on memory, a sequencing requirement, streaming or heavy IO (that could benefit from batching).