Why is the Apache Event MPM Performing Poorly?
Asked Answered
W

2

24

The Event MPM is not exactly the same design as Nginx, but was clearly designed to make keepalives more tenable and sending static files faster. My understanding is that the Event MPM is a bit of a misnomer because:

  1. Although the connection is passed to kqueue/epoll,
  2. certain very important modules such as mod_gzip and mod_ssl will block/consume a thread until the response is done,
  3. and that is an issue for large files, but probably not for PHP-generated HTML documents, etc.

Unfortunately, Apache keeps losing marketshare, and most benchmarks are damning for the event MPM. Are the benchmarks flawed, or does the event MPM really do so poorly against Nginx? Even with these limitations, under normal traffic (non-malicious) and smaller files, it should be somewhat competitive with Nginx. For example, it should be competitive serving PHP-generated documents via php-fpm on slow connections because the document will be buffered (even if being ssl'd and gzip'd) and sent asynchronously. Both SSL and non-SSL connections using compression or not should not work meaningfully differently than they would in Nginx on such a workload.

So why does it not shine in various benchmarks? What's wrong with it? Or what's wrong with the benchmarks? Is a major site using it as an appeal to authority that it can perform?

Wigwam answered 9/1, 2015 at 8:7 Comment(4)
This would be a much better question if you had cited the benchmarks you refer to. If you simply want a fast installation, then pre-fork Apache + mod_php will consistently outperform Nginx on low to medium loads - there is a massive difference under heavy load. But if you want capacity as well as performance then you should be looking at a different architecture - ATS/nginx/varnish in front of your webservers.Eucalyptus
@Eucalyptus Probably true. I'm just yet to find one (except the one published by Apache) that looks good. I also think you're overgeneralizing. ATS or Varnish will do very little (or be harmful) on dynamic content that cannot be cached.Wigwam
Au contraire. Running an event based server in front of a pre-fork webserver will provide protection against sloloris attacks, and offload the static content serving, saving memory and increasing capacity. The additional latency on dynamic content should be of the order of a couple of milliseconds - hardly earth shattering. Indeed if you can move the reverse proxy closer to the clients you should see a significant improvmenet in performance.Eucalyptus
@Eucalyptus I was only referring to dynamic content. See "... on dynamic content that cannot be cached." It may do something about slowloris and slow reads over mod_reqtimeout or mod_security, but at the expense of another copy over a pipe. I guess you could say that Nginx + Apache+mod_php is technically not so different than Nginx + php-fpm. But why would anyone use php-fpm, then? Is php-fpm and FastCGI to localhost so much faster than HTTP over localhost? In any case, as another answer states, the Event MPM is roughly equivalent to how you describe Nginx -- an event-based server in front.Wigwam
T
22

It is slower than nginx because Apache with the event MPM is (very) roughly equivalent to an event-driven HTTP proxy (nginx, varnish, haproxy) in front of Apache with the worker MPM. Event is worker, but rather than handing each new connection to a thread for its lifetime, the event MPM's threads hand the connection to a secondary thread which pushes it onto a queue or closes it if keep-alive is off or has expired.

The real benefit of event over worker is the resource usage. If you need to sustain 1,000 concurrent connections, the worker MPM needs 1,000 threads, while the event MPM may get by with 100 active threads and 900 idle connections managed in the event queue. The event MPM will use a fraction of the resources of the worker MPM in that hypothetical, but the downside is still there: each of those requests is handled by a separate thread which must be scheduled by the kernel and as such will incur the cost of switching context.

On the other hand we have nginx which uses the event model itself as its scheduler. Nginx simply processes as much work on each connection as it can before moving on to the next one. No extra context switching required.

The one use case where the event MPM really shines is to handle a setup where you have a heavy application running in Apache, and to conserve the resources of threads that are idle during keep-alive, you would deploy a proxy (such as nginx) in front of apache. If your front end served no other purpose (e.g. static content, proxying to other servers, etc...), the event MPM handles that use case beautifully and eliminates the need for a proxy.

Tintoretto answered 14/1, 2015 at 21:40 Comment(7)
This is exactly why I'd expect it to do well if you use Apache event MPM + PHP-fpm. It shouldn't really do much or any worse than Nginx + PHP-fpm. But every benchmark says it doesn't. Time to do my own benchmark. They really could have applied this "keep alive optimization" to every MPM by the way -- including prefork. Nothing would stop them from using a master process to track idle kept-alive connections and only handing them off to a process in the pool when active. They just only did it to worker.Wigwam
The comparison I was making was Nginx + Apache-worker + PHP-fpm ~~ Apache-event + PHP-fpm. Nginx + PHP-fpm will always perform significantly better than Apache-event + PHP-fpm because each FastCGI connection under Apache has to be handled by a separate thread. If php were thread-safe you could get much closer to Nginx's performance by using Apache-event+mod_php, but alas, it is not...Tintoretto
Maybe I'm misunderstanding, but the thread's lifetime will be very brief because most connections to php-fpm will be brief. The response will be small (an HTML document is 10-100k typically), get buffered, and sent in an event-based fashion. I guess Nginx can do it entirely without a thread per connection with a state machine talking to php-fpm. It would probably also buffer it because otherwise it would be recreating the problem that prefork has with slow connections and just push it back to a php-fpm process instead of an httpd process. I wouldn't think the difference would be so huge.Wigwam
As I understand it, if responses were buffered and delivered through the event queue, modules like mod_gzip would not tie up the thread for the duration of the response. The thread running the event queue exists to hold onto a connections between HTTP requests. So in your scenario, when the next request comes in, the event loop hands off to a thread to process it, that thread sends a request to the php-fcgi, php-fcgi processes it and returns, the apache thread sends the response and hands the connection off to the event loop. Nginx can skip the handoffs and talk to fcgi in its event loop.Tintoretto
There are still CPU-bound things to do in Nginx, though — like SSL and compression. Is that why it's typically configured to have NUM_CPUs+1 or so processes, not one light Lighttpd? I guess Event may be an improvement, but it really looks like Apache would wither if anyone added .htaccess support to Nginx and made it play nice with CPanel.Wigwam
Yep, but SSL and compression are implemented in a manner that's compatible with the event loop, being broken into manageable chunks. The extra processes allow you to take advantage of every cpu on your server, I'm not really sure why lighttpd doesn't implement that. But yeah, Apache is a fine workhorse for multiuser environments. I'm not sure if nginx will ever gain that sort of functionality.Tintoretto
That's what I said in my question — but an Apache contributor ("covener") said it's not the biggest problem in the other answer — I brought up SSL and Gzip filters specifically because I figured Nginx would do it a smarter way block-by-block, but Apache 2.x's elaborate filter system blocking by design. Litespeed did all this in one package as POC. But it's not free. I really wish they'd just totally open source Litespeed instead of the whole Open Litespeed thing without .htaccess — deliberately crippling it. That move w/could change the webserver landscape overnight. It would be a coup.Wigwam
C
9

To me, the dominating operative differences are that in event:

  • handlers (plugins responsible for generating the response) are synchronous -- if they are performing either computation or I/O they will tie up a thread
  • the core must use cross-thread locks to protect key data structures because it is multi-threaded to support so many of these synchronous requests

That's why at very high volumes servers like nginx (or Apache Traffic Server or any modern commercial/high performance proxy) usually comes out ahead.

IMO The bullets in your question are a bit off the mark, SSL and deflate are not really contributing much to the differences here as they are both filters that don't really contribute to scalability problems or even tie httpd to its traditional API guarantees about the lifecycle of a request or connection. Filters like these (vs. handlers, or the core filter responsible for the low-level I/O) are probably the least of things tied to the processing model.

But I also don't think it peforms so poorly by comparison for all but the most extreme workloads or extremely constrained systems. Most of the benchmarks I've seen are of extremely poor quality, for one reason or another.

I think largely people want what they call a webserver today to be a proxy to a more sophisticated application server (Java EE, PHP, etc) and a server designed to move I/O around most efficiently without API baggage is going to have the edge.

Cobaltite answered 13/1, 2015 at 1:46 Comment(2)
I should add that I am as biased as they can come as an Apache httpd contributor.Cobaltite
Thx! I admit guessing from a quick audit of the code & on documentation. I was right about the handlers but wrong about specifics, and I missed the locking. The reason I brought up those modules -- is that I could see it being a big problem on large static files. Google wants us to move to everything-SSL in 2015. I can even see SSL moving to the kernel because we can no longer just map the file & send. But at least in Nginx it won't rob us of threads full time. It can use ssh/gzip+epoll on a few threads block-by-block, perhaps with some affinity. Am I wrong? Gzip raises similar issues.Wigwam

© 2022 - 2024 — McMap. All rights reserved.