We've all read the benchmarks and know the facts - event-based asynchronous network servers are faster than their threaded counterparts. Think lighttpd or Zeus vs. Apache or IIS. Why is that?
I think event based vs thread based is not the question - it is a nonblocking Multiplexed I/O, Selectable sockets, solution vs thread pool solution.
In the first case you are handling all input that comes in regardless of what is using it- so there is no blocking on the reads- a single 'listener'. The single listener thread passes data to what can be worker threads of different types- rather than one for each connection. Again, no blocking on writing any of the data- so the data handler can just run with it separately. Because this solution is mostly IO reads/writes it doesn't occupy much CPU time- thus your application can take that to do whatever it wants.
In a thread pool solution you have individual threads handling each connection, so they have to share time to context switch in and out- each one 'listening'. In this solution the CPU + IO ops are in the same thread- which gets a time slice- so you end up waiting on IO ops to complete per thread (blocking) which could traditionally be done without using CPU time.
Google for non-blocking IO for more detail- and you can prob find some comparisons vs. thread pools too.
(if anyone can clarify these points, feel free)
Event-driven applications are not inherently faster.
From Why Events Are a Bad Idea (for High-Concurrency Servers):
We examine the claimed strengths of events over threads and show that the
weaknesses of threads are artifacts of specific threading implementations
and not inherent to the threading paradigm. As evidence, we present a
user-level thread package that scales to 100,000 threads and achieves
excellent performance in a web server.
This was in 2003. Surely the state of threading on modern OSs has improved since then.
Writing the core of an event-based server means re-inventing cooperative multitasking (Windows 3.1 style) in your code, most likely on an OS that already supports proper pre-emptive multitasking, and without the benefit of transparent context switching. This means that you have to manage state on the heap that would normally be implied by the instruction pointer or stored in a stack variable. (If your language has them, closures ease this pain significantly. Trying to do this in C is a lot less fun.)
This also means you gain all of the caveats cooperative multitasking implies. If one of your event handlers takes a while to run for any reason, it stalls that event thread. Totally unrelated requests lag. Even lengthy CPU-invensive operations have to be sent somewhere else to avoid this. When you're talking about the core of a high-concurrency server, 'lengthy operation' is a relative term, on the order of microseconds for a server expected to handle 100,000 requests per second. I hope the virtual memory system never has to pull pages from disk for you!
Getting good performance from an event-based architecture can be tricky, especially when you consider latency and not just throughput. (Of course, there are plenty of mistakes you can make with threads as well. Concurrency is still hard.)
A couple important questions for the author of a new server application:
- How do threads perform on the platforms you intend to support today? Are they going to be your bottleneck?
- If you're still stuck with a bad thread implementation: why is nobody fixing this?
It really depends what you're doing; event-based programming is certainly tricky for nontrivial applications. Being a web server is really a very trivial well understood problem and both event-driven and threaded models work pretty well on modern OSs.
Correctly developing more complex server applications in an event model is generally pretty tricky - threaded applications are much easier to write. This may be the deciding factor rather than performance.
It isn't about the threads really. It is about the way the threads are used to service requests. For something like lighttpd you have a single thread that services multiple connections via events. For older versions of apache you had a process per connection and the process woke up on incoming data so you ended up with a very large number when there were lots of requests. Now however with MPM apache is event based as well see apache MPM event.
© 2022 - 2024 — McMap. All rights reserved.