nginx : Its Multithreaded but uses multiple processes?
Asked Answered
T

3

40

I'm trying to understand what makes Nginx so fast, and I have a few questions.

As I understand it, Apache either spawns a new process to serve each request OR spawns a new thread to serve each request. Since each new thread shares virtual address space the memory usage keeps climbs if there are a number of concurrent requests coming in.

Nginx solves this by having just one listening process(Master), with a single execution thread AND 2 or 3(number is configurable) worker processes. This Master process/thread is running an event loop. Effectively waiting for any incoming request. When a request comes in it gives that request to one of the worker processes.

Please correct me if my above understanding is not correct

If the above is correct, then I have a few questions:

  1. Isn't the worker process going to spawn multiple threads and going to run into the same problem as apache ?

  2. Or is nginx fast because its event based architecture uses nonblocking-IO underneath it all. Maybe the worker process spawns threads which do only non-blocking-IO, is that it ?

  3. What "exactly" is "event based architecture", can someone really simplify it, for soemone like me to understand. Does it just pertain to non-blocking-io or something else as well ?

I got a reference of c10k, I am trying to go through it, but I don't think its about event based arch. it seems more for nonblocking IO.

Thread answered 21/1, 2011 at 22:59 Comment(3)
Nonblocking IO requires an event-based architecture.Lubricous
FYI - Just in case you are interested in digging deeper - I've blogged the answer along with other materials + videos over here : planetunknown.blogspot.com/2011/02/…Thread
Apache is not slow for too many threads it creates, but for the context switch between them, which consumes too much more CPU time slice than execution the instructions.Brachiate
H
23

It's not very complicated from a conceptual point of view. I'll try to be clear but I have to do some simplification.

The event based servers (like nginx and lighttpd) use a wrapper around an event monitoring system. For example. lighttpd uses libevent to abstract the more advanced high-speed event monitoring system (see libev also).

The server keeps track of all the non blocking connections it has (both writing and reading) using a simple state machine for each connection. The event monitoring system notifies the server process when there is new data available or when it can write more data. It's like a select() on steroids, if you know socket programming. The server process then simply sends the requested file using some advanced function like sendfile() where possible or turns the request to a CGI process using a socket for communication (this socket will be monitored with the event monitoring system like the other network connections.)

This link as a lot of great information about the internals of nginx, just in case. I hope it helps.

Haematosis answered 21/1, 2011 at 23:41 Comment(3)
Thanks Ass3mbler. I think that helped in understanding it.Thread
@PlanetUnknow I'm really glad it helped. If you need more info, just ask me, I've worked many times to modify the source of lighttpd. If it's ok could you accept my answer please? Thank you!Haematosis
Thanks Ass3mbler. I think that helped in understanding it. So the master process is listening to incoming traffic and the worker processes are running event loops (registering events and responding when one occurs). In a summary, is that correct ?Thread
D
67

Apache uses multiple threads to provide each request with it's own thread of execution. This is necessary to avoid blocking when using synchronous I/O.

Nginx uses only asynchronous I/O, which makes blocking a non-issue. The only reason nginx uses multiple processes, is to make full use of multi-core, multi-CPU and hyper-threading systems. Even with SMP support, the kernel cannot schedule a single thread of execution over multiple CPUs. It requires at least one process or thread per logical CPU.

So the difference is, nginx requires only enough worker processes to get the full benefit of SMP, whereas Apache's architecture necessitates creating a new thread (each with it's own stack of around ~8MB) per request. Obviously, at high concurrency, Apache will use much more memory and suffer greater overhead from maintaining large numbers of threads.

Duotone answered 25/4, 2012 at 11:10 Comment(2)
@spockwang Indeed they use one worker for core. The NGINX configuration recommended in most cases – running one worker process per CPU core – makes the most efficient use of hardware resources. from NGINX docsKajdan
Apparently they have since introduced multi-threaded reads in an effort to reduce 'blocking" on the main threads, but the answer is right I think. nginx.com/blog/thread-pools-boost-performance-9xAnaya
H
23

It's not very complicated from a conceptual point of view. I'll try to be clear but I have to do some simplification.

The event based servers (like nginx and lighttpd) use a wrapper around an event monitoring system. For example. lighttpd uses libevent to abstract the more advanced high-speed event monitoring system (see libev also).

The server keeps track of all the non blocking connections it has (both writing and reading) using a simple state machine for each connection. The event monitoring system notifies the server process when there is new data available or when it can write more data. It's like a select() on steroids, if you know socket programming. The server process then simply sends the requested file using some advanced function like sendfile() where possible or turns the request to a CGI process using a socket for communication (this socket will be monitored with the event monitoring system like the other network connections.)

This link as a lot of great information about the internals of nginx, just in case. I hope it helps.

Haematosis answered 21/1, 2011 at 23:41 Comment(3)
Thanks Ass3mbler. I think that helped in understanding it.Thread
@PlanetUnknow I'm really glad it helped. If you need more info, just ask me, I've worked many times to modify the source of lighttpd. If it's ok could you accept my answer please? Thank you!Haematosis
Thanks Ass3mbler. I think that helped in understanding it. So the master process is listening to incoming traffic and the worker processes are running event loops (registering events and responding when one occurs). In a summary, is that correct ?Thread
E
13

Apache doesn't spawn a new thread for each request. It maintains a cache of threads or a group of pre-forked processes which it farms out requests to. The number of concurrent requests are limited by the number of children/threads yes, but apache is not spawning a new thread/child for every request which would be ridiculously slow (even with threads, creation and teardown for every request would be way too slow)

Nginx uses a master-worker model. The master process deals with loading the configuration and creating/destroying/maintaining workers. Like apache it starts out with a number of pre-forked processes already running each of which is a worker (and one of which is the "master" process). EACH worker process share a set of listening sockets. Each worker process accepts connections and processes them, but each worker can handle THOUSANDS of connections at once, unlike apache which can only handle 1 connection per worker.

The way nginx achieves this is through "multiplexing". It doesn't use libevent, it uses a custom event loop which was designed specifically for nginx and grew in development with the development of the nginx software. Multiplexing works by using a loop to "increment" through a program chunk by chunk operating on one piece of data/new connection/whatever per connection/object per loop iteration. It is all based on backends like Epoll() kqueue() and select(). Which you should read up on

Estren answered 4/7, 2014 at 7:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.