What exactly is a pre-fork web server model?
Asked Answered
B

2

166

I want to know what exactly it means when a web server describes itself as a pre-fork web server. I have a few examples such as unicorn for ruby and gunicorn for python.

More specifically, these are the questions:

  • What problem does this model solve?
  • What happens when a pre-fork web server is initially started?
  • How does it handle requests?

Also, a more specific question for unicorn/gunicorn:

Let's say that I have a webapp that I want to run with (g)unicorn. On initialization, the webapp will do some initialization stuff (e.g. fill in additional database entries). If I configure (g)unicorn with multiple workers, will the initialization stuff be run multiple times?

Buprestid answered 14/9, 2014 at 14:31 Comment(0)
H
155

Pre-forking basically means a master creates forks which handle each request. A fork is a completely separate *nix process.

Update as per the comments below. The pre in pre-fork means that these processes are forked before a request comes in. They can however usually be increased or decreased as the load goes up and down.

Pre-forking can be used when you have libraries that are NOT thread safe. It also means issues within a request causing problems will only affect the process which they are processed by and not the entire server.

The initialisation running multiple times all depends on what you are deploying. Usually however connection pools and stuff of that nature would exist for each process.

In a threading model the master would create lighter weight threads to dispatch requests too. But if a thread causes massive issues it could have repercussions for the master process.

With tools such as Nginx, Apache 2.4's Event MPM, or gevent (which can be used with Gunicorn) these are asynchronous meaning a process can handle hundreds of requests whilst not blocking.

Habitable answered 17/9, 2014 at 15:34 Comment(5)
I had the same doubt about the meaning of "prefork". I supposed it meant forking of some kind, naturally, but the "pre" part was confusing me. I found here abbreviations.com/prefork that the "pre" part actually means that worker processes are created in advance, so that time is not wasted forking only when a worker is needed. Makes a lot of sense to me :)Diffusive
// , @ElNinjaTrepador, why not add a separate answer? That was a lot more intelligible to me, at least, and it may help others more if that comment gets a more prominent place.Newborn
I've updated the answer to add a bit more information on the pre in pre-fork. @ElNinjaTrepador thanks for pointing this out I didn't realize that wasn't well known.Habitable
@NathanBasanese will do sir ;)Diffusive
@JoeDoherty maybe it is well known, it just wasn't known by me until that time :D I was going to add another answer (as per Nathan's request), but since you've added what I said to yours it's not necessary anymore :)Diffusive
L
33

How does a "pre-fork worker model" work?

  • Master Process: There is a master process that spawns and kills workers, depending on the load and the capacity of the hardware. More incoming requests would cause the master to spawn more workers, up to a point where the "hardware limit" (e.g. all CPUs saturated) is reached, at which point queing will set in.
  • Workers: A worker can be understood as an instance of your application/server. So if there are 4 workers, your server is booted 4 times. It means it occupies 4 times the "Base-RAM" than only one worker would, unless you do shared memory wizardry.
  • Initialization: Your initialization logic needs to be stable enough to account for multiple servers. For example, if you write db entries, check if they are there already or add a setup job before your app server
  • Pre-fork: The "pre" in prefork means that the master always adds a bit more capacity than currently required, such that if the load goes up the system is "already ready". So it preemptively spawns some workers. For example in this apache library, you control this with the MinSpareServers property.
  • Requests: The requests (TCP connection handles) are being passed from the master process to the children.

What problem do pre-fork servers solve?

  • Multiprocessing: If you have a program that can only target one CPU core, you potentially waste some of your hardware's capacity by only spawning one server. The forked workers tackle this problem.
  • Stability: When one worker crashes, the master process isn't affected. It can just spawn a new worker.
  • Thread safety: Since it's really like your server is booted multiple times, in separate processes, you don't need to worry about threadsafety (since there are no threads). This means it's an appropriate model when you have non-threadsafe code or use non-threadsafe libs.
  • Speed: Since the child processes aren't forked (spawned) right when needed, but pre-emptively, the server can always respond fast.

Alternatives and Sidenotes

  • Container orchestration: If you're familiar with containerization and container orchestration tools such as kubernetes, you'll notice that many of the problems are solved by those as well. Kubernetes spawns multiple pods for multiprocessing, it has the same (or better) stability and things like "horizontal pod autoscalers" that also spawn and kill workers.
  • Threading: A server may spawn a thread for each incoming request, which allows for many requests being handled "simultaneously". This is the default for most web servers based on Java, since Java natively has good support for threads. Good support meaning the threads run truly parallel, on different cpu cores. Python's threads on the other hand cannot truly parallelize (=spread work to multiple cores) due to the GIL (Global Interpreter Lock), they only provide a means for contex switching. More on that here. That's why for python servers "pre-forkers" like gunicorn are so popular, and people coming from Java might have never heard of such a thing before.
  • Async / non-blocking processing: If your servers spend a lot of time "waiting", for example disk I/O, http requests to external services or database requests, then multiprocessing might not be what you want. Instead consider making your code "non-blocking", meaning that it can handle many requests concurrently. Async / await (coroutines) based systems like fastapi (asgi server) in python, Go or nodejs use this mechanism, such that even one server can handle many requests concurrently.
  • CPU bound tasks: If you have CPU bound tasks, the non-blocking processing mentioned above won't help much. Then you'll need some way of multiprocessing to distribute the load on your CPU cores, as the solutions mentioned above, that is: container orchestration, threading (on systems that allow true parallelization) or... pre-forked workers.

Sources

Least answered 7/1, 2022 at 18:14 Comment(1)
Very helpful answer bersling, glad I read past the fold on this question! :)Concerto

© 2022 - 2024 — McMap. All rights reserved.