What does it mean to say Apache spawns a thread per request, but node.js does not?
Asked Answered
G

3

17

I have read about node.js and other servers such as Apache, where the threading is different. I simply do not understand what the threading means.

If I have a webpage that runs SQL to hit a database, say three different databases in the one server side page, what does that mean for threading in node.js? Apache? What does "thread" mean here?

Or as an article I saw, "start a new thread to handle each request."

What does it mean to say Apache spawns a thread per request, but node.js does not?

EDIT: I am hoping for an example that I can grasp. I'm used to having a server side page that hits a database(s). Several connections inside that file.

Gigue answered 11/10, 2013 at 17:56 Comment(1)
I posted a couple of diagrams in this question that show how the different threads behave in an event driven model (like Node.js) versus a traditional threaded model #14189996 (From the comments that I got in that question, the diagrams seem to very accurate)Hardeman
V
69

A thread is a context of program execution. Programs that are single-threaded can only do one thing at once, where multi-threaded programs can do many things at once.

Think of it like a kitchen at a restaurant. A single chef can really only do one task at a time, be that chopping onions or putting something in an oven. If an order comes in that requires lots of work from the chef (such as making salads vs. putting stuff in the oven and waiting) some meals may get delayed because that chef is busy. On the other hand, if that chef just has to bake a bunch of stuff, there isn't much work for him to do and he can make other meals while waiting for the food in the oven to be done.

With multiple chefs, many of these tasks can be done simultaneously. Many meals can be prepared simultaneously.

Apache's threading model is like hiring a fixed number of chefs (regardless of how many customers your restauarant has that night) and each chef can only work on one meal at a time. That means that if a meal order comes in, a dedicated chef is assigned to that meal. There will be times when that chef is busy chopping up ingredients and mixing cake batter, but there will also be times when he's just standing around waiting for the potatoes to boil. At any given time, you could have most of your chefs sitting idle, waiting on potatoes to boil and cake to bake and no more orders will be worked on, since each chef is dedicated to one order at a time.

To make matters worse, your kitchen is only as big as you can afford to make it. Each chef takes up space and resources, and you may have a situation where a bunch of chefs standing around holding the only spoons available are preventing other chefs from getting their food made.

Nginx is another web server (often used as a proxy) that you didn't ask about, but I'm including it to explain another threading model. It also hires a fixed number of chefs, but it hires fewer of them. Each chef can work on multiple meals at a time. So, if they're waiting on potatoes to boil while an order comes in for a chopped salad, they can go work on that salad instead of standing around idle. You can have a smaller kitchen (relative to the size of restaurant/number of customers) and get the same number of meals out, or more. It's a tight crew that is effective at not wasting time and resources.

Node.js is a bit different. It is single-threaded from a JavaScript perspective, but other tasks like disk and network IO are handled on separate threads automatically. It's like having a kitchen with only one chef, but that makes sense in some cases. If your kitchen has a lot of busy work for that chef, perhaps it makes sense to hire more chefs to do work. (To do this in Node.js, you can only spawn more processes, which is effectively like building a bunch of small kitchens right next to each other. You can have one guy standing out front coordinating the orders for all those kitchens.) However, if you're just a bakery (mainly just IO, with little busy-work for the chef), maybe you only need one chef.

To sum all this up, different threading models are used to divide work and process it effectively. Which threading model makes sense depends on your needs, and the other characteristics of the server you are choosing.

Valise answered 11/10, 2013 at 18:13 Comment(11)
Great answer. I'm going to use the kitchen metaphor in the future.Erdah
Awesome, this should be awarded a bounty :)Piliferous
This is very good. Thank you. I understood everything, but I wasn't sure how you were saying that Node.js was different. I will keep thinking on it though.Gigue
@Gigue Can you elaborate on what parts are still confusing? Node.js runs your code in a single thread. If you load a file from disk asynchronously, you basically tell Node, "go get that file and let me know when you're done". Then, your code in your single thread can go do other things until Node.js itself tells you that the file is ready. The underlying IO work is done in other threads that your code cannot access and doesn't need to know about. It's an important distinction, because processing a request will block other requests, but simply pushing data around won't.Valise
@Valise I think I get it. Node.js is like you said, taking a request from say http request, hits the back end code. The V8 takes the js code and creates multiple threads, but you still have only spawned one thread. V8 sends data back as it gets it but the initial request still loads and adds data as it gets it. Maybe. But how is that different from the Apache spawning threads? It takes a request and spawns threads inside its "Apache engine." There are still lots of threads by all of them, it seems.Gigue
I like the chefs, btw.Gigue
@Gigue V8 does not, and cannot, create multiple threads for running JavaScript. Your JavaScript always runs in a single thread. Suppose I have some JavaScript that calls fs.readfile(). There is some additional JavaScript in the standard library, but eventually some bit of JavaScript tells V8 to invoke native compiled code originally written in C or something. That native code can do whatever it wants at that point, and Node.js' library does create a thread pool for things in native code. When it comes time to send data back to JavaScript, it ends back on the main loop, single thread.Valise
@Valise is there a java server that doesn't use threads and can serve requests asynchronously like node.js?Is it possible to implement it in java?Carlyncarlynn
@POrekhov Sure, you can write whatever you like in almost any language/stack.Valise
@Coder-Man, I think it is what you are asking: thebackendguy.com/netty-simple-tcp-serverDistributee
Cooks in a kitchen are like CPUs in a computer. The cooks/CPUs make things actually happen. The orders in the kitchen are like the threads in a software system: Each one has a state that progresses mostly independently of the others. In order to achieve the best efficiency, you want the cooks/CPUs to be able to "switch context" from one order/thread that does not need their attention at any given moment to another order/thread that does need it. What you described—each cook sticking with just one order 'till its done—is the culinary equivalent of a system with lots of polling and spin locks.Erogenous
E
1

Node.js is single threaded in that it can only do one thing at once. You can run multiple instances of the node process on pretty much all cloud service providers, though. The apache process can multi-task on threads.

If the node process hangs for some reason, nothing else can happen. That's why its important to write node in an asyncronous way so that if a database query hangs, node can still take requests.

Without getting too technical, a thread can be thought of as a lane in the highway of the program. Its a specific channel of execution. In the lifetime of a request, a lot of things have to happen. All of those things are in one box.

Node doesn't have threads! You can think of it like a one lane road. But the way node is deployed you get many instances of that one lane road. They don't share anything though. If you a value gets added to an array in one, its not in the other. Anything that needs to be shared has to be shared in a cache or database layer.

Erdah answered 11/10, 2013 at 17:58 Comment(1)
What does "thread" mean? What does it mean to say "the node process"? How can I understand this in light of my example? Thanks.Gigue
T
1

What people confuse between is Threads, Process & Async, Non-blocking I/O.

Threads are child level 'runnable' to a process. All the execution environment is set up for a thread. Right from the Stack to Addressable memory locations it's allocated to a thread. If a child-level thread has to communicate back to the the main process thread, it has to use safe-messaging,notification models. There are multiple ways to do this, based on the language.

Node.js is Single Threaded and obviously single Process based. It's not meant for high CPU intensive blocking calls. But if you still want to use, You could consider Node clustering. So instead of creating threads, it creates multiple "process" that works like a thread.

Async - All the code that carries a callback functions are not actually Async. Okay in other words, Literally, they are Asynchrounous as they don't block the call.

But in Node.js context, When someone says, Node is Async, it's completely linked to the OS interfacing. The capability of Node depends on the Non-blocking I/O capabilities of the underlying OS. So whatever objects the OS supports Non-blocking I/O for example, Sockets, Files, Pipes, Node utilizes them to maximum.

And btw, when you talk about Apache, you should ideally be comparing Nginx. Not Node.js. Node.js is not meant to serve as a Web Server. It's a basically a Process that puts effective use of Async I/O.

Triglyceride answered 27/5, 2020 at 14:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.