Why is Node.js single threaded? [closed]
Asked Answered
T

3

300

In PHP (or Java/ASP.NET/Ruby) based webservers every client request is instantiated on a new thread. But in Node.js all the clients run on the same thread (they can even share the same variables!) I understand that I/O operations are event-based so they don't block the main thread loop.

What I don't understand is WHY the author of Node chose it to be single-threaded? It makes things difficult. For example, I can't run a CPU intensive function because it blocks the main thread (and new client requests are blocked) so I need to spawn a process (which means I need to create a separate JavaScript file and execute another node process on it). However, in PHP cpu intensive tasks do not block other clients because as I mentioned each client is on a different thread. What are its advantages compared to multi-threaded web servers?

Note: I've used clustering to get around this, but it's not pretty.

Twitt answered 31/7, 2013 at 0:25 Comment(14)
I recently watched a good video (29 mins) explaining some of the theory behind Node. I even think the guy talks about CPU intensive tasks and briefly how to handle them: youtube.com/watch?v=L0pjVcIsU6AHydrolysis
You may know this, but to be clear Node.js isn't single-threaded. Your JavaScript code runs single-threaded, but IO operations and other things that plugins can do run out of a thread pool. Node.js gives you much of the benefit of multithreading without having to deal with multithreaded code. Also, Node.js contributors didn't choose single-threaded nature of JavaScript, the authors of JavaScript did. I can't think of a way JS could work in a multithreaded context, but even if there were, V8 isn't written that way which is what Node.js uses as its JavaScript engine.Seif
V8 was just the interpreter they could have threaded it since they're just using it via c++. also, yes I know events are threaded but I am talking about main loop. Further, standard web servers (like java) u dont have to deal with multithreading the webservers do all that for u.Twitt
If a thread is "CPU intensive" you can only run one of those per physical CPU code. In my experience that's usually 32 or less. You can easily run 32 processing nodes usign hte cluster module. In apache/PHP, if you have number-of-cores requests using full CPU, all other requests are actually queued and waiting. So running many CPU intensive processes in parallel is actually just an illusion.Georgeanngeorgeanna
PHP is more single-threaded than JavaScript. You are probably thinking of server modules like FastCGI or mod_php. So you're in fact comparing Node.js with Apache, Nginx or IIS—not with PHP, Java or Ruby.Satrap
Of course the question is about the stack, not a single tool like PHP. Noone is handling HTTP calls in pure PHP, or Java, for that matter. And by the way, Java has excellent support for event-based request handling, people simply don't want to learn how to use it.Corporeal
Node is not single-threaded. It's a popular misconception. Even simple node -e 'setTimeout(()=>{},1000);' & ps -T h $! | wc -l; kill $! displays five threads on my system. The main event loop is single-threaded (it wouldn't make much sense if it wasn't) but Node is heavily multi-threaded and you can write multi-threaded single-process applications if you want. I would love to write a comprehensive answer about it but some people decided to close your question so I can't. I'm voting to reopen it. If it gets more votes and gets reopened then please mention me in the comment.Inshrine
@Inshrine thanks for your comment, but I meant in the main thread not i/o related. if you're doing something cpu related like a big for loop that does something then the server stops processing connections. meaning, the server is unusable at the time. so we're left using hacks like clusters just to do something so simple instead of it inherently threading every connection like most servers do. jxcore.com tried to address this but then it makes one use special/modified node plugins which essentially makes it unusable to me.Twitt
@foreyez If you mean the http server in Node then it itself is async and event-based so in order to use it you will need to use callbacks that are nonblocking - but your callbacks can themselves call e.g. an extension that is heavily threaded and has blocking code, as long as the main event loop thread is not blocked. You can write a threaded Node app using C++ which may be a good idea anyway for CPU-intensive code. For a JavaScript-only solution you can take a look at the webworker-threads Node module. There are also few other ways.Inshrine
oh I've tried it all, including webworker-threads. everything is convoluted and a disaster to work with. oh well. one day hopefully someone will see the benefit and fork node in a proper way.Twitt
@foreyez There are also fibers and generators... ;) Seriously, I can feel your pain :) if you're used to blocking code then I admit that Node can be sometimes frustrating. But what I expect in your case is that you may not even need to use C++, webworkers, fibers or generators if it is just something simple but happens to block the thread too much - like a long running loop or something like that. Maybe if you post it as a new question and post a link here in the comments then I'll be able to help, if you include a sample code that you would like to run, like a long running loop or something.Inshrine
Duplicate of #7018593 @Inshrine please write comprehensive answer there.Depose
@Inshrine after +4 years is better to create a new post and give your answer,.Atypical
watch my easy illustration answer here https://mcmap.net/q/37391/-why-is-node-js-called-single-threaded-when-it-maintains-threads-in-thread-poolUncritical
A
343

Node.js was created explicitly as an experiment in async processing. The theory was that doing async processing on a single thread could provide more performance and scalability under typical web loads than the typical thread-based implementation.

And you know what? In my opinion that theory's been borne out. A node.js app that isn't doing CPU intensive stuff can run thousands more concurrent connections than Apache or IIS or other thread-based servers.

The single threaded, async nature does make things complicated. But do you honestly think it's more complicated than threading? One race condition can ruin your entire month! Or empty out your thread pool due to some setting somewhere and watch your response time slow to a crawl! Not to mention deadlocks, priority inversions, and all the other gyrations that go with multithreading.

In the end, I don't think it's universally better or worse; it's different, and sometimes it's better and sometimes it's not. Use the right tool for the job.

Aesthetically answered 31/7, 2013 at 0:36 Comment(20)
But web servers typically do ALOT of cpu intensive stuff it's not JUST database fetching. We need to process what we fetch, and do alot of business logic alot of the time before serving it up to the client.Twitt
So just spawn workers, well! That's the whole deal with Node.js. Heavy stuff can run in another process, and you process it's results in a lightweight callback.Streamlet
The problem with that is that there's an os level process running per worker.. You'll see them using "ps" command. So that potentially means thousands of processes running on the machine at once - that's nuts!Twitt
btw, as far as race conditions, server developers rarely deal with that you're talking about people that have actually programmed tomcat or iis or something.Twitt
@foreyez, You don't need a process per user. You have choice in how you split up the load. Also, not everyone is doing a ton of CPU intensive stuff. Node is a tool for a job... maybe not your job, but many kinds of jobs.Seif
Actually, I'd like @foreyez to back up that statement that "web servers typically to ALOT(sic) of cpu intensive stuff". In my experience, they don't. Or maybe my definition of 'cpu intensive' differs from his. Converting product data into a UI is not CPU intensive, nor is calculating orders or the like. Most of the web is pretty transactional. CPU intensive stuff is things like converting videos, converting image formats, etc. Much of that is due to file i/o which, actually, node does pretty well. And makes it easy to offload to another process that's dedicated to the converting.Innards
Paul is right, I doubt applying business logic will be CPU-intensive enough that it would need multiple workers. If you do CPU-intensive stuff frequently then you should redesign your server architecture, to do that on another app server not your web server, which is node.Kashakashden
this are just excuses. You're saying that a web server doesn't do cpu intensive things apart from converting video? How about reading from a service and translating the json into a different format. This is basic stuff guys. I got around my problem using the cluster module. But this is a lame solution for faulty architecture. just make the darn thing spawn another thread when a user connects this isn't rocket science. I see other projects like jxcore.com did just that.Twitt
Typically format conversion is an I/O bound operation. If you've got a lot of CPU intensive stuff as part of that operation it's a bit unusual compared to the typical web workload. Again, use the proper tool for the job, sounds like node didn't fit your problem space. Glad you were able to get something that worked for you though.Aesthetically
@foreyez I mean how much JSON are you translating? Unless it's a massive amount of JSON, this can't take much processing. Also, at least from what I've seen, if HTTP requests take a fair amount of time, you should probably have some tier for creating jobs and checking statuses of running jobs. Not do everything in one shot on an open HTTP call and block that worker from handling other requests.Blaspheme
It's worth noting that Node.JS isn't single-threaded. It's only user-land (ie request handling) that's restricted to a single. I realize that sounds pedantic but I had a hard time understanding Node until I understood that. Also, there are modules for threads that you can use when appropriate.Mathematician
As the commenter above me said, Node.JS is multithreaded internally. All IO and database operations are multithreaded. You just can't access that functionality directly. quora.com/…Eyeless
Javascript does not provide ways of synchronization so you can't have an implementation that uses multiple threads explicitly.Autoroute
@ChrisTavares, It does not need to be necessarily CPU intensive. We're talking about delays here. For example, fetching requests from a 3rdparty webserver for each page request takes no CPU but still has a delay.Colbycolbye
@MaiaVictor, Re "so just spawn workers".. dude, any time you need to resort to that, you have already defeated the purpose of nodejs' "single threaded model". So saying that you can "just spawn workers" is not providing an argument for the model, but an argument against it.Colbycolbye
@Colbycolbye I think you're missing the point, it doesn't beat the model, it is the model, main event loop in a single, light thread, everything else separate. The main difference is you don't spawn a thread for every new request.Streamlet
@Paul, @ ChrisTavares, CPU-bound examples here: #3492311Colbycolbye
@MaiaVictor, Modern PHP web stacks don't spawn a new thread for every page load. That's just a uber myth which is true for old apache models but newer web servers maintain a threadpool.Colbycolbye
I didn't say that. Haven't touched apache since a long time ago...Streamlet
I get the that the async model is better than thread-per-request, but I can't begin to fathom why anyone would want just one thread to do all the work; except, of course, that it's what was available in V8 and there was no other way (= faulty architecture IMO). One of anything is a bottleneck and a single point of failure. What's wrong with having a thread pool run async tasks in parallel?Beefeater
F
76

The issue with the "one thread per request" model for a server is that they don't scale well for several scenarios compared to the event loop thread model.

Typically, in I/O intensive scenarios the requests spend most of the time waiting for I/O to complete. During this time, in the "one thread per request" model, the resources linked to the thread (such as memory) are unused and memory is the limiting factor. In the event loop model, the loop thread selects the next event (I/O finished) to handle. So the thread is always busy (if you program it correctly of course).

The event loop model as all new things seems shiny and the solution for all issues but which model to use will depend on the scenario you need to tackle. If you have an intensive I/O scenario (like a proxy), the event base model will rule, whereas a CPU intensive scenario with a low number of concurrent processes will work best with the thread-based model.

In the real world most of the scenarios will be a bit in the middle. You will need to balance the real need for scalability with the development complexity to find the correct architecture (e.g. have an event base front-end that delegates to the backend for the CPU intensive tasks. The front end will use little resources waiting for the task result.) As with any distributed system it requires some effort to make it work.

If you are looking for the silver bullet that will fit with any scenario without any effort, you will end up with a bullet in your foot.

Frenchy answered 31/7, 2013 at 8:27 Comment(5)
Node.js is restricted to event-only processing due to the lack of v8 multithreading support. Well, javascript language itself lacks the needed features, so any implementatio will end up being tricky. That's the main culprit of Node.js, In my opinion. In other languages you can choose what you want. Or some hybrid of both models, like java NIO.Windproof
@Kazaag, Modern web servers do maintain a threadpool. They don't just dumbly spawn a new thread per page load. Those are the older web servers.Colbycolbye
@Colbycolbye I never said that a new thread is spawn, but each thread is allocated to one request until the request is finished.Frenchy
@Frenchy It is definitely not a general rule that "each thread is allocated to one request until the request is finished". I.e. in .Net (including processing HTTP requests) one can and should use async (Task based) programming and this will release threads while waiting for I/O and other async operations to complete. This is applicable to high level programming as well, i.e. MVC/API controllers. So in practice there could be 20 HTTP requests pending but only one active thread.Fastigium
"silver bullet in your foot" Opening my shoe: Python xDBucaramanga
Z
31

Long story short, node draws from V8, which is internally single-threaded. There are ways to work around the constraints for CPU-intensive tasks.

At one point (0.7) the authors tried to introduce isolates as a way of implementing multiple threads of computation, but were ultimately removed: https://groups.google.com/forum/#!msg/nodejs/zLzuo292hX0/F7gqfUiKi2sJ

Zhang answered 31/7, 2013 at 0:45 Comment(2)
Do you have more information regarding this "isolate"?Colbycolbye
V8 is not single-threadedUncritical

© 2022 - 2024 — McMap. All rights reserved.