Is Node.js considered multithreading with worker threads?
Asked Answered
J

2

19

My entire life, I thought Node.js and JavaScript was a single threaded language. Node.js is not good for CPU intensive tasks but is lightweight because of its single threaded nature. Multithreading is good for CPU intensive tasks because you can delegate tasks to different threads, but it creates opening for race conditions which can get complicated.

Then comes worker threads, telling me node can now spawn threads named "worker threads" to pass off CPU intensive tasks so it doesn't block the JavaScript stack. Why do people call JavaScript single threaded like a permanent definition, if with the power of worker threads it actually can be multithreaded? Or is JavaScript indeed permanently single threaded, but with the power of worker threads, a process is able to have multiple threads of JavaScript, which still behave single thread?

Node.js uses two kinds of threads: a main thread handled by event loop and several auxiliary threads in the worker pool.

Additionally, this article I read said the statement above. This makes it sound like JavaScript was actually using multiple different threads the entire time. Why are people calling JavaScript single threaded?

Justinejustinian answered 3/8, 2020 at 5:52 Comment(4)
"Why are people calling Javascript single threaded?" because they've been wrong the entire time. The statement was never really true. You could say that code runs on a single thread but that doesn't meant that the language is single threaded. The browser could and would run multiple threads with JS processing, for example. However, code loaded as part of the page would run on the UI thread. Again, that's one instance of single threading, not applicable to the whole language.Circumfluous
From a practical perspective Node.js is still essentially single-threaded. The way it operates and queues the event loop via callbacks can give the illusion of multi-threading but at it's core only one bit of the code written by the developer is being executed at a time. As you point out there are worker threads but these do not share the same state - data is cloned between them and updating a variable via a worker thread does not present the same way it would in multi-threading (you'll need messages to process these which are, you guessed it, queued and ordered by the primary thread).Ire
Javascript code does execute on a single thread/stack, node.js itself has never been single threaded.Circumspect
@MarkTaylor You can use SharedArrayBuffer for sharing the same state across multiple worker threads.Basidiomycete
O
44

This makes it sound like JavaScript was actually using multiple different threads the entire time. Why are people calling JavaScript single threaded?

The programming model in Node.js is a single threaded event loop with access to asynchronous operations that use native code to implement asynchronous behavior for some operations (disk I/O, networking, timers, some crypto operations, etc...).

Also, keep in mind that this programming model is not a product of JavaScript the language itself. It's a product of how JavaScript is deployed in popular environments like Node.js and browsers as an event-driven implementation.

The fact that internally there is a native code thread pool that is used for the implementation of some asynchronous operations such as file I/O or some crypto operations does not change the fact that the programming model is a single threaded event loop. The thread pool is just how the implementation of a time consuming task is made to have an asynchronous interface through JavaScript. It's an implementation detail that doesn't change the JavaScript programming model from a single threaded, event loop model.

Similarly, the fact that you can now actually create WorkerThreads does not really change the primary programming model either because the WorkerThreads run in a separate JavaScript VM with a separate event loop and do not share regular variables. So, whether you're using WorkerThreads or not, you still pretty much design your code for an event-driven, non-blocking system.

WorkerThreads do allow you to off-load some time-consuming tasks to get them out of the main event loop to keep that main event loop more responsive and this is a very good and useful option to have in some cases. But, the overall model does not change. For example, all networking is still event driven and non-blocking, asynchronous. So, just because we have WorkerThreads, that doesn't mean that you can now program networking in JavaScript like you sometimes do in Java with a separate thread for every new incoming request. That part of JavaScript's model doesn't change at all. If you have an HTTP server in Node.js, it's still receiving one incoming request at a time and won't start processing the next incoming request until that prior incoming request returns control back to the event loop.

Also, you should be aware that the current implementation of WorkerThreads in Node.js is fairly heavyweight. The creation of a WorkerThread fires up a new JavaScript VM, initializes a new global context, sets up a new heap, starts up a new garbage collector, allocates some memory, etc... While useful in some cases, these WorkerThreads are much, much more heavyweight than an OS level thread. I think of them as if they're almost like mini child processes, but with the advantage that they can use SharedMemory between WorkerThreads or between the main thread and WorkerThreads which you can't do with actual child processes.

Or is JavaScript indeed permanently single threaded, but with the power of worker threads, a process is able to have multiple threads of JavaScript, which still behave single thread?

First off, there's nothing inherent in the JavaScript language specification that requires single threaded. The single-threaded programming model is a product of how the JavaScript language is implemented in the popular programming environments such as Node.js and the browser. So, when speaking about single-threadedness, you should speak about the programming environment (such as Node.js), not about the language itself.

In Node.js, a process is able to have multiple threads of JavaScript now (using WorkerThreads). These run independently so you can get true parallelization of running JavaScript in multiple threads concurrently. To avoid many of the pitfalls of thread synchronization, WorkerThreads run in a separate VM and do not share access to variables of other WorkerThreads or the main thread except with very carefully allocated and controlled SharedMemory buffers. WorkerThreads would typically communicate with the main thread using message passing which runs through the event loop (so a level of synchronization is forced on all the JavaScript threads that way). Messages are not passed between threads in a pre-emptive way - these communication messages flow through the event loop and have to wait their turn to be processed just like any other asynchronous operation in Node.js.

Here's an example implementation using WorkerThreads. I was writing a test program whose job it was to run a simulation of an activity several billion times and record statistics on all the results to see how random the results were. Some parts of the simulation involved some crypto operations that were pretty time consuming on the CPU. In my first generation of the code, I was running a smaller number of iterations for testing, but it was clear that the desired several billions iterations was going to take many hours to run.

Through testing and measurement, I was able to find out which parts of the code were using the most CPU and then I created a WorkerThread pool (8 worker threads) that I could pass the more time consuming jobs to and they could work on them in parallel. This reduced the overall time of running the simulation by a factor of 7.

Now, I could have also used child processes for this, but they would have been less efficient because I needed to pass large buffers of data between the main thread and a workerThread (the workerThread was processing data in that buffer) and it was a lot more efficient to do that using a SharedArrayBuffer than it would be to pass the data between parent and child processes (which would have involved copying the data rather than sharing the data).

Orella answered 3/8, 2020 at 7:4 Comment(0)
F
2

It is called a single-threaded because, by default, only a single thread of JS runs on CPU. It sounds weird with respect to concurrency but its good since the minimal number of resources are used. NodeJs is designed to perform non-blocking operations, which means that no time consuming or CPU intensive job could block/hang the main application. For this reason, when there is a time-consuming operation like calling DB, writing files, fetching data from another server, etc, NodeJs opens a new thread for that specific task. By doing so the main thread remains available to listen to new events, while CPU intensive and time-consuming tasks are performed in the background. When that task is finished, that thread is destroyed. From this, I can infer that

HodeJs is single-threaded, but to prevent that thread from being blocked, NodeJs opens new threads to perform time-consuming/CPU intensive jobs. By doing so, new threads are opened whenever there is a need and destroyed when need is fulfilled

This whole process optimizes CPU resource management.

Note that NodeJs is not considered an ideal choice to build CPU intensive applications. I think the reason for which is that it could open a lot of new threads and CPU may run out of new them.

Faeroese answered 3/8, 2020 at 6:24 Comment(3)
Which other language doesn't use only a single thread by default?Corrinnecorrival
"NodeJs opens a new thread for time-consuming operation like calling DB, writing files, fetching data from another server" - actually, no, all of these have natively asynchronous APIs in the operating system, node doesn't spawn threads for them.Corrinnecorrival
@Corrinnecorrival - nodejs does use a thread pool for some disk I/O and some crypto operations.Orella

© 2022 - 2024 — McMap. All rights reserved.