When a workerThread is created in nodejs, does it utilize the same core in which nodejs process is running?
Asked Answered
G

1

7

Let's assume i have a nodejs serverProgram with one api and it does some manipulations on the video file, sent via the http request.

const saveVideoFile=(req,res)=>{
  processAndSaveVideoFile(); // can run for minimum of 10 minutes
  res.send({status: "video is being processed"})
}

i decided to to make use of a workerThread to do this processing as my machine has 3 cores (core1,core2,core3) and there is no hyperthreading enabled here

Assume that my nodejs program is running on core1. When i fire up a single workerThread, will the workerThread run on core2/core3 or core1?

i read that workerThread is not the same as childProcess. ChildProcess will fork a new process which will facilitate the childProcess to choose from available free cores (core2 or core3).

i read that workerThread shares memory with the mainThread. Let's assume that i create 2 workerThreads (wt1,wt2). Will my nodejs program, wt1, wt2 run on the same core i.e core1 ?

Also, in nodejs we have eventloop (mainthread) and otherThreads doing the background operations i.e I/O. is it correct to assume that all of these are utilizing the resources available in a single core (core1). if this is the case, is creating and using additional workerThread's an overkill on the nodejs server?

Below is an excerpt from this blog

We can run things in parallel in Node.js. However, we need not to create threads. The operating system and the virtual machine collectively run the I/O in parallel and the JS code then runs in a single thread when it is time to send the data back to the JavaScript code.

i keep reading this same information about nodejs in many articles and video presentations. But what i do not understand is this,

The operating system and the virtual machine collectively run the I/O in parallel

How can the operating system run the I/O requests from nodejs program in parallel without using any of the childProcess or threads spawned from nodejs? if those I/O requests from nodejs program is running in parallel, does it mean that all 3 cores (core1,core2,core3) will be utilized?

There are lot of contents on nodejs, but it doesn't clear doubts related to my above questions. if you have idea on how these things actually work, please share the detail.

Gader answered 16/5, 2020 at 3:27 Comment(0)
J
16

A worker thread in node.js is an actual OS thread running in a different instance of V8. As such, it's totally up to the operating system to decide how to allocate it among available CPU cores. If there are cores with available time, then it will not generally be run on the same core as the main nodejs thread when that thread is busy because the OS will allocate busy threads across the various cores.

But, again this is entirely up to the OS and is not something that nodejs controls and the exact strategy for which cores are used will vary by OS. But, in all modern operating systems, the design goal is that available cores are used for threads that are currently executing. Now, if there are more threads active at once than there are cores, the threads will be time-sliced and all the cores will be active.

Also, in nodejs we have eventloop (mainthread) and otherThreads doing the background operations i.e I/O. is it correct to assume that all of these are utilizing the resources available in a single core (core1). if this is the case, is creating and using additional workerThread's an overkill on the nodejs server?

No, it is not correct to assume those threads all use the same core.

A workerThread in nodejs has its own event loop. For the most part, it does not share memory. In fact, if you want to share memory, you have to very specifically allocated SharedMemory and pass that to the workerThread.

Is it overkill? Well, it depends upon what you're doing. There are very useful things to do with workerThreads and there are things that they would not be necessary for.

The operating system and the virtual machine collectively run the I/O in parallel

I/O in node.js is either asynchronous at the OS level (such as networking) or run in separate threads (such as disk I/O). That means it runs separately from the main thread in node.js that runs your Javascript and can run in parallel with it, synchronizing only at the completion of an event. "Parallel" in this case means that both make progress at the same time. If there are multiple cores, then they can truly be running at exactly the same time. If there was only one core, then the OS will timeslice between the various threads and they will be both make progress (in an interleaved fashion that will seem to be parallel, but really they are taking turns).

How can the operating system run the I/O requests from nodejs program in parallel without using any of the childProcess or threads spawned from nodejs? if those I/O requests from nodejs program is running in parallel, does it mean that all 3 cores (core1,core2,core3) will be utilized?

The OS has its own threads for managing things like a network interface or a disk interface. The job of those threads is to interface with the hardware and bring data to an appropriate application or take data from the application and send it to the hardware. These are OS-level threads that exists independent of node.js. Yes, other cores can be used by those OS-level threads. It is important to realize that many operations such as networking are inherently non-blocking. Thus, if you're waiting for some data to arrive on a network interface, you don't need to have a thread doing something the whole time.


I want to add that it appears in your questions that you've combined questions about a several different things. Mentioned in your questions are:

  1. Worker Threads
  2. Internal node.js threads
  3. Operating system threads

These are all different things.

A worker thread is a new thread you can start to run specific pieces of Javascript in another thread so you can have more than one Javascript thread running at the same time. In node.js, this is done by creating a whole new instance of V8, setting up a whole new global environment and loaded modules environment and using almost entirely separate memory.

Internal node.js threads are used by node.js as part of implementing its event loop and its standard library. Specifically, disk I/O and some crypto operations are run in internal native threads and they communicate with your Javascript via events/callbacks through the event loop.

Operating system threads are threads that the OS uses to implement it's own system APIs. Since the OS is responsible for lots of things, these threads ca have many different uses. Depending upon native implementations, they may be used to facilitate things like disk I/O or networking I/O. These threads are the responsibility of the OS to create and use and are not directly controlled by node.js.


Some additional questions asked in comments:

what is the difference b/w workerThread & childProcess concept in nodejs? is childProcess = workerThread without sharedMemory ?

A child process can be any type of program - it does not have to be a node.js program. A worker thread is node.js code.

A worker thread can share memory if sharedMemory is specifically allocated and shared with the worker thread and if it is carefully managed for concurrency issues.

It is more efficient to copy memory back and forth between worker thread and main thread than with child process.

If main program exits, worker threads will exit. If main program exits, child process can be configured to exit or to continue.

If worker thread calls process.exit(), the main thread will exit too. If child program exits, it cannot cause main program to exit without main program's cooperation.

how nodejs is able to magically interact with the os level thread without nodejs itself creating any threads?, i need additional details on this, your explanation is the common one present in most places including the blog i shared?

nodejs just calls an OS API. It's the OS API that manages communicating with its own threads (if threads are needed for that specific OS API). How it does that communication internally is implementation dependent and will vary by OS. It will even vary by OS which OS APIs use threads and which don't.

Jacquline answered 16/5, 2020 at 4:15 Comment(15)
A workerThread in nodejs has its own event loop. For the most part, it does not share memory. if this is the case then what is the difference b/w workerThread & childProcess concept in nodejs? is childProcess = workerThread without sharedMemory ? @JacqulineGader
The OS has its own threads for managing things like a network interface or a disk interface. The job of those threads is to interface with the hardware and bring data to an appropriate application or take data from the application and send it to the hardware. These are OS-level threads that exists independent of node.js... -- how nodejs is able to magically interact with the os level thread without nodejs itself creating any threads?, i need additional details on this, your explanation is the common one present in most places including the blog i shared?Gader
@Gader - Some additional information relative to your comments has been added to the end of the question.Jacquline
thank you for your response. i can understand what you have mentioned. but iam not sure if i can believe it. could you take a look at this blog . the detail present in the image seems to contradict what you have mentioned. please share your thoughts on thatGader
@Gader - What exactly do you think is a contradiction? I don't see any issues. The diagram doesn't really contain much detail about how things actually work.Jacquline
you mentioned nodejs just calls an OS API. . i responded to that by asking "does nodejs call OSAPI by nodejs itself creating internal threads " . you said "no, nodejs doesnot create internal threads, but nodejs uses OSAPI directly & OSAPI does the thread creation work". but this blog says nodejs uses internal threads for what it appears to be asyncio in nodejs. Do you understand what i meant by "contradiction"?Gader
@Gader - libuv (inside of node.js) has a thread pool that it uses for blocking operations like disk access and certain crypto operations. It does not use or need to use the thread pool for networking because there are native asynchronous OS APIs for that. So, sometimes a thread pool is used and sometimes not, depending upon whether the OS offers a native asynchronous API or not. If the OS only offers a blocking, synchronous API for a certain function, then libuv (inside of node.js) uses a thread pool to simulate a non-blocking, asynchronous interface so it doesn't block the JS interpreter.Jacquline
@Gader - BTW, my answer already mentions the difference between how node.js handles disk I/O and networking because of the difference in OS asynchronous support. FYI, the diagram you're looking at represents the node.js application only (not any threads used in the OS) in case that's confusing you.Jacquline
Thank you for your patience. This will be my final question. FirstQuestion: Assume there are 2 cores (core1,core2). so,nodejs which runs on core1 has internalThreads( libuv ,cpp library), the internal threads would run on the same core in which nodejs is running or, the internalThreads could make use of core2? (if core2 is free). SecondQuestion: is it valid to understand like, any thread that is created within a process can be scheduled to run on any available cores?Gader
@Gader - Any thread can run on any core. It's a function of the OS to allocate a thread that wants to run to some core that isn't busy. When all the cores are busy, it will then start timeslicing threads so they all share all the cores. This allocation among all the cores isn't always perfectly even because threads sometimes run for very short periods of time before blocking on something so the OS then scrambles to assign the next thread that wants to run to that core and so on... But, the general idea is to spread the threads among all the cores as evenly as possible.Jacquline
@Gader - As an example, I had a very CPU intensive node.js app (that did a lot of heavy crypto). I have 8 cores in my CPU so I created 8 WorkerThreads in my one node.js app and that fully loaded all 8 cores in my CPU as shown in the Windows Task Manager.Jacquline
this clears a lot for me on the nodejs side, thank you.But on the threading side, in general, i still have some questions on how certain servers were creating hugenumber of threads directly proportional to the huge number of http request on lets say an OS with 2 cores, does this scenario also follow the same process of threadscheduling? you can respond to it if you'd like to, but i understand that would be different question & if you have found some interesting articles on threads/osscheduling/cores, please share.Gader
@Gader - As soon as there are more threads wanting to run (e.g. not currently blocked or waiting for something) than there are cores, then the OS has to "schedule" them onto a core. That is the OS thread scheduler. It is generally not efficient to have tons more active threads than you have cores because the process of timeslicing between them (letting each thread use a core for a few ms before stopping it and giving another thread a few ms) is fair, but inefficient. This is one of the reasons that nodejs does not use a lot of threads and one of the reasons its architecture can be efficient.Jacquline
@Gader - Further discussion of OS thread scheduling would probably belong in a new question.Jacquline
one of the best explanations ive ever read, considering how misleading most of the stuff written about nodejs worker threads out there isGrippe

© 2022 - 2024 — McMap. All rights reserved.