Hidden threads in Javascript/Node that never execute user code: is it possible, and if so could it lead to an arcane possibility for a race condition?
Asked Answered
B

3

2

See bottom of question for an update, based on comments/answers: This question is really about the possibility of hidden threads that do not execute callbacks.


I have a question about a potential arcane scenario involving the Node Request module in which:

  • A complete HTTP request is constructed and executed over the network (taking however many ms or even seconds)

  • ... before a single function is executed at runtime on the local machine (typically in the nanoseconds?) - see below for details

I am posting this mostly as a sanity check just to make sure I am not misunderstanding something about Node / JS / Request module code.

From the examples in the Request module (see the SECOND example in that section), is this:

// Copied-and-pasted from the second example in the 
// Node Request library documentation, here:
// https://www.npmjs.com/package/request#examples

// ... My ARCANE SCENARIO is injected in the middle

var request = require('request')
  request(
    { method: 'GET'
    , uri: 'http://www.google.com'
    , gzip: true
    }
  , function (error, response, body) {
      // body is the decompressed response body 
      console.log('server encoded the data as: ' + (response.headers['content-encoding'] || 'identity'))
      console.log('the decoded data is: ' + body)
    }
  )

    // **************************************************** //
    // Is the following scenario possible?
    //
    // <-- HANG HANG HANG HANG HANG HANG HANG HANG HANG -->
    //
    // Let us pretend that the current thread HANGS here,
    // but that the request had time to be sent,
    // and the response is pending being received by the thread
    //
    // <-- HANG HANG HANG HANG HANG HANG HANG HANG HANG -->
    // **************************************************** //

.on('data', function(data) {
    // decompressed data as it is received 
    console.log('decoded chunk: ' + data)
  })
  .on('response', function(response) {
    // unmodified http.IncomingMessage object 
    response.on('data', function(data) {
      // compressed data as it is received 
      console.log('received ' + data.length + ' bytes of compressed data')
    })
  })

I have indicated my arcane scenario in the code snippet.

Suppose the Node process hangs at the point indicated, but that Node internally (in a hidden thread, invisible to Javascript, and therefore not calling any callbacks) WAS able to construct the request, and send it over the network; suppose the hang continues until a response (in two chunks, say) is received and waiting to be processed by Node. (This is the scenario that is certainly arcane, and that I'm not sure is even theoretically possible.)

Then suppose that the hang ends, and the Node thread above wakes up. Further, suppose that (somehow) Node was able to process the response all the way to the point of executing the callback function in the code above (yet without moving past the 'hanged' point in the code in the original code path -- again, if this is even theoretically possible).

Is the above arcane scenario theoretically possible? If so, wouldn't the data packets be received over the network and combined, ready to be passed to the callback function, before the 'data' event was scheduled on the object? In this case, if it's possible, I would imagine that the 'data' event would be missed.

Again, I understand that this is an arcane scenario - perhaps it's not even theoretically possible, given the internal mechanisms and coding involved.

That is my question - is the above arcane scenario, with its extremely unlikely race condition, nonetheless theoretically possible?

I ask just to make sure I'm not missing some key point. Thanks.


UPDATE: From comments & answers: I now have clarified my question. The 'arcane scenario' would require that there is a HIDDEN thread (which therefore CANNOT execute any USER code, including CALLBACKS) that constructs the request, sends it over the network, and receives the response - WITHOUT having any callbacks to trigger, including the 'data' callback - and stops short just at the point that the 'response' callback is ready to be called, waiting for the (single) visible JS thread to wake up.

Baber answered 18/6, 2015 at 4:1 Comment(9)
You might want to read developer.mozilla.org/en-US/docs/Web/JavaScript/EventLoopAdolphus
@FelixKling Thank you. I just want to be certain that there is no 'hidden' thread that processes the request and sends it over the network just up to, but not including, the point of the callback (which I understand is in the single JS thread).Baber
I don't really understand how you think that a data event could be missed. You mean the packets are received before the handler is installed?Dowson
@Dowson - Exactly. I trust this can't actually occur; I just want to be sure I understand why. Because though all USER code must occur in a single thread (from the other comments & answer), and all CALLBACKS must execute in that same thread, where does it say that there cannot be a hidden (invisible, internal) thread that does processing up to (but not including) the point at which a callback needs to be called?Baber
What does "thread hangs here" mean? node.js doesn't just "hang". There could be an infinite loop, but execution doesn't just hang.Reinsure
@Reinsure ... it wouldn't be Node that hangs. It would be some arcane condition in the operating system that prevents the thread from continuing.Baber
Your hang still doesn't make any sense to me. The nodejs code continues to run if your hang is in some other system thread somewhere as the nodejs code does not wait for a response, it just registers an event handler to be notified. If the hang is in a thread that is processing the outstanding request, then the response just doesn't come back until that thread gets free again so the nodejs event handlers just don't get any events about the response. But, it is my understanding the nodejs does not use internal threading for networking anyway, though it does for file I/O.Reinsure
@Reinsure A hanging thread is an arcane possibility. The OS is not required to execute any given thread, even though it always will, if it can. By the way - I've updated my question now that your comment, and others, have clarified that I'm really asking about the possibility of hidden threads inside Node/JS, despite the fact that there is only ONE visible thread that executes all Javascript code.Baber
@Dowson - I updated the title, and added an addendum, to clarify that my question is really about the possibility of hidden internal threads.Baber
D
3

No, this cannot happen.

Yes, there are indeed "hidden" background threads that do the work for asychronous methods, but those don't call callbacks. All execution of javascript does happen on the same thread, synchronously, sequentially. That data event callback will always be executed asynchronously, that is, after the current script/function ran to completion.

While there could indeed already arrive packets from the network before the callback is created and attached to the event emitter, the callback that listens for packets on the lowest level is always created before the request is sent - it is an argument to the native "makeRequest" method, and is available to be called right from the beginning. So when a packet does arrive before the current script (still being occupied by constructing event emitters and attaching handlers) has finished, this event is queued up, and the callback will only be executed after the event loop is ready - on the next turn. By then, the data event callback is certainly created and attached.

Dowson answered 18/6, 2015 at 4:45 Comment(4)
Great! This answers my question - in particular your comments the callback that listens for packets on the lowest level is always created before the request is sent - it is an argument to the native "makeRequest" method. Because if, indeed, there is an internal Javascript callback guaranteed to be in place that needs to be triggered on the 'data' event, even if it hasn't been set on the Request event in the sample code above, it would be guaranteed to stall any possible 'hidden' threads. In case you have a moment and it's easy - do you have a link to a reference to this function?Baber
A bit misleading. When it comes to network I/O there are no additional threads.Nursery
You can dig yourself through the source of the http module :-) You'll find lines such as process.binding('http_parser') (here) that is exported by the native code here.Dowson
@slebetman: True, it's asynchronous and evented even on the lower level. I just tried to simplify and threw it in the same pot as file IO etc.Dowson
R
2

nodejs Javsacript execution is single threaded and event driven. That means that everything runs through an event queue. A thread of Javascript execution runs until it's done and then the system checks the event queue to see if there's anything else to do (timers waiting to fire, async callbacks waiting to be called, etc...).

nodejs does use some internal threads in some of its implementation (such as file I/O), but it is my understanding that it does not use threads in networking. But, it's immaterial whether there are some internal threads or not because all communication between sub-systems like networking and the main nodejs JS thread is via the event queue.

A nodejs thread of execution is never interrupted to do something else. It finishes and runs to completion, then the JS engine checks the event queue to see if there's something else waiting to be executed.

When there's incoming data available on socket, an event is placed in the event queue. The current nodejs Javascript that is executing finishes doing what it's doing, then the JS engine sees there's an event in the event queue and fires that event. If there's a function callback or event handler associated with that event (there usually is), then that gets called to execute the event.

If there's a mishap in the internals of some infrastructure such as networking, then all that happens to the nodejs code is that some networking event just doesn't occur. The nodejs code has its event handlers in place and just doesn't receive the event they are waiting for until the infrastructure gets unwedged and creates the event. This doesn't create any sort of hang in the nodejs code.

So, in your update:

From comments & answers: I now have clarified my question. The 'arcane scenario' would require that there is a HIDDEN thread (which therefore CANNOT execute any USER code, including CALLBACKS) that constructs the request, sends it over the network, and receives the response - WITHOUT having any callbacks to trigger, including the 'data' callback - and stops short just at the point that the 'response' callback is ready to be called, waiting for the (single) visible JS thread to wake up.

The nodejs thread runs to completion, then the JS engine waits for a new event to occur (e.g. get put in the event queue). When that event occurs, the JS engine runs the code that corresponds to that event (event handlers, callbacks, etc...). You make it sound like the single visible JS thread is asleep waiting to wake up and it can get stuck there because some other sub-system gets hung. That is not the case. The only thing that can happen is that some event that the single JS thread has an event handler for just never occurs. This would be no different than a situation where you send a message to a server and you have an event handler to see a response, but the server never sends the response. Your nodejs code continues processing other events (timers, other networking, other I/O), but this particular event just never occurs because the other server just never sent the data that would trigger that event. Nothing hangs.

This is "evented I/O" which is how nodejs describes itself.

Reinsure answered 18/6, 2015 at 4:42 Comment(2)
Thanks. But consider If there's a function callback or event handler associated with that event (there usually is) ... -- What if there isn't, though? Is it possible there could be an optimization in place such that in THIS case a hidden, background thread will process the event - not the main JS thread - because no Javascript code needs to execute so there is no violation of the "single visible thread" rule (i.e., no callbacks are pending and no callbacks execute)?Baber
@DanNissenbaum - what do you mean "process the event"? If there's no nodejs code to handle the event, then there's nothing to do when the event occurs - it will just get thrown away. This can occur. You could register an event listener for some event that will occur in the future and then you could remove the event listener before the event occurs. The event will happen and there will just be no JS code registered to do anything when it occurs. It matters not whether the event actually goes into the event queue or not in this case - same outcome either way.Reinsure
A
0

There’s only one thread involved in Node.js; an event loop is used to process tasks that run asynchronously, and nothing queued will ever interrupt anything already running. So no, there is no race condition there.

Azobenzene answered 18/6, 2015 at 4:5 Comment(10)
Internally to the Node implementation, I think there must be additional threads - at least so that some thread can block on the operating system's event (or condition, etc.) object. These threads would not be visible to the Javascript/Node code. However - there is just one thread that executes the Javascript code, correct? If so - is it true that the construction of the HTTP request occurs inside this Javascript thread, or could it occur inside a (hidden) thread (in the Node internals)?Baber
There are other threads internal to node.js to make certain things wrok. In fact, async file I/O uses threads. But, it is my understanding that networking does not use threads, it uses events.Reinsure
@DanNissenbaum: When it comes to network I/O there are no additional threads, visible or otherwise. It's all single thread that executes the javascript AND handle network requests - they take turns. The only exception is disk I/O which the node developers have chosen to implement as threads to simplify cross-platform support of async disk I/ONursery
@DanNissenbaum: See this for an explanation of how node works: #29884025Nursery
@Nursery Not sure that's true. On Windows, for example, the low-level OS SDK socket function is recv (or similar) - msdn.microsoft.com/en-us/library/windows/desktop/…. This function blocks. Networking applications must be constructed, at the C or OS SDK level, to create a thread that will block on socket read functions - and the same is true for Linux as well.Baber
@Nursery Thanks for the excellent link. A single thread could wait for multiple read operations via select, but that thread would block, so there must, I imagine, be another thread processing the Javascript callbacks.Baber
@DanNissenbaum: It's true. On Windows you can use select() instead of recv() which will block on ALL network socket instead of a single network socket. This is how async I/O works basically.Nursery
@DanNissenbaum: See the link. While waiting, there is absolutely zero reason for the interpreter to execute javascript code. So it doesn't execute javascript code at all. Only when there are events will the select() function return and then the interpreter can check if there's any need to execute any javascript code. It goes execute->wait->execute->wait->execute... forever until there are no more event handlers and node.js exits (or in browser environment, the page becomes a static page)Nursery
@Nursery In Javascript, then, am I right that there can be - at the very lowest level - ONLY the following types of callbacks: File, Network, Timer? And if so, these can all be managed via a single select?Baber
@DanNissenbaum: Not only those, there can also be UI events (browser) and thread/job completion events (web workers). But all of them can be implemented using file/socket I/O as the communications medium so the core of the system will still work.Nursery

© 2022 - 2024 — McMap. All rights reserved.