How to send huge amounts of data from child process to parent process in a non-blocking way in Node.js?
Asked Answered
B

3

2

I'm trying to send a huge json string from a child process to the parent process. My initial approach was the following:

child: process.stdout.write(myHugeJsonString);

parent: child.stdout.on('data', function(data) { ...

But now I read that process.stdout is blocking:

process.stderr and process.stdout are unlike other streams in Node in that writes to them are usually blocking.

  • They are blocking in the case that they refer to regular files or TTY file descriptors.
  • In the case they refer to pipes:
    • They are blocking in Linux/Unix.
    • They are non-blocking like other streams in Windows.

The documentation for child_process.spawn says I can create a pipe between the child process and the parent process using the pipe option. But isn't piping my stdout blocking in Linux/Unix (according to cited docs above)?

Ok, what about the Stream objectoption? Hmmmm, it seems I can share a readable or writable stream that refers to a socket with the child process. Would this be non-blocking? How would I implement that?

So the question stands: How do I send huge amounts of data from a child process to the parent process in a non-blocking way in Node.js? A cross-platform solution would be really neat, examples with explanation very appreciated.

Bearable answered 27/6, 2014 at 11:3 Comment(3)
Did you try using the stdio: 'ipc' and sending/receiving messages that way?Haywood
@Haywood Not yet. Would this result in a non-blocking data transfer? child.send seems to me to be blocking as well. The docs say Please note that the send() method on both the parent and child are synchronous - sending large chunks of data is not advised (pipes can be used instead, see child_process.spawn). I am totally confused.Bearable
Did you find a solution to this? I'm trying to stream from a .spawn() using additional stdio pipes (fd >=4) but cannot find the way for my child process to open the fd as a stream.Matronna
V
0

One neat trick I used on *nix for this is the fifo pipes (http://linux.about.com/library/cmd/blcmdl4_fifo.htm). This allows child to write to a file like thing and the parent to read from the same. The file is not really on the fs so you don't get any IO problems, all access is handled by the kernel itself. But... if you want it cross-platform, that won't work. There's no such thing on Windows (as far as I know).

Just note that you define the size of the pipe and if what you write to it (from child) is not read by something else (from parent), then the child will block when the pipe is full. This does not block the node processes, they see the pipe as a normal file stream.

Verduzco answered 27/6, 2014 at 11:11 Comment(1)
Thnx. Though I'm searching for a cross-platform solution (updated my question with this spec). Hope your answer helps someone else.Bearable
M
0

I had a similar problem and I think I have a good solution by setting-up a pipe when spawning the child process and using the resulting file descriptor to duplex data to the clients end.

How to transfer/stream big data from/to child processes in node.js without using the blocking stdio?

Apparently you can use fs to stream to/from file descriptors:

How to stream to/from a file descriptor in node?

Matronna answered 5/7, 2014 at 6:24 Comment(4)
Thanks Bartvds, but isn't setting up a pipe blocking on Linux (according to the docs)? Can you somehow prove that your approach leads to non-blocking data-transfer between child and parent processes?Bearable
Docs say nothing about blocking on pipe. It does say this: "Please note that the send() method on both the parent and child are synchronous - sending large chunks of data is not advised (pipes can be used instead, see child_process.spawn"Matronna
Here the docs say something about process.stdout stream being blocking when it refers to pipes. And exactly this part of the docs (which seems to contradict the part you cited in your comment) makes me so confused. If you could clarify on this it would be really great!Bearable
I read that, but iiric these pipes are not the special case as described stdout. But it is a bit confusing, and I have no way to prove this, so I posted another question for this specifically.Matronna
S
0

The documentation for child_process.spawn says I can create a pipe between the child process and the parent process using the pipe option. But isn't piping my stdout blocking in Linux/Unix (according to cited docs above)?

  1. No. The docs above say stdout/stderr, and in no way do they say "all pipes".

  2. It won't matter that stdout/stderr are blocking. In order for a pipe to block, it needs to fill up, which takes a lot of data. In order to fill up, the reader at the other end has to be reading slower than you are writing. But... you are the other end, you wrote the parent process. So, as long as your parent process is functioning, it should be reading from the pipes.

Generally, blocking of the child is a good thing. If its producing data faster than the parent can handle there are ultimately only two possibilities: 1. it blocks, so stops producing data until the parent catches up 2. it produces more data than the parent can consume, and buffers that data in local memory until it hits the v8 memory limit, and the process aborts

You can use stdout to send your json, if you want 1)

You can use a new 'pipe' to send your json, if you want 2)

Sampson answered 21/10, 2016 at 13:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.