NodeJS - How to stream request body without buffering
Asked Answered
N

2

11

In the below code I can't figure out why req.pipe(res) doesn't work, and yet doesn't throw an error either. A hunch tells me it's due to nodejs' asynch behavior, but this is a very simple case without a callback.

What am I missing?

http.createServer(function (req, res) {

  res.writeHead(200, { 'Content-Type': 'text/plain' });

  res.write('Echo service: \nUrl:  ' + req.url);
  res.write('\nHeaders:\n' + JSON.stringify(req.headers, true, 2));

  res.write('\nBody:\n'); 

  req.pipe(res); // does not work

  res.end();

}).listen(8000);

Here's the curl:

➜  ldap-auth-gateway git:(master) ✗ curl -v -X POST --data "test.payload" --header "Cookie:  token=12345678" --header "Content-Type:text/plain" localhost:9002 

Here's the debug output (see that body was uploaded):

  About to connect() to localhost port 9002 (#0)
  Trying 127.0.0.1...
    connected
    Connected to localhost (127.0.0.1) port 9002 (#0)
  POST / HTTP/1.1
  User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8x zlib/1.2.5
  Host: localhost:9002
  Accept: */*
  Cookie:  token=12345678
  Content-Type:text/plain
  Content-Length: 243360
  Expect: 100-continue

  HTTP/1.1 100 Continue
  HTTP/1.1 200 OK
  Content-Type: text/plain
  Date: Sun, 04 Aug 2013 17:12:39 GMT
  Connection: keep-alive
  Transfer-Encoding: chunked

And the service responds without echoing the request body:

Echo service: 
Url:  /
Headers:
{
  "user-agent": "curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8x zlib/1.2.5",
  "host": "localhost:9002",
  "accept": "*/*",
  "cookie": "token=12345678",
  "content-type": "text/plain",
  "content-length": "243360",
  "expect": "100-continue"
}

... and final curl debug is

Body:
 Connection #0 to host localhost left intact
 Closing connection #0

Additionally, when I stress test with large request body, I get an EPIPE error. How can I avoid this?

-- EDIT: Through trial and error I did get this to work, and it still points to being a timing issue. Though it is still strange, as the timeout causes the payload to be returned, but the timeout duration is not minded. In other words whether I set the timeout to 5 seconds or 500 seconds, the payload is properly piped back to the request and the connection is terminated.

Here's the edit:

http.createServer(function (req, res) {

    try {
      res.writeHead(200, { 'Content-Type': 'text/plain' });
      res.write('Echo service: ' + req.url + '\n' + JSON.stringify(req.headers, true, 2));
      res.write('\nBody:"\n');
      req.pipe(res);
    } catch(ex) {
      console.log(ex);
      // how to change response code to error here?  since headers have already been written?
    } finally {
      setTimeout((function() {
        res.end();
      }), 500000);
    }

}).listen(TARGET_SERVER.port);

?

Nan answered 31/7, 2013 at 1:58 Comment(1)
Note you will see that a request is made to 9002. This is a reverse proxy (simple node-http-proxy to 8000, the target). Hitting the target directly yields the same results.Nan
C
8

Pipe req to res. Req is readable stream and response is a writable stream.It should work

   http.createServer(function (req, res) {

       res.writeHead(200, { 'Content-Type': 'text/plain' });    
       res.write('Echo service: ' + req.url + '\n' + JSON.stringify(req.headers, true, 2));

       // pipe request body directly into the response body
       req.pipe(res);       

   }).listen(9002);
Complex answered 31/7, 2013 at 3:48 Comment(5)
This will work... sometimes ... there is something asynchronous about the pipe call. The faster the machine, the more often this doesn't work at all. I am able to get it to work by waiting before calling res.end(). The bounty is for whomever can explain why I was able to fix this issue by adding a sleep. (see my edit at bottom of question)Nan
PIPE call takes care of calling the res.end when the req stream calls close / end. There is no need to again call res.end after piping.Did you try the above code without using res.end()?Complex
Reason for your code to be working after adding sleep(not a right terminology in nodejs :-) ) is that the pipe is actually getting time to close the response stream after req stream ends.res.end() in settimeout is merely closing the closed stream <no sideeffect>.Complex
Accepting your answer but including part of Wyatt's answer as he went into more detail as to the reason: Since IO is asynchronous in node, when you issue the .pipe command, control is immediately returned to the current context, while the pipe works in the background. When you next call res.end(), you close the stream, preventing any more data to be written. The solution here is to let .pipe end the stream itself, which is the default.Nan
Why does req.pipe(res) work? How does it know to only send req.body and not all the headers, etc.?Acarus
A
6

So first, it looks like your curl is off, the filename of the posted data should be preceded by an @ as shown here. You'd just be posting the filename otherwise.

Aside from that, Chandu is correct in saying that the call to res.end() is the problem here.

Since IO is asynchronous in node, when you issue the .pipe command, control is immediately returned to the current context, while the pipe works in the background. When you next call res.end(), you close the stream, preventing any more data to be written.

The solution here is to let .pipe end the stream itself, which is the default.

I'd imagine that timing came into play because on different machines and different data sizes, the asynchronous IO could theoretically finish (fast IO of small dataset) before the end event on the writable stream is fully processed.

I'd recommend this blog post for some more context.

Albumenize answered 6/8, 2013 at 18:31 Comment(4)
That makes perfect sense. Thanks for the explanation.Nan
Re the curl post. That was intentional as I was doing both.. took off the @ to test just a few characters versus the relatively large payload contained in the file.Nan
So if I could make sense of the "funny" timeout behavior... pipe takes say 500 ms, and then closes the connection. The timeout still occurs (even if set 500 seconds into the future), but simply doesn't do anything when res.close is called, since res is already closed. I would expect some sort of error to be reported.Nan
Raising an error would certainly be useful.. though there might be something else at work here. I just tried adding some res.writes in after the call to end and they also failed silently. They should throw errors re:the docs. Interesting..!Albumenize

© 2022 - 2024 — McMap. All rights reserved.