Sending large image data over HTTP in Node.js
Asked Answered
M

3

30

In my development environment I have two servers. One sends and image to the other over a POST http request.

Client server does this:

    fs.readFile(rawFile.path,'binary',function (err, file){
        restler.post("http://0.0.0.0:5000",{
            data: file,
            headers:{
                "Content-Type": rawFile.type,
            }
        }).on('complete',function(data,response){                               
            console.log(data);
            res.send("file went through")
        })

The server that recieves the request does this:

    server.post('/',function(req,res,next){
        fs.writeFileSync("test.png",req.body,"binary",function(err){
            if(err) throw err;
            res.send("OK")
        })
    })

If i send a small image it works fine. However, if i send a large image although the file is saved correctly only the first upper portion of the image is displayed. The rest is black. Image size is correct.

I guess it's just the first chunk of the image that's being written on the file. I've tried creating a readStream and a writeStream but it doesn't seem to work:

req.body.pipe(fs.createWriteStream('test.png'))

Can i stream directly from the binary data and pipe it into the file? For what i've seen, readStream is often used to stream from files not raw binary data.

I read a few posts but it doesn't seem to work for me.

I'm using restler module in the client server and restify in the other.

Thanks!

Moron answered 21/2, 2013 at 12:4 Comment(0)
C
72

Sorry to be blunt, but there's a lot wrong here.

readFile reads the entire contents of a file into memory before invoking the callback, at which point you begin uploading the file.

This is bad–especially when dealing with large files like images–because there's really no reason to read the file into memory. It's wasteful; and under load, you'll find that your server will run out of memory and crash.

Instead, you want to get a stream, which emits chunks of data as they're read from disk. All you have to do is pass those chunks along to your upload stream (pipe), and then discard the data from memory. In this way, you never use more than a small amount of buffer memory.

(A readable stream's default behavior is to deal in raw binary data; it's only if you pass an encoding that it deals in text.)

The request module makes this especially easy:

fs.createReadStream('test.png').pipe(request.post('http://0.0.0.0:5000/'));

On the server, you have a larger problem. Never use *Sync methods. It blocks your server from doing anything (like responding to other requests) until the entire file is flushed to disk, which can take seconds.

So instead, we want to take the incoming data stream and pipe it to a filesystem stream. You were on the right track originally; the reason that req.body.pipe(fs.createWriteStream('test.png')) didn't work is because body is not a stream.

body is generated by the bodyParser middleware. In restify, that middleware acts much like readFile in that it buffers the entire incoming request-entity in memory. In this case, we don't want that. Disable the body parser middleware.

So where is the incoming data stream? It is the req object itself. restify's Request inherits node's http.IncomingMessage, which is a readable stream. So:

fs.createWriteStream('test.png').pipe(req);

I should also mention that this all works so simply because there's no form parsing overhead involved. request simply sends the file with no multipart/form-data wrappers:

POST / HTTP/1.1
host: localhost:5000
content-type: application/octet-stream
Connection: keep-alive
Transfer-Encoding: chunked

<image data>...

This means that a browser could not post a file to this URL. If that's a need, look in to formidable, which does streaming parsing of request-entities.

Caruncle answered 21/2, 2013 at 16:58 Comment(12)
Great answer! Thanks for putting back on track :) I didn't explain in the question what I needed to accomplish:Moron
Images are uploaded to the server, resized in 4 different sizes and later uploaded to a 3rd party service. I'm resizing images with imagemagick and i'm not sure if I can pass in a stream, I'll have to read the docs more carefully. Otherwise I'l have to read the file (?) I understand that sync methods are not good, it was just a way of debugging. Would all the explained above work for all this?Moron
Thanks for helping a front-end newbie... :)Moron
@Maroshii: Depends on how you use imagemagick. If you're invoking the CLI tools, they run completely externally, and you just give it a path to a file. So you'd just invoke IM when the upload completes: fs.createWriteStream('test.png').pipe(req).on('end', function() { /* run imagemagick on test.png */ }); There's also a imagemagick-native module that allows you to operate on data in-memory (thus requiring no temporary files), but that's a lot more complex.Caruncle
@Caruncle how could I do that without using request module, can you please help me out with my question here. #38023926Despoliation
What's the best practice in case if I'm sending a file (image) from client to nodejs?Heredity
formidable parses browser uploads and streams them to disk.Caruncle
@Caruncle I cannot rely on non built-in modules so I cannot use request, but http only, so fs.createWriteStream + http only in my case.Silkaline
Is Multer middleware necessary for reading an incoming datastream in node?Tessatessellate
It depends on what that incoming data looks like. If it's just a stream of raw data uploaded from a node client, then no parser is necessary; you can just pipe the request to a file. If you're dealing with a browser upload, then you do need something like formidable or multer to parse the multipart/form-data upload.Caruncle
thanks for the answer josh. if i'm understanding correctly, this means that if i want to add any additional params or information to the request, i'd have to switch over to using multipart/form-data and use a middleware like multer to parse it?Tessatessellate
Have to? No, there are plenty of ugly ways to avoid using a multipart upload. You could stuff data into query parameters or headers, but that's not particularly RESTful. Your life will most likely be made much, much easier if you use a multipart parser and upload files and additional parameters in the standard fashion.Caruncle
D
0

I dont know much about restler. But posting an image is a multipart request.

restler.post("http://0.0.0.0:5000",{
    data: restler.file(path, filename, fileSize, encoding, contentType),
    multipart: true
})
Dumyat answered 21/2, 2013 at 12:15 Comment(1)
Use rest.file() method instead of fs.readFile. Ill update my answer.Dumyat
I
0

I tried the solution above, and if you've just moving uploaded files or something the following works much better:

fs.rename(path, newPath, callback(err) {});

I was uploading files over 200MB and would encounter errors using streams, sync or async.

Isogamete answered 6/9, 2013 at 18:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.