How to make the client download a very large file that is genereted on the fly
Asked Answered
Y

4

11

I have an export function that read the entire database and create a .xls file with all the records. Then the file is sent to the client.

Of course, the time of export the full database requires a lot of time and the request will soon end in a timeout error.

What is the best solution to handle this case?

I heard something about making a queue with Redis for example but this will require two requests: one for starting the job that will generate the file and the second to download the generated file.

Is this possible with a single request from the client?

Yester answered 25/4, 2017 at 9:57 Comment(0)
M
15

Excel Export:

Use Streams. Following is a rough idea of what might be done:

  1. Use exceljs module. Because it has a streaming API aimed towards this exact problem.

    var Excel = require('exceljs')
    
  2. Since we are trying to initiate a download. Write appropriate headers to response.

    res.status(200);
    res.setHeader('Content-disposition', 'attachment; filename=db_dump.xls');
    res.setHeader('Content-type', 'application/vnd.ms-excel');
    
  3. Create a workbook backed by Streaming Excel writer. The stream given to writer is server response.

    var options = {
        stream: res, // write to server response
        useStyles: false,
        useSharedStrings: false
    };
    
    var workbook = new Excel.stream.xlsx.WorkbookWriter(options);
    
  4. Now, the output streaming flow is all set up. for the input streaming, prefer a DB driver that gives query results/cursor as a stream.

  5. Define an async function that dumps 1 table to 1 worksheet.

    var tableToSheet = function (name, done) {
        var str = dbDriver.query('SELECT * FROM ' + name).stream();
        var sheet = workbook.addWorksheet(name);
    
        str.on('data', function (d) {
            sheet.addRow(d).commit(); // format object if required
        });
    
        str.on('end', function () {
            sheet.commit();
            done();
        });
    
        str.on('error', function (err) {
            done(err);
        });
    }
    
  6. Now, lets export some db tables, using async module's mapSeries:

    async.mapSeries(['cars','planes','trucks'],tableToSheet,function(err){
       if(err){
         // log error
       }
       res.end();
    })
    

CSV Export:

For CSV export of a single table/collection module fast-csv can be used:

// response headers as usual
res.status(200);
res.setHeader('Content-disposition', 'attachment; filename=mytable_dump.csv');
res.setHeader('Content-type', 'text/csv');

// create csv stream
var csv = require('fast-csv');
var csvStr = csv.createWriteStream({headers: true});

// open database stream
var dbStr = dbDriver.query('SELECT * from mytable').stream();

// connect the streams
dbStr.pipe(csvStr).pipe(res);

You are now streaming data from DB to HTTP response, converting it into xls/csv format on the fly. No need to buffer or store the entire data in memory or in a file.

Morentz answered 28/4, 2017 at 8:36 Comment(5)
@S.D. Thanks! What about if the file can be written line by line. As for example a .CSV file?Yester
@S.D I am using similar code and i need only one sheet in xlsx but it seems that it sends chunk when a worksheet is added(committed) completely not when a row is added(committed) completely. It would be better if I start getting data as soon as any row starts adding into the worksheet.Detroit
@RaghuChahar That has to be supported by the exceljs module's implementation. Looks like you've already asked there.Morentz
@S.D. I went to open that issue after not getting any reply here, Can you help me in any way, please!Detroit
@S.D. do you have any workaround for this with sequelize? As sequelize does not work with streamsTortuga
J
0

You do not have to send the whole file once, you can send this file by chunks (line by line for example), just use res.write(chunk) and res.end() at finish to mark it as completed.

Joslyn answered 27/4, 2017 at 13:57 Comment(2)
In this way can I send to the client, for example, a .csv file? And will be a file or just a browser page with the text inside?Yester
Right, xls is little bit problematic.. The idea is - to send something small to client each 25-30 seconds, then when the file is ready we actually send it, in this case TIMEOUT will not happen. But XLS you can't generate by chunks :(. You should separate your download to 2 parts. 1) Client request file, you return some task number to client and begin the job. 2) Client asks if task number is completed. If yes you return http url for a static file if not client should ask again in X seconds.Joslyn
P
0

You can either send the file information as a stream, sending each individual chunk as it gets created via res.write(chunk), or, if sending the file chunk by chunk is not an option, and you have to wait for the entire file before sending any information, you can always keep the connection open by setting the timeout duration to Infinity or any value you think will be high enough to allow the file to be created. Then set up a function that creates the .xls file and either:

1) Accepts a callback that receives the data output as an argument once ready, sends that data, and then closes the connection, or;

2) Returns a promise that resolves with the data output once its ready, allowing you to send the resolved value and close the connection just like with the callback version.

It would look something like this:

function xlsRouteHandler(req, res){
  res.setTimeout(Infinity) || res.socket.setTimeout(Infinity)

  //callback version
  createXLSFile(...fileCreationArguments, function(finishedFile){
    res.end(finishedFile)
  })

  //promise version
  createXLSFile(...fileCreationArguments)
    .then(finishedFile => res.end(finishedFile))
}

If you still find yourself concerned about timing out, you can always set an interval timer to dispatch an occasional res.write() message to prevent a timeout on the server connection and then cancel that interval once the final file content is ready to be sent.

Phraseogram answered 27/4, 2017 at 19:49 Comment(0)
R
0

Refer to this link which uses jedis (redis java client) The key to this is the LPOPRPUSH command

https://blog.logentries.com/2016/05/queuing-tasks-with-redis/

Relax answered 2/5, 2017 at 19:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.