How to pipe multiple readable streams, from multiple api requests, to a single writeable stream?

Asked 23/7, 2019 at 6:5 Answered 4/8, 2019 at 13:1

node.js express ibm-watson fs node-streams

- Desired Behaviour
- Actual Behaviour
- What I've Tried
- Steps To Reproduce
- Research

Desired Behaviour

Pipe multiple readable streams, received from multiple api requests, to a single writeable stream.

The api responses are from ibm-watson's textToSpeech.synthesize() method.

The reason multiple requests are required is because the service has a 5KB limit on text input.

Therefore a string of 18KB, for example, requires four requests to complete.

Actual Behaviour

The writeable stream file is incomplete and garbled.

The application seems to 'hang'.

When I try and open the incomplete .mp3 file in an audio player, it says it is corrupted.

The process of opening and closing the file seems to increase its file size - like opening the file somehow prompts more data to flow in to it.

Undesirable behaviour is more apparent with larger inputs, eg four strings of 4000 bytes or less.

What I've Tried

I've tried several methods to pipe the readable streams to either a single writeable stream or multiple writeable streams using the npm packages combined-stream, combined-stream2, multistream and archiver and they all result in incomplete files. My last attempt doesn't use any packages and is shown in the Steps To Reproduce section below.

I am therefore questioning each part of my application logic:

01. What is the response type of a watson text to speech api request?

The text to speech docs, say the api response type is:

Response type: NodeJS.ReadableStream|FileObject|Buffer

I am confused that the response type is one of three possible things.

In all my attempts, I have been assuming it is a readable stream.

02. Can I make multiple api requests in a map function?

03. Can I wrap each request within a promise() and resolve the response?

04. Can I assign the resulting array to a promises variable?

05. Can I declare var audio_files = await Promise.all(promises)?

06. After this declaration, are all responses 'finished'?

07. How do I correctly pipe each response to a writable stream?

08. How do I detect when all pipes have finished, so I can send file back to client?

For questions 2 - 6, I am assuming the answer is 'YES'.

I think my failures relate to question 7 and 8.

Steps To Reproduce

You can test this code with an array of four randomly generated text strings with a respective byte size of 3975, 3863, 3974 and 3629 bytes - here is a pastebin of that array.

// route handler
app.route("/api/:api_version/tts")
    .get(api_tts_get);

// route handler middleware
const api_tts_get = async (req, res) => {

    var query_parameters = req.query;

    var file_name = query_parameters.file_name;
    var text_string_array = text_string_array; // eg: https://pastebin.com/raw/JkK8ehwV

    var absolute_path = path.join(__dirname, "/src/temp_audio/", file_name);
    var relative_path = path.join("./src/temp_audio/", file_name); // path relative to server root

    // for each string in an array, send it to the watson api  
    var promises = text_string_array.map(text_string => {

        return new Promise((resolve, reject) => {

            // credentials
            var textToSpeech = new TextToSpeechV1({
                iam_apikey: iam_apikey,
                url: tts_service_url
            });

            // params  
            var synthesizeParams = {
                text: text_string,
                accept: 'audio/mp3',
                voice: 'en-US_AllisonV3Voice'
            };

            // make request  
            textToSpeech.synthesize(synthesizeParams, (err, audio) => {
                if (err) {
                    console.log("synthesize - an error occurred: ");
                    return reject(err);
                }
                resolve(audio);
            });

        });
    });

    try {
        // wait for all responses
        var audio_files = await Promise.all(promises);
        var audio_files_length = audio_files.length;

        var write_stream = fs.createWriteStream(`${relative_path}.mp3`);

        audio_files.forEach((audio, index) => {

            // if this is the last value in the array, 
            // pipe it to write_stream, 
            // when finished, the readable stream will emit 'end' 
            // then the .end() method will be called on write_stream  
            // which will trigger the 'finished' event on the write_stream    
            if (index == audio_files_length - 1) {
                audio.pipe(write_stream);
            }
            // if not the last value in the array, 
            // pipe to write_stream and leave open 
            else {
                audio.pipe(write_stream, { end: false });
            }

        });

        write_stream.on('finish', function() {

            // download the file (using absolute_path)  
            res.download(`${absolute_path}.mp3`, (err) => {
                if (err) {
                    console.log(err);
                }
                // delete the file (using relative_path)  
                fs.unlink(`${relative_path}.mp3`, (err) => {
                    if (err) {
                        console.log(err);
                    }
                });
            });

        });


    } catch (err) {
        console.log("there was an error getting tts");
        console.log(err);
    }

}

The official example shows:

textToSpeech.synthesize(synthesizeParams)
  .then(audio => {
    audio.pipe(fs.createWriteStream('hello_world.mp3'));
  })
  .catch(err => {
    console.log('error:', err);
  });

which seems to work fine for single requests, but not for multiple requests, as far as I can tell.

Research

concerning readable and writeable streams, readable stream modes (flowing and paused), 'data', 'end', 'drain' and 'finish' events, pipe(), fs.createReadStream() and fs.createWriteStream()

Almost all Node.js applications, no matter how simple, use streams in some manner...

const server = http.createServer((req, res) => {
// `req` is an http.IncomingMessage, which is a Readable Stream
// `res` is an http.ServerResponse, which is a Writable Stream

let body = '';
// get the data as utf8 strings.
// if an encoding is not set, Buffer objects will be received.
req.setEncoding('utf8');

// readable streams emit 'data' events once a listener is added
req.on('data', (chunk) => {
body += chunk;
});

// the 'end' event indicates that the entire body has been received
req.on('end', () => {
try {
const data = JSON.parse(body);
// write back something interesting to the user:
res.write(typeof data);
res.end();
} catch (er) {
// uh oh! bad json!
res.statusCode = 400;
return res.end(`error: ${er.message}`);
}
});
});

https://nodejs.org/api/stream.html#stream_api_for_stream_consumers

Readable streams have two main modes that affect the way we can consume them...they can be either in the paused mode or in the flowing mode. All readable streams start in the paused mode by default but they can be easily switched to flowing and back to paused when needed...just adding a data event handler switches a paused stream into flowing mode and removing the data event handler switches the stream back to paused mode.

https://www.freecodecamp.org/news/node-js-streams-everything-you-need-to-know-c9141306be93

Here’s a list of the important events and functions that can be used with readable and writable streams

The most important events on a readable stream are:

The data event, which is emitted whenever the stream passes a chunk of data to the consumer The end event, which is emitted when there is no more data to be consumed from the stream.

The most important events on a writable stream are:

The drain event, which is a signal that the writable stream can receive more data. The finish event, which is emitted when all data has been flushed to the underlying system.

https://www.freecodecamp.org/news/node-js-streams-everything-you-need-to-know-c9141306be93

.pipe() takes care of listening for 'data' and 'end' events from the fs.createReadStream().

https://github.com/substack/stream-handbook#why-you-should-use-streams

.pipe() is just a function that takes a readable source stream src and hooks the output to a destination writable stream dst

https://github.com/substack/stream-handbook#pipe

The return value of the pipe() method is the destination stream

https://flaviocopes.com/nodejs-streams/#pipe

By default, stream.end() is called on the destination Writable stream when the source Readable stream emits 'end', so that the destination is no longer writable. To disable this default behavior, the end option can be passed as false, causing the destination stream to remain open:

https://nodejs.org/api/stream.html#stream_readable_pipe_destination_options

The 'finish' event is emitted after the stream.end() method has been called, and all data has been flushed to the underlying system.

const writer = getWritableStreamSomehow();
for (let i = 0; i < 100; i++) {
  writer.write(`hello, #${i}!\n`);
}
writer.end('This is the end\n');
writer.on('finish', () => {
  console.log('All writes are now complete.');
});

https://nodejs.org/api/stream.html#stream_event_finish

If you're trying to read multiple files and pipe them to a writable stream, you have to pipe each one to the writable stream and and pass end: false when doing it, because by default, a readable stream ends the writable stream when there's no more data to be read. Here's an example:

var ws = fs.createWriteStream('output.pdf');

fs.createReadStream('pdf-sample1.pdf').pipe(ws, { end: false });
fs.createReadStream('pdf-sample2.pdf').pipe(ws, { end: false });
fs.createReadStream('pdf-sample3.pdf').pipe(ws);

https://stackoverflow.com/a/30916248

You want to add the second read into an eventlistener for the first read to finish...

var a = fs.createReadStream('a');
var b = fs.createReadStream('b');
var c = fs.createWriteStream('c');
a.pipe(c, {end:false});
a.on('end', function() {
  b.pipe(c)
}

https://stackoverflow.com/a/28033554

A Brief History of Node Streams - part one and two.

Related Google search:

how to pipe multiple readable streams to a single writable stream? nodejs

Questions covering the same or similar topic, without authoritative answers (or might be 'outdated'):

How to pipe multiple ReadableStreams to a single WriteStream?

Piping to same Writable stream twice via different Readable stream

Pipe multiple files to one response

Creating a Node.js stream from two piped streams

Effectuate answered 23/7, 2019 at 6:5 Comment(4)

I don't think that you can simply concatenate multiple audio streams in the way you are attempting. Each stream will have it's own header information defining each segment. You will have these headers interspersed in the final file, and the first simply will not describe the content. You need to find a library that will allow you to join audio files. – Rebut 23/7, 2019 at 11:14

can you please confirm what the return response type is, ie NodeJS.ReadableStream|FileObject|Buffer? then i think i will have a better idea how to join them and write to file. thank you. – Effectuate 24/7, 2019 at 6:45

You are using node.js, so type is fluid, but if you check through the SDK - github.com/watson-developer-cloud/node-sdk/blob/master/… and github.com/IBM/node-sdk-core/blob/master/lib/requestwrapper.ts, then it's a stream, which you can pipe to a write stream audio.pipe(fs.createWriteStream('hello_world.wav')); – Rebut 24/7, 2019 at 10:9

@Rebut - are you suggesting piping each readable stream to its own mp3 file and then, when all those pipes have finished, joining audio? that method has since been suggested in an answer that unfortunately is producing errors. i think something is going awry with the piping to write streams in the first place. not sure if relevant, but tested single requests to api with input around 4000 bytes in Postman - resulting audio had repeating blocks of sound at the end of the file, also the original 200 OK response came back quickly, but file took about 2 mins to be completed and ready to save. – Effectuate 31/7, 2019 at 22:18

The core problem to solve here is asynchronicity. You almost had it: the problem with the code you posted is that you are piping all source streams in parallel & unordered into the target stream. This means data chunks will flow randomly from different audio streams - even your end event will outrace the pipes without end closing the target stream too early, which might explain why it increases after you re-open it.

What you want is to pipe them sequentially - you even posted the solution when you quoted

You want to add the second read into an eventlistener for the first read to finish...

or as code:

a.pipe(c, { end:false });
a.on('end', function() {
  b.pipe(c);
}

This will pipe the source streams in sequential order into the target stream.

Taking your code this would mean to replace the audio_files.forEach loop with:

await Bluebird.mapSeries(audio_files, async (audio, index) => {  
    const isLastIndex = index == audio_files_length - 1;
    audio.pipe(write_stream, { end: isLastIndex });
    return new Promise(resolve => audio.on('end', resolve));
});

Note the usage of bluebird.js mapSeries here.

Further advice regarding your code:

you should consider using lodash.js
you should use const & let instead of var and consider using camelCase
when you notice "it works with one event, but fails with multiple" always think: asynchronicity, permutations, race conditions.

Further reading, limitations of combining native node streams: https://github.com/nodejs/node/issues/93

Condolent answered 31/7, 2019 at 23:2 Comment(7)

thanks for this, my attempts to implement have produced various issues, some questions that might help resolve them are: 1) does this solution work with the text_string_array value provided in OP - to test behaviour with large input? 2) i see a new promise is returned for each iteration of the map function - 2a) does that mean a result value is returned for each readable stream? 2b) what is the result value of each promise? 3) how do i detect when the output file is ready to send back? here is my pastebin of last attempt to implement solution: pastebin.com/PY8GWPmq – Effectuate 1/8, 2019 at 7:28

Note that this solution might not address every issue with your approach: for example I am not sure if audio steams can be joined to yield another valid audio stream. Most data formats do not allow for that! Thus the other approach / solution to join audio files might be more valuable to you and does sound more stable! – Condolent 1/8, 2019 at 10:48

Trying to answer your questions: 1) Not all issues might be solved, but at least those I discovered in your code so that logic wise it may work 2a) The promises are only returned so that mapSeries waits for each stream to end before calling pipe on the next one. The promise results are not used. 2b) The result is the return value of resolve (=undefined) - it is not used. 3)As done in your code: write_stream.on('finish', …. If you replace the audio_files.forEach loop with the mapSeries(audio_files loop you should be closer to a solution, if the data format allows this approach – Condolent 1/8, 2019 at 22:13

Your last code example does not look bad - did you try it with the whole first code example, thus with everything before the audio_files.forEach loop (the text_string_array and so on)? The whole purpose of my proposal is to bring the streams into sequential order, each finishing all its data events before the next one starts writing. One correction for 2b): the results will be the return value of audio.on (still should be undefined). – Condolent 1/8, 2019 at 22:23

yes, just replaced the audio_files.forEach block with the Bluebird.mapSeries block, as shown in the pastebin i linked to in first comment. current behaviour is that the first promise takes so long to 'complete' that the application just seems to start resending the requests, starting from the beginning, at which point i just have to Ctrl C the application. – Effectuate 2/8, 2019 at 5:2

If the first readable stream never ends - that would seem an issue of the Text-to-Speech library which might keep the streams open? You could try a minimal example just to see if audio.on('end' ever happens on the readable steams. Also you could try adding a .catch(...) to Promise(resolve => audio.on('end', resolve)) with code to log errors. – Condolent 2/8, 2019 at 7:12

for reference, the 'resending of the request' issue was caused by a timeout in node that can be resolved by answers mentioned here: github.com/expressjs/express/issues/2512 - basically using something like: req.connection.setTimeout( 1000 * 60 * 10 ); // ten minutes. am making progress by combining answers here and plan to post results and insights tomorrow after further testing. – Effectuate 3/8, 2019 at 13:35

I'll give my two cents here, since I looked at a similar question recently! From what I have tested, and researched, you can combine the two .mp3 / .wav streams into one. This results in a file that has noticable issues as you've mentioned such as truncation, glitches etc.

The only way I believe you can combine the Audio streams correctly will be with a module that is designed to concatenate sound files/data.

The best result I have obtained is to synthesize the audio into separate files, then combine like so:

function combineMp3Files(files, outputFile) {
    const ffmpeg = require("fluent-ffmpeg");
    const combiner = ffmpeg().on("error", err => {
        console.error("An error occurred: " + err.message);
    })
    .on("end", () => {
        console.log('Merge complete');
    });

    // Add in each .mp3 file.
    files.forEach(file => {
        combiner.input(file)
    });

    combiner.mergeToFile(outputFile); 
}

This uses the node-fluent-ffmpeg library, which requires installing ffmpeg.

Other than that I'd suggest you ask IBM support (because as you say the docs don't seem to indicate this) how API callers should combine the synthesized audio, since your use case will be very common.

To create the text files, I do the following:

// Switching to audio/webm and the V3 voices.. much better output 
function synthesizeText(text) {
    const synthesizeParams = {
        text: text,
        accept: 'audio/webm',
        voice: 'en-US_LisaV3Voice'
    };
    return textToSpeech.synthesize(synthesizeParams);
}


async function synthesizeTextChunksSeparateFiles(text_chunks) {
    const audioArray = await Promise.all(text_chunks.map(synthesizeText));
    console.log(`synthesizeTextChunks: Received ${audioArray.length} result(s), writing to separate files...`);
    audioArray.forEach((audio, index) => {
        audio.pipe(fs.createWriteStream(`audio-${index}.mp3`));
    });
}

And then combine like so:

combineMp3Files(['audio-0.mp3', 'audio-1.mp3', 'audio-2.mp3', 'audio-3.mp3', 'audio-4.mp3'], 'combined.mp3');

I should point out that I'm doing this in two separate steps (waiting a few hundred milliseconds would also work), but it should be easy enough to wait for the individual files to be written, then combine them.

Here's a function that will do this:

async function synthesizeTextChunksThenCombine(text_chunks, outputFile) {
    const audioArray = await Promise.all(text_chunks.map(synthesizeText));
    console.log(`synthesizeTextChunks: Received ${audioArray.length} result(s), writing to separate files...`);
    let writePromises = audioArray.map((audio, index) => {
        return new Promise((resolve, reject) => {
            audio.pipe(fs.createWriteStream(`audio-${index}.mp3`).on('close', () => {   
                resolve(`audio-${index}.mp3`);
            }));
        })
    });
    let files = await Promise.all(writePromises);
    console.log('synthesizeTextChunksThenCombine: Separate files: ', files);
    combineMp3Files(files, outputFile);
}

Annates answered 24/7, 2019 at 10:38 Comment(12)

my files variable is declared like this: var files = await Promise.all(promises), so it is an array of the readable streams returned from the api. where you have outputFile, i have put ${relative_path_to_file}.mp3. i am getting error:

Error: Only one input stream is supported at FfmpegCommand.proto.mergeAdd.proto.addInput.proto.input (C:\Users\Me\Documents\my_repo\node_modules\fluent-ffmpeg\lib\options\inputs.js:42:15) at files.forEach (C:\Users\Me\Documents\my_repo\app.js:1196:22)

. – Effectuate 24/7, 2019 at 11:57

if, rather than iterating over the readable streams, and adding them as arguments to the input() method, you are actually iterating over files that have already been written, how did you detect when all the readable streams had finished piping into the writeable streams? (so that the files were ready to be passed to the input() method) – Effectuate 24/7, 2019 at 12:1

I'm doing it in two steps so I'm not currently detecting that the streams have completed, but you could create an array of Promises that resolve on the stream.end callback on each one and then do an await all for this. – Annates 24/7, 2019 at 12:3

I've added a function that creates the temporary audio files, then combines them once they are written.. – Annates 24/7, 2019 at 12:13

Will test tomorrow, need to refactor and falling asleep, thanks! – Effectuate 24/7, 2019 at 12:48

Cool, one more thing, you could probably use dedicated tmp files for the audio files (if you end up using that approach), this is probably better practice, plus they will get cleaned up eventually.. – Annates 24/7, 2019 at 12:51

with a test of 4 text strings equal to or less than 4000 bytes, after synthesizeTextChunks: Received 4 result(s), writing to separate files..., it took maybe 30 seconds or more before synthesizeTextChunksThenCombine: Separate files: [ 'audio-0.mp3', 'audio-1.mp3', 'audio-2.mp3', 'audio-3.mp3' ] and Merge complete was displayed, then i opened the final outputFile and the audio had some glitches and was not complete, closing the file and re-opening it caused the files' displayed file size to increase and the previous two messages were logged again. so it seems something is going awry. – Effectuate 25/7, 2019 at 9:5

I can confirm that again.. I'm seeing these glitches in the audio-n.mp3 files as well, so it seems to me that the original synthesis or the persistance of the resulting stream to file is causing the problem. I might play around with the API to see if I can improve this.. – Annates 25/7, 2019 at 9:43

Using the synthesis settings of: { accept: 'audio/webm', voice: 'en-US_LisaV3Voice' } gives us better output. I think this is definitely part of the issue. – Annates 25/7, 2019 at 9:59

Also using accept: 'audio/ogg;codecs=opus;rate=48000' gives a pretty good result. The V3 voices are definitely a lot better. – Annates 25/7, 2019 at 10:15

i am bit confused, in my mind the problem is not the audio being produced by watson, but rather not knowing the correct way to join the multiple readable streams returned from watson into one file. that is why i added research about streams to the question - to try and find out what i was missing. so i am hoping someone will be able to say 'here is the correct way to create a single file from multiple readable streams' which will produce the desired result. – Effectuate 30/7, 2019 at 5:10

Oh yes, absolutely! I just wanted too ensure the audio being produced by Watson was the best possible! – Annates 30/7, 2019 at 5:15

WebRTC would be good option for above problem. Because your once your file has generation done , i will give client to listen.

https://www.npmjs.com/package/simple-peer

Ondine answered 31/7, 2019 at 5:56 Comment(0)

Here are two solutions.

Solution 01

uses Bluebird.mapSeries
writes individual responses to temporary files
puts them in a zip file (using archiver)
sends zip file back to client to save
deletes temporary files

It utilises Bluebird.mapSeries from BM's answer but instead of just mapping over the responses, requests and responses are handled within the map function. Also, it resolves promises on the writeable stream finish event, rather than the readable stream end event. Bluebird is helpful in that it pauses iteration within a map function until a response has been received and handled, and then moves on to the next iteration.

Given that the Bluebird map function produces clean audio files, rather than zipping the files, you could use a solution like in Terry Lennox's answer to combine multiple audio files into one audio file. My first attempt of that solution, using Bluebird and fluent-ffmpeg, produced a single file, but it was slightly lower quality - no doubt this could be tweaked in ffmpeg settings, but i didn't have time to do that.

// route handler
app.route("/api/:api_version/tts")
    .get(api_tts_get);

// route handler middleware
const api_tts_get = async (req, res) => {

    var query_parameters = req.query;

    var file_name = query_parameters.file_name;
    var text_string_array = text_string_array; // eg: https://pastebin.com/raw/JkK8ehwV

    var absolute_path = path.join(__dirname, "/src/temp_audio/", file_name);
    var relative_path = path.join("./src/temp_audio/", file_name); // path relative to server root

    // set up archiver
    var archive = archiver('zip', {
        zlib: { level: 9 } // sets the compression level  
    });
    var zip_write_stream = fs.createWriteStream(`${relative_path}.zip`);
    archive.pipe(zip_write_stream);

    await Bluebird.mapSeries(text_chunk_array, async function(text_chunk, index) {

        // check if last value of array  
        const isLastIndex = index === text_chunk_array.length - 1;

        return new Promise((resolve, reject) => {

            var textToSpeech = new TextToSpeechV1({
                iam_apikey: iam_apikey,
                url: tts_service_url
            });

            var synthesizeParams = {
                text: text_chunk,
                accept: 'audio/mp3',
                voice: 'en-US_AllisonV3Voice'
            };

            textToSpeech.synthesize(synthesizeParams, (err, audio) => {
                if (err) {
                    console.log("synthesize - an error occurred: ");
                    return reject(err);
                }

                // write individual files to disk  
                var file_name = `${relative_path}_${index}.mp3`;
                var write_stream = fs.createWriteStream(`${file_name}`);
                audio.pipe(write_stream);

                // on finish event of individual file write  
                write_stream.on('finish', function() {

                    // add file to archive  
                    archive.file(file_name, { name: `audio_${index}.mp3` });

                    // if not the last value of the array
                    if (isLastIndex === false) {
                        resolve();
                    } 
                    // if the last value of the array 
                    else if (isLastIndex === true) {
                        resolve();

                        // when zip file has finished writing,
                        // send it back to client, and delete temp files from server 
                        zip_write_stream.on('close', function() {

                            // download the zip file (using absolute_path)  
                            res.download(`${absolute_path}.zip`, (err) => {
                                if (err) {
                                    console.log(err);
                                }

                                // delete each audio file (using relative_path) 
                                for (let i = 0; i < text_chunk_array.length; i++) {
                                    fs.unlink(`${relative_path}_${i}.mp3`, (err) => {
                                        if (err) {
                                            console.log(err);
                                        }
                                        console.log(`AUDIO FILE ${i} REMOVED!`);
                                    });
                                }

                                // delete the zip file
                                fs.unlink(`${relative_path}.zip`, (err) => {
                                    if (err) {
                                        console.log(err);
                                    }
                                    console.log(`ZIP FILE REMOVED!`);
                                });

                            });


                        });

                        // from archiver readme examples  
                        archive.on('warning', function(err) {
                            if (err.code === 'ENOENT') {
                                // log warning
                            } else {
                                // throw error
                                throw err;
                            }
                        });

                        // from archiver readme examples  
                        archive.on('error', function(err) {
                            throw err;
                        });

                        // from archiver readme examples 
                        archive.finalize();
                    }
                });
            });

        });

    });

}

Solution 02

I was keen to find a solution that didn't use a library to "pause" within the map() iteration, so I:

swapped the map() function for a for of loop
used await before the api call, rather than wrapping it in a promise, and
instead of using return new Promise() to contain the response handling, I used await new Promise() (gleaned from this answer)

This last change, magically, paused the loop until the archive.file() and audio.pipe(writestream) operations were completed - i'd like to better understand how that works.

// route handler
app.route("/api/:api_version/tts")
    .get(api_tts_get);

// route handler middleware
const api_tts_get = async (req, res) => {

    var query_parameters = req.query;

    var file_name = query_parameters.file_name;
    var text_string_array = text_string_array; // eg: https://pastebin.com/raw/JkK8ehwV

    var absolute_path = path.join(__dirname, "/src/temp_audio/", file_name);
    var relative_path = path.join("./src/temp_audio/", file_name); // path relative to server root

    // set up archiver
    var archive = archiver('zip', {
        zlib: { level: 9 } // sets the compression level  
    });
    var zip_write_stream = fs.createWriteStream(`${relative_path}.zip`);
    archive.pipe(zip_write_stream);

    for (const [index, text_chunk] of text_chunk_array.entries()) {

        // check if last value of array 
        const isLastIndex = index === text_chunk_array.length - 1;

        var textToSpeech = new TextToSpeechV1({
            iam_apikey: iam_apikey,
            url: tts_service_url
        });

        var synthesizeParams = {
            text: text_chunk,
            accept: 'audio/mp3',
            voice: 'en-US_AllisonV3Voice'
        };

        try {

            var audio_readable_stream = await textToSpeech.synthesize(synthesizeParams);

            await new Promise(function(resolve, reject) {

                // write individual files to disk 
                var file_name = `${relative_path}_${index}.mp3`;
                var write_stream = fs.createWriteStream(`${file_name}`);
                audio_readable_stream.pipe(write_stream);

                // on finish event of individual file write
                write_stream.on('finish', function() {

                    // add file to archive
                    archive.file(file_name, { name: `audio_${index}.mp3` });

                    // if not the last value of the array
                    if (isLastIndex === false) {
                        resolve();
                    } 
                    // if the last value of the array 
                    else if (isLastIndex === true) {
                        resolve();

                        // when zip file has finished writing,
                        // send it back to client, and delete temp files from server
                        zip_write_stream.on('close', function() {

                            // download the zip file (using absolute_path)  
                            res.download(`${absolute_path}.zip`, (err) => {
                                if (err) {
                                    console.log(err);
                                }

                                // delete each audio file (using relative_path)
                                for (let i = 0; i < text_chunk_array.length; i++) {
                                    fs.unlink(`${relative_path}_${i}.mp3`, (err) => {
                                        if (err) {
                                            console.log(err);
                                        }
                                        console.log(`AUDIO FILE ${i} REMOVED!`);
                                    });
                                }

                                // delete the zip file
                                fs.unlink(`${relative_path}.zip`, (err) => {
                                    if (err) {
                                        console.log(err);
                                    }
                                    console.log(`ZIP FILE REMOVED!`);
                                });

                            });


                        });

                        // from archiver readme examples  
                        archive.on('warning', function(err) {
                            if (err.code === 'ENOENT') {
                                // log warning
                            } else {
                                // throw error
                                throw err;
                            }
                        });

                        // from archiver readme examples  
                        archive.on('error', function(err) {
                            throw err;
                        });

                        // from archiver readme examples   
                        archive.finalize();
                    }
                });

            });

        } catch (err) {
            console.log("oh dear, there was an error: ");
            console.log(err);
        }
    }

}

Learning Experiences

Other issues that came up during this process are documented below:

Long requests time out when using node (and resend the request)...

// solution  
req.connection.setTimeout( 1000 * 60 * 10 ); // ten minutes

See: https://github.com/expressjs/express/issues/2512

400 errors caused by node max header size of 8KB (query string is included in header size)...

// solution (although probably not recommended - better to get text_string_array from server, rather than client) 
node --max-http-header-size 80000 app.js

See: https://github.com/nodejs/node/issues/24692

Effectuate answered 4/8, 2019 at 13:1 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags