How can I download and save a file using the Fetch API? (Node.js)
Asked Answered
E

9

98

I have the url to a possibly large (100+ Mb) file, how do I save it in a local directory using fetch?

I looked around but there don't seem to be a lot of resources/tutorials on how to do this.

Ephraim answered 3/6, 2016 at 12:40 Comment(6)
Node.js doesn't have Fetch integrated.Sharie
Why fetch? Node has http support?Pyrrha
I'm creating an Electron app, fetch is supported. Why fetch instead of pure http, because it's a lot easier to use (or so it seemed so far).Ephraim
If someone looked for a way to save file using fetch api but in browser (and came across this answer) then please take a look here: https://mcmap.net/q/48226/-how-can-i-download-a-file-using-window-fetchAlasdair
See below for an example that uses the native Node.js http / https libraries. Note that I don't have to deal with 301/302, so it is straightforward.Updo
@BenFortune it does as of 18+Dotdotage
H
130

Updated solution on Node 18:

const fs = require("fs");
const { mkdir } = require("fs/promises");
const { Readable } = require('stream');
const { finished } = require('stream/promises');
const path = require("path");
const downloadFile = (async (url, fileName) => {
  const res = await fetch(url);
  if (!fs.existsSync("downloads")) await mkdir("downloads"); //Optional if you already have downloads directory
  const destination = path.resolve("./downloads", fileName);
  const fileStream = fs.createWriteStream(destination, { flags: 'wx' });
  await finished(Readable.fromWeb(res.body).pipe(fileStream));
});

await downloadFile("<url_to_fetch>", "<fileName>")

Old Answer works till Node 16:

Using the Fetch API you could write a function that could download from a URL like this:

You will need node-fetch@2 run npm i node-fetch@2

const fetch = require("node-fetch");
const fs = require("fs");
const downloadFile = (async (url, path) => {
  const res = await fetch(url);
  const fileStream = fs.createWriteStream(path);
  await new Promise((resolve, reject) => {
      res.body.pipe(fileStream);
      res.body.on("error", reject);
      fileStream.on("finish", resolve);
    });
});
Horsepowerhour answered 12/7, 2018 at 9:51 Comment(14)
You could even make it a little shorter by writing res.body.on('error', reject); and fileStream.on('finish', resolve);.Centrality
This gives an error: res.body.pipe is not a function. NodeJS v18Singsong
The function which calls downloadFile does not wait for it to resolve the promise. I'm calling this function like this-> await downloadFile(URL, path). Would you mind correcting me?Germinate
@Singsong try importing and using 'node-fetch' instead of the normal fetchLur
just style preferences but especially for short example code I much prefer the explicit async function downloadFile style over const somevar = Zoroastrian
Shouldn't an async function always return a Promise, rather than awaiting one?Vespertilionine
@serverpunk downloadFile will still return an empty promise because of the async keyword, but it won't return until awaiting the inner anonymous promiseFirmament
@Firmament This might be a minor detail but it seems like downloadFile should return the new Promise and the outer code should call await downloadFile(), if I'm understanding the expected behavior of async functions correctly.Vespertilionine
@serverpunk That entirely depends on what one wants the behavior of downloadFile to be! What I think you're describing would be effectively the same as the current answerFirmament
@Zoroastrian en.wiktionary.org/wiki/bikesheddingBradney
@AhmedFasih I thought that was implied by me saying "just style preferences" but Im happy that you were able to still identify what I meant, good for you! :)Zoroastrian
This does not work on node v18. I think stackoverflow.com/a/74722818 is a better solution in 2023.Pretense
Node 16 answer works on Node 18, Node 18 answer leads to 0 byte file.Stokowski
It should be noted that fromWeb is flagged as experimentalDiscreet
P
47

Older answers here involve node-fetch, but since Node.js v18.x this can be done with no extra dependencies.

The body of a fetch response is a web stream. It can be converted to a Node fs stream using Readable.fromWeb, which can then be piped into a write stream created by fs.createWriteStream. If desired, the resulting stream can then be turned into a Promise using the promise version of stream.finished.

const fs = require('fs');
const { Readable } = require('stream');
const { finished } = require('stream/promises');

const stream = fs.createWriteStream('output.txt');
const { body } = await fetch('https://example.com');
await finished(Readable.fromWeb(body).pipe(stream));
Polypary answered 7/12, 2022 at 21:4 Comment(8)
That can also be nicely compacted in one line const download = async (url, path) => Readable.fromWeb((await fetch(url)).body).pipe(fs.createWriteStream(path))Bromide
Does this download the entire file (await fetch(...)) before starting the write stream?Overfeed
@Overfeed await fetch(...) finishes after the response headers are fully received, but before the response body is received. The body will be streamed into the file while it is arriving. The second await can be omitted to perform other tasks while the body stream is still in progress.Polypary
Argument of type 'ReadableStream<Uint8Array>' is not assignable to parameter of type 'ReadableStream<any>'. Type 'ReadableStream<Uint8Array>' is missing the following properties from type 'ReadableStream<any>': values, [Symbol.asyncIterator]ts(2345)Dividivi
@Dividivi unfortunately it looks like there are 2 different ReadableStream definitions, as per #63630614. You should be able to cast body to the correct ReadableStream from 'stream/web'; i.e. import { ReadableStream } from 'stream/web'; and body as ReadableStream<any>.Polypary
Ends in 0 byte file for me.Stokowski
Could probably be rewrited with propre import as Node support it easily.Keffer
Why is Readable.fromWeb() even necessary if body is already a ReadableStream?Mcloughlin
B
28

If you want to avoid explicitly making a Promise like in the other very fine answer, and are ok with building a buffer of the entire 100+ MB file, then you could do something simpler:

const fetch = require('node-fetch');
const {writeFile} = require('fs/promises');

function downloadFile(url, outputPath) {
  return fetch(url)
      .then(x => x.arrayBuffer())
      .then(x => writeFile(outputPath, Buffer.from(x)));
}

But the other answer will be more memory-efficient since it's piping the received data stream directly into a file without accumulating all of it in a Buffer.

Bradney answered 27/12, 2018 at 3:56 Comment(2)
I have tried this code but got error...I got error [Error: EISDIR: illegal operation on a directory, open 'D:\Work\repo\'] { errno: -4068, code: 'EISDIR', syscall: 'open', path: 'D:\\Work\\repo\\' }Croissant
@ScottJones EISDIR means "Error: IS Directory": you're giving Node a directory when it expects a file. Just use d:\work\repo\file.txt for exampleBradney
I
15

This is now easy using modern NodeJS APIs. This will not read the entire file into memory at once so can be used with huge files and is great for performance.

import { writeFile } from 'node:fs/promises'
import { Readable } from 'node:stream'

const response = await fetch('https://example.com/pdf')
const body = Readable.fromWeb(response.body)
await writeFile('document.pdf', body)
Intaglio answered 1/10, 2023 at 11:10 Comment(2)
I believe that that stores the intermediate result in memory which might be undesirable, especially for larger files.Hyps
Updated answer. It no longer loads file into memory.Intaglio
C
11
const {createWriteStream} = require('fs');
const {pipeline} = require('stream/promises');
const fetch = require('node-fetch');

const downloadFile = async (url, path) => pipeline(
    (await fetch(url)).body,
    createWriteStream(path)
);
Calif answered 19/8, 2020 at 16:48 Comment(1)
I get error TypeError: Cannot read property 'on' of undefined at destroyer (internal/streams/pipeline.js:23:10)Garv
S
3
import { existsSync } from "fs";
import { mkdir, writeFile } from "fs/promises";
import { join } from "path";

export const download = async (url: string, ...folders: string[]) => {
    const fileName = url.split("/").pop();

    const path = join("./downloads", ...folders);

    if (!existsSync(path)) await mkdir(path);

    const filePath = join(path, fileName);

    const response = await fetch(url);

    const blob = await response.blob();

    // const bos = Buffer.from(await blob.arrayBuffer())
    const bos = blob.stream();

    await writeFile(filePath, bos);

    return { path, fileName, filePath };
};

// call like that ↓
await download("file-url", "subfolder-1", "subfolder-2", ...)
Symphonia answered 5/8, 2022 at 22:30 Comment(2)
Your answer could be improved by adding more information on what the code does and how it helps the OP.Mccay
this will store the whole 100MB file in memory before writing it, which might work but you probably want to avoid that if possibleMassorete
A
1

I was looking for kinda a same usage, wanted to fetch bunch of api endpoints and save the json responses to some static files, so I came up creating my own solution, hope it helps

const fetch = require('node-fetch'),
    fs = require('fs'),
    VERSIOINS_FILE_PATH = './static/data/versions.json',
    endpoints = [
        {
            name: 'example1',
            type: 'exampleType1',
            url: 'https://example.com/api/url/1',
            filePath: './static/data/exampleResult1.json',
            updateFrequency: 7 // days
        },
        {
            name: 'example2',
            type: 'exampleType1',
            url: 'https://example.com/api/url/2',
            filePath: './static/data/exampleResult2.json',
            updateFrequency: 7
        },
        {
            name: 'example3',
            type: 'exampleType2',
            url: 'https://example.com/api/url/3',
            filePath: './static/data/exampleResult3.json',
            updateFrequency: 30
        },
        {
            name: 'example4',
            type: 'exampleType2',
            url: 'https://example.com/api/url/4',
            filePath: './static/data/exampleResult4.json',
            updateFrequency: 30
        },
    ],
    checkOrCreateFolder = () => {
        var dir = './static/data/';
        if (!fs.existsSync(dir)) {
            fs.mkdirSync(dir);
        }
    },
    syncStaticData = () => {
        checkOrCreateFolder();
        let fetchList = [],
            versions = [];
        endpoints.forEach(endpoint => {
            if (requiresUpdate(endpoint)) {
                console.log(`Updating ${endpoint.name} data... : `, endpoint.filePath);
                fetchList.push(endpoint)
            } else {
                console.log(`Using cached ${endpoint.name} data... : `, endpoint.filePath);
                let endpointVersion = JSON.parse(fs.readFileSync(endpoint.filePath, 'utf8')).lastUpdate;
                versions.push({
                    name: endpoint.name + "Data",
                    version: endpointVersion
                });
            }
        })
        if (fetchList.length > 0) {
            Promise.all(fetchList.map(endpoint => fetch(endpoint.url, { "method": "GET" })))
                .then(responses => Promise.all(responses.map(response => response.json())))
                .then(results => {
                    results.forEach((endpointData, index) => {
                        let endpoint = fetchList[index]
                        let processedData = processData(endpoint.type, endpointData.data)
                        let fileData = {
                            data: processedData,
                            lastUpdate: Date.now() // unix timestamp
                        }
                        versions.push({
                            name: endpoint.name + "Data",
                            version: fileData.lastUpdate
                        })
                        fs.writeFileSync(endpoint.filePath, JSON.stringify(fileData));
                        console.log('updated data: ', endpoint.filePath);
                    })
                })
                .catch(err => console.log(err));
        }
        fs.writeFileSync(VERSIOINS_FILE_PATH, JSON.stringify(versions));
        console.log('updated versions: ', VERSIOINS_FILE_PATH);
    },
    recursiveRemoveKey = (object, keyname) => {
        object.forEach((item) => {
            if (item.items) { //items is the nesting key, if it exists, recurse , change as required
                recursiveRemoveKey(item.items, keyname)
            }
            delete item[keyname];
        })
    },
    processData = (type, data) => {
        //any thing you want to do with the data before it is written to the file
        let processedData = type === 'vehicle' ? processType1Data(data) : processType2Data(data);
        return processedData;
    },
    processType1Data = data => {
        let fetchedData = [...data]
        recursiveRemoveKey(fetchedData, 'count')
        return fetchedData
    },
    processType2Data = data => {
        let fetchedData = [...data]
        recursiveRemoveKey(fetchedData, 'keywords')
        return fetchedData
    },
    requiresUpdate = endpoint => {
        if (fs.existsSync(endpoint.filePath)) {
            let fileData = JSON.parse(fs.readFileSync(endpoint.filePath));
            let lastUpdate = fileData.lastUpdate;
            let now = new Date();
            let diff = now - lastUpdate;
            let diffDays = Math.ceil(diff / (1000 * 60 * 60 * 24));
            if (diffDays >= endpoint.updateFrequency) {
                return true;
            } else {
                return false;
            }
        }
        return true
    };

syncStaticData();

link to github gist

Antibes answered 10/8, 2022 at 5:49 Comment(0)
U
1

If you don't need to deal with 301/302 responses (when things have been moved), you can actually just do it in one line with the Node.js native libraries http and/or https.

You can run this example oneliner in the node shell. It just uses https module to download a GNU zip file of some source code to the directory where you started the node shell. (You start a node shell by typing node at the command line for your OS where Node.js has been installed).

require('https').get("https://codeload.github.com/angstyloop/js-utils/tar.gz/refs/heads/develop", it => it.pipe(require('fs').createWriteStream("develop.tar.gz")));

If you don't need/want HTTPS use this instead:

require('http').get("http://codeload.github.com/angstyloop/js-utils/tar.gz/refs/heads/develop", it => it.pipe(require('fs').createWriteStream("develop.tar.gz")));

Updo answered 4/10, 2022 at 16:10 Comment(0)
S
0

This got the job done for me node 18 and presumably 16. Has only fs and node-fetch (probably works with other fetch libraries) as a dependency.

const fs = require('fs');
const fetch = require("node-fetch");
async function downloadImage(imageUrl){
    //imageurl https://example.com/uploads/image.jpg
    imageUrl = imageUrl.split('/').slice(-1) //image.jpg
    const res = await fetch(imageUrl);
    const fileStream = fs.createWriteStream(`./folder/${imageUrl}`);
    await new Promise((resolve, reject) => {
        res.body.pipe(fileStream);
        res.body.on("error", reject);
        fileStream.on("finish", resolve);
      });
  };

Previous top answer by @code_wrangler was split into a node 16 and 18 solution (this is like the 16 solution), but on Node 18 the Node 18 solution created a 0 byte file for me and cost me some time.

Stokowski answered 27/6, 2023 at 19:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.