Download a file using Nightmare
Asked Answered
S

4

9

I am using Nightmare to create a automated downloader for today's newspaper. I managed to login and go the the specified page. However I could not find out how to download a file with Nightmare.

var Nightmare = require('nightmare');
new Nightmare()
  .goto('https://login.nrc.nl/login?service=http://digitaleeditie.nrc.nl/welkom')
    .type('input[name="username"]', 'Username')
    .type('input[name="password"]','Password')
    .click('button[type="submit"]')
    .wait()
    .goto('http://digitaleeditie.nrc.nl/digitaleeditie/NH/2014/10/20141124___/downloads.html')
    .wait()
    .click('a[href="/digitaleeditie/helekrant/epub/nrc_20141124.epub"]')
    .wait()

    .url(function(url) {
        console.log(url)
    })
    .run(function (err, nightmare) {
      if (err) return console.log(err);
      console.log('Done!');
    });

I tried to download the file by clicking on the download button. However this seems not to work.

Stilliform answered 24/11, 2014 at 16:32 Comment(0)
F
5

PhantomJS (and CasperJS and Nightmare) don't trigger a download (dialog) when you click on something that should be downloaded. So, it is necessary to download it yourself. If you can find out the URL of the file, then it can be easily downloaded using an XMLHttpRequest from the page context.

So you need to exchange

.click('a[href="/digitaleeditie/helekrant/epub/nrc_20141124.epub"]')

for

.evaluate(function ev(){
    var el = document.querySelector("[href*='nrc_20141124.epub']");
    var xhr = new XMLHttpRequest();
    xhr.open("GET", el.href, false);
    xhr.overrideMimeType("text/plain; charset=x-user-defined");
    xhr.send();
    return xhr.responseText;
}, function cb(data){
    var fs = require("fs");
    fs.writeFileSync("book.epub", data, "binary");
})

You can also use the newer way of requesting binary data.

.evaluate(function ev(){
    var el = document.querySelector("[href*='.pdf']");
    var xhr = new XMLHttpRequest();
    xhr.open("GET", el.href, false);
    xhr.responseType = "arraybuffer";
    xhr.send();

    var bytes = [];
    var array = new Uint8Array(xhr.response);
    for (var i = 0; i < array.length; i++) {
        bytes[i] = array[i];
    }
    return bytes;
}, function cb(data){
    var fs = require("fs");
    fs.writeFileSync("book.epub", new Buffer(data), "binary");
})

Both of the ways are described on MDN. Here is a sample script which shows a proof of concept.

Faustena answered 24/11, 2014 at 21:17 Comment(10)
I tried to implement this. However this only download a 4k file with the same name. It does not download the whole file.Stilliform
4k is a bit arbitrary. What is the content? Maybe this is an error page.Faustena
It is a epub file of size 4k. If opened in a text editor it only contains null.Stilliform
You can try it with the only other way and see if the page supports it. I updated my answer.Faustena
The second option throws an error: stream.js:94 throw er; // Unhandled stream error in pipe. ^ TypeError: Cannot read property 'length' of null at new Buffer (buffer.js:184:31) at cb (/home/nrclogin.js:32:35) at wrapped (/home/node_modules/nightmare/lib/actions.js:324:14) at Proto.apply (/home/node_modules/nightmare/node_modules/phantom/node_modules/dnode/node_modules/dnode-protocol/index.js:123:13) Not realy sure what to doStilliform
I am too trying to download a file that is not at a url, but rather gets triggered via javascript. How can I download a such a file?Melgar
@Melgar I presume my answer didn't help you. I don't know how else it may work, but have you tried the newest Nightmare version? It works with Electron and I would think that there is something like this included.Faustena
@ArtjomB. I have stumbled on this question looking for the same thing as the OP, the problem cannot be resolved with a simple HttpXmlRequest like yours. We can clearly see that nightmare first logs in before trying to fetch the file. Using a basic (or 'nude') GET request would not be able to obtain the file... I think one should take all cookies from nightmare and use them correctly.Halpern
@Halpern That may be true, but I don't know of any site where I can properly test this (login+download). Maybe I decide to create an account on OPs site. Also, are you sure that this is still an issue with Nightmare 2 since it uses Electron now.Faustena
@ArtjomB. I don't know, but I am still trying to figure it out for myself... Maybe the docs are just too sparse.Halpern
M
3

There is a Nightmare download plugin. You can download the file just with this code below:

var Nightmare = require('nightmare');
require('nightmare-download-manager')(Nightmare);
var nightmare = Nightmare();
nightmare.on('download', function(state, downloadItem){
  if(state == 'started'){
    nightmare.emit('download', '/some/path/file.zip', downloadItem);
  }
});

nightmare
  .downloadManager()
  .goto('https://github.com/segmentio/nightmare')
  .click('a[href="/segmentio/nightmare/archive/master.zip"]')
  .waitDownloadsComplete()
  .then(() => {
    console.log('done');
  });
Merovingian answered 20/12, 2016 at 16:6 Comment(0)
W
1

I got my downloads super easy using the request module, as described here.

var Nightmare = require('nightmare');
var fs = require('fs');
var request = require('request');

new Nightmare()
  .goto('https://login.nrc.nl/login?service=http://digitaleeditie.nrc.nl/welkom')
  .insert('input[name="username"]', 'Username')
  .insert('input[name="password"]','Password')
  .click('button[type="submit"]')
  .wait()
  .goto('http://digitaleeditie.nrc.nl/digitaleeditie/NH/2014/10/20141124___/downloads.html')
  .wait()
  .then(function () {
    download('http://digitaleeditie.nrc.nl/digitaleeditie/helekrant/epub/nrc_20141124.epub', 'myBook.epub', function () {
      console.log('done');
    });
  })
  .catch(function (err) {
    console.log(err);
  })

function download(uri, filename, callback) {
  request.head(uri, function () {
    request(uri).pipe(fs.createWriteStream(filename)).on('close', callback);
  });
}

Run npm i request in order to use request.

Walliw answered 9/9, 2016 at 5:26 Comment(1)
error in code - missing ',' between args when invoking download()Ridicule
S
0

Nightmare will download it properly if you click on the download link.

const Nightmare         = require('nightmare');
const show              = ( process.argv[2].includes("true") ) ? true : false;
const nightmare         = Nightmare( { show: show } );

nightmare
    .goto("https://github.com/segmentio/nightmare")
    .click('a[href="/segmentio/nightmare/archive/master.zip"]')
    .end(() => "Done!")
    .then((value) => console.log(value));
Succinct answered 19/3, 2017 at 22:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.