Puppeteer TimeoutError: Navigation timeout of 30000 ms exceeded
Asked Answered
R

2

18

I'm developing web screen capture app with Node.js & Google Puppeteer. Now I have to capture 38000 pages and most of the functions are works find but it has errors in some points and I don't know where the errors are coming from.

I have two assumptions. First, I use headless option to check the problem and I found that some pages have lots of GIF files so It loads too long so the timeout error shows. Second, the website sometimes loads fail so it shows the error.

Here's my full code

const puppeteer = require("puppeteer");
const fs = require('fs');

let galleryName = "frozen"; // Enter gallery name

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  // Adjustments particular to this page to ensure we hit desktop breakpoint.
  page.setViewport({
    width: 1000,
    height: 10000000,
    deviceScaleFactor: 1
  });

  fs.readFile('db.txt', async function (err, data) {
    if (err) throw err;
    let array = data.toString().split("\n");
    for (i in array) {
      console.log(`Now Processing : ${array[i]} | ${array.length - i -1} items left`);
      await page.goto(`https://gall.dcinside.com/${galleryName}/${array[i]}`), {
        waitUntil: "networkidle2",
        // timeout: 0
      };
      await page.waitForSelector(".view_content_wrap"), {
        waitUntil: 'networkidle2'
      }
      /* ScreenShot Functions */
      async function screenshotDOMElement(opts = {}) {
        const padding = "padding" in opts ? opts.padding : 0;
        const path = "path" in opts ? opts.path : null;
        const selector = opts.selector;

        if (!selector) throw Error("Please provide a selector.");

        const rect = await page.evaluate(selector => {
          const element = document.querySelector(selector);
          if (!element) return null;
          const {
            x,
            y,
            width,
            height
          } = element.getBoundingClientRect();
          return {
            left: x,
            top: y,
            width,
            height,
            id: element.id
          };
        }, selector);

        if (!rect)
          throw Error(
            `Could not find element that matches selector: ${selector}.`
          );

        return await page.screenshot({
          path,
          clip: {
            x: rect.left - padding,
            y: rect.top - padding,
            width: rect.width,
            height: rect.height + padding * 2
          }
        });
      }

      await screenshotDOMElement({
        path: `./result/${array[i]}.png`,
        selector: ".view_content_wrap",
        padding: 10
      });
    }
  });
  //   // await browser.close();
})();
Raquelraquela answered 4/2, 2020 at 6:46 Comment(2)
Not directly relevant but if you wanted to scale an application like this try using the mariadb connector. I have very similar code to what you have here in one of my applications - in particular the reading data from text file. You could do something as simple as dumping that data into an SQL table, which will give you the ability to extend your dataset (add new columns) relatively easily. You will also be able to deal with more than a few 100k records easily. You can keep part of this code to automatically populate the DB from your text file.Afton
This is missing a minimal reproducible example. Every site is different, so there's no way to say why it's failed. Some sites load slowly, in which case a slight timeout increase can get you over the hump. But usually, sites are blocking you outright. Try the advice in Why does headless need to be false for Puppeteer to work?, like adding a user agent. Also, try using page.goto(url, {waitUntil: "domcontentloaded"}) which resolves the goto() as fast as possible, without waiting for the default "load" event.Gaitskell
M
27

Before await browser.goto try:

await page.setDefaultNavigationTimeout(0)

Or put { waitUntil: 'load', timeout: 0 } inside .goto options.

Marie answered 4/2, 2020 at 7:34 Comment(2)
I got this error on circleci because I didn't setup proxy setting on puppeteer args. So that test url is unreachable. Here is my solution for it. const browser = await puppeteer.launch({ args: ['--proxy-server=http://your-proxy-url:port'], }); Reference => https://mcmap.net/q/740786/-puppeteer-keeps-getting-timeouterror-navigation-timeout-of-80000-ms-exceededSatirist
This is not good advice. Making navigation timeouts 0 makes the timeout infinite, causing the script to hang forever on sites that suppress the "load" event. A better solution would be page-specific, likely running headfully, adding a user agent string, or using "domcontentloaded" to handle sites that suppress the "load" event. If you must increase the timeout, make it 3 minutes.Gaitskell
D
0

The accepted answer is as described by ggorlen not advisable. Suppose you have a socket open the call will never finish. The best way is to use the waitForSelector option. That way you can also ensure that dynamically rendered content is properly displayed. Just when you are done rendering add an element with the id passed in waitForSelector.

Destinee answered 3/6 at 20:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.