Setting puppeteer page.goto retry limit
Asked Answered
B

0

1

I've written some code that loads a web form, submits it, processes the response, and returns it. I've run into an issue where the "page.goto" that initially loads the form sometimes gets stuck. Maybe my internet connection isnt so stable or maybe the site. I added a "retry" and it will usually retry a few times and it will work. However, sometimes it would get stuck retrying forever or it just gets stuck (where I don't see "retrying..."). Therefore, I tried adding "retrycount" to stop after a certain number of tries. However, when this gets trigger the program just hangs. I would have expected it to return "[]". I am somewhat new to node's style of programming so this might not be the most elegant code.

My issue is specifically with trying to set a limit on retries of the "page.goto" when initially loading the form. Based on the code below, I would expect the "try/catch" below to fail, get the "Browser Error" message and then the function to just return. Instead it hangs.

processURL gets called loading a list of urls.

async function processURL(url) {
    const browser = await puppeteer.launch(...);
    const page = await browser.newPage();
    let results = null;
    let retrycount = 0;
    const retry = (fn, ms, retrycount) => new Promise(resolve => {
        fn()
            .then(resolve)
            .catch(() => {
                if (retrycount++ < 5) {
                    setTimeout(() => {
                        console.log(new Date().toLocaleTimeString('en-GB'), 'retrying...');
                        retry(fn, ms).then(resolve);
                    }, ms);
                } else {
                    console.log(new Date().toLocaleTimeString('en-GB'), 'retry failed');
                }

            });
    });
    //Load url form
    await retry(() => page.goto(url, {
        waitUntil: 'networkidle0'
    }), 500, retrycount);
    page.on('response', async (msg) => {
        //process response from form submission
        results = process_response(msg);
    });
    try {
        //Fill out form and submit
        await page.evaluate(() => {
            document.querySelector("#select").click();
        });
        await page.click('#button');
        //Wait for response, timeout after 60sec
        retries = 0;
        while (results === null && retries < 120) {
            await timeout(500);
            retries++;
        }
    } catch (error) {
        console.log(new Date().toLocaleTimeString('en-GB'), 'Browser error', error)
    };
    // HANG?  await page.close();
    await browser.close();
    if (results === null) {
        return [];
    }
    return results;
}
Bihar answered 9/4, 2023 at 18:11 Comment(4)
Does this answer your question? How do I return the response from an asynchronous call?Schoenfelder
Try page.waitForResponse. Using a while loop to wait for a callback is very hacky. It's hard to help beyond that since this isn't a minimal reproducible example.Schoenfelder
Maybe I posted too much code. the "page.goto" loads a web form. I submit the form, then process the response with "page.on". Because it is async, I have the while loop. I agree it is hacky so I should fix that. But my question is regardin the "page.goto" loading the initial form. Sometimes it doesnt load and times out so I retry, but I am having trouble setting a limit on the retries. The problem has nothing to do with later in the code (processing response)Bihar
That looks pretty hacky too. Promises are supposed to be easy to use and almost never need to be combined with callbacks. A simple for loop with a await setTimeout(someDelay) imported from node:timers/promises is enough to retry with a wait time in between the retries. Another tip: never combine await and then.Schoenfelder

© 2022 - 2024 — McMap. All rights reserved.