How to mock remote website responses to unit test Puppeteer code with Jest?
Asked Answered
N

3

6

I'm implementing a web-scraping script for a website which will collect some useful information.

Script is implemented using Puppeteer library and is basically a set of instructions like:

  1. Start a headless chrome
  2. Open new tab/page
  3. Get some pagination links from the HTML
  4. Open every pagination link found and scrape some information from HTML

I'm looking for some way of testing this functionality. Ideally, what I want to do is to "pre-save" the real HTML responses in test folder and then mock the external website responses (making sure they always are the same). And then assert that collected information is correct.

I'm familiar with several tools which are able to mock endpoints for fetch function in the browser. I'm looking for something similar but for Puppeteer.

So far, the only solution I'm thinking about is to use browser instance as a dependency for my script. And then mock the newPage method of the browser to return a page with custom interceptor. This looks like a lot of work though.

Any other solution for this?

Nevernever answered 23/12, 2020 at 1:14 Comment(1)
Why do you want to mock the transport layer? Why not make the starting page URL an input, so you can point your scraper at a local server for testing?Synecious
P
5

A simple solution is to store the HTML page you want to test (or parts of it) locally and open it in Puppeteer. It is possible to open local HTML websites in Puppeteer. Then the result can be tested in a Javascript test framework like Mocha or Jest.

If you need a real web server for the tests, another option is to use a library like Express to serve local HTML pages as a mock for a web server response. You can find an example in this search engine scraper which contains tests for scraping various search engines.

It is also possible to mock and stub Puppeteer functions like launch, goto and $eval. This approach requires a lot of stubbed methods.

Pustulant answered 16/2, 2021 at 14:0 Comment(0)
C
4

This is something I am playing around with at the moment.

I got this working by setting setRequestInterception:

it('responds', () => {
  return (async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

    await page.setRequestInterception(true);

    page.on('request', request => {
      // TODO: match request.url()

      request.respond({
        content: 'application/json',
        headers: {"Access-Control-Allow-Origin": "*"},
        body: JSON.stringify({foo: 'bar'})
      })
    });

    res = await page.goto('https://example.com');
    json = await res.json()
    await browser.close();

    expect(json).toStrictEqual({"foo": "bar"})
  })();
})

This also looks to be a useful tool: https://github.com/jefflau/jest-fetch-mock. Maybe it could be handy to match requests etc

Also see: Best way to intercept XHR request on page with Puppeteer and return mock response

Consumedly answered 4/9, 2021 at 3:4 Comment(0)
A
0

Late to the question, but I've published mock-puppeteer-goto a few years ago for that. It works with Playwright too.

https://www.npmjs.com/package/mock-puppeteer-goto

Adult answered 6/4 at 17:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.