How can I obtain the original encoded response size when intercepting requests with Puppeteer?
Asked Answered
H

2

6

I'm using this code to log the encoded response size when loading a page in Chrome:

const puppeteer = require("puppeteer");

(async function() {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  page._client.on("Network.loadingFinished", data => {
    console.log("finished", { encodedDataLength: data.encodedDataLength });
  });

  // await page.setRequestInterception(true);
  // page.on("request", async request => {
  //   request.continue();
  // });

  await page.goto("http://example.com");
  await browser.close();
})();

This is the output:

finished { encodedDataLength: 967 }

However, if I uncomment the four lines in the code snippet the output changes to:

finished { encodedDataLength: 0 }

This does make some sense, since the intercepted request could have been modified in some way by the client, and it would not have been gzipped again afterwards.

However, is there a way to access the original gzipped response size?


The Chrome trace also doesn't include the gzipped size:

"encodedDataLength": 0, "decodedBodyLength": 1270,

Hyetograph answered 16/10, 2018 at 8:37 Comment(1)
If the answer worked for you, please choose as accepted answer so it doesn't show as unanswered. :)Frig
F
2

We can use Content-Length header value for such case.

The good guys at google decided they won't fix some weird bugs closely related to encodedDataLength.

Check the code and result below to see proof.

page.on("request", async request => {
  request.continue();
});

// Monitor using _client
page._client.on("Network.responseReceived", ({ response }) => {
  console.log("responseReceived", [
    response.headers["Content-Length"],
    response.encodedDataLength
  ]);
});

page._client.on("Network.loadingFinished", data => {
  console.log("loadingFinished", [data.encodedDataLength]);
});

// Monitor using CDP
const devToolsResponses = new Map();
const devTools = await page.target().createCDPSession();
await devTools.send("Network.enable");

devTools.on("Network.responseReceived", event => {
  devToolsResponses.set(event.requestId, event.response);
});

devTools.on("Network.loadingFinished", event => {
  const response = devToolsResponses.get(event.requestId);
  const encodedBodyLength =
    event.encodedDataLength - response.headersText.length;
  console.log(`${encodedBodyLength} bytes for ${response.url}`);
});

Result without setRequestInterception:

responseReceived [ '606', 361 ]
loadingFinished [ 967 ]
606 bytes for http://example.com/

Result with setRequestInterception:

responseReceived [ '606', 0 ]
loadingFinished [ 0 ]
-361 bytes for http://example.com/

Tested with multiple gzip tool. Same result everywhere. enter image description here

The Content-Length Header is far more reliable in every sense.

Frig answered 18/10, 2018 at 11:25 Comment(7)
Thanks for your answer, but a lot of response headers don't include the Content-Length.Hyetograph
Can you share one example url?Frig
This file for example: fonts.googleapis.com/css?family=YT%20Sans%3A300%2C500%2C700Hyetograph
Are you sure it's gzipped?Frig
Opps, realized that's not the problem. Silly me.Frig
Content-Length is not used when the response header Transfer-Encoding is used, usually set to "chunked", or, if the server is HTTP/1.0-compliant, when the "Connection: close" header is given (the body is then supposed to be read until the connection is closed). These are the 3 message framing mechanisms used in HTTP. tools.ietf.org/html/rfc7230#section-3.3Steeple
What shall we do if we want to measure it if there is not content content-length in the header?Femur
F
0

If you want to get the encoded response size (transferSize) of each request you could use Google Lighthouse:

You can use the CLI:

npx lighthouse http://example.com --output json --output-path ./results.json

or programmaticly with NodeJS:

import lighthouse from 'lighthouse';
import {
  launch
} from 'chrome-launcher';

const chrome = await launch({
  chromeFlags: ['--headless']
});
const runnerResult = await lighthouse('https://example.com', {
  port: chrome.port
});

console.log('Report is done for', runnerResult.lhr.audits["network-requests"]);

chrome.kill();

In both results, you get a detailed view of each request. Here is an example with two entries:

{
  "audits": {
    "network-requests": {
      "details": {
        "items": [
          {
            "url": "https://example.com/",
            "sessionTargetType": "page",
            "protocol": "h2",
            "rendererStartTime": 0,
            "networkRequestTime": 1.2999999970197678,
            "networkEndTime": 3682.6830000057817,
            "finished": true,
            "transferSize": 1075,
            "resourceSize": 800,
            "statusCode": 200,
            "mimeType": "text/html",
            "resourceType": "Document",
            "priority": "VeryHigh",
            "experimentalFromMainFrame": true,
            "entity": "/example.com"
          },
          {
            "url": "https://kit.fontawesome.com/XXXXXXXX.js",
            "sessionTargetType": "page",
            "protocol": "h2",
            "rendererStartTime": 3681.929000005126,
            "networkRequestTime": 3683.8050000071526,
            "networkEndTime": 4694.142000004649,
            "finished": true,
            "transferSize": 4855,
            "resourceSize": 11890,
            "statusCode": 200,
            "mimeType": "text/javascript",
            "resourceType": "Script",
            "priority": "High",
            "experimentalFromMainFrame": true,
            "entity": "FontAwesome CDN"
          }
        ]
      }
    }
  }
}
Femur answered 28/5, 2024 at 20:35 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.