Headless Puppeteer - Avoid being detected by Akamai
Asked Answered
B

1

6

Hi I'm trying to scrape a website that is powered by Akamai for bot protection. I'm unable to make it past a Login page due to Akamai blocking my login request.

Firstly I would like to say that, Yes there are a lot of guides on how to avoid being detected by things like Akamai but those are now irrelevant as companies like Akamai are getting better and better through the use of their AI to detect new bots.

So let me tell you the basics of what my script is running:

  • Puppeteer (Headless Mode)
  • puppeteer-extra-plugin-stealth

For the Chrome Flags:

var chromeFlags = [
    '--no-sandbox',
    '--disable-setuid-sandbox',
    '--disable-accelerated-2d-canvas',
    '--no-zygote',
    '--renderer-process-limit=1',
    '--no-first-run',
    '--ignore-certificate-errors',
    '--ignore-certificate-errors-spki-list',
    '--disable-dev-shm-usage',
    '--disable-infobars',
    '--lang=en-US,en',
    '--window-size=1920x1080',
    '--disable-extensions'
  ];

I've also spoofed the Timezones and Viewport:

await page.emulateTimezone("Asia/Singapore");
await page.setViewport({width: (width/2)-21, height: height-111});

Form what I heard, Akamai has specifically known to scrutinize window/screen sizes. I've done everything I think is necessary to ensure the headless mode mimics an actual browser but to no avail.

Theres a website that shows you your browser fingerprints bot.sannysoft.com . I'm currently using that to compare if the headless puppeteer mimics an actual headful browser and so far It seems that it looks like a legitimate browser. Here's the result from that website I got with my headless puppeteer enter image description here

I hope someone is able to tell me if there's anything I should try spoofing next to maybe increase my chance of not being detected by Akamai or point out where I did wrong.

Thanks everyone!

Blinnie answered 5/3, 2021 at 18:1 Comment(6)
Were you able to figure this out? Im stuck as wellMira
@Mira Not quite but would you like to work on this together? I'm sure we can both solve it if we try Haha.Blinnie
Sure we can work together on this. I was able to get it to work. Do you have slack or skype?Mira
were you guys able to overcome Akamai ?Equalize
@Mira Please share the resolutionChemisette
Any news on this topic guys? I digged quite a bit but unfortunately unable to find a solution. There is this very important _abck cookie which will be created after sending some sensor_data and also some endpoint pixel that seems to analyse the browser configuration... Those 2 steps will give you away as being headless or not. Tried to compare headless/headful, intercept and modify requests accordingly, but no luck so far. Akamai is getting good at detecting headless browsers it seemsDetrital
B
-1

Try adding the ignoreHTTPSErrors: true when launching puppeteer. Also, check out puppeteer extra and the stealth plugin here:

https://www.npmjs.com/package/puppeteer-extra-plugin-stealth

You can also add me on slack or skype to work on this further!

Brindled answered 24/4, 2021 at 20:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.