Hi I'm trying to scrape a website that is powered by Akamai for bot protection. I'm unable to make it past a Login page due to Akamai blocking my login request.
Firstly I would like to say that, Yes there are a lot of guides on how to avoid being detected by things like Akamai but those are now irrelevant as companies like Akamai are getting better and better through the use of their AI to detect new bots.
So let me tell you the basics of what my script is running:
- Puppeteer (Headless Mode)
- puppeteer-extra-plugin-stealth
For the Chrome Flags:
var chromeFlags = [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-accelerated-2d-canvas',
'--no-zygote',
'--renderer-process-limit=1',
'--no-first-run',
'--ignore-certificate-errors',
'--ignore-certificate-errors-spki-list',
'--disable-dev-shm-usage',
'--disable-infobars',
'--lang=en-US,en',
'--window-size=1920x1080',
'--disable-extensions'
];
I've also spoofed the Timezones and Viewport:
await page.emulateTimezone("Asia/Singapore");
await page.setViewport({width: (width/2)-21, height: height-111});
Form what I heard, Akamai has specifically known to scrutinize window/screen sizes. I've done everything I think is necessary to ensure the headless mode mimics an actual browser but to no avail.
Theres a website that shows you your browser fingerprints bot.sannysoft.com . I'm currently using that to compare if the headless puppeteer mimics an actual headful browser and so far It seems that it looks like a legitimate browser. Here's the result from that website I got with my headless puppeteer
I hope someone is able to tell me if there's anything I should try spoofing next to maybe increase my chance of not being detected by Akamai or point out where I did wrong.
Thanks everyone!
_abck
cookie which will be created after sending somesensor_data
and also some endpointpixel
that seems to analyse the browser configuration... Those 2 steps will give you away as being headless or not. Tried to compare headless/headful, intercept and modify requests accordingly, but no luck so far. Akamai is getting good at detecting headless browsers it seems – Detrital