I am scraping some websites that seem to have pretty good protection against it. The only way I can get it to work is to use Selenium to load the page and then scrape stuff from that.
Currently this works on my local computer (a firefox windows opens and closed when I access my page and it's HTML is processed further in my script). However, I need my scraper to be accessible on the web. The scraper is embedded within a Flask app on Heroku. Is there a way to make the Selenium browser work on Heroku servers? Or are there any hosting providers where it can work?
heroku buildpacks:add https://github.com/heroku/heroku-buildpack-google-chrome
instead ofxvfb
for headless mode. – Dodgem