Evade detection of selenium automation
Asked Answered
A

2

4

To test my skills I am writing a Python software that should go to the web page https://www.solebox.com/de_DE, select a product and save the name, tag and price in a .txt file (or convert it into a shoe bot in the future) using the Selenium library. The problem is that the site detects that I am using an automated sotware and does not allow me to access the products. I've already tried using the undetected_chromedriver library but it didn't work. Does anyone know a working method? Thank you.

More info: OS: Windows 10, Chrome version: 88.0.4324.150 64 bit , Python version: 3.9.1, Writing software: Visual Studio Code

Adenectomy answered 14/2, 2021 at 19:7 Comment(0)
B
3

There are multiple ways to Evade detection of Selenium automation.


Using --disable-blink-features=AutomationControlled

Code Block:

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_argument('--disable-blink-features=AutomationControlled')

driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get('https://www.solebox.com/de_DE')
print(driver.page_source)

Console Output:

<!-- =============== This snippet of JavaScript handles fetching the dynamic recommendations from the remote recommendations server
and then makes a call to render the configured template with the returned recommended products: ================= -->

<script>
(function(){
// window.CQuotient is provided on the page by the Analytics code:
var cq = window.CQuotient;
if (cq && ('function' == typeof cq.getCQUserId)
&& ('function' == typeof cq.getCQCookieId)
&& ('function' == typeof cq.getCQHashedEmail)
&& ('function' == typeof cq.getCQHashedLogin)) {
var recommender = '[[&quot;Homepage_Topseller&quot;]]';
// cleaning up the leading/trailing brackets and quotes:
recommender=recommender.slice(8, recommender.length-8);
var separator = '|||';
.
</script>
<script type="text/javascript">//<!--
/* <![CDATA[ (viewProduct-active_data.js) */
dw.ac._capture({id: "01900289", type: "recommendation"});
/* ]]> */
// -->
</script>
.
<script type="text/javascript" id="" src="//static.criteo.net/js/ld/ld.js"></script>
<script type="text/javascript" id="">window.criteo_q=window.criteo_q||[];window.criteo_q.push({event:"setAccount",account:google_tag_manager["GTM-M9TMD24"].macro(24)},{event:"setEmail",email:""},{event:"setSiteType",type:"d"},{event:"viewHome"});</script><div id="criteo-tags-div" style="display: none;"><iframe src="https://gum.criteo.com/syncframe?topUrl=www.solebox.com#{&quot;bundle&quot;:{&quot;origin&quot;:0,&quot;value&quot;:null},&quot;cw&quot;:true,&quot;lwid&quot;:{&quot;origin&quot;:0,&quot;value&quot;:null},&quot;optout&quot;:{&quot;origin&quot;:0,&quot;value&quot;:null},&quot;origin&quot;:&quot;onetag&quot;,&quot;pm&quot;:0,&quot;sid&quot;:{&quot;origin&quot;:0,&quot;value&quot;:null},&quot;tld&quot;:&quot;solebox.com&quot;,&quot;topUrl&quot;:&quot;www.solebox.com&quot;,&quot;uid&quot;:null,&quot;version&quot;:&quot;5_6_2&quot;}" id="criteo-syncframe" width="0" height="0" frameborder="0" style="border-width:0px; margin:0px; display:none" title="Criteo GUM iframe"></iframe></div></body></html>

You can find a relevant detailed discussion in Selenium can't open a second page


Using undetected_chromedriver

Code Block:

import undetected_chromedriver as uc
from selenium import webdriver

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
driver = uc.Chrome(options=options)
driver.get("https://www.solebox.com/de_DE")
print(driver.page_source)

Console Output:

.
.
<script type="text/javascript" id="">!function(b,e,f,g,a,c,d){b.fbq||(a=b.fbq=function(){a.callMethod?a.callMethod.apply(a,arguments):a.queue.push(arguments)},b._fbq||(b._fbq=a),a.push=a,a.loaded=!0,a.version="2.0",a.queue=[],c=e.createElement(f),c.async=!0,c.src=g,d=e.getElementsByTagName(f)[0],d.parentNode.insertBefore(c,d))}(window,document,"script","https://connect.facebook.net/en_US/fbevents.js");fbq("init",google_tag_manager["GTM-M9TMD24"].macro(19));fbq("track","PageView");</script>
<noscript><img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=238536633211197&amp;ev=PageView&amp;noscript=1"></noscript>

<script type="text/javascript" id="" src="//static.criteo.net/js/ld/ld.js"></script></body></html>

You can find a relevant detailed discussion in Undetected Chromedriver not loading correctly

Bedridden answered 14/2, 2021 at 21:47 Comment(6)
I tried but still can't access the page. You can find the error message in the comments of PDHide's answerAdenectomy
@Adenectomy There are 2 approaches mentioned within this answer. Which approach did you try? What error do you see? Update the question with your code trials and recent errors.Bedridden
I've tried both methods many times and they haven't worked. I tried the undetected_chromedriver method again and strangely it works now. I think I have solved the problem, if it stops working again I will notify you. Thanks for the reply. (ps: the undetected_chromedriver method seems to be unstable, it works about 70% of the time, if there are any problems I will notify you)Adenectomy
@Adenectomy Sure, let me know if you are stuck anywhere.Bedridden
Here we go again, now the problem occurs when the software automatically clicks on somethingAdenectomy
@undetectedSelenium how does this option --disable-blink-features=AutomationControlled work internally? I mean what is it doing internally? can you share some source info etcMacfarlane
F
1
options=webdriver.ChromeOptions()

options.add_experimental_option(
    "excludeSwitches", ['enable-automation'])
    

driver = webdriver.Chrome(options=options)


driver.get("https://www.solebox.com/de_DE")

just exclude the automation switch which will disable the navigator.webdriver object

https://developer.mozilla.org/en-US/docs/Web/API/Navigator/webdriver

On further investigation it was observed that the website also observes too fast navigations in the screen and throws the automation error. You can validate this by clicking the product and clicking backspace and continuing this process continously

Frontier answered 14/2, 2021 at 19:19 Comment(9)
I have already tried this method and it didn't work. I also tried changing the screen resolution as indicated in many tutorials about it but no way.Adenectomy
I tried that and it works what error you are gettingFrontier
try adding options.add_argument('--disable-blink-features=AutomationControlled')Frontier
It still doesn't work, I don't get an error from the console but I still can't access the web page. The message is as follows: "Access to this page has been denied because we believe you are using automation tools to browse the website. This may happen as a result of the following: ●Javascript is disabled or blocked by an extension (ad blockers for example) ●Your browser does not support cookies Please make sure that Javascript and cookies are enabled on your browser and that you are not blocking them from loading. Reference ID: 1c2b8b30-6f73-11eb-aae2-8f373c0a0244 "Adenectomy
did you try what i mentioned , even if you click on the page back and then click on any elemenbt continously you will get the errorFrontier
so add sleep between stepsFrontier
just click back and forward browser button cutinouslyFrontier
if you confirm that sleep indeed solves your problem i can help you furrtherFrontier
No way, I tried with sleep, changing the resolution, using undetected_chromedriver, refreshing the page and all the other methods you reccomended meAdenectomy

© 2022 - 2024 — McMap. All rights reserved.