"Eager" Page Load Strategy workaround for Chromedriver Selenium in Python
Asked Answered
L

2

4

I want to speed up the loading time for pages on selenium because I don't need anything more than the HTML (I am trying to scrape all the links using BeautifulSoup). Using PageLoadStrategy.NONE doesn't work to scrape all the links, and Chrome no longer supports PageLoadStrategy.EAGER. Does anyone know of a workaround to get PageLoadStrategy.EAGER in python?

Latter answered 28/6, 2018 at 16:41 Comment(3)
have you tried with urllib ?Deutero
yea, why bother with selenium at all?Ehling
Does urllib load faster? I don't know all that much about the different parsersLatter
D
6

ChromeDriver is the standalone server which implements WebDriver's wire protocol for Chromium. Chrome and Chromium are still in the process of implementing and moving to the W3C standard. Currently ChromeDriver is available for Chrome on Android and Chrome on Desktop (Mac, Linux, Windows and ChromeOS).

As per the current WebDriver W3C Editor's Draft The following is the table of page load strategies that links the pageLoadStrategy capability keyword to a page loading strategy state, and shows which document readiness state that corresponds to it:

page loading strategy

However, if you observe the current implementation of of ChromeDriver, the Chrome DevTools does takes into account the following document.readyStates:

  • document.readyState == 'complete'
  • document.readyState == 'interactive'

Here is a sample relevant log:

[1517231304.270][DEBUG]: DEVTOOLS COMMAND Runtime.evaluate (id=11) {
   "expression": "var isLoaded = document.readyState == 'complete' ||    document.readyState == 'interactive';if (isLoaded) {  var frame = document.createElement('iframe');  frame.name = 'chromedriver dummy frame'; ..."
}

As per WebDriver Status you will find the list of all WebDriver commands and their current support in ChromeDriver based on what is in the WebDriver Specification. Once the implementation are completed from all aspects PageLoadStrategy.EAGER is bound to be functionally present within Chrome Driver.

Dallasdalli answered 29/6, 2018 at 8:26 Comment(0)
P
1

You only use normal or none as the pageLoadStrategy in chromdriver. So either choose none and handle everything yourself or wait for the page load as it normally happens

Pluviometer answered 28/6, 2018 at 17:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.