How can I scroll a web page using selenium webdriver in python?
Asked Answered
U

27

278

I am currently using selenium webdriver to parse through facebook user friends page and extract all ids from the AJAX script. But I need to scroll down to get all the friends. How can I scroll down in Selenium. I am using python.

Upswell answered 8/1, 2014 at 3:44 Comment(3)
possible duplicate of How to scroll page with seleniumCarpospore
driver.execute_script(f"window.scrollTo(0, {2**127});")Bitterling
If in your case that there is a list of items, so you can follow this method https://mcmap.net/q/103234/-how-to-scroll-down-some-part-using-selenium-in-pythonFaria
G
483

You can use

driver.execute_script("window.scrollTo(0, Y)")

where Y is the height (on a fullhd monitor it's 1080). (Thanks to @lukeis)

You can also use

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

to scroll to the bottom of the page.

If you want to scroll to a page with infinite loading, like social network ones, facebook etc. (thanks to @Cuong Tran)

SCROLL_PAUSE_TIME = 0.5

# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")

while True:
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)

    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height

another method (thanks to Juanse) is, select an object and

label.sendKeys(Keys.PAGE_DOWN);
Glasswork answered 3/1, 2015 at 22:13 Comment(5)
Excellent, can you explain a little bit on scrollHeight, what does it mean and how does it work in general?Haphazard
How would you then use the variable "last_height"? I have something similar in my code and the browser is scrolling down. However, when I look at the data I'm scraping it only scrapes the data from the first page k times with "k" being the number of times the browser scrolls down.Stebbins
@JasonGoal hope this will help: https://mcmap.net/q/56941/-what-is-offsetheight-clientheight-scrollheightWilma
driver.execute_script can be combined with smooth scrolling (developer.mozilla.org/en-US/docs/Web/API/Window/scrollTo) to imitate more human-like behavior!Austral
Goodness this is pure gold!Lubet
G
113

If you want to scroll down to bottom of infinite page (like linkedin.com), you can use this code:

SCROLL_PAUSE_TIME = 0.5

# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")

while True:
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)

    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height

Reference: https://mcmap.net/q/103235/-scroll-down-to-bottom-of-infinite-page-with-phantomjs-in-python

Geographical answered 8/4, 2017 at 19:32 Comment(3)
This is great. For anyone who is trying to use this on instagram, you may need to first tab to the "Load more" button using ActionChains, then apply Cuong Tran's solution... at least that's what worked for me.Beamends
Thanks for the answer! What I would like to do is scroll for instance in instagram to the bottom of the page, then grab the entire html of the page. Is there a function in selenium where I could give last_height as input and get the entire page html, after I have scrolled to the bottom?Mousseline
The SCROLL_PAUSE_TIME varies, it takes around 2 seconds for me.Respect
P
75

You can use send_keys to simulate an END (or PAGE_DOWN) key press (which normally scroll the page):

from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
html = driver.find_element(By.TAG_NAME, 'html')
html.send_keys(Keys.END)
Plication answered 15/7, 2018 at 5:34 Comment(3)
Tried PAGE_DOWN on a loop and did not behave as expected, END worked as expected for w/e reasonAdenocarcinoma
Guys, one thing I was trying to do.... send the keys to press like a value from a function.... ex: my_func(value):....html.send_keys(Keys.<value>). any help is welcome... thanksLorimer
find_element_by_tag_name is not longer supported in Selenium. Also keys will not work on pages like Facebook.Fronton
F
29

same method as shown here:

in python you can just use

driver.execute_script("window.scrollTo(0, Y)")

(Y is the vertical position you want to scroll to)

Faille answered 8/1, 2014 at 4:4 Comment(0)
A
26
element=find_element_by_xpath("xpath of the li you are trying to access")

element.location_once_scrolled_into_view

this helped when I was trying to access a 'li' that was not visible.

Actually answered 7/6, 2016 at 22:54 Comment(2)
'find_element_by_xpath' is a driver function or what, the '.location_once_scrolled_into_view' returns error NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//*[@id="timeline-medley"]/div/div[2]/div[1]"}Ibidem
Just one more thing. The reason why location_once_scrolled_into_view should be called without () is that location_once_scrolled_into_view is a Python property. see the source code here: selenium/webelement.py at d3b6ad006bd7dbee59f8539d81cee4f06bd81d64 · SeleniumHQ/seleniumMassey
S
22

For my purpose, I wanted to scroll down more, keeping the windows position in mind. My solution was similar and used window.scrollY

driver.execute_script("window.scrollTo(0, window.scrollY + 200)")

which will go to the current y scroll position + 200

Spiller answered 2/8, 2018 at 16:59 Comment(0)
G
13

This is how you scroll down the webpage:

driver.execute_script("window.scrollTo(0, 1000);")
Gabelle answered 28/11, 2018 at 7:14 Comment(0)
B
10

The easiest way i found to solve that problem was to select a label and then send:

label.sendKeys(Keys.PAGE_DOWN);

Hope it works!

Benefactress answered 16/4, 2018 at 18:21 Comment(1)
Can you please show us example code to select a label?Hive
P
9

None of these answers worked for me, at least not for scrolling down a facebook search result page, but I found after a lot of testing this solution:

while driver.find_element_by_tag_name('div'):
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    Divs=driver.find_element_by_tag_name('div').text
    if 'End of Results' in Divs:
        print 'end'
        break
    else:
        continue
Paid answered 9/11, 2017 at 12:37 Comment(1)
It works, but very slow (for me at least). I found that if you set SCROLL_PAUSE_TIME in https://mcmap.net/q/102438/-how-can-i-scroll-a-web-page-using-selenium-webdriver-in-python to 2, it works just fine and you scroll down a 100x faster.Quandary
B
9

When working with youtube the floating elements give the value "0" as the scroll height so rather than using "return document.body.scrollHeight" try using this one "return document.documentElement.scrollHeight" adjust the scroll pause time as per your internet speed else it will run for only one time and then breaks after that.

SCROLL_PAUSE_TIME = 1

# Get scroll height
"""last_height = driver.execute_script("return document.body.scrollHeight")

this dowsnt work due to floating web elements on youtube
"""

last_height = driver.execute_script("return document.documentElement.scrollHeight")
while True:
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0,document.documentElement.scrollHeight);")

    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)

    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.documentElement.scrollHeight")
    if new_height == last_height:
       print("break")
       break
    last_height = new_height
Bivalve answered 13/3, 2019 at 4:35 Comment(0)
C
9

scroll loading pages. Example: medium, quora,etc

last_height = driver.execute_script("return document.body.scrollHeight")
    while True:
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight-1000);")
        # Wait to load the page.
        driver.implicitly_wait(30) # seconds
        new_height = driver.execute_script("return document.body.scrollHeight")
    
        if new_height == last_height:
            break
        last_height = new_height
        # sleep for 30s
        driver.implicitly_wait(30) # seconds
    driver.quit()
Commonable answered 22/4, 2019 at 12:54 Comment(2)
should driver.quit() be outside the while block or not? and also the last implicit wait is not required.. someone pls confirm. @CommonableEgypt
No, if driver.quit() was inside the while loop, the driver would be closed each iteration of the loop. Once there is no more length to the page, then it will quit. The last wait could be there to give the page time to load?Shechem
E
8

Here's an example selenium code snippet that you could use for this type of purpose. It goes to the url for youtube search results on 'Enumerate python tutorial' and scrolls down until it finds the video with the title: 'Enumerate python tutorial(2020).'

driver.get('https://www.youtube.com/results?search_query=enumerate+python')
target = driver.find_element_by_link_text('Enumerate python tutorial(2020).')
target.location_once_scrolled_into_view
Edmondson answered 7/8, 2020 at 11:56 Comment(0)
T
7

I was looking for a way of scrolling through a dynamic webpage, and automatically stopping once the end of the page is reached, and found this thread.

The post by @Cuong Tran, with one main modification, was the answer that I was looking for. I thought that others might find the modification helpful (it has a pronounced effect on how the code works), hence this post.

The modification is to move the statement that captures the last page height inside the loop (so that each check is comparing to the previous page height).

So, the code below:

Continuously scrolls down a dynamic webpage (.scrollTo()), only stopping when, for one iteration, the page height stays the same.

(There is another modification, where the break statement is inside another condition (in case the page 'sticks') which can be removed).

    SCROLL_PAUSE_TIME = 0.5


    while True:

        # Get scroll height
        ### This is the difference. Moving this *inside* the loop
        ### means that it checks if scrollTo is still scrolling 
        last_height = driver.execute_script("return document.body.scrollHeight")

        # Scroll down to bottom
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

        # Wait to load page
        time.sleep(SCROLL_PAUSE_TIME)

        # Calculate new scroll height and compare with last scroll height
        new_height = driver.execute_script("return document.body.scrollHeight")
        if new_height == last_height:

            # try again (can be removed)
            driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

            # Wait to load page
            time.sleep(SCROLL_PAUSE_TIME)

            # Calculate new scroll height and compare with last scroll height
            new_height = driver.execute_script("return document.body.scrollHeight")

            # check if the page height has remained the same
            if new_height == last_height:
                # if so, you are done
                break
            # if not, move on to the next loop
            else:
                last_height = new_height
                continue
Teddytedeschi answered 3/9, 2018 at 18:21 Comment(0)
B
7

This code scrolls to the bottom but doesn't require that you wait each time. It'll continually scroll, and then stop at the bottom (or timeout)

from selenium import webdriver
import time

driver = webdriver.Chrome(executable_path='chromedriver.exe')
driver.get('https://example.com')

pre_scroll_height = driver.execute_script('return document.body.scrollHeight;')
run_time, max_run_time = 0, 1
while True:
    iteration_start = time.time()
    # Scroll webpage, the 100 allows for a more 'aggressive' scroll
    driver.execute_script('window.scrollTo(0, 100*document.body.scrollHeight);')

    post_scroll_height = driver.execute_script('return document.body.scrollHeight;')

    scrolled = post_scroll_height != pre_scroll_height
    timed_out = run_time >= max_run_time

    if scrolled:
        run_time = 0
        pre_scroll_height = post_scroll_height
    elif not scrolled and not timed_out:
        run_time += time.time() - iteration_start
    elif not scrolled and timed_out:
        break

# closing the driver is optional 
driver.close()

This is much faster than waiting 0.5-3 seconds each time for a response, when that response could take 0.1 seconds

Breaking answered 11/7, 2019 at 1:20 Comment(1)
Doesn't work for me.Newcomb
D
6

You can use send_keys to simulate a PAGE_DOWN key press (which normally scroll the page):

from selenium.webdriver.common.keys import Keys
html = driver.find_element_by_tag_name('html')
html.send_keys(Keys.PAGE_DOWN)
Dictionary answered 8/9, 2020 at 14:19 Comment(3)
That's exactly this answer, simply more vaguePacifist
this the only code that works with me on the Specific page I work on but I must click on the slider with the mouse to work I don't know why I should do that and i try find another solution for my problemDictionary
the page i works on: contacts.google.com/u/0/directoryDictionary
B
3

if you want to scroll within a particular view/frame (WebElement), what you only need to do is to replace "body" with a particular element that you intend to scroll within. i get that element via "getElementById" in the example below:

self.driver.execute_script('window.scrollTo(0, document.getElementById("page-manager").scrollHeight);')

this is the case on YouTube, for example...

Babism answered 13/1, 2020 at 10:1 Comment(0)
M
3

The ScrollTo() function doesn't work anymore. This is what I used and it worked fine.

driver.execute_script("document.getElementById('mydiv').scrollIntoView();")
Magnesium answered 18/3, 2020 at 10:9 Comment(2)
Only this method worked in my case, not other worked. Thanks.Calle
worked for me too. If you're calling scrollIntoView multiple times, be sure to set a setTimeout() function in order to allow the page to load the new content, or it won't find the new element. On a side note, to find an elem by href you can do: driver.execute_script(document.querySelector(\"a[href=\'your_href_link\']\").scrollIntoView();")Trellis
E
3

According to the docs, the class ActionChains does the job:

from selenium import webdriver
from selenium.webdriver import ActionChains

driver = webdriver.Firefox()
action_chains = ActionChains(driver)
action_chains.scroll(x: int, y: int, delta_x: int, delta_y: int, duration: int = 0, origin: str = 'viewport').perform()
Estreat answered 14/5, 2022 at 13:21 Comment(0)
A
2

insert this line driver.execute_script("window.scrollBy(0,925)", "")

Alix answered 15/1, 2021 at 6:14 Comment(1)
While this code may answer the question, including an explanation of how or why this solves the problem would really help to improve the quality of your post. Remember that you are answering the question for readers in the future, not just the person asking now. Please edit your answer to add explanations and give an indication of what limitations and assumptions apply.Granese
A
1

The loop using the "send keys" method of scrolling the page:

pre_scroll_height = driver.execute_script('return document.body.scrollHeight;')
while True:
    driver.find_element_by_tag_name('body').send_keys(Keys.END)
    time.sleep(5)
    post_scroll_height = driver.execute_script('return document.body.scrollHeight;')

    print(pre_scroll_height, post_scroll_height)
    if pre_scroll_height == post_scroll_height:
        break
    pre_scroll_height=post_scroll_height
Auklet answered 16/3, 2022 at 5:35 Comment(0)
C
1

Here is a method I wrote to slowly scroll down to a targets element

You can pass either Y-th position of element of the CSS Selector to it

It scrolls exactly like we do via mouse-wheel

Once this method called, you call it again with same driver object but with new target element, it will then scroll up/down wherever that element exists

def slow_scroll_to_element(self, driver, element_selector=None, target_yth_location=None):
    current_scroll_position = int(driver.execute_script("return window.scrollY"))
    
    if element_selector:
        target_yth_location = int(driver.execute_script("return document.querySelector('{}').getBoundingClientRect()['top'] + window.scrollY".format(element_selector)))
    
    scrollSpeed = 100 if target_yth_location-current_scroll_position > 0 else -100

    def chunks(a, n):
        k, m = divmod(len(a), n)
        return (a[i*k+min(i, m):(i+1)*k+min(i+1, m)] for i in range(n))
    
    for l in list(chunks(list(range(current_scroll_position, target_yth_location, scrollSpeed)) + list([target_yth_location+(-scrollSpeed if scrollSpeed > 0 else scrollSpeed)]), 3)):
        for pos in l:
            driver.execute_script("window.scrollTo(0, "+str(pos)+");")
            time.sleep(0.1)
        time.sleep(random.randint(1,3))
Changeling answered 26/6, 2022 at 13:17 Comment(0)
I
1

Scroll to an element: Find the element and scroll using this code.

scroll_element = driver.find_element(By.XPATH, "your element xpath")
driver.execute_script("arguments[0].scrollIntoView();", scroll_element)
Isomagnetic answered 3/2, 2023 at 4:3 Comment(0)
I
1

Would you consider using an extension of Selenium so you don't have to code everything yourself? I'm the author of the Browserist package in full disclosure. Browserist is lightweight, less verbose extension of the Selenium web driver that makes browser automation even easier. Simply install the package with pip install browserist.

Browserist has several options for scrolling. Whether it's scrolling to specific elements, a few pixels down or up, a whole page down or up, end or top of page, just a few lines of code is needed. Examples:

from browserist import Browser

browser = Browser()
browser.open.url("https://stackoverflow.com")
browser.scroll.into_view("/html/body/div[3]/div[2]/div[1]/div[3]/div/div/div[6]")
browser.scroll.page.to_end()
browser.scroll.page.to_top()
browser.scroll.page.down()
browser.scroll.down_by(100)
browser.scroll.up_by(50)

Here's what I get (slowed down as Browserist finishes the job quickly). I hope this is helpful. Let me know if you have questions?

Example of scrolling with Browserist

Intrinsic answered 11/5, 2023 at 17:59 Comment(0)
P
1

There are severals ways to this, but all of them has a limitation if you are using them for infinite loading site.

The limitation is the waiting time until new scroll happened and this's very bad since we cannot be sure about others internet speed. Any way if I found any solution for this I'll update this post.

1st solution

loading_waiting_time = 1

# Get actual page height
previous_page_height = driver.execute_script("return document.body.scrollHeight")

# Run infinte loop and stop it if new_page_height is equal to previous_page_height
while True:
    # Scroll to the end of page
    driver.execute_script('window.scrollTo(0, document.body.scrollHeight);')

    # Waiting until new images loaded
    time.sleep(loading_waiting_time)

    # Get new page height
    new_page_height = driver.execute_script("return document.body.scrollHeight")
    if new_page_height == previous_page_height:
        break
    previous_page_height = new_page_height

2nd Solution This solution is good for non fixed footer.

loading_waiting_time = 1

# Get actual page height
previous_page_height = driver.execute_script("return document.body.scrollHeight")

# Run infinte loop and stop it if new_page_height is equal to previous_page_height
while True:
    # Scroll to `footer` using JS
    footer_element = driver.find_element(By.TAG_NAME, 'footer')
    driver.execute_script('arguments[0].scrollIntoView(true)', footer_element)
    
    # Waiting until new images loaded
    time.sleep(loading_waiting_time)

    # Get new page height
    new_page_height = driver.execute_script("return document.body.scrollHeight")
    if new_page_height == previous_page_height:
        break
    previous_page_height = new_page_height

3rd Solution This solution is good for non fixed footer.

loading_waiting_time = 1

# Get actual page height
previous_page_height = driver.execute_script("return document.body.scrollHeight")

# Run infinte loop and stop it if new_page_height is equal to previous_page_height
while True:
    # Scroll to until `footer` is visible
    WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.TAG_NAME, 'footer')))

    # Waiting until new images loaded
    time.sleep(loading_waiting_time)

    # Get new page height
    new_page_height = driver.execute_script("return document.body.scrollHeight")
    if new_page_height == previous_page_height:
        break
    previous_page_height = new_page_height
Payola answered 15/8, 2023 at 13:19 Comment(0)
G
0
driver.execute_script("document.getElementById('your ID Element').scrollIntoView();")

it's working for my case.

Gilgamesh answered 11/6, 2020 at 13:44 Comment(0)
C
0

Just a small variation of the solutions provided so far: sometimes in scraping you have to meet the following requirements:

  • Keep scrolling step by step. Otherwise if you always jump to the bottom some elements are loaded only as containers/divs but their content is not loaded because they were never visible (because you jumped straight to the bottom);
  • Allow enough time for content to be loaded;
  • It's not an infinite scroll page, there is an end and you have to identify when the end is reached;

Here is a simple implementation:

from time import sleep
def keep_scrolling_to_the_bottom():
    while True:
        previous_scrollY = my_web_driver.execute_script( 'return window.scrollY' )
        my_web_driver.execute_script( 'window.scrollBy( 0, 230 )' )
        sleep( 0.4 )
        if previous_scrollY == my_web_driver.execute_script( 'return window.scrollY' ):
            print( 'job done, reached the bottom!' )
            break

Tested and working on Windows 7 x64, Python 3.8.0, selenium 4.1.3, Google Chrome 107.0.5304.107, website for property rent.

Cornellcornelle answered 20/11, 2022 at 12:51 Comment(0)
L
-1

Scroll to a specific element, position or end of the page :

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://example.com")

# Find the target element you want to scroll to
element = driver.find_element_by_id("target-element-id")



# Scroll to the target element
driver.execute_script("arguments[0].scrollIntoView();", element)

# Scroll to a specific position (x, y coordinates)
driver.execute_script("window.scrollTo(0, 500)")

# Scroll to the end of the page
driver.execute_script("window.scrollTo(0, document.body.scrollHeight)")
Leigha answered 30/5, 2023 at 16:8 Comment(1)
element = driver.find_element_by_id("'target-element-id") AttributeError: 'WebDriver' object has no attribute 'find_element_by_id'Vinasse

© 2022 - 2024 — McMap. All rights reserved.