Python Selenium: Finds h1 element but returns empty text string
Asked Answered
C

2

22

I am trying to get the text in the header on this page:

enter image description here

iShares FTSE MIB UCITS ETF EUR (Dist)

The tag looks like this:

<h1 class="product-title" title="iShares FTSE MIB UCITS ETF EUR (Dist)"> iShares FTSE MIB UCITS ETF EUR (Dist) </h1>

I am using this xPath:

xp_name = ".//*[@class[contains(normalize-space(.), 'product-title')]]"

Retrieving via .text in Selenium WebDriver for Python:

new_name = driver.find_element_by_xpath(xp_name).text

The driver finds the xpath, but when I print new_name, macOS Terminal only prints a blank string: ""

What could be the reason for this?

enter image description here


Note: I also tried some other xpath alternatives, getting the same result, for example with:

xp_name = ".//*[@id='fundHeader']//h1"
Catheycathi answered 15/4, 2017 at 18:33 Comment(0)
B
51

The problem is that there are two h1 elements with totally the same outer HTML: the first is hidden, the second is not. You can check it with

print(len(driver.find_elements_by_xpath('//h1[@class="product-title "]')))

text property allow you to get text from only visible elements while textContent attribute also allow to get text of hidden one

Try to replace

new_name = driver.find_element_by_xpath(xp_name).text

with

new_name = driver.find_element_by_xpath(xp_name).get_attribute('textContent')

or simply handle the second (visible) header:

driver.find_elements_by_xpath('//h1[@class="product-title "]')[1].text
Bombe answered 15/4, 2017 at 19:2 Comment(2)
Might or might not be relevant: For me the problem was that the element I was trying to fetch just wasn't loaded yet. So a 'time.sleep(1)' fixed it for me. The way the website was setup, it wouldn't throw an error tho.Northern
Just want to say Thank You for this answer. Spent couple of hours trying to figure out why i get " " in scraping results, only after i added .get_attribute('textContent') i got my desired outputGrimaldo
R
1

As @ahmad-moussa mentioned, for me to the solution was:

import time

(...)

time.sleep(1)
# before 
<webelement>.text
Render answered 24/1, 2021 at 21:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.