web-scraping

10

Python request-html is not downloading Chromium

import requests from bs4 import BeautifulSoup from requests_html import HTMLSession url="https://dmarket.com/ingame-items/item-list/csgo-skins?title=recoil%20case" sesion = HTMLSession() ...

python web-scraping python-requests-html

Elliellicott asked 19/2 at 21:4

4

Solved

Scraping Wikipedia tables with Python selectively

I have troubles sorting a wiki table and hope someone who has done it before can give me advice. From the List_of_current_heads_of_state_and_government I need countries (works with the code below) ...

python-3.x web-scraping beautifulsoup wikipedia

Heulandite asked 15/5, 2018 at 16:56

3

Selenium Python: How to Scroll Down in a Pop Up window

I am working on a Linkedin web scraping project. I am trying to get the list of companies that interest someone (notice I am not using the API). It is a dynamic website, so I would need to scroll d...

javascript python selenium web-scraping linkedin-api

Reparation asked 30/8, 2017 at 12:58

12

Issue with Selenium and webdriver

I'm trying to use the webdriver and Selenium, it was working fine a couple days ago but I'm currently facing this issue where I receive this error: [Errno 8] Exec format error: '/Users/[USER]/.wdm/...

selenium-webdriver web-scraping webdriver

Dartmouth asked 24/7 at 17:43

4

Solved

THIRD_PARTY_NOTICES.chromedriver - Exec format error - undetected_chromedriver

undetected_chromedriver with webdriver_manager was working well few days ago for scraping websites but out of nowhere it started throwing the error: OSError: [Errno 8] Exec format error: '/Users/p...

python selenium-webdriver web-scraping undetected-chromedriver

Cull asked 29/7 at 11:27

3

how to spoof location so google autocomplete API will provide local results, ideally with R

google has an API for downloading search suggestions: https://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/xml_reference/query_suggestion.html unfortunately, as far as i...

r geolocation web-scraping vpn spoofing

Elias asked 14/6, 2015 at 15:23

3

Solved

Why does headless need to be false for Puppeteer to work?

I'm creating a web api that scrapes a given url and sends that back. I am using Puppeteer to do this. I asked this question: Puppeteer not behaving like in Developer Console and recieved an answer ...

javascript web-scraping puppeteer

Resinate asked 9/9, 2020 at 20:8

11

How can I download images on a page using puppeteer?

I'm new to web scraping and want to download all images on a webpage using puppeteer: const puppeteer = require('puppeteer'); let scrape = async () => { // Actual Scraping goes Here... const...

javascript web-scraping puppeteer google-chrome-headless

Vasomotor asked 27/9, 2018 at 17:21

3

Web Scraping Risk Factors from 10-K EDGAR

Is there anyone who has tried to extract the individual risk factors from the Risk Factors section i.e. Item 1A from the EDGAR 10-K filings of the company using BeautifulSoup or any other web scrap...

html python-3.x regex web-scraping beautifulsoup

Philosophy asked 17/6, 2020 at 13:31

6

AttributeError: module 'OpenSSL.SSL' has no attribute 'SSLv3_METHOD'

After running the scrapy shell with the defined url, I am getting the attribute error showing the following error: AttributeError: module 'OpenSSL.SSL' has no attribute 'SSLv3_METHOD' scrapy shell ...

python python-3.x web-scraping scrapy

Ineffectual asked 26/9, 2022 at 19:49

4

Solved

Convert date locale in google sheet from Gregorian calendar to Jalali calendar

I'm wondered if it's possible in google sheet to convert Gregorian calendar to jalali using a function. In fact, I have some date such as : February 20, 2021 4:30 AM I need to display this date in...

function date google-sheets web-scraping google-sheets-formula

Straightforward asked 22/2, 2021 at 12:41

9

Solved

Puppeteer - Protocol error (Page.navigate): Target closed

As you can see with the sample code below, I'm using Puppeteer with a cluster of workers in Node to run multiple requests of websites screenshots by a given URL: const cluster = require('cluster')...

node.js web-scraping puppeteer google-chrome-headless node-cluster

Vizzone asked 1/8, 2018 at 8:54

3

Solved

PhantomJS hanging when called from CLI or Web

I'm trying to use phantomJS to capture a screenshot of a URL, however when I call phantomJS (from either the command line or web app) it hangs and seesm to never execute the "exit()" call. I can't ...

javascript web-scraping phantomjs

Ganymede asked 20/5, 2013 at 20:24

9

Solved

Selenium-Debugging: Element is not clickable at point (X,Y)

I try to scrape this site by Selenium. I want to click in "Next Page" buttom, for this I do: driver.find_element_by_class_name('pagination-r').click() it works for many pages but not for...

python selenium-webdriver web-scraping selenium-firefoxdriver

Naquin asked 17/6, 2016 at 10:18

4

Solved

How to upload crawled data from Scrapy to Amazon S3 as csv or json?

What are the steps to upload the crawled data from Scrapy to the Amazon s3 as a csv/jsonl/json file? All i could find from the internet was to upload scraped images to the s3 bucket. I'm currently...

python json amazon-s3 web-scraping scrapy

Hettiehetty asked 5/8, 2016 at 11:24

2

Solved

XPath Search in Chrome JS Console ("$x(...)") outputs arrays (jQuery Objects?) rather than sections of HTML text (DOM elements?)

I'm relatively new to using Chrome developer tools/doing XPath searches/this kind of programming in general, so please excuse any incorrect terminology or vague-sounding descriptions. I think the s...

javascript html xpath web-scraping google-chrome-devtools

Decalescence asked 7/12, 2016 at 20:12

1

Solved

Data scraping fails: Seeking assistance

I attempted the following: Utilize the German stock exchange's API (https://api.boerse-frankfurt.de/v1/search/equity_search) to retrieve index values. The API can be accessed externally using para...

python web-scraping python-requests http-headers stock

Occidentalize asked 10/4 at 20:59

3

Solved

Retrieving amazon order history programmatically using Java

I want to log into my amazon account and retrieve purchase history programmatically in java. I did a lot of research and came across screen scraping. Is this the only way or does amazon provide api...

java amazon-web-services web-scraping

Bedard asked 2/4, 2013 at 7:22

6

Python Requests get returns response code 401 for nse india website

I use this program to get the json data from https://www.nseindia.com/api/option-chain-indices?symbol=NIFTY but since this morning it's not working as it returns <Response [401]>. The link lo...

python python-3.x web-scraping python-requests

Professorship asked 20/9, 2020 at 16:43

4

Solved

Get href link using python playwright

I am trying to extract the link inside a href but all I am finding it is the text inside the element The website code is the following: <div class="item-info-container "> <a hre...

python web-scraping xpath playwright playwright-python

Syverson asked 6/7, 2023 at 1:10

1

Cannot find context with specified id in puppeteer

i have a page with cards i click on the title of each it takes to a page then i click on a link in that page and it takes me another page to extract the data that i want. After that i navigate back...

javascript node.js web-scraping puppeteer

Oversight asked 7/3 at 13:39

5

Get Image src with specific class in puppeteer

I have the following code, where I store all src in an array, I would like to store only img with class name xyz const imgs = await page.$$eval('img[src]', imgs => imgs.map(img => img.getAtt...

javascript node.js web-scraping puppeteer

Immobility asked 11/3, 2019 at 6:43

1

option.add_argument("--headless=new") doesn't work for Chrome 122 when scrap Kayak

Chrome 122.0.6261.95 Chrome driver 122.0.6261.94 Python 3.8.3 If I comment out option.add_argument("--headless=new"), it will print(len(elements)) will print 2. Otherwise, can not print a...

python selenium-webdriver web-scraping selenium-chromedriver

Punke asked 2/3 at 17:7

7

Solved

Scraping free proxy listing website

I am trying to scrape one of the free proxy listings website but, I just couldn't be able to scrape the proxies. Below is my code: import requests import re url = 'https://free-proxy-list.net/' ...

python web-scraping

Birdhouse asked 24/1, 2018 at 15:58

3

Solved

How to fix '$(...).click is not a function' in Node/Cheerio

I am writing an application in node.js that will navigate to a website, click a button on the website, and then extract certain pieces of data from the website. All is going well except for the but...

javascript node.js web-scraping request cheerio

Airhead asked 19/6, 2019 at 20:26

web-scraping Questions

Recommended topics

Hot tags