web-scraping Questions

10

import requests from bs4 import BeautifulSoup from requests_html import HTMLSession url="https://dmarket.com/ingame-items/item-list/csgo-skins?title=recoil%20case" sesion = HTMLSession() ...
Elliellicott asked 19/2 at 21:4

4

Solved

I have troubles sorting a wiki table and hope someone who has done it before can give me advice. From the List_of_current_heads_of_state_and_government I need countries (works with the code below) ...
Heulandite asked 15/5, 2018 at 16:56

3

I am working on a Linkedin web scraping project. I am trying to get the list of companies that interest someone (notice I am not using the API). It is a dynamic website, so I would need to scroll d...
Reparation asked 30/8, 2017 at 12:58

12

I'm trying to use the webdriver and Selenium, it was working fine a couple days ago but I'm currently facing this issue where I receive this error: [Errno 8] Exec format error: '/Users/[USER]/.wdm/...
Dartmouth asked 24/7 at 17:43

4

Solved

undetected_chromedriver with webdriver_manager was working well few days ago for scraping websites but out of nowhere it started throwing the error: OSError: [Errno 8] Exec format error: '/Users/p...

3

google has an API for downloading search suggestions: https://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/xml_reference/query_suggestion.html unfortunately, as far as i...
Elias asked 14/6, 2015 at 15:23

3

Solved

I'm creating a web api that scrapes a given url and sends that back. I am using Puppeteer to do this. I asked this question: Puppeteer not behaving like in Developer Console and recieved an answer ...
Resinate asked 9/9, 2020 at 20:8

11

I'm new to web scraping and want to download all images on a webpage using puppeteer: const puppeteer = require('puppeteer'); let scrape = async () => { // Actual Scraping goes Here... const...
Vasomotor asked 27/9, 2018 at 17:21

3

Is there anyone who has tried to extract the individual risk factors from the Risk Factors section i.e. Item 1A from the EDGAR 10-K filings of the company using BeautifulSoup or any other web scrap...
Philosophy asked 17/6, 2020 at 13:31

6

After running the scrapy shell with the defined url, I am getting the attribute error showing the following error: AttributeError: module 'OpenSSL.SSL' has no attribute 'SSLv3_METHOD' scrapy shell ...
Ineffectual asked 26/9, 2022 at 19:49

4

Solved

I'm wondered if it's possible in google sheet to convert Gregorian calendar to jalali using a function. In fact, I have some date such as : February 20, 2021 4:30 AM I need to display this date in...
Straightforward asked 22/2, 2021 at 12:41

9

Solved

As you can see with the sample code below, I'm using Puppeteer with a cluster of workers in Node to run multiple requests of websites screenshots by a given URL: const cluster = require('cluster')...

3

Solved

I'm trying to use phantomJS to capture a screenshot of a URL, however when I call phantomJS (from either the command line or web app) it hangs and seesm to never execute the "exit()" call. I can't ...
Ganymede asked 20/5, 2013 at 20:24

9

Solved

I try to scrape this site by Selenium. I want to click in "Next Page" buttom, for this I do: driver.find_element_by_class_name('pagination-r').click() it works for many pages but not for...

4

Solved

What are the steps to upload the crawled data from Scrapy to the Amazon s3 as a csv/jsonl/json file? All i could find from the internet was to upload scraped images to the s3 bucket. I'm currently...
Hettiehetty asked 5/8, 2016 at 11:24

2

Solved

I'm relatively new to using Chrome developer tools/doing XPath searches/this kind of programming in general, so please excuse any incorrect terminology or vague-sounding descriptions. I think the s...
Decalescence asked 7/12, 2016 at 20:12

1

Solved

I attempted the following: Utilize the German stock exchange's API (https://api.boerse-frankfurt.de/v1/search/equity_search) to retrieve index values. The API can be accessed externally using para...
Occidentalize asked 10/4 at 20:59

3

Solved

I want to log into my amazon account and retrieve purchase history programmatically in java. I did a lot of research and came across screen scraping. Is this the only way or does amazon provide api...
Bedard asked 2/4, 2013 at 7:22

6

I use this program to get the json data from https://www.nseindia.com/api/option-chain-indices?symbol=NIFTY but since this morning it's not working as it returns <Response [401]>. The link lo...
Professorship asked 20/9, 2020 at 16:43

4

Solved

I am trying to extract the link inside a href but all I am finding it is the text inside the element The website code is the following: <div class="item-info-container "> <a hre...
Syverson asked 6/7, 2023 at 1:10

1

i have a page with cards i click on the title of each it takes to a page then i click on a link in that page and it takes me another page to extract the data that i want. After that i navigate back...
Oversight asked 7/3 at 13:39

5

I have the following code, where I store all src in an array, I would like to store only img with class name xyz const imgs = await page.$$eval('img[src]', imgs => imgs.map(img => img.getAtt...
Immobility asked 11/3, 2019 at 6:43

1

Chrome 122.0.6261.95 Chrome driver 122.0.6261.94 Python 3.8.3 If I comment out option.add_argument("--headless=new"), it will print(len(elements)) will print 2. Otherwise, can not print a...

7

Solved

I am trying to scrape one of the free proxy listings website but, I just couldn't be able to scrape the proxies. Below is my code: import requests import re url = 'https://free-proxy-list.net/' ...
Birdhouse asked 24/1, 2018 at 15:58

3

Solved

I am writing an application in node.js that will navigate to a website, click a button on the website, and then extract certain pieces of data from the website. All is going well except for the but...
Airhead asked 19/6, 2019 at 20:26

© 2022 - 2024 — McMap. All rights reserved.