scrapy Questions
2
Solved
Recently I have started to use Scrapy on a regular basis to analyze sites which demand the latest browser (user agent) for their content to show up.
Now, this may seem like an old time problem, yet...
Daubery asked 21/6, 2021 at 10:22
4
Solved
I'm trying to collect a few pieces of information about a bunch of different web sites. I want to produce one Item per site that summarizes the information I found across that site, regardless of w...
10
Solved
I installed Scrapy in my python 2.7 environment in windows 7 but when I trying to start a new Scrapy project using scrapy startproject newProject the command prompt show this massage
'scrapy' is n...
Jojo asked 14/9, 2016 at 9:28
5
Solved
In my previous question, I wasn't very specific over my problem (scraping with an authenticated session with Scrapy), in the hopes of being able to deduce the solution from a more general answer. I...
6
After running the scrapy shell with the defined url, I am getting the attribute error showing the following error:
AttributeError: module 'OpenSSL.SSL' has no attribute 'SSLv3_METHOD'
scrapy shell ...
Ineffectual asked 26/9, 2022 at 19:49
5
is there a chance to stop crawling when specific if condition is true (like scrap_item_id == predefine_value ). My problem is similar to Scrapy - how to identify already scraped urls but I want to ...
4
Is it possible to delay the retry of a particular scrapy Request. I have a middleware which needs to defer the request of a page until a later time. I know how to do the basic deferal (end of queue...
4
Solved
What are the steps to upload the crawled data from Scrapy to the Amazon s3 as a csv/jsonl/json file? All i could find from the internet was to upload scraped images to the s3 bucket.
I'm currently...
Hettiehetty asked 5/8, 2016 at 11:24
5
Solved
At dawn my code was working perfectly, but today when I woke up it is no longer working, and I didn't change any line of code, I also checked if Firefox updated, and no, it didn't, and I have no id...
Impact asked 15/4, 2022 at 15:26
8
Solved
I have Visual Studio Code on a Windows Machine, on which I am making a new Scrapy Crawler. The crawler is working fine but I want to debug the code, for which I am adding this in my launch.json fil...
Nieman asked 9/3, 2018 at 20:47
6
Solved
I am scraping a soccer site and the spider (a single spider) gets several kinds of items from the site's pages: Team, Match, Club etc.
I am trying to use the CSVItemExporter to store these items in...
11
I get twisted.internet.error.ReactorNotRestartable error when I execute following code:
from time import sleep
from scrapy import signals
from scrapy.crawler import CrawlerProcess
from scrapy.util...
Hickory asked 9/10, 2016 at 17:47
27
Solved
I'm practicing the code from 'Web Scraping with Python', and I keep having this certificate problem:
from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
pages = set...
Popular asked 8/5, 2018 at 14:32
2
Solved
I need the values in a dict. But item uses some abstraction on top of it. How to get the fields in a dict from an item ?
I know scrapy allows dict to be returned in place of item now. But I alread...
Upbraid asked 6/8, 2015 at 12:20
4
I am a newbie to python. I am running python 2.7.3 version 32 bit on 64 bit OS. (I tried 64 bit but it didn't workout).
I followed the tutorial and installed scrapy on my machine. I have created o...
Salado asked 12/4, 2012 at 11:58
7
I just started programming Python. I want to use scrapy to create a bot,and it showed
TypeError: Object of type 'bytes' is not JSON serializable when I run the project.
import json
import codecs...
Spam asked 21/6, 2017 at 16:54
24
Solved
This is Windows 7 with python 2.7
I have a scrapy project in a directory called caps (this is where scrapy.cfg is)
My spider is located in caps\caps\spiders\campSpider.py
I cd into the scrapy pr...
25
Solved
2
I am trying to crawl data from a list of URLs. I have already done with the code below and succeeded yesterday without any error.
But today, when I came back and ran the code again, there was an er...
Befriend asked 28/8, 2023 at 19:32
1
Solved
I'm learning python scraping with scrapy. I did exacly the same thing as the tutorial teaches.
But I got an error. Please help!
My Python code:
import scrapy
class BookSpider(scrapy.Spider):
nam...
Sibbie asked 29/8, 2023 at 18:38
5
Solved
This is not working anymore, scrapy's API has changed.
Now the documentation feature a way to "Run Scrapy from a script" but I get the ReactorNotRestartable error.
My task:
from celery import Ta...
4
Solved
I'm building a data extract using scrapy and want to normalize a raw string pulled out of an HTML document. Here's an example string:
Sapphire RX460 OC 2/4GB
Notice two groups of two whitespace...
3
Solved
I'm trying to create a custom Scrapy Item Exporter based off JsonLinesItemExporter so I can slightly alter the structure it produces.
I have read the documentation here http://doc.scrapy.org/en/la...
5
I have disabled the Default Scrapy cookie option, so that i have to set it manually.
COOKIES_ENABLED = False
COOKIES_DEBUG = True
Now, i need to set cookie with the value which is received as th...
Paleo asked 6/4, 2016 at 6:32
7
I'm new to scrapy, and recently started using it on the M1 MacBook Air. I've encountered an issue.
For example, when I try to do something like this:
scrapy shell bbc.com
It would return me: Memor...
Bazemore asked 16/5, 2021 at 12:46
1 Next >
© 2022 - 2025 — McMap. All rights reserved.