scrapinghub Questions

1

Solved

Background - TLDR: I have a memory leak in my project Spent a few days looking through the memory leak docs with scrapy and can't find the problem. I'm developing a medium size scrapy project, ~40k...
Upu asked 17/9, 2020 at 11:8

4

Solved

I am trying to programatically call a spider through a script. I an unable to override the settings through the constructor using CrawlerProcess. Let me illustrate this with the default spider for ...
Kezer asked 28/2, 2017 at 14:48

0

I'm trying to use pygsheets in a script on ScrapingHub. The pygsheets part of the script begins with: google_client = pygsheets.authorize(service_file=CREDENTIALS_FILENAME, no_cache=True) spreadsh...

1

Solved

I have problem for Running/deploying custom script with shub-image. setup.py from setuptools import setup, find_packages setup( name = 'EU-Crawler', version = '1.0', packages = find_packages(...
Carnes asked 7/12, 2017 at 16:24

1

Below is the spider code: import scrapy class MyntraSpider(scrapy.Spider): custom_settings = { 'HTTPCACHE_ENABLED': False, 'dont_redirect': True, #'handle_httpstatus_list' : [302,307], #'CRAW...
Monopolist asked 16/12, 2017 at 6:32
1

© 2022 - 2024 — McMap. All rights reserved.