How to pass parameters to scrapy crawler from scrapyd?
Asked Answered
A

1

6

I can run a spider in scrapy with a simple command

scrapy crawl custom_spider -a input_val=5 -a input_val2=6

where input_val and input_val2 are the values i'm passing to the spider

and the above method works fine..

However while scheduling a spider with scrapyd

running

curl http://localhost:6800/schedule.json -d project=crawler -d input_val=5 -d input_val2=6 -d spider=custom_spider

Throws an error

spider = cls(*args, **kwargs)
    exceptions.TypeError: __init__() got an unexpected keyword argument '_job'

How do i get this to work?

Edit This: is inside my initializer:

def __init__(self,input_val=None, input_val2=None, *args, **kwargs):
        self.input_val = input_val
        self.input_val2 = input_val2
        super(CustomSpider, self).__init__(*args, **kwargs)
Anglaangle answered 26/8, 2015 at 10:20 Comment(0)
H
6

Be sure to support arbitrary keyword arguments in your spider and call __init__ with super() like shown in the docs for spider arguments:

class MySpider(scrapy.Spider):
    name = 'myspider'

    def __init__(self, category=None, *args, **kwargs):
        super(MySpider, self).__init__(*args, **kwargs) # <- important
        self.category = category

Scrapyd supplies the job ID as a _job argument passed to the spider (see code here).

Hernandez answered 26/8, 2015 at 12:56 Comment(4)
Thank you, I've edited the question with the constructor, the error is still there.... any idea what is wrong?Anglaangle
@Anglaangle hm, what version of scrapy and scrapyd are you using? could you provide a small self-contained sample project that reproduces the problem when deployed to scrapyd?Hernandez
@Anglaangle I was not able to reproduce the problem with the code you provided.Hernandez
yes, will get that self-contained code, checked and spider works with your code suggestions, but for some cases it does'nt (probably some misconfig on my end) will let you know, thanksAnglaangle

© 2022 - 2024 — McMap. All rights reserved.