Running multiple spiders using scrapyd
Asked Answered
H

1

3

I had multiple spiders in my project so decided to run them by uploading to scrapyd server. I had uploaded my project succesfully and i can see all the spiders when i run the command

curl http://localhost:6800/listspiders.json?project=myproject

when i run the following command

curl http://localhost:6800/schedule.json -d project=myproject -d spider=spider2

Only one spider runs because of only one spider given, but i want to run run multiple spiders here so the following command is right for running multiple spiders in scrapyd ?

curl http://localhost:6800/schedule.json -d project=myproject -d spider=spider1,spider2,spider3........

And later i will run this command using cron job i mean i will schedule this to run frequently

Hibachi answered 9/7, 2012 at 7:45 Comment(0)
F
2

If you want to run multiple spiders using scrapyd, schedule them one by one. scrapyd will run them in the same order but not at the same time.

See also: Scrapy 's Scrapyd too slow with scheduling spiders

Frasch answered 9/7, 2012 at 7:58 Comment(6)
Yes i mean to run all the spiders with one command, not all concurrently. After deploying a project with multiple spiders how can i schedule them with using scrapyd, whether above command is useful ?Hibachi
Your command is invalid. doc.scrapy.org/en/latest/topics/scrapyd.html#schedule-json says that spider argument should contain spider name, but you provided a list of spider names delimited by commas. Instead of doing http://localhost:6800/schedule.json -d project=myproject -d spider=spider1,spider2 do http://localhost:6800/schedule.json -d project=myproject -d spider=spider1 then http://localhost:6800/schedule.json -d project=myproject -d spider=spider2, and so onFrasch
if we do so i expect this will be same as "scrapy crawl spider_name" command, then why we uploaded this to scrapyd server, suppose if want to run all these through cron jobs i need to write all the commands in more than one lines right?Hibachi
Actually, when scrapyd runs a spider it uses almost the same command as "scrapy crawl spider_name"Frasch
oh thanks warwaruk, but how can we run the spiders one by one then, because now i am trying to run all the spiders(for example 4 spiders) through cron jobs, isthere any way to run all the spiders one by one and schedule them to run for every 2 or more hoursHibachi
make a bash script which issues commands for scheduling all your spiders, and put that push script to cron. Alternatively, it can be a python script which does the same, without calling curl. An example is hereFrasch

© 2022 - 2024 — McMap. All rights reserved.