scrapyd: is it possible to return ERROR status for a job
Asked Answered
D

0

7

I have an application which schedules scrapy crawl jobs via scrapyd. Items flow nicely to the DB, and I can monior the job status via the listjobs.json endpoint.So far so good, and I can even know when jobs are finished.

However, sometimes jobs can fail. Maybe because of an HTTP Error, or bad credentials. I would like to access the finished jobs statuses, preferably from the scrapyd api. Something along the lines of what listjobs.json is giving me today, i would love to have a result that would look like:

{"status": "ok",


"error": [{"id": "78391cc0fcaf11e1b0090800272a6d06", "spider": "spider1"}],
 "running": [{"id": "422e608f9f28cef127b3d5ef93fe9399", "spider": "spider2", "start_time": "2012-09-12 10:14:03.594664"}],
 "finished": [{"id": "2f16646cfcaf11e1b0090800272a6d06", "spider": "spider3", "start_time": "2012-09-12 10:14:03.594664", "end_time": "2012-09-12 10:24:03.594664"}]}

Of course, I can have the jobs themselves update some DB or File, and I can check that from the app, but I was wondering if there's a cleaner way.

Doublespace answered 3/3, 2016 at 16:44 Comment(1)
Ended up implementing it via databaseDoublespace

© 2022 - 2024 — McMap. All rights reserved.