Laravel run multiple scheduled tasks
Asked Answered
D

2

11

I currently have a scheduled console command that runs every 5 minutes without overlap like this:

 $schedule->command('crawler')
             ->everyFiveMinutes()
             ->withoutOverlapping()
             ->sendOutputTo('../_laravel/storage/logs/scheduler-log.txt');

So it works great, but I currently have about 220 pages that takes about 3 hours to finish in increments of 5 minutes because I just force it to crawl 10 pages at each interval since each page takes like 20-30 seconds to crawl due to various factors. Each page is a record in the database. If I end up having 10,000 pages to crawl, this method would not work because it would take more than 24 hours and each page is supposed to be re-crawled once a day.

So my vendor allows up to 10 concurrent requests (or more with higher plans), so what's the best way to run it concurrently? If I just duplicate the scheduler code, does it run the same command twice or like 10 times if I duplicated it 10 times? Any issues that would cause?

And then I need to pass on parameters to the console such as 1, 2, 3, etc... in which I could use to determine which pages to crawl? i.e. 1 would be 1-10 records, 2 would be next 11-20 records, and so on.

Using this StackOverfow answer, I think I know how to pass it along, like this:

 $schedule->command('crawler --sequence=1')

But how do I read that parameter within my Command class? Does it just become a regular PHP variable, i.e. $sequence?

Diatonic answered 10/3, 2016 at 2:57 Comment(3)
Is it possible for you to post your command class? what classes does it extend from?Brilliant
It's like 500 lines long so probably shouldn't paste the whole thing here. It extends Command class.Diatonic
What is you were to run multiple queue listeners and chunk your pages across jobs?Whim
B
9
  1. Better to use queue for job processing
  2. on cron, add all jobs to queue
  3. Run multiple queue workers, which will process jobs in parallel

Tip: It happened with us. It might happen that job added previously is not complete, yet cron adds the same task in queue again. As queues works sequentially. To save yourself from the situation, you should in database mark when a task is completed last time, so you know when to execute the job (if it was seriously delayed)

Boulware answered 12/3, 2016 at 11:13 Comment(7)
But how would I have repeating queues? The reason I'm using the task scheduler is because I can have it run every 5 minutes and check if any of the pages need to be re-crawled. Sometimes there will be none and other times might be 200.Diatonic
I asked you to use cron, to identify which pages need to crawl. Then add these pages as Jobs in Queue. Now running queue workers will pick Jobs from Queue, and crawl one by one. (If you have 5 workers running, at a time 5 pages will be crawled) What do you mean by Repeating Queue ?Boulware
I see. The task scheduler in Laravel is basically a cron itself. So you're saying I should continue using task scheduler except only to check which pages need to be crawled and then pass them to the queue to handle the actual crawling process? I will read more into the queues to make sure they can do what I need...Diatonic
@Diatonic yes that is exactly what Shyam is saying - as long as the 'checking' part of the process is quick and easy (and you could even check these future 10000 pages in a few seconds) then the scheduled command should do the check every 5 minutes and add any pages that need 'refreshing' to the queue. Then your 5 or 10 queue workers will be able to run these refresh actions in parallel. When there's nothing on the queue they will 'sleep' so there's little overhead, and if there are lots of things on the queue they will process them one at a time (multiplied by the number of workers you have).Sealey
Thanks @alexrussell. How are queue workers started? I know I can start it via console/terminal, but I want it to be an automated process so once I add crawls to a queue, they run when the next "worker" is available.Diatonic
For this you need to look into utilities that you can use to run commands and keep them alive if/when they die. The current go-to tool for this is Supervisor. FWIW I only recently used queues in Laravel for the first time and the experience wasn't great as there are multiple ways to run them. In the end I think found that a queue worker (set to daemon mode - which is not the same as a real Linux daemon I may add) kept 'alive' using Supervisor was the best for my case.Sealey
Thanks @alexrussell. Supervisor ended up being pretty good and easy to setup on my Mac, I'm hoping the experience will be just as smooth on my server too.Diatonic
E
1

I found this on the documentation, I hope this is what you're looking for:

  • Retrieving Input

While your command is executing, you will obviously need to access the values for the arguments and options accepted by your application. To do so, you may use the argument and option methods:

  • Retrieving The Value Of A Command Argument

$value = $this->argument('name');

  • Retrieving All Arguments

$arguments = $this->argument();

  • Retrieving The Value Of A Command Option

$value = $this->option('name');

  • Retrieving All Options

$options = $this->option();

source

Extract answered 17/3, 2016 at 16:28 Comment(1)
Thanks. I think I'll give the queues a shot since that seems like a better option but I'll keep this in my back pocket just in case.Diatonic

© 2022 - 2024 — McMap. All rights reserved.