delayed_jobs vs resque vs beanstalkd?
Asked Answered
L

3

64

Here is my needs:

  • Enqueue_in(10.hours, ... ) (DJ syntax is perfect.)
  • Multiply workers, concurrently. (Resque or beanstalkd are good for this, but not DJ)
  • Must handle push and pop of 100 jobs a second. (I will need to run a test to make sure, but I think DJ can't handle this many jobs)

Resque and beanstalkd don't do the enqueue_in.

There is a plugin (resque_scheduler) that does it, but I'm not sure of how stable it is.

Our enviroment is on amazon, and they rolled out the beanstalkd for free for who has amazon instances, that is a plus for us, but I'm still not sure what is the best option here.

We run rails 2.3 but we are bringing it to speed to rails 3.0.3 soon.

But what is my best choice here? Am I missing another gem that does this job better?

I feel my only option that actually works now is the resque_scheduler.

Edit:

Sidekiq (https://github.com/mperham/sidekiq) is another option that you should check it out.

Langevin answered 26/1, 2011 at 18:24 Comment(0)
R
136

For my projects I will feel very comfortbale with collectiveidea/delayed_job in rails2 and 3. I don't know beanstalkd, but i will try it soon :-). I have followed the suggestions in the resque documentation. I will report it.

Resque vs DelayedJob

How does Resque compare to DelayedJob, and why would you choose one over the other?

  • Resque supports multiple queues
  • DelayedJob supports finer grained priorities
  • Resque workers are resilient to memory leaks / bloat
  • DelayedJob workers are extremely simple and easy to modify
  • Resque requires Redis
  • DelayedJob requires ActiveRecord
  • Resque can only place JSONable Ruby objects on a queue as arguments
  • DelayedJob can place any Ruby object on its queue as arguments
  • Resque includes a Sinatra app for monitoring what's going on
  • DelayedJob can be queried from within your Rails app if you want to add an interface

If you're doing Rails development, you already have a database and ActiveRecord. DelayedJob is super easy to setup and works great. GitHub used it for many months to process almost 200 million jobs.

Choose Resque if:

  • You need multiple queues
  • You don't care / dislike numeric priorities
  • You don't need to persist every Ruby object ever
  • You have potentially huge queues
  • You want to see what's going on
  • You expect a lot of failure / chaos
  • You can setup Redis
  • You're not running short on RAM

Choose DelayedJob if:

  • You like numeric priorities
  • You're not doing a gigantic amount of jobs each day
  • Your queue stays small and nimble
  • There is not a lot failure / chaos
  • You want to easily throw anything on the queue
  • You don't want to setup Redis

Choose Beanstalkd if:

  • You like numeric priorities
  • You want extremely fast queue
  • You don't want to waste you RAM
  • You want to serve high number of jobs
  • You're fine with JSONable Ruby objects on a queue as arguments
  • You need multiple queues

In no way is Resque a "better" DelayedJob, so make sure you pick the tool that's best for your app.

A nice comparison of queueing backend speed:

                 enqueue                work
-------------------------------------------------
delayed job |   200 jobs/sec     120 jobs/sec
resque      |  3800 jobs/sec     300 jobs/sec
rabbitmq    |  2500 jobs/sec    1300 jobs/sec
beanstalk   |  9000 jobs/sec    5200 jobs/sec

Have a nice day!

P.S. There is a RailsCast about resque, Delayed Job (revised version) and Beanstakld. Have a look!

P.P.S. My favourite choiche is now Sidekiq ( very Simple, Fast and efficient for simple jobs ), have a look at this page for comparison.

Reste answered 26/1, 2011 at 22:22 Comment(3)
Delayed job supports queues now.. (for a while) Great answer though.Oculist
Does heroku supports Beanstakld?Perithecium
Outdated answer, DJ no longer handles "any" ruby object as the yaml parser now fails on exporting anonymous modules and file streams, so for example it is almost impossible to dump request.env safely #15790624Freespoken
H
9

Amazon Beanstalk isn't Beanstalkd.

Beanstalkd - the queue - does have delayed jobs, that won't be reserved out of the queue until the given number of seconds have passed. If that is what Enqueue_in(10.hours, ... ) means, then it's just syntactic sugar to calculate the number of seconds, and not make a job available till then.

Haematogenous answered 26/1, 2011 at 22:25 Comment(1)
This is correct. beanstalkd's put operation accepts a <delay> which is the number of seconds until the job becomes ready to receive. It also supports priorities.Laevorotation
J
8

Just a small note: delayed_job 3.0+ supports named queues

object.delay(:queue => 'tracking').method    
Delayed::Job.enqueue job, :queue => 'tracking'    
handle_asynchronously :tweet_later, :queue => 'tweets'
Justicz answered 26/1, 2012 at 13:15 Comment(5)
Do these queues run in parallel?Millner
yes, also you can specify number of threads per queues github.com/collectiveidea/delayed_jobJusticz
given this, what reason would there be to use resque? Just trying to suss it all out.Oculist
@Oculist - Resque still has better reporting, and isn't in a relational database, so can handle a lot more things in the queue at once. It just depends on how much chaos you imagine having. Three 10,000 entry queues can jam up the works pretty good in delayed_job. Three 10,000 entry queues in resque/redis are standard and easy to deal with.Acoustics
I found delayed_job ran fine with 500K - 700k or so jobs, using Oracle 11g. I think we added one or two indexes in addition to what the default DJ generator creates. Delayed_job_web is a simple way to keep track of things.Discomfit

© 2022 - 2024 — McMap. All rights reserved.