How to deploy a threadsafe asynchronous Rails app?

I've read tons of material around the web about thread safety and performance in different versions of ruby and rails and I think I understand those things quite well at this point.

What seems to be oddly missing from the discussions is how to actually deploy an asynchronous Rails app. When talking about threads and synchronicity in an app, there are two things people want to optimize:

utilizing all CPU cores with minimal RAM usage
being able to serve new requests while previous requests are waiting on IO

Point 1 is where people get (rightly) excited about JRuby. For this question I am only trying to optimize point 2.

Say this is the only controller in my app:

class TheController < ActionController::Base
  def fast
    render :text => "hello"
  end

  def slow
    render :text => User.count.to_s
  end
end

fast has no IO and can serve hundreds or thousands of requests per second, and slow has to send a request over the network, wait for work to be done, then receive the answer over the network, and is therefore much slower than fast.

So an ideal deployment would allow hundreds of requests to fast to be fulfilled while a request to slow is waiting on IO.

What seems to be missing from the discussions around the web is which layer of the stack is responsible for enabling this concurrency. thin has a --threaded flag, which will "Call the Rack application in threads [experimental]" -- does that start a new thread for each incoming request? Spool up rack app instances in threads that persist and wait for incoming requests?

Is thin the only way or are there others? Does the ruby runtime matter for optimizing point 2?

The right approach for you depends heavily on what your slow method is doing.

In a perfect world, you could use use something like the sinatra-synchrony gem to handle each request in a fiber. You'd only be limited by the maximum number of fibers. Unfortunately, the stack size on fibers is hardcoded, and it is easy to overrun in a Rails app. Additionally, I've read a few horror stories of the difficulties of debugging fibers, due to the automatic yielding after async IO has been initiated. Race conditions are still possible when using fibers, as well. Currently, fibered Ruby is a bit of a ghetto, at least on the front-end of a web app.

A more pragmatic solution that doesn't require code changes is to use a Rack server that has pool of worker threads such as Rainbows! or Puma. I believe Thin's --threaded flag handles each request in a new thread, but spinning up a native OS thread is not cheap. Better to use a thread pool with the pool size set sufficiently high. In Rails, don't forget to set config.threadsafe! in production.

If you're OK with changing code, you can check out Konstantin Haase's excellent talk on real-time Rack. He discusses using the EventMachine::Deferrable class to produce a response outside of the traditional request/response cycle that Rack is built on. This seems really neat, but you have to rewrite the code in an async style.

Also take a look at Cramp and Goliath. These let you implement your slow method in a separate Rack app that is hosted alongside your Rails app, but you will probably have to rewrite your code to work in the Cramp/Goliath handlers as well.

As for your question about the Ruby runtime, it also depends on the work that slow is doing. If you're doing CPU-heavy computation, then you run the risk of the GIL giving you issues. If you're doing IO, then the GIL shouldn't get in your way. (I say shouldn't because I believe I've read about issues with the older mysql gem blocking the GIL.)

Personally, I've had success using sinatra-synchrony for a backend, mashup web service. I can issue several requests to external web services in parallel, and wait for all of them to return. Meanwhile, the frontend Rails server uses a thread pool, and makes requests directly to the backend. Not perfect, but it works well enough right now.

Recommended topics

Hot tags