Is RAILS_MAX_THREADS something that Puma will set and scale during build time , or should I set it?
Asked Answered
S

1

18

I know Rails 5 ships with Puma (which we're using) and will look for RAILS_MAX_THREADS as an environment variable or default to 5 threads, but I'm receiving timeout errors with the default value. I looked at my database and found its max connections is a few thousand.

It may be silly, but is this something Puma will set automatically and scale for, depending on its settings, or do I need to explicitly set this in the environment variables? If it needs to be manually set, what would be a good value for RAILS_MAX_THREADS?

I've found the following helpful, but I'm not fully grasping the scalability part:

https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server https://devcenter.heroku.com/articles/concurrency-and-database-connections

Salzhauer answered 26/4, 2017 at 16:55 Comment(2)
What kind of a timeout error do you get and under what circumstances? Does it happen in development or in production?Prorate
Thanks @NickShebanov, it happens in production when we have high spurts of traffic. We've since increased this value to 25, but still occasionally see it during spikes. We have allayed this by spinning up a new server as needed. I now know RAILS_MAX_THREADS doesn't automatically scale, but would setting this to ~100 be absurd?Salzhauer
P
51

Puma has two parameters actually, the number of threads and the number of workers. If we slightly change the default puma.rb, it will look like that:

# WORKERS_NUM is not a default env variable name
workers Integer(ENV['WORKERS_NUM'] || 1)
max_threads_count = Integer(ENV['RAILS_MAX_THREADS'] || 1)
min_threads_count = max_threads_count
threads min_threads_count, max_threads_count

The number of workers is the number of separate processes that Puma spawns for you. Usually, it is a good idea to set it equal to the number of processor cores you have on your server. You could spawn more of them to allow for more requests to be processed simultaneously, but workers create additional memory overhead – each worker spins up a copy of your rails app, so usually, you would use threads to achieve higher throughput.

RAILS_MAX_THREADS is a way to set the number of threads each of your workers will use under the hood. In the example above, the min_threads_count is equal to the max_threads_count, so the number of threads is constant. If you set them to be different, it is going to scale from the min to the max, but I haven't seen it in the wild.

There are several reasons to limit the number of threads – your interpreter and response times:

  1. If you use MRI, your threads are limited by GIL, so they're not run in parallel. MRI imitates parallel execution by context switching. A big number of threads will allow for much more simultaneous connections, but the average response time will increase because of the GIL.
  2. Platform limits: i.e. heroku has thread number limits https://devcenter.heroku.com/articles/dynos#process-thread-limits, linux limits only the number of processes Maximum number of threads per process in Linux?.
  3. When the code isn't thread-safe, there is a chance that using more than one thread will result in unpredictable problems. That's actually my case, so I didn't experiment much with the number of threads.

There was also an argument that slow IO blocks ruby process and doesn't allow context switching (i.e. calls to external services, or generating large files on the fly), but it turns out not to be true http://yehudakatz.com/2010/08/14/threads-in-ruby-enough-already/. But optimizing your architecture to do as much work in the background, as possible is always a good idea.

This answer will help you to find out a perfect combination of the number of threads vs the number of workers given your hardware.

This shows how the benchmarking could be done to find the exact numbers.

To sum up: WORKERS_NUM multiplied by RAILS_MAX_THREADS gives you a maximum number of simultaneous connections that can be processed by puma. If the number is too low, your users will see timeouts during load spikes. To achieve the best performance given you use MRI, you need to set WORKERS_NUM to the number of cores and find optimal RAILS_MAX_THREADS based on average response time during performance tests.

Prorate answered 18/5, 2017 at 13:33 Comment(6)
You are, without a doubt, a godsendSalzhauer
@Salzhauer you're welcome, but keep in mind this is a very biased answer, if you dig deeper some details can be different from what I said. I just tried to fit it to your particular case. Also, I've made a couple of mistakes (fixed): 1. context switches increase average response time, not decrease, of course, 2. blocking IO still allows to switch the thread.Prorate
I think each worker gets a new connection pool, so if you have 2 workers each with 5 threads, then you should only need 5 connection pool (not 10). Correct me if i'm wrong someone?Heloise
@Heloise yes, that's correct. Is anything wrong with the text above?Prorate
I just read your last paragraph that says "workers_num multiplied by rails_max_threads gives you a maximum number of simultaneous connections that can be processed by puma." While that is true, I can see how one might look at that as the way to calculate your connection pool size, when in actuality you should just use your "RAILS_MAX_THREADS" number as the pool size.Heloise
By "your users will see timeouts during load spikes" what exactly is responding w/ the timeout, is it a normal 408 or 504?Copro

© 2022 - 2024 — McMap. All rights reserved.