Would it be possible to have multiple database connection pools in rails to switch between?
Asked Answered
G

3

13

A little background

I have been using the Apartment gem for running a multi-tenancy app for years. Now recently the need to scale the database out into separate hosts has arrived, the db server simply can't keep up any more (both reads and writes are getting too too much) - and yes, I scaled the hardware to the max (dedicated hardware, 64 cores, 12 Nvm-e drives in raid 10, 384Gb ram etc.).

I was considering doing this per-tenant (1 tenant = 1 database connection config / pool) as that would be a "simple" and efficient way to get up to number-of-tenants-times more capacity without doing loads of application code changes.

Now, I am running rails 4.2 atm., soon upgrading to 5.2. I can see that rails 6 adds support for a per-model connection definitions, however that is not really what I need, as I have a completely mirrored database schema for each of my 20 tenants. Typically I switch "database" per request (in middleware) or per background job (sidekiq middleware), however this is currently trivial and handled ny the Apartment gem, as it just sets the search_path in Postgresql and does not really change the actual connection. When switching to a per-tenant hosting strategy I will need to switch the entire connection per request.

Questions:

  1. I understand that I could do an ActiveRecord::Base.establish_connection(config) per request / background job - however, as I also understand, that triggers an entirely new database connection handshake to be made and a new db pool to spawn in rails - right? I guess that would be a performance suicide to make that kind of overhead on every single request to my application.
  2. I am therefore wondering if anyone can see the option with rails of e.g. pre-establishing multiple (total of 20) database connections/pools from the beginning (e.g. on boot of the application), and then just switch between those pools per request? So that he db connections are already made and ready to be used.
  3. Is all this just a poor poor idea, and should I instead look for a different approach? E.g. 1 app instance = one specific connection to one specific tenant. Or something else.
Groundsill answered 26/2, 2020 at 19:28 Comment(3)
guides.rubyonrails.org/active_record_multiple_databases.html I think it might help youHouppelande
You might be interested in this PR in the Rails' GitHub repository that recently added exactly the feature you need to the current Rails master branch. Would running Rails Egde be an option or back-prting that feature to your current Rails version?Template
@Template ActiveRecord::Base.connected_to(shard: :shard_one) do ... end means that the pool will be (re-)used, instead of creating a whole new connection everytime ?Endue
A
8

As I understand, there are 4 pattern for multi-tenancy app:

1. Dedicated model/Multiple Production Environments

Each instance or database instance entirely host different tenant application and nothing is shared among tenants.

This is 1 instance app and 1 database for 1 tenant. The development would be easy as if you serve 1 tenant only. But will be nightmare for devops if you have, say, 100 tenants.

2. Physical Segregation of Tenants

1 instance app for all tenant but 1 database for 1 tenant. This is what you are searching for. You can use ActiveRecord::Base.establish_connection(config), or using gems, or update to Rails 6 as other suggests. See the answer for (2) below.

3. Isolated schema model/Logical Segregations

In an Isolated Schema, the tenant tables or database components are group under a logical schema or name-space and separated from other tenant schemas, however the schema are hosted in the same database instance.

1 instance app and 1 database for all tenant, like you do with apartment gem.

4. Partially Isolated Component

In this model, components that have common functionalities are shared among tenants while components with unique or unrelated functions are isolated. At the data layer, common data such as data that identify tenants are grouped or kept in single table while tenant specific data are isolated at table or instance layer.


As for (1), ActiveRecord::Base.establish_connection(config) not handshaking to db per request if you use it correctly. You can check here and read all the comment here.

As for (2), If you don't want to use establish_connection, you can use gem multiverse (it works for rails 4.2), or other gems. Or, as other suggest, you can update to Rails 6.

Edit: Multiverse gem is using establish_connection. It will append the database.yml, and create a base class so that each subclass shares the same connection/pool. Basically, it reducing our effort to use establish_connection.

As for (3), the answer:

If you don't have so many tenants, and your application is pretty complex, I suggest you use Dedicated Model pattern. So you go for 1 app instance = one specific connection to one specific tenant. You don't have to make your apps more complex by adding multiple database connections.

But if you have many tenants, I suggest you use Physical Segregation of Tenants or Partially Isolated Component depends on your business process.

Either way, you have to update/rewrite your application to comply with the new architecture.

Autocade answered 3/3, 2020 at 5:6 Comment(7)
Hi thanks for the answer. I will need a little time to actually test out the suggestion before I can reward one of the answers the bounty if they are good solutions.Groundsill
I have a couple of questions in regards to 1 and 2. 1: I am not sure I understand your references. Is what your saying, that I can call .establish_connection(config) without doing db handshake / recreating the db poll? In that case, I am not sure how the two links explains that? 2: For multiverse, isn't that a per-model database switching rather than an entire db switch for the entire app? I feel their documentation is pretty vagueGroundsill
I think I have misunderstanding. Do you mind to elaborate these sentences? I understand that I could do an ActiveRecord::Base.establish_connection(config) per request / background job - however, as I also understand, that triggers an entirely new database connection handshake to be made and a new db pool to spawn in rails It suggest that one request create one db pool?Autocade
I mean: (1) I am worried about performance/network overhead when having to call ActiveRecord::Base.establish_connection(config) on every request, just to switch between the different databases/countriesGroundsill
You don't have to worry about the overhead. Now, if you use single DB, you have one connection pool (you can check the link about connection in the answer of (1) above). If you use establish_connection in the model like this one: class SecondTenantUser < ActiveRecord::Base; establish_connection(DB_SECOND_TENANT); end, and say you have 5 model, you create 5 connection pool to the DB_SECOND_TENANT. And each pool is treated equally. So, you don't create a pool per request, but per establish_connection.Autocade
That way, you have to create one pool for one DB, so you make a class like this: module SecondTenant; class Base < ActiveRecord::Base; self.abstract_class = true; establish_connection(DB_SECOND_TENANT); end; end. And your model: class SecondUser < SecondTenant::Base; end. And you have one pool for one DB, you dont have to worry about the overhead. I am sorry if that is not clear. My english is not that good, and explaining this concept in comments is pretty hard.Autocade
Edit: I made some serious 'typo'. Not one pool for one DB. You set the maximum pool per db. Say you have 2 db, you set max 5 pools per db, then you can have 10 pools at a time: 5 pools for DB A, 5 pools for DB B. But you can't have 6 pool for DB A, and 4 pool for DB B.Autocade
T
4

Just a couple of days ago horizontal sharding was added to Ruby on Rails' master branch on GitHub. Currently, this feature is not officially released but depending on your application's Rails version you might want to consider using Rails master by adding this to your Gemfile:

gem "rails", github: "rails/rails", branch: "master"

With this new feature, you can take advantage of Rails' database connection pool and switch the database based on conditions.

I haven't used this new feature, but it seems pretty straight-forward:

# in your config/database.yml
production:
  primary:
    database: my_database
    # other config: user, password, etc
  primary_tenant_1:
    database: tenant_1_database
    # other config: user, password, etc

# in your controller for example when updating a tenant
ActiveRecord::Base.connected_to(shard: "primary_tenant_#{tenant.database_shard_number}") do
  tenant.save
end

You didn't add much detail about how you determine the tenant number or how authorization is done in your application. But I would try to determine the tenant number as soon as possible in the application_controller in an around_action. Something like this might be a starting point:

around_filter :determine_database_connection

private

def determine_database_connection
  # assuming you have a method to determine the current_tenant and that tenant
  # has a method that returns the number of the shard to use or even the 
  # full shard identifier
  shard = current_tenant.database_shard # returns for example `:primary_tenant_1` 

  ActiveRecord::Base.connected_to(shard: shard) do
    yield
  end
end
Template answered 4/3, 2020 at 16:40 Comment(5)
Would that makes the same sense to switch back to the default connection in that case too ? github.com/influitive/apartment#middleware-considerationsEndue
Once you leave the ActiveRecord::Base.connected_to ... do block it would use the default connection again.Template
@Template i was reading ab this gem, Is't not only for rails6 ?Putto
@Putto It is included in the current Rails master branch.Template
Hi thanks for the answer. I will need a little time to actually test out the suggestion before I can reward one of the answers the bounty if they are good solutions.Groundsill
D
3

From what I understand, (2) should be possible with manual connection switching in Rails 6.

Duggan answered 27/2, 2020 at 16:52 Comment(1)
Thanks however this seems pretty far from my use case. It would imply rewriting the entire app for using this procedure everywhere.Groundsill

© 2022 - 2024 — McMap. All rights reserved.