How to structure model that is broken into database as 36 tables?
Asked Answered
B

3

11

I have over 1 billion domain name records which, instead of putting them all in a single table, I decided to break them up into 36 tables (same db structure for each table).

There is a table based on the first character of the domain name (ex tables: domains_a... domains_z).

How can I create a single Domain model in rails that automatically switches between these tables seamlessly based on the character specified?

Biggs answered 2/7, 2017 at 6:58 Comment(6)
How about using ymal?Stanwin
Why on earth did you break up the data? This is a mistake unless you have an excellent reason. A billion records is not so big for e.g. postresql. It allows 32 TB tables. So as long as your records are 30Kb or less you're good to go. MySQL allows much, much bigger tables.Fistula
Please specify what database you are using -- any solution is going to heavily involve DB-specific features.Lindsy
@Fistula Because mass inserts gets painfully slow due to indexes. Same problem with certain reads. Partitioning the table helped reduce the latency a lot.Biggs
@AdamLassek PostgresBiggs
@Biggs in that case you should definitely be doing the partitioning in the DB layer.Lindsy
J
2

You can't: you have to write your own logic to deal with that. Rails would need to know your business logic and analyze the SQL query to find out which table to pick and can't do that by default, you need to write that code by yourself.

However there is a trick that will make it extremely easier for you. What about handling this on the database level? I've checked and all major databases support updatable views.

So, create a new view, name it domains and make sure it creates a union of all your domain tables (from a to z), then createa model:

class Domain
  self.table_name = "your_view_name"
end

This would do the trick for read side. Now based on the database you are using, you might be able to solve also the write problem in this way (with triggers and similar DB functionalities), otherwise, you need to write your own code for the write part, which will probably need to run raw queries.

As an alternative, you can deal with this at Ruby level by creating all the models (DomainA, DomainB, etc.) manually or with a generator and then creating a common class that acts as an interface. Or, you can create those models with some metaprogramming and again have a common class which work as an interface.

Jehad answered 4/7, 2017 at 22:46 Comment(0)
L
3

Generally, this sort of table partitioning is handled at the database level. You should specify what database you are using, because that will be extremely relevant here.

For instance, PostgreSQL has basic table partition support. You would point the Rails model at the master table, and the partitioning would be transparent to the Ruby layer.

Lindsy answered 4/7, 2017 at 22:56 Comment(1)
Is there is a way in Postgres table partitioning to automatically move rows to the appropriate partition if the range value is updated?Biggs
J
2

You can't: you have to write your own logic to deal with that. Rails would need to know your business logic and analyze the SQL query to find out which table to pick and can't do that by default, you need to write that code by yourself.

However there is a trick that will make it extremely easier for you. What about handling this on the database level? I've checked and all major databases support updatable views.

So, create a new view, name it domains and make sure it creates a union of all your domain tables (from a to z), then createa model:

class Domain
  self.table_name = "your_view_name"
end

This would do the trick for read side. Now based on the database you are using, you might be able to solve also the write problem in this way (with triggers and similar DB functionalities), otherwise, you need to write your own code for the write part, which will probably need to run raw queries.

As an alternative, you can deal with this at Ruby level by creating all the models (DomainA, DomainB, etc.) manually or with a generator and then creating a common class that acts as an interface. Or, you can create those models with some metaprogramming and again have a common class which work as an interface.

Jehad answered 4/7, 2017 at 22:46 Comment(0)
T
0

Table partitioning is the way to go. Do not create all those identical table structure.

What table partitioning will give you

  1. You will have single table which is logically partitioned by the database.
  2. In your applications view, you are querying a single table just like any other database table.
  3. In the database perspective, it stores data by partition which is defined by a partition type and partition logic. In mysql, you can refer to https://dev.mysql.com/doc/refman/5.7/en/partitioning-types.html
  4. Performance benefits if defined correctly. It will avoid scanning the 1 billion rows but instead scan the related partition when doing queries.

Table partition can be very database specific.

A simple example from mysql.

CREATE TABLE employees (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT NOT NULL,
    store_id INT NOT NULL
)
PARTITION BY RANGE (store_id) (
    PARTITION p0 VALUES LESS THAN (6),
    PARTITION p1 VALUES LESS THAN (11),
    PARTITION p2 VALUES LESS THAN (16),
    PARTITION p3 VALUES LESS THAN MAXVALUE
);

The employee is stored to a specific partition to either p0, p1, p2 or p3 depending on which store (store_id) the employee belongs.

You are still accessing it through a single table but the data is stored logically by partition depending on the store_id.

SELECT * FROM employee WHERE store_id = 10

The database will simply look at partition p1 and does not scan other partition (p0, p2 and p3) because simply that query will never find data in those partition.

Toratorah answered 11/7, 2017 at 16:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.