How to migrate an existing Postgres Table to partitioned table as transparently as possible?
Asked Answered
F

2

47

I have an existing table in a postgres-DB. For the sake of demonstration, this is how it looks like:

create table myTable(
    forDate date not null,
    key2 int not null,
    value int not null,
    primary key (forDate, key2)
);

insert into myTable (forDate, key2, value) values
    ('2000-01-01', 1, 1),
    ('2000-01-01', 2, 1),
    ('2000-01-15', 1, 3),
    ('2000-03-02', 1, 19),
    ('2000-03-30', 15, 8),
    ('2011-12-15', 1, 11);

However in contrast to these few values, myTable is actually HUGE and it is growing continuously. I am generating various reports from this table, but currently 98% of my reports work with a single month and the remaining queries work with an even shorter timeframe. Oftentimes my queries cause Postgres to do table scans over this huge table and I am looking for ways to reduce the problem. Table partitioning seems to fit my problem perfectly. I could just partition my table into months. But how do I turn my existing table into a partitioned table? The manual explicitly states:

It is not possible to turn a regular table into a partitioned table or vice versa

So I need to develop my own migration script, which will analyze the current table and migrate it. The needs are as follows:

  • At design time the time frame which myTable covers is unknown.
  • Each partition should cover one month from the first day of that month to the last day of that month.
  • The table will grow indefinitely, so I have no sane "stop value" for how many tables to generate
  • The result should be as transparent as possible, meaning that I want to touch as little as possible of my existing code. In best case this feels like a normal table which I can insert to and select from without any specials.
  • A database downtime for migration is acceptable
  • Getting along with pure Postgres without any plugins or other things that need to be installed on the server is highly preferred.
  • Database is PostgreSQL 10, upgrading to a newer version will happen sooner or later anyway, so this is an option if it helps

How can I migrate my table to be partitioned?

Footway answered 3/12, 2018 at 18:58 Comment(0)
F
69

In Postgres 10 "Declarative Partitioning" was introduced, which can relieve you of a good deal of work such as generating triggers or rules with huge if/else statements redirecting to the correct table. Postgres can do this automatically now. Let's start with the migration:

  1. Rename the old table and create a new partitioned table

    alter table myTable rename to myTable_old;
    
    create table myTable_master(
        forDate date not null,
        key2 int not null,
        value int not null
    ) partition by range (forDate);
    

This should hardly require any explanation. The old table is renamed (after data migration we'll delete it) and we get a master table for our partition which is basically the same as our original table, but without indexes)

  1. Create a function that can generate new partitions as we need them:

    create function createPartitionIfNotExists(forDate date) returns void
    as $body$
    declare monthStart date := date_trunc('month', forDate);
        declare monthEndExclusive date := monthStart + interval '1 month';
        -- We infer the name of the table from the date that it should contain
        -- E.g. a date in June 2005 should be int the table mytable_200506:
        declare tableName text := 'mytable_' || to_char(forDate, 'YYYYmm');
    begin
        -- Check if the table we need for the supplied date exists.
        -- If it does not exist...:
        if to_regclass(tableName) is null then
            -- Generate a new table that acts as a partition for mytable:
            execute format('create table %I partition of myTable_master for values from (%L) to (%L)', tableName, monthStart, monthEndExclusive);
            -- Unfortunatelly Postgres forces us to define index for each table individually:
            execute format('create unique index on %I (forDate, key2)', tableName);
        end if;
    end;
    $body$ language plpgsql;
    

This will come in handy later.

  1. Create a view that basically just delegates to our master table:

    create or replace view myTable as select * from myTable_master;
    
  2. Create rule so that when we insert into the rule, we'll not just update out partitioned table, but also create a new partition if needed:

    create or replace rule autoCall_createPartitionIfNotExists as on insert
        to myTable
        do instead (
            select createPartitionIfNotExists(NEW.forDate);
            insert into myTable_master (forDate, key2, value) values (NEW.forDate, NEW.key2, NEW.value)
        );
    

Of course, if you also need update and delete, you also need a rule for those which should be straight forward.

  1. Actually migrate the old table:

    -- Finally copy the data to our new partitioned table
    insert into myTable (forDate, key2, value) select * from myTable_old;
    
    -- And get rid of the old table
    drop table myTable_old;
    

Now migration of the table is complete without that there was any need to know how many partitions are needed and also the view myTable will be absolutely transparent. You can simple insert and select from that table as before, but you might get the performance benefit from partitioning.

Note that the view is only needed, because a partitioned table cannot have row triggers. If you can get along with calling createPartitionIfNotExists manually whenever needed from your code, you do not need the view and all it's rules. In this case you need to add the partitions als manually during migration:

do
$$
declare rec record;
begin
    -- Loop through all months that exist so far...
    for rec in select distinct date_trunc('month', forDate)::date yearmonth from myTable_old loop
        -- ... and create a partition for them
        perform createPartitionIfNotExists(rec.yearmonth);
    end loop;
end
$$;
Footway answered 3/12, 2018 at 18:58 Comment(9)
This seems a very good answer, but what about the case that the old table is referenced by other tables through foreign key relationship? I guess that fk's must be updated to point to the new table - is this enough?Medium
@babis21: Yes, of course. However FK support is limited: Postgre 10: foreign keys referencing partitioned tables are not supported, nor are foreign key references from a partitioned table to some other table. For Postgre 11 this improved somewhat: foreign keys referencing partitioned tables are not supported. (Foreign key references from a partitioned table to some other table are supported.)Footway
Thanks for the informative reply, I was trying to implement your solution and I keep on getting an error where it says on the insert partition already exists. I'm using PostgreSQL 13.3, not sure if that makes a differenceAla
@bnns: From your comment is very hard to find out what went wrong in your case. I think you need more text for that. I suggest that you create a minimal reproducible example and start your own question.Footway
Thanks, correct, in my case it was getting a naming collision because the old table still existed when migrating in my case I hadn't removed previous partitions which contained the same name. so a silly mistake on my part. This answer helped me a lot thanks for that.Ala
Why create a view called myTable instead of just calling the new partitioned table myTable? Not only is there no need, your insert in step 3 will fail, since you can't insert records into a view...Carmancarmarthen
@Cerin: Please read the last paragraph in my answer: "Note that the view is only needed, because a partitioned table cannot have row triggers[..]". Also the insert won't fail, because it is handled by the previously created row triggers.Footway
@Footway Interesting. Thanks for the clarification.Carmancarmarthen
If you need high performance on inserts, don’t use rules that create partitions on demand (they will slow down inserts significantly) but create the partitions „manually“. E.g. create a cron job that runs once per month to create the partitions for next month (if you schedule that job for the 1st of each month, you’ll have a whole month time to fix if anything is broken in the process creating the partitions)Shively
P
0

A suggestion could be, use a view for you main table access, do the steps mentioned above, where you create a new partition table. once finished, point the view to the new partitioned table, and then do the migration, finally deprecate the old table.

Phosphorate answered 8/7, 2020 at 4:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.