Postgresql - how to run a query on multiple tables with same schema
Asked Answered
G

2

9

I have a postgres database that has several tables (a few hundreds). A subset Foo of the tables in the database have the same schema.

Ideally, I would like to create a stored procedure which can run a query against a single table, or against all tables in subset Foo.

Pseudocode:

CREATE TABLE tbl_a (id INTEGER, name VARCHAR(32), weight double, age INTEGER);
CREATE TABLE tbl_b (id INTEGER, name VARCHAR(32), weight double, age INTEGER);
CREATE TABLE tbl_c (id INTEGER, name VARCHAR(32), weight double, age INTEGER);
CREATE TABLE tbl_d (id INTEGER, name VARCHAR(32), weight double, age INTEGER);

CREATE TYPE person_info AS (id INTEGER, name VARCHAR(32), weight double, age INTEGER);

CREATE FUNCTION generic_func(ARRAY one_or_more_table_names)
    RETURNS person_info 
    -- Run query on table or all specified tables
    AS $$  $$
    LANGUAGE SQL; 

How could I implement this requirement in Postgresql 9.x ?

Grapeshot answered 5/1, 2017 at 9:17 Comment(1)
Unrelated, but: "Postgres 9.x" covers 7 different major version (with two of them no longer supported). You need to be more specific with the Postgres version.Butlery
R
11

You should have a look at table inheritance in PostgreSQL, they allow exactly what you speak about.

For example, you could create a table parent_tbl:

CREATE TABLE parent_tbl (id INTEGER, name VARCHAR(32), weight numeric, age INTEGER);

Then link your tables to this parent table:

ALTER TABLE tbl_a INHERIT parent_tbl;
ALTER TABLE tbl_b INHERIT parent_tbl;
ALTER TABLE tbl_c INHERIT parent_tbl;
ALTER TABLE tbl_d INHERIT parent_tbl;

Then a SELECT query over parent_tbl will query all of tbl_x tables, while a query on tbl_x will query only this particular table.

INSERT INTO tbl_a VALUES (1, 'coucou', 42, 42);

SELECT * FROM tbl_a;
 id |  name  | weight | age 
----+--------+--------+-----
  1 | coucou |     42 |  42
(1 row)

SELECT * FROM parent_tbl;
 id |  name  | weight | age 
----+--------+--------+-----
  1 | coucou |     42 |  42
(1 row)

SELECT * FROM tbl_b;
 id |  name  | weight | age 
----+--------+--------+-----
(0 rows)

It is also possible to filter data from given children tables. For example, if you are interested in data coming from tables tbl_a and tbl_b, you can do

select id, name, weight, age
from parent_tbl
left join pg_class on oid = parent_tbl.tableoid
where relname in ('tbl_a', 'tbl_b');

EDIT : I put numeric for weight instead of double as this type is not supported on my server.

Raeannraeburn answered 5/1, 2017 at 9:34 Comment(3)
+1 Neat solution for the use case where I need to run queries on all the tables, but in the instances where I need to run on a specific table, I need to repeat the same code for some 300+ tables! I have since found out that passing the name of a table to a query is bad form. Is there a way where I could perhaps introduce another table and use JOINS to restrict the query run on the parent table, to only the data in say, table_a?Grapeshot
Sure! You can use the tableoid column. I'll update my answer, just leave me 5 minutesRaeannraeburn
Thanks!, I bow humbly before your Postgresqlfu ! ;)Grapeshot
C
6

To create select query dynamically using items(table name) in an array you can use following select statement

SELECT string_agg(q, ' union all ')
FROM  (
    SELECT 'select * from ' || unnest(array ['tble_a','tble_b']) AS q
    ) t

Result:

string_agg                                          
--------------------------------------------------- 
select * from tble_a union all select * from tble_b 

You can create the function that returns table with columns

 id INTEGER
,name VARCHAR(32)
,weight numeric
,age INTEGER

P.S: I am avoiding TYPE person_info

function:

CREATE
    OR REPLACE FUNCTION generic_func (tbl varchar [])
RETURNS TABLE (         -- To store the output
        id INTEGER
        ,name VARCHAR(32)
        ,weight numeric
        ,age INTEGER
        ) AS $BODY$

DECLARE qry text;

BEGIN
    SELECT string_agg(q, ' union all ')  --To create select query dynamically
    INTO qry
    FROM (
        SELECT 'select * from ' || unnest(tbl) AS q
        ) t;

    RAISE NOTICE 'qry %',qry; --optional

    RETURN query --Executes the query to the defined table

    EXECUTE qry;
END;$BODY$

LANGUAGE plpgsql VOLATILE

Usage:

select * from generic_func(array['tbl_a','tbl_b','tbl_c','tbl_d'])

Result:

id name weight age 
-- ---- ------ --- 
2  ABC  11     112 
2  CBC  11     112 
2  BBC  11     112 
2  DBC  11     112 

and

select * from generic_func(array['tbl_a'])

Result:
id name weight age 
-- ---- ------ --- 
2  ABC  11     112 
Conjectural answered 5/1, 2017 at 10:28 Comment(1)
+1 I like this as it allows me to specify the name of the table. However, I have since found that this is considered an Anti-Pattern (#10706116). I'm therefore currently, leaning more in favour of Fabian's answer - which rather elegantly, satisfies the requirement of "multi-table queries" - but seems unwieldy (requiring each table to have it's own "version" of a query, where the tablename is hardcoded in the query/SP.Grapeshot

© 2022 - 2024 — McMap. All rights reserved.