Dynamically generate columns in PostgreSQL
Asked Answered
P

1

8

I have seen that there are quit a few similar questions like this one, but I havent understood how to code it myself. Please have in mind that I am just a beginner in this field.

Basically I want to pivot the table like this:

zoom |    day     | point         zoom | 2015-10-01 |  2015-10-02 | ......
------+-----------+-------  ---> ------+------------+-------------+
   1 | 2015-10-01 |   201            1 |    201     |     685     |
   2 | 2015-10-01 |    43            2 |     43     |     346     | 
   3 | 2015-10-01 |    80            3 |     80     |     534     | 
   4 | 2015-10-01 |   324            4 |    324     |     786     | 
   5 | 2015-10-01 |    25            5 |     25     |     685     |
   1 | 2015-10-02 |   685 
   2 | 2015-10-02 |   346 
   3 | 2015-10-02 |   534 
   4 | 2015-10-02 |   555 
   5 | 2015-10-02 |   786
   :
   :
   :

Time can vary.

Results on left I get with:

SELECT 
zoom,
to_char(date_trunc('day', time), 'YYYY-MM-DD') AS day,
count(*) as point
FROM province
WHERE time >= '2015-05-01' AND time < '2015-06-01'
GROUP BY to_char(date_trunc('day', time), 'YYYY-MM-DD'), zoom;

I have read that there are some issues if I use count and also that it would be better if I use CASE and GROUP BY, however I have no idea how to CASE this.

Crosstab itself doesnt support dynamic creation of column names, but that can be achieved with crosstab_hash, if I understood it correctly.

This might be probably nice solution: http://okbob.blogspot.ca/2008/08/using-cursors-for-generating-cross.html however I am stucked with it trying to program it myself.

I have to use this kind of pivoting quite often, so I would appriciate any kind of help and additional explanation behind it.

Edit1

I am trying to figure out how crosstab works with dates, currently without returning dynamic names of columns. Later on I will explain why. It is realted to the main question. For this example I am using only period of 2 dates.

Based on @Erwin Brandstetter answer:

SELECT * FROM crosstab(
       'SELECT zoom, day, point
        FROM   province
        ORDER  BY 1, 2'
      , $$VALUES ('2015-10-01'::date), ('2015-10-02')$$)
AS ct (zoom text, day1 int, day2 int);

returned results are:

zoom |    day1    |    day2     | 
-----+------------+-------------+
   1 |    201     |     685     |
   2 |     43     |     346     | 
   3 |     80     |     534     | 
   4 |    324     |     786     | 

I am trying to get this

zoom | 2015-10-01 |  2015-10-02 | 
-----+------------+-------------+
   1 |    201     |     685     |
   2 |     43     |     346     | 
   3 |     80     |     534     | 
   4 |    324     |     786     | 

but my query doesnt work:

SELECT *
FROM crosstab(
      'SELECT *
       FROM province
       ORDER  BY 1,2')
AS ct (zoom text, "2015-10-01" date, "2015-10-02" date);

ERROR:  return and sql tuple descriptions are incompatible

Edit1, Q1. Why does this doesnt work and how can I return results like that?

I have read links that @Erwin Brandstetter provided me, especially this one: Execute a dynamic crosstab query. I have copied/pasted his function:

CREATE OR REPLACE FUNCTION pivottab(_tbl regclass, 
                                    _row text, _cat text, 
                                    _expr text,
                                    _type regtype)  
RETURNS text AS
$func$
DECLARE
   _cat_list text;
   _col_list text;
BEGIN
-- generate categories for xtab param and col definition list    
EXECUTE format(
 $$SELECT string_agg(quote_literal(x.cat), '), (')
        , string_agg(quote_ident  (x.cat), %L)
   FROM  (SELECT DISTINCT %I AS cat FROM %s ORDER BY 1) x$$
 , ' ' || _type || ', ', _cat, _tbl)
INTO  _cat_list, _col_list;

-- generate query string
RETURN format(
'SELECT * FROM crosstab(
   $q$SELECT %I, %I, %s
      FROM   %I
      GROUP  BY 1, 2
      ORDER  BY 1, 2$q$
 , $c$VALUES (%5$s)$c$
   ) ct(%1$I text, %6$s %7$s)'
, _row, _cat, _expr, _tbl, _cat_list, _col_list, _type
);

END
$func$ LANGUAGE plpgsql;

and call it with query

SELECT pivottab('province','zoom','day','point','date');

Function returned me:

                         pivottab                         
----------------------------------------------------------
 SELECT * FROM crosstab(                                 +
    $q$SELECT zoom, day, point                           +
       FROM   province                                   +
       GROUP  BY 1, 2                                    +
       ORDER  BY 1, 2$q$                                 +
  , $c$VALUES ('2015-10-01'), ('2015-10-02')$c$          +
    ) ct(zoom text, "2015-10-01" date, "2015-10-02" date)
(1 row)

So when I edited the query and added ; (it would be nice that ; is already there) I got:

ERROR:  column "province.point" must appear in the GROUP BY clause or be used in an aggregate function

Edit1, Q2. Any ideas how to solove this?

Edit1, Q3. I guess next question will be how to execute function automaticlly, which is also mentioned on the same link, but got stucked on previous steps.

Postrider answered 29/9, 2016 at 20:47 Comment(3)
Build up your dynamic SQL with looping over a record and appending to the string that hold up your query. This can be achieved using plpgsql. Then return a table from the function and you are good to go :-)Polygenesis
What's supposed to be dynamic? Number of result columns? Names or data types of result columns?Kinata
@ErwinBrandstetter Sorry, I just added dots to the table. So I need to add dynamically names of the columns...it would be like 2015-10-01, 2015-10-02, 2015-10-03, 2015-10-04 ... depending on a WHERE part in first query. That should be generated without listing them manually, because sometimes I might have for example interval of 75 daysPostrider
K
4

The basic crosstab query for your example is simple:

SELECT * FROM crosstab(
       'SELECT zoom, day, point
        FROM   province
        ORDER  BY 1, 2'

     , $$VALUES ('2015-10-01'::date), ('2015-10-02')$$)
AS ct (zoom text, day1 int, day2 int);

But not with dynamic column names or a dynamic number of columns. As a compromise, you can have a fixed number of columns and only fill the leading ones. Basics:

Dynamic?

crosstab_hash is not going to help you with dynamic column names. It's for repeated use without typing a column definition list, but not for dynamic column names. Examples:

For truly dynamic column names, you need two round trips to the server. Whether you retrieve the column names with a first query to build a second query, or you create a cursor or a temporary table or a prepared statement. Whatever you try, you need two round trips. SQL wants to know the return type at call time.

The closest I could get to a "dynamic" call is with my custom crosstab_n() function defined in this related answer:


Or you give up the idea of a completely dynamic crosstab query (because, you know, it's impossible) and use a two-step workflow, like mentioned above.

  1. Let a function generate the crosstab query text. You can use the function provided here (and adapt it to your needs!):

    In particular, remove GROUP BY 1, 2, since you do not aggregate rows before the cross tabulation.

  2. Execute the generated function.


For completeness, there is also the new \crosstabview metacommand in psql in Postgres 9.6 (just released) - with similar functionality, and it can display dynamic column names (attaching dynamic names happens in the psql client, not in the Postgres server).

Kinata answered 30/9, 2016 at 1:54 Comment(8)
Thanks, I will go through all this links thoroughly tomorrow. So basically you are saying that its hard to return dynamically columns like that and I should look for some other solution?Postrider
Completely dynamic columns are strictly impossible in SQL. Your way around this limitation depends on your requirements. I added one more alternative above: use a two-step workflow instead ...Kinata
Thanks for your fast response. I got stucked on baby steps. Please take a look at my edited question under Edit1!Postrider
@newbie_girl: Adapt the function to your needs. I added some hints above. You need a basic understanding of what you are doing. Please ask any new questions as new question.Kinata
Sorry, thought it would be inapproriate to open a new subject. thanks for additional instruction. but I guess I will have to ask new questions.Postrider
@newbie_girl: This is a Q/A site. Well formulated questions are always welcomeKinata
crosstab function i am not able to use it. Please let me know why I am getting error while using it.Melanimelania
I have a requirement of creating a table based on cursor column values as columns in that create table. Is it possible to achieve it ?Melanimelania

© 2022 - 2024 — McMap. All rights reserved.