Transpose latest rows per user to columns
Asked Answered
F

1

69

I have the following table, which gives multiple email addresses for each user.

table sample

I need to flatten this out to columns on a user query to give me the "newest" 3 email addresses based on the creation date. So the expected output looks like:

user_name user_id email1 email2 email3
Mary 123 [email protected] [email protected] [email protected]
Joe 345 [email protected] [NULL] [NULL]
Funds answered 14/4, 2014 at 12:35 Comment(0)
I
86

Use crosstab() from the tablefunc module.

SELECT * FROM crosstab(
   $$SELECT user_id, user_name, rn, email_address
     FROM  (
        SELECT u.user_id, u.user_name, e.email_address
             , row_number() OVER (PARTITION BY u.user_id
                            ORDER BY e.creation_date DESC NULLS LAST) AS rn
        FROM   usr u
        LEFT   JOIN email_tbl e USING (user_id)
        ) sub
     WHERE  rn < 4
     ORDER  BY user_id
   $$
  , 'VALUES (1),(2),(3)'
   ) AS t (user_id int, user_name text, email1 text, email2 text, email3 text);

I used dollar-quoting for the first parameter, which has no special meaning. It's just convenient to escape single quotes in the query string, which is a common case:

Detailed explanation and instructions:

And in particular, for "extra columns":

The special difficulties here are:

  • The lack of key values.
    → We substitute with row_number() in a subquery.

  • The varying number of emails.
    → We limit to a max. of three in the outer SELECT and use crosstab() with two parameters, providing a list of possible keys.

Pay attention to NULLS LAST in the ORDER BY.

Imprimis answered 14/4, 2014 at 13:42 Comment(6)
perfect, thankyou. Does exactly what I need. This query would have been a bit beyond me, but has helped me learn and understand much.Funds
For those few wondering. No, this does not work on AWS Redshift.Marvin
@MarcelloGrechiLins: This Q & A are for Postgres, not Redshift.Imprimis
@ErwinBrandstetter I know, but since Redshift uses "Postgre Compliant" drivers, most Redshift users looking for answers to their questions search for "PostgreSQL" instead, since the community is bigger and the chances if finding the answer for your Redshift issue, on a Postgre post is higher :) Hopefully that makes senseMarvin
@MarcelloGrechiLins: Redshift is a fork based on Postgres 8.0.2 and the list of unsupported Postgres features has grown very long. BTW, it's PostgreSQL or Postgres, never "Postgre". That's like saying "Redshif".Imprimis
Hi! Landed on this here 8 years later & it's still helpful! Thank youSolano

© 2022 - 2024 — McMap. All rights reserved.