Using a filter condition on a 'PARTITION BY'
Asked Answered
S

1

6

I have two tables. One is a Reference table, used to sort priority, and one is a Customer table. The Reference table is used to give priority to each column in the Customer table, to give a different order for the individual columns for a single customer.

Reference Table:

---------------------------------------
| Priority |   Attribute |  sourceID  |
---------------------------------------
|   1      |     EMAIL   |      1     |
|   2      |     EMAIL   |      2     |
|   3      |     EMAIL   |      3     |
|   2      |     NAME    |      1     |
|   1      |     NAME    |      2     |
|   3      |     NAME    |      3     |
---------------------------------------

Customer table:

-----------------------------------------------------------------------
| CustomerID |  Name   |       Email        |  SourceID |     Date    |
-----------------------------------------------------------------------
|    1       |  John   |       NULL         |     1     |  03/01/2017 |
|    1       |  NULL   |   [email protected]   |     3     |  01/01/2017 |
|    1       |   J     |  [email protected] |     2     |  02/01/2017 |
-----------------------------------------------------------------------

Result:

---------------------------------------------
| CustomerID   |  Name  |       Email       |
---------------------------------------------
|      1       |  John  | [email protected] |
---------------------------------------------

At the moment I'm using the following query to do this:

SELECT DISTINCT
       FIRST_VALUE(c.Name IGNORE NULLS) 
           OVER (PARTITION BY p.customerID 
                 ORDER BY r.PRIORITY, c.DATE 
                 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS EMAIL,
      FIRST_VALUE(c.Email IGNORE NULLS) 
           OVER (PARTITION BY c.customerID 
                 ORDER BY r.PRIORITY, c.DATE 
                 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS EMAIL
FROM Customer c
  JOIN reference r ON c.sourceID = r.sourceID;

However, this does take the different attributes for each column into consideration. I need to add a filter of some sort to each of the partitions by parts.

Can anyone assist with how I can go about doing this?

Suspensor answered 21/7, 2017 at 9:56 Comment(0)
B
5

One method is to put the attributes for a customer in a column and then recombine them:

SELECT DISTINCT customerId
       first_value(CASE WHEN ca.attribute = 'NAME' THEN ca.val end) OVER
           (PARTITION BY ca.customerId, attribute ORDER BY r.priority, ca.date) AS name,
       first_value(CASE WHEN ca.attribute = 'EMAIL' THEN ca.val END) OVER
           (PARTITION BY ca.customerId, attribute ORDER BY r.priority, ca.date) AS email
FROM ((SELECT customerId, 'NAME' AS attribute, name AS val, sourceId, date
       FROM customer c
      ) UNION ALL
      (SELECT customerId, 'EMAIL' AS attribute, email AS val, sourceId, date
       FROM customer c
      )
     ) ca JOIN
     reference r
     ON r.sourceId = ca.sourceId AND r.attribute = ca.attribute;

Note that this uses SELECT DISTINCT instead of GROUP BY. I don't think that Netezza has a first_value() aggregation function, so this construct gets around that problem.

Barbarbarbara answered 21/7, 2017 at 10:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.