How to unnest multiple columns in presto, outputting into corresponding rows
Asked Answered
B

1

5

I'm trying to unnest some code

I have a a couple of columns that have arrays, both columns using | as a deliminator

The data would be stored looking like this, with extra values to the side which show the current currency

I want to output it like this I tried doing another unnest column, like this

SELECT c.campaign, c.country, a.product_name, u.price--, u.price -- add price to this split. handy for QBR

FROM c, UNNEST(split(price, '|')) u(price), UNNEST(split(product_name, '|')) a(product_name)

group by 1,2, 3, 4

but this duplicated several rows, so I'm not sure if unnesting the two columns doesn't quite work

Thanks

Buckthorn answered 23/4, 2019 at 16:13 Comment(0)
S
18

The issue with your query is that the clause FROM c, UNNEST(...), UNNEST(...) is effectively computing the cross join between each row of c and the rows produced by each of the derived tables resulting from the UNNEST calls.

You can solve it by unnesting all your arrays in a single call to UNNEST, thus, producing a single derived table. When used in that manner, the UNNEST produces a table with one column for each array and one row for each element in the arrays. If the arrays have a different length, it will produce rows up to the number of elements in the largest array and fill in with NULL for the column of the smaller array.

To illustrate, for your case, this is what you want:

WITH data(a, b, c) AS (
    VALUES
        ('a|b|c', '1|2|3', 'CAD'),
        ('d|e|f', '4|5|6', 'USD')
)
SELECT t.a, t.b, data.c
FROM data, UNNEST(split(a, '|'), split(b, '|')) t(a, b)

which produces:

 a | b |  c
---+---+-----
 a | 1 | CAD
 b | 2 | CAD
 c | 3 | CAD
 d | 4 | USD
 e | 5 | USD
 f | 6 | USD
(6 rows)
Submarine answered 23/4, 2019 at 16:47 Comment(1)
thanks for sharing this useful technique! Im curious what if I have many rows of record for the data array column (which I would like to unnest) - I can't possibly label ` VALUES ('a|b|c', '1|2|3', 'CAD'), ('d|e|f', '4|5|6', 'USD') ` - what should I do then?Baccate

© 2022 - 2024 — McMap. All rights reserved.