SQL Unnest - how to use correctly?
Asked Answered
L

1

7

Say I have some data in a table, t.

id, arr
--, ---
1,  [1,2,3]
2,  [4,5,6]

SQL

SELECT AVG(n) FROM UNNEST(
    SELECT arr FROM t AS n) AS avg_arr 

This returns the error, 'Mismatched input 'SELECT'. Expecting <expression>. What is the correct way to unnest an array and aggregate the unnested values?

Leningrad answered 4/3, 2022 at 22:22 Comment(0)
A
9

unnest is normally used with a cross join and will expand the array into relation (i.e. for every element of array an row will be introduced). To calculate average you will need to group values back:

-- sample data
WITH dataset (id, arr) AS (
    VALUES (1,  array[1,2,3]),
        (2,  array[4,5,6])
) 

--query
select id, avg(n)
from dataset
cross join unnest (arr) t(n)
group by id

Output:

id _col1
1 2.0
2 5.0

But you also can use array functions. Depended on presto version either array_average:

select id, array_average(n)
from dataset

Or for older versions more cumbersome approach with manual aggregation via reduce:

select id, reduce(arr, 0.0, (s, x) -> s + x, s -> s) / cardinality(arr)
from dataset
Aneroid answered 4/3, 2022 at 22:38 Comment(2)
In your first code block, what does t(n) do? It looks like it's referencing the table, t, and creating some index variable?Leningrad
@Leningrad t(n) is an alias for unnest(arr) result. Since arr is array it will be expanded into a single column (n)Aneroid

© 2022 - 2024 — McMap. All rights reserved.