How can I get the corresponding value of a column based on an aggregate of another column?
Asked Answered
H

1

6

I am migrating some PostgreSQL queries to Snowflake and I am struggling to replicate the following query:

WITH test_table(col1, col2, col3, col4) AS (
    VALUES 
        (1, 1, 7, 5),
        (1, 1, 6, 4),
        (1, 2, 1, 4),
        (1, 2, 2, 5),
        (1, 2, 3, 6)
)

SELECT
    col1,
    col2,
    (max(ARRAY[col3, col4]))[2]
FROM test_table
GROUP BY 1, 2

This query manages to get the value of col4 that corresponds to the maximum value of col3 grouped by col1, col2. However, I do not find any elegant solution to replicate this on Snowflake.

Hereford answered 26/1, 2021 at 17:50 Comment(1)
Please add some example source data to allow everyone to easily replicate your results.Structure
D
4

You can aggregate an array by any order. So you can group col3 ordered by col4:

with test_table(col1, col2, col3, col4) 
as (
    select * from (values 
        (1, 1, 20, 1), -- added 
        (1, 1, 7, 5),
        (1, 1, 6, 4),
        (1, 2, 1, 4),
        (1, 2, 2, 5),
        (1, 2, 3, 6))
)

SELECT
    col1,
    col2,
    array_agg(col4) within group(order by col3 desc) c4_by_max_c3
FROM test_table
GROUP BY 1, 2

enter image description here

Instead of getting this array, you can just pick the top number with [0]:

SELECT
    col1,
    col2,
    array_agg(col4) within group(order by col3 desc)[0] c4_by_max_c3
FROM test_table
GROUP BY 1, 2

enter image description here

Hopefully Snowflake will also support max_by in the future, but it's not yet implemented. Meanwhile this will work.


2023 Update: MAX_BY() is supported

https://docs.snowflake.com/en/sql-reference/functions/min_by.html

Departmentalize answered 26/1, 2021 at 19:33 Comment(2)
MAX_BY is now supported in SnowflakeGuff
Good call! docs.snowflake.com/en/sql-reference/functions/min_by.htmlDepartmentalize

© 2022 - 2024 — McMap. All rights reserved.