How to compute a median in PrestoSQL?
Asked Answered
A

2

15

It seems like there is no native function for that purpose in Presto SQL. Do you know any way to efficiently aggregate a group and return its median?

Arbutus answered 23/9, 2020 at 14:41 Comment(0)
B
27

approx_percentile() should be a reasonable approach. Assuming a table like mytable(id, val), that you want to aggregate by id:

select id, approx_percentile(val, 0.5) median_val
from mytable
group by id
Beeves answered 23/9, 2020 at 14:47 Comment(1)
Note that this is an approximate solution. Upcoming Presto release will provide more accuracy in general case, follow github.com/prestosql/presto/pull/5158 for details.Staysail
C
2

Well in my case, I am using Presto and the most-voted solution didn't work, so I had to do it in 2 CTEs:

  1. I used the row_number() to number all the rows over my sequence of values in the order desired
  2. I discovered the row in the middle by dividing the max(row_number) by 2 (in my case there was no need to treat odd or even numbers)
  3. Inner join to my original data, where I just select the records with the same row_number as the CTE 2

(it might not be the best logic, but I did it in a rush and worked fine)

Corollary answered 11/1 at 18:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.