I want to calculate the cumulative product across rows in Snowflake.
Basically I have monthly rates that multiplied accumulate across time.
(Some databases have the product()
SQL function for that).
I want to calculate the cumulative product across rows in Snowflake.
Basically I have monthly rates that multiplied accumulate across time.
(Some databases have the product()
SQL function for that).
A trick suggested by Sterling Paramore: Add logs, and then exponentiate it:
with data as (select $1 x from values (1),(2),(3),(4),(5))
select x
, sum(x) over(order by x) sum
, exp(sum(ln(x)) over(order by x)) mult
from data
ln(a*b) = ln(a) + ln(b)
the following is true exp(ln(a) + ln(b)) = a*b
, since exp(ln(x)) = x
for any x>=0
. –
Spatula If a built-in function doesn't exist, it's usually possible to roll something custom using a User-Defined Table Function.
In this case:
CREATE OR REPLACE FUNCTION CUMULATIVE_PRODUCT(VALUE double)
RETURNS TABLE (PRODUCT double)
LANGUAGE JAVASCRIPT
AS '{
initialize: function(argumentInfo, context) {
this.cumulativeProduct = 1;
},
processRow: function f(row, rowWriter, context){
this.cumulativeProduct = this.cumulativeProduct*row.VALUE;
rowWriter.writeRow({PRODUCT: this.cumulativeProduct});
}
}';
Example table:
create temp table sample_numbers as (
select 1 as index, 5.1::double as current_value
union all
select 2 as index, 4.3::double as current_value
union all
select 3 as index, 3.7::double as current_value
union all
select 4 as index, 3.9::double as current_value
)
invoking the UDTF:
select index,current_value,PRODUCT as cumulative_product
from sample_numbers,table(CUMULATIVE_PRODUCT(current_value) over ())
Note the empty over() clause which forces Snowflake to do a single sequential run over the data instead of splitting it into parallel chunks
© 2022 - 2024 — McMap. All rights reserved.