Why is there no PRODUCT aggregate function in SQL?
Asked Answered
F

10

60

Im looking for something like SELECT PRODUCT(table.price) FROM table GROUP BY table.sale similar to how SUM works.

Have I missed something on the documentation, or is there really no PRODUCT function?

If so, why not?

Note: I looked for the function in postgres, mysql and mssql and found none so I assumed all sql does not support it.

Fieldsman answered 12/10, 2010 at 6:42 Comment(9)
I guess it's just far rarer to want to compute the product on a set of numbers than a sum. Even in your example, how frequently do you want to compute the product of a set of prices?Kelm
What is the value of product(table.price) supposed to answer? Reading up on "product aggregate function", I get that it is to return the product of all table.price values found in the result set: Row1.Price * Row2.Price * ... * RowN.Price. But for the life of me, I can't get my head around what that value "means", what information it is supposed to convey? What is the practical application of this, for prices or any other type of value? Please enlighten me.Welbie
for my case its not actually for prices, but for getting a product of yields (qty_out / qty_in ).Fieldsman
@MarjanVenema Use case that brought me here: Tables risk and risk_prevention eevry risk has a damage_value representing the ammount of money which is at risk. Every risk_prevention has a risk_multiplier >0 and <1. The relationship between the two tables is 1 to n. The expected damage is damage_value * all risk_prevention.risk_multiplier. This logic is not my id. It is what the customer is using and what the customer wants in the software. (sorry about the bad pseudo code)Ceaseless
@OliverA. Even so, a risk mulitplier would not simply multiply all values it receives (that would reduce the risk as it would be multiplying "chances" ie values between 0 and 1). I understand the need for a "multiply" function, even one that allows more than two input values, but I cannot see the use for a generic "multiply all values in table column X" function. It simply doesn't seem to have a general application...Welbie
@MarjanVenema I know this is an old article but I have a set of returns of a financial instrument (1%, 3% etc) and want to calculate Product(1+return) -1 calculate the compounded return across groups.Dimension
@Dimension We don't mind adding info to old articles on SO :). Percentages... Interesting! So something like InitialValue*(1+return1) ==> VR1, and then VR1 * (1+return2) ==> VR1+2. Which boils down to InitialValue * (1+return1) * (1+return2) * ... And you could do that as InitialValue * PRODUCT(1+table.ReturnPercentage).Welbie
The exp(sum(log(var1)) is the product(var1) when var is always positive.Dustin
I see a lot of people giving valid use cases for this, so never discount the use for something just because you don't see the use yourself. In my case, I want to calculate the overall probability of a set of conditions in a data model. I join to get the conditions that match, then aggregate the product of the individual probabilities to get the overall probability.Guard
L
30

There is no PRODUCT set function in the SQL Standard. It would appear to be a worthy candidate, though (unlike, say, a CONCATENATE set function: it's not a good fit for SQL e.g. the resulting data type would involve multivalues and pose a problem as regards first normal form).

The SQL Standards aim to consolidate functionality across SQL products circa 1990 and to provide 'thought leadership' on future development. In short, they document what SQL does and what SQL should do. The absence of PRODUCT set function suggests that in 1990 no vendor though it worthy of inclusion and there has been no academic interest in introducing it into the Standard.

Of course, vendors always have sought to add their own functionality, these days usually as extentions to Standards rather than tangentally. I don't recall seeing a PRODUCT set function (or even demand for one) in any of the SQL products I've used.

In any case, the work around is fairly simple using log and exp scalar functions (and logic to handle negatives) with the SUM set function; see @gbn's answer for some sample code. I've never needed to do this in a business application, though.

In conclusion, my best guess is that there is no demand from SQL end users for a PRODUCT set function; further, that anyone with an academic interest would probably find the workaround acceptable (i.e. would not value the syntactic sugar a PRODUCT set function would provide).

Out of interest, there is indeed demand in SQL Server Land for new set functions but for those of the window function variety (and Standard SQL, too). For more details, including how to get involved in further driving demand, see Itzik Ben-Gan's blog.

Luthanen answered 12/10, 2010 at 8:15 Comment(4)
+1 "There is no PRODUCT set function in the SQL Standard. It would appear to be a worthy candidate, though" - so would a geometric mean function.Ermaermanno
Unfortunately this answer is quite wrong. exp(sum(log(column))) works great for positive numbers or see the better answers below.Dustin
Mark, the geo mean is this: exp(avg(log(x)))Dustin
The allusion is this: clamoring gets muted when there is already a way to do something. At any rate, sql's traditional focus is accounting not data science.Dustin
B
59

For MSSQL you can use this. It can be adopted for other platforms: it's just maths and aggregates on logarithms.

SELECT
    GrpID,
    CASE
       WHEN MinVal = 0 THEN 0
       WHEN Neg % 2 = 1 THEN -1 * EXP(ABSMult)
       ELSE EXP(ABSMult)
    END
FROM
    (
    SELECT
       GrpID, 
       --log of +ve row values
       SUM(LOG(ABS(NULLIF(Value, 0)))) AS ABSMult,
       --count of -ve values. Even = +ve result.
       SUM(SIGN(CASE WHEN Value < 0 THEN 1 ELSE 0 END)) AS Neg,
       --anything * zero = zero
       MIN(ABS(Value)) AS MinVal
    FROM
       Mytable
    GROUP BY
       GrpID
    ) foo

Taken from my answer here: SQL Server Query - groupwise multiplication

Badillo answered 12/10, 2010 at 6:52 Comment(5)
"it's just maths and aggregates on logarithms" :) log(a*b*c...*n)=log(a)+log(b)+log(c)...+log(n)Luthanen
Quick question. For this to work, you must use the same base for the log and power functions, right? But LOG and EXP do not operate on the same base: LOG is base 10 while EXP is base e. So the correct answer would be to use LN instead of LOG or 10^ABSMult instead of EXP(ABSMult). Right?Gnarled
In SQL Server there is LOG (base e) and LOG10 (base 10). The bare log function now has an optional base parameter. learn.microsoft.com/en-us/sql/t-sql/functions/log-transact-sql and learn.microsoft.com/en-us/sql/t-sql/functions/…Badillo
Confirmed! Our problem was that we used your code in SQL Server, and later ported it to Teradata, where LOG() is base 10. Obviously, our calculations were off all of sudden for no apparent reason. I'm pretty sure I've seen languages (not necessarily SQL) in which LOG is base 2. Since this question is tagged as general SQL, the take home message for us is to check the behavior of the LOG function in the target implementation if one is not getting the expected results. Brilliant idea and implementation, by the way!Gnarled
This is a good answer for the general case, however for the more limited but very common case of percent returns, where value will never be less than -1 (-100% is a total loss), EXP(SUM(LOG(1+value)))-1 will do you fine. Just make sure to also divide value by 100 and multiply the result by 100 if you store values as percentages.Prismatoid
S
39

I don't know why there isn't one, but (take more care over negative numbers) you can use logs and exponents to do:-

select exp (sum (ln (table.price))) from table ...
Singleton answered 12/10, 2010 at 6:50 Comment(4)
Add round() if you are calculating the product of values of an integer column. Sometimes the .9999... is returned instead of a whole int.Favrot
this is very cleverSudhir
In T-SQL, select exp (sum (log (table.price)))Chantilly
ln(0) gives an error though, so add a case when 0 give 0?Vernice
L
30

There is no PRODUCT set function in the SQL Standard. It would appear to be a worthy candidate, though (unlike, say, a CONCATENATE set function: it's not a good fit for SQL e.g. the resulting data type would involve multivalues and pose a problem as regards first normal form).

The SQL Standards aim to consolidate functionality across SQL products circa 1990 and to provide 'thought leadership' on future development. In short, they document what SQL does and what SQL should do. The absence of PRODUCT set function suggests that in 1990 no vendor though it worthy of inclusion and there has been no academic interest in introducing it into the Standard.

Of course, vendors always have sought to add their own functionality, these days usually as extentions to Standards rather than tangentally. I don't recall seeing a PRODUCT set function (or even demand for one) in any of the SQL products I've used.

In any case, the work around is fairly simple using log and exp scalar functions (and logic to handle negatives) with the SUM set function; see @gbn's answer for some sample code. I've never needed to do this in a business application, though.

In conclusion, my best guess is that there is no demand from SQL end users for a PRODUCT set function; further, that anyone with an academic interest would probably find the workaround acceptable (i.e. would not value the syntactic sugar a PRODUCT set function would provide).

Out of interest, there is indeed demand in SQL Server Land for new set functions but for those of the window function variety (and Standard SQL, too). For more details, including how to get involved in further driving demand, see Itzik Ben-Gan's blog.

Luthanen answered 12/10, 2010 at 8:15 Comment(4)
+1 "There is no PRODUCT set function in the SQL Standard. It would appear to be a worthy candidate, though" - so would a geometric mean function.Ermaermanno
Unfortunately this answer is quite wrong. exp(sum(log(column))) works great for positive numbers or see the better answers below.Dustin
Mark, the geo mean is this: exp(avg(log(x)))Dustin
The allusion is this: clamoring gets muted when there is already a way to do something. At any rate, sql's traditional focus is accounting not data science.Dustin
J
9

You can perform a product aggregate function, but you have to do the maths yourself, like this...

SELECT
    Exp(Sum(IIf(Abs([Num])=0,0,Log(Abs([Num])))))*IIf(Min(Abs([Num]))=0,0,1)*(1-2*(Sum(IIf([Num]>=0,0,1)) Mod 2)) AS P
FROM
   Table1

Source: http://productfunctionsql.codeplex.com/

Joleenjolene answered 12/10, 2010 at 6:47 Comment(0)
C
7

There is a neat trick in T-SQL (not sure if it's ANSI) that allows to concatenate string values from a set of rows into one variable. It looks like it works for multiplying as well:

declare @Floats as table (value float)
insert into @Floats values (0.9)
insert into @Floats values (0.9)
insert into @Floats values (0.9)

declare @multiplier float = null

select 
    @multiplier = isnull(@multiplier, '1') * value
from @Floats

select @multiplier

This can potentially be more numerically stable than the log/exp solution.

Coltoncoltsfoot answered 20/6, 2013 at 15:27 Comment(0)
S
3

I think that is because no numbering system is able to accommodate many products. As databases are designed for large number of records, a product of 1000 numbers would be super massive and in case of floating point numbers, the propagated error would be huge.

Also note that using log can be a dangerous solution. Although mathematically log(a*b) = log(a)*log(b), it might not be in computers as we are not dealing with real numbers. If you calculate 2^(log(a)+log(b)) instead of a*b, you may get unexpected results. For example:

SELECT 9999999999*99999999974482, EXP(LOG(9999999999)+LOG(99999999974482))

in Sql Server returns

999999999644820000025518, 9.99999999644812E+23

So my point is when you are trying to do the product do it carefully and test is heavily.

Spurtle answered 18/1, 2012 at 2:49 Comment(0)
A
1

One way to deal with this problem (if you are working in a scripting language) is to use the group_concat function. For example, SELECT group_concat(table.price) FROM table GROUP BY table.sale

This will return a string with all prices for the same sale value, separated by a comma. Then with a parser you can get each price, and do a multiplication. (In php you can even use the array_reduce function, in fact in the php.net manual you get a suitable example).

Cheers

Antione answered 17/2, 2012 at 18:56 Comment(0)
S
1

The problem can be solved using modern SQL features such as window functions and CTEs. Everything is standard SQL and - unlike logarithm-based solutions - does not require switching from integer world to floating point world nor handling nonpositive numbers. Just number rows and evaluate product in recursive query until no row remain:

   with recursive t(c) as (
     select unnest(array[2,5,7,8])
   ), r(c,n) as (
     select t.c, row_number() over () from t
   ), p(c,n) as (
     select c, n from r where n = 1
     union all
     select r.c * p.c, r.n from p join r on p.n + 1 = r.n
   )
   select c from p where n = (select max(n) from p);

As your question involves grouping by sale column, things got little bit complicated but it's still solvable:

   with recursive t(sale,price) as (
     select 'multiplication', 2 union
     select 'multiplication', 5 union
     select 'multiplication', 7 union
     select 'multiplication', 8 union
     select 'trivial', 1 union
     select 'trivial', 8 union
     select 'negatives work', -2 union
     select 'negatives work', -3 union
     select 'negatives work', -5 union
     select 'look ma, zero works too!', 1 union
     select 'look ma, zero works too!', 0 union
     select 'look ma, zero works too!', 2
   ), r(sale,price,n,maxn) as (
     select t.sale, t.price, row_number() over (partition by sale), count(1) over (partition by sale)
     from t
   ), p(sale,price,n,maxn) as (
     select sale, price, n, maxn
     from r where n = 1
     union all
     select p.sale, r.price * p.price, r.n, r.maxn
     from p
     join r on p.sale = r.sale and p.n + 1 = r.n
   )
   select sale, price
   from p
   where n = maxn
   order by sale;

Result:

sale,price
"look ma, zero works too!",0
multiplication,560
negatives work,-30
trivial,8

Tested on Postgres.

Seclude answered 24/4, 2019 at 10:43 Comment(1)
Connor McDonald mentions this answer in his great article connor-mcdonald.com/2020/11/18/… . There are mentioned also another solutions (XML, model clause) and analyzed from the performance view. I fully agree with Connor the most effective way is custom aggregate function in the Oracle environment.Flibbertigibbet
S
1

Another approach based on fact that the cardinality of cartesian product is product of cardinalities of particular sets ;-)

⚠ WARNING: This example is just for fun and is rather academic, don't use it in production! (apart from the fact it's just for positive and practically small integers)⚠

with recursive t(c) as (
  select unnest(array[2,5,7,8])
), p(a) as (
  select array_agg(c) from t
  union all
  select p.a[2:]
  from p
  cross join generate_series(1, p.a[1])
)
select count(*) from p where cardinality(a) = 0;
Seclude answered 26/4, 2019 at 12:55 Comment(0)
A
0

Here is an oracle solution for anyone who needs it

with data(id, val) as(
select 1,1.0 from dual union all
select 2,-2.0 from dual union all
select 3,1.0 from dual union all
select 4,2.0 from dual 
),
neg(val , modifier) as(
select exp(sum(ln(abs(val)))), case when mod(count(*),2) = 0 then 1 Else -1 end
from data
where val <0
)
,
pos(val) as (
select exp(sum(ln(val)))
from data
where val >=0
)
select (select val*modifier from neg)*(select val from pos) product from dual
Assurbanipal answered 7/12, 2021 at 10:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.