Django aggregate queries with expressions
Asked Answered
S

4

7

I have a model XYZ and I need to get the max value for fields a, b, and expression x/y for a given queryset.

It works beautifully for fields. Something like:

>>> XYZ.all().aggregate(Max('a'))

... {'a__max': 10}

However, I can't find a way to do it for expressions. Trying something like:

>>> XYZ.all().aggregate(Max('x/y'))

Gives an error:

*** FieldError: Cannot resolve keyword 'x/y' into field. Choices are: a, b, x, y, id

Trying something like:

>>> XYZ.all().aggregate(Max(F('x')/F('y')))

Gives an error:

*** AttributeError: 'ExpressionNode' object has no attribute 'split'

And even something like:

XYZ.all().extra(select={'z':'x/y'}).aggregate(Max('z'))

Also doesn't work and gives the same error as above:

FieldError: Cannot resolve keyword 'z' into field. Choices are: a, b, x, y, id

The one hack I found to do it is:

XYZ.all().extra(select={'z':'MAX(x/y)'})[0].z

Which actually works because it generates the right SQL, but it's confusing because I do get the right value at the z atttribute, but not the right instance, the one with that max value.

Of course, I could also use raw queries or tricks with extra() and order_by(), but it really doesn't make sense to me that Django goes all the way to support aggregate queries in a nice way, but can't support expressions even with its own F expressions.

Is there any way to do it?

Siobhansion answered 19/4, 2012 at 3:52 Comment(1)
You may be interested to know that the ability to use F() objects in aggregates is part of the upcoming Django 1.8 release.Belgravia
G
6

In SQL, what you want is actually

SELECT x/y, * FROM XYZ ORDER BY x/y DESC LIMIT 1;
# Or more verbose version of the #1
SELECT x/y, id, a, b, x, y FROM XYZ GROUP BY x/y, id, a, b, x, y ORDER BY x/y DESC LIMIT 1;
# Or
SELECT * FROM XYZ WHERE x/y = (SELECT MAX(x/y) FROM XYZ) LIMIT 1;

Thus in Django ORM:

XYZ.objects.extra(select={'z':'x/y'}).order_by('-z')[0]
# Or
XYZ.objects.extra(select={'z':'x/y'}).annotate().order_by('-z')[0]
# Or x/y=z => x=y*z
XYZ.objects.filter(x=models.F('y') * XYZ.objects.extra(select={'z':'MAX(x/y)'})[0].z)[0]

The version

XYZ.all().extra(select={'z':'MAX(x/y)'})[0].z

does not have correct x,y and instance because the MAX function is evaluated among all rows, when there is no GROUP BY, thus all instances in the returned QuerySet will have same value of z as MAX(x/y).

Galenism answered 19/4, 2012 at 14:31 Comment(2)
Right, but the intent is to get the max value itself, as the return from XYZ.all().aggregate(Max('a')) does, not the instance containing it. The version with extra-select is the closest to it. Not returning the right instance is a confusing side effect, but it returns the right value. As I said in the opening message, I am aware of the solutions with extra and order_by, but these are not acceptable since they require a full sort of the table, not a single pass. It doesn't make much sense for Django to support Max aggregate with single fields but not expressions.Siobhansion
@pjwerneck The reason of the confusing side effect you called, is described in the last paragraph of my answer. If you only want max value, XYZ.objects.extra(select={'z':'MAX(x/y)'})[0].z is enough, there is no order_by. Or even directly cursor.execute('SELECT MAX(x/y) from XYZ'). I agree w/ you that Django does not provide aggregate w/ expressions, because it could be dramatically more difficult than supporting single field IMO.Galenism
K
3

Your example that uses F() objects should work fine since Django 1.8:

XYZ.all().aggregate(Max(F('x')/F('y')))

There's a snippet that demonstrates aggregation with Sum() and F() objects in the Django aggregation cheat sheet:

Book.objects.all().aggregate(price_per_page=Sum(F('price')/F('pages'))
Katydid answered 5/1, 2016 at 23:0 Comment(1)
Good to know. Thanks!Siobhansion
R
0

For versions lower than 1.8 you can achieve the same in this (undocumented) way.

Book.objects.all().aggregate(price_per_page=Sum('price_per_page', 
                                                field='book_price/book_pages'))

This works for Postgres, I don't know about MySQL.

Source: Django Aggregation: Summation of Multiplication of two fields

Raving answered 3/6, 2016 at 14:4 Comment(0)
H
-3

I think you should get the Maximum values separately

result = XYZ.aggregate(Max('x'), Max('y'))

And then divide the two fields

result['x__max'] \ result['y__max']
Hartmann answered 19/4, 2012 at 5:30 Comment(1)
That doesn't make any sense. Even if it returns the row with the max x and y pair, that isn't necessarily the max x/y. For instance, Max(x)/Max(y) row is 69/16=4, while max(x/y) is 8/1=8Siobhansion

© 2022 - 2024 — McMap. All rights reserved.