How to do SELECT COUNT(*) GROUP BY and ORDER BY in Django?
Asked Answered
W

4

157

I'm using a transaction model to keep track all the events going through the system

class Transaction(models.Model):
    actor = models.ForeignKey(User, related_name="actor")
    acted = models.ForeignKey(User, related_name="acted", null=True, blank=True)
    action_id = models.IntegerField() 
    ......

how do I get the top 5 actors in my system?

In sql it will basically be

SELECT actor, COUNT(*) as total 
FROM Transaction 
GROUP BY actor 
ORDER BY total DESC
Walton answered 30/9, 2013 at 18:54 Comment(1)
Please read: docs.djangoproject.com/en/dev/topics/db/aggregationDusk
D
292

According to the documentation, you should use:

from django.db.models import Count
Transaction.objects.all().values('actor').annotate(total=Count('actor')).order_by('total')

values() : specifies which columns are going to be used to "group by"

Django docs:

"When a values() clause is used to constrain the columns that are returned in the result set, the method for evaluating annotations is slightly different. Instead of returning an annotated result for each result in the original QuerySet, the original results are grouped according to the unique combinations of the fields specified in the values() clause"

annotate() : specifies an operation over the grouped values

Django docs:

The second way to generate summary values is to generate an independent summary for each object in a QuerySet. For example, if you are retrieving a list of books, you may want to know how many authors contributed to each book. Each Book has a many-to-many relationship with the Author; we want to summarize this relationship for each book in the QuerySet.

Per-object summaries can be generated using the annotate() clause. When an annotate() clause is specified, each object in the QuerySet will be annotated with the specified values.

The order by clause is self explanatory.

To summarize: you group by, generating a queryset of authors, add the annotation (this will add an extra field to the returned values) and finally, you order them by this value

Refer to https://docs.djangoproject.com/en/dev/topics/db/aggregation/ for more insight

Good to note: if using Count, the value passed to Count does not affect the aggregation, just the name given to the final value. The aggregator groups by unique combinations of the values (as mentioned above), not by the value passed to Count. The following queries are the same:

Transaction.objects.all().values('actor').annotate(total=Count('actor')).order_by('total')
Transaction.objects.all().values('actor').annotate(total=Count('id')).order_by('total')
Dihybrid answered 30/9, 2013 at 19:44 Comment(8)
For me it worked as Transaction.objects.all().values('actor').annotate(total=Count('actor')).order_by('total'), don't forget to import Count from django.db.models. Thank youLeahleahey
Good to note: if using Count (and maybe other aggregators), the value passed to Count does not effect the aggregation, just the name given to the final value. The aggregator groups by unique combinations of the values (as mentioned above), not by the value passed to Count.Slipstream
You can even use this for postgres search result query sets to get faceting!Mauldon
@Slipstream It does affect it, nowadays at least (I'm using Django 2.1.4). In the example, total is the name given and the count used in sql is COUNT('actor') which in this case doesn't matter, but if e.g. values('x', 'y').annotate(count=Count('x')), you'll get COUNT(x), not COUNT(*) or COUNT(x, y), just tried it in ./manage.py shellLeprechaun
many years since this answer but since nobody pointed out: the .order_by('total') should be .order_by('-total') since OP asked for DESC order byGamopetalous
shouldn't the .all() be at the end? if you call it as first method in the chain you will load from db all the recordsForayer
Are you sure about "Good to note"? If I use Count(1) I will get group with Null value properly, but if I use Count("<fieldname>") the result for the Null value group is =0.Gifted
@Gifted sadly this answer is almost 10 years old and I haven't used django in a long time so I don't know / remember. It could have changed as well.Dihybrid
K
60

Just like @Alvaro has answered the Django's direct equivalent for GROUP BY statement:

SELECT actor, COUNT(*) AS total 
FROM Transaction 
GROUP BY actor

is through the use of values() and annotate() methods as follows:

Transaction.objects.values('actor').annotate(total=Count('actor')).order_by()

However one more thing must be pointed out:

If the model has a default ordering defined in class Meta, the .order_by() clause is obligatory for proper results. You just cannot skip it even when no ordering is intended.

Further, for a high quality code it is advised to always put a .order_by() clause after annotate(), even when there is no class Meta: ordering. Such approach will make the statement future-proof: it will work just as intended, regardless of any future changes to class Meta: ordering.


Let me provide you with an example. If the model had:

class Transaction(models.Model):
    actor = models.ForeignKey(User, related_name="actor")
    acted = models.ForeignKey(User, related_name="acted", null=True, blank=True)
    action_id = models.IntegerField()

    class Meta:
        ordering = ['id']

Then such approach WOULDN'T work:

Transaction.objects.values('actor').annotate(total=Count('actor'))

That's because Django performs additional GROUP BY on every field in class Meta: ordering

If you would print the query:

>>> print Transaction.objects.values('actor').annotate(total=Count('actor')).query
  SELECT "Transaction"."actor_id", COUNT("Transaction"."actor_id") AS "total"
  FROM "Transaction"
  GROUP BY "Transaction"."actor_id", "Transaction"."id"

It will be clear that the aggregation would NOT work as intended and therefore the .order_by() clause must be used to clear this behaviour and get proper aggregation results.

See: Interaction with default ordering or order_by() in official Django documentation.

Kilburn answered 21/3, 2017 at 21:9 Comment(2)
.order_by() saved me from ordering in Meta.Closure
"If the model has a default ordering defined in class Meta, the .order_by() clause is obligatory for proper results." Thank you. This wasn't working, and adding the order_by fixed it. Requiring an order_by to make a group by work is very intuitive, and an important quirk of Django's ORM to understand.Healall
M
4

If you want reverse (bigger value to smaller value) order just use - minus.

from django.db.models import Count
Transaction.objects.all().values('actor').annotate(total=Count('actor')).order_by('-total')
Madra answered 30/8, 2021 at 19:33 Comment(0)
M
0

actually i am not getting to use values() and annotations combine.

The code written below didn't work when i had tried

from django.db.models import Count

Transaction.objects.all().values('actor').annotate(total=Count('actor')).order_by('-total')

so, i prefer the below type code for grouping.

get_posts(user_id):
user = Post.objects.all().\
    values('posted_by').\
    annotate(count=Count("post_id"))
return user
Meijer answered 21/4, 2023 at 10:36 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.