Annotate with value of latest related in Django 1.8 using conditional annotation
Asked Answered
B

4

29

I have the following models:

class City(models.Model):
    ...

class Census(models.Model):
    city = models.ForeignKey(City)
    date = models.DateTimeField()
    value = models.BigIntegerField()

Now I'd like to annotate a City-queryset with the value of the latest Census. How do I achieve that?

I have tried:

City.objects.annotate(population=Max('census__date'))
# --> annotates date and not value

City.objects.annotate(population=Max('census__value'))
# --> annotates highest value, not latest

City.objects.annotate(population=
    Case(
        When(
            census__date=Max('census__date'),
            then='census__value')
        )
    )

# --> annotates 'None'

City.objects.annotate(population=
    Case(
        When(
            census__date=Max('census__date'),
            then='census__value')
        ), output_field=BigIntegerField()
    )

# --> takes forever (not sure what happens at the end, after some minutes I stopped waiting)

Any help greatly appreciated!

Bezoar answered 5/7, 2015 at 20:50 Comment(4)
1) Performance matters or is just an offline sporadic report ? 2) Should be database brand agnostic?Armoury
1) Yes, performance matters, some performance compromise for beauty is acceptable though. 2) Yes, should at least work with MySQL and PostgreSQL.Bezoar
Newer Django versions (>1.8) solution here: #43775602Madi
For Django 3.2+ see: https://mcmap.net/q/243813/-in-django-is-there-a-way-to-directly-annotate-a-query-with-a-related-object-in-single-queryVanhook
D
11

I've also been having an issue where I need a max value object of a related set, but I require the entire object. I have not been able to figure out a solution using annotation and Case. Until I do, I use this prefetching solution. If each city does not have a large amount of census objects, or if your application is not performance bound, this may work for you.

inner_qs = Census.objects.order_by('-date')
cities = City.objects.prefetch_related(Prefetch("census_set", queryset=inner_qs, to_attr="census_list"))

class City(models.Model):
    @property
    def latest_census(self):
        if hasattr(self, 'census_list') and len(self.census_list) > 0:
            return self.census_list[0]
        return None

If this does not work for you, consider some of the suggestions found here: http://blog.roseman.org.uk/2010/08/14/getting-related-item-aggregate/

Doiron answered 6/7, 2015 at 2:30 Comment(2)
Thanks Mark, that looks like a good workaround. Unfortunately I do have a lot of census objects per city, but maybe I can limit the Prefetch qs like this: inner_qs = Census.objects.order_by('-date')[:1] to still make it efficient. I can't test it right now, but will try it as soon as possible.Bezoar
Unfortunately, limiting like this doesn't work (fails with "Cannot filter a query once a slice has been taken.") - I will start a bounty and see if someone comes up with a better solution.Bezoar
A
5

At this moment, they are not a django query expression to annotate a not aggregated field from a related 1:N model based on a sql having expression.

You can accomplish it with several workarounds like split query and work with data in memory ( itertools groupby f.e. ) or through raw queries. But this will not match your requirements performance and database agnostic.

I explain here what I will do if this was my app. For developers is hard to have redundancy in database. In your scenario, last census by city is a calculated field ... but, in this case, take in consideration to materialize last_census:

The dirty work ...

class City(models.Model):
    last_census = models.ForeignKey('Census', null=True, 
                                     blank=True, editable=False)
    ...

For easy maintenance, you can overrite save and delete methods on Census to keep last_census up to date.

class Census(models.Model):
    ...

    #overriding save
    def save(self, *args, **kwargs):
        super(Census, self).save(*args, **kwargs)
        #updating last_census on referenced city
        max_census = Census.objects.filter( city = self.city ).latest('date')
        self.city.last_census = max_census.city if max_census else None
        self.city.save()

    #overriding delete
    def delete(self, *args, **kwargs):
        #updating last_census on referenced city
        max_census = ( Census.objects
                      .filter( city = self.city )
                      .exclude( id = self.id )
                      .latest('date') )
        self.city.last_census = max_census.city if max_census else None
        self.city.save()
        super(Census, self).delete(*args, **kwargs)

Notice: if you are more comfortable, you can code it with signals ( pre_delete, post_save, ... ) instead overriding methods.

The best ...

Your query now:

City.objects.select_related('last_census__value').all()
Armoury answered 9/7, 2015 at 9:9 Comment(4)
Is there any way to avoid race conditions with this solution?Doiron
What you means? Explain. Be free to illustrate with a sample.Armoury
My application will have many users adding and removing 'Census' objects at the same time. I need to make sure this redundant field is accurate, as the users will notice if it is not.Doiron
You should learn about database transactions, isolation levels and django transaction control management.Armoury
P
2

Something like this may work for you:

I have this to show last reservation for restaurants

Reservation.objects.filter(
        restaurant__city_id__gt=0
        ).distinct().annotate(city_count=Count(
        'restaurant_id')
        ).order_by('-reservationdate').filter(city_count__gte=1)[:20]

in your case it may be something like:

city = Census.objects.filter(
        city_id__gt=0
        ).distinct().annotate(city_count=Count(
        'city_id')
        ).order_by('-date').filter(city_count__gte=1)[:20]

and your html

{% for city in city %}
{{ city.city.name }} {{ city.date }}<br>
{% endfor %}
Progenitive answered 8/7, 2015 at 0:32 Comment(4)
I like the idea, but it has to be a City QuerySet.Bezoar
Thats the trick, you dont actually query the city at least not in the Django ORM eyes. You query the Census and display the City name in the results with {{ census.city.name }}. The city is actually inside your census tables as city_id. So you query that and display the city name as census.city.nameProgenitive
Sorry, I wasn't clear. Yes, I understood that, but I actually need it to be a City-QuerySet.Bezoar
What results are you actually looking for? Just asking because Django ORM does the join for you and you can display all and any data from the City table through the census QuerySet. There is no difference ...Progenitive
C
2

It's late and I'm getting hungry so I see won't this all the way through*, but the following code will return the latest census value in a valueset for a given city. A queryset might be possible, but again I am hungry!

*I wasn't sure if you needed to retrieve all of the latest values for all cities or a specific city. The latter is pretty easy, the former is a bit harder. You could easily put this as a method on your model and call it on every city in a loop inside of a template/view.

Hope this helps!

from app.models import *
from django.db.models import F

    City.objects.annotate(
        values=F("census__value"),
        date=F("census__date"))\
        .values('values', 'name').filter(name="Atlanta")\
        .latest('date')
Chongchoo answered 9/7, 2015 at 5:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.