How to annotate a queryset with number of days since creation
Asked Answered
S

1

3

I'm trying to do some complex ordering based on promotions: an article is promoted every 7th day since creation (the articles expire after 30 days).

My approach is to annotate the queryset with the number of days since it was created, but the value of the annotated field (days_since_creation), in my code, is always 0.

from datetime import timedelta
from django.test import TestCase
from django.db.models import ExpressionWrapper, F
from django.db.models.fields import IntegerField
from django.utils import timezone
from .models import Article
    # class Article(models.Model):
    #     title = models.CharField(max_length=150)
    #     creation_date = models.DateTimeField()
    # 
    #     def __str__(self):
    #         return self.title


class ArticleTestCase(TestCase):

    def test_days_since_creation(self):
        now = timezone.now()
        objects_data = [
            Article(
                title='Alrp',
                creation_date=(now - timedelta(days=5)) # 5 days ago
            ),
            Article(
                title='Bopp',
                creation_date=(now - timedelta(days=7)) # 7 days ago
            ),
            Article(
                title='Crkp',
                creation_date=(now - timedelta(days=14)) # 14 days ago
            ),
        ]
        Article.objects.bulk_create(objects_data)
        article_set = Article.objects\
            .annotate(
                days_since_creation=ExpressionWrapper(
                    now - F('creation_date'),
                    output_field=IntegerField()
                )
            )
        for article in article_set:
            print(article.days_since_creation)
        # all 3 objects print "0"

I expected the value for each object to be 5, 7, 14 repectively. I even tried DurationField but that just printed 0:00:00.

After that I would annotate the queryset again with an order field that has the value 0 if the value of days_since_past is in [0, 7, 14, 21, 28,] or 1 otherwise, and then order_by('order').

Synchronism answered 6/5, 2019 at 4:11 Comment(2)
I found answer related to your problem: Django How to mock datetime.now for unit testingTellford
@Tellford How exactly does it relate? I dont think the problem is with the now function. I printed out the datetimes to check their value and everything is correct.Synchronism
K
6

It's a little bit hacky, but works. You need to convert now to DateTimeField. Result will be in DurationField - standard for timedelta. For order we are going to use Case to check whether or not we are in selected duration and then we sort it.

I specifically change initial order of Articles to test ordering.

from datetime import timedelta
from django.test import TestCase
from django.db.models import ExpressionWrapper, F, Value, Case, When
from django.db.models.fields import DateTimeField, DurationField, BooleanField
from django.utils import timezone
from .models import Article


class ArticleTestCase(TestCase):
    def test_days_since_creation(self):
        now = timezone.now()

        order_in_variants = [
            timedelta(days=0),
            timedelta(days=7),
            timedelta(days=14),
            timedelta(days=21),
            timedelta(days=28),
        ]

        objects_data = [
            Article(
                title='Crkp',
                creation_date=(now - timedelta(days=14))  # 14 days ago
            ),
            Article(
                title='Alrp',
                creation_date=(now - timedelta(days=5))  # 5 days ago
            ),
            Article(
                title='Bopp',
                creation_date=(now - timedelta(days=7))  # 7 days ago
            ),
        ]
        Article.objects.bulk_create(objects_data)
        article_set = Article.objects.all().annotate(
            days_since_creation=ExpressionWrapper(
                Value(now, DateTimeField()) - F('creation_date'),
                output_field=DurationField()
            )
        ).annotate(
            order=Case(
                When(days_since_creation__in=order_in_variants, then=Value(False)),
                default=Value(True),
                output_field=BooleanField(),
            )
        ).order_by('order')
        for article in article_set:
            print(article.days_since_creation.days, article.order)
        # 14 False
        # 7 False
        # 5 True

You also might want to check that days_since_creation in range instead of being exactly 7 days difference. Makes it even uglier, but still:

order_in_variants = [
            (timedelta(days=0), timedelta(days=0, hours=23, minutes=59, seconds=59, microseconds=999999)),
            (timedelta(days=7), timedelta(days=7, hours=23, minutes=59, seconds=59, microseconds=999999)),
            (timedelta(days=14), timedelta(days=14, hours=23, minutes=59, seconds=59, microseconds=999999)),
            (timedelta(days=21), timedelta(days=21, hours=23, minutes=59, seconds=59, microseconds=999999)),
            (timedelta(days=28), timedelta(days=28, hours=23, minutes=59, seconds=59, microseconds=999999)),
        ]
# ...
order=Case(
                When(days_since_creation__range=order_in_variants[0], then=Value(False)),
                When(days_since_creation__range=order_in_variants[1], then=Value(False)),
                When(days_since_creation__range=order_in_variants[2], then=Value(False)),
                When(days_since_creation__range=order_in_variants[3], then=Value(False)),
                When(days_since_creation__range=order_in_variants[4], then=Value(False)),
                default=Value(True),
                output_field=BooleanField(),
            )
Klemm answered 6/5, 2019 at 5:25 Comment(9)
Works great! Why do you say using Value is hacky?Synchronism
Value isn't hacky, it feels overly complicated and it would be probably more efficient doing this in database.Klemm
You mean that the resulting query string from the article_set is inefficient? Or just that the code is complicated? Now that you mention range checks, wouldn't it be cleaner ignoring the time part? Cast the fields to DateField instead of DateTimeField? Changing it to Value(now.date(), DateField()) - F('creation_date__date'), doesn't work, the value of days_since_creation is 0 again.Synchronism
Scrap that, it's pretty fine sql-wise. If it looks cleaner and works - than probably it's better. I couldn't make it work though.Klemm
You mean scrap the cast to date logic? Also multiple When clauses are great for fine-tuning the queryset, using IntegerField instead of BooleanField for the values of the order field.Synchronism
Scrap my comments on it being not good - I thought that resulted sql would be inefficient, but I was wrong. Yep, you can use that to find to which exact range it refers. If this answer solved your problem, consider marking it as correct. You can refer to SO docsKlemm
I've implemented the logic in production and it works great. I realized the importance of checking the range for the days_since_creation field in the order annotate. The code in the example works only because the same now variable is used throughout the code. A milisecond difference between creation and query and it won't work.Synchronism
@Synchronism I'm trying to follow the discussion because I have a similar issue. DId you end up using Gasanov's answer, or did you use date instead of datetime?Roehm
I'm using Gasanov's answer. I'm pretty sure the entire approach is bad on big tables.Synchronism

© 2022 - 2025 — McMap. All rights reserved.