Aggregate difference between DateTime fields in Django
Asked Answered
K

4

13

I have a table containing a series of entries which relate to time periods (specifically, time worked for a client):

task_time:
id     |    start_time    |    end_time       |    client (fk)
1        08/12/2011 14:48   08/12/2011 14:50     2

I am trying to aggregate all the time worked for a given client, from my Django app:

time_worked_aggregate = models.TaskTime.objects.\
                        filter(client = some_client_id).\
                        extra(select = {'elapsed': 'SUM(task_time.end_time - task_time.start_time)'}).\
                        values('elapsed')

if len(time_worked_aggregate) > 0:
    time_worked = time_worked_aggregate[0]['elapsed'].total_seconds()
else:
    time_worked = 0

This seems inelegant, but it does work. Or at least so I thought: it turns out that it works fine on a PostgreSQL database, but when I move over to SQLite, everything dies.

A bit of digging suggests that the reason for this is that DateTimes aren't first-class data in SQLite. The following raw SQLite query will do my job:

SELECT SUM(strftime('%s', end_time) - strftime('%s', start_time)) FROM task_time WHERE ...;

My question is as follows:

  • The Python sample above seems roundabout. Can we do this more elegantly?
  • More importantly at this stage, can we do it in a way that will work on both Postgres and SQLite? Ideally, I'd like not to be writing raw SQL queries and switching on the database backend that happens to be in place; in general, Django is extremely good at protecting us from this. Does Django have a reasonable abstraction for this operation? If not, what's a sensible way for me to do a conditional switch on the backend?

I should mention for context that the dataset is many thousands of entries; the following is not really practical:

sum([task_time.end_date - task_time.start_date for task_time in models.TaskTime.objects.filter(...)])
Karlise answered 8/12, 2011 at 14:58 Comment(0)
A
11

I think since Django 1.8 we can do better:

I would like just to draw the part with annotation, the further part with aggregation should be straightforward:

from django.db.models import F, Func
SomeModel.objects.annotate(
    duration = Func(F('end_date'), F('start_date'), function='age')
)

[more about postgres age function here: http://www.postgresql.org/docs/8.4/static/functions-datetime.html ]

each instance of SomeModel will be anotated with duration field containg time difference, which in python will be a datetime.timedelta() object [more about datetime timedelta here: https://docs.python.org/2/library/datetime.html#timedelta-objects ]

Adriatic answered 1/3, 2016 at 11:15 Comment(1)
Very awesome. Thank you.Inventory
B
14

Almost the same solution as @andri proposed. In the final result you will get the same data. ExpressionWrapper - New in Django 1.8.

from datetime import timedelta
from django.db.models import ExpressionWrapper, F, fields
from app.models import MyModel

duration = ExpressionWrapper(F('closed_at') - F('opened_at'), output_field=fields.DurationField())
objects = MyModel.objects.closed().annotate(duration=duration).filter(duration__gt=timedelta(seconds=2))

for obj in objects:
    print obj.id, obj.duration, obj.duration.seconds

# sample output
# 807 0:00:57.114017 57
# 800 0:01:23.879478 83
# 804 3:40:06.797188 13206
# 801 0:02:06.786300 126
Bystreet answered 3/2, 2017 at 13:4 Comment(1)
I think your answer is not accurate, because in context of time delta calculation we do not have to do with different types, and ExpressionWrapper is necessary when using arithmetic on F() expressions with different types docs.djangoproject.com/en/1.8/ref/models/expressions/…Adriatic
A
11

I think since Django 1.8 we can do better:

I would like just to draw the part with annotation, the further part with aggregation should be straightforward:

from django.db.models import F, Func
SomeModel.objects.annotate(
    duration = Func(F('end_date'), F('start_date'), function='age')
)

[more about postgres age function here: http://www.postgresql.org/docs/8.4/static/functions-datetime.html ]

each instance of SomeModel will be anotated with duration field containg time difference, which in python will be a datetime.timedelta() object [more about datetime timedelta here: https://docs.python.org/2/library/datetime.html#timedelta-objects ]

Adriatic answered 1/3, 2016 at 11:15 Comment(1)
Very awesome. Thank you.Inventory
A
0

I will do it step by step:

  1. first step:annotate the timedelta
  2. group by and sum timedelta

the code like this:

from django.db.models import Count, Sum, F

times_obj_list = models.TaskTime.objects.annotate(times=F("end_time")-F("start_time"))

groupby_obj_list = times_obj_list.values("client").annotate(cnt=Count("id"),seconds=Sum(times)).order_by()
Artisan answered 20/8, 2021 at 4:4 Comment(0)
W
-2

Django currently only supports aggregates for Min, Max, Avg and Count, so using raw SQL is the only way to achieve what you want. When you use raw SQL, database-independence is out the window, so unfortunately, you're out of luck. You'll have to just detect the database and alter the SQL appropriately.

Wheelwork answered 8/12, 2011 at 15:28 Comment(1)
No longer a valid responseToxicity

© 2022 - 2024 — McMap. All rights reserved.