Efficient pagination and database querying in django
Asked Answered
B

3

24

There were some code examples for django pagination which I used a while back. I may be wrong but when looking over the code it looks like it wastes tons of memory. I was looking for a better solution, here is the code:

# in views.py
from django.core.paginator import Paginator, EmptyPage, PageNotAnInteger

... 
...    

def someView():
    models = Model.objects.order_by('-timestamp')
    paginator = Paginator(models, 7)
    pageNumber = request.GET.get('page')

    try: 
        paginatedPage = paginator.page(pageNumber)
    except PageNotAnInteger: 
        pageNumber = 1
    except EmptyPage: 
        pageNumber = paginator.num_pages
    models = paginator.page(pageNumber)

    return render_to_resp ( ..... models ....)

I'm not sure of the subtlties of this code but from what it looks like, the first line of code retrieves every single model from the database and pushes it into. Then it is passed into Paginator which chunks it up based on which page the user is on from a html GET. Is paginator somehow making this acceptable, or is this totally memory inefficient? If it is inefficient, how can it be improved?

Also, a related topic. If someone does:

   Model.objects.all()[:40]

Does this code mean that all models are pushed into memory, and we splice out 40 of them? Which is bad. Or does it mean that we query and push only 40 objects into memory period?

Thank you for your help!

Blare answered 23/4, 2013 at 5:36 Comment(0)
C
33

mymodel.objects.all() yields a queryset, not a list. Querysets are lazy - no request is issued and nothing done until you actually try to use them. Also slicing a query set does not load the whole damn thing in memory only to get a subset but adds limit and offset to the SQL query before hitting the database.

Crosshatch answered 23/4, 2013 at 7:11 Comment(0)
S
2

There is nothing memory inefficient when using paginator. Querysets are evaluated lazily. In your call Paginator(models, 7), models is a queryset which has not been evaluated till this point. So, till now database hasn't been hit. Also no list containing all the instances of model is in the memory at this point.

When you want to get a page i.e at paginatedPage = paginator.page(pageNumber), slicing is done on this queryset, only at this point the database is hit and database returns you a queryset containing instances of model. And then slicing only returns the objects which should be there on the page. So, only the sliced objects will go in a list which will be there in the memory. Say on one page you want to show 10 objects, only these 10 objects will stay in the memory.

When someone does;

Model.objects.all()[:40]

When you slice a list, a new list is created. In your case a list will be created with only 40 elements and will be stored somewhere in memory. No other list will be there and so there won't be any list which contains all the instances of Model in memory.

Sensualist answered 23/4, 2013 at 5:58 Comment(5)
Yeah but .. models is all of the model objects in the entire database, its loaded right there in the first line. Are u saying paginator like kills it or something? For ur second answer, yes a new list is created but the old one is kinda still there right? Or does it just poof disappear?Blare
For the second answer, since reference count for old list will be 0, it will be garbage collected when garbage collection occurs as there is no way to reach the old list.Sensualist
For first answer, yes, models is all the model objects in the database and it will be in memory for the entire period of your view. And you want it otherwise how is django supposed to know what objects to show on a particular page? Django can only do it if you provide it all the objects and then tell it which page you want and then django will do the slicing and return you a new list.Sensualist
@akshar raaj please read the manual and learn how querysets work. You are plain wrong on all points.Crosshatch
@brunodesthuilliers Can you check it now please?Sensualist
C
0

Using the above information I came up with a view function decorator. The json_list_objects takes djanog objects to json-ready python dicts of the known relationship fields of the django objects and returns the jsonified list as {count: results: }.

Others may find it useful.

def with_paging(fn):
  """
  Decorator providing paging behavior.  It is for decorating a function that 
  takes a request and other arguments and returns the appropriate query
  doing select and filter operations.  The decorator adds paging by examining
  the QueryParams of the request for page_size (default 2000) and 
  page_num (default 0).  The query supplied is used to return the appropriate
  slice. 
  """
  @wraps(fn)
  def inner(request, *args, **kwargs):
    page_size = int(request.GET.get('page_size', 2000))
    page_num = int(request.GET.get('page_num', 0))
    query = fn(request, *args, **kwargs)
    start = page_num * page_size
    end = start + page_size
    data = query[start:end]
    total_size = query.count()
    return json_list_objects(data, overall_count=total_size)
  return inner
Choli answered 21/6, 2017 at 22:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.