Optimizing database queries in Django REST framework
Asked Answered
T

4

23

I have the following models:

class User(models.Model):
    name = models.Charfield()
    email = models.EmailField()

class Friendship(models.Model):
    from_friend = models.ForeignKey(User)
    to_friend = models.ForeignKey(User)

And those models are used in the following view and serializer:

class GetAllUsers(generics.ListAPIView):
    authentication_classes = (SessionAuthentication, TokenAuthentication)
    permission_classes = (permissions.IsAuthenticated,)
    serializer_class = GetAllUsersSerializer
    model = User

    def get_queryset(self):
        return User.objects.all()

class GetAllUsersSerializer(serializers.ModelSerializer):

    is_friend_already = serializers.SerializerMethodField('get_is_friend_already')

    class Meta:
        model = User
        fields = ('id', 'name', 'email', 'is_friend_already',)

    def get_is_friend_already(self, obj):
        request = self.context.get('request', None)

        if request.user != obj and Friendship.objects.filter(from_friend = user):
            return True
        else:
            return False

So basically, for each user returned by the GetAllUsers view, I want to print out whether the user is a friend with the requester (actually I should check both from_ and to_friend, but does not matter for the question in point)

What I see is that for N users in database, there is 1 query for getting all N users, and then 1xN queries in the serializer's get_is_friend_already

Is there a way to avoid this in the rest-framework way? Maybe something like passing a select_related included query to the serializer that has the relevant Friendship rows?

Tactic answered 27/10, 2014 at 17:30 Comment(0)
B
34

Django REST Framework cannot automatically optimize queries for you, in the same way that Django itself won't. There are places you can look at for tips, including the Django documentation. It has been mentioned that Django REST Framework should automatically, though there are some challenges associated with that.

This question is very specific to your case, where you are using a custom SerializerMethodField that makes a request for each object that is returned. Because you are making a new request (using the Friends.objects manager), it is very difficult to optimize the query.

You can make the problem better though, by not creating a new queryset and instead getting the friend count from other places. This will require a backwards relation to be created on the Friendship model, most likely through the related_name parameter on the field, so you can prefetch all of the Friendship objects. But this is only useful if you need the full objects, and not just a count of the objects.

This would result in a view and serializer similar to the following:

class Friendship(models.Model):
    from_friend = models.ForeignKey(User, related_name="friends")
    to_friend = models.ForeignKey(User)

class GetAllUsers(generics.ListAPIView):
    ...

    def get_queryset(self):
        return User.objects.all().prefetch_related("friends")

class GetAllUsersSerializer(serializers.ModelSerializer):
    ...

    def get_is_friend_already(self, obj):
        request = self.context.get('request', None)

        friends = set(friend.from_friend_id for friend in obj.friends)

        if request.user != obj and request.user.id in friends:
            return True
        else:
            return False

If you just need a count of the objects (similar to using queryset.count() or queryset.exists()), you can include annotate the rows in the queryset with the counts of reverse relationships. This would be done in your get_queryset method, by adding .annotate(friends_count=Count("friends")) to the end (if the related_name was friends), which will set the friends_count attribute on each object to the number of friends.

This would result in a view and serializer similar to the following:

class Friendship(models.Model):
    from_friend = models.ForeignKey(User, related_name="friends")
    to_friend = models.ForeignKey(User)

class GetAllUsers(generics.ListAPIView):
    ...

    def get_queryset(self):
        from django.db.models import Count

        return User.objects.all().annotate(friends_count=Count("friends"))

class GetAllUsersSerializer(serializers.ModelSerializer):
    ...

    def get_is_friend_already(self, obj):
        request = self.context.get('request', None)

        if request.user != obj and obj.friends_count > 0:
            return True
        else:
            return False

Both of these solutions will avoid N+1 queries, but the one you pick depends on what you are trying to achieve.

Bedwell answered 28/10, 2014 at 0:5 Comment(3)
Great answer Kevin. Thanks a lot. The only little ammend is that instead of for friend in obj.friends, I needed to call : for friend in obj.friends.all() .. the corresponding thread is here : #6315341Tactic
The first approach with "prefetch_related" would be cumbersome if the user had thousands of friends. In that case would be better to just make n queries for each userEvangelista
Kevin, i do the prefetch_related method on my views but when i call .all() on the object inside the serializer, it still makes calls to the DB. based on my sql logs.Lauter
U
17

Described N+1 problem is a number one issue during Django REST Framework performance optimization, so from various opinions, it requires more solid approach then direct prefetch_related() or select_related() in get_queryset() view method.

Based on collected information, here's a robust solution that eliminates N+1 (using OP's code as an example). It's based on decorators and slightly less coupled for larger applications.

Serializer:

class GetAllUsersSerializer(serializers.ModelSerializer):
    friends = FriendSerializer(read_only=True, many=True)

    # ...

    @staticmethod
    def setup_eager_loading(queryset):
        queryset = queryset.prefetch_related("friends")

        return queryset

Here we use static class method to build the specific queryset.

Decorator:

def setup_eager_loading(get_queryset):
    def decorator(self):
        queryset = get_queryset(self)
        queryset = self.get_serializer_class().setup_eager_loading(queryset)
        return queryset

    return decorator

This function modifies returned queryset in order to fetch related records for a model as defined in setup_eager_loading serializer method.

View:

class GetAllUsers(generics.ListAPIView):
    serializer_class = GetAllUsersSerializer

    @setup_eager_loading
    def get_queryset(self):
        return User.objects.all()

This pattern may look like an overkill, but it's certainly more DRY and has advantage over direct queryset modification inside views, as it allows more control over related entities and eliminates unnecessary nesting of related objects.

Unscramble answered 22/12, 2016 at 16:15 Comment(1)
does this method also work for POST results? I've gotten setup_eager_loading to work for GETs, but when a client POSTs, and the resulting instance is returned as the response to the POST, none of the prefetch_related clauses seem to be applied.Taryn
D
1

Using this metaclass DRF optimize ModelViewSet MetaClass

from django.utils import six

@six.add_metaclass(OptimizeRelatedModelViewSetMetaclass)
class MyModelViewSet(viewsets.ModelViewSet):
    queryset = MyModel.objects.all()
    serializer_class = MyModelSerializer
Data answered 26/2, 2018 at 20:42 Comment(1)
It throws a TypeError: metaclass conflict exception in django-cookiter drf 3.13.1Twelvetone
P
0

You can split the view into two query.
First, only get the Users list (without is_friend_already field). This only require one query.
Second, get the friends list of request.user.
Third, modify the results depending on if the user is in the request.user's friend list.

class GetAllUsersSerializer(serializers.ModelSerializer):
    ... 


class UserListView(ListView):
    def get(self, request):
        friends = request.user.friends
        data = []
        for user in self.get_queryset():
            user_data = GetAllUsersSerializer(user).data
            if user in friends:
                user_data['is_friend_already'] = True
            else:
                user_data['is_friend_already'] = False
            data.append(user_data)
        return Response(status=200, data=data)
Pember answered 6/4, 2017 at 9:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.