Django: ManyToMany filter matching on ALL items in a list
Asked Answered
P

4

22

I have such a Book model:

class Book(models.Model):
    authors = models.ManyToManyField(Author, ...)
    ...

In short:

I'd like to retrieve the books whose authors are strictly equal to a given set of authors. I'm not sure if there is a single query that does it, but any suggestions will be helpful.

In long:

Here is what I tried, (that failed to run getting an AttributeError)

# A sample set of authors
target_authors = set((author_1, author_2))

# To reduce the search space, 
# first retrieve those books with just 2 authors.
candidate_books = Book.objects.annotate(c=Count('authors')).filter(c=len(target_authors))

final_books = QuerySet()
for author in target_authors:
    temp_books = candidate_books.filter(authors__in=[author])
    final_books = final_books and temp_books

... and here is what I got:

AttributeError: 'NoneType' object has no attribute '_meta'

In general, how should I query a model with the constraint that its ManyToMany field contains a set of given objects as in my case?

ps: I found some relevant SO questions but couldn't get a clear answer. Any good pointer will be helpful as well. Thanks.

Preselector answered 7/11, 2012 at 13:12 Comment(1)
Almost there. See the answer in this question: #8618568Broadcast
P
18

Similar to @goliney's approach, I found a solution. However, I think the efficiency could be improved.

# A sample set of authors
target_authors = set((author_1, author_2))

# To reduce the search space, first retrieve those books with just 2 authors.
candidate_books = Book.objects.annotate(c=Count('authors')).filter(c=len(target_authors))

# In each iteration, we filter out those books which don't contain one of the 
# required authors - the instance on the iteration.
for author in target_authors:
    candidate_books = candidate_books.filter(authors=author)

final_books = candidate_books
Preselector answered 7/11, 2012 at 17:34 Comment(2)
Your solutions do the same. Filter kwargs are "AND"edHarbot
You're right, they look the same. However, I think there is an execution difference. As far as I understood, in your approach, the author in the authors field which matches author_1 is also expected to match with author_2. On the other hand, iterative filtering doesn't enforce such a constraint. If I'm wrong, please correct me. I'm here to learn. Thanks again!Preselector
H
5

You can use complex lookups with Q objects

from django.db.models import Q
...
target_authors = set((author_1, author_2))
q = Q()
for author in target_authors:
    q &= Q(authors=author)
Books.objects.annotate(c=Count('authors')).filter(c=len(target_authors)).filter(q)
Harbot answered 7/11, 2012 at 13:27 Comment(3)
Thanks @goliney, although the approach is neat and inspiring, I guess it's not doing what I'm looking for. If there is just one author, it works fine, but when there are multiple authors, the AND process is probably leading to an impossible constraint like (where a=x AND a=y).Preselector
As it turned out, there is something unclear in Q() behaviour for me. According to this relative question I find on SO, Q() & Q() is not equal to .filter().filter(). Thank you for your questionHarbot
Thanks for the link and sharing your time.Preselector
M
2

Q() & Q() is not equal to .filter().filter(). Their raw SQLs are different where by using Q with &, its SQL just add a condition like WHERE "book"."author" = "author_1" and "book"."author" = "author_2". it should return empty result.

The only solution is just by chaining filter to form a SQL with inner join on same table: ... ON ("author"."id" = "author_book"."author_id") INNER JOIN "author_book" T4 ON ("author"."id" = T4."author_id") WHERE ("author_book"."author_id" = "author_1" AND T4."author_id" = "author_1")

Minna answered 8/1, 2019 at 21:51 Comment(0)
S
0

I came across the same problem and came to the same conclusion as iuysal, untill i had to do a medium sized search (with 1000 records with 150 filters my request would time out).

In my particular case the search would result in no records since the chance that a single record will align with ALL 150 filters is very rare, you can get around the performance issues by verifying that there are records in the QuerySet before applying more filters to save time.

# In each iteration, we filter out those books which don't contain one of the 
# required authors - the instance on the iteration.
for author in target_authors:
   if candidate_books.count() > 0:
      candidate_books = candidate_books.filter(authors=author)

For some reason Django applies filters to empty QuerySets. But if optimization is to be applied correctly however, using a prepared QuerySet and correctly applied indexes are necessary.

Seiden answered 8/12, 2016 at 8:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.