How to check if a pymongo cursor has query results

Asked 24/10, 2014 at 14:18 Answered 12/1 at 11:20

Solved python mongodb pymongo mongodb-query

I need to check if a find statement returns a non-empty query.

What I was doing was the following:

query = collection.find({"string": field})
if not query: #do something

Then I realized that my if statement was never executed because find returns a cursor, either the query is empty or not.

Therefore I checked the documentation and I find two methods that can help me:

count(with_limit_and_skip=False) which (from the description):

Returns the number of documents in the results set for this query.

It seems a good way to check, but this means that I need to count all the results in cursor to know if it is zero or not, right? A little bit expensive?
retrieved which (from the description):

The number of documents retrieved so far.

I tested it on an empty query set and it returns zero, but it's not clear what it does and I don't know if it's right for me.

So, which is the best way (best practice) to check if a find() query returns an empty set or not? Is one of the methods described above right for this purpose? And what about performance? Are there other ways to do it?

Just to be clear: I need to know if the query is empty and I'd like to find the best way with the cursor with respect to performance and being pythonic.

Longfaced answered 24/10, 2014 at 14:18 Comment(7)

That depends on what you wanna do with it – Convulsant 24/10, 2014 at 15:24

@VincentBeltman I just need to know if the query returns something using that cursor. – Longfaced 24/10, 2014 at 15:30

You're already performing the query in your collection.find() method anyway. The "count" method is just returning the number of documents that the find method returned to the cursor. You would just do "if query.count > 0:" – Churchyard 24/10, 2014 at 16:19

@Churchyard I understood that, but how about performance? Does count need to rescan all results found to give me a number? – Longfaced 24/10, 2014 at 16:23

@boh the cursor is a pointer to the result set of the query. So a rescan would be unnecessary. – Churchyard 24/10, 2014 at 16:47

@Churchyard So the cursor "knows" how many objects have been found by the query? – Longfaced 24/10, 2014 at 21:45

If your further workflow supposes using of a list, you can extract cursor data with: lst = list(query) and then check if not lst: #do something – Valdez 13/10, 2016 at 4:54

EDIT: While this was true in 2014, modern versions of pymongo and MongoDB have changed this behaviour. Buyer beware:

.count() is the correct way to find the number of results that are returned in the query. The count() method does not exhaust the iterator for your cursor, so you can safely do a .count() check before iterating over the items in the result set.

Performance of the count method was greatly improved in MongoDB 2.4. The only thing that could slow down your count is if the query has an index set on it, or not. To find out if you have an index on the query, you can do something like

query = collection.find({"string": field})
print query.explain()

If you see BasicCursor in the result, you need an index on your string field for this query.

EDIT: as @alvapan pointed out, pymongo deprecated this method in pymongo 3.7+ and now prefers you to use count_documents in a separate query.

item_count = collection.count_documents({"string": field})

The right way to count the number of items you've returned on a query is to check the .retreived counter on the query after you iterate over it, or to enumeratethe query in the first place:

# Using .retrieved
query = collection.find({"string": field})
for item in query:
    print(item)

print('Located {0:,} item(s)'.format(query.retrieved))

Or, another way:

# Using the built-in enumerate
query = collection.find({"string": field})
for index, item in enumerate(query):
    print(item)

print('Located {0:,} item(s)'.format(index+1))

Dactylic answered 25/10, 2014 at 1:6 Comment(3)

Thanks for bringing this up mate. .count() is the perfect function to check non-empty query. I was having similar problem and was resolved with this function. – Dolomites 15/2, 2018 at 8:0

@VooDooNOFX, PyMongo 3.7 has deprecated cursor.count(), and ask us to use collection.count_documents() instead. Then I don't know what will be the new corrrect way to find the number of results that are returned in the cursor. Do we need to collection.count_documents(query_string) first, check the number, then collection.find(query_string) to get the real results? But the new way requires us to do query twice, while the old way does only once, which seems better. – Receiver 8/7, 2018 at 2:3

@Receiver edited the response, Thanks. Personally, I prefer to just perform the query, then count the documents afterwards. Not many scalable solutions should be using a count method to know ahead of time how many documents they're querying, except maybe a total document count in the collection use-case. – Dactylic 9/7, 2018 at 5:5

How about just using find_one instead of find ? Then you can just check whether you got a result or None. And if "string" is indexed, you can pass fields = {"string":1, "_id" :0}, and thus make it an index-only query, which is even faster.

Amour answered 4/12, 2014 at 8:9 Comment(6)

The field string is indexed, but I can't use findOne because it is not guaranteed that there is only one result (actually it's unlikely that there's one). – Longfaced 4/12, 2014 at 10:11

In my understanding, all you want to know is whether you have any results for the query or not, is that correct ? If that is the case, find_one is enough for your case: if it returns a value, it means that find() would have found at least one result, and if it returns None it means that find() would return 0 results. – Amour 4/12, 2014 at 11:43

Yes, but if there are no results, I'll do something, else I'll do something with results, tipycally showing all of them. Do you say that is more efficient doing a query with findOne on an indexed field and if it's not empty perform again the query with find? – Longfaced 4/12, 2014 at 11:48

Oh, I see, I didn't understand from your question that you need the results. The answer really depends of a few factors, like whether you expect most of your queries to return results or not (if not - find_one can work well), and whether your query uses an index (and then count() is fast). – Amour 4/12, 2014 at 12:16

Another option: just iterate over the results and do your thing. If there are no results, it will be an empty loop, and you can detect it using a flag in your code (set the flag to True inside the loop). – Amour 4/12, 2014 at 12:17

seems line find is a lot faster than find_one. see: blog.serverdensity.com/… – Fingerling 19/10, 2017 at 8:12

Another solution is converting cursor to list, if the cursor doesn't have any data then empty list else list contains all data.

 doc_list = collection.find({}); #find all data
 have_list = True if len(list(doc_list)) else False;

Rambouillet answered 31/5, 2018 at 10:39 Comment(2)

I prefer this also, but disadvantage of this way is that list() function will exhaust cursor fully. Better to use it, when you are sure that your cursor contains relatively small amount of documents. – Complect 21/12, 2018 at 12:13

list does exhaust, consider doc_list = list(collection.find({})); where no results returns an empty list [] and works in Python with if not doc_list: and len(doc_list) – Valuation 4/4, 2019 at 21:35

From my tests, the quickest way is

if query.first():
    # do something

In [51]: %timeit query = MyMongoDoc.objects(); query.first()
100 loops, best of 3: 2.12 ms per loop

In [52]: %timeit query = MyMongoDoc.objects(); query.count()
100 loops, best of 3: 4.28 ms per loop

(Using MongoDB 2.6.7, 2015-03-26)

Beau answered 26/3, 2015 at 9:44 Comment(0)

I end up using simple counter, since i don't want calling server twice for no reason:

cursor = someCollection.find( query )

ct = 0

for doc in cursor:

    ct += 1

    # some code

if ct == 0:
    
    # cursor was empty

Aldosterone answered 25/7, 2023 at 17:20 Comment(1)

There is no need for this. cursor.retrieved gives you the same information. – Identical 12/1 at 11:13

Use Cursor.retrieved. However, as the name suggests this only works after you have iterated the cursor, not before: if you call cursor.retrieved before iterating the cursor, it will return 0 even if the query has matching documents.

So retrieved needs to be used according to the following general pattern:

cursor = collection.find(…)

for doc in cursor:
    process(doc)

if cursor.retrieved == 0:
    handle("no matching documents found")

Alternative suggestions in other answers have one of three disadvantages:

Either they convert the cursor into a list, causing the entire result set to be copied into memory.
Or they peek at the first result document, which means having to duplicate the processing logic.
Or they perform a redundant query to the database via count_documents or similar.

Identical answered 12/1 at 11:20 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags