How can I check mongodb query performance without cache
Asked Answered
J

3

16

I have a query that takes too long to finish. I like to do some performance tests but after I check for one time (that takes currently ~30 seconds) the query start to run a lot faster (< 1 second). I assume it has to be with the mongodb caching. Is there a way to disable caching for mongodb or another way I can check the performance?

I'm using mongodb hosted in mongohq. Program with Ruby on Rails 3. Here is the explain:

{"cursor"=>"BtreeCursor start_-1", "nscanned"=>5276, "nscannedObjects"=>5276, "n"=>25, "millis"=>3264, "nYields"=>0, "nChunkSkips"=>0, "isMultiKey"=>false, "indexOnly"=>false, "indexBounds"=>{"start"=>[[{"$maxElement"=>1}, {"$minElement"=>1}]]}, "allPlans"=>[{"cursor"=>"BtreeCursor attendees_count_-1", "indexBounds"=>{"attendees_count"=>[[1.7976931348623157e+308, 1]]}}, {"cursor"=>"BtreeCursor images_count_-1", "indexBounds"=>{"images_count"=>[[1.7976931348623157e+308, 2]]}}, {"cursor"=>"BtreeCursor start_-1", "indexBounds"=>{"start"=>[[{"$maxElement"=>1}, {"$minElement"=>1}]]}}, {"cursor"=>"BtreeCursor start_-1_end_-1", "indexBounds"=>{"start"=>[[{"$maxElement"=>1}, {"$minElement"=>1}]], "end"=>[[{"$maxElement"=>1}, {"$minElement"=>1}]]}}, {"cursor"=>"BtreeCursor attendees._id_1 multi", "indexBounds"=>{"attendees._id"=>[[BSON::ObjectId('4f0b621e94bb688563000007'),BSON::ObjectId('4f0b621e94bb688563000007')], [BSON::ObjectId('4f0b647d5a8c00acde05236f'), BSON::ObjectId('4f0b647d5a8c00acde05236f')], [BSON::ObjectId('4f0b647d5a8c00acde052370'), BSON::ObjectId('4f0b647d5a8c00acde052370')], [BSON::ObjectId('4f0b647d5a8c00acde052371'), BSON::ObjectId('4f0b647d5a8c00acde052371')], [BSON::ObjectId('4f0b647d5a8c00acde052372'), BSON::ObjectId('4f0b647d5a8c00acde052372')], [BSON::ObjectId('4f0b647d5a8c00acde052373') ... (lots of Object IDs)]]}}}

Notice that I had to truncate some of the explain because it was too long. Where there is "lots of Object IDs" there were lots of Object IDs (~400).

Thanks all

Janejanean answered 20/2, 2012 at 14:35 Comment(0)
B
8

I guess you can not stop Mongo from preparing the query plan (that's how mongo works). So, Before query stats.. 1. Clear the plan cache of all collections by

>db.colllction_name.getPlanCache().clear()
  1. Record the slow queries with Profiler (see mongo docs),
  2. work on query to optimize it,
  3. clear the cache again, and check the query performance again.
Barker answered 28/8, 2018 at 9:7 Comment(0)
C
5

When you are first running the query, the data set is memory mapped, but has not been paged into actual memory, see Caching on the MongoDB site. So, the OS has to page that data set into memory and then you get your query run and result.

Because you are paging in from the disk (slow) into RAM (fast) the initial run is slow, and then, unless you have memory pressure, that data will stay in RAM and all of your subsequent queries on that data set will be fast.

This is how MongoDB is designed to function, the process of loading your data set into memory is often called "warming up" the database and it is only after that warming up (in your case the first query) that you get the true performance.

It is worth noting that your initial query still seems to take a very long time to return. You should make sure it is using indexes effectively. The best place to start in that investigation is the explain() page.

Crampon answered 20/2, 2012 at 15:8 Comment(7)
I still want my initial query to run faster because I can't load all documents to memory. So, I already used explain, and saw that it scans ~5000 documents to fetch the query and I think 30 seconds are too much for 5000 documents. That why I want to keep doing the same query without cache so I can't test what is wrong.Janejanean
Did the query use an index? Feel free to add the explain to the question and I can take a look at the output.Crampon
Added in the original message.Janejanean
It is using the single key index "start_-1" - i.e. a reverse ordered index on the start field for the query. Does that sound optimal to you? I suspect not, though I can't really say without knowing more about the data and the query. The brute force method would be to try hinting and using the other indexes to determine which one is actually the fastest (MongoDB will cache the current value for a while before retrying the query plans).Crampon
Sometime it uses this index and sometimes it uses an index on field attendees._id (an ObjectId field in an array of hashes), so I guess it optimizes. I checked and saw that if I hint it to always use the attendees._id index, it scan more documents (~9000).I thought on another angle, for each document it scan, it check if attendees._id is in a list of 400 IDs, sometimes even more than 1000. Is there a more optimal way to do it?Janejanean
This all depends on the data and the nature of the query - cardinality, locality, appropriate indexes all play a part - there is no one size fits all method of query/index optimization, otherwise DBAs would be out of work pretty quickly :) By the way, the explain indicated that this was executing in 3.2 seconds rather than 32 - still a long time, but not as bad as 30+ seconds would imply. In the end if you are going to have to do a lot of disk reads you may need faster disks and more shards to scale this upCrampon
It was executing in 3.2 seconds due to cache. That why I asked if there is a way to run without a cacheJanejanean
M
1

You can use db.collection.getPlanCache().clear() to remove all cached query plans for a collection.

https://docs.mongodb.com/v3.2/reference/method/PlanCache.clear/

Mcmann answered 19/12, 2017 at 16:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.