Alternatives to MongoDB cursor.toArray() in node.js
Asked Answered
C

1

6

I am currently using MongoDB cursor's toArray() function to convert the database results into an array:

run = true;
count = 0;
var start = process.hrtime();
db.collection.find({}, {limit: 2000}).toArray(function(err, docs){
  var diff = process.hrtime(start);
  run = false;
  socket.emit('result', {
    result: docs,
    time: diff[0] * 1000 + diff[1] / 1000000,
    ticks: count
  });
  if(err) console.log(err);
});

This operation takes about 7ms on my computer. If I remove the .toArray() function then the operation takes about 0.15ms. Of course this won't work because I need to forward the data, but I'm wondering what the function is doing since it takes so long? Each document in the database simply consists of 4 numbers.

In the end I'm hoping to run this on a much smaller processor, like a Raspberry Pi, and here the operation where it fetches 500 documents from the database and converts it to an array takes about 230ms. That seems like a lot to me. Or am I just expecting too much?

Are there any alternative ways to get data from the database without using toArray()?

Another thing that I noticed is that the entire Node application slows remarkably down while getting the database results. I created a simple interval function that should increment the count value every 1 ms:

setInterval(function(){
  if(run) count++;
}, 1);

I would then expect the count value to be almost the same as the time, but for a time of 16 ms on my computer the count value was 3 or 4. On the Raspberry Pi the count value was never incremented. What is taking so much CPU usage? The monitor told me that my computer was using 27% CPU and the Raspberry Pi was using 92% CPU and 11% RAM, when asked to run the database query repeatedly.

I know that was a lot of questions. Any help or explanations are much appreciated. I'm still new to Node and MongoDB.

Coacher answered 16/4, 2015 at 12:43 Comment(0)
E
12

db.collection.find() returns a cursor, not results, and opening a cursor is pretty fast.

Once you start reading the cursor (using .toArray() or by traversing it using .each() or .next()), the actual documents are being transferred from the database to your client. That operation is taking up most of the time.

I doubt that using .each()/.next() (instead of .toArray(), which—under the hood—uses one of those two) will improve the performance much, but you could always try (who knows). Since .toArray() will read everything in memory, it may be worthwhile, although it doesn't sound like your data set is that large.

I really think that MongoDB on Raspberry Pi (esp a Model 1) is not going to work well. If you don't depend on the MongoDB query features too much, you should consider using an alternative data store. Perhaps even an in-memory storage (500 documents times 4 numbers doesn't sound like lots of RAM is required).

Ebb answered 16/4, 2015 at 12:55 Comment(2)
Thanks for your quick answer. It made a lot of things clear to me. I tried using .each() but it was actually a little slower. Is it expected that the Node application slows down while data is being transferred from the database?Coacher
@Coacher for small document sizes, even large-ish data sets shouldn't significantly slow down your app. In my experience, most of the slowing down will occur in the BSON parsing (which is a synchronous operation). With an underpowered platform, it may become a bottleneck.Ebb

© 2022 - 2024 — McMap. All rights reserved.