Node.JS + mongo: .find().each() stopping after first batch
Asked Answered
N

1

6

This has me stumped.

I have a standalone (command-line executed) node script, whose purpose is to iterate through all the documents in a large collection (several hundred thousand of them), and for each document, perform a few calculations, run a little additional JS code, and then update the document with some new values.

Per the documentation for cursor.each(), once I've got my cursor from collection.find(), the .each(cb) method should execute cb(item) on each item in the entire collection.

Example code:

myDb.collection('bigcollection').find().each(function(err, doc) {
    if (err) {
        console.log("Error: " + err);
    } else {
        if (doc != null) {
            process.stdout.write(".");
        } else {
            process.stdout.write("X");
        }
    }
});

What I'd expect this to do is print out several hundred thousand .'s and then print an X at the end, as cursor.each() is supposed to "Iterate over all the documents for this cursor," and per the example code, "If the item is null then the cursor is exhausted/empty and closed."

But what it actually does is print out precisely 101 .'s, without an X at the end.

If I adjust the batch size (.find().batchSize(10).each(...), it goes through exactly that number of documents before bailing.

So, why is it only processing the first batch? Am I somehow misreading the documentation for .each()? Does it have to do with the fact that this is a command-line script, and somehow the whole script is exiting before the second batch of results comes back, or something? If so, how do I make sure it actually processes all the results?

As a side node, I've tried using .stream() and .forEach(), and in both of those cases as well, it ditches after the first batch.

UPDATE: Well, this is interesting. Just tried connecting to my production server instead of my mongo instance on localhost, and voila, it runs through the entire collection like it should. The server is running mongodb 3.0.6, my local instance is 3.2.3. My version of the node mongodb driver is 2.0.43.

Nicolais answered 4/4, 2016 at 16:8 Comment(4)
I don't think the last item will be null. The last item is the last item... period. So it is a valid document and it will print ., not x. Maybe you need to count the collection first and use a counter to print x after the last one. Makes sense?Breeder
This code works fine when I try it. Are you perhaps closing myDb before the iteration completes?Thermionics
@Thermionics I'm not actually closing myDb in the code at all. I was wondering if since it's an asynchronous call, if maybe node was just running straight through to the end and then quitting before sending out the request for the second batch, or something, but that seems like it would be... odd.Nicolais
just had the same issue with mongodb driver v.2.0.45, "npm install mongodb" updated it to 2.1.21 and suddenly my stream finishes as should...Edifice
W
4

I have 200 documents in my collection and following code goes well. In other words, couldn't reproduce problem. As you can see I have reduced batch size to 10.

var url = 'mongodb://localhost:27017/test';
MongoClient.connect(url, function(err, db) {
    if (err) {
        console.log(err);
    }
    else {
        var counter = 0;
        db.collection('collection').find({}).batchSize(10).each(function(e, r){
            if(err){
                console.log("E: " +  err);
                db.close();
            }
            else{
                if(r ==  null){
                    db.close();
                }
                else{
                counter += 1;
                console.log("X: " +  counter);
                }
            }
        });
    }
});

If you are still facing same issue, I'd suggest to update MongoDB driver to latest version. Since drivers are actively being developed, sometime bugs sneak into released version causing strange behavior.

Wroughtup answered 4/4, 2016 at 16:29 Comment(4)
OK, so I just ran your exact code (subbing in my connection URL and collection name for yours), and as before, it ran through the first batch of 10 and then exited. So there must be something else going on here. Any idea what could cause it to exit like that?Nicolais
well, this is really strange. What version of driver you have? try updating to latest if possible. Also please mention your MongoDB version too.Wroughtup
hmm, interesting. try removing/reinstalling driver and repairing database.Wroughtup
OK, there you have it -- updating the mongodb node driver to 2.0.55 fixed it. Man, I don't know how many hours I spent on this... thanks for suggesting that! Please stick that comment into the answer; I'll mark this one as correct.Nicolais

© 2022 - 2024 — McMap. All rights reserved.