Mongoose (mongodb) batch insert?
Asked Answered
S

8

133

Does Mongoose v3.6+ support batch inserts now? I've searched for a few minutes but anything matching this query is a couple of years old and the answer was an unequivocal no.

Edit:

For future reference, the answer is to use Model.create(). create() accepts an array as its first argument, so you can pass your documents to be inserted as an array.

See Model.create() documentation

Shantung answered 24/5, 2013 at 1:14 Comment(7)
See this answer to a previous question.Chelseychelsie
Thanks. That's what I ended up finding after posting.Shantung
@Shantung please add your edit as an answer and accept it to resolve your question.Whisky
groups.google.com/forum/#!topic/mongoose-orm/IkPmvcd0kdsPostremogeniture
Model.create() is slow and if you're considering inserting a huge number of documents, it's better to take this approach instead.Chipboard
Check this #38509795Struggle
mongoose now supports Model.bulkWrite() and Model.insertMany().Alarmist
C
186

Model.create() vs Model.collection.insert(): a faster approach

Model.create() is a bad way to do inserts if you are dealing with a very large bulk. It will be very slow. In that case you should use Model.collection.insert, which performs much better. Depending on the size of the bulk, Model.create() will even crash! Tried with a million documents, no luck. Using Model.collection.insert it took just a few seconds.

Model.collection.insert(docs, options, callback)
  • docs is the array of documents to be inserted;
  • options is an optional configuration object - see the docs
  • callback(err, docs) will be called after all documents get saved or an error occurs. On success, docs is the array of persisted documents.

As Mongoose's author points out here, this method will bypass any validation procedures and access the Mongo driver directly. It's a trade-off you have to make since you're handling a large amount of data, otherwise you wouldn't be able to insert it to your database at all (remember we're talking hundreds of thousands of documents here).

A simple example

var Potato = mongoose.model('Potato', PotatoSchema);

var potatoBag = [/* a humongous amount of potato objects */];

Potato.collection.insert(potatoBag, onInsert);

function onInsert(err, docs) {
    if (err) {
        // TODO: handle error
    } else {
        console.info('%d potatoes were successfully stored.', docs.length);
    }
}

Update 2019-06-22: although insert() can still be used just fine, it's been deprecated in favor of insertMany(). The parameters are exactly the same, so you can just use it as a drop-in replacement and everything should work just fine (well, the return value is a bit different, but you're probably not using it anyway).

Reference

Chipboard answered 20/7, 2014 at 6:54 Comment(14)
groups.google.com/forum/#!topic/mongoose-orm/IkPmvcd0kds Says it all really.Postremogeniture
Please give example with Mongoose.Hideous
Since Model.collection goes directly through the Mongo driver, you lose all the neat mongoose stuff including validation and hooks. Just something to keep in mind. Model.create loses the hooks, but still goes through validation. If you want it all, you must iterate and new MyModel()Neale
@Pier-LucGendreau You are absolutely right, but it's a tradeoff you have to make once you start dealing with a humongous amount of data.Chipboard
@SirBenBenji Example given :-)Chipboard
Thanks @arcseldon. Added your link as a reference.Chipboard
Be careful to new readers: "Changed in version 2.6: The insert() returns an object that contains the status of the operation". No more docs.Grazing
By addressing .collection property you are bypassing Mongoose (validation, 'pre' methods ...). See the other answer for using (new native Mongoose method)[https://mcmap.net/q/167642/-mongoose-mongodb-batch-insert].Orlantha
Probably the best way to add tons of data :DTripura
insert is now deprecated, use insertMany with the same function signature as insertPert
@EatatJoes can you point us to the documentation where it says so? I just checked 4.0 docs and insert is not marked as deprecated. Also checked the upcoming 4.2 and nothing there either. insertMany was added in 3.2, so I think I will keep using insert here since people may still be using older versions. If you find the reference where it says it's been deprecated, please send it to me and I'll add it to the answer.Chipboard
@LucioPaiva I had a look at History.md and it looks like Mongoose started using mongodb-native 3.2.0 since version 5.5.0, here is the link, mongodb.github.io/node-mongodb-native/3.2/api/… Since .collection.insert{Many} is just an interface to the core mongo driver you will see the deprecation warning.Pert
Thanks for the link @EatatJoes - added it to the answer.Chipboard
As per mongoose 5.7.3 insertMany is 3-4 slower than collection.insert, probably due to the fact that it returns all the document back to the client. collection.insert is as fast as pymongo insert_many - #58226891.Liebman
O
127

Mongoose 4.4.0 now supports bulk insert

Mongoose 4.4.0 introduces --true-- bulk insert with the model method .insertMany(). It is way faster than looping on .create() or providing it with an array.

Usage:

var rawDocuments = [/* ... */];

Book.insertMany(rawDocuments)
    .then(function(mongooseDocuments) {
         /* ... */
    })
    .catch(function(err) {
        /* Error handling */
    });

Or

Book.insertMany(rawDocuments, function (err, mongooseDocuments) { /* Your callback function... */ });

You can track it on:

Orlantha answered 3/2, 2016 at 10:38 Comment(7)
At this time, this method does not support options.Textbook
Thank you for the answer. Any idea what parsing of the rawDocuments should be in place? I've tried it with an array of Json objects and all it has inserted was just their IDs. :(Quinquagesima
How is this different to bulkWrite? See here: #38742975Quinquagesima
insertMany doesn't work for me. I got a fatal error allocation failed. But if I use collection.insert It works perfectly.Tripura
Would this work with the extra stuff that mongoose schema provides? ex will this add the data if no date exists dateCreated : { type: Date, default: Date.now },Angry
This is the right answer. I had the same problem and this solved it and its FASTRiggle
As per mongoose 5.7.3 insertMany is 3-4 slower than collection.insert, probably due to the fact that it returns all the document back to the client. collection.insert is as fast as pymongo insert_many - #58226891.Liebman
M
23

Indeed, you can use the "create" method of Mongoose, it can contain an array of documents, see this example:

Candy.create({ candy: 'jelly bean' }, { candy: 'snickers' }, function (err, jellybean, snickers) {
});

The callback function contains the inserted documents. You do not always know how many items has to be inserted (fixed argument length like above) so you can loop through them:

var insertedDocs = [];
for (var i=1; i<arguments.length; ++i) {
    insertedDocs.push(arguments[i]);
}

Update: A better solution

A better solution would to use Candy.collection.insert() instead of Candy.create() - used in the example above - because it's faster (create() is calling Model.save() on each item so it's slower).

See the Mongo documentation for more information: http://docs.mongodb.org/manual/reference/method/db.collection.insert/

(thanks to arcseldon for pointing this out)

Melt answered 29/3, 2014 at 20:30 Comment(5)
groups.google.com/forum/#!topic/mongoose-orm/IkPmvcd0kds - Depending on what you want, the link has a better option.Postremogeniture
Don't you mean {type:'jellybean'} instead of {type:'jelly bean'}? Btw. what strange types are those? Are they part of Mongoose API?Hideous
Well that's a bad naming choice then, since type is usually reserved in Mongoose for denominating the ADT of a database object.Hideous
@sirbenbenji I changed it, but it was an example also present in the official documentation. It was not necessary to downvote for this I think.Melt
By addressing .collection property you are bypassing Mongoose (validation, 'pre' methods ...)Orlantha
B
6

Here are both way of saving data with insertMany and save

1) Mongoose save array of documents with insertMany in bulk

/* write mongoose schema model and export this */
var Potato = mongoose.model('Potato', PotatoSchema);

/* write this api in routes directory  */
router.post('/addDocuments', function (req, res) {
    const data = [/* array of object which data need to save in db */];

    Potato.insertMany(data)  
    .then((result) => {
            console.log("result ", result);
            res.status(200).json({'success': 'new documents added!', 'data': result});
    })
    .catch(err => {
            console.error("error ", err);
            res.status(400).json({err});
    });
})

2) Mongoose save array of documents with .save()

These documents will save parallel.

/* write mongoose schema model and export this */
var Potato = mongoose.model('Potato', PotatoSchema);

/* write this api in routes directory  */
router.post('/addDocuments', function (req, res) {
    const saveData = []
    const data = [/* array of object which data need to save in db */];
    data.map((i) => {
        console.log(i)
        var potato = new Potato(data[i])
        potato.save()
        .then((result) => {
            console.log(result)
            saveData.push(result)
            if (saveData.length === data.length) {
                res.status(200).json({'success': 'new documents added!', 'data': saveData});
            }
        })
        .catch((err) => {
            console.error(err)
            res.status(500).json({err});
        })
    })
})
Brion answered 16/5, 2018 at 17:36 Comment(0)
S
5

It seems that using mongoose there is a limit of more than 1000 documents, when using

Potato.collection.insert(potatoBag, onInsert);

You can use:

var bulk = Model.collection.initializeOrderedBulkOp();

async.each(users, function (user, callback) {
    bulk.insert(hash);
}, function (err) {
    var bulkStart = Date.now();
    bulk.execute(function(err, res){
        if (err) console.log (" gameResult.js > err " , err);
        console.log (" gameResult.js > BULK TIME  " , Date.now() - bulkStart );
        console.log (" gameResult.js > BULK INSERT " , res.nInserted)
      });
});

But this is almost twice as fast when testing with 10000 documents:

function fastInsert(arrOfResults) {
var startTime = Date.now();
    var count = 0;
    var c = Math.round( arrOfResults.length / 990);

    var fakeArr = [];
    fakeArr.length = c;
    var docsSaved = 0

    async.each(fakeArr, function (item, callback) {

            var sliced = arrOfResults.slice(count, count+999);
            sliced.length)
            count = count +999;
            if(sliced.length != 0 ){
                    GameResultModel.collection.insert(sliced, function (err, docs) {
                            docsSaved += docs.ops.length
                            callback();
                    });
            }else {
                    callback()
            }
    }, function (err) {
            console.log (" gameResult.js > BULK INSERT AMOUNT: ", arrOfResults.length, "docsSaved  " , docsSaved, " DIFF TIME:",Date.now() - startTime);
    });
}
Senzer answered 2/8, 2015 at 18:37 Comment(1)
By addressing .collection property you are bypassing Mongoose (validation, 'pre' methods ...)Orlantha
D
4

You can perform bulk insert using mongoDB shell using inserting the values in an array.

db.collection.insert([{values},{values},{values},{values}]);
Delisle answered 8/8, 2014 at 9:3 Comment(4)
is there a way in mongoose for bulk insert?Delisle
YourModel.collection.insert()Periderm
By addressing .collection property you are bypassing Mongoose (validation, 'pre' methods ...)Orlantha
This is not mongoose, and the raw collection.insert answer was given a few weeks before this answer, and explained in much more detail.Alarmist
P
4

You can perform bulk insert using mongoose, as the highest score answer. But the example cannot work, it should be:

/* a humongous amount of potatos */
var potatoBag = [{name:'potato1'}, {name:'potato2'}];

var Potato = mongoose.model('Potato', PotatoSchema);
Potato.collection.insert(potatoBag, onInsert);

function onInsert(err, docs) {
    if (err) {
        // TODO: handle error
    } else {
        console.info('%d potatoes were successfully stored.', docs.length);
    }
}

Don't use a schema instance for the bulk insert, you should use a plain map object.

Premiership answered 16/3, 2015 at 3:35 Comment(2)
The first answer is not wrong, it just has validationSchizothymia
By addressing .collection property you are bypassing Mongoose (validation, 'pre' methods ...)Orlantha
E
0

Sharing working and relevant code from our project:

//documentsArray is the list of sampleCollection objects
sampleCollection.insertMany(documentsArray)  
    .then((res) => {
        console.log("insert sampleCollection result ", res);
    })
    .catch(err => {
        console.log("bulk insert sampleCollection error ", err);
    });
Ephesus answered 26/12, 2017 at 17:1 Comment(1)
The .insertMany solution was already given (and explained) in this 2016 answer.Alarmist

© 2022 - 2024 — McMap. All rights reserved.