Synchronize mongo databases on different servers

Asked 21/5, 2013 at 13:5 Answered 16/5, 2017 at 7:17

I have next situation. I have two mongodb instances on different servers. For example

Mongodb instance on server "one" (host1:27017) with database: "test1"
Mongodb instance on server "two" (host2:27017) with database: "test2"

Now, i need to synchronize "test1" database from "host1:27017" with "test2" from "host2:27017".

By "synchronize" I mean next:

If some collection from "test1" database doesn't exist in "test2" then this collection should be full copied in "test1" database.
If some record from collection doesn't exist in "test2" database, then must be added otherwise updated. If record not exist in A collection in "test1" database, but exist in A collection in "test2" database, then record must be deleted from "test2".

By the way here is problem. For example: "test1" database has collection "A" with the following documents:

{
 _id: "1",
 name: "some name"
}

"test2" database has collection "A" with the following documents:

{
 _id: "1",
 name: "some name"
}

{
 _id: "2",
 name: "some name2"
}

If I perform db.copyDatabase('test1', 'test2', "host2:27017") I get error:

"errmsg" : "exception: E11000 duplicate key error index: test1.A.$id dup key: { : \"1\" }"

Same with cloneDatabase command. How I can resolve it ?

In general what are the ways to synchronize databases? I know what the simplest way is just copy files from one server to second, but maybe there are better ways.

Please help. I'm newcomer in mongo. Thanks.

Commendation answered 21/5, 2013 at 13:5 Comment(8)

There are a fair amount of questions about this subject alone on here if you search around a little, i.e.: #16172113 – Caveman 21/5, 2013 at 13:56

Why don't you dump the database on host1, tar+scp the data to the other host and do a dbrestore? Or you set up a replication set (remember to include an arbiter), and let mongodb take care of replication? – Ieper 21/5, 2013 at 13:59

how are things being updated in mongoDB? Are there writes going to test1 and test2 is just a copy? – Talc 21/5, 2013 at 14:10

I just want sync data between different instances. In my case test1 is how master and test2 as slave, i want to have equivalence data in test1 and test2 – Commendation 21/5, 2013 at 14:29

Sounds like you want a replica set when you say master and slave – Caveman 21/5, 2013 at 14:58

I can't use replica. i need to sync it data manually. – Commendation 21/5, 2013 at 15:4

Well your about to get into a world of pain doing that, I listed a question which states some of the best manual ways – Caveman 21/5, 2013 at 15:13

I'm interested in this topic as well, for a simple system I just want to sync data automatically between a master and slave machine, don't need the complexity of replication sets with arbiters - though that might be the ultimate solution. A simple master-slave replication scheme is sufficient, for now. – Adenaadenauer 6/7, 2013 at 15:47

I haven't tried this, but the current MongoDB documents describe a replication set equivalent to master-slave replication:

Deploy Master-Slave Equivalent using Replica Sets

If you want a replication configuration that resembles master-slave replication, using replica sets, consider the following replica configuration document. In this deployment hosts and 1 provide replication that is roughly equivalent to a two-instance master-slave deployment:
{
   _id : 'setName',
   members : [
              { _id : 0, host : "<master>", priority : 1 },
              { _id : 1, host : "<slave>", priority : 0, votes : 0 }
  ]
}
See Replica Set Configuration for more information about replica set configurations.

Adenaadenauer answered 6/7, 2013 at 16:17 Comment(0)

Use _id instead of id. There is no need to declare it in your model.

if you have plenty of servers

I use on each server a small prehook which creates a controlled unique _id. The mongoose _id is built very logical (https://docs.mongodb.com/manual/reference/method/ObjectId/#ObjectIDs-BSONObjectIDSpecification), the digits 0,6 are the machine identifier. I just control these digits because I have multiple servers and I want to assure there is no collusion. If you have just a few, it is probably no risk to not do this. And even in my case I think it is too paranoid.

exports.useProcessId = ()->
  return process.env.INSTANCE_PROCESS_ID? && process.env.INSTANCE_PROCESS_ID.length == 4

exports.manipulateMongooseId = (id) ->
  id = id.toString()
  newId = new ObjectId(id.slice(0,6) + process.env.INSTANCE_PROCESS_ID + id.slice(10,24))
  return newId

schema

mymOdelSchema.pre('save', (next) ->
  data = @
  async.parallel
    myModel: (next)->
      myModelValidator.base(data, next)
    changeMongooseId: (next)->
      if useProcessId && instanceType == 'manager' then processIdConfig.changeMongooseId(data, next) else return next()
    (err)->
      return

 next new Error(err) if err?
      return next()
)

Skantze answered 16/5, 2017 at 7:17 Comment(0)

Deploy Master-Slave Equivalent using Replica Sets

if you have plenty of servers

schema

Recommended topics

Hot tags