We are running a 3-member MongoDB replica set in production environment.
We would need to maintain a clone of that replset, which is called "mirror," to do internal analytics. This mirror does not need to be real-time but the more it is up-to-date the better it is (could be 1-day lagged at max).
What would be the most appropriate methods to maintain such a mirrored database? (Note that this mirror can be either 1-member replset or standalone instance)
FYI, we have tried 2 options but their speed was not acceptable:
- Oplog replaying. But this took so much time (~40 hours to play oplog from the replset's Primary).
- Periodically using snapshot from production replset but the new volume (created from snapshot) was so slow because it was not warmed up (we are using AWS EBS, the warming up took ~12 hours)
Update #1
: We also tried to make the mirror to be the replset member but we wanted to separate the mirror from the replset so this options does not satisfy the requirements.
Update #2
: The reason why we do not want this mirror to be a replset member: We ran heavy queries on this mirror and made it run out of resource credits (disk IO, network IO, CPU) and the instance became temporarily unavailable. This changed the whole replset structure (because it lost one node). When the instance was available again, it changed the replset structure again (added one more node). These changes badly affected the replset.
Thank you.