Severe performance drop with MongoDB Change Streams
Asked Answered
S

1

45

I want to get real-time updates about MongoDB database changes in Node.js.

A single MongoDB change stream sends update notifications almost instantly. But when I open multiple (10+) streams, there are massive delays (up to several minutes) between database writes and notification arrival.

That's how I set up a change stream:

let cursor = collection.watch([
  {$match: {"fullDocument.room": roomId}},
]);
cursor.stream().on("data", doc => {...});

I tried an alternative way to set up a stream, but it's just as slow:

let cursor = collection.aggregate([
  {$changeStream: {}},
  {$match: {"fullDocument.room": roomId}},
]);
cursor.forEach(doc => {...});

An automated process inserts tiny documents into the collection while collecting performance data.

Some additional details:

  • Open stream cursors count: 50
  • Write speed: 100 docs/second (batches of 10 using insertMany)
  • Runtime: 100 seconds
  • Average delay: 7.1 seconds
  • Largest delay: 205 seconds (not a typo, over three minutes)
  • MongoDB version: 3.6.2
  • Cluster setup #1: MongoDB Atlas M10 (3 replica set)
  • Cluster setup #2: DigitalOcean Ubuntu box + single instance mongo cluster in Docker
  • Node.js CPU usage: <1%

Both setups produce the same issue. What could be going on here?

Shelton answered 23/1, 2018 at 22:30 Comment(10)
Did you check if you have all the needed indexes? I.e. "fullDocument.room" i guess this needs an indexSnowmobile
No, I don't have any indexes. I don't really see how indexes would help sort out newly inserted items. But I'll give it a try.Shelton
Update: added an index on room, nothing changed.Shelton
Did you find any hint on this ?Snowmobile
Unfortunately, no. :( I'll start a bounty.Shelton
RAM size on these machines? It’s estimated that after 1000 streams you will start to see very measurable performance drops. Why there is not a global change stream option to avoid having so many cursors floating around is not clear. I think it’s something that should be looked at for future versions of this feature. Up to now, many use cases of mongo, specifically in the multi-tenant world, might have > 1000 namespaces on a system. This would make the performance drop problematic. percona.com/blog/2017/11/22/…Neural
My dev machine has 16G, and I see considerable performance drop even with just 10 open streams. The DO machine I tested has 4G as I recall. Both memory and CPU usages were pretty low though.Shelton
1) How are you running the processes ? 2) Have you measured the network latency between your DigitalOcean box and Atlas cluster ? 3) Have you tried replicating with all nodes in a local network ?Kirbee
@WanBachtiar: 1) I run a single Node.js script that creates 50 change stream cursors and then writes into the collection at 100 docs/sec. 2) I don't have exact numbers. But the delay is the same on my local computer without any internet traffic. Atlas servers have milliseconds-grade ping distance, and change streams are several orders of magnitude slower than that. The DO test case runs both Mongo and the client on the same machine, yet the issue persists. I highly doubt it's a connection issue. 3) Yes, I did, the result is the same.Shelton
I filed a bug to MongoDB if anyone's interested: jira.mongodb.org/browse/SERVER-32946Shelton
S
57

The default connection pool size in the Node.js client for MongoDB is 5. Since each change stream cursor opens a new connection, the connection pool needs to be at least as large as the number of cursors.

In version 3.x of the Node Mongo Driver use 'poolSize':

const mongoConnection = await MongoClient.connect(URL, {poolSize: 100});

In version 4.x of the Node Mongo Driver use 'minPoolSize' and 'maxPoolSize':

const mongoConnection = await MongoClient.connect(URL, {minPoolSize: 100, maxPoolSize: 1000});

(Thanks to MongoDB Inc. for investigating this issue.)

Shelton answered 30/1, 2018 at 20:32 Comment(7)
Thanks for this. so if i am watching 5 collections, these different collections will also open a connection to the database ?Exterminate
@Exterminate Yes, that's my understanding. Worth mentioning, even if you watch the same collection using 5 different streams, it opens 5 connections. No idea why.Shelton
and the annoying fact is , if the poolSize is equal to db.serverStatus().connections.current the app will slow down drasticallyExterminate
This is very disappointing, but it is worth noting that at least on 4.2 and later you can open a change stream on the database and then use the query to select which collections you want; thus you can do a single stream for all collections in a database, you just have to demultiplex them in your own codeVigen
Hi @taxilian, is any maximum number of change streams that can open?Trotyl
if each change stream cursor opens a new connection then the maximum number would be related to the maximum number of connections and/or cursors which the db server (and/or client) can handleVigen
@Exterminate I am experiencing this exact issue. When current connections are equal to the pool size the app slows down. How to solve this?Frey

© 2022 - 2024 — McMap. All rights reserved.