limits of number of collections in databases
Asked Answered
M

6

54

Can anyone say are there any practical limits for the number of collections in mongodb? They write here https://docs.mongodb.com/manual/core/data-model-operations/#large-number-of-collections:

Generally, having a large number of collections has no significant performance penalty, and results in very good performance.

But for some reason mongodb set limit 24000 for the number of namespaces in the database, it looks like it can be increased, but I wonder why it has some the limit in default configuration if having many collections in the database doesn't cause any performance penalty?

Does it mean that it's a viable solution to have a practically unlimited number of collections in one database, for example, to have one collection of data of one account in a database for the multitenant application, having, for example, hundreds of thousands of collections in the database? If it's the viable solution to have a very large number of collections for a database for every tenant, what's the benefits of it for example versus having documents of each tenant in one collection? Thank you very much for your answers.

Moffett answered 25/3, 2012 at 6:44 Comment(2)
The answers are informative, but since it's the future I'll add that the documentation now states that Wired Tiger - which is the default as of 3.2 - is not subject to this limitation.Veery
Although WiredTiger has no such limit, I've discovered that having thousands of collections causes problems starting the server. It uses huge amounts of RAM and opens a pointer to every *.wt file on disk. Resource consumption eventually settles down to normal operating usage, but my restarts take 10 minutes and I have to lease servers with many times the RAM I would otherwise need. It's now advised that a massive number of collections is an anti-pattern. So much for "humongous".Swartz
T
40

This answer is late however the other answers seem a bit...weak in terms of reliability and factual information so I will attempt to remedy that a little.

But for some reason mongodb set limit 24000 for the number of namespaces in the database,

That is merely the default setting. Yes, there is a default setting.

It does say on the limits page that 24000 is the limit ( http://docs.mongodb.org/manual/reference/limits/#Number%20of%20Namespaces ), as though there is no way to expand that but there is.

However there is a maximum limit on how big a namespace file can be ( http://docs.mongodb.org/manual/reference/limits/#Size%20of%20Namespace%20File ) which is 2GB. That gives you roughly 3 million namespaces to play with in most cases which is quite impressive and I am unsure if many people will hit that limit quickly.

You can modify the default value to go higher than 16MB by using the nssize parameter either within the configuration ( http://docs.mongodb.org/manual/reference/configuration-options/#nssize ) or at runtime by manipulating the command used to run MongoDB ( http://docs.mongodb.org/manual/reference/mongod/#cmdoption-mongod--nssize ).

There is no real reason for why MongoDB implements 16MB by default for its nssize as far as I know, I have never heard about the motto of "not bother the user with every single detail" so I don't buy that one.

I think, in my opinion, the main reason why MongoDB hides this is because even though, as the documentation states:

Distinct collections are very important for high-throughput batch processing.

Using multiple collections as a means to scale vertically rather than horizontally through a cluster, as MongoDB is designed to, is considered (quite often) bad practice for large scale websites; as such 12K collections is normally considered something that people will never, and should never, ascertain.

Trying answered 28/2, 2013 at 12:51 Comment(7)
Is it possible to get the records from every collections like if i have user.primary and user.secondary in that collections can i select user by name from both collection at a single time?Reinke
@Reinke No, MongoDB commands are singular collection only, also there are no server-side JOINsTrying
Are you sure it's 3 million max namespaces? By my calculations it's only 307 Thousand.Bassist
@Bassist tbh that was quoted from a 10gen engineer I have never actually calculatedTrying
@Bassist the actual calc is: 2147483648/684 (bytes) = 3.1mTrying
@Bassist np common mistake I do it all the time :)Trying
There is no collection number limit if you use WiredTiger. WiredTiger is the default engine now so it's safe to say that there is no collection number limit anymore.Azores
S
22

No More Limits!

As other answers have stated - this is determined by the size of the namespace file. This was previously an issue, because it had a default limit of 16mb and a max of 2gb. However with the release of MongoDB 3.0 and the WiredTiger storage engine, it looks like this limit has been removed. WiredTiger seems to be better in almost every way, so I see little reason for anyone to use the old engine, except for legacy support reasons. From the site:

For the MMAPv1 storage engine, namespace files can be no larger than 2047 megabytes.

By default namespace files are 16 megabytes. You can configure the size using the nsSize option.

The WiredTiger storage engine is not subject to this limitation.

http://docs.mongodb.org/manual/reference/limits/

Schwenk answered 7/10, 2015 at 8:51 Comment(0)
L
14

A little background:

Every time mongo creates a database, it creates a namespace (db.ns) file for it. The namespace (or collections as you might want to call it) file holds the metadata about the collection. By default the namespace file is 16MB in size, though you can increase the size manually. The metadata for each collections is 648 bytes + some overhead bytes. Divide that by 16MB and you get approximately 24000 namespaces per database. You can start mongo by specifying a larger namespace file and that will let you create more collections per database.

The idea behind any default configuration is to not bother the user with every single detail (and configurable knob) and choose one that generally works for most people. Also, viability does go hand in hand with best/good design practices. As Chris said, consider the shape of your data and decide accordingly.

Lianna answered 25/3, 2012 at 20:53 Comment(4)
The question is why they have this limit at all. Do they assuem that from some number of namespaces we could have performance problems?Moffett
I am not sure why you are taking it as a limit. Its just a default value with the idea that it would fit the use case for most people. Nothing is stopping you from creating a larger namespace file if your use case needs larger number of collections.Lianna
It seems to me if the number of namespaces could be unlimited, there wouldn't be any params for it at all allowing to have unlimited number of namespaced. Even if I increase the limit the problem of possible reaching of limit would still exists at some point, right?Moffett
I know I'm really late to the discussion but it seems to me that an open-ended version like you suggest, @Oleg, needs additional functionality, i.e. the namespace file would have to be automatically made larger on demand. If you want to keep things simple in that place, you go instead with a fixed limit, hoping that once people need to go beyond that limit they will have progressed far enough to know which wheel to turnNashom
K
4

As others mention, the default namespace size is 16MB and you can get about 24000 namespace entries. Actually my 64 bit instance in Ubuntu topped out at 23684 using the default 16MB namespace file.

One important thing that isn't mentioned in the FAQ is that indexes also use namespace slots.

You can count the namespace entries with:

db.system.namespaces.count()

And it's also interesting to actually take a look at what's in there:

db.system.namespaces.find()

Set your limit higher than what you think you need because once a database is created, the namespace file cannot be extended (as far as I understand - if there is a way, please tell me!!!).

Kaolack answered 9/7, 2013 at 6:16 Comment(2)
docs.mongodb.org/manual/reference/limits/… By default namespace files are 16 megabytes. You can configure the size using the nssize option.Demography
You can only configure the size of the namespace before creating your database. To change it, you need to do mongodump, destroy the database, reconfigure mongod, restart and mongoload. Alternatively additional replicas can be added with a larger nssize option, and then you can switch out the old one with the smaller nssize.Kaolack
G
3

Practically, I have never run across a maximum. But I've definitely never gone beyond the 24,000 collection limit. I'm pretty sure I've never hit more than 200, other than when I was performance testing the thing. I have to admit, I think it sounds like an awful lot of chaos to have that many collections in a single database, rather than grouping like data in to their own collections.

Consider the shape of your data and business rules. If your data needs to be laid out such that you must have the data separated in to different logical groupings for your multi-tenant app, then you probably should consider other data stores. Because while Mongo is great, the fact that they put a limit on the amount of collections at all tells me that they know there is some theoretical limit where performance is effected.

Perhaps you should consider a store that would match the data shape? Riak, for example, has an unlimited number of 'buckets' (without theoretical maximum) that you can have in your application. One bucket per account is perfectly doable, but you sacrifice some querability by going that direction.

Otherwise, you may want to follow a more relational model of grouping like with like. In my view, Mongo feels like a half-way point between relational databases and key-value stores. That means that it's more easy to conceptualize it coming from a relational database world.

Gratin answered 25/3, 2012 at 15:14 Comment(7)
Each tenant should have entities that have both common fields for all tenants and can have customized fields for each tenant.Moffett
Could you please explain what you mean here: "I have to admit, I think it sounds like an awful lot of chaos to have that many collections in a single database, rather than grouping like data in to their own collections. "? What do you mean "grouping like data in to their own collections"? Do you think that having one collection for each tenant can cause performance issues from some time? What if to store in each data not more that N collections and storing other tenants in other database?Moffett
One of the problems with Riak buckets is that it has limit something like 64M not like tables in rdbms and collections in mongodb that can store any number of data.Moffett
When I say "Grouping like data" I mean grouping things that have similar fields. Lets say you have collection of newsletters. You would group all of those newsletters in to the same collection because they are all newsletters. Even if they don't 'go together' because they might fit with a different tenant of your application. If you're familiar with relational modeling, think in that direction.Gratin
As for Riak buckets, where did you hear that? Each entry in to a bucket has a maximum of 100m (or something like that), but not each bucket. Mongo has a hard maximum of 16m (currently) per entry. Technically Riak can actually hold more per "document" stored in a Riak bucket, vs each "document" stored in a Mongo collection.Gratin
Do you mean it's better to store every entity (for example newsletter) of each tenant in one big collection than to store entity of each tenant in a seperate collection? If so what's advantange of it? For example, if entity (let's say newsletter) of each tenant is stored in a seperate collection data there is no need to store tenant if for each entity of tenant, theoretically it's simpler to manage small collection of each tenant, not need to store all huge index of large collection with entities of all tenants, but maybe huge collections number will cause performance issues in mongodb.Moffett
Do you mean that there is not limit to bucket size in Riak, only the limit to the size of object stored by the key in bucket?Moffett
N
3

There seems to be a massive overhead for maintaining collections. I've just reduced a database which had around 1.5mio documents in 11000 collections to one with the same number of documents in around 300 collections; this has reduced the size of the database from 8GB to 1GB. I'm not familiar with the inner workings of MongoDB so this may be obvious but I thought might be worth noting in this context.

Nashom answered 22/1, 2013 at 14:43 Comment(1)
It depends on which mongo version you are using, switch to latest version and then test the sameIntersect

© 2022 - 2024 — McMap. All rights reserved.