Should I use a single or multiple database setup for a multi-client application? [closed]
Asked Answered
P

9

66

I am working on a PHP application that intends to ease company workflow and project management, let's say something like Basecamp and GoPlan.

I am not sure on what the best approach is, database-wise. Should I use a single database and add client-specific columns to each of the tables, or should I create a database for each new client? An important factor is automation: I want it to be dead simple to create a new client (and perhaps opening the possibility to signing up for yourself).

Possible cons I can think of using one database:

  • Lack of extensibility
  • Security problems (although bugs shouldn't be there in the first place)

What are your thoughts on this? Do you have any ideas what solution the above companies are most likely to have chosen?

Protozoology answered 1/11, 2008 at 7:25 Comment(3)
I had the same question. Here are some of the answers I got. #69628 Check out the slides on LinkedIn's architectureMetallic
did speed ever come into consideration as well? A database search with 1 million records will perform significantly better than one with a billion. I'm curious how you fared on this.Starve
possible duplicate of What are the advantages of using a single database for EACH client?Heliotropism
S
38

I usually add ClientID to all tables and go with one database. But since the database is usually hard to scale I will also make it possible to run on different database instances for some or all clients.

That way you can have a bunch of small clients in one database and the big ones on separate servers.

A key factor for maintainability though, is that you keep the schema identical in all databases. There will be headache enough to manage the versioning without introducing client specific schemas.

Sifuentes answered 1/11, 2008 at 8:21 Comment(1)
Yeah, classic example of sharding. You can also move clients to different database for maintenance, etc. The key is to build the tools to move data around and an API to find what server an account is on.. Once that's done, the sky is the limit.Monochloride
P
36

Listen to the Stackoverflow podcast where Joel and Jeff talk about the very same question. Joel is talking about their experience offering a hosted version of their software. He points out that adding client ids all over your DB complicates the design and code (are you sure you didn't accidentally forget to add it to some WHERE clause?) and complicates hosting feature, such as client-specific backups.

It was in episode #20 or #21 (check the transcripts for details).

Peppel answered 1/11, 2008 at 16:19 Comment(1)
it's episode #19 @ [50:45] => stackoverflow.fogbugz.com/default.asp?W24218Steven
A
24

In my view, it will depend on your likely customer base. If you could get into a situation where arch-rivals are both using your system, then you would be better off with separate databases. It also depends on how multiple databases get implemented by your DBMS. If each database has a separate copy of the infrastructure, then that suggests a single database (or a change of DBMS). If multiple databases can be served by a single copy of the infrastructure, then I'd go for separate databases.

Think of database backup. Customer A says "Please send me a copy of my data". Much, much easier in a separate database setup than if a single database is shared. Think of removing a customer; again, much easier with separate databases.

(The 'infrastructure' part is mealy-mouthed because there are major differences between different DBMS about what constitutes a 'database' versus a 'server instance', for example. Add: The question is tagged 'mysql', so maybe those thoughts aren't completely relevant.)

Add: One more issue - with multiple customers in a single database, every SQL query is going to need to ensure that the data for the correct customer is chosen. That means that the SQL is going to be harder to write, and read, and the DBMS is going to have to work harder on processing the data, and indexes will be bigger, and ... I really would go with a separate database per customer for many purposes.

Clearly, StackOverflow (as an example) does not have a separate database per user; we all use the same database. But if you were running accounting systems for different companies, I don't think it would be acceptable (to the companies, and possibly not to the legal people) to share databases.

Anglesite answered 1/11, 2008 at 15:58 Comment(0)
S
15
  • DEVELOPMENT For rapid development, use a database per customer. Think how easy it will be to backup, restore, or delete a customer's data. Or to measure/monitor/bill usage. You won't need to write code to do it by yourself, just use your database primitives.

  • PERFORMANCE For performance, use a database for all. Think about connection pooling, shared memory, caching, etc.

  • BUSINESS If your business plan is to have lots of small customers (think hotmail) you should probably work on a single DB. And have all administrative tasks such registration, deletion, data migration, etc. fully automated and exposed in a friendly interface. If you plan to have dozens or up to a few hundreds of big customers then you can work in one DB per customer and have system administration scripts in place that can be operated by your customer support staff.

Soissons answered 15/2, 2009 at 12:54 Comment(0)
H
11

For multitenancy, performance will typically increase the more resources you manage to share across tenants, see

http://en.wikipedia.org/wiki/Multitenancy

So if you can, go with the single database. I agree that security problems would only occur due to bugs, as you can implement all access control in the application. In some databases, you can still use the database access control by careful use of views (so that each authenticated user gets a different view).

There are ways to provide extensibility also. For example, you could create a single table with extension attributes (keyed by tenant, base record, and extension attribute id). Or you can create per-tenant extension tables, so that each tenant has his own extension schema.

Horne answered 1/11, 2008 at 8:4 Comment(0)
W
7

When you're designing a multi-tenant database, you generally have three options:

  1. Have one database per tenant
  2. Have one schema per tenant
  3. Have all tenants share the same table(s)

The option you pick has implications on scalability, extensibility and isolation. These implications have been widely discussed across different StackOverflow questions and database articles.

In practice, each of the three design options -with enough effort- can address questions around scale, data that varies across tenants, and isolation. The decision depends on the primary dimension you’re building for. The summary:

  • If you're building for scale: Have all tenants share the same table(s)
  • If you're building for isolation: Create one database per tenant

For example, Google and Salesforce follow the first pattern and have their tenants share the same tables. Stackoverflow on the other hand follows the second pattern and keeps one database per tenant. The second approach is also more commonplace in regulated industries, such as healthcare.

The decision comes down to the primary dimension you're optimizing your database design for. This article on designing your SaaS database for scale talks about the trade-offs and provides a summary in the context of PostgreSQL.

Whitefly answered 8/10, 2016 at 18:38 Comment(0)
R
5

Another point to consider is that you may have a legal obligation to keep one companies' data separate from anothers'.

Rickettsia answered 1/11, 2008 at 20:52 Comment(0)
G
4

Having a database per client generally does not scale well. MySQL (and probably other databases) holds resources open per table, this does not lend itself well to 10k+ tables on one instance, which would happen in a large-scale multitenancy situation.

Of course, if you have some other issue which causes other problems before you get to this level, this may not be relevant.

Additionally, "sharding" a multi-tenant application is likely€ to be the right thing to do eventually as your application gets bigger and bigger.

Sharding does not however mean one database (or instance) per tenant, but one per shard or set of shards, which may have several tenants each. You will need to discover the right tuning parameters for yourself, probably in production (hence it probably needs to be pretty tunable from the outset)

€ I can't guarantee it.

Gynaeceum answered 1/11, 2008 at 21:7 Comment(0)
P
0

You can start with a single database and partition it as the application grows. If you do this, there a few things I would recommend:

1) Design the database in a way that it can be easily partitioned. For example, if customers are going to share data, make sure that data is easily replicated across each database.

2) When you have only one database, make sure it is being backed up to another physical server. In the event of a failover you can revert traffic to this other server and still have your data intact.

Pops answered 2/1, 2009 at 23:15 Comment(1)
What do you mean in 1, 'If customer are going to share data'? I am facing the case that data has to be shared accross customers to be accessed by a governing entity, how would you design it then?Disclaimer

© 2022 - 2024 — McMap. All rights reserved.