Database efficiency - table per user vs. table of users
Asked Answered
D

6

41

For a website having users. Each user having the ability to create any amount of, we'll call it "posts":

Efficiency-wise - is it better to create one table for all of the posts, saving the user-id of the user which created the post, for each post - OR creating a different separate table for each user and putting there just the posts created by that user?

Duala answered 25/9, 2011 at 8:52 Comment(2)
Search this site with the terms [database] "table for each" for a long list of different takes on this questions.Northey
if the issue is getting the posts for a given user quickly, use an index on posts(user_id) or the like. if you have an schema and some sample queries that you need to run quickly, it would be better to tell us that schema and those queries and ask us what indexes should exist.Pentavalent
V
50

The database layout should not change when you add more data to it, so the user data should definitely be in one table.

Also:

  • Having multiple tables means that you have to create queries dynamically.

  • The cached query plan for one table won't be used for any other of the tables.

  • Having a lot of data in one table doesn't affect performance much, but having a lot of tables does.

  • If you want to add an index to the table to make queries faster, it's a lot easier to do on a single table.

Vimineous answered 25/9, 2011 at 9:8 Comment(0)
D
13

Well to answer the specific question: In terms of efficiency of querying, it will always be better to have small tables, hence a table per user is likely to be the most efficient.

However, unless you have a lot of posts and users, this is not likely to matter. Even with millions of rows, you will get good performance with a well-placed index.

I would strongly advise against the table-per-user strategy, because it adds a lot of complexity to your solution. How would you query when you need to find, say, users that have posted on a subject within the year ?

Optimize when you need to. Not because you think/are afraid something will be slow. (And even if you need to optimize, there will be easier options than table-per-user)

Durango answered 25/9, 2011 at 9:1 Comment(5)
I disagree with "always" - give me a count of all posts per user. Writing a UNION query that will aggregate that for you will not be fun, nor will it be efficient.Scutcheon
You could use a View for thisMaurya
"Optimize when you need to. Not because you think/are afraid something will be slow." loved it!Intermolecular
Is there a such thing as over optimizing? I'd think optimizing when things are small would be a good idea, instead of waiting for things to get large and then decide you need to optimize and have to refactor a lot of code/queries. I think starting out with an optimized approach would be a better idea.Inopportune
@c0dezer019, there is such a thing as premature optimization. Most people don't need a shed for their bicycles, and they won't get the remodeling of their house done if they're busy bike-shedding. Always measure first and go after proven bottlenecks.Murry
N
8

Schemas with a varying number of tables are, generally, bad. Use one single table for your posts.

Northey answered 25/9, 2011 at 8:55 Comment(0)
A
5

If performance is a concern, you should learn about database indexes. While indexes is not part of the SQL standard, nearly all databases support them to help improve performance.

I recommend that you create a single table for all users' posts and then add an indexes to this table to improve the performance of searching. For example you can add an index on the user column so that you can quickly find all posts for a given user. You may also want to consider adding other indexes, depending on your application's requirements.

Atween answered 25/9, 2011 at 9:0 Comment(0)
K
4

Your first proposal of having a single user and a single post table is the standard approach to take.

At the moment posts may be the only user-specific feature on your site, but imagine that it might need to grow in the future to support users having messages, preferences, etc. Now your separate table-per-user approach leads to an explosion in the number of tables you'd need to create.

Krystinakrystle answered 25/9, 2011 at 9:12 Comment(0)
D
0

I have a similar but different issue with your answer because both @guffa and @driis are assuming that the "posts" need to be shared among users.

In my particular situation: not a single user datapoint can be shared for privacy reason with any other user not even for analytics.

We plan on using mysql or postgres and here are the three options our team is warring about:

N schema and 5 tables - some of our devs feel that this is the best direction to make to keep the data completely segregated. Pros - less complexity if you think of schema as a folder and tables as files. We'll have one schema per user Cons - most ORMs do connection pooling per schema

1 schema and nx5 tables - some devs like this because it allows for connection pooling but appears to make the issue more complex. Pros - connection pooling in the ORM is possible Cons - cannot find an ORM where Models are set up for this

1 schema and 5 tables - some devs like this because they think we benefit from caching.

Pros: ORMs are happy because this is what they are designed to do Cons: every query requires the username table

I, personally, land in camp 1: n schemas. My lead dev lands in camp 3: 1 schema 5 tables.

Caching: If data is always 1:1, I cannot see how caching will ever help regardless of the solution we use because each user will be searching for different info.

Any thoughts?

Disinfest answered 22/11, 2020 at 19:38 Comment(1)
@guffa - I cannot apply your answer logic to my issue...so, can you provide other reasons?Disinfest

© 2022 - 2025 — McMap. All rights reserved.