SQL one to one relationship vs. single table
Asked Answered
A

6

24

Consider a data structure such as the below where the user has a small number of fixed settings.

User

[Id] INT IDENTITY NOT NULL,
[Name] NVARCHAR(MAX) NOT NULL,
[Email] VNARCHAR(2034) NOT NULL

UserSettings

[SettingA],
[SettingB],
[SettingC]

Is it considered correct to move the user's settings into a separate table, thereby creating a one-to-one relationship with the users table? Does this offer any real advantage over storing it in the same row as the user (the obvious disadvantage being performance).

Andean answered 27/6, 2014 at 10:29 Comment(4)
Does a given Id always have a SettingA, a SettingB and a SettingC? Or can it have just some of them? When an Id has a certain setting, is it always known that it is specifically one of those? Or could it be put under any of them? Were you going to make them nullable? (Your use of "normalize" is dubious (see below). I don't want to make any assumptions from your use of "1:1".)Disbranch
Yes, all users have exactly the same settings.Andean
So I guess you mean: yes, no to the or, yes, no to the or, and no. In which respect, then, SettingA, a SettingB and a SettingC are each just like Name and Email.Disbranch
Somewhat related to your question: Why use a 1-to-1 relationship in database design?.Curve
M
29

You would normally split tables into two or more 1:1 related tables when the table gets very wide (i.e. has many columns). It is hard for programmers to have to deal with tables with too many columns. For big companies such tables can easily have more than 100 columns.

So imagine a product table. There is a selling price and maybe another price which was used for calculation and estimation only. Wouldn't it be good to have two tables, one for the real values and one for the planning phase? So a programmer would never confuse the two prices. Or take logistic settings for the product. You want to insert into the products table, but with all these logistic attributes in it, do you need to set some of these? If it were two tables, you would insert into the product table, and another programmer responsible for logistics data would care about the logistic table. No more confusion.

Another thing with many-column tables is that a full table scan is of course slower for a table with 150 columns than for a table with just half of this or less.

A last point is access rights. With separate tables you can grant different rights on the product's main table and the product's logistic table.

So all in all, it is rather rare to see 1:1 relations, but they can give a clearer view on data and even help with performance issues and data access.

EDIT: I'm taking Mike Sherrill's advice and (hopefully) clarify the thing about normalization.

Normalization is mainly about avoiding redundancy and relateded lack of consistence. The decision whether to hold data in only one table or more 1:1 related tables has nothing to do with this. You can decide to split a user table in one table for personal information like first and last name and another for his school, graduation and job. Both tables would stay in the normal form as the original table, because there is no data more or less redundant than before. The only column used twice would be the user id, but this is not redundant, because it is needed in both tables to identify a record.

So asking "Is it considered correct to normalize the settings into a separate table?" is not a valid question, because you don't normalize anything by putting data into a 1:1 related separate table.

Misprision answered 27/6, 2014 at 12:20 Comment(3)
You are mostly right, but I would respectfully disagree with your point about making things simpler for the programmer - that's usually done by hiding the "raw" tables behind some sort of API (e.g. views and stored procedures). Having to worry about properly connecting rows from two tables will, if anything, complicate things for the programmer.Curve
"You would normally split tables into two or more 1:1 related tables when the table gets very wide . . ." Maybe. But normalization doesn't tell you to do this.Invective
@Mike Sherrill 'Cat Recall': You are right, normalization doesn't tell me to do this. It doesn't say anything to this. This is nothing to do with normalization. The normal form stays the same when splitting a table. Is this what you are saying? That the request is tagged and titled incorrectly, because this is no normalization topic?Misprision
S
5

You're all wrong :) Just kidding.

On a high load, high volume, heavily updated system splitting a table by 1:1 helps optimize I/O.

For example, this way you can place heavily read columns onto separate physical hard-drives to speed-up parallel reads (the 1-1 tables have to be in different "filegroups" for this). Or you can optimize table-level locks. Etc. Etc.

But this type of optimization usually does not happen until you have millions of rows and huge read/write concurrency

Symmetry answered 16/10, 2019 at 18:0 Comment(0)
U
4

Creating a new table with 1-1 relationships is not a reasonable solution. You might need to do it sometimes, but there would typically be no reason to have two tables where the user id is the primary key.

On the other hand, splitting the settings into a separate table with one row per user/setting combination might be a very good idea. This would be a three-table solution. One for users, one for all possible settings, and one for the junction table between them.

The junction table can be quite useful. For instance, it might contain the effective and end dates of the setting.

However, this assumes that the settings are "similar" to each other, in a SQL sense. If the settings are different such as:

  • Preferred location as latitude/longitude
  • Preferred time of day to receive an email
  • Flag to be excluded from certain contacts

Then you have a data-type problem when storing them in a table. So, the answer is "it depends". A lot of the answer depends on what the settings look like, how they will be used, and the type of constraints on them.

Unpaged answered 27/6, 2014 at 11:7 Comment(0)
H
0

It makes more sense that your settings are not only in a separate table, but also use a on-to-many relationship between the ID and Settings. This way, you could potentially have a as many (or as few) setting as required.

UserSettings

[Settings_ID]
[User_ID]
[Settings]

In fact, one could make the same argument for the [Email] field.

Hindustani answered 27/6, 2014 at 11:1 Comment(2)
That is the first step toward the EAV antipattern. And the downside of this first step is you lose the typechecking on the [Settings] column/attribute.Shena
Oh, I should have been clearer. I mean that each row be a group of settings, rather than a single attribute. So you could have multiple groups of same-typed settings. Mind you, it still does indeed depend on the nature of the OP's settings in the first place.Hindustani
S
0

Splitting tables into distinct tables with 1:1 relationships between them is usually not practiced, because :

If the relationship is really 1:1, then integrity enforcement boils down to "inserts being done in all concerned tables, or none at all". Achieving this on the server side requires systems that support deferred constraint checking, and AFAIK that's a feature of the rather high-end systems. So in many cases the 1:1 enforcement is pushed over to the application side, and that approach has its own obvious downsides.

A case when splitting tables is nonetheless advisable, is when there are security perspectives, i.e. when not all columns can be updated by one user. But note that by definition, in such cases the relationship between the tables can never be strictly 1:1 .

(I also suggest you read carefully the discussion between Thorsten/Mike. You used the word 'normalization' but normalization has very little to do with your scenario - except if you were considering 6NF, which I think is rather unlikely.)

Shena answered 28/6, 2014 at 15:39 Comment(0)
M
0

In addition to the arguments given in the other answers, it should be mentioned that: A one-to-one relationship is a natural consequence of tph, tpt, tpc with entity splitting and other omr strategies:

https://learn.microsoft.com/en-us/ef/core/modeling/inheritance

https://learn.microsoft.com/en-us/ef/core/modeling/table-splitting

https://www.codeproject.com/Articles/100109/Mixing-Table-Per-Hierarchy-and-Entity-Splitting

These strategies are used with a generic repository.

https://dotnettutorials.net/lesson/generic-repository-pattern-csharp-mvc/

You can read about the pros and cons of these patterns and strategies in many other answers and blogs:

Entity Framework 5: Inheritance and Polymorphic Associations

Repository Pattern pros and cons of each implementation

https://goodjava.org/blog?rec_id=17 https://www.c-sharpcorner.com/article/the-downsides-of-using-the-repository-pattern/

To sum up all the pros and cons, there are more advantages to using a one-to-one strategy for the context in question.

The one-to-one strategy is a natural consequence of the tpt, tph, tpc strategies and using generic repository patterns when column length increase.

Depending on the database engine, exceeding the critical mass of the number of columns results in significant performance losses:

https://dba.stackexchange.com/questions/155874/insert-performance-degradation-in-sql-server-after-a-certain-number-of-columns

mysql - how many columns is too many?

Does a fat table/more columns affect performance in sql

etc.

Splitting into multiple tables reduces the likelihood of tables locking during an update.

Mouton answered 28/6, 2024 at 7:35 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.