Relational database design question - Surrogate-key or Natural-key?
Asked Answered
R

10

16

Which one is the best practice and Why?

a) Type Table, Surrogate/Artificial Key

Foreign key is from user.type to type.id: alt text

b) Type Table, Natural Key

Foreign key is from user.type to type.typeName: alt text

Richelle answered 19/9, 2010 at 22:27 Comment(1)
Thanks about this clear and beautiful question. I have used it on 'Introducció a les bases de dades' exercices with your attribution.Dorfman
A
20

I believe that in practice, using a natural key is rarely the best option. I would probably go for the surrogate key approach as in your first example.

The following are the main disadvantages of the natural key approach:

  • You might have an incorrect type name, or you may simply want to rename the type. To edit it, you would have to update all the tables that would be using it as a foreign key.

  • An index on an int field will be much more compact than one on a varchar field.

  • In some cases, it might be difficult to have a unique natural key, and this is necessary since it will be used as a primary key. This might not apply in your case.

Alexandra answered 19/9, 2010 at 22:34 Comment(1)
Generally, the best solution is both - have the surrogate key for referential integrity, and the natural key (generally, it's compound) to enforce the tuple's uniqueness. So in the example above, a unique constraint/index would exist on type.typename and on some combination of columns in user.Locke
B
9

The first one is more future proof, because it allows you to change the string representing the type without updating the whole user table. In other words you use a surrogate key, an additional immutable identifier introduced for the sake of flexibility.

Bubo answered 19/9, 2010 at 22:32 Comment(0)
V
4

A good reason to use a surrogate key (instead of a natural key like name) is when the natural key isn't really a good choice in terms of uniqueness. In my lifetime i've known no fewer than 4 "Chris Smith"s. Person names are not unique.

Velvetvelveteen answered 19/9, 2010 at 22:32 Comment(0)
D
4

I prefer to use the surrogate key. It is often people will identity and use the natural key which will be fine for a while, until they decide they want to change the value. Then problems start.

Dodecahedron answered 19/9, 2010 at 22:38 Comment(1)
in databases that properly support foreign key semantics, changing this is only tricky from the point of getting the schema right in the first place (update cascades in the right places). If your database doesn't support this well, then you are right in avoiding natural keys.Velvetvelveteen
A
3

You should probably always use an ID number (that way if you change the type name, you don't need to update the user table) it also allows you to keep your datasize down, as a table full of INTs is much smaller than one full of 45 character varchars.

Annmarie answered 19/9, 2010 at 22:30 Comment(0)
M
2

If typeName is a natural key, then it's probably the preferable option, because it won't require a join to get the value.

You should only really use a surrogate key (id) when the name is likely to change.

Messeigneurs answered 19/9, 2010 at 22:29 Comment(0)
P
1

Surrogate key for me too, please.

The other might be easier when you need to bang out some code, but it will eventually be harder. Back in the day, my tech boss decided using an email addr as a primary key was a good idea. Needless to say, when people wanted to change their addresses it really sucked.

Phiz answered 19/9, 2010 at 23:14 Comment(0)
C
1

Use natural keys whenever they work. Names usually don't work. They are too mutable.

If you are inventing your own data, you might as well invent a syntheic key. If you are building a database of data provided by other people or their software, analyze the source data to see how they identify things that need identification.

If they are managing data at all well, they will have natural keys that work for the important stuff. For the unimportant stuff, suit yourself.

Corpulence answered 22/9, 2010 at 10:31 Comment(0)
A
1

well i think surrgote key is helpful when you don't have any uniquely identified key whose value is related and meaningful as is to be its primary key... moreover surrgote key is easier to implement and less overhead to maintain.

but on the other hand surrgote key is sometimes make extra cost by joining tables. think about 'User' ... I have

UserId varchar(20), ID int, Name varchar(200)

as the table structure.

now consider that i want to take a track on many tables as who is inserting records... if i use Id as a primary key, then [1,2,3,4,5..] etc will be in foreign tables and whenever i need to know who is inserting data i've to join User Table with it because 1,2,3,4,5,6 is meaningless. but if i use UserId as a primary key which is uniquely identified then on other foreign tables [john, annie, nadia, linda123] etc will be saved which is sometimes easily distinguishable and meaningful . so i need not to join user table everytime when i do query.

but mind it, it takes some extra physical space as varchar is saved in foreign tables which takes extra bytes.. and ofcourse indexing has a significant performance issue where int performs better rather than varchar

Alunite answered 14/2, 2011 at 5:55 Comment(0)
A
-1

Surrogate key is a substitution for the natural primary key. It is just a unique identifier or number for each row that can be used for the primary key to the table. The only requirement for a surrogate primary key is that it is unique for each row in the table.

It is useful because the natural primary key (i.e. Customer Number in Customer table) can change and this makes updates more difficult.

Anisotropic answered 16/8, 2016 at 1:2 Comment(1)
5 years after question was asked and does not add anything useful.Mika

© 2022 - 2024 — McMap. All rights reserved.