Are foreign keys really necessary in a database design?
Asked Answered
S

24

139

As far as I know, foreign keys (FK) are used to aid the programmer to manipulate data in the correct way. Suppose a programmer is actually doing this in the right manner already, then do we really need the concept of foreign keys?

Are there any other uses for foreign keys? Am I missing something here?

Scrope answered 20/8, 2008 at 20:18 Comment(15)
This comes up often around here. I blame Joel Spolsky:-). There are many good answers here; rather than retype mine, I'll just give you a link: https://mcmap.net/q/75628/-what-39-s-wrong-with-foreign-keysClerc
"Suppose a programmer is actually doing this in the right manner already" - I can't even imagine such a scenario.Bacciform
"Foreign Key" is an idea, not a technology. It's a relational rule. Your question is really about whether you should attempt to enforce the rule in your code or let the database help you. When concurrency is involved, it's best to let the database engine enforce the rule, since it's aware of EVERYTHING that happens in the database, while your code can not possibly be aware.Jessjessa
@Triynko, concept of foreign keys is not relational rule.Zaporozhye
@Jessjessa - To expand on what has been said, foreign keys are not a relational rule, they are a relationship ruleGreco
@lubos & cdeszaq. Actually, it IS a relational rule... it's a subset of rule 10 of Codd's "Twleve Commandments"... "Integrity Independence", which basically says that the RDBMS's relational integrity must be maintained independently of any application that accesses it, which is exactly what I was explaining in an easy-to-understand way. This rule is implemented by, among other things, foreign key constraints. So yes, the idea of a foreign key is "a" relational rule.Jessjessa
@Triynko, it's up to me whether I want to have referential integrity between two relations or not. Nowhere is written that if I have relations Customers and Invoices, I must also have and maintain referential integrity between them. I think we both agree that it's a good idea to enforce the referential integrity where applicable but it's just a good idea, not requirement of relational model.Zaporozhye
@lubos: To clarify, you're talking about whether or not you're going to use a particular feature, but I'm talking about whether that feature's presence is necessary to have a complete, fully functional RDBMS. Referential constraints, if and when you choose to use them, is something that should be enforced within the RDBMS (rather than the application), so it's a feature that should be there, and in that sense it is a requirement of the relational model if you're going to develop a complete RDBMS.Jessjessa
A foreign key is like a remote primary key. When you have two related tables, a foreign key links the data from a separate table, that would otherwise be included (redundantly) in the original table. The relational model serves to eliminate redundancy by separating data out into related tables, so foreign keys are fundamental to the relational model. IMO, if a relation exists, then it SHOULD be enforced. If you choose not to enforce such a relation, then IMO your database sucks :), does not maintain a complete record of events, and may cripple your application with null values eventually.Jessjessa
@Triynko, please don't say relation if you mean relationship. it's confusing because in relational model, relation is a set of tuples which share the same type, not a relationship. also when you talk about eliminating redundancy, you are not talking about relational model, you are talking about database normalization or 3NF. relational model must adhere to minimal set of rules defined in 1NF. referential integrity, foreign keys, redundancy elimination are not rules of 1NF therefore they are not rules of meeting requirements of relational model. hopefully it makes sense now.Zaporozhye
@lubos: A relationship and a relation are the same thing. What you need to realize is that a relationship constrains two disjoint relations (tables) into a single logical relation. In other words, a relationship involves two relations, which when constrained properly are logically equivalent to a single relation. When you write a JOIN query, the result is a SINGLE relation (table). What you're implying is that the 'relational model' describes a basic spreadsheet, and I completely disagree.Jessjessa
From Wikipedia "Relational Model": "The relational model of data permits the database designer to create a consistent, logical representation of information. Consistency is achieved by including declared constraints in the database design, which is usually referred to as the logical schema. The theory includes a process of database normalization whereby a design with certain desirable properties can be selected from a set of logically equivalent alternatives."Jessjessa
...continued: "The consistency of a relational database is enforced, not by rules built into the applications that use it, but rather by constraints, declared as part of the logical schema and enforced by the DBMS for all applications. In general, constraints are expressed using relational comparison operators, of which just one, "is subset of" (⊆), is theoretically sufficient. In practice, several useful shorthands are expected to be available, of which the most important are candidate key (really, superkey) and foreign key constraints."Jessjessa
Let me rephrase "What you're implying is..." to the following (since the editing seems to be disabled): "What you're implying is.. that the 'relational model' does not encompass ideas dealing with maintaining the integrity of the relation when it happens to be physically disjoint for normalization purposes, but I think it does encompass that."Jessjessa
Note that Codd's "rules" do not require referential integrity constraints to be supported by a RDBMS. Rule 10 simply specifies that integrity constraints should be independently enforced - it doesn't say what type of constraints they should be. In any case, the "rules" are not and never were intended to be a definition of the relational model. It's clear that a database can be relational without having a single RI constraint in it and a RDBMS might even not support RI (it's just that it probably wouldn't have too many willing customers).Resinoid
U
111

Foreign keys help enforce referential integrity at the data level. They also improve performance because they're normally indexed by default.

Unfriendly answered 20/8, 2008 at 20:19 Comment(4)
If you need an index create one, this should not be a primary reason for FKs. (In fact in certain circumstances (More inserts than selects for example) maintaining a FK might be slower. )Shalna
That's a horrible answer FKs genaerally can add extra overhead not improve performance.Audwen
In SQL-Server, they are not indexed by default on either the referee or the referrer. sqlskills.com/blogs/kimberly/…Eponymous
Nor in Oracle; you have to create indexes (on the FK columns) yourself.Kass
R
66

Foreign keys can also help the programmer write less code using things like ON DELETE CASCADE. This means that if you have one table containing users and another containing orders or something, then deleting a user could automatically delete all orders that point to that user.

Rossman answered 20/8, 2008 at 20:22 Comment(5)
@Greg Hewgill This could pontentially lead to a lot of problems. You should be very careful with thinks like DELETE CASCADE, as in many cases, you would want to keep the orders created by a user when deleting the user.Ioannina
Although this should probably be handled in business logic layer. Deciding whether or not to keep related child records, is not quite the same as ensuring that no values violate foreign key relationships.Mexican
The other issue is auditing, if auditing is not done at the db level, cascading updates or deletes will invalidate your audit trail.Intramolecular
@Codewerks: Business logic can be in the DB.Stearic
@Ioannina I disagree. If you want to keep the orders of client X, then it makes no sense to delete client X to begin with -it contains important data relevant to the order. You would only delete client X if you intend to also delete everything that references it -in which case ON DELETE CASCADE does exactly that. If you don't want a cascade, then you shouldn't delete at all.Flanigan
H
49

I can't imagine designing a database without foreign keys. Without them, eventually you are bound to make a mistake and corrupt the integrity of your data.

They are not required, strictly speaking, but the benefits are huge.

I'm fairly certain that FogBugz does not have foreign key constraints in the database. I would be interested to hear how the Fog Creek Software team structures their code to guarantee that they will never introduce an inconsistency.

Humoral answered 20/8, 2008 at 20:24 Comment(10)
Joel: "So far we've never had a problem." So far, I've never driven into a lamp-post. But I still think it's a good idea to wear seat belts ;-)Immure
May be you never have SEEN the problem, but may be it's there... The most of databases use a convention like id_xxx that is exactly the same that ixXXXProust
@Joel: Naming conventions in place of enforcement of rules? Might as well do away with type while you're at it.Colza
@Eric: You're holding Fog Creek up as some sort of avatar of software development here. If you said "A company in New York City does not have foreign keys in their db ..." we'd all say "And?"Colza
@jcollum, I made that comment during the beta of Stack Overflow, when pretty much everybody here knew who Jeff and Joel were, and most were probably listening to the Podcast, so Fog Creek was on everybody's radar.Humoral
@jcollum: some would say "And?" while others would say "WTF?" (I just did), but I guess there's more than one way to skin a tooth. Nice use of "avatar", by the way. :)Papal
Eric: FogBugz uses a naming convention for foreign keys. For example ixBug is understood to be an index into the primary key of the table Bug. So far we've never had a problem. -- Joel SpolskyOccult
"So far we've never had a problem." -- Correction: You've never had a problem YOU KNOW OF--which is another great reason to let your tools help you.Reggy
To respond to Eric's actual question, my understanding is that Fog Creek software is hosted on servers controlled by Fog Creek, not shrink-wrap software released into the wild. This means they can ensure that their database is manipulated only via their applications software. In this context I surmise that there is an object model that enforces constraints.Immunize
many forum DBs don't use foreign key like xenforo, phpbb... or even wordpress. using fk causes performance issues. many of them choose to handle orphan rows manually.Nepheline
O
47

A database schema without FK constraints is like driving without a seat belt.

One day, you'll regret it. Not spending that little extra time on the design fundamentals and data integrity is a sure fire way of assuring headaches later.

Would you accept code in your application that was that sloppy? That directly accessed the member objects and modified the data structures directly.

Why do you think this has been made hard and even unacceptable within modern languages?

Oliguria answered 20/8, 2008 at 22:29 Comment(3)
+1 for a good analogy between encapsulation and FK/PK relationships.Colza
@jumping_monkey Your edit was not a clarification of what the author said, it added something new, which is inappropriate. I rolled it back. It should be a comment suggestion to the author. Also it was not grammatically correct & it left out a space & it had unnecessary boldface, it was a bad edit for that content. Help centerInappetence
Hey @philipxy, thanks for checking that. I would argue that it is, as referential integrity(which is the whole point of foreign keys) is a subset of data integrity. That's the reason i added it, with the link to read more about it. Anyway, you can remove it, your call. I like Guy's answer, that's what is more important. Cheers.Improvement
U
23

Yes.

  1. They keep you honest
  2. They keep new developers honest
  3. You can do ON DELETE CASCADE
  4. They help you to generate nice diagrams that self explain the links between tables
Unriddle answered 20/8, 2008 at 20:37 Comment(2)
what do you mean by honesty?Flickinger
Honest with the conception I guess. It prevent you from cheating with the data by doing quick and lame programming.Parfitt
D
16

Suppose a programmer is actually doing this in the right manner already

Making such a supposition seems to me to be an extremely bad idea; in general software is phenomenally buggy.

And that's the point, really. Developers can't get things right, so ensuring the database can't be filled with bad data is a Good Thing.

Although in an ideal world, natural joins would use relationships (i.e. FK constraints) rather than matching column names. This would make FKs even more useful.

Decury answered 20/8, 2008 at 20:35 Comment(1)
Good point, it would be nice to join two tables with "ON [Relationship]" or some other keyword and let the db figure out what columns are involved. Seems pretty reasonable really.Colza
D
15

Personally, I am in favor of foreign keys because it formalizes the relationship between the tables. I realize that your question presupposes that the programmer is not introducing data that would violate referential integrity, but I have seen way too many instances where data referential integrity is violated, despite best intentions!

Pre-foreign key constraints (aka declarative referential integrity or DRI) lots of time was spent implementing these relationships using triggers. The fact that we can formalize the relationship by a declarative constraint is very powerful.

@John - Other databases may automatically create indexes for foreign keys, but SQL Server does not. In SQL Server, foreign key relationships are only constraints. You must defined your index on foreign keys separately (which can be of benefit.)

Edit: I'd like to add that, IMO, the use of foreign keys in support of ON DELETE or ON UPDATE CASCADE is not necessarily a good thing. In practice, I have found that cascade on delete should be carefully considered based on the relationship of the data -- e.g. do you have a natural parent-child where this may be OK or is the related table a set of lookup values. Using cascaded updates implies you are allowing the primary key of one table to be modified. In that case, I have a general philosophical disagreement in that the primary key of a table should not change. Keys should be inherently constant.

Dexamethasone answered 20/8, 2008 at 20:35 Comment(0)
B
10

Without a foreign key how do you tell that two records in different tables are related?

I think what you are referring to is referential integrity, where the child record is not allowed to be created without an existing parent record etc. These are often known as foreign key constraints - but are not to be confused with the existence of foreign keys in the first place.

Biddable answered 20/8, 2008 at 20:24 Comment(0)
S
10

I suppose you are talking about foreign key constraints enforced by the database. You probably already are using foreign keys, you just haven't told the database about it.

Suppose a programmer is actually doing this in the right manner already, then do we really need the concept of foreign keys?

Theoretically, no. However, there have never been a piece of software without bugs.

Bugs in application code are typically not that dangerous - you identify the bug and fix it, and after that the application runs smoothly again. But if a bug allows currupt data to enter the database, then you are stuck with it! It's very hard to recover from corrupt data in the database.

Consider if a subtle bug in FogBugz allowed a corrupt foreign key to be written in the database. It might be easy to fix the bug and quickly push the fix to customers in a bugfix release. However, how should the corrupt data in dozens of databases be fixed? Correct code might now suddenly break because the assumptions about the integrity of foreign keys dont hold anymore.

In web applications you typically only have one program speaking to the database, so there is only one place where bugs can corrupt the data. In an enterprise application there might be several independent applications speaking to the same database (not to mention people working directly with the database shell). There is no way to be sure that all applications follow the same assumptions without bugs, always and forever.

If constraints are encoded in the database, then the worst that can happen with bugs is that the user is shown an ugly error message about some SQL constraint not satisfied. This is much prefereable to letting currupt data into your enterprise database, where it in turn will break all your applications or just lead to all kinds of wrong or misleading output.

Oh, and foreign key constraints also improves performance because they are indexed by default. I can't think of any reason not to use foreign key constraints.

Sheers answered 17/9, 2008 at 13:24 Comment(0)
M
8

Is there a benefit to not having foreign keys? Unless you are using a crappy database, FKs aren't that hard to set up. So why would you have a policy of avoiding them? It's one thing to have a naming convention that says a column references another, it's another to know the database is actually verifying that relationship for you.

Marva answered 22/8, 2008 at 15:40 Comment(3)
The benefit is performance. I'm not saying you should not have FK's, just strictly answering your question. Suppose you have a huge (100GB) table with a FK to another table. If you delete a record from "another table" - the engine will scan the entire 100GB table to make sure you're not deleting anything useful. Unless you have that FK column indexed (FK are not indexed by default in SQL Server)Bowker
I'm not a db expert but I don't think that should be how you address performance problems. Like you said, you can index the FK column (which you'll realize pretty quickly that SQL doesn't do by default) and you do want the database to enforce that the record you're deleting isn't in use in your 100GB table.Marva
I (mostly) agree. Just wanted to mention that when you're managing databases of size of tens of terabytes, dropping FKs is an an unspoken common practice among DBAs. In essence, at this scale you're moving to "NoSQL land", where you have to drop one of the "A", "C", "I" or "D" out of the "ACID" principle.Bowker
M
8

FKs are very important and should always exist in your schema, unless you are eBay.

Macguiness answered 19/4, 2009 at 14:53 Comment(1)
that link is actually extremely fascinating... i'd truly like to know more details and i'm somewhat scared to use ebay now. for other people: click on the 4th question to see what he says about their db structure. the whole interview is worth watching, though. also... unibrowBlayne
H
6

I think some single thing at some point must be responsible for ensuring valid relationships.

For example, Ruby on Rails does not use foreign keys, but it validates all the relationships itself. If you only ever access your database from that Ruby on Rails application, this is fine.

However, if you have other clients which are writing to the database, then without foreign keys they need to implement their own validation. You then have two copies of the validation code which are most likely different, which any programmer should be able to tell is a cardinal sin.

At that point, foreign keys really are neccessary, as they allow you to move the responsibility to a single point again.

Haematoid answered 20/8, 2008 at 20:28 Comment(1)
It's like an onion. FKs are the last layer of defense. Unless it's an embedded local database, apps trying to do referential integrity is always an bad idea.Khajeh
L
6

Foreign keys allow someone who has not seen your database before to determine the relationship between tables.

Everything may be fine now, but think what will happen when your programmer leaves and someone else has to take over.

Foreign keys will allow them to understand the database structure without trawling through thousand of lines of code.

Latarsha answered 17/9, 2008 at 4:44 Comment(0)
I
5

As far as I know, foreign keys are used to aid the programmer to manipulate data in the correct way.

FKs allow the DBA to protect data integrity from the fumbling of users when the programmer fails to do so, and sometimes to protect against the fumbling of programmers.

Suppose a programmer is actually doing this in the right manner already, then do we really need the concept of foreign keys?

Programmers are mortal and fallible. FKs are declarative which makes them harder to screw up.

Are there any other uses for foreign keys? Am I missing something here?

Although this is not why they were created, FKs provide strong reliable hinting to diagramming tools and to query builders. This is passed on to end users, who desperately need strong reliable hints.

Immunize answered 6/10, 2008 at 21:26 Comment(0)
N
4

They are not strictly necessary, in the way that seatbelts are not strictly necessary. But they can really save you from doing something stupid that messes up your database.

It's so much nicer to debug a FK constraint error than have to reconstruct a delete that broke your application.

Northwest answered 20/8, 2008 at 20:57 Comment(0)
P
4

They are important, because your application is not the only way data can be manipulated in the database. Your application may handle referential integrity as honestly as it wants, but all it takes is one bozo with the right privileges to come along and issue an insert, delete or update command at the database level, and all your application referential integrity enforcement is bypassed. Putting FK constraints in at the database level means that, barring this bozo choosing to disable the FK constraint before issuing their command, the FK constraint will cause a bad insert/update/delete statement to fail with a referential integrity violation.

Postlude answered 17/9, 2008 at 19:9 Comment(0)
C
3

I think about it in terms of cost/benefit... In MySQL, adding a constraint is a single additional line of DDL. It's just a handful of key words and a couple of seconds of thought. That's the only "cost" in my opinion...

Tools love foreign keys. Foreign keys prevent bad data (that is, orphaned rows) that may not affect business logic or functionality and therefor go unnoticed, and build up. It also prevents developers who are unfamiliar with the schema from implementing entire chunks of work without realizing they're missing a relationship. Perhaps everything is great within the scope of your current application, but if you missed something and someday something unexpected is added (think fancy reporting), you might be in a spot where you have to manually clean up bad data that's been accumulating since the inception of the schema without a database enforced check.

The little time it takes to codify what's already in your head when you're putting things together could save you or someone else a bunch of grief months or years down the road.

The question:

Are there any other uses for foreign keys? Am I missing something here?

It is a bit loaded. Insert comments, indentation or variable naming in place of "foreign keys"... If you already understand the thing in question perfectly, it's "no use" to you.

Counterplot answered 21/8, 2008 at 3:30 Comment(0)
I
2

Entropy reduction. Reduce the potential for chaotic scenarios to occur in the database. We have a hard time as it is considering all the possiblilites so, in my opinion, entropy reduction is key to the maintenance of any system.

When we make an assumption for example: each order has a customer that assumption should be enforced by something. In databases that "something" is foreign keys.

I think this is worth the tradeoff in development speed. Sure, you can code quicker with them off and this is probably why some people don't use them. Personally I have killed a number of hours with NHibernate and some foreign key constraint that gets angry when I perform some operation. HOWEVER, I know what the problem is so it's less of a problem. I'm using normal tools and there are resources to help me work around this, possibly even people to help!

The alternative is allow a bug to creep into the system (and given enough time, it will) where a foreign key isn't set and your data becomes inconsistent. Then, you get an unusual bug report, investigate and "OH". The database is screwed. Now how long is that going to take to fix?

Incisive answered 1/12, 2009 at 23:22 Comment(0)
M
1

You can view foreign keys as a constraint that,

  • Help maintain data integrity
  • Show how data is related to each other (which can help in enforcing business logic and rules)
  • If used correctly, can help increase the efficiency with which the data is fetched from the tables.
Mvd answered 20/8, 2008 at 20:26 Comment(0)
H
1

We don't currently use foreign keys. And for the most part we don't regret it.

That said - we're likely to start using them a lot more in the near future for several reasons, both of them for similar reasons:

  1. Diagramming. It's so much easier to produce a diagram of a database if there are foreign key relationships correctly used.

  2. Tool support. It's a lot easier to build data models using Visual Studio 2008 that can be used for LINQ to SQL if there are proper foreign key relationships.

So I guess my point is that we've found that if we're doing a lot of manual SQL work (construct query, run query, blahblahblah) foreign keys aren't necessarily essential. Once you start getting into using tools, though, they become a lot more useful.

Howse answered 20/8, 2008 at 20:55 Comment(2)
I work on systems that don't use them. And I regret it regularly. I have seen more instances I can count of non-sensical data that would have been prevented by proper constraints.Bacciform
And having been working with foreign keys on our current project for nearly six months, I totally agree with this comment.Howse
M
1

The best thing about foreign key constraints (and constraints in general, really) are that you can rely on them when writing your queries. A lot of queries can become a lot more complicated if you can't rely on the data model holding "true".

In code, we'll generally just get an exception thrown somewhere - but in SQL, we'll generally just get the "wrong" answers.

In theory, SQL Server could use constraints as part of a query plan - but except for check constraints for partitioning, I can't say that I've ever actually witnessed that.

Mccaleb answered 20/8, 2008 at 22:10 Comment(1)
Uniqueness constraints indicate high cardinality which is used by the optimiser in selecting a join mechanism.Immunize
G
1

Foreign keys had never been explicit (FOREIGN KEY REFERENCES table(column)) declared in projects (business applications and social networking websites) which I worked on.

But there always was a kind of convention of naming columns which were foreign keys.

It's like with database normalization -- you have to know what are you doing and what are consequence of that (mainly performance).

I am aware of advantages of foreign keys (data integrity, index for foreign key column, tools aware of database schema), but also I am afraid of using foreign keys as general rule.

Also various database engines could serve foreign keys in a different way, which could lead to subtle bugs during migration.

Removing all orders and invoices of deleted client with ON DELETE CASCADE is the perfect example of nice looking, but wrong designed, database schema.

Grier answered 21/8, 2008 at 9:6 Comment(0)
O
0

Yes. The ON DELETE [RESTRICT|CASCADE] keeps developers from stranding data, keeping the data clean. I recently joined a team of Rails developers who did not focus on database constraints such as foreign keys.

Luckily, I found these: http://www.redhillonrails.org/foreign_key_associations.html -- RedHill on Ruby on Rails plug-ins generate foreign keys using the convention over configuration style. A migration with product_id will create a foreign key to the id in the products table.

Check out the other great plug-ins at RedHill, including migrations wrapped in transactions.

Oculist answered 17/9, 2008 at 4:30 Comment(0)
J
0

If you plan on generating your data access code, ie, Entity Framework or any other ORM you entirely lose the ability to generate a hierarchical model without Foreign Keys

Justinejustinian answered 5/2, 2020 at 17:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.