Database constraints - keep or ignore?
Asked Answered
R

10

19

When I was learning in university, they taught us the database fundamentals, basics and rules, and one of the most important rules is the constraints (primary key, foreign key), and how to make 1-m, 1-1, m-n relationships.

Now when I move to real business environment they tell me: you should forget all you have been taught; no constraints, all those relationships are logical, no primary keys, no foreign keys, you can make your constraints through the code.

I don't know who is right: what I learned in my academic life or what I will learn in my new real business life. What do you think?

Regimen answered 13/9, 2010 at 13:37 Comment(2)
This is an example of how job experience isn't always the best experience.Ferromanganese
That's not a "real business environment", you've landed in a den of Spaghetti coding Hackers... get out while you can. Seriously. Find another job, you will be unhappy. Either that or you're being hazed.Howes
C
6

I think the constraints help you to have clean data. Performance is sometimes improved. In some cases, the performance can get affected by having the constraints. However, the answer to that is not removing the constraints. You have something called "denormalization" to help you deal with the performance issues (provided that your queries are already optimized). You can always create denormalized summary tables in such scenarios.

Did the guys who told you to "forget what you learnt" also tell you that they have forgotten the traffic rules they learnt at the driving classes?

Courtnay answered 13/9, 2010 at 13:49 Comment(1)
the constraints keep the other monkeys from trashing the databaseJandy
K
29

If somebody told me to ignore keys and constraints on my databases, I would promptly ignore them and go about my business.

Primary keys, foreign keys, and constraints are there for a reason. Use them. They'll make your life easier and your database easier to understand (and, quite often, more performant).

Kiakiah answered 13/9, 2010 at 13:41 Comment(5)
-1 I like the answer, but it is not an explanation nor a concrete answer to the question. are there for a reason is the same as does not work in a question.Tarkany
Exactly what do you do with databases? Not all databases are the same.Hannibal
@David Thornley - Also a good question. I thought about that after I posted. It may be the case where the OP is not creating a relational database but an OLAP or some other type of db.Kiakiah
OLAP databases don't have primary keys?Howes
@Caspar, that's silly. "Does not work" is too generic because we know that there are specific things it actually does do. Raises an error, hangs, blue screens, gives an unexpected result... but there is no one specific reason to use relational theory and explaining it here is a waste of time and space.Howes
J
12

The longer I work with databases, the more I appreciate constraints. In the long run, they save me a lot of time. Only trusted constraints ensure 100% validity of data.

I wrote a chapter on usage of constraints vs. other ways of ensuring data integrity, available as free download here

Jugal answered 13/9, 2010 at 14:56 Comment(0)
S
11

If you think that constraints are merely about primary keys and foreign keys, then in fact you haven't been taught much of the "fundamentals" to begin with.

I suggest you take a look at "An Introduction to Relational Database Theory" by Hugh Darwen, which is available freely online. And at least you get a genuine education about the "fundamentals" from that one.

Stratopause answered 13/9, 2010 at 13:48 Comment(2)
In an environment where PK and FK constraints are derided, there may not be much point in going for the other constraints too. I think the OP should be cut some slack. That's not to say that the other constraints aren't also important, but getting the most basic of basics right first is more important.Deeplaid
i know there are other constraints ,but the talking is mainly about P.K and F.K constraints.thanks alotRegimen
C
6

I think the constraints help you to have clean data. Performance is sometimes improved. In some cases, the performance can get affected by having the constraints. However, the answer to that is not removing the constraints. You have something called "denormalization" to help you deal with the performance issues (provided that your queries are already optimized). You can always create denormalized summary tables in such scenarios.

Did the guys who told you to "forget what you learnt" also tell you that they have forgotten the traffic rules they learnt at the driving classes?

Courtnay answered 13/9, 2010 at 13:49 Comment(1)
the constraints keep the other monkeys from trashing the databaseJandy
J
5

Database constraints are a great idea when you have users accessing the database whom may corrupt your data(i.e. any user). I tend to keep Foreign key constraints in place to ensure that anyone editing data in my database is aware that other tables are relying on the data in the current table.

Justinn answered 13/9, 2010 at 13:41 Comment(0)
D
4

Well it's certainly true that there are very few ultimate truths out there, and it's also true that many commerically successful products eschew declared referential integrity (DRI).

On the other hand, if you care about the data in your database, there is almost no better way to safeguard the integrity of that data than through DRI.

If you leave it all up to the code on top, you're kind of banking on the hope that no one will ever access the database through any other means. If they do there will be nothing stopping data corruption (orphaned rows, inconsistent and illogical data).

Draftee answered 13/9, 2010 at 13:41 Comment(3)
The primary reason for a commercial product to eschew DRI is that different back-ends perform differently with DRI, and some back-ends don't effectively support it at all. (And business types think the cost of porting between back-ends approaches zero.) Further, if the customer diddles the data outside the app, well, caveat emptor, and good luck getting help from support.Tinge
Oracle, MS SQL, MySQL, PostgreSQL, even MS Access supports FK declaration. I wouldn't drop the benefits of DRI because of the theoretical possibility that "some back-ends" don't effectively support it. So what if they don't?Draftee
Customer "diddling" is a cynical way to look at the problem. Suppose your product enjoys great success and the OFT forces you to open your database to third parties? This is not unrealistic, it happened to software I've worked on. When that day comes you'll be glad of the protection afforded by DRI.Draftee
E
4

What you learned isn't just academic stuff. But yes it's like Plato's Utopia at times. It's the perfect condition your database can be in, the ideal design. But that ideal design isn't always possible.

Constraints should be as close to DB data as possible. Think about it this way. What if you wrote your constraints in code and later you wanted to migrate to a different language/platform and an error cropped up in one of your constraints? It'd be disastrous. Things like PK, FK, constraints etc. are used widely. They've been used for more than 30 years now. So, they're not junk but in certain scenarios they're just not manageable. For example, if you're Google, you can't just rely on a relational model to give answers in milliseconds.

So based on requirements like speed and stability, we sometimes duplicate data too, we don't use PKs, or we don't establish relationships etc. But only when we're looking for something specific AND when we know what we'll lose by doing it that way.

In the end, relational model is still just a model. It's a way of representing things. A very successful way but it's not a godsend so in some cases it has to be compromised.

Elba answered 13/9, 2010 at 13:41 Comment(8)
Not sure why any bugger would berate this. Heights!!! At least communicate the reason for the criticism.Elba
-1 but LOL for Plato's Utopia. Can you explain a bit more on Plato's version?Tarkany
Agreed... this is the only answer with a dose of reality. We all shoot for 3NF but know we'll violate it somewhere and we also need to know WHY we are doing it and what it means to do so. And down votes should REQUIRE a comment.Howes
I gave a +1 for this simple comment alone, "Constraints should be as close to DB data as possible. "Howes
Caspar, Can you explain a little more why you feel this should be down modded?Howes
Plato's Utopia is a bit like Thomas More's Animal Farm, a terrible book to refer to.Tarkany
Heh, heh! It seems like some of us are more concerned with Utopia than DBs :) Anyway, what I meant was that Utopia is an ideal world thought by a philosopher, which howsoever appealing hasn't as yet been modelled. The relational database as we read it is like that. An ideal but circumstances sometimes force us away from it. And that happens quite often. I haven't read Animal Farm so won't comment on that :pElba
@Casper It'd be better if we talk technology here. I'm not here to mention your favorite books.Elba
E
4

I've spent a lot of time fix ng the crap data produced by incompetents who think the constraints belong in the application code. If a database is designed without these required things, it will have bad data. Do a quick check of any system like this that has been running for several years and you will find orphaned records and missing required information etc.

Eugenieeugenio answered 13/9, 2010 at 18:16 Comment(0)
H
-3

When I was in class, we accessed tables with raw SQL, and there is a lot of raw SQL or the equivalent out there. In these cases, constraints are generally good.

However, there are systems that use databases as back ends, and these databases are only accessed by that particular software system. In this case, the software should keep track of the necessary relations and constraints, and the database serves as a redundant check that doesn't provide good feedback and lowers performance.

The database constraints are redundant because the attached system needs to maintain the constraints itself. The feedback is that a certain database constraint was violated. If a program is capable of dealing with such feedback, it's capable of doing its own checks. The performance cost should be obvious.

The constraints can still be useful on development or test systems, but when the system goes into production about all they can do is crash the system if something goes wrong, and that's usually exactly what you don't want to happen in a large system like that.

Hannibal answered 13/9, 2010 at 18:10 Comment(6)
Actually most of those systems you think are only accessed byt the application in my 30 years experince are not. They are accessed in import, by other applications and by scripts that run inthe query window.Eugenieeugenio
That's exactly what YOU DO WANT TO HAVE HAPPEN. It means you're trying to insert bad data. It means you're about to break every query. It means you're about to screw up extracts and reports with bad data. It means your code is crap.Howes
This is a mentality that I've seen common to many object oriented developers. OO developers want every thing in code, and don't often appreciate the advantages of letting a database do what it was designed to do. It is a trade-off that they make to make it easier for them to code, and to make their code more easily ported to differing database back-ends.Conglomerate
HLGEM, Exactly... if you have a database that has exactly one system accessing the database then either it is some uber-central system that runs the whole (and small) company. or it's rather an unimportant off-shoot from the company's main data.Howes
@Stephanie: This isn't a joke, but rather an observation. The systems I'm talking about are uber-systems that do control large pieces of large enterprises. They have their own import processing, do not let applications outside the system access their database, and are very controlling about what gets to touch the data. These aren't your standard business systems, but come from large outside vendors like SAP and Oracle. If you're going to develop a system in-house, you're not going to get the same effect, so you probably do want to keep constraints.Hannibal
Wow. That's taking a very specific case and drawing conclusions from it then passing it around as general advice for most dbs. Oracle Financials, Lodestar, SAP. Massive COTS apps are a COOOMMMPLLEEETELY different breed than anything usually discussed here. Providing advice to a generic question based on those systems isn't useful. In fact, it's counter-productive. You usually don't have a choice about whether to use constraints or not when implementing those packages... at least not without violating your warrantee, so why would you think that's the basis of the question?Howes
G
-4

Primary keys are important. You need a way to uniquely ID rows.

But, its been my experience that if you properly encapsulate your database access within classes (ie, reading/writing objects to/from the db), constraints arent generally necessary. Yeah, I might use them if 50 different apps in 10 different languages were using the same database. But if its one app, or a common suite of apps sharing a source code base, I'd rather have all the database manipulation logic in one place to make the app more maintainable. Same goes for stored procedures, but they have the additional issue of portablity between db systems if you write code meant to handle a wide variety of databases.

Growing answered 13/9, 2010 at 17:59 Comment(12)
You can uniquely ID rows with ROWID... so Heap table don't need a PK?Howes
The database optimizers need those constraints to produce good plans. If you move the constraints out of the database and into your code, you're begging the optimizer to produce suboptimal plans. If you write code to run on more than one RDBMS then you're already intending for suboptimal database usage so it really won't matter then.Howes
@stephanie, yes, ROWID works if thats supported on your database. As for database optimizers, I've simply never encountered that as a problem. Quite the contrary, handcoded optimization by someone experienced with databases almost always trumps anything the server might do on its own.Growing
Building the plan for every single SQL you write doesn't scale well. I'd rather provide some simple metadata and let software do that work for me. Yes there are times when optimizers fail for specific reasons but what was 10% of SQL in the 90's became 5% of SQL in the 2000's and is now closer to 1% today. At least in Oracle that's the case, maybe your preferred RDBMS is worse, I don't know.Howes
@Growing you are partially right. Hand coded optimizations can out preform some built in plans. However, by not providing any 'metadata' about your data interacts with itself, you've relegated yourself to optimizing everything. Part of the reason you purchase databases such as MS SQL & Oracle is because they can self optimize when you give them enough information to do so. Step 1 should be to create a well designed data model with relationships and constraints. There are times when this isn't the best approach, but those are rare.Conglomerate
@chillitemp, alas, I dont buy MSSQL or Oracle, so it doesnt really apply to my projects. I've worked with both though, as well as most others (everything from DB2 to SQLite), and it just hasnt been an issue, so long as the developer has a good feel for writing queries.Growing
@stephanie, I wasnt saying to define a plan for each query. Rather, that someone who is very familiar with SQL doesnt need to rely on that to make fast performing queries. If people want to use constraints, they can go nuts with it. But I have not found them to add much value over what a skilled database programmer can do in code, and over the long run, I find that where the logic is located on one place, and one place only, applications are more maintainable. I know I'm 'blaspheming' against whats taught in schools as objective truth, but my experience has simply been as I've described.Growing
"I find that where the logic is located on one place, and one place only, applications are more maintainable." We have no disagreement here. And you might be forced to put it in the app because you sell your product and must support multiple database platforms. But that's the exceptional case. Most questions here are from developers working on apps/dbs that aren't for sale/internal. In the common case, the database is the one place for that code. It's the only efficient place to handle concurrency.Howes
"so long as the developer has a good feel for writing queries". THE developer? I've always worked on large systems with more than one development team is accessing the database. There's simply no way to make sure that they all have a "good feel". Constraints ensure that my data is consistent no matter who is attacking it today.Howes
by 'THE developer' = 'THE developer who happens to be working on the code'. I've worked in teams both large and small, though almost universally in the commercial realm (ie, software for sale, both mass market and vertical market). BUT, if you are talking about larger organizations, please re-read my original response. I stated that if multiple apps/tools are acccessing the same data, that then constraints arent a bad idea (because the logic is then spread out across multiple apps, and its easy for that to get out of sync and for the different apps to start handling data differently).Growing
So you could substitute ANY for THE and mean the same as what you meant. And if ANY developer has to be proficient then EVERY developer has to be proficient and that is not my experience.Howes
Well, I guess if you are in a situation where the developers arent particularly proficient, then maybe the best bet is to protect them from themselves via enforced constraints. For the people I work with, its simply not been an issue. To each their own I guess.Growing

© 2022 - 2024 — McMap. All rights reserved.