Dilemma: Cascade delete or Join delete
Asked Answered
P

4

8

It's not a specific question, more a general wondering.

When you have to make a delete on multiple tables in a 1:M relationship, is it better to make a FK constraint with a cascade delete or join the tables in the delete statement.

I had an old project that had separate delete statements for related tables, and a few times some of the statements were not executed and data integrity was compromised. I had to make a decision between the two, so I was thinking a bit what would be a better solution.

There is also an option to make a stored procedure or a transaction.

So I am looking for an opinion or advice...?

Portamento answered 19/10, 2011 at 12:38 Comment(0)
C
4

If your database has proper RI defined for it then there shouldn't be any case of compromised data integrity. All of your related tables should have declarative RI, which means that you can't delete a parent while it still has children.

Also, if you have code that is only deleting some of the rows at times then that is poor coding and poor testing. These kinds of actions should be a single transaction. Your suggestion of using a stored procedure is a great approach for solving that problem and is pretty standard.

As has already been mentioned, cascading triggers have the danger of deleting rows that someone did not intend to delete. Consider that sometimes people might be accessing your data from somewhere outside of your application, especially when fixing data issues. When someone accidentally tries to delete the wrong parent and gets an RI error that's good. When they accidentally try to delete the wrong parent and it not only deletes that parent but 20 children in 5 other tables, that's bad.

Also, cascading deletes are very hidden. If a developer is coding a delete for the parent then they should know that they have to use the delete stored procedure to take care of children. It's much preferable to have a developer not code against that, get an error, and fix his code (or realize that he doesn't really want to do all of that deleting) than it is to have a developer throw in a delete and have no one realize that it's killing off children until the code has gone live.

IMO, I prefer to have my developers knowledgeable about the application rather than make it easier for them to remain ignorant of it.

Crichton answered 31/10, 2011 at 17:58 Comment(0)
B
7

I'd say it's safer to use a cascade delete. If you decide to use joins, you have to remember to use them every time you delete anything from parent table; and even if you're disciplined enough to do that, you can't be sure about your coworkers or people who will support your software in the future. Also, encoding such knowledge about table relationships more than once violates DRY principle.

If you use a cascade delete though, nobody has to remember anything, and child rows will always be deleted as needed.

Badge answered 19/10, 2011 at 12:49 Comment(2)
Totally. Great answer and well-put. Data Integrity has always gotten too little attention and usually start popping up a few years 'down the line' and often ends up strangling many companies - they can't innovate and change anymore due to long ago bad decisions about data collection.Goglet
There seems to be some stigma associated with cascade delete because of the possible accidents when deleting data manually. Personally I have had a few bad experiences with it, but I still agree that it is a safer way to maintain strict data integrity. On the other hand I still try to avoid using cascade.Portamento
R
4

Cascade delete causes lots of issues and thus is extremely dangerous. I would not recommend its use. In the first place, suppose I need to delete record that has millions of child records. You could lock up the database and make it unusable for hours. I know of very few dbas who will permit cascade delete to be used in their databases.

Next, it does not help with data integrity if you have defined the FKs. A delete with child records still existant will fail which is a good thing. I want the customer delete to fail if he has existing orders for instance. Cascade delete used thoughtlessly (as it usually is in my experience) can cause things to be deleted that you really don't want to delete.

Rozalie answered 19/10, 2011 at 13:39 Comment(0)
C
4

If your database has proper RI defined for it then there shouldn't be any case of compromised data integrity. All of your related tables should have declarative RI, which means that you can't delete a parent while it still has children.

Also, if you have code that is only deleting some of the rows at times then that is poor coding and poor testing. These kinds of actions should be a single transaction. Your suggestion of using a stored procedure is a great approach for solving that problem and is pretty standard.

As has already been mentioned, cascading triggers have the danger of deleting rows that someone did not intend to delete. Consider that sometimes people might be accessing your data from somewhere outside of your application, especially when fixing data issues. When someone accidentally tries to delete the wrong parent and gets an RI error that's good. When they accidentally try to delete the wrong parent and it not only deletes that parent but 20 children in 5 other tables, that's bad.

Also, cascading deletes are very hidden. If a developer is coding a delete for the parent then they should know that they have to use the delete stored procedure to take care of children. It's much preferable to have a developer not code against that, get an error, and fix his code (or realize that he doesn't really want to do all of that deleting) than it is to have a developer throw in a delete and have no one realize that it's killing off children until the code has gone live.

IMO, I prefer to have my developers knowledgeable about the application rather than make it easier for them to remain ignorant of it.

Crichton answered 31/10, 2011 at 17:58 Comment(0)
D
0

Use both!

"Joined" manual deletes are usually better for avoiding deadlocks and other contention problems as you can break up the deletes into smaller units of work. If you do have contention its definitely easier to find the cause of the conflict.

As stated "Delete Cascade" will absolutely guarantee referential integrity.

So use both -- do explicit deletes of the "children" in joined sqls to avoid deadlocks and performance problems. But leave "CASCADE DELETE" enabled to catch anything you missed. As there should be no children left when you come to delete the parent this won't cost you anything, unless, you made a mistake with your deletes, in which case the cost is worth it to maintain your referential integrity.

Dessertspoon answered 1/11, 2011 at 1:57 Comment(1)
Any foreign key guarantees referential integrityCrichton

© 2022 - 2024 — McMap. All rights reserved.