How can I break referential integrity briefly, within a transaction, without disabling the foreign key constraint?
Asked Answered
I

6

17

I have a table with 3 columns:

ID, PARENT_ID, NAME

PARENT_ID has a foreign key relationship with ID in the same table. This table is modeling a hierarchy.

Sometimes the ID of a record will change. I want to be able to update a record's ID, then update the dependent records' PARENT_ID to point to the new ID.

The problem is, when I attempt to update the ID of a record it breaks the integrity and fails immediately.

I realize I could insert a new record with the new ID, then update the children, then delete the old record, but we have a lot of triggers in place that would get screwed up if I did that.

Is there any way to temporarily update the parent with the promise of updating the children (obviously it would fail on commit) without disabling the foreign key briefly?

Ibnsina answered 23/6, 2010 at 21:34 Comment(1)
"Sometimes the ID of a record will change" - you may want to consider using a en.wikipedia.org/wiki/Surrogate_key so that this doesn't happen...Baxter
M
20

What you want is a 'deferred constraint'.

You can pick between the two types of deferrable constraints, 'INITIALLY IMMEDIATE' and 'INITIALLY DEFERRED' to drive default behavior - whether the database should default to check the constraint after every statement, or if it should default to only checking constraints at the end of the transaction.

Marquita answered 23/6, 2010 at 21:46 Comment(14)
sorry, link should be fixed nowMarquita
Links works now, but is incredibly slow to load (on my highspeed connection). Only concern is what versions of Oracle this applies to - can't tell from the link.Gaius
The Oracle 9.2 version of that documentation describes deferrable constraints as well, so it should be available at least from 9.2 - download.oracle.com/docs/cd/B10501_01/server.920/a96524/…Marquita
+1: For answering the question, but the data model should be constructed so hacks like these aren't necessary.Gaius
This doesn't seem like a hack to me - it seems like a reasonable expectation to support. Either you want some lee-way in your ACID, or you don't. It would really be up to the user to decide.Colchicum
@Chris Kaminski: No DBA would ever let you get away with using deferred constraints. What I suggested in my answer about using a surrogate key that the user can alter at will, without impacting referencial integrity, would've stopped this need from occurring in the first place.Gaius
@Marquita This solves exactly the problem I was facing. I'm getting a bit scared based on what others are saying about it though, so perhaps I need to consider using a surrogate key and then just accepting that the requirement will be on my code to enforce this integrity in my insert/update/delete procedures and/or triggers. I wish there was a less messy, "correct" way to do this.Ibnsina
One caveat about deferred constraints: if the constraint is not valid when you try to commit, the transaction will automatically rollback. This is one of the few places where the database takes transactional control out of the user's (or application's) hands.Tipster
@OMG Ponies: then the DBAs are idiots. As long as I complete my transaction with referential integrity intact, it shouldn't matter how I got there. And your surrogate keys is a horrible hack to get around DBAs with more ego than sense, IMHO, or software that just isn't robust enough to manage it (MySql?).Colchicum
@OMG Ponies: I'll agree that hacks like this "shouldn't" be needed, but I have encountered situations where, due to legacy design issues (i.e. "What moron came up with this POS?!?"), deferred constraints were the only reasonable way to allow the data to be properly constrained without having to write some really awful code. YMMV.Pelham
@Bob Jarvis: Such situations remind me of this poster: despair.com/mis24x30prin.htmlGaius
@Chris Kaminski: Good luck fighting against DBAs, it's not their fault the situation existed in the first place.Gaius
@OMG Ponies: Concur with the fighting against DBAs bit... :)Colchicum
@Chris Kaminski what you don't seem to understand is that the DBA's job is to protect the data not make things more convenient for the programmers. That doesn't make them idiots.Devoid
B
10

Answered slower than Chi, but felt it would be nice to include code sample, so that the answer could be found on SO.

As Chi answered, deferrable constraints make this possible.

SQL> drop table t;

Table dropped.

SQL> create table T (ID number
  2      , parent_ID number null
  3      , name varchar2(40) not null
  4      , constraint T_PK primary key (ID)
  5      , constraint T_HIREARCHY_FK foreign key (parent_ID)
  6          references T(ID) deferrable initially immediate);

Table created.

SQL> insert into T values (1, null, 'Big Boss');

1 row created.

SQL> insert into T values (2, 1, 'Worker Bee');

1 row created.

SQL> commit;

Commit complete.

SQL> -- Since initially immediate, the following statement will fail:
SQL> update T
  2  set ID = 1000
  3  where ID = 1;
update T
*
ERROR at line 1:
ORA-02292: integrity constraint (S.T_HIREARCHY_FK) violated - child record found


SQL> set constraints all deferred;

Constraint set.

SQL> update T
  2  set ID = 1000
  3  where ID = 1;

1 row updated.

SQL> update T
  2  set parent_ID = 1000
  3  where parent_ID = 1;

1 row updated.

SQL> commit;

Commit complete.

SQL> select * from T;

        ID  PARENT_ID NAME
---------- ---------- ----------------------------------------
      1000            Big Boss
         2       1000 Worker Bee

SQL> -- set constraints all deferred during that transaction
SQL> -- and the transaction has commited, the next
SQL> -- statement will fail
SQL> update T
  2  set ID = 1
  3  where ID = 1000;
update T
*
ERROR at line 1:
ORA-02292: integrity constraint S.T_HIREARCHY_FK) violated - child record found

I believe, but could not find the reference, that deferrability is defined at constraint creation time and can not be modified later. The default is non-deferrable. To change to deferrable constraints you'll need to do a one time drop and add constraint. (Properly scheduled, controlled, etc.)

SQL> drop table t;

Table dropped.

SQL> create table T (ID number
  2      , parent_ID number null
  3      , name varchar2(40) not null
  4      , constraint T_PK primary key (ID)
  5      , constraint T_HIREARCHY_FK foreign key (parent_ID)
  6          references T(ID));

Table created.

SQL> alter table T drop constraint T_HIREARCHY_FK;

Table altered.

SQL> alter table T add constraint T_HIREARCHY_FK foreign key (parent_ID)
  2      references T(ID) deferrable initially deferred;

Table altered.
Bently answered 23/6, 2010 at 22:28 Comment(0)
C
7

The common advice with scenarios like this is to employ deferrable constraints. However, I think these situations are almost always a failure of application logic or data model. For instance, inserting a child record and a parent record in the same transaction can be a problem if we execute it as two statements:

My test data:

SQL> select * from t23 order by id, parent_id
  2  /

        ID  PARENT_ID NAME
---------- ---------- ------------------------------
       110            parent 1
       111            parent 2
       210        110 child 0
       220        111 child 1
       221        111 child 2
       222        111 child 3

6 rows selected.

SQL>

The wrong way to do things:

SQL> insert into t23 (id, parent_id, name) values (444, 333, 'new child')
  2  /
insert into t23 (id, parent_id, name) values (444, 333, 'new child')
*
ERROR at line 1:
ORA-02291: integrity constraint (APC.T23_T23_FK) violated - parent key not
found


SQL> insert into t23 (id, parent_id, name) values (333, null, 'new parent')
  2  /

1 row created.

SQL>

However, Oracle supports a multi-table INSERT synatx which allows us to insert the parent and child records in the same statement, thus obviating the need for deferrable constraints:

SQL> rollback
  2  /

Rollback complete.

SQL> insert all
  2      into t23 (id, parent_id, name)
  3          values (child_id, parent_id, child_name)
  4      into t23 (id, name)
  5          values (parent_id, parent_name)
  6  select  333 as parent_id
  7          , 'new parent' as parent_name
  8          , 444 as child_id
  9          , 'new child' as child_name
 10  from dual
 11  /

2 rows created.

SQL>

The situation you are in is similar: you want to update the primary key of the parent record but can't because of the existence of the child records: And you can't update the child records because there is no parent key. Catch-22:

SQL> update t23
  2      set id = 555
  3  where id = 111
  4  /
update t23
*
ERROR at line 1:
ORA-02292: integrity constraint (APC.T23_T23_FK) violated - child record found


SQL> update t23
  2      set parent_id = 555
  3  where parent_id = 111
  4  /
update t23
*
ERROR at line 1:
ORA-02291: integrity constraint (APC.T23_T23_FK) violated - parent key not
found


SQL>

Once again the solution is to do it in a single statement:

SQL> update t23
  2      set id = decode(id, 111, 555, id)
  3          , parent_id = decode(parent_id, 111, 555, parent_id)
  4  where id = 111
  5     or parent_id = 111
  6  /

4 rows updated.

SQL> select * from t23 order by id, parent_id
  2  /

        ID  PARENT_ID NAME
---------- ---------- ------------------------------
       110            parent 1
       210        110 child 0
       220        555 child 1
       221        555 child 2
       222        555 child 3
       333            new parent
       444        333 new child
       555            parent 2

8 rows selected.

SQL>

The syntax in the UPDATE statement is a bit clunky but kludges usually are. The point being that we should not have to update primary key columns very often. Indeed, as immutability is one of the characteristics of "primary key-ness" we shouldn't really have to update them at all. Needing to do so is a failure of the data model. One way of avoiding such failures is to use a synthetic (surrogate) primary key, and simply enforce the uniqueness of the natural (aka business) key with a unique constraint.

So why does Oracle offer deferrable constraints? They are useful when we undertake data migrations or bulk data uploads. They permit us to cleanse data in the database without staging tables. We really shouldn't need them for regular application tasks.

Corrinacorrine answered 24/6, 2010 at 11:39 Comment(1)
Thank you for posting this. I had tested the single update to both parent and children, and it had failed. I thought it wouldn't work. Evidently there was an error in my test. Yours works fine.Bently
P
3

Recommendations to use a surrogate key are excellent, IMO.

More generally, the problem with this table is that it lacks a primary key. Recall that a primary key must be three things:

  1. Unique
  2. Non-null
  3. Unchanging

Databases I'm familiar with enforce (1) and (2), but I don't believe they enforce (3), which is unfortunate. And that's what's kicking you in the butt - if you change your "primary key" you have to chase down all the references to that key field and make equivalent alterations if you don't want to break integrity. The solution, as others have said, is to have a true primary key - one that is unique, non-null, and which doesn't change.

There's reasons for all these little rules. This is a great opportunity to understand the "unchanging" part of the primary key rules.

Share and enjoy.

Psychosis answered 24/6, 2010 at 15:18 Comment(1)
Yes, and composite and natural keys are impossible, and the Easter Bunny ate my homework. Sometimes you inherit a flawed data model, yes, it's preferable that PK is unchanging but it's not a rule except in an academic situation.Majoriemajority
B
1

If this were any other database besides Oracle, you could declare the foreign key with ON UPDATE CASCADE. Then if you change a parent's id, it would propagate the change atomically to the child's parent_id.

Unfortunately, Oracle implements cascading deletes but not cascading updates.

(This answer is for information purposes only, since it doesn't actually solve your problem.)

Bernard answered 23/6, 2010 at 21:35 Comment(6)
Nice addition. Although I don't agree with your word "Unfortunately" as I think it should be discouraged. Primary keys should be immutable.Methodology
@Rob van Wijk: I'd agree that pseudokeys should be immutable. What about natural primary keys?Bernard
The only acceptable natural primary key to me, is one that will never ever change. That's because I don't want to be bothered updating half of the entire datamodel, with the associated downtime. That's hard to explain to the business. And, immutable natural keys are rare. Even some natural keys that seem immutable at first sight, change. Just think about country codes (remember Yugoslavia,Zaire), currency codes (Euro), bank account numbers (changing from 9 digits to 10) and so on. An example of an acceptable natural key I can think of: the date column of a DW time table.Methodology
@RobVanWijk - " An example of an acceptable natural key I can think of: the date column of a DW time table. " Until the next Pope Gregory XIII ...Corrinacorrine
@APC: Lol. That would certainly cause protests and riots worldwide by IT companies. "You can't do that, because we have the date as a natural primary key". That's what I call evolution of the human race :-)Methodology
And we will not have ON UPDATE CASCADE while Tom Kyte lives ;)Majoriemajority
G
1

You need to use a deferrable constraint (see Chi's answer).
Otherwise, in order to add a value that will fail the foreign key constraint, you have to either disable or drop & re-create the foreign key constraint.

Situations like these employ a surrogate key that can be altered by users as necessary, without impacting referential integrity. To expand on this idea, currently the setup is:

  • ID (pk)
  • PARENT_ID (foreign key, references ID column -- making it self referential)

..and the business rules are that ID can change. Which is fundamentally bad from a design perspective - primary key are immutable, unique, and can't be null. So the solution to the situation when you're building your data model is to use:

  • ID (pk)
  • PARENT_ID (foreign key, references ID column -- making it self referential)
  • SURROGATE_KEY (unique constraint)

The SURROGATE_KEY is the column that supports change without affecting referential integrity - the parent & child relationship is intact. This means that a user can tweak the surrogate key to their hearts delight without needing deferred constraints, enable/disable or drop/recreate foreign key constraints, ON UPDATE CASCADE...

As a rule, in data modeling you NEVER display primary key values to the user because of situations like these. For example, I have a client who wants their jobs number to change on the start of the year, with the year at the start of the number (IE: 201000001 would be the first job created in 2010). What happens when the client sells the company, and the new owner needs a different scheme for their accounting? Or, what if the numbering can't be maintained while transitioning to a different database vendor?

Gaius answered 23/6, 2010 at 21:35 Comment(11)
I considered adding a surrogate key but then the integrity of my table relies on having a series of triggers to ensure nobody broke anything, rather than a single foreign key constraint... it seems like a frustrating trade-off.Ibnsina
@Renderln: I loath it too, but telling users "NO!" is what stops me from being a business analyst =)Gaius
@OMG Ponies: you have the most memorable name on all of stackoverflow. Every time I see it, I giggle and feel better about my day.Ibnsina
@OMG - actually, you can. See @Chi's answer.Walcott
@DCookie: Worst part is, I remember hearing about deferrable constraints before :/Gaius
@OMG, at least you can remember ;-)Walcott
@OMG Ponies: Introducing a surrogate key is indeed the best option by far. Probably I'm reading it wrong, but it looks like you are suggesting in your second paragraph that the surrogate key can be altered by users? The purpose of introducing a surrogate key, is that the surrogate key is used for referential integrity and is immutable, while the real/business key can be altered.Methodology
@OMG Ponies: I'd love to see an example here, like @Shannon Severance did elsewhere - I'm arguing with you on the method elsewhere, maybe you could convince me to eat my words AND help educate others? ;-)Colchicum
@Rob van Wijk: I prefer natural and artificial key for referential integrity terminology. For this situation, surrogate is accurate - the surrogate is a substitute reference for the user, in this case that they can change. If the surrogate is also immutable, then the OP is back to where they started but with an additional column ;)Gaius
@Chris Kaminski: Your mission, should you choose to accept it, is to proofread my updated answer.Gaius
Digesting - I appreciate you taking the time to add this. :-)Colchicum

© 2022 - 2024 — McMap. All rights reserved.