Checking foreign key constraint "online"
Asked Answered
O

1

8

If we have a giant fact table and want to add a new dimension, we can do it like this:

BEGIN TRANSACTION

ALTER TABLE [GiantFactTable]
ADD NewDimValueId INT NOT NULL
CONSTRAINT [temp_DF_NewDimValueId] DEFAULT (-1)
WITH VALUES -- table is not actually rebuilt!

ALTER TABLE [GiantFactTable]
WITH NOCHECK
ADD CONSTRAINT [FK_GiantFactTable_NewDimValue]
FOREIGN KEY ([NewDimValueId])
REFERENCES [NewDimValue] ([Id])

-- drop the default constraint, new INSERTs will specify a value for NewDimValueId column
ALTER TABLE [GiantFactTable]
DROP CONSTRAINT [temp_DF_NewDimValueId]

COMMIT TRANSACTION

NB: all of the above only manipulate table metadata and should be fast regardless of table size. Then we can run a job to backfill GiantFactTable.NewDimValueId in small transactions, such that the FK is not violated. (At this point any INSERTs/UPDATEs - e.g. backfill operation - are verified by the FK since it's enabled, but not "trusted")

After the backfill we know the data is consistent, my question is how can SQL engine become enlightened too? Without taking the table offline.

This command will make the FK trusted but it requires a schema modification (Sch-M) lock and likely take hours (days?) taking the table offline:

ALTER TABLE [GiantFactTable]
WITH CHECK CHECK CONSTRAINT [FK_GiantFactTable_NewDimValue]

About the workload: Table has a few hundred partitions (fixed number), data is appended to one partition at a time (in a round-robin fashion), never deleted. There is also a constant read workload that uses the clustering key to get a (relatively small) range of rows from one partition at a time. Checking one partition at a time, taking it offline, would be acceptable. But I can't find any syntax to do this. Any other ideas?

Overalls answered 23/10, 2013 at 18:37 Comment(1)
Came upon this connect.microsoft.com/SQLServer/feedback/details/757706Overalls
H
1

A few ideas come to mind but they aren't pretty:

Redirect workloads and run check constraint offline

  1. Create a new table with the same structure.
  2. Change the "insert" workload to insert into the new table
  3. Copy the data from the partition used by the "read" workload to the new table (or a third table with the same structure)
  4. Change the "read" workload to use the new table
  5. Run alter table to check the constraint and let it take as long as it needs
  6. Change the both workloads back to the main table.
  7. Insert the new rows back into the main table
  8. Drop new table(s)

A variation on the above is to switch the relevant partition to the new table in step 3. That should be faster than copying the data but I think you will have to copy (and not just switch) the data back after the constraint has been checked.

Insert all the data into a new table

  1. Create a new table with the same structure and constraint enabled
  2. Change the "insert" workload to the new table
  3. Copy all the data from old to new table in batches and wait as long as it takes to complete
  4. Change the "read" workload to the new table. If step 3 takes too long and the "read" workload needs rows that have only been inserted into the new table, you will have to manage this changeover manually.
  5. Drop old table

Use index to speed up constraint check?

I have no idea if this works but you can try to create a non-clustered index on the foreign key column. Also make sure there's an index on the relevant unique key on the table referenced by the foreign key. The alter table command might be able to use them to speed up the check (at least by minimizing IO compared to doing a full table scan). The indexes, of course, can be created online to avoid any disruption.

Hullabaloo answered 4/11, 2013 at 1:53 Comment(2)
The second idea could certainly work, it also certainly helps that in this case all readers use a view for this table (it joins the fact table with its dimensions and helps with partition elimination). If the view is updated to something like SELECT * FROM GiantFactTable_old UNION ALL SELECT * FROM GiantFactTable the readers will be none the wiser.Overalls
Also of significance, is that the second suggestion doesn't require additional temporary storage to implement. (this is a giant table after all!)Overalls

© 2022 - 2024 — McMap. All rights reserved.