In SQL Server 2008+, we'd like to enable tracking of historical changes to a "Customers" table in an operational database.
It's a new table and our app controls all writing to the database, so we don't need evil hacks like triggers. Instead we will build the change tracking into our business object layer, but we need to figure out the right database schema to use.
The number of rows will be under 100,000 and number of changes per record will average 1.5 per year.
There are at least two ways we've been looking at modelling this:
As a Type 2 Slowly Changing Dimension table called
CustomersHistory
, with columns forEffectiveStartDate
,EffectiveEndDate
(set toNULL
for the current version of the customer), and auditing columns likeChangeReason
andChangedByUsername
. Then we'd build aCustomers
view over that table which is filtered toEffectiveEndDate=NULL
. Most parts of our app would query using that view, and only parts that need to be history-aware would query the underlying table. For performance, we could materialize the view and/or add a filtered index on EffectiveEndDate=NULL.With a separate audit table. Every change to a
Customer
record writes once to theCustomer
table and again to aCustomerHistory
audit table.
From a quick review of StackOverflow questions, #2 seems to be much more popular. But is this because most DB apps have to deal with legacy and rogue writers?
Given that we're starting from a blank slate, what are pros and cons of either approach? Which would you recommend?
CustomersCurrentView WHERE Customer = 'John Doe' JOIN CustomersHistory JOIN Transactions
. My suggestion is - if historical data isn't used often, keep in a set of separate audit table; consider SCD 2 only if history-aware components constitute an important piece of the application. +1 for a very interesting question! – Consentaneous