In data warehousing, we have the concept of slowly changing dimensions. I am just wondering why there is no jargon for 'slowly/rapidly changing FACTs' because the same Type1, Type 2 measures can be used to track changes in the FACT table.
According to the DW gods there are 3 types of FACT tables
- Transaction: your basic measurements with dim references. measurements not rolled up or summarized, lots of DIM relations
- Periodic: Rolled up summaries of transaction fact tables over a defined period of time.
- Accumulating Snapshot: measurements associated with a 2+ defined time periods
From these we have at least 2 options that will result in something pretty similar to a slowly changing fact table. It all depends on how your source system is set up.
Option 1: Transactional based Source System
If your source system tracks changes to measurements via a series of transactions (ie, initial value + change in value + change value etc) then each of these transactions ends up in the transactional fact. This is then used by the periodic fact table to reflect the 'as of period' measures.
For example, if your source system tracks money in and out of an account you would probably have a transaction fact table that pretty much mirrored the source money in/out table. You would also have a periodic fact table that would be updated every period (in this case month) to reflect the total value of the account for that period
The periodic fact table is your Slowly Changing Fact table.
Source DW_Tansaction_Fact DW_Periodic_Fact --------------- -> ------------------- -> -------------------- Acnt1 Jan +10$ Acnt1 Jan +10$ Acnt1 Jan 10$ Acnt1 Feb -1 $ Acnt1 Feb -1 $ Acnt1 Feb 9$ Acnt1 Apr +2 $ Acnt1 Apr +2 $ Acnt1 Mar 9$ Acnt1 Apr 11$
Option 2: CRUD/Overwriting Source System
Its more likely you have a source system that lets users directly update/replace the business measurements. At any point in time, according to the source system, there was and is only one value for each measure. You can make this transaction by some clever trickery in your ETL process but your only ever going to get a transaction window limited by your ETL schedule.
In this case you could go either with a Periodic Fact table OR an Accumulating fact table.
Lets stick with our account example, but instead of transactions the table just stores an amount value against each account. This is updated as required in the source system so that for Acnt1, in January it was 10$, February 9$ and April 11$
Sticking the the transaction and period fact tables we would end up with this data (As at end of April). Again, The periodic fact table is your Slowly Changing Fact table.
DW_Tansaction_Fact DW_Periodic_Fact ------------------- -> -------------------- Acnt1 11$ Acnt1-Jan-10$ Acnt1-Feb-09$ Acnt1-Mar-09$ Acnt1-Apr-11$
But we could also go with with an Accumulating Fact table which could contain all month values for a given year.
DW_Accumlative_Fact_CrossTab Year Acnt Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2001 Acnt1 10 9 9 11 - - - - - - - -
Or a more type3-ish version
DW_Accumlative_Fact_CrossTab Acnt Year YearStartVal CurrentVal Acnt1 2001 10 9
Kindof relevant
In my experience, this sort of question comes up when this common business scenario:
- There is a Core Business System with a DATABASE.
- Business Periodically Issues Reports that summaries values by time periods from Core Business System
- Core Business System allows retrospective updating of data - This is handled by overwriting values.
- Business demands to know why the January figures in the same report run in June no longer match the January figures from the report run in February.
Note that you are now dealing with FOUR sets of time (Initial period of report, measurement at date of initial period, current report period, measurement at current period) which will be hard enough for you to explain let alone your end users to understand.
Try to step back, explain to your end users which business measures change over time, listen to what results they want and build your facts accordingly. Note that you may end up with multiple fact tables for the same measure, that is OK and good.
Reference:
© 2022 - 2024 — McMap. All rights reserved.
orders
fact table withdate
andproduct
dimension andsales_quantity
,sales_amount
measure. The source transaction table has a flag column indicates the status of an order (paid
,cancelled
,refund
, etc) could change by time, an order of yesterday when we do ETL it was paid, and today it's cancelled. How could we handle this kind of data? I think this is a kind of fact that changes. – Mender