How to design a shrunken dimension table for dates in dimensional warehouse and use in SSAS?
Asked Answered
D

1

6

I'm working in a situation where we're moving from having a bunch of transactional fact tables to a more complicated picture with aggregates, snapshots, etc. In the past, there were a few cases where data needed to be aggregated by month, but previous developers had just put the key for the first day of the month it belonged to into a column in the fact table, and pointed it to the usual date dimension. This seems to work OK, we have day/month/year hierarchies in the cubes for each date dimension, and users are doing fine when they need to look at things by month.

When I read around - mostly Kimball's work, but other guides as well - the suggestion is that we should be using a "shrunken dimension" in these cases. The Kimball Group even specifically mention it in regards to a Month dimension. But I'm really not finding a whole lot of information about implementing them past that article, and brief write-ups that seem to be re-phrasing parts of it.


One of my particular concerns is that at the moment, people using our cubes are used to having one date dimension for each different type of date with year-month-day hierarchies, and they just only go down to month level when that's what they need. If this is going to result in a separate dimension with a year-month hierarchy, then it seems like it might be unwelcome clutter. But is this the intention?

The last two paragraphs in the linked article are the only thing I've found tackling how this should work in the presentation layer, and I just don't get what they're trying to describe. It feels short a couple of examples to flesh out how this should appear in a cube. Normally, I'd just trial-and-error this, but timescales are very tight. So...

  • If I do this, what is the intended display in the cube? Would I have two separate date dimensions, with one that only goes down to month?
  • If the above is correct, is there really much point in this, seeing as people can currently happily query things at a monthly level without it? I feel like I'm missing what the benefits really are. I can see it's more semantically correct (we're at month level, so holding first day of the month is hacky and will show unrelated attributes), but with users who are already used to this, I'm not convinced that's enough reason to spend more time on this just now. I can see it could perform better given it'd be a smaller dimension, but we're not having performance issues as-is. Am I missing something?
  • If I do go ahead with the changes, any tips on getting the shrunken dimension working in the cube? Normally I can dig around online until I narrow something down to the best couple of options, but there's really not much around, and I'd appreciate hearing from someone who's done this before. Not looking for anything huge, but something either written a little more technically than that article or a mini-example would probably leave me feeling a lot clearer about what needs doing and why. The Kimball article especially confused me when discussing needing to join the base dimension to the shrunken dimension in order to see attributes.

First two points are the big ones, because I'd know whether I need to make any data warehouse changes, and get them done if so - I would be very glad for an answer on those, even if you can't cover the third point.

Drees answered 7/6, 2015 at 20:11 Comment(5)
I went to Kimball training and asked Joy about this. She basically said that they assume a good analytics tool has drill-across features in addition to drill-down features. The MSBI stack doesn't provide this functionality natively with SSAS. So mini dimensions won't work the way they are suggesting they should. I tend not to use them with SSAS for this reason unless it can still provide good usability. The problem with relating your fact with your existing date dimension is there is probably a level of granularity (day) that isn't applicable for a fact at the month level.Damselfish
You are correct, you would end up with a full date dimension and then a month dimension. You would need to update their names to appropriately reflect their contents. An alternative to the mini dimension is to scope the measures so they blank out when you drill to an inappropriate level. (This is my suggestion, not Joy's.)Damselfish
@Damselfish - Thanks, that's great information. Probably explains why people had avoided using them here, and why I'm struggling to find information! What you've said seems like very valuable information even if there are other possible ways forward - any chance of writing it up as an answer? I might not accept straight away just to see if we get any other thoughts, but will definitely up-vote as it's already been helpful.Drees
@Damselfish I've added a bounty on this question which is about to expire and no one else has answered. If you feel like writing up your comments as an answer, I'll be happy to accept it and award it the bounty as it has been useful information! In the mean time, still very much open to any other answers.Drees
I think you probably concluded it wasn't worth the effort to change, and that's probably right. But I think you could implement it an SSAS-friendly way by including the surrogate MonthKey in the Date dimension, so your SSAS measure group at month level joins to the Date Dimension on MonthKey. You wouldn't be using the Month dimension itself at all in SSAS but at least it would be consistent/correct. It might be the way I'd go about it in a new system but I wouldnt' change it if I found it, the benefits are too small.Matamoros
M
4

This is not an answer nor it is a Cognos fanboi reposonse. For comparison I want to highlight how multi grain facts are modelled in other tools.

http://www-01.ibm.com/support/knowledgecenter/SSWGNW_8.0.0/com.ibm.swg.im.cognos.ug_best.8.4.0.doc/ug_best_id1339multi-factmulti-grainquery.html%23multi-factmulti-grainquery

http://www.cognoise.com/index.php?topic=17992.0

In the first link:

  • The monthly table has a month key and is joined to the month in the calendar table
  • The daily table has a day key and is joined to the same calendar table
  • What the link doesn't show is that you define the hierarchy levels behind the scenes so that the tool automatically knows not to double count the monthly level data
  • The result is the tool automatically knows how to roll up the facts

I'm not a SSAS expert but it appears it does not support this kind of functionality.

If that is the case then it seems to me there is no point in modelling the data 'correctly'. By correctly I mean assigning a particular month to a fact that is only defined at a monthly level.

So far I see no problem with modelling this by assigning a particular day in the month. If the fact table is all at the same level (monthly) then we know that a date in the table represents a month. At the least you might want to put on a check constraint that ensures it's the first of the month so there is no ambiguity.

The result is when you observe monthly and daily facts at a monthly level, everything is consistent. When you observe monthly and daily facts at a daily level you see a big chunk at the start of the month. If you could use SSAS to hide the measure at this level.. problem solved.

Merat answered 17/6, 2015 at 0:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.