I'm working in a situation where we're moving from having a bunch of transactional fact tables to a more complicated picture with aggregates, snapshots, etc. In the past, there were a few cases where data needed to be aggregated by month, but previous developers had just put the key for the first day of the month it belonged to into a column in the fact table, and pointed it to the usual date dimension. This seems to work OK, we have day/month/year hierarchies in the cubes for each date dimension, and users are doing fine when they need to look at things by month.
When I read around - mostly Kimball's work, but other guides as well - the suggestion is that we should be using a "shrunken dimension" in these cases. The Kimball Group even specifically mention it in regards to a Month dimension. But I'm really not finding a whole lot of information about implementing them past that article, and brief write-ups that seem to be re-phrasing parts of it.
One of my particular concerns is that at the moment, people using our cubes are used to having one date dimension for each different type of date with year-month-day hierarchies, and they just only go down to month level when that's what they need. If this is going to result in a separate dimension with a year-month hierarchy, then it seems like it might be unwelcome clutter. But is this the intention?
The last two paragraphs in the linked article are the only thing I've found tackling how this should work in the presentation layer, and I just don't get what they're trying to describe. It feels short a couple of examples to flesh out how this should appear in a cube. Normally, I'd just trial-and-error this, but timescales are very tight. So...
- If I do this, what is the intended display in the cube? Would I have two separate date dimensions, with one that only goes down to month?
- If the above is correct, is there really much point in this, seeing as people can currently happily query things at a monthly level without it? I feel like I'm missing what the benefits really are. I can see it's more semantically correct (we're at month level, so holding first day of the month is hacky and will show unrelated attributes), but with users who are already used to this, I'm not convinced that's enough reason to spend more time on this just now. I can see it could perform better given it'd be a smaller dimension, but we're not having performance issues as-is. Am I missing something?
- If I do go ahead with the changes, any tips on getting the shrunken dimension working in the cube? Normally I can dig around online until I narrow something down to the best couple of options, but there's really not much around, and I'd appreciate hearing from someone who's done this before. Not looking for anything huge, but something either written a little more technically than that article or a mini-example would probably leave me feeling a lot clearer about what needs doing and why. The Kimball article especially confused me when discussing needing to join the base dimension to the shrunken dimension in order to see attributes.
First two points are the big ones, because I'd know whether I need to make any data warehouse changes, and get them done if so - I would be very glad for an answer on those, even if you can't cover the third point.