Dimensional Modelling - ambiguous relations

Asked 10/8, 2016 at 9:59 Answered 18/8, 2016 at 11:54

I've been trying to solve an issue, and to date I haven't been able to reach what I'd say is an optimal solution. I have a dimension (Features) which needs to be referenced in 2 other dimensions (Actions and Sessions), which in turn are referenced from the same Fact table (UserAction). This creates ambiguity and I can't complete the schema:

(note: snip of the model, not the whole thing) (included the bridge tables to show some of the added complexity in the model with many-to-many relationships)

I think the issue might be with Dim_Features technically having different meaning between both dimensions, but I'm still trying to use it as the same? It means both:

An Action belongs to this Feature / Feature Area
A Session had this Feature / Feature Area available (owned)

What I need to accomplish is being able to filter/slice Fact_UserActions by Sessions where certain features are available / unavailable, to then analyse things like:

Which Features are used when Feature 'A' is owned (as in, correlations between certain features being ownes, and others being used)?
How many users who own a Feature have not used it?
How often is a Feature used? (constrained by population of sessions that own it, ie. where it could actually be used)

Any ideas on what I might be doing wrong, or how I might improve the model?

EDIT: In case it helps, the sort of thing we'd want to get out of this is a table such as:

Where we can see the impact a feature has on the population as a whole, and within the population that owns it.

Vilma answered 10/8, 2016 at 9:59 Comment(2)

I don't know if business logic is reasonable or not but in a quick glance I see a relationship problem. There is a many-to-many relationship between Fact-UserAction and Bridge_SeasonToFeaturesOwned (-to-1 and 1-to- leads to -to-) which is not supported. – Moneymaking 11/8, 2016 at 22:1

Isn't this exactly what Bridge tables are for? To create many-to-many relations? In this case mapping [actions] to [features areas it belongs to] and [sessions] to [features it owns]. – Vilma 12/8, 2016 at 14:13

I think your problem is that you're working at the wrong grain. The standard Kimball advice for star schemas is to always find the absolute lowest grain, because you'll always be able to aggregate up.

Have a look at all of the questions you want to be able to answer - they're all about use of Features, yet the Fact table you're using to analyse Features isn't at Feature usage level. The Bridge tables exist to try to work around this.

It's important to remember that the vast majority of the time, your Dimensions should only be related indirectly, though Fact tables. Sometimes you need a Bridge table, but relatively rarely.

It's hard to come up with a suggested schema here without knowing how it fits in to the rest of the model, but consider the following:

Replace Fact_UserAction with something like Fact_FeatureUsage.
Have action_id, session_id, and feature_id in Fact_FeatureUsage.
Get rid of your Bridge tables.

Frore answered 18/8, 2016 at 11:54 Comment(4)

Interesting point on the grain of the fact table. We are however interested in Action usage, and an action happens to fall within one or more 'Feature' (really these represent feature groups or areas, categorizing actions). eg. Line.Plotted action will fall within Lines and Charts 'Features'. Our current solution to the problem involves 2 fact tables (one being SessionLogged, which includes feature ownership within the session, and then we indirectly compute Measures across the tables.. This feels wrong to me, as we're not really slicing the ActionUsage facts by feature owned.. – Vilma 18/8, 2016 at 13:27

Keep in mind that this this answer is purely based on answering the question you've asked here, which is very much about feature use reporting. It might be that you need another fact if you have some measures that really cannot be split down to feature use level. – Frore 18/8, 2016 at 15:0

Ok, let's ignore the lower grain of the actions for a second. Even if we have a Fact_FeatureUsed which had a feature_id in it (fact= feature used), how do you then relate that fact to the same (or another) Feature being owned in that session? In other words, how would I be able to slice the Feature usage fact table by 'Feature ABC is owned', so that I can see % of users of any feature in the context of owners of ABC? Is the only way to have a separate Feature<Owned> dimension? It makes it hard to generalize the '% Owners who have used' question without lots of hand-crafted visuals – Vilma 18/8, 2016 at 15:13

eg. I couldn't set up a report and with a page filter on 'Feature = ABC' get a display of '% Owners of ABC who have used it = 45%'.. As I'd have to make some form of individual mapping between the 2 identical concepts. Maybe that's the answer. have some form of extra fact table that links together Feature<Used> and Dim_Feature<Owned> in a way that makes this specific type of question easy to dynamically answer? (sorry, I'm brainstorming with myself, feel free to join in :) ) – Vilma 18/8, 2016 at 15:18

I would remove the relationships to Dim_Features and then hide it.

Then I would create two New Tables (while on the Report or Data view, go to the Modeling ribbon and click New Table). The DAX expressions for each would be something like:

Features (Actions) = 'Dim_Features'

Features (Sessions) = 'Dim_Features'

Now you have 2 independent copies of the Dimension table, and you can create the relationships to each of those in the Relationships window.

Hereafter answered 15/8, 2016 at 3:22 Comment(7)

This was one of my suggestion, handling Features as 2 separate dimensions. This means they are not directly related to each other, and makes clearly reporting things like "% of owners who have used" for each feature quite difficult (if possible at all - unless I'm missing something). Is this really the only way to go? Is there no smarter way to organise my dimensions I might be missing? – Vilma 16/8, 2016 at 8:29

Take a look at "Adjusting cross filter direction for a complex set of tables of relationships" section in this link – Guillen 16/8, 2016 at 8:38

Given your schema above, I don't understand how you think your output table can work - a single Fact row can be assigned to 2 different Features. – Hereafter 16/8, 2016 at 23:4

In the table above, the Fact is a user action. A user action (such as drawing a line on a chart) can be considered to belong to multiple features in a hierarchy - Lines (interactions with lines), Chart Annotations (any interaction with lines or other shapes), Charts (any interaction with charts), etc. This is why we have this many-to-many relationship between an action and a feature. – Vilma 18/8, 2016 at 15:41

Imagine you have Fact_UserAction and Dim_Features on a different axes of the same visual. Which relationship route should PowerBI pick, through Dim_Actions, or through Dim_Sessions? These kind of ambiguities are not allowed in PowerBI even though this would be perfectly fine for a database schema. Please search for "ambiguous" and "ambiguity" in the link I posted. – Guillen 19/8, 2016 at 8:25

@Guillen yes, I understand that there is ambiguity, that's what I'm trying to solve here. Mike Honey, I just got what you meant. we are trying to use the UserAction facts to measure usage of actions + features, which have different grains! We can filter UserAction by the feature, but we shouldn't be measuring feature usage there, which points towards a new Fact_FeatureUsed table. It can also reference Session (+ owned features), but Feature(Used) is definitely a separate concept. Now I need solve out how to map them together so I can easily report Usage by Owners of the 'same' feature. – Vilma 19/8, 2016 at 9:14

@Vilma for that I would try to build an "Owner Bridge" table to relate each Fact records to it's Owner(s). The columns would be just a Fact Key (you could concatenate action_id and session_id) and the Owner attributes (if there are a lot, leave them in an Owner table and just use the Owner Key on the Bridge table). I'd probably tackle that in the Edit Queries layer with a Append then a Group By to generate the rows (1 or 2 per fact row, from Actions then Features). – Hereafter 19/8, 2016 at 23:8

Recommended topics

Hot tags