This is the general order of operations for building a successful report in Power BI desktop. In this series, I want to discuss relationships in Power BI and why it's important for you to have a basic understanding of the underlying Power BI data model.
In It's Complicated: Relationships in Power BI Part 1 (Fact vs Dimension) we discussed how to identify Fact and Dimension tables in a typical star schema:
- Fact table - usually one per report (though it's possible to have multiple fact tables in one data model) that is the central table of transactions (sales, events, logs, requests, audits, etc).
- Dimension tables - as many as needed to store commonalities, demographic or dimension data about the transactions (customers, locations, dates, leave type, department, etc)
Cardinality
- 1 to 1
- 1 to many / many to 1
many to many
FactTrainingAttendance
EmpID | CourseID | Date | Result |
1 | PB1 | 24-Jun-2020 | 95 |
2 | EX1 | 20-Jun-2020 | 70 |
1 | PB2 | 30-Jun-2020 | 80 |
3 | EX1 | 20-Jun-2020 | 80 |
DimCourse
CourseID | Course Name | Course Category | Level |
PB1 | Power BI Essentials | Power BI | Beginner |
PB2 | Power BI Next Level | Power BI | Intermediate |
EX1 | Excel Essentials | Excel | Beginner |
We can see that CourseID exists in both tables. In order to be able to use this as a key relationship column, we need to check a few things:
- Does the data type match? Yes! Both are text.
- Are they in the same format? Yes! Both have two letter abbreviation followed by number with no spaces.
- Is there a one to one or one to many cardinality? Hmm, what does that mean again?
Remember we defined cardinality as the number of items in a set. I said you need to think of your set as a single unique value for the column. We have three sets to look at here; PB1, PB2 and EX1. In order to see if this relationship follows the cardinality we need, I must count the number of times each item appears in each table.
DimCourse
CourseID | Number of Times in FactTrainingAttendance | Number of Times in DimCourse | Unique in Dim table? |
PB1 | 1 | 1 | Yes |
PB2 | 1 | 1 | Yes |
EX1 | 2 (many) | 1 | Yes |
Note: You can use the Column Distribution option in Power Query Editor (tick the box in the View tab i the ribbon to turn on Column Distribution) to see the cardinality of each item for the first 1000 rows of your data.
FactTrainingAttendance: CourseID Cardinality = many
In the image above, you can see that CourseID has more distinct values than unique values. This means that not every CourseID is unique in the Fact table, and that some CourseIDs appear multiple times in the FactTrainingAttendance table. The cardinality for CourseID in the fact table is many.DimCourse: CourseID Cardinality = 1
In the DimCourse table, we can see that the number of distinct values is equal to the number of unique values. This means that every CourseID in the DimCourse table is unique and only appears one time in the Dimension table. The cardinality for CourseID in the dimension table is 1.
DimCourse to FactTrainingAttendance: Cardinality = 1 to many
- Does the data type match? Yes! Both are text.
- Are they in the same format? Yes! Both have two letter abbreviation followed by number with no spaces.
- Is there a one to one or one to many cardinality? Yes! Cardinality is 1 to many.
BONUS: Choosing the right key
What if we change our data slightly and add a new column to the fact table. Imagine our data now looks like this:
FactTrainingAttendance
EmpID | Course Number | Course Category | Date | Result |
1 | 1 | PB | 24-Jun-2020 | 95 |
2 | 1 | EX | 20-Jun-2020 | 70 |
1 | 2 | PB | 30-Jun-2020 | 80 |
3 | 1 | EX | 20-Jun-2020 | 80 |
DimCourse
Course Number | Course Name | Course Category | Level |
1 | Power BI Essentials | PB | Beginner |
2 | Power BI Next Level | PB | Intermediate |
1 | Excel Essentials | EX | Beginner |
Could we use Course Category and Course Category as our key columns for the relationship between these two tables? Let's review our checklist again:
- Does the data type match? Yes! Both are text.
- Are they in the same format? Yes! Both spell out the entire category.
- Is there a one to one or one to many cardinality? No! Cardinality is many to many.
Power BI category (PB) appears many times in both the Fact and the Dimension table. The key column in the dimension table must ALWAYS have a cardinality of one, meaning that every value in the key column of the dimension table must be unique.
In this example, if we try to use Course Category as the key column for the dimension table, I would not know which Level to assign to the course on 24 Jun - I know that it's in the Power BI (PB) category, but when I check the DimCourse table for Course Category = PB, I have two choices; Beginner and Intermediate. Therefore, my results would be ambiguous. I have Power BI Essentials and Power BI Next Level to choose from.
If we use the CourseID, I know that the course on 24 Jun is Course Number 1. Again, Course Number 1 appears multiple times in the DimCourse table - again my results would be ambiguous. I have Power BI Essentials and Excel Essentials to choose from.
How do I know which course to choose?
Power BI and Excel only allow us to use ONE column to create relationships between our tables, so I must create a single column unique CourseID. Looking at the above options:
Course Category as key gives - Power BI Essentials, Power BI Next Level
Course Number as key gives - Power BI Essentials, Excel Essentials
Power BI Essentials is the only course in both lists, so this must be the right choice. However, I can only know this if I look at BOTH Course Category AND Course Number.
So let's merge these into a single column in both tables: CourseID
FactTrainingAttendance
EmpID | Course Number | Course Category | Date | Result | CourseID |
1 | 1 | PB | 24-Jun-2020 | 95 | PB1 |
2 | 1 | EX | 20-Jun-2020 | 70 | EX1 |
1 | 2 | PB | 30-Jun-2020 | 80 | PB2 |
3 | 1 | EX | 20-Jun-2020 | 80 | EX1 |
DimCourse
Course Number | Course Name | Course Category | Level | CourseID |
1 | Power BI Essentials | PB | Beginner | PB1 |
2 | Power BI Next Level | PB | Intermediate | PB2 |
1 | Excel Essentials | EX | Beginner | EX1 |
If we use the new CourseID column as key, I know that the course on 24 Jun is CourseID PB1. PB1 is only in the DimCourse table once, so I know that it falls into Beginner Level.
Conclusion
Make sure you understand your data well enough to verify that the relationships created in Power BI and Excel data models are correct. Fact tables should have cardinality of many. Dimension tables should have cardinality of one. Not all data models are as straightforward as this example, so reach out if you have any questions.