Search

Search this blog:

Search This Blog

It's Complicated: Relationships in Power BI Part 1 (Fact vs Dimension)


Get Data > Transform and Clean Data > Check Relationships > Visualize Data

This is the general order of operations for building a successful report in Power BI desktop. In this series, I want to discuss relationships in Power BI and why it's important for you to have a basic understanding of the underlying Power BI data model. 

UPDATE January 2023: Microsoft Learn have recently updated their explanation of star schema in Power BI with a much more useful explanation that can be found here: 
https://learn.microsoft.com/en-us/power-bi/guidance/star-schema

The most common data model is a star schema, and consists of: 
  • Fact table - usually one per report (though it's possible to have multiple fact tables in one data model) that is the central table of transactions (sales, events, logs, requests, audits, etc). 
  • Dimension tables - as many as needed to store commonalities, demographic or dimension data about the transactions (customers, locations, dates, leave type, department, etc)

Note that Dimension tables have nothing to do with dimensions of objects (as pictured), except that in both cases you must ensure you have as many dimensions as needed to paint a clear picture of your desired result.

In a typical star schema, we create a relationship between the Fact table and each Dimension table based on key identifiers, or columns that both tables have in common. It is not always straightforward which table is the Fact table and which tables you need to use for Dimension tables, so let's start with a few examples. 

Example 1: Employee Training

We will start by using employee training as an example, where you want to track the number and type of training that each employee has attended in the past month. 

Fact Table: Training Attendance

Your Fact table might look something like this: 

FactTrainingAttendance

 EmpIDCourseID Date Result 
 1PB1 24-Jun-2020 95
 2EX1 20-Jun-2020 70
 1PB2 30-Jun-2020 80 
 3EX120-Jun-202080 

Note that we have one row for each transaction; in this case a transaction is every attendee at a training event. This means that each event is listed multiple times in the Fact table, as we had multiple employees in attendance. It also means that each employee can be listed multiple times in the Fact table, as an employee can attend multiple training sessions. 

I can see at a glance that 3 employees have attended three different courses, but I do not have any specific information about those employees or the courses. 

Dimension Tables: DimEmployee, DimCourse

What if I want to know the gender, job title or age group of those employees? What if I want to know the category of the course? In order to capture and report on this information, I will add two new Dimension tables to my model. The reason for using Dimension tables here is that the information in these tables stays static across many transactions: 
  • An employee does not change their birth date or gender each time they attend a course. 
  • A course does not change categories each time it is run (an Excel course is always in the Excel category, a SharePoint course will always be SharePoint category). 
The dimension tables we need to answer the questions about attendance by gender or course category therefore look like this: 

DimEmployee

EmpIDName    Date of BirthGender
 1John24-Jun-1960 M
 2Paul20-Aug-1988 M
 3Mary14-Feb-1975F

DimCourse

CourseIDCourse NameCourse CategoryLevel
 PB1Power BI EssentialsPower BIBeginner
 PB2Power BI Next LevelPower BI Intermediate
 EX1Excel EssentialsExcelBeginner

Related Columns

Once you have identified the Fact and Dimension tables, you must identify the key identifying columns that relate the two tables. In Power BI we can only select one column from each table to make this connection. In this example, how can we relate the FactTrainingAttendance table to the DimEmployee table? How can we relate the FactTrainingAttendance table to the DimCourse table? 

FactTrainingAttendance[EmpID] relates to DimEmployee[EmpID]
FactTrainingAttendance[CourseID] relates to DimCourse[CourseID]


Example 2: Incident Reporting

Let's use incident reporting for our second example describing Fact and Dimension tables, since it differs a bit from the typical sales examples already posted on the internet.

Fact table: Incident Reports

If you are reporting on incidents that have been reported over the past month, you might see a FactIncidentReports table with columns such as:
  • IncidentID
  • IncidentDate
  • EmployeeID
  • ReportID

Dimension tables: DimDate, DimEmployee, DimIncident

From that small amount of information, I can use the Dimensional (demographic/static) data that I already have, to find:
  • Month
  • Employee name, gender, age
  • Incident type, severity
This information can be pulled from the Dimension tables: 

DimDate

  • Date
  • Year
  • Quarter
  • Month
  • Month Number
  • Day of Week

DimEmployee

  • EmployeeID
  • Employee Name
  • Gender
  • Date of Birth

DimIncident

  • IncidentID
  • Incident Name
  • Incident Type
  • Severity

Related Columns

If we focus on the DimEmployee table, I only need to pull in data for each employee ONCE, to save space in my data model. We can use the EmployeeID column to connect the DimEmployee table, and all the data in it, to every incident that employee reports in the FactIncidentReports table, which may be MANY over the course of a month or year. 

FactIncidentReports[IncidentID] relates to DimIncident[IncidentID]
FactIncidentReports[IncidentDate] relates to DimDate[Date]
FactIncidentReports[EmployeeID] relates to DimEmployee[EmployeeID]

Star Schema

Now that we've seen a few examples, you can understand the common identifying features of a Star Schema data model: 
  • a Fact table - one row per transaction, few columns to avoid repeating information
  • a Dimension table - holds static details that relate to a key identifier in the Fact table, saves space in data model by avoiding duplicate information or double data entry
Once you have identified your Fact and Dimension tables, you are now ready for the next step - creating relationships between them. In the next post we will look at how to create these relationships and something called Cardinality. 

Custom Visual Review: Charticulator

This is not your ordinary custom visual - this is EVERY custom visual. Charticulator puts the power to design and develop custom visuals to ...