The first of the Paralympic Games medals have been earned already - I'll be posting a blog soon on how to use .pbit files and update data sources using parameters.
COVID 19: New Zealand is now in Level 4
Today though I'm going to digress slightly from the celebratory mood of the Olympics and return to the COVID 19 data analysis. I've had a few requests from friends and family to dig up my old report and visualize what's going on in the NZ community (not including MIQ facilities). View the full COVID 19 NZ report in Power BI.
Question: What do the NZ Community COVID cases look?
The Ministry of Health website graphs all NZ cases, including those in Managed Isolation and Quarantine. It's difficult to get a good picture of how many community outbreak cases we're dealing with, and how that compares to the other outbreaks we've seen in New Zealand.
So what does the COVID 19 data look like over time if you remove the Managed Isolation & Quarantine cases?
Unfortunately, the Ministry of Health website has changed quite a bit since I first created the COVID report, and there are no longer any web based tables available to use as data source.
Challenge: Auto Refresh a Dynamic Data Source
Instead, all the data I need is uploaded daily in a CSV file which includes the name of the date in the filename. The web link to access this CSV file is:
https://www.health.govt.nz/system/files/documents/pages/covid_cases_2021-08-25.csv
In order to make the refresh dynamic, I'm going to use a few Power Query tricks.
Step 1: Identify Data Source parts
The CSV file seems to be uploaded between 1-4pm each day for that day, so if I schedule the refresh for 4pm daily I can use today's date as part of the data source URL.
Recall the URL is:
https://www.health.govt.nz/system/files/documents/pages/covid_cases_2021-08-25.csv
We're going to break this into:
- Root website: https://www.health.govt.nz
- Relative path part 1: /system/files/documents/pages/covid_cases_
- Relative path part 2: 2021-08-25
- Relative path part 3: .csv
Note the Relative path part 2 matches today's date in the format: yyyy-MM-dd
Step 2: Generate Dynamic Relative path part
I can't use a normal parameter for the Relative path part 2 because I need it to update automatically on refresh. So, I'm going to create a new blank query instead. This Query is really simple, just has one step that combines a few M code date functions:
letSource = Date.ToText(Date.From(DateTime.FixedLocalNow()), "yyyy-MM-dd")inSource
I have named this query: DateCSV
Step 3: Use relative path to get Dynamic Data
Normally we'd be able to split the URL into parts, but with a dynamic data source that will give an error.
INCORRECT Code
letSource = Web.Contents("https://www.health.govt.nz/system/files/documents/pages/covid_cases_"&DateCSV&".csv")inSource
Unfortunately, if you use the above Source code and Publish to PowerBI.com, you'll get the following error explaining that dynamic data sources cannot be refreshed in the Power BI service:
This error is not very helpful - it doesn't even specify the correct name of the query, instead just referring to 'Query1'.
When you follow the link to Learn more from the error message, you will find a short message that doesn't quite provide you with enough information to resolve the data error:
In most cases, Power BI datasets that use dynamic data sources cannot be refreshed in the Power BI service. There are a few exceptions in which dynamic data sources can be refreshed in the Power BI service, such as when using the RelativePath and Query options with the Web.Contents M function. Queries that reference Power Query parameters can also be refreshed.
Picking up on the keywords, a quick web search for "RelativePath and Query options with the Web.Contents M function" returns Chris Webb's super helpful blog.CrossJoin post Using The RelativePath And Query Options With Web.Contents() In Power Query And Power BI M Code, to which I have referred many times.
Finally, we're getting somewhere and you may finally be able to see why I split the URL into a Root path and four Relative paths in Step 1.
CORRECT Code
letSource = Web.Contents("https://www.health.govt.nz", [RelativePath = "/system/files/documents/pages/covid_cases_"&DateCSV&".csv"])inSource
This correct code uses a static URL for the Web.Contents() function. You'll find this process easier if the root URL is an actual functioning website, which is why my Relative Path part 1 is still static.
Now we can refer to the dynamic DateCSV query within the RelativePath portion of the Web.Contents() function and overcome the dynamic data refresh limitations in Power BI service.
Voila! Now we can schedule a refresh on this dataset in Power BI service.