Auckland: Living City
Today's post is inspired the the Auckland Museum Living City exhibit that projects larger than life data visualizations onto the floors and walls of Auckland Museum. Check it out if you're in New Zealand, or sneak a peak at the Auckland Museum Living City Instagram post for my overseas readers.
Sankey Custom Visual
Scores
Data Clarity: 6/10
Versatility: 3/10
Fun Factor: 6/10
PBI Certified: Yes
Description
The Sankey Chart is a specialized type of visual designed to display the flow of data from source to destination. For that reason, it can be difficult to find the right data to use with this visual or you may need to perform many transformations on your data model to get it into a format ready to use with a Sankey visual.
I have used it to help display the breakdown of complex hierarchies - turning it into a flowchart of the parent - child relationships in the dataset. It's commonly used for the energy industry, but as Auckland Museum have shown it works well with anything that moves in which you want to show the relationships from start to finish.
How To
Format the visual
The Sankey visual enables the end user to manipulate the data by dragging and dropping the nodes to move the links around and change the overall flow of the visual.
The Sankey visual has a few basic formatting options:
Data labels: This will turn on the labels for the nodes. Of course you have the typical font size, color, etc. There is also an option here to turn on Force display, which will ensure that all nodes are labeled without the user needing to hover over the visual. This is great for small datasets, but can make larger datasets look very cluttered. Unfortunately, there is no option to specify which nodes to force display for, it's either: default, all or nothing.
Data link labels: This will turn on the labels for ALL the data links. You'll see in the video below that this can be time consuming and clutter!
Links: This simply lets you choose a color for all the links. You must individually set the color for each unique link combination.
Nodes: This lets you choose a color for each of the nodes, again this must be done individually for each node.
Scale settings: This is a useful option that lets you change the scale. The default is set to 'Provide min optimal height of node' which can make it difficult to see which node has a larger population/weight. Turning this off can make some nodes too small to see, but also displays the nodes accurately to scale. There is also an option for logarithmic scales here.
Cycles displaying: Not applicable for this dataset, but useful for displaying the flow of items that can return back to their source.
All the other formatting options are as standard visuals, formatting for title, size, background etc.
This is definitely a time consuming visual to set up, but the rewards can be great if you put in the effort and have the right data to display.
Model the Data
The most challenging part of using the Sankey visual might just be getting your data formatted properly. If you're wanting to use the Sankey visual for a dataset that isn't curated for Source and Destination (such as showing multi-level hierarchical parent child relationships), you may need to transform the data to make it all visible in the Sankey.
In the video below I show one method for turning a flat tabular table into a Source > Destination table for Sankey visualization. Here you'll see how to create a series of queries that reference and group your raw data, and then finally append all those queries into one long table for the Sankey visual.