VizWiz

Data Viz Done Right

April 13, 2021

How to Calculate a Z-Score

No comments
A Z-score is a numerical measurement that describes a value's relationship to the mean of a group of values. Z-score is measured in terms of standard deviations from the mean. 

  • If a Z-score is 0, it indicates that the data point's score is identical to the mean score. 
  • A Z-score of 1.0 would indicate a value that is one standard deviation from the mean. 
  • Z-scores may be positive or negative, with a positive value indicating the score is above the mean and a negative score indicating it is below the mean. 

The calculation you need is Tableau is: 

( SUM([Profit]) - WINDOW_AVG(SUM([Profit])) ) 
 / 
WINDOW_STDEV(SUM([Profit])) 

Replace the SUM([Profit]) with whichever measure you'd like to use at the aggregation that makes sense in your data. 

Get the data used in the video here - https://data.world/vizwiz/car-sales-mock-data

April 12, 2021

#MakeoverMonday 2021 Week 15 - Fouls Called by NBA Referees

No comments

The original viz for this week was so good that I struggled to come up with something different. In the end, I wanted to learn by recreating the original. Check out #WatchMeViz and interact with the viz below.



April 8, 2021

How to Create a Ternary Graph / Triangular Chart

No comments
Ternary graphs visualise the ratios between the three variables. A ternary graph requires three metrics, plotted as a triangle, where the sum of all three variables adds up to a constant. You can think of it as a three dimensional scatterplot. 

Each dimension is plotted based on its relative variance (on a scale of 0%-100%) to the largest value within the dimension. 

A value plotted near the top would indicate a weighting towards the variable at the top. Likewise, for the bottom right, a value plotted there would indicate a weighting towards the variable that was plotted on the bottom right. A value in the middle, indicates the dimension is balanced across all three variables.

Download the Car Sales mock data set here. Download the triangle I used as the background image on the chart here.

April 5, 2021

#MakeoverMonday 2021 Week 14 - Multiclass Classification of Dry Beans

No comments

March 29, 2021

#MakeoverMonday Week 13 - UK Trade With the EU Since the Brexit Referendum

No comments

I must admit I got a bit stuck with this week's dataset. It was very straightforward and I didn't like how anything turned out. With some ideas from the live audience on YouTube, I started comparing exports, imports and the trade balance. I then thought about a trick I had taught the Data School about using a timeline to filter.

Great, that's it! Wrong! The layout I had in my head was all wrong. I had to move some things around and then getting everything to line up took ages! Why or why isn't formatting easier???

Then, after creating the BANs, someone suggested that it would be good to show a zoomed in version of the line chart for the dates selected. I added those as sparklines with each BAN and it turned out so much better.

Check out and download the viz here.



Watch the video here

March 26, 2021

#WOW2021 Week 5 - Predicting HBCU Future Enrollment

No comments

Workout Wednesday 2021 week 5 required you to become familiar with the statistical functions in Tableau as well as being able to create predictions based on those stats. I hadn't done either of these before, so I knew it would be a good learning opportunity.

View the dashboard here

Building the chart itself was simple. I chose to NOT truncate the axis as Candra did because it's not best practice to truncate the axis of an area chart as it skews the magnitude of change across time.

To create the Gaussian process regression, I found information on Tableau's website about the calculation and how to configure it. For Gaussian regression, the help says to use this formula:

MODEL_PERCENTILE(
"model=gp",
AVG([Days to Ship Actual]),
ATTR(DATETRUNC('month',([Order Date])))
)

However, when I did so, the chart and values were not the same as Candra's. So I used the MODEL_QUANTILE function instead. As always, the help within the calculation window was immensely useful.


Great, I now had the line chart. But I couldn't figure out how to get the prediction to extends another five years. A Google search for "predicting the future Tableau" sent me to this link.

Step 4: Extend the date range and densify the data was exactly what I needed. There's an Extend Date Range option in the dropdown for the Year dimension that I'd never seen before.



Sweet! Once in the Custom window, the configuration is very intuitive.


Some formatting, a few calcs to get the tooltips and title correct, and a sheet to trigger the change of the measure with a parameter action and done! Check out my solution here.

View the dashboard here

March 23, 2021

How to Create a Parallel Coordinates Plot Over Time

No comments
Typically a parallel coordinates plot compares multivariate, numerical data. However, you may want to create a parallel coordinates chart for a single variable over time. In this tip, I show you how to create that chart.

Resources:

March 22, 2021

#MakeoverMonday Week 12 - How much do Americans spend on cereals?

No comments
Time really flew by on today's #WatchMeViz. Before I knew it, an hour had passed, I'd built lots of things, and I hadn't yet decided on my "final" visualization. So instead, I have three this week!

Watch the video here to learn how I built these charts.



Viz 1 - Year over Year Change in Consumption of Food and Beverages in America




Viz 2 - Parallel Coordinates - How much do Americans spend on cereals relative to other products?




Viz 3 - Bump Chart - #MakeoverMonday 2021 Week 12 - How Does Cereal Rank in American Food Spending?


#WorkoutWednesday 2021 Week 11 - Gapminder: Income vs. Life Expectancy

No comments

As Lorna mentions in the week 11 challenge, the key is in the data prep. Once you have that, the visualization is really simple.

I did not use the new relationships model; I stuck with the traditional method of unions and a join as that's the most straightforward way to ensure you get the data in the correct shape.

First, you want to union together the three CSV files: life expectancy, population, and income. When you do that, you'll get this strange looking view that is super wide and doesn't have headers that mean anything. 


What you should see, though, is that the headers are in the first row. To fix that, click on the drop down triangle next to the unioned data sources and choose Field names are in first row.


The years are nicely in the headers now. The next step is to select all of the columns with the years and pivot the data. Be sure to ONLY select the years.

I then renamed Pivot Field Names to "Year" and changed the data type to Number (whole) and also renamed Pivot Field Values to "Values".

Next, add the data source with the list of countries and drag it into the data prep area to create a join. You want to join "country" to "name". And now everything should look good. That's it for the data prep.


Now that the data is pivoted, in order to build the view, you need to create a calculated field for each measure: life expectancy, population, and income



All three calculations are the same. All you need to do is swap out the name of the csv. Lastly, build the view.


Note that the x-axis is a logarithmic scale and both axes have the option to start at 0 turned off.  That's it! I hope you found this helpful.

March 19, 2021

#WorkoutWednesday 2021 - Week 2: Customer Lifetime Value (CLTV) Matrix

No comments

If you like a table calc challenge, this Workout Wednesday is for you. Get Ann's requirements here. On the surface it seems pretty simple:

  1. Get the first order date for each customer.
  2. Determine the number of quarters that elapsed since then.
  3. Calculate the cumulative value of each cohort.

Steps 1 & 2 are pretty simple with an LOD and a calculated field. Step 3 is an aggregate calculation that Ann gives a big hint for and then it's cumulative across the view.

The tricky part comes when you try to get rid of any future quarters. The cumulative calc forces each cell to be filled in. The requirements say that you can't show any quarters after the cohort's latest quarter.

This is where the table calc magic happens. When I create complex calculations, I nearly always split them into multiple calcs because (1) they're easier to debug and (2) I can see my progress along the way and see where I am going wrong.

First, calculate the average lifetime value.




Next, calculate each cohort's cumulative lifetime value.


You should now have a view like this with the marks are filled in across the whole table




We need to figure out how to get rid of the marks when they start to repeat after each cohort's last quarter since birth. This is where the complex table calc comes into play.



Add this calculation to the Filters shelf, choose true and you're done! Click on the image below to view my version on Tableau Public.

March 16, 2021

Mastering Containers in Tableau (Part 2) - Sales Performance Dashboard

No comments
Containers...you either love them or hate them. And I want you to learn to love them. 

In part one of this series, I showed you how to build a simple KPI dashboard with three cards. In the video below, I show you how to build a more complex dashboard that requires 7 containers:

Click for the interactive version

This video shows you how to organize the containers, how to use padding, and how to build an engaging dashboard for your stakeholders. Enjoy the video!

March 15, 2021

#MakeoverMonday Week 11: The World's Largest Cash Crops

No comments
What a fun dataset! Thank you for your ideas during #WatchMeViz. Here are the video and final visualization.


March 9, 2021

How to Create a Control Chart

No comments
Control charts are used to monitor the stability and control of measurements over a period of time. There are four elements to make an effective control chart. 

  1.  A control chart is a time series graph.
  2. A line across the time series that represents the mean of all of the measurements in the graph. 
  3. Upper and lower control limits (UCL and LCL) that are displayed as a reference band across the view at a specified number of standard deviations from the mean. 
  4. Indicators to show which measurements are "out of control". 

Typically, any measurements that are more than three standard deviations from the mean are considered unlikely and therefore outside the control limits. However, it is also common to consider measurements that are more than 1 or 2 standard deviations from the mean as a form of analysis. 

In this video, I show you how to create a control chart that allows the user to specify the number of standard deviations at which to plot the upper and lower control limits.

March 8, 2021

#MakeoverMonday Week 10: Female Participation in the Summer Olympics

No comments

First, thank you to Tommaso Ferri for moderating Watch Me Viz. I enjoyed working with this data set and was able to build nine vizzes in about 40 minutes, then took another 40-50 working through formatting and some pesky table calcs. I got there in the end!

After the live stream ended, I created one more version that uses containers. I think I like this one best. Below you'll find the live stream recording as well as the two final visualizations.

Thanks for watching!