Data Viz Done Right

January 23, 2017

The Pitfalls of Averages of Averages

No comments

As soon as I started exploring the data for Makeover Monday week 4, I had a suspicion that people wouldn’t pick up on a few things:

  1. There was a region named “Total (all TLAs)” that represented the total but was mixed in with all of the other regions.
  2. The data was an index, which is calculated as a weighted number for each region, month, and visitor type. 
  3. When using indexes in the dataset, using an average aggregation is appropriate as long as you only use it at the individual region, month, and visitor type level. You can’t use an average of the average to represent the total.

I saw a few people make the mistake of using an average of the average very early on (I won’t shame them publicly), so I thought it would be appropriate to explain averages of averages, why they don’t calculate “accurately", and how people should be using them.

Have Increases in Healthcare Spending Led to an Increase In Life Expectancy in America?

I was looking at an article on Visual Capitalist last night and saw another story that caught my eye. It was a connected scatterplot of life expectancy vs. health care expenditure for OECD countries. Everything on it is well documented including the data source. So off I went to get the data for Life Expectancy and Health Care Expenditure from the World Bank. I wanted to rebuild this chart in Tableau.

One interesting note is that the original visualisation had data back to 1970, yet that data is nowhere to be found on the World Bank website nor is it documented where it came from. That’s not very good. I limited my re-creation to the data I knew was accurate, 1995-2013.

This didn’t take too awfully long to create in Tableau, with most of the time being spent fiddling with the country labels. I know they overlap the lines in a lot of places, but I think it still works. When you click on a line, you’ll see drop lines that reveal the values for that point and also highlight the whole line for that country.

As always, rebuilding a chart I like has been a good learning experience. This was the first time I used drop lines and it was fun figuring out how to get only certain points labeled and colouring everything just right.

Click on the image for the interactive version.

Makeover Monday: Domestic & International Tourism Spend in New Zealand


Already four weeks into another fun year of Makeover Monday. As it’s an even numbered week, it’s Eva’s week to pick the topic. I’m really enjoying her helping out with picking topics because it allows me to participate like everyone else. I don’t see anything ahead of time and have to read the article, interpret the data and build a viz all at once. When I pick the topics myself, I tend to think weeks ahead about what I want to create.

This week, Eva chose two charts that are basically the same. There’s one for international tourism spend and one for domestic. Here’s the international spend chart:

What works well?

  • Title and subtitle make it easy to know what I’m viewing
  • Source is clearly documented
  • Colors are distinct enough that it’s clear we’re supposed to look at them individually
  • Legend is out of the way
  • Axes are clearly labeled
  • Minimal gridlines that aren’t distracting
  • Removing the vertical axis line
  • Nice crisp font (Founders Grotesk)

What could be improved?

  • Packed bars make it harder than necessary to follow a year across its months
  • 100 is the baseline (see the subtitle), so why isn’t the chart the difference from the baseline?
  • Overall too cluttered
  • Are there more years that could be used for comparison?
  • What was spending like before 2008? How do 2013-15 compare?
  • Are we trying to understand the change within each month or the change across the years?

I must admit that I got sucked in by all of the detail Eva provided in the data set. I started by creating a view of the change in spending by year compared to the 2008 average. After all, this is the baseline, so we should be comparing to that. Anything above 100 is an increase compared to 2008; anything below 100 is a decline versus 2008. I wonder how many people are going to catch that subtlety in the data. I understood it after reading the notes that accompanied the chart.

This is what I came up with first. It show the change versus 2008 for each region. Click on the image for the interactive version.

But was I really doing a makeover of the original? No, I wasn’t. The original was look at the country total, so back to the drawing board I went. I really liked the look of the lines I had created so it wasn’t like I was completely starting over. What I wanted to show was:

  1. How did the overall spending change compared to 2008 for each visitor type?
  2. What was the change by year within each season? This would help me understand any seasons where tourism spending has decreased.

Overall, a really fun data set to play with and a lesson learned to not get sucked into the data too quickly. I need to think more holistically, think about the story I want to tell, search for idea, do some sketching, and move away from Tableau when I start. Basically, practice what I preach.

January 20, 2017

London Crimes: Exploring the Trends, Locations and Crime Types

No comments

Sasha Pasulka has been a good friend of mine for many years and is moving to London from the US soon in her new role with Tableau. Like anyone else that’s moving to a new country, she finds the entire process is ridiculously overwhelming. Trying to find apartments from thousands of miles away, not knowing anything about neighbourhoods, can be an incredibly daunting task. To help her, I decide to build a viz in Tableau.

I started by using the London Crimes web data connector from Tableau Junkie, only to realise that it’s somehow not returning all of the data. It looks like it returns the data via a radius versus for an entire postcode.

No worries though. Instead I went to and downloaded all crimes from the Metropolitan Police Service. This returned a separate CSV for each month and it also wasn’t limited to just the London area. On the London Datastore I was able to find a list of all LSOAs in the London area. Great! All I needed to do was union all of the CSVs then join them to the London LSOAs. I love that I can do all of this straight inside Tableau now.

From there, it was a matter of building a simple visualisation that allows Sasha to pick boroughs and see the crimes in those areas. Note that I set the map to only display when there are 3 or fewer boroughs selected. did this because the map was simply too slow to draw the dots. Hopefully she likes it and makes it easier for her to settle in.

January 18, 2017

Workout Wednesday: The State of U.S. Jobs


Last week on my Data Viz Done Right site, I wrote about a great small multiples visualisation created by Matt Stiles that shows U.S. unemployment compared to the national average for every State. I recreated Matt’s work in Tableau, and your challenge for this week is to do the same. Personally, I find trying to rebuild visualisations a great way to learn.

Below you’ll see an image of what I created. Click on it for the interactive version. I’ve prepared the data for you here (it comes from the Bureau of Labor Statistics). Some things to keep in mind:

  1. My viz is 875x2150. Your’s doesn’t have to be this size, but I thought I’d provide it for guidance.
  2. I’m using the Source Sans Pro font, which you can download from Google Fonts. This is the font that Matt used in his version.
  3. The years should be displayed every 10 years.
  4. The axis line on for the year should be more distinct than the gridlines.
  5. The ends of each line should be colour-coded by whether it was an increase or decrease compared to the national average.
  6. You will need to calculate the national average. The national average needs to include the District of Columbia, but D.C. should not have its own chart.
  7. Values above the national average should start with a + and and below should start with a -
  8. The national average is the line you see at 0%.
  9. This is NOT a trellis chart, but if you think you can make it work as a trellis chart, go for it!
  10. The area above zero should be shaded. The hex code to use is #F7E6E2 and it should be shaded from 0- to +10%.
  11. Pay attention to the gaps between each State. I like how this gives it some breathing room.

This will be a very tedious exercise. To provide some context, this took me 2-3 hours to create. Don’t get discouraged and don’t feel like you have to do it all in one sitting. Basically, try to make yours look identical to mine.

Post an image on Twitter when you’re done and hashtag it with #WorkoutWednesday and please tag me so I see it. You can also comment below with a link to your workbook or with any questions you have. Good luck!

January 17, 2017

Tableau Tip Tuesday: How to Create Waffle Charts


I love waffles! So why not learn how to make waffle charts? This week, I show you a very simple way to create them. Inspiration for this post came from this post by Russell Christopher. The difference between his post and mine is that I don’t require any table calculations; that’s nearly always a win for me!

I created a template for the grid layout (the primary data source that I use in the video), which you can find here. I’d love to see more examples that you build with this. Good luck!

January 16, 2017

Makeover Monday: The Tweeting Habits of President-Elect Trump

On January 20th, we’ll have a new President of the United States. With Inauguration Day just a few days away, it seemed appropriate to look at the tweet of President-Elect Trump for Makeover Monday week 3. BuzzFeed News analyzed all the accounts Donald Trump retweeted during his presidential campaign and created this visualisation to accompany it (Note: The visualisation was trimmed for this blog) -

What works well?

  • The bubbles are ordered from largest to smallest from left to right in a Z-pattern making it pretty simple who he tweets the most.
  • Colouring inactive accounts so they are easier to identify

What could be improved?

  • Bubble charts are notoriously hard to use for comparison; a simple bar chart would be so much easier to read.
  • The viz lacks insight or a story.
  • The article has some in-depth writing; would have been great to include that in the visualisation.

I must admit that I struggled a bit this week. I was trying too many charts and had trouble focusing my story. On top of that, I messed up the TDE that I created for everyone. Note that when you use the DATEPARSE function hh means hour in am/pm (1~12) whilst HH means hour in day (0~23). I had used hh initially which made it look like Trump takes lunch off from Twitter.

I really liked Eva’s idea of using the Montserrat font that’s on Trump’s website, so I’ve knicked that idea. I also wanted to stick to Twitter’s official color palette and try as much as possible to quantify and simplify the amount of tweets that Trump produces. With these in mind, here’s my Makeover Monday week 3 visualisation:

January 12, 2017

Workout Wednesday: Showing Nothing When 'All' is Selected


Really fun challenge from Emma Whyte for this week’s Workout Wednesday that helped me learn loads! I’ll highlight some of those below, but first, here’s the challenge:

  1. The dashboard title should change to say 'All Departments' when you select 'All' in the filter
  2. The department logo should change when you select a department, but disappear when you select 'All'
  3. The line chart title should update when you change the filter
  4. All the charts and tooltips should look the same as Emma's

And here’s my final version:

Here are some things that helped me along the way. First, when I need to add a dot onto the end of a line, I usually create a table calc or LOD calc and create a dual axis chart, but that’s totally unnecessary. I never knew that when you turn on mark labels you then have the option to show only the Most Recent. The great thing about this option is it make the mark a dot. Perfect for this exercise!

To get the dashboard title to update dynamically, I created a sheet instead of using the dashboard title. This gives me more flexibility to use calculations. I created this LOD calc to first count the number of categories in the view and then return a string based on that result.

I then placed this new field on the Detail shelf and updated the title of the sheet.

For the department logos, I knew I wanted to only show a shape when “All” is NOT selected, so I created a new sheet and then created this LOD calc that simply checks whether the number of product categories is 1.

I then placed the field on the Filter shelf and chose to Exclude False (keeping True only works too). From there, I placed the Product Category field on the Shape shelf and assigned the shapes for each Department.

To get the month into the title of the line chart I created this LOD calc to return the latest month.

I placed this field on the Detail shelf and changed it to continuous Month and updated the title. The title is also supposed to show sales for the most recent month. I did this with a table calculation.

That’s about it. The rest was formatting and dashboard layout. Emma floated some things in her dashboard whereas I chose to tile everything. Both work just fine.

Overall, I learned quite a bit and that’s what Workout Wednesday is all about!

January 10, 2017

Tableau Tip Tuesday: How to Sort via a Cross-Database Calculation

1 comment

With Tableau 10 came the ability to create cross-database joins. In this video, I show you how to:

  1. Create a cross-database join
  2. Create a calculation across the data sources
  3. Use the calculation to sort the view
  4. Use the calculation as a boolean

January 9, 2017

Makeover Monday: Have Apple Lost Their Edge With iPhone?

The iPhone…what would I do without mine? We’ve become so dependent on our mobile phones and this is primarily due to the revolution that Steve Jobs launched back in 2007. This week for Makeover Monday, we look at this chart from DazeInfo:

What works well?

  • Title captures your attention
  • Nice highlighting of the declining bar in 2016
  • Simple colors that work well
  • Pretty simple, easy to understand viz

What could be improved?

  • Beveled bars are unnecessary; just make them flat
  • What does the * mean next to 2016?
  • Is the axis needed if the bars are labeled?
  • Inconsistent label formatting; make them all the same number of decimals
  • Remove the gridlines

For my version, I wanted to focus on the story that was in the article itself. I was particularly struck by the percentage of revenue that the iPhone now accounts for in Apple’s overall revenue.

January 4, 2017

Workout Wednesday: Comparing Year over Year Purchase Frequencies


What is Workout Wednesday? It’s a set of weekly challenges from Emma Whyte and me designed to test your knowledge of Tableau and help you kick on in your development. We will alternate weekly challenges: I will take the odd weeks and post them on my blog; Emma post challenges on the even weeks on her blog.

The idea is to replicate the challenge that we pose as closely as possible. When you think you have it, leave a comment with a link to your visualisation and post a pic on Twitter for others to enjoy.

Ok, so for week 1, I challenge you to recreate the visualisation below that helps identify the year over year purchase frequencies of Superstore (get the data here). The workbook is downloadable, but don’t cheat. Give it a real go before you download it to see if you got it right.

Here are the requirements:

  1. Y-axis is the cumulative % of total orders
  2. X-axis is the day of the year
  3. One line for each year, with the latest year highlighted
  4. Include a reference line for the target, which should adjust based on what the user enters
  5. Single select filter for Product Sub-Category
  6. Multi-select filters for Year, Region and Customer Segment
  7. Include a dot on each line at the first day when the cumulative % of order crosses the target
  8. Make the tooltips match mine
  9. Title should update based on the target entered by the user

January 3, 2017

Makeover Monday: Australia’s Income Gender Gap


2017 is here and with it another 52 weeks of Makeover Monday. ICYMI, this year Eva Murray is joining me on the project. We’ll be rotating each week, which I’m looking forward to as it’ll help challenge me more.

For week 1, we’re reviewing this article from Women’s Agenda about the massive pay gap that exists in Australia’s 50 highest paying jobs. Interestingly, we’re not making over a chart this week; instead it’s two ordered lists.

What works well?

  • An ordered list is great for showing ranking
  • Splitting the lists between men and women makes it easy to see which jobs pay them most for each gender

What doesn’t work well?

  • It’s basically impossible to compare men and women in the same jobs, which was the whole purpose of the article.
  • Within a gender, you have to do the math in your head to compare jobs. A simple bar chart, would make it to compare at a glance.
  • There’s no “story” to the data. What’s the call to action?
  • The lists only show the top 50 for each gender distinctly, making it really hard to find an overlap in the lists.
  • There’s no sense for the “overall” gender pay gap when limiting the list.

For my version, I started with a Google image search to get some inspiration. I pulled various parts and pieces from different infographics that resonated with me to put together this infographic. I had a few objectives:

  1. Use an impactful title
  2. Break the infographic into several parts by adding divider lines
  3. Start with a high-level summary of the gender gap for all jobs in the data set and just the top 50 jobs
  4. Quantify the pay gap for the reader to improve the context
  5. Show the wage gap in the top 50 jobs via a slope chart and highlight the jobs when women earn more than men (sadly only 2 jobs)
  6. Create a mobile version that allows for scrolling through the story

With these goals in mind, here’s my first Makeover Monday of 2017.

December 31, 2016

2016 | VizWiz Year in Review

This has been a pretty epic year for me both for this blog and for “work”. This year, I made an effort to blog more, record videos of tips to help grow the Tableau Community and share as much as I possibly could. Some notable stats and facts I’ve been tracking:

  1. Three cohorts and 24 consultants passed through the Data School
  2. Wrote 175 blog posts on this site, 25 more than my previous best year (2015)
  3. Started the Tableau Tip Tuesday video series and published 45 videos to my YouTube channel
  4. Published 22 blog posts for the Data School, up from 9 in 2015
  5. Wrote 18 blog posts for Data Viz Done Right, down from 54 in 2015
  6. Presenting 86 Tips in 50 Minutes alongside Jeffrey Shaffer at the Tableau Conference
  7. Completed 52 weeks of the #MakeoverMonday project with Andy Cotgreave
  8. Honourable Mention - Best Data Viz Project (Makeover Monday) | Kantar Information is Beautiful Awards

I started this blog in August 2009 as a way for me to record what I learned. As I observe more and more readers, I'm inspired me to write more, to share more, to help more. Thank you for reading!

Here are the most popular VizWiz posts in 2016:

  1. 12 Books Every Great Data Analyst Should Read
  2. Tableau Tip Tuesday: How to Create Diverging Bar Charts
  3. Tableau Tip Tuesday: How to Display KPIs Next to Bars
  4. Fix It Friday: Ten Alternatives Methods for Presenting Alcohol Consumption in OECD Countries
  5. Tableau Tip Tuesday: How to Create Small Multiple Line Charts
  6. Tableau Tip Tuesday: 12 Use Cases for Parameters
  7. Tableau Tip Tuesday: Layout Tips for Long Form Dashboards
  8. Makeover Monday: Travel Agents Are a Relic of the Past and Hotels Could Be Next
  9. Tableau Tip Tuesday: How to Create Monthly Radar Charts
  10. Tableau Tip: How To Create a 100 Mark Unit Chart inside of a Tooltip

So what’s in store for 2017? Well, much of the same and a bit more.

  • Makeover Monday will continue and I’ll be teaming up with Eva Murray in 2017.
  • Given that you clearly like tips, I will continue Tableau Tip Tuesday, hopefully every week, but I know I won’t have time to do a video every week. I promise to create as many as I can.
  • To help those that participated in Makeover Monday in 2016 take their development to the next level, I’ll be teaming up with Emma Whyte for Workout Wednesday. Basically, we’ll trade Data School Gym challenges with each other every Wednesday through the year. You can take the challenge yourself and try to rebuild our challenges each week.
  • I want to post on Data Viz Done Right more often. Ideally, I’ll post every Thursday at a minimum.

Happy New Year to everyone in the Tableau and data visualisation communities! You’ve made 2016 an amazing year and I look forward to seeing what 2017 brings.

December 26, 2016

Makeover Monday: How Much Has the Cost of Christmas Dinner Really Risen?

Well, this is it! 52 weeks of Makeover Monday under our belts. Thank you for an incredible year and for participating in this amazing project.

In case you missed it, Makeover Monday will continue in 2017. Eva Murray and I have big plans in store for everyone. If you ever have questions, comments or feedback, give us a tweet.

Ok, so onto week 52. The BBC published an article early in December about the cost of Christmas dinner in the UK with this blaring headline:

Christmas dinner costs 'rise 14%'

Inside the article were these basic charts:

Fortunately, Andy C was able to locate the data on BBC Github. Let’s review the charts.

What works well?

  • A line chart is a good choice for showing change over time.
  • The line chart uses about a 3x2 ratio which doesn’t distort the shape too much given the axis is zoomed in.
  • The bar charts work fine for comparing the two years.

What doesn’t work well?

  • The line chart has smoothed curves, which lead you to believe there were changes throughout the year, but the data is only at the yearly level. This could be misleading.
  • What food are included? The data on Github doesn’t match the data in the chart? How are we to know what they consider to be the Christmas foods?
  • There’s no logical sort to the bar chart. It’s neither sorted by 2015 nor 2016. I can’t make out how it’s sorted.
  • The bar chart doesn’t include all foods they have data for, so why did they only show these foods? Are these the foods that comprise the line chart? We have no way of knowing.
  • Were the Christmas themed colors in the bar chart intentional? If so, I get why they did it, but it won’t work for their colour-blind readers.
  • The title of the bar chart indicates it’s about the price change, but they make the reader do the change maths in their head. Why not just show the change?
  • Why are they comparing 2016 to 2015 and not a different year? That seems a bit shortsighted and sensationalist to me.

For my version, I thought it was important to:

  1. Indicate when the peak price occurred
  2. Show the entire time period to get a better sense for the rate of change
  3. Indicate whether the 2016 price has increased or decreased compared to 2006
  4. Sort the products by the 2016 price
  5. Show the rate of change and the latest price
  6. Optimise the view for mobile 

Given these ideas above, here is my last Makeover Monday for 2016. I’m looking forward to continuing this project with Eva in 2017.

December 18, 2016

Makeover Monday: Historical Performance of the Washington Metro

Two weeks to go in one of the best projects I’ve ever been a part of. In fact, Andy Cotgreave and I are hosting a Twitter chat on Monday at 4pm GMT. Get all of the details here and join us in discussing the impact this amazing project has had on all of us.

For week 51, we’re looking at the Washington DC Metro Scorecard. This is basically a KPI dashboard of quality and safety for the Metro. Let’s have a look at their scorecard:

What works well?

  • Nearly laid out and organized
  • Thumbs up and down give the status at a glance
  • Despite the rings, it’s visually appealing and draws you in

What could be improved?

  • While the rings make the comparison to goal easy, they wouldn’t work if something has to go past a full circle of the ring.
  • Inconsistent colors
  • No sense for performance over time
  • Can’t see all of the metrics in a single view; scrolling is generally a problem with KPI dashboards
  • You can’t tell how far each metric is from the target

They didn’t make all of the data available for us to use in the makeover. For example, there is no data available for the entire people and assets section. So what you’ll see this week from everyone will be nine of the metrics. For my makeover, I wanted to create small multiples of each metric over time. I liked their blue and red colors, so I used similar colors. I also wanted this to look more like a newspaper, so I went with the American Typewriter font. Lastly, if you look at this on a phone, you’ll see it a long, scrolling dashboard. I really like how this turned out.

December 16, 2016

How have humans grown in the last 100 years?

This morning I was catching up on my backlog on feedly and read the two amazing blog post by Rody Zakovich and Alexander Mou about creating a Sigmoid curves. I have never created a chart like this before and I love the visual appearance so I thought I’d give it a go.

I went back to my Data School Gym challenge from a few months ago because I knew it was simply comparing one point in time to another like Alexander covers in his blog. In this case, I was looking at the change in human height by country from 1896 to 1996. Alex’s blog is so well written and easy to follow. All I had to do was essentially copy/paste his calculations and I was off.

I’m asked very often how to get better at Tableau. For me, there’s no better way than practicing. And when you think you’ve practiced enough, practice more. You will never ever know everything. Just keep learning, like I’ve done this morning. With that, here’s my first Sigmoid curve.

December 13, 2016

Tableau Tip Tuesday: How to Create Ranked Dot Plots

1 comment

In this week’s tip, I show you how I built my Makeover Monday week 50 visualisation. I’m not exactly certain what you would call this chart type, so I’ll call it dot plot. The biggest different is that I use a Gantt bar that the dots then sit on top of. I learned quite a bit building this and I hope you will too.

December 12, 2016

Makeover Monday: Where are America's Best Drivers?

For week 50 of Makeover Monday, we reviewed a series of three maps from USA Today:

What works well?

  • Grouping the States into sets of 10
  • Gradual color scaling with black being the most negative
  • Calling out the smaller States separately
  • Labelling is clear and prominent
  • Good color choices
  • Simple title

What could be improved?

  • A filled map doesn’t add much value since there aren’t any apparently regional trend. Just because you have States, that doesn’t mean you MUST have a map.
  • They have three separate maps, which makes it hard to compare them.
  • Could have a more impactful title
  • For me, it’s counterintuitive to have 1 as the worst and 51 as the best.
  • It’s a very negative message. Wouldn’t something a bit more positive be better during the holiday season?

I found myself iterating A LOT this week. Starting with filled maps, then hex maps, then dot plots, then a series of dot plots. Here’s a GIF of my process:

For my view, I’ve included all of the metrics in a single view, allowed the user to sort by the value they are most interested in, and also reversed the axis, so the best is on the left. I liked their color palette, so I’ve re-used that. My titles also update dynamically based on the metric selected since the purpose of the view changes.

December 6, 2016

Tableau Tip Tuesday: How to Create Barcode Charts


In this week’s Tableau Tip Tuesday, I show you how to create barcode charts. I’ve used these charts several times recently:

I also have another video with five use cases for strip plots. I use the term strip plot interchangeably with barcode chart. The difference in this video is that I show you two methods for creating them and the visual differences between the two methods.

December 5, 2016

A History of the North London Derby in the Premier League

No comments

I really liked the learnings that I got last Friday from Rhiannon Fox and the cricket viz I ended up creating. It got me to thinking about other data that I could use with a similar visual display to continue practicing what I learned. Arsenal was preparing to play London rival West Ham on Saturday while I was watching my son skateboarding, so I thought I’d create something comparing Arsenal with their biggest rival, Tottenham Hotspur.

I was able to quickly get all of the results from Wikipedia, which I then imported into Google Sheets and then connected to Tableau. I showed a couple of iterations to Gwilym this morning at the Data School to get his feedback and we agreed that displaying goal difference was the most effective display. In addition, I added dots to indicate the winner. The goal difference display helped show the ebbs and flows of the rivalry way better than showing the goals scored by each team in each match as a diverging bar chart.

Click on the image for the interactive version. Does this display work for you? What might you do differently? Leave a comment and let me know or, better yet, download the workbook, iterate on my design and leave a comment with a link to your version. Enjoy!