Data Viz Done Right

June 19, 2018

Tableau Tip Tuesday: How to sort first by the most positive values, then by the most negative values in a single chart

No comments
It was day 1 of Tableau training for DS9 at The Data School yesterday and we were practicing different sorting methods, including sorting by a discrete measure. I was then asked how we could sort positive values from highest to lowest followed by the negative values from lowest to highest. This helps emphasize the best and worst performers. The trick was doing this in a single chart.


June 18, 2018

Makeover Monday: U.S. Influenza Surveillance Report

No comments
I've seen lots of headlines in the US since March about how bad the flu outbreak was this year, so I did a bit of googling and found this week's Makeover Monday topic. The chart we are to makeover comes from the CDC.

What works well?

  • Clearly marking the x-axis so that it's evident that the weeks don't start at the beginning of the year
  • Including the national baseline for context
  • Chart dimensions scaled properly
  • Using red for the most recent season so that it stands out more

What could be improved?

  • The colors are too bright and are competing for attention.
  • The symbols on the 2017-18 line are unnecessary.
  • The start of the 2009-10 season is wrong, according to the data that can be downloaded.
  • The national baseline should be weekly, not flat across the time period.

What I did

  • I liked the idea behind the original chart, so I kept that but made it look nicer and more focused.
  • I included a summary to set the context for the line chart.
  • I included the national average by week for context.
  • Use a stepped chart to make the weekly change easier to see.
  • Focus the lines on the two outlier periods.

June 15, 2018

Tableau Prep Tip: Returning the First and Second Purchase Dates

No comments
If you don't participate in Workout Wednesday, you're missing a great opportunity to learn. For week 24 2018, Ann Jackson challenged us to also use Tableau Prep. The toughest part of the data prep was getting the second purchase date for a customer. First is easy. The trickiest part is that you can't sort data in Prep, so you have to do some workarounds to get what you need.

In this video, I should you how I approached returning the first and second purchase dates for a customer, include some summary measures, then bring them both back together into a single table for visualizing in Tableau.

June 13, 2018

Workout Wednesday: Do Customers Spend More on Their First or Second Purchase?

No comments
Ann is back for Workout Wednesday week 24. The full list of requirements can be found here. Personally, I like how this week's challenge included Tableau Prep. It's great to have an excuse to practice!

The high-level requirements:

  1. Create a data set in Tableau Prep that returns the first and second order for each customer along with the sales, number of categories and number of products sold on that day. 
  2. The data must be wide rather than tall. That is, you must have nine columns in total: Customer ID, two dates, two sales totals, two category counts, two product counts.
  3. Create a dashboard with a scatterplot and two strip plots.
  4. Float everything...YUCK!

Here's what my flow looks like from Prep:
I intentionally did NOT rename my tools so that you wouldn't know exactly what I did. You can see my final output below the flow.

Building the viz was pretty simple. Scaling the axes the same is something I do a lot, but I do expect that to trip up people. Creating the 45º line is something I've written a tip on before, but Ann's has a twist as it has to be behind the dots. Sneaky!

The sucky part was floating everything. I started by tiling everything, literally writing down the position and size for each element. Then I floated them one by one and entered what I wrote down to put them back in their proper position. From there, it was a little bit of tweaking to move the axes closer together.

Nice challenge. It wasn't overly complex and required me to reach back into the memory bank.

June 10, 2018

Makeover Monday: Tourism Density Index

No comments
For week 24, Eva presented us with something called the tourism density index, which basically means how many tourists come into a country compared to that country's population. Here's the original viz:

What works well?

  • Really good explanations for how they define overtourism and undertourism and examples for each
  • Providing the exact figures for each country
  • Colors are easy enough to distinguish
  • Sorting the countries from lowest to highest
  • Splitting the view between the highest 9 and the lowest 9

What could be improved?

  • Circles are inherently difficult for comparisons. Are they measure by area or diameter? Either way, the circle in a circle in overkill.
  • Why does the size of the light green circle change once the dark green circle is a larger value? That makes no sense at all.
  • If the exact numbers were not included, it would be impossible to compare countries.
  • Why show the top 9? That seems like an unusual way to select the countries.

My Goals

  • Focus on either the raw values or the percentages. I'll figure this out once I explore the data.
  • Make it easier to compare countries.

June 7, 2018

Workout Wednesday: How does sales compare in the Current Period to the Previous?

No comments
It's been eight weeks since I've done Workout Wednesday. Sometimes you have to reprioritize things to get other things done. For me, WW was something I could cut out to free up more time for finishing the Makeover Monday book (pre-order here).

But I'm back and this week Rody gave us this challenge. Read all of the requirements here.

I had an idea straight away how to do this and in all it took about 30 minutes. The date offsetting took some tinkering, but the rest was pretty easy. I'm glad Rody is back from his hiatus too. His challenges aren't as brutal as Ann's.

Click on the image for the interactive version.

June 5, 2018

Tableau Tip Tuesday: Split, Pivoting and Union with Tableau Prep

No comments
Now that the manuscript of the Makeover Monday book (pre-order here) has been sent to the publisher, I'm going to do my best to get back into a rhythm with weekly tips. This week, I'm going to show you how I used Tableau Prep to prepare the data for my Makeover Monday viz this week. It involves splitting eight columns into two sets of four, renaming fields, pivoting the data, and then unioning it all back together.

You can download the flow here.

June 4, 2018

Makeover Monday: The UK Gender Pay Gap Across Salary Bands

No comments
This week's data comes from the UK government and more specifically the Valuation Office Agency. I was alerted of this data set by Aisling Roberts, who had written a great article on LinkedIn that questions whether people will actually take any actions based on the data.

Let's start with this viz from the official report:

What works well?

  • The symbols make it clear this about females and males.
  • The BAN in the middle tells us what the bonus pay gap is.

What could be improved?

  • Both icons are filled to the same level, making it look like there is no bonus pay gap. These should be filled to the actual values for each gender.
  • The icons don't add much value.
  • The title could tell us a whole lot more.
  • There's no source listed nor no timeframe.
  • The gridlines aren't evenly spaced between 0% and 50%.

I must admit, this is a tough data set. Hopefully the explanations I wrote on provide sufficient context. I found it most useful to look at a specific company and look those values up in the data provided to ensure I understood what it means. Given that I found the data overwhelming, I decided to focus on the pay bands since that's what Aisling focused on in her article. 

From there, I started to build lots of charts, but found the number of companies overwhelming. Therefore, I decide to limit the data to those companies located in the City of London (i.e., those with a postcode that starts with EC). I also knew I need to do some data prep so that I could compare females and males in each pay band more easily. I turned to Tableau Prep for this.

The flow works like this:

  1. Remove columns that aren't needed
  2. Splitting the data up into two streams, one for the female columns and one for the male columns.
  3. Pivot the data so that the pay bands are listed down instead of across
  4. Add a column for the gender
  5. Union the data back together
  6. Export to an extract

Pretty straightforward and this short amount spent prepping the data made the gender comparison significantly easier. I first wanted to understand how the median proportion of females and males in each pay band by the size of the company with a City of London total (NOTE: the total only represents companies that reported).

Click on the image for the interactive version.

This simple view makes it incredibly evident that the proportion of females declines as the pay band increases. Males would be the inverse. It's particularly stark in the largest organizations. In the City of London, there are only three employers in that range (British Telecom, Royal Mail, and Sainsbury's Supermarket).

The heat map helped give me an overview of the data and felt ready to create something more detailed. This time I wanted to look at all companies together by gender by pay band compared to the overall median for each gender. I also wanted to provide the user with the option to choose a specific company. When they do, that company gets highlighted.

Click on the image for the interactive version.

What first struck me in this view is the clear, overwhelming patterns down and to the right for women. This gave me a great impression for how big the gender pay gap problem is.

The gender pay gap is not a myth. These are facts, facts that show women are underrepresented at higher salary levels. Don't let this discussion get lost. Check out your own company. How are they performing? Ask them to share the data within your organization. Transparency is a key to fixing this discrepancy.

May 27, 2018

Makeover Monday: Europe Is Moving Up the Ranks of the World’s Wealthiest Regions

No comments
#MakeoverMonday HQ has been into small data lately and this week continued the trend: 2 columns x 20 rows. I love small data!

The viz to makeover this week comes from the World Economic Forum.

What works well?

  • Good title and subtitle
  • The colors scale, from dark to light, make it relatively easy to see which cities are the most and least expensive.
  • It's easy to see that the bottom-left cities is the most expensive and the upper-right is the least expensive (of the list).
  • Including footnotes for the data
  • Including the source

What could be improved?

  • A treemap is used to represent parts-to-whole relationships. This is only a selected set of cities.
  • Displaying square meters as parts-to-whole makes no sense at all.
  • The sorting of the treemap is strange. It seems the whole chart should be flipped on the y-axis.

My Goals

  • After reading the article that the source refers to, I wanted to look at cities by their regions and focus on Europe.
  • Make the ranking of the cities much easier to understand
  • Use highlighting to focus the analysis
  • Make everything in a single chart (except the mobile version since I can't change the title and caption font size)

With that, here's my simple bar chart showing how European cities are mostly ranked near the top of the list of the World's wealthiest cities.

May 21, 2018

Makeover Monday: How well did The Guardian predict the Premier League table?

Back to sports again this week. With the Premier League season just finishing, we're looking at how well The Guardian predicted the EPL table at the start of the season.

What works well?

  • Sorting the teams by prediction makes sense since this is an evaluation of their performance against their prediction.
  • Including the logos so people can find their favorite team
  • Including the numbers for the table position so that the reader doesn't have to count as they go
  • Shading every other row helps break up the view

What could be improved?

  • If you don't know the team logos, it can be hard to track a team across the table.
  • It's hard to see which team did better and worse than expected.
  • There's no scale for how "well" The Guardian predicted the table.

My Goals

  • Focus on the difference between the predicted and actual results
  • Try to create some sort of unit chart (I didn't have time to figure out the calcs, so I cheated with distribution bands)
  • Make it easier to see if team finished above or below the predictions
  • Finish in under an hour because we did MM live at the Data School and had to present to Eva at the end of the hour 

May 14, 2018

Makeover Monday: Which European commuters spend the most time in traffic jams?

No comments
It's crazy busy times at the Kriebel household, so this week I had to get something done quickly. Eva chose a data set about time people spend in traffic congestion in European cities.

What works well?

  • Bars are ranked in descending order
  • Simple, clear title
  • Axis title tells us what the bars represent
  • Nice tooltips
  • Footnotes that qualify the data

What could be improved?

  • The alternating bar colors add no meaning.
  • The title has a weird shape to it. 

My Goals

  • Change the metric to percent of time spend in congestion during peak hours, which required me to go to the source to get the additional data.
  • I took inspiration from Eva's viz, but wanted to show the congestion as a percentage rather than a raw number. I feel this gives move context to the numbers and lets the audience know their likelihood of being stuck in traffic in these cities.
  • Create the viz as a single worksheet.

May 6, 2018

Makeover Monday: Toughest Sport by Skill

1 comment
I've had this data set in the queue for probably two years now and it's finally time that I got to post it. I know Eva loves sports data sets (cue eye roll), so this week we're looking at the toughest sport by skill according to a group of sports science experts surveyed by ESPN.

What works well?

  • The sports are ordered by the rank by default, making it easy to see how the compare to other sports.
  • You can sort by any of the column headers.
  • All of the definitions are provided and thoroughly explained.
  • If you need to look up a value, a table works perfectly.

What could be improved?

  • It needs to be easier to see the relative difference between sports.
  • Comparing more than one metric at a time across two sports takes too much brain power (for me at least).
  • Filtering would help make the list more digestible.

My Goals

  • Allow the user to compare two sports, rather than all sports at once.
  • Make the difference between the sports across all of the skills easier to understand.
  • Show which skill is harder/easier when comparing two sports.

April 30, 2018

Makeover Monday: Annual Change in American Bee Colonies

This week Makeover Monday is collaboration with #VizForSocialGood to analyze bee colonies in America. The data and visualization come from Bee Informed.

What works well?

  • When you hover over a state, the value is indicated on the color legend.
  • Informative tooltips
  • Good filtering capabilities to customize the view
  • Very responsive tooltips
  • Displaying the states and territories outside the continental US separately

What could be improved?

  • A diverging color scale is typically used to indicate positive and negative values. In this case, all values are positive.
  • If you do want to use a diverging palette, the colors should merge at the median.
  • A filled map makes smaller state difficult to compare to larger states.
  • There's no sense of the change over time. Are colonies increasing or decreasing?

My Goals

  • Create small multiple maps that show each year
  • Make the story about the change, rather than the specific values
  • Use highlight actions to make it easy to see a state across all maps
  • Incorporate the total annual loss into the tooltip

April 23, 2018

Makeover Monday: Biocapacity vs. Ecological Footprint

No comments
Makeover Monday week 17 is a collaboration with the Global Footprint Network for Earth Day 2018. They've been fantastic to work with throughout the planning. Here's the viz they volunteered for us to makeover:

What works well?

  • Simple title
  • Nice framing of the legend
  • Clicking on categories to add/remove them from the view
  • Super responsive tooltips

What could be improved?

  • Everything is very compact making it impossible to read
  • Rotate the chart and make it tall vs. wide
  • Reduce the number of categories to reduce the number of colors
  • Provide a total option

For my viz, I wanted to recreate what was in the tooltip of one of their maps. I didn't have much time so I had to get it done quickly and I really like the BANs and how coloring between the lines helps emphasize the difference.

April 16, 2018

Makeover Monday: The Seasonality of Confirmed Malaria Cases in Zambia Southern Province

No comments
For Makeover Monday week 16, Eva and I are hosting a #MakeoverMonday Live at Tableau HQ in London. The data and viz this week were provided by Jonathan Drummey and the Visualize No Malaria project.

What works well?

  • The colors are distinct from each other.
  • The seasonality is very evident.
  • The title is simple and tells us what theviz is about.

What could be improved?

  • Are the colors stacked or is one behind the other?
  • The overall decline is harder to see than necessary.
  • What happened at the spikes? Adding some annotations would be helpful.
  • Why is the data split between health facilities and health workers?

My Goals

  • Can I show the overall decline more effectively?
  • What does the viz look like when I combine the health facilities and health workers?
  • Are there colors that will work more effectively?
  • How can I make the seasonality more evident?

With those goals in mind, here is my Makeover Monday week 16. If this looks somewhat familiar, I created a very similar viz with a very similar data set for Makeover Monday week 34 2016.

April 12, 2018

Workout Wednesday Part 2: Total Products by Sub-Category

No comments
Ok, part 2 of this week's challenge, the Jedi version, really sucked (requirements here). It took me FOREVER! I used table calcs for all of the calculations and getting them just right took a really long time and a lot of experimenting. Surely Tableau can make this easier for us.

Some thoughts:

  1. Getting the subcategories to layout correctly in a trellis plot was easy.
  2. Getting the labels above each grid was easy.
  3. Getting 10 dots across each pane was easy.
  4. Getting the stacking of the dots in rows was a pain!
  5. Luke has an evil side.

But I absolutely loved the challenge. I'm really enjoying these! My advice for everyone is to keep at it until you get it. Even if you're stuck, don't cheat and download the solution. That doesn't help your learning. If it's too hard, then consider skipping it; it might not be the right level for you yet. You can always come back and do it later.

April 11, 2018

Workout Wednesday Part 1: Top 5 Subcategories with the Most Products

No comments
Luke is back and he has posted two challenges this week. First, create a pictogram of the top 5 subcategories that have sold the most products. They should be represented in 100 dot sections and colored depending on the section they are in. Get the requirements here.

There were two requires that I didn't need to use to make it work:

  1. Set the minimum and maximum values on the columns axis (x-axis) to -3 and 12, respectively.
  2. Set the minimum and maximum values on the rows axis (y-axis) to -1 and 32, respectively.

I'm not sure what the purpose of these would be, but I suspect it's some sort of spacing. I didn't need them, so I ignored these requirements. Here's my version and now I'll get to work on part 2.

April 9, 2018

Makeover Monday: Arctic Sea Ice is Disappearing Fastest in Summer Months

No comments
I'm writing this having just finished a bike tour of Rome with my family in an absolute monsoon. Global warming is proven to cause unusual volatility in the weather, including hotter summers, extreme winter storms, and changing warm water patterns around the earth. This warming is most evident near the Arctic, where ice levels are at all time lows and the cycle of melting is accelerating year upon year.

So when I found this visualization by the National Snow & Ice Data Center, it seemed an appropriate topic for Makeover Monday. One of the most fun elements of this data set is that it includes only two columns: date and sea ice extent.

What works well?

  • Without even trying, it tells a compelling story.
  • The interactivity is fabulous. I really like being able to simply click on an item on the legend to have it added or removed as a highlighted line.
  • Including the 1981-2010 median along with the IQR and IDR provides great context.
  • Defaulting the view to show 2012 (the previously worst year for arctic ice) to 2018 helps show how 2018 is looking to surpass 2012 (in a bad way) by a lot.
  • Subtitle explains what sea ice extent means
  • Good use of simple colors
  • Great example of using highlighting for context

What could be improved?

  • The x-axis could be simpler by only showing the month names and removing the word "Date" from the axis title.
  • Make the title more impactful

My Goals

  • First, I wanted to rebuild the original and see if I could make it any better. I couldn't.
  • Second, build a spiral diagram that shows the months around the outside, but this only worked well when it was animated.
  • Finally, I settled on a different take on the metric that swaps the months and year on the original. That is, put the year on the x-axis and month on each line. This gave me only 12 lines which looked less busy and helped me see patterns for each month.
  • Next, I included a line that is the average of each year (black line).
  • I then decided to look at how each year of each month changed compared to 1979. I went with a percent change because I think that provides more context.
  • Lastly, I included a highlighter for the months and included some BANs of the actual values for comparison.

Click on the image for the interactive version.

April 4, 2018

Workout Wednesday: Frequency Matrix

No comments
This week Ann Jackson stepped in to provide everyone the challenge. I think she's been seeking some revenge and she sure dished it out. Find all of the requirements here.

Easiest Parts

  • Use sub-categories (HINT: Since it's a frequency matrix, there's a trick here.)
  • Dashboard size is 1000 x 900; tiled; 1 sheet
  • Distinctly count the number of orders that have purchases from both sub-categories
  • Sort the categories from highest to lowest frequency
  • White out when the sub-category matches and include the number of orders
  • Match formatting & tooltips – special emphasis on tooltip verbiage

Hardest Parts

  • Calculate the average sales per order for each sub-category
  • Identify in the tooltip the highest average spend per sub-category (see Phones & Tables)
  • If it’s the highest average spend for both sub-categories, identify with a dot in the square

I messed around with the average sales per order calc for quite a while. I have every number the same except for a couple in paper and binders. I have had a look at Ann's solution and I think mine is more accurate, so I'm sticking with it. :-)

Excellent challenge this week! Give it a try, but give it a REAL try before looking at someone's solution; you will absolutely learn more. Click on the image for my interactive version.

April 1, 2018

Makeover Monday: World Wine Production

This week Eva provided a simple data set and a simple viz from the International Organisation of Vine and Wine.

What works well?

  • It's a simple line chart, which makes it easy to understand.
  • The red line stands out well against the white background without being too bright.
  • The units on the axis are labeled.
  • The title tells is what the line represents.

What could be improved?

  • The subtitle could be moved to a caption below the image.
  • The axis has a strange scale. Why does it start at 180?
  • Adding the drop lines makes it look like the length of those lines is important, but if you compare the length of the lines, then that could be misleading due to the axis starting at 180. I'd remove those lines.
  • The year labels are diagonal.
  • Each year doesn't need a label.
  • Why doesn't the source document contain data for all of the years?

My Goals

  • It's Easter and I have basically no time to work on this, so do something quick.
  • Mimic what we created for Workout Wednesday week 33.
  • Focus on the relative change from a chosen period instead of the absolute change. For me, this is more meaningful if you want to see how much a country has changed and it normalizes all of the countries.

That's it. All done!