Launch, grow, and unlock your career in data

May 30, 2016

Makeover Monday: The History of Famous Wrestlers

No comments

Confession: I worked on this on Wednesday, several days before the data went live to all of you. Sorry! But I’m on a family holiday to Paris. Yes, Paris. Lucky us!

Anyway, the visualisation and data this week comes from Horizontal History. Andy C did the data prep this week, so if there are problems, blame him, though he won’t respond because he’s camping somewhere in France. I’ve been dreading this data set for a while. I didn’t find it particularly interesting, but Andy insisted since I’ve been blowing this one off for 2 months.

Let’s start by looking at the original visualisation. You should definitely take the time to read through the original article because the author went through painstaking efforts to gather the data and gives an amazing overview. I’m going to focus on the last chart because it’s the most complete.

Click the image for the full size version

What works well?

  • This is a TON of information to display in one view. Given that constraint, the author organizes it probably as well as possible.
  • The tall, thin layout works well and gives you a good sense for the length of the timeline.
  • I like the dashed lines that signify a new century. It breaks up the viz and makes it easier to follow along.

What doesn’t work well?

  • There are too many colors that are too bold. It’d be tough to come up with a color palette that works well for this many categories. Softer tones would work better.
  • I don’t like turning my head sideways to read something.  Rotate the whole viz and make it horizontal (which the author later did here).
  • While I get that this is supposed to be an extensive list, it’s overwhelming. I’d love to see some interactivity like filtering and highlighting so that I can find my own story.
  • If it were interactive, it would be good to include more information about each person as you click on them.
  • Simply the visualization. It’s just too busy, but granted, that was kind of the point the author was making.  Again, read the article and you’ll understand his approach.

I played with this data set, begrudgingly, for about 30 minutes and was getting a bit irritated. For me, it was overwhelming. I’m hoping others don’t feel the same way. I think my unconscious bias came into play this week against this data set. Finally I stumbled upon the Domain field and saw there was a sports category. Great! I love sports. This is the point where I decided to focus on a simpler subset of the data.

Within sports I first looked at footballers before finding the Holy Grail…professional wrestlers! I was a MASSIVE wrestling fan as a kid. The Hulk Hogan / Andre The Giant match from Wrestlemania III was totally changed wrestling. I’m watching it as I type this. Hulkamania shot through the roof! And in early 2015, I even got to meet the Hulkster in person!

So the heck with a makeover of the entire original viz, I wanted to create something about WRESTLING and the superstars I watched through the years. Enjoy!

May 27, 2016

Fix it Friday: Early Leavers from Education and Training in Europe

On the train to work this morning I was reading through the blogs I follow and ran across this amazing visualisation from Stephanie Evergreen:

I love small multiples and I love slope charts, and this in an amazing combination of the two. Shortly thereafter, I ran across this chart from the Financial Times:

To me, this chart is screaming out for a slope chart. Also, I don’t understand why they didn’t include all countries in Europe. I downloaded the data from Eurostat and created this small multiples slope chart in Tableau.

I also was able to include an option that allows you to pick a gender or the overall. Notice how the title changes color to match the lines in the slope graph. Do you know how I did that?

Which one do you think tells the story better? Does the bar chart of the slope chart make comparing the years easier?

May 24, 2016

Tableau Tip Tuesday: Five Use Cases for Strip Plots

In last week’s Makeover Monday about global warming I included a strip plot at the bottom of my final visualisation. You may hear these also called barcode charts or frequency charts, but whatever their “official” name, they are very useful for:

  1. Seeing a lot of data at a glance
  2. Understanding concentration of the data
  3. Seasonal trends

In this week’s video tip, I walk you through five use cases for strip plots varying from global warming to the frequency of fires to distribution of sales to deprivation in Scotland.

May 23, 2016

Makeover Monday: The Militarization of the Middle East in a Post-9/11 World


After an epic week 20 for Makeover Monday, I had great expectations for this week. Another great data set, this time looking at global arms imports and exports. But dang it was tough! I really struggled this week making something I was happy with. In the end, time is up and I learned a lot.

Let’s start by looking back at the original visualisation.

What works well?

  • The colors clearly distinguish imports and exports.
  • The labels provide the needed context.
  • Nice small line charts for Europe and the Middle East along with an indicator for the rate of change.

What could be improved?

  • The title of the article doesn’t match the chart.
  • It’s hard to compare countries.
  • Why were the countries that are shown selected? Are they the top N?
  • Why is the timeframe 2011-2015? That seems a bit arbitrary.
  • Why are there only sparklines for Europe and the Middle East?
  • The lower section with the flags has nothing to do with the map.
  • In the lower section, why don’t UAE and China show awaiting delivery? It should be consistent.
  • Is there a better story that can be told? The data goes back to 1950 after all.

I decided to focus on the title of the article: “The Militarization of the Middle East”. And I focused even farther by looking at the post-9/11 era from 2002-2015. America initiated a war with the Middle East. I wanted to know how that impacted the import of arms to the region.

Once again this was a week of iterations. I started with this small multiple map view, but didn’t think it showed the change through the years very well.

Click the image for the interactive version

I then looked at a slope graph comparing the % of total arms imported in the region by country in 2002 compared to 2015. This definitely shows the rate of change better, but I lose the context of the years in between.

Click the image for the interactive version

Maybe a DNA chart will work better than the slope graph? Not really, it just flattens it out.

Click the image for the interactive version

I was getting frustrated by this point, so I decided to take the opportunity to learn a new technique. I read Matt Chamber’s blog post recently on how to build ranked bump charts and thought this would make a great use case for this type of chart. In this view, I can see how a country moves year by year in the ranking of arms imported into the Middle East. I really like being able to click on a country and see it highlighted.

What the bump chart loses, though, is the context of the overall value of the arms imported. So to take care of that, I included the sparkling which also updates when you click or tap on a country. In the end, I’m satisfied and I learned something new. That’s a bit of what Makeover Monday is about.

May 19, 2016

Rain Patterns at Mount Diablo: What 60 years of rain data tells us about the Northern California drought

No comments

I’ve been getting deep into Alberto Cairo’s latest book “The Truthful Art” and was particularly fascinated by the Rain Patterns in Hong Kong visualisation created by The South China Morning Post.

Immediately I began to think back to our time living in Northern California and the historic drought conditions. I decided to use Mount Diablo as a representative weather station because it was one of the most complete and oldest in the Bay Area.

I decided to use this visualisation as inspiration for a version of my own. Some interesting patterns reveal themselves:

  • It hardly ever rains between the end of May and early October.
  • The most single-day rain total in the last 60 years was 5 inches on 21-Jan-1967.
  • Out of the 21,404 days in the data set, only 3,892 had any measureable rain (18.2%).
  • We lived in Pleasanton for 1,070 days. During that time, there were only 145 days of measurable rain (13.6%) and only 57.3 inches of rain during that time (less than 1/2 inch per day that it rained).
  • Over 60 years, there’s an average of 66 days of rain per year.
  • During our time living there, we saw an average of 50 days of rain per year.

My version was made with Tableau 10, so I can’t publish it to Tableau Public yet. You can download the workbook here and the data set I used here. The data was sourced from NOAA.

Finally, a special thanks to Data Schooler Nisa Mara for her feedback during the process.

12 Books Every Great Data Analyst Should Read

I had the pleasure of speaking today to the Virtual Government Tableau User Group. I mentioned that great data analysts have an appetite for learning and included a list of books that every great analyst should read. Here are 12 books that I highly recommend (obviously this isn’t an exhaustive list):

May 18, 2016

The Data School Gym - Marimekko Alternative


Former Data Schooler Nicco Cirone posted a challenge to the team at The Information Lab Monday to create this alternative to a Marimekko chart.

Click the image for a larger version

Nicco got the idea from this post by Jon Peltier about the problems with Marimekko charts. Give it a shot. It’s a bit tricky and one you will surely learn something from. There are several very subtle tricks in here. Try to match mine pixel for pixel.

If you want to use the same version of Superstore Sales that I used, you can download it here. If you think you have it and want to check to see if you got it right, you can view the final solution on my Tableau Public profile here.

Good luck!

May 17, 2016

Tableau Tip Tuesday: How to Create Monthly Radar Charts


In this week’s Tableau Tip Tuesday, I show you how to create radar charts that are based on monthly data. That is, the months go around an imaginery circle like a clock. Thanks to Jonathan Trajkovic for this great explanation and to Ed Hawkins for the inspiration.

May 15, 2016

Makeover Monday: How warm is Earth becoming?


There was a lot of chatter on Twitter last week about this terrific visualisation by Ed Hawkins:

The beauty of this visualisation is in the animation. However, without the animation, it kind of fails to tell the story. Let’s dig a bit deeper.

What works well?

  • There is a clear title.
  • The background circles provide helpful context.
  • Including the month labels makes it easier to understand what you’re seeing.
  • The year in the middle helps tell the story.
  • The animation is compelling.
  • It has a nice color scheme that works well on a black background.

What could be improved?

  • While the title is clear, it could be more eye-catching, like a news headline.
  • If you see this as a static image, you lose the sense of change.
  • You can’t compare any time periods. All you know is 2016 is the warmest.
  • There’s no explanation about what the numbers represent. Though I do see in the Twitter post a link to additional information.
  • The color scale has nothing to do with the temperature change, which I assumed it did until I read hte additional information. The colors actually represent the years. That doesn’t add much value. I think coloring by the temperature change would be more impactful.

So, this data set is actually incredibly simple. All we have is one record per month, the temperature, and the confidence intervals.

The first thing I wanted to do was rebuild the radial chart. This wasn’t nearly as easy as I thought. This post by Jonathan Trajkovic was very helpful, but it wasn’t designed for months. I’ll record how I did made it for a future Tableau Tip Tuesday.

Click the image for the interactive version

This radial chart is basically the same as the original, however I can’t make it “play”on Tableau Public and I also changed the color to be the median temperature difference. Really, I only built this to see if I could. It’s not any more useful than the original.

Next, I took the radial chart and flattened it out.

Click the image for the interactive version

This doesn’t make the understanding all that much easier because I can’t tell which years are which. Maybe I should switch the color legend back to years?

Click the image for the interactive version

Oh wow! What a difference! Now I can easily see the distinction between the older and more recent years. I think this is much, much better than the original, especially in static format. I wanted to keep iterating though.

Whenever I’m working with time-based data, I like to build either calendar heatmaps or heatmaps by year and month. Here’s what this data set looks like as a heatmap:

Click the image for the interactive version

The heatmap makes the series of lines even easier to understand. It’s super easy to see the gradual temperature change over time. This is pretty compelling, yet I wanted to keep going. Was there a better way to tell the story?

Next I looked at the 10-year average, that is, a 120 month moving average of the median temperature change. I then overlaid the confidence intervals.

Click the image for the interactive version

Lastly, I took the 10-year moving average view and replaced the monthly confidence intervals for the monthly values while keeping the overall 10-year average. This is my submission for Makeover Monday. In this view, I like how I can see the drastic monthly fluctuations but still have the overall context. Including a reference line at zero helps emphasize the dramatic change since about 1984.

I also included a strip plot under the graph that shows the average median temperature difference for the entire year. This brings back a bit of the heatmap view above.

In the end, another fun week with a simple data set that provides lots and lots of options. Which one do you like best?

UPDATE: This week has been a fascinating exercise in iterating. That’s the beauty of Tableau. I can get another idea and build it quickly. After seeing some of the submission for this week, I thought a jitter plot might work well. Thoughts?

Click the image for the interactive version

May 12, 2016

The Data School Gym - Timeline Pareto Chart


Another Data School Gym challenge for you. Today Jonathan MacDonald reached out to me with a question. I’m glad he reached out because I’m constantly asking him question. He asked:

How can we create a timeline Pareto chart? That is, we need to calculate the cumulative sales for a dimension from it’s first sale until it’s most recent, but the time has to be expressed as a % of cumulative days.

So, here’s the challenge for you. Create this chart below. It’s based on Superstore Sales and there are no big tricks. Just see if you can do it. Some hints:

  1. You will need an LOD calc for the x-axis.
  2. Include a single select parameter for the dimensions. Each line on the chart represents one element of the selected dimension.
  3. Include an option to highlight a specific element within the dimension selected.

I’ve disabled the download option on the workbook so you can’t cheat. Tweet me when you think you’ve figured it out.

May 10, 2016

Tableau Tip Tuesday: How to Create Directional Lollipops


In this week’s tip, I show you how to create directional lollipops, an alternative view to time series jittering. In the video, I look at the incredible season by Stephen Curry and his shot results minute-by-minute, game-by-game.


May 8, 2016

Makeover Monday: How Many Hours Do Women Work in OECD Countries?


Since Sunday is Mother’s Day in the States, this week’s Makeover Monday topic is about how many hours women work in various OECD countries. Let’s start by reviewing the original chart by Business Insider:

What works well?

  • The stacked bar chart is relatively easy to understand since it only has four colors and there aren’t that many countries to compare.
  • The chart is sorted by the smallest percentage of women working 40+ hours per week, which makes it easy to compare that category.
  • The colors are easily distinguishable.
  • Easy to read headers

What doesn’t work well?

  • I have no idea what year this data is from. The data goes back to 1976. I assumed it was for 2016, since that’s when the article was written, but after finding the data myself, it looks like it’s from 2014.
  • The title of the article "American women work way more than their European counterparts” isn’t entirely true. The chart doesn’t show all of the countries is Europe from OECD. The U.S. would rank 9th is you compare European countries and the U.S. from 2014.
  • The chart title is useless.
  • Japan isn’t in Europe, so why is that included?
  • Why is the OECD average included if this is supposed to be the U.S. compared to Europe?
  • There’s no rationale to the countries they chose to include. Is the author being deceitful on purpose? I hope it’s merely an oversight.
  • While I don’t think this stacked bar chart is terrible, it does make it very hard to compare any of the other categories of hours worked.

The first thing I did was rebuild the chart including all of the OECD countries and reversing the sort to be by the highest rate of women working 40+ hours.

Click to interact

I included several filtering and sorting options to allow the user to find their own story. The user can scroll through all of the years and see how the story unfolds. This view solves the problem of not being able to sort by any of the other categories of hours worked.

I didn’t love this though, so I created a slightly different version that shrinks the bars and adds dots. Think of it as a stacked dot chart.

Click to interact

This is the beauty of Tableau. I can quickly iterate on ideas and see which one I like best. At first, I thought adding the dots would make it easier to understand. I think it looks pretty neat, but actually, I think I made it harder to understand.

The problem in both of these stacked charts is that I can’t see all of the years in one view. I was really curious as to the patterns. Has the % of women working 40+ hours per week in the U.S. grown? How does that compare to the OECD average? How do other countries compare?

With those thoughts in mind, I created this series of line charts across the different work hours ranges.

Click to interact

I love these types of charts. I created one last week as well. What I like about them is they include lots of context. In this particular example, I can clearly see that the U.S. is higher than the OECD average in the 40+ hours worked per week section. Yet I can also see that there are quite a few OECD countries that are higher than the U.S. I can easily compare Europe to North America. Or only look at the top 10 countries according to U.S. News and World Report. I can zoom into a specific working hours category with a simple tap on the filter.

I almost stopped here, because I think this already is much better than the original. However, I wanted to see of there was a better way to compare the different work hours within a single country. To address that, I thought a small multiples view might work well.

Click to interact

I chose to sort the countries by the highest % of women working 40+ hours per week in 2014. Then you read it in a z-pattern. So this view let’s you see where a country ranks amongst the others and you can also compare the hours worked within a single country.

Then it hit me. I quickly went to Andy Cotgreave’s blog and found this viz he created a few weeks ago:

Yes! This is it! It even matches the colors I was using. I duplicated the previous viz and changed it to an area chart. I then added some of the filtering options back.

NOTE: If you’re viewing this on a phone, you’ll see a long skinny version with less filtering and that also has the sorting option removed.

It took me five iterations, but I got there in the end. I’m not sure how I could have done this quicker with any tool other than Tableau. I love how I can fail fast! Which version do you like best?

May 5, 2016

Makeover Monday: The Rising Cost of Tuition in the United States - Highcharts Edition

No comments
Makeover Monday has been an incredible learning experience for me. I’ve become particularly fond of designing for mobile devices. Doing so has led me to learn a lot about things Tableau can do to make the mobile experience better. The great news is they listen. I’m having a call with the mobile developers to talk to them about my experience.

However, until these problems are fixed, I need to find another way to create visualisations that work well on mobile. Enter Highcharts. It’s a javascript-based charting tool that has proven super easy to learn. I’ve never coded in JS before, yet I was able to reproduce my entire Makeover Monday viz from this week without too much fuss. Yes, it takes me longer than Tableau, but the extra time is worth the control I have over the visualisation.

Give it a play, especially on your phone. I think you’ll find it a much better experience than the Tableau version.

May 4, 2016

Data+Women: Women are Underrepresented on Tech Boards


I was listening to the latest Tableau Wannabe Podcast about Women in Data Month and Emily mentioned how Tableau has no females on its Board of Directors. I’m also preparing to speak at the first Data+Women London meetup tomorrow, so I wanted to educate myself a bit and also verify Emily's comment.

I looked on Google Finance at Tableau to get a list of comparable companies. I then included some more big tech companies from Silicon Valley for comparison purposes. The data is shocking!

Of the 17 companies I selected:

  1. Only seven (7) have boards with at least 25% female composition
  2. 0% of the companies have 50% representation of females
  3. Tableau and MicroStrategy have exactly zero (0) female members on their boards

This is sad, truly sad. My message to the leaders of these companies: “Lean in!"

May 3, 2016

Tableau Tip Tuesday: 12 Use Cases for Parameters


Parameters are one of Tableau’s most powerful features. I remember when they were first introduced and it completely changed the Tableau paradigm. This week’s video takes you through 12 simple use cases for parameters. These 12 barely scratch the surface for what’s possible.

NOTE: This video is 50 minutes long.