Launch, grow, and unlock your career in data

April 30, 2016

Dear Data Two | Week 52: Goodbye

No comments

Goodbye - it’s something we say all the time. Sometimes it means “see you soon”. Sometimes it signifies the end. Week 52 of Dear Data Two was all about goodbyes. When do I say them? Who do I say them to? How emotional am I? How do I say goodbye?

The timing for this week couldn’t have been any better. Not only was this the final week for Dear Data Two, it was also the last week for the second cohort of the Data School. I wanted to track my emotions throughout the week to see if anyone or anything triggered me to be more emotional. Unfortunately I never found this out; I rarely got too emotional saying goodbye.

What’s exciting for me is that I know the Data School is ready for their placements. Saying goodbye to them wasn’t about being sad, it was about being proud. Proud of who they’ve become, what they represent, and the impact they’re going to have on the World.

For Dear Data Two, it’s more bittersweet. I’m very excited it’s over. Tracking data about yourself on a different topic for 52 weeks is not easy. Taking that data and trying to understand is even harder. Then I still had to create a postcard, which was BY FAR the hardest part. I’ve grown, and I’ve learned an amazing amount about myself, about Tableau, about Jeff. That’s what I’ll miss.

What I take most from this project is the approach: drawing visualisations makes you think differently. There’s no interactivity, no computer screen, little directions. There’s no way you cannot become better at data visualisation through a project like this.

I hope you all have enjoyed following this project. I hope you’re learned something about data visualisation. I hope you’ve picked up some Tableau tricks along the way. I appreciate all of the feedback and comments.

I’d also like to thank Steve Wexler for his great blog post on visualising 4-scale survey data. The post was immensely helpful this week and I gladly stole his color palette.


April 28, 2016

Guy Fawkes & Fires: Learning about the UK with Tableau

1 comment

Today I attended an expo at the Devon & Somerset Fire Services with Tom Brown and Robin Kennedy. In preparation for the event I built a few dashboards for us to showcase. A friend of ours over at Surrey County Council pointed me to a dataset about fire incidents in the Greater Manchester area for 2012/13. The beauty of this dataset was that it included the locations of the fires and lots and lots of details about the fires.

Tom asked me to build something, so I started exploring by building lots and lots of visualisations to see if a story stuck out. And one did! When I created a heatmap of fires by month and day, November 4th and 5th jumped right out. That seemed a bit strange to me. I think of those dates as near elections, but here in the UK it the celebration of Guy Fawkes Day and there are lots and lots of fireworks. And, duh, fireworks = fires.

I also noticed some outliers at the beginning of March, which I still haven’t been able to explain, April Fools Day had a lot of false alarms (shocker!), and August 5th and October 27th were near bank holidays, which probably means fires from BBQs.

Yet another fun couple of hours exploring data with Tableau. It was a lot of fun showing these off to potential customers today and also seeing a bit more about the sales side of the company. Great learning indeed!

April 26, 2016

Tableau Tip Tuesday: Create Alerts with ASCII Symbols

1 comment

Quick tip this week on using ASCII symbols in calculations as indicators. This is useful for creating alerts similar to many of Stephen Few’s examples. This could easily be extended to arrows or triangles or any other ASCII shape you like.

April 25, 2016

Makeover Monday: Victims of the 21st-century Slave Trade

1 comment

This week for Makeover Monday we are looking at this Viz of the Day from CNBC as seen on Tableau Public. For me, this is a really poorly done viz that really doesn’t tell us anything. When I look at this and say to myself “so what?”, I can’t answer that question.

Let’s start by looking at what works well:

  • The table provides a nice summary of how many countries in each region fall into each tier.
  • Clicking on a region and/or tier filters the map
  • Nice “About” option

Really, that’s about all I can come up with that works well. There are so many things that could be done better:

  • The most major problem is that there’s no context. What is good and what is bad? Is the situation getting better or worse? Context is so critical in visualisation and this viz totally lacks context.
  • The title is misleading; was that intentional? These countries aren’t necessarily trafficking the children. Yes, it’s unbelieably terrible how the children are treated, but the title needs to be more accurate.
  • The bubbles in the table add no value.
  • Poor color choices
  • Tooltips are useless
  • There are no trends
  • There’s no call to action. What am I supposed to do with this information? Again, so what?

Keep in mind that Makeover Monday is NOT about criticizing the author, it’s about critiquing the visualization and trying to create a more compelling story and more interesting visualizations.

What I wanted to try to focus on this week is whether or not childhood slave trade is getting better or worse. The data set that I created contains data back to 2007. Since we have geographical data (i.e., countries), I first wanted to see if viewing the countries by their tier status as a series of small multiple maps would reveal anything.

Click on the image to interact

Unfortunately seeing trends across multiple maps is extremely difficult. Even if I drill into a specfic region, Africa for example, it’s still really hard to see patterns.

Click on the image to interact

Ok, so maybe filtering even further by tier status would help. Let’s first look at all tier 3 countries.

Click on the image to interact

Maybe you now have a case for making sense of the small multiple maps, but it takes too much work to see the rate of change. Enough of the maps, clearly they won’t work for displaying change. Fortunately Tableau lets us fail fast!

An alternative to the small multiple maps would be a heat map. In this view, I placed Country in the Rows and Year in the Columns and added a filter for Region.

Click on the image to interact

Now we’re getting somewhere. I really like this view because it shows concentrations well. At a glance I can see that there were fewer Tier 2 in 2015 than in 2007 and more contries have moved towards Tier 3. It’s also clear that there are no more Tier 1 countries in Africa.

Contrast that to Europe.

Click on the image to interact

Nearly every country in Europe is Tier 1 or Tier 2 and have been consistently. The transition in Russia from a Tier 2 Watch List to Tier 3 pops out as well.

So far, the heat map is my favourite. In Tableau though, since it’s so easy to create lots of views, I iterated some more. Next, I created this unit chart which shows each country in each region by year, simply color-coded by their tier status.

Click on the image to interact

I love the look of this unit chart. Notice that I intentionally made each section 5 circles wide. I did that so they’re easier to count. Two things in particular jump out to me in this view:

  1. The number of Tier 2 Watch List and Tier 3 countries in Africa is growing.
  2. Asia had an increase in Tier 2 Watch List countries early on and is not slowly progressing more countries towards Tier 2 status.

To get an even better look at whether regions are progressing, I can filter the unit chart to only Tier 2 Watch List and Tier 3 countries. Filtering the data makes my two impressions above even more evident.

Click on the image to interact

Lastly, I wanted to look at a time series view. In this case, I need to aggregate the data up to the Region level and count the number of countries in each Tier. The rationale behind this is to simply see how regions are doing over time. Are they getting better or worse? Is the number of Tier 1 and Tier 2 countries growing or are the number of Tier 2 Watch List and Tier 3 countries increasing?

Click on the image to interact

It makes complete sense that the time-series view would reveal patterns best. That’s exactly the best use case for a line graph. I like the clean look about this chart. It’s easy to compare within a region and tier or across regions or across tiers or across regions and tiers.

But does this give me the BEST context? Is there a better way to show how the situation has changed since 2007? Yes, by comparing each year to the first year. In this view, I’m now showing the % change in the number of countries in each region in each tier since 2007. I’m using % change because it’s more contextual than simply the change.

Click on the image to interact

In this post, I’ve shown you many ways to iterate through the data to find a compelling story, but this doesn’t bring it all together quite yet. In order to do so, I need to combine some of these views into a dashboard. I’m going long-form again this week as there’s simply too much information that needs to be shown to be effective in a single screen.

I’ve created both desktop and mobile version with the design a bit differnent in each.

April 22, 2016

Analyzing 63 Months of Crimes in London with Alteryx & Tableau


I’m beginning to think Tom Brown might actually think I know what I’m doing. Last week he threw a challenge at me for fire data. Today, he told me about a data set he had of 63 months of crime data in the UK. His requirement was simple: build a viz. Easy enough, until he showed me the data. It was split up into 2899 CSV files in multiple directories.

This is where our good friend Alteryx comes in. In Alteryx all I had to do was use one input tool and tell is to scan all subdirectories. After that I had a tiny bit of cleanup and I was done. Seriously, about 10 minutes of work.

Awesome! I now had a data set of almost 32 million crimes across the UK. But I needed it to be fast and I needed to be able to publish it to Tableau Public. This meant limiting the data set. So I created a group for the London boroughs and filtered my extract to just those. Cool, now only about 5.5 millions crimes.

From there I simply played with lots of vizzes to see if we could find any stories. The end result is the viz below. It’s amazing what you can accomplish in under 2 hours with Alteryx and Tableau. I love it!

My Tableau Journey

1 comment

I’m often asked how I got started with Tableau. How did I end up at Facebook? Why did I move to the UK to start the Data School? I gave a presentation at Surrey County Council yesterday about those exact questions, so I thought I’d share it with everyone. I hope you enjoy it!

April 20, 2016

Tableau Tip & Remake: Women are Underrepresented in State Legislatures

1 comment

Sonja Kuijpers has been partipating in Makeover Monday and creating some stunning visualisations with Illustrator. You can view all of Sonja’s Makeover Monday work here. This week, for the women in State Legislatures data set, Sonja created this beautiful viz:

I love how clearly displaying the data this way makes the problem of underrepresentation stand out. On the left of each section is the % of population by gender and on the right is the representation by gender in each State Legislature. Absolutely lovely work!

I wanted to rebuild this in Tableau and I got pretty close. Some things I wasn’t able to replicate:

  1. Shading the area between the lines
  2. Setting color ranges for male and female (I used static colors for each gender)
  3. Calling out the smallest and largest disproportions with boxes (Yes, I can annotate an area, but it’s a royal pain to get it just right in Tableau and not worth the effort. This needs to be easier.)
  4. I didn’t include a legend for how to read the chart.

I was able to create both desktop and mobile versions though, so if you’re looking at this on your computer you will see a grid that is 10x5, whereas if you’re on your phone, you’ll see a 3x17 layout.

Another super fun exercise and a view I would have never thought of creating or even trying to create if it hadn’t been for the community that’s developing around the Makeover Monday project. You can interactive with the viz below and below that is a video for how I created this visualization.

April 19, 2016

Tableau Tip Tuesday: Using LOD Calcs to Filter the Latest Month and View Sales for the Latest Day

I’ve written before about making the ends of sparklines actionable here and here and created a video here. Each of these uses table calcs to add the dot on the end of the lines. The problem, though, is the method falls apart when the end date on all of the lines isn’t the same. Yes, there are other workarounds with table calcs, but they are overly complicated.

I’ve been wanted to look at this again with level of detail expressions, thinking there had to be a simpler way to create and maintain them, and a way to overcome the drawback listed above. The video below walks through two level of detail expression:

  1. Create a filter using a LOD calc to dynamically return just the latest month in the data set
  2. Return the value of the end of EACH line of a series of sparklines

This technique could easily be parameterized, but for this example, I stick to a single dimension. Enjoy!

April 18, 2016

Dear Data Two | Week 51: Privacy

We are 51 weeks into Dear Data Two and there’s surely so much Jeff has already learned about me. Yet we all have secrets. We all have things we’d like to keep to ourselves. We all do things we’d be embarrassed about.

For week 51, I tracked everything that I would have been embarrassed about. Was it something I did, said or thought? How embarrassed would I be if someone knew? Is there anyone I would share it with? These are the moments I’d very few people to know about. And it was a fun week. I’ve never really thought about all of the things that I keep to myself because I’m essentially afraid to share them.

One thing I wanted to be sure of was to NOT note the specific things I was embarrassed about. I mean, I’m all for open data, but there are some things that need to stay locked away in this little brain of mine. What are some things I found out about myself this week?

  1. I tend to not be embarrassed by little things. Go big or go home I guess! I know this because I tracked how embarrassed I would be if someone found out and rated these on a scale of 1 to 5. My average embarrassment level is a high 3.7.
  2. It’s not my thoughts that embarrass me most, it’s my behaviour. Go figure!
  3. I’m very close to my brother. In fact, he’s clearly the most likely person I would tell if I wouldn’t keep it to myself.

For the postcard, I wanted to incorporate all of the dimensions that I tracked. There are no measures in this data set (I consider level of embarrassment a discrete measre), so that posed another fun challenge…visualising data with no measures. Overall, another interesting week and only one week to go!

April 17, 2016

Makeover Monday: Where Do Women Have the Most and Least Political Representation in the U.S.?


Yes, it’s Sunday, but there are weeks when I simply don’t have the time for Makeover Monday on Monday. This is one of those weeks. For Makeover Monday this week, we are reviewing a visualisation created by the National Conference on State Legislatures. According to the NCSL:

"Approximately 1,809 women serve in the 50 state legislatures at the beginning of the 2016 legislative session. Women make up 24.5 percent of all state legislators nationwide."

Wow! Only 24.5%…that’s sad America. We need to do something about it. Those figures are accompanied by this map that shows the percentage of women in state legislatures in 2016.

This viz is definitely not terrible. There are several things that this map does well:

  • Using four distinct bins for the colouring makes it easy to see concentrations like all of the yellow in the middle of the U.S. and all of the green on the west coast and northeast
  • Making the extra squares for the small states that would be tough to click on
  • Nice interactivity on hover
  • Good tool tips

But I feel like this is so much more to this story. So when I created the data, I included population rates by state. This way I could look at thedisparity between the female population in each U.S. state and the percentage of women in state legislatures.

What would I do differently?

  • Get rid of the filled map; while they included the extra callouts for the small states, it add more to the viz than is needed
  • There’s no sense of how under-represented women are. Are we supposed to just assume each state is 50% female then do the math in our head? I think that’s too much work for the reader.
  • No mobile version
  • Not enough distinction in the colours. Why not use a continuous colour scale?
  • Need a more impactful title
  • Need a better way to make comparisons
  • There’s no ranking of the states. How do I identify the best and the worst?

Given these recommendations, I’ve create the viz below. I also made this device responsive, so you will see a different viz depending on whether you’re on a computer, tablet or phone. I did this because I want to learn to design for mobile first, something John Burn-Murdoch has recommended to us when he’s come to chat with us at the Data School.

April 15, 2016

Fire Response in the UK: How is your local authority performing?

No comments

This morning, Tom Brown and I were meeting with a customer and talking about data that’s available from the UK government about the public sector. I found a great data set about fires and the response time by fire department in each of the local authorities. A bit of Excel cleanup, unioning the files in Tableau 9.3 and 30 minutes later I had this dashboard that allows people in the UK to see how well their local fire department is responding.

What’s interesting is that for nearly every authority, the number of fires is decreasing across all fire types, yet response times are inversely increasing. I don’t know this is, but that’s the power of data visualisation; it leads to more questions, more discussion and hopefully better outcomes.

April 14, 2016

The Importance of Data Visualization in Analytics

No comments

Today I had the pleasure of speaking at the Barclays Enabling Modern Analytics symposium. Nandu Govindankutty of Barclays did an amazing job organizing this event. He runs a charity called MADTA (Making A Difference Through Analytics) and focused this event on senior leaders from Charities and Social Enterprises.

Naturally I wanted to speak about the amazing work that The Data School did on the Connect2Help 211 project for the Tableau Foundation. Nandu also asked me to speak about my experience in data visualization and why I think data visualization is import in analytics.

I was able to record the presentation. You can watch the presentation below, view the slides and some of the visualizations I built during the talk.

Tableau Tip: Cross-Database Joins in Tableau 10

Tableau 10 brings us cross-database joins. In this video, I walk you through a simple example and talk about the importance of visualing data versus traditional tables.

You can download the workbook I created here. It requires Tableau 10 (obviously).

April 12, 2016

Dear Data Two | Week 50: My Phone

No comments
Click the image for the full-size version

Week 50 was clearly one of the easiest from a data collection perspective as there was really nothing to track. It was a nice break for the daily effort to keep up with each topic. For this week, I wanted to look at the apps that sit on my phone’s home screen.

  • Why did I choose these apps?
  • How do I organize them?
  • What’s the most common colour?

Scroll through the story points below to see a bit more about my phone and get a bit of insight into how my brain works. The last two story points show the postcard I drew. It’s really hard to draw app icons! This took hours!

Tableau Tip Tuesday: How to Use LOD Calcs to Compare Avg Sales to Avg Sales per Customer

No comments

For this week’s Tableau Tip Tuesday, I continue the series of Level of Detail Expressions by looking at the difference between average sales and average sales per customer. The idea here is you might want to answer the question:

For each state (or product category or whatever), what is the average sales per customer.

This is a very different question than simply the average sales per state. I walk through a couple of examples that hopefully help clarify the technique.

April 11, 2016

Makeover Monday: From Millions to Billions

No comments

Today is my 9 year anniversary using Tableau. On April 11, 2007, I downloaded a trial of Tableau 3.0 and my life has never been the same. What better way to celebrate than working on Makeover Monday and creating my first mobile dashboard.

This week’s Makeover Monday was quite tough. How do you take a viz that’s already great and make it better? Or how do you tell the story differently? I wrote about this viz on my other site ( a couple weeks ago. Here’s the viz again:

What works well?

  • Title catches your attention
  • Subtitle makes you more interested and sets up the story
  • Very well organized
  • Clear labeling of the headers
  • Pictures of the faces make it feel more real
  • Consistent use of colors makes it easy to understand and follow
  • Nice totals on the right for context and they’re out of the way as to not distract
  • Dark background works well and makes the bars pop out
  • Great tooltips!
  • Fantastic sorting interactivity
  • Great spacing; there’s a lot of information yet it still feels simple

What could be be improved?

  • Include a filter for gender
  • Include more women
  • Change the mark type for the ends

So given how good this is already and how little I saw that I could do to make it better, I thought I would use this week to practice making a mobile version, formatted for iPhone 6. I rarely create visualisations specifically for mobile and the only way to improve that ability is to practice.

Adding the filter for gender was easy. Including more women was out of scope.

Notice that I changed the ends of each bar to dots. I did this because I think the vertical bars are a bit tough to see; I wanted to make it easer for the viewer.

One challenge I had was that I wanted the sorting to be “touchable” like the gender filter is, but I couldn’t get it to work. I didn’t want to spend a ton of time fiddling with it, so I resorted to a parameter for the sorting.

I’d like to hear what you think about this mobile version. Does it work well? What could I do to make it better mobile experience?

April 10, 2016

Dear Data Two | Week 49: Data

At the start of this week, I attempted to follow the lead set by Stefanie and Giorgia during their week 49 and track every time I heard, said or wrote the word “data”. Tracking swear words was difficult enough and I quickly realized there was no way I would be able to keep up with “data”, there’s simply too much of it.

It was Tuesday and I couldn’t really start a new topic because I’d be missing a day. Instead, I decided to look at the data I had created through the first 48 week of Dear Data Two. There are four main types of data I create every week: raw data (usually in Excel), Tableau extracts, Tableau workbooks, and scanned images. As you click through the story below, you’ll see I’ve created nearly 1GB of data through 48 weeks, 95% from pictures.

This means that the data sets I had been working with have mostly been small. And this revealed a problem I hadn’t know: my Tableau extract were actually BIGGER than the raw data. I always had it in my head that Tableau extracts would be smaller than the source data, but in 42/45 weeks, that wasn’t the case.

Some summary stats:

  • 357 pictures = 926.4 MB
  • Raw data = 3.5 MB
  • TDEs created = 3.9 MB
  • TDEs were on average 10% bigger than the raw data
  • 45 Tableau workbooks = 36.2 MB

Overall, it was fun to analyse the data of Dear Data Two. Flip through the story below to see my take.

April 8, 2016

Abortion Regulations - State by State | Hexmaps in Tableau

Click the image to interact

Hexmaps…I’ve seen so many great examples of them created in Tableau (and other tools) like this one from Matt Chambers, this one from Ben Moss or this one from Kris Ericsson. Then I saw an article on the Washington Post about abortion regulations that included this series of tile maps...

Click the image to go to the original

…and thought this would look much better a series of hexmaps. In addition, I’d never created one before and I wanted to learn how. Fortunately Matt Chambers wrote a blog post for how to create them. I downloaded his workbook, copied the data into Excel, and add the columns I needed. I’ve saved an Excel template you can download here and all you need to do is add the columns you want to use on the color shelf in Tableau.

For my version, I wanted to create a small multiples view with a matrix that shows all states and their regulations in a table you can click to highlight. Special thanks to Robin Kennedy and Matt Chambers for their feedback!!

Two vizzes in one day, two new techniques learned. I’ll chalk it up as a fabulous day!

Brexit: How will it affect your favourite Premier League team?

Click the image for the interactive version

Brexit is a hot topic in the news these days over here in the UK. The ramifications are far and wide. Leaving the EU would mean a LOT of lost jobs, impacting even the Data School and the Premier League.

Back in September, The Guardian wrote an article and created this visualisation to show which teams in the Premier League would be impacted the most.

Click the image to view the interactive version

I like the layout of this, but what it lacks is a simple way to see where the players that are impacted come from. I also wanted an excuse to practice the new Union feature in Tableau 9.3. It took me a couple of hours to recreate the data set and then another hour or so to create the interactive viz below.

Find your team and see how many players they could lose if Brexit happens!

April 5, 2016

Tableau Tip Tuesday: Compare One Dimension Member to All Others With LOD Calcs


This week’s Tableau Tip Tuesday continues the series on LOD expressions. This week I show you how to compare one member of a dimension to all others in the same dimension. For example, you might want to see how the profitability of Binders is doing compared to all of your other product sub-categories.

If there are other videos you’d like to see, please leave a comment. I’m always looking for new ideas.

April 4, 2016

Makeover Monday: America's Most Diverse Cities Are Often The Most Segregated

Click on the image for an interactive version

Diversity…always a hot topic in the States, particularly around election time. This week we for Makeover Monday, we look at this visualisation from FiveThirtyEight.

I chose those chart for two main reasons:

  1. It’s already good. How will everyone approach a chart that is already good vs. a one that has many flaws?
  2. I felt like there was more to the story than this chart reveals. The chart takes quite a while to comprehend, so surely there’s a better way to communicate the message.

Let’s start by consider what works well and what doesn’t. What works well?

  • As with all FiveThirtyEight charts, it conforms to their principle of simplicity in design.
  • The additional red trend line helps to clearly show what is above and below “normal”.
  • Highlighting specific cities
  • Additional descriptions through text aid in understanding

Ok, but when we look at those same reasons for being effective, we can also find some areas of weakness:

  • Why are those particular cities highlighted? Some explanation is given, yet the design leads you to believe those are the most important cities.
  • No information is provided for how the red trend line is drawn or calculated. Why not? Is the model not sound enough to promote sharing of it? It makes me skeptical.
  • What does the location of a city on the scatterplot really mean? You have to read to complicated descriptions below the chart to gain that understanding.
  • We have location information (i.e., cities), which immediately makes me wonder if there are geographic implications. HINT: there are!

I iterated through many different charts looking for a story. To me, the measure that is the most impactful in the Integration/Segregation Index. This is a measure of how far a city is from being fully integrated. The worse the number, the more segregated a city. This is the easiest metric to understand and the most telling. I created a slope graph and a bar chart that I threw away. In the end, I combined several views into a single dashboard.

  • All of my charts use the integration/segregation metric for color
  • I broke the chart into four quadrants and labeled each quadrant to aid in understanding.
  • I included jitterplots of the city and neighborhood diversity indices to show concentrations.
  • I included a map so that you can more easily see that cities east of the Mississippi River tend to be more segregated, which is emblamtic of the history of the United States. Eastern states have their roots in slavery, and it doesn’t appear all that much has changed from a segregation perspective.
  • I included a filter so that the user can look at just the quadrant they are interested in.

I feel like my design gives the reader much more information about the problem, the story, and even a bit of historical context. While all of the data is the same in each chart, this is an example where including multiple displays of the same data in a single view can aid the understanding of the reader. When we’re creating data visualisations, it’s important to make them as simple to understand as possible. If you step away from your work and it’s not completely clear, then you should keep iterating.

You may also notice that I did something a bit different with the map. This is a Tableau map, yet you only see borders around the US states. I’ll create a video sometime for how I did that. If you’re curious in the meantime, download the workbook and see if you can reverse engineer my technique.