VizWiz

Data Viz Done Right

July 18, 2018

Financial Times Visual Vocabulary: Tableau Edition

No comments
We're all in the never ending search for resources that will help us pick the "best" chart for the situation. The Financial Times Graphics team created the Visual Vocabulary to help all of us make better chart choices.


Over the past month, I've been building all of these charts in Tableau so that everyone in the Tableau Community would have examples they could use and learn from. This has been quite the labor of love and I would like to thank the (best) team at The Information Lab for their support, reviews and feedback along the way.

There are 72 charts in total, most of which I built myself or with help of tutorials from the community. To build the violin plot, equalized cartogram, and heat map examples, I prepared the data in Alteryx and the output was shape files. The scaled cartogram was built using Tilegrams by Pitch Interactive based on this tutorial from Ken Flerlage.

While the people listed below may not have been the original creators of the charts, they are the resources I used to create the charts in my workbook.

Chart
Person
Link
Diverging Stacked Bar Steve Wexler Data Revelations
Surplus/Deficit Filled Line Jeffrey Shaffer Data +Science
Violin Plot Ben Moss YouTube / Alteryx App
Sunburst Chart Leonid Golub Super Data Science
Arc Chart Ken Flerlage KenFlerlage.com
Venn Diagram Leonid Golub Super Data Science
Radar Chart Adam McCann Dueling Data
Scaled Cartogram Ken Flerlage KenFlerlage.com
Sankey Diagram Leonid Golub Super Data Science
Chord Diagram Noah Salvaterra DataBlick

How to use this workbook

  1. Start on the Visual Vocabulary tab.
  2. Click on the text in any section to get to the chart types associated with that topic.
  3. To go back to the beginning, click on the Visual Vocabulary tab (NOTE: I'll add dashboard navigation buttons once Tableau releases that feature.)
  4. Download the workbook to see how the charts are built. You should be able to swap your data out for any chart type fairly easily.
  5. Give credit to the creator of the chart as appropriate.

This has taken up a tremendous amount of my time, so I would appreciate it not being downloaded and then re-posted as if it's your own work. If you see someone has done that, please tweet me with a link to the person/page that has done so and I'll take it from there. Feel free to show these charts to customers and prospects to show the capabilities of Tableau. 

Notes

  • This is NOT meant to be an exhaustive list of charts that can be built with Tableau. This is based on the charts created by the Financial Times for the Visual Vocabulary.
  • Actions are quite slow to respond on Tableau Public. If you download the workbook, it's much more responsive. 
  • There's a mobile version as well.
  • Images of each set of charts can be found on Google Photos.

If you find what I've created useful, please share a link to this blog post to them. Any feedback you have is very much appreciated. Click on the gif below for the interactive version. Enjoy!


July 16, 2018

Makeover Monday: NBA Team Salaries Against the Cap

No comments
Shortly after I finished this week's Makeover Monday, I reached out to Eva to give her some ideas for how to approach the data, given her disdain for sports data. I told her to basically think of the data as actuals (team salaries) vs. budget (salary cap). Then it struck me, this is pretty much how I approached the Visual Profit & Loss Statement I created last summer.

With that in mind, here's a second Makeover Monday from this week, a scorecard of NBA steam salaries vs. the salary cap. Click on the image below for the interactive version. From there, click or lasso any set of years and the bullet chart and BANs will update accordingly.

July 15, 2018

Makeover Monday: Historical NBA Team Salaries Against the Cap in the Salary Cap Era

No comments
I've been complaining to Eva the past few weeks about her choices of data sets. She basically tells me to suck it up and moves on. Tough love! So what better way for me to respond than to give her some sports data to play with. (I've given her some hints on how to approach it.)

Let's start by reviewing the original viz from What's the Cap?:


What works well?

  • Title and subtitle clearly explain what the chart is about
  • Good labelling of the y-axis
  • Using colors that are easy to distinguish from each other
  • Quick interactivity on the tooltips
  • Using lines for three of the metrics works well for time series data

What could be improved?

  • Over other season is labeled, which isn't hard to figure out, but it looks messy
  • Season labels are on a diagonal; make them horizontal
  • Make the salary cap a line as well for consistency
  • The legend could use some work. Why are they boxes?
  • There's no option to pick a team. What if I want to know my favorite team's salary vs. the salary cap?

What I did

  • I wanted to show all teams so that they could be compared. I settled on a dot plot for each season.
  • I created a calculation to get the starting year for each season so that the x-axis labels would look nicer and could be displayed horizontally.
  • I made the focus on the variance to the salary cap. I had no idea so many teams were over the salary cap.
  • I included a line that displays the NBA average of the variance to the cap for the team selected (via a parameter).
  • Since teams have moved to other cities and changed names, I created a calculation to make them franchises.


To understand how many outliers there were, I used box plots and hid the marks behind the boxes.


The problem I saw with this, though, was that I didn't feel like I had much context for the distribution of the teams, even though that it the point of a box plot. I decided to scrap the box plot and created this version in the end.

July 9, 2018

Makeover Monday: Have volcanoes nearest to a tectonic plate erupted more recently?

No comments
Week 28 focused on volcanoes around the world and when they last erupted. Here's the original visualization:

What works well?

  • Including a description for how to interpret the chart
  • Ordering the volcanoes from front to back according to elevation above sea level
  • Coloring by the number of eruptions since 1893
  • Excellent tooltips

What could be improved?

  • Where is sea level? Some of the volcanoes are below sea level. You can't really size by negative feet below sea level.
  • There's no explanation for why some of the volcanoes have labels.
  • Make it more clear where the based of the volcano starts. I assume it's at the bottom of the viz.
  • Include reasoning for why 1893 is when the counting of the eruptions starts.

What I did

I started by creating an Alteryx workflow that took the volcanic eruptions data and plotted the volcanoes onto a 250 miles grid of the world.



I then created a custom Mapbox map on which I included the tectonic plates, which I got the boundaries for as a shapefile from sciencebase.gov, imported it into Alteryx, created points, got the lat/lon and exported as a CSV so that I could import it as a layer in Mapbox. Here's what I ended up with, which was fun to create, but not insightful at all.



I had to start all over, so this time I decided to look at how far each volcano was from the boundary of the nearest tectonic plate. Again, Alteryx to the rescue!


Once I had the data I needed, I created a few calculation to help me create a simple quadrant chart that clearly show that the nearer a volcano is to a boundary of a tectonic plate, the more recently it erupted. All of that totally makes sense given what we learned about geology in school.

July 1, 2018

Makeover Monday: Where are New York's rats?

No comments
We begin this second half of Makeover Monday 2018 with data about rat sightings in New York City. I've been aware of this data set for quite a while and thought it would be funny to look at rats this week after analyzing London's bikes last week. It's kind of the same data, just replace the city and the subject. I'm curious to see how different the vizzes are this week from everyone.

The original article by Jowanza Joseph contains several fantastic visualization, most which look like they were created in R. For this week's makeover, we need to try to make this visualization better:


What works well?

  • Simple title that tells us what the data is about and the time period
  • Axes are clearly labeled
  • Including light gridlines for context that aren't distracting
  • Including every sighting as a dot for context; it's interesting how these show cyclical patterns
  • Including an average line which confidence bands to show the overall pattern
  • Excellent color choices; the purple really works well again the grey background

What could be improved?

  • Include an explanation of what the line represents
  • Include the data source and author's name
  • Remove the word "Date" from the x-axis. That's implied by the title and the year labels.

What I did

  • We were doing Alteryx spatial training this week at The Data School this week, so I wanted to do something using the locations of the rats, but not a simple dot of where each sighting occurred.
  • I wanted to use Alteryx as I'm working to improve my skills in that area.
  • I create a tile grid map using Alteryx for London crimes last year and wanted to do that again, as I need to practice techniques several times to reinforce them.
  • Create the tile grid map so every 1/2 mile and have them cut off at a Borough's edge
  • Create a simple, minimalist map and line chart in Tableau
  • Use the Magma color palette as I really like how it works as a heat map

Alteryx Workflow


The workflow is pretty simple. It takes the individual sightings, converts them to spatial points, assigns them to a 1/2 mile grid based on shape files available for each borough, then I export it as a shape file.

Tableau Visualization

In Tableau, it's simply a matter of double clicking on the spatial object, adding the borough and grid ID to give it the right level of detail, adding color by number of sightings, creating a line chart, adding a borough filter, and cleaning up the tooltips. 

Because all of the heavy work was done in Alteryx, it takes about 10 minutes to create the visualization in Tableau, most of that time being formatting. With that, here's my Makeover Monday week 27 about rat sightings in New York City.

June 26, 2018

Makeover Monday: Where are London's happiest bike pickup zones?

No comments
While I was running to work today along the Thames toe path, I had an idea! I wanted to create a hex map or grid map or some type of grouping of stations to represent the stations in clusters rather than showing every station on a map.

I created a viz last year about American happiness, so decided to use a similar theme. What I did was group stations together based on their location. It takes two calculations:


You then makes the continuous dimensions and place them on the appropriate shelves (Round Lon on Columns and Round Lat on Rows).

I then created a calculation that ranks each "zone" by the number cycle hires and then places them into percentiles. I then take the percentiles and break them up into happiness quartiles.


I set the Location Happiness to discrete, placed it on the Shapes shelf and applied my emoticon shapes. I then duplicated the Round Lat field on the Rows shelf and moved the Location Happiness field to color, changed the shape to circle, moved the marks to the back and assigned colors.

Simple! I like how this turned out.

June 25, 2018

Makeover Monday: When are bicycles hired in London?

No comments
The weather has been wonderful here in London lately, with really warm temperatures expected later this week. I take advantage of the weather by cycling to work whenever I can. It not only helps the environment, it's good for my health, it's good for my mind, and cycling commuting saves me time and money.

I asked Eva to use data from Transport for London's open API about their cycle hire scheme. Data is available back to 2012 and I offered to prep it for her and upload it to Exasol...all 50M+ bike hires worth. I love the weeks when we get to use Exasol because I can ask and answer questions on massive data sets without any performance constraints.

The visualization to makeover this week comes from Sophie Sparks:


What works well?

  • The small multiple layout works great or showing cyclical patterns (see what I did there?).
  • The diverging color scale helps accentuate the peak periods.
  • The shading under the lines makes the viz feel more full and complete.
  • Shading the weekends helps separate them from the rest of the weekdays.
  • Putting the word When in red in the title to match the peak period.

What could be improved?

  • I would remove the section at the top that says "Boris Bikes" and the image.
  • Include some sort of insight as a subtitle.
  • There's no indication of what the y-axis means. I assume it's the number of bikes hired, but it could just as easily be something else.

What I did

  • First, I rebuilt Sophie's viz because I like it.
  • I wanted to focus on the weekday and hourly patterns in the data.
  • Use the TFL blue as a single color for the viz.
  • Provide some interactivity so that people could see when the peaks and troughs in the data are for a specific year or month.

Click on the image for the interactive version.

June 19, 2018

Tableau Tip Tuesday: How to sort first by the most positive values, then by the most negative values in a single chart

No comments
It was day 1 of Tableau training for DS9 at The Data School yesterday and we were practicing different sorting methods, including sorting by a discrete measure. I was then asked how we could sort positive values from highest to lowest followed by the negative values from lowest to highest. This helps emphasize the best and worst performers. The trick was doing this in a single chart.

Enjoy!

June 18, 2018

Makeover Monday: U.S. Influenza Surveillance Report

No comments
I've seen lots of headlines in the US since March about how bad the flu outbreak was this year, so I did a bit of googling and found this week's Makeover Monday topic. The chart we are to makeover comes from the CDC.



What works well?


  • Clearly marking the x-axis so that it's evident that the weeks don't start at the beginning of the year
  • Including the national baseline for context
  • Chart dimensions scaled properly
  • Using red for the most recent season so that it stands out more

What could be improved?

  • The colors are too bright and are competing for attention.
  • The symbols on the 2017-18 line are unnecessary.
  • The start of the 2009-10 season is wrong, according to the data that can be downloaded.
  • The national baseline should be weekly, not flat across the time period.

What I did

  • I liked the idea behind the original chart, so I kept that but made it look nicer and more focused.
  • I included a summary to set the context for the line chart.
  • I included the national average by week for context.
  • Use a stepped chart to make the weekly change easier to see.
  • Focus the lines on the two outlier periods.

June 15, 2018

Tableau Prep Tip: Returning the First and Second Purchase Dates

No comments
If you don't participate in Workout Wednesday, you're missing a great opportunity to learn. For week 24 2018, Ann Jackson challenged us to also use Tableau Prep. The toughest part of the data prep was getting the second purchase date for a customer. First is easy. The trickiest part is that you can't sort data in Prep, so you have to do some workarounds to get what you need.

In this video, I should you how I approached returning the first and second purchase dates for a customer, include some summary measures, then bring them both back together into a single table for visualizing in Tableau.

June 13, 2018

Workout Wednesday: Do Customers Spend More on Their First or Second Purchase?

No comments
Ann is back for Workout Wednesday week 24. The full list of requirements can be found here. Personally, I like how this week's challenge included Tableau Prep. It's great to have an excuse to practice!

The high-level requirements:

  1. Create a data set in Tableau Prep that returns the first and second order for each customer along with the sales, number of categories and number of products sold on that day. 
  2. The data must be wide rather than tall. That is, you must have nine columns in total: Customer ID, two dates, two sales totals, two category counts, two product counts.
  3. Create a dashboard with a scatterplot and two strip plots.
  4. Float everything...YUCK!

Here's what my flow looks like from Prep:
I intentionally did NOT rename my tools so that you wouldn't know exactly what I did. You can see my final output below the flow.

Building the viz was pretty simple. Scaling the axes the same is something I do a lot, but I do expect that to trip up people. Creating the 45º line is something I've written a tip on before, but Ann's has a twist as it has to be behind the dots. Sneaky!

The sucky part was floating everything. I started by tiling everything, literally writing down the position and size for each element. Then I floated them one by one and entered what I wrote down to put them back in their proper position. From there, it was a little bit of tweaking to move the axes closer together.

Nice challenge. It wasn't overly complex and required me to reach back into the memory bank.

June 10, 2018

Makeover Monday: Tourism Density Index

No comments
For week 24, Eva presented us with something called the tourism density index, which basically means how many tourists come into a country compared to that country's population. Here's the original viz:


What works well?

  • Really good explanations for how they define overtourism and undertourism and examples for each
  • Providing the exact figures for each country
  • Colors are easy enough to distinguish
  • Sorting the countries from lowest to highest
  • Splitting the view between the highest 9 and the lowest 9

What could be improved?

  • Circles are inherently difficult for comparisons. Are they measure by area or diameter? Either way, the circle in a circle in overkill.
  • Why does the size of the light green circle change once the dark green circle is a larger value? That makes no sense at all.
  • If the exact numbers were not included, it would be impossible to compare countries.
  • Why show the top 9? That seems like an unusual way to select the countries.

My Goals

  • Focus on either the raw values or the percentages. I'll figure this out once I explore the data.
  • Make it easier to compare countries.


June 7, 2018

Workout Wednesday: How does sales compare in the Current Period to the Previous?

No comments
It's been eight weeks since I've done Workout Wednesday. Sometimes you have to reprioritize things to get other things done. For me, WW was something I could cut out to free up more time for finishing the Makeover Monday book (pre-order here).

But I'm back and this week Rody gave us this challenge. Read all of the requirements here.

I had an idea straight away how to do this and in all it took about 30 minutes. The date offsetting took some tinkering, but the rest was pretty easy. I'm glad Rody is back from his hiatus too. His challenges aren't as brutal as Ann's.

Click on the image for the interactive version.

June 5, 2018

Tableau Tip Tuesday: Split, Pivoting and Union with Tableau Prep

No comments
Now that the manuscript of the Makeover Monday book (pre-order here) has been sent to the publisher, I'm going to do my best to get back into a rhythm with weekly tips. This week, I'm going to show you how I used Tableau Prep to prepare the data for my Makeover Monday viz this week. It involves splitting eight columns into two sets of four, renaming fields, pivoting the data, and then unioning it all back together.

You can download the flow here.

June 4, 2018

Makeover Monday: The UK Gender Pay Gap Across Salary Bands

No comments
This week's data comes from the UK government and more specifically the Valuation Office Agency. I was alerted of this data set by Aisling Roberts, who had written a great article on LinkedIn that questions whether people will actually take any actions based on the data.

Let's start with this viz from the official report:


What works well?

  • The symbols make it clear this about females and males.
  • The BAN in the middle tells us what the bonus pay gap is.

What could be improved?

  • Both icons are filled to the same level, making it look like there is no bonus pay gap. These should be filled to the actual values for each gender.
  • The icons don't add much value.
  • The title could tell us a whole lot more.
  • There's no source listed nor no timeframe.
  • The gridlines aren't evenly spaced between 0% and 50%.

I must admit, this is a tough data set. Hopefully the explanations I wrote on data.world provide sufficient context. I found it most useful to look at a specific company and look those values up in the data provided to ensure I understood what it means. Given that I found the data overwhelming, I decided to focus on the pay bands since that's what Aisling focused on in her article. 

From there, I started to build lots of charts, but found the number of companies overwhelming. Therefore, I decide to limit the data to those companies located in the City of London (i.e., those with a postcode that starts with EC). I also knew I need to do some data prep so that I could compare females and males in each pay band more easily. I turned to Tableau Prep for this.

The flow works like this:

  1. Remove columns that aren't needed
  2. Splitting the data up into two streams, one for the female columns and one for the male columns.
  3. Pivot the data so that the pay bands are listed down instead of across
  4. Add a column for the gender
  5. Union the data back together
  6. Export to an extract

Pretty straightforward and this short amount spent prepping the data made the gender comparison significantly easier. I first wanted to understand how the median proportion of females and males in each pay band by the size of the company with a City of London total (NOTE: the total only represents companies that reported).

Click on the image for the interactive version.


This simple view makes it incredibly evident that the proportion of females declines as the pay band increases. Males would be the inverse. It's particularly stark in the largest organizations. In the City of London, there are only three employers in that range (British Telecom, Royal Mail, and Sainsbury's Supermarket).

The heat map helped give me an overview of the data and felt ready to create something more detailed. This time I wanted to look at all companies together by gender by pay band compared to the overall median for each gender. I also wanted to provide the user with the option to choose a specific company. When they do, that company gets highlighted.

Click on the image for the interactive version.

What first struck me in this view is the clear, overwhelming patterns down and to the right for women. This gave me a great impression for how big the gender pay gap problem is.

The gender pay gap is not a myth. These are facts, facts that show women are underrepresented at higher salary levels. Don't let this discussion get lost. Check out your own company. How are they performing? Ask them to share the data within your organization. Transparency is a key to fixing this discrepancy.

May 27, 2018

Makeover Monday: Europe Is Moving Up the Ranks of the World’s Wealthiest Regions

No comments
#MakeoverMonday HQ has been into small data lately and this week continued the trend: 2 columns x 20 rows. I love small data!

The viz to makeover this week comes from the World Economic Forum.


What works well?

  • Good title and subtitle
  • The colors scale, from dark to light, make it relatively easy to see which cities are the most and least expensive.
  • It's easy to see that the bottom-left cities is the most expensive and the upper-right is the least expensive (of the list).
  • Including footnotes for the data
  • Including the source

What could be improved?

  • A treemap is used to represent parts-to-whole relationships. This is only a selected set of cities.
  • Displaying square meters as parts-to-whole makes no sense at all.
  • The sorting of the treemap is strange. It seems the whole chart should be flipped on the y-axis.

My Goals

  • After reading the article that the source refers to, I wanted to look at cities by their regions and focus on Europe.
  • Make the ranking of the cities much easier to understand
  • Use highlighting to focus the analysis
  • Make everything in a single chart (except the mobile version since I can't change the title and caption font size)

With that, here's my simple bar chart showing how European cities are mostly ranked near the top of the list of the World's wealthiest cities.

May 21, 2018

Makeover Monday: How well did The Guardian predict the Premier League table?

4 comments
Back to sports again this week. With the Premier League season just finishing, we're looking at how well The Guardian predicted the EPL table at the start of the season.


What works well?

  • Sorting the teams by prediction makes sense since this is an evaluation of their performance against their prediction.
  • Including the logos so people can find their favorite team
  • Including the numbers for the table position so that the reader doesn't have to count as they go
  • Shading every other row helps break up the view

What could be improved?

  • If you don't know the team logos, it can be hard to track a team across the table.
  • It's hard to see which team did better and worse than expected.
  • There's no scale for how "well" The Guardian predicted the table.

My Goals

  • Focus on the difference between the predicted and actual results
  • Try to create some sort of unit chart (I didn't have time to figure out the calcs, so I cheated with distribution bands)
  • Make it easier to see if team finished above or below the predictions
  • Finish in under an hour because we did MM live at the Data School and had to present to Eva at the end of the hour 

May 14, 2018

Makeover Monday: Which European commuters spend the most time in traffic jams?

No comments
It's crazy busy times at the Kriebel household, so this week I had to get something done quickly. Eva chose a data set about time people spend in traffic congestion in European cities.


What works well?

  • Bars are ranked in descending order
  • Simple, clear title
  • Axis title tells us what the bars represent
  • Nice tooltips
  • Footnotes that qualify the data

What could be improved?

  • The alternating bar colors add no meaning.
  • The title has a weird shape to it. 

My Goals

  • Change the metric to percent of time spend in congestion during peak hours, which required me to go to the source to get the additional data.
  • I took inspiration from Eva's viz, but wanted to show the congestion as a percentage rather than a raw number. I feel this gives move context to the numbers and lets the audience know their likelihood of being stuck in traffic in these cities.
  • Create the viz as a single worksheet.