Launch, grow, and unlock your career in data

March 29, 2017

Workout Wednesday: Benford's Law

From Wikipedia:

Benford's law, also called the first-digit law, is an observation about the frequency distribution of leading digits in many real-life sets of numerical data. The law states that in many naturally occurring collections of numbers, the leading significant digit is likely to be small. For example, in sets which obey the law, the number 1 appears as the most significant digit about 30% of the time, while 9 appears as the most significant digit less than 5% of the time. By contrast, if the digits were distributed uniformly, they would each occur about 11.1% of the time.
This leads us to the Workout Wednesday week 13 challenge. Let's build a chart about Benford's law based on the World Indicators datasource that comes with Tableau. You can download it here.

Here are the guidelines:

  • Data should reflect 2010 only
  • User should be able to pick between the eight metrics you see listed
  • Zeros and nulls should be excluded
  • Match my title and subtitle exactly
  • Match my colors throughout
  • Each bar represents the % of countries that end with that number for the metric selected
  • No tooltips
  • Labels for the bars should be on the inside-top of each bar
  • The lighter blue bars behind the red bars indicate the "expected" outcome of Benford's law. Notice those don't change even when the user changes the metric.
  • Dashboard layout is 800x600
  • You cannot use LOD expressions

This shouldn't be too terribly challenging. For me, it was fun to learn about what Benford's law is and see it in action. Good luck!

March 28, 2017

Tableau Tip Tuesday: How to Use One Parameter With Unlimited Number Formats

I've written about this week's tip before, yet never created a video for it. People have been telling me they find the videos useful, so occasionally I'm going back and creating videos for older tips. This is one of those weeks.

In this week's video, I show you how to use a single parameter with unlimited number formats. This is particularly useful when you have a single chart with a single measure and you want you user to be able to swap out the measure, yet retain the number formatting for the measure (e.g., currency vs. percentages).

March 27, 2017

Makeover Monday: The Secret of Success

For Makeover Monday week 13, we looked at a visualisation that I found on that I use in the Data School's first data viz training class where we look at critically evaluating visualisations.

Andy Cotgreave messaged Eva and me about this particular viz and found a great article that does an excellent in-depth review of this particular piece. Reading the article helped reassure the thoughts that I've had about this chart over the years.

What works well?
  • Engaging design
  • Clear labels on the values
  • Legend matches the shape of each amoeba
  • Everything is scaled in proportion
  • Good color choices
  • Nice big title

What doesn't work well?
  • The subtitle is very offensive. It sure seems this survey was conducted by some extreme right-wing group.
  • The overlapping colors make it hard to follow the pattern.
  • Radar charts are very hard for making comparisons across categories.
  • The radars are curved to make the segments appear to take up more space.
  • Unnecessary icons

I wanted to be able to compare all of the segments in one view, meaning both across social strata and across reasons. I did a quick bit of google image searching to get some ideas for survey visualisations to help gather my thoughts.

In the end, I went with a simple small multiple, waffle chart type design (mind you they are circles instead of squares). I chose to highlight the largest response rate for each social strata and call out in the title the key findings.

Click on the image for the interactive version.

March 22, 2017

Workout Wednesday: Highlight a Treemap

Emma was at it again this week with her little tricks! This week's workout was to rebuild this treemap. You can see all of the requirements on Emma's blog.

The requirements are pretty straight forward:

  1. The treemap should be segmented by Product Category and Product Sub-Category with Sales as the size of each part. 
  2. The Sales value should also colour each segment of the treemap as a gradient (she used blue with the highest sale being the lightest blue). 
  3. You should be able to highlight a segment of the treemap with a different colour (in orange). The part highlighted should be able to be changed by the end user.

I must admit that don't know how to build treemaps in Tableau, so I took the easy way out and used Show Me for I think the first time in quite a while. That was easy!

I had an idea straight away for how to create the necessary calculation for coloring the sub-categories. This, however, resulted in my colors being shaded from lightest to darkest, that is, the sub-category with the largest sales was darkest. Emma's has the darkest color as the least sales, so I knew I wasn't done. This is what you see in the method 1 tab below. Method 1 uses a single calculation, but it won't work if any sub-category has negative sales. Though the treemap would crap out too.

Method 2 matches Emma's exactly. This requires two calculations and understanding when you need to make a field an attribute and when you need to make a field discrete. This method colors the sales from darkest to lightest, with the sub-category with the largest sales being the lightest color.

Method 3 is similar to method 2 in that it requires two fields on the color shelf. However, this colors sales from lightest to darkest, with the sub-category with the largest sales being the darkest color.

Good challenge Mrs. Emma!

March 21, 2017

Tableau Tip Tuesday: How to Create a Full Year Heatmap Calendar with Month Labels

This week's video is an extension of the great blog post by Kevin Taylor on the Tableau blog about how to create full year heatmap calendars. This video shows you how to add month labels to the visualisation.

March 20, 2017

Makeover Monday: The Giant Killers of the NCAA Tournament

It's March Madness time and Eva found a great viz to makeover from Business Insider. By great, I mean terrible, as in great for Makeover Monday.

What works well?

  • Years are in order
  • Nothing else

What doesn't work well?
  • It makes no sense whatsoever to add up the seeds.
  • There is no consistency in the coloring of the bars.
  • The text call out in the middle has nothing to do with the chart.
  • It's way too busy.
  • I have to rotate my head sideways to read the years.

For my view, I wanted to focus on the upsets because that's what makes the NCAA Tournament so exciting! Cinderella can make it to the ball and often does. I went with American Typewriter for the font to make it look more like a newspaper headline. The regions are ordered geographically (thanks for the feedback Eva).

March 19, 2017

Where is Planned Parenthood the only clinic that offers women the full range of contraceptive services?

I was looking through Twitter Friday and saw this tweet from Emily Crockett.

Emily was responding to the ignorance of HHS Secretary Tom Price. Here's the text from the Vox article:

During a CNN Health Care Town Hall on Wednesday, co-host Dana Bash asked Health and Human Services Secretary Tom Price about what would happen to the women who rely on Planned Parenthood for health care if the organization were to be defunded.
Specifically, she asked about those who live in the 105 counties where Planned Parenthood is the only clinic that offers women the full range of contraceptive services.
“Well, I’d be interested in the list you have,” Price replied.
Emily provides a simple list of the counties that are impacted. After a bit more reading, specifically on the National Campaign to Prevent Teen and Unplanned Pregnancy website, I was able to find the data behind the counties Emily listed. A few hours on a Friday evening later, and some great feedback from my colleagues, and I present to you this map of the counties. I think using visuals in this case tells a much more powerful story.

Basically it says, get your act together and your facts straight Secretary Price. Do what's right for women in America!

Click on the image for the interactive version.

March 15, 2017

Workout Wednesday: Full Year Calendar with Month Labels

Last week, Kevin Taylor of Tableau wrote a great blog post on the use cases and the how to for heatmap calendars. Kevin ends the blog post with this question/challenge:

I’ve still yet to come across a really good, scalable solution for adding in the month names.
Kevin sent this to me several weeks ago and I came up with this solution, which is your challenge for this week. Create a heatmap calendar that includes month names. Download the data for this challenge here. You must match everything: the titles, the colors, the fonts, the filter, all of it. The final dashboard size is 600x800.

TIP: You could start by downloading Kevin's workbook. This will give you the LOD that he used for the weeks. Good luck!

March 14, 2017

Tableau Tip Tuesday: Using LODs to View the Latest, Previous and Prior Months

This week's tip comes from a question on The Information Lab's collaboration platform about returning the latest, previous and prior month values.

I'm using LOD expressions in this video along with the DATEDIFF and DATETRUNC functions. Below this video, you will see the calculations that I used if you want to copy/paste them for your own use.

Latest Month Sales

IF DATEDIFF('month',DATETRUNC('month',[Order Date]),{MAX(DATETRUNC('month',[Order Date]))})=0
THEN [Sales]

Previous Month Sales

IF DATEDIFF('month',DATETRUNC('month',[Order Date]),{MAX(DATETRUNC('month',[Order Date]))})=1
THEN [Sales]

Prior Month Sales

IF DATEDIFF('month',DATETRUNC('month',[Order Date]),{MAX(DATETRUNC('month',[Order Date]))})=2
THEN [Sales]

If you want the months to be labeled…

Latest 3 Months

DATETRUNC('month',[Order Date]),
{MAX(DATETRUNC('month',[Order Date]))}) <= 2

Latest N Months

DATETRUNC('month',[Order Date]),
{MAX(DATETRUNC('month',[Order Date]))}) < [How many months?]

March 13, 2017

Makeover Monday: Who Has the Best Orgasm Frequency?

No comments
Interesting topic this week for Makeover Monday...orgasms! I mean, who doesn't like a good orgasm? Well according to the data, women aren't having them frequently enough. But isn't that the mystery men have been trying to solve since the beginning of time?

Let's take a quick look at the original viz by Anna Vital, an information designer based in San Francisco.

What works well?
  • Orange text on the dark purple background
  • Title captures your attention
  • Using icons for the relationship type
  • Nice big numbers
  • Bed icon that looks like it has shooting stars coming out of it
  • Metrics are sorted
  • Including references to the data source
  • Simple, organized layout

What doesn't work well?
  • Bed icons are partially shaded, which makes it tough to know the exact amount each bed is shaded. However, including the large numbers helps offset this weakness.
  • Light purple is really hard to read
  • Could use a better title; this one captures our attention because it's about sex
  • Should the icons that are shaded as out of the range still have the fireworks coming out of them?
  • Sorting from worst to best; I would sort the other way around to emphasize the positive

To understand the data better, I read the abstract from the original study. This helped me understand the points I wanted to highlight. I thought about creating a waffle charts, however, I wanted the viz to look more like an infographic, so I switched to 100 bed shapes. These are then highlighted based on the orgasm rate for the group.

I liked Anna's big numbers, so I've include those as well. Instead of using male/female icons in different collections, I used icons from Lastly, I included some of the text highlights from the study extract to provide additional context.

March 8, 2017

Workout Wednesday: UK House Prices Hexbin Maps

No comments
Another great learning experience today for me thanks to Emma! This week required us to create hexbin maps in Tableau. I'd never done these before, so I got stuck in, knowing I'd come out of it have learned something.

Let's start with Emma's requirements:

  1. Create the hexbin map based on the District 
  2. Change the size of the hexagon using a parameter
  3. Match the sunrise-sunset diverging colour palette on the hexagons
  4. Match the tooltips and titles
  5. Set the dashboard size to 650 * 700 pixels

Fortunately Emma linked to a blog post by Matt Chambers for how to use the HEXBINX and HEXBINY functions. This made the exercise immensely easier because all I really needed to do was swap out [Ratio] in the blog post with my parameter.

I couldn't get the total in my title to match, though I knew I had done it right. It turns out Emma used an LOD that returns the average for both years, not just the year selected. I decided to use the year selected because that reflects what's in the map. I also create this value with a table calc since all of the dimensions I needed were already in the view.

Also, requirement #2 to change the size is actually a requirement to change the denstiy, i.e., how many hexabins are created. So, I decided to also size the hexagons based on how many are displayed in the view. So if you enter a smaller number, the hexagons are bigger. Enter a big number and the hexagons are smaller. I also limited the parameter to between 20-200.

Lastly, I floated everything since the color legend had to be floated above the hexmap. I find that Tableau works best when you go either 100% float or 100% tiled.

Nice fun challenge! Learned a bunch!

March 7, 2017

Tableau Tip Tuesday: How to Export a CSV from a TDE

Super simple, yet very useful tip this week. Here's the scenario:

  1. Someone sends you a TDE
  2. You need the raw data, but the file wasn't provided

Getting the data out is much simpler than you think.

Step 1 - Connect to the TDE

Step 2 - Add Number of Records to the Text shelf

Step 3 - Right-click on the text in the worksheet and choose View Data...

Step 4 - In the View Data window, click on the Full Data tab at the bottom.

Step 5 - Click the Export All button on the upper right

This will save all of the data as a CSV. That's it! I told you it was simple.

March 6, 2017

Makeover Monday: Who are YouTube's Biggest Gaming Stars?

No comments
This week Eva chose this simple table from SocialBlade of the top YouTube gaming channels for the Makeover Monday topic.

What works well?
  • Tables are great for looking up specific values. Want to know how many views and/or subscribers the 73rd highest channel has? A table makes finding that really easy.
  • Tables are precise; they provide exact values.
  • The table is sorted by the rank, making it easy to process who is most popular.

What doesn't work?
  • There's no real story to the data.
  • Tables aren't engaging.
  • Comparisons are difficult - How do the top 20% compare to the rest?

Really, there's not a whole lot more to say. I started this week with a Google search for some inspiration. I was looking for graphics that caught my eye, interesting statistics, good colors, etc. In the end, I chose to use the colors from this visualisation for my palette because I thought they fit well with YouTube's primary red.

I wanted to answer two simple questions with my visualisation:
  1. How do the top 20% of channels compare to the rest? Do they fit the Pareto principle?
  2. How many views do the top 10 channels have and how can I best compare them?

Originally I created a long dashboard, but found it too hard to see the entire story in one view, so I've switched to a wider view. Click on it for the interactive version.

March 1, 2017

Workout Wednesday: World Series Game 7 - Pitch-By-Pitch

When I set out to do this week's workout, I really wanted to recreate this amazing World Cup infographic:

I love this graphic! So much information packed in a compact space. But I couldn't find the data anywhere. What I decided to do instead was look at game 7 of the 2016 World Series. It's talked about as one of the greatest games of all time, so I thought I'd create something similar, but on a pitch-by-pitch basis.

I was able to find the data on Brooks Baseball. I then imported it into Google Sheets for each pitcher and then unioned them all in Tableau. I'd recommend you just use the TDE I've created this week as I've removed all of the extra columns you won't need. You can download it here.

Here are the requirements:

  1. Each inning should be an individual row
  2. Within each inning, show every pitch from left to right
  3. The home team (Cleveland Indians) pitched first, so their bars should point up. Followed by the visiting team (Chicago Cubs), which should point down.
  4. Each pitch is color coded based on the outcome - Ball, Strike, or In Play
  5. The final outcome of each batter should be displayed as a shape and color coded. See the subtitle in my viz. Note that the open circle is filled in the middle with white so that the bar can't be seen through it.
  6. Match my tooltips
  7. Include the data source at the bottom
  8. Match my title and subtitle
  9. Viz must be a single worksheet
  10. Viz should be 450x800
  11. Optional: Match my font, Rubik in this case.

That's it! This shouldn't be as challenging as some of the past challenges I've posted. For me, this was more about trying to replicate a viz I liked. If you don't understand the fields or get stuck, ask for help. Good luck!