Data Viz Done Right

December 23, 2013

Tableau Tip: Using actions to “reset” a chart to the most recent date

This is a guest post from Richie Fanti, who joined our team at Facebook at the beginning of December.  This is a great example of the type of talent that we’re looking to bring onto our team.  If you’re interested in joining our Analytics team as a Data Visualization Engineer, contact me via any of the links on the right side of this blog.  Check out the job description here.

A great Tableau question came up at Facebook the other day. Often times our users are especially interested in the most recent daily data, but like to have the freedom to view historical details on demand as well. Trend lines are generally good at telling the historical story, but don’t always provide all of the details that users want. Wouldn’t it be cool to use a trend line to filter the specific components of a dashboard to any historical data point, but then be able to immediately revert to the most recent date on demand?

Suppose you have a simple dashboard composed of two sheets:
  1. A trend line that shows daily sales across product category
  2. A horizontal bar chart that shows the top selling products.
Here’s what it looks like using the Superstore Sales Excel data:


Now say we want the end user to select a mark on the trend line and have the bar chart update to a list of products within that respective product category and show sales for only that day. Tableau makes this very straightforward through the use of a filter action. However, what if the user unselects that mark on the trend line and we want the bar chart to revert back to only reflecting sales data for the most recent date of data? Without some extra manipulation and a couple table calculations, Tableau’s default behavior would be to clear all actions and show product sales for all dates in the partition, rather than only the most recent one. In other words, there are generally two things we want here:
  1. The bar chart should only reflect the most recent day of data by default (in this case June 16th).
  2. When a mark on the trend line is selected, the bar chart should dynamically filter to that respective day of data. However, if that selection is cleared, or if that mark on the trend line is unselected, we want the bar chart to revert back to only reflecting sales data for the most recent date (June 16th).
If we ignore requirement #2, Tableau makes everything about #1 pretty straightforward. #2 is where things get tricky, especially the latter half; getting the bar chart to revert back to only reflecting sales for the most recent date of data once the action is cleared in the dashboard requires some extra manipulation. There are a few steps and table calculations used to achieve this. Let’s start with first building out the trend line.

Using the Superstore Sales Excel data, drag Sales to the Rows Shelf, Product Category to the Color Shelf, and right-click drag and drop Order Date to the Columns Shelf, choosing continuous days.


Also, we’ll need a date filter so that Tableau only shows the most recent two weeks in the view to simplify things. Drag Order Date to the Filters Shelf, choose “Individual dates,” go to the “Top” tab, and choose Top 14 by Minimum Order Date. This filters the partition to the most recent two weeks of data.


Let’s make this a global filter by right-clicking on the date field from the Filters Shelf and choose “Apply to Worksheets,” and “All Using this Data Source.” Now every viz we make using this data source will only reflect the most recent two weeks of data.

Cool, so far so good. Nothing tricky here. Now on to the bar chart. Drag Sales to the Columns Shelf, Product Name to the Rows Shelf, and sort descending by the sum of Sales. I like to add labels on the end of each bar by selecting the “Abc” icon on the toolbar as well. We can also add Product Category to the Color Shelf so that it matches the color scheme in trend line:


Note that view is showing the top products by sales within the last two weeks cumulatively. In order to satisfy #1 of our requirements, our users only want to only see data for the most recent day by default, rather than the most recent two weeks as a whole. This can be done in the dashboard by creating a filter action and selecting a mark on the trend line for the most recent day, but clearing the selection would just show cumulative two week sales again, which is not what we’re looking for. In order to get around this, we can add discrete days to the view, and use the INDEX() function to rank products within each discrete day of data. This will essentially allow Tableau to segment a list of top products within each day.
NOTE: Since this is being done in version 8.0.x, I’m using the INDEX() function to get what we need here. Version 8.1 introduced some enhanced ranking functionality with the RANK() function, but the general framework behind this between v8.0 and v8.1 should be very similar.
Right-click drag and drop Order Date to the left of Product Name on the Rows Shelf and choose the discrete MDY option. Right-click on this field and choose Sort, then choose Descending in data source order. This places the most recent day of data on the top of the list in the view. You should have something like this:


Now we need to use a table calculation to sort products within each day, rather than on the aggregate over all days. Create a calculated field, name it Product Rank, and use the index function in the Formula box.


Right click on this newly created calculated field from the measures list in the data window and choose “Convert to Discrete.” Then, drag and drop this between the Order Date and Product Name on the Rows Shelf. Things will look somewhat funky, but we’ll fix this by right-clicking on the Product Rank field on the Rows Shelf and choosing “Edit Table Calculation…” Choose Compute using > Advanced. We want the table calculation to be addressed along Month, Day, Year of Order Date, Product Name, and Product Category in that order, and sorted descending by the sum of Sales. Then select At the level: > Product Name and Restarting Every: > Month, Day, Year of Order Date. Here’s the snapshot of the dialog boxes:


The sheet should now have products correctly sorted in descending order by the sum of sales within each discrete day.


Finally, we need to do this one more time for dates. However, we’ll use the index function differently this time. Since Order Date is already sorted, we don’t necessarily need the INDEX() function strictly to sort stuff. We can instead leverage the more formal ranking to dynamically filter only the top-ranked date. This is different from directly filtering the date field because that would not allow this sheet to interact with other sheets in the dashboard based on Order Date in the dynamic sense that we’re looking for. The INDEX() ranking allows Tableau to make two passes on the partition, therefore telling Tableau to aggregate sales for the past 14 days, but still giving us the ability to only view details for one date at a time dynamically and interactively.

To create the second calculated field, right-click on Product Rank from the measures list and choose “Duplicate.” Rename it “Date Rank.” Drag this to the left of MDY (Order Date) on the Rows Shelf. We want to choose some advanced computation settings for this as well, so go to “Edit Table Calculation,” and edit the “Compute Using” settings so that the table calculation addresses Order Date, Product Name, and Product Category respectively. Then we want the results to be computed along Order Date so we can assign a rank to each discrete day in the view.


This should give us a list of products ranked in descending order by sales within each day, which is also assigned a rank in descending data source order.


Now all we need to do is filter the viz so that it only shows the #1 ranked date, which is also the most recent day of data by default. Double-click on the Date Rank field on the Rows Shelf, and edit the selection so that only #1 is chosen. This is super important because, by filtering the Date Rank, we’ve just completed both #1 and #2 of our requirements for this dashboard. Not only is the most recent date now always going to be displayed by default, but we can also filter to other dates and clear the filter to revert back to the most recent.

All that’s needed now is some clean-up and actually building the dashboard. To hide some of the redundant stuff, right click on the Date Rank field on the Rows Shelf and uncheck Show Header, and do the same for the Product Rank field on the Rows Shelf as well. We’ll leave the date field included to ensure things are working correctly.

To make the dashboard, arrange the sheets to look something like this:


Finally, all we have to do is create the filter action and we’ll be good to go.


We now have a dashboard that only reflects product sales for the most recent day of data in the bar chart, and after interacting with the trend line, the bar chart will always default back to this day as well. This is a pretty cool functionality to have because often our users only care about what is happening today, but like to have the ability to look back at a snapshot in time to view more specific data that otherwise couldn’t be seen in the trend line.

Check it out. Play with it yourself below. Notice how the bar chart defaults to the last date. Click on a dot on the line chart, for example June 10th. Notice how the bar chart updates to reflect Jun 10th. Now deselect that point and the bar chart goes back to the latest date.

December 4, 2013

Tableau tip: Don’t waste the ends of your sparklines…make them actionable!

I’ve been meaning to write about how I use sparklines for a while now and the post Two Tips for Meaningful Sparklines in Tableau by Jim Wahl finally gave me the push I needed to get this written.

Sparklines are one of my favorite chart types to include in dashboards, yet I see many people using them without providing enough context.  Some people like to add bandlines, some like to add sets of dots, some like to add text, all in an effort to add meaning to sparklines.  These are perfectly fine, but I think there’s a better way to make sparklines actionable.  Is this the best way? Maybe not, but it is an alternative worth considering.

Sparklines were first introduced by Edward Tufte in his book Beautiful Evidence.  Tufte says: “A sparkline is a small intense, simple, word-sized graphic with typographic resolution.”  Stephen Few expands Tufte’s definition in his book Information Dashboard Design: “Their whole purpose is to provide a quick sense of historical context to enrich the meaning of the measure.  This is exactly what’s required in a dashboard.”

When someone is creating a dashboard, they should provide as much information and meaning as possible to make the information actionable.  I don’t see any examples from Tufte, Few or Jim Wahl that provide much meaningful context to the end of a sparkline.

Tufte provides some examples:

He might add a red dot to the end of the line along with some text to highlight the latest value.


While it’s a bit tough to see in this next example, Tufte has used red dots for the beginning and ends of the lines and blue dots to indicate the highest and lowest values.


It’s important to also note how Tufte always includes the values associated with all of the highlighted dots.

There are tons and tons of examples of how Stephen Few uses sparklines.  Consider this example from his whitepaper Dashboard Design for Real-Time Situation Awareness


Few says: “Meaningful context has been added to these metrics in the form of sparklines, which provide a quick sense of the history that has led up to the present.”  This small section of a dashboard is a classic Few design.  You’ll often see him use (1) sparklines, (2) a visual indicator of health (the red dots in this case), and (3) bullet charts closely together.

When I use sparklines, I like to combine all of the elements of Tufte and Few designs.  Let’s look at an example.


On the left you see the sparklines, but notice that I use the dot on the end of the line as an indicator to take action.  Tufte uses the dot one the end to indicate you’re at the end.  Does that make it actionable?  Not necessarily.  Few separates the indicator into its own space and does not mark the end of the sparkline.  My version saves space, increases the data-to-ink ratio, and provides a visual indicator to the reader in one chart.

The table to the right summarizes the sparkline, pulling from Tufte’s practices.  In this example, I’m concerned with comparingthe last two 7-day periods.  Notice how I used conditional formatting so that the dot on the end of the line is the same color as the text in the WoW and WoW% columns.  I don’t use bullet graphs because I feel that the text itself is sufficient; I don’t want to add a graph for the sake of having a graph for everything.

Simple, concise, actionable…all things you want in a dashboard.  Keep reading to see how I built these sparklines in Tableau.

Step 1: The date calculations I use in the example below are simple and efficient when you include a Max Date field in your data source.  Creating Max Date as a calculated field directly in Tableau won’t always work since you need the Max Date at the row level.  In this example, I’ve switched the Superstore Sales data source to Custom SQL and added a subquery to include the Max Date at the row level.


TIP: If you have a large dataset, Tableau will run more efficient queries if you push the custom SQL into a view in your database.  Tableau wraps its own SQL around the custom SQL, which can get quite messy and inefficient.  Creating a view will simply the query Tableau runs and improve performance.

Step 2: I like my sparklines to show the last 30 days, so I need to include a date filter.  I include my date filter as the first step so that my data set is smaller to work with from the outset.


A Boolean calculation works well here.  Notice how it leverages the Max Date field.  This wouldn’t work if the Max Date was a calculated field inside of Tableau.


To get my sparklines to look how I like them, the column and row shelves will need to look like this.  Let’s break the worksheet down into its pieces.


Step 3: Create a dummy header and place it on the columns shelf.  Place it on the Columns shelf and hide the fields labels for the columns.


Step 4: Right-click-drag Order Date to the columns shelf and choose the first option, Order Date (Continuous).  Notice how it only shows the last 30 days.


Step 5: Right-click on the Order Date pill and uncheck Show Header.  This hides the date axis.

Step 6: Drag Category on the Rows shelf and hide the headers.  The headers aren’t needed since they’re on the left side of the table; there’s no need to repeat them.

Step 7: Place Sales on the Rows shelf to the right of Category.  This gives us the lines.  Make them thinner, change the color to dark gray, and resize the chart to make them look like sparklines.

Step 8: Double-click on the Sales axis to bring up the axis options.  Uncheck Include Zero and choose Independent axis ranges for each row or column.  This gives us the view that best fits the space.  Few talks about the scaling options for sparklines in Chapter 10 of Information Dashboard Design.

Step 9: We need a calculated field to show a dot on the end of the line.  You might be tempted to simply turn on the line ends, but that won’t do the trick because you can’t color the line ends only.  The calculated field should only capture sales for the last day.  This is where our Max Date field comes in handy again.


Step 10: Drag the new Last Day Sales field onto the Rows shelf to the right of the Sales pill.  Right-click on the Last Day Sales pill and choose Dual Axis.  Right-click on the scale for Last Day Sales and choose Synchronize Axis.  Right-click on Sales pill and uncheck Show Header.

We’re almost done.  All we need to do now is color the dot.

Step 11: I like my dots to be colored by the week over week change.  This requires me to create several calculated fields.  You could combine all of these calculated fields into a single calculation, but I like separating the parts of the calculation to make it easier to understand and so that each calculation is reusable.

Create all of these calculated fields in this order (special thanks to Joe Mako for helping me get these calculations working and showing me why they’re more efficient than what I had been doing):
  1. Last 7 Day Sales:
    IF [Order Date] >= DATEADD('day', -6, [Max Date]) THEN [Sales]
  2. Prior 7 Day Sales:
    IF [Order Date] >= DATEADD('day', -13, [Max Date]) AND [Order Date] <= DATEADD('day', -7, [Max Date]) THEN [Sales]
  3. Total Sales - Last 7 Days:
    IIF(LAST()=0, RUNNING_SUM(SUM([Last 7 Day Sales])), null)
  4. Total Sales - Prior 7 Days:
    IIF(LAST()=0, RUNNING_SUM(SUM([Prior 7 Day Sales])), null)
  5. WoW (week over week):
    [Total Sales - Last 7 Days] - [Total Sales - Prior 7 Days]
  6. WoW %:
    [WoW]/[Total Sales - Prior 7 Days]
Step 12: Click on the Last Day Sales pill on the rows shelf and then drag the WoW % field onto its color shelf.


Step 13: Right-click on the WoW % pill, go down to Compute Using and choose Order Date.

Step 14: Double-click on the color legend and change the settings to something like these:


You might need to do a bit more formatting to get your viz just the way you want it, but in the end, you’ll want it to look something to this:


Notice that I keep the row banding.  I like to include banding on both the sparkline chart and the table so that the reader’s eyes go across the dashboard.

This might seem like a lot of steps, but once you do it a couple of times, it’s pretty quick; you’ll be able to do this in only a couple of minutes.

Building the table is super simple now that you have all of the calculations (this is why I create all of them individually).  Download the workbook here to see how all of this was built.