Data Viz Done Right

July 27, 2012

Tableau Tip: Using the TOTAL function to summarize dimensions

8 comments
UPDATE (7 Sept 2016): There is now a much easier way to do this with LOD expressions. I have created a video tutorial an example here.



One of the great benefits of working with brilliant people is that you get challenging questions. When those questions are about Tableau, I’m nearly always convinced that what the person is trying to do can be done.

Wednesday was another great example.  Assume you have a table that measures sales by year by customer.  You’re only interested in customers that have more than $35,000 in sales per year.

image

And you don’t care who the customers are, just how many.  Restating the question: How many customers have bought over $35,000 of products in a year and what was the total value of those sales?  This is a simple question, but it wasn’t as easy to do in Tableau as I thought.

My initial thought was to simply change the Customers field to a distinct count, but that didn’t work because my individual customers are no longer listed.  Ok, so I need all of the customers, but I don’t want to see them.  Basically I’m looking for one row for each year.  Each row will list year, # of customers over $35K, and total sales for those customers.

Here’s how to do it.  I’m going to assume that you are starting with a view like this.

image

Step 1 – Create a calculated field to count all of the customers in the view (note that I’m working with an Excel data source, so COUNTD is only available if I make it an extract).

image

Step 2 – Drop the Customer Count measure onto the data area.  You will see the Customer Count field appear in the Measure Values shelf.

Step 3 – Right-click on the Customer Count measure in the Measure Values shelf and choose Edit Table Calculation.
  1. From the Compute Using list, choose Advanced…
  2. Move the Year of Order Date and Customer fields to the right and click OK.

    image
  3. On the Table Calculation window, change the Restarting every option to Year of Order Date and click OK.

    image

Your view should now look like this.  Notice how the Customer Count field shows the same total number of Customers in each row for each year.


image

Step 4 – Create a calculated field to total all of the sales in the view

image

Step 5 – Drop the Total Sales measure onto the data area. You will see the Total Sales field appear in the Measure Values shelf.

Step 6 – Right-click on the Total Sales measure in the Measure Values shelf and choose Edit Table Calculation.
  1. From the Compute Using list, choose Advanced…
  2. Move the Year of Order Date and Customer fields to the right and click OK.

    image
  3. On the Table Calculation window, change the Restarting every option to Year of Order Date and click OK.

    image

Your view should now look like this. Notice how the Total Sales field now shows to total sales by year on each row.

image

Step 7 – Drop the Number of Records measure onto the data area. You will see the Number of Records field appear in the Measure Values shelf.

Step 8 – Right-click on the Number of Records measure in the Measure Values shelf and change the aggregation from Sum to Average.

Step 9 – Right-click on the Number of Records measure in the Measure Values shelf and choose Add Table Calculation.
  1. Change the Calculation Type to Running Total.
  2. From the Running Along list, choose Advanced…
  3. Move the Year of Order Date and Customer fields to the right and click OK.

    image
  4. On the Table Calculation window, change the Restarting every option to Year of Order Date and click OK.

    image

Your view should now look like this. Notice the running total we’ve added and how it restarts every year.  What we’re doing here is simply counting the number of rows in each pane.

image

Step 10 – Cleanup time!
  1. Right-click on the Number of Records pill and choose Filter.  Change the filter to be At Most 1.  This will now keep only the first row in each year. 

    image

    Notice how our Customer Count and Total Sales fields didn’t change. This is because we still have the Customers dimension in the view.
  2. Drag the Number of Records pill off of the Measure Values shelf.  Be sure to keep it on the filter shelf though.

    Note: Steps 7 to 10.1 could all be combined into one step if we put the Number of Records field in the filter immediately.  I didn’t do that because I wanted to demonstrate how we’re leveraging the running totals calculation in this example.
  3. Move the Customers dimension from the Row shelf to the Level of Detail shelf.
  4. Remove Sum of Sales from the view.  It’s no longer relevant.
  5. Right click on the two headers for the measures and rename them (Edit Alias).
That’s it!

image

I have a sneaking suspicion that I’m overlooking a very simple way to do this, so if I am, please let me know.  Either way, this works.

Download the workbook here.

8 comments :

  1. Hi Andy,

    I've got a few ideas on how to simplify this:

    The Advanced Compute Using you set up does the same thing as just setting the Compute Using to Customer. The Compute Using determines the addressing (what combinations of non-aggregated discrete values to calculate results for) and then all the other non-aggregated discrete values in the view get used for partitioning (when to restart the calculations). So setting the Compute Using to Customer sets the partitioning to Year.

    I've always thought the UI (and documentation) are a little confusing, because they use a variety of terms to mean the same things: Compute using/compute along/addressing are all synonyms, and partitioning/restarting every are synonyms. The Restarting Every option is just overriding Tableau's automatic partitioning with your choice of partition.

    Also, instead of using a TOTAL(COUNTD()) which requires Tableau to hit the database, if you just want a count of records in the partition the SIZE() function does just that.

    Finally, to reduce the number of rows returned (to prevent overlapping text) the easiest technique is to use the FIRST()==0 test. This returns True for the first row in the partition and False for everything else. So you can put this on the Filter Shelf, filter for True, and only return one row. However, that's one extra calculation to have to remember to set the right Compute Using for, and there might be some performance issues because Tableau evaluates all table calculations before applying the table calc filters. Alternatively, you can embed the table calc filter in the calc, like this one for the customer count:

    IF FIRST()==0 THEN SIZE() END

    With the Compute Using set to Customer, even though the IF() statement is evaluated for every customer, the SIZE() calc is only run once and Null is returned for all other rows. For the total, I used a similar calc:

    IF FIRST()==0 THEN TOTAL(SUM([Sales])) END

    Here it is on Tableau Public: Using SIZE() and a built-in filter.

    ReplyDelete
  2. Andy, that was a very creative route you found. Sometimes, whatever works is good enough, because there is always more than one way to get the final result.

    Jonathan, I do not know of a more efficient route to achieve this result with table calcs, well done. Now the next challenge is to chart these values in a line chart. Year on the Columns, both table calc measure pills on the rows. Then once you get that, modify the data so Clay (the first in 2012) is also in 2010 with Sales over $35K.

    ReplyDelete
  3. Jonathan, I went through your tips and added them to my viz to be sure I understood how they worked. Thanks again!

    ReplyDelete
  4. Thanks a lot Andy and Jonathan for the tips. That's really insightful on how Tableau handles the calculation behind the scene.

    ReplyDelete
  5. Hi guys. I achieved the same results by simply adding the count (using SIZE()) and the Total Sales, then I:
    i) moved customers to the level of detail
    ii) added Measure Names on the Columns shelf and just dragged the column border so to hide it
    It worked just as well - am I missing anything?
    Thanks,

    ReplyDelete
  6. Hi Andy, my client wants to view the quarterly and yearly reports as (running product of (1 + monthly returns)-1). Could you please guide us in this regard? Thanks in advance

    ReplyDelete
  7. Hi Andy, love your blog! I had a question: Is it possible to turn this into a visual: a combined bar and line chart with dual axis? Tried doing it but since we have the Customers in the Detail view, it obviously breaks it down and removing Customers from the view is not an option. I would like to see a chart based on the above final view minus the Customers' information. Would it be possible? Many thanks in advance.

    ReplyDelete
    Replies
    1. I was able to find a much simpler solution with LOD expressions which then also allowed me to create a visual view instead of a table. I've updated the workbook on Public which you can get here:

      https://public.tableau.com/views/CustomerCountSummary/Sheet3?:embed=y&:display_count=yes&:showTabs=y

      Look at Sheet 3. I'll make a note to create a video for how to do this.

      Delete