Data Viz Done Right

# Using lines with unequal intervals can mislead: ESPN reached 50,000 episodes Of SportsCenter

ESPN aired the 50,000th episode of SportsCenter last night, an incredible accomplishment for the self-proclaimed Worldwide Leader. That works out to roughly 1,500 episodes a year for the the 33 years of ESPN's existence. But as we can see below, those 50,000 episodes have not been distributed evenly.

Followed by this chart:

I’ll give you a couple of minutes to figure out what’s wrong…

Give up?

Cork Gaines is using a line to connect uneven intervals.  While it looks like a steady increase, was it really?  Were there the same number of episodes every month between Sept 1979 and December 1998?  I highly doubt it.

I recreated the data in Excel, and low and behold, when I build a line chart with these six data points, it looks identical to Cork’s.

Clearly Cork used Excel to create the chart.  And clearly he didn’t know that he should not use a line to connect unequal intervals of time.

There are a few basic guidelines for line charts (Stephen Few):

1. Lines should only be used to connect values along an interval scale (with a couple of exceptions).
2. Intervals should be equal in size.
3. Lines should only directly connect values in adjacent intervals.

Cork’s chart breaks all three guidelines.

The best way to represent time-series data with unequal intervals of time is with a bar chart.

1. Andy,

There's still a problem and that is that you are not making it clear that there is lots of missing data. The gap between 1979 and 1988 is the same as the gap between 1998 and 2002.

Yes, we really need number of episodes for every year to do this right, but short of that I think you need to show there's a much wider time interval between the first three data elements.

Steve

2. Andy, you are correct that a line chart is not ideal for data with irregular intervals, but a bar chart is not the correct answer either.

If you have Stephen Few's book "Now you see it", check out Chapter 7 Time-Series Analysis. If you don't have the book, let me know, I gift you a copy.

There Stephen recommends a dot plot as an option to display the data without conveying information that is not known.

see http://public.tableausoftware.com/views/dotplotforandy/DotPlot for an example.

3. Great example! Thanks Joe!

Steve, great feedback as well...thanks!

4. Joe, I do have the book, thanks for the offer anyway. I was flipping through it looking for his discussion on this topic too, but couldn't find it. I'll dog ear it when I get back to the office.

5. Ignoring the issue of unequal intervals for a moment, I think a line chart does a better job communicating the idea that we're looking at a running total.

At first glance, the column chart might suggest that we're looking at the number of episodes aired during each of these periods. Obviously ESPN can't air SportsCenter 137 times per day (though casual viewers of ESPN may consider this an under-estimate).