November 25, 2010
The CDC publishes an annual report on Health in the United States and included in the report is a “Chartbook”. It’s 574 pages long, but you can skip to page 32 for the start of the charts. There are some quite horrendous charts, especially the pie charts, that you will get a kick out of.
You can download the data on CDC Wonder. Once you create your query, you get a spreadsheet of the results, a map, and a bar chart. The bar chart is particularly poor and only allows you to pick two dimensions.
I have downloaded the data and produced an interactive dashboard via Tableau Public. Within this dashboard you can filter by Gender, Age, State and Disease. In the end, I have included all of the views from CDC Wonder, plus much more.
- The infection rate for the total US has continued to climb for all diseases combined. This is largely due to Chlamydia.
- Syphilis infection rates declined from 1996-2001, but have continued to climb since. Particularly concerning is the rate in Washington, DC.
- In fact, Washington, DC has the highest infection rate for all three diseases.
- Alaska’s overall infection rate in twice the national average, with the Chlamydia rate 86% higher than the national average. This is definitely worth looking into.
- The overall infection rate for females is more than double that for males.
- Females between the ages of 15-24 are most likely to get infection, while males are most likely between the ages of 20-24.
There are many more observations and insights to be gleaned from this dashboard. It is considerably quicker to identify outliers and trends with a simple dashboard like this than with CDC Wonder. Imagine how much more useful the “Chartbook” would be if the CDC used Tableau.
What other observations can you make?
November 24, 2010
Simon Rogers from the Guardian created a visualization of failed banks in the US using Many Eyes. The article can be found here. Before I critique the visualization, take a minute to interact with the bubbles.
A quick word about data integrity. The article on the Guardian references the data going back to 1935, and in fact, the data compiled does go back to 1935. I made the assumption that the viz created by Simon went back that far as well, but then I couldn’t find a year filter. If you look at the data that built the bubble chart, it only covers 2008-2010 and all three years are combined. But 2010 isn’t even a complete year. Come on already! This can be extremely misleading and should be clearly noted, but it’s not!
The initial view above is assets in failed banks as dollars per person.
- The huge bubble for Nevada surely stands out, but why not a simple bar chart?
- Notice that Total is listed as a State; that doesn’t make any sense.
- Which state is #2? How about #5? It takes some work.
- Do we really care about all 50 States or maybe just the top 10?
- How much bigger is Nevada than the #2 state?
It’s so much easier to compare the size of the bars than the size of the bubbles. From the bar chart, you can easily see the rank and the relative size of each bar. It turns out that Nevada is 20x larger than Alabama. There’s absolutely no way you can identify that in the bubble chart.
Change the Bubbles Size option to Number of failed banks. Holy smokes! What state is collapsing? Oh, it’s not a state; it’s that pesky Total again. The Total completely distorts the view and makes all other comparisons impossible. Again, a simple bar chart will suffice.
Finally, since there are two data points that are highlighted by the article (assets per person and number of failed banks), a scatter plot provides one of the best means of seeing the relationship between the two. In this view, you immediately see the five outliers I have labeled below.
Scroll through the other filters and you continue to see that including Total as a State completely wrecks any insight that could be gleamed from the viz.
The viz below was built with Tableau Public and it includes the data all the way back to 1935. However, I decided to only focus on the last 20 years; this time period represents the most volatility since the Great Depression.
NOTE: 2005 and 2006 are not included since there were not any bank failures listed on the FDIC website for those years. I also excluded 2010 since the year is not complete.
There are three visualizations included. The line chart (and size) represents the number of bank failures. The color indicates the estimated loss (adjusted to the value of the dollar as of 31 Dec 2009). When you choose a Year, you will get the corresponding map and bar charts. The map and bar chart are sized and colored in the same manner as the line chart.
Naturally, I went straight to 1989. Texas had 224 bank failures! Then I went to the surrounding years and Texas was at the top of the list again. It turns out that there was a banking collapse in Texas in the middle 1980s to early 1990s.
According to the Dallas Morning News: “In the state's 1980s collapse, an energy bust and a subsequent real-estate wreck leveled hundreds of Texas banks, including longtime pillars of the economy.”
November 19, 2010
It doesn't really matter if we like you.
It matters if we like your work.
[Surprisingly, the converse of this rule also works].
Sometimes it seems as though people who are really concerned about one would be better off focusing on the other.
Great advice for a consultant, don't you think?
November 16, 2010
My daughter had a soccer game last weekend across town early in the morning and the weather was predicted to be quite cold. Naturally I went to weather.com to check the hourly forecast, but this time something struck me.
Notice the vertical scale. It’s not zero-based. Sure, it’s simply showing the changes in temperature, but as I scrolled through the pages, the axis values changed, that is, the range did not stay consistent. I also noticed that 12am is repeated, that’s kind of odd. Fusion Charts is their tool of choice.
I would have used Tableau to create a simpler chart. Unfortunately I lose the nice pictures across the top of each hour, which I really like, and the gentle shading (though why use gold for night hours…doesn’t gold mean sunny?), but I gain a zero-based scale and a line that I can color based on temperature, with the mid-point at 32 degrees. Below 32 = red, above 32 = green.
In this view the variances in the temperatures are even easier to see. You can see the huge change from 6am to 3pm and then the dramatic drop as sunset approaches. Which view works best for you?
- + 2:00 – Registration
- + 2:30 – The End of BI as You Know it
- + 3:00 – Tableau 6.0: Speed, Power and Style
- + 4:00 – Wrap-up
- + 4:30 – Reception: Cocktails, Networking and Hands-on Demos
November 13, 2010
The November Atlanta Tableau User Group (ATUG) meeting included over 30 people from industries including transportation, social media, consumer packaged goods, and data visualization consulting, just to name a few. Over half the group had downloaded Tableau 6.0 the same day as the meeting.
Our last three user group meetings have all included hands-on exercises and this time we challenged the groups to come up with a dashboard within 30 minutes based on 50 years of crime data. That might seem like a short amount of time, but that’s the point. We want the membership to realize the power of Tableau to let you gain rapid-fire insights. Almost half the group was new users, so having them work with Tableau and putting the power in their hands is the best way to sell the product.
We formed three team of five (yes, I know that doesn’t add up to 30; there were people that left and others than hovered) and told the teams that the best viz would win a prize…t-shirts donated by Tableau.
First place went to Team 2 (as voted by their peers).
Team 1 came in a close second place with their dashboard that contains action filters on each sheet.
Well done to each team. I’m looking forward to our next meeting on January 20th. Remember to bring a friend.
November 12, 2010
In this case, their use of the pie chart is somewhat acceptable because:
- There are only three data points.
- The chart starts at the zero position.
- The largest slice is first (though it would be better if they were in descending order all the way around).
- The results can be easily discerned.
November 6, 2010
When I first saw this I thought "Wow! What a huge variance over the years!" But then I looked a bit closer and saw that:
- The years are backwards. A time-series line chart should nearly always start with the oldest time period on the left. I can't even think of a way to interpret time backwards. Maybe the DeLorean from Back to the Future could help.
- The Y axis does not start at zero. This creates a misleading variance. It appears there has been a 700% variance from highest to lowest, but really it's only 35%.
- The Y axis should be rounded to a whole number; this is unnecessary precision.
- I find myself having to refer back to the legend to remind myself which sex is represented by which color. They are way too close in hue. Why not use blue for men and pink for women?
- The Years on the X axis are at an angle and squished together. If you must show all of the years, the turn them a full 90 degrees. In the end though, I believe the purpose of the chart is to show a trend, so I don't need to see all of the years, just enough so that I know it's a regular interval.
- One more thing. It's very subtle. This is NOT a regular interval after all. Between 1890 and 1940, there is only one measure per decade. Only beginning in 1947 is there data for every year. I would only display 1947-2003.
November 5, 2010
November 3, 2010
November 2, 2010
According to Transparency International:
- The 2010 Corruption Perceptions Index shows that nearly three quarters of the 178 countries in the index score below five, on a scale from 10 (highly clean) to 0 (highly corrupt). These results indicate a serious corruption problem.
To summarize the 2010 results:
- Denmark, New Zealand and Singapore are tied at the top of the list with a score of 9.3, followed closely by Finland and Sweden at 9.2.
- The most corrupt country is Somalia with a score of 1.1. Only slightly less corrupt are Myanmar and Afghanistan, with a score of 1.4, and Iraq at 1.5.
Immediately obvious to me are that:
- The rankings haven't changed much over the past three years.
- You should avoid nearly all of Africa and Asia.
- Western Europe, particularly the Scandinavian countries, are relatively devoid of corruption.
If you want to see a horribly create bubble chart from which you cannot infer anything, go to Many Eyes.