Launch, grow, and unlock your career in data

April 29, 2013

Artic Sea Ice Volume: A Radar Graph vs. Line Graphs

No comments

UPDATE (May 2, 2013): I have removed 2013 from all charts.

I posted an article on my Facebook page the other day asking if this radar graphs about artic sea ice volume works.  The comments were mixed.

I think this radar graph is merely ok.  Radar graphs, in general, are hard to read because relationships and trends are not easily discerned.  Some of the issues I see with this graph include:

  • It’s difficult to see the entire pattern over time.  I see the pattern is spiraling in, but does it change year by year.  Ask yourself this: How does 1999 compare with 2004?  It takes a lot of work. 
  • It’s very hard to follow the ice volume labels around the chart.  I bet you found this as well in the example above.
  • The author only included September.  Why?  The data is available by day.  Get the data here.
  • Are there any seasonal patterns?  You can’t answer this question.

I created a couple of different views for your consideration in Tableau.  You can download the workbook here.

In my version, I’ve addressed all of the problems I mentioned above.

  • In the upper left graph, you can easily see that the artic sea ice volume is trending down.  
  • The upper right chart allows you to see two things:
    1. The seasonal patterns
    2. Comparisons by year:  The comparisons across years can be see because the years get darker as the data gets closer to 2013.  You can see the lines get darker as you look down the graph.
  • The Daily Volume graph makes the cyclical patterns much, much clearer.

How would you visualize this data?  Do you agree that these line graphs work better than the radar graph?

April 23, 2013

Notes from the Visual Business Intelligence Workshop: Day 3 – Now you see it


Day 3 of the Visual BI workshop was the day I was looking forward to the most.  I was very interested in hearing Stephen’s approach for analyzing data, which he covers in his book Now you see it.  There was one common theme throughout: keep things simple and clear, but don’t dumb it down.

Here are my key takeaways/notes:

  • The word “see” in the title represents the analytical thought process: Search => Examine => Explain (SEE)
  • Things to look for in a skilled data analyst: interested in the data, curious, self-motivated, imaginative, open minded and flexible, skeptical, honest, has a sense of what’s worthwhile, attentive, methodical, analytical, synthetical, has an eye for patterns, knowledge of the data, knowledge of effective data analysis practices
  • The context we perceive is influenced by the surroundings.
  • Exceptions can be a result of:
    1. Erroneous data
    2. Extraordinary events
    3. Extraordinary entities
    4. Randomness
  • Highlight exceptions that are out of the range of “normal” or “standard”
  • Always ask “Compared to what?”
  • The tools we use need to make common interactions easy.  The tools should allow the train of thought to continue.
  • Cycle plots are useful for cyclical and linear patterns.
  • Linear trend lines on time series can be misleading; use with caution!  Consider moving averages as an alternative.
  • Log scales are useful for measuring rates of change.  Lines with similar slopes will have similar rates of change.
  • When looking for leading and lagging indicators, it can be useful to shift the time of one of the indicators.
  • Bump charts are a good way to see how rankings change across different dimensions or measures.  Learn how to build one in Tableau here.
  • The mean represents the quantitative center and is highly influenced by outliers.  If you want to look at dispersion around the mean, use standard deviation.
  • The median represents the ordinal center and is better than the mean for showing the “typical” value.  If you want to look at dispersion around the median, use percentiles.
  • It’s a good to idea to start an analysis by looking at a distribution of all values.  This will help you quickly identify outliers and the overall shape of the data.
  • You shouldn’t remove outliers from an analysis until you understand why they are outliers.
  • This is a really cool analysis of pay ranges by level and gender.  You could easily include a strip plot on this.


That’s it!  Three days of learning that I’ll never forget.  These courses were easily worth the money.  You’ll be able to apply so much immediately upon returning to your regular job.

April 22, 2013

Notes from the Visual Business Intelligence Workshop: Day 2 – Information Dashboard Design

No comments

Day two of Stephen Few’s three-day Visual Business Intelligence Workshop centered around his book Information Dashboard Design.  This class included quite a bit of critiquing of dashboards from “BI” vendors, a look at some of the better work, and a bit of hands on creating our own designs.

Like day one, these are the key points I wrote down (nowhere near the entire content of the course) that we should all reinforce in our work.

  • Well designed dashboards and well designed software allow for rapid visual monitoring, which has a three-phase analytical approach:
    1. Scan the big picture
    2. Zoom in on important details
    3. Links to supporting detail
  • The visual display of a dashboard needs to match the reader’s mental model.  If the reader does not have a mental model, then you should sit with them to develop one.  Avoid asking them “What do you want your dashboard to look like?”, rather get a sense for what questions the reader expects to be able to answer.
  • Be aware of the 13 common mistakes in dashboard design
  • A great way to convince people of how simple data visualize can be is through Stephen’s “Graph Design IQ Test”.  Answering the questions wrong is pretty funny.
  • There are four characteristics of a good dashboard design:
    1. Exceptional organization
    2. Data is condensed in summaries
    3. Data is specific to and customized for the task at hand
    4. Concise, clear and often contain small display mechanisms
  • Never ask people what they want their dashboard to look like.
  • Common dashboard data consists of:
    1. Measures of what’s currently going on
    2. Each compared to something to provide context
    3. Each evaluated to declare its qualitative state
  • Don’t design a dashboard only to highlight problems and exceptions.  The dashboard should be meaningful even when all is well.
  • Objectives of visual design:
    1. Eliminate clutter and distraction
    2. Group data into logical sections
    3. Highlight what’s most important (Place what’s always important on the upper-left)
    4. Support meaningful comparisons / give your data context (this was a them that came up over and over again)
    5. Design for aesthetic appeal (but don’t add fluff to add fluff)
      • Use soft, natural colors
      • Soften the background of the dashboard (Stephen likes to use a soft yellow)
      • Charts and text should be crisp and clear
      • Use good fonts (stick to Sans Serif on dashboards)
      • Only include one font style per screen
    6. Navigating to additional important needs to be easy and should support our train of thought.
      1. Scan the big picture
      2. Zoom in on important specifics
      3. Link to supporting details

April 18, 2013

Stephen Few’s Financial Statement Bullet Graph – Every CFO should have one of these!


Stephen showed this incredibly intuitive example of a financial statement built with bullet charts during the Information Dashboard Design course on Wednesday.  I love how it clearly walks you through the statement.  It almost reminds me of the NCAA tournament bracket.

With Stephen’s permission, here it is. Click on it to see a larger, crisper version.

Financial Statement Bullet Chart

April 17, 2013

Notes from the Visual Business Intelligence Workshop: Day 1 – Show Me the Numbers

1 comment

Stephen Few is running his three-day Visual Business Intelligence Workshop this week in Austin, TX.  He and I had been emailing back and forth and he thought maybe this wouldn’t be a worthwhile course for me, but I told him that I strongly believe that seeing someone teach in person is way better than reading books and sitting on webinars.  You simply can’t get the same level of interaction and communication without attending or conducting training live, in person.  I see this every time that I run training classes; those classes that are in person are way better.

Day one of the course, based on Stephen’s book Show Me the Numbers, has exceeded my expectations.  I’ve read all of Stephen’s books, so yes, much of the material was repeat, but the discussions in the class and hearing Stephen explain, in detail, his beliefs, extended the content way beyond what the book could possibly cover. 

Here are some of the key reinforcements and takeaways I noted from Tuesday’s class.  Many of these are “duh, of course” type of notes, yet good to always be reinforced.  This is a brain dump, so don’t expect any semblance of fluidity in my notes.

  • You should strive to include context and comparisons in every table of chart you create.
  • Line charts do NOT have to start at zero because you’re looking at patterns, unlike bars, which must start at zero because you’re comparing lengths. 

This was a big question I had before the course.  I typically always create line charts that include zero, but now I understand better why that’s not always necessary.  In the end, the patterns of the lines are easier to see when you do not start at zero.  Be careful though, you may have to alert your audience that the axis does not start at zero if they’re unfamiliar with the data.

  • There are eight types of relationship graphs:
    1. Time series
    2. Ranking
    3. Part-to-whole or contribution
    4. Deviation
    5. Distribution
    6. Correlation
    7. Geospatial (note this is different than geographical)
    8. Nominal comparison
  • “Don’t bury the truth under a layer of beauty or abstraction.”
  • There are four primary methods for encoding values:
    1. Points
    2. Lines
    3. Bars
    4. Boxes (which represent ranges of low values to high values)
  • is a great resource for understanding color choices.
  • Adding data points to line charts is good for making comparisons between lines, but keep them light & small.  Don’t use separate shapes if you’ve colored the lines.
  • Bubble size is appropriate to use for data points if precise comparisons are not required.
  • Avoid dual-encoding.  I’m glad I asked about this, because it’s a practice I employ, but no longer will. 

As an example, if you’re looking at a map that has a circle for each state that is sized by sales, you should not also color the circle by sales.  If you want to use color, it should be another element that adds to the interpretation (like profit ratio).

  • This one shocked me – It’s ok to use pie charts on maps (assuming there aren’t but two or three slices) because there’s no better way to subdivide bubbles on a map.  In other words, pie charts are your only choice in this case.
  • Three good examples of representing a single distribution are histograms, frequency polygons and strip plots.  I’ve never used the latter two, so I’m going to be looking into those more.
  • Colors:
    1. Use only soft, natural colors.  Tableau’s medium palette works well.
    2. Use fully saturated colors for emphasis, otherwise they become visually exhausting.

Of course there was way more content than this; these were merely the key points that I wanted to ensure I reinforced to myself.  Look for summaries of the next two course as the week progresses.

Our Gartner story: What CIOs can learn from Facebook's success with data discovery tools

No comments

Below is a reprint of an article written by Nicole Laskowski, Senior News Writer for SearchCIO.  Nicole did an incredible job summarizing the talk that Namit Raisurana and I gave at the Gartner BI Summit a few weeks ago.

Facebook Inc. tracks more than 1 billion active users on its site every month -- up from 600 million at the end of 2008. More users mean more data means more storage space. And for the Menlo Park, Calif.-based company, which started life in 2004, that growth also represents an ever-expanding data trove to mine for business and customer insights.

Up until a year ago, the world's largest social networking platform relied on a homegrown reporting business intelligence (BI) tool, as well as tools from MicroStrategy to dig through data. Both, however, required technical expertise, causing a bottleneck between BI specialists and would-be analysts. "MicroStrategy is a good tool, but [it created] a dependency on the developers," said Namit RaiSurana, BI engineer for Facebook at last month's Gartner BI Summit.

RaiSurana and his team searched for a solution that would close the gap, and landed on what is becoming a common cure for analysis paralysis: data discovery tools. By transforming data into rich, readily consumable visualizations, these tools allow business users to analyze complex data sets without having to be trained extensively in data science. But the business-friendly tools -- already vying to become the Excel spreadsheet of the big data age -- are a double-edged sword.

"Data discovery tools give business users more flexibility and give them control," said Rita Sallam, research vice president for Stamford, Conn.-based Gartner Inc. "That being said, now there's the possibility of business users creating calculations different from those sanctioned by the enterprise."

The tools tend to be easy to use and put power into the hands of the business user, but without support from IT, they can also create silos, Sallam said. The challenge for IT business leaders is this, she said: "How can you have that balance where you want to meet the business needs, but at the same time, have some sort of mechanism in place to achieve some sort of level of governance?"

For Facebook, the answer was investing in a community where users could ask questions, challenge themselves, build up their skill sets and increasingly become more independent from BI developers. Facebook selected tools from Seattle, Wash.-based Tableau, whose popular software stemmed from research done at Stanford University between 1997 and 2002. What came next for Facebook was a search for the experts who could help implement the tool. That led RaiSurana to several new hires, including Andy Kriebel, author of the blog VizWiz and a co-presenter at the Gartner summit. Before Facebook wooed him away, Kriebel worked for Coca-Cola, where he provided promotion analysis.

"Namit [RaiSurana] talked about all of this gigantic data that we have, but unless we teach people how to understand that data and use that data, it's really meaningless," said Kriebel, who's now in charge of data visualization at Facebook. "So, our goal … is to make everyone an analyst."

The company is on its way to realizing that goal. Today, Kriebel is one of four BI engineers who support an impressive 500 Tableau users per month at Facebook. Here are three practices that led to Facebook's success.

1. Empower the tribe. Fortunately for RaiSurana and Kriebel, their target user group knows a thing or two about networking and collaboration. One of the first things they did was establish a Tableau Facebook Group where users can pose questions. The open platform saves developers from having to address the same question repeatedly on an individual basis, and it helps build a kind of reference manual that evolves organically with the users. The Facebook Group also acts as a platform for collaboration, something that isn't built right into the business model of most companies. That kind of collaboration, according to Kriebel, help takes some of the pressure off the handful of Tableau experts who support hundreds of users.

"The greatest thing about that is we're building this incredible tribe and this incredible repository of knowledge," Kriebel said. "It's awesome to see questions posted and other people responding who you trained a couple of months ago. It makes us feel really good that we're contributing by not having to contribute."

Users can also share their work by publishing their initial findings or projects to an uncertified Facebook site, accessible by any analyst.

"There's no really holding back, because you can publish stuff on uncertified and start sharing with other people," RaiSurana said. "There's no lag or dependency on the BI team itself."

The process, however, has built-in controls. Before any project can move from the uncertified to the certified site, it undergoes a BI review process.

2. Provide beginning and advanced training. Every other week, Tableau users or would-be users can attend either a beginner's or an advanced training session. Beginner's sessions focus on getting a feel for the data discovery tools, learning about different chart types, and even building and publishing a dashboard. Advanced sessions delve into more advanced types of charts, such as scatterplots and lollipops.

As a member of the elite team that supports Tableau at Facebook, Kriebel finds inspiration for new lessons every time someone gets stuck and needs help.

Gartner's Sallam points to this as a best practice for any organization: "Training and change management become critically important," she said. "It has to be an ongoing program -- monthly or weekly -- to get the most out of the tool's capabilities."

3. Rev up competitive juices --  or gamify! Kriebel's enthusiasm for data visualization is undeniable. One way he shares that enthusiasm with others is through a "data visualization of the week" contest, where he finds and rewards (with swag or other prizes) a compelling example from the user group. "I think people appreciate it more for the recognition," he said.

Sharing examples of good work can boost morale and can help build interest in the tool. According to Sallam, data discovery tools "grow sideways" by word of mouth rather than by a top-down initiative.

Working with the tool can trigger recognition for an employee, but so can building onto the tool. With Mark Zuckerberg at its helm, it might not be surprising that Facebook embraces hacking, also known as "Facebook pushing the envelope." That includes hosting hackathons. Here's the innovative part: In order to participate, employees are required to tackle non-work projects. Hackathons can focus on general, out-of-the-box exploits or on targeted projects, such as improving the data discovery tool's functionality. Two examples: better email functionality and a richer metadata repository.

"Tableau is great the way we bought it, but there are some things we wanted to have to make it a better fit," RaiSurana said.

April 12, 2013

How in the World did I end up at Facebook?

1 comment

I had a great time speaking yesterday at the Data Viz Summit in SF.  This was the first time I used Prezi and it worked out really well.  What fun would it be without taking some risks, including the two live demos I gave.  I would highly encourage you to give Prezi a try.  I don’t think I’ll use PowerPoint again.

I want to give a special shout out to Martin McGinn, Namit Raisurana, Heather Torres and George Lee for the great feedback during rehearsal.  It’s great to have such a supportive team and a culture that promotes openness and honest feedback.  It made a big difference.

There was a great speaker lineup.  The videos of the talks should be available shortly after the conference and I will post my talk, plus the others then.  And I will be writing a brief summary of my favorite talks. 

In the meantime, enjoy my Prezi.

April 2, 2013

Tableau time lapse video - Every photo from the International Space Station


Last week I wrote about how I created a map in Tableau of every photo taken from the ISS.  The map with all of the dots is interesting on its own, but a time lapse version is way cooler.

Since Tableau does not support animation via the Pages shelf on Tableau Server or Tableau Public, I created this video so you could see the map of all of the photos build over time.  The first ISS mission was on October 31, 2000 and Mission 34 was December 19, 2012.  A list of all missions can be found here.