Launch, grow, and unlock your career in data

April 10, 2016

Dear Data Two | Week 49: Data

At the start of this week, I attempted to follow the lead set by Stefanie and Giorgia during their week 49 and track every time I heard, said or wrote the word “data”. Tracking swear words was difficult enough and I quickly realized there was no way I would be able to keep up with “data”, there’s simply too much of it.

It was Tuesday and I couldn’t really start a new topic because I’d be missing a day. Instead, I decided to look at the data I had created through the first 48 week of Dear Data Two. There are four main types of data I create every week: raw data (usually in Excel), Tableau extracts, Tableau workbooks, and scanned images. As you click through the story below, you’ll see I’ve created nearly 1GB of data through 48 weeks, 95% from pictures.

This means that the data sets I had been working with have mostly been small. And this revealed a problem I hadn’t know: my Tableau extract were actually BIGGER than the raw data. I always had it in my head that Tableau extracts would be smaller than the source data, but in 42/45 weeks, that wasn’t the case.

Some summary stats:

  • 357 pictures = 926.4 MB
  • Raw data = 3.5 MB
  • TDEs created = 3.9 MB
  • TDEs were on average 10% bigger than the raw data
  • 45 Tableau workbooks = 36.2 MB

Overall, it was fun to analyse the data of Dear Data Two. Flip through the story below to see my take.