Category Archives: Data

Yearbook Data

Data Viz Reviewed

As we prepare for the next theme in our MOO-ish thing, I figured I’d condense and summarize some of the products from this run.

Yearbook Data
Gillian Lambert a wonderful English teacher from Moody MS opted to play along and visualized yearbook data for her school. Gillian documents the process and highlights how visualizing the information led her to new understandings. I think that’s one of the most neglected benefits of a process like this. Creating leads to additional understanding for both the audience and the author.


Debbie found that sometimes the web 2.0 tools aren’t up to par. Her critique of the infogr.am and piktochart services echoes some of the problems I’ve had with services like this over time. They are fast and relatively easy but you give up some key elements of control that some people won’t be happy about. I think they have their place but you have to go in knowing the limitations of the tool and the impact of those limitations on your ability to control the narrative.

Katie spent some time tracking happiness but with great common sense opted to move on to looking at Google Doc usage. She used mural.ly to layout the graphs from GAFE in a way that lets you see the big picture and drill down to the details. The infinite canvas concept that you have in mural.ly and prezi is something that has some fascinating possibilities.


Gaynell used her amazingly consistently entered Outlook calendar data to plot all of the yoga teaching she has done since the dawn of time. Her post does a great job breaking down how she got the data from Outlook, into Excel, and then parsed out in a way that allowed her to visualize. Her post has encouraged me to look at how I’m categorizing my calendar information so that I might actually use it. My current method is not a method.

Screen Shot 2013-06-20 at 5.47.57 PM
William Berry visualized data about the Civil War and used Google’s new mapping tool to do it. He got enough of my interest that I signed up for the same class he’s taking on how to build in the new Google Maps/Earth. I particularly like how he’s framing data analysis as a necessary student lens on the world. As a result of this conversation, we’re going to try playing with this same data in Exhibit.1

Karen Richardson started some work mapping where food comes from in Exhibit as well but hasn’t submitted anything officially. I’m linking her in anyway. I’ll also be working some in the near future to update my Exhibit tutorials. If you feel like time traveling the old tutorials from around 2007 are here. iWeb anyone?

None of this was massive but I’m really happy that people have participated and have enjoyed things. The next round will likely be focused on the idea of remix (defining post coming soon). It’s a big topic and the Internet is a big place. You’re more than welcome to come join us.


1 I may over advocate for this tool but it remains uniquely capable of visualizing information over time and space in a fluid faceted user friendly environment.

Screen Shot 2013-06-09 at 9.38.33 PM

Blog Post Stats

I wondered about my blogging patterns given my recent increase in posts. I didn’t bother pulling out Jim Coe’s posts from back when this was a joint blog but the data is good enough for my purpose. Anyway, I started messing with it and am working towards a visual way to represent it in a way that makes sense to me.

Screen Shot 2013-06-09 at 9.38.33 PM
I’m totally unhappy with this graph. Totally. I messed with some color pallets etc. but it just didn’t do what I wanted at all.

Screen Shot 2013-06-09 at 9.38.58 PM
I then went to the opposite end of the spectrum and wanted to see what sparklines might show me. Sparklines are a favorite of Edward Tufte who is on the super minimalist side of the data visualization spectrum.1 At first I didn’t think there was enough data to make the sparklines work. I then tried compressing the horizontal axis and it improved things but it’s still not what I want.
Screen Shot 2013-06-09 at 10.14.30 PM

wp_posts
Here’s another stacked year graph that I might work on some more. I ended up wandering into Adobe Illustrator and found out there are some interesting tricks for making graphs in there. I will explore it more in the near future. I’m learning a lot of things.

Here’s a messy (deliberately) stack of the graphs above with the opacity set to 20% or so. It gives a modified version of a stacked bar chart that I kind of like. It’s not a complete picture but, coupled with the source graphs, it starting to look like what I want.
Screen Shot 2013-06-10 at 10.37.06 AM


1 There’s probably a happier middle ground but he has a number of good points. If you’re in HCPS and interested in checking out some of his books let me know and I’ll bring them in.

Screen Shot 2013-06-09 at 4.27.34 PM

More Storage Visualization

I have meant to play around more with the Google Chart API for a while and I wasn’t happy with what I made earlier to visualize the network storage differences among the schools and users. I thought a treemap would be a more powerful way to show just how much space a few teachers used vs the masses. Knowing your options and picking the right one to help illustrate your point is an important element of data visualization. After all, we aren’t ignorant savages who believe -Isn’t this about visualizations, basically a form designed for those who won’t (or can’t) read? Kinda like remedial explanation for the 99%.

You can see the Google example for this kind of graphic here. This is my first time messing with it so I started by copying their example into my text editor. Their example was pretty close to what I wanted in terms of the structure of the information. They had Location, Parent, Volume, Color as the main variables. I wanted something pretty similar.

Instead of ‘Global,’ ‘HCPS’ was my top category with the schools taking the place of the countries. Pretty simple but I sure didn’t want to write all that data by hand. I already had the basic data in Excel, I just had to come up with the right formula. In this case -

=”['"&C2&"',"&"'"&A2&"'"&","&D2&","&E2&"],”

It’s worth remembering how handy Excel is at doing stuff like this. Anything within double quotes is written as is and the rest is just plugging in the cell variables. From there I just needed to cut and paste the column in. Easy and quick.

Screen Shot 2013-06-09 at 4.27.34 PM

The only other small I change I made was to the color scale. It was red/green which tends to indicate pretty specific types of judgement. I wasn’t interested in that so I made a small switch there. Changing the minColor/maxColor variables indicated below. They are hexadecimal color values if you’re unfamiliar with them.

minColor: ‘#0033CC‘,
midColor: ‘#ddd’,
maxColor: ‘#fff‘,
headerHeight: 15,
fontColor: ‘black’,
showScale: true});

I’m still not sure about a couple of things. For instance, I can’t figure out why Glen Allen is darker than Tucker and Godwin on the main view. That seems to be similar to what’s going on the example but I’m not sure why. It’d also be nice if clicking on the parent piece after you drill down would take you back up a level. I think that’s doable.

You can see the full size example here if it amuses you. It’s crammed in below using an iframe which will let you put just about anything into an html page. The code used to embed it below is provided as an example.

Networked Storage Data

We have 668 high school teachers using at least .1 MB on a shared network volumes we’ve collectively dubbed “Virtual Share.”

Those 668 high school teachers use 2019.7 GB or 2.02 terabytes of storage. What’s particularly interesting to me is the disproportionate usage between teachers.

The top user, a single person, uses 180 GB or roughly 17% of the total.1

The top 10 users use 733.2 GB of storage.

The top 20 users use 993.6 GB of storage or almost 50% of the storage is used by roughly 3% of the users.2

These are just embeds of the data from Google Spreadsheets. Nothing fancy, not much control but I think it does paint a decent picture of the extreme differences in resource usage. I do continue to have trouble with the interactive chart embeds outside of the spreadsheet. I do like the unintentional psychedelic effect on the pie chart.


1 No judgements on quality of use, just amazement that they are so far out there.

2 Makes me reconsider the whole 1% thing as even more screwed up.

Why I Talk This Way

I spent quite a lot of time with my wife and oldest son looking at the dialectic survey map1 and trying to figure out which one of us said a particular phrase or pronounced a word a certain way. About half the time I answered “all of the above” while my wife was tried and true Massachusetts for just about every one.2

I figure my wandering ways are to blame so I figured I’d take a shot at visualizing that. I did recall that Google Spreadsheets would let you visualize spreadsheet data on map with no trouble at all. It’s an option under “insert chart.” All I needed was a location in the first column and the numerical value for the circle in the second column (years in this case). Said and done.3 Too easy. Mine is immediately below and is followed by my wife’s map. Turns out it has a rough time with two different data sources from one document- even if they’re on different sheets. I could have made an additional spreadsheet but I don’t like this enough. Easy-ish but not much control. I’m going to look for some other options.

Image Version

Turns out I’m starting to hate these as there are more issues than they’re worth. I don’t know how to allow access to the interactive version as I’ve published everything I can.


1 It was a good time and I’ll not apologize for it.

2 For the record, Massachusetts says just about everything wrong. It’s really sad.

3 Well, not quite. Turns out you can only have two columns of data or the whole thing errors out even if you only had two columns chosen for the chart (which works within the spreadsheet but not on the embed).