Playing with Google Books’ Ngram Viewer


I just came across a link to a fascinating new Ted Talk via Library Link of the DayWhat we learned from 5 million books is a 15-minute video of researchers Erez Lieberman Aiden and Jean-Baptiste Michel talking about what they’re learning about culture by charting the frequency of words over time in the books so far digitized as part of the Google Books digitization project.

They call the study “culturomics,” which they define as “the application of massive-scale data collection analysis to the study of human culture.”  What it lets them do is chart things like how frequently the words “God” or “aargh” appear in books over time.  They argue that this allows them to get a clear picture of what people are talking about at any particular point in time, and also trace the importance of a concept over time.  Jump down to the comments posted below the talk and you’ll see that a lot of people feel this is flawed theory because it ignores word context.  I’m not sure yet what I think, but I know I’m intrigued.

The cool thing is that Google liked the tool Michel and Lieberman Aiden have been using for data analysis so much, they made a version that’s available to all of us.  So now you can go in and do your own analysis, for any word that you like.  And you can see a sample of the books the word appears in.

Try it out, and see what you think.  Just “nerdy fun” or a useful tool for looking at how culture develops and changes?

Just as a side note, it’s possible Michel’s graphs for “awesome vs. practical” are the best graphs I’ve ever seen, and the quickest visual summary of how realistic an idea is.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: