This article has been reproduced in a new format and may be missing content or contain faulty links. Contact to report an issue.

By funny coincidence, since I’ve been at ZSR I have attended meetings in my previous home town (LITA Forum in Columbus) and the home town before that (Midwinter in Seattle). ALA Annual in Chicago this summer will make the trifecta. Do let me know if there are good meetings coming up in Ann Arbor or Madison.

At 47°37′ N lattitude, Seattle is farther north than Duluth, Minnesota: in January, if there’s any sunlight at all, it’s noticeable from about 8:30am to 4:30pm. But there usually isn’t any sunlight: clouds, rain, highs in the mid 40s, lows in the low 40s. Typical January in Seattle. They take seasonal affective disorder seriously there.

Highlights (unless you really want to know about the LITA publication committee, LITA’s committee of committee chairs, and the ALA committee of publication committee chairs… Oh, except that one of the newest LITA Guides to hit the stands is Cloud-Based Services for Your Library: A LITA Guide by Erik Mitchell, coming soon to library stacks near you.)

Big Data For Big Brother

If I’m a little slow writing up my Midwinter experience, it’s because I’ve been cowering in paranoid fear in my office since the first meeting I attended, OCLC’s member meeting – okay, not a harrowing experience – and keynote by Alistair Croll on “The implications and opportunities of Big Data.” The session benignly defined Big Data as datasets that are too large for traditional hardware and software tools to analyze. The term plays off the growth of Big Science: $10 billion to build a Large Hadron Collider, and then eleventy bajillion teraflops to analyze its output. Croll defines Big Data as the problem of analyzing data with a lot of Volume (a ton of data), Variety (many kinds of data), and Velocity (torrents of data).

Big Data has potential for good: medical data – including Google searches for symptoms – can predict disease outbreaks. Analyzing which farmers in developing countries benefit most from microloans helps target future loans where they will do the most good. Analyzing traffic data allows taxi services to have cars ready where people will want them. Likewise, when we’re driving, every one of our GPS-enabled phones or tablets contributes to real-time maps of traffic data so we can route around delays.

The darker side of Big Data becomes apparent when you realize the biggest dataset out there is our own increasingly trackable behavior both on- and offline. I won’t rehash too many of Croll’s points (I strongly recommend you watch it when you have some free time – link below), but some of the highlights:

  • Big Data gets used a lot to say “People in Group A tend to like Topic B and products like Item C, so we’ll put ads for C on pages about B.” We smile knowingly when Group A is Librarians, or people with Zip codes beginning 271xx. (Mac users may remember that Orbitz shows them ads for more expensive hotels than Windows users, the assumed connection being that if you have a Mac you’re either more affluent than most Windows users or more willing to shell out for a quality experience.) Things can get ethically and legally tricky when that group is defined by things like gender or race, and the product is, for example, low rate mortages.
  • There is so much data out there that we usually do not have the tools (or access to the data) to evaluate it, and we often accept that all competing explanations for something are equally well supported.
  • And humans aren’t very good at evaluating data anyway. We keep demonstrating that we will believe what seems right even when it’s demonstrably wrong. (Several long examples that can be summed up here and here.)

Croll concluded with the idea of Good Data: to Volume, Variety, and Velocity, he adds Veracity and Value: data that is true (and that can be checked), and data that has a useful context. A telling line from the conclusion: “Google can find more articles than any librarian, but any librarian can find better articles than Google.” There is a continuing need for human insight that can apply all of the data as something more than an algorithm.

Watch the presentation here:

Other meetings of note: I nearly missed LITA Happy Hour because I actually sat down with a colleague and discussed a reasearch project for a couple of hours. It’s the sort of thing that makes it worthwhile to schlep cross country and attend a conference in person. I heard both a former Ohio colleague and a current WFU colleague (Roz) present on the experience of bringing Summon up at their libraries (we were a lot more laid back about it). And at the LITA Town Meeting, a few of us graybeards determined that the key to LITA’s future is piratically taking over RUSA, ALCTS, and LLAMA – because, hey, where would they be without technology? – and creating a unified Library Services Division (LSD). Having come up with the idea, we leave it to the youngsters to make it actually happen.

Then, with nothing to do until the 11pm red-eye home, I spent the afternoon in the Seattle Public Library:

Seattle Public Library, top floor reading room