"Aren't you supposed to be working on your new business? That ungluing ebooks thing? Instead you keep writing about library data, whatever that is. What's going on?"
No, really, it all fits together in the end. But to explain, I need to talk you beyond the "Like Button".
Earlier this month, I attended a lecture at the New York Public Library. The topic was Linked Open Data, and the speaker was Jon Voss, who's been applying this technology to historical maps. It was striking to see how many people from many institutions turned out, and how enthusiastically Jon's talk was received. The interest in Linked Data was similarly high at the American Library Association Meeting in New Orleans, where my session (presented with Ross Singer of Talis) was only one of several Linked Data sessions that packed meeting rooms and forced attendees to listen from hallways.
I think it's important to convert this level of interest into action. The question is, what can be done now to get closer to the vision of ubiquitous interoperable data? My last three posts have explored what libraries might do to better position their presence in search engines and in social networks using schema.org vocabulary and Open Graph Protocol. In these applications, library data enables users to do very specific things on the web- find a library page in a search engine or "Like" a library page in a Facebook. But there's so much more that could be done with the data.
I think that library data should be handled as if it was made of gold, not of diamond.
Perhaps the most amazing property of gold is its malleability. Gold can be pounded into a sheet so thin that it's transparent to light. An ounce of gold can be made into leaf that will cover 25 square meters.
There is a natural tendency to treat library data as a gem that needs skillful cutting and polishing. The resulting jewel will be so valuable that users will beat down library websites to get at the gems. Yeah.
The reality is that library data in much more valuable as a thin layer that covers huge swaths of material. When data is spread thinly, it has a better chance of connecting with data from other libraries and with other sorts of institutions: Museums, archives, businesses, and communities. By contrast, deep data, the sort that focuses on a specific problem space, is unlikely to cross domains or applications without a lot of custom programming and data tweaking.
Here's the example that's driven my interest in opening up library linked data: At Gluejar, we're building a website that will ask people to go beyond "liking" books. We believe that books are so important to people that they will want to give them to the world; to do that we'll need to raise money. If lots of people join together around a book, it will be easy to raise the money we need, just as public radio stations find enough supporters to make the radio free to everyone.
We don't want our website to be a book discovery website, or a social network of readers, or a library catalog; other sites to that just fine. What we need is for users to click "support this book" buttons on all sorts of websites, including library catalogs. And our software needs to pull just a bit of data off of a webpage to allow us to figure out which book the user wants to support. It doesn't sound so difficult. But we can only support to or three different interfaces to that data. If library websites all put a little more structured data in their HTML, we could do some amazing things. But they don't, and we have to settle for "sort of works most of the time".
Real books get used in all sorts of ways. People annotate them, they suggest them to friends, they give them away, they quote them, and they cite them. People make "TBR" piles next to their beds. Sometimes, they even read and remember them as long as they live. The ability to do these same things on the web would be pure gold.
No, really, it all fits together in the end. But to explain, I need to talk you beyond the "Like Button".
Earlier this month, I attended a lecture at the New York Public Library. The topic was Linked Open Data, and the speaker was Jon Voss, who's been applying this technology to historical maps. It was striking to see how many people from many institutions turned out, and how enthusiastically Jon's talk was received. The interest in Linked Data was similarly high at the American Library Association Meeting in New Orleans, where my session (presented with Ross Singer of Talis) was only one of several Linked Data sessions that packed meeting rooms and forced attendees to listen from hallways.
I think it's important to convert this level of interest into action. The question is, what can be done now to get closer to the vision of ubiquitous interoperable data? My last three posts have explored what libraries might do to better position their presence in search engines and in social networks using schema.org vocabulary and Open Graph Protocol. In these applications, library data enables users to do very specific things on the web- find a library page in a search engine or "Like" a library page in a Facebook. But there's so much more that could be done with the data.
I think that library data should be handled as if it was made of gold, not of diamond.
Perhaps the most amazing property of gold is its malleability. Gold can be pounded into a sheet so thin that it's transparent to light. An ounce of gold can be made into leaf that will cover 25 square meters.
There is a natural tendency to treat library data as a gem that needs skillful cutting and polishing. The resulting jewel will be so valuable that users will beat down library websites to get at the gems. Yeah.
The reality is that library data in much more valuable as a thin layer that covers huge swaths of material. When data is spread thinly, it has a better chance of connecting with data from other libraries and with other sorts of institutions: Museums, archives, businesses, and communities. By contrast, deep data, the sort that focuses on a specific problem space, is unlikely to cross domains or applications without a lot of custom programming and data tweaking.
Here's the example that's driven my interest in opening up library linked data: At Gluejar, we're building a website that will ask people to go beyond "liking" books. We believe that books are so important to people that they will want to give them to the world; to do that we'll need to raise money. If lots of people join together around a book, it will be easy to raise the money we need, just as public radio stations find enough supporters to make the radio free to everyone.
We don't want our website to be a book discovery website, or a social network of readers, or a library catalog; other sites to that just fine. What we need is for users to click "support this book" buttons on all sorts of websites, including library catalogs. And our software needs to pull just a bit of data off of a webpage to allow us to figure out which book the user wants to support. It doesn't sound so difficult. But we can only support to or three different interfaces to that data. If library websites all put a little more structured data in their HTML, we could do some amazing things. But they don't, and we have to settle for "sort of works most of the time".
Real books get used in all sorts of ways. People annotate them, they suggest them to friends, they give them away, they quote them, and they cite them. People make "TBR" piles next to their beds. Sometimes, they even read and remember them as long as they live. The ability to do these same things on the web would be pure gold.