In my post on privacy and Google Book Search, I alluded to technological solutions libraries could use to enhance patron privacy while also protecting against unauthorized access. I thought it would be useful to elaborate on this comment with some details. In general, there is no reason that privacy and security objectives can both be met in a properly engineered solution, other than the fact that it's hard to find someone willing to pay for the properly engineered solution. For example, I mentioned Shibboleth as a possible solution to providing security and privacy. Shibboleth is an open-source single-sign-on authentication system developed as part of the Internet2 project. It uses strong cryptographic techniques to delegate trust over a network, and in so doing, allows for significantly enhanced privacy.
Think about the situation where a company has licensed some content to a university. The licensor wants to make sure that only persons associated with the university are allowed to access the content. It doesn't need to know who the user is, it only needs to know that the user is properly entitled. The Shibboleth system allows the institutional user to sign in to an authentication point once using their institutional credentials, then any licensed resource can check with the central authentication point that the user is accredited by virtue of institutional affiliation. Shibboleth also allows users an institutions to disclose attributes to providers of their choosing. Attributes might include their name, preferred language, subject areas of interest, subgroup membership, etc.. Security is preserved because the institution still knows the identity of the users, and is enhanced because the Shibboleth system is designed to be much harder to defeat than competing solutions.
As far as I understand, Shibboleth would
- When the user is authenticated by the institution, a session id would be sent to Google. The session id tracks the user, but only the institution knows the identity of the user.
- When the user views a page in a book, Google sends a message to the institution to increment a named counter associated with the user. The name of the counter identifies a book, but only Google knows which book is associated with the counter.
- when the user asks to view another page, Google asks the institution for the page count associated with the book and the user, and grants access accordingly.
What is the likelihood that such a system can be created and adopted? On this score I am very skeptical. Who would pay for the enhanced privacy afforded by such a system? The success of a variety of Web 2.0 services seem to indicate that users are almost eager to give up privacy to gain the ability to communicate. As Randal Picker has discussed in a recent paper, consumers have significant incentives to give up their privacy to online advertising networks because doing so amounts to advertising by the consumer that results in a more efficient market. The history of Shibboleth can be used as an indicator of market behavior. Although it can provide enhanced privacy and strong security, these advantages have not been able to counteract implementation and usability costs and compared to competing technologies, and Shibboleth has not been widely adopted. When Peter Brantley raised the specific question of using Shibboleth for Google Book Search, Google's Dan Clancy commented that "Some institutions use Shiboleth and we will support this although most institutions prefer IP authentication". Google is known for putting a very high priority on usability, which is an area of significant weakness for Shibboleth.
On second thought, maybe the Swiss banks are onto something. Maybe the best target market for ultimate privacy is ultra rich people. Sergey and Larry, Warren and Bill, might I sell you a bit of privacy?