Friday, July 24, 2009

If Elvis had an OpenID and the Mome Raths Outgrabe

If the space aliens that kidnapped Elvis decided to return him tomorrow, and he decided to use Twitter and a blog to communicate that fact, how would anyone know it was really him? Would the National Enquirer even bother to report the news? Would Elvis ever be able to reclaim his public or private identities? Would he be able to remember any of his passwords?

Password proliferation has been a problem for so long that innovators have solved it over and over again. The library world came up with an single-sign-on authentication/authorization system called Shibboleth, and then implemented EZProxy so they wouldn't have to deal with it. The UK developed the single-sign-on system called Athens. The dot-com bubble came up with a bunch of single-sign-on companies; some of them, including PassLogix and Imprivata are still at it. I am still waiting for the announcement of the Single-Single-Sign-On system.

OpenID took a different approach, and is now somewhat usable for the purpose of allowing people to establish an identity with one provider that can be used on many websites. For example, I've used the OpenID identity "http://go-to-hellman.blogspot.com/" to register comments on the Semtech 2009 website and the Paul Miller's Cloud of Data Blog. On my last post, comments were left by "nicomo", and "breizhlady", whose OpenID's are http://nicomo.pip.verisignlabs.com/ and http://breizhlady.myopenid.com/ . Jodi Schneider used Blogger credentials to leave her comment. My OpenID can be used to determine with some degree of certainty that the the Eric Hellman who left a comment on Cloud of Data is the same Eric Hellman who's writing on this blog. A bit of googling will tell you who nicomo and breizhlady are, if you really want to know. If Elvis had been issued an OpenID before he left we would be able to tie his new blog to his old identity.

There are still the single-single problems with OpenID. The user experience for OpenID systems gets a bit clunky- my wife was frustrated when she tried to leave a comment on this blog. But overall there seems to be slow convergence and user acceptance of OpenID.

This brings me to the questions I wanted to raise today: What does an OpenID identify? Does http://go-to-hellman.blogspot.com/ identify me? Can these OpenIDs be used to make assertions about people to enter into the Linked Data Cloud? How should the Linked Data semantics for redirects be implemented for OpenID? Should 303 redirects be used to indicate that the "thing" being identified by OpenID is a real-world object?

To some extent, it's really the way identifiers are used that determines semantics- identification of any real-world object can never have perfect accuracy. The use of ISBN to identify a book is a good example. Although ISBN is frequently used to identify a book, ISBNs are managed in such a way that they most accurately identify items sold in a bookstore- toys and dolls often get ISBNs. Similarly, you might think that the US identifies people with Social Security Numbers (SSN), but if you think about it, the "thing" an SSN most accurately identifies is an account with the Internal Revenue Service. Similarly, I think it's pretty clear that an OpenID identifies a set of login credentials, although people might well use the OpenID to identify the person or persons behind it.

I have been guilty in the past of driving people to distraction by arguing that it can be almost impossible to decide whether something is an "information resource" (something whose essential characteristics can be conveyed in a message) or whether it is a "real-world object". It's pretty easy to blur the issue with an e-book, for example, but what about the SSN? It used to be that "an IRS account" was something on paper somewhere, but I'm pretty sure that my entire IRS account is digitized somewhere.

Section 3 of the W3C's Technical Recommendation "Cool URIs for the Semantic Web" assumes that it's easy to determine whether something is an information resource or whether it's a real-word object and that it's impossible to convey the essence of real-world objects in a stream of bits. I find this a bit unworldy. It even cites the unicorn as an example of a "real-world object". I guess that makes Elvis a real-world object, too. Conversely, even things that live completely on the internet are rarely "conveyed in a message" any more. A typical URI-addressable service today is constructed out of software, web services, content delivery networks, advertising delivery networks and clustered hardware so that the "essential characteristics" include the attributes of real world objects like me.

I've recently become aware that lots of really smart people have thought and written about the theory of identifiers and about how the Semantic Web should handle them. I've particularly enjoyed an article called "In Defense of Ambiguity" by Patrick Hayes and Harry Halpin. But to answer my questions about the semantics of OpenID, there's no sage more useful than the one who said "When I use a word, it means just what I choose it to mean - neither more nor less." Semantics do not get determined by those who mint the identifiers, but rather by those who make use of them. It helps if they are also willing to pay the IDs a bit extra.

2 comments:

  1. Minor point... when you say "and then implemented EZProxy so they would have to deal with it" I think you mean "wouldn't" rather than "would"?

    On the "unicorn as an example of a real-world object" I certainly agree that there is the potential for confusion. Note that the document you cite, Cool URIs for the Semantic Web, uses 'Web document' rather than 'information resource' throughout, which I think is helpful, but, yes, Elvis was and still is a real-world object - even well after he left the building.

    ReplyDelete