Thursday, April 16, 2009

Optimism about OpenURL

4 weeks ago, there was a thread on the OpenURL listserv with the wonderful title "OpenURL listserv still not accepting my mail". I was mentioned by Herbert van de Sompel, so I thought I should reply. The problem was, I had been unsubscribed by virtue of having left my email address when I left OCLC. I figured, no problem, I should be able to get resubscribed. With a bit of help from Phil Norman, I eventually got resubscribed, only to find out that the OpenURL listserv was not accepting my mail either!

I've never had much urge to start blogging, but I've known for a while what the name of my blog would be!

The following is a horrible way to start a blog, but I fully intend for this to be a horrible blog. It will be incomprehensible, arcane, obscure, indirect and never poetic. Here's what I had to say that the OpenURL listserv won't publish:

It's a bit unkind to talk about dynamical OpenURL formats as "false fantasy". Having said that, I think most of the committee that worked on OpenURL standard would plead guilty to being optimistic about the future. The present situation, however, is that if you use a metadata format that a resolver is unfamiliar with, there are no resolvers, either in production or in the lab, that will understand enough about the ContextObject to do anything other than validate it.

If you're a glass-half-full guy like me, you'll say- "Wow, you mean an OpenURL link resolver can actually validate a metadata format that it's never seen before???" and you'll admire the practicality of the group that worked on the standard.

If you're a glass half-empty person, you'll say- "That's completely useless, the resolver has no hope of doing anything useful for a user unless somebody goes and does some work on the format" and you'll be muttering about the false fantasies and delusions of the group that worked on the standard.

As Herbert pointed out, the standard is written so that metadata formats that are not in the registry (and thus validated as being important in some way by real live human beings) must be either described by an xml schema or a matrix file. At the time we worked on the standard, the most that could be accomplished with this rule is that a resolver machine would be able to validate a context object. We thought that was a realistic and sensible goal. There was always the hope that semantic web technology would advance to the point that self-describing metadata formats would also be possible. And in fact, there have since been developed some very interesting annotation technologies that would make that possible- if you really needed to do it.

The bottom line for the question that began the discussion four weeks ago is that registering the metadata formats that are thought to be important is a Good Thing. Those formats will not be self-describing in common use (because they are successful without self-description!)

(Thanks to Paul Moss who alerted me to the discussion while my listserv subscription had lapsed since leaving OCLC, and to Phil Norman, who helped resubscribe me.)
Reblog this post [with Zemanta]

4 comments:

  1. Eric,

    I share your belief that there is something fundamentally true about OpenURL but for various reasons it hasn't live up to its potential. In one of my Q6 blog entries, I actually said "the OpenURL domain model can be used to model every operation that happens today on the Internet". I still believe that.

    Where I think OpenURL went astray was to hedge on the Transport issue. In hindsight, HTTP should have been assumed rather than modeled as a Transport. Given the potential of SOAP at the time, this decision was understandable but unfortunate. In essence, OpenURL reinvented much of HTTP and then ended up tunneling this information in an HTTP message (entity body or query string), just in case somebody someday want to Transport it via a SOAP message instead. Competing with the HTTP model was destined for failure as SOAP found out.

    The good bit is that OpenURL remodeled HTTP based on an intuitive domain model: "who", "what", "where", "why", "when", and "how". The committee gave these classes funny names, but the intuitive sense was dead on and will remain relevant forever.

    For better or worse, HTTP has won. Nevertheless, OpenURL has a vital set of use cases. I believe that OpenURL can regain its mojo if we can re-envision these use cases in the context of Linked Data. I believe the future of OpenURL lies there.

    Jeff

    ReplyDelete
  2. Jeff,

    Thanks for the comment (a historic first comment to my secret blog!). My view is that if OpenURL went astray it was in failing to KISS (Keep it simple, stupid). Abstracting transport was a logical consequence of addressing the question: "how do you redirect a POST?" and then failing to settle on a single answer. Worse could have happened.

    Eric

    ReplyDelete
  3. Eric,

    Your explanation for why Transport was abstracted is intriguing. I was just guessing when I assumed it was to accommodate a possible SOAP future that never materialized. It seems that an HTTP solution could have been to return a 200 status with a full-text representation entity body and the Content-Location header set to the actual location rather than a redirect. Is this 20/20 hindsight or is there more to it?

    Jeff

    ReplyDelete
  4. The redirection problem that came up was the that of a linking hub/gateway such as OCLC's OpenURL gateway/Resolver registry. Of course when we looked at the ways we could address this, SOAP naturally came up as a possible way to ship context data along a chain of resolvers.

    ReplyDelete