Wednesday, June 8, 2011

Our Metadata Overlords and That Microdata Thingy

On June 2, our Metadata Overlords spoke. They told us that they'll only listen when we tell them things using a specialized vocabulary they've now given us at the website. Although we can still use our stone tablets if that's what we're using now, we're expected to migrate to a new Microdata Thingy, assuming that we really want them to pay attention to our website metadata supplications.

There are among us believers, who, led by druids enraptured by the power of stone tablets to carry truth, will shun the new thingy, but most of us will meekly comply with the edicts of the overlords. We're not able to distinguish the druidic language of the tablets from the new liturgy of of the state church. Many things are difficult to articulate in the new vocabulary, but gosh, those tablets were heavy to carry around. And the new thingy doesn't seem so awful, although it's difficult to tell with the mumbled sermons and hymn singing and all.

I hope the overlords don't try to take our pagan rituals of Friending and Liking away from us, though. The incantations used to invoke and bless the Like ritual also use the druidic language, and the help scrolls tell us we might confuse the overlords if we use more than one language in our prayers.

My soul remains troubled, however, at the thought that the Overlords care not for truth and for justice. Sometimes it seems as though the overlords want only for our offerings of attention and seek only to feed our lust for food, drink, entertainment, debauchery and money. Yes, there are new words for our books and learning, but we can say so little about these in language that our wizards and mages will be mute if they ever choose to enter that realm.

I myself was present at a conclave of such mages and wizards dedicated to the entwinement of data from libraries, museums and archives in full openness. When tweet of the new order came, we endeavored to learn more of and its thingy. We questioned whether the thingy was an abomination against openness, or whether we might exploit its Overlord endorsement to make our own spells more powerful. We agreed to teach each other our new thingy spells, even as our colleagues elsewhere figured out how to chisel the new vocabulary into stone. Word came from other lands that the new vessel would founder trying to cross the seas.

We then visited the temple of the archive and found the servers cool to the touch. We heard words from a past oracle, ate as they never ate in Rome, drank cool drafts, and returned home emboldened with an enlarged appreciation of intermingled bits.

So it was said, so shall we do.

  1. Google's blog post on adopting microdata was signed by R. V. Guha who had a bit to do with the creation of RDF.
  2. It's not really a surprise that Google doesn't care about RDFa. In my article on RDFa from 2009, I pointed to mistakes that Google made in their RDFa documentation. They never fixed it.
  3. can't even list all of its schemata- the web page, chock full of non-breaking spaces, is truncated!.
  4. The current microdata spec is in an odd state where it's confused about how to define an itemtype. In fact, the mechanism for defining new itemtypes is gone! Here's what it says:
    The item type must be a type defined in an applicable specification.

    Except if otherwise specified by that specification, the URL given as the item type should not be automatically dereferenced.

    A specification could define that its item type can be derefenced to provide the user with help information, for example. In fact, vocabulary authors are encouraged to provide useful information at the given URL.
    Apparently, stuff was removed for some sort of political reason- it's there in the WHAT-WG version; note that Google links to the W3C version, which is not fully baked.
  5. the terms of service are creepy when you get to the part about patents.
  6. The big selling point for RDFa was that Google, Yahoo and Bing supported it for Rich Snippets and the like. But Microdata's inability to easily support complex markup turned out to be an key feature for the search engines. The moral of the story for standards developers: your best customers are always righter than the others.
  7. In the video, Brewster Kahle reads from the last page of A Manual on Methods of Reproducing Research Material by Robert C. Binkley (1936). OCLC Number 14753642. Peter Binkley, a meeting participant, donated a copy of his grandfather's book to the Internet Archive, along with permission to make it free to the public.
  8. Henri Sivonen has written a very readable and informed discussion about Microdata, RDFa, and the process of making standards that you should read if you are interested in why things are the way they are in HTML5.

1 comment:

  1. Minor correction to n.7: Brewster bought the book, and since my grandfather's copyright passed into the public domain last year on the 70th anniversary of his death, permission to digitize was purely symbolic.

    A 1984 metaphor seems apt for controlling discourse (and thereby thought) by imposing an impoverished language. RDFa double-plus bad!