Friday, May 15, 2009

Reif#&cation Part 1: RDF and the dry martini

A man walks into a bar. The bartender asks him what he wants. "Nothing," he says.
"So why did you come in here for nothing?" asks the bartender.
"Because nothing is better than a dry martini."

This joke is an example of reification. An abstract concept, "nothing", is linguistically twisted into a real object, resulting in a humorous absurdity. I first encountered the concept when, 10 years ago, I learned RDF, (resource description framework) the data model which was designed to be the fundamental underpinning of the semantic web. At that time, I was sure that "reification" was a completely made up word used as a jargon stolen from the knowledge representation community. It's only this week that I learned that in fact, "reification" is a "macaronic calque" translation of a completely made up German word used prominently by Karl Marx, "Verdinglichung". Somehow that doesn't make me feel much better about the word. If you learn nothing else from reading this, remember that you can use "reification" as a code word to gain admittance to any gathering of the Semantic Webnoscenti.

In RDF, reification is necessary so that stores of triples can avoid self-contradiction. Let me translate that into English. RDF is just a way to say things about other things so that machines can understand. The model is simple enough that machines can gather together large numbers of RDF statements, apply mathematical machinery to the lot of them, and then spit out new statements that make it seem as though the machines are reasoning. The problem is that machines are really stupid, so if you tell them that the sky is blue, and also that the sky is not blue, they can't resolve the contradiction and they start emitting greenhouse gases out the wazoo and millions of people in low-lying countries lose their homes to flooding. What you need to do instead is to "reify" the contradictory statements and tell the machine "Eric said the 'the sky is blue'" and "Bruce said 'the sky is not blue'". RDF, as a system, can't talk about the assertions that it contains without doing the extra step of reifying them.

So let's see how the RDF model accomplishes this (remember, RDF represents assertions as a set of (subject,predicate,object,) triples. We start with:
Subject: The sky
Predicate: is colored
Object: blue
And after reification, we have:
Subject: statement x
Predicate: has Subject
Object: The sky

Subject: statement x
Predicate: has Predicate
Object: is colored

Subject: statement x
Predicate: has Object
Object: blue

Subject: Eric
Predicate: said
Object: statement x
So now the statement about the color of the sky has become a real thing within the RDF model, and I can do all sorts of things with it, such as compare it to a dry martini. The downside is that this comes at the cost of turning one triple into 3 triples.

Reification has analogs in other disciplines. Software developers familiar with object-oriented programming may want to think of reification as making the assertion into a first-class object. Physicists and people who just want their minds blown may want to compare reification to "second quantization". At this point, I'll don my ex-physicist hat (even though I never wore a hat while doing physics!) and tell you that second quantization is the mathematical machinery of field theory that allows field theory to treat bundles of waves as if they were real particles that can be created and annihilated.

Whether you're doing linked open data or quantum field theory, it's a good idea to focus on things that behave as if they were real. Otherwise, no dry martinis for you!

This is the first part of three articles on reification. In Part 2, I'll show how reification is applied in a real example, using the newly trendy RDFa. In Part 3, I'll write about whether reification is a good idea.

0 comments:

Contribute a Comment

Note: Only a member of this blog may post a comment.