Wednesday, May 20, 2009

Reif#&cation Part 2: The Future of the RDF, RDFa, and the Semantic Web is Behind Us

In Reif#&cation Part 1, I introduced the concept of reification and its role in RDF and the Semantic Web. in Part 3, I'll discuss the pros and cons of reification. Today, I'll show some RDFa examples.

I've spent the last couple of days catching up on lots of things that have happened over the last few years while the semantic web part of my brain was on vacation. I was hoping to be able to give some examples of reification in RDFa using the vocabulary that Google announced it was supporting, but I'm not going to be able to do that, because the Google vocabulary is structured so that you can't do anything useful with reification. There are some useful lessons to draw from this little fact. First of all, you can usually avoid reification by designing your domain model to avoid it. You should probably avoid it too if you can. In the Google vocabulary, a Review is a first-class object with a reviewer property. The assertion that a product has rating 3 stars is not made directly by a reviewer, but indirectly by a review created by a reviewer.

Let's take a look at the html snippet presented by Google on their help page for RDFa (It's permissible to skip past the code if you like.):


<div xmlns:v="http://rdf.data-vocabulary.org/#"
typeof="v:Review">
<p><strong><span property="v:itemReviewed">
Blast 'Em Up</span>
Review</strong></p>
<p>by <span rel="v:reviewer">
<span typeof="v:Person">
<span property="v:name">Bob Smith</span>,
<span property="v:title">Senior
Editor</span> at ACME Reviews
</span>
</span></p>
<p><span property="v:description">This is a great
game. I enjoyed it from the opening battle to the final
showdown with the evil aliens.</span></p>

</div>

(Note that I've corrected a bunch of Google's sloppy mistakes here- the help page erroneously had "v:person", "v:itemreviewed" and "v:review" where "v:Person", "v:itemReviewed" and "v:Review" would be been correct according to their published documentation. I've also removed an affiliation assertion that is hard to fix for reasons that are not relevant to this discussion, and I've fixed the non-well-formedness of the Google example. )

The six RDF triples embedded here are:

subject: this block of html (call it "ThisReview")
predicate: is of type
object: google-blessed-type "Review"

subject: ThisReview
predicate: is reviewing the item
object: "Blast 'Em Up"

subject: ThisReview
predicate: has reviewer
object: a google-blessed-type "Person"

subject: a thing of google-blessed-type "Person"
(call it BobSmith)
predicate: is named
object: "Bob Smith"

subject: BobSmith
predicate: has title
object: "Senior Editor"

subject: ThisReview
predicate: gives description
object: "This is a great game. I enjoyed it from the
opening battle to the final showdown with the evil
aliens."

Notice that in Google's favored vocabulary, Person and Review are first-class objects and the item being reviewed is not (though they defined a class that might be appropriate). An alternate design would be to make the item a first class object and the review a predicate that could be applied to RDF statements. The seven triples for that would be

subject: a thing of google-blessed-type "Product"
(call it BlastEmUp)
predicate: is named
object: "Blast 'Em Up"

subject: BobSmith
predicate: is named
object: "Bob Smith"

subject: BobSmith
predicate: has title
object: "Senior Editor"

subject: an RDF statement (call it TheReview)
predicate: has creator
object: BobSmith

subject: TheReview
predicate: has subject
object: BlastEmUp

subject: TheReview
predicate: has predicate
object: gives description

subject: TheReview
predicate: has object
object: "This is a great game. I enjoyed it from the
opening battle to the final showdown with the evil
aliens."

To put those triples in the same HTML, I do this:


<div xmlns:v="http://rdf.data-vocabulary.org/#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
typeof="rdf:Statement"
rel="dc:creator"
href="#BobSmith">
<p><strong>
<span property="rdf:subject">
<span typeof="v:Product">
<span property="v:name">Blast 'Em Up</span>
</span>
</span> Review</strong></p>
<p>by <span typeof="v:Person" id="BobSmith">
<span property="v:name">Bob Smith</span>,
<span property="v:title">Senior Editor</span>
at ACME Reviews
</span></p>
<p><span property="rdf:predicate"
resource="v:description"/>
<span property="rdf:object">This is a great
game. I enjoyed it from the opening battle
to the final showdown with the evil
aliens.</span></p>
</div>

I've drawn one extra bit of vocabulary from the venerable "Dublin Core" vocabulary, "dc:creator", to do this.

Some observations:
  1. Reification requires a bit of gymnastics even for something simple; if I wanted to reify more than one triple, it would start to look really ugly.
  2. Through use of a thought-out knowledge model, I can avoid the need for reification.
  3. The Knowledge model has a huge impact on the way I embed the information.

This last point is worth thinking about further. It means for you and me to exchange knowledge using RDFa or RDF, we need to share more than a vocabulary, we need to share a knowledge model. It reminds me of another story I heard on NPR, about the Aymara people of the Andean highlands, whose language expresses the future as being behind them, whereas in English and other western languages the future is thought of as being in front of us. We can know the vocabulary for front and back in Aymarian, but because we don't share the same knowledge model, we wouldn't be able to successfully speak to an Aymarian about the past and the future.

1 comment:

  1. There is another problem with reification: Adding a reified triple allows you to say things about the triple, but it does not actually assert the triple. Thus, if you want your document to contain the statement

    subject: BlastEmUp
    predicate: description
    object: "This is a great game..."

    then you need to add that triple once more, in non-reified form. This makes reification almost useless in practice, IMO.

    ReplyDelete