It's interesting to compare the core data models for RDF and for Twitter. In RDF, the fundamental particles are, as I've said, subject-object-predicate triples. To recast that last sentence into the RDF model, we would proceed as follows:
Assertion:That's probably too self-referential for most people to wrap their heads around, so instead I'll change the example:
subject: RDF
object: subject-object-predicate triples
predicate: has fundamental particles of type
Assertion:I usually have trouble remembering which is the predicate and which is the object. If you think about it, however, you can express the same particle of knowledge in ways that swap the roles of predicate and object, or even subject and predicate. For example:
subject: The United States
object: Barack Obama
predicate: has a president named
Assertion:In your copious spare time, you can work out the other 4 permutations.
subject: Barack Obama
object: President of the United States
predicate: has the office of
Now let's look at Twitter. The particle of information in Twitter, the tweet, seems also to be a triple:
Tweet:The tweeter in turn has associated with it sets of followed users and followers as well as profile information. There's a lot to talk about here, and in a previous post I pointed out that Twitter message content is becoming richer and more linguistically complex. But the point I'd like to make for now is that twitter's point of view is that it doesn't care so much about what the message is saying as who is saying it and when it was said. The more we look at the RDF examples above, the more the subject-object-predicate representation of knowledge seems limiting. The assertion may be true or false depending on when it was said; assertions removed from the context of who is making the assertion are for the most part useless because machines have no way to know whether to trust the assertion.
tweeter: gluejar
message: going to bed now!
time: Wed, 29 Apr 2009 06:58:01 +0000
Friend-of-the-blog Jeff Young asserts that the OpenURL data model can be thought of as answering 6 questions: Who, What, Where, When, Why and How. Whatever success Twitter has achieved can be thought of as an argument that the most important of these are the Who, What and When.
Sanity Alert! the following may be mind-blowing to certain susceptible individuals: the data model that Twitter REALLY uses to propagate tweets is RSS and Atom. These formats are decended from what was originally called "Meta Content Format" which became "RDF Site Summary" (Yes, the very same RDF!) which became "Really Simple Syndication" or maybe something else, I'm not sure for sure. Here's how Twitter REALLY feeds into the semantic web:
tweet:
title: gluejar: going to bed now!
description: gluejar: going to bed now!
pubDate: Wed, 29 Apr 2009 06:58:01 +0000
guid: http://twitter.com/gluejar/statuses/1649740567
link: http://twitter.com/gluejar/statuses/1649740567
Exercise for the reader- how does this look in Atom?
Does anyone but me think that there's something weird going on here?
Apples and oranges. And fighter jets and goldfish. "...it doesn't care so much about what the message is saying as who is saying it and when it was said." So what is the point of comparing and contrasting two information technologies with such different goals? That's like saying that XML is better than relational databases because it's better at representing the relationship of inline markup to prose sentences, or that relational databases are better than XML because they scale so well when you want to track data that fits well into normalized tables.
ReplyDeleteThe things that Twitter serialization formats do well (track who is saying something, a simple string to represent what they said, and when they said it) is trivial in RDF--many have done it in RDF (http://bit.ly/bR3lbs)--and the things that RDF does well have nothing to do with the goals of Twitter. So, again, what is the point of comparing and contrasting two information technologies with such different goals?
bobducharme- Given that the goals are so different- isn't it of interest that the data models are similar? Isn't it interesting to see how the ecosystems surrounding twitter and the semantic web have evolved so differently?
ReplyDeleteNow that Twitter has announced annotations, we'll see increasing overlap of function and capability of the two; they are still very different beasts, but maybe they'll begine to compete for sustenance.
Twitter is a messaging service and a schema. RDF is a format for data representation. They can't compete, because they operate at completely different levels.
ReplyDeleteIn fact, Twitter may use RDF.
And a tweet is obviously not a triple, but a collection of them:
Assertion:
subject: tweet
object: gluejar: going to bed now!
predicate: content
Assertion:
subject: tweet
object: Wed, 29 Apr 2009 06:58:01 +0000
predicate: date
Etc.
One could imagine to use Twitter as an ontology collaborative workbench where everyone can propose RDF triples and others can confirm (retweet), infirm, make different proposals or comments (answer), etc. Anybody knows somebody who tried that?
ReplyDelete