Leigh Dodds suggested yesterday that someone should do an "evolutionary study" on successful and failed conference hashtags. Since I've been interested in the way that vocabulary propagates on networks, I decided to take up the idea and do a bit of data collection.
If you're not on Twitter, or haven't been to a conference recently, you may not have encountered the practice of hashtagging a conference. The hashtag is just a string that allows search engines to group together posts on a single topic. It's become popular to use twitter as a back-channel to discuss conference presentations while they happen (does anyone still use IRC for this?), or to report on things being discussed at a meeting. The hashtags can also be used on services such as Flickr. I'll be attending the Semantic Technology conference in San Jose next week, and there was a bit of back and forth about what the right hashtag should be, semtech09 or semtech2009. Someone connected to the conference asserted that the longer version was preferred, sparking Leigh's remark.
To collect data, I search Twitter for "conference" AND "hashtag" and compiled all the results from 7 and 8 days ago. That gave me a list of 30 conferences, and I searched for all tweets which included the hashtags for all these conferences. "MediaBistro Circus" (#mbcircus) was the most referenced of any of these, with 1500 tweets in all.
There was no evidence at all of anything resembling evolution. Almost all the hashtags appeared spontaneously and without controversy, each one apparently via the agency of an intelligent designer. I did not find a single example of multiple hashtags competing with each other in a "survival of the fittest" sort of way. I found two instances of dead-on-arrival hashtags, which were proposed once and never repeated. In only one case was there significant usage of an alternate hashtag- the America's Future Now conference appeared in 22 tweets as #afn09, compared to 949 tweets for #afn. Even in this case, there appeared to be no competition, as the #afn09 tweeters stuck with their hashtag.
One question posed by the initial query was whether "09" or "2009" was the correct convention. Although "09" was 2 to 3 times more popular than "2009" in hashtags, having no year indicator at all was twice as frequent as having a year. What was very clear, however, was that avoidance of unrelated hashtags was the clear preference of hashtag selectors. The most interesting examples of this were the "#cw2009" and "#cw09" hashtags used for ComplianceWeek Conference and CodeWorks Conference, respectively.
None of this is surprising in retrospect. It's quite easy to see if a hashtag is the right one- you just enter in a search to see what comes up before you put it in your tweet. If you can't find one that works as you want it to, most likely you will not use a hashtag at all. Conferences tend to be meetings of people who are connected to each other via common interests, and a small number of people tend to update frequently and be followed by large numbers of people with common interests. Conference goers also seem to be very motivated to propagate and adopt hashtags- hashtag announcements for conferences are quite frequently retweeted.
The behavior of people selecting hashtags is quite uniform. However, I've previously noted another meeting of semantic web folk that had trouble getting their hashtag straight. Perhaps people who develop taxonomies for a living are less likely to adopt other peoples' suggested vocabulary based on a feeling that their way is the best way, and thus are more likely to silo themselves. I've seen this phenomenon before- the phones at Bell Labs never seemed to work very well, and my wife's experience with computers at IBM was not exactly problem-free. The Librarian Conferences I've been to seem to have horrible classification systems. Perhaps one way to improve vocabulary propagation on the semantic web is to get rid of the ontologists!