Sunday, October 31, 2010

The User-Generated "Rally to Restore Sanity"

I had an inkling that Jon Stewart's "Rally to Restore Sanity" was going to be big a few weeks ago when I tried to make a hotel reservation in Washington, DC and found that no hotel rooms in the city were available this weekend (I had decided to combine a trip to the rally with some other business).

On Saturday morning, I realized that the rally would indeed be something unprecedented; at 9 AM, the inbound Metro Red line train was jam packed with rallygoers at the third stop (Twinbrook), even though nothing was scheduled till 12 AM. By the sixth stop, the train was having difficulty getting its doors closed, leading to delays.

The odd thing about the rally was that there were hardly any instructions as to what we were to do at the "million moderate march", other than to have some fun while engaging is a respectful discourse with other people and being willing to listen. Rallygoers were to make of the event whatever they wished. For some, it was a massive Halloween Party; for others it was a chance to express dismay at the rhetoric of Glenn Beck and Fox News. Whatever it was, there was a widespread feeling of wanting to participate in something historic.

On the train, collective coping behaviors emerged from the unexpected mass intimacy. When we got to stations where one or two riders needed to get off, other riders squeezed aside to make sure they could emerge from the train. Later, above ground, the crowd would make a special effort to part for rallygoers in wheelchairs and parents with children in strollers. Police cars and ambulances drove through streets packed with people because somehow the crowd recognized a necessity.

In the part of the crowd where I ended up, none of us could really hear or see the speeches or entertainment, but nobody seemed to care. We can watch Jon Stewart and Stephen Colbert on TV any evening, and we knew without watching what he might say. It was the rest of the crowd we were there to be with. Close to where I stood, a young man was trying to climb a tree to get a better view; there was already another fellow up in the tree. The crowd took notice, and began to cheer him on. Before long, there was a chant of "YES YOU CAN" that took hold of the crowd, and the fellow in the tree tried his best to help the climber up.

This morning, reading the press accounts, I was dismayed to see that much of the "mainstream media" seemed to miscomprehend the Rally for Sanity by focusing on the show rather than the audience. Whether the rally was 215,000 people or 420,000 people, the media's mistake is akin to reporting that YouTube is a site for stupid pet trick videos. The best way to understand the Sanity Rally's significance, in fact, is to make the analogy that YouTube is to Fox (or CBS, for that matter) as the Rally for Sanity is to almost any previous political rally. Finally, crowdsourcing has come to the crowd.

Nonetheless, when Jon Stewart began to speak at the end of the rally, the crowd suddenly became quiet and strained to hear. "These are hard times, not end times" we heard. There was something about the miracle of the Lincoln Tunnel, where cars strain to get through, merging from 20 lines or so down to two, and it somehow works even though the people in one car might be totally different from the people in the next car, the traffic moves forward concession by sensible concession, and sometimes the light at the end of the tunnel isn't the promised land, it's New Jersey.

But we didn't need Jon Stewart to tell us that, we had discovered that by being pressed together in a massive crowd, and by learning how to get around anyway.
Enhanced by Zemanta

Friday, October 29, 2010

Business Idea #4: Ungluing eBooks

It's been a while since I blogged a business idea for Gluejar. There have been at least three reasons for this hiatus.
  1. Blogging has been fun. Every article generates a bunch of ideas for me to pursue and I just love exploring new ideas and learning about new things. Not to mention the dialogue with readers which has been very stimulating. I've learned a lot.
  2. Since January, I've been working with architects and contractors on an extensive home renovation, which takes up a lot of time and mind share. This work is almost done, so it's time to think about new projects.
  3. Previous ideas smoldered and the heat dissipated. Same with a few other opportunities I looked at.
On the other hand, my recent series of articles on acquiring ebooks into the commons has led me to seriously consider making this into a business. The more I consider the possibilities and the details, the more convinced I become that the job is both viable and important. It's been over 2 months, and the hallucination has not worn off.

For your reference, here are the relevant articles:
The business model for this venture is simple- build, run and promote an ebook bounty market, and charge 5%-ish of closed transactions. The hard part is winning over mass markets of consumers, both individual and institutional. The first step, perhaps the hardest step, is explaining and advocating the concept to people who are only just getting a feel for what ebooks are. "Stealth" is just not going to get that done, so I've decided to be as open as possible about what I'm doing.

So here's a first draft of a home-page pitch/explanation. Let me know what you think.
Unglue Your eBooks!
Have you noticed that your eBooks are stuck inside your proprietary reading device?
Have you noticed that the printed books on your shelves are stuck there, instead of being available on your nifty reading thing?
Have you noticed how hard it is to lend an ebook to someone or to give it to your friends?
Have you noticed that you can't get the ebooks you want through your library?
The problem is glue. Not the kind of glue that binds paper together, but the legal kind that sticks intellectual property to its owners and licensees. Authors need a paycheck so they can devote time and effort to writing books; publishers need income so they can refine those books and make them beautiful. Their problem these days is that the book publishing industry is stuck on an old model of selling individual copies.
Gluejar offers a new way for people who love books to support the people who create books without putting all sorts of licensing glue and restrictions glue on the digital copies. We bring large numbers of book lovers together to financially support the ungluing of the books that they love.
Here's how it works:
  1. Decide how much you want to spend on ungluing your books. The amount is totally up to you; but how about 10% of what you normally spend on books?
  2. Visit our partner sites to decide which books you want to support; you can pledge support for any number of books! Or browse our catalog of books on offer.
  3. You can visit your Gluejar Boookshelf at any time to see the current support level for each of your books.
  4. We won't charge your credit card until supporters have pledged enough money so that rights holders agree to unglue the ebooks you want.
  5. When an ebook is unglued it will appear automatically in your favorite ebook reader account. Once that happens, you can give the ebook to all of your friends and your library can make it available to all of its patrons. 100% legal, anywhere.
  6. Many ebooks come with bonuses for supporters; you might win a dinner with the author, a signed print copy, or maybe just a simple thank you note. We'll email you with details.
As you can see, I've tied the name of the service into the pre-existing Gluejar name for now; at least it's not horrible.

If you want to suggest an alternative pitch, put it in the comments. The best suggestion will win $100 to spend on the Gluejar ebook bounty market!
Enhanced by Zemanta

Wednesday, October 20, 2010

Attributor eBook Piracy Numbers Don't Add Up

In my article on "Consumer Demand for Pirated eBooks", I showed that Google Trends data tells a very different story from the one that anti-piracy services vendor Attributor derived from the very same data. I did not comment, however, on the headline that Attributor gave for its press release. The key finding of the report heralded by that release was that "Daily demand for pirated e-books can be estimated at 1.5-3 million people worldwide." This result has garnered some significant attention, because the number is quite large.

Extracting numbers using the tools used by Attributor is rather involved, and it's taken a while for me to carefully examine the available data. After doing this work, I've decided that when Attributor wrote "can be estimated at 1.5-3 million", they left out the word "blindly". As far as I can tell, Attributor is recklessly inflating the magnitude of ebook piracy; using the very same traffic measurement tools, I estimate the truth to be about 10% of the number they claim.

The Attributor numbers come from data generated by Google's AdWords service. AdWords is designed to help advertisers select advertising keywords and to manage budgets. For example, AdWords will tell you that the keyword "PDF" is used in approximately 101 million searches per month, worldwide, or 3.32 million searches per day. "PDF" is a keyword that a searcher might use in the course of a search for a pirated ebook, so you could reasonably assume that some percentage of these searches involve a consumer looking for a book they can avoid paying for. The trouble with this assumption is that most searches that include "PDF" have nothing to do with ebooks.

Another AdWords tool designed to assist Google advertisers is the keyword suggestion tool. In practice, you use this tool to refine keywords. Here is a table of the top ten refined searches for "PDF":
Keywordpercent of "pdf"
filetype pdf 36.69%
doc to pdf 6.03%
pdf download 3.30%
pdf to swf 3.30%
pdf to xls 2.70%
free pdf 2.70%
pdf free 2.70%
pdf to word 2.21%
pdf to rtf 2.21%
php pdf 1.81%
Of these, it's reasonable to assume that some percentage of the "pdf free" and a smaller fraction of the "pdf download" searches are related to consumers trying to avoid paying for books. The other searches are clearly unrelated to books. We can further use the keyword suggestion tool to refine these estimates. My review of over 700 refined keywords indicates that at most 4% of PDF searches, or 132,000 per day, are looking for ebooks of any kind.

A review of AdWords' suggested refinements for the term "rapidshare" reveals that searcher interest in ebooks is negligible compared to that for movies, TV, music and games. For example, Rapidshare is a "file-locker" site, and might be expected to appear in search terms for illegally distributed files. Of 743 suggested keywords, only one, accounting for 0.24% of "rapidshare" queries, or about 4,000 per day, is clearly related to ebooks:
Keywordpercent of "rapidshare"
files rapidshare 13.45%
rapidshare download 6.03%
download rapidshare 6.03%
download from rapidshare 6.03%
rapidshare megaupload 4.93%
free rapidshare 3.29%
rapidshare free download 2.70%
free rapidshare downloader 2.70%
free rapidshare download 2.70%
rapidshare download free 2.70%
free download rapidshare 2.70%
rapidshare free downloader 2.70%
download rapidshare free 2.70%
free rapidshare downloads 2.70%
download free rapidshare 2.70%
rapidshare searcher 2.19%
rapidshare search 1.80%
search on rapidshare 1.80%
dvdrip rapidshare 1.21%
rapidshare file 1.21%
rapidshare windows 7 1.21%
rapidshare mp3 1.21%
rapidshare dvd 0.99%
windows 7 rapidshare 0.81%
movie rapidshare 0.54%
rapidshare movie 0.54%
rapidshare upload 0.54%
upload rapidshare 0.54%
rapidshare downloader 0.44%
rapidshare file download 0.44%
rapidshare music 0.44%
music rapidshare 0.44%
download rapidshare files 0.36%
movies rapidshare 0.36%
rapidshare files download 0.36%
rapidshare windows xp 0.36%
720p rapidshare 0.36%
rapidshare premium accounts 0.30%
rapidshare password 0.30%
xbox 360 rapidshare 0.30%
game rapidshare 0.30%
password rapidshare 0.30%
rapidshare game 0.30%
rapidshare premium account 0.24%
premium account rapidshare 0.24%
rapidshare account premium 0.24%
premium rapidshare account 0.24%
rapidshare generator 0.24%
rapidshare engine 0.24%
rapidshare engine search 0.24%
up rapidshare 0.24%
rapidshare software 0.24%
software rapidshare 0.24%
rapidshare ebook 0.24%
Harry Potter and the Twilight Saga make appearances farther down the list, but only the titles that exist as movies.

Although direct interest in ebook torrents is so small that AdWords can barely measure it (~1500 searches per day), torrent search sites can give us another way to estimate the magnitude of interest in pirated ebooks. According to "KickassTorrents", the torrents active recently had this composition:
movies 30.04%
music 27.62%
tv 16.22%
apps 13.76%
games 5.52%
anime 5.43%
ebooks 1.42%
About 1.4 million searches using the keyword "torrent" are made on Google daily, according to AdWords. If the distribution of searches mirrors the distribution of files, this would indicate that searches for ebook torrents comprise about 46,200 per day.

All in all, I estimate that about 210,000 searches made on Google per day represent possible interest in pirated ebooks. About 30,000 of these come from the US. The "real" number for all countries could be as high as 300,000 or as low as 100,000. The 1.5-3 million numbers reported by Attributor are not within the range of plausibility.

One difficulty with using Google AdWords to gain insight into piracy is that it measures only a "shadow cast by piracy", as expressed by a commenter on my previous post. Nonetheless, AdWords sheds considerable light on patterns of demand. For example, the tools show clearly that it's common for people to search for movies and TV shows and acquire them extralegally. Also, they indicate that most of the demand, about 82%, for pirated ebooks comes from outside of the US, UK and Canada. Publishers should plan antipiracy strategies accordingly, based on data that can be confirmed independently.

Update: I have a followup post.
Enhanced by Zemanta

Thursday, October 14, 2010

Bounty Markets for Open-Access eBooks

Mozart: The Piano Concertos
In January 1773, Wolfgang Amadeus Mozart placed advertisements asking patrons to "subscribe" to the three piano concertos he was writing. If he received enough support, the concertos would be finished by April, and subscribers would receive beautifully copied manuscripts. More importantly, they would have the pleasure of supporting the creation of a great work, which would be performed around the world. The  resulting concertos, K413-415 are today considered important works, but it took quite a long time for Mozart to gather enough subscribers.

This old model for publishing was modernized with the addition of cryptographic assurance layers by cryptographers John Kelsey and Bruce Schneier, who started their examination of intellectual property business models with a deep pessimism about the long-term technical viability of digital rights management systems. Kelsey and Schneier dubbed their system "the street performer protocol" in tribute to a friend who friend who had travelled Europe earning money with bagpipe performances. Presumably, the friend found that he could earn more money by passing the hat before or during a performance rather than after.

Street Performer Protocol is a fund raising method designed to support the free release of a creative work. The creator agrees to release the work only after a threshold amount of money is pledged by supporters. For this reason, the method has been termed a "threshold pledge system". Kelsey and Schneier describe how a third party, who they term the "publisher" can provide supporters with a layer of assurance that the creator will live up to his end of the bargain if the threshold is reached; otherwise, pledges are refunded to the supporters.

Another term that has been used for systems of this sort is "ransom publishing", which is particularly apt when an author gives away the first few chapters of a novel, but holds the rest of the chapters hostage until a suitable ransom is paid by readers who want to read the cliffhanger ending.

Somehow I doubt that a cryptographic assurance layer would have helped much with Mozart's concertos. Nor do I think that Mozart would have had much success giving out the first movement before asking for contributions.

What might have helped Mozart a lot, however, would be a market.

Markets function by bringing together many buyers, many products, and many sellers. If Mozart had been able to offer subscriptions to every music lover in the world, and those music lovers had easy access to all the composers in the world, Mozart would have been a very wealthy man.

Web sites that attempt to create markets for creative work around threshold pledge systems are definitely a trend. Kickstarter is perhaps the best example- They provide opportunities for creative people to solicit support for worthy projects. In many cases, the creators provide benefits for people who have supported their work. Another example is FashionStake, a website which lets ordinary people support fashion designers by pre-ordering designs before they hit stores. Supporters of successful projects thus get special access and special prices for the latest designs from the world’s top designers.

Kickstarter and FashionStake share several characteristics. The number of projects available for support is not huge; there are selection filters that have a side effect of preventing significant competition between projects. Also, there is an assumption that projects will not be executed if the threshold funding is not reached. The incentive to keep the threshold price low is that projects priced too high will not achieve their support threshold, and thus won't receive any funding at all.

I've been thinking about how to apply threshold pledge systems to the sponsorship of open access for ebooks. I believe that with some modest but essential innovations, a sort of threshold pledge market could become a powerful economic force in many segments of the ebook business.

The first innovation would be to create a market that covered ALL books. According to the Google Books team, there are over 100 million books that can be identified in the world; a relatively small fraction of them are out of copyright, and an even smaller fraction of them are available as open-access ebooks. Why not let people sponsor any and every book they cared about?

Note that these books have already been written and published, so the sponsors are not being patrons of artistic creation, as in Kickstarter, FashionStake, and Mozart's concerto subscribers, instead, they are posting a reward for conversion to open access. It's wrong to think of this as "ransom publishing"- a parent doesn't kidnap their own child! A better way to think of it is posting a "bounty" for the delivery of the ebook into a Creative Commons compatible license.

A second innovation follows from the first. If you allow the posting of a bounty on any book, then a lot of books will get only minimal sponsorship. Many of these books may be "orphans", without known rightsholders for the public to purchase ebook rights from. The only way to keep sponsorship dollars from sitting unused is to allow them to be posted to many books at once. The first book to claim a bounty would take home the money.

Multiple commitment of sponsorship dollars has two interesting effects. First, it magnifies the impact of sponsorship dollars. An commitment ratio of 100 to one allows 10 million dollars of support to look like a billion dollars offered to rightsholders. Smart rightsholders who participate at the right time could walk away with sizeable rewards. Second, multiple commitment puts rightsholders in competition with each other for sponsorship dollars. If two books share many sponsors the bounty for one book would go down significantly when the owners of the other book decide to accept their posted bounty. Rightsholders are thus discouraged from waiting too long for sponsorship dollars to build.

None of this will work if sponsors don't get something for their money. It seems to me that the released ebooks should include some sort of recognition text, but maybe just loading the sponsor's devices automagically with released ebooks would be enough. Given some format validation, the released ebooks would slide easily into Google Books, OpenLibrary, Feedbooks, other places- LibraryThing, GoodReads, WeRead, GetGlue and devices/apps- Kindle, Kobo, Nook, iBooks, Ibis Reader. Perhaps most importantly, the released ebooks could be curated and preserved by libraries around the world, something that can't happen properly with today's copyright system. That alone would be enough to get me to participate.

Mozart would approve, I think.

Next week: who would create a market to help people post bounties for the release of ebooks?
Enhanced by Zemanta

Friday, October 8, 2010

Consumer Demand for Pirated eBooks Stopped Growing in 2010

Online piracy of ebooks has been a persistent worry for book publishers who look at the successes and failures of other media that have moved to digital forms. A surprising number and variety of ebooks are easily availabile on file sharing websites and peer-to-peer networks that use bitTorrent and similar protocols. The possibility that this availability will cut into sales of licensed ebooks and even print books is a scary one for an industry that has had many decades of relative stability. At Digital Book World in January, Brian Napack, President of Macmillan, "delivered a passionate call to arms for publishers to fight piracy in the ebook space or risk permanent damage to the underpinnings of publishing as a commercial enterprise".

Adding to the ebook piracy hysteria have been studies of the prevalence of ebook piracy produced by Attributor, a company that sells anti-piracy services. I've previously written critically about Attributor's report that purported to find evidence that "Online Book Piracy Costs U.S. Publishers Nearly $3 Billion".

In their most recent report, Attributor has taken a rather clever approach to the measurement of ebook piracy. Instead of trying to track downloads, Attributor has begun to use Google Trends to gain an understanding of consumer demand for ebooks. Although there are many potential difficulties in using Google for this purpose, Google Trends is a powerful and useful tool for gaining insight into the things that web users around the world are looking for.

Attributor presents their data along with an alarming narrative of growing and pervasive ebook piracy, and points to the iPad as a contributing factor to an increase in demand for pirated ebooks. After playing around with Google Trends for a while, I've come to the conclusion that Attributor has narrowly selected data to fit their narrative; taken as a whole, Google Trends data broadly supports a rather different narrative: that the growth of consumer interest in pirated ebooks slowed significantly in 2009 and stopped in early 2010.

To understand how Google Trends informs the debate about the prevalence of ebook piracy, it helps to understand what activity is being measured. Google Trends measures the frequency that search terms are used. A consumer looking for a free copy of a particular work will typically search on the book title, adding  terms like "free" or "download" or "pdf" to locate downloadable files. A more sophisticated strategy, one that is quickly learned, is to add the name of a preferred download site. If the user prefers peer-to-peer networks, the word "torrent" can be added to locate "seed" files for the item. The file sharing sites most commonly used for this purpose are currently RapidShare, Megaupload, 4shared, and Hotfile. To use Google trends to measure the demand for a pirated ebook, you give it keywords that reproduce these searches. For example, demand for Stephanie Meyer's book Breaking Dawn can be assessed with a query such as this one.

To assess the overall state of ebook piracy, I used data from this query. Note that since the search is for ebooks generically, there's no telling for sure that the ebooks being searched for are really pirated; for the purposes of this study, I assumed that none of the ebooks being searched for are legally available on these sites. Calling them "pirated books" may be inaccurate, but I'll use that term anyway.

Some features of the data are immediately apparent. First of all, searches for pirated ebooks have increased a great deal over the past 5 years. It's worth noting however, that the most intense interest measured by Google occurs in India, the Philippines, Indonesia, Vietnam, Malaysia, Singapore, and eastern Europe. Less than half the search volume comes from the US. It's also easy to see seasonal peaks that obscure the shorter term trends. The peak periods for pirate ebook seeking are the December holidays and the beginning of September, presumably because of the start of school.
To eliminate seasonal variations, I computed the year over prior year growth of pirate ebook search activity. The resulting plot is quite smooth. After a few years of 100% per year growth, 2008 showed a clear slowing of growth. This slowing of growth continued up to the beginning of 2010, and then  flat-lined. Since February of 2010, the growth of interest in pirated ebooks has stopped completely.

It should be noted that this stabilization has occurred during a period of strong sales of ebook reader devices, including Kindle, Nook, and the iPad. Indeed, the unveiling of the iPad was coincident with the stabilization of demand for pirate ebooks.

It's hard to know for sure what's happening, but one interpretation of these patterns is that a broad increase in consumer-friendly availability of properly licensed ebooks over the last 2 years has squelched the growth of demand for ebooks from illicit sources. In that light, the remaining demand can be interpreted as a sign of poor availability for appropriately priced ebooks on college campuses and in developing countries.

While this data has to be seen as an encouraging sign for the book publishing industry, it's too soon to know if it will last. It's entirely possible that too-high prices, cumbersome DRM, or new technologies could reinvigorate the demand for illicitly shared ebook files. For the moment at least, the book publishing industry can exhale.
Enhanced by Zemanta

Tuesday, October 5, 2010

Aggregating Deep Discount Readers of eBooks

The book publishing industry should be terrified of readers like me. Over the last year, I have purchased a grand total of one new book. Why only one? I have a huge stack of books, both print and digital, in my aspirational reading queue. I read plenty of books, but there are many more books that I would like to read; so many in fact that I see no reason to spend $30 on a book when there are plenty of minimal-outlay books already waiting to fill my hours.

The books on this stack come from a variety of sources. I read ebooks from my wife's Kindle account. The print books are almost all purchased at used book sales, typically for $1 or $2 each. The publisher's revenue from all of this book enjoyment is $0, and of course the author gets only a small fraction of that in royalties.

No matter how terrifying the idea is, discount readers like me represent a big opportunity for book publishers as they move there properties onto digital platforms. Discount readers come in many forms; it's safe to say that the billions of books lent by libraries went to people unwilling to pay full retail price for books. Libraries contribute modestly to the income streams of publishers and authors; used book sellers not at all. In their print businesses, publishers have learned to segment their markets by offering paperback versions as well as remaindered books, but they have largely neglected the deep discount end of the demand curve.

There's a huge amount of value to society in deep discount demand, and it not just in the benefit to readers like me. Libraries include the preservation of our written culture in their mission; this activity wouldn't happen if the first-sale and fair use doctrines didn't limit the control that publishers and authors could exert over the use of their works. We need to think about how to do preservation as we translate the book business into a digital industry.

I've been thinking a lot about business models for ebook publishing, and a lot of my thinking has surrounded market segmentation methods. I've been looking for ways ways that discount readers like me can be aggregated into sustainable revenue streams to sustain institutions such as libraries.

One obvious model to serve the discount reader is to offer subscription packages. I've come to the conclusion that ebook subscription packages have many structural problems. Subscription packages inevitably cannibalize sales of the items they contain, and there's a lot of incentive for the package to exclude items that readers would really want.

In the course of studying academic publishing models, I think I've found a  way for the book business to serve deep discount readers, to reinvigorate libraries, and to create a new, sustainable revenue stream for publishing: public acquisitions of ebook rights.

Here's how it might work for me. There are lots of books I aspire to read, many more than I have time for. There are also lots of books that I'd like to have on my reading devices, because I've read them once in print. I want to use these books in many ways, on many devices, at any time in the future. I want to be able to search them, and have others read them. I don't want to have to mess with DRM. And I want them to be preserved and available forever in public libraries.  I also like the one new book I've purchased (Clay Shirky's Cognitive Surplus: Creativity and Generosity in a Connected Age) enough that I would also pay something towards letting you read it too! Imagine that I could offer $1 for each title on my list to have this magic occur. (Cognitive Surplus is a thought-provoking meditation on the things that happen when the barriers to collective action are lowered, among other things. You should read it!)

OK, here's a stretch, imagine that millions of other people feel the same way!

If millions of people feel this way, there's absolutely no reason this magic can't happen. I have advocated that libraries should work together to collectively acquire ebook assets. The same mechanisms that would allow libraries to act collectively could be used by individuals to act collectively on behalf of books that they care about.

 If a hundred thousand people offered a dollar to Clay Shirky (and Penguin, his publisher) for Cognitive Surplus to be released as a creative commons licensed ebook, certainly at some point they would examine their prospects for future sales and figure out how to say "yes". Once a book is liberated in this way, all the magic just happens.

 I'm not expecting J.K. Rowling to cash in her Harry Potter rights anytime soon, but I think there are many types of works and many types of authors who would find it financially advantageous to monetize their work in this way if it became popular. I also think that it's very common for readers to be passionate about the books they read in ways that transcend their narrow financial self interest.

 If you agree with me that mechanisms for public ebook acquisition by readers should be developed, I would very much like to to hear from you, either privately or in the comments!

Saturday, October 2, 2010

A Web Developer's Approach to Bedbugs

I hate bugs. It used to be that if I became aware of a problem or vulnerability in the software I was responsible for, or if I noticed something in the application I didn't understand, I wouldn't be able to sleep until the culprit was isolated and squashed. The thought of users encountering a known bug gave me an terrible feeling, and the thought that a hacker on the other side of the globe might be exploiting my system was my worst nightmare.

In its second year of operation, my previous business had two of its servers hacked into. In retrospect, it was a good thing to happen to us. Our customers never noticed the problem, but we got security religion real fast. We became meticulous in patch application, kept our servers clean as a whistle, and monitored the logs regularly. Our servers stayed up for the next 7 years through terrorist attacks of 9/11 and the blackout of '03.

When you do web development and service deployment, you quickly learn that any device open to the internet is being constantly probed for vulnerabilities. You become familiar with all the ways your server is listening for connections and the sort of threats that are out there. You EXPECT that sooner or later, any vulnerability will find attackers, even if an attack would be completely pointless. It's as if bank robbers went door-to-door breaking and entering, just in case your house happened to be a bank.

Consider spam. Spammers don't care that you're an English speaker with no interest in penile implants. They'll still send you emails about them in Greek. They'll even do it with every submit button you put on the web, and they'll leave inane comments on your baby blog. If there's habitat that can support spam, spam will somehow evolve to fill that habitat.

Bedbugs are exactly the same.

When I opened my eyes one morning this summer to the sight of a flat brown bug crawling on my pillow, I didn't freak out- I applied lessons I had learned doing web development and deployment. But I had a lot to learn about bedbugs.

If you've been at all sentient this summer you will have read one or more news articles about the bug that is devouring New York City. I've read all these articles and they're mostly useless. The web was more informative, but less so than you might think. It takes at most 2 Google searches to find useful information almost any web security threat based on its attack vector, but search for bedbug information, and most of what you get are people trying to sell you ineffective solutions to your bedbug problem. Some mainstream media outlets illustrated their bed bug article with a picture of a shield bug. Thank goodness for Wikipedia! Their photos allowed me to identify my bug as a genuine adult bedbug.

Who's afraid of a little brown bug? There's never just one bug. I learned that bedbugs are extremely good at hiding, but you can detect their presence by things they leave around: bites marks in twos or threes, a few mm apart, and little balls of excrement that leave smeary black spots near their hiding places. The bites I had been wondering about, but the bedbug feces I hadn't noticed. Examination of my bed turned up some black spots on the fabric holding the slats together, and I was hot on the trail. I disassembled the bed and found a bunch of bed bugs in a the recessed holes of two of the screw heads holding the bed together. To date, I've found about 30 bugs; I keep them in a double zip-lock plastic bag, the better to study them. Am I a nerd or what?

At a certain point in dealing with bedbugs, paranoia sets in. They are so clever in their ability to hide, you begin to assume they are everywhere. And then you start feeling bugs crawling over you at night. One very useful fact helped me deal with this. It turns out that bedbugs have evolved the ability to crawl on your skin so that you can't feel it. If you feel something crawling on your skin, you can be pretty sure it's not a bedbug. Unless its a bedbug who's just tanked up and is heavy with a load of your blood, and then it's too late. Take a look at the long feet the big bedbug in my picture has- they're designed for stealthy crawling.

By the way, the pictures are taken with a cheap USB microscope that I got on Amazon. Best nerd toy ever! Works with Photobooth on my MacBook Pro. Hey, Christmas isn't that far off!

The USB microscope allowed me to confirm that the tiny things crawling around my kid's beds weren't bedbugs. In my paranoia, I started using an LED flashlight to highlight dust on the floor, and noticed that some of the dust was crawling around. I'm pretty sure these crawling bits of dust are "booklice". Booklice aren't really lice, they're tiny insects also called psocids. Booklice feed on mold that grows on damp plaster (yep, we got that) and glue such as found on book bindings. (Another advantage of ebooks!)

Bed bugs are quite hardy. They last at least a month in my plastic zip-lock bag, and they are resistant to pesticides. DDT resistant strains were observed even in the 1950's

I'm currently living in an apartment while our house is being renovated. When we move back in, we'll be very careful to take only items that we're sure can be trusted. It will be just like rebuilding a hacked computer system. You have to start with something you know for sure is clean, something you can trust, because a otherwise you can't rule out a rootkit. I'm now careful to clean out my vacuum cleaner after every use, and I stow it in a giant zip-lock bag.

For now, though, my study of bed bugs has led me to the conclusion that getting rid of the bugs is just a first step. If you have a server that gets hacked, it does you no good to rebuild the system unless you identify the weakness your attacker exploited. If you don't, the attack will happen again, and you'll be back where you started from. Same with bedbugs. So I've started thinking of my bedroom and its maintenance as a system for sleeping that, for the foreseeable future, will be under continual attack from parasitic blood sucking invaders. Knowledge of the invader's life cycle should help me to design maintainable defenses against the attack.

Most of the news media have focused attention on how to avoid "getting" bedbugs, as if they're a disease. But given how difficult it is for a layperson to detect a bedbug infestation, my guess is that for most people, low-level exposure to bedbugs will be almost unavoidable. Even if you stop sitting in the comfy couches at Starbucks, hanging out in the library, and watching movies in dark theaters, you won't avoid some amount of bedbug exposure.

Think of bedbugs the same way you think of computer viruses. A surefire way to avoid them is to stop using the internet and email, but that's not what most of us do. We deal with computer viruses and similar attacks by avoiding risky behavior and vulnerable software, and by keeping our systems clean and uncluttered. Or don't, as the case may be.

What are the measures that will keep the bedbugs from biting? Some knowledge of the bedbug's life-cycle can provide some insight into our sleeping system's vulnerabilities.

Let's start with the adult bedbug. It feeds at night, and after a full meal, it's sluggish and easy to catch (and to smush into a bloodstain on your sheet or wall). It can go weeks without a meal in its hiding place, and lays eggs where they can hatch and find a meal. The newborn bedbug nymph locates a host the same way a mosquito does- sensing heat and CO2. Nymphs are hard to see, almost transparent, until they have a drink. Then they find a dark place to hide when the sun comes up. The bedbug grows through six stages of development, shedding a skin at each stage.

The ideal habitat for a bedbug is a dark hiding place with an easy commute to dinner. In retrospect, the flaw in my sleeping system was that it provided plenty of idyllic bedbug habitat. I have a platform bed that has a closed space under the bed slats. You can't clean under the bed without removing the mattress and unscrewing the slats. The slats and the frame have all sorts of crevices that must have been a cozy retirement home for great grandpa bedbugs. The highest concentration of resting bedbugs I found was an screw hole a short 18 inch crawl from my shoulder. Other locations reputed to be favorites of bedbugs include electrical outlets, cracks in the floor, and "popcorn" ceilings.

I've eliminated as much of this habitat as possible. The mattress is encased in a mite-proof cover, even though it showed no sighs of infestation. I've filled cracks and I've eliminated clutter from neat my bed. I've put a lot of stuff in sealable plastic bags.

A new weekly cleaning regimen has been instituted for my sleeping system. Every Saturday morning, I run the sheets through the wash and heat the quilt in the dryer. I take apart the bed frame and vacuum, dust, and mop around and under the bed. I snoop around with the flashlight for any signs of bedbugs or their feces. So far this seems to have worked well. I've seen only 2 bedbugs in the last two months.

When we move back into our house, the platform bed frame will get trashed. A new bed will have legs and feet, to better isolate my sleeping area from the floor and to make it easier to clean regularly. When more bedbugs try to hack into my sleeping system, they will not find any cosy hideouts.

And Mom was right. We all need to make our beds every day.

Some resources I found to be useful and/or interesting:

Locating and eliminating bedbugs takes expertise; if you can find a reliable and knowledgeable pest control professional, money you spend on them will not be wasted. But be aware that the explosion of concern about this nuisance created the ideal "habitat" for scam artists and hucksters selling crap.