Monday, December 21, 2009

Copyright Enforcement for eBooks: Cultural Life Preserver or Orwellian Nightmare?

I'm a 7 MPH speeder. When I'm on an empty highway with a 65 MPH speed limit, I drive 72. This puts my car in roughly the 70th percentile of car speed. But when some idiot comes zooming past at 85, I cheer when I see him stopped by the cops 5 minutes later.

Last time I was in England, I was appalled to find that cameras had been installed along some of the motorways that would send you a speeding ticket automatically if you averaged more than the speed limit. I told the limo driver that Americans would elect a black president long before we'd tolerate speed cameras on the freeway. I was right.

I'm no legal theorist, but I know better than to think that human behavior is determined by laws- laws only work as far as they reflect a social consensus. It's true for driving and it's also true for reading, listening to music, and watching videos. As behaviors change due to the introduction of technology, society is forced to modify social norms for behavior.

The book publishing industry is at the beginning of a technology driven change in the way that people read books, and the shape of the consensus that emerges will determine how creative production is sustained. (Same for news, but that's another story entirely!)

Social consensus has a lot of inertia because if people and institutions don't have to change, they won't. Think about how social consensus evolved when music became digital. It used to be that people expected to be able to listen for free, via radio, and expected to pay to "keep" the music. Once paid for, people expected to be able to share their records with friends in a variety of ways.

When music became digital with the advent of the compact disk, very little changed, at least for a while. The addition of internet distribution, however, allowed Napster to stretch the "sharing" behavior so as to cover free listening and threaten the buy-to-own behavior. The music industry responded with legal action, but its failure to provide convenient, authorized activities to cover accustomed behaviors gave Napster an effective monopoly on digitally distributed music. If not for the social habit of paying-to-keep music, the music industry may well have collapsed. With the takedown of Napster and the rise of authorized services like iTunes, Pandora and Spotify, the music industry has begun to successfully reshape user behavior forged by easy unauthorized file sharing, but its mistakes have clearly hurt.

The movie industry has had more luck with the onset of digital distribution. People still expect to watch TV for free, and to pay for premium entertainment at the movies. The internet bandwidth needed to easily move video files has become available at about the same time as distribution sites such as Hulu and Netflix, so pirates have never had much of a monopoly on digital movie distribution. YouTube offers a flood of free video content, and it works with rightsholders to identify and remove unauthorized uses of their work. Large amounts of unauthorized distribution has occurred, but the movie industry has responded with both the carrot and the stick, by providing enhanced in-theater experience, inexpensive secondary distribution channels, making deals with YouTube and providing specialized DVD content while pursuing takedowns and ostentatiously prosecuting copyright infringements. Certainly the movie industry has made some missteps, but a blockbuster movie can still gross a billion dollars.

People have always expected to pay to own books, but once bought, the books could be freely borrowed from friends or libraries, and a vibrant used-books market makes older works available at very low cost. The biggest change brought about by digital distribution is the flood of free material available on a huge variety of websites, from blogs to wikis to traditional news.

It's not clear how book (including ebook) sales will be impacted by unauthorized distribution of digital copies. Although I've noted that it's relatively easy to find and identify unauthorized copies of works like Harry Potter and the Deathly Hallows, it's not likely that people will change their book buying behavior unless they have to. That's why I find it surprising that J. K. Rowling and her publishers are giving the pirates a near monopoly on the digital version of that particular book.

I've heard publishers say that they've learned from the example of the music industry that the threat of piracy makes DRM (digital rights management) a necessity for distribution of ebook content. In fact, almost the opposite is true. Publishers have been distributing books for hundreds of years without DRM. A potential pirate doesn't need to crack any encryption; they need only buy a single copy of the book and scan it. I wrote about the advent of cheap book scanners in October; Wired has a recent article.

Pirating a book is somewhat more difficult than pirating a song, but comparable to pirating a movie. The first step is to acquire a digital copy. Popular books are easy to obtain and a professional pirate would likely remove the binding with a saw and feed the pages into a high-speed copier/scanner. (Until the DVD comes out, a pirate typically sits in a theater and films the movie; the DRM on DVD's is trivial to crack.)

The digital file would then either be seeded onto a peer-to-peer network or uploaded to a file distribution or streaming site similar to rapidshare. Studies by Arbor, Cisco, and Sandvine suggest that P2P networks are declining in popularity compared to the file distribution sites, especially in countries with high broadband penetration.

In a peer to peer network such as those using the BitTorrent protocol, the work is divided between tracker sites and the peers which provide the actual files. The use of many peers allows high-volume distribution without needing a high bandwidth internet connection. Since the RIAA and others began filing lawsuits against people thought to be involved in providing files, the remaining networks have adopted social-networking and encryption to make sure that they can no longer be easily monitored.

File distribution sites are being used more and more as broadband connections become widespread. These sites have many legitimate uses, and will respond to takedown notices when illicit content is identified on their sites (although the in some countries, the takedowns are processed with the underwhelming speed of a bank's electronic funds transfer.) The links and metadata for the illicit files mostly appear on third party sites, which complicates any enforcement action. Ironically, sites such as Rapidshare have become so popular that to use it easily you really have to purchase a premium subscription!

Still, digital book piracy has already begun to appear in significant amounts. According to Brad Beautlich, Sales Director at DtecNet, text books, including law and medical textbooks, are now frequently appearing on the content distribution sites and torrent indexes favored by copyright infringers. These tend to be expensive items sold in cost-sensitive markets, which increases the incentives for unauthorized use. The sites appear to have very few books that have been cracked from digital versions; most of the book content currently available is clearly derived from scanned print.

The lack of pirated e-reader files (such as kindle or epub files) is consistent with the profile of e-reader early adopters, who tend to be to be older and not particularly price sensitive. I assume it's because older users tend to have bad eyes and full shelves. They're unwilling to install P2P client software or be attracted by the sort of advertising found on file index sites. Readers in developing countries may be in different situations.

DtecNet is a company that has been providing detection services to media companies. They offer to seek out, document and help to take down unauthorized content from web sites and file sharing networks. Their task can be difficult, as they need to scan and monitor indexing sites that may cloak the identity of a file ("NITM2" instead of Night in the Museum 2) and figure out from user comments in multiple languages whether a file is genuine or not.

Beautlich suggests that although monitoring from his company would be expensive ($4000-5000/month for a Harry Potterish project), an early investment in copyright enforcement by the book industry might more effective than a strategy of waiting for a larger threat to arrive.

Another strategy to modify user behavior is being pursued by Audible Magic. Audible Magic has a rather different business model from DtecNet. Instead of working for rightsholders, Audible Magic provides content identification services to ISPs, educational institutions, and content distribution services, helping them minimize their liability for copyright infringement. In the US, the Higher Education Opportunity Act (HEOA) of 2008 requires colleges and universities to have "A plan to 'effectively combat' copyright abuse on the campus network using 'a variety of technology-based deterrents'."

Audible Magic provides an appliance that attaches to a router or gateway within the client's network. The appliance "listens" to network traffic, and when is recognizes copyrighted content being transferred in ways that connote unauthorized use, it either logs a report or attempts an intervention. According to Jay Friedman, Audible Magic's Vice President for Marketing, over 100 University campuses are using their systems. Pricing depends on the amount of bandwidth used by the university and can be as little as a few thousand dallars a year.

Interventions are positioned in a "graduated response" model. For example, a user's next webpage download might be replaced by a page suggesting that unauthorized activity may have occurred, along with a reminder of an institution's usage policies. Continued infractions might result in the user being put in a "timeout", followed by a human mediated intervention.

If you find it big-brotherish to have an "appliance" looking over your shoulder so see whether your infringing copyrights, you wouldn't be alone. The Electronic Frontier Foundation has warned that Audible Magic's service offering is "no magic bullet", and is concerned that this type content monitoring would be a threat to individual privacy rights. It's one thing for a universities and corporations to be proactive in avoiding copyright infringement liability, but imagine what it would be like if this sort of monitoring were a legal requirement! Public Knowledge has published an excellent overview of the issues surrounding this sort of network monitoring.

In fact, international treaties and legislation requiring ISPs to adopt "three strikes" graduated response policies culminating in loss of internet connection is being considered in Europe and other parts of the world. While many book publishers would be horrified to buy into these sorts of copyright enforcement regimes, at the same time they are aghast at the prospect of having their content pirated and their livelihoods destroyed.

Think about the speed limit monitor in the accompanying photo. Based on my observation it is very effective at modifying the behavior of drivers. 7-MPH speeders like myself become 1-MPH speeders. I don't think anyone minds being monitored by this sign- there is confidence that it's not doing anything other than measuring and displaying our speed. In contrast, hidden speed traps seem evil- they don't slow people down unless they own radar detectors; the egregious speeders are not the ones who get caught! Copyright enforcement for ebooks should be as much like that as possible. As Princeton's Ed Felten has observed, the ideal copyright enforcement system exhibits maximal compliance and minimum prosecution. Especially for books, monitoring systems should be as open as possible and visible to users to maximize compliance and to create confidence they are not also snooping on reading habits.

It's interesting to read about the experiences of a university that implemented monitoring of P2P networks to comply with HEOA. Illinois State's Digital Citizen Project's summary of "Escalated Response System Testing Utilizing Audible Magic Copysense" (pdf, 1.5 MB) is valuable reading. While it's hard to be sure that Illinois' program was effective (you can't measure events that have evaded detection), I found it interesting that Illinois State students expressed minimum complaints or concern about the program.

A company with content identification technologies similar to those of Audible Magic is Nexicon. Both companies have agreements in place to work with YouTube to help to identify copyrighted material in uploaded videos, but Nexicon's business model aligns them with enforcement-oriented rightsholders. Here's how Nexicon President Sam Glines describes their flagship services:
Through our GetAmnesty and PayArtists solutions, we share with the rights holders settlements collected via the DMCA notices sent to infringers. The copyright holder sets the dollar amount per infringement - in the case of PayArtists and for Frank Zappa, the settlement amount is $10 per infringement. Nexicon’s MARC platform is capable of sending 95 million DMCA notices each day. Nexicon’s MARC platform monitors billions of illegal downloads of copyrighted material on a daily basis.
Nexicon has recently been involved in controversial takedown notices which Prof. Mike Freedman of Princeton's Center for Information Technolgy Policy describes as "inaccurate enforcement". In addition to defending Frank Zappa's copyright interests, Nexicon, a public company, boasts about fighting child pornography. At the same time, it appears to be associated with a New Jersey company that represents pornography publishers in their battle against copyright pirates. It can be hard for a technology company to control how their customers employ technology, but I would like to see more clear and coherent explanations of what happened to Freedman than Nexicon has provided to date.

Identification of ebooks is rather a different endeavor than identification of video or audio files. Copyrighted content in audio and video files can be identified in a number of ways, including watermarking, hashing and fingerprinting. As its name implies, Audible Magic's roots are in the audio fingerprinting area, and its huge library of 7 million song fingerprints is a significant asset, but they increasingly need to use textual clues such as those required for eBook identificationand are interested in further developing book-related identification techniques. As I've written previously, textual fingerprints are surprisingly effective at identifying books, even using a single sentence.

Book publishers preparing to fight piracy need to first and foremost have their content ready to be identified. While metadata, epub files and the like will be useful in locating and identifying pirated content that includes OCRed text, scanned images of books are also likely to be useful for the development of content recognition systems. If book publishers don't at least have scan files of every book they own, now is the time to start for them to start scanning!

Enforcement is only one weapon in the fight against book piracy, and it is the one weapon that most quickly loses effectiveness, as the techniques of copyright evaders evolve. One potential weapon that should be avoided is the dirty trick. If book publishers are unable to learn from the Sony rootkit fiasco, they will get all the ill will and lawsuits they deserve.

The shaping of societal behavior is a hopeless endeavor if the stick is wielded without a corresponding carrot. Any psychologist will tell you that most powerful tool in modifiying human behavior is positive reinforcement. If ebooks are to succeed commercially, publishers must use every means possible to reward people who purchase ebooks. I hope to write more about this soon, but I believe that positive reinforcement is the best lens to look at DRM with. DRM will fail unless its users believe it is rewarding them with convenience and ease of use, and with sufficient reward, it is also unnecessary - that is the lesson of iTunes.

As the era of digital books dawns, book publishers should expect that business models will change. Their mission, if they choose to accept it, is not only to deal with unauthorized use, but also to lead users to a social consensus that benefits everyone.

Update: In this post, I managed to overlook Attributor. Here's a post about them.


Copyright Enforcement/Monitoring Companies

Privacy Organizations

Content Industry Organizations

Reblog this post [with Zemanta]


Contribute a Comment

Note: Only a member of this blog may post a comment.