Thursday, April 28, 2011

Open Access eBooks, Part 1

No Shelf Required: E-books in LibrariesI've been working on on a book chapter for a book edited by No Shelf Required's Sue Polanka. My chapter covers "Open Access E-Books". Over the next week or two, I'll be posting drafts for the chapter on the blog. Many readers know things that I don't about this area, and I would be grateful for their feedback and corrections. Today, I'll post the introduction, subsequent posts will include sections on Types of Open Access E-Books, Business Models for Open Access E-Books, and Open Access E-Books in Libraries. Note that while the blog always uses "ebook" as one word, the book will use the hyphenated form, "e-book".

Open Access E-Books

As e-books emerge into the public consciousness, “Open Access”, a concept already familiar to scholarly publishers and academic libraries, will play an increasing role for all sorts of publishers and libraries. This chapter discusses what Open Access means in the context of e-books, how Open Access e-books can be supported, and the roles that Open Access e-books will play in libraries and in our society.

The Open Access “Movement”

Authors write and publish because they want to be read. Many authors also want to earn a living from their writing, but for some, income from publishing is not an important consideration. Some authors, particularly academics, publish because of the status, prestige, and professional advancement that accrue to authors of influential or groundbreaking works of scholarship. Academic publishers have historically taken advantage of these motivations to create journals and monographs consisting largely of works for which they pay minimal royalties, or more commonly, no royalties at all. In return, authors’ works receive professional review, editing, and formatting. Works that are accepted get placement in widely circulated journals and monograph catalogs.

In the late 1970’s and 1980’s academic libraries became acutely aware that an expansion of research activity had resulted in the growth of both the numbers of journals and the numbers of articles published in the journals. The combination of increased subscription prices and the number of journals needed to support research resulted in a so-called “serials crisis”. Libraries were forced to cancel subscriptions. The reduction in circulation forced publishers to raise subscription prices further to make ends meet, and the resulting cycle of cancellations and price increases led to a fear that the whole system would collapse. If few libraries could afford subscriptions, fewer scholars would be able to read the articles, diminishing the attractiveness of publishing.

The advent of web-based publications in the 90’s led many to believe that the solution to the serials crisis would be a shift of the scholarly publishing industry to so-called “Open Access” business models. Open Access publications are those that can be read at no cost to the reader or the reader’s institution. The traditional model of publishing supported by subscription fees was thus styled as “Toll-Access” publishing. It was hoped that the combined cost reductions from digital distribution and automation would stop the cycle of rising expenditures.

Perhaps the most successful implementation of Open Access has been ArXiv, a database of digital preprints and reprints (“e-prints”) originally focusing on the particle physics community. Originally started by Paul Ginsparg, a physicist at Los Alamos National Labs, ArXiv is now located at Cornell University and hosts more than 670,000 scientific articles in e-print form. Authors deposit articles they’ve written into the repository, and other scholars are free to search, browse and download articles without needing any sort of subscription.

One reason for the success of Open Access archives has been that they have grown up in a parallel coexistence with the traditional academic journals, which have mostly shifted onto the web. In the so-called “Green” model for Open Access, many journals allow versions of accepted articles to be made available via repositories. Authors can thus submit their articles to high-prestige subscription-supported journals without worrying about colleagues’ access, because scholars that need to read their works can always access versions from free sources.

Meanwhile, the shift of traditional journals onto the web has allowed the rise of secondary distribution channels. Most academic libraries today enjoy access to a much broader range of journals compared to 20 years ago because of the availability of article databases that aggregate content from large numbers of journals.

The past decade has also seen the rise of “Gold” Open Access journals. These journals leverage low cost Internet distribution to allow articles to be read universally with no subscription charges. Led by Biomed Central and PLoS, these journals cover expenses by charging publication fees to the submitting author. They build prestige  and avoid becoming “vanity” presses by establishing rigorous review processes.

The success of Open Access journals and articles has for the most part not yet been duplicated in the word of books. There are a number of possible reasons for this. The first is the matter of cost. Publication fees for Open Access journal articles are in the range of $600-$3000; editing and production expenses for a book published by a university press are estimated to range from $10,000 for a book that’s mostly text to much more for a book with figures, photos, equations and cover art. Author-funded publication fees this large are unlikely to be practical, even with significant institutional subsidies.

Another factor holding back Open Access books may be a preference for print books over e-books. Books are much longer than journal articles, and many readers are uncomfortable reading a book on a computer screen. It’s only in the past two years that dedicated reader devices such as the Kindle and tablet computers such as the iPad have improved the e-book experience enough to gain wide consumer acceptance.

The business environment for book publishers is another possible factor. The university publisher loses money on much of its catalog, but compensates for this by having one or two titles that cross over to be successful outside the academic environment. has bolstered this pattern, by providing wide distribution for small print-run titles that would never have been available in bookstores before. In contrast, journal articles almost never cross over into non-professional markets.

Nonetheless, there have been a few notable attempts to publish Open Access e-books. I’ll cover these later in a section on business models for Open Access e-books, but it wouldn’t be right to omit mention of Project Gutenberg at this point. Project Gutenberg (PG) produced not only the first Open Access e-books, it produced the first e-books, period. Started by Michael Hart in 1971, PG aimed to take the text of public domain works and make them available via the Internet. To date, PG has put over 34,000 works into its collection, entirely through the efforts of volunteers.

Distribution of Open Access e-books can be thought of as an enterprise separate from their production, since the costs involved are of a different nature. The scaling laws of Internet distribution favor centralization, and as a result, organizations such as the Internet Archive are able to distribute appropriately licensed e-books on a vast scale; businesses such as Google are able to search and organize them; libraries, blogs, and portal sites are able to select and “curate” them. To some extent, this type of distribution depends on the self-contained nature of the book; it shouldn’t require the context of a specific website to retain and accumulate value.

Open Access for e-books provides many benefits in addition to allowing people to read for free. Access to the full text of books makes for more complete indexing. The utility of Google Books, and the effort Google has put into digitizing books from libraries, even when they are unable to make the books available because of copyright, is testament to the value of indexing the full text. Long-term preservation of our cultural heritage is another public benefit of Open Access to e-books.

next post in series -> 


  1. Comment from Jodi Schneider, who had trouble with the comment form:

    Thanks for sharing this draft, Eric, and soliciting comments!

    Here's my feedback:

    * I'm surprised that you're capitalizing "Open Access", and that
    "movement" is in quotes.

    * I'd add another subheading before "The success of Open Access
    journals and articles has for the most part not yet been duplicated in
    the word of books." and possibly also at "Nonetheless, there have been
    a few notable attempts to publish Open Access e-books."

    * The wording here is awkward: "range from $10,000 for a book that’s
    mostly text to much more for a book with figures, photos, equations
    and cover art." A citation would be really helpful for others trying
    to follow up on the economic angle.

    * To give context, add dates for ArXiv (when it was founded, how long
    it took to get wide adoption in physics). Also, in your formatting,
    add a return before the following paragraph.

    * " In contrast, journal articles almost never cross over into
    non-professional markets." -- I don't think this is true for
    health-related articles. There, open access is becoming ubiquitous.
    Maybe a mention of NIH's role in promoting open access would be

    * Some sense of the scope/market penetration of open access (e.g. what
    percentage of overall journals are in DOAJ? what percentage of "good"
    journals?) might be pertinent.

    I'm looking forward to seeing the next instalment of your article;
    thanks again for sharing this, and for soliciting feedback!

  2. The $10,000 and up estimate comes from feedback I got in response to this article. I'd love to get better citations.

    The crossover remark was pointed at the fact that revenue from crossover books can provide significant subsidy for the rest of a press's list. Even in the medical field, I don't know of any situations where a crossover article provides a similar subsidy. The closest thing I can think of is the case of reprints purchased by a company whose products are mentioned in an article and distributed as promotional material; the dynamics and ethics are very different.

    On capitalization, I defer to Wikipedia. Go edit there!

  3. I hope you are not buying the silly anti-DRM sentiment I see in some of my library colleagues.

    I did two surveys of eReader owners on MobileRead Forums. In the first survey (n=90), I asked eReader owners to click the two most common types of documents they load onto their devices. 37% clicked free public domain eBooks. 16% clicked eBook versions of new print bestsellers. In the second survey (n=75), I asked eReader owners to click the three types of documents they want to borrow from libraries. 1% clicked free public domain titles. 59% clicked eBook versions of new print bestsellers.

    Libraries' value to the community is in providing stuff that is not available elsewhere or is not available for free or cheap elsewhere. DRM protects the value of library eBook collections because people have to come to libraries to get them for free.

  4. Chris,

    Dismissing anti-DRM sentiment as "silly" is not a persuasive argument. For many of us, there is a strong sense that any e-book we "buy" that is DRM'd is not really ours in the sense that a physical paperback is -- that it could evaporate at any moment at the whim of the publisher or merchant, or that it may become unreadable when our current device expires.

    These are real and legitimate concerns, aren't they?

  5. Chris- wait till Part 4, I'll address the library angle.

    Chris and Mike- DRM imposes costs on both reader and publisher. The reader pays in convenience, the publisher pays for infrastructure. While there is a divergence of opinion on how big the costs are, especially on the reader side, it is clearly a benefit to both reader and publisher if these costs can be eliminated.

  6. Jean-Claude Guédon also had difficulty posting his comment (with an excellent point):

    Indeed, Gold publishing refers to OA journals, but OA journals are not all based on the so-called author-pay model. In fact, a majority among them are simply subsidized from A to Z. Most successful and visible is SciELO, initially launched from Brazil and now covering many countries in Latin America, Southern Europe and South Africa.

    This conflation of gold publishing with author-pay business models is a
    common error that must not be propagated further.

    Check DOAJ for a list of OA journals. (

  7. This is a good overview and has some interesting speculations about why there has been less interest in Open Access ebooks.

    One thing that you seem to overlook is that Open Access still has not been as successful for academic ejournals as many people had hoped. It definitely has not solved the "serials crisis." Although it is true that most university students and scholars have access to more academic journal content than they used to have due to epublishing, many academic libraries still are spending a large and growing percentage of their collections budget on ejournals.

    What suffers, as a result, it their ability to buy scholarly monographs. It is kind of a vicious circle for scholarly books. 1) Academic libraries have less to spend on scholarly books b/c a growing percentage of their budget goes to serials 2) Academic publishers reduce their print runs b/c fewer libraries are buying their books, which reduces economies of scale and increases the cost of individual titles 3) Academic libraries buy even fewer titles because the costs go up.

    It would be interesting for you to speculate on how Open Access ebooks might be able to break this cycle

  8. I also want to say something about the comment that:

    "DRM imposes costs on both reader and publisher ... it is clearly a benefit to both reader and publisher if these costs can be eliminated."

    Of course, this is true, but we also pay a lot to protect our property rights in the physical world.

    For example, consider the costs that a public library imposes on both itself and its patrons to protect its physical collection. Users usually have to create accounts with information that will allow the library to track them down if they fail to return books. Thus, libraries have to manage databases of patron accounts. They have to install expensive RFID or tattle tape on their books and install systems to monitor them, ect., ect. ...

    All of this "imposes costs on both reader" and libraries. Wouldn't it be so much cheaper and easier for everyone to get rid of this whole technological infrastructure of monitoring library collections? Instead of using staff and technology to make sure that patrons follow the rules, libraries could just put up signs on their doors and little notes in the books explaining their policies. Maybe, something like this: "Each library user is allowed to keep up to 10 library books for up to three weeks. Please return this book to the shelf after your time is up."

    OK -- I am going on too long here, but the point is that most people would consider it Utopian and impractical to get rid of technological rights management for physical books. There always have been and always will be costs involved in protecting your property, and I don't understand why so many people seem to believe that digital networks will lead to some kind of Utopian world in which all of these costs will go away.

  9. In response to Veblen's comment,

    "For example, consider the costs that a public library imposes on both itself and its patrons to protect its physical collection ... Wouldn't it be so much cheaper and easier for everyone to get rid of this whole technological infrastructure of monitoring library collections?"

    Part of the concern with regular libraries is exactly that: they are _physical collections_. If I take out a book from a library, it's unavailable to other visitors until I return it.

    By contrast if I download an e-Book, I take a copy -- it's not taken away.

    The concerns for protection are therefore very different.

  10. veblen wrote: "There always have been and always will be costs involved in protecting your property, and I don't understand why so many people seem to believe that digital networks will lead to some kind of Utopian world in which all of these costs will go away."

    The FUNDAMENTAL difference between a physical book (journal, article, sound recording, whatever) and an electronic one is that you can make infinite perfect copies of the latter at zero cost. Users would like to do that; publishers who have invested years and millions into physical-media business models would like to prevent them. No solution is going to satisfy both.

    The existence of back-channels such as The Pirate Bay makes me think that in the end, users are going to win this war. It's so much easier to make and distribute 100 new copies than to stamp out 100 existing copies. I know this is not a popular message among publishers, but I suspect a lot of the argument against it doesn't amount to much more than LA LA LA CAN'T HEAR YOU.

    How can publishers remain in business when infinite perfect copies are free to make and distribute? When the question is couched in these terms, I think the answer is pretty obvious: by charging for actual services, such as copy-editing, review, formatting, design and publicity, rather than by holding increasingly useless "rights".

    And this of course is exactly what open-access publishers like PLoS do: they charge authors to cover their costs in providing those services -- services which authors are prepared to pay for as they do provide genuine value to the author.

    That works in academia, because academic authors don't expect to get paid. They live off grants, and writing papers (and paying to have them published) is a core part of what they do.

    But it doesn't translate into the rest of the writing world, where the author expects to get paid. What is the solution? I don't know. But ignoring the problem, or pretending we can make it go away, isn't going to help.

  11. Mike Taylor,

    You asked whether I agreed that concerns over DRM denying people ownership of eBooks they purchase are legitimate. I do agree.

    I need to clarify that I was talking specifically about librarians wanting no-DRM. In my opinion, since DRM protects the value of library collections, librarians should be asking for better DRM, not no DRM. One problem librarians have with DRM is the clunkiness of Overdrive's checkout process. By comparison, Amazon's one-click purchase is clunk-free.

    Chris Rippel

  12. OK, Chris, that makes much more sense. Thanks for clarifying.

  13. Kindle is not "clunk-free" by any means. Sure, the purchase process is easy, but the reading platform is limited.

    For the iPhone, Stanza is still my ereader of choice, the Kindle reader is starved: While Amazon bought Stanza, they have merged few of its features and affordances into the Kindle reader.

    DRM is not just about ease of use today; it's also about ease of keeping one's reading material in the future. That is another point where Kindle utterly fails in my view.