Thursday, December 27, 2018

Towards Impact-based OA Funding

Earlier this month, I was invited to a meeting sponsored by the Mellon Foundation about aggregating usage data for open-access (OA) ebooks, with a focus on scholarly monographs. The "problem" is that open licenses permit these ebooks to be liberated from hosting platforms and obtained in a variety of ways. A scholar might find the ebook via a search engine, on social media or on the publisher's web site; or perhaps in an index like Directory of Open Access Books (DOAB), or in an aggregator service like JSTOR. The ebook file might be hosted by the publisher, by OAPEN, on Internet Archive, Dropbox, Github, or Libraries might host files on institutional repositories, or scholars might distribute them by email or via ResearchGate or discipline oriented sites such as Humanities Commons.

I haven't come to the "problem" yet. Open access publishers need ways to measure their impact. Since the whole point of removing toll-access barriers is to increase access to information, open access publishers look to their usage logs for validation of their efforts and mission. Unit sales and profits do not align very well with the goals of open-access publishing, but in the absence of sales revenue, download statistics and other measures of impact can be used to advocate for funding from institutions, from donors, and from libraries. Without evidence of impact, financial support for open access would be based more on faith than on data. (Not that there's anything inherently wrong with that.)

What is to be done? The "monograph usage" meeting was structured around a "provocation": that somehow a non-profit "Data Trust" would be formed to collect data from all the providers of open-access monographs, then channel it back to publishers and other stakeholders in privacy-preserving, value-affirming reports. There was broad support for this concept among the participants, but significant disagreements about the details of how a "Data Trust" might work, be governed, and be sustained.

Why would anyone trust a "Data Trust"? Who, exactly, would be paying to sustain a "Data Trust"? What is the product that the "Data Trust" will be providing to the folks paying to sustain it? Would a standardized usage data protocol stifle innovation in ebook distribution? We had so many questions, and there were so few answers.

I had trouble sleeping after the first day of the meeting. At 4 AM, my long-dormant physics brain, forged in countless all-nighters of problem sets in college, took over. It proposed a gendanken experiment:
What if there was open-access monograph usage data that everyone really trusted? How might it be used?
The answer is given away in the title of this post, but let's step back for a moment to provide some context.

For a long time, scholarly publishing was mostly funded by libraries that built great literature collections on behalf of their users - mostly scholars. This system incentivized the production of expensive must-have journals that expanded and multiplied so as to eat up all available funding from libraries. Monographs were economically squeezed in this process. Monographs, and the academic presses that published them, survived by becoming expensive, drastically reducing access for scholars.

With the advent of electronic publishing, it became feasible to flip the scholarly publishing model. Instead of charging libraries for access, access could be free for everyone, while authors paid a flat publication fee per article or monograph. In the journal world, the emergence of this system has erased access barriers. The publication fee system hasn't worked so well for monographs, however. The publication charge (much larger than an article charge) is often out of reach for many scholars, shutting them out of the open-access publishing process.

What if there was a funding channel for monographs that allocated support based on a measurement of impact, such as might be generated from data aggregated by a trusted "Data Trust"? (I'll call it the "OA Impact Trust", because I'd like to imagine that "impact" rather than a usage proxy such as "downloads" is what we care about.)

Here's how it might work:

  1. Libraries and institutions register with the OA Impact Trust, providing it with a way to identify usage and impact relevant to the library or institutions.
  2. Aggregators and publishers deposit monograph metadata and usage/impact streams with the Trust.
  3. The Trust provides COUNTER reports (suitably adapted) for relevant OA monograph usage/impact to libraries and institutions. This allows them to compare OA and non-OA ebook usage side-by-side.
  4. Libraries and institutions allocate some funding to OA monographs.
  5. The Trust passes funding to monograph publishers and participating distributors.

The incentives built into such a system promote distribution and access. Publishers are encouraged to publish monographs that actually get used. Authors are encouraged to write in ways that promote reading and scholarship. Publishers are also encouraged to include their backlists in the system, and not just the dead ones, but the ones that scholars continue to use. Measured impact for OA publication rises, and libraries observe that more and more, their dollars are channeled to the material that their communities need.

Of course there are all sorts of problems with this gedanken OA funding scheme. If COUNTER statistics generate revenue, they will need to be secured against the inevitable gaming of the system and fraud. The system will have to make judgements about what sort of usage is valuable, and how to weigh the value of a work that goes viral against the value of a work used intensely by a very small community. Boundaries will need to be drawn. The machinery driving such a system will not be free, but it can be governed by the community of funders.

Do you think such a system can work? Do you thing such a system would be fair, or at least fairer than other systems? Would it be Good, or would it be Evil?

  1. Details have been swept under a rug the size of Afghanistan. But this rug won't fly anywhere unless there's willingness to pay for a rug.
  2. The white paper draft which was the "provocation" for the meeting is posted here.
  3. I've been thinking about this for a while.


  1. Yeah, we have this problem with casebooks/legal textbooks. They are CC licensed and so are available from a variety of places. They are also shared via email, purchased from Amazon (for $0.99) and print versions at We are not super-dependent on download stats, but it sure does help. What we REALLY desire is "course adoption" stats. This shows up in our stats as large purchases from by bookstores or appearance on syllabi. I wish we could put some code into the ebook that "phones home", but well, surveillance is scary already. ;-)

    1. Have you considered putting a survey link into the ebook? In our "Mapping the free ebook supply chain" project we had some nice results with that.

  2. is our book store BTW.


Note: Only a member of this blog may post a comment.