Showing posts with label OCLC. Show all posts
Showing posts with label OCLC. Show all posts

Friday, May 18, 2018

The Shocking Truth About RA21: It's Made of People!

Useful Utilities
logo from 2004
When librarian (and programmer) Chris Zagar wrote a modest URL-rewriting program almost 20 years ago, he expected the little IP authentication utility would be useful to libraries for a few years and would be quickly obsoleted by more sophisticated and powerful access technologies like Shibboleth. He started selling his program to other libraries for a pittance, naming this business "Useful Utilities", fully expecting that it would not disrupt his chosen profession of librarianship.

He was wrong. IP address authentication and EZProxy, now owned and managed by OCLC, are still the access management mainstays for libraries in the age of the internet. IP authentication allows for seamless access to licensed resources on a campus, while EZProxy allows off-campus users to log in just once to get similar access. Meanwhile, Shibboleth, OpenAthens and similar solutions remain feature-rich systems with clunky UIs and little mainstream adoption outside big rich publishers, big rich universities and the UK, even as more distributed identity technologies such as OAuth and OpenID have become ubiquitous thanks to Google, Facebook, Twitter etc.

from My Book House, Vol. I: In the Nursery, p. 197.
So how long will the little engines that could keep chugging? Not long, if the folks at RA21 have their way. Here are some reasons why the EZProxy/IP authentication stack needs replacement:

  1. IP authentication imposes significant administrative burdens on both libraries and publishers. On the library side, EZProxy servers need a configuration file that knows about every publisher  supplying the library. It contains details about the publisher's website that the publisher itself is often unaware of! On the publisher side, every customer's IP address range must be accounted for and updated whenever changes occur. Fortunately, this administrative burden scales with the size of the publisher and the library, so small publishers and small institutions can (and do) implement IP authentication with minimal cost. (For example, I wrote a Django module that does it.)
     
  2. IP Addresses are losing their grounding in physical locations. As IP address space fills up, access at institutions increasingly uses dynamic IP addresses in local, non-public networks. Cloud access points and VPN tunnels are now common. This has caused publishers to blame IP address authentication for unauthorized use of licensed resources, such as that by Sci-Hub. IP address authentication will most likely get leakier and leakier.
     
  3. Men Monsters in the middle are dangerous, and the web is becoming less tolerant of them. EZProxy acts as a "Man Monitor in the Middle", intercepting web traffic and inserting content (rewritten links) into the stream. This is what spies and hackers do, and unfortunately the threat environment has become increasingly hostile. In response, publishers that care about user privacy and security have implemented website encryption (HTTPS) so that users can be sure that the content they see is the content they were sent.

    In this environment, EZProxy represents an increasingly attractive target for hackers. A compromised EZProxy server could be a potent attack vector into the systems of every user of a library's resources. We've been lucky that (as far as is known) EZProxy is not widely used as a platform for system compromise, probably because other targets are softer.

    Looking into the future, it's important to note that new web browser APIs, such as service workers, are requiring secure channels. As publishers begin to make use these API's, it's likely that EZProxy's rewriting will unrepairably break new features.

So RA21 is an effort to replace IP authentication with something better. Unfortunately, the discussions around RA21 have been muddled because it's being approached as if RA21 is a product design, complete with use cases, technology pilots, and abstract specifications. But really, RA21 isn't a technology, or a product. It's a relationship that's being negotiated.

What does it mean that RA21 is a relationship? At its core, the authentication function is an expression of trust between publishers, libraries and users. Publishers need to trust libraries to "authenticate" the users for whom the content is licensed. Libraries need to trust users that the content won't be used in violation of their licenses. So for example, users are trusted keep their passwords secret. Publishers also have obligations in the relationship, but the trust expressed by IP authentication flows entirely in one direction.

I believe that IP Authentication and EZProxy have hung around so long because they have accurately represented the bilateral, asymmetric relationships of trust between users, libraries, and publishers. Shibboleth and its kin imperfectly insert faceless "Federations" into this relationship while introducing considerable cost and inconvenience.

What's happening is that publishers are losing trust in libraries' ability to secure IP addresses. This is straining and changing the relationship between libraries and publishers. The erosion of trust is justified, if perhaps ill-informed. RA21 will succeed only if creates and embodies a new trust relationship between libraries, publishers, and their users. Where RA21 fails, solutions from Google/Twitter/Facebook will succeed. Or, heaven help us, Snapchat.

Whatever RA21 turns out to be, it will add capability to the user authentication environment. IP authentication won't go away quickly - in fact the shortest path to RA21 adoption is to slide it in as a layer on top of EZProxy's IP authentication. But capability can be good or bad for parties in a relationship. An RA21 beholden to publishers alone will inevitably be used for their advantage. For libraries concerned with privacy, the scariest prospect is that publishers could require personal information as a condition for access. Libraries don't trust that publishers won't violate user privacy, nor should they, considering how most of their websites are rife with advertising trackers.

It needn't be that way. RA21 can succeed by aligning its mission with that of libraries and earning their trust. It can start by equalizing representation on its steering committee between libraries and publishers (currently there are 3 libraries, 9 publishers, and 5 other organizations represented; all three of the co-chairs represent STEM publishers.) The current representation of libraries omits large swaths of libraries needing licensed resources. MIT, with its Class A huge IP address block, has little in common with my public library, the local hospital, or our community colleges. RA21 has no representation of Asia, Africa, or South America, even on the so-called "outreach" committee. The infrastructure that RA21 ushers in could exert a great deal of power; it will need to do so wisely for all to benefit.

To learn more...
Thanks to Lisa Hinchliffe and Andromeda Yelton for very helpful background.

Would you let your kids see an RA21 movie?
_______________

Update 5/17/2019: A year later, the situation is about the same

Tuesday, December 22, 2015

xISBN: RIP


When I joined OCLC in 2006 (via acquisition), one thing I was excited about was the opportunity to make innovative uses of OCLC's vast bibliographic database. And there was an existence proof that this could be done, it was a neat little API that had been prototyped in OCLC's Office of Research: xISBN.

xISBN was an example of a microservice- it offered a small piece of functionality and it did it very fast. Throw it an ISBN, and it would give you back a set of related ISBNs. Ten years ago, microservices and mashups were all the rage. So I was delighted when my team was given the job of "productizing" the xISBN service- moving it out of research and into the marketplace.

Last week,  I was sorry to hear about the imminent shutdown of xISBN. But it got me thinking about the limitations of services like xISBN and why no tears need be shed on its passing.

The main function of xISBN was to say "Here's a group of books that are sort of the same as the book you're asking about." That summary instantly tells you why xISBN had to die, because any time a computer tells you something "sort of", it's a latent bug. Because where you draw the line between something that's the same and something that's different is a matter of opinion and depends on the use you want to make of the distinction. For example, if you ask for A Study in Scarlet, you might be interested in a version in Chinese, or you might be interested to get a paperback version, or you might want to get Sherlock Holmes compilations that included A Study in Scarlet. For each  question you want a slightly different answer. If you are a developer needing answers to these questions, you would combine xISBN with other information services to get what you need.

Today we have better ways to approach this sort of problem. Serious developers don't want a microservice, they want richly "Linked Data". In 2015, most of us can all afford our own data crunching big-data-stores-in-the-cloud and we don't need to trust algorithms we can't control. OCLC has been publishing rather nice Linked Data for this purpose. So, if you want all the editions for Cory Doctorow's Homeland, you can "follow your nose" and get all the data you need.

  1. First you look up the isbn at http://www.worldcat.org/isbn/9780765333698
  2. which leads you to http://www.worldcat.org/oclc/795174333.jsonld (containing a few more isbns
  3. you can follow the associated "work" record: http://experiment.worldcat.org/entity/work/data/1172568223
  4. which yields a bunch more ISBNs.

It's a lot messier than xISBN, but that's mostly because the real world is messy. Every application requires a different sort of cleaning up, and it's not all that hard.

If cleaning up the mess seems too intimidating, and you just want light-weight ISBN hints from a convenient microservice, there's always "thingISBN". ThingISBN is a data exhaust stream from the LibraryThing catalog. To be sustainable, microservices like xISBN need to be exhaust streams. The big cost to any data service is maintaining the data, so unless maintaining that data is in the engine block of your website, the added cost won't be worth it. But if you're doing it anyway, dressing the data up as a useful service costs you almost nothing and benefits the environment for everyone. Lets hope that OCLC's Linked Data services are of this sort.

In thinking about how I could make the data exhaust from Unglue.it more ecological, I realized that a microservice connecting ISBNs to free ebook files might be useful. So with a day of work, I added the "Free eBooks by ISBN" endpoint to the Unglue.it api.

xISBN, you lived a good micro-life. Thanks.

Wednesday, June 10, 2015

Protect Reader Privacy with Referrer Meta Tags

Back when the web was new, it was fun to watch a website monitor and see the hits come in. The IP address told you the location of the user, and if you turned on the referer header display, you could see what the user had been reading just before.  There was a group of scientists in Poland who'd be on my site regularly- I reported the latest news on nitride semiconductors, and my site was free. Every day around the same time, one of the Poles would check my site, and I could tell he had a bunch of sites he'd look at in order. My site came right after a Russian web site devoted to photographs of unclothed women.

The original idea behind the HTTP referer header (yes, that's how the header is spelled) was that webmasters like me needed it to help other webmasters fix hyperlinks. Or at least that was the rationalization. The real reason for sending the referer was to feed webmaster narcissism. We wanted to know who was linking to our site, because those links were our pats on the back. They told us about other sites that liked us. That was fun. (Still true today!)

The fact that my nitride semiconductor website ranked up there with naked Russian women amused me; reader privacy issues didn't bother me because the Polish scientist's habits were safe with me.


Twenty years later, the referer header seems like a complete privacy disaster. Modern web sites use resources from all over the web, and a referer header, including the complete URL of the referring web page, is sent with every request for those resources. The referer header can send your complete web browsing log to websites that you didn't know existed.

Privacy leakage via the referrer header plagues even websites that ostensibly believe in protecting user privacy, such as those produced by or serving libraries. For example, a request to the WorldCat page for What you can expect when you're expecting  results in the transmission of referer headers containing the user's request to the following hosts:
  • http://ajax.googleapis.com
  • http://www.google.com (with tracking cookies)
  • http://s7.addthis.com (with tracking cookies)
  • http://recommender.bibtip.de
None of the resources requested from these third parties actually need to know what page the user is viewing, but WorldCat causes that information to be sent anyway. In principle, this could allow advertising networks to begin marketing diapers to carefully targeted WorldCat users. (I've written about AddThis and how they sell data about you to advertising networks.)

It turns out there's an easy way to plug this privacy leak in HTML5. It's called the referrer meta tag. (Yes, that's also spelled correctly.)

The referrer meta tag is put in the head section of an HTML5 web page. It allows the web page to control the referer headers sent by the user's browser. It looks like this:

<meta name="referrer" content="origin" />

If this one line were used on WorldCat, only the fact that the user is looking a WorldCat page would be sent to Google, AddThis, and BibTip. This is reasonable, library patrons typically don't expect their visits to a library to be private; they do expect that what they read there should be private.

Because use of third party resources is often necessary, most library websites leak lots of privacy in referer headers. The meta referrer policy is a simple way to stop it. You may well ask why this isn't already standard practice. I think it's mostly lack of awareness. Until very recently, I had no idea that this worked so well. That's because it's taken a long time for browser vendors to add support. Although Chrome and Safari have been supporting the referrer meta tag for more than two years; Firefox only added it in January of 2015. Internet Explorer will support it with the Windows 10 release this summer. Privacy will still leak for users with older browser software, but this problem will gradually go away.

There are 4 options for the meta referrer tag, in addition to the "origin" policy. The origin policy sends only the host name for the originating page.

For the strictest privacy, use

<meta name="referrer" content="no-referrer" />

If you use this sitting, other websites won't know you're linking to them, which can be a disadvantage in some situations. If the web page links to resources that still use the archaic "referer authentication", they'll break.

 The prevailing default policy for most browsers is equivalent to

<meta name="referrer" content="no-referrer-when-downgrade" />

"downgrade" here refers to http links in https pages.

If you need the referer for your own website but don't want other sites to see it you can use

<meta name="referrer" content="origin-when-cross-origin" />

Finally, if you want the user's browser to send the full referrer, no matter what, and experience the thrills of privacy brinksmanship, you can set

<meta name="referrer" content="unsafe-url" />

Widespread deployment of the referrer meta tag would be a big boost for reader privacy all over the web. It's easy to implement, has little downside, and is widely deployable. So let's get started!

Links:

Tuesday, August 10, 2010

A Library Monopsony for Monographic eBook Acquisition?

A River in the Sky: A Novel (Amelia Peabody Mysteries)In its antitrust lawsuit filed against OCLC on July 29 SkyRiver and Innovative Interfaces, Inc. (III) take the point of view that OCLC, a "purported member-based cooperative of libraries" is trying to monopolize the market for integrated library services.

If you're interested in the lawsuit, I recommend reading posts by two of my favorite Karens. Karen Coyle writes intelligently from an annoyed-at-OCLC viewpoint here, here and here, while Karen Schneider writes here with a deep familiarity with the library world's vendors.

What I find remarkable is the fact of the lawsuit itself. Here we have Innovative, one of the world's most successful library systems companies, claiming that OCLC, a creation of libraries themselves, is competing unfairly. It is no coincidence that the lawsuit was filed just two weeks after the announcement of a high profile launch of OCLC's "Web-Scale Management" system at the University of Tennessee at Chattanooga. OCLC's new service is clearly threatening to Innovative. While the lawsuit is ostensibly about OCLC's anti-competitive behavior in the cataloging market, it is motivated by OCLC's entrance as a potential competitor in Innovative's core library management system market.

I don't have much to say about the merits of the lawsuit. I am not a lawyer. I'm not even a librarian. But in general, I think it would be A Good Thing if libraries would find MORE ways to exert their market power. As far as I've seen, libraries usually act like marketplace doormats (enlightened readers of this blog excluded of course). The recent dust-up between Nature Publishing and the University of California would be a dog-bites-man story in almost any other industry.

One of my favorite words is "monopsony". It's one that every library director in the world should come to know, along with the related word "oligopsony". Everyone knows "monopoly", which is when a product or service is available from only one seller. In the SkyRiver lawsuit, it is alleged that OCLC has a monopoly on the provision of cataloging services to libraries. An oligopoly is when a monopoly is shared by a small number of sellers acting as if they were one. A monopsony is the converse of a monopoly, and occurs when a product or service has only one buyer. Sellers in such a market are at the mercy of the buyer. Although it's not often that purchasers amass such market power, it is at the heart of the success of large retailers and manufacturers such as Walmart and Dell. Their power as purchasers allows them to drive down supplier prices.

The ideal time to exert market power of any flavor is at a technological "tipping point". OCLC became a cataloging powerhouse in the 80's by taking advantage of the shift to computerized library catalogs, and its exertion of market power is so feared today because of the current technology shift towards cloud computing.

It's frustrating to a number of us in the library business that libraries are mostly sitting on the sidelines while technology is tipping towards ebooks. There is a very real possibility that the ability of libraries to lend books will not survive this transformation. The big publishers don't see libraries as a big part of their market; some publishers are openly hostile towards libraries.

That's not to say that libraries don't have significant market power in significant segments of the book  market. In these segments, I believe libraries could exert their collective power and reshape markets to their enduring benefit.

Consider the business of publishing scholarly monographs. Although some of these books find their way to readers through Amazon, the fact is that most of these books are bought by academic libraries. If academic libraries flexed their purchasing muscles, they could ensure the existence of a library-friendly ebook sales channel for these materials.

Here's the problem. Publishers of scholarly monographs have to spend real money to produce high-quality books. Today, they fund this activity by selling the books to libraries. The shift to eBooks presents new possibilities. If the production of scholarly monographs was funded directly by libraries, then perhaps the funded monographs could be made available to everyone, not just libraries that have chosen to purchase a book. The benefits would be universal access to the scholarship in question and the elimination of expensive and cumbersome DRM platforms.

Once In A Lifetime (2005 Remastered Album Version )If you think this sounds suspiciously like open-access publishing, you are  correct. Even before the creation of the World-Wide Web, many scientists used the internet to exchange technical articles, and many believed that the entire journal publishing industry would shift to an internet-enabled open-access business model. Yet, 20 years after the creation of the Web, the economics of scholarly journal publishing is roughly the same as it ever was. Is there a difference between ebook publishing and journal publishing? I believe there may be.

Let's set aside current reality for a moment and consider how libraries might accomplish the funding of ebook creation and distribution. I imagine the creation of an ebook acquisition collective. Libraries joining the collective would spend a specified fraction of their book acquisition budget through the group. The collective would offer to buy ebook rights from monograph publishers, with the understanding that the selected ebooks would be made available on an open-access basis. Collective members would decide which books to acquire. Access to this decision-making power would be a strong reason for libraries to maintain their membership; for example, the collective might favor works written by faculty members of participating institutions.

The incentives for publishers to offer books to the collective would be strong; they would get a financial payoff immediately instead of waiting years for the books to sell; a shift to demand-driven purchasing by libraries would have the opposite result. Many publishers would move timidly at first, offering only backlist titles or books that have poor commercial prospects. But publishers have to follow the money, and if the money spent by the collective grew to be large enough, publishers would have little choice but to participate.

The market for any individual scholarly monograph is not very large. For example, Princeton University Press publishes about 200 new books every year, at a cost of about $10 million. So as a very rough average, it needs about $50,000 to produce a book. University publishers that focus more closely on scholarly works produce books for significantly less. The University Press of Colorado, for example, produces 30-35 books a year at a cost of about $550,000, or less than $20,000 per book.At $20,000 per book, an acquisition collective could acquire a significant number of books.

The incentives for library participation are less certain. As libraries face budget pressure, many will be tempted to benefit from access to the acquired titles without contributing their share of funding. The most effective counter-incentive may be prestige. An effective ebook purchasing cooperative would try to maximize the prestige and publicity for the books it chooses to acquire.  This is a factor that has not worked so well for open-access journals, which are almost uniformly of lower prestige than traditionally published journals. A library can't fail to subscribe to a top journal because the faculty will insist on it; a well-marketed purchasing cooperative might provide the same sort of access to prestige as a top journal.

Given the current concern about monopolies in the library world, you might be wondering whether the sort of purchasing cooperative I describe here would be legal. While monopsonies can run into antitrust issues similar to those of monopolies, it's much rarer to for this to occur. That's because its quite easy for a purchasing cooperative to avoid antitrust violations. For example, any purchaser that represents 35% or less of a market can generally assume that its actions aren't impacted by anti-trust. If a library purchasing cooperative found itself getting too big, it could easily limit a member's contribution to 35% of its book acquisition budget. (For background on how antitrust affects group purchasing, see "Antitrust and Group Purchasing", by Michael A. Lindsay, in Antitrust 23 (3), Summer 2009. PDF, 220KB)

Libraries spend quite a bit of money on books. According to ARL Statistics, the 124 ARL libraries spent $330 million on monographs in 2008 at an average price of $55.44. If they spent 10% of this amount through an ebook cooperative, they could purchase ebook rights for 1,650 books at $20,000 each. These ebooks would really be owned by libraries, not merely rented, the way ebooks are handled in libraries today. A collective with monopsony power (or a group of collectives with ologopsony power) could force lower pricing and give strong incentives for cost-efficient publishers. Both monopolies and monopsonies can be perfectly legal if fairly obtained- copyrights and patents are good examples of monopolies created and rewarded by society.

It's interesting to compare this vision for the future with the recently announced University Press eBook Consortium, which has won a modest planning grant from the Mellon Foundation. This group imagines building a university-press-branded delivery platform for its ebook offerings and expects to launch with "over 2000 new titles and 23,000 older titles in subject-area collections, as well as a complete collection offer." The subscription business model will presumably be imported from the scholarly journal business, which is to exert the monopoly power of copyright to optimize revenue. Libraries need to think seriously about the e-journal subscription model, whether it has worked out well for them, and whether they should try something different.

If you think that an ebook acquisition collective is a good idea for libraries, please leave a comment, and spread the word. Lenders of the world, unite!

Note: SkyRiver also alleges that OCLC used its "tax-free profits" to buy up library industry companies to extend its monopolies.  One of those companies was the one that I founded. You may feel better knowing that in my case at least, a large chunk of that "tax-free" money went straight to the Internal Revenue Service!
Enhanced by Zemanta

Thursday, March 18, 2010

After EBSCO Hooks NetLibrary Will Others Take the eBook Bait?

Don Linn's post on the business of book publishing "Risk and Return in the Time of Cholera", did a good job of explaining the John Sargent "disastrous but stable" comment that I reported last week. The observation that I found most interesting in Linn's article was that technological change surrounding the transition to ebooks will necessitate significant expenditures of capital. "Given weak balance sheets and low ROI's, where will the capital come from to finance the rapid innovation and change required?"

Yesterday a major transaction was announced involving one of the leading ebook distributors in libraries. OCLC sold most of its NetLibrary Division  to EBSCO Publishing. The transaction is surely a manifestation of the need for innovation capital identified by Don Linn.

First, I need to make disclaimers. For the most part, I've avoided writing here about OCLC, the world's largest library cooperative.  I worked there for three years, and I have continuing obligations regarding proprietary information, so writing about OCLC means I have to do extra work to make sure that things I know and write are public information. Although I didn't work with the NetLibrary division at all, you can't work somewhere for three years without developing bias, so consider yourself disclaimered.

NetLibrary was a bubble-era dot-com that was the first company to try to make a business of creating, aggregating and selling ebooks. eBook adoption was too slow for NetLibrary to generate the returns that investors had hoped for, and it burned through over $100 million of venture capital and crashed. OCLC was the white knight that rode in to rescue the company, picking it out of the bankruptcy dumpster for only $10 million. OCLC said that it did so to protect the investments its members had made in NetLibrary ebooks, but another way to understand its motivation is to note that libraries and the institutions that serve them have much longer horizons than venture capitalists, and OCLC could afford to wait for for the day that ebooks would transition from curiosity to widely used medium.

As part of OCLC, NetLibrary's market presence grew steadily along with the library ebook market. Its content expanded to audiobooks and the FirstSearch article databases,  but its technology was designed long before Kindle and iPad came along. While you can listen to a NetLibrary audiobook on your iPod, you can't read a NetLibrary ebook on your iPhone. NetLibrary now uses the Adobe Content Server to allow its PDFs to be read on the nook from Barnes & Noble and on Sony Digital Readers. Clearly NetLibrary will need some significant investment to keep up with the rapidly changing ebook environment.

The sale of NetLibrary should be viewed primarily as a capital allocation decision by OCLC. eBooks and eReaders are not the only change happening in the library world, and NetLibrary is not the only major product at OCLC that would suck up significant capital. OCLC is making significant investments in cloud-based library management service based on WorldCat and WorldCat Local, and sensibly managed businesses, even non-profit ones, allocate capital according to the potential value created.

With capable ebook competitors such as Overdrive, ebrary, Myilibrary (part of Ingram Digital Group) and others, it's difficult to make the case that NetLibrary was providing unique value or substantial cost savings for OCLC member libraries. In contrast, WorldCat is a unique resource and the library management services being built on it promise a revolution in the way libraries work. According to OCLC VP Chip Nilges, quoted in an article worth reading in Library Journal, selling NetLibrary is "a strategic repositioning from hosting and reselling content to building WorldCat out as a platform that libraries can use to manage and provide access to their entire collection."

Netlibrary's presence in the ebook market may also have conflicted with OCLC's desire to catalogue, expose and link to every ebook held by libraries. To best do this, OCLC needs cooperation from ebook vendors other than NetLibrary. These competitors probably weren't happy that OCLC's library holdings database constituted valuable market intelligence- what they were selling and who they was selling it to.

If OCLC wasn't willing to finance rapid ebook innovation, why does EBSCO Publishing appear to be willing to do so?

EBSCO is one of the more unusual players in the library space. EBSCO started out selling magazine subscriptions. Elton B. Stephens, the company's founder and the EBS of EBSCO, noticed that his customers, which included the military, needed binders to put the magazines in and shelves to put them on, so he started selling binders and shelving. EBSCO grew into the largest subscription agency in the world, and provides libraries and corporations tools to create and manage their virtual magazine shelves. Somewhere along the way it also became the largest fishing lure manufacturer in the world.

The reason that a move into ebooks makes sense for EBSCO is that ebook purchases are really subscriptions. The print book production and distribution chain was built under the assumption that once the book was delivered to the customer, the transaction was done and could be forgotten. Magazine subscriptions, by contrast, are continuing relationships. Electronic magazines and journals require even more continuing support, and this is true for ebooks as well. A corporate infrastructure built to sell and support magazine subscriptions works well for supporting ebooks.

I think the answer to Don Linn's question is that the capital to support rapid innovation in ebooks will come (and has come from) from incumbents in adjacent industries with expertise in products that are not print books. Amazon first developed eCommerce capability; Apple developed consumer devices and a content marketplace; Google sells ads and delivers search. Starbucks does storefronts.

I wouldn't be surprised if one or more of the NetLibrary competitors I named above are soon acquired by "adjacent industry incumbents".  The comment thread is open for your speculatory pleasure.