Thursday, March 12, 2015

16 of the top 20 Research Journals Let Ad Networks Spy on Their Readers

A recent query to the "LibLicense" listserv asked:
Is there any kind of organization that has put together a website or list of database providers/publishers that indicate the extent to which they respect patron privacy?
The answer is "no", but I thought it would useful to look at the top journal publishers to see if their websites are built with an orientation towards reader privacy.

I came up with a list of 20 top journals. I took the 10 journals with the most citations and the 10 journals with the most citations per published article, according to the SCImago journal rankings.

I used Ghostery to count the number of trackers present on the web page for an article in each journal. Each of these trackers gets a feed of each user's browsing behavior. I looked at the trackers to see if user browsing behavior was being sent to advertising networks. I also determined whether the journal supported secure connections. Based on these results, I assigned a letter grade for each journal.

Passing, Grade A

None of the scholarly journals I looked at earned excellent grades for reader privacy.

Passing, Grade B

Two journals, both published by the American Physical Society, earned good grades for reader privacy. They use a social sharing widget that respects privacy.

Reviews of Modern Physics.  Ranked #2 in citations/article. 1 Tracker (Google Analytics). No advertising networks. Supports HTTPS, but allows insecure connections.
Physical Review Letters. Ranked #9 in total citations, #393 in citations/article. 1 Tracker (Google Analytics). No advertising networks. Supports HTTPS, but allows insecure connections.

Passing Grade C

Two journals, both published by Annual Reviews, earned acceptable grades for reader privacy.

Annual Review of Immunology. Ranked #3 in citations/article. 1 Tracker (Google Analytics). No advertising networks. Insecure connections only.
Annual Review of Biochemistry. Ranked #5 in citations/article. 1 Tracker (Google Analytics). No advertising networks. Insecure connections only.

Failing Grade D

Failing grades are earned by publishers that allow their readers to be tracked by advertising networks. These networks get access to the full browsing history of a user and track them with cookies; it's difficult for users to maintain anonymity when most of their web browsing is exposed to tracking.

Science, published by AAAS. Ranked #5 in total citations, #49 in citations/article. 10 Trackers. Multiple advertising networks. Science gets a D rather than an F because it supports HTTPS, although it allows insecure connections.

Failing Grade F

15 journals earned failing grades because their participation in advertising networks exposes their readers to tracking and spying. Some of the publishers are more flagrant about this than others. Maybe I should have given F+ to some and F- to others. All of these journals force insecure connections.


PLoS One, published by the Public Library of Science. #1 in total citations, #1776 in citations/article. 3 trackers. One advertising network.
Proceedings of the National Academy of Sciences of the United States, published by the National Academy of Sciences. #2 in total citations, #155 in citations/article. 3 trackers. One advertising network.
Journal of Biological Chemistry
, published by the American Society for Biochemistry and Molecular Biology. #8 in total citations, #513 in citations/article. 3 trackers. One advertising network.
Quarterly Journal of Economics
, published by Oxford Journals. #6 in citations/article. 4 trackers. One advertising network.
Chemical Communications
, published by the Royal Society of Chemistry. #10 in total citations, #680 in citations/article. 6 trackers. Multiple advertising networks.
Journal of the American Chemical Society
, published by the American Chemical Society. #4 in total citations, #185 in citations/article. 7 trackers. Multiple advertising networks.
Chemical Reviews
, published by the American Chemical Society. #10 in citations/article. 8 trackers. Multiple advertising networks. 
CA: A Cancer Journal for Clinicians
, published by Wiley. #1 in citations/article. 9 trackers. Multiple advertising networks.
Cell
, published by Elsevier. #4 in citations/article. 9 trackers. Multiple advertising networks.
Angewandte Chemie - International Edition
, published by Wiley. #6 in total citations, #202 in citations/article. 11 trackers. Multiple advertising networks.
Nature Genetics
, published by Nature Publishing Group. #7 in citations/article. 11 trackers. Multiple advertising networks.
Nature
, published by Nature Publishing Group. #3 in total citations, #11 in citations/article. 11 trackers. One advertising network.
Nature Reviews Genetics
, published by Nature Publishing Group. #8 in citations/article. 12 trackers. Multiple advertising networks.
Nature Reviews Molecular Cell Biology
, published by Nature Publishing Group. #9 in citations/article. 13 trackers. Multiple advertising networks.
New England Journal of Medicine,
 published by the Massachusetts Medical Society. #7 in total citations, #41 in citations/article. 14 trackers. Multiple advertising networks.

Remarks

I'm particularly concerned about the medical journals that participate in advertising networks. Imagine that someone is researching clinical trials for a deadly disease. A smart insurance company could target such users with ads that mark them for higher premiums. A pharmaceutical company could use advertising targeting researchers at competing companies to find clues about their research directions. Most journal users (and probably most journal publishers) don't realize how easily online ads can be used to gain intelligence as well as to sell products.

In defense of the publishers, it should be noted that the web advertising business has developed very rapidly over the past few years due to intense competition. A few years ago, the attacks on user privacy enabled by the ad networks' massive data collection were mostly theoretical. But competition has led the networks to increase their targeting ability and scoop up more and more "demographic" data. What was theory a few years ago is today's reality. We still have time to prevent tomorrow's privacy disaster, but change will only happen if the institutions that purchase and fund these journals learn what's really going on and start to demand the privacy that readers deserve.

6 comments:

  1. It isn't just the journals. Less than three weeks ago, Brian Merchant at Motherboard posted Looking Up Symptoms Online? These Companies Are Tracking You, pointing out that health information sites such as WebMD and, even less forgivably, the CDC, are festooned with trackers.

    ReplyDelete
  2. Your own blog has six (6) trackers on it, Ghostly says.

    ReplyDelete
    Replies
    1. I've blogged about the trackers on this site previously.

      My ghostery counts 4 trackers on this page: DoubleClick, Statcounter, Facebook Connect, and Twitter Button. I'd be interested to find out what the other two you get are. DoubleClick is not one I've put there; I presume that Google puts it there since I don't pay for Blogger, as you don't.

      If Go To Hellman was a scholarly journal, I'd give it an F. On the other hand, if people were paying me for it, I would certainly not be selling their clickstream to advertising networks.

      Delete
  3. I see DoubleClick, Facebook Connect, Google AdSense, Google+ Platform, Statcounter, and Twitter Button.

    There are a lot of problems with this "analysis" of yours. The vast majority of trackers do not sell information to ad networks or do things invasive of privacy, and you make no distinction between mainstream, tested, and innocuous trackers and potentially malicious ones (which aren't used by professional advertisers). Trackers simply help the ad systems not deliver redundant ads and help systems count clicks and response rates. One thing online advertisers have to guard against is bot traffic, which can deliver false impressions and burn through purchases inventory. If a site doesn't have verified true traffic, advertisers will not advertise. What is the true traffic rate for these sites? Do they meet IAB standards? Do their trackers? Tracking systems help screen out bot traffic, and allows sites to deliver true traffic to their advertisers. No private or personal information is shared, and your paranoid example of someone searching for a serious illness and having that information sold to somebody seeking to exploit that information is pure fantasy. The technologies don't work that way.

    The reason sites with more advertising have more trackers is because they have a broader array of customers. This leads to having to accommodate more different solutions to ad tracking. Advertisers can prefer one ad system over another, and if you have more advertisers, you will have more tracking systems across your publications.

    Tracking ads is not spying on users. That's a key distinction you're missing.

    ReplyDelete
    Replies
    1. I agree there are benign trackers. I also agree that trackers support advertising so that web advertising without trackers is significantly less valuable than advertising with trackers. For example, Google Analytics at present is benign if you believe the assurances given in their privacy policy. But how do you really know?

      However there are trackers that are evil. They respawn coookies. They use "canvas fingerprinting". They track users across devices. They ignore "do not track". They enable ad-stalking of users. I have myself been stalked by ads, despite enabling the "do-not-track" header.

      Did you know that some ad networks let advertisers execute javascripts? Have you ever thought about how you might use an ad network to spy on users? Have you read the so-called privacy policies for the trackers that bother to have them? Do you agree that the ad trackers should be allowed to retain all that data- every click from every userid forever? Are you OK with the ad trackers selling that data to other companies?

      It's true that the vast majority of ads and advertisers are benign. The vast majority of ads are not intrusive. But when an library spends $5,000 a year so their users can access NEJM, giving their entire clickstreams to 14 ad trackers is probably not what they think they're paying for.

      Delete
    2. PS. Do you work at the Massachusetts Medical Society? My tracker says you might.

      Delete

Note: Only a member of this blog may post a comment.