Thursday, November 18, 2010

Real Research Gets Reproduced

It's not often that I'm identified as a physicist, as Richard Curtis did in his commentary on my followups of Attributor's piracy "demand" report. But it's true, I worked in materials physics research at Bell Labs from 1988-1998.

Crystal structure of YBCO
Those were great years to be in materials physics. In 1986, two guys at IBM Zurich discovered some amazing new superconducting materials. By the end of that year, a team in Japan had reproduced their results; a group I was a part of at Stanford did the same after talking with the Japan group in December. By March, so many groups around the world had made exciting discoveries that the American Physical Society meeting in New York became known as "the Woodstock of Physics".

A blue semiconductor laser.
A few years later, a guy in Japan reported that he had made a semiconductor light emitting diode (LED) glow blue. His work was a lot harder to reproduce; it took years for anyone to come close to what his team reported; although he published many details, it was hard work. I even sawed one of his LEDs in half to try to understand how it worked. Today, my kitchen (and the screen of my MacBook) is lit by white LED's made from that same semiconductor.

Around that time, some chemists in Utah announced a truly amazing discovery: they saw fusion reactions occurring in palladium electrochemical cells. Since they were respected electrochemists, their results were taken seriously, and lots of people tried to reproduce the incredible results. The promise of a seemingly magical, unlimited power source seemed almost too good to be true. This time, however, nobody could reproduce the results. Some scientists saw odd things happen, but they were different in every lab. At Bell Labs, the scientists trying to reproduce so-called "cold fusion" became convinced that the guys in Utah were being led astray by their excitement.

In science, it's usual that a surprising result will only be accepted once it has been reproduced by someone else. My scientific training has sometimes gotten me in trouble in the world of libraries and publishing. When presented with something that seems surprising to me, I ask for the evidence. In cultures that are more comfortable assigning and recognizing authority, my questions have sometimes been seen as irritants.

It's been that way with my questions about the Attributor report. I was surprised at some of the findings, and I tried to reproduce them. My results can't reproduce some of the key findings reported by Attributor. It would be nice to better understand the factor of a hundred difference between my results and those of Attributor; much might be learned from such an analysis. Attributor is a company that sells anti-piracy services; one would hope that the data they report is somehow rooted in fact, even though they benefit from overestimates of privacy.

In Richard Curtis' article, Jim Pitkow, Attributor's CEO, is quoted:
Our study’s rigorous methodology ensured highly accurate results that align with actual consumer behavior. We analyzed 89 titles, using multiple keyword permutations per title, across different days of the week, with very high bids to ensure placement – each of which is fundamental in guaranteeing accuracy and legitimacy. Each of these variables impact the findings, and analyzing all variables together produce highly accurate results. We stand by our research, and we’re confident that the study addresses an accurate portrayal of the consumer demand for pirated e-books.
If Attributor really stands by its research, it will make it easier for people like me to reproduce their results. In particular, they should publish the complete list of the "869 effective keyword terms" used as keywords for their Google AdWords experiment. There are mistakes they might have made in permuting and combining search terms; they might also have thought of a class of effective search terms that my study totally overlooked. As it stands, it's impossible to know.

I can understand why Attributor might not want to release their search term list. First of all, they should expect people to try to tear it to shreds. The marketing department isn't going to like that. That's what happened to the superconductor guys, the blue LED guy, and cold fusion guys. They stood behind their work, and let the scientific community look for weaknesses and make their own judgments.

Cold fusion didn't pan out, and Pons and Fleischmann, the Utah guys, tried for years to figure out what it was they measured. Bednorz and Müller, the guys in Zurich, won the Nobel Prize. Shuji Nakamura, the LED guy, won a Millenium Prize and a lawsuit.

It may be easier to do a followup study without the worry of spurious searches for widely known terms. But at this point, Attributor customers and the book industry as a whole stand to learn a lot from understanding where the irreproducibility of Attributor's study is coming from. Publishers need that information to plan out a response to the threat of ebook piracy, and their needs should come first- no matter what the marketing department says.
Enhanced by Zemanta


  1. Early 20th c. French author Paul Valéry has, for me, the definitive quote on the subject:
    "Rappelez-vous tout simplement qu’entre les hommes il n’existe que deux relations: la logique ou la guerre. Demandez toujours des preuves, la preuve est la politesse élémentaire qu’on se doit. Si l’on refuse, souvenez-vous que vous êtes attaqué, et qu’on va vous faire obéir par tous les moyens."

    Which roughly translates has:
    "Remember that between people only two kinds of relationships exist: logic or war. Always ask for proofs, proof is the basic politeness one is owed. If it's refused, remember that you are being attacked, and that they'll make you obey by every possible mean."

  2. Eric, I have nothing to add but: Hell, yeah! If a result can't be reproduced, it can't be trusted, period.

  3. Great post, Eric. It drives me bonkers when people make those kinds of statements without revealing methodology. It's a bigger issue in libraryland as a whole, but I digress.

  4. Your statements about cold fusion are incorrect. By 1991 over 100 major labs reproduced cold fusion, and by now over 200 have, in roughly 14,000 positive experimental runs, according to a tally at the Institute of High Energy Physics, Chinese Academy of Sciences.

    I have a collection of 1,200 peer-reviewed journal papers about cold fusion from the library at Los Alamos, and 2,500 others from various sources. I suggest you review this literature before commenting on this research. See: