I’m listening to a great Long Now Seminar by Nassim Taleb about probability in complex systems and it reminded me of a great idea. Nassim gives only what he calls “negative advice,” meaning advice about what not to do. He considers positive advice useless and laments that it’s so hard to find books called Ten Ways to Screw Up Your Life, or How I Lost a Million Dollars, compared to stuff like Ten Steps to Success.

There is a related publishing problem in psychology, and perhaps other sciences: If your idea doesn’t work out, you can’t get it published. Journals do not want to publish failed experiments. They just aren’t sexy. The problem is, at a typical alpha of .05, one in twenty experimental results will be flukes—just random happenings, not reliable, not indicative of anything real going on. Even with a more rigorous alpha of .01, you will get a false positive every 100 experiments you run, on average.

Research psychologists know this. They get a lot of training in statistics. They do not feel certain about their own results until the results have been replicated in other labs. But they rely on what is published for their input. For my honors thesis, for example, I was interested in how the effects of having power over others compares to having power over yourself.  So I read the literature on power and designed my experiment first to replicate the results of two experiments from a famous  paper which showed evidence for social power inhibiting perspective taking, and then to extend that research a little, by adding a “personal power” condition. Almost every paper on power mentions that social power inhibits perspective taking, and they all cite this famous paper to back them up. The author is prolific and well-respected, and rightfully so. He does really creative, interesting work.

Despite my considerable efforts to duplicate his methods, however, I replicated none of his results. “These are the flattest data I’ve ever seen,” said Sean, my advisor. That was a problem for my honors thesis, because the question I wanted to look at never came up—I had nothing to compare my personal power numbers to. I had a conversation with this famous psychologist later and found out that he had not been able to replicate his results either. Now flat data is not a problem for science; every researcher I’ve talked to about it has said something like, “Hmm! It didn’t replicate, huh? That’s really interesting!” The problem is, that information was already out there and I couldn’t get to it. This scientist knew about the problem, but I didn’t. Now I know about it, but no one outside of my lab will know, because no one will publish it. The next person who has my idea will make the same mistake, and the next.

The solution:

First, an idea either stolen or adapted from my advisor, a high quality psychology journal called Null Results in Psychology, with a mission to publish peer reviewed failures. It might be an online-only journal, because it would need to be big. If such a thing had existed a year ago, I could have run a standard check and saved myself a lot of trouble.

Second, another journal called Replicated Results in Psychology, which would be for publishing peer reviewed, successful replications of previous research. Or perhaps these two could be combined into one. It doesn’t matter.

Third, both of these journals could be attached to a database that compiled and cross-referenced replications and failed replications. Ideally, the strength of a theory or evidence is based on how well it predicts the future. In practice, however, this is only partly the case, and turns out to be true only in the long run. The weight carried by a theory or evidence has at least as much to do with the fame of the scientist who produced it. Everyone is waiting for and immediately reads their new stuff. There is a database which records how often a paper is cited, but the number of citations tells you only the relative fame of that paper. It doesn’t say whether the citations are supportive or critical. And most citations are not either—they are used to support the author’s thinking.

Easy access to null results, replicated results, and a database linking it all together could change the direction and the pace of progress in psychology. It could also make learning psychology more interesting. My professors were mostly very good about not just teaching theories. They presented (and had me memorize) the experimental methods and evidence that led to the formulation of the theories. Even so, I often wondered how soon and in what way these theories would seem quaint, like phlogiston or “the ether”–early evidence supported these ideas, too, after all.  I would have loved it if evidence could have been presented like, “OK, we’re starting to feel pretty good about these results, because these variations have been tried by 30 different labs, and 25 of them found the same thing.” I can imagine the groans of my fellow students and the cheers of my professors, which makes me think it’s a good idea.