Wednesday 9 September 2015

The Reproducibility Problem

For some years now psychologists have been talking about the Reproducibility Problem. This is where a team of researchers publish a paper that shows some new and interesting result but when another team uses the same method to try to reproduce the result they cannot detect the effect. This is significant because one paper is never proof of anything, one paper can always be an anomaly. The ability to faithfully reproduce an effect is one of the cornerstones of the scientific method and knowing that what we see is a real effect.

An open access article in Science this week reports on a large collaborative effort to try to put some flesh on the bones of this problem. Nosek et al attempted to replicate 100 wide ranging psychological studies performed in 2008 published in three different, high quality psychological journals. They copied the methods of the original studies exactly and even used the same materials where available.

They found that 97% of the original studies had a statistically significant result but that only 36% of replications did; also, the mean effect size more than halved in the replications. This all ties in with what I was talking about in a recent post on P-Hacking, the phenomenon consciously or otherwise of massaging your results into significance. Although this study focussed on psychology I don't think it is unreasonable to assume that many fields of science suffer from a similar problem to some degree. It is a symptom of being rewarded for new and innovative research but not for reliable reproduction of existing research. To reiterate my point from my previous post, I don't think that this means science is broken, but it does mean that it is inefficient.

A large part of the blame lies with journal editors who aren't interested in publishing replications, but also with those in charge of allocating grant money and in Principal Investigators who want to up their research impact. I'm afraid that, at the end of the day, a lot of science is simply not new and sexy, it is about meticulous, hard graft that won't set the world afire but will, slowly, push back the boundaries of our knowledge. Nosek sums up the situation well in his paper:

Reproducibility is not well understood because the incentives for individual scientists prioritize novelty over replication. Innovation is the engine of discovery and is vital for a productive, effective scientific enterprise. However, innovative ideas become old news fast. Journal reviewers and editors may dismiss a new test of a published idea as unoriginal. The claim that “we already know this” belies the uncertainty of scientific evidence. Innovation points out paths that are possible; replication points out paths that are likely; progress relies on both. Replication can increase certainty when findings are reproduced and promote innovation when they are not. This project provides accumulating evidence for many findings in psychological research and suggests that there is still more work to do to verify whether we know what we think we know.


Image used with permission from here

No comments:

Post a Comment