Are Most Published Research Findings False?

Eric Schuur

Many people are aware of the work of John Ioannidis regarding the analysis of research findings and the conclusions drawn from those analyses. In particular, these concepts were described by him in a paper published in PLOS Medicine in 2005 is apparently the most downloaded article from that journal.

I’ve had this article on my mental favorites list for some time now. I am finally putting a few words in print about it mostly to put a stake in the ground on this issue because I believe it is an important one in this era of high volume research reporting. In short, I agree with the article’s main conclusions, although I might phrase it as “most published biomedical research conclusions are not true”. This is not to say I think there is some conspiracy or that statistics are useless. To the contrary: statistics is an enormously useful field of applied mathematics. I also think a great deal of very good research is being done in labs and clinics around the world by very dedicated and smart researchers.

My concern over the veracity of biomedical research and how these results are reported stems from the nature of statistical models and test versus how they are interpreted and reported. Within that discussion is another around the unspoken assumptions underlying both our biological and statistical models.

Perhaps the stickiest issue for me is the use, or misuse, of p values in many published studies. Without getting too long-winded about it, far too often the p value is used all by itself and given the status of a “stamp of approval”. Using a p value in isolation (i.e. p=0.001 therefore I won!) is ignoring a lot of important information. What type of test did you “win”? What distribution of p values for this test did you assume? Are your assumptions correct? Did you keep testing data until you found the p value you were hoping for?

Fortunately, I think the wider scientific community is waking up to the deficiencies in the most commonly used statistical analysis scenarios. This recent article from Genomeweb does a nice job describing the basic appropriate role for statistical analyses in biomedical research. An important distinction pointed out in their article is that statistical significance and biological (or clinical) significance are two different things. When we rely on statistics to identify important relationships within a vast ocean of information, it is all the more important to understand what these mathematical tools are telling us.

As the wise scientist once said, “Never assume anything other than a 4% mortgage.” I mentioned assumptions above in the sense of statistical models; assumptions also come into play in experimental design. My sense of it is that these assumptions are usually underappreciated or perhaps even ignored. The danger, of course, is that incorrect assumptions, statistical or experimental, can invalidate the results and conclusions of any research. Often these assumptions difficult to verify, which we might be able to cope with, if we knew what these assumptions were.
Unfortunately, they are not part of the standard scientific reporting paradigm. This recent article in PLoS Computational Biology sheds some light on the issue of reporting experimental assumptions. Again, by bringing the issue to light there is hope that we can begin to change our science reporting procedures to incorporate some discussion of assumptions.

I find it reassuring that these discussions about accurate analysis and reporting of scientific research are surfacing. Opening up communication about these critical issues will greatly enhance our ability to navigate through the ocean of biomedical studies available to us.