Only 1 in 10 Medical Treatments Are Backed by High-Quality Evidence

As the pressure to 'publish or perish' rises, the quality of research has fallen—even as the volume has skyrocketed
September 10, 2020 Updated: September 10, 2020

When you visit your doctor, you might assume that the treatment they prescribe has solid evidence to back it up. But you’d be wrong. Only 1 in 10 medical treatments are supported by high-quality evidence, our latest research shows.

The analysis, published in the Journal of Clinical Epidemiology, included 154 Cochrane systematic reviews published between 2015 and 2019. Only 15 (9.9 percent) had high-quality evidence according to the gold-standard method for determining whether they provide high- or low-quality evidence.

That standard is called GRADE (grading of recommendations, assessment, development, and evaluation). Among these 15 treatments, only two had statistically significant results—meaning that the results were unlikely to have arisen due to random error—and were believed by the review authors to be useful in clinical practice. Using the same system, 37 percent of treatments had moderate-quality evidence, 31 percent had low-quality evidence, and 22 percent had very-low-quality evidence.

The GRADE system looks at things such as risk of bias. For example, studies that are “blinded”—in which patients don’t know whether they are getting the actual treatment or a placebo–offer higher-quality evidence than “unblinded” studies. Blinding is important because people who know what treatment they are getting can experience greater placebo effects than those who don’t know what treatment they are getting.

Among other things, GRADE also considers whether the studies were imprecise because of differences in the way the treatment was used. In the 2016 review, researchers found that 13.5 percent–about 1 in 7–reported that treatments were supported by high-quality evidence. Lack of high-quality evidence, according to GRADE, means that future studies might overturn the results.

The 154 studies were chosen because they were updates of a previous review of 608 systematic reviews, conducted in 2016. This allows us to check whether reviews that had been updated with new evidence had higher-quality evidence. They didn’t. In the 2016 study, 13.5 percent reported that treatments were supported by high-quality evidence, so there was a trend towards lower quality as more evidence was gathered.

There were a few limitations to the study. First, the sample size in the study may not have been representative, and other studies have found that more than 40 percent of medical treatments are likely to be effective. Also, the sample in the study wasn’t large enough to check whether there were certain types of medical treatments (pharmacological, surgical, psychological) that were better than others. It is also possible that the “gold standard” for ranking evidence (GRADE) is too strict.

Too Many Low-Quality Studies

Many poor-quality trials are being published, and our study merely reflected this. Because of the pressure to “publish or perish” to survive in academia, more and more studies are being done. In PubMed alone—a database of published medical papers—more than 12,000 new clinical trials are published every year. That’s 30 trials published every day. Systematic reviews were designed to synthesize these, but now there are too many of those, too: more than 2,000 per year published in PubMed alone.

The evidence-based medicine movement has been banging a drum about the need to improve the quality of research for more than 30 years, but, paradoxically, there is no evidence that things have improved despite a proliferation of guidelines and guidance.

In 1994, Doug Altman, a professor of statistics in medicine at Oxford University, pleaded for less, but better, research. This would have been good, but the opposite has happened. Inevitably, the tsunami of trials published every year, combined with the need to publish in order to survive in academia, has led to a great deal of rubbish being published, and this hasn’t changed over time.

Poor-quality evidence is serious: Without good evidence, we simply can’t be sure that the treatments we use work.

Grade System Too Harsh

A carpenter should only blame their tools as a last resort, so the excuse that GRADE doesn’t work should be only be used cautiously. Yet it’s probably true that the GRADE system is too harsh for some contexts. For example, it is near impossible for any trial evaluating a particular exercise regime to be of high quality.

An exercise trial can’t be “blinded”: Anyone doing exercise will know they are in the exercise group, while those in the control group will know they aren’t doing exercise. Also, it’s hard to make large groups of people do exactly the same exercise, whereas it is easier to make everyone take the same pill. These inherent problems condemn exercise trials to being judged to be of lower quality, no matter how useful safe exercise is.

Also, our method was strict. Whereas the systematic reviews had many outcomes (each of which could be high quality), we focused on the primary outcomes. For example, the primary outcome in a review of painkillers would be a reduction in pain. Then they might also measure a range of secondary outcomes, ranging from anxiety reduction to patient satisfaction.

Focusing on the primary outcomes prevents spurious findings. If we look at many outcomes, there is a danger that one of them will be high quality just by chance. To mitigate this, we looked at any outcome—even if it wasn’t the primary outcome. We found that 1 in 5 treatments had high-quality evidence for any outcome.

On average, most of the medical treatments whose effectiveness has been tested in systematic reviews aren’t supported by high-quality evidence. We need less, but better, research to address uncertainties so that we can become more confident that the treatments we take work.

 is the director of the Oxford Empathy Programme at the University of Oxford in the UK. This article was originally published on The Conversation.