Medical studies: Evidence you can trust?

Evidence-based Living is built around the idea that scientific study should guide our lives – in decisions we make for our families, in community initiatives, and of course in choosing medical treatments.

A new review this month is the Journal of Oncology raises important questions about the validity of medical studies.  The report reviewed 164 trials of treatments for breast cancer including chemotherapy, radiation and surgery conducted from 1995 to 2011.

It concluded that: most of the studies were clouded by overemphasizing the benefits of the treatment, or minimizing potential side effects.

For example, they reported on 92 trials which had a negative primary endpoint – which essentially means the treatment was not found to be effective for the main goal of the study. In 59 percent of those trials, a secondary end point – another goal – was used to suggest the experimental therapy was actually beneficial.

And only 32 percent of the studies reported severe or life-threatening side effects in the abstract – where medical professionals who are scanning the report might miss them. Studies that reported a positive primary endpoint – meaning the treatment was effective for the problem that researchers were targeting – were less likely to report serious side effects.

What does all of this mean?

Elaine Wethington, a medical sociologist at the College of Human Ecology, says the review reveals some important findings about medical studies.

“I would speculate that the findings are due to at least three processes,” she explained.

“First, trial results should be published even if the primary outcome findings are negative, but it can be difficult to find a journal that will publish negative findings,” she said. “As a result, there is a tendency to focus on other outcomes that are secondary in order to justify the work and effort.

“Second, presentation of findings can be influenced by a variety of conflicts of interest. There is a lot of published evidence – and controversy — that scientific data collection and analysis can be affected by the source of funding, private versus public.

“Third, this could also be explained as a problem in scientific peer review.  Reviewers and editors could insist that this type of bias in reporting be controlled,” Wethington said.

In short, she sees the publication of this review as an important step in improving the scientific review process.

Citizen scientists: The new research corps

More often than ever before, people from all walks of life –  from retired senior citizens to young families – are helping scientists collect data that support research projects. This movement of “citizen science” has flourished over the past decade as technology has advanced, allowing volunteers to share information with researchers quickly and accurately.

In fact, there are several interesting examples of “citizen science” here at Cornell University, including a survey of backyard birds and a project called Yardmap that encourages homeowners to map their yards so that researchers can better understand the habitat available to birds.

This month, a group of researchers from the United Kingdom has published a review that details exactly how “citizen science” is working, including summaries of projects across the globe, interviews with scientists who use this data and , and a guide of the best practices for conducting these types of projects. The review found some interesting conclusions. Among them:

  • The motivation for citizen scientists varies greatly. Successful projects tend to take into account the interests and skill-sets of participants, and their expectations.
  • Getting feedback from volunteers is an important component of a sucessful project and is acheived through a wide variety of mediums, including social media and face-to-face interactions.
  • Technologies such as GPS and smart phones have made it easier for citizens to share accurate data, but relying on these devices excludes those who don’t have access to them.

Cornell gerontologist Karl Pillemer is a proponent of “citizen science” for people in their 60s, 70s and 80s. He has conducted research that found that older adults who get involved in creating a sustainable society and conserving natural resources are not only helping the environment, they are also helping themselves.

“Research shows that citizen science activities provide a wonderful opportunity to achieve two goals at once: Adding to our knowledge about areas important to quality of life for people, while also providing opportunities for rewarding and meaningful activity,” he said. “And citizen science activities can be adapted for any life course stage, from elementary school students to retirees.”

In short, projects that use citizen volunteers to collect data are an important part of environmental research today, and understanding the best practices for this type of research is important.

What we know – and what we don’t – about Omega-3 fatty acids

Over the past four decades, there have been thousands of studies examining the health benefits of Omega-3 fatty acids – building blocks our bodies use to create cell membranes and maintain the connections between brain cells.

The medical community’s excitement over this nutrient began when observational studies of non-western diets – in Japan and among Eskimos in Greenland, for example – found significantly lower rates of heart disease and other chronic medical conditions.  (Humans can’t produce omega-3 fatty acids, so we must get them by eating fish, walnuts, flaxseed and green vegetables.)

Dozens have studies have identified these types of correlations. But earlier this year, a meta-analysis published in the Journal of the American Medical Association, which included 20 clinical trials involving nearly 70,000 people, concluded omega-3 fatty acids did not prevent heart attacks, strokes or deaths from heart disease.

Proponents of omega-3s point out that the authors of the JAMA analysis used the an especially strict standard to determine statistical significance. (Using the typical standard would have found a 9 percent reduction in cardiac deaths.)

But other systematic reviews – like this one by the Cochrane Collaboration – found it unclear whether omega-3 supplements reduce the risk of cardiac deaths.

So, what’s the bottom line?  This is one case where the evidence is truly unclear. One challenge is that longitudinal diet studies are difficult to perform because there are so many variables in what people eat over long periods of time. The it can be tough to differentiate between omega-3s consumed as part of a diet versus those taken in a supplement.  It is clear that foods like salmon, tuna and green vegetables are good for us – and including them in our diets is a step in the right direction. But we need more evidence to determine their exact effects, and to establish whether it’s worthwhile to take omega-3 supplements.

Slimming it down? New evidence on low-calorie diets

Over the past few years, you may have heard the buzz about the potential for a low-calorie diet to prolong life and prevent chronic medical conditions like heart disease and cancer.

While the concept of restricting calories has been around for decades, a longitudinal study of monkeys published in 2009 seemed to provide definitive evidence that eating less was good for you. The study by researchers at the University of Wisconsin found a diet of moderate caloric restrictions over 20 years lowered the incidence of aging-related deaths and reduced the incidence of diabetes, cancer, cardiovascular disease, and brain atrophy.

But last week, a new longitudinal study of different species of monkey raised questions about the idea of restricting calories to improve health. The study included 121 monkeys split into two groups. The experimental group was fed 30 percent fewer calories than the control group.

In the study published last week, which was sponsored by the National Institute on Aging, the monkeys on restricted diets did not live any longer than those with normal diets. Rates of cancer and heart disease were the same for monkeys on restricted diets and normal diets. While some groups of monkeys on restricted diets had lower levels of cholesterol, blood sugar and triglycerides, they still did not live longer than the monkeys who ate normally.

The study is interesting from a health perspective because it raises questions about the notion of restricting calories to improve health. But it’s also a prime example of why it’s important to collect data from more than one study.

“This shows the importance of replication in science,” Steven Austad, interim director of the Barshop Institute for Longevity and Aging Studies at the University of Texas Health Science Center, told the New York Times. Austad, who was not involved in either study, also explained that the first study was not as conclusive as portrayed in the media.

The take home message: It’s important to collect evidence from multiple studies before drawing conclusions, even when the data seems extremely convincing.

A roadmap: How to use research to help people

The idea of translational research initially sprung out of the field of medicine, where doctors and scientists have teamed up to move laboratory discoveries more rapidly into clinical settings to help patients improve their health and recover from ailments.

Since its beginnings several decades ago, researchers working in other disciplines have latched onto the idea of translation. Now a new book offers models for social and behavioral scientists who want to transfer their findings into real world settings.

The book – “Research for the Public Good: Applying the Methods of Translational Research to Improve Human Health and Well-Being” – includes chapters by experts in the fields of psychology, child development, public policy, sociology, gerontology, geriatrics and economics that offer road maps for translating research into policies and programs that improve the well-being of individuals and communities. It  is co-edited by Cornell professors Elaine Wethington and Rachel Dunifon.

The book grew out of second Biennial Urie Bronfenbrenner Conference on translational research held at Cornell and attended by leading experts in the social sciences and medical fields.

“Translational research has gained prominence in biomedical research, where there’s an emphasis on speeding lab findings into practice,” Wethington told the Cornell Chronicle. “It also goes back to the work of Urie Bronfenbrenner and his colleagues, however, who were ahead of their time with an ecological approach to human development that brought together research, policy and practice. This book defines the term in that context and provides practical insights for doing translational research.”

Graduate students and early-career scientists unfamiliar with translational research methods should find the book valuable, Wethington said. “There is a surge of interest in the field right now, so the book should be a great resource,” she said.

Missing data: The Achilles heel of systematic reviews

If you’re a regular reader of EBL, you know we’re huge fans of systematic reviews – studies in which researchers use sophisticated methods to bring together and evaluate the dozens, hundreds, or even thousands of articles on a topic.

We value these analyses because they collect all of the information available and then look at why and how each study differs. By looking at so many studies, researchers can make general conclusions, even though participants and study settings might be different.

So we took a great interest this week in a series of studies in the British Medical Journal making the case that many medical studies aren’t published, and therefore missing from systematic reviews and the decision-making processes of doctors and patients.

One of the studies found that fewer than half of the clinical trials funded by the National Institutes of Health from 2005 to 2008 were published in peer-reviewed journals within 30 months of study completion, and only 68 percent were published at all.

Another examined trials registered at the federal web site during 2009. Of the 738 studies registered and subject to mandatory reporting guidelines (per the rules of the U.S. Food and Drug Administration), only 22 percent reported results within one year.  (It’s interesting to note that trials of medicines in the later stages of development and those funded by the drug industry were more likely to have results reported.)

A third study re-analyzed 41 systematic reviews of nine different medicines, this time including unpublished clinical trial data from the FDA in each analysis.  For 19 of the systematic reviews, the addition of unpublished data led to the conclusion that the drug was not as effective as originally shown. For 19 other reviews, the additional data led to the conclusion that the drug was more effective than originally shown.

Dr. Harlan Krumholz, a cardiologist at Yale and a internationally-respected expert in outcomes research, summarized the issue in his Forbes magazine blog, including some of the reasons that data goes unreported. (Among them, researchers may not be happy with the results or may shift focus to a new study. And medical journals may not be receptive to negative results.)

Whatever the reasons, the take-home message seems to be that researchers and publishers need to do a better job getting all of the information out in the public domain so that doctors and patients can truly make informed decisions.

A rise in food allergies: Fact or fiction?

I recently attended a children’s holiday party that ended with a group of parents discussing the treats they brought to share. One parent lamented that she could not bring her family’s favorite cookies (which contain peanut butter) for fear a child at the party was allergic to peanuts. The discussion eventually arrived at the question, are more children really suffering from allergies to food items like peanuts, dairy products and wheat?

Everyone at the party had an opinion, but no one quite knew the answer for sure.  Of course, I hurried home to do some research.

I wasn’t able to find a clear conclusion because the evidence on food allergies is limited. Two separate, large systematic reviews published in The Journal of Allergy and Clinical Immunology and the Journal of the American Medical Association found inconclusive evidence about the prevalence of food allergies, mainly because there is not uniform criteria available for diagnosing and tracking food allergies.

The review in The Journal of Allergy and Clinical Immunology included 36 studies and a total of more than 250,000 children and adults. Only 6 studies included food challenge tests – the gold standard for allergy testing that involves serving a patient a suspected allergen unknowingly.  More importantly, the analysis found very little uniformity in study methods, making difficult to compile data.

The review in the Journal of the American Medical Association came to the same conclusion – without a uniform method for studying food allergies, it’s difficult to draw conclusions about what’s going on. This review concluded that food allergy affects more than 1 percent of the population but less than 10 percent and found it unclear if the prevalence of food allergies is increasing. The analysis also found that a common diagnostic process for food allergies called the elimination diet – where patients eliminate suspected allergens from their meals – have rarely been studied.

So, the jury is still out. But the good news is that the federal government has recognized the critical need for more research in this area and provided a steady stream of funding to the National Institute of Allergy and Infectious Disease to address these questions. Their first step was to commission a review of the scientific and clinical literature that would eventually lead to the development of guidelines for diagnosing and managing food allergies.  (You can read it here.)

In the meantime, my son’s school will remain peanut-free. And I agree, it’s probably better that way.  If there are students who is seriously allergic to peanuts, it’s important to keep them safe. We’ll just have to make our own peanut butter cookies to enjoy at home.

Randomized, controlled designs: The “gold standard” for knowing what works

You’re having trouble sleeping one night, so you finally give up and turn on the TV. It’s 2 AM, so instead of actual programs, much of what you get are informercials. As you flip through these slick “infotainment” shows, you hear enthusiastic claims about the effectiveness of diet pills, exercise equipment, and a multitude of other products

You will soon see that almost every commercial uses case studies and testimony of individuals for whom the product has supposedly worked. “I lost 50 pounds,” exults a woman who looks like a swimsuit model. “I got ripped abs in 30 days,” crows a man who, well, also looks like a swimsuit model.

The problem is that this kind of case study and individual testimony is essentially worthless in deciding if a product or program works. The main problem is that it’s very hard to disprove case study evidence. Look at the informercials – they seem to have worked for some people, but what about all the people who failed? And how do we know that the people who lost weight, for example, wouldn’t have done so without buying the product?

So case studies and testimonials aren’t worth much because they don’t give us the kind of comparative information needed to rule out alternative explanations.

To the rescue comes experiments using randomized, controlled designs (RCD). Such experiments are rightly called the “gold standard” for knowing whether a treatment will work. In a RCDs, we create a test so that one explanation necessarily disconfirms the other explanation. Think of it like a football game. Both teams can’t win, and one eventually beats the other. It’s the same with science: our knowledge can only progress if one explanation can knock out another explanation.

 The main weapon in our search for truth is control group designs.  Using control groups, we test a product or program (called the “treatment”) against a group that doesn’t get whatever the treatment is.

 Case studies simply don’t have the comparative information needed to prove that a particular treatment is better than another one, or better than just doing nothing. And that’s important because of the “placebo effect.” It turns out that people tend report that a treatment has helped them, whether or not there is any actual therapy delivered. In medicine, placebo effects very strong, and in some cases (like drugs for depression) the placebos have occasionally been found to work more effectively than the drugs.

 So what is a randomized, controlled design? There are four components of RCDs:

 1. There is a treatment to be studied like a program, a drug, or a medical procedure)

 2. There is a control condition. Sometimes, this is a group that doesn’t’ get any treatment at all. Often it is a group that gets some other kind of treatment, but of a different kind or smaller amount.

3.  Now here’s the key point:The participants must be randomly assigned to treatment or control groups. It is critical that nobody – not the researchers, not the people in the experiment – can participate in the decision about which group people fall into. Some kind of randomization procedure is used to put people into groups – flipping a coin, using a computer, or some other method. This is the only way we can make sure that the people who get the intervention will be similar to those who do not.

4. There must be carefully defined outcome measures, and they must be measured before and after the treatment occurs.

Lots of the bogus claims you see on TV and elsewhere look only at people who used the product. Without the control group, however, we can’t know if the participants would have gotten better with no treatment at all, or with some other treatment.

Catherine Greeno, in an excellent article on this topic, sums up why we need to do RCDs if we want to understand if something really does or doesn’t work. She puts it this way:

  • We study a treatment compared to a control group because people may get better on their own.
  • We randomly assign to avoid the problem of giving worse off people the new treatment because we think they need it more.
  • We measure before and after the treatment so that we have measured change with certainty, instead of relying on impressions or memories.

 So when you are wondering if a therapy, treatment, exercise program, product, etc. are likely to work, keep those three little words in mind: Randomized, Controlled Design!

New evidence on how the flu spreads

We’re deep into flu season in the U.S. The federal Centers for Disease Control, which tracks the flu virus nationally, found a significant increase in flu-related hospitalizations and deaths in January.

I should admit to you that I was one of those patients.  I came down with a cough mid-January that quickly turned into body aches and fever.  Being pregnant, I went into the doctor for a check-up.  They immediately sent me to the hospital where I tested positive for the flu.  I was treated with IV fluids, fever-reducers and an antiviral medicine that has been shown to reduce the duration of the flu. Thankfully, I was much better within a week.  While I was sick, I did find myself wondering more than once, “Where did I pick this virus up?”  (My husband and son never got sick.)

So when I came across a new study in the Proceedings of the National Institutes of Science about how the flu spread, I was personally intrigued.

Researchers studied an outbreak of H1N1 flu virus at an elementary school in Pennsylvania in the spring 2009. They collected data in real time while the epidemic was going on, a unique method for studying the flu. In total, they collected information on 370 students from 295 households. Nearly 35 percent of the students and 15 percent of their family members came down with flu.

The interesting aspect of the study is that researchers collected data on exactly who got sick and when, plus information from seating charts, activities and social networks at the school.  They then used statistical methods to trace the spread of the disease from one child to the next.

Their findings were surprising:

  • Sitting next to a classmate with the flu did not significantly increase the risk of infection, but the social networks and the structure of classes certainly did.
  • Transmission was 25 times as intensive among classmates as between children in different grades. Boys were more likely to catch the flu from other boys, and girls from other girls. From May 7 to 9, the illness spread mostly among boys. From May 10 to 13, it spread mostly among girls.
  • Administrators closed the school from May 14 to 18, but there was no indication that this slowed transmission.  
  • Only 1 in 5 adults caught the illness from their own children.

The researchers did point out some limitations of the study. Survey data was reported by the main caregiver in each household and focused on symptoms only. And the study did not take into account how the flu spread outside of the school environment, at gatherings like play dates or sports practice.

But the study does provide a unique snapshot at how a virus can spread, revealing definite patterns of what the researchers call “back-and-forth waves of transmission” between the school, the community, and the households. It is one, detailed piece in the complex puzzle of understanding how disease spreads.

How do I know if a program works? A “CAREful” approach

I was recently giving a talk on intervention research and I was asked: “How do I tell whether the evidence for a particular program is good or not?” I often talk with practitioners in various fields who are struggling with exactly what “evidence-based” means. They will read “evidence” about a program that relies only on whether participants liked it, or they will see an article in the media that recommends a treatment based on a single study. What should you look for when you are deciding: Is the evidence on this program good or not?

I came across a very helpful way of thinking about this issue in the work of educational psychologist Joel R. Levin. He developed the acronym “CAREful research,” which sums up what needs to be done when drawing conclusions from intervention research.

In Levin’s “CAREful” scheme, he identifies four basic components of sound intervention studies.

Comparison – choosing the right comparison group for the test of the intervention. Usually, there needs to be a group that does not receive the program being studied, so one can see if the program works relative to a group that does not receive it. A program description should explain how the comparison was done and why it is appropriate.

Again and again – The intervention program needs to be replicated across multiple studies; one positive finding isn’t enough.

Relationship – There has to be a relationship between the intervention and the outcome. That is, the intervention has to affect the outcome variables. That may seem simple, but it’s important; the program has to have a positive effect on important outcomes, or why should you use it?

Eliminate – The other possible explanations for an effect have to be eliminated, usually through random assignment to experimental and control groups and sound statistical analysis.

 Levin and colleauges sum up the CAREful scheme:

“If an appropriate Comparison reveals Again and again evidence of a direct Relationship between an intervention and a speciried outcome, while Eliminating all other competing explanations for the outcome, then the research yields scientifically confincing evidence of the intervention’s effectiveness.”

To see a good example of an evidence-based approach to intervention that reflects this kind of CAREful research, take a look at the PROSPER program, which takes a similar approach to youth development progams.

So when you are looking at intervention programs, “Be CAREful”: Applying these four criteria for good research can help you decide what works and what doesn’t.

Skip to toolbar