When you’re baking a cake, you follow a recipe that uses specific ingredients, added in a particular order, mixed in a specific way, and baked for a certain time at an exact temperature. But what if you made the cake twice? Or three times? Will it be the same each time? Will your cake be “reproducible”? What if your baking powder is old? The cake might not rise as much as normal. Or you buy a different brand of flour? What if you’re baking at your sister’s house in Oregon at a higher elevation than normal? Do you change the time or temperature that the cake bakes? How much? And what if you’re using your Grandmother’s handwritten recipe for her famous spice cake? It’s filled with phrases such as a “pinch” of cinnamon or “about” 3 cups of flour. Do you think it will taste the same as when your Grandmother makes it? And what if two people try to make the same cake? On the Great British Baking Show, one of the challenges each week has each contestant follow the same recipe with the same ingredients to make the same baked item. They NEVER come out the same because there is variability built until every step, even with the same instructions, equipment, and ingredients.
Science is sometimes a lot like baking. Instead of a recipe, scientists’ follow a protocol (or standard operating procedure – shortened to “SOP” because scientists like acronyms). We purchase or make ingredients, typically called “reagents.” Often a single reagent can be bought from many different companies or made in the lab by you or maybe by a technician, or maybe you were in a rush and you borrowed some from someone down the hall. In biology, biological materials like cells or enzymes are often involved. These could be new or old. You could have tested and “validated” your cells or antibodies or you could have relied on someone telling you that they are okay. Once you have all of your reagents pulled together, you do the experiment.
Experiments are funny little things. You follow your SOP, but maybe one day, the 5 minute incubation turns into 10 because you were in the middle of answering an email. Maybe another day you’re in a rush to get to a seminar so you skip a step. Or maybe you’re training a new undergraduate how to do the experiment and you let them do a few steps on their own.
This variability is part of the reason why scientists repeat their experiments multiple times. Three, as you may expect, is often the magic number. These data are then presented (for example, in a grant or a paper) either as a representative experiment, where only one of the three or more experiments are shown, or as an average of the experiments with “error bars” that often show how much the data differed between experiments. This type of careful presentation gives other researchers more confidence that the result is real, as opposed to something that happened just because the new grad student messed something up.
The Reproducibility “Problem”
However, even with all this careful planning, there is a lot of chatter these days about the failure of scientists to be able to reproduce experiments. One of the earliest papers about this topic came from researchers at Amgen who found that they couldn’t reproduce 47 out of 53 studies from cancer research labs. This has led to a snowball of studies and reports of the failure to reproduce data from various fields including biology and psychology. The most recent is a Nature Article surveying 1,500 scientists about their ability to reproduce their own and others’ results in their own labs. 52% of these scientists felt that there was a reproducibility “crisis” and words like “bleak” and “discomfiting” were thrown around to express the severity of the problem.
So is there really a problem? Lots of papers have discussed this already, but I figured why not add my own opinion to the mix. In part, yes, these likely is a bit of a problem. This problem stems from using reagents you aren’t sure of. For example, imagine that you think you’re studying prostate cancer and you think you’re using a cell line from a prostate cancer patient, but actually you’re using a super common cervical cell line? It happens all the time! An effort to make publishers and grantors enforce cell line authentication and other types of reagent confirmation before beginning experiments is gaining steam. Not a bad idea and not too expensive.
Other efforts are also underway where a third party can be “hired” to authenticate results such as the Reproducibility Project. This is expensive, and time consuming, and one might wonder where the value lies in having someone else repeat your experiments? The value lies in having confidence in the result…but as we’ve discussed here already, the minute you move your experiment to another lab with new reagents and new people, you add more variability. If the experiment fails, how do you know it’s because the result was wrong or because someone else did it wrong?
This is where the issue lies, and I think it all comes back to the central goal of science and scientists. Scientists want to uncover what’s really happening in nature. Every experiment is done to test a hypothesis, and these results lead to more experiments and on and on. Even if an experiment doesn’t get identical results each time or can’t be reproduced in another lab, the fundamental question is whether or not the biological hypothesis is correct or not. No matter what, scientists should always do multiple different kinds of experiments and follow-up experiments to confirm or refute their hypothesis. This all assumes that scientists are ethical and follow the scientific method – as opposed to folks who publish fabricated or modified data just to get a paper published (but that’s a topic for a whole other post!!)
So I guess the question may not be whether or not an experiment is reproducible, but whether or not the hypothesis is true. And if scientists focus on THAT as opposed to reproducibility, per se, then I think science is moving forward in a productive direction!