The scientific method, revisited

What really is the scientific method in research?

bsr_scientific_methodMaybe your high school biology teacher made you recite the scientific method. If you talk to most practicing researchers, they would probably say this method is “more a set of guidelines than an actual rule”; kind of like the pirate’s code. Publications disseminate research using the key terms of the scientific method: background, hypothesis, experimental design or methods, results and conclusion. However, this approach may not reflect the complex reality of conducting research, nor optimize productivity. This article evaluates the scientific method by investigating different ways to approach research, specifically the strong inference model.

Some may argue that without the backbone of the scientific method, the research process is convoluted and complex. But is complexity something to be avoided or embraced? Consider the ecologist, Eric Berlow, as he simplifies complexity in his TED talk. He supports the claim that complexity is not complicated but, rather, simplifies our understanding by encompassing the whole network. Sound like an oxymoron? This abstraction is distilled to a concrete example with a more direct interpretation of an infographic on U.S. strategy in Afghanistan (this infographic looks as overwhelming as food webs over the past ten years). Is the scientific method stripping science of its innate, remarkable complexity, or distilling it into a digestible brew?

Such complexity can arise through logic trees, and imagine designing experiments using one! In 1964, John Platt presented the concept of strong inference, stating that the development of logic trees and alternative hypotheses are essential to rapid advancement of scientific inquiry. Platt exposes the weakness of the classical scientific method’s single-hypothesis. Actively pursuing an answer to one specific hypothesis may cause confirmation bias at each level of the scientific method, including background research on the question, modes of approaching the question experimentally, data collection and interpretation. Blind and double-blind experiments attempt to control for this, but if the methodology is biased, does the data quantified still lack bias? Platt suggests with anecdotal evidence that greater scientific progress is made through inductive reasoning (metaphorically, casting a fishing net into an ocean of unknowns) and conducting thoughtful experiments to eliminate alternative hypotheses and produce subsequent logical ones, instead of simply going to one fishing spot at a time through the deductive reasoning, as outlined by the classical scientific method.

Thirty-seven years later, William O’Donohue and Jeffrey Buchanan point out the weakness of strong inference, to the end of promoting normative usage of the scientific method.  Below are the weaknesses of Platt’s essay quoted from their abstract, followed by my responses to them:

[Platt’s essay has:]

1) No demonstration of the central historical claim, that is, that the more successful sciences actually have utilized this method more frequently than the less successful sciences;

2) Poor historiography, in that more faithful explications of some of the historical case studies Platt cites as canonical examples of SI fail to support that SI was actually used in any of these cases;

3) Failure to identify other cases of important scientific progress that did not use SI and instead used distinct scientific methods;

Unless Platt did more sociological research by asking labs to categorize their research approach and success rates (and control for bias), the task of investigating this central historical claim with statistical significance is daunting and fretted with obstacles. This is similar to how null results are not incentivized the same way as statistically significant results, for it takes shockingly high numbers of trials or samples to even slightly convince the absence of a phenomenon. Karl Popper corroborates the dilemma in falsifiability with the example, “All swans are white.” Seeing many swans, and noting they are all white, is not evidence enough that all swans are white. Platt does not argue that all successful labs use strong inference, though his evidence for any labs seems mostly anecdotal.

4) Neglect of the importance and implications of background knowledge used in problem formulation;

Background knowledge seems intuitively necessary to form alternative hypotheses, so even if Platt included this observation, would it have strengthened his article?

5) The impossibility of enumerating a complete set of alternative hypotheses;

It is impossible to do them all, so do not attempt to enumerate any alternative hypothesis–I am not convinced. Through the process of contemplating multiple outcomes, one may have a better grasp on the subject matter and prevent biases being formed as a result of the love of a single hypothesis, as aforementioned. The only downside is that the time invested in these thought-exercises is not directly incentivized (i.e. to the extent that the actual results of an experiment are); however, I personally value the publications that point out limits to their studies and propose experiments to expand these boundaries (except…would reviewers of your article ask you why you did not do these proposed experiments?)

6) No acknowledgement of the Quine-Duhem thesis indicating that “critical” experiments are actually logically inconclusive;

The Quine-Duhem thesis emphasizes the importance of context in scientific investigations. Context is established through background research, available tools, and surrounding biases (e.g. how are researchers in your proximity shaping your opinions about the field?). Consider the evaluation Donohue and Buchanan regarding Quine’s perspective (based on deductions from the history of science):

“While there may not be a clear choice of which belief to reject in the face of counter-evidence or an unexpected result…scientists assume, nevertheless, there are better or worse candidates for rejection and that certain pragmatic considerations can aid rational belief revision. Scientists generally will want to revise as few beliefs as possible, which will lead them to prefer those beliefs with  the fewest implications and the least amount of support in their favor.”

Does this very argument suffer from the weaknesses that Donohue and Buchanan claim are weaknesses of Platt’s “strong inference” model (i.e. the first three bold statements quoted previously?). What empirical evidence supports what they claim “scientists assume”?

7) The assumption that there is one scientific method;

Excellent point. There is indeed a classical scientific method, and the interpreters of it are in different intellectual and technological contexts that, again, show how the direct method is, “more like guidelines.”

8) Significant ambiguity regarding exactly how each of the four steps of SI is to be faithfully implemented.

Platt presented the limitations imposed by the scientific method, and his four steps for conducting research following the strong inference model*, and I ponder if any amount of precise descriptive prose could enable someone to faithfully implement these steps. The mere suggestion is open to interpretation. Scientists can continue to hypothesize how to successfully hypothesize, and then if their process is not successful**, develop alternative hypotheses.

In conclusion, a universal approach to scientific research is important when determining what are (or are not) acceptable practices. However, it may be more effective to consider what is the best research approach for your particular field for fruitful and meaningful results. And besides, paradigm-shifts do not happen overnight. Reflect on how you or your affiliates conduct your research, how you interpret the classical scientific method and the strong inference model, and if these are the most effective approaches towards your end goal.


*summarised by Donohue and Buchanan as, “1) Devising alternative hypotheses, 2) Devising a crucial experiment (or several of them), with alternative possible outcomes, each of which will, as nearly as possible, exclude one or more of the hypotheses, 3) Carrying out the experiment so as to get “clean” results, and 4) Recycling the procedure, making sub-hypotheses or sequential hypotheses to refine the possibilities that remain.”

**“success” can be defined by current academic incentive structures (publications, job offers, awards etc) or by personal sense of accomplishment–a bit more complicated to quantify


Check out Berkeley’s “Understanding Science: how science really works” website to see how the scientific method is explained and alternative processes are addressed informally (from whence the image is borrowed!)


This article is inspired by some of the readings and discussions for the graduate seminar, psychology 210B, led by professors Linda Wilbrecht and Lance Kriegsfeld, which leads (as most papers do), down a rabbit hole of related readings:

Ernst Mayr writes about Cause and Effect in Biology in 1961 (which is revisited in 2011 by Kevin Laland et al.), Paul Sherman describes The levels of analysis in 1988, and Scott MacDougall-Shackleton revisits these levels in 2011 with more current examples. A familiar paper to you may be John Platt’s paper on Strong Inference from 1964, and opponents of the strong inference model wrote The Weaknesses of Strong Inference in 2001.


Leave a Reply


  1. Jake Lockley

    ALL evidence is anecdotal. Please see SOCIAL SCIENCE AS THEOLOGY by Neil Postman.

  2. Anonymous

    Dax Vivid
    Dax is a graduate student in the Integrative Biology department studying avian reproductive neuroendocrinology. When she’s not at the bench, she enjoys running, learning new languages, and figuring out ways to create a culture of wellness.

    Small spelling mistake in above paragraph: Bench? Do you mean beach?