The Scientific Method and Replicability

Overview

Reproducibility and replicability are frequently confused for one another, and for good reason. Some folks consider these terms equivalent. Some consider them different. Some define them in opposite ways. In the 2018 opinion paper Reproducibility vs. Replicability: A Brief History of a Confused Terminology published in Frontiers in Neuroinformatics, the author discusses the history and previously stated definifitons of reproducibility and replicability. This article does an excellent job describing the nuance of what many might consider 'basic' definitions in science.

Diagram illustrating reproducibility: three separate groups (A, B, and C) each use the same dataset (Dataset A) and the same workflow or code (Workflow/Code A) to generate results. Each group produces a result, and the diagram asks whether the results agree, emphasizing that reproducibility means obtaining consistent results when the same data and methods are used.

Our Definition

For our purposes (in this collection of modules), we define reproducibility simply as the ability of an individual or group to reproduce the results of some computational work with the same data and code/software as the original work. In the most narrow version of this definition, the "data" and "code" is the exact same. However, any informaticist will recognize that hardware availability, operating systems, and software versions may result in inability to exactly reproduce a computing environment. Sometimes it is more difficult to revert to the originally used version of a software than it is to use the newer, upated version 3 years later. Therefore, this definition is meant as a guideline not as an exact standard to uphold. It is left intentionally broad to reflect the nuance in terminology described above. Lastly, while this definition does not explicity identify how to measure, there do exist metrics for computational reproducibility.

We also define replicability simply as the ability of an individual or group to reproduce the results of some computational work with either the same data and different code/software or different data and the same code/software as the original work. In brief, the purpose of replicating an experiment might be explained as exploring if the original results stand even with slight modification of computational approach. If not, there will likely be additional exploration of processes (both human and computer) to be explored to explain differences in results.

Diagram illustrating replicability: three separate groups (A, B, and C) each use different datasets (Dataset A, Dataset B, and Dataset C) but apply the same workflow or methods. Each group produces a result, and the diagram emphasizes comparing results across multiple experiments and conditions using the same methods to determine consistency.

Additional Reading

As always, one page on one website is not sufficient to cover this topic in depth. For more reading, please refer to the recent report by the National Academies, "National Academies of Sciences, Medicine, Policy, Global Affairs, Board on Research Data, Information, ... & Replicability in Science. (2019). Reproducibility and replicability in science. National Academies Press" at https://www.nationalacademies.org/read/25303/chapter/1