A Year of Reproducibility Initiatives: The Replication Revolution Forges Ahead
Adhering faithfully to the scientific method is at the very heart of psychological inquiry. It requires scientists to be passionately dispassionate, to be intensely interested in scientific questions but not wedded to the answers. It asks that scientists not personally identify with their past work or theories — even those that bear their names — so that science as a whole can inch ever closer to illuminating elusive truths.
That compliance isn’t so easy. But those who champion the so-called replication revolution in psychological science believe that it is possible — with the right structural reforms and personal incentives.
Emerging leaders in psychological reproducibility came together at the 2014 APS Annual Convention in San Francisco to discuss their current efforts to enact this cultural shift toward a more open and ultimately more reliable way of conducting scientific research.
Barbara A. Spellman, a professor of law at the University of Virginia and Editor of Perspectives on Psychological Science, has helped facilitate what has been called a replication revolution. The word “revolution” in a scientific context has paradigm-shifting connotations, but Spellman believes that it has developed as more of a political revolution, with emerging technological advances enabling the scientific community to finally address biases and methodological problems that have been around for years.
Spellman addressed some common criticisms of replication, from feelings of persecution and bullying to more hyperbolic charges of McCarthyism, which reveal both the personal terms in which scientists engage with their research and the perceptions that replication revolutionaries must challenge and overcome. A next step in “the evolution of the revolution,” as Spellman calls it, therefore is to change our procedures to incentivize replications and reliable generalizable results.
All the participants agreed that these changes are desperately needed in the current academic climate, which puts career success and scientific truth at odds. As APS Fellow Brian A. Nosek, University of Virginia associate professor of psychology, put it, “The incentives for my success as an acting scientist are focused on me getting it published, not on me getting it right.”
This thinking elevates the value of a study’s results over the quality of the science itself, said Chris Chambers, a senior research fellow in cognitive neuroscience at Cardiff University’s school of psychology in the United Kingdom. The focus on findings rather than methodology inexorably leads to problems such as publication bias, p-hacking, “HARKing” (Hypothesizing After the Results are Known, or presenting a post hoc hypothesis as if it were an a priori one), and other plagues of the publish-or-perish milieu. Chambers was unequivocal in his assessment of the consequences of this outlook.
“The minute we judge the quality of the science and of the individual scientists based on the results they produce,” he said, “we condemn ourselves to becoming a soft science, and we self-sabotage our ability to discover truth.”
To help shift scientists’ focus toward the quality of the questions asked and the methods used to generate answers, Chambers began editing Registered Reports for the journal Cortex in 2013, and several other journals are now publishing papers in a similar format. These include Perspectives on Psychological Science, which introduced its Registered Replication Report initiative in 2013.
In this new type of article, researchers lay out their hypotheses, experiments, and analyses before collecting any data. This research plan, including full Introduction and Methods sections, is then subject to peer review similar to the traditional peer-review process for final manuscripts, where external reviewers determine whether the proposed study is well-designed and well-powered. If the results of the review are positive, the authors receive an in-principle acceptance from the journal, virtually guaranteeing publication of the final manuscript, regardless of the outcome of the study. After the authors collect the data, run analyses, and write up the Results and Discussion sections, the manuscript undergoes a second round of peer review. Barring any problematic deviations from the protocol, it is then published, and is to include any unsupported hypotheses, p values over 0.05, and others results not traditionally considered novel or impactful.
A renewed focus on process over results will not only address biases in publishing, it will also enable psychological science as a field to address serious methodological flaws that permeate today’s research environment. One of the most pervasive of these problems is the prevalence of significant but low-power findings, said Jelte M. Wicherts, an associate professor of methodology and statistics at Tilburg University, the Netherlands. A staggering 90% of the findings published in psychology journals show results in support of the authors’ hypothesis, but effect sizes in psychological studies are often too small to be so readily detected in the tiny sample sizes frequently found in these studies.
Wicherts blames poorly designed statistical analyses for this phenomenon: Fewer than 15% of articles report a formal power analysis, and surveys have shown that the majority of psychological scientists have employed one or more questionable research practices (QRPs) to boost the power in their research. These include sequential testing and using multiple smaller studies instead of larger ones, which increases the likelihood of getting at least one significant finding. Wicherts offered suggestions for redressing these problems for both individual researchers, who need to design experiments with higher power, and journals, which need to improve reporting standards and consider insignificant results seriously.
An organic way to reduce biases and other QRPs, according to University of Amsterdam psychology professor Eric-Jan Wagenmakers, is through adversarial collaborations. In this approach — also advocated by APS William James Fellow and Nobel laureate Daniel Kahneman — a researcher who is unable to replicate a finding reaches out to the original authors or proponents, and the two groups work together on a plan for data collection, analyses, and rules for determining a “winner.” Once the terms are agreed upon and an impartial referee has been selected to resolve any disputes between the groups, the two labs conduct the experiment and write up the results in a joint article, with each group and the referee writing their own discussion sections.
Replication experiments conducted in this way eliminate hindsight bias, confirmation bias, and the file drawer effect, Wagenmakers asserted. And “it avoids the post hoc critique of the proponent that the skeptic didn’t do the experiment right,” a common sticking point in many direct replication projects, he said.
Such collaborations can address another effect of valuing publishable findings over accuracy: a sense that abiding by the guiding principles of good science puts individual scientists at a disadvantage. Nosek discussed a 2007 survey in which Melissa Anderson of the University of Minnesota and colleagues found that individual scientists largely endorse accepted behavioral norms in scientific research, such as communal sharing of data and methodologies, organized skepticism — even of one’s own prior work — and a primary focus on quality over quantity of research. However, when asked about their actual behavior, they fell short of these ideals, and they rated other scientists’ behaviors even worse. This suggests that a scientist who perceives people around her behaving in ways that benefit their careers rather than science as a whole will view complying with the scientific method as “effectively, in [her] perception, committing career suicide,” said Nosek.
A key to changing the culture, he said, is to change the belief among scientists that no one else is playing by the rules. One way to do this is through so-called nudge incentives, including awarding badges in recognition of authors’ open science practices, as Psychological Science began doing for articles accepted in 2014. To receive a badge, authors must share their data, materials, and/or preregistered research plan on a publicly available repository. Nosek and his colleagues at the Center for Open Science have launched one such repository, the Open Science Framework (OSF), designed to encourage an open exchange of data and ideas.
Above all, the panelists expressed hope that these kind of incentives, technological advancements such as OSF, publishing changes such as preregistered studies, and more sophisticated analytical methods will foster a culture that defines career success differently, where achievement is measured not in the number of publications but in the trustworthiness of findings.
“I think we’re all just concerned with doing better science as a field,” said Spellman, “and we’re working out the best ways to make that happen.”
APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines.
Please login with your APS account to comment.