Improving psychological science: further thoughts, reflections and ways forward

Clinical

In the last 5-10 years, much has changed in how science is conducted, and specifically in how psychological science is performed. One of the key drivers of change was the publication of the Open Science Collaboration's (2015) paper estimating the reproducibility of psychological science. This international collaborative effort set out to replicate 100 experimental and correlational studies from three leading psychology journals. The findings were alarming: fewer than 40% of psychology studies were able to be replicated. Numerous factors-known as questionable research practices or "QRPs"-have been suggested to explain these low levels of replication including low statistical power, hypothesizing after the results are known (HARKing), p-hacking, the "garden of forking paths" and failure to control for biases (Gelman & Loken, 2013;Kerr, 1998;Munafò et al., 2017;Norris & O'Connor, 2019). Following this so-called Replication Crisis, it has been argued that psychological science has been undergoing a renaissance (O'Connor, 2021). Part of its "rebirth" has involved the development of numerous new tools and approaches to help improve replication and reproducibility and to reduce use of questionable research practices. At Cogent Psychology, we are keen to support these efforts in order to help increase openness, integrity and reproducibility in scientific research and ultimately improve the robustness of our evidence base.
To this end, in addition to standard article formats, Cogent Psychology now offers two innovative and novel publishing formats: Registered Reports and Brief Replication Reports. Registered Reports differ from conventional empirical articles by performing part of the review process before the researchers collect and analyse data. Unlike the more conventional scientific process where a full report of empirical research is submitted for peer review, Registered Reports are considered as proposals for empirical research, which are evaluated on their merit prior to the data being collected (see, Chambers & Tzavella, 2022; https://osf.io/rr/). Once the Stage 1 Registered Report has been accepted and has received In Principle Acceptance, data collection can begin. Importantly, following successful data collection and analysis, the full Registered Report will be accepted for publication irrespective of the significance of your findings. Crucially, it is hoped this will help reduce publication bias that favours statistically significant effects. Cogent Psychology also welcomes Brief Replication Reports. The main purpose of this format is to help facilitate and simplify the publication of replication studies whereby researchers can repeat research or present similar results to previously published research with the aim of reinforcing previous studies to determine their validity, elaborating on earlier findings, developing academic knowledge and directing future research (see Instructions to Authors for more details).
Cogent Psychology also encourages preregistration of all types of empirical studies (e.g., observational studies, randomised controlled trials and experimental studies), Brief Replication Studies, systematic reviews and meta-analyses. Preregistration of clinical trials, behaviour change interventions and systematic reviews and meta-analyses is commonplace on repositories (such as https:// clinicaltrials.gov/; https://www.isrctn.com/, https://www.crd.york.ac.uk/PROSPERO/). However, preregistration of other observational and experimental studies is less common. Preregistration of study plans before conducting a study has been identified as an important tool to help increase the transparency of science and to improve the robustness of psychological research findings (Bosnjak et al., 2022). To this end, in addition to other existing options, a new template for the preregistration of quantitative empirical studies in psychology-known as the Psychological Research Preregistration-Quantitative (PRP-QUANT) Template-has recently been introduced (https://doi.org/10.23668/psy charchives.4584; for other helpful open research primers see https://www.ukrn.org/primers/).
As outlined above, there have been many excellent developments in improving how psychological science is conducted, many of which are being adopted by Cogent Psychology. However, there are a number of other important issues that psychological researchers should also consider as we continue to improve psychological science and related disciplines. This editorial turns to some of these next.

Need for greater transparency and openness
Whilst the estimated replication rate for cognitive studies in the seminal study of the Open Science Collaboration (2015) was better than the average rate across all studies (50% vs. 36%), there is clearly much room for improvement. At Cogent Psychology, we are ideally situated to make a considerable impact on improving attempts at replication, reproducibility, and the uptake of open research practices more generally. For example, although Cogent offers Brief Replication Reports that communicate attempts to replicate an already published finding, submissions to Cogent that report new experimental cognitive findings can be strengthened by including a direct and/or conceptual replication (Brandt et al., 2014) of the finding within the same submission. Such presentation has replication "built-in" by design, and as such helps provide the field with reassurance as to the robustness of new effects reported. Although not a prerequisite for acceptance, submissions that provide such reassurance will certainly be viewed as a priority for publication.
More broadly, experimental psychologists also have an important opportunity to maximise the openness of their research, including by sharing their experimental materials, their data, and their analysis code. Although sharing of data and analysis code is becoming common, the sharing of experimental materials is less so. As many methods in the field of experimental cognitive psychology are digital (e.g., digital stimuli and computerised experiment scripts) the barrier to openly sharing all of the code and stimuli associated with our experimental work is low. As such, we strongly recommend that all submissions make their experimental methods openly available where possible.
Beyond the openness and reproducibility of experimental findings, Cogent Psychology also places strong emphasis on the openness and reproducibility of theoretical work. For example, computational modelling significantly aids rigorous theory building in cognitive psychology (e.g., Guest & Martin, 2021;Oberauer & Lewandowsky, 2019), but it also allows for clearer theoretical communication between researchers (Farrell & Lewandowsky, 2010). In contrast to verbal theories, theories expressed computationally can be communicated with other researchers by sharing the computer code. It is important therefore that we make our code openly available and to take steps to ensure our models are fully reproducible. Cogent Psychology also welcomes submissions reporting studies aimed at assessing the impact of the many researcher degrees of freedom (Simmons et al., 2011) inherent in the modelling process to the inferences made (see, for example, Dutilh et al., 2019). We also welcome tutorial papers that help lower the barrier of entry for colleagues to implement computational modelling into their own research programme.

Single-case studies in the context of replication and reproducibility
Many of the sections at Cogent Psychology are devoted to theoretical, experimental, and applied contributions that advance the understanding of cognitive and behavioural impairments in neurological conditions, their recovery and rehabilitation. Taken at face value, it may be assumed that an area such as neuropsychology has escaped the Reproducibility Crisis as it often deals with large effect sizes. However, replication problems might differ depending on the clinical disorder investigated and sample sizes can vary widely in neuropsychology.
Single-case studies (N = 1) are sometimes the only way to study rare neurological conditions, but replication attempts are rare (e.g., Rossetti et al., 2017;Rossit et al., 2018). In addition, in singlecase studies it is hard to determine whether findings can generalise to other cases. Therefore, replication is crucial to establish the reliability and generalisability of neuropsychological findings, and at Cogent we welcome the submission of such replication attempts.
Data sharing holds significant promise to address the challenges of small samples as it allows testing the reliability and generalisability of findings across neurological cases, research groups, countries, languages, and cultures. Larger group studies can reveal important patterns of more prominent neurological disorders, but there is also an important need to understand how group data can inform us about the individual, both in terms of symptom presentation over time and intervention efficacy. Directly measuring within-subject variability in large cohort studies is critical to determine which failures to replicate are driven by a lack of single-subject analysis. For example, in neuropsychological rehabilitation, reproducibility is at least partially linked to how well grouplevel data represent individual responses to treatment.
A paradigm shift is needed in neuropsychology focusing on adoption of open research to accelerate the field and bring researchers and clinicians closer to important advancements in assessment, diagnosis, and interventions for people with neurological conditions and their families. This shift would help determine if neuropsychological findings are robust and should be implemented in clinical practice. Moreover, as outlined earlier, open materials, code, and data can provide research teams with access to the methods and outcomes of all studies which in turn will facilitate replication, combination of multiple datasets and meta-analyses further strengthening findings and their translation into clinical practice. Moreover, open data can ultimately facilitate the investigation of population, sample level and singleperson level effect sizes and, even, provide the foundations for testing a range of hypotheses (including the null) within a Bayesian framework.
At Cogent Psychology we are excited to fully support the open research paradigm shift in neuropsychology and encourage authors to consider study preregistration and replication and welcome submission of new studies as Registered Reports or Brief Replication Reports. In a field of psychology with such direct ramifications to the care of neurological patients the adoption of open, reproducible, and robust research practices is too important to be delayed.

Large-scale datasets and secondary data analyses: opportunities and challenges for open research
The availability of survey data from national and international studies and from researchers who have posted their data to trusted repositories presents both opportunities and challenges for open research. Combined with open analysis code, open data allows the primary findings of major surveys to be reproduced and the findings of published studies to be verified. Ensuring results are computationally reproducible is crucial for the integrity of psychological research and journal editors have begun to call for stronger practice in this area (Aczel et al., 2020;Bauer, 2022). This is because providing open data and code allows journal reviewers and independent researchers to retrace how scientific findings are reached and better understand the potential role of researcher degrees of freedom (e.g., the choice of statistical tests, variable coding, and exclusion criteria). By exposing research findings to stronger collective scrutiny open data and code should increase confidence in the findings of psychological science.
Another major benefit of open data is that independent researchers can test entirely new ideas using pre-existing data. Secondary analysis of large-scale national and international surveys is already commonplace and has helped increase the utilisation and impact of publicly funded research. Efforts to improve data transparency (e.g., the Transparency and Openness Promotion guidelines; Nosek et al., 2015) and open data mandates from funding bodies are set to further accelerate the availability of research data. While the potential benefits of open data are vast, the proliferation of easily accessible secondary data presents significant challenges. Most critically, if researchers access data and test a range of hypotheses in different ways without openly declaring this practice, this will result in a high rate of false-positive findings (Simmons et al., 2011).
An array of approaches to reduce the rate of false-positives arising from analyses of secondary data have been proposed and these practices are welcomed across our sections. First, multiverse analysis or specification-curve analysis has been proposed as a way to understand the impact of flexibility in analysis on study estimates (Simonsohn, Simmons, & Nelson, 2020). Multiverse analysis involves testing and presenting all plausible statistical models and can be implemented using independently developed packages in R, Stata, and Python (e.g., http://urisohn.com/specification-curve/). Multiverse analysis can reduce bias by making explicit the impact of using different specifications or examining different outcomes within a given study. Similarly, "outcome-wide" designs-where all relevant available outcomes included in a dataset are examined-have been proposed as a way to reduce the practice of cherry-picking outcomes related to the exposure of interest when examining secondary data. This approach allows the overall link between a predictor and a range of outcomes to be estimated (VanderWeele, Mathur, & Chen, 2020). Taken together, these approaches have the potential to substantially reduce the amount of false-positive findings arising from analysis of secondary data.
The practice of reverse engineering hypotheses based on observed relationships in the data (i.e. HARKing) is more difficult to address using multiverse or outcome-wide analyses. Instead, it is crucial that researchers acknowledge when analyses are exploratory or hypothesis generating and perform a confirmatory test of post-hoc hypotheses in replication samples or a preregistered study (cf., Bosnjak et al., 2022;O'Connor, 2021). A related approach involves the use of "hold-out" or "splitsample" strategies to control the false discovery rate. Exploratory analyses are performed on a publicly available fraction of the data and the hypotheses that emerge can be registered and tested on a portion of the data that was initially withheld by the data controller (Anderson & Magruder, 2017). Preregistering studies prior to applying for access to secondary data is another, perhaps more straightforward, approach to ensuring hypotheses are tested as planned. Of course, preregistration is less feasible when an application is not needed or the data has already been accessed by the research team. In this situation, detailed analysis protocols can be prepared and posted to a trusted repository in advance of beginning a new study drawing on the data, once again to make explicit exploratory and confirmatory tests. Therefore, at Cogent Psychology, we would welcome papers based upon large-scale datasets and secondary data analyses following the principles outlined above.

Importance of greater collaborative working
As noted earlier, an additional critique of psychological science research has been the reliance on small sample sizes, bringing into question statistical power. This has been particularly pertinent in research with hard-to-reach populations (e.g., individuals with developmental disorders) or groups of participants that may require high levels of resources to engage in the research process (e.g., infants). Open research practices have aimed to address this issue by ensuring cross-laboratory collaboration through initiatives such as Many Babies (https://manybabies.github.io) which includes researchers from across 47 countries (but also see the Psychological Science Accelerator -a globally distributed network of psychological science laboratories, https://psysciacc.org/. Many Babies aims to replicate key findings in developmental science by pooling resources to address fundamental research questions. Interestingly, this approach not only addresses critical issues around sample size, transparency, and the sharing of advanced research methods, but also increases both the diversity in study samples and the researchers involved. Given that most of the research in developmental science has been generated with participants from Western, Educated, Industrialised, Rich, Democratic (WEIRD) samples these international collaborative efforts are critical to ensure that scientific findings and theory development incorporate human diversity.
Increasingly, data sharing forms an important cornerstone of open research practices. This is not just important to increase transparency of decision-making processes and analyses, but also to increase data pooling and collaborations. Of course, data sharing is not without its issues around upholding ethical standards and protecting the confidentiality of participants. However, initiatives such as https://nyu. databrary.org have successfully and safely generated a large resource for educational and developmental psychology research through the housing of video and transcript data for further exploration. Access to the database is restricted to recognised researchers whose identities are verified by their host institutions. Major funders have also supported the development and maintenance of large-scale databases to encourage data sharing and data combining to make substantial advances in our understanding of critical areas of research. A specific example is "LDbase" (https://ldbase.org), funded by the National Institutes of Health in the United States, which aims to support big data approaches to understanding reading difficulties. By encouraging the use of secondary data sources and combining datasets, it is also important to emphasise the preregistration of data analytic plans (including how variables will be selected, how missing data will be dealt with, etc.). This not only enables researchers to successfully plan out their approach to their research, but also increases transparency in the context of having "a garden of forking paths" of statistical analyses options (Gelman & Loken, 2013).
At Cogent Psychology we would particularly welcome research that embeds open research practices within its workflow, with a particular focus on multi-laboratory collaboration and the inclusion of diverse samples. Through these approaches, we will increase the chances of generating impactful research that aims to improve outcomes of children and young people.

Statistical power and sample size justification
As outlined earlier, it is now well accepted that most research fields in psychology have historically had low statistical power; they generally have a low probability of detecting the effect or effects they were interested in (e.g., Button et al., 2013;Cohen, 1962). It is also now common to see calculations estimating statistical power in papers, grant applications or preregistrations, at least in part as a response to mandates from journals or funding bodies. This sometimes backfires: researchers produce a calculation that will satisfy reviewers and funders rather than one that may be informative about the study they are planning. Common benchmarks and thresholds (e.g., 80% power to detect a medium or large effect) encourage misunderstandings and poor practice (Baguley, 2004;Giner-Sorolla et al., 2020).
A better and more transparent approach is to think about the range of factors that influence the statistical power of a study and what constraints exist on those factors. For example, if you collect data from a small school with 80 children, n is capped at 80. Any justification of sample size in terms of the effect size you are trying to detect is likely to be spurious. It would be better to justify the sample size on pragmatic grounds (e.g., see, Lakens, 2022). However, this sort of constraint does not mean that you should not think about and plan to maximize statistical power.
To understand why it is important to realize that statistical power is not a number but a function or curve. For any particular study, the shape of the power curve depends on a range of parameters (representing the factors that influence your ability to detect an effect). Our aim in planning a study is not to predict a point on that curve (which for practical purposes will always be wrong). Rather, we want to understand how the shape of that curve is influenced by those parameters and (for the parameters that we can manipulate) pick values that give high statistical power across a range of plausible parameter values (sometimes termed a sensitivity analysis).
Furthermore, even in a simple study there are options to increase statistical power despite only having a few parameters to worry about: usually n, alpha and effect size. Often the focus is on n and this leads to neglect of other factors that influence statistical power such as the design of the study (Baguley, 2004). This should be informed by the minimum effect sizes we would like to detect; the SESOI or smallest effect size of interest (e.g., see, Giner-Sorolla et al., 2020). Researchers are reluctant to shift alpha from the conventional .05 level but there are good arguments for considering this (Lakens et al., 2018) or for adopting one-sided tests for preregistered hypotheses.
In a more complex study it may well be more important to focus on other factors that decrease statistical power-notably missing data or drop-out. Investing in preventing attrition will often have a greater practical impact on statistical power than increasing sample size (as well as being desirable for other reasons). It is still possible that after all this effort, power remains stubbornly low. Under these circumstances it may still be worth running the study, but focus should shift to making research synthesis easier. Open data, standardized protocols and (ideally) collaboration between research groups can facilitate this.
In summary, papers submitted to Cogent Psychology should include a clear justification for choice of sample size. This need not always involve consideration of statistical power (notably for single case or qualitative studies). Where statistical power analysis is involved it is better to focus on understanding the sensitivity of your sampling strategy or design, and to make plausible assumptions about the research context rather than try and think of statistical power as a way to arrive a single, fixed correct answer.

Conclusion
In conclusion, in the last decade or so, it is clear that the discipline of psychology has made huge advances in how psychological science is conducted (Chambers & Tzavella, 2022;Munafò et al., 2017;O'Connor, 2020O'Connor, , 2021. The development of new tools, approaches and publication formats have helped to reduce use of questionable research practices that will ultimately improve the robustness of our evidence base. We hope that the relaunch of Cogent Psychology can play a role in helping to further improve psychological science and that you want to contribute to this and will consider submitting some of your work to us in the near future.