Abstract
Statistical errors in preclinical science are a barrier to reproducibility and translation. For instance, linear models (e.g., ANOVA, linear regression) may be misapplied to data that violate assumptions. In behavioral neuroscience and psychopharmacology, linear models are frequently applied to interdependent or compositional data, which includes behavioral assessments where animals concurrently choose between chambers, objects, outcomes, or types of behavior (e.g., forced swim, novel object, place/social preference). The current study simulated behavioral data for a task with four interdependent choices (i.e., increased choice of a given outcome decreases others) using Monte Carlo methods. 16,000 datasets were simulated (1000 each of 4 effect sizes by 4 sample sizes) and statistical approaches evaluated for accuracy. Linear regression and linear mixed effects regression (LMER) with a single random intercept resulted in high false positives (>60%). Elevated false positives were attenuated in an LMER with random effects for all choice-levels and a binomial logistic mixed effects regression. However, these models were underpowered to reliably detect effects at common preclinical sample sizes. A Bayesian method using prior knowledge for control subjects increased power by up to 30%. These results were confirmed in a second simulation (8000 datasets). These data suggest that statistical analyses may often be misapplied in preclinical paradigms, with common linear methods increasing false positives, but potential alternatives lacking power. Ultimately, using informed priors may balance statistical requirements with ethical imperatives to minimize the number of animals used. These findings highlight the importance of considering statistical assumptions and limitations when designing research studies.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 13 print issues and online access
$259.00 per year
only $19.92 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Code for this project will be shared on the corresponding author’s GitHub repository (https://github.com/VonderHaarLab) at the time of publication and is also included as Supplement S2. Underlying data for TBI simulations (control set) is publicly available at https://odc-tbi.org/, dataset identifier 703. Data for cue manipulation simulations (validation set) was obtained and used with permission from Dr. Catharine Winstanley at the University of British Columbia.
References
Garner JP. The significance of meaning: why do over 90% of behavioral neuroscience results fail to translate to humans, and what can we do to fix it? ILAR J. 2014;55:438–56.
Burke DA, Whittemore SR, Magnuson DSK. Consequences of common data analysis inaccuracies in CNS trauma injury basic research. J Neurotrauma. 2013;30:797–805.
Elliott S. Impact of inadequate methods and data analysis on reproducibility. J Pharm Sci. 2020;109:1211–9.
Hoekstra R, Kiers HA, Johnson A. Are assumptions of well-known statistical techniques checked, and why (not). Front Psychol. 2012;3:137.
McCullagh PN, J.A. Generalized linear models: Chapman and Hall; 1989.
Porsolt RD, Anton G, Blavet N, Jalfre M. Behavioural despair in rats: a new model sensitive to antidepressant treatments. Eur J Pharm. 1978;47:379–91.
Slattery DA, Desrayaud S, Cryan JF. GABAB receptor antagonist-mediated antidepressant-like behavior is serotonin-dependent. J Pharmacol Exp Ther. 2005;312:290–6.
Burgdorf CE, Bavley CC, Fischer DK, Walsh AP, Martinez-Rivera A, Hackett JE, et al. Contribution of D1R-expressing neurons of the dorsal dentate gyrus and Ca(v)1.2 channels in extinction of cocaine conditioned place preference. Neuropsychopharmacology. 2020;45:1506–17.
Smith PF, Renner RM, Haslett SJ. Compositional data in neuroscience: if you’ve got it, log it! J Neurosci Methods. 2016;271:154–9.
Dang Q, Mazumdar S, Houck PR. Sample size and power calculations based on generalized linear mixed models with correlated binary outcomes. Comput Methods Prog Biomed. 2008;91:122–7.
Knief U, Forstmeier W. Violating the normality assumption may be the lesser of two evils. Behav Res Methods. 2020;53:2576–90.
Bradley JV. Robustness? Br J Math Stat Psychol. 1978;31:144–52.
Micceri T. The unicorn, the normal curve, and other improbable creatures. Psychol Bull. 1989;105:156–66.
Lee Van Horn M, Smith J, Fagan AA, Jaki T, Feaster DJ, Masyn K, et al. Not quite normal: consequences of violating the assumption of normality in regression mixture models. Struct Equ Modeling. 2012;19:227–49.
Young ME, Clark MH, Goffus A, Hoane MR. Mixed effects modeling of Morris water maze data: advantages and cautionary notes. Learn Motiv. 2009;40:160–77.
Young ME, Hoane MR. Mixed effects modeling of Morris water maze data revisited: bayesian censored regression. Learn Behav. 2021;49:307–20.
Smith PF. A guerilla guide to common problems in ‘neurostatistics’: essential statistical topics in neuroscience.J Undergrad Neurosci. 2017;16:R1–r12.
Yu Z, Guindani M, Grieco SF, Chen L, Holmes TC, Xu X. Beyond t test and ANOVA: applications of mixed-effects models for more rigorous statistical analysis in neuroscience research. Neuron. 2022;110:21–35.
Bonapersona V, Hoijtink H, Sarabdjitsingh RA, Joëls M. Increasing the statistical power of animal experiments with historical control data. Nat Neurosci. 2021;24:470–7.
Zeeb FD, Robbins TW, Winstanley CA. Serotonergic and dopaminergic modulation of gambling behavior as assessed using a novel rat gambling task. Neuropsychopharmacology .2009;34:2329–43.
Vonder Haar C, Martens KM, Frankot MA. Combined dataset of rodent gambling task in rats after brain injury. Open data commons for traumatic brain injury. 2022. http://odc-tbi.org.
Langdon AJ, Hathaway BA, Zorowitz S, Harris CBW, Winstanley CA. Relative insensitivity to time-out punishments induced by win-paired cues in a rat gambling task. Psychopharmacology. 2019;236:2543–56.
Vonder Haar C, Frankot M, Reck AM, Milleson VJ, Martens KM. Large-N rat data enables phenotyping of risky decision-making: a retrospective analysis of brain injury on the Rodent Gambling Task. Front Behav Neurosci. 2022;16:837654.
Bates D, Maechler M, Bolker B, Walker S. Fitting linear mixed-effects models using lme4. J Stat Softw. 2015;67:1–48.
Stanislaw H, Todorov N. Calculation of signal detection theory measures. Behav Res Methods Instrum Comput. 1999;31:137–49.
Bürkner PC. Advanced Bayesian multilevel modeling with the R package brms. R J. 2018;10:395–411.
Young ME. Bayesian data analysis as a tool for behavior analysts. J Exp Anal Behav. 2019;111:225–38.
Seyhan AA. Lost in translation: the valley of death across preclinical and clinical divide – identification of problems and overcoming obstacles. Transl Med Commun. 2019;4:18.
Festing M. On determining sample size in experiments involving laboratory animals. Lab Anim. 2018;52:002367721773826.
Kilkenny C, Parsons N, Kadyszewski E, Festing M, Cuthill I, Fry D, et al. Survey of the quality of experimental design, statistical analysis and reporting of research using animals. PLoS One. 2009;4:e7824.
Hoy RR. Quantitative skills in undergraduate neuroscience education in the age of big data. Neurosci Lett. 2021;759:136074.
Young ME. Discounting: a practical guide to multilevel analysis of choice data. J Exp Anal Behav. 2018;109:293–312.
Alkharusi H. Categorical variables in regression analysis: a comparison of dummy and effect coding. Int J Educ. 2012;4:202–10.
Beltz AM, Beery AK, Becker JB. Analysis of sex differences in pre-clinical and clinical data sets. Neuropsychopharmacology. 2019;44:2155–8.
Diester CM, Banks ML, Neigh GN, Negus SS. Experimental design and analysis for consideration of sex as a biological variable. Neuropsychopharmacology. 2019;44:2159–62.
Meddings JB, Scott RB, Fick GH. Analysis and comparison of sigmoidal curves: application to dose-response data. Am J Physiol. 1989;2571:G982–9.
Horst NK, Jupp B, Roberts AC, Robbins TW. D2 receptors and cognitive flexibility in marmosets: tri-phasic dose-response effects of intra-striatal quinpirole on serial reversal performance. Neuropsychopharmacology. 2019;44:564–71.
NIH. R01-equivalent grants: Competing applications, awards, and success rates. In: NIH Data Book. 2022. https://report.nih.gov/nihdatabook/report/29.
Depaoli S, Winter SD, Visser M. The importance of prior sensitivity analysis in Bayesian statistics: demonstrations using an interactive shiny app. Front Psychol. 2020;11:608045.
Austin PC. Estimating multilevel logistic regression models when the number of clusters is low: A comparison of different statistical software procedures. Int J Biostat. 2010;6:16.
Dalton GL, Phillips AG, Floresco SB. Preferential involvement by nucleus accumbens shell in mediating probabilistic learning and reversal shifts. J Neurosci. 2014;34:4618–26.
Lin B, Bouneffouf D, Cecchi G. Predicting human decision making in psychological tasks with recurrent neural networks. PLoS One. 2022;17:e0267907.
Dixon P. Models of accuracy in repeated-measures designs. J Mem Lang. 2008;59:447–56.
Luke SG. Evaluating significance in linear mixed-effects models in R. Behav Res Methods. 2017;49:1494–502.
Kenward MG, Roger JH. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics. 1997;53:983–97.
Singer JD, Willet JB. Using Wald statistics to test composite hypotheses about fixed effects. In Applied longitudinal data analysis: modeling change and event occurrence. New York: Oxford University Press; 2003.
Saravanan V, Berman GJ, Sober SJ. Application of the hierarchical bootstrap to multi-level data in neuroscience. Neurons, Behavior, Data Analysis, and Theory. 2020;3:1–25.
Langford IH. Using a generalized linear mixed model to analyze dichotomous choice contingent valuation data. Land Economics. 1994;73:507–14.
NIH. NIH data sharing policy and implementation guidance. 2020. https://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm.
Shaver TK, Ozga JE, Zhu B, Anderson KG, Martens KM, Vonder Haar C. Long-term deficits in risky decision-making after traumatic brain injury on a rat analog of the Iowa gambling task. Brain Res. 2019;1704:103–13.
Acknowledgements
We would like to thank the researchers in the Vonder Haar laboratory and the Winstanley laboratory who helped to collect the original data which were used to support the simulations in this manuscript.
Funding
This work was supported by the NINDS (R01- NS110905; R01-NS110905-05S1) and Ohio State University.
Author information
Authors and Affiliations
Contributions
Conceptualization: MF, CV; Formal analysis: MF, PM, MY; Funding acquisition: CV; Writing – first draft: MF; Writing – review and editing: MF, PM, CV, MY.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Frankot, M., Mueller, P.M., Young, M.E. et al. Statistical power and false positive rates for interdependent outcomes are strongly influenced by test type: Implications for behavioral neuroscience. Neuropsychopharmacol. 48, 1612–1622 (2023). https://doi.org/10.1038/s41386-023-01592-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41386-023-01592-6
This article is cited by
-
Understanding Individual Subject Differences through Large Behavioral Datasets: Analytical and Statistical Considerations
Perspectives on Behavior Science (2024)