Monte Carlo confidence intervals for the indirect effect with missing data

Pesigan, Ivan Jacob Agaloos; Cheung, Shu Fai

doi:10.3758/s13428-023-02114-4

Monte Carlo confidence intervals for the indirect effect with missing data

Published: 07 August 2023

Volume 56, pages 1678–1696, (2024)
Cite this article

Behavior Research Methods Aims and scope Submit manuscript

428 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Missing data is a common occurrence in mediation analysis. As a result, the methods used to construct confidence intervals around the indirect effect should consider missing data. Previous research has demonstrated that, for the indirect effect in data with complete cases, the Monte Carlo method performs as well as nonparametric bootstrap confidence intervals (see MacKinnon et al., Multivariate Behavioral Research, 39(1), 99–128, 2004; Preacher & Selig, Communication Methods and Measures, 6(2), 77–98, 2012; Tofighi & MacKinnon, Structural Equation Modeling: A Multidisciplinary Journal, 23(2), 194–205, 2015). In this manuscript, we propose a simple, fast, and accurate two-step approach for generating confidence intervals for the indirect effect, in the presence of missing data, based on the Monte Carlo method. In the first step, an appropriate method, for example, full-information maximum likelihood or multiple imputation, is used to estimate the parameters and their corresponding sampling variance-covariance matrix in a mediation model. In the second step, the sampling distribution of the indirect effect is simulated using estimates from the first step. A confidence interval is constructed from the resulting sampling distribution. A simulation study with various conditions is presented. Implications of the results for applied research are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

R-squared change in structural equation models with latent variables and missing data

Article 29 March 2021

Estimation of the latent mediated effect with ordinal data using the limited-information and Bayesian full-information approaches

Article 01 November 2014

manymome: An R package for computing the indirect effects, conditional effects, and conditional indirect effects, standardized or unstandardized, and their bootstrap confidence intervals, in many (though not all) models

Article Open access 05 October 2023

Notes

While we use and recommend free and open-source statistical software, Mplus was used in the simulations because it is currently the fastest SEM software available. The speed was crucial for this study because it allowed us to investigate a wide range of simulation conditions (2,832) with adequate replication size (5,000). It also allowed us to use a large number of bootstrap samples (5,000) and imputations (100).
The path coefficient .376 for \({\alpha }\) and \({\beta }\) results in the indirect effect .141, the square of which times \(100\%\) is equal to \(2\%\). The path coefficient .600 for \({\alpha }\) and \({\beta }\) results in the indirect effect .361, the square of which times \(100\%\) is equal to \(13\%\). The path coefficient .714 for \({\alpha }\) and \({\beta }\) results in the indirect effect .50, the square of which times \(100\%\) is equal to \(26\%\).
The path coefficient .141 for \(\tau ^{\prime }\) squared times \(100\%\) is equal to \(2\%\). The path coefficient .361 for \(\tau ^{\prime }\) squared times \(100\%\) is equal to \(13\%\). The path coefficient .509 for \(\tau ^{\prime }\) squared times \(100\%\) is equal to \(26\%\).
The ampute function defines the proportion of missing cases as the proportion of the original case which would be dropped had listwise deletion been employed. There are other ways of thinking about proportion of missing data, for example, the number of cells in the data matrix with missing values over the total number of cells. For a more thorough presentation of the proportion of missing data used in the study see https://jeksterslab.github.io/manMCMedMiss/articles/proportion-missing.html.
An alternative total variance was introduced by Li et al. (1991) and was used in the simulation. While results for this approach are available in the supplementary materials, they will be omitted in the manuscript as this approach did not provide a significant improvement to the performance of Eq. 11.
The automated MI implementation in Mplus only pools the diagonal elements of the sampling variance-covariance matrix. The wrapper function FitModelMI fits the model using normal-theory maximum likelihood on each imputation in Mplus, extracts, and pools the entire sampling variance-covariance matrix for each of the fitted models.

References

Arbuckle, J. L. (1996). Full information estimation in the presence of incomplete data. In G. A. Marcoulides & R. E. Schumacker (Eds.), Advanced structural equation modeling. Psychology Press. https://doi.org/10.4324/9781315827414
Arbuckle, J. L. (2021). Amos 28.0 User’s Guide. Chicago, IBM SPSS.
Aroian, L. A. (1947). The probability function of the product of two normally distributed variables. The Annals of Mathematical Statistics, 18(2), 265–271. https://doi.org/10.1214/aoms/1177730442
Article Google Scholar
Asparouhov, T., & Muthén, B. O. (2022). Multiple imputation with Mplus (Tech. Rep.). http://www.statmodel.com/download/Imputations7.pdf
Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51(6), 1173–1182. https://doi.org/10.1037/0022-3514.51.6.1173
Article PubMed Google Scholar
Bauer, D. J., Preacher, K. J., & Gil, K. M. (2006). Conceptualizing and testing random indirect effects and moderated mediation in multilevel models: New procedures and recommendations. Psychological Methods, 11(2), 142–163. https://doi.org/10.1037/1082-989x.11.2.142
Article PubMed Google Scholar
Biesanz, J. C., Falk, C. F., & Savalei, V. (2010). Assessing mediational models: Testing and interval estimation for indirect effects. Multivariate Behavioral Research, 45(4), 661–701. https://doi.org/10.1080/00273171.2010.498292
Article PubMed Google Scholar
Blanca, M. J., Arnau, J., López-Montiel, D., Bono, R., & Bendayan, R. (2013). Skewness and kurtosis in real data samples. Methodology, 9(2), 78–84. https://doi.org/10.1027/1614-2241/a000057
Article Google Scholar
Bollen, K. A., & Stine, R. (1990). Direct and indirect effects: Classical and bootstrap estimates of variability. Sociological Methodology, 20, 115. https://doi.org/10.2307/271084
Article Google Scholar
Bradley, J. V. (1978). Robustness? British Journal of Mathematical and Statistical Psychology, 31(2), 144–152. https://doi.org/10.1111/j.2044-8317.1978.tb00581.x
Article Google Scholar
Cheung, G. W., & Lau, R. S. (2007). Testing mediation and suppression effects of latent variables. Organizational Research Methods, 11(2), 296–325. https://doi.org/10.1177/1094428107300343
Article Google Scholar
Cheung, M.W.-L. (2009a). Comparison of methods for constructing confidence intervals of standardized indirect effects. Behavior Research Methods, 41(2), 425–438. https://doi.org/10.3758/brm.41.2.425
Article PubMed Google Scholar
Cheung, M.W.-L. (2009b). Constructing approximate confidence intervals for parameters with structural equation models. Structural Equation Modeling: A Multidisciplinary Journal, 16(2), 267–294. https://doi.org/10.1080/10705510902751291
Article Google Scholar
Cheung, M.W.-L. (2021). Synthesizing indirect effects in mediation models with meta-analytic methods. Alcohol and Alcoholism, 57(1), 5–15. https://doi.org/10.1093/alcalc/agab044
Article Google Scholar
Cochran, W. G. (1952). The \(\chi ^{2}\) test of goodness of fit. The Annals of Mathematical Statistics, 23(3), 315–345. https://doi.org/10.1214/aoms/1177729380
Article Google Scholar
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed). Routledge. https://doi.org/10.4324/9780203771587
Craig, C. C. (1936). On the frequency function of \(xy\). The Annals of Mathematical Statistics, 7(1), 1–15. https://doi.org/10.1214/aoms/1177732541
Article Google Scholar
Davison, A. C., & Hinkley, D. V. (1997). Bootstrap methods and their application. Cambridge University Press. https://doi.org/10.1017/CBO9780511802843
Article Google Scholar
Efron, B. (1987). Better bootstrap confidence intervals. Journal of the American Statistical Association, 82(397), 171–185. https://doi.org/10.1080/01621459.1987.10478410
Article Google Scholar
Efron, B. (1988). Bootstrap confidence intervals: Good or bad? Psychological Bulletin, 104(2), 293–296. https://doi.org/10.1037/0033-2909.104.2.293
Article Google Scholar
Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. Chapman & Hall. https://doi.org/10.1201/9780429246593
Article Google Scholar
Enders, C. K. (2010). Applied missing data analysis. Guilford Publications.
Fritz, M. S., & MacKinnon, D. P. (2007). Required sample size to detect the mediated effect. Psychological Science, 18(3), 233–239. https://doi.org/10.1111/j.1467-9280.2007.01882.x
Article PubMed Google Scholar
Goodman, L. A. (1960). On the exact variance of products. Journal of the American Statistical Association, 55(292), 708–713. https://doi.org/10.1080/01621459.1960.10483369
Article Google Scholar
Graham, J. W., Olchowski, A. E., & Gilreath, T. D. (2007). How many imputations are really needed? Some practical clarifications of multiple imputation theory. Prevention Science, 8(3), 206–213. https://doi.org/10.1007/s11121-007-0070-9
Article PubMed Google Scholar
Hayes, A. F. (2022). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach (3rd ed.). Guilford Publications.
Hayes, A. F., & Scharkow, M. (2013). The relative trustworthiness of inferential tests of the indirect effect in statistical mediation analysis. Prevention Science, 24(10), 1918–1927. https://doi.org/10.1177/0956797613480187
Article Google Scholar
Jorgensen, T. D., Pornprasertmanit, S., Schoemann, A. M., & Rosseel, Y. (2022). semTools: Useful tools for structural equation modeling. https://CRAN.R-project.org/package=semTools
Koopman, J., Howe, M., & Hollenbeck, J. R. (2014). Pulling the Sobel test up by its bootstraps. In More statistical and methodological myths and urban legends: Doctrine, verity and fable in organizational and social sciences (pp. 224–243). Routledge/Taylor & Francis Group. 10.4324/9780203775851
Koopman, J., Howe, M., Hollenbeck, J. R., & Sin, H.-P. (2015). Small sample mediation testing: Misplaced confidence in bootstrapped confidence intervals. Journal of Applied Psychology, 100(1), 194–202. https://doi.org/10.1037/a0036635
Article PubMed Google Scholar
Li, K. H., Raghunathan, T. E., & Rubin, D. B. (1991). Large-sample significance levels from multiply imputed data using moment-based statistics and an F reference distribution. Journal of the American Statistical Association, 86(416), 1065–1073. https://doi.org/10.1080/01621459.1991.10475152
Article Google Scholar
Little, R. J. A. & Rubin, D. B. (2019). Statistical analysis with missing data (3rd ed.). Wiley. https://doi.org/10.1002/9781119482260
MacKinnon, D. P. (2008). Introduction to statistical mediation analysis. Erlbaum Psych Press. https://doi.org/10.4324/9780203809556
Article Google Scholar
MacKinnon, D. P., Fritz, M. S., Williams, J., & Lockwood, C. M. (2007). Distribution of the product confidence limits for the indirect effect: Program PRODCLIN. Behavior Research Methods, 39(3), 384–389. https://doi.org/10.3758/bf03193007
Article PubMed PubMed Central Google Scholar
MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M., West, S. G., & Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7(1), 83–104. https://doi.org/10.1037/1082-989x.7.1.83
Article PubMed PubMed Central Google Scholar
MacKinnon, D. P., Lockwood, C. M., & Williams, J. (2004). Confidence limits for the indirect effect: Distribution of the product and resampling methods. Multivariate Behavioral Research, 39(1), 99–128. https://doi.org/10.1207/s15327906mbr3901_4
Article PubMed PubMed Central Google Scholar
Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105(1), 156–166. https://doi.org/10.1037/0033-2909.105.1.156
Article Google Scholar
Muthén, L. K. & Muthén, B. O. (2017). Mplus user’s guide (8th ed.). Muthén & Muthén.
Pawitan, Y. (2013). In all likelihood: Statistical modelling and inference using likelihood. Oxford University Press.
Pesigan, I. J. A., & Cheung, S. F. (2020). SEM-based methods to form confidence intervals for indirect effect: Still applicable given nonnormality, under certain conditions. Frontiers in Psychology, 11,. https://doi.org/10.3389/fpsyg.2020.571928
Peugh, J. L., & Enders, C. K. (2004). Missing data in educational research: A review of reporting practices and suggestions for improvement. Review of Educational Research, 74(4), 525–556. https://doi.org/10.3102/00346543074004525
Article Google Scholar
Preacher, K. J., & Hayes, A. F. (2008). Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behavior Research Methods, 40(3), 879–891. https://doi.org/10.3758/brm.40.3.879
Article PubMed Google Scholar
Preacher, K. J., & Hayes, A. F. (2004). SPSS and SAS procedures for estimating indirect effects in simple mediation models. Behavior Research Methods, Instruments, & Computers, 36(4), 717–731. https://doi.org/10.3758/bf03206553
Article Google Scholar
Preacher, K. J., & Selig, J. P. (2012). Advantages of Monte Carlo confidence intervals for indirect effects. Communication Methods and Measures, 6(2), 77–98. https://doi.org/10.1080/19312458.2012.679848
Article Google Scholar
R Core Team. (2022). R: A language and environment for statistical computing.R Foundation for Statistical Computing. https://www.R-project.org/
Raghunathan, T. E., Lepkowski, J. M., Hoewyk, J. V., & Solenberger, P. (2001). A multivariate technique for multiply imputing missing values using a sequence of regression models. Survey Methodology, 27(1), 85–95.
Google Scholar
Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2). https://doi.org/10.18637/jss.v048.i02
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63(3), 581–592. https://doi.org/10.1093/biomet/63.3.581
Article Google Scholar
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. John Wiley & Sons, Inc. https://doi.org/10.1002/9780470316696
Savalei, V., & Rosseel, Y. (2021). Computational options for standard errors and test statistics with incomplete normal and nonnormal data in SEM. Structural Equation Modeling: A Multidisciplinary Journal, 29(2), 163–181. https://doi.org/10.1080/10705511.2021.1877548
Article Google Scholar
Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. Chapman and Hall/CRC. https://doi.org/10.1201/9780367803025
Article Google Scholar
Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7(2), 147–177. https://doi.org/10.1037/1082-989x.7.2.147
Article PubMed Google Scholar
Schouten, R. M., Lugtig, P., & Vink, G. (2018). Generating missing values for simulation purposes: A multivariate amputation procedure. Journal of Statistical Computation and Simulation, 88(15), 2909–2930. https://doi.org/10.1080/00949655.2018.1491577
Serlin, R. C. (2000). Testing for robustness in Monte Carlo studies. Psychological Methods, 5(2), 230–240. https://doi.org/10.1037/1082-989x.5.2.230
Article PubMed Google Scholar
Shrout, P. E., & Bolger, N. (2002). Mediation in experimental and nonexperimental studies: New procedures and recommendations. Psychological Methods, 7(4), 422–445. https://doi.org/10.1037/1082-989x.7.4.422
Article PubMed Google Scholar
Sobel, M. E. (1982). Asymptotic confidence intervals for indirect effects in structural equation models. Sociological Methodology, 13, 290. https://doi.org/10.2307/270723
Article Google Scholar
Sobel, M. E. (1987). Direct and indirect effects in linear structural equation models. Sociological Methods & Research, 16(1), 155–176. https://doi.org/10.1177/0049124187016001006
Article Google Scholar
Sobel, M. E. (1986). Some new results on indirect effects and their standard errors in covariance structure models. Sociological Methodology, 16, 159. https://doi.org/10.2307/270922
Article Google Scholar
Taylor, A. B., MacKinnon, D. P., & Tein, J.-Y. (2007). Tests of the three-path mediated effect. Organizational Research Methods, 11(2), 241–269. https://doi.org/10.1177/1094428107300344
Article Google Scholar
Tofighi, D., & Kelley, K. (2020). Improved inference in mediation analysis: Introducing the model-based constrained optimization procedure. Psychological Methods, 25, 496–515. https://doi.org/10.1037/met0000259
Tofighi, D., & Kelley, K. (2019). Indirect effects in sequential mediation models: Evaluating methods for hypothesis testing and confidence interval formation. Multivariate Behavioral Research, 55(2), 188–210. https://doi.org/10.1080/00273171.2019.1618545
Article PubMed PubMed Central Google Scholar
Tofighi, D., & MacKinnon, D. P. (2015). Monte Carlo confidence intervals for complex functions of indirect effects. Structural Equation Modeling: A Multidisciplinary Journal, 23(2), 194–205. https://doi.org/10.1080/10705511.2015.1057284
Article Google Scholar
van Buuren, S. (2018). Flexible imputation of missing data (2nd ed.). Chapman; Hall/CRC. https://doi.org/10.1201/9780429492259
van Buuren, S., Brand, J. P. L., Groothuis-Oudshoorn, C. G. M., & Rubin, D. B. (2006). Fully conditional specification in multivariate imputation. Journal of Statistical Computation and Simulation, 76(12), 1049–1064. https://doi.org/10.1080/10629360600810434
Article Google Scholar
van Buuren, S. & Groothuis-Oudshoorn, K. (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3). https://doi.org/10.18637/jss.v045.i03
Venables, W. N., & Ripley, B. D. (2002). Modern applied statistics with S. Springer, New York.https://doi.org/10.1007/978-0-387-21706-2
Article Google Scholar
Venzon, D. J., & Moolgavkar, S. H. (1988). A method for computing profile-likelihood-based confidence intervals. Applied Statistics, 37(1), 87. https://doi.org/10.2307/2347496
Article Google Scholar
Wu, W., & Jia, F. (2013). A new procedure to test mediation with missing data through nonparametric bootstrapping and multiple imputation. Multivariate Behavioral Research, 48(5), 663–691. https://doi.org/10.1080/00273171.2013.816235
Article PubMed Google Scholar
Yuan, K.-H., & Bentler, P. M. (2000). Three likelihood-based methods for mean and covariance structure analysis with nonnormal missing data. Sociological Methodology, 30(1), 165–200. https://doi.org/10.1111/0081-1750.00078
Article Google Scholar
Yzerbyt, V., Muller, D., Batailler, C., & Judd, C. M. (2018). New recommendations for testing indirect effects in mediational models: The need to report and test component paths. Journal of Personality and Social Psychology, 115(6), 929–943. https://doi.org/10.1037/pspa0000132
Zhang, Z., & Wang, L. (2012). Methods for mediation analysis with missing data. Psychometrika, 78(1), 154–184. https://doi.org/10.1007/s11336-012-9301-5
Zhang, Z., Wang, L., & Tong, X. (2015). Mediation analysis with missing data through multiple imputation and bootstrap. In Quantitative Psychology Research (pp. 341–355). Springer International Publishing. https://doi.org/10.1007/978-3-319-19977-1_24

Download references

Acknowledgements

The simulation was performed in part at the High-Performance Computing Cluster (HPCC) which is supported by the Information and Communication Technology Office (ICTO) of the University of Macau.

Author information

Authors and Affiliations

Department of Psychology, Faculty of Social Sciences, University of Macau, Avenida da Universidade, Taipa, Macao SAR, China
Ivan Jacob Agaloos Pesigan & Shu Fai Cheung

Authors

Ivan Jacob Agaloos Pesigan
View author publications
You can also search for this author in PubMed Google Scholar
Shu Fai Cheung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ivan Jacob Agaloos Pesigan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Pesigan, I.J.A., Cheung, S.F. Monte Carlo confidence intervals for the indirect effect with missing data. Behav Res 56, 1678–1696 (2024). https://doi.org/10.3758/s13428-023-02114-4

Download citation

Accepted: 21 March 2023
Published: 07 August 2023
Issue Date: March 2024
DOI: https://doi.org/10.3758/s13428-023-02114-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Monte Carlo confidence intervals for the indirect effect with missing data

Abstract

Access this article

Similar content being viewed by others

R-squared change in structural equation models with latent variables and missing data

Estimation of the latent mediated effect with ordinal data using the limited-information and Bayesian full-information approaches

manymome: An R package for computing the indirect effects, conditional effects, and conditional indirect effects, standardized or unstandardized, and their bootstrap confidence intervals, in many (though not all) models

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Monte Carlo confidence intervals for the indirect effect with missing data

Abstract

Access this article

Similar content being viewed by others

R-squared change in structural equation models with latent variables and missing data

Estimation of the latent mediated effect with ordinal data using the limited-information and Bayesian full-information approaches

manymome: An R package for computing the indirect effects, conditional effects, and conditional indirect effects, standardized or unstandardized, and their bootstrap confidence intervals, in many (though not all) models

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation