Skip to main content
Log in

Monte Carlo confidence intervals for the indirect effect with missing data

  • Published:
Behavior Research Methods Aims and scope Submit manuscript

Abstract

Missing data is a common occurrence in mediation analysis. As a result, the methods used to construct confidence intervals around the indirect effect should consider missing data. Previous research has demonstrated that, for the indirect effect in data with complete cases, the Monte Carlo method performs as well as nonparametric bootstrap confidence intervals (see MacKinnon et al., Multivariate Behavioral Research, 39(1), 99–128, 2004; Preacher & Selig, Communication Methods and Measures, 6(2), 77–98, 2012; Tofighi & MacKinnon, Structural Equation Modeling: A Multidisciplinary Journal, 23(2), 194–205, 2015). In this manuscript, we propose a simple, fast, and accurate two-step approach for generating confidence intervals for the indirect effect, in the presence of missing data, based on the Monte Carlo method. In the first step, an appropriate method, for example, full-information maximum likelihood or multiple imputation, is used to estimate the parameters and their corresponding sampling variance-covariance matrix in a mediation model. In the second step, the sampling distribution of the indirect effect is simulated using estimates from the first step. A confidence interval is constructed from the resulting sampling distribution. A simulation study with various conditions is presented. Implications of the results for applied research are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. While we use and recommend free and open-source statistical software, Mplus was used in the simulations because it is currently the fastest SEM software available. The speed was crucial for this study because it allowed us to investigate a wide range of simulation conditions (2,832) with adequate replication size (5,000). It also allowed us to use a large number of bootstrap samples (5,000) and imputations (100).

  2. The path coefficient .376 for \({\alpha }\) and \({\beta }\) results in the indirect effect .141, the square of which times \(100\%\) is equal to \(2\%\). The path coefficient .600 for \({\alpha }\) and \({\beta }\) results in the indirect effect .361, the square of which times \(100\%\) is equal to \(13\%\). The path coefficient .714 for \({\alpha }\) and \({\beta }\) results in the indirect effect .50, the square of which times \(100\%\) is equal to \(26\%\).

  3. The path coefficient .141 for \(\tau ^{\prime }\) squared times \(100\%\) is equal to \(2\%\). The path coefficient .361 for \(\tau ^{\prime }\) squared times \(100\%\) is equal to \(13\%\). The path coefficient .509 for \(\tau ^{\prime }\) squared times \(100\%\) is equal to \(26\%\).

  4. The ampute function defines the proportion of missing cases as the proportion of the original case which would be dropped had listwise deletion been employed. There are other ways of thinking about proportion of missing data, for example, the number of cells in the data matrix with missing values over the total number of cells. For a more thorough presentation of the proportion of missing data used in the study see https://jeksterslab.github.io/manMCMedMiss/articles/proportion-missing.html.

  5. An alternative total variance was introduced by Li et al. (1991) and was used in the simulation. While results for this approach are available in the supplementary materials, they will be omitted in the manuscript as this approach did not provide a significant improvement to the performance of Eq. 11.

  6. The automated MI implementation in Mplus only pools the diagonal elements of the sampling variance-covariance matrix. The wrapper function FitModelMI fits the model using normal-theory maximum likelihood on each imputation in Mplus, extracts, and pools the entire sampling variance-covariance matrix for each of the fitted models.

References

Download references

Acknowledgements

The simulation was performed in part at the High-Performance Computing Cluster (HPCC) which is supported by the Information and Communication Technology Office (ICTO) of the University of Macau.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ivan Jacob Agaloos Pesigan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pesigan, I.J.A., Cheung, S.F. Monte Carlo confidence intervals for the indirect effect with missing data. Behav Res 56, 1678–1696 (2024). https://doi.org/10.3758/s13428-023-02114-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3758/s13428-023-02114-4

Keywords

Navigation