Abstract
Structural equation modeling is a multivariate method for establishing meaningful models to investigate the relationships of some latent (causal) and manifest (control) variables with other variables. In the past quarter of a century, it has drawn a great deal of attention in psychometrics and sociometrics, both in terms of theoretical developments and practical applications (see Bentler and Wu, 2002; Bollen, 1989; Jöreskog and Sörbom, 1996; Lee, 2007). Although not to the extent that they have been used in behavioral, educational, and social sciences, structural equation models (SEMs) have been widely used in public health, biological, and medical research (see Bentler and Stein, 1992; Liu et al. 2005; Pugesek et al., 2003 and references therein). A review of the basic SEM with applicants to environmental epidemiology has been given by Sanchez et al. (2005).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ansari, A. and Jedidi, K. (2000). Bayesian factor analysis for multilevel binary observations. Psychometrika, 65, 475–498
Austin, P. C. and Escobar, M. D. (2005). Bayesian modeling of missing data in clinical research. Computational Statistics and Data Analysis, 48, 821–836
Bentler, P. M. and Bonett, D. G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88, 588–606
Bentler, P. M. and Stein, J. A. (1992). Structural equation models in medical research. Statistical Methods in Medical Research, 1, 159–181
Bentler, P. M. and Wu, E. J. C. (2002). EQS6 for Windows User Guide. Encino, CA: Multivariate Software, Inc.
Bollen, K. A. (1989). Structural Equations with Latent Variables. New York: Wiley
Chib, S. (1995). Marginal likelihood from the Gibbs output. Journal of the American Statistical Association, 90, 1313–1321
Chib, S. and Jeliazkov, I. (2001). Marginal likelihood from the Metropolis-Hastings outputs. Journal of the American Statistical Association, 96, 270–281
Congdon, P. (2005). Bayesian predictive model comparison via parallel sampling. Computational Statistics and Data Analysis, 48, 735–753
Diciccio, T. J., Kass, R. E., Raftery, A. and Wasserman, L. (1997). Computing Bayes factors by combining simulation and asymptotic approximations. Journal of the American Statistical Association, 92, 903–915
Dolan, C, van der Sluis, S. and Grasman, R. (2005). A note on normal theory power calculation in SEM with data missing completely at random. Structural Equation Modeling, 12, 245–62
Dunson, D. B. (2000). Bayesian latent variable models for clustered mixed outcomes. Journal of the Royal Statistical Society Series B, 62, 355–366
Dunson, D. B. (2005). Bayesian semiparametric isotonic regression for count data. Journal of the American Statistical Association, 100, 618–627
Dunson, D. B. and Herring, A. H. (2005). Bayesian latent variable models for mixed discrete outcomes. Biostatistics, 6, 11–25
Garcia-Donato, G. and Chan, M. H. (2005). Calibrating Bayes factor under prior predictive distributions. Statistica Sinica, 15, 359–380
Gelfand, A. E. and Dey D. K. (1994). Bayesian model choice: asymptotic and exact calculations. Journal of the Royal Statistical Society, Series B, 56, 501–514
Gelman, A. and Meng, X. L. (1998). Simulating normalizing constants: from importance sampling to bridge sampling to path sampling. Statistical Science, 13, 163–185
Gelman, A., Meng, X. L. and Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica, 6, 733–807
Geman, S. and Geman, D. (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741
Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their application.Biometrika, 57, 97–109
Joreskog, K. G. and Sörbom, D. (1996). LISREL 8: Structural Equation Modeling with the SIM- PLIS Command Language. Scientific Software International: Hove and London
Kass, R. E. and Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795
Kim, K. H. (2005). The relation among fit indices, power, and sample size in structural equation modeling. Structural Equation Modeling, 12, 368–390
Lee, S. Y. (2007). Structural Equations Modelling:ABayesian Approach. New York: Wiley
Lee, S. Y. and Song, X. Y. (2003). Model comparison of a nonlinear structural equation model with fixed covariates. Psychometrika, 68, 27–47
Lee, S. Y. and Song, X. Y. (2004a). Evaluation of the Bayesian and maximum likelihood approaches in analyzing structural equation models with small sample sizes. Multivariate Behavioral Research, 39, 653–686
Lee, S. Y. and Song, X. Y. (2004b). Maximum likelihood analysis of a general latent variable model with hierarchically mixed data. Biometrics, 60, 624–636
Lee, S. Y. and Xia, Y. M. (2006). Maximum likelihood methods in treating outliers and symmetrically heavy-tailed distributions for nonlinear structural equation models with missing data. Psychometrika, 71, 565–585
Levin, H. M. (1998). Accelerated schools: a decade of evolution. In A. Hargreavers et al. (Eds) International Handbook of Educational Change, Part Two(pp 809–810). New York: Kluwer
Liu, X., Wall, M. M. and Hodges, J. S. (2005). Generalized spatial structural equation models. Biostatistics, 6, 539–551
Meng, X. L. and Wong, H. W. (1996). Simulating ratios of normalizing constants via a simple identity: a theoretical exploration. Statistica Sinica, 6, 831–860
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. and Teller, E. (1953). Equations of state calculations by fast computing machine. Journal of Chemical Physics, 21, 1087–1091
Ogata, Y. (1989). A Monte Carlo method for high dimensional integration. Numerische Mathe-matik, 55, 137–157
Palomo, J., Dunson, D. B. and Bollen, K. (2007). Bayesian structural equation modeling. In S. Y. Lee (Ed) Handbook of Latent Variable and Related Models. Amsterdam: Elsevier
Pugesek, B. H., Tomer, A. and von Eye, A. (2003). Structural Equation Modeling Applications in Ecological and Evolutionary Biology. New York: Cambridge University Press
Raftery, A. E. (1993). Bayesian model selection in structural equation models. In K. A. Bollen and J. S. Long (Eds) Testing Structural Equation Models(pp 163–180). Thousand Oaks, CA: Sage Publications
Raftery, A. E. (1996). Hypothesis testing and model selection. In W. R. Wilks, S. Richardson and D. J. Spieglhalter (Eds) Practical Markov Chain Monte Carlo(pp 163–188). London: Chapman and Hall
Raykov, T. and Marcoulides, G. A. (2006). On multilevel model reliability estimation from the perspective of structural equation modeling. Structural Equation Modeling, 13, 130–141
Richardson, S. and Green, P. J. (1997). On Bayesian analysis of mixture with an unknown number of components (with discussion). Journal of the Royal Statistical Society, Series B, 59, 731–792
Sanchez, B. N., Budtz-Jorgenger, E., Ryan, L. M. and Hu, H. (2005). Structural equation models: a review with applications to environmental epidemiology. Journal of the American Statistical Association, 100, 1443–1455
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461–464
Shi, J. Q. and Lee, S. Y. (2000). Latent variable models with mixed continuous and polytomous data. Journal of the Royal Statistical Society, Series B, 62, 77–87
Song, X. Y. and Lee, S. Y. (2004). Bayesian analysis of two-level nonlinear structural equation models with continuous and polytomous data. British Journal of Mathematical and Statistical Psychology, 57, 29–52
Song, X. Y. and Lee, S. Y. (2005). A multivariate probit latent variable model for analyzing di-chotomous responses. Statistica Sinica, 15, 645–664
Song, X. Y. and Lee, S. Y. (2006a). Model comparison of generalized linear mixed models. Statistics in Medicine, 25, 1685–1698
Song, X. Y. and Lee, S. Y. (2006b). Bayesian analysis of latent variable models with non-ignorable missing outcomes from exponential family. Statistics in Medicine, 26, 681–693
Spiegelhalter, D. J., Best, N. G., Carlin, B. P. and van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, Series B, 64, 583–639
Spiegelhalter, D. J., Thomas, A., Best, N. G. and Lunn, D. (2003). WinBugs User Manual. Version 1.4. Cambridge, England: MRC Biostatistics Unit
Tanner, M. A. and Wong, W. H. (1987). The calculation of posterior distributions by data augmentation (with discussion). Journal of the American statistical Association, 82, 528–550
Acknowledgments
This research is fully supported by two grants (CUHK 404507 and 450607) from the Research Grant Council of the Hong Kong Special Administrative Region, and a direct grant from the Chinese University of Hong Kong (Project ID 2060278). The authors are indebted to Dr. John C. K. Lee, Faculty of Education, The Chinese University of Hong Kong, for providing the data in the application.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: Full Conditional Distributions
Appendix: Full Conditional Distributions
The conditional distributions required by the Gibbs sampler in the posterior simulation of the integrated model will be presented in this appendix. We use p(∙) to denote the conditional distribution if the context is clear, and note that (Y obs, W obs, U mis, O) = U.
-
(i)
p(V ǀ9, a, Y obs, W obs, U mis, Ω1, Ω2, O) = p(V Čθ,U,Ω1,Ω2): This conditional distribution is equal to a product of p(v gǀθ, U g, Ω1g, ω2g) with g = 1,…, G. For each gth term in this product, its conditional distribution is \(N\left[ {\mu _g^*,\sum _g^* } \right],\) where
$$\mu _g^* = \sum _g^* \left\{ {{\mathbf{\psi }}_1^{ - 1} \sum\limits_{i = 1}^{N_g } {\left[ {{\mathbf{u}}_{gi} - {\mathbf{A}}_1 {\mathbf{c}}_{ugi} - \Lambda _1 \omega _{1gi} } \right] + {\mathbf{\psi }}_2^{ - 1} \left[ {{\mathbf{A}}_2 {\mathbf{c}}_{\upsilon g} + \Lambda _2 \omega _{2g} } \right]} } \right\},{\text{and}}\sum _g^* = \left({N_g {\mathbf{\psi }}_1^{ - 1} + {\mathbf{\psi }}_2^{ - 1} } \right)^{ - 1}.$$((31)) -
(ii)
p(Ω1ǀθ, α, Y obs, W obs, U mis, Ω2, V, O) = p(Ω1ǀθ, U, Ω2, V) \( = \mathop \prod \limits_{g = 1}^G \mathop \prod \limits_{i = 1}^{N_g } \) p(ω1gi ǀθ, v g , ω2g , u gi ), where p(ω1gi ǀθ, v g , ω2g , u gi ) is proportional to
$$\begin{array}{l}\exp \left[ - \frac{1} {2}\left\{ {{\mathbf{\xi }}_{1gi}^T {\mathbf{\Phi }}_1^{ - 1} {\mathbf{\xi }}_{1gi} + \left[ {{\text{u}}_{gi} - {\mathbf{v}}_g - {\mathbf{A}}_1 {\mathbf{c}}_{ugi} - {\mathbf{\Lambda }}_1 \omega _{1gi} } \right]^T {\mathbf{\psi }}_1^{ - 1}} \right.\right.\\ \qquad\qquad\times \left[ {{\mathbf{u}}_{gi} - {\mathbf{v}}_g - {\mathbf{A}}_1 {\mathbf{c}}_{ugi} - {\mathbf{\Lambda }}_1 \omega _{1gi} } \right] + \left[ {\eta _{1gi} - {\mathbf{B}}_1 {\mathbf{c}}_{1gi} - \prod _1 \eta _{1gi} - {\mathbf{\Gamma }}_1 {\mathbf{F}}_1 \left({{\mathbf{\xi }}_{1gi} } \right)} \right]^T\\ \qquad\qquad\left.\left.\times {\mathbf{\psi }}_{1\delta }^{ - 1} \left[ {\eta _{1gi} - {\mathbf{B}}_1 {\mathbf{c}}_{1gi} - {\mathbf{B}}_1 {\mathbf{c}}_{1gi} - \prod _1 \eta _{1gi} - {\mathbf{\Gamma }}_1 {\mathbf{F}}_1 \left({{\mathbf{\xi }}_{1gi} } \right)} \right] \right\} \right]. \end{array}$$((32)) -
(iii)
p(Ω2ǀθ, α, Y obs, W obs, U mis, Ω1, V, O): This distribution has very similar form as in p(Ω.1 ǀ·) and (32), hence is not presented to save space.
-
(iv)
p(α, Y obsǀθ, W obs, U mis, Ω1, Ω2, V, O): To deal with the situation with little or no information about these parameters, the following noninformation prior distribution is used: p(αk) = p(a k,2,…, ak,bk−1) C, k = 1,…, s, where C is a constant. As (α, Y g) is independent with (α, Y h) for g ≠ h, and that Ψ 1 is diagonal, we have
$$p\left({\alpha,{\mathbf{Y}}| \cdot } \right) = \mathop \prod \limits_{g = 1}^G p\left({\alpha _h,{\mathbf{Y}}_g | \cdot } \right) = \mathop \prod \limits_{g = 1}^G \mathop \prod \limits_{k = 1}^s p\left({\alpha _k,{\mathbf{Y}}_{gk} | \cdot } \right),$$((33))where Y gk = [ygk1,…, ygkNg]. Let ψ1k be the kth diagonal element of Ψ 1, vgk be the kth element of v g, and Λ 1k be the kth row of Λ1, and I a (y) be an indicator function with value 1 if y is A and zero otherwise, p(α,Yǀ·) can be obtained from (33) and
$$p\left({\alpha _k,y_{gki} | \cdot } \right) \propto \mathop \prod \limits_{i = 1}^{N_g } \Phi ^* \left\{ {\psi _{1k}^{ - 1/2} \left[ {y_{gki} - \upsilon _{gk} - \Lambda _{1k}^T \omega _{1gi} } \right]} \right\}I_{(\alpha _{k,z_{gki}},\alpha _{_{k,z_{gki} }} + 1]} \left({y_{gki} } \right).$$((34))$$p\left({\omega _{gik,{\text{obs}}} |\theta,\omega _{1gi},d_{gik,{\text{obs}}} } \right) \sim \left\{ {\begin{array}{*{20}c} {N\left[ {{\mathbf{\Lambda '}}_{1k} \omega _{1gi},\psi _{1k} } \right]I_{\left({ - \infty,0} \right)} \left({\omega _{gik,{\text{obs}}} } \right),{\text{if}}d_{gik,} {\text{obs = 0}}} \\ {N\left[ {{\mathbf{\Lambda '}}_{1k} \omega _{1gi},\psi _{1k} } \right]I_{\left({0,\infty } \right)} \left({\omega _{gik,{\text{obs}}} } \right),{\text{if}}d_{gik,} {\text{obs = 1,}}} \\ \end{array} } \right.$$((35))where n g,k is the number of d gki,obs in D k,obs and D k,obs is the kth row of D obs.
$$\left[ {{\mathbf{u}}_{gi,{\text{mis}}} |\theta,\omega _{1gi} } \right]\mathop = \limits^D N\left[ {{\mathbf{v}}_g + {\mathbf{A}}_{1i,{\text{mis}}} {\mathbf{c}}_{ugi} + \Lambda _{1i,{\text{mis}}} \omega _{1gi},{\psi }_{1i}, {\text{mis}}} \right],$$((36))where A 1i,misand ι1i,mis are submatrices of A 1 and ι1with rows that correspond to observed components deleted, and Ψ1i,mis is a submatrix of Ψ1 with the appropriate rows and columns deleted. (vii) p(θǀα, Y obs, W obs, Umis, Ω1, Ω2, V, O) = p(θǀ U, Ω1, Ω2):Letθ1 be the vector of unknown parameters in A 1, Λ1, and Ψ1, θ1ω be the vector of unknown parameters in Π 1, Γ 1, Φ 1, and Ψ 1δ, θ2 be the vector of unknown parameters in A 2, Λ2, and Ψ2, and θ2ω be the vector of unknown parameters in Π2, σ2, Φ2, and Ψ2δ.
For θ1, the following commonly used conjugate type prior distributions are used:
where \({\mathbf{A}}_{1k}^T,{\mathbf{\Lambda }}_{1k}^T \) are the row vectors that contain the unknown parameters in the kth row of A1 and A1, respectively;\(\alpha _{01\varepsilon k,} \beta _{01\varepsilon k,} {\mathbf{A}}_{01k,} {\mathbf{\Lambda }}_{01k,} {\mathbf{H}}_{01k,} \) and H 01yk are given hyper-parameters values. For k ≠ h, it is assumed that (ψ1k, Λ1 k) and (ψ1h, Λ1h) are independent. Let U¢ = {u gi—v g—A 1 c ugi;i = 1,…, N g, g = 1,…, G and \({\mathbf{U}}_k^{*^T } \)be the kth row of U, Ω1 ={σ1gi; i =1… N g g = 1…G}
Let C u = {c ugi; i = 1,…, N g, g = 1,…, G}, Û = {u gi—vg—Λ1ω1gi;i = 1,…, N g, g = 1,…, G}, and \({\mathbf{\tilde U}}_k^T \) be the feth row of \({\mathbf{\tilde U}},\tilde \sum _{1k} = \left({{\mathbf{H}}_{01k}^{ - 1} + {\mathbf{C}}_u {\mathbf{C}}_u^T } \right)^{ - 1},{\mathbf{\tilde m}}_{1k} = \tilde \sum _{1k} \left({{\mathbf{H}}_{01k}^{ - 1} {\mathbf{A}}_{01k} + {\mathbf{C}}_u {\mathbf{\tilde U}}_k } \right).\)
For θ1ω, it is assumed that Φ1 is independent of (Λ1ω, :1s), where Λ1o = \(\left({{\mathbf{B}}_1^T,\prod _1^T,{\mathbf{\Gamma }}_1^T } \right)^T.\)Also, (Λ1ωk;, ω1δk) and (Λ1ωh, ω1δh) are independent, where Λ1ωk and ψ1δk are the kth row and diagonal element of Λ 1ω and ΨΨ 1,δ, respectively. The associated prior distribution of Φ1 is \(p\left({{\mathbf{\Phi }}_1^{ - 1} } \right)\mathop = \limits^D W\left[ {{\mathbf{R}}_{01,\rho 01,q12} } \right],\) where W[∙, ∙, q12] denotes the q 12-dimensional Wishart distribution, ρ01 and the positive definite matrix R 01 are given hyper-parameters. Moreover, the prior distribution of ψ1δk and Λ1ωak are
where α01δk, Λ01ωk, and H 01ωk are given hyper-parameters. Let E 1 = (η1g1,…, η1gNg); g = 1,…, G},\({\mathbf{E}}_{1k}^T \)be the kth row of E 1, ϩ1 = {(ξ 1g1,…, ξ 1gNg); g = 1,…, G} and\({\mathbf{F}}_1^* = \left\{ {\left({{\mathbf{F}}_1^* \left({\xi _{1g1} } \right), \ldots,{\mathbf{F}}_1^* \left({\xi _{1gNg} } \right)} \right);g = 1, \ldots,G} \right\},\), in which \({\mathbf{F}}_1^* \left({\xi _{1g1} } \right) = \left({\eta _{1gi}^T,{\mathbf{F}}_1 \left({\xi _{1gi} } \right)^T } \right)^T,i = 1, \ldots,N_g,\), i = 1,…, N g, and it can be shown that for k = 1,…, q 11,
where \(\sum _{1\omega k} = \left({{\mathbf{H}}_{01\omega k}^{ - 1} + {\mathbf{F}}_1^* {\mathbf{F}}_1^{*^T } } \right)^{ - 1},{\mathbf{m}}_{1\omega k} \left({{\mathbf{H}}_{01\omega k}^{ - 1} {\mathbf{\Lambda }}_{01\omega k} + {\mathbf{F}}_1^* {\mathbf{E}}_{1k} } \right),\) and\(\beta _{1\delta k} = \beta _{01\delta k} + \frac{1} {2}\left({{\mathbf{E}}_{1k}^T {\mathbf{E}}_{1k} - {\mathbf{m}}_{{\text{l}}\omega {\text{k}}}^T \sum _{{\text{l}}\omega {\text{k}}}^{ - 1} {\mathbf{m}}_{{\text{l}}\omega {\text{k}}} + \Lambda _{01\omega k}^T {\mathbf{H}}_{01\omega k} } \right).\)be the inverted Wishart distribution, the conditional distribution relating to Φ1 is given by
Conditional distributions involved in θ2 are derived similarly on the basis of the following independent conjugate type prior distributions: for k = 1,…, p, and
where \({\mathbf{A}}_{2k}^T \) and \({\mathbf{\Lambda }}_{2k}^T \) are the vectors that contain unknown parameters in the kth rows of A2 and Λ2, respectively; \(\alpha _{02\varepsilon k,} \beta _{02k,} {\mathbf{A}}_{02k,} {\mathbf{\Lambda }}_{02k,} {\mathbf{H}}_{02k,} \) and \({\mathbf{H}}_{02yk} \) are given hyperparameters.
Similarly, conditional distributions involved in θ2ω are derived on the basis of the following conjugate type distributions: for k = 1,…, q21,
where\({\mathbf{\Lambda }}_{2\omega } = \left({B_2^T,\prod _2^T,{\mathbf{\Gamma }}_2^T } \right)^T \) and \({\mathbf{\Lambda }}_{2\omega k} \) is the vector that contains the unknown parameters in the kth row of Λ2ω. As these conditional distributions are similar to those in (38), (40) and (41), they are not presented to save space.
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Lee, SY., Song, XY. (2008). Bayesian Model Comparison of Structural Equation Models. In: Dunson, D.B. (eds) Random Effect and Latent Variable Model Selection. Lecture Notes in Statistics, vol 192. Springer, New York, NY. https://doi.org/10.1007/978-0-387-76721-5_6
Download citation
DOI: https://doi.org/10.1007/978-0-387-76721-5_6
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-76720-8
Online ISBN: 978-0-387-76721-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)