Skip to main content
Log in

Bayesian Selection on the Number of Factors in a Factor Analysis Model

  • Published:
Behaviormetrika Aims and scope Submit manuscript

Abstract

This paper considers a Bayesian approach for selecting the number of factors in a factor analysis model with continuous and polvtomous variables. A procedure for computing the important statistic in model selection, namely the Bayes factor, is developed via path sampling. The main computation effort is on simulating observations from the appropriate posterior distribution. This task is done by a hybrid algorithm which combines the Gibbs sampler and the Metropolis-Hastings algorithm. Bayesian estimates of thresholds, factor loadings, unique variances, and latent factor scores as well as their standard errors can be produced as by-products. The empirical performance of the proposed procedure is illustrated by means of a simulation study and a real example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Akaike, H. (1970). Statistical predictor identification. Annals of the Institute of Statistical Mathematics, 22, 203–217.

    Article  MathSciNet  Google Scholar 

  • Akaike, H. (1987). Factor analysis and AIC. Psychometrika, 52, 317–332.

    Article  MathSciNet  Google Scholar 

  • Aitkin, M., Anderson, D., & Hinde, J. (1981). Statistical modelling of data on teaching styles (with discussion). Journal of the Royal Statistical Society. Senes A, 144, 419–461.

    Article  Google Scholar 

  • Arminger, G. & Muthén, B. O. (1998). A Bayesian approach to nonlinear latent variable models using the Gibbs sampler and the Metropolis-Hastings algorithm. Psychometrika, 63, 271–300.

    Article  Google Scholar 

  • Berger, J. O. (1985). Statistical decision theory and Bayesian analysis. New York: Springer-Verlag.

    Book  Google Scholar 

  • Carlin, B. & Chib, S. (1995). Bayesian model choice via Markov chain Monte Carlo. Journal of the Royal Statistical Society, Series B, 57, 473–484.

    MATH  Google Scholar 

  • Chib, S. (1995). Marginal likelihood from the Gibbs output. Journal of the American Statistical Association, 90, 1313–1321.

    Article  MathSciNet  Google Scholar 

  • Cowles, M. K. (1996). Accelerating Monte Carlo Markov chain convergence for cumulative-link generalized linear models. Statistics and Computing, 6, 101–111.

    Article  Google Scholar 

  • Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society, Ser. B, 39, 1–38.

    MathSciNet  MATH  Google Scholar 

  • DiCiccio, T. J., Kass, R. E., Raftery, A., & Wasserman, L. (1997). Computing Bayes factors by combining simulation and asymptotic approximations. Journal of the American Statistical Association, 92, 903–915.

    Article  MathSciNet  Google Scholar 

  • Gelfand, A. E. & Dey, D. K. (1994). Bayesian model choice: asymptotic and exact calculations. Journal of the Royal Statistical Society. Series B, 56, 501–514.

    MathSciNet  MATH  Google Scholar 

  • Gelman, A. & Meng, X. L. (1998). Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Statistical Science, 13, 163–185.

    Article  MathSciNet  Google Scholar 

  • Gelman, A., Meng, X. L., & Stern, H.(1996). Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica, 6, 733–807.

    MathSciNet  MATH  Google Scholar 

  • Geman, S. & Genian, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.

    Article  Google Scholar 

  • George, E. I. & McCulloch, R. E. (1993). Variable selection via Gibbs sampling. Journal of the American Statistical Association, 88, 881–889.

    Article  Google Scholar 

  • Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82, 711–732.

    Google Scholar 

  • Hastings, W. K. (1970). Monte Carlo sampling methods using Markov Chains and their application. Biometrica, 57, 97–109.

    Article  MathSciNet  Google Scholar 

  • Kass, R. E. & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795.

    Article  MathSciNet  Google Scholar 

  • Lawiey, D. N. & Maxwell, A. E. (1971). Factor analysis as a statistical method (2nd ed.). London: Butterworths.

    Google Scholar 

  • Lee, S. Y. (1981). A Bayesian approach to confirmatory factor analysis. Psychometrika, 46, 153–160.

    Article  MathSciNet  Google Scholar 

  • Lee, S. Y., Poon, W. Y., & Bentler, P. M. (1995). A two-stage estimation of structural equation models with continuous and polytomous variables. British Journal of mathematical and Statistical Psychology, 48, 339–358.

    Article  Google Scholar 

  • Lee, S. Y. & Shi, J. Q. (2000). Joint Bayesian analysis of factor scores and structural parameters in the factor analysis model. Annals of the Institute of Statistical Mathematics, 4, 722–736.

    Article  MathSciNet  Google Scholar 

  • Lee, S. Y. & Zhu, H. T. (2000). Statistical analysis of nonlinear structural equation models with continuous and polvtomous data. British Journal of Mathematical and Statistical Psychology, 53, 209–232.

    Article  Google Scholar 

  • Lindsay, B. G. & Basak, P. (1993). Multivariate normal mixtures: A fast consistent method of moments. Journal of the American Statistical Association, 88, 408–476.

    Article  MathSciNet  Google Scholar 

  • Liu, J. S. (1994). The collapsed Gibbs sampler in Bayesian computation with applications to a gene regulation problem. Journal of the American Statistical Association, 89, 958–966.

    Article  MathSciNet  Google Scholar 

  • Martin, J. K. & McDonald, R. P. (1975). Bayesian estimation in unrestricted factor analysis: a treatment for Heyword cases. Psychometrika, 40, 505–577.

    Article  Google Scholar 

  • Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E. (1953). Equations of state calculations by fast computing machine. Journal of Chemical Physics, 21, 1087–1091.

    Article  Google Scholar 

  • McLachlan, G. J. (1987). On bootstrapping the likelihood ratio test statistics for the number of components in a normal mixture. Applied Statistics, 36, 318–324.

    Article  Google Scholar 

  • McLachlan, G. J. & Basford, K. E. (1988). Mixture models: Inference and applications to clustering. New York: Marcel Dekker.

    MATH  Google Scholar 

  • Meng, X. L. (1994). Posterior predictive p-values. The Annals of Statistics, 22, 1142–1160.

    Article  MathSciNet  Google Scholar 

  • Meng, X. L. & Wong, H. W. (1996). Simulating ratios of normalizing constants via a simple identity: a theoretical exploration. Statistica Sinica, 6, 831–360.

    MathSciNet  MATH  Google Scholar 

  • NewTon, M. A. & Raftery, A. E. (1994). Approximate Bayesian inference by the weighted likelihood bootstrap (with discussion). Journal of the Royal Statistical Society, Series B, 56, 3–48.

    MathSciNet  MATH  Google Scholar 

  • Ogata, Y. (1989). A Monte Carlo method for high dimensional integration. Numerical Mathematics, 55, 137–157.

    Article  MathSciNet  Google Scholar 

  • Raftery, A. E. (1993). Bayesian model selection in structural equation models. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 163–180). Beverly Hills, CA: Sage.

    Google Scholar 

  • Roboussin, B. A. & Liang, K. Y. (1998). An estimating equations approach for the LISCOMP model. Psychometrika, 63, 165–182.

    Article  Google Scholar 

  • Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statist., 6, 461–464.

    Article  MathSciNet  Google Scholar 

  • Shi, J. Q. & Lee, S. Y. (1998). Bayesian sampling-based approach for factor analysis model with continuous and polvtomous data. British Journal of Mathematical and Statistical Psychology, 51, 233–252.

    Article  Google Scholar 

  • Shi, J. Q. & Lee, S. Y. (2000). Latent variable models with mixed continuous and polytomous data. Journal of the Royal Statistical Society, Series B, 62, 77–87.

    Article  MathSciNet  Google Scholar 

  • Tanner, M. A. & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation (with discussion). Journal of the American Statistical Association, 82, 528–550.

    Article  MathSciNet  Google Scholar 

  • Wei, G. C. G. & Tanner, M. A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithm. Journal of the American Statistical Association, 85, 699–704.

    Article  Google Scholar 

  • WORLD VALUES SURVEY, 1981–1984 AND 1990–1993. (1994). ICPSR version. Ann Arbor, MI: Institute for Social Research [producer], 1994. Ann Arbon, MI: Interuniversitv Consortium for Political and Social Research [distribution], 1994.

    Google Scholar 

  • Zhu, H. T. & Lee, S. Y. (2001). A Bayesian analysis of finite mixtures in the LISREL model. Psychometrika, 66, 133–152.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sik-Yum Lee.

Additional information

The research is fully supported by a grant (CUHK 4346/01H) from the Research Grant Council of the Hong Kong Special Administrative Region. We are indebted to the Editor for valuable comments.

About this article

Cite this article

Lee, SY., Song, XY. Bayesian Selection on the Number of Factors in a Factor Analysis Model. Behaviormetrika 29, 23–39 (2002). https://doi.org/10.2333/bhmk.29.23

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.2333/bhmk.29.23

Key Words and Phrases

Navigation