Abstract
The general use of coefficient alpha to assess reliability should be discouraged on a number of grounds. The assumptions underlying coefficient alpha are unlikely to hold in practice, and violation of these assumptions can result in nontrivial negative or positive bias. Structural equation modeling was discussed as an informative process both to assess the assumptions underlying coefficient alpha and to estimate reliability
References
Becker, G. (2000). How important is transient error in estimating reliability? Going beyond simulation studies. Psychological Methods, 5, 370–379.
Bentler, P.M., & Woodward, J.A. (1980). Inequalities among lower bounds to reliability: With applications to test construction and factor analysis. Psychometrika, 45, 249–267.
Bollen, K.A. (1989). Structural equations with latent variables. New York: Wiley.
Cattell, R.B., & Tsujioka, B. (1964). The importance of factor-trueness and validity, versus homogeneity and orthogonality in test scales. Educational and Psychological Measurement, 24, 3–30.
Chen, F.F., West, S.G., & Sousa, K.H. (2006). A comparison of bifactor and second-order models of quality of life. Multivariate Behavioral Research, 41, 189–224.
Cortina, J.M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78, 98–104.
Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. New York: Holt, Rinehart, and Winston.
Cronbach, L.J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334.
Feldt, L.S., & Qualls, A.L. (1996). Bias in coefficient alpha arising from heterogeneity of test content. Applied Measurement in Education, 9, 277–286.
Fleishman, J., & Benson, J. (1987). Using LISREL to evaluate measurement models and scale reliability. Educational and Psychological Measurement, 47, 925–939.
Gerbing, D.W., & Anderson, J.C. (1988). An updated paradigm for scale development incorporating unidimensionality and its assessment. Journal of Marketing Research, 25, 186–192.
Gessaroli, M.E., & Folske, J.C. (2002). Generalizing the reliability of tests comprised of testlets. International Journal of Testing, 2, 277–295.
Green, S.B. (2003). A coefficient alpha for test-retest data. Psychological Methods, 8, 88–101.
Green, S.B., & Hershberger, S.L. (2000). Correlated errors in true score models and their effect on coefficient alpha. Structural Equation Modeling, 7, 251–270.
Green, S.B., & Yang, Y. (2009). Reliability of summed item scores using structural equation modeling: An alternative to coefficient alpha. Psychometrika, 94. doi:10.1007/s11336-008-9099-3.
Green, S.B., Lissitz, R.W., & Mulaik, S.A. (1977). Limitations of coefficient alpha as an index of test unidimensionality. Educational and Psychological Measurement, 37, 827–838.
Green, S.B., Akey, T.M., Fleming, K.K., Hershberger, S.L., & Marquis, J.G. (1997). Effect of the number of scale points on chi-square fit indices in confirmatory factor analysis. Structural Equation Modeling, 4, 108–120.
Guttman, L.A. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10, 255–282.
Hattie, J. (1985). Methodology review: Assessing unidimensionality of test and items. Applied Psychological Measurement, 9, 139–164.
Horn, J.L. (1965). A rationale and a test for the number of factors in factor analysis. Psychometrika, 30, 179–185.
Humphreys, L.G. (1985). General intelligence: An integration of factor, test, and simplex theory. In B.B. Wolman (Ed.), Handbook of intelligence: Theories, measurements, and applications (pp. 15–35). New York: Wiley.
Jackson, P.H., & Agunwamba, C.C. (1977). Lower bounds for the reliability of the total score on a test composed of non-homogeneous items: I. Algebraic lower bounds. Psychometrika, 42, 567–578.
Jöreskog, K.G. (1971). Statistical analysis of sets of congeneric test. Psychometrika, 36, 109–133.
Leary, L.F., & Dorans, N.J. (1985). Implications for altering the context in which test items appear: A historical perspective on an immediate concern. Review of Educational Research, 55, 387–411.
Lee, G., & Frisbie, D.A. (1999). Estimating reliability under a generalizability theory model for test scores composed of testlets. Applied Measurement in Education, 12, 237–255.
Lee, G., Dunbar, S.B., & Frisbie, D.A. (2001). The relative appropriateness of eight measurement models for analyzing scores from tests composed of testlets. Educational and Psychological Measurement, 61, 958–975.
Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.
Lucke, J.F. (2005). “Rassling the hog” The influence of correlated item error on internal consistency, classical reliability, and congeneric reliability. Applied Psychological Measurement, pp. 106–125.
Maxwell, A.E. (1968). The effect of correlated errors on estimates of reliability coefficients. Educational and Psychological Measurement, 28, 803–811.
McDonald, R.P. (1981). The dimensionality of test and items. British Journal of Mathematical and Statistical Psychology, 34, 100–117.
McDonald, R.P. (1999). Test theory: A unified approach. Hillsdale: Erlbaum.
Miller, M.B. (1995). Coefficient alpha: A basic introduction from the perspectives of classical test theory and structural equation modeling. Structural Equation Modeling, 2, 255–273.
Novick, M.R., & Lewis, C. (1967). Coefficient alpha and the reliability of composite measurements. Psychometrika, 32, 1–13.
Ochieng, C.O. (2001). Effects of item order on consistency and precision under different ordering schemes in attitudinal scales: A case of physical self-concept scales (Paper No. ESQESS-2001-3). University of British Columbia. Edgeworth Laboratory for Quantitative Educational and Social Science, Vancouver, B.C.
Raykov, T. (1997). Estimation of composite reliability for congeneric measures. Applied Psychological Measurement, 21, 173–184.
Raykov, T. (1998). Coefficient alpha and composite reliability with interrelated nonhomogeneous items. Applied Psychological Measurement, 22, 375–385.
Raykov, T. (2001). Bias of coefficient α for fixed congeneric measures with correlated errors. Applied Psychological Measurement, 25, 69–76.
Raykov, T., & Shrout, P. (2002). Reliability of scales with general structure: Point and interval estimation using a structural equation modeling approach. Structural Equation Modeling, 9, 195–212.
Reise, S.P., Waller, N.G., & Comrey, A.L. (2000). Factor analysis and scale revision. Psychological Assessment, 12, 287–297.
Reise, S.P., Morizot, J., & Hays, R.D. (2007). The role of the bifactor model in resolving dimensionality issues in health outcomes measures. Quality of Life Research, 16, 19–31.
Rindskopf, D., & Rose, T. (1988). Some theory and applications of confirmatory second-order factor analysis. Multivariate Behavioral Research, 23, 51–67.
Rozeboom, W.W. (1966). Foundations of the theory of prediction. Homewood: Dorsey.
Rozeboom, W.W. (1989). The reliability of a linear composite of nonequivalent subtests. Applied Psychological Measurement, 13, 277–283.
Roznowski, M., Tucker, L.R., & Humphreys, L.G. (1991). Three approaches to determining the dimensionality of binary items. Applied Psychological Measurement, 15, 109–127.
Schmid, J., & Leiman, J.M. (1957). The development of hierarchical factor solutions. Psychometrika, 22, 53–61.
Schurr, K.T., & Henriksen, L.W. (1983). Effects of item sequencing and grouping in low-inference type questionnaires. Journal of Educational Measurement, 20, 379–391.
Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika, 94. doi:10.1007/s11336-008-9101-0.
Sireci, S.G., Thissen, D., & Wainer, H. (1991). On the reliability of testlet-based tests. Journal of Educational Measurement, 28, 237–247.
Sparfeldt, J.E., Schilling, S.R., & Rost, D.H. (2006). Blocked versus randomized format of questionnaires: A confirmatory. Educational and Psychological Measurement, 66, 961–974.
Steinberg, L. (2001). The consequences of pairing questions: Context effects in personality measurement. Journal of Personality and Social Psychology, 81, 332–342.
Steinberg, L., & Thissen, D. (1996). Uses of item response theory and the testlet concept in the measurement of psychopathology. Psychological Methods, 1, 81–97.
Ten Berge, J.M.F., & Kiers, H.A.L. (1991). A numerical approach to the exact and the approximate minimum rank of a covariance matrix. Psychometrika, 56, 309–315.
Ten Berge, J.M.F., & Kiers, H.A.L. (2003). The minimum rank factor analysis program MRFA. Internal report, Department of Psychology, University of Groningen, The Netherlands.
Veres, J.G., Sims, R.R., & Locklear, T.S. (1991). Improving the reliability of Kolb’s revised learning style inventory. Educational & Psychological Measurement, 51, 143–150.
Wainer, H., & Kiely, G.L. (1987). Item clusters and computerized adaptive testing: A case of testlets. Journal of Educational Measurement, 24, 185–201.
Woodhouse, B., & Jackson, E.H. (1977). Lower bounds for the reliability of a test composed of nonhomogeneous items II: A search procedure to locate the greatest lower bound. Psychometrika, 42, 579–591.
Yang, Y., & Green, S.B. (2007). Coefficient alpha and SEM estimates of reliability. Presented at annual meeting of the American Educational Research Association.
Yen, W.M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8, 125–145.
Yen, W.M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30, 187–214.
Yung, Y.F., Thissen, D., & McLeod, L.D. (1999). On the relationship between the higher-order factor model and the hierarchical factor model. Psychometrika, 64, 113–128.
Zimmerman, D.W., Zumbo, R.D., & Lalonde, C. (1993). Coefficient alpha as an estimate of test reliability under violation of two assumptions. Educational and Psychological Measurement, 53, 33–49.
Zinbarg, R.E., Revelle, W., Yovel, I., & Li, W. (2005). Cronbach’s α, Revelle’s β, and McDonald’s ω H: Their relations with each other and two alternative conceptualizations of reliability. Psychometrika, 70, 123–133.
Zinbarg, R.E., Revelle, W., & Yovel, I. (2007). Estimating ω h for structures containing two group factors: Perils and prospects. Applied Psychological Measurement, 15, 135–157.
Zumbo, B.D., & Rupp, A.A. (2004). Responsible modeling of measurement data for appropriate inferences: Important advances in reliability and validity theory. In D. Kaplan (Ed.), The SAGE handbook of quantitative methodology for the social sciences (pp. 73–92). Thousand Oaks: Sage.
Zwick, W.R., & Velicer, W.F. (1986). Comparison of five rules for determining the number of components to retain. Psychological Bulletin, 99, 432–442.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Green, S.B., Yang, Y. Commentary on Coefficient Alpha: A Cautionary Tale. Psychometrika 74, 121–135 (2009). https://doi.org/10.1007/s11336-008-9098-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-008-9098-4