On tests and indices for evaluating structural models

doi:10.1016/j.paid.2006.09.024

Personality and Individual Differences

Volume 42, Issue 5, May 2007, Pages 825-829

https://doi.org/10.1016/j.paid.2006.09.024 Get rights and content

Abstract

Eight recommendations are given for the improved reporting of research based on structural equation modeling. These recommendations differ substantially from those offered by Prof. Barrett in this issue, especially with regard to the virtues and limitations of current statistical methods.

Introduction

Professor Barrett makes many wise and perceptive observations in his discussion of model fit, and I agree with much he says e.g., that investigators inappropriately ignore the test of model fit, that there are virtues to cross-validation, etc. Yet I also disagree with certain points, e.g., his recommendation to ban all fit indices. I will give my own recommendations on how a structural equation model (SEM) should be submitted to, and reported in, a journal, and compare these to Professor Barrett’s. See also McDonald and Ho (2002).

Section snippets

My Recommendations vs. Barrett’s

1.
When submitting a manuscript (ms) with an SEM, an author should submit a separate statement that verifies, for each major model, that (a) every parameter in the model is purely a priori, and if not, (b) details on all model modifications that were made. This material should be sent to reviewers along with the ms.
2.
Every ms should provide summary statistics, where these exist, for evaluating assumptions to be made in the statistical analysis. Example: if using a normal theory statistic for

On the sources of caution Re. test statistics

Barrett notes that the chosen probability level (e.g., reject the model if p < .05) on a χ² test is arbitrary (true: why not .10, or .024), but “once that alpha level is set subjectively, … it becomes ‘exact’.” I disagree. As dozens of simulations across decades have shown, test statistics are not necessarily trustworthy (e.g., Curran et al., 1996, Hu et al., 1992). Even early proponents Jöreskog and Sörbom (1982, p.408) had reservations about their overall goodness of fit test: “… we emphasize

Approximate fit tests

If exact fit tests are not so exact, there may be a role for approximate fit. In discussing this, Barrett speaks negatively about recent work that tried to provide simulation-based guidance about the behavior of fit indices under null and non-null conditions. It seems to me helpful to know which indices are relatively insensitive to sample size, are sensitive to model misspecification, etc., even if the best recent research is not definitive. Perhaps SRMR needs little further research, since it

References (21)

P.M. Bentler
Comparative fit indexes in structural models
Psychological Bulletin
(1990)
P.M. Bentler
EQS structural equations program manual
(1995)
Bentler, P. M. (in press). EQS 6 structural equations program manual. Encino, CA: Multivariate...
P.M. Bentler et al.
Significance tests and goodness of fit in the analysis of covariance structures
Psychological Bulletin
(1980)
M.W. Browne et al.
Alternative ways of assessing model fit
M.W. Browne et al.
When fit indices and residuals are incompatible
Psychological Methods
(2002)
P.J. Curran et al.
The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis
Psychological Methods
(1996)
J. de Leeuw
Model selection in multinomial experiments
L.-T. Hu et al.
Evaluating model fit
L. Hu et al.
Can test statistics in covariance structure analysis be trusted?
Psychological Bulletin
(1992)

There are more references available in the full text version of this article.

Cited by (662)

State-Trait Anxiety Inventory for Children – Parent-reported Trait-version; a psychometric analysis of the measure in children on the autism spectrum
2023, Research in Autism Spectrum Disorders
Trait anxiety, a person’s general tendency to experience anxiety, has been widely researched in neurotypical populations resulting in standardised measures being well established for reliability and validity. However, the psychometrics of these measures have rarely been explored for children on the autism spectrum.
This study used an online questionnaire to investigate the item response profile and psychometric properties of the State-Trait Anxiety Inventory for Children – Parent reported Trait subscale (STAIC-P-T) in a sample of 105 parents of children with an autism diagnosis (mean age 10y5m; SD 2y10m).
STAIC-P-T items with the highest mean score related to the child’s feelings of worry or upset. Items related to physiological symptoms had the lowest mean score. The STAIC-P-T showed good internal consistency (α = 0.84), evidence of divergent validity through very weak correlation with a measure of autism characteristics (r_s =.06, p = .58), and convergent validity through strong positive correlation with an established measure of anxiety (r_s =.65, p < .001). However, confirmatory factor analysis did not support the unidimensional model of trait anxiety established in the studies with neurotypical participants.
Despite evidence supporting the validity and reliability of the STAIC-P-T, factor analysis indicates the need for further research to explore the suitability of this measure to assess trait anxiety in autism.
Edmonton Functional Assessment Tool – revised version (EFAT-2): Translation into Italian and assessment of its psychometric properties
2023, Annals of Physical and Rehabilitation Medicine
Doing good or feeling good? Justice concerns predict online shaming via deservingness and schadenfreude
2023, Computers in Human Behavior Reports
Public shaming has moved from the village square and is now an established online phenomenon. The current paper explores whether online shaming is motivated by a person's desire to do good (a justice motive); and/or, because it feels good (a hedonic motive), specifically, as a form of pleasure at witnessing another's misfortune (schadenfreude). We examine two key aspects of social media that may moderate these processes: anonymity (Study 1) and social norms (the responses of other users; Studies 2–3). Across three experiments (N = 225, 198, 202) participants were presented with a fabricated news article featuring an instance of Islamophobia and given the opportunity to respond. Participants' concerns about social justice were not directly positively associated with online shaming and had few consistent indirect effects on shaming via moral outrage. Rather, justice concerns were primarily associated with shaming via participants' perception that the offender was deserving of negative consequences, and their feelings of schadenfreude regarding these consequences. Anonymity did not moderate this process and there was mixed evidence for the qualifying effect of social norms. Overall, the current studies point to the hedonic motive in general and schadenfreude specifically as a key moral emotion associated with people's shaming behaviour.
Pain-Related Anxiety in Spanish-Speaking Mexican Americans Who Report Chronic Pain: Psychometric Evaluation of a New Spanish Adaptation of the 20-Item Pain Anxiety Symptom Scale (PASS-20)
2023, Journal of Pain
The 20-item Pain Anxiety Symptom Scale (PASS-20) was adapted for Spanish-speaking Mexican Americans who report chronic pain (SSMACP). The instrument measures pain-related anxiety with fear, physiological, avoidance/escape, and cognitive anxiety as subtypes. In SSMACP, the Spanish PASS-20′s psychometric properties were evaluated while exploring relationships between pain-related anxiety with other variables. Using convenience sampling, 188 SSMACP (women = 108, men = 77; mean age = 37.20 years, standard deviation = 9.87) were recruited across the United States. Confirmatory factor analyses examined the structural validity of the hierarchical factor structure. Hierarchical multiple regression examined incremental validity. Correlational analyses examined convergent validity. Cronbach’s coefficient alphas and McDonald’s omegas examined internal consistency. Pearson’s r, t-tests, and analysis of variance tests examined relationships between demographic variables and PASS-20 scores. Confirmatory factor analyses supported the hierarchical factor structure (root mean square error of approximation = .061, standardized root mean residual = .038, comparative fit index = .940). Total and subscale PASS-20 scores had acceptable convergent validity and internal consistency (range = .75–.93). Hierarchical multiple regression found that total and subscale PASS-20 scores have adequate incremental validity, considering that they contributed uniquely to the prediction of generalized anxiety scores above and beyond other pain-related scores. Demographic variables were significantly related to total and subscale PASS-20 scores. Evidence supports the use of Spanish total and subscale PASS-20 scores in SSMACP. Exploratory evidence also informed on the possible consequences and predictors of their pain-related anxiety. The results also encourage pain research in specific populations from Latin America (eg, Mexican Americans).
The Spanish PASS-20 has adequate psychometric properties in SSMACP. This instrument can help catalyze pain research in SSMACP by informing on their pain-related anxiety and by helping evaluate other pain-related instruments. Evidence also informed on pain-related anxiety in SSMACP.
Feeling and acting in classroom teaching: The relationships between teachers’ emotional labor, commitment, and well-being
2023, System
The teaching profession is considered highly emotionally demanding. While teachers’ emotional labor has been found to be related to their commitment and well-being, little is known about the nuanced interrelationships between these constructs within classroom contexts, and inquiry in this area has been dominated by quantitative methods without contextual support from qualitative data. This paper reports on a survey study conducted among 803 junior high school teachers teaching English as a foreign language (EFL) in China. Quantitative data were analyzed using structural equation modeling (SEM) and complemented by qualitative data elicited through open-ended questions in the survey. It was found that, of the three dimensions of emotional labor (i.e., the expression of naturally felt emotions, surface acting, and deep acting) performed in classroom teaching, only deep acting significantly predicted teacher well-being both directly and indirectly through commitment. The implications of these findings for teachers and institutions are finally presented.
An intervention to improve the well-being of families in which African American grandmothers are raising grandchildren: A longitudinal mediation analysis
2023, Children and Youth Services Review
Children raised by grandparents experience increased emotional distress often related to family circumstances. Drawing on a coping model of family stress, adjustment, and adaptation, we examined a 12-month home-based intervention to improve the well-being of families in which grandmothers are raising grandchildren and longitudinally explored grandmothers' perceived changes in grandchild behavior in relation to their psychological distress levels. The sample consisted of 510 African American custodial grandmother-grandchildren pairs, including children ages 4–16 years. All grandmothers participated in a 12-month, primarily home-based support intervention. The prospective analysis revealed that reported child internalizing and externalizing behavior problems and grandmother psychological distress decreased significantly from baseline to the 12-month follow-up assessment. Grandchild internalizing and externalizing behaviors were positively predicted by grandmothers' baseline and 12-month psychological distress. Concerning mediation, the total effect of baseline grandmother psychological distress on 12-month grandchild internalizing and externalizing behaviors was significant, positive, and significantly mediated by 12-month grandmother psychological distress. Findings also revealed medium effect sizes for the indirect effects. These results suggest that a decrease in psychological distress in grandmothers over the 12-month intervention had a positive effect on grandmothers’ perceptions of grandchildren's internalizing and externalizing behavior problems. Study limitations and future research directions are discussed.

View all citing articles on Scopus

^☆: Supported in part by National Institute on Drug Abuse Grants DA00017 and DA01070.

View full text

On tests and indices for evaluating structural models☆

Abstract

Introduction

Section snippets

My Recommendations vs. Barrett’s

On the sources of caution Re. test statistics

Approximate fit tests

Comparative fit indexes in structural models

Psychological Bulletin

EQS structural equations program manual

Significance tests and goodness of fit in the analysis of covariance structures

Psychological Bulletin

Alternative ways of assessing model fit

When fit indices and residuals are incompatible

Psychological Methods

The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis

Psychological Methods

Model selection in multinomial experiments

Evaluating model fit

Can test statistics in covariance structure analysis be trusted?

Psychological Bulletin