Exploring the structure and psychometric properties of the Warwick-Edinburgh Mental Well-Being Scale (WEMWBS) in a representative adult population sample

This article reports the psychometric properties of both full and the abbreviated (Short) Warwick-Edinburgh Mental Well-being Scales (WEMWBS; SWEMWBS) in the Finnish general population. A large cross-sectional dataset (N = 5,335) was collected as part of the nationally representative FinHealth Study in 2017. Exploratory and confirmatory factor analyses of the data evaluated one, two-, three-, and bi-factorial solutions with a split-half approach. McDonald ’ s omega was used to assess internal consistency and convergent validity was evaluated using four established mental health and well-being scales (BDI-6, GHQ-12, MHI-5, EUROHIS-QOL8). Contrary to previous findings, our results supported a three-factor model of the full scale with separate, yet highly correlated, mental, social, and eudemonic well-being factors. For the SWEMWBS, the bi-factor model showed the best fit, with a strong general mental well-being factor and a weaker specific eudemonic well-being factor. In this sampling context, the social aspect of mental well-being may be considered a separable construct from other mental well-being dimensions and the shorter 7-item version might thus be a preferable option when assessing overall mental well-being.


Introduction
The term 'mental well-being' is a complex phenomenon covering hedonic (positive affect), eudaimonic (psychological functioning), and sometimes social (relationship) aspects of well-being (Keyes & Annas, 2009).The theoretical framework stems from earlier work by Diener et al. (1998), Ryff and Keyes (1995) where the hedonic and eudaimonic well-beings were considered separate constructs, although a more recent framework combines these two aspects (Keyes & Annas, 2009;Ryan & Deci, 2001).However, the role of social well-being is less clear.According to some theories, social well-being is either a part of eudaimonic well-being (Ryff & Keyes, 1995) or these two aspects are divided into intrapersonal and interpersonal well-being (Keyes, 2005).The current literature acknowledges the differences between hedonic, eudaimonic, and social well-being (Gallagher et al., 2009;Ryan & Deci, 2001) and the presence of them all is considered to embody a mental state that Keyes calls 'flourishing mental health' (Keyes, 2005).However, models encompassing either one (mental well-being), two (hedonic and eudaimonic), or three (hedonic, eudaimonic, and social well-being) mental well-being factors did not empirically differ greatly from each other, indicating that various components of second-order constructs of well-being could be represented parsimoniously (Gallagher et al., 2009).To conclude, the exact roles and theoretical models of mental well-being (e.g., the role of social well-being) remain unclear, causing some disagreements and inconsistent results regarding the applied latent structure of well-being.
The Warwick-Edinburgh Mental Well-Being Scale (WEMWBS) was developed in 2005 in response to the urgent need for validated instruments assessing positive mental health at the population-level (Tennant et al., 2007).Existing mental health questionnaires, such as the Depression-Happiness Scale (SDHS, Joseph et al., 2004) or the Scale of Psychological Well-Being (SPWB, Ryff & Keyes, 1995), were considered either too illness-centered or conceptualized mental well-being differently from updated theory (Tennant et al., 2007).The 14-item WEMWBS was designed to capture the fundamental aspects of mental well-being, including both hedonic (e.g., ´feeling cheerful ´), eudaimonic (e. g., ´been thinking clearly´) and some interpersonal relationship (e.g., ´feeling loved´) items.The WEMWBS was based on the Affectometer 2 by Kammann & Flett (1983).Unlike the Affectometer 2, the WEMWBS is shorter and contains only positive worded items, and it does not suffer from ceiling or flooring effects (Tennant et al., 2007).Later, an abbreviated 7-item version of the scale (SWEMWBS) was launched, containing predominantly eudaimonic items (Stewart-Brown et al., 2009).Both instruments have been validated in different populations and cohorts (e.g., Lang & Bachinger, 2017;Shannon et al., 2020;Smith et al., 2017), with most studies reporting a unidimensional factor structure, good face and content validity, high internal consistency, and excellent test-retest reliability (e.g., Clarke et al., 2011;Stewart-Brown et al., 2009;Tennant et al., 2007).
Despite the breadth of research with the (S)WEMWBS, some studies have employed inconsistent and limited statistical methods in exploring structure (e.g., principal component analysis, PCA) or are suffering from overfitting, narrow sampling, or small sample sizes (Clarke et al., 2011;Dong et al., 2019;Haver et al., 2015;Houghton et al., 2017;Mavali et al., 2020).In general, PCA tends to overestimate factor loadings, since it does not account for unique variance and is not advised to use as a prior to CFA (Costello & Osborne, 2005).Similarly, some studies have allowed multiple error terms to correlate without reporting or justifying the correlations (e.g., Clarke et al., 2011;Dong et al., 2019;Haver et al., 2015); or have excluded some items in order to increase the model fit or deal with weak single-item factor loadings.Adding too many error terms leads to overfitting issues and skewed interpretation of the latent construct (Hermida, 2015).Especially items that measure social well-being (4, 9, and 12) have proven to be the most problematic, and in some studies item 4 was dropped out due to model misfit (Houghton et al., 2017;Mavali et al., 2020).
Only a few studies have reported anything else than a single-factor structure for the (S)WEMWBS (Azcurra, 2015;Lang & Bachinger, 2017;López et al., 2013).Azcurra (2015) found a two-factor model with separate but correlated emotional and psychological factors among Argentinian elderly people.Some studies have suggested a two-factor structure with a separate social factor based on exploratory analyses, but decided to report a single-factor structure instead (Dong et al., 2019;López et al., 2013;Ringdal et al., 2018).Recently, a bi-factor structure demonstrated excellent model fit, with one strong general mental well-being factor and three specific factors: positive affect, functioning, and personal relationships (Lang & Bachinger, 2017;Shannon et al., 2020).The bi-factor model adapted from Lang and Bachinger (2017) out-performed the one-and two-factor models of Tennant et al. (2007) and Azcurra (2015), but the explained variances of the specific factors were relatively low (Shannon et al., 2020).The most parsimonious description of the structure of (S)WEMWBS seems to be a dominant main factor with some additional specific aspects, but the role and the impact of these factors remain unclear.However, to our knowledge no one has explored the theory-driven possibility of a three-factor structure in the WEMWBS, nor applied any other than a one-factor model to investigate the factor structure of the SWEMWBS.Therefore, testing alternative models is required to confirm the latent structure (S)WEMWBS in general population samples.
The aim of the current investigation is to verify the latent factor structure of the original 14-item (WEMWBS) and the short 7-item (SWEMWBS) and their feasibility to measure mental well-being among the adult Finnish population.Several exploratory and confirmatory factor analyses (EFA; CFA) are tested to obtain a better operationalisation of mental well-being.Convergent validity and reliability are estimated to ensure confidence in the scales.Since the (S)WEMWBS is normally distributed it can detect the subtle changes in individuals who may be at-risks developing a mental health illness and thus the scale has a great potential for future mental health research and diagnostics.

Participants and survey
The employed data is extracted from the nationally representative 2017 FinHealth Study, which followed the ethical principles of the Declaration of Helsinki for medical research involving human subjects and was approved by the Coordinating Ethics Committee at the Hospital District of Helsinki and Uusimaa (Borodulin & Sääksjärvi, 2019).One and two-stage stratified, random sampling from the 15 largest cities and rural areas from mainland Finland was used to gather the health data from individuals aged 18 years or older.
The data collection entailed two phases: invitation letter and selfreported health questionnaire (Phase I) and health examination measures (e.g., urine and blood samples) and a second self-administered health questionnaire (Phase II).The original sample size was 10,305 but after deletion of 54 ineligible cases (i.e., deaths, living abroad, unknown address), the updated sample size was 10,247.A bit over half (58.1%, n=5,952) of the individuals participated in the health examination, but 617 did not complete the second questionnaire relevant to the present study, yielding a final sample size of 5,335.The participation rate was thus 52.1%.For more detailed information regarding the data collection and sampling, please see the publication by Borodulin and Sääksjärvi (2019).The flow chart of the study plan is presented in Fig. 1.
A total of 5,335 participants (55.8% female sex) were included in the analyses.The dataset was then split into exploratory (n = 2,668) and validation (n = 2,667) datasets by block randomization (block size 2) to re-evaluate the best factor models from EFAs as recommended previously (Cabrera-Nguyen, 2010;Costello & Osborne, 2005).The number of cases with all items missing were 44 and 47 in the exploratory dataset, and 22 and 25 in the validation dataset for WEMWBS and SWEMWBS, respectively.Among the 5,335 respondents with any data, the level of WEMWBS missingness was negligible, being 0.8% of responses (Table S1-4) No statistically significant differences in socio-demographic variables between the datasets was reported (Table S5).

(S)WEMWBS
The 14-item WEMWBS and its shorter 7-item SWEMWBS (items 1, 2, 3, 6, 7, and 11) consist of positively worded items using a 5-point frequency rating scale (1 none of the time to 5 all of the time), with higher scores indicating higher levels of mental well-being (Stewart-Brown et al., 2009;Tennant et al., 2007).For more information on copyright statement, registering to commercial and non-commercial use (S)WEMWBS, other languages or developing new translations, and scoring and interpretations of the scores please visit the Warwick University Medical School's WEMWBS website: https://warwick.ac.uk/fac/sci/med/resea rch/platform/wemwbs or contact wemwbslicence@warwick.ac.uk.

Analysis
The study falls under the STROBE guideline of cross-sectional studies (STROBE, 2023).However, as the guideline does not contain detailed information on reporting factor analysis or self-report scale validation, an additional guideline by Cabrera-Nguyen ( 2010) is used where applicable.

Prerequisites of factor analysis
The Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy, Bartlett's test of sphericity, item-level descriptives and correlations, along with tests of normality are estimated with SPSS version 28.0 (IBM Corp, 2021) to assess the feasibility of factor analysis.

Factor structure
Mplus version 8.8 (Muthén & Muthén, 2017) is used to conduct all factor analyses with standard settings of the Weighted Least Square Mean and Variance adjusted (WLSMV) estimator and oblique Geomin rotation.WLSMV provides standard fit indices and it is appropriate for ordinal indicators, outperforming the maximum likelihood estimator when the latent distributions are approximately normal (Flora & Curran, 2004).The oblique Geomin rotation allows factor intercorrelations and provides excellent and robust recovery of simple factor structure in both empirical and simulated data (Costello & Osborne, 2005;Hattori et al., 2017), and is also a default in Mplus (Muthén & Muthén, 2017).Note that the employed WLSMV algorithm uses all available pairwise data in estimating the polychoric correlation matrix, so cases with missing data are not lost.All item parameters are freely estimated, while the mean and variance of the latent factors are fixed to 0 and 1, respectively.(Costello & Osborne, 2005).If there are no sufficient items to construct the latent factor with a minimum of three items, the next highest loading is added to the model.In the present study, we do not exclude any items, since we want to explore all factor models (one-, two-, three-, and bi-factor) based on the theoretical framework of mental well-being.All models are cross-validated with the validation dataset.Simultaneously, an existing bi-factor model from Lang and Bachinger ( 2017) is compared to exploratory models.Model fit indices, factor loadings, and the interpretation of the models (i.e., correspondence with the theoretical framework) are considered to determine the best model fit.A model is discarded if it is unclear or uninterpretable (e. g., high cross-loadings, or inconsistent with any of the proposed latent structures of mental well-being).

Exploratory dataset.
Ordinal EFA models are estimated with up to three independent factors for WEMWBS and two for SWEMEBS, while bi-factor models contain one common mental well-being factor and either one (SWEMWBS) or up to three (WEMWBS) specific factors.Once the factor structure is established with EFA, these models are refined with CFA.To further improve model fit, the residual error covariances between the items are freed iteratively based on the highest modification indices, until the change of CFA is smaller than 0.005.Error terms are allowed to correlate only if the relationship between the items is justified based on the theoretical, conceptual, or pragmatic interpretation of the latent structure, for instance, if items are part of the one of the latent structure, or the wording between the items is similar (Cabrera-Nguyen, 2010).

Validation dataset.
The bi-factor structure of WEMWBS from Lang and Bachinger ( 2017) is compared with CFA with the models of the exploratory dataset.

Internal consistency and convergent validity
McDonald's omega is reported to quantify the internal consistency (reliability; Reise, 2012).Values greater than .7 or .8indicate a good reliability (Cheung et al., 2023).In multi-factor models, discriminant validity between factors is investigated by examining their intercorrelations, factor loadings, and cross-loadings.Factor correlations above .80indicate poor discriminant validity, whereas strong factor loadings that do not cross-load between the items indicate good validity (Cabrera-Nguyen, 2010;Cheung et al., 2023).

Results
Socio-demographic information of the study participants are presented in Table 1.
The KMO value of .956and the Bartlett's Test of Sphericity p < .001were both within the acceptable range (KMO 0.8-1.0,Barlett's Test of Sphericity p <.05), indicating that the sampling is adequate even for linear factor analysis (Shrestha, 2021).The item-level descriptives and histograms indicated that (S)WEMWBS is normally distributed and only slightly negatively skewed to the left, with some items (2, 4, 7, and 9-13) having over 15% of the highest response.No significant multicollinearity (range .381-.693), ceiling-nor floor effects were detected.The results are displayed in the Supplementary Materials (Tables S6-9; Figures S1-3).
The most optimal factor structures for WEMWBS and SWEMWBS are presented in Fig. 2. The results are discussed below.

Factor structure -WEMWBS
The analyses based on the exploratory dataset are presented in Supplementary Table S10-14.All analyses yielded acceptable SRMR (< .08)and CFI (> .95)values, but none of the models reached acceptable RMSEA values (< .06),though the first two items (1: optimistic about the future; 2: feeling useful) in one-, two-, and three-factor models, and items 9 and 12 in the one-factor model were allowed to correlate.The factor loadings and interpretation of the models were acceptable, except for the bi-factor models.The bi-factor models showed either weak loadings on the specific factor (e.g., none of the factor loadings reached the .3threshold; Table S10, EFA Bi-Factor 3) or item 9 was overpowering (factor loading > 1; Table S14, CFA Bifactor 1 and 2).In two-and threefactor models, items 4 (interested in other people), 9 (close to other people), and 12 (feeling loved) loaded on a separate 'social well-being' factor, whereas items 6 (dealing with problems well), 7 (thinking clearly), and 11 (able to make up my own mind about things) formed a separate 'eudaimonic well-being' factor.Since these factors conform to the theoretical framework of mental well-being, models with one (mental well-being), two (mental and social well-being), and three (hedonic, eudaimonic, and social well-being) factors were selected for evaluation with the validation dataset, along with the adapted bi-factor model from Lang and Bachinger (2017).The results are presented in Table 2.
The adapted model from Lang and Bachinger (2017) had the best model fit.However, the RMSEA value did not reach the predetermined level and most factor loadings were relatively weak or some even non-significant (items 8 and 13); the specific factors explained only 6% or less of the total variance.The three-factor model had the second highest model fit with strong factor loadings, and thus it might be more suitable to describe WEMWBS factor structure.However, the model fits of the one-and two-factor models were only slightly worse in comparison to other models.

Factor structure -SWEMWBS
All the analyses using the exploratory dataset are presented in Supplementary Table S15-18.EFA models showed acceptable SRMR and CFI values, but the one-factor model did not reach the acceptable RMSEA value, despite the strong factor loadings (Table S15).In the bi-factor model, item 7 had the highest, yet negative, factor loading of -.368, followed by item 1 (.343).However, item 1 (optimistic about the future) was not included as a part of specific factor since it did not fit the theoretical model of eudaimonic well-being.Instead, items 6 and 11 were added to the specific factor since they fit in the theoretical framework and had negative factor loadings.In the two-factor model Fig. 2. Most optimal models for WEMWBS and SWEMWBS.items 6, 7, 9, and 11 had factor loadings > .3.However, the factor loading of item 9 was stronger in factor 1 (.437 > .323)and it was therefore excluded from the second factor.In the CFAs, none of the RMSEA values reached the desirable level of <.06 (Table S16 -18).To improve the model fit, the error terms between the first two items were included in the one and two-factor models.Additionally, item 7 was allowed to correlate with items 6 and 11 in the single-factor structure.Allowing the error terms to correlate improved model fits, with onefactor model reaching an RMSEA value of .058.Despite the poor RMSEA values in the two-and bi-factor models, all the models fit the theoretical framework, had acceptable factor loadings (> .3)and adequate , and were therefore investigated in the validation dataset.
The results of one-, two-, and the bi-factor models in the validation dataset are presented in Table 3.The bi-factor model had the best model fit, with the specific eudaimonic well-being factor explaining 13% of the total variance.However, the model fit indices in all factor models are almost identical, with none of the RMSEA values reaching the acceptable .06threshold.

Internal consistency and convergent validity
The internal consistency of the hierarchal models was good: above .90 in all models.The specific omegas (ω S ) in both bi-factor models were relatively low, ranging between .02-.16, indicating that the impact of specific factors is minimal.Similarly, the correlation between the independent factors in two-and three-factor models were high, indicating a poor discriminant validity.
The linear correlations between the mental health questionnaires and (S)WEMWBS are presented in Table 4.As expected, (S)WEMWBS correlations were strongly negative with GHQ-12/BDI-6, and positive with MHI-5/EUROHIS-QOL8.

Discussion
We examined the psychometric properties of the 14-item Warwick-Edinburgh Mental Well-being Scale and its shorter 7-item version using a nationally representative dataset of 5,335 adults in Finland.The large sample size allowed us to use a split-half method to explore and compare several factor structures in two datasets.Models were obtained with EFA and refined with CFA in one dataset and tested with CFA in the other.The models for validation were chosen based on the strong factor loadings (> .3),model indices (RMSEA, CFI, SRMR), and resemblances to the theoretical hierarchical model of well-being, where hedonic, eudaimonic, and social well-being are seen as associated, yet separate constructs (Gallagher et al., 2009;Ryan & Deci, 2001).Furthermore, the previously successful bi-factor model of Lang and Bachinger (2017) was investigated in our validation data along with the other factor structures.WB = well-being, S = specific factor, n.s.= non-significant.All other factor loadings p values < .001.a-c: added error terms for one-factor (items 1 with 2, 9 with 12), two-factor (items 1 with 2), three-factor (items 1 with 2), and bi-factor (items 1 with 2, 1 with 10, 9 with 12) models.Correlations between factors: two-factor model r = 0.840 (SE = 0.008); three-factor mental and social well-being r = 0.848 (SE = 0.008), mental and eudaimonic well-being r = .905(SE = 0.007), social and eudaimonic well-being r = 0.754 (SE = 0.012).
Our findings suggested that a three-factor model of the WEMWBS with separate mental, social, and eudaimonic well-being factors, and a bi-factor solution with a general mental well-being factor and a specific eudaimonic well-being factor for the SWEMWBS had the most optimal factor structures and the best interpretability (Fig. 2, Tables 2 and 3).However, the fit advantages over other models were minimal, and none of the RMSEA values reached the acceptable <. 06 threshold.Furthermore, WEMWBS factors were highly correlated with each other (≥ .80),and model structures were almost identical, leaving room for discussion of the dimensionality of the (S)WEMWBS.In addition, both social and eudaimonic well-being items seemed to be associated with strong error covariances in one-factor models and encompassing independent factors in two, three, and bi-factor models.Likewise, the explained common variances of (bi-factor) specific factors were small, showing that hedonic, eudaimonic, and social well-being are strongly related via a higher-order factor, general mental well-being.
Despite none of the models having RMSEA values below the desirable threshold of .06,other model indices had acceptable values.The models also had strong factor loadings and fit the theoretical model of mental well-being.In general, all fit indices cut-offs are suggested to be used as a rule of thumb rather than exact value that defines the entire validity of the model (Hermida, 2015).In fact, specific features of the model (e.g., estimation method, sample size) may influence the optimal cut-off criteria (Hermida, 2015;McNeish et al., 2018).For example, better measurement quality can result in poorer RMSEA values even with the same level of model specification (reliability paradox;McNeish et al., 2018).Therefore, indicators other than only model fit indices shall also be used when evaluating the factor models, and fit limits shall relaxed when loadings are high.
In the present study, the error terms between the first two items (1 and 2), items 9 and 12, as well as 6, 7 and 11 were allowed to correlate.Even though this is common procedure, it is typically recommended against doing in the literature; and if used, the error terms shall always be justified (Cabrera-Nguyen, 2010;Hermida, 2015).Our split-half procedure also offset some of the disadvantages of model tuning.Items 9 and 12 were both part of the independent social well-being factor, whereas items 6, 7, and 11 were all part of the eudaimonic well-being factor.Due to the conceptual value of retaining items, allowing these items to correlate beyond the overall factor structure is therefore meaningful.However, it is less clear why the error terms between the first two items (1: feeling optimistic; 2: feeling useful) had a systematically high residual covariance in all models, since the two items are very different content-wise.However, previous studies used have allowed them to correlate (Lang & Bachinger, 2017;Ringdal et al., 2018;Smith et al., 2017).Lang and Bachinger (2017) argued that the error covariance could be explained by the older age group or poor translation.In fact, the wording of the first three items is very similar in Finnish, which may explain the strong residual covariance between the  WB = mental well-being.Linear correlations between (S)WEMWBS factor scores and sum scales: r (SE); all correlations p < .001.WEMWBS three-factor: Mental well-being (items 1, 2, 3, 5, 8, 10, 13, and 14) Social well-being (items 4, 9, 12); Eudemonic well-being (items 6, 7, 11); error terms item 1 with 2, r = .10,WEMWBS one-factor: error terms items 9 with 12 (r = .515)and 1 with 2 (r = .263);SWEMWBS one-factor: items (1,2,3,6,7,9,11); error terms items 1 with 2 (r = .094),6 with 7 (r = .142),and 7 with 11 (r = .19).
items in our data.However, the scale may also suffer from order bias, which may impact the distributions and correlations between the items (Israel & Taylor, 1990).Thus, further investigation is warranted with targeted methodological approaches, such as differential item functioning or measurement invariance analysis, or randomizing the order of the items to test for changes in item correlations.Despite the theoretical framework of well-being, there is little evidence that eudaimonic well-being should be considered a separate factor in the (S)WEMWBS.Smith and colleagues (2017) reported a one-factor model with multiple error terms, including both social and eudaimonic well-being items, among Norwegian primary healthcare patients.Another study reported significant correlations between items 2 (feeling useful), 6 (dealing with problems well), and 11 (able to make up my own mind about things) among individuals with mental illnesses (Ng et al., 2014).Our adult sample had a high rate (65 %) of existing health issues, which may partially explain the separate eudaimonic well-being factor in our findings.These results cautiously suggest that eudaimonic well-being may perform differently in clinical cohorts, which should be investigated further in the future.
As in previous studies, the social well-being items were the most problematic in our models.For example, item 9 (close to other people) showed strong cross-loading across two factors in EFA (SWEMWBS) and was overpowering the CFA results in bi-factor models (WEMWBS).One explanation is that the interpretation of the social items may vary within the population.For instance, items 9, 12, and 5 have been reported to interfere with the WEMWBS group mean when investigating interpersonal relationships in the Australian general population (Goh et al., 2018).Similarly, focus group interviews revealed that items 4, 9, and 12 were interpreted to mainly concern sexual and romantic relationships among high school students in the UK (Clarke et al., 2011).
In general, both scales were strongly, but not overly, correlated with various mental health scales, supporting that the (S)WEMWBS is a suitable instrument to measure mental well-being.In light of our findings, the WEMWBS can be treated as mostly unidimensional, if the social and eudaimonic well-being factors and their possible interference with the topic of interest is considered.Since the most problematic items (4, 12) are not part of SWEMWBS, the shorter 7-item SWEMWBS might be the preferable option when assessing overall mental well-being, at least in a sampling context similar to ours.

Table 4
Convergent validity of the (S)WEMWBS.