Teacher job satisfaction across 38 countries and economies_ An alignment optimization approach to a cross-cultural mean comparison

The purpose of this study is to compare latent means of job satisfaction across participating countries in the 2018 Teaching and Learning International Survey. The mean comparison of this nature is sparse in the literature due to lack of cross-cultural construct validity of job satisfaction scales. We applied an alignment approach that can optimize this construct validity to compare the latent mean of 153,682 teachers across 48 countries. We found that Austria, Chile, Spain, Canada, and Argentina form the top countries with highly job-satisfied teachers while the least job-satisfied teachers are from Bulgaria, England, Portugal, Saudi Arabia, and Malta. Our findings provide potential cues to policymakers/education stakeholders on which country to emulate such that teacher job satisfaction can be improved.


Introduction
Teachers' evaluations of their teaching profession have become increasingly important in recent times. One reason for this interest in job satisfaction is its significant role in the retention of effective teachers. Policy makers, school principals and other education stakeholders are challenged with questions about how to identify, recruit and retain effective teachers such that goals and objectives of schools are achieved. Teacher job satisfaction has been identified among other indicators of effective teachers and teaching (Klassen & Tze, 2014). It has been defined as "the sense of fulfilment and gratification that teachers experience through their work as a teacher" (Ainley & Carstens, 2018, p. 43). It includes both positive and negative self-evaluations of the teaching experience which are both intrinsically and extrinsically motivated and its effects translate to students' performance (Skaalvik & Skaalvik, 2011). Empirical evidence has been documented from different studies on the role of teacher job satisfaction in, for example, reducing job quitting intention, lowering emotional exhaustion, lowering burnout and stress (e.g., Cameron & Lovett, 2015;Skaalvik & Skaalvik, 2015;Veldman, Admiraal, van Tartwijk, Mainhard, & Wubbels, 2016). For instance, Cameron and Lovett (2015) reported a nine year longitudinal study involving 57 school teachers who were initially identified with enthusiastic determination of 'making a difference' in their teaching profession. They investigated the influence of job satisfaction in sustaining this commitment over the years. It was found that teachers with high sense of job satisfaction were able to sustain this commitment and preferred to be in schools where a serene environment is created that fosters their job satisfaction (Cameron & Lovett, 2015).
Also, the relations between teacher job satisfaction and several other factors such as self-efficacy, school leadership, teachers' emotions, etc., have been properly documented (e.g., Karousiou, Hajisoteriou, & Angelides, 2018;Yin, 2015). For example, the contributions of teacher personal traits and school factors such as gender, age, status of employment (fixed-term or permanent involves fitting a measurement model on a country data (in their case it was England) and systematically comparing its latent mean with each latent mean of other comparable countries one after the other. They established some conditions for comparable countries (trustworthy, reasonable, unreliable) based on the extent of configural, metric and scalar invariance criteria. This multi-pairwise mean comparison is interesting. However, it could be very cumbersome when the pairwise comparisons involve many groups like 48 participating countries and economies in TALIS 2018 survey. Another limitation of this approach is that even if the comparisons involve only 10 groups it cannot be used to rank or determine which group has the highest or the lowest latent mean. This is because only one group is held constant at a time and compared with every other comparable group.
A relatively recent approach that has been proposed and effectively applied (mostly among methodologists) to compare latent means across many groups is the alignment optimization (Asparouhov & Muthén, 2014;Muthén & Asparouhov, 2018). The basic idea of this approach is to identify items that exhibit the most non-invariance and their contributions to the scalar non-invariance of the measurement model are iteratively optimized such that the minimum scalar non-invariance is achieved for mean comparisons to take place (Muthén & Asparouhov, 2018). This approach can be used to compare latent means across many groups. It can also be used to rank and determine the group with the highest or lowest means in the construct under investigation. It has been applied in various fields for latent mean comparisons across many groups and it was found efficient (e.g., Flake & McCoach, 2017;Rice, Park, Hong, & Lee, 2019;Tay, Jayasuriya, Jayasuriya, & Silove, 2017). To the best of our knowledge, after an extensive and exhaustive search of the literature, no study has applied this method for latent mean comparison of teacher job satisfaction. Thus, this article is geared towards latent mean comparisons of teacher job satisfaction with the teaching profession across the 48 participating countries and economies in the TALIS 2018 survey using alignment optimization approach. This is envisaged to answer the questions of which country has the most job-satisfied lower secondary school teachers? Which country has the least job-satisfied lower secondary school teachers? And how does the lower secondary school teacher job satisfaction with the teaching profession differs from one country to another?
The aim of the current study is not to determine the best country in terms of job satisfaction or otherwise. Rather, our goal is to report empirical evidence based on what the data say which could serve a potential cue to policy makers and other education stakeholders on the right country to emulate in improving teacher job satisfaction. Further, after presenting an introduction on the importance of job satisfaction, previous attempts in relating job satisfaction with other constructs, some challenges in cross-cultural mean comparison of job satisfaction and our approach to addressing some of these challenges, the remaining part of this article is organized as follows. The next section discusses some issues related to methodology and methods such as data source, sample, measures, etc., and some technical terms used in the current section are elaborated upon. This is followed by a section where results are presented, and findings are simultaneously discussed including strengths and limitations of the current study. Some concluding remarks are provided in the last section before the reference list. We try to avoid use of some technical terms and theoretical mathematics behind the statistics used in the current article such that this article can be read and understood by a wider audience beyond methodologists and researchers.

Data source and sample
The data used for the current study came from the third cycle of the international teacher survey conducted by TALIS team in 2018. The survey was conducted in 48 countries and economies in which 200 schools per country were randomly selected and 20 teachers in each school using probabilistic sampling procedure. The focus of the current study is on the job satisfaction of lower secondary school teachers around the world. Lower secondary school teachers were chosen because they form the central target population of the TALIS exercise and were substantially covered in all the 48 participating countries unlike the primary or upper secondary school teachers. Thus, only the lower secondary school teacher data that relate to job satisfaction with the teaching profession were pooled from the TALIS 2018 data. These amount to a total of 153,682 teachers including 106,123 (69.1 %) females and 47,551 (30.9 %) males after excluding 8 teachers who did not indicate their gender types. The average number of years of working as a teacher is 16.5 years (a minimum of 0 year with 3500 teachers, and a maximum of 58 years with 6 teachers) across the 46 countries and economies (Iceland and Flemish community (Belgium) data were not released for lower secondary school teachers) that participated in the survey. Further, the highest country specific sample comes from the United Arab Emirate with a total of 8648 teachers (60.64 % females) and the lowest country/economy specific sample comes Alberta in Canada with total of 1077 teachers (63.14 % females).

Measures
TALIS 2018 team operationalized and measured job satisfaction with the teaching profession (JSP) with four items using a 4-point Likert scale format. Teachers were requested to rate their level of agreement (1-strongly disagree, 2-disagree, 3-agree and 4-strongly agree) on the four items using the following stem question "We would like to know how you generally feel about your job. How strongly do you agree or disagree with the following statements?" (OECD, 2019a, p. 295). Descriptive statistics of the pooled sample including items label and wordings are presented in Table 1. Even though JSP was measured using an ordinal scale with four-point categories, it was revealed in Table 1 that the data set neither contains excess kurtosis nor excess skewness. These can be deduced from the absolute values of these two statistics less than 1 in absolute terms except for JSP03 (Muthén & Kaplan, 1992). This has an implication for the current study on the choice of estimator in the measurement model and the type of reliability test that will be conducted.

Data screening
The data set for each country or economy was screened for pattern and significance of the missing values using Little's missing completely at random (MCAR) tests (Li, 2013). The missing patterns were found to be at random with less than 10 % missing on each variable. Thus, the default full information maximum likelihood estimation using expectation maximum algorithm in Mplus software was used to handle the missing values in the analyses described in the next section (Cham, Reshetnyak, Rosenfeld, & Breitbart, 2017).

Measurement model and reliability
Teacher job satisfaction being an unobserved (latent) construct is hypothesized to manifest through four items (observed variables/indicators) as operationalized by the TALIS 2018 team. Fig. 1 presents a diagrammatic representation of the JSP measurement model used in the current study. Note. Items with '*' are reversed coded before the analyses to reflect their negative wordings. Also, valid cases are obtained after deleting missing cases on each variable. The oval shape in Fig. 1 with JSP label represents the latent construct of job satisfaction. It has a mean (α) and a variance (Ω) by default. The four rectangles with labels JSP01 -JSP04 are the items of the JSP scale designed to expose the unobserved construct of job satisfaction. Each of these items (JSP0i's) is associated with a factor loading λ i , an intercept τ i , and a residual variable or an item unique variance e i . The factor loadings are measures of strength of relationships between JSP and the items (typical of regression coefficients) whose squares (in standardized forms) encapsulate the amount of item variances explained by the latent factor. Similar to the interpretation in regression analysis, an intercept is a measure of the amount of each item predicted when the latent construct, JSP, is zero. The unique variance is a measure of errors (measurement error variance or specific error variance) associated with each item in the scale. The double-headed arrow between e 3 and e 4 represents an error covariance between item JSP03 and item JSP04 in the scale. This error covariance was included to account for any error could stem from reading these two items because of their negative wordings as compare to other items on the scale. The covariance is also recommended by the TALIS 2018 team in their technical report (OECD, 2019a). Perhaps, to account for the measurement errors that might result from the negative wordings of these items.
The first preliminary analysis was to check for the internal consistency of the JSP scale items since construct validity is taken care of as a step in the alignment optimization approach that will be described shortly. Thus, the measurement model presented in Fig. 1 was validated for each of the 46 participating countries such that the ensuing parameters e.g. factor loadings and unique variances are used to calculate hierarchical coefficient omega, ω h , (Revelle & Zinbarg, 2009;Trizano-Hermosilla & Alvarado, 2016) for the scale reliability. Unlike the popular Cronbach alpha, ω h is based on factor analytic framework such that scale reliability is estimated based on factor loadings and unique variances of the scale items. It requires relaxed assumptions on tau-equivalence (i.e. equal factors for all items in the scale which is a necessary assumption for computing Cronbach alpha), normal distribution of the data, etc., it allows for error covariance between items, and it has been shown to perform better than Cronbach alpha in both simulation and real data studies (e.g., Dunn, Baguley, & Brunsden, 2014;Zinbarg, Revelle, Yovel, & Li, 2005). The coefficient omega is usually interpreted just like the Cronbach alpha and has its values range from 0 to 1 with values close 1 indicating higher reliability than others. The authors are not aware of any specific threshold value of ω h for a reliable instrument. Thus, a value of ω h for each participating country or economy that contains 0.70 within its 95 % confidence interval was used to judge a reliable instrument as popularly used in the literature (e.g., Trizano-Hermosilla & Alvarado, 2016;Zieger et al., 2019). Table 2 presents the values of ω h for the countries and economies the meet the 0.70 criterion of a reliable instrument.

Criteria for accessing model
A number of goodness of fit (GOF) indices were used to access how the model fits the data. These GOF indices are comparative fit index (CFI), standardized root mean square residual (SRMR), Tucker-Lewis index (TLI), and root mean square error of approximation (RMSEA). These GOF indices were chosen because of their popularity in education literature (e.g., Zakariya, Bjørkestøl, Nilsen, Goodchild, & Lorås, 2020) coupled with their satisfactory performance in both simulated and real data studies (e.g., Hu & Bentler, 1999). Chi-square statistics are reported and used only for model comparisons and not for accessing model fits because of their bias to reject an appropriate model when the sample is large (Chen, 2007). Both CFI and TLI values greater than or equal 0.95 (Hu & Bentler, 1999), RMSEA and SRMR values of less than or equal to 0.08 (Browne & Cudeck, 1992) are considered as minimum thresholds for an appropriate model fit as recommended in the literature.

Scalar invariance
As a first step in the data analysis, configural invariance (i.e. pattern of factor loadings) was investigated by performing a confirmatory factor analysis (CFA) to validate the measurement model as presented in Fig. 1 for each country using robust maximum likelihood (MLR) estimator. The decision to use MLR estimator was informed by its satisfactory performance in analysis of ordinal data (Suh, 2015). Also, the alignment optimization is only compactible with MLR unlike the weighted least square mean and variance adjusted (WLSMV) which is also a potential estimator (Muthén & Asparouhov, 2018). At this stage, Korea was removed from further analyses because its measurement model results (RMSEA = .094, CFI = .99, TLI = .94, and SRMR = .013) failed in the criterion value of RMSEA. Given a total of 38 comparable countries and economies as presented in Table 2. Further, we moved up to metric invariance (i.e. = = = λ λ λ λ 1 2 3 4 for each country) and then to scalar invariance (i.e. = = = λ λ λ λ 1 2 3 4 and = = = τ τ τ τ 1 2 3 4 for each country). The final nested model confirmed that scalar invariance does not hold across the 38 countries and economies and we resort to the alignment optimization approach for the latent mean comparisons.

Alignment optimization
Recall that the basic idea of alignment approach is to identify items that exhibit highest non-invariance and their contributions to the scalar non-invariance of the measurement model are iteratively optimized such that the minimum scalar non-invariance is achieved for mean comparisons to take place. This idea involves many steps. (1) A configural invariance is validated in which a model with the same pattern of factor loadings, α fixed to 0 and Ω fixed to 1 is fitted across the 38 countries and economies. (2) An optimization iterative process that is similar to factor rotation in exploratory factor analysis will then be performed. This process identifies items that contribute the most to the scalar non-invariance of the model and simultaneously derives α's and Ω's that minimize the non-invariance without compromising the fits. (3) Both α's and Ω's are finally ranked and compared across the groups (Asparouhov & Muthén, 2014;Muthén & Asparouhov, 2018). The ensuing results from these analyses are presented and discussed in the next section.

Scalar invariance
The results from the multiple group analyses of the data from the 38 countries and economies are presented in Table 3. Model fit statistics for configural, metric and scalar invariance as well as for comparisons of these models are presented. The nested models were compared using chi-square difference test with Satorra-Bentler correction to cater for ordinal scale of the data vis-à-vis the estimator used (Satorra & Bentler, 2010).
The results presented in Table 3 show that the measurement model of the JSP scale satisfies excellent fits at both configural and  Note. χ Δ 2 means change in MLR chi-square with Satorra-Bentler correction, df Δ means change in degree of freedom and "vs." means versus. All the chi-square values are significant at p < .05. Y.F. Zakariya, et al. International Journal of Educational Research 101 (2020) 101573 metric levels. This is evident from the CFI, TLI ≥ 0.95, SRMR and RMSEA ≤ . 08 for both models. The excellent configural model fits can be interpreted to be an evidence of configural invariance of JSP scale across the 38 participating countries and economies in TALIS 2018 which is an important condition for alignment optimization method (Asparouhov & Muthén, 2014). That is, when there were neither equality constraints on the factor loadings nor the intercepts the JSP scale demonstrate an excellent measure of the latent factor across the 38 countries and economies. However, the JSP model does not satisfy metric invariance. Even though, its metric model exhibits an excellent fit. This is because for metric invariance to hold, apart from the fitness of the metric model, the comparison between configural and metric models is expected to return a non-significant chi-square difference test (Brown, 2015) which is not the case ( 2 ). Further, it is also revealed in Table 3 that the measurement model of the JSP scale show a poor fit of the scalar model since none of the GOF indices reached the cut-off for an acceptable fit. Invariably, chi-square difference test with Satorra-Bentler correction were significant for both comparison between scalar with configural models and between scalar with metric models. Thus, the JSP scale does not satisfy the scalar invariance condition.

Alignment optimization
As a result of the failure in scalar invariance test presented in the previous section, we proceeded to use the alignment optimization approach for latent mean comparison across the 38 countries and economies. The FIXED alignment estimation was used by fixing the latent of mean of JSP in Latvia (country 27) to 0. This decision was informed by initially conducting the FREE alignment estimation to determine in which country JSP has a mean close to or equal to 0 and this was found to be Latvia (mean = 0.001). The FREE alignment estimation returned an error warning that recommends the use of FIXED option to avoid model misspecification. Thus, this recommendation was followed in line with the best practice in the literature (Muthén & Asparouhov, 2018). Consistent with the second step of the alignment optimization procedure Table 4 presents the results of items with non-invariance in factor loadings and intercepts and the countries or economies at which these non-invariances occur are put in parentheses and in bold faces. Table 4 revealed that items JSP01 and JSP04 have more significant non-invariance in factor loadings across the 38 countries and economies than items JSP02 and JSP03. Item JSP02 is invariant in factor loadings across all the countries and economies except for Australia (country 2), Bulgaria (country 7), England (United Kingdom, country 15), Portugal (country 34), and the United States (country 48) where it demonstrates significant non-invariance. Similar pattern of the factor loading measurement invariance of the item JSP03 can also be observed in Table 4. On the other hand, all the items demonstrate significant non-invariance in intercepts across the 38 countries and economies. After the identification of items with significant non-invariance in factor loadings and intercepts and their contributions to the scalar non-invariance of the whole scale, the iterative optimization technique was them implemented. This gives us the opportunity to proceed to the third step of the alignment optimization approach. The results of the latent mean comparison of JSP scale across the 38 countries and economies coupled with their ranking in descending order are presented in Table 5.
It was revealed from Table 5 that Austria (country 3) has the highest mean ranking of the JSP scale with a mean of 1.580 and Malta (country 29) has the least mean ranking with a mean of -0.296 among the 38 countries and economies. This finding can be interpreted to mean, on an average level, lower secondary school teachers in Austria are the most satisfied with the teaching profession among all the teachers worldwide that took part in the TALIS 2018. On the extreme, lower secondary school teachers in Malta are the least satisfied teachers with the teaching profession among all the teachers worldwide that took part in TALIS 2018. Table 5 also shows that next to Austrian teachers' highest raking in job satisfaction with teaching profession are teachers in Chile (country 9), Spain (country 43), Alberta (Canada, country 1), Ciudad Autónoma de Buenos Aires (Argentina, country 8) and Italy (country 23) in that order. These countries and economies are significantly different from other countries and economies in the levels of job satisfaction among their teachers. At the bottom of the ranking just before Malta are Saudi Arabia (country 37), Portugal (country 34), England (United Kingdom, country 15), Bulgaria (country 7) and Latvia (country 27) in that order. These countries on an average level have the least satisfied lower secondary school teachers with the teaching profession among all the teachers worldwide that took part in TALIS 2018. Further, some of these countries have negative means due to the positions of the intercepts. Moreover, at the Table 4 Sources of non-invariance of factor loadings and intercepts of each item of the JSP scale across 38 countries and economies. Loadings JSP01 1 (2) (3) (4) 6 7 8 (9) 11 12 13 (14) (15) 16 (17)  (1) (2) (3) (4) (6) (7) (8) (9) (11) (12) (13)  (1) (2) 3 4 (6) 7 (8) 9 (11) 12 (13)  middle of the ranking are teachers from Croatia (country 11), Israel (country 22), Estonia (country 16), Japan (country 24), New Zealand (country 32) and Brazil (country 6). It is important to remark that Cyprus (country 12) and Norway (country 33) ranked 15th and 16th respectively have the same mean values up to three decimal places. However, Cyprus is ranked higher because the mean value of her JSP scale is higher than that of Norway when the decimal point is extended beyond 3 decimal places. How lower secondary school teacher job satisfaction with the teaching profession differs from one country to another can also be read from Table 5. For instance, lower secondary school teachers in the United Arab Emirates (country 47) are less satisfied with the teaching profession as compared to teachers in the first 25 countries or economies in the ranking. However, they are more satisfied with the teaching profession than lower secondary school teachers in Hungary (country 20), Shanghai (China, country 38), Slovak Republic (country 40), Turkey (country 46), Latvia (country 27), Bulgaria (country 7), England (United Kingdom, country 15), Portugal (country 34), Saudi Arabia (country 37) and Malta (country 29). Similar cross-country comparisons can also be done for each other countries on Table 5. Comparison of this nature is envisaged to provide potential cues to policy makers and other education stakeholders on the question of which country to learn from such that teacher job satisfaction can be improved in own country. What is Austria doing differently from other countries? That makes her teachers stand out on the job satisfaction with teaching profession scale. Empirical evidence has been provided in this article that prompts such inquiry. Attempts were made in the next section to compare the findings of the current study with reports from the previous studies.

Finding comparisons with prior studies
It was mentioned in the introductory section of the current article that there were sparse studies on cross-country mean comparisons of teaching job satisfaction in the literature. This fact makes it a little difficult to find previous studies to compare with our findings. However, some of our findings are comparable with some findings reported in (Zieger et al., 2019). In a large-scale study using TALIS 2013 data, Zieger et al. (2019) found that lower secondary teachers in England (United Kingdom) had lower job satisfaction with teaching profession than each of the comparable 17 countries and economies that participated in the survey. Our finding as displayed in Table 5 also corroborates this result to a large extent. England (United Kingdom, country 15) was ranked 35th out of the 38 countries and economies that participated in the survey. This can be interpreted to mean over the span of five years after the second round of the TALIS there appears to be no significant improvement in job satisfaction of lower secondary school teachers in England. It is our hope that this finding will be useful to policy makers and other education stakeholders in England and other countries in general. In the next section, strengths and limitations of this current study are presented.

Strengths and limitations of the study
The strengths of this study lie in the compelling empirical evidence it has provided for latent mean comparison of teacher job satisfaction across different countries. This type of evidence seems impossible due to the condition of scalar invariance that is necessary for multiple group mean comparison to take place. However, with an application of a relatively new and efficient alignment optimization approach using an approximate scalar invariance we are able to compare the means of teacher job satisfaction scale across 38 countries and economies. Austria was identified as number one country with teachers that are highly satisfied with their profession while teachers in Malta are found to be least satisfied with teaching profession. Empirical evidence was also provided for comparison between one country and another. These types of findings appear lacking in the literature. However, there are some limitations that are necessary to acknowledge in the current study.
One of the limitations of the current study is our inability to combine the two subscales that measure teacher job satisfaction in the TALIS 2018 questionnaire. Some researchers have argued that teacher job satisfaction is a multidimensional construct and it is preferred to it treat as such (e.g., Veldman, van Tartwijk, Brekelmans, & Wubbels, 2013). Meanwhile, it has been shown elsewhere (Zakariya, 2020a,b) that if teacher job satisfaction is treated as a multidimensional construct an item must be allowed to cross-load on the two subscales. However, alignment optimization approach cannot be used when there is an item cross loading in the measurement model (Muthén & Asparouhov, 2018). Another limitation of this study stems from the fact that alignment optimization approach is based on as assumption of existent and attainment of the minimum measurement non-invariance of model (Muthén & Asparouhov, 2018). But there is no test to confirm whether this assumption holds or not to the best of our knowledge. It is also acknowledged that there are some within and across country disparities that could stem from differences in regional and district factors that affect teacher satisfaction. The alignment approach lacks the power to eliminate these disparities completely. Instead, the disparities are optimized across all the countries to allow meaningful and reliable cross-country comparisons to take place.

Conclusions
Teacher job satisfaction has been identified to be an important construct not only to sustaining effective teachers on their jobs but also to improving learning experience of the students they teach (Skaalvik & Skaalvik, 2011). Job satisfaction is a self-evaluation of the teaching profession by the teachers themselves which influences many other factors such as quitting intention, absenteeism, perceived distribution school leadership, burnout, overall teaching efficacy, etc. (Torres, 2019). However, cross-country mean comparison of the teacher job satisfaction has been a challenge due to lack of empirical evidence on its construct validity (in terms of the condition of scalar invariance) and reliability across different culture. Thus, we applied a robust statistical technique that optimized the amount non-invariance in the measurement model to conduct multiple group mean comparison of lower secondary school teacher job satisfaction across 38 participating countries and economies in the TALIS 2018. Our findings show that Austria, Chile, Spain, Alberta (Canada), Ciudad Autónoma de Buenos Aires (Argentina) and Italy are the top six countries/economies with highly job-satisfied teachers in descending order respectively. At the middle of ranking are Croatia, Israel, Estonia, Japan, New Zealand and Brazil that occupy 17th to 22nd positions respectively. At the bottom of the ranking are Latvia, Bulgaria, England, Portugal, Saudi Arabia, and Malta. Apart from ranking of the countries according to the means on the teachers' job satisfaction scale our findings also provide empirical evidence for comparing each country with others that have significantly lower or higher means. Cross-cultural mean comparisons reported in this article have several implications for policy makers and other education stakeholders on which country (ies) to emulate such that teacher job satisfaction can be improved.