Perceived Age Discrimination Across Age in Europe: From an Ageing Society to a Society for All Ages

Ageism is recognized as a significant obstacle to older people’s well-being, but age discrimination against younger people has attracted less attention. We investigate levels of perceived age discrimination across early to late adulthood, using data from the European Social Survey (ESS), collected in 29 countries (N = 56,272). We test for approximate measurement invariance across countries. We use local structural equation modeling as well as moderated nonlinear factor analysis to test for measurement invariance across age as a continuous variable. Using models that account for the moderate degree of noninvariance, we find that younger people report experiencing the highest levels of age discrimination. We also find that national context substantially affects levels of ageism experienced among older respondents. The evidence highlights that more research is needed to address ageism in youth and across the life span, not just old adulthood. It also highlights the need to consider factors that differently contribute to forms of ageism experienced by people at different life stages and ages.


Sample
The sample assessed in the European Social Survey (ESS), Round 4, included 56751 respondents between 15 and 105 years. Data were collected in 29 countries and the figure below shows kernel density plots of age for each of the 29 countries. Although age densities varied across countries, many countries had comprable age distributions. Turkey (TR) differed notably from the remaining countries, with a predominantly young sample.

Measurements and Descriptive Statistics
The Three Age Discrimination Items Perceived age discrimination was assessed with thre items, Table A1 shows descriptive statistics for the items. The items were strongly skewed, indicating that they should be treated as categorical. There was little missingness in the data (about 1.5% for each item). The table also shows a comparison of indicaded experiences of age discrimination in narrowly defined age groups. Proportions who reported age discrimination (at any level) against themselves were substantially higher in the youngest age group (from 15 to 29 years) than in any other age group. Middle aged had the lowest scores for perceived age discrimination, older age groups moderatedly higher. Analyses used recoded 3-point scales due to few responces in the two highest categories. We note that the few responses indicating particularly frequent age discrimination were found primarily among the youngest age groups (close to 1% for prejudice because of age and lack of respect, lower for treated badly because of age) than in the oldest age groups (approximatetly 0.4 or lower for prejudice because of age and lack of respect, even lower for treated badly because of age).

Validity Test of the three Age Discrimination Items
We used structural equation models to conduct a simple test of the convergent and discriminant validity of the three age discrimination items. It was theoretically possible that responses could indicate a general tendency to claim being discriminated against (not just based on age). For instance, emotional problems might increase the tendency to blame conflicts in social interactions on discrimination (Major, Kaiser, & McCoy, 2003) We compared fit for models with a factor representing perceived discrimination and regressed this factor on age and squared age (to reflect non-linear association between age and perceived discrimination). The first model estimated a factor with the three age discrimination items and two similar items in the ESS assessing prejudice because of gender (predsex) and prejudice because of ethnicity (predetn), all indicators were recoded to 3-point ordinal variables. The second and third model used four items (adding either prejudice because of gender or because of ethnicity), the fourth model used only the three age discrimination items to estimate the latent factor of perceived age discrimination.
Adding items on perceived discrimination because of gender and because of ethnicity, providing five indicators for the factor, resulted in a model with acceptable values for the comparative fit index (CFI) and the standardized root mean square residual (SRMR), but the root mean square error of approximation (RMSEA) was clearly too high for a fitting model; RMSEA = 0.10 even when running separate analyses of men and women. Dropping the item discrimination because of ethnicity did not improve model fit (RMSEA = 0.10). Dropping the item for perceived discrimination because of gender and keeping the ethnicity item in addition to the three age discrimination items improved fit (RMSEA = 0.06), since most respondents (84 %) did not experience discrimination because of ethnicity. However, a model using only the three age discrimination items as indicators of the factor (and keeping the two predictors as part of the model) gave a notably improved fit (RMSEA = .02; CFI = 1.00, SRMR = .001). These tests with several items on perceived discrimination (age, gender, ethnicity) were indicative of the discriminant and convergent validity of the three items for perceived age discrimination.

Analytical Strategy
We used three newly developed statistical methods to investigate measurement invariance: an alignment analysis to test for approximate measurement invariance across countries and age groups, and two methods to test for measurement invariance across age as a continuous variable -local structural equation modeling (LSEM) and moderated non-linear factor analysis (MNLFA).

Approximate Measurement Invariance
Studies of measurement invariance typically investigate three types of invariance using confirmatory factor analysis: configural, metric, and scalar invariance. Configural invariance simply means that the factor structure (a factor and its indicators) will be the same across groups. More interesting to us was metric invariance, which assumes invariant factor loadings across groups. A higher level of invariance is scalar invariance, adding invariant intercepts for factor indicators to the invariant factor loadings already tested in metric invariance.
If both intercepts and factor loadings for perceived age discrimination can be fixed to be invariant across groups (countries or age groups), then the latent factor means are on the same scale and it would be possible to compare levels of perceived age discrimination across countries or age groups. That is, the relationship between the estimated factor and the observed variables would not depend on which country or age group an individual belongs to. Thus, scalar invariance would allow for comparisons of factor means, making it possible to draw conclusions about different degrees of perceived discrimination across groups (see Vandenberg & Lance, 2000). In practice, strong measurement invariance (identical factor loadings and identical indicator intercepts) across groups is unlikely when many groups are involved, as in comparisons of countries in the ESS (Asparouhov & Muthén, 2014).
One alternative might be to use partial measurement invariance with an exploratory adaption of the measurement model (Byrne, Shavelson, & Muthén, 1989;Steenkamp & Baumgartner, 1998), but this approach is unlikely to be very helpful when many groups are analysed (see Asparouhov & Muthén, 2014). A better solution can be to use the recently developed approach of approximate measurement invariance (Asparouhov & Muthén, 2014), which estimates approximately equal factor loadings and approximately equal indicator intercepts/thresholds across groups.
Approximate measurement invariance is "approximate" in the sense that it allows for statistically non-significant differences in factor loadings and intercepts across groups. By allowing for some wiggle room for parameters, approximate measurement invariance is more realistic than conventional scalar invariance and achieving approximate measurement invariance would allow for comparisons of the level of perceived age discrimination across countries and across age groups. Asparouhov and Muthén (2014) refer to the computation of approximate measurement invariance in Mplus as an alignment method. The alignment is done automatically by the statistical software rather than depending on exploratory adaption of the model by the researcher. The alignment uses the configural model as a starting point (no factor loadings or intercepts are fixed to be equal across groups) and then adds restrictions to the model, making factor loadings and intercepts approximately equal, provided these restrictions are supported by the data. Invariance is tested for all indicators.
The algorithm for the alignment method defines a measurement parameter as approximately invariant if it is not statistically significantly different from the default model for all groups. For each measurement parameter the algorithm searches for the largest set of invariant groups. The algorithm develops a solution "where for each group in the invariant set of groups the measurement parameter in that group is not statistically significant[ly different] from the average value for that parameter across all groups in the invariant set" (Asparouhov & Muthén, 2014, p. 5). Moreover, "the algorithm is based on multiple pairwise comparison; that is, multiple testing is done and to avoid false non-invariance discovery we use smaller p-values than the nominal .05" (Asparouhov & Muthén, 2014, p. 5).
The final model will fit the data as well as the original configural model. The combination of approximate measurement invariance and good fit with the data should allow for computation of group-specific factor means (Asparouhov & Muthén, 2014). The moderate differences across groups in factor loadings and intercepts should have little effect on the estimated factor mean. An important byproduct of the alignment analysis is that it will identify which groups cannot have their factor loadings or intercepts/thresholds fixed at approximately the same value as the other groups.
The alignment method in Mplus can estimate approximate measurement invariance freely or apply a fixed alignment, the latter requiring the user to fix the factor mean for a baseline group to zero, potentially easing the alignment analysis (Asparouhov & Muthén, 2014). We refer to Asparouhov and Muthén (2014) for details on approximate measurement invariance based on an alignment analysis.

Local Structural Equation Modeling
In LSEM (see Hildebrandt, Wilhelm, & Robitzsch, 2009;Hildebrandt et al., 2016), the full sample is analysed repeatedly, but in each run individuals in the sample are weighted differently, dependent on their value along the moderator (age in our case). Respondents with an age equal to the focal point received a weight of 1.
Following Hildebrandt et al., we developed a bandwidth for the weighting procedure using a Gaussian kernel function. The density function given by the weighting procedure implied no upper or lower limit, meaning that the whole sample was included in each model, but respondents much older (younger) than the focal point had a very low weight.
As Hildebrandt et al. point out, observations near the focal point are also informative for the value of the focal point, though less than than those occupying the focal point on the scale, but still more than distal observations. Thus, weighting has to be defined in a manner where weights are lower the further away (the older/younger) individuals are from the focal point. When using this approach, ages nearby the focal point will give information for the calculation and ages far distant from the focal point will have negligible influence on the estimation. Repeating this procedure across the scale of the moderator (age), moving the focal point slightly from model to model, we estimated in total 401 models for an analysis with LSEM.
We tested each factor loading for measurement invariance, the latent factor was identified by fixing its variance to 1. Age was centered, so that 0 for age was the average age of 47.5 years. Following Hildebrandt et al., we used focal points in the LSEM models varying from two standard deviations above to two standard deviations below 0 of centered age, giving focal points that represented ages from 10.5 to 84.5 years. The use of two SDs below and above the average implied that the first of the models estimated gave the largest weight to 15 years olds, since these were the youngest respondents. Respondents older than 84 were represented by their relatively high weights in models of respondents close to 2 SDs above the average.
As described by Hildebrandt et al (2016), the bandwidth (bw) around each focal point is defined by the following equation: The bandwidth is thus computed by using a density function that reflects the sample size (N ) and the standard deviation of moderator SD M , where M in our case refers to the moderator age.
The difference z for an respondent i and the target value of M is scaled according to the bandwidth: Weights (K) for each respondent are then calculated based on the distance z i . These weights are then rescaled to weights (W ) that vary between 0 and 1:

Moderated Non-Linear Factor Analysis
We used MNFLA (Bauer, 2016) as a second method to analyze measurement invariance across age. Bauer refers to moderation of an item's factor loading or threshold as differential item functioning (DIF). Following Bauer, we tested for DIF by comparing (a) models with DIF for a particular item and (b) a model with no DIF. These models were nested and we used the scaled nested Chi-square test (Satorra & Bentler, 2001) for model comparisons. We then kept DIF for the item resulting in the largest improvement in fit and added DIF for a second item, testing whether this improved fit. Finally, we used the model with the best fit to estimate factor scores for each respondent, accounting for measurement non-invariance.
We refer to Bauer (2016) for technical details of the MNLFA approach. The MNLFA code later in this supplemental material shows how we modeled DIF for items.

Measurement Invariance across Countries
We first tested for measurement invariance across contries. The code usevariables = predj_r lkrsp_r trtbd_r country in the code chunk below refers to variables used in this part of the analysis. predj_r is the recoded 3-point version of the original ESS variable "predage" (prejudice because of age), lkrsp_r is the recoded 3-point version of the original variable "lkrspag" (lack of respect because of age), trtbd_r is the the recoded 3-point version of the origina variable "trtbdag" (treated badly because of age). We first estimated traditional measurement invariance across all countries. The estimation was done with Mplus, using MplusAutomation (Hallquist & Wiley, 2016) (29); knownclass=c(country);", ANALYSIS = " model = configural metric scalar; estimator = mlf; algorithm = integration; type = mixture;", MODEL = " %overall% discrim BY predj_r lkrsp_r trtbd_r;", OUTPUT = " tech1 tech8 cinterval;", rdata = ESSdata) # Run mymodel myresults<-mplusModeler(mymodel, modelout="CountriesMetricGroupAll.inp", run=1L) Given the negative findings for metric invariance (p < .001), we tested for approximate measurement invariance across countries. The analysis of the full sample indicated substantial non-invariance across countries and we exploratory developed two groups of countries based on tests with approximate measurement invariance, resulting in the following grouping:

Perceveived Age Discrimination across Age
Tests of measurement invariance across age used three different approaches, the first two able to estimate measurement invariance across a continuous variable: LSEM and MNLFA.

Perceived Age Discrimination across Age in Single Countries
The final analysis estimated factor scores across age for each country separatedly, using MNLFA models with DIF for prejudice because of age and for treated badly because of age: