Reliability estimates for three factor score predictors

Estimates for the reliability of Thurstone's regression factor score predictor, Bartlett's factor score predictor, and McDonald's factor score predictor were proposed. As in Kuder-Richardson's formula, the reliability estimates are based on a hypothetical set of equivalent items. The reliability estimates were compared by means of simulation studies. Overall, the reliability estimates were largest for the regression score predictor, so that the reliability estimates for Bartlett's and McDonald's factor score predictor should be compared with the reliability of the regression score predictor, whenever Bartlett's or McDonald's factor score predictor are to be computed. An R-script and an SPSS-script for the computation of the respective reliability estimates is presented.


Introduction
Factor score predictors may be computed whenever individual scores on the factors are of interest. If, for example, decisions are made on the individual level (e.g., in personnel selection) an individual score is needed. When factor score predictors are used in order to compute individual scores, it would be helpful to know whether they are valid and reliable. The coefficient of determinacy, i.e., the correlation of the factor score predictor with the factor [1] has been related to the validity of factor score predictors [2]. However, the reliability of factor score predictors has rarely been investigated. It should be noted that the reliability of factor score predictors should not be confounded with the reliability of the factors. An index for the reliability of a factor has been proposed [3,4] and this index has been found to represent the proportion of variance due to all common factors [5], and a specific form of this index has been proposed to represent the reliability of the general factor in hierarchical models. Of course, the reliability of the factors might be of interest whenever factor models are estimated. However, it is impossible to compute the individual scores on the factors because the number of common and unique factors exceeds the number of observed variables [6]. Therefore, factor score predictors have to be computed whenever individual scores representing the factors are needed. Accordingly, the reliability of the factor score predictors should be estimated.
Moreover, specific reliability estimates of factor score predictors may depend on the factor score predictors that are considered. For example, a reliability estimate for Harman's ideal variable factor score predictor [7] have been proposed [8]. However, Thurstone's regression factor score predictor [9], Bartlett's factor score predictor [10], and McDonald's correlation-preserving factor score predictor [11] are probably more often used than Harman's ideal variable factor score predictor. Therefore, the present paper aims at proposing reliability estimates for Thurstone's regression factor score predictor, Bartlett's factor score predictor, and McDonald's correlation-preserving factor score predictor.
Moreover, the effect of the size of loadings, the number of variables, the inter-correlation of the factors, and sampling error on the reliability estimates for the three factor score predictors will be investigated by means of a simulation study. It is, however, possible that the factor model does not perfectly hold in a given sample, which is typically referred to as model error [12,13]. Accordingly, the effect of model error on the reliability estimates of the factor score predictors is also investigated by means of a simulation study. Finally, an R script is presented that allows for the computation of the reliability estimates for the factor score predictors starting from the loading pattern, the factor inter-correlations, and the item covariances.

Definitions
In the population, the common factor model can be defined as where x is the random vector of observations or items of order p. Thus, there are p observed variables and f is the random vector of common factor scores of order q, e is the random error vector or unique vector of order p, and  is the factor pattern matrix of order p by q.
where  represents the q by q factor correlation matrix and  2 is a p by p diagonal matrix representing the expected covariance of the error scores e (Cov[e,e] = [ee´] =  2 ). It is assumed that the diagonal of  2 contains only positive values and that the expectation of the non-diagonal elements is zero.

Reliability of factor score predictors
The starting point is Kuder and Richardson's consideration that a sum of items might be correlated with a sum of hypothetical equivalent items in order to estimate the reliability of the item sum [14]. In the following, 1 x is an empirical set of p items and 2 x is a hypothetical set of equivalent items.
According Kuder and Richardson, equivalence means that the items in the hypothetical item set are interchangeable with the items in the empirical item set. Thus, the members of each item pair (comprising an empirical item and a hypothetical item) have the same difficulty and are correlated to the extent of their respective reliabilities. This implies that the inter-item correlations within each set of items need not to be equal, i.e., that the items within each set need not to be parallel items. It is, nevertheless, instructive to present the correlation between the unit-weighted scales resulting for two equivalent sets of parallel items with unit variance in matrix form.  .
According to ,,  It follows from diag  1 (Φ ) I that Equation 11 can be transformed into 1 diag .
Entering  and, after some transformation, Entering Cor( ) Thus, as for the Kuder-Richardson formula, only the parameters of the empirical items are necessary in order to calculate the reliabilities, when the hypothetical item set is equivalent. The equivalence of the items implies that the parameters of the factor model will be identical for the two item sets.

Comparing reliability estimates for different factor score predictors
It should be noted that the formula for the reliability estimated of factor score predictors are based on the condition of equal factor models and, especially, on ,, and  12 e e 0 . This means that all true variance and all reliability comes from the amount of variance that is due to f1. Thus, the factor score predictor with the highest correlation with f1 should have the highest reliability. The regression score predictor has the highest correlation with the factor [16], so that ˆĈ or( , ) Cor( )  h the regression score predictor has the same or a larger reliability than the other two factor score predictors, the conditions for having an equal reliability are also of interest. Theorem 1 shows that the reliabilities of the regression factor score predictor and the Bartlett factor score predictor are equal when the condition diag  when there is only one non-zero factor loading of each variable (perfect simple structure).
Premultiplication with´1 Λ and some transformation yields . This yields 1 1 1/2 1 1 According to the conditions of Theorem 1 Equation 17 can be transformed into Since diag This completes the proof.  Theorem 2 shows that the reliabilities of the regression factor score predictor and the McDonald factor score predictor are equal when the condition diag  This completes the proof.  Thus, the three factor score predictors considered here have the same reliability for q = 1 and for orthogonal models with q > 1 and only one non-zero factor loading of each variable (perfect simple structure). However, these considerations do not allow for a quantification of the relative differences of the reliabilities of the factor score predictors. Therefore, simulation studies were performed in order to give an account of the reliabilities of the three factor score predictors under different conditions. First, a simulation study was performed at the level of the population for item sets for which the factor model holds in the population.
Simulation Study 1. The first short simulation study describes the effects of different population parameters on the reliability estimates. The simulation study was performed with IBM SPSS Version 22 and gives an account of the reliability estimates for the three factor score predictors for q = 6, depending on the number of main loadings per factor p/q (5,10), the size of main loadings l (.40, .50, .60, .70, .80), the size of secondary loadings sl (.00, .10), and the size of the factor intercorrelations r (.00, .30). This results in (2 levels of p/q  5 levels of l  2 levels of sl  2 levels of r) 40 population models, for which population correlation matrices of observed variables were generated according to Equation 2. The models with p/q = 5 were based on 30 observed variables and the models with p/q = 10 were based on 60 observed variables.
The reliability estimates for the factor score predictors were computed from the population parameters of the factor model ( ,, Λ Φ Ψ ) and the corresponding item covariances () by means of Equations 9,14,and 15. The results are summarized in Figure 1. No pronounced reliability differences occurred when the secondary loadings (sl) were zero, especially, when only reliabilities greater than .70 are considered. For sl = .10 and factor inter-correlations of .30, the regression score predictor had a notably larger reliability than Bartlett's factor score predictor and McDonald's factor score predictor. The differences between the reliability estimates for the Bartlett's factor score predictor and McDonald's factor score predictor were very small. Reliability estimates for the regression factor score predictor, Bartlett's factor score predictor, and McDonalds' factor score predictor for population models with q = 6. The horizontal line marks a reliability of .70 (Rtt = Reliability estimate, l = salient loadings, sl = secondary loadings, r = factor inter-correlations).
Simulation Study 2. The next simulation is based on samples that are drawn from populations with the same model parameters as in the previous simulation. The simulation study was again performed with IBM SPSS Version 22. For each of the 40 population models of the previous simulation study 1,000 samples with n = 500 cases and 1,000 samples with n = 1,000 cases were drawn. Random numbers for the samples of factor scores were generated by means of the SPSS Mersenne Twister random number generator. The corresponding samples of observed variables were generated from the common and unique factor scores by means of Equation 2. Maximumlikelihood factor analysis with subsequent Varimax-rotation for orthogonal population factor models and with Promax-rotation (kappa=4) for correlated factor models was performed in each sample of observed variables and the corresponding factor score reliabilities were computed from Equations 9,14,and 15. The results can be found in Figure 2.
The results of the simulation study for the samples are essentially the same as the results for the population parameters with the highest reliability of the regression factor score predictor. The main difference to the results of the simulation study for the population is that the Bartlett factor score predictor is substantially more reliable than the McDonald factor score predictor when the factor inter-correlations are substantial and when there are substantial secondary loadings.  Fig. 2. Reliability estimates for the regression factor score predictor, Bartlett's factor score predictor, and McDonalds' factor score predictor for samples based on population models with q = 6. The horizontal line marks a reliability of .70 (Rtt = Reliability estimate, l = salient loadings, sl = secondary loadings, r = factor inter-correlations).
Simulation Study 3. The third simulation study was again based on the population parameters of the first and second simulation study. The only difference is that the simulation study was based on imperfect models, thus, on population models that do not fit exactly to the population covariance matrix [12,13]. Imperfect models were generated as proposed by MacCallum and Tucker [13]. The population correlation matrices were generated from the loadings of the major factors corresponding to the factors in the simulation studies 1 and 2 as well as from the loadings of 100 'minor factors' and from the corresponding uniquenesses. Minor factors have very small nonzero population loadings and represent the 'many minor influences', which are thought to affect the values of the observed scores in the real world. Again, maximum-likelihood factor analysis with subsequent Varimax-rotation for orthogonal population factor models and with Promax-rotation (kappa=4) for correlated factor models was performed in each sample of observed variables and the corresponding factor score reliabilities were computed. The results for the imperfect models were extremely similar to those presented in simulation study 2, so that an additional figure was not necessary. Thus, imperfect models did not affect the reliability estimates substantially.

Reliability of the regression score predictor and the coefficient of determinacy
In the following, the reliability estimate for the regression score predictor is compared with the determinacy coefficient [1] in order to give an account of the relation between reliability and validity. The covariances of the regression factor score predictor with the corresponding common factor are the diagonal elements of The standard deviation of the factor is one and the standard deviation of the regression factor score predictor is diag(´ -1 ) -1/2 . Accordingly, the factor score determinacy, i.e., the correlation of the regression score predictor with the corresponding common factors is When the common variance of the factor and the regression factor score predictor is computed for the empirical factor models considered above, this yields For orthogonal factor models with Thus, for orthogonal factor models with only one loading of each variable on one factor, the reliability estimate of the regression score predictor corresponds to the coefficient of determinacy. Since it has been shown that the reliability estimates of the regression score predictor, Bartlett's factor score predictor, and McDonald's factor score predictor are equal under these conditions, it follow that the abovementioned reliability estimates of the factor score predictors are equal to the determinacy coefficient for  1 ΦI and diag  -1´-1 1 1 1 1 1 1 Λ Σ Λ (Λ Σ Λ ) . Theorem 3 describes the relation between the reliability estimate of the regression factor score predictor and factor score determinacy for orthogonal factor models that are identical across measurement occasions when diag  Since these diagonal elements are squared elements, it follows that diag(HH)  0 and diag(DD)  0. , which can occur when there are non-zero secondary loadings.

Discussion
Reliability estimates for Thurstone's regression factor score predictor, Bartlett's factor score predictor, and McDonald's factor score predictor were proposed. As in Kuder-Richardson's formula, the reliability estimates are based on a hypothetical set of equivalent items. The reliability estimates were, moreover, based on the assumption that the true variance of the items is only based on the common factors and that the error or unique variances of the items due not contribute to the reliability of the factor score predictors. Other assumptions might be possible, e.g. for hierarchical factor models, when the unique variance of a second order factor analysis already represents some amount of true score variance. However, this is not the standard case and the aim of the present study was to propose reliability estimates for the common case. It was shown that the reliability estimates are equal for the three factor score predictors when they are based on a one-factor model or when there are orthogonal factors with only one non-zero loadings of the items on a factor.
The reliability estimates of the three factor score predictors were compared by means of a simulation study for the population and by means of a simulation study for samples drawn from a population in which the factor model holds as well as for samples drawn from a population in which the factor model does not hold. It was found in the population based simulation study that the reliability estimates were largest for the regression factor score predictor and that the differences between the reliability estimates for Bartlett's factor score predictor and McDonald's factor score predictor were small. Especially, for models with correlated factors and substantial secondary loadings, the regression factor score predictor had substantially larger reliability estimates. In contrast, for orthogonal factors and when only substantial reliabilities (>.70) were considered, the differences between the reliability estimates for all three factor score predictors were small. The results of the simulation studies for the samples were very similar to the results for the population based simulation study. The regression factor score predictor was most reliable across all conditions. At best, the Bartlett and McDonald factor score predictor were as reliable as was the regression factor score predictor. The only relevant difference between the sample based simulation studies and the population based simulation study was that the Bartlett factor score predictor was substantially more reliable than the McDonald factor score predictor when there were substantial factor inter-correlations and non-zero secondary loadings in the sample based simulation study. Thus, computing McDonald's factor score predictor may result in larger losses of reliability than computing Bartlett's factor score predictor. The effect of using imperfect factor models for the simulation study did not affect the results.
Overall, the results of the simulation studies indicate that whenever Bartlett's or McDonald's factor score predictor are to be computed, the resulting reliability estimates should be compared with the reliability of the regression factor score predictor. This is necessary in order to investigate whether a substantial amount of reliability is lost by computing Bartlett's or McDonald's factor score predictor instead of the regression factor score predictor. An R-script (Appendix A) as well as an SPSS-script (Appendix B) was presented that allows for the respective calculations of the reliability estimates from the loading pattern and factor inter-correlations.
Finally, it was shown that the reliability estimates for the regression factor score predictor are equal to the determinacy coefficient for the one-factor model or when there are orthogonal factors with only one non-zero loadings of the items on a factor. For orthogonal factor models with more than one non-zero loading of the items on a factor the determinacy coefficient is a lower-bound estimate of the reliability of the regression factor score predictor. This result was not unexpected since the determinacy coefficient is based on the correlation of the regression factor score predictor with the factor.