Structural Parameters under Partial Least Squares and Covariance-Based Structural Equation Modeling: A Comment on Yuan and Deng (2021)

Abstract In their article, Yuan and Deng argue that a structural parameter under partial least squares structural equation modeling (PLS-SEM) is zero if and only if the same structural parameter is zero under covariance-based structural equation modeling (CB-SEM). Yuan and Deng then conclude that statistical tests on individual structural parameters assessing the null hypothesis of no effect can achieve the same purpose in CB-SEM and PLS-SEM. Our response to their article highlights that the relationship they find between PLS-SEM and CB-SEM structural parameters is not universally valid, and that consequently, tests on individual parameters in CB-SEM and PLS-SEM generally do not fulfill the same purpose.


Motivation
In a recent article, Yuan and Deng (2021) contribute to a better understanding of different types of factor scores and their relations by showing the relationship between Bartlett factor scores (Bartlett, 1937), regression factor scores (Thomson, 1934;Thurstone, 1935), and scores obtained by partial least squares structural equation modeling (PLS-SEM, Wold, 1982). In doing so, they elaborate on the relationship between regression and Bartlett factor scores and remind us that Bartlett and regression weights are rescaled versions of each other (Bartlett, 1938;Lawley & Maxwell, 1962). Similarly, they elaborate on the findings in the PLS-SEM literature that scores obtained by Mode B weights are (a) asymptotically proportional to regression factor scores if separately computed for each latent variable (see, e.g., Dijkstra, 1985 p. 57 andDijkstra, 2010 p. 33), and (b) asymptotically univocal, i.e., scores for one factor are asymptotically not contaminated by variance of other factors (Lohm€ oller, 1989, pp. 100-107; see also Harman, 1976, p. 387). Based on these results, they show that the correlation between regression factor scores and Bartlett factor scores, respectively, equals the canonical correlation between the first pair of canonical variates for cases in which there are only two blocks of indicators. Further, Yuan and Deng (2021) propose a transformation of PLS-SEM Mode A weights into PLS-SEM Mode B weights. Such a transformation can be beneficial since PLS-SEM Mode A weights are known to be more numerically stable than Mode B weights (Dijkstra & Henseler, 2015a), while, as Yuan and Deng (2021) show, factor scores obtained by PLS-SEM Mode B attain maximum reliability asymptotically.
Besides these valuable insights, Yuan and Deng (2021) claim that a structural parameter under PLS-SEM equals zero if and only if the corresponding structural parameter is zero under covariance-based structural equation modeling (CB-SEM). Further, they argue that in structural models containing latent variables the population parameters are arbitrary, and then conclude that "for the purpose of modeling the relationship among latent variables, the proxies still permit unbiased test for the significance of each parameter estimate" (p. 558). More specifically, they claim that a "statistical test on individual parameters [ … ] under PLS-SEM can achieve the same purpose as that under CB-SEM given the same overall-model structure" (p. 562). This is surprising because it is at odds with the various criticisms raised in the literature regarding the use of PLS-SEM for structural models containing latent variables (e.g., Goodhue et al., 2017;R€ onkk€ o et al., 2015R€ onkk€ o & Evermann, 2013;Schuberth, 2021) and contradicts recent PLS-SEM guidelines recommending bias correction if PLS-SEM is applied to latent variable models (e.g., Benitez et al., 2020;Evermann & R€ onkk€ o, in press;Schuberth et al., in press).
In this comment on Yuan and Deng's paper, we shed more light on the relationship between CB-SEM and PLS-SEM parameters and we show it is generally not true that a structural parameter under PLS-SEM is equal to zero if and only if the corresponding structural parameter equals zero under CB-SEM. In doing so, we draw on the literature on error-in-variables models, i.e., the methodological research strand that focuses on the consequences of variables contaminated by error. In addition, we present a scenario analysis which further illustrates the problem. Furthermore, we explain that statistical tests on individual structural parameters in most cases do not achieve the same purpose under PLS-SEM and CB-SEM. In the last section we give a conclusion.

Relationship between PLS-SEM and CB-SEM Structural Parameters
In their article, Yuan and Deng (2021) argue that a structural parameter under PLS-SEM is equal to zero if and only if the corresponding parameter is zero under CB-SEM, assuming that the model is correctly specified (see also Equation (20) in Yuan and Deng, 2021). According to Yuan and Deng, this is because in PLS-SEM each predictor will be correlated with its response variable whenever the latent predictor variable is correlated with the latent response variable and the two correlation coefficients have the same sign.
As we will show in the following argument, their statement is not generally valid and only true under very specific conditions, such as in structural models with a single latent predictor variable or with orthogonal latent predictor variables, which are unlikely to hold in situations where SEM is typically applied. Like Yuan and Deng (2021), we limit our focus to linear recursive structural equation models with uncorrelated structural disturbance terms. Each latent variable g j is assumed to be standardized and measured reflectively by a set of indicators x j ¼ k j g j þ e j with j ¼ 1, :::, J: As is common, we assume that the random measurement errors e have a mean of zero and are mutually uncorrelated. Further, we assume that the random measurement errors are uncorrelated with the latent variables and the structural disturbance terms. Additionally, it is assumed that the structural disturbance terms are uncorrelated with the exogenous latent variables.
To demonstrate that Yuan and Deng's (2021) statement about the relationship between PLS-SEM and CB-SEM structural parameters is not generally true, we draw on the literature on error-in-variables models, i.e., the literature that studies problems associated with variables contaminated by errors. 1 As Yuan and Deng (2021) pointed out, PLS-SEM creates proxies for each latent variable g j asg j ¼ w 0 j x j ¼ w 0 j k j g j þ w 0 j e j ¼ q j g j þ d j to estimate the structural parameters. Obviously, these proxies are representations of the latent variables contaminated by random measurement error. As is common in PLS-SEM, the weights are scaled to ensure that each proxy has unit variance, i.e., varðg j Þ ¼ w 0 j R jj w j ¼ 1 where R jj denotes the variance-covariance matrix of the indicators associated with the latent variable g j . Since all latent variables are standardized, the variance of the composed random measurement error term d j equals 1Àq 2 j , where q 2 j is the reliability of proxyg j for latent variable g j . Following Dijkstra and Schermelleh-Engel (2014, see Equations 10 and 11), in cases where Mode A weights are used to build the proxies, the reliability of proxyg j is given as q 2 j ¼ ðw 0 j k j Þ 2 ¼ ðk 0 j k j Þ 2 =ðk 0 j R jj k j Þ (see Yuan & Deng, 2021, p. 562 and also Dijkstra, 1985, Chapter 2). Similarly, if Mode B weights are used to create the proxies, their reliability is given as q 2 j ¼ k 0 j R À1 jj k j (Dijkstra, 2010, Equation 1.24). Under PLS-SEM, the structural parameters of each equation of the structural model are estimated by ordinary least squares (OLS). Consequently, its use for structural equation models with latent variables resembles applying the naive estimator, i.e., the OLS estimator, to models with additive random measurement error in all variables, i.e., dependent and independent variables (see, e.g., Carroll et al., 2006, Chapter 2). For this type of model, Buonaccorsi (2010, Equation 5.7) has derived the expected value of the estimates obtained by the naive estimator. Applying Equation 5.7 of Buonaccorsi (2010) to the PLS-SEM context, the probability limit of the PLS-SEM structural parameter estimates is given as follows: 2 where R is the population correlation matrix of the independent latent variables. Moreover, the matrix Q contains the square root of the reliabilities of the proxies for the independent latent variables on the diagonal and I is a unit matrix of the same dimension as Q: Similarly, q dep equals the square root of the reliability of the dependent latent variable's proxy. Note that since in PLS-SEM the proxies are standardized, the random measurement errors comprised in the proxy for the dependent latent variable additionally distort the parameter estimates (e.g., Bollen, 1989, Chapter 5). Further, c is the vector containing the true standardized structural parameters of that equation, i.e., the probability limit of the standardized structural parameter estimates under CB-SEM. The effects of random measurement errors on PLS-SEM parameter estimates have also been pointed out in prior PLS-SEM literature (see, e.g., Dijkstra, 1985Dijkstra, , 2010Dijkstra & Henseler, 2015a, 2015b. Given that Q is a diagonal matrix, Equation (1) reveals that it is generally not true that the probability limit of a structural parameter estimate under PLS-SEM equals zero if and only if the population counterpart equals zero, as Yuan and Deng (2021) have claimed. In fact, this is only the case under one of the following conditions: (i) in simple regression models, i.e., if the model has only one independent latent variable, (ii) if all independent latent variables are uncorrelated, (iii) if all independent latent variables have no effect on the dependent latent 1 Error-in-variable models have also been studied in other fields such as econometrics. See, e.g., Wooldridge (2012, pp. 320-323). 2 The original Equation 5.7 of Buonaccorsi (2010) reads as follows: where the vectorb 1, naive contains the OLS estimates of the independent variables' parameters in a multiple regression model in which both dependent and independent variables are contaminated by additive random measurement errors. Additionally, R XX and R u are the expected variancecovariance matrices of the true independent variables (i.e., the independent variables measured without error) and the random measurement errors comprised in the observed independent variables, respectively. Further, b 1 is the vector containing the expected value of the independent variables' parameters in cases where there is no random measurement error. The matrix j is also known as the reliability matrix, which determines the degree of bias due to attenuation. Since PLS-SEM relies on estimated weights to form the proxies, we consider the probability limit of the PLS-SEM estimates instead of their expected value. Furthermore, in PLS-SEM all variables, i.e., the latent variables and their proxies, are assumed to be standardized, which is not the case in the formula of Buonaccorsi (2010). Therefore, the reliability of the dependent variables also needs to be taken into account.
variable, or (iv) if the latent variables' proxies show a reliability of zero. Note that in the case of a proxy's reliability being zero, the indicators forming that proxy would be uncorrelated with all other indicators in the model. Consequently, neither CB-SEM nor PLS-SEM estimates are identified. Similarly, we emphasize that in case of isolated latent variables, the PLS-SEM weights used to build the proxies for these latent variables are not defined (Dijkstra, 1985, p. 72).
To further highlight this point, we consider a latent variable model with two correlated latent predictor variables g 1 and g 2 and one latent response variable g 3 . All latent variables are standardized, i.e., they have a mean zero and a unit variance. The structural model is given as follows: To ensure that the latent response variable has a unit variance, the variance of the structural disturbance term is chosen as varðfÞ ¼ 1Àc 2 1 À2c 1 c 2 / 12 Àc 2 2 , where the correlation between the two latent predictor variables is denoted by / 12 : Considering Equation (1), the structural parameter estimate under PLS-SEMĉ PLS 1 converges in probability to: Assuming that the true structural parameter between the two latent variables g 1 and g 3 is equal to zero, i.e., c 1 ¼ 0, Equation (3) simplifies as follows: Considering Equation (4),ĉ PLS 1 would be consistent, i.e,. plimĉ PLS 1 ¼ c 1 ¼ 0, if c 2 were equal to zero or if the latent variables g 1 and g 2 were uncorrelated, i.e., / 12 ¼ 0: Similarly,ĉ PLS 1 would be consistent if the reliabilities of the proxiesg 1 org 3 were equal to zero. This is arguably not the case in most empirical settings. Moreover, a completely unreliable proxy would require that the indicators making up that x 12 x 13 x 21 x 22 x 23 x 31 x 32

Structural parameters under PLS-SEM
Note: Values are rounded to the second decimal.
proxy be uncorrelated with all other indicators in the model. Consequently, neither the ML estimates nor the PLS-SEM estimates would be identified. Therefore, it can be concluded that a structural parameter of zero under CB-SEM does not generally imply a structural parameter of zero under PLS-SEM. In a similar way, Equation (4) shows that we cannot conclude that a structural parameter other than zero under PLS-SEM implies a structural parameter other than zero under CB-SEM. Further, one can directly show that a structural parameter of zero under PLS-SEM does not necessarily imply a structural parameter of zero under CB-SEM. Considering Equation (3), this is illustrated if c 1 ¼ Àc 2 / 12 ð1Àq 2 2 Þ=ð1À/ 2 12 q 2 2 Þ, where it is assumed that c 2 6 ¼ 0: In this situation, the structural parameter estimate under PLS-SEMĉ PLS 1 converges in probability to zero, while the CB-SEM counterpart converges in probability to a value other than zero. Note how this also shows that it is not true that a structural parameter other than zero under CB-SEM implies a structural parameter other than zero under PLS-SEM.
To show this issue numerically, we focus on two scenarios illustrated in Tables 1 and 2. Scenario 1 assumes that the structural parameters of g 1 and g 2 on g 3 are 0 and 0.7, respectively. The correlation between the two latent variables g 1 and g 2 is assumed to be 0.6. Further, we measure each latent variable g j by three indicators with the factor loadings 0.5, 0.8, and 0.7. For simplicity, it is assumed that the indicators are standardized. To form the PLS-SEM proxies, we use Mode A weights; consequently, the reliability of the three proxies is given as q 2 j ¼ ðk 0 j k j Þ 2 =ðk 0 j R jj k j Þ%0:74: As the following equation shows, the structural parameter estimate under PLS-SEM c PLS 1 does not converge in probability to zero, i.e., the true value of the structural parameter (which equals the probability limit of the structural parameter estimate under CB-SEM): Scenario 2 is very similar to Scenario 1, but here the structural parameters for the latent variables are set to c 1 ¼ 0:22 and c 2 ¼ 0:82: Additionally, the correlation between x 12 x 13 x 21 x 22 x 23 x 31 x 32  the two exogenous latent variables is set to / 12 ¼ À0:68: As is clear in Table 2, the structural parameter estimate of c 1 under CB-SEM converges in probability to 0.22, while the PLS-SEM counterpart converges to 0.

Structural parameters under CB-SEM
The situation outlined above also shows that Yuan and Deng (2021, p. 562) are mistaken in stating that "statistical test[s] on individual parameters (i.e., H 0 : c j0 ¼ 0) under PLS-SEM can achieve the same purpose as that under CB-SEM given the same overall-model structure." According to the discussion in the literature on error-in-variables models, it is well known that the validity of statistical tests regarding single regression coefficients is very limited in the presence of measurement error when the tests are based on OLS estimates (e.g., Carroll et al., 2006, Chapter 10). For instance, Brunner and Austin (2009) studied the behavior of the t test to test the null hypothesis that a single regression coefficient equals zero for a regression model with two independent variables that are both contaminated by random measurement error. In their study, they conclude that researchers relying on the OLS estimator and the t test to test H 0 : c j ¼ 0 will almost certainly make a type I error when there are additive random measurement errors in the independent variables. Westfall and Yarkoni (2016) reached the same conclusion when they showed that tests on individual parameters are more reliable in CB-SEM than in OLS regression if random measurement errors are present.
The same holds for PLS-SEM when it is used to estimate structural models containing latent variables. This has also been observed in the PLS-SEM literature where, for instance, Goodhue et al. (2017, pp. 678) conclude that "[s]pecifically, in both regression and PLS[-SEM], excessive false positives are possible, and the incidence increases with measurement error, with the size of the correlation between predictor constructs and with sample size." Hence, we can conclude that statistical tests on individual parameters do in general not achieve the same purpose under PLS-SEM and CB-SEM.

Conclusion
The study of Yuan and Deng (2021) contains various contributions including PLS-SEM Mode A weights transformed into Mode B weights. These transformed weights will enjoy the numerical stability of classical Mode A weights and preserve the asymptotic maximum reliability known from Mode B weights. Moreover, their study contributes to a better understanding of different types of factor scores, including those obtained by PLS-SEM and their relationships. Although it is not new in the PLS-SEM literature that Mode B weights asymptotically attain maximum reliability and asymptotically produce scores that are univocal (Dijkstra, 1985;Lohm€ oller, 1989), Yuan and Deng further formalize these properties and link them to other more traditional ways of obtaining factor scores. However, both reliability and univocality of PLS-SEM factor scores have been studied before by means of simulation studies and the published results show that their finite sample behavior is not always as good as the asymptotic results suggest (R€ onkk€ o, 2014;R€ onkk€ o & Ylitalo, 2010). Particularly, in PLS-SEM, the weights depend strongly on the inter-block correlations, i.e., on correlations between indicators that belong to different latent variables. Consequently, in finite samples, particularly in small samples, PLS-SEM factor scores are contaminated by variance of other factors (i.e., univocality is sacrificed; R€ onkk€ o & Ylitalo, 2010, refers to this as bias). Hence, the finite sample behavior of PLS-SEM factor scores more likely resembles the behavior of regression factor scores than that of Bartlett factor scores. Similarly, and as Yuan and Deng (2021) noted, only anecdotal evidence is available regarding the performance of PLS-SEM under misspecified models (R€ onkk€ o et al., in press). Consequently, we need future research to shed more light on these issues.
Besides providing insight on the different types of factor scores, Yuan and Deng (2021) draw a connection between structural parameters under PLS-SEM and CB-SEM. As we have shown in our comment, the connection they make is only true under very special conditions such as structural equations with one independent latent variable or a situation in which all independent latent variables are uncorrelated. Therefore, it is generally not possible to draw conclusions from a CB-SEM structural parameter about the PLS-SEM counterpart and vice versa. Consequently, effect size measures and statistical tests on individual structural parameters can generally not achieve the same purpose under CB-SEM and PLS-SEM. This highlights that bias caused by random measurement errors, i.e., attenuation bias (Cohen et al., 1990), should be taken seriously even if one is only interested in whether a parameter equals zero in the population.
Notably, testing only for the existence of effects is not a recommended research practice. Researchers should additionally pay attention to the magnitude of the estimates (Wasserstein et al., 2019;Wasserstein & Lazar, 2016). Following Yuan and Deng (2021), this would be a fruitless endeavor in the context of latent variable models because the scales of latent variables, and thus the involved parameters, are arbitrary. However, the scale of a latent variable is no more arbitrary than the scale of any other measured variable, such as height. The choice to measure height in centimeters is arbitrary and the parameters of an analysis would be different if another metric (e.g., millimeters, inches, feet) were to be used. Once the metric of a measured variable has been set and fixed, the measured variable and the involved parameters are no longer arbitrary. It is true that any parameter involving a latent variable can be thought of as an algebraic transformation of an unobserved population quantity (Klopp & Kl€ oßner, 2021); however, the same applies to any parameter involving a measured variable because of the measurement scales' arbitrariness (Markus & Borsboom, 2013, Chapter 2). Fortunately, there are ways to deal with the arbitrariness of scales. In the case of latent variables studied in the SEM context various methods have been proposed to choose the metric, such as the fixed marker scaling, the effects coding scaling, and the fixed factor scaling. For an explanation, see, e.g., Little et al. (2006). Similar to choosing the metric of height (e.g., centimeters, inches, feet), choosing a particular scaling method is arbitrary. Once the scaling method has been chosen, the scale of a latent variable is fixed and the arbitrariness of its scale disappears. Consequently, the parameters associated with the latent variables can be interpreted (see Table 4 of Klopp & Kl€ oßner, 2021) and estimation error, bias, and consistency can be assessed in a meaningful way. 3 To correct the relationship Yuan and Deng (2021) posit between a structural parameter under PLS-SEM and CB-SEM, PLS-SEM's inconsistency could be addressed by correcting for attenuation as has been done in consistent partial least squares (Dijkstra & Henseler, 2015a). A similar proposal has been made in the context of factor score regression to address the problem of random measurement error comprised in the factor scores (e.g., Devlieger et al., 2016). However, correcting for bias often increases an estimator's variability. To quantify the trade-off between an estimator's bias and its variability, criteria such as mean squared error can be used (Casella & Berger, 2001, Chapter 7). The errorin-variables model literature has shown that it depends on the sample size and the reliability of the independent variables whether the estimator corrected for attenuation bias outperforms the OLS estimator in terms of mean squared error (see, e.g., Fuller, 1987, Table 1.1.1). In the CB-SEM and PLS-SEM context, recent simulation studies suggest that the mean squared error of structural parameters estimated by CB-SEM (or other bias-corrected methods) is usually smaller than the mean squared error of these parameters estimated by methods such as uncorrected factor score regression or PLS-SEM that ignore the presence of measurement error in the scores (see, e.g., Devlieger et al., 2016;Yuan et al., 2020). In other words, the observed bias in methods that ignore random measurement error is often much larger than the added variability that is inherent to methods used in correcting for this bias. Possible exceptions would be in situations where researchers work with a very small sample size.
In the past decade, PLS-SEM has been heavily criticized because it produces inconsistent estimates for structural models that contain latent variables due to attenuation bias (e.g., Cadogan & Lee, in press;Henseler & Schuberth, in press;R€ onkk€ o & Evermann, 2013;R€ onkk€ o et al., 2015Schuberth, 2021). However, these criticisms have rarely been echoed in PLS-SEM guidelines which continue to recommend using PLS-SEM for structural models that contain latent variables including assessment criteria developed under CB-SEM, such as average variance extracted, indicator reliability, and composite reliability (e.g., Hair et al., 2019Hair et al., , 2020Hair et al., , 2021. Against this background, our comment is a further call for future research on PLS-SEM to pay serious attention to random measurement errors.