Using confirmatory composite analysis to assess emergent variables in business research ☆ , ☆☆

Confirmatory composite analysis (CCA) was invented by Jörg Henseler and Theo K. Dijkstra in 2014 and ela- borated by Schuberth et al. (2018b) as an innovative set of procedures for specifying and assessing composite models. Composite models consist of two or more interrelated constructs, all of which emerge as linear com- binations of extant variables, hence the term ‘emergent variables’. In a recent JBR paper, Hair et al. (2020) mistook CCA for the measurement model evaluation step of partial least squares structural equation modeling. In order to clear up potential confusion among JBR readers, the paper at hand explains CCA as it was originally developed, including its key steps: model specification, identification, estimation, and assessment. Moreover, it illustrates the use of CCA by means of an empirical study on business value of information technology. A final discussion aims to help analysts in business research to decide which type of covariance structure analysis to use.


Introduction
Modeling and testing theories that contain abstract concepts is a core part of a large number of studies in business research. The family of statistical tools provided for this purpose, covariance structure analysis (CSA), plays a crucial role in this process. Methodological papers on CSA such as Bagozzi and Yi (1988) or Fornell and Larcker (1981) are among the most cited papers in marketing and related fields. An important task in CSA is the operationalization of these abstract concepts (Sajtos & Magyar, 2016). For a long time, constructs-the representations of concepts in the statistical model-have been equated with latent variables, which typically are common factors that fully explain the covariation among the observable variables (Borsboom, 2008). The preferred type of CSA to assess this type of latent variable model is confirmatory factor analysis (CFA).
Due largely to the poor test record of the common factor model (Henseler et al., 2014), researchers are looking for alternatives to latent variables. Rigdon (2012, p. 342) observed that "research in statistics and psychometrics challenges the factor-centric worldview," and Rhemtulla, van Bork, and Borsboom (2020) recommended the exploration of opportunities offered by composite methods. An alternative type of construct is found in emergent variables (Cole, Maxwell, Arvey, & Salas, 1993). An emergent variable is a composite of variables of which the correlations with other variables in a model are proportional to one another (Benitez, Henseler, Castillo, & Schuberth, 2020;Dijkstra, 2017). Alternative terms for emergent variables are 'composite constructs' (Benitez, Llorens, & Braojos, 2018), 'aggregate constructs' (Edwards, 2001), and 'formative constructs' 2 (Petter, Straub, & Rai, 2007). Emergent variables have been used much less frequently than latent variables; possibly because of the lack of statistical tools to assess them. Empirical assessment is as important for emergent variables as it is for latent variables. As Rigdon (2012, p. 353) explained, "[r]ejecting the factor model does not mean rejecting rigor, but it does mean defining rigor in composite terms." Analysts who would like to employ emergent variables in their models therefore need a statistical method to assess them. This is where confirmatory composite analysis (CCA) enters the stage.
CCA, invented by Jörg Henseler and Theo K. Dijkstra (see the author note in Henseler et al., 2014), is an innovative set of procedures for specifying and assessing composite models. It is a fully developed method for confirmatory purposes that assesses composite models with the same rigor as CFA does for common factor models (Schuberth, Henseler, & Dijkstra, 2018b). In fact, CCA has been designed analogous to CFA. The only difference between the two is that whereas CFA helps to assess a latent variable structure of observable variables, CCA helps to assess an emergent variable structure. Similar to all types of CSA, CCA examines the discrepancy between the empirical and the modelimplied variance-covariance matrix of observable variables, i. e., the model's goodness of fit. As Barrett (2007, p. 823) has emphasized, "model fit testing and assessment is paramount, indeed crucial, and cannot be fudged for the sake of 'convenience' or simple intellectual laziness on the part of the investigator." In a recent JBR paper, Hair, Howard, and Nitzl (2020) claimed to "introduce and explain the process of CCA" (p. 106). However, not only did they forget to mention at the beginning from whom they had borrowed the idea of CCA; they also confused CCA with the measurement model evaluation step in 'partial least squares structural equation modeling' (PLS-SEM), an approach promoted as a "silver bullet" by Hair, Ringle, and Sarstedt (2011), which is "very appealing to many researchers" (Hair, Risher, Sarstedt, & Ringle, 2019, p. 3). 3 Due to this confusion, they made several errors regarding CCA as Schuberth et al. (2018b) proposed it. For instance, according to Hair et al. (2020, p. 108): (1) "CCA should always be considered as a technique when the focus of research is prediction," (2) "CCA can facilitate the assessment of reflective as well as formative measurement models," and (3) "goodness of fit is not a required metric." In all three aspects, Hair et al. (2020) obviously erred. As its name indicates, CCA as proposed by Schuberth et al. (2018b) is dedicated to confirmation (actually disconfirmation), i. e., to assessing whether a proposed composite model fits the data, and not to predicting outcomes. Also, its focus is on composite models comprising emergent variables and not on reflective or formative measurement models consisting of latent variables. Finally, as a form of CSA, CCA centrally contains the concept of model fit assessment. Evidence of model fit is a necessary but not sufficient condition for a model to be true. For a more elaborate comparison of CCA proposed by Schuberth et al. (2018b) and Hair et al. (2020), we refer to Schuberth (in press).
Against the background given above, the study at hand expounds the confusion regarding CCA introduced by Hair et al. (2020), presents CCA as originally developed, explains its steps, illustrates its use, and discusses when it should be applied. The next section presents CCA as it was first introduced by Henseler et al. (2014) and Schuberth et al. (2018b), i. e., as a type of CSA to specify, estimate, and assess composite models. The following section illustrates how a CCA is performed using a case from the domain of information technology's business value and discusses potential limitations. The final section provides conclusions as well as guidelines for analysts to make well-motivated decisions on which models and methods to employ.

Confirmatory composite analysis
For a long time, CSA, particularly CFA and structural equation modeling (SEM), was mainly equated with latent variable modeling, i. e., abstract concepts are operationalized by means of latent variables. Rigdon (2012) wrote a groundbreaking article in which he questioned the factual monopoly of the common factor model and proposed the socalled concept proxy framework, which provides a fresh perspective on the gap between a concept and its corresponding observable variables. This framework opened the way for employing emergent variables to approximate abstract concepts. Emergent variables are defined by their indicators (Reise, 1999). As linear combinations of other variables, they fulfill the requirement of composite models to act along a single dimension (Dijkstra, 2017). They can be described as "conceptual conveniences- […] phenomena that exist at a higher level than their constituent elements for reasons that people find useful" (Coan, 2010, p. 278). Inspired by Rigdon's (2012) idea and following his postulate for composite-based SEM to show the same rigor as factor-based SEM, the developers of CCA designed their method as a new type of CSA that functions to assess composite models instead of common factor models.
CCA is analogous to CFA, but rather than a common factor model (also known as reflective measurement model), it comprises primarily a composite model. Hence, the abstract concept under investigation is represented by an emergent variable instead of a latent variable in the statistical model (Benitez et al., 2020;Chin, 2010;Cohen, Cohen, Teresi, Marchi, & Velez, 1990;Cole et al., 1993;Reise, 1999). Against this background, CCA is used to assess whether the constraints imposed by the composite model are consistent with the data, i.e., whether the information between two blocks of observable variables is fully conveyed by the emergent variables. Thus, by employing CCA, researchers can empirically assess their postulated theories expressed by statistical models containing emergent variables.
It took more than four years to develop CCA. While the initial idea of CCA was already sketched by Henseler et al. (2014), the scholarly publishing process took more time, until finally, Schuberth et al. (2018b) provided the first full description of the method and demonstrated CCA's efficacy by means of a Monte Carlo simulation. Interim developments on CCA were shared with the scientific community either in written form (Henseler, 2015c(Henseler, , 2017Henseler, Hubona, & Ray, 2016) or in oral communication (Henseler, 2015a(Henseler, , 2015b(Henseler, , 2015dSchuberth, Dijkstra, & Henseler, 2018a). A video presentation by Henseler (2015a) is available at http://tv.us.es/videoembed/? numberpost=30277.
As Fig. 1 illustrates, CCA consists of the same four steps as CFA and other forms of CSA: (1) model specification, (2) model identification, (3) model estimation, and (4) model assessment. Each of these steps is presented below in separate subsections.

Specifying composite models
In CCA, each emergent variable j is assumed to be composed of a unique block of K j observable variables, Thus, the composite model satisfies the principle that all information between the blocks of observable variables is conveyed solely by the emergent variables (Dijkstra, 2017). This principle entails that the covariance between two observable variables belonging to different blocks is constrained to be the product of three factors: the two covariances between the observable variables and their respective emergent variable, and the covariance between the two emergent variables. In general, constraints on the variance-covariance matrix of the emergent variables are possible, and the emergent variables can be embedded in a structural model similar to SEM with latent variables (Dijkstra, 2017). That means Dijkstra (2017) envisioned SEM with emergent variables. For simplicity going further, we assume that all emergent variables freely covary, i. e., all covariances among the emergent variables are model parameters that are freely estimated.
The weights of block j to form the emergent variable j are captured in the vector w j of length K j . Usually, each weight vector is scaled to ensure that the emergent variables have unit variance (see also the conditions for model identification as formulated in Section 2.2). Moreover, it is assumed that each observable variable is connected to only one emergent variable. The model-implied variance-covariance matrix of the observable variables can be expressed as a partitioned matrix, as Eq. 1 shows: (1) The intra-block variance-covariance matrix jj of dimension × K K j j is typically unconstrained, and captures the covariation among the observable variables of block j; thus, the observable variables of one block can freely covary. Moreover, it can be shown that the variance-covariance matrix of the observable variables is positive-definite if and only if the following two conditions hold (Dijkstra, 2015(Dijkstra, , 2017: (i) all intra-block variance-covariance matrices are positive-definite, and (ii) the variance-covariance matrix of the emergent variables is positive-definite.
The covariances between the observable variables of block j and l are captured in the inter-block covariance matrix jl , with j l of dimension × K K j l . However, in contrast to the intra-block variance-covariance matrix, the inter-block covariance matrix is constrained, since by assumption, the emergent variables carry all information between the blocks: where jl presents the covariance between the emergent variables j and l . The vector = w j jj j of length K j contains the loadings, which are defined as the covariances between the emergent variable j and its associated observable variables y j . Eq. 2 is highly reminiscent of the corresponding equation in CFA, in which all concepts are modeled as latent variables instead of emergent variables. In the classical reflective measurement model, the vector j captures the factor loadings of the observable variables on their connected latent variable, and jl represents the covariance between latent variables j and l. Hence, both models show the rank-one structure for the covariance matrices between two blocks of observable variables.
Although the intra-block variance-covariance matrices of the observable variables jj are typically not constrained, importantly, the composite model is still a model from the CSA point of view. It assumes that all information between the observable variables of two different blocks is conveyed by the emergent variable(s), and therefore, it imposes rank-one restrictions on the inter-block covariance matrices of the observable variables (see Eq. 2). These restrictions can be exploited to assess the overall model fit (see Section 2.4). Notably, the weights w j producing these matrices are the same across all inter-block covariance matrices jl with = … l J 1, , and l j. To specify a composite model in CCA, the researcher has to decide on the covariances among the emergent variables and the observable variable forming these emergent variables. Fig. 2 shows an exemplary composite model consisting of three interrelated emergent variables, each of which is composed of three observable variables.
The composite model typically allows the observable variables of each emergent variable to be freely correlated, which is indicated by the double-headed arrows among the observable variables of one block. Similarly, the three emergent variables are allowed to freely correlate, which is highlighted by the double-headed arrows among the three emergent variables. Finally, the correlations between the observable variables of two different emergent variables are fully conveyed by the emergent variables.

Identifying composite models
As in SEM and CFA, model identification plays a preponderant role in CCA. Since analysts can freely specify their models, it must be ensured that the model parameters have a unique solution (Bollen, 1989, Chap. 8). Model identification is also necessary to obtain consistent parameter estimates and to reliably interpret them (Marcoulides & Chin, 2013;Martín & Quintana, 2002). For composite models, as those studied in CCA, for identification, at least two conditions must be fulfilled (Dijkstra, 2017;Schuberth et al., 2018b). Each condition is necessary but not sufficient for identification.
First, a necessary condition for ensuring identification is to fix the scale of each emergent variable. This can be done by either fixing one weight per emergent variable or fixing the variance of each emergent variable. Typically, the weights are chosen to ensure that the variance of each emergent variable is one, If this approach is applied, the sign of each weight vector of every block also needs to be determined, because the negative weight vector also leads to unit variance of the emergent variable. This is similar to a reflective measurement model if the variance of the latent variable is fixed as one.
Second, each emergent variable must be connected to at least one other variable not part of this emergent variable. In principle, the other variable can be observable, latent, or emergent. As a result, at least one inter-block covariance matrix = … l J , 1, , jl with l j satisfies the rank-one condition. In normalizing the weight vectors, all model parameters can be uniquely retrieved from the variance-covariance matrix of observable variables since there is a non-zero inter-block covariance matrix for every weight vector. Otherwise, if an emergent variable j is isolated, all inter-block covariance matrices with l j, belonging to this emergent variable are of rank zero. Consequently, the weights forming this emergent variable cannot be uniquely retrieved, because an infinite number of weight sets exists for this emergent variable that satisfies the normalization condition.

Estimating composite models
The existing literature provides various methods to construct emergent variables from blocks of observable variables. The most common among them are principal component analysis (PCA, Pearson, 1901), linear discriminant analysis (LDA, Fisher, 1936), generalized canonical correlation analysis (GCCA, Kettenring, 1971), and the iterative partial least squares algorithm (PLS, Wold, 1973). All these approaches seek emergent variables that 'best' explain the data and can be regarded as prescriptions for dimension reduction (Dijkstra & Henseler, 2011).
In their original article on CCA, Schuberth et al. (2018b) used MAXVAR, an approach to GCCA, to consistently estimate composite models. Although it is suggestive to apply a composite-based estimator such as the iterative PLS algorithm or generalized structured component analysis (GSCA, Hwang & Takane, 2004), there is no obvious reason not to use other estimators, such as maximum likelihood (ML) or generalized least squares. The decision to use a certain estimator should be based on the estimator's statistical properties and the validity of the assumptions under which these properties were derived, such as independent and identically distributed (i.i.d.) random variables. Typically, unbiased and/or consistent estimators are favored over biased and/or inconsistent ones. Purely practical aspects such as computation time and convergence behavior tend to play a minor role in the selection of estimators. Finally, although not further discussed here, Bayesian estimation is also conceivable (see e. g., Choi & Hwang, 2020;Vidaurre, van Gerven, Bielza, Larrañaga, & Heskes, 2013).
In cases where researchers face a high degree of multicollinearity, it can be beneficial to avoid consistent estimators and choose an estimator that is robust against multicollinearity, but inconsistent. This also applies to the iterative PLS algorithm, where the weights obtained by PLS Mode B are consistent but affected by multicollinearity, while the weights obtained by PLS Mode A are usually inconsistent but unaffected by high correlations among the observable variables. In such a situation, employing Mode A could be advantageous even if the obtained weight estimates are generally not expected to be consistent. An analogous approach is customary in the context of regression analysis where ridge regression is employed in situations with a high multicollinearity among the independent variables (Mason & Brown, 1975).

Assessing composite models
Model assessment is a pivotal step in CCA. In principle, it does not differ from assessing structural equation models generally. In CCA, it consists of the following two steps: (i) overall model fit assessment, and (ii) inspection of each emergent variable separately.
As in SEM and CFA, overall model fit assessment is crucial in CCA (Mulaik et al., 1989;Schuberth et al., 2018b;Yuan, 2005). At the heart of overall model fit assessment is the discrepancy between the empirical and the model-implied variance-covariance matrix of the observable variables. This discrepancy can be exploited to obtain empirical evidence against the specified model, i. e., we examine whether the constraints imposed by the model are justifiable. Note that "[i]f a model is consistent with reality, then the data should be consistent with the model. But if the data are consistent with the model, this does not imply that the model corresponds to reality" (Bollen, 1989, p. 68). Schuberth et al. (2018b) proposed two nonexclusive ways to assess composite models: (i) a bootstrap-based test for exact overall model fit (Beran & Srivastava, 1985) and (ii) fit indices. The goal of the bootstrap-based test for exact overall model fit is to assess the null hypothesis that the model-implied variance-covariance matrix based on the population parameters equals the population variance-covariance matrix of the observable variables: = H : ( ) 0 . The assessment relies on bootstrap to obtain the reference distribution of the discrepancy between the empirical variance-covariance matrix of the observable variables and their estimated model-implied counterpart under the null hypothesis of exact model fit. Hence, the bootstrap-based test is nonparametric in nature and functions asymptotically well under rather mild assumptions such as i.i.d. random variables (Beran & Srivastava, 1985). To measure this discrepancy, various metrics can be employed. In the context of CCA, the standardized root mean square residual (SRMR), the squared Euclidean distance, and the geodesic distance have been proposed (Schuberth et al., 2018b).
To overcome the critiques of the test's stringency for exact overall model fit, fit indices such as the SRMR (Hu & Bentler, 1998) or the root mean square error of approximation (RMSEA, see Browne & Cudeck, 1993) have been proposed. 5 Instead of defining exact fit as the desirable objective, fit indices quantify the discrepancy between the model and the data along a continuous scale to examine how well the estimated model fits the collected data. To judge whether a model shows an acceptable fit, the value of a fit index is typically compared to threshold values derived by simulation studies.
In empirical research, scientists are encouraged to examine alternative explanations for a phenomena under investigation (Nuzzo, 2015). In the context of SEM, this inevitably leads to situations where researchers face various alternative models that are theoretically plausible, i. e., guided by existing theory. To cope with this issue, considering model selection criteria can be helpful to choose the "optimal" model among alternative models. In this context, 'optimal' refers to the trade-off between model fit and model parsimony (Huang, 2017). The most prominent model selection criteria are arguably Akaike's information criterion (AIC Akaike, 1998) and the Bayesian information criterion (BIC Schwarz, 1978); however, various extensions such as the consistent AIC (Bozdogan, 1987) have been developed. For an overview of model selection criteria, we refer to McQuarrie and Tsai (1998) and West, Taylor, and Wu (2012). Notably, model selection criteria such as the BIC and the AIC, are based on model parameter estimates derived from the collected sample. Hence, similar to parameter estimates, model selection criteria are also subject to sampling variation (known as model selection uncertainty, see e.g., Burnham & Anderson, 2002), which could result in different optimal models if a new sample from the same population is considered. To obtain more certainty on the optimality of a chosen model, replication of prior research is recommended (Preacher & Merkle, 2012).
Once a researcher has decided on a certain model, he/she typically continuous to assess each emergent variable and its relation to its observable variables. In doing so, the weight estimates, loading estimates, and their significance are examined and compared to the expectations implied by a researcher's theory. Moreover, if weight estimates potentially suffer from multicollinearity, e.g., if PLS using Mode B is employed, multicollinearity needs to be assessed as well. Finally, the researcher needs to investigate whether the correlations among the emergent variables are in line with their expectations. For a more detailed description on how to assess emergent variables locally in the context of PLS path modeling, we refer to Benitez et al. (2020) and Henseler et al. (2016).

A demonstration of CCA
To illustrate the application of CCA, we refer to aspects of a CCA that Benitez, Ray, and Henseler (2018) conducted and reported. 6 Their study in the domain of information technology's (IT's) business value investigated whether IT infrastructure flexibility has an impact on the success or failure of mergers and acquisitions. It followed a two-step approach which we also know from the SEM literature (e.g., Anderson & Gerbing, 1988). However, in contrast to SEM with latent variables, the first step conducts CCA, and subsequently, in the second step, the emergent variables are embedded in a structural model.
To collect the data, midsize firms in Spain were surveyed, using questionnaire items measured on a 5-point scale, to which the researchers received 100 valid responses. For the sake of ease and in line with Rhemtulla, Brosseau-Liard, and Savalei (2012) (see also Schuberth, Henseler, & Dijkstra, 2018c), we treated the variables as continuous. Table 1 lists the 16 observable variables and gives the wording of the respective questionnaire items. Further, it shows the concepts that are intended to be constructed based on the corresponding blocks of observable variables. For more details on the data collection and the concepts, we refer the reader to the original study.
Based on the theory they derived, Benitez, Ray, et al. (2018) specified the composite model as depicted in Fig. 3. 7 As illustrated, it assumed that each concept is composed of four observable variables. Additionally, all constructs were allowed to correlate freely, i. e., no constraints were imposed on the correlation matrix of the emergent variables. Finally, a CCA was conducted to examine whether their specified model was consistent with the collected data.
Considering the identification of the specified model, we note that no emergent variable was isolated in the model. Moreover, the weights were scaled to obtain emergent variables with unit variance. To ensure that the signs of the weights were uniquely determined, the first observable variable of each emergent variable, namely COMP1, CONN1, MOD1, and PSF1, was used as dominant indicator, i. e., the weights were chosen in a way that ensured a positive correlation between the observable variables and their respective emergent variables. Consequently, the model parameters can, in theory, be uniquely retrieved from the variance-covariance matrix of the observable variables.
To estimate the specified model, we followed the original study of Benitez, Ray, et al. (2018) and applied the iterative PLS algorithm using Mode A to obtain the weight and the construct correlation estimates. Mode A was deliberately chosen for the estimation because of the 5 For an overview of various fit indices, we refer to Schermelleh-Engel, Moosbrugger, and Müller (2003). However, it is noted that research investigating the performance of fit indices in the context of composite models is scarce. 6 We thank José Benitez and Gautam Ray for providing their data for the purpose of this demonstration. 7 It is noted that the covariances among the indicators of each block are concealed to preserve clarity.
degree of collinearity among the indicators and the relatively small sample size. 8 Additionally, we used the factorial scheme to calculate the inner weights in PLS. The whole analysis was conducted in the statistical programming environment R (R Core Team, 2020) using the cSEM package (Rademaker & Schuberth, 2020). We report the obtained weight estimates and the correlations between the observable variables and their construct, i. e., the loadings, in Table 2. Since there are no closed-form standard errors of the PLS parameter estimates, the statistical inference is based on bootstrap. In doing so, the corresponding 95% percentile confidence intervals based on 999 bootstrap runs are used. Moreover, the correlations among the emergent variables range from 0.39 to 0.59, and none of their 95% percentile confidence intervals covered the 0. Assessing the composite model involves two steps, i. e., the assessment of the overall model fit and the assessment of each emergent variable separately. To assess the overall model fit, we used the bootstrap-based test for exact overall model fit. The results displayed in Table 3 show that the values of the discrepancy measures, i. e., geodesic distance (d G ), SRMR, and squared Euclidean distance (d L ), are below the corresponding critical value, namely the 95% quantile of the Table 1 Items used in the CCA. Taken from Benitez, Ray, et al. (2018, p. A2).

Concept
Variable Name Item Wording IT compatibility COMP1 Software applications can be easily transported and used across multiple platforms. COMP2 Our firm provides multiple interfaces or entry points (e. g., web access) for external end users. COMP3 Our firm establishes corporate rules and standards for hardware and operating systems to ensure platform compatibility. COMP4 Data captured in one part of our organization are immediately available to everyone in the firm. IT connectivity CONN1 Our organization has electronic links and connections throughout the entire firm. CONN2 Our firm is linked to business partners through electronic channels (e. g., websites, e-mail, wireless devices, electronic data interchange). CONN3 All remote, branch, and mobile offices are connected to the central office.

CONN4
There are very few identifiable communications bottlenecks within our firm. Modularity MOD1 Our firm possesses a great speed in developing new business applications or modifying existing applications.

MOD2
Our corporate database is able to communicate in several different protocols.

MOD3
Reusable software modules are widely used in new systems development.

MOD4
IT personnel use object-oriented and prepackaged modular tools to create software applications. IT personnel skills flexibility PSF1 Our IT personnel have the ability to work effectively in cross-functional teams. PSF2 Our IT personnel are able to interpret business problems and develop appropriate technical solutions. PSF3 Our IT personnel are self-directed and proactive. PSF4 Our IT personnel are knowledgeable about the key success factors in our firm. was used, the variance inflation factors would range from 1.73 to 3.08 and some of the obtained weights would show negative signs and not differ significantly from 0, although they are expected to be positive and significant.
corresponding reference distribution. The results lead us to conclude that the specified model adequately fits the collected data, i. e., the proposed model captures the available information in the data acceptably. 9 As a second step of model assessment, each emergent variable is considered separately, i. e., we assessed the model locally. In line with Benitez, Ray, et al. (2018)'s expectations and as shown in Table 2, all observable variables significantly contribute to their emergent variable, i. e., the 95% percentile confidence intervals of the estimated weights do not cover the zero. Similarly, all correlations between the observable variables and their construct, as well as the correlations between the emergent variables are positive and significantly different from zero, i. e., the 95% percentile confidence intervals do not cover the zero.
Considering the results of the model assessment, we found no empirical evidence against the specified model, and thus the postulated theory of Benitez, Ray, et al. (2018) cannot be falsified. Notably, this lack of disconfirmation does not automatically imply a confirmation of the theory. As with all empirical studies, replicating the study is crucial to obtain more confidence in the model.

Discussion
The empirical example given above serves as showcase to illustrate how a statistical method, namely CCA, is applied. Hence, it should be obvious that an empirical study generally involves much more than simply undertaking the CCA steps presented in this paper. For example, as is common in quantitative research, researchers need to develop a theory and derive testable hypotheses prior to the data analysis. Moreover, researchers must decide about the way in which to collect the data for their studies and address associated issues such as common method variance (Podsakoff, MacKenzie, Lee, & Podsakoff, 2003). Although several statistical procedures to assess common method variance have been proposed in the context of common factor models (see, e.g., Williams, Hartman, & Cavazotte, 2010;Fuller, Simmering, Atinc, Atinc, & Babin, 2016), it is currently not clear how to address common method variance in the context of composite models. In general, researchers are advised to take various forms of uncertainty into account (Rigdon, Sarstedt, & Becker, 2020). Further, researchers should investigate a priori whether the size of the collected sample is sufficiently large to ensure that the employed statistical tests have sufficient power, e.g., by conducting Monte Carlo simulations (see e.g., Wolf, Harrington, Clark, & Miller, 2013). However, in the context of composite models, to date no guidelines have been proposed for setting up such a simulation. Finally, we need to assess whether the employed statistical methods and their associated assumptions match theory's implications and the characteristics of the collected data.
Properties of statistical methods including estimators and statistical tests are derived under assumptions that are often not met in empirical research, such as linearity or i.i.d. random variables. Moreover, often properties such as consistency and convergence in distribution refer to asymptotic behaviors, i. e., when the sample size converges to infinity, which is obviously never the case in empirical research. Hence, it is important to be aware of which assumptions are met and what their implied limitations for empirical research are. It is also important to know what and how serious the consequences are if these assumptions are violated.
This highlights the importance of methodological research that explores the limitations of statistical methods, such as consistent estimators' behavior in finite samples, and develops new methods that relax the original assumptions and take issues encountered in empirical research into account. As CCA has been introduced only recently, many of the methodological enhancements proposed in the context of SEM, such as equivalence testing (Yuan, Chan, Marcoulides, & Bentler, 2015), are not available yet. This endorses the model of the null hypothesis rather than rejecting it. It also endorses the test of close fit based on the SRMR (Maydeu-Olivares, Shi, & Rosseel, 2017), which works well even for smaller sample sizes. Similarly, current literature lacks proper guidelines on how to deal with sources of model misfit in the context of composite models. However, as CCA is still in its infancy, we are convinced as an analytic technique it will experience similar enhancements as CFA and SEM, and thus overcome limitations currently faced.  indicates that the data comprises more information than captured by the specified model (Jöreskog, 1969). As a consequence, researchers should investigate potential sources for the misfits. In doing so, they can follow existing guidelines known from SEM, e.g., Kline (2015, Chapter 12), and inspect the variance-covariance of the residuals, i. e., the difference between the sample variance-covariance matrix and the model-implied counterpart. Moreover, they can consult fit indices to investigate whether the approximate fit of the model is still acceptable. As CCA was developed only recently, it is up to future research to propose more sophisticated strategies to identify sources of misfits in composite models, e.g., the evaluation of local fit, which has been recently proposed in the context of latent variable models (Thoemmes, Rosseel, & Textor, 2018).

Conclusion
Constructs are the core building blocks of statistical models in CSA. Two types of constructs can be distinguished: latent variables and emergent variables. Latent variables are the common cause underlying a set of indicators. They are particularly useful to model theoretical concepts of behavioral research, such as traits and attitudes. Typically, they are extracted as common factors. The preferred method to empirically assess them is CFA. In contrast, emergent variables are typically defined by a set of observable variables, i. e., they are composites of other variables, which are particularly useful to model human-made artifacts. For a long time, there has been no method to empirically assess them.
In 2014, Jörg Henseler and Theo K. Dijkstra developed and proposed a statistical method that assesses the goodness of fit of models of interrelated composites, and -making use of the inventor's privilege to name the invention -they labeled it 'confirmatory composite analysis' (Henseler et al., 2014, see also the authors' note in that paper). The full presentation of the method appeared in Schuberth et al. (2018b), and it is now being introduced to business research by means of the current paper, which explains all of CCA's main steps. Not only does the name CCA emphasize the analogy to CFA, with which CCA shares everything excepting the specified model, but it also clearly designates what CCA actually does: it enables the analysis of composite models, and it is confirmatory in nature because it provides evidence about whether the collected data is consistent with a researcher's specified composite model. Thus, the name explicitly articulates a central aspect of the meaning of this analytic approach.
In their recent JBR paper, Hair et al. (2020) used the term 'confirmatory composite analysis' for something else: the measurement model evaluation step of PLS-SEM. In doing so, they appear to associate CCA with nothing more than a body of existing rules of thumb and cookbook procedures, which have for some time been associated with PLS path modeling, in particular. Briefly, according to Hair et al. (2020), CCA is little more than a new name for an old recipe. 10 This is unfortunate, because it could unnecessarily confuse business researchers.
In statistical modeling, it has become a good practice to distinguish between the statistical model and its unknown parameters on the one hand, and the employed estimator as a function of the data on the other hand (Lehmann & Casella, 2003, p. 4). Researchers are thus confronted with two problems in decision making. First, they need to specify the statistical model, which includes for example the decision about the number of free and fixed parameters. Second, they must choose an appropriate estimator to obtain estimates for the free model parameters. Preferably, estimators should have certain desired (asymptotic) properties, such as being unbiased, consistent, and efficient. 11 In order to establish such properties, it is necessary that certain assumptions, which are estimator specific, are fulfilled. If these assumptions are violated, an estimator loses its properties. Consequently, its estimates turn out to be biased and inconsistent, as do entities that rely on the estimates, such as their standard errors. Moreover, statistical inference leads to wrong conclusions, which renders a study's findings questionable. In particular, an estimator must suit the statistical model. For instance, it is well known that using estimators assuming a composite model such as PLS and GSCA to estimate the parameters of common factor models/reflective measurement models leads to biased estimates (Dijkstra, 1983;Dijkstra & Henseler, 2015;Hwang, Takane, & Jung, 2017), which can inflate Type-I and Type-II errors (Goodhue, Lewis, & Thompson, 2017;Henseler, 2012). The opposite constellation is equally problematic: using consistent PLS (PLSc, Dijkstra & Henseler, 2015), which assumes a common factor model, to estimate the parameters of a composite model also leads to biased estimates (Sarstedt, Hair, Ringle, Thiele, & Gudergan, 2016).
In the light of this situation, our paper offers two pieces of advice. First, scholars should not confuse Hair et al.'s recommendations with the CCA defined in the literature. CCA is not the measurement model evaluation step of PLS-SEM, but an innovative set of procedures for specifying and assessing composite models. Second, analysts should always assess a model with a method that has been designed for precisely such a model. Concretely, CFA is the preferred method to assess common factor models (reflective measurement models), and CCA is the method of choice to assess composite models.