MODERATION ANALYSIS: ISSUES AND GUIDELINES

This editorial is dedicated to moderation analysis. Similar to what we did with the earlier editorial about mediation analysis, this editorial addresses seven key issues related to moderation and provides guidelines to justify the inclusion of moderator(s) and perform the analysis. Specifically, it discusses identification, conceptualization, usage, analysis, and reporting of moderating variables. Additionally, it also explains several approaches pertaining to moderation analysis and highlights the key differences between a simple moderation analysis and a multi-group analysis. We hope that this editorial will be useful to academics and research students to conduct moderation analysis with rigor.


INTRODUCTION
Our earlier editorials on methodological misconceptions (Memon, Ting, Ramayah, Chuah, & Cheah, 2017) and mediation analysis  were highly acknowledged. Since then we receive numerous positive response and feedback from the research community in Malaysia and abroad. Both articles were downloaded more than a thousand times from ResearchGate in a matter of weeks. Subsequently, we received requests for a similar contribution to moderation analysis. This became impetus for considering this editorial, and we are glad that such consideration has come to pass.
Moderating variable is at the heart of theory in business and social science (Andersson, Cuervo-Cazurra, & Nielsen, 2014;Cohen, Cohen, West, & Aiken, 2003). It symbolizes the maturity and sophistication of a field of inquiry (Aguinis, Boik, & Pierce, 2001;Frazier, Tix, & Barron, 2004). A moderating variable refers to a variable that "influences the nature (e.g., magnitude and/or direction) of the effect of an antecedent on an outcome" (Aguinis, Edwards, & Bradley, 2017, p. 2). In statistical terms, moderation is where a relationship between an independent variable and a dependent variable changes according to the value of a moderator variable (Dawson, 2014). Additionally, moderating variables are essential to assess whether two variables have the same relation across groups. On the whole, a moderating model addresses "when" or "for whom" a variable strongly explains or causes an outcome variable (Frazier et al., 2004).
Motivated by the urge to contribute more to the research community, the purpose of this editorial is to address seven key issues related to moderation and provides guidelines with reference to existing literature and our experience to perform moderation analysis. These seven fundamental issues are: (1) How to identify potential moderators, (2) Difference between simple moderation analysis and multi-group analysis (MGA), (3) When to use a moderating variable, (4) How to conceptualize/hypothesize a moderating relationship, (5) Approaches (6) Preanalysis guidelines, and (7) Analyzing and reporting of moderation effects. It is worth mentioning that we, like many others, are the beneficiaries of the gurus and experts in methods and statistical analysis. Hence, we hope this editorial will serve as a complementing reference to academics and research students who wish to perform moderation analysis rigorously and appropriately.

AN EXEMPLAR
Beginning with an exemplar of moderation analysis, we use a conceptual model (Figure 1) consisting of a dependent variable (Y), an independent variable (X), and a moderator (M). The moderating variable is connected to the dependent and independent variables by an arrow which points to the relationship between X and Y. However, the statistical visualization is different from how it is conceptualized in the model graphically as it includes an interaction term depicted by X*M (Z).  Figure 2 shows the statistical model for moderation including an interaction term (Z), pointing to the dependent variable. In general, a moderator can have different connotations. It can be referred to as categorical variable when nominal or ordinal scale is used (e.g., male and female; public universities and private universities) or as continuous variable when interval scale is used (e.g. high level and low level of skepticism; high level and low level of organizational support). Discreet data is often treated as a categorical variable in statistical analysis. Note that it is incorrect to claim that moderation analysis only involves variables with categorical data. iii

How to Identify Potential Moderators
The choice of moderators should be based on theoretical grounds with considerable literature support (Frazier et al., 2004) rather than driven by ambition or requirement to make the study complex. A careful review of the relevant articles published in reputable journals, especially the sections about limitations and future directions of the study can be a good starting point to identify potential moderators. Inconsistent findings in past studies about the effect of the same antecedent (independent variable) on the outcome can also be a strong case for testing a moderator. Hence, systematic literature review and meta-analysis are usually utilized to achieve this purpose. Moreover, the use of contextual factor from a different field with a constructive theoretical explanation (e.g. using generations from sociology in a marketing study) provides a strong basis for incorporating the said factor into the study as a moderator. Such investigation and subsequent findings mark a substantial contribution to the existing body of knowledge.
Furthermore, discussion with experts in the same field and key informants in the relevant industry can be another useful technique to brainstorm and identify the potential moderators. Note-worthily, the appropriate selection of experts and key informants is critically important. Similarly, attending the right conferences is also pivotal to discussing with and learning from the informed experts and delegates. Additionally, a qualitative inquiry is necessary to explore and propose potential moderating effect when a contextual variable is found pertinent but has not been empirically tested in the same field of study. As such data collection techniques such as focus groups, participatory-observation and personal interview are recommended to identify contextual factors which have an effect on the nature of the relationship between an antecedent and an outcome in a natural setting. We would recommend that researchers read Andersson et al. (2014) and Frazier et al. (2004) for a better understanding in relation to moderating variables.

Difference Between Simple Moderation and Multi-Group Analysis
Multi-group analysis (MGA) helps researchers to assess whether two or more variables have the same/different relation across groups (MacKinnon, 2011). Specifically, when the moderator variable is categorical, such as nationalities or industry types, the preferred analytical technique would be MGA if the moderation effect is on the entire model. In other words, it tests and compares the effect of every structural path across various groups (Aguinis et al., 2017;Boyd, Haynes, Hitt, Bergh, & Ketchen, 2012;Ting, Fam, Hwa, Richard, & Xing, 2019). MGA is altogether different from t-test or ANOVA as the latter is performed via univariate analysis. This procedure can be done by comparing the parameters between two or more groups.
Since the moderator is expected to exert its effect on all the structural paths of the model rather than a specific path in MGA, measurement invariance test is mandatory. The primary purpose is to ensure that the measurement model assessment conducted under different conditions yield equivalent representations of the same constructs (Hair, Black, Babin, & Anderson, 2010). In a similar vein, Hult et al. (2008Hult et al. ( , p. 1028 pointed out that "failure to establish data equivalence is a potential source of measurement error (i.e., discrepancies of what is intended to be measured and what is actually measured), which accentuates the precision of estimators, reduces the power of statistical tests of hypotheses, and provides misleading results." In CB-SEM, configural invariance, metric invariance, and scalar invariance need to be assessed prior to MGA (see Hair et al., 2010). However, in PLS-SEM, configural invariance, compositional invariance, equal means, and equal variances need to be examined instead (see Henseler, Ringle, & Sarstedt, 2016). Notably, researchers need to achieve at least partial invariance result from metric invariance (or compositional invariance) test in order to proceed to MGA (Hair et al., 2010;Henseler et al., 2016).
The simple moderation analysis, in turn, is appropriate when the moderator is expected to exert its effect on the specific structural path(s) with the support of relevant theory. As discussed earlier, it could be a continuous variable or a categorical variable. A simple moderation effect can be assessed by creating a moderated regression model that explains whether a moderator alters the strength or/and direction of the relationship between an antecedent (independent variable) and an outcome (Andersson et al., 2014;Baron & Kenny, 1986). Note that moderator with continuous data should not be converted to categorical data when assessing its interaction (Dawson, 2014). This is because it will reduce the statistical power of the test, thus making it more difficult to detect significant effect (Cohen et al., 2003;Stone-Romero & Anderson, 1994). In addition, it also raises the concern about the use of certain dividing point (i.e., median or mean) to run the analysis (Aguinis et al., 2017, p. 10).

When to Use a Moderating Variable
With reference to the earlier discussion about how to identify potential moderators, moderating variables are introduced when there is an unexpectedly weak or inconsistent relation between an antecedent (independent variable) and an outcome across studies (Baron & Kenny, 1986;Frazier et al., 2004). An inconsistent or inconclusive relation refers to when "a relation holds in one setting but not in another, or for one subpopulation but not for another" (Baron & Kenny, 1986, p. 1178. In most cases, a moderator is either an antecedent (independent variable) tested in past studies or a contextual factor found relevant across different fields of study. Froese, Peltokorpi, Varma, and Hitotsuyanagi-Hansel (2018) provide a good example of such an approach where the authors point out previous inconclusive findings as the basis for testing the moderating effects of employee demographic characteristics between merit-based rewards and job satisfaction.
Moreover, moderating variables can also be tested for the purpose of new theoretical insights (Andersson et al., 2014). For instance, Hauff, Richter, and Tressin (2015) filled a research gap by investigating how national culture moderates the influence of different job characteristics on job satisfaction. In either case, a strong theoretical support is required to justify the inclusion of a moderating variable in an existing or exploratory model. There must be theoretical arguments as to why the inclusion of certain moderator will result in a better explanation of the phenomenon under investigation (Andersson et al., 2014). It should not be done based on "trial and error" approach neither should it be designed to make the model complex, assuming that it would lead to more significant contribution (be it for a Ph.D. research or publication). In addition to the works of Andersson et al. (2014) and Frazier et al. (2004), we further recommend these references for an improved understanding of the use of moderating variables: Dawson (2014), Baron and Kenny (1986) and Aguinis et al. (2017).

How to Conceptualize/Hypothesize a Moderating Relationship
We strongly recommend the seven-step framework by Andersson et al. (2014) for conceptualizing or hypothesizing moderating relationships. This framework suggests that the researchers should: (1) Identify the theory that explains the direct and moderating effects, (2) apply the selected theory to the research question and explain the direct effect and the mechanisms behind it, (3) provide a theoretical justification for the choice of moderator variable (M), (4) explain the direct effect of the moderator variable (M) on the dependent variable (Y) to clarify how this direct effect differs from the moderating effect (Z), (5) explain how the moderating effect (Z) changes the mechanisms by strengthening or weakening the direct relationship, (6) theoretically rule out the reverse interaction in which the independent variable (X) is moderating the relationship between the moderating variable (M) and the dependent variable (Y), (7) return to theory when interpreting the results and explain them from a theoretical viewpoint. These steps can be adapted and modified depending on the specific research question and the nature of the study.
It is necessary to emphasize again that the inclusion of moderating effects must be justified by theory, rather than the statistical significance of the moderating effect. Researchers should ensure that the explanation of the moderating effect (Z) should differ from the explanation of the direct effect as well as from the explanation of the impact of the moderating variable (M) on the dependent variable (Y). Simply mentioning that "The M moderates the relationship between X and Y" does not constitute a good hypothesis. Instead, authors should mention explicitly the directionality of the interaction by postulating either positive or inverse relationship based on the literature (Aguinis et al., 2017) For example, "The impact of X on Y will increase when M is present" or "The relationship between X and Y will be stronger when M reduces". Gardner, Harris, Li, Kirkman, and Mathieu (2017) suggest three possible types of interaction effect that can be exerted by moderators. Specifically, a moderator can (1) strengthen a relationship, (2) weaken a relationship or (3) reverse or change a relationship. We strongly encourage researchers to read Andersson et al. (2014), Aguinis et al. (2017), Gardner et al. (2017) and Baron and Kenny (1986) to better understand how the moderating relationship is conceptualized.

Approaches for Moderation Analysis
When performing statistical analysis using structural equation modeling, specifically in PLS-SEM, there are several approaches for moderation analysis, including Product-Indicator, Two-Stage, and Orthogonalizing. The product-indicator approach (Chin, Marcolin, & Newsted, 2003) multiplies indicators of the independent variables by the indicators of the moderator variable. This approach is recommended for reflective models. It can also be used for multi-group analysis when the moderator is categorical (with a continuous independent variable). However, it is not appropriate when the independent or/and moderator variable are measured formatively . One of the weaknesses of this approach is that it produces collinearity in the structural model (Fassott, Henseler, & Coelho, 2016).
In the two-stage approach (Henseler & Chin, 2010;Henseler & Fassott, 2010), the latent construct scores are firstly calculated and saved. The interaction term (Z) is subsequently built up as the element-wise product of the construct scores of X and M. This interaction term together with the latent variable scores of X and M are thus used as independent variables in a multiple regression on the latent variable scores of Y (Fassott et al., 2016(Fassott et al., , p. 1891). This approach is recommended when the independent (X) or the moderator (M) is a formative variable. Like the product-indicator approach, the two-stage approach may also induce collinearity as it involves an interaction term.
The orthogonalizing approach (Henseler & Chin, 2010) is an extension of the product indicator approach. Unlike the product-indicator and the two-stage approaches, the orthogonalizing approach eliminates the issue of collinearity through residual centering. Additionally, it has superiority in terms of parameter and prediction accuracy. However, it is only applicable when both independent (X) and moderator (M) are reflective. Lower statistical power is considered as one of the main disadvantages of this approach (Fassott et al., 2016). We also recommend Becker, Ringle, and Sarstedt (2018), Fassott et al. (2016), Henseler and Chin (2010), and Henseler and Fassott (2010) for a detailed discussion of these approaches. A detailed discussion on this matter is also provided in the 2nd edition of our PLS manual .

Pre-Analysis Guidelines
In addition to the 7-step framework by Andersson et al. (2014), we suggest seven important guidelines related to the research design which should be considered before data collection. The researchers should (1) use scales having sufficient scale points. Using a scale with insufficient scale points may result in possible information loss, thus preventing the detection of a moderating effect (Aguinis, 1995). Based on our experience and discussions with experts, we believe that a 7-point Likert scale works better for a moderating variable compared to those scales with fewer scale points in the Malaysian context due to collectivistic culture, (2) pretest the instrument before the main data collection. A thorough pretest session by means of protocol or debriefing techniques, rather than a pilot study for performing reliability test, with target participants can be an invaluable exercise and it minimizes the risk of a low-quality dataset and potentially unexpected results, (3) keep the important items (e.g., items related to the moderating variable) up in the order, especially when the questionnaire is relatively lengthy, (4) run power analysis (see Faul, Erdfelder, Buchner, & Lang, 2009;Kock & Hadaya, 2018). As far as power analysis is concerned, we recommend running it twice. First, we should run it before the data collection to determine the sample size required to achieve desirable power and effect size. Second, it should be conducted after data collection to ensure that the study has sufficient sample size required for a moderating effect to be detected. In the case when there are several moderators in the model and the sample size is low, we recommend analyzing each moderator separately with justification, (5) screen out suspicious responses (e.g., straight-lining and zigzag patterns), and responses which have less than 0.5 standard deviations due to small variation, (6) do "homework" first through reading in order to understand some of the fundamental concepts and approaches related to moderation analysis (it baffles us when Ph.D. candidates ask questions about moderation which they could have easily known through some reading), (7) select an appropriate statistical package (e.g., SmartPLS, IBM SPSS AMOS, WarpPLS, IBM SPSS Statistics, and SPSS/SAS Macro) based on their characteristics and make sure the techniques are relevant to the research problems/objectives/questions/hypotheses. Although IBM SPSS AMOS is widely used for data analysis in social science and business research, SmartPLS 3.0 (Ringle, Wende, & Becker, 2015) has received good coverage recently due to its built-in features that run all moderating approaches (e.g., product-indicator, two-stage, and orthogonalizing) with a few simple clicks. Evidently, SmartPLS3.0 makes MGA easier to execute than other frequently used packages (e.g., IBM SPSS AMOS). The same is true about WarpPLS 6.0 (Kock, 2017). SPSS Macro (Preacher & Hayes, 2004) can also be a good tool for moderation analysis though it does not provide graphical output. We recommend several key references on the subject matter, including the seminal works by Aguinis et al. (2017), Aguinis (1995), and Andersson et al. (2014). Also, a brief explanation related to pre-analysis guidelines is detailed in Memon et al. (2017).

Analyzing and Reporting Moderation Effect
The main objective of moderation analysis is to "measure and test the differential effect of the independent variable on the dependent variable as a function of the moderator" (Baron & Kenny, 1986, p. 1174. The steps involved in analyzing a moderating effect vary based on the statistical package and the approach used. Although it is beyond the scope of this editorial to explain the analytical steps of each approach using different packages, we recommend that the general guidelines should be considered while analyzing and reporting moderation analysis. Whichever statistical package is used, researchers must take care of the following three key points while performing a moderation analysis. (1) First, the research should focus on the significance of the moderating effect (Z). To clarify, it is possible that a moderator variable (M) may or may not have an effect on the dependent variable (Y). Thus, the decision as to whether there is any moderating effect should be made based on a significant relationship between the moderating effect (Z) and the dependent variable (Y). (2) Second, researchers must calculate and report the effect size (f 2 ), and how much it contributes to R 2 as a function of the moderator. Only a few software packages (e.g., SmartPLS3.0) calculate f 2 by default. For others, there are online spreadsheets which can be used to calculate effect size (see http://statwiki.kolobkreations.com).
(3) Lastly, researchers must execute and report a simple slope plot for the visual inspection of the direction and strength of the moderating effect. SmartPLS users can check out a simple slope plot under "Final Results" and "Simple Slope Analysis". As a final note, we reemphasize the seventh step suggested by Andersson et al. (2014) that the researchers should "return to theory when interpreting the results and explain them from a theoretical viewpoint". In other words, they should put more emphasis on the substantive meaning of such results in terms of the theoretical understanding of the phenomenon under investigation rather than the statistical significance. For a practical explanation and step-by-step guidelines, we strongly recommend the PLS Primer by Hair, Hult, Ringle, and Sarstedt (2017) and the PLS manual by  to assess moderation in SmartPLS 3.0.

Emerging Issues to Consider
In this section, we wish to briefly discuss some of the issues raised by Aguinis et al. (2017) when performing a moderator analysis. There are seven issues which require consideration and they are 1) lack of attention to measurement error, 2) variable distributions are assumed to include the full range of possible values, 3) unequal sample size across moderator-based categories, 4) insufficient statistical power, 5) artificial dichotomization of continuous moderators, 6) presumed effects of correlation between product term and its components, and 7) interpreting first-order effects based on models excluding product terms. Aguins et al. (2017) mentioned that 62.4% of the papers they reviewed in the Strategic Management Journal did not report measurement errors. This is consistent with what Boyd et al (2005) found where most articles published in Strategic Management Journal did not report reliability. They argued that if the X (reliability = 0.7) and the Z (reliability = 0.7), then the product term X*Z (reliability 0.7*0.7 = 0.49) would not be acceptable. They also argued that when independent and moderator variables are measured with error, the unstandardized coefficient estimates will be biased. They suggested that future research should, at a minimum, report reliability estimates for all predictors, including product components as this maybe be useful to interpret when the interactions effects are not significant.
The second issue they raised is about the data collected which do not represent the full range of possible scores of the variables under consideration that might exist in the population. For example, only the top performing companies in the urban areas from the population are selected and sampled. They referred this issue to as "range restriction" since the companies with poor performance are not included in the sample. Aguinis and Stone-Romero (1997) discovered that when sample variance is less than population variance, even by what may be considered a small amount, the statistical power for detecting moderating effects is substantially diminished. They suggested that the researchers should attempt to capture the full range of scores of all variables involved in the analysis if that is not feasible. Alternatively, when the moderating effects are small or not significant, they should report the population variance to rule out range restriction as a plausible alternative explanation for the results obtained.
The third issue they raised is for situations where the moderators are categorical. In this kind of situation, as much as possible researchers should strive to balance the sample size in each of the categories of the moderator variable. For example, if gender is constructed as a moderator, and female respondents make up 80% of the sample, it will lead to the underestimation of the moderating effect. As much as possible researchers should try to collect similar proportions. If the categories are unevenly distributed, then the oversampling from the smaller group is likely to inflate statistical power at the cost of using a sample that might not be representative of the population.
The fourth issue they raised is that many studies lacked sufficient power to detect the moderating effects. As a result, many moderating effects can go undetected. As the norm for the social sciences is to get a power of 0.80, researchers are suggested to do a power analysis prior to collecting data to ensure that they have sufficient power for their analysis. They further suggested that statistical power can be increased by using larger samples and conducting research in settings that control for extraneous variables (i.e., experimental or simulation-based research). It is further recommended that power should be computed and reported in future studies to dispel the notion that the study is underpowered.
The fifth issue they raised is the artificial dichotomization of a continuous moderator variable in the analysis which is commonly done through IBM SPSS AMOS. This is because the interaction effect generation in IBM SPSS AMOS is relatively tedious. They argue such an issue will lead to loss of information. It not only undermines the interpretation of the moderator but also reduces the variance of the moderator variable. Consequently, the estimated moderating effects are biased downward (Aguinis, 1995). They have also argued that this practice of artificially categorizing continuous moderator variables discards information, reduces statistical power to detect moderating effects, and attenuates the size of moderating effects. Hence, this practice should be discontinued (Aguinis, 1995;Aguinis & Gottfredson, 2010).
The sixth issue relates to the concern about the correlation between the product term Z and its component variables X and M which is commonly considered as multicollinearity. Thus, many researchers administer a procedure called mean centering to try reducing this effect. What they wrote contradicts this common belief, and that is the multicollinearity created by his issue does not actually inflict any problem to moderation analysis as long as X, M, and X*M are included when running the analysis. Finally, they recommended mean-centering for the sole purpose of facilitating the interpretation of coefficients on lower-order terms in the presence of interactions. But they also emphasized the fact that the results regarding interaction effects would likely remain unchanged regardless of predictors being centered or not.
The last issue raised is the interpretation of the lower order effect (direct effect of X to Y) without including the interaction effects. This is not recommended because when an interaction exists, the predictor involved in the interaction does not have a single unique effect. Instead, it has a range of effects that vary according to the level of the moderating variable and are referred to as simple slopes (Aiken & West, 1991). Aguinis (2004) explained that because simple slopes represent a range of effects in most cases, it is not meaningful to hypothesize or test a single effect for a predictor when it interacts with a moderating variable. In conclusion, they suggested to not assess the lower order effects (X to Y) without the interaction effects in the future.

A FINAL NOTE
We know there are much more to moderation analysis. We hope that our humble attempt in this Editorial provides a bird"s-eye view of the subject matter and more importantly stimulates the interest of the researchers, be it academics or postgraduate students, to keep learning and making progress in our understanding and application of statistical analysis. We should never do research mechanistically as if there is a template answer to every question. There is also no one formula that solves all problems. Performing moderation analysis for the sake of making the model complex and finding the easiest way to get things done without fundamental understanding can be likened to a shipwreck. When a "storm" hits, the ship will sink. All analyses in social science and business research, including moderation analysis, must not be done in separation with the research problems and questions, and should be conducted with rigor and with a sound theoretical explanation. We strongly believe our commitment to learning, like a moderator, will dictate and determine our research progress and performance.