ANOVA on principal component as an alternative to MANOVA

With its strict assumptions, practitioners found difficulties in applying Multivariate Analysis of Variance (MANOVA) on their works. When normal assumption is partially fulfilled on multivariate responses, it does not guarantee that the responses are to be multivariate normal distributed simultaneously. To tackle this problem, we proposed a method by simply applying Analysis of Variance (ANOVA) on principal component (PC). The PC is a linear combination of the responses. Once all the responses are normally distributed, the PCS will also be. Accordingly, all we need is to ensure that the responses are to be partially normal distributed and the multivariate normal distribution is needless. The results of our data analysis indicated that our proposed method can be used as an alternative to MANOVA, especially when multivariate normal assumption could not be fully guaranteed.


Introduction
Multivariate Analysis of Variance (MANOVA) in experimental design is employed to investigate the effects of treatments on multi-responses simultaneously instead of using Analysis of Variance (ANOVA) partially [1], [2], [6]. However, practitioners found difficulties in applying MANOVA on their works due to the underlying assumptions to properly using MANOVA. One of the underlying assumptions is the multivariate normality which is difficult to fulfill. When assumption of normality is fulfilled on responses separately, it could not be assured that the responses are to be simultaneously multivariate normal distributed. This strict assumption is practically difficult to be satisfied. Accordingly, the application of MANOVA to such a case is no longer suitable.
Like ANOVA, a MANOVA mainly aims to test the significance of treatments and the interaction between treatments (if the model containing interaction term). When the null hypothesis (that all treatments are the same) is rejected, further analysis needs to be performed to draw conclusion about which treatment means differ from others. In this case, it is difficult to perform a post hoc test to identify an optimum treatment. Interpretation on interaction term is also another problem to deal with when it is significant. To overcome the problems above, we proposed an application of ANOVA on principal component (PC) instead of applying MANOVA on multivariate responses. This approach takes consideration since principal component is a linear combination of original variables, the responses. And as we know, when response variables are separately having normal distribution, the principal components will also be. By ensuring each of response variables to be normally distributed, it will be confirmed that the principal components are also to follow normal distribution. With PCS, we only apply ANOVA on a single response variable, i.e. the principal component score which has the highest contribution to the total variation of original response variables.

Principal Component Analysis (PCA)
The PCA is a technique to transform a set of p original variables, usually correlated, say X 1 , X 2 , …, X p , to a new set of r (r ≤ p) uncorrelated variables, say Y 1 , Y 2 , …, Y r . These new set of variables are called principal components, the linear combinations of original variables. Geometrically, these linear combinations represent the new coordinate system by rotating the original coordinate system with Y 1 , Y 2 , …, Y r as the axes. These axes represent the directions with maximum variability on covariance structure [1], [3], [4], [5]. Suppose random vector X' = [X 1 , X 2 , …, X p ] having covariance matrix Σ with corresponding characteristic roots λ 1 ≥ λ 2 ≥ ... ≥ λ p ≥ 0. Consider the following linear combinations.
: : .., Y p are principal components, the linear combinations of original variables which are uncorrelated among others. The first principal component explaining most of total variance can be obtained by maximizing Var(Y 1 ) = a 1 'Σa 1 subject to a 1 'a 1 = 1. For the second principal component is to maximize Var(Y 2 ) = a 2 'Σa 2 with constraints a 2 'a 2 = 1 and Cov(Y 2 , Y 1 ) = 0. And so forth for ith principal component by assigning Var(Y i ) = a i 'Σa i subject to a i 'a i = 1 and Cov(Y i , Y j ) = 0 for any j < i.

ANOVA and MANOVA Model
MANOVA model is similar to ANOVA. The only difference is the use of multivariate response instead of single variable separately. In this paper we only show a two-factor model with interaction. For a two-factor ANOVA, the model with interaction can be written as: Where Σ α i = Σ β j = Σ γ ij = 0 and ε ijk be normally independent distributed with zero mean and variance σ 2 . In this model, μ represents general mean, α i be fixed effect of factor I, β j be fixed effect of factor II, and γ ij be the interaction between factor I and faktor II [1], [3]. Like in ANOVA model, MANOVA model can be written as Equation (2) with all terms are vectors with response term in matrix, as follows.
Where Σ α i = Σ β j = Σ γ ij = 0. All vectors are of order p × 1 (p is the number of response variables) and ε ijk are random vectors of independent multivariate normal N p (0, Σ) [1].

Results and Discussions
To show that our proposed method work well, we utilized an artificial dataset with three response variables, two factors with 4 levels for each, and the number of replication is 5. The first step of this section we would be providing an ANOVA to see if the factors and interaction are showing significant effects on separated response variable. Then, we would be presenting the results from MANOVA to compare the effects on response variables simultaneously. Finally we would be performing the results