Modifying and Evaluating the Alexander-Govern Test Using Real Data

This study examines the use of independent group test of comparing two or more means by using parametric method, such as the Alexander-Govern (AG) test. The Alexander-Govern test is used for comparing two or more groups and is a better alternative compared to the James test, the Welch test and the ANOVA. This test has a good control of Type I error rates and gives a high power under variance heterogeneity for a normal data, but it is not robust for non-normal data. As a result, trimmed mean was applied on the test under non-normal data for two group condition. But this test could not control the Type I error rates, when the number of groups exceed two groups. As a result, the MOM estimator was introduced on the test, as its central tendency measure and is not influenced by the number of groups. But this estimator fails to give a good control of Type I error rates, under skewed heavy tailed distribution. In this study, the AGWMOM test was applied in Alexander-Govern test as its central tendency measure. To evaluate the capacity of the test, a real life data was used. Descriptive statistics, Tests of Normality and boxplots were used to determine the normality and non-normality of the independent groups. The results show that only the group middle is not normally distributed due extreme value in the data distribution. The results from the test statistic show that the AGWMOM test has a smaller p-value of 0.0000002869 that is less than 0.05, compared to the AG test that produced a p-value of 0.06982, that is greater than 0.05. Therefore, the AGWMOM test is considered to be significant, compared to the AG test.


Introduction
In comparing independent group means, the analysis of variance is applicable in different aspects of life, such as in sociology, agriculture, economics and in medicine, as explained by Pardo, Pardo, Vincente and Esteban (1997).The three main assumptions that must be fulfilled before the ANOVA can perform effectively are: (i) homogeneity of the variance (ii) normality of the data and (iii) independent observations.The ANOVA is a classical group test that is used for comparing three or more means.The ANOVA is very sensitive to the assumptions of homogeneity of the variance.In a situation where there is a violation in the assumptions, it would affect the authenticity of the test and thereby the p-value may become too conservative or large (Brown & Forsythe, 1974;Wilcox, Charlin & Thompson, 1986).Welch (1951) proposed the Welch test to solve the problem of heterogeneity of variance.This test modifies the calculation of the degree of freedom in the common F test.For unequal variance, the Welch test gives a good control of Type I error rates, but fails to control the Type I error rates when the group sizes increases (Wilcox, 1988).James (1951) introduced the James test.According to Lix et al. (1996), Oshima and Algina (1992) and Wilcox (1988) stated that the James test is used for weighting sample means and is a better alternative to the ANOVA under variance heterogeneity.But this test fails to give a good control of Type I error rates for a small sample size.The Welch test and the James test are used in analyzing a data distribution that is non-normal under variance heterogeneity (Brunner, Dette, & Munk, 1997;Kohr & Games, 1974;Krishnamoorthy, Lu, & Matthew, 2007;Wilcox & Keselma, 2003).
The Alexander-Govern test was introduced by Alexander-Govern (1994) to handle the problem of variance heterogeneity, for a normal data, but this test is not robust to non-normal data.The Alexander-Govern test was compared with the James test and the Welch test and it was admitted by Schneider and Penfield (1997) and Myers (1988) that the Alexander-Govern test is a better alternative compared to the James test and the Welch test.Myers (1998) admitted that the Alexander-Govern test gives a good solution to the problem of variance heterogeneity.Although, the Alexander-Govern test is a better alternative to the ANOVA under variance heterogeneity, the test suffers some disadvantages.As stated by Myers (1998) the major weakness of the test is that it cannot handle any deviation from normality.The test performs excellently well in the control of Type I error for a normal data.
It is an established fact that the common mean is a very good estimator for a normal data, but it is extremely sensitive to the appearance of outliers.The Alexander-Govern test uses mean as its central tendency measure.But the test fails to give a high power and good control of Type I error rates, for a non-normal data.A non-normal data is a situation whereby a given data set is not normally distributed.Investigation under the empirical test shows that the Alexander-Govern test performed remarkably well in the control of Type I error rates and power under variance heterogeneity, compared to the ANOVA for a normal data (Alexander & Govern, 1994).In addition, Schneider and Penfield (1997) observed that the Alexander-Govern test is a better alternative to the ANOVA under variance heterogeneity compared to the James test and the Welch test.It is due to the fact the Alexander-Govern test is simple in its calculation, and possesses good control of Type I error rates and high power for a normal data distribution.But this test fails to give a good control of Type I error rates under non-normal data.
According to Myers (1998) the Alexander-Govern test is only recommended for a normal data and not for a non-normal data in the control of Type I error rates.In dealing with non-normal data, transformation becomes a better solution.Transformation is a technique that is used in transforming a data distribution that is non-normal, under variance heterogeneity.As a result, the present scores in the data distribution become normal with equal variance.Despite the fact that transformation possesses the ability of transforming a skewed data, but it possesses some disadvantages in its usage.According to Wilcox (2002) in applying transformation on the square root of the mean and also the log of the mean, removes the influences on a real data set.Transformation cannot eliminate the influence of outliers in a data set.In a condition, where the extent of transformation is complex in a given data set, it fails to normalize a skewed data.A better alternative in handling non-normal data is by using non-parametric approach.Marascuilo and McSweeney (1977) explained that non-parametric test makes no definite assumption in association with one or more of the population parameters that describes the given data distribution that is to be used.It is used to remove a nominal and a ranked order data and is also referred to as a distribution free test.Non-parametric test are not as sensitive as using parametric test, when the assumptions in using the parametric test are accomplished.Therefore, larger differences are needed before a rejection of the null hypothesis is carried out.Also, non-parametric methods require a large sample size to prevent loss of information.In examining the weakness in using non-parametric technique, scholars have discovered the use of robust estimator as a better alternative in handling non-normal data.Robust estimator that is frequently used in improving the independent group test is the use of trimmed mean.The trimmed mean has been successfully used to improve the Alexander-Govern test, under variance heterogeneity for a non-normal data distribution (Guo & Luh, 2000;Lix & Keselman, 1995;Luh, 1999).Lix and Keselman (1998) introduced the trimmed mean as a better alternative to the common mean as its central tendency measure under non-normal data.The Trimmed mean is calculated by averaging only the middle data value after removing a certain percentage of the largest and smallest data value, while its variance is evaluated using the Winsorized variance.Trimming is defined as the process of eliminating extreme values in percentage from both tails of the data distribution, in the process of analyzing the data.
Generally, the percentage of trimming is carried out irrespective of the type of the distribution.It will be a very costly mistake, to remove a data distribution where outliers are not found, mainly in a normal distribution, because in doing so, it will lead to loss of information.For a skewed data distribution, the trimming process is not done equally on the tails of the data distribution.Another weakness in using the trimmed mean is that it could not give a good control of Type I error rates, when the number of groups is more than two, especially when it was applied in Alexander-Govern test (Lix & Keselman, 1995).
Another technique in handling the influence of outliers in a data distribution is with the use of Winsorization approach.According to Hasings, Monsteler, Tukey and Winsor (1947) describes the winsorization process as an exchange or replacement for an outlier detected value with the value closest to the outlier.In Winsorization process, the sample size of the data distribution remains the same.While in the trimming process, the outliers detected are removed from both the upper and lower tail of the data distribution.The trimming process results to loss of information, while the Winsorization process, helps to preserve the data.
According to Abdullah, Yahaya and Othman (2007) a better alternative to the use of trimmed mean in the Alexander-Govern test, is a highly robust estimator known as the modified one step M-estimator (MOM).It was discovered by these researchers that when the distribution of the data is skewed, the MOM estimator, gave a good control of Type I error rates and it empirically trims extreme data values only, depending on the nature of the distribution, be it normal or skewed data distribution.The MOM estimator gave an excellent control of Type I error rates, when it was applied in Alexander-Govern test, under normal or highly skewed data distribution, but fails to give a remarkable control of Type I error rates under skewed heavy tailed distribution (Othman et al., 2004).
In this study, to evaluate the capacity of the test using real data, for the AG test and the AGWMOM test, Descriptive statistics, Test of Normality, Box plots and the Test statistic were employed.The results from the Test statistic, shows that the AGWMOM test is very significant compared to the AG test.

The Alexander-Govern Test
The Alexander-Govern test is a test introduced by Alexander-Govern (1994) that uses mean as a measure of its central tendency and is also used in comparing two or more groups.This test gives a good control of Type I error rates, and provides high power under variance heterogeneity for a normal data but it is not robust to non-normal data.The test statistic for the Alexander-Govern test is obtained by using the procedures below: Firstly, the mean of the test is calculated using: where ij X denotes the observed ordered random samples and j n represent the sample sizes of the observations.The mean is used as a measure of the central tendency in the Alexander-Govern (1994) method.After the mean is obtained, the estimate of the usual unbiased variance is obtained by using: where j X is  used for estimating j  for the population j.The standard error of the mean is obtained for each of the groups, using: The weight ( ) j w for the group sizes with j population of the ordered sample data is defined, where j w  must be equivalent to 1. So, the weight ( ) j w for each of the independent groups is obtained using the formula: must be equal to 1 (Alexander & Govern, 1994).
The null hypothesis testing for the Alexander-Govern (1994) technique, for the equality of the mean, under variance heterogeneity is defined as: 1 1 : ... : ... , 1,..., The alternative hypothesis negates the claim or statement made by the null hypothesis.The variance weighted estimate of the total mean for all the groups in the data distribution, is obtained using: where j w is the weight for each of the group in the data distribution and is the mean of each of the groups in the ordered sample.The t statistic for each of the groups is obtained using: where j X  represent the mean for each of the independent group,   represent the grand mean for all the groups under analysis and ej S denotes the standard error for each of the independent groups with population .j The t statistic is distributed as a t variable, with 1 j n  degrees of freedom for . where  is the degree of freedom for each of the independent groups in the ordered data distribution.The t statistic is obtained for each group and is transformed to a standard normal deviates by using the Hill's (1970) normalization approximation in the Alexander-Govern (1994) technique.The AG formula is defined as: where and where 2 1, 0.5, 48 The test statistic for the Alexander-Govern approach is expressed as: The test statistic value of the Alexander-Govern (AG) test at  = 0.05 level of significance is denoted by A. The p-value of the test is obtained from the standard chi-square distribution table, with J -1 degree of freedom.If the p-value obtained for the AG test is > 0.05, then we say the AG test is not significant, otherwise, the test is regarded as significant (when the p-value of AG test is < 0.05).

The Modified Alexander-Govern Test
Given an ordered data sample of X 1 , X 2 , …, X n , with sample size n and group sizes j.Firstly, the median of the data set is obtained by selecting the middle value from the observations.The MAD estimator is the median of the set of absolute values of the differences between each of the score and the median.It is the median of: /X j -M/,…,/X n -M/.Thereafter, the median absolute deviation about the median (MAD n ) estimator is obtained by using: , 0.6745 According to Wilcox and Keselman (2003) the constant value of 0.6745 is used to rescale the MAD estimator, with the aim of making the denominator estimates  when sampling from a normal distribution.
Outliers in a data distribution can be detected by using the formula below: where X j represents the observed ordered random sample, M is the median of the ordered random samples and MAD n is the median absolute deviation about the median.The value of K is 2.24.This value was proposed by Wilcox and Keselman (2003) for detecting the presence of outliers in a data set, because it has a very small standard error, when sampling from a normal distribution.
Equation ( 13) and ( 14) is also referred to as the MOM estimator that is used for detecting the presence of outliers in a data distribution.The Winsorized MOM estimator would be applied in the data distribution when the outlier detected value is replaced or exchanged with a preceding value closest to where the outlier is located.
The WMOM estimator, which replaces the mean as a measure of the central tendency is calculated by using the formula below: where is the mean of the Winsorized data distribution, WMOMj X is the ordered sample data of the Winsorized data distribution and n is the sample size of the Winsorized data distribution.The WMOM estimator becomes a replacement for the common mean as a measure of the central tendency in the Alexander-Govern test, for the following reasons: 1.To remove the presence of outliers from the data distribution.
2. To make the Alexander-Govern test to be robust to non-normal data.
The Winsorized sample variance is expressed as: The standard error of the Winsorized MOM is obtained by using the bootstrapping algorithm, for estimating standard errors, which is defined using the following procedure.Firstly, we select B independent bootstrap samples which are expressed as: 1 2 , ,..., , where each of the random samples consists of n data sets chosen by making replacement from x, which is defined as: 1 2 1 2 ( , ,..., ) ( , ,..., ) The symbol ( )  indicates that x  is not the actual data set of x, but it is referring to a randomized or resampled version of x.To estimate the standard error of the bootstrap samples, the number of B must be within the interval of (25 -200).As stated by Efron and Tibshira (1998) 50 amounts of the bootstrap sample is a reasonable amount to give a sufficient estimate of the standard error of the MOM estimator.In this study, 50 amounts of the bootstrap sample were used to give a reasonable estimate of the standard error of the MOM estimator.
Secondly, we estimated the bootstrap replications equating to each of the bootstrap samples expressed as: Thirdly, we estimate the standard error of ( ) from the sample standard deviation of the bootstrap (B) replications as defined as: The weight j w for the Winsorized data distribution for each of the independent groups is obtained using: Note: The weight (w j ) for the group sizes with j population is defined such that 1 J j j w   must be equivalent to 1 (Alexander & Govern, 1994).
is the sum of the reciprocal of the square of the standard error for all the independent groups in the ordered data set from the real life data distribution.
The variance weighted estimate of the total mean for the Winsorized data distribution for all the independent groups is defined as: The t statistic for the Winsorized data distribution for each of the independent groups is expressed as: where eWMOMj S is the Winsorized sample standard error from the Winsorized data distribution, for each of the independent groups of .


In the Alexander-Govern (1994) approach, the j t value is transformed to standard normal by using the Hill's (1970) normalization approximation and the hypothesis testing of the Winsorized data distribution, where 2 WMOMj
Therefore, the normalization approximation formula for the Alexander-Govern approach, using the Winsorized data distribution is defined as: where and where 2 1, 0.5, 48 , The test statistic for the AGWMOM for all the independent groups in the ordered sample data is defined as: The test statistic of the AGWMOM test follows a chi-square distribution at 0.05   level of significance, with J -1 chi-square degrees of freedom.The p-value of the AGWMOM test is obtained from the standard chi-square distribution table.If the p-value is < 0.05, it shows that the AGWMOM test is significant, otherwise it is not.

Discussion and Conclusion
Shapiro-Wilk Test is a test that is most suitable for sample sizes that is less than 50.This test is also suitable to handle sample sizes that is as large as 2000.As a result of this, the Shapiro-Wilk Test is used to test for the normality of the three independent groups, namely, the group young, middle and old respectively.If the significant values of any of the three tests is greater than 0.05, then the data is said to be normally distributed.Otherwise, if the significant value is less than 0.05, then the data distribution is said to be non-normal.The results from Table 5, show that the p-value for the group young and old is greater than 0.05, hence both groups are normally distributed.The group middle has a p-value of 0.001 which is less than 0.05 and is considered to be non-normally distributed.
In Figure 1, shows the boxplots of the reaction time against the group young, middle and old respectively.It can be observed very clearly from the plots that there is no extreme value seen in the group young and old and hence, the data distribution for both groups are said to be normally distributed.There is an extreme value observed in the group middle, and this indicates that the data distribution for the group middle is non-normal.In Table 6, the test statistic for the AG test has a value of 5.3237, with a p-value of 0.06982 at   0.05 level of significant.This implies that the p-value of the AG test is regarded as not significant, since the p-value of the AG test is > 0.05.While the test statistic value of the AGWMOM test is six times more than that of the original AG test.The test statistic value of the AGWMOM test produced a value of 30.1280, with a p-value of 0.0000002869 at 0.05 level of significant.As a result, the p-value of the AGWMOM test is regarded as significant, since its value is < 0.05, compare to the AG test.The standard error of the Winsorized AGMOM from the real life data for the group young, middle and old is far smaller compared to the standard error of the AG test from the original real life data.Therefore, the AGWMOM test has assisted to minimize error as much as possible from the real life data compared to the AG test.
group sizes with j population of the ordered sample data is defined such that, sample variance for the Winsorized data distribution, j X is the observed random sample, WMOMj X  is the Winsorized MOM estimator for the Winsorized data distribution and n is the sample size of the ordered data set.

Figure 1 .
Figure 1.Boxplots on reaction time versus group young, middle and old

Evaluate the Capacity of the Test Using Real DataTable 1 .
Given an Ordered Data set with Observations as follows

Table 2 .
The Winsorized data distribution from the real life data

Table 3 .
Descriptive Statistics for the bootstrap sample (n = 50), for the AG test , values of the skewness for the group young, middle and old are all greater than zero.Hence, the distribution of the data is said to be right skewed.The values of the kurtosis for the group young and old is less than three and it indicates that the values of the data are majorly distributed around the mean in the distribution.While the kurtosis value for the group middle is greater than three and hence, it shows that the distribution of the data is centred about the mean, with a thicker tail.As a result, there is high probability for extreme values.

Table 4 .
Descriptive Statistics for the bootstrap sample (n = 50), for the AGWMOM test

Table 5 .
Tests of Normality for the real life data

Table 6 .
The test statistic for the AG test and AGWMOM test