Size and Power Properties of Some Test Statistics for Testing the Population Correlation Coefficient

Correlation measures the strength of association between two variables, which plays an important role in various fields, such as Health Science, Economics, Finance, Engineering, Environmental science among others. Several tests for testing the population correlation coefficient are proposed in a literature by various researchers at different time points. This paper evaluates the performance of some of the prominent test statistics for testing the population correlation coefficient based on empirical size and power of the tests. Some bivariate distributions, such as normal, lognormal, gamma and chi-square are considered to compare the performance of the test statistics. We believe that the findings of this paper will make an important contribution to select some good test statistics to find the relationship between two variables. Citation: Banik S, Golam Kibria BM (2017) Size and Power Properties of Some Test Statistics for Testing the Population Correlation Coefficient ρ. J Biom Biostat 8: 353. doi: 10.4172/2155-6180.1000353


Introduction
One of the most useful statistical tools for quantifying the relationship between two continuous variables is the coefficient of correlation that developed by Pearson [1] from a related idea introduced by Galton [2]. In statistics, Pearson's correlation coefficient is used to find the linear relationship between two quantitative variables (say) X and Y. It gives a value between -1 and +1 inclusive, where -1 indicates a perfect negative correlation, 0 is no correlation and +1 indicates a perfect positive correlation between X and Y. Since the population correlation coefficient, ρ is usually unknown, it is necessary to estimate it by estimator, r from the observed data or sample information. Even the sample correlation coefficient (r) is a biased estimator of population correlation coefficient ρ, the biasness disappears with the increase of sample size. When there is a question of estimation, its estimation accuracy and thus the validity through the hypothesis testing is essential. Several researchers considered several confidence intervals for estimating the population correlation coefficient ρ [3]. However, a comparison of several test statistics for testing the population correlation coefficient is limited in literature. In this paper, we have made an attempt to consider several test statistics for testing the population correlation coefficient. Since, a theoretical comparison among the test procedures is not possible, a simulation study will be conducted to compare the performance of the test statistics based on empirical size and power of the test. We believe that the findings of this study will make an important contribution to literature to choose appropriate test statistics for testing the population correlation coefficient for practitioners.
The paper is organized as follows: One proposed and some existing methods for testing the population correlation coefficient are described in section 2. A Monte Carlo simulation study along with results is discussed in section 3. Finally, some concluding remarks are given in section 4.

Methods for Testing the Population Correlation Coefficient
Suppose we are interested to find the linear relationship between two variables X and Y. Then the population correlation coefficient between two variables X and Y is denoted by ρ and is defined by 2 2 Cov(X, Y) The corresponding sample correlation coefficient is defined bŷ 2 2 .x y r x y σ σ σ = It can be shown that −1 ≤ ρ ≤ 1. A value of 1 implies that a linear equation describes the relationship between X and Y perfectly, with all data points lying on a line for which Y increases as X increases. A value of −1 implies that all data points lie on a line for which Y decreases as X increases. A value of 0 implies that there is no linear association between X and Y. Several methods for testing for population correlation coefficient, H 0 : ρ=0 vs. H 1 : ρ ≠ 0 are given as follows.

The classical test statistic
Suppose, the sample correlation coefficient, r is a point estimator of ρ. The distribution of r when ρ is zero was for the first time studied by a student [4]. Thus, a common test is that of whether or not a linear relationship exists between two variables X and Y. The test statistic is defined as follows: where n is the sample size and (n-2) is the degrees of freedom(df). Thus the critical value for this test statistic can be obtained from t-distribution with (n-2) degrees of freedom.

Fisher's large sample test statistic
Since the sampling distribution of Pearson's r is not normally distributed, Pearson's r is converted to Fisher's z and the test statistic for testing H 0 : ρ=0 vs. H 1 : ρ ≠ 0 is computed using Fisher's [5] transformation and is given as follows: and z 0 is the value of z under the null hypothesis where The distribution of has a standard normal distribution.

Gorsuch and Lehmann test statistics
To improve the performances of the classical statistic, Gorsuch and Lehmann [6] modified the classical statistic and the Fisher statistic and proposed the following four statistics for testing H 0 : ρ=0 vs. H 1 : ρ ≠ 0 based on different standard errors of r: Modified classical statistics:  is assumed to be 2 (details see Gorsuch and Lehmann [6]) and the distribution of 2 follows t distribution with (n-1) df.

Modified Fisher statistics:
where the critical value of 3 is assumed to be 2 (details see Gorsuch and Lehmann [6]) and 4 has a t-distribution with (n-1) degrees of freedom.

Proposed test statistic
We know that The distribution of has t-distribution with (n-2) df. where i th random samples are denoted by x (i) and y (i) for i =1,2, …, B and B is the number of bootstrap samples [7]. The test statistic for testing H 0 : ρ=0 vs. H 1 : ρ ≠ 0 is given by

Parametric bootstrap test statistic
Where critical values of the above statistic is the * (n 2) , 2 t α − which is the (α/2) th sample quintiles of .

Parametric bootstrap Fisher z test statistic
The test statistic for testing H 0 : ρ=0 vs. H 1 : ρ ≠ 0is computed using Fisher's z [8] transformation and is given as follows: Where critical values of the above statistic is the * / 2 z α , which is the (α/2) th sample quintiles of .

Parametric bootstrap version of proposed test statistic
The test statistic for testing H 0 : ρ=0 vs. H 1 : ρ ≠ 0is computed follows: Where critical values of the above statistic is the * (n 2) , 2 t α − , which is the (α/2) th sample quintiles of .

Bootstrap bias corrected acceleration test statistic
This method is introduced by Efron and Tibshirani [9]. The test statistic for testing H 0 : ρ=0 vs. H 1 : ρ ≠ 0is the t-statistic defined in eqn. (1) and the critical value is calculated by is the standard normal cumulative distribution function, bias correction is the inverse function of cumulative distribution function of the Z distribution, acceleration factor  1.5 3 2 1 1 (r r ) / 6 (r r ) , r is the correlation between x and y and r i is the correlation between x and y of (n-1) observations without the i th observation.

Simulation Study
The main goal of this paper is to evaluate the performance of test statistics for testing population correlation coefficient based on size and power properties, discussed in section 2. Since a theoretical comparison among the tests is not possible, a simulation study has been conducted in this section. MATLAB (2015) programming language was used to run simulations and to make necessary tables. The most common level of significance α=0.05 is considered and assumed random sample sizes n=10, 30, 50, 80 and 100 and ρ 1 =-0.5, -0.9, 0.3, 0.8 and 0.99. We have considered 2500 replications for our simulation experiments and 1500 bootstrap samples for each selected random samples sizes. Random samples produced from the following population distributions:

Results Discussion
We can see from Figure 1 and Table 1 is that for all sample sizes, all proposed test statistics except GL1 and SK Boot have empirical sizes close to the 5% nominal level.
We have presented estimated sizes when data are generated from the bivariate lognormal distribution in Table 2 and depicted results for visual inspection in Figure 2. From Table 2 and Figure 2, we observe that for moderate to large sample sizes, tBoot, FBoot, SKBoot and BCABoot have sizes close to the nominal level, while rest of the tests achieve nominal level only when sample sizes are large.
In Tables 3 and 4, we have reported estimated sizes when data generated from the bivariate gamma and bivariate chi-square distribution respectively. We find that all tests have correct sizes expect GL1 and SKboot. GL1 test has small sizes than the nominal level and SKboot has higher sizes than the nominal level (Figures 3 and 4).
In Table 5, we have presented the estimated powers when data are generated from the bivariate normal distribution for various sample sizes and various values of ρ. We observed that for small sample size n=10 ( Figure 5)      powers as compare to other test statistics. For sample sizes 50 or above, (Figure 6 for n=50) we found that all test statistics have good powers except for ρ=0.3. We noted that for weak positive correlation, SKboot has highest power as compare to rest of the test statistics.
In Figure 7, we have presented estimated powers when data are generated from the bivariate lognormal distribution for n=10. We observed that for strong negative correlation, t, SK, tBoot and BCABoot tests statistics have very poor power compare to other tests statistics. For positive correlation, we found that all tests have good powers but tBoot and BCABoot. We observed that these two tests have very low powers as compare to other test statistics. we have tabulated estimated power of various test statistics when data are generated from bivariate gamma and bivariate chi-square distributions (see Figure 9 (n =10) and Figure 10 (n=30) for better understanding). It is observed that power properties of the selected tests are similar when data generated from bivariate normal distribution or lognormal.
In Figures 11 and 12, we have plotted estimated powers for various values of n for ρ=-0.5 and ρ=0.8 to check effects of n on the selected tests. It is observed from these graphs that as n increases powers are     also increases for all selected tests. We noted that for small sample sizes, Fisher, GL2, Fboot and SKboot tests are more powerful than the other considered tests. It is also noted that our proposed bootstrap version SKboot is more powerful than the other considered tests.
In Figures 13 and 14, we have plotted estimated powers for selected values of n and two selected values of ρ. Here also we observed same patterns like Figures 11 and 12 as sample size increases, estimated powers also increases. As compare to the Figure 11, we noted very low powers n=10 when data generated from the bivariate lognormal distribution. We noted that tboot and BCAboot tests have very low power compared to the other tests.
Figures 15-18 present estimated powers for various values of n and two selected values of ρ when data are generated from the bivariate gamma distribution and bivariate chi-square distribution respectively. Similar interpretation can be drawn from these figures, as we observed when data are generated from the bivariate normal distribution.

Conclusion
In this paper, we study the performance of several methods for testing the population correlation coefficient by means of a simulation study. Data were generated randomly from several bivariate distributions, namely, bivariate normal, bivariate lognormal, bivariate  gamma and bivariate chi-square with a range of sample sizes. Overall, we found that test statistics, t, Fisher, GL2, GL3, GL4, SK and FBoot have sizes close to the 5% nominal level. Fisher, GL3, GL4, FBoot and SKBoot have good powers as compare to other test statistics. It appears from the simulation study is that the test statistics, Fisher, GL3, GL4, FBoot and SKBoot can be recommended for practitioners because these test statistics have good sizes and powers compare to the rest of selected test statistics.