The Correlation between Variate-Values and Ranks in Samples from Complete Fourth Power Exponential Distribution

In this paper, we derive the correlation between variate-values and ranks in a sample from the Complete Fourth Power Exponential (CFPE) distribution. A sample from the CFPE distribution could be misclassiﬁed as if it is drawn from the normal distribution due to some similarities between the two distributions. In practice, ranks are used instead of real values (variate-values) when there is hardly any knowledge about the underlying distribution. This may lead to loss of some of the information contained in the actual values. In this paper we found that the amount of information loss, by using ranks instead of real data, is larger when the sample is from the CFPE distribution than if it is from the normal distribution. However, there is still a relatively high correlation between variate-values and the corresponding ranks. Comparisons between the correlation between variate-values and ranks for the CFPE distribution and some other distributions are provided.


Introduction
Statistical methods based on ranks have been heavily studied in the literature, specifically when there is hardly any knowledge of the underlying distribution of the data at hand. Such methods fall under the nonparametric techniques umbrella. However, ignoring the underlying distribution may lead to loss of some of the information contained in the data. And, in some circumstances, when there is a lack of information with regards to the underling distribution, nonparametric techniques -including the methods based on ranks -could be useful and lead to robust inferences. For more details on these methods, we refer the reader to Lehmann and D'Abrera (1976).
It can be difficult to discriminate between the Complete Fourth Power Exponential (CFPE) distribution and the normal distribution due to some similarities between these distributions, e.g. their shapes and other properties. Some people, especially non-statisticians, may mistakenly assume their dataset comes from a normal distribution, when in fact the data may actually come from the CFPE distribution, since it has properties similar to a normal distribution. Amira and Mazloum (1993) studied in detail the CFPE distribution and focused on the geometric and statistical properties of this distribution and compared it to the normal distribution. Unfortunately, this distribution The misclassification of the dataset from the CFPE distribution as a normal distribution was studied in an unpublished MSc thesis entitled "Study on the Error in Random Samples Classification between the Normal Distribution and the Complete Fourth Power Exponential" by Baeshen (2000), King Abdulaziz University, Jeddah, Saudi Arabia.
In practice, the ranks of data are used instead of the original data to make an inference. Additionally, to find how much information we may lose by this action, Stuart (1954) derived the formula to calculate the relationship (correlation) between variate-values and their ranks. It showed that for some distributions we did not lose much information when the original data is replaced by their corresponding ranks. Moreover, Stuart (1955) considered the situation when the variance of a specific distribution does not exist and it showed that for a continuous distribution with no moments which is eventually monotone, the correlation between variate-values and their ranks is zero.
Later, O'Brien (1982) estimated via simulation the average correlation between variate-values and their ranks for small size samples from different distributions. The term "the degree of distortion or error" is used to indicate the loss of information by replacing the variate-values by their ranks. O'Brien found that these correlations are generally high, and when the sample size increase they reach the limiting values presented by Stuart (1954Stuart ( , 1955 . This finding supports the idea of using the ranks instead of variate-values with a small degree of distortion or error even for small samples.
In Section 2, we will overview the CFPE distribution and some of its properties. In Section 3, we use Stuart's formula to derive the correlation between variate-values and ranks in samples from the CFPE Distribution. Finally, we provide the exact and the approximation correlations between variate-values and ranks from some distributions in Section 4.

The Complete Fourth Power Exponential (CFPE) Distribution
In this section we overview briefly the CFPE distribution and some of its summaries that are needed in this paper. For more details we refer to Amira and Mazloum (1993).
Let X be a random variable from the CFPE distribution with the density function given by where α is the location parameter and β is the scale parameter. The characteristic function of this distribution is where [m/2] is the greatest integer number less than m/2. Then, from this characteristic function, we can obtain the central and non-central moments as follows: , μ 3 = 0 and μ 4 = 1 4 β 4 For simplicity and easy notation, we work on the special case of the CFPE distribution when α = 0, since it is easy to centralize the data around the mean. Therefore : 1916-9795 and

The correlation between variate-values and ranks in samples from the CFPE Distribution
In this section, we use Stuart's formula (1954) to derive the correlation between variate-values and the ranks in sample from the CFPE distribution. From Stuart (1954), the correlation between variate-values X i and their ranks R i , ρ(X i , R i ), is given by where n is the sample size. The second factor on the right side in (2) tends to 2 √ 3/σ as n tends to infinity. In the following theorem, we use (2) to derive the correlation between variate-values and ranks in a sample from the CFPE distribution.

Theorem:
The correlation between variate-values and ranks in a sample from the CFPE distribution when α = 0 is given by Proof: For ease notation, we consider the centered CFPE distribution, i.e. when α = 0, therefore where Γ( 1 2 ) = √ π. Now by using (2) we have (1954) provided, using the formula in (2), the correlation between variate-values and the ranks in samples from Uniform (0, 1), Exponential (λ = 1), and Normal (0, 1) distributions. From (3), we have the correlation between variate-values and the ranks in the sample from the CFPE distribution (0, b), which is equal to 0.728 when n is significantly large, see Table 1.

Stuart
Moreover, by using the simulation technique to estimate the quantity E[XF(X)] in (2), we obtained the approximation results for the correlation between variate-values and the ranks in samples from the other distributions which did not have explicit formula. These approximation results are given in Table 2. We can point out that the corresponding correlation for the Log-normal distribution (0, 1) is closer to (less than) the CFPE distribution compared to other distributions.

Conclusion
In this paper, we derived the correlation between variate-values and ranks in a sample from the Complete Fourth Power Exponential (CFPE) distribution. We found that we lost more information when the data set came from the CFPE distribution than when the data came from normal distribution, when we used ranks instead of variatevalues. This finding emphasizes the importance of distinguishing between these distributions. However, the correlation between variate-values and ranks in samples from the CFPE is still relatively high, which allowed us to use ranks instead of variate-values without losing a lot of information. ≈ .728 The exact results of Uniform, Exponential and Normal are reported by Stuart (1954) while the logistic result presented in O'Brien (1982) The approximation results obtained from 100 000 simulation samples of size 100 (n =100) 18