Improved Linear Combination of Two Estimators for a Function of Interested Parameter

In this paper, we consider the problem of improving the efficiency of a linear combination of two estimators when the population coefficient of variation is known. We generalized the discussion from the case of a parameter to a function of are interested parameter. We show that two estimators obtained from a improved linear combination of two estimators and a linear combination of two improved estimators are equivalent in terms of efficiency. We also show how a doubly-improved linear combination of two estimators can be constructed when the population coefficient of variation is known.


Introduction
In many practical inferential studies some prior information such as coefficient of variation (CV), kurtosis or skewness of population are available in advance.Using prior information to improve the efficiency of a given estimator has been considered in literature, repeatedly Searls (1964) and Arnholt & Hebert (2001) proposed an improved estimators for the population mean given the population CV.Wencheko & Wijekoon (2007) improved their results and obtained a shrunken estimator for the mean of one parameter exponential families.Also, given the population CV, Khan (1968) constructed a convex combination of two uncorrelated and unbiased estimators of the population mean with minimum mean square error (MSE).Improved estimators for the population variance that utilize the population kurtosis have been discussed by many authors notably Searls (1964), Kleffe (1985), Searls & Intarapanich (1990), Kanefuji & Iwase (1998), Wencheko & Chipoyera (2005) and Subhash & Cem (2013).In this regard, Laheetharan & Wijekoon (2010) proposed an improved estimator for the population variance and compared it with other estimators based on the scaled squared error loss function.The problem of finding improved estimators given an additional information has also been considered, for situations in which the dimension of sufficient statistics is grater than the dimension of the interested parameter.Gleser & Healy (1976) considered the problem of minimizing the MSE of a non-convex combination of two uncorrelated and unbiased estimators given a known population coefficient of variation.Samuel-Cahn (1994) expand their solution to a more general case for two correlated and unbiased estimators.Also, Arnholt & Hebert (1995) discussed non-convex combination of two correlated and biased estimators for an unknown parameter when the CV of both two estimators are known.It should be noted that the process of finding improved estimator usually leads to a biased estimator; therefore, the MSE criterion plays a main role in all results due to its emphasis on both variance and biasness of estimators.Some important results related to improving biased estimators are given by Bibby (1972), Bibby & Toutenburg (1977) and Bibby & Toutenburg (1978).The following theorems provide some of the most important results related to the problem of finding improved estimators in the presence of some prior information.
Theorem 1 (Arnholt &Hebert, 2001 andLaheetharan &Wijekoon, 2011).Let X = (X 1 , . . ., X n ) be a random sample from a population with distribution f (x|θ) and T 1 (X) and T 2 (X) be estimators of θ, possibly correlated with uniformly has the minimum MSE among all estimators that are linear in T 1 (X) and T 2 (X), where .
Theorem 2 (Laheetharan & Wijekoon, 2010).Let X = (X 1 , . . ., X n ) be a random sample from a population with distribution f (x|θ) and g(θ) be a realvalued function on the parameter space Θ.Let T 1 (X) and T 2 (X) be point estimators of g(θ) with E(T i (X)) = k i g(θ), where k i ∈ .Then, the estimators In this paper, we consider the problem of improving the efficiency of a linear combination of two estimators, when the population CV is known.The rest of paper is organized as follows: in Section 2, we briefly review the main results related to the improved linear combination of estimators.In Section 3 we generalized the discussion from the case of a parameter to a function of an interested parameter by expanding the results of Gleser & Healy (1976) and and Arnholt & Hebert (2001).In section 4, we show that two estimators obtained from a improved linear combination of two estimators and a linear combination of two improved estimators are the same in terms of efficiency.In section 5, we show that how a doubly-improved linear combination of two estimators can be construct when the population CV is known.In section 6, we provide some illustrative examples.

Improved Linear Combination of Two Estimators for a Parameter
In this section, we briefly review the main results related to an improved linear combination of estimators, when some additional information is available.
Using some prior information may reduce the dimension of parameter space.For example, when the coefficient of variation ν = σ µ is known, the distribution of N (µ, σ 2 ), µ = 0 can be written as N (µ, ν 2 µ 2 ) due to the equation σ 2 = ν 2 µ 2 .It can be seen that the dimension of sufficient statistics, ( X, S 2 ), is more than the dimension of the parameter of interest, µ.In this situation using only a part of the sufficient statistics leads to a loss of some information about the parameter of interest.Therefore, the simultaneous use of two or more estimator is necessary to achieve more possible information about the parameter of interest.One can use a combination of estimators to construct an efficient estimator.Khan (1968) proposed the optimal combinations of two independent and unbiased estimators of the population mean when the sampling distribution is normal and the population coefficient of variation, ν, is known.Consider T 1 (X) = Xn , T 2 (X) = c n S, c n = (n 1/2 Γ((n − 1)/2))/((2a) 1/2 Γ(n/2)), as two unbiased and independent estimators for µ, where S is the sample standard deviation and a = √ ν.Then, the shrinkage estimator is the optimal combination of estimators X and c n S, where α Of course, it is not necessary to restrict these combinations to be convex.Gleser & Healy (1976) considered a more general case with T = α 1 T 1 + α 2 T 2 , where T i are any independent and unbiased estimators of θ and α 1 + α 2 is not necessarily equal to 1.The only restriction is that the ratios ν 2 i = θ 2 V ar(T i ), i = 1, 2 are free from θ, where ν i denotes the CV of estimator T i .This restriction holds, for example, when the T i , i = 1, 2 are unbiased and ν is known.Since the estimator T is not necessarily convex, it is not necessarily an unbiased estimator for θ.The authors showed that the optimal weights in this case are given by Samuel-Cahn (1994) studied another generalized case of optimizing a convex combination of two unbiased, dependent estimators with a known correlation coefficient ρ.They derived the optimal weight as α * = (1 − ρλ)/(1 − 2ρλ + λ 2 ), where λ 2 = V ar(T 1 )/V ar(T 2 ).The authors assumed that λ 2 is known and free from θ.
It should be noted that when the estimators CV are known and free from θ for both estimators, this restriction is held.

Improved Linear Combination of Two Estimators for a Function of a Parameter
In a population with distribution f (x|θ) there are different interested parameters such as mean, variance, etc. these appear as different functions of θ, hence it is interesting to look for improved estimators for a function of a parameter.In recent years some authors, notably Laheetharan & Wijekoon (2010), have considered the problem of finding improved estimators for a function of an interested parameter, say g(θ).In this section, we derived an optimal shrinkage estimator for a function of a parameter with assumption of known population CV.The following lemma, which is left without proof, provides a preliminary necessary fact for the next theorem.
Lemma 1.Let T (X) be an estimator of parameter θ and g(•) be a real valued function, where E(T (X)) = kg(θ).If the population CV is known, then the ratio Using the Lemma 1, we improved the Gleser & Healy (1976) and Arnholt & Hebert (2001) results to estimate a function of parameter, g(θ), in the next theorem.
Theorem 3. Let X = (X 1 , . . ., X n ) be a random sample from a population with distribution f (x|θ) and let T 1 (X) and T 2 (X) be estimators for g(θ), possibly correlated with E(T i (X)) = k i g(θ), and i = 1, 2. Under these conditions, . (1) are known and free from θ.
(2) Differentiating (2) with respect to α 1 and α 2 and equating it to zero leads to the following system of equations: (3) The solutions of equations (3) are given by (5) Substituting (4) in (5), we have .
Similarly, we have .
The second order partial derivations of (2) with respect to α 1 and α 2 given by which are both positive, therefore α * 1 and α * 2 minimize the value of M SE(T LC (X)), and the estimator Obviously, Theorem 3 is assumptions are culmination of the required assumptions for Theorems 1 and 2, which are provided in the section.The next corollary is an immediate consequence of Theorem 3.

Linear Combination of Two Improved Estimators
One may expected, intuitively, that using two estimators with improved efficiency to construct an optimal linear combination, leads to a more efficient estimator.In the other words, it may be expected that improving the two estimators T 1 (X) and T 2 (X) by using Theorem 2 and then constructing an optimal combination of these improved estimators by Theorem 3 leads to a more efficient linear combination.The following theorem shows that this intuitive expectation is not true.In fact, it shows that two estimators obtained from an improved linear combination of two estimators and a linear combination of two improved estimators are equivalent, in terms of efficiency.

A Doubly-Improved Linear Combination of Two Estimators
In this section, we show how a Doubly-Improved (DI) linear combination of two estimators of a parameter can be construct when the population CV is known.In the next theorem, we try to further improve the improved linear combination estimator that resulted from Theorem 3 by applying Theorem 2.
Theorem 5. Consider the assumptions of Theorem 3. Suppose are the optimal linear combination of estimators T 1 (X) and T 2 (X) where α * 1 and α * 2 are given in equation (1).Then, a) The doubly-improved estimator T * DI (X) = α * T * LC (X) uniformly has the minimum MSE among all estimators of g(θ) that are in the class b) The minimum value of MSE(T * DI (X)) is given by hence, Due to following system of equations it can be easily shown that the estimator

Illustrative Examples
Using Theorem, 2 it is possible to obtain optimal shrunken estimators for both the population mean, say T * µ (X), and the population variance, say T * σ 2 (X).Note that if the population CV, ν, is known, then one can easily use the mean based estimator T * σν 2 (X) = ν 2 [T * µ (X)] 2 as another estimator for the population variance.Laheetharan & Wijekoon (2010) compared the MSE of estimators T * σ 2 (X) and T * σν 2 (X).
Suppose E(T µ (X)) = k 1 µ and E(T σ 2 (X)) = k 2 σ 2 are estimators of the population mean and variance, respectively.Since the population CV is known, then the estimator T σν 2 (X) = ν 2 [T µ (X)] 2 can be considered as another estimator for the population variance.Hence, if V ar(T µ (X)) = cσ 2 , we have the following where k 1 , k 2 , k σ 2 ν and c are known constants.Using the above information, and based on Theorems 4 and 5, we have the following theorem to estimate the population variance.Theorem 6.Let X = (X 1 , . . ., X n ) be a random sample from a population with distribution f (x|θ), and let T σν 2 (X) and T σ 2 (X) be estimators of σ 2 , possibly correlated with are known and free from σ 2 .ii) Since and is free from σ 2 and known, based on Theorem 5, the doubly-improved estimator T * DI (X) = α * T * LC (X) uniformly has the minimum MSE of all σ 2 estimators that are in the class Example 1.Let X = (X 1 , . . ., X n ) be a random sample from a population with a location-scale exponential distribution E(θ, θ), given by Since the estimators T 1 (X) = X (1) and T 2 (X) = n i=1 (X i − X (1) ) are jointly sufficient statistics for g(θ) = θ, our motivation is to use a combination of these two estimators.We can to estimate an interested parameter.It is easy to shaw that the mean and variance of T 1 and T 2 are given by and respectively.Hence, based on the notation of Theorem 3.1, we have Therefore, according to equation ( 1), the improved linear combination of two estimators T 1 and T 2 is given by T This improved estimator uniformly has the minimum MSE among all estimators in the class C T1,T2 (α 1 , α 2 ) = {α 1 T 1 (X) + α 2 T 2 (X) | 0 < α 1 , α 2 < ∞}.The value of MSE for an improved estimator has been computed for different sample sizes and plotted in Figure 1.Decreasing the value of MSE by increasing the sample size, indicates that the improved shrinkage estimator will become more consistent.
Example 2. Let X = (X 1 , . . ., X n ) be a random sample from a population with normal distribution N (θ, θ 2 ).This is a curved exponential family with a twodimensional sufficient statistic.The joint minimal sufficient statistic for g ) and the following equations hold for these estimators: Revista Colombiana de Estadística 39 (2016) 229-245 From Inverse Gaussian distribution we know that E( where where k . Therefore, the coefficient of correlation between T 1 and T 2 obtained is: where h is a free from θ quantity.Considering the equations ( 18), ( 19) and ( 22), and based on the notation of Theorem 3.1, we have k Therefore, according to equation (1), the improved linear combination of two estimators T 1 and T 2 is given by T .
This improved estimator, uniformly has the minimum MSE among all estimators in class C T1,T2 (α 1 , α 1 ) = {α 1 T 1 + α 2 T 1 }.The value of MSE for improved estimators has been computed for different sample sizes and plotted in Figure 3. Decreasing the value of MSE by increasing the sample size indicates that the improved shrinkage estimator becomes more consistent.

Discussion and Results
Sometimes the complete information about the interested parameter is distributed in two or more different estimators.In these situations, using only one of given estimators leads to loss of information of other estimators.Therefore, a combination of estimators must be employed to achieve a more efficient estimator.Moreover, it is interesting to look for improved estimators for a general form function of interested parameters, say g(θ).In recent years, some authors, notably Laheetharan & Wijekoon (2010), have been considered the problem in term of finding improved estimators for a function of an interested parameter.In this context, we have presented an optimal shrinkage estimator for a general form function of an interested parameter with an assumption of a known population coefficient of variation.We have also showed that two estimators obtained from the improved linear combination of two estimators and the linear combination of two improved estimators are equivalent, in terms of efficiency.We think that using other coefficients of distributions, as additional information to be able to achieve a more efficient linear combination of two or more estimators, is an interesting field of research.Future studies will need to address this problem.Of course, it is our opinion that using the coefficient of variation in this direction, as an informative coefficient of distribution, will remain forever interesting.In fact, whenever prior information about the size of coefficient of variations is available, the shrinkage procedure could be useful.The possible results for some distributions with particular properties may be more interesting.For example, considering one-parameter exponential family of distributions is quite interesting.In some members of this distributions family such as normal, Poisson, gamma, binomial and negative binomial, it is known that variance is at most a quadratic function of the mean.Therefore, identifying the pertinent coefficients in the quadratic function is equivalent to determining the coefficient of the variations.As is obvious from theorems' assumptions, one can use any correlated or uncorrelated pair of