Improved Linear Regression Using Auxillary Informations

The present study was taken under consideration in order to propose improved linear regression using auxiliary information’s of coefficient of regression, coefficient of skewness, coefficient of variation in order to achieve more precision in estimates than the already existing estimators. The properties associated with the proposed estimators are assessed by mean square error and bias and compared with the existing estimators. In the support of the theoretical proposed work we have given numerical illustration.


INTRODUCTION
A Regression type estimator using known coefficient of variation is considered and its properties are studied. There are several instances in physical, biological and agricultural sciences where the mean is Proportional to standard deviation and consequently the coefficient of variation is known although the mean and standard deviation may not be known. Some such situations may be seen in Snedecor (1946), Hald (1952), Davies and Goldsmith (1976) and Gleser and Healy (1976). The well-known Weber's law of psychophysics (see Guilford (1975), Chapter II provides instances where coefficient of variation is known and one such example is given in Singh (1998) also. Sometimes simple a priori information in the form of coefficient of variation is available to the experimenters in the fields of biology, agriculture, psychophysics etc. Long association of the experimenters with the experimental information concerning the coefficient of variation. This information concerning coefficient of variation. This information concerning coefficient of variation is frequently used to plan experiments, estimate sample size, average, total, etc. (See Searles (1964) also. Further coefficient of variation may be seen in Cochran (1977, 3 rd edition) on page 77 and Page 79 of Chapter 4. A good description about knowledge of coefficient of variation is given in Sukhatme et al. (1984) also on page 42.
The objective of the paper is to propose modified estimators for estimating the population mean by using the improved linear regression using auxiliary imformation with the coefficient of regression and coefficient of skewness of the auxiliary variables.

Further, Let
Where , are the observation on y and are the observations on auxiliary variable x for a simple random sample of size n.
For estimating the population mean using regression method of estimation, the proposed estimator is Where is the characterizing scalar chosen suitably.

Comparison of Bias and Mean Square Error of ̅ of the proposed estimator via-A-vis the competing estimator
For simplicity, it is assumed that the population size N is large enough as compared to the sample size n so that finite population correction term may be ignored.
we get the estimator based on the estimated optimum ̂ as.

An Illustration
We observe that the conditions discussed in the introduction for known coefficient of variation are satisfied for the data given in Walpoole, R. E., Myres, R. H., Myres, S. L., and Ye, K. (2005, page 473) dealing with measure of aerobic fitness is the oxygen consumption in volume per unit body weight per unit time. Thirty-one individuals were used in an experiment in order to be able to model oxygen consumption (y) against time to run one and half miles (x). consumption of required values has been done and we have the following. For the population I and II we use the Data Sets data of Singh and Chaudhary (1986) page 177 and for the population III we use the data of Murthy (1967) page 228 in which fixed capital is denoted by X (auxiliary variable) and output of 80 factories are denoted by Y (study variable). For the population IV, the data is of cultivation and production of apple in district Baramulla of Kashmir (Jammu and Kashmir) in which the apple production (in tons) is denoted by Y (study variable) and number of apple trees are denoted by X (auxiliary variable, 1unit = 100 trees) in 117 villages of the Baramulla region of Jammu and Kashmir in 2010-2011 (Source: RC Mproject, pilot survey for estimation of cultivation andproduction of apple in district Baramulla, RC Mapproved project). The data is consisted of 100 blocks in a large city. The variables of interest are as: : The number of persons per block and : The number of rooms per block.
For this data, we have ̅ 101.1, ̅ 58.8, .4385. We provide the Percentage Gain in Efficiency of the proposed estimator with respect to its compititors in the following Table 1. The above table well indicates the supremacy of the proposed estimator over and . From (4.4.3), we see that the estimator ̅ depending on estimated optimum value is always more efficient than the usual linear regression estimator ̅ ̅ ̅ ̅ in the sense having lesser mean square error.
The use of proposed estimator is limited for the situations when coefficient of variation is known. However, in case of unknown coefficient of variations its estimated value may be used after studying the performance of the estimator (robustness) against different values of CV, if the guess is in error say 5%, 10%, 15%. 20%, 25%, 50%. Further work is being done in this direction.