A new class of Poisson Ridge-type estimator

Ertan, Esra; Akay, Kadri Ulaş

doi:10.1038/s41598-023-32119-0

Download PDF

Article
Open access
Published: 27 March 2023

A new class of Poisson Ridge-type estimator

Scientific Reports volume 13, Article number: 4968 (2023) Cite this article

919 Accesses
2 Citations
Metrics details

Subjects

Abstract

The Poisson Regression Model (PRM) is one of the benchmark models when analyzing the count data. The Maximum Likelihood Estimator (MLE) is used to estimate the model parameters in PRMs. However, the MLE may suffer from various drawbacks that arise due to the existence of multicollinearity problems. Many estimators have been proposed as alternatives to each other to alleviate the multicollinearity problem in PRM, such as Poisson Ridge Estimator (PRE), Poisson Liu Estimator (PLE), Poisson Liu-type Estimator (PLTE), and Improvement Liu-Type Estimator (ILTE). In this study, we define a new general class of estimators which is based on the PRE as an alternative to other existing biased estimators in the PRMs. The superiority of the proposed biased estimator over the other existing biased estimators is given under the asymptotic matrix mean square error sense. Furthermore, two separate Monte Carlo simulation studies are implemented to compare the performances of the proposed biased estimators. Finally, the performances of all considered biased estimators are shown in real data.

Genome-wide association studies

Article 26 August 2021

Principal component analysis

Article 22 December 2022

Causal machine learning for predicting treatment outcomes

Article 19 April 2024

Introduction

The Poisson Regression Model (PRM) is one of the benchmark models for count data in much the same way as the normal linear regression model is the benchmark for continuous data¹. In the PRM, $y_{i}$ is the response variable and follows a Poisson distribution with mean $\mu_{i}$, then the probability function is defined as

$$f\left( {y_{i} } \right) = \frac{{e^{{ - \mu_{i} }} \mu_{i}^{{y_{i} }} }}{{y_{i} !}}, \quad i = 1,2, \ldots ,n, y_{i} = 0,1,2, \ldots$$

(1)

where $\mu_{i}$ is expressed by using canonical log link function and a linear combination of explanatory variables as follows $\mu_{i} = \exp \left( {x^{\prime}_{i} \beta } \right)$ where $x^{\prime}_{i}$ is the ith row of X, which is an $n \times \left( {p + 1} \right)$ data matrix with p explanatory variables and $\beta$ is a $\left( {p + 1} \right) \times 1$ vector of coefficients.

The Maximum Likelihood method is the well-known estimation technique to estimate the model parameters in PRMs². The log-likelihood function for PRM is given as follows

$$l(\beta ) = \sum\limits_{i = 1}^{n} {y_{i} x^{\prime}_{i} \beta - \exp \left( {x^{\prime}_{i} \beta } \right) - \log \left( {y_{i} !} \right).}$$

(2)

The Maximum Likelihood Estimator (MLE) of $\beta$ is obtained by maximizing the log-likelihood function, so the following equations are obtained as

$$S(\beta ) = \frac{\partial l(\beta ;y)}{{\partial \beta }} = \sum\limits_{i = 1}^{n} {\left[ {y_{i} - \exp \left( {x^{\prime}_{i} \beta } \right)} \right]} x_{i} = 0.$$

(3)

Since Eq. (3) is nonlinear function of parameter $\beta$, the solution of $S\left( \beta \right)$ is obtained using the following iteratively reweighted least squares (IRLS) algorithm

$$\hat{\beta }_{MLE} = \left( {X^{\prime}\hat{W}X} \right)^{ - 1} X^{\prime}\hat{W}Z,$$

(4)

where Z is an n-dimensional vector with the ith element $z_{i} = \log \left( {\hat{\mu }_{i} } \right) + \frac{{y_{i} - \hat{\mu }_{i} }}{{\hat{\mu }_{i} }}$ and $\hat{W} = {\text{diag}} \left[ {\hat{\mu }_{i} } \right]$³. The iteration ends when the difference between the old and updated values is less than a given small value, which is usually $10^{ - 8}$⁴. The asymptotic variance–covariance matrix of $\hat{\beta }_{MLE}$ is $cov\left( {\hat{\beta }{}_{MLE}} \right) = \left( {X^{\prime}\hat{W}X} \right)^{ - 1} .$

Besides being a widely used estimator of MLE, one of its major disadvantages is that parameter estimates become unstable in the case of multicollinearity^{5,6,7,8,9,10,11,12,13}. The multicollinearity problem, which occurs because of the approximately linear relationship between the explanatory variables, affects the estimates of model parameters in the PRMs as well as in the linear regression models. One effect of the multicollinearity between explanatory variables is that the variance of the MLE becomes so large that the estimates of the model parameters become unstable^{14,15,16,17,18,19,20}.

In order to reduce the undesirable effects of multicollinearity, the biased estimators that are alternative to the MLE are generalized in a manner similar to that introduced in the linear regression model. For example, Månsson and Shukur¹⁸ proposed the Poisson Ridge Estimator (PRE) as follows:

$$\hat{\beta }_{PRE} = \left( {X^{\prime}\hat{W}X + kI} \right)^{ - 1} X^{\prime}\hat{W}X\hat{\beta }_{MLE} {, }\quad k > 0,$$

(5)

where $k$ is a biasing parameter. The PRE is the generalization of the Ridge estimator introduced by Hoerl and Kennard²¹ for the linear regression model.

Månsson et al.¹⁹, Amin et al.²² and Qasim et al.²³ defined the Poisson Liu Estimator (PLE) as follows:

$$\hat{\beta }_{PLE} = \left( {X^{\prime}\hat{W}X + I} \right)^{ - 1} \left( {X^{\prime}\hat{W}X + dI} \right)\hat{\beta }_{MLE} ,$$

(6)

where $0 < d < 1$ is a biasing parameter. The PLE is the generalization of the Liu estimator introduced by Liu²⁴ for the linear regression model.

In recent years, the estimators with two biasing parameters have been proposed as an alternative to PRE and PLE. The purpose of estimators with two biasing parameters obtained by combining several estimators is to obtain more suitable estimators for parameter estimates. In this context, Algamal²⁵ defined the Poisson Liu-type estimator (PLTE) for the PRMs as follows:

$$\hat{\beta }_{PLTE} = \left( {X^{\prime}\hat{W}X + kI} \right)^{ - 1} \left( {X^{\prime}\hat{W}X - dI} \right)\hat{\beta }_{MLE} ,$$

(7)

where $k{ > 0}$ and $d \in R$ are the biasing parameters. The PLTE is a generalization of the Liu-type estimator, which is firstly introduced by Liu²⁶. The PLTE is based on the biasing parameters $k$ and $d$.

Moreover, Asar and Genç¹⁵ and Çetinkaya and Kaçıranlar¹⁶ proposed another biased estimator with two biasing parameters, defined by Özkale and Kaçıranlar²⁷ for the linear regression models. The Poisson two-parameter Estimator (PTPE) is defined as:

$$\hat{\beta }_{PTPE} = \left( {X^{\prime}\hat{W}X + kI} \right)^{ - 1} \left( {X^{\prime}\hat{W}X + kdI} \right)\hat{\beta }_{MLE} ,$$

(8)

where $k{ > 0}$ and $0{ < }d{ < 1}$ are the biasing parameters.

As an alternative to the estimators introduced so far, Akay and Ertan⁵ proposed a general Improved Liu-type Estimator (ILTE) which includes MLE, PRE, PLE, PLTE and PTPE as special cases as follows:

$$\hat{\beta }_{ILTE} = \left( {X^{\prime}\hat{W}X + kI} \right)^{ - 1} \left( {X^{\prime}\hat{W}X + f\left( k \right)I} \right)\hat{\beta }{\kern 1pt}^{*} , k > 0,$$

(9)

where $\hat{\beta }^{*}$ is any estimator of $\beta$ and $f\left( k \right)$ is a continuous function of the biasing parameter k. The estimator given in (9) is a generalization of the Liu-type estimator proposed by Kurnaz and Akay²⁸ for linear regression models.

In the literature, many estimators proposed for linear regression models can be generalized to be applied to PRMs. For example, the estimator depending on the Ridge estimator in linear regression models was proposed by Yang and Chang²⁹. In this sense, the biased estimator proposed by Yang and Chang²⁹ is adapted from the PRMs by Asar and Genç¹⁵. In addition, this estimator is applied to Negative Binomial regression models by Huang and Yang³⁰. Depending on the PRE, the estimator given by Huang and Yang³⁰ in the literature has been as follows:

$$\hat{\beta }_{PHY} \left( {k,d} \right) = \left( {X^{\prime}\hat{W}X + I} \right)^{ - 1} \left( {X^{\prime}\hat{W}X + dI} \right)\left( {X^{\prime}\hat{W}X + kI} \right)^{ - 1} X^{\prime}\hat{W}X\hat{\beta }_{MLE} ,\quad k > 0, 0 < d < 1,$$

(10)

where k and d are two biasing parameters. Although the estimator given in (10) is depending on the PRE, it is a general estimator which includes the MLE, PRE, and PLE as special cases, too.

From this point of view, another estimator depending on the Ridge estimator in linear regression models was proposed by Sakallıoğlu and Kaçıranlar³¹, and is defined by Sakallıoğlu and Kaçıranlar³¹ which is defined as:

$$\hat{\beta }_{SK} \left( {k,d} \right) = \left( {X^{\prime}X + I} \right)^{ - 1} \left( {X^{\prime}X + \left( {k + d} \right)I} \right)\hat{\beta }_{RE} { ,}\quad k > 0, - \infty < d < \infty ,$$

(11)

where k and d are two biasing parameters and $\hat{\beta }_{RE} = \left( {X^{\prime}X + kI} \right)^{ - 1} X^{\prime}Y$. In this context, we can generalize the (11) estimators suggested for PRMs. Based on the PRE, we can generalize the estimator proposed by Sakallıoğlu and Kaçıranlar³¹ given in (11) as follows:

$$\hat{\beta }_{PSK} \left( {k,d} \right) = \left( {X^{\prime}\hat{W}X + I} \right)^{ - 1} \left( {X^{\prime}\hat{W}X + \left( {k + d} \right)I} \right)\hat{\beta }_{PRE} { ,}\quad k > 0, - \infty < d < \infty ,$$

(12)

where k and d are two biasing parameters. In this case, the estimator given in (12) is a general estimator which includes the MLE, PRE and PLE as special cases. Best of our knowledge, no study has been conducted about estimator in (12) for the PRMs.

In PRMs, it is known that the performance of biased estimators proposed as an alternative to MLE is generally affected by the value of the biasing parameter. In general, the methods used for the estimation of biasing parameters have been adapted similarly to those used in linear regression models. On the other hand, the use of estimators with two biasing parameters has become increasingly widespread in recent years. However, one of the most important problems for estimators with two biasing parameters is finding optimal estimates of the biasing parameters is difficult. For this purpose, many iterative techniques have been proposed to estimate these biasing parameters. In these cases, one of the biasing parameters can be estimated depending on the other biasing parameter, or vice versa^15,16,30. Thus, the idea arises that an unknown functional relationship may exist between these two biasing parameters.

Based on the information mentioned above, our aim in this article is to introduce a new general class of estimators that arises when there is a functional relationship between the biasing parameters. In addition, the proposed general estimator can be defined to specifically include the estimators given by (4), (5), (6), (10) and (12). Thus, this proposed estimator constitutes a general class of estimators like the estimator given in (9). It is a more efficient alternative estimator when compared with the one defined in (9) which can overcome multicollinearity in the PRMs. Another purpose of this article is to compare these two class estimators with a simulation study under some conditions.

The remainder of the article is organized as follows: In "A new general biased estimator", a new biased estimator is defined and some of its properties are given. The superiority of this estimator over the other biased estimators under the matrix mean square error sense are shown in "The superiority of the PRTE in PRMs". In "Determination of function", several rules are proposed to determine the relationship between the biasing parameters. Two separate Monte Carlo simulation studies are executed in "The Monte Carlo simulation studies". In "Numerical example: the aircraft damage data", a real numerical example is provided to evaluate the performances of the proposed biased estimators. Some concluding remarks are given in "Some concluding remarks".

A new general biased estimator

For PRMs, we can define a new general class of estimators including (4), (5), (6), (10) and (12) estimators based on the PRE estimator as follows:

$$\hat{\beta }_{PRTE} = \left( {X^{\prime}\hat{W}X + I} \right)^{ - 1} \left( {X^{\prime}\hat{W}X + g\left( k \right)I} \right)\hat{\beta }_{PRE} , k > 0,$$

(13)

where $g\left( k \right)$ is a continuous function of the biasing parameter $k$. When we select $g\left( k \right)$ as a linear function of the biasing parameter k such as $g\left( k \right) = ak + b$ where $a,b \in R$, the Poisson Ridge-type estimator (PRTE) is a general estimator which includes the other biased estimators as special cases:

$\hat{\beta }_{PRTE} = \hat{\beta }_{MLE}$ for $g\left( 0 \right) = 1$ where $k = 0$ and $b = 1$.

$\hat{\beta }_{PRTE} = \hat{\beta }_{PRE}$ for $g\left( k \right) = 1$ where $a = 0$ and $b = 1$.

$\hat{\beta }_{PRTE} = \hat{\beta }_{PLE}$ for $g\left( 0 \right) = b$ where $a = 0$ and $b$ corresponds to the biasing parameter d.

$\hat{\beta }_{PRTE} = \hat{\beta }_{PHY} \left( {k,d} \right)$ for $g\left( k \right) = b$ where b corresponds to the biasing parameter d.

$\hat{\beta }_{PRTE} = \hat{\beta }_{PSK} \left( {k,d} \right)$ for $g\left( k \right) = k + b$ where $a = 1$ and b corresponds to the biasing parameter d.

Note that, the proposed estimator given in (13) is different form the biased estimator given in (9). That is, when we use $\hat{\beta }_{PRE}$ instead of $\hat{\beta }^{*}$ in (9), the resulting estimator $\hat{\beta }_{ILTE(PRE)}$ is given as follows:

$$\hat{\beta }_{ILTE(PRE)} = \left( {X^{\prime}\hat{W}X + kI} \right)^{ - 1} \left( {X^{\prime}\hat{W}X + f\left( k \right)I} \right)\hat{\beta }{\kern 1pt}_{PRE} , k > 0,$$

(14)

where $f\left( k \right)$ is a continuous function of the biasing parameter $k$. Note that the estimator given in (14) does not exactly correspond to the estimators given by (10) and (12), respectively. To show that the estimators given in (13) and (14) are different estimators, let’s examine the asymptotic scalar mean square error (SMSE) and asymptotic matrix mean square error (MMSE) of these estimators.

In order to obtain the asymptotic SMSE and the asymptotic MMSE of an estimator, we denote $\alpha = Q^{\prime}\beta ,$ $\Lambda { = }diag\left( {\lambda_{1} ,...,\lambda_{p + 1} } \right) = Q^{\prime}\left( {X^{\prime}\hat{W}X} \right)Q,$ where $\lambda_{1} \ge \lambda_{2} \ge \cdots \lambda_{p + 1} > 0$ are the ordered eigenvalues of $X^{\prime}\hat{W}X, Q$ is the orthogonal matrix whose columns constitute the eigenvectors of $X^{\prime}\hat{W}X$ and the ith element of $Q^{\prime}\beta$ is denoted as $\alpha_{j} ,j = 1,2,...,p + 1.$

The asymptotic SMSE and the asymptotic MMSE of an estimator $\hat{\beta } = H\hat{\beta }_{MLE} ,$ where $H$ is an $\left( {p + 1} \right) \times \left( {p + 1} \right)$ matrix, are defined as:

$$\begin{aligned} & MSEM\left( {\hat{\beta }} \right) = E\left( {\hat{\beta } - \beta } \right)\left( {\hat{\beta } - \beta } \right)^{\prime } = H\left( {\hat{\beta }_{MLE} - \beta } \right)\left( {\hat{\beta }_{MLE} - \beta } \right)^{\prime } H^{\prime} + \left( {H\beta - \beta } \right)\left( {H\beta - \beta } \right)^{\prime } \\ & SMSE\left( {\hat{\beta }} \right) = E\left( {\hat{\beta } - \beta } \right)^{\prime } \left( {\hat{\beta } - \beta } \right) = \left( {\hat{\beta }_{MLE} - \beta } \right)^{\prime } H^{\prime}H\left( {\hat{\beta }_{MLE} - \beta } \right) + \left( {H\beta - \beta } \right)^{\prime } \left( {H\beta - \beta } \right). \\ \end{aligned}$$

(15)

Note that there is a relationship $SMSE\left( {\hat{\beta }} \right) = tr\left( {MMSE\left( {\hat{\beta }} \right)} \right)$ between MMSE and SMSE criteria. Because of the relation of $\alpha = Q^{\prime}\beta$; $\hat{\beta }_{MLE} , \hat{\beta }_{PRE} , \hat{\beta }_{PLE} , \hat{\beta }_{PLTE} , \hat{\beta }_{ILTE}$ and $\hat{\beta }_{PRTE}$ have the same SMSE values as $\hat{\alpha }_{MLE} , \hat{\alpha }_{PRE} , \hat{\alpha }_{PLE} , \hat{\alpha }_{PLTE} , \hat{\alpha }_{ILTE}$ and $\hat{\alpha }_{PRTE}$, respectively.

Using (9), (13) and (14), it is easily computed that

$$\begin{aligned} MMSE\left( {\hat{\beta }_{ILTE} } \right) & = Q\left( {\left( {\Lambda + kI} \right)^{ - 1} \left( {\Lambda + f\left( k \right)I} \right)\Lambda^{ - 1} \left( {\Lambda + f\left( k \right)I} \right)\left( {\Lambda + kI} \right)^{ - 1} } \right. \\ & \quad \left. { + \left( {f\left( k \right) - k} \right)^{2} \left( {\Lambda + kI} \right)^{ - 1} \alpha \alpha^{\prime}\left( {\Lambda + kI} \right)^{ - 1} } \right)Q^{\prime} \\ \end{aligned}$$

(16)

$$\begin{aligned} MMSE\left( {\hat{\beta }_{{ILTE(PRE)}} } \right) & = Q\left( {\left( {\Lambda + kI} \right)^{{ - 1}} \left( {\Lambda + f\left( k \right)I} \right)\left( {\Lambda + kI} \right)^{{ - 1}} \Lambda \left( {\Lambda + kI} \right)^{{ - 1}} \left( {\Lambda + f\left( k \right)I} \right)\left( {\Lambda + kI} \right)^{{ - 1}} } \right. \\ & \quad \left. { + \left( {\Lambda + kI} \right)^{{ - 1}} \left( {f\left( k \right)\Lambda - 2k\Lambda - k^{2} I} \right)\left( {\Lambda + kI} \right)^{{ - 1}} \alpha \alpha ^{\prime } \left( {\Lambda + kI} \right)^{{ - 1}} \left( {f\left( k \right)\Lambda - 2k\Lambda - k^{2} I} \right)\left( {\Lambda + kI} \right)^{{ - 1}} } \right)Q^{\prime } . \\ \end{aligned}$$

(17)

$$\begin{aligned} MMSE\left( {\hat{\beta }_{PRTE} } \right) & = Q\left( {\left( {\Lambda + I} \right)^{ - 1} \left( {\Lambda + g\left( k \right)I} \right)\left( {\Lambda + kI} \right)^{ - 1} \Lambda \left( {\Lambda + kI} \right)^{ - 1} \left( {\Lambda + g\left( k \right)I} \right)\left( {\Lambda + I} \right)^{ - 1} } \right. \\ & \quad \left. { + \left( {\left( {g\left( k \right) - k - 1} \right)\Lambda - kI} \right)\left( {\Lambda + I} \right)^{ - 1} \left( {\Lambda + kI} \right)^{ - 1} \alpha \alpha^{\prime}\left( {\Lambda + kI} \right)^{ - 1} \left( {\Lambda + I} \right)^{ - 1} \left( {\left( {g\left( k \right) - k - 1} \right)\Lambda - kI} \right)} \right)Q^{\prime}. \\ \end{aligned}$$

(18)

Moreover, we can give the SMSE functions of ILTE, ILTE (PRE) and PRTE as follows:

$$SMSE\left( {\hat{\beta }_{ILTE} } \right) = \sum\limits_{j = 1}^{p + 1} {\frac{{\left( {\lambda_{j} + f\left( k \right)} \right)^{2} }}{{\lambda_{j} \left( {\lambda_{j} + k} \right)^{2} }}} + \sum\limits_{j = 1}^{p + 1} {\frac{{\left( {f\left( k \right) - k} \right)^{2} \alpha_{j}^{2} }}{{\left( {\lambda_{j} + k} \right)^{2} }}}$$

(19)

$$SMSE\left( {\hat{\beta }_{ILTE(PRE)} } \right) = \sum\limits_{j = 1}^{p + 1} {\frac{{\left( {\lambda_{j} + f(k)} \right)^{2} \lambda_{j} }}{{\left( {\lambda_{j} + k} \right)^{4} }}} + \sum\limits_{j = 1}^{p + 1} {\frac{{\left( {f(k)\lambda_{j} - 2k\lambda_{j} - k^{2} } \right)^{2} \alpha_{j}^{2} }}{{\left( {\lambda_{j} + k} \right)^{4} }}}$$

(20)

$$SMSE\left( {\hat{\beta }_{PRTE} } \right) = \sum\limits_{j = 1}^{p + 1} {\frac{{\lambda_{j} \left( {\lambda_{j} + g\left( k \right)} \right)^{2} }}{{\left( {\lambda_{j} + 1} \right)^{2} \left( {\lambda_{j} + k} \right)^{2} }}} + \sum\limits_{j = 1}^{p + 1} {\frac{{\left( {\left( {g\left( k \right) - k - 1} \right)\lambda_{j} - k} \right)^{2} \alpha_{j}^{2} }}{{\left( {\lambda_{j} + 1} \right)^{2} \left( {\lambda_{j} + k} \right)^{2} }}}$$

(21)

where the first term is the asymptotic variance and the second term is the squared bias. It should be noted that MMSE and SMSE functions of ILTE (PRE) and PRTE are different. Also, the MMSE and SMSE functions of other existing functions can be obtained according to the appropriate selection of $f\left( k \right)$ and $g\left( k \right)$.

Let $\hat{\beta }_{1}$ and $\hat{\beta }_{2}$ be any two estimators of $\beta$ parameter. Then, $\hat{\beta }_{2}$ is superior to $\hat{\beta }_{1}$ with respect to the MMSE sense if and only if $MMSE\left( {\hat{\beta }_{1} } \right) - MMSE\left( {\hat{\beta }_{2} } \right)$ is a positive definite (pd) matrix. If $MMSE\left( {\hat{\beta }_{1} } \right) - MMSE\left( {\hat{\beta }_{2} } \right)$ is a non-negative definite matrix, then $SMSE\left( {\hat{\beta }_{1} } \right) - SMSE\left( {\hat{\beta }_{2} } \right) \ge 0.$ But, the reverse is not always true³².

In order to compare the MMSEs for the above-mentioned biased estimators, we are using the following theorem.

Theorem 2.1

Let $A$ be a positive definite matrix, namely $A > 0,$ and $c$ nonzero vector. Then, $A - cc^{\prime}$ is positive definite matrix iff $c^{\prime}A^{ - 1} c \le 1$³³.

The superiority of the PRTE in PRMs

In this section, we compare the PRTE with the ILTE according to the MMSE criterion. Here, we give a general theorem for comparing estimators with different choices of $g\left( k \right)$ and $f\left( k \right)$ functions. In this way, a general theorem is obtained for comparing the estimators mentioned above in terms of MMSE sense.

The following theorem is given to show the superiority of PRTE over ILTE.

Theorem 3.1.

Let be $k > 0$ and $- \lambda_{j} - \frac{{\left( {\lambda_{j} + 1} \right)\left( {\lambda_{j} + f\left( k \right)} \right)}}{{\lambda_{j} }} < g\left( k \right) < - \lambda_{j} + \frac{{\left( {\lambda_{j} + 1} \right)\left( {\lambda_{j} + f\left( k \right)} \right)}}{{\lambda_{j} }}$ where j=1,2,...,p+1. Then $MMSE\left( {\hat{\beta }_{ILTE} } \right) - MMSE\left( {\hat{\beta }_{PRTE} } \right) > 0$ iff

$$\begin{aligned} & bias\left( {\hat{\beta }_{PRTE} } \right)^{\prime } Q\left( {\left( {\Lambda + kI} \right)^{ - 1} \left( {\Lambda + f\left( k \right)I} \right)\Lambda^{ - 1} \left( {\Lambda + kI} \right)^{ - 1} \left( {\Lambda + f\left( k \right)I} \right)} \right. \\ & \quad \left. { - \left( {\Lambda + I} \right)^{ - 1} \left( {\Lambda + g\left( k \right)I} \right)\left( {\Lambda + kI} \right)^{ - 1} \Lambda \left( {\Lambda + kI} \right)^{ - 1} \left( {\Lambda + g\left( k \right)I} \right)\left( {\Lambda + I} \right)^{ - 1} } \right)^{ - 1} Q^{\prime}bias\left( {\hat{\beta }_{PRTE} } \right) < 1 \\ \end{aligned}$$

(22)

where $bias\left( {\hat{\beta }_{PRTE} } \right) = \left( {\left( {g\left( k \right) - k - 1} \right)\Lambda - kI} \right)Q\left( {\Lambda + I} \right)^{ - 1} \left( {\Lambda + kI} \right)^{ - 1} \alpha$.

Proof

Using (19) and (21), we obtain

$$\begin{aligned} & MMSE\left( {\hat{\beta }_{ILTE} } \right) - MMSE\left( {\hat{\beta }_{PRTE} } \right) = Q\left( {\left( {\Lambda + kI} \right)^{ - 1} \left( {\Lambda + f\left( k \right)I} \right)\Lambda^{ - 1} \left( {\Lambda + kI} \right)^{ - 1} \left( {\Lambda + f\left( k \right)I} \right)} \right. \\ & \quad \left. { - \left( {\Lambda + I} \right)^{ - 1} \left( {\Lambda + g\left( k \right)I} \right)\left( {\Lambda + kI} \right)^{ - 1} \Lambda \left( {\Lambda + kI} \right)^{ - 1} \left( {\Lambda + g\left( k \right)I} \right)\left( {\Lambda + I} \right)^{ - 1} } \right)^{ - 1} Q^{\prime} - bias\left( {\hat{\beta }_{PRTE} } \right)bias\left( {\hat{\beta }_{PRTE} } \right)^{\prime } \\ & \quad = Q diag\left\{ {\frac{{\left( {\lambda_{j} + f\left( k \right)} \right)^{2} }}{{\left( {\lambda_{j} + k} \right)^{2} \lambda_{j} }} - \frac{{\lambda_{j} \left( {\lambda_{j} + g\left( k \right)} \right)^{2} }}{{\left( {\lambda_{j} + 1} \right)^{2} \left( {\lambda_{j} + k} \right)^{2} }}} \right\}_{j = 1}^{p + 1} Q^{\prime} - bias\left( {\hat{\beta }_{PRTE} } \right)bias\left( {\hat{\beta }_{PRTE} } \right)^{\prime } . \\ \end{aligned}$$

$D = \left( {\Lambda + kI} \right)^{ - 1} \left( {\Lambda + f\left( k \right)I} \right)\Lambda^{ - 1} \left( {\Lambda + kI} \right)^{ - 1} \left( {\Lambda + f\left( k \right)I} \right) - \left( {\Lambda + I} \right)^{ - 1} \left( {\Lambda + g\left( k \right)I} \right)\left( {\Lambda + kI} \right)^{ - 1} \Lambda \left( {\Lambda + kI} \right)^{ - 1} \left( {\Lambda + g\left( k \right)I} \right)\left( {\Lambda + I} \right)^{ - 1}$ is the pd matrix if $\left( {\lambda_{j} + 1} \right)^{2} \left( {\lambda_{j} + f\left( k \right)} \right)^{2} - \lambda_{j}^{2} \left( {\lambda_{j} + g\left( k \right)} \right)^{2} > 0.$ Thus D is the pd matrix if $- \lambda_{j} - \frac{{\left( {\lambda_{j} + 1} \right)\left( {\lambda_{j} + f\left( k \right)} \right)}}{{\lambda_{j} }} < g\left( k \right) < - \lambda_{j} + \frac{{\left( {\lambda_{j} + 1} \right)\left( {\lambda_{j} + f\left( k \right)} \right)}}{{\lambda_{j} }}$ and $k > 0$ where j=1,2,...,p+1. By Theorem 2.1, the proof is completed.

Determination of $g\left( k \right)$ function

Since the performance of the biased estimators is related to the choice of biasing parameters, it is an important problem to find the optimal biasing parameters for the proposed biased estimators. Different techniques for estimating the biasing parameters in the PRE, PLE, PLTE, PSK and PHY are generalized depending on the similarities between linear regression models and PRMs^{5,15,16,17,18,19,23,30,34}. The performance of PRTE depends on the function $g\left( k \right)$, and therefore only on the biasing parameter $k$. It should be noted that we have given the appropriate choice of the $g\left( k \right)$ function in the introduction that different estimators can be obtained. We may give a method to find the optimal $g\left( k \right)$ function that approximately minimizes $SMSE\left( {\hat{\beta }_{PRTE} } \right)$ according to $k$. Our aim here is to determine the k and $g\left( k \right)$ functions together, which can make the $SMSE\left( {\hat{\beta }_{PRTE} } \right)$ function approximately minimum. In other words, our goal here is to choose the appropriate k and $g\left( k \right)$ functions such that the decrease in the variance term is greater than the increase in squared bias. Note that $SMSE\left( {\hat{\beta }_{PRTE} } \right)$ is a nonlinear function of the biasing parameter $k$. So, writing $h\left( k \right) = SMSE\left( {\hat{\beta }_{PRTE} } \right),$ then we find $h^{\prime}\left( k \right)$ as follows differentiating $h\left( k \right)$ with respect to $k,$

$$h^{\prime}\left( k \right) = \sum\limits_{j = 1}^{p + 1} {\frac{{2\lambda_{j} \left( {\lambda_{j} - g^{\prime}\left( k \right)\lambda_{j} - g^{\prime}\left( k \right)k + g\left( k \right)} \right)\left[ {\alpha_{j}^{2} \left( {\left( {k + 1 - g\left( k \right)} \right)\lambda_{j} + k} \right) - \left( {\lambda_{j} + g\left( k \right)} \right)} \right]}}{{\left( {\lambda_{j} + 1} \right)^{2} \left( {\lambda_{j} + k} \right)^{3} }}}.$$

When $h^{\prime}\left( k \right) = 0$, there are two facts as follows;

Fact 1

$\lambda_{j} \left( {\lambda_{j} - g^{\prime}\left( k \right)\lambda_{j} - g^{\prime}\left( k \right)k + g\left( k \right)} \right) = 0$ differential equation is found. From the solution of this differential equation, we obtain

$$g\left( k \right) = ck + \left( {c - 1} \right)\lambda_{j} ,$$

(23)

where $c$ is the constant of integration.

Fact 2

$\alpha_{j}^{2} \left( {\left( {k + 1 - g\left( k \right)} \right)\lambda_{j} + k} \right) - \left( {\lambda_{j} + g\left( k \right)} \right) = 0$ equation is found. We have

$$\begin{array}{*{20}c} {g\left( k \right) = \frac{{\alpha_{j}^{2} \left( {\lambda_{j} + 1} \right)}}{{1 + \lambda_{j} \alpha_{j}^{2} }}k + \frac{{\left( {\alpha_{j}^{2} - 1} \right)}}{{1 + \lambda_{j} \alpha_{j}^{2} }}\lambda_{j} } & {or} & {g\left( k \right) = \frac{{\alpha_{j}^{2} \left( {\lambda_{j} + 1} \right)}}{{1 + \lambda_{j} \alpha_{j}^{2} }}k + \left( {\frac{{\alpha_{j}^{2} \left( {\lambda_{j} + 1} \right)}}{{1 + \lambda_{j} \alpha_{j}^{2} }} - 1} \right)\lambda_{j} } \\ \end{array} .$$

(24)

According to the first and the second facts, it is convenient to choose $g\left( k \right)$ as a linear function of the biasing parameter k. Note that, $g\left( k \right)$ which is obtained in Fact 2 is a solution of the differential equation which is obtained in Fact 1. According to the results obtained in Fact 1 and Fact 2, we can propose the following generalizations. Firstly, note that the function $g\left( k \right)$ given in (23) and (24) makes the $SMSE\left( {\hat{\alpha }_{PRTE} } \right)$ function approximately minimum for a j value. So, $g\left( k \right)$ depends on the eigenvalues of $X^{\prime}WX$, the unknown parameter $\alpha$ and the estimate of the biasing parameter k. In other words, many functions can be determined depending on the functional relationship given in (23) and (24). For example, the following functional relationships can be proposed for the determination of function $g\left( k \right)$:

$$g_{1} \left( k \right) = c_{1} k + \left( {c_{1} - 1} \right)\lambda_{\min } \,{\text{where}}\,c_{1} \in \left( {0,1} \right),$$

(25)

$$g_{2} \left( k \right) = \frac{{\alpha_{\min }^{2} \left( {1 + \lambda_{\min } } \right)}}{{1 + \lambda_{\max } \alpha_{\max }^{2} }}k + \left( {\frac{{\alpha_{\min }^{2} \left( {1 + \lambda_{\min } } \right)}}{{1 + \lambda_{\max } \alpha_{\max }^{2} }} - 1} \right)\lambda_{\min } ,$$

(26)

$$g_{3} \left( k \right) = \frac{{\min \left( {\alpha_{j}^{2} \left( {\lambda_{j} + 1} \right)} \right)}}{{n\max \left( {1 + \lambda_{j} \alpha_{j}^{2} } \right)}}k + \left( {\frac{{\min \left( {\alpha_{j}^{2} \left( {\lambda_{j} + 1} \right)} \right)}}{{n\max \left( {1 + \lambda_{j} \alpha_{j}^{2} } \right)}} - 1} \right)\lambda_{\min } ,$$

(27)

where $\alpha_{\min }^{2}$ and $\alpha_{\max }^{2}$ are defined as the minimum and maximum value of $\alpha_{j}^{2} , j = 1,2,...,p + 1,$ respectively. Similarly, $\lambda_{\min }$ and $\lambda_{\max }$ indicate the minimum and maximum value of the eigenvalue of $X^{\prime}\hat{W}X$, respectively.

In this study, we examined only the first degree polynomial functions given in (25) to (27) for $g\left( k \right)$ function. Note that, the function $g\left( k \right)$ can be selected as any continuous function of the biasing parameter k. Therefore, the proposed biased estimator depends on a single biasing parameter k. In this case, we should use an appropriate estimate of biasing parameter k, which must be estimated to control the conditioning of the $X^{\prime}WX$ matrix. Since the proposed estimator depends on a single biasing parameter k, the suitable estimates of k can be used given in Månsson and Shukur¹⁸, Kibria et al.¹⁷, Algamal²⁵. In addition to the previously proposed estimators of the biasing parameter, we can also use the following estimators to estimate k:

$$\hat{k}_{PRTE} = \frac{{p\left( {\lambda_{\max } - \lambda_{\min } } \right)}}{n},\hat{k}_{PRTE} = \frac{{\max \left( {\lambda_{j} \hat{\alpha }_{j}^{2} } \right)}}{{\sum\nolimits_{j = 1}^{p + 1} {\hat{\alpha }_{j}^{2} } }},\hat{k}_{PRTE} = \left( {\prod\limits_{j = 1}^{p + 1} {\sqrt {\frac{1}{{\hat{\alpha }_{j}^{2} }}} } } \right)^{{\frac{1}{p + 1}}}$$

where $m_{j} = \sqrt {\frac{{\hat{\sigma }^{2} }}{{\hat{\alpha }_{j}^{2} }}} ,j = 1,2,...,p + 1$ and $\hat{\sigma }^{2} = \frac{1}{n - p - 1}\sum\limits_{i = 1}^{n} {\left( {y_{i} - \hat{y}_{i} } \right)^{2} }$.

The Monte Carlo simulation studies

In this section, we designed two simulation schemes to compare the performances of different biased estimators in the PRMs. In the first simulation scheme, we discussed the effects of sample size (n), the degree of the collinearity $\left( \rho \right)$ and the number of the explanatory variables $\left( p \right)$ on the performance of the PRTE, PRE, PLE, PLTE, PSK, PHY estimators and PRTE, based on suggested best biasing estimates. In the second simulation design, we examined the effect of the biasing parameter on the performances of the PRTE and ILTE for each set of the values $\left( {n,\rho ,p,\sigma^{2} } \right)$. For both simulation designs, we generated the explanatory variables by following Månsson and Shukur¹⁸, Kibria et al.¹⁷, Kibria and Lukman³⁵ as

$$x_{ij} = \left( {1 - \rho^{2} } \right)^{{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-0pt} 2}}} w_{ij} + \rho w_{ip+1} , i = 1,2,..,n, j = 1,2,...,p,$$

(28)

where $w_{ij}$ are independent standard normal pseudo-random numbers and $\rho$ is specified such that the correlation between any two variables is given by $\rho^{2}$. Four different sets of correlations are investigated corresponding to $\rho = 0.85,0.9,0.99$ and $0.999$. Number of explanatory variables is determined as $p = 2, 4, 8$ and 12. For each set of explanatory variables, the parameter $\beta$ is selected as the normalized eigenvector corresponding to the largest eigenvalue of $X^{\prime}X$ so that $\beta^{\prime}\beta = 1$. We used glm function in the R Stats package⁴. We also set the intercept term equal to 0.

In the simulation and application sections, the proposed best biasing parameter estimators for PRE, PLE, PLTE, PSK, and PHY estimators are used based on the works of Månsson and Shukur¹⁸, Månsson et al.¹⁹, Kibria et al.¹⁷, Asar and Genç¹⁵, Alanaz and Algamal³⁴, Çetinkaya and Kaçıranlar¹⁶, Qasim et al.²³, Huang and Yang³⁰.

To estimate k in PRE, we used the best estimator of k as $\hat{k}_{PRE} = \max \left( {\frac{1}{{m_{j} }}} \right)$ where $m_{j} = \sqrt {\frac{{\hat{\sigma }^{2} }}{{\hat{\alpha }_{j}^{2} }}} ,j = 1,2,...,p$ and $\hat{\sigma }^{2} = \frac{1}{n - p - 1}\sum\nolimits_{i = 1}^{n} {\left( {y_{i} - \hat{\mu }_{i} } \right)^{2} }$ which is recommended by Kibria et al.¹⁷.

According the results given by Qasim et al.²³, we used the best estimator of d in PLE as $\hat{d}_{PLE} = \max \left( {0,\min \left( {\frac{{\hat{\alpha }_{j}^{2} - 1}}{{\max \left( {\frac{1}{{\lambda_{j} }}} \right) + \hat{\alpha }_{\max }^{2} }}} \right)} \right).$

For PLTE, the biasing parameters k and d are estimated by grouping them in three different ways as follows:

PLTE I: $\hat{k}_{PLTE} = \max \left( {\frac{1}{{m_{j} }}} \right)$ where $m_{j} = \sqrt {\frac{{\hat{\sigma }^{2} }}{{\hat{\alpha }_{j}^{2} }}} ,j = 1,2,...,p$ and $\hat{d}_{PLTE} = \frac{{\sum\nolimits_{j = 1}^{p} {\frac{{1 - \hat{k}_{PLTE} \hat{\alpha }_{j}^{2} }}{{\left( {\lambda_{j} + \hat{k}_{PLTE} } \right)^{2} }}} }}{{\sum\nolimits_{j = 1}^{p} {\frac{{1 + \lambda_{j} \hat{\alpha }_{j}^{2} }}{{\lambda_{j} \left( {\lambda_{j} + \hat{k}_{PLTE} } \right)^{2} }}} }}$.

PLTE II: $\hat{k}_{PLTE} = \frac{{\lambda_{1} - 100 \lambda_{p} }}{99}$ and $\hat{d}_{PLTE} = \frac{{\sum\nolimits_{j = 1}^{p} {\frac{{1 - \hat{k}_{PLTE} \hat{\alpha }_{j}^{2} }}{{\left( {\lambda_{j} + \hat{k}_{PLTE} } \right)^{2} }}} }}{{\sum\nolimits_{j = 1}^{p} {\frac{{1 + \lambda_{j} \hat{\alpha }_{j}^{2} }}{{\lambda_{j} \left( {\lambda_{j} + \hat{k}_{PLTE} } \right)^{2} }}} }}$.

PLTE III: $\hat{d}_{PLTE} = \frac{1}{2}\min \left\{ {\frac{{\lambda_{j} }}{{1 + \lambda_{j} \hat{\alpha }_{j}^{2} }}} \right\}, j = 1,2,...,p$ and $\hat{k}_{PLTE} = \frac{1}{p}\sum\limits_{j = 1}^{p} {\frac{{\lambda_{j} - \hat{d}_{PLTE}^{*} \left( {1 + \lambda_{j} \hat{\alpha }_{j}^{2} } \right)}}{{\lambda_{j} \hat{\alpha }_{j}^{2} }}}$.

Sakallıoğlu and Kaçıranlar³¹ did not provide a specific technique for estimating the biasing parameters k and d for SK estimator. Therefore, we used the following estimator to estimate the biasing parameters k and d in PSK:

PSK: $\hat{k}_{PSK} = \max \left( {\frac{1}{{m_{j} }}} \right)$ where $m_{j} = \sqrt {\frac{{\hat{\sigma }^{2} }}{{\hat{\alpha }_{j}^{2} }}} ,j = 1,2,...,p$ and $\hat{d}_{PSK} = \frac{{\sum\nolimits_{j = 1}^{p} {\frac{{\lambda_{j} \left( {\hat{\alpha }_{j}^{2} - 1} \right)}}{{\left( {\lambda_{j} + 1} \right)^{2} \left( {\lambda_{j} + \hat{k}_{PSK} } \right)^{2} }}} }}{{\sum\nolimits_{j = 1}^{p} {\frac{{\lambda_{j} \left( {1 + \lambda_{j} \hat{\alpha }_{j}^{2} } \right)}}{{\left( {\lambda_{j} + 1} \right)^{2} \left( {\lambda_{j} + \hat{k}_{PSK} } \right)^{2} }}} }}$.

Moreover, we used the methods proposed by Huang and Yang³⁰ to estimate the parameters of the PHY estimator. Huang and Yang³⁰ proposed two methods. We refer to these methods as (K1, D1) and (K2, D2) (see Huang and Yang³⁰ for details). We used these methods by adapting them for the PHY estimator in PRMs. As a result, the estimator obtained with (K1, D1) indicates PHY I, and the estimator obtained with (K2, D2) with PHY II.

We used the following $g\left( k \right)$ functions together with the k estimator to determine the PRTE:

PRTE I: $\hat{k}_{{PRTE {\text{I}}}} = \frac{1}{n}\left( {p\lambda_{\max } - \left( {p + 1} \right)\lambda_{\min } } \right)$ and $g\left( k \right) = \frac{{\left( {1 + \lambda_{\min } } \right)\alpha_{\min }^{2} }}{{1 + \lambda_{\max } \alpha_{\max }^{2} }}k + \left( {\frac{{\left( {1 + \lambda_{\min } } \right)\alpha_{\min }^{2} }}{{1 + \lambda_{\max } \alpha_{\max }^{2} }} - 1} \right)\lambda_{\min }$.

PRTE II: $\hat{k}_{{PRTE {\text{II}}}} = \frac{{p\lambda_{\max } \alpha_{{{\text{med}}}}^{2} }}{{n\alpha_{{{\text{mean}}}}^{2} }}$ and $g\left( k \right) = \frac{{\left( {1 + \lambda_{\max } } \right)\alpha_{\min }^{2} }}{{p\left( {1 + \lambda_{\max } \alpha_{\max }^{2} } \right)}}k + \left( {\frac{{\left( {1 + \lambda_{\max } } \right)\alpha_{\min }^{2} }}{{p\left( {1 + \lambda_{\max } \alpha_{\max }^{2} } \right)}} - 1} \right)\lambda_{\min }$.

PRTE III: $\hat{k}_{{PRTE {\text{III}}}} = \frac{p}{n}\left( {\lambda_{\max } - \lambda_{\min } } \right)$ and $g\left( k \right) = \frac{{\min \left( {\left( {1 + \lambda_{j} } \right)\alpha_{j}^{2} } \right)}}{{n\max \left( {1 + \lambda_{j} \alpha_{j}^{2} } \right)}}k + \left( {\frac{{\min \left( {\left( {1 + \lambda_{j} } \right)\alpha_{j}^{2} } \right)}}{{n\max \left( {1 + \lambda_{j} \alpha_{j}^{2} } \right)}} - 1} \right)\lambda_{\min }$.

PRTE IV: $\hat{k}_{{PRTE {\text{IV}}}} = \frac{{p\max \left( {\lambda_{j} \alpha_{j}^{2} } \right)}}{{n\alpha_{{{\text{mean}}}}^{2} }}$ and $g\left( k \right) = \min \left( {\frac{{\left( {1 + \lambda_{j} } \right)\alpha_{j}^{2} }}{{n\left( {1 + \lambda_{j} \alpha_{j}^{2} } \right)}}} \right)k + \left( {\min \left( {\frac{{\left( {1 + \lambda_{j} } \right)\alpha_{j}^{2} }}{{n\left( {1 + \lambda_{j} \alpha_{j}^{2} } \right)}}} \right) - 1} \right)\lambda_{\min }$. where $\alpha_{{{\text{med}}}}^{2}$ and $\alpha_{{\text{mean}}}^{2}$ are defined as the median end mean value of $\alpha_{j}^{2} , j = 1,2,...,p + 1,$ respectively.

The performance of the estimated MSEs (EMSEs) is used as basis for comparing the proposed estimators which are calculated for an estimator $\hat{\beta }$ of $\beta$ as

$$EMSE\left( {\hat{\beta }} \right) = \frac{1}{N}\sum\limits_{r = 1}^{N} {\left( {\hat{\beta }_{r} - \beta } \right)^{\prime } \left( {\hat{\beta }_{r} - \beta } \right)} ,$$

(29)

where $\left( {\hat{\beta }_{r} - \beta } \right)$ is the difference between the estimated and true parameter vectors at rth replication and N is the number of replications. For each case of n, p and $\rho$, the experiment was replicated 2000 times by generating response variables. Our Monte Carlo simulation studies were conducted using the R Programming Language. The results for different n, p and $\rho$ are given in Tables 1, 2, 3 and 4 for $p = 2, 4, 8$ and 12 respectively.

Table 1 The EMSE values of the estimators when $p = 2.$

Full size table

Table 2 The EMSE values of the estimators when $p = 4.$

Full size table

Table 3 The EMSE values of the estimators when $p = 8.$

Full size table

Table 4 The EMSE values of the estimators when $p = 12.$

Full size table

The bold numbers in the tables show the estimators with the smallest EMSE values, and in addition, the signs (*), (**), and (***) represent the first, second, and third smallest EMSE values in each row, respectively. The results from Tables 1, 2, 3 and 4 are listed below:

1.
According to the results from Tables 1, 2, 3 and 4, it can be seen that the degree of correlation $\left( \rho \right),$ the number of explanatory variables $\left( p \right)$ and the sample size $\left( n \right)$ have different effects on all estimators in the simulation.
2.
It has been observed that the EMSE values of PRTE I, PRTE II, PRTE III and PRTE IV are smaller than the other existing biased estimators. Although our proposed estimators PRTE I, PRTE II, PRTE III, and PRTE IV outperformed other existing estimators in all cases, it is also observed that they outperformed each other in different $n, p$ and $\rho$ values.
3.
When the number of variables p, and $\rho$ are kept constant, the number of observations in the model did not have a significant effect on the PRTE I, PRTE II, PRTE III, and PRTE IV.
4.
Regardless of n and p values, it is observed that PRTE I, PRTE II, PRTE III, and PRTE IV tended to give low EMSE values at high correlation.
5.
When the number of observations $\left( n \right)$ and correlation $\left( \rho \right)$ in the model are kept constant, the EMSE values for PRTE I, PRTE II, PRTE III, and PRTE IV decrease with the increase in the number of explanatory variables $\left( p \right)$.

In summary, all our proposed estimators outperformed the other considered estimators in all scenarios. However, it can be seen that there are cases where our estimators outperform each other due to different k and $f\left( k \right)$ function choices in different scenarios. As a result, we can observe that the number of observations has a relatively small effect on EMSE values compared to $\rho$ and p. In other words, PRTEs have a robust structure according to the number of observations, therefore it gives very good results in case of high collinearity.

In the second simulation scheme, we examined the effects of the biasing parameter k on ILTEs and PRTE performances when the sample size $\left( n \right)$, degree of the collinearity $\left( \rho \right)$, and number of explanatory variables $\left( p \right)$ are constant. The purpose of this simulation is to examine the performances of ILTE and PRTE at various values of the biasing parameter k according to the EMSE values given in (29). The biasing parameter k was not estimated in the second simulation scheme. Only the EMSE values obtained by increasing the k values in the range $\left[ {0,{ 2}} \right]$ by 0.1 were compared. There are many $f\left( k \right)$ and $g\left( k \right)$ functions we considered to evaluate the performances of these estimators. In order to compare the performances of these estimators under some n, p and $\rho$ as an example, the ILTEs and PRTE determined by the following $f\left( k \right)$ and $g\left( k \right)$ functions are considered:

$\hat{\beta }_{ILTE} = \left( {X^{\prime}\hat{W}X + kI} \right)^{ - 1} \left( {X^{\prime}\hat{W}X + f\left( k \right)I} \right)\hat{\beta }{\kern 1pt}_{MLE}$ where $f\left( k \right) = \frac{{\lambda_{\min } \alpha_{\min }^{2} }}{{1 + \lambda_{\max } \alpha_{\max }^{2} }}k + \left( {\frac{{\lambda_{\min } \alpha_{\min }^{2} }}{{1 + \lambda_{\max } \alpha_{\max }^{2} }} - 1} \right)\lambda_{\min }$
$\hat{\beta }_{{ILTE{ (}PRE)}} = \left( {X^{\prime}\hat{W}X + kI} \right)^{ - 1} \left( {X^{\prime}\hat{W}X + f\left( k \right)I} \right)\hat{\beta }_{PRE}$ where $f\left( k \right) = \frac{{\alpha_{\min }^{2} }}{{1 + \lambda_{\max } \alpha_{\max }^{2} }}\left( {k + \lambda_{\min } } \right)^{2} - \left( {k + \lambda_{\min } } \right)$
$\hat{\beta }_{PRTE} = \left( {X^{\prime}\hat{W}X + I} \right)^{ - 1} \left( {X^{\prime}\hat{W}X + g\left( k \right)I} \right)\hat{\beta }_{PRE}$ where $g\left( k \right) = \frac{{\left( {\lambda_{\min } + 1} \right)\alpha_{\min }^{2} }}{{1 + \lambda_{\max } \alpha_{\max }^{2} }}k + \left( {\frac{{\left( {\lambda_{\min } + 1} \right)\alpha_{\min }^{2} }}{{1 + \lambda_{\max } \alpha_{\max }^{2} }} - 1} \right)\lambda_{\min }$

Note that, when we use $\hat{\beta }_{PRE}$ instead of $\hat{\beta }^{*}$ in $\hat{\beta }_{ILTE}$, the obtained estimator is shown $\hat{\beta }_{ILTE(PRE)}$. Also, the $f\left( k \right)$ functions used in the $\hat{\beta }_{ILTE}$ and $\hat{\beta }_{ILTE(PRE)}$ were determined in accordance with the rules given by Akay and Ertan⁵. Note that when the method given by Akay and Ertan⁵ is applied to $\hat{\beta }_{ILTE(PRE)}$, $f\left( k \right)$ that minimizes the $SMSE\left( {\hat{\beta }_{ILTE(PRE)} } \right)$ function is a quadratic function.

We considered the cases $\rho = 0.9,0.99, 0.999$, $n = 50,100,500$, and $p = 4,8,12$. Depending on these n, $\rho$ and p values, the explanatory variables are generated according to (28). The simulation is repeated 2000 times for each k value. The results are given graphically in Figs. 1, 2 and 3.

According to Figs. 1, 2 and 3, we can obtain the following results depending on each set of the values $\left( {n,\rho ,p} \right)$;

1.
At small values of the biasing parameter k, PRTE outperforms other ILTE and ILTE(PRE). Although both the PRTE and ILTE(PRE) include the PRE, the performance of the ILTE(PRE) is quite poor compared to the PRTE at small values of the biasing parameter.
2.
When the collinearity between the explanatory variables is relatively low, i.e. $\rho = 0.9$, ILTE(PRE) exhibits quite different behavior from ILTE and PRTE. If the value of correlation of explanatory variables and the number of explanatory variables increases, ILTE, ILTE(PRE) and PRTE show almost the same behavior. However, PRTE exhibits a more consistent behavior at varying values of the biasing parameter k.

As a result of the second simulation design, we recommend the PRTE to the researchers. In general, the performance of these estimators depends on $f\left( k \right)$ and $g\left( k \right)$ functions, respectively. In practice, we need to replace these functions with suitable functional relationships that can occur between the biasing parameters.

Numerical example: the aircraft damage data

In this section, the aircraft damage data is reanalyzed to demonstrate the benefits of PRTE. This data consists of 30 observations with three explanatory variables. The first variable $\left( {x_{1} } \right)$ is a dichotomous variable showing the type of the aircraft. The explanatory variables $\left( {x_{2} } \right)$ and $\left( {x_{3} } \right)$ are bomb load in tons and total months of aircrew experience, respectively. The count variable y is the number of locations where damage was inflicted on the aircraft³. This dataset is also used by Myers et al.³ , Asar and Genç¹⁵, Amin et al.⁷, Lukman et al.³⁶, and Akay and Ertan⁵.

Asar and Genc¹⁵, Amin et al.⁷ and Akay and Ertan⁵ considered the following model $\mu = \exp \left( {\beta_{0} + \beta_{1} x_{1} + \beta_{2} x_{2} + \beta_{3} x_{3} } \right)$. Except for the intercept term, the eigenvalues of $X^{\prime}X$ are 208,522.5106, 374.8961 and 4.3333. Thus, the condition number is 48,120.9495, indicating a high multicollinearity problem among the explanatory variables. Firstly, the variables are standardized and then the intercept term is added to the vector of variables. Also, the eigenvalues of the matrix $X^{\prime}WX$ are obtained as $\lambda_{1} = 4{7}{\text{.5850}}$ $\lambda_{2} = {2}{\text{.2844}}$, $\lambda_{3} = {1}{\text{.4097}}$ and $\lambda_{4} = {0}{\text{.3681}}$. The condition number is 129.2719 which is considerably larger than 30, indicating that MLE is still affected due to multicollinearity. The numerical results are given in Tables 5 to compare the PRTEs with other existing estimators.

Table 5 The estimated parameter values and the SMSE values of the estimators.

Full size table

In addition, the bootstrap sampling method is used to calculate the SMSE values of the given biased estimators. For this reason, 10,000 bootstrap samples have been created. For each of these samples, the parameter estimates of the given biased estimators are calculated. The mean of the MLE estimates is considered the real parameters. Then the calculated SMSE values are given in Table 5. From Table 5, it can be seen that the estimator with the best SMSE value is PRTE I and PRTE III.

Now, we want to examine the performances of ILTE, ILTE(PRE), and PRTE, which were examined in the previous section. Figure 4 graphically shows the estimated variance values of these estimators based on the value of the biasing parameter k. Also, Fig. 5 shows the SMSE performance of $\hat{\beta }_{ILTE}$, $\hat{\beta }_{ILTE(PRE)}$ and $\hat{\beta }_{PRTE}$ estimators according to the biasing parameter k.

Figures 4 and 5 indicating that the proposed PRTE is a strong alternative to other estimators at small values of the biasing parameter k. This result is also compatible with the second simulation results given in the previous section.

To compare the estimators under the MMSE sense, the parameter estimation obtained with the bootstrap sampling method is used in place of the unknown parameter $\alpha$. R Programming is used with tolerance $10^{ - 12}$ to show the MMSE differences as a positive definite (pd) matrix. That is, if any of the eigenvalues is less than or equal to tolerance, then the matrix is not pd. Otherwise, the considered matrix is pd.

Finally, our aim in this part is to compare the estimators obtained from the choice of various $f\left( k \right)$ and $g\left( k \right)$ functions as a result of the theorem given in “The superiority of the PRTE in PRMs”. To illustrate Theorem 3.1, the function $f\left( k \right)$ and $g\left( k \right)$ are taken as $f\left( k \right) = 0.05k + 0.05$ and $g\left( k \right) = 0.5k - 0.05$, respectively. In this case, ${\text{cov}} \left( {\hat{\beta }_{ILTE} } \right) - {\text{cov}} \left( {\hat{\beta }_{PRTE} } \right)$ is pd matrix for for $0 < k \le 2.0057$. Also, k values which provide (22) criterion are $0 < k < 2.0054$. Consequently, $MMSE\left( {\hat{\beta }_{ILTE} } \right) - MMSE\left( {\hat{\beta }_{PRTE} } \right)$ is the pd matrix where $0 < k < 2.0054$.

Some concluding remarks

In this article, we defined a new general class of estimator named the PRTE as an alternative to MLE and the other existing biased estimators in the presence of multicollinearity for the PRMs. The PRTE is a general estimator which includes other biased estimators, such as the PRE, PLE, PHY and PSK estimators as special cases. In this study, we propose several rules for the determination of function $g\left( k \right)$. By using Monte Carlo simulations, the performance of the proposed PRTE with the existing estimators is evaluated in the smaller EMSE sense. The results show that the proposed PRTE outperforms the existing estimators in case of high multicollinearity. In addition, the comparison of ILTEs and PRTE is given with a general simulation study. In this simulation study, these two general estimators are compared according to the values of the biasing parameter k. It is observed that the PRTE is superior at small values of the biasing parameter k. Although the PRTE and ILTE(PRE) are both depending on the PRE, the main advantage of PRTE over ILTE(PRE) is that it can minimize the SMSE function with the help of a liner function of the biasing parameter k. Also, the estimators are applied to real dataset and it is observed that the results are consistent with simulation study. Depending on the experimental conditions examined, the proposed biased estimator outperforms the other existing biased estimators. Therefore, based on the results of the simulations and example, the PRTEs are recommended to the practitioners when there is multicollinearity problem in the PRMs.

Data availability

All data generated or analyzed during this study are used by the given reference in this article. The data analyzed/ generated are available upon the reasonable request by the E. E.

References

Winkelmann, R. Econometric Analysis of Count Data. (Springer Science & Business Media, 2008).
Hilbe, J. M. Modeling Count Data (Cambridge University Press, 2014).
Book Google Scholar
Myers, R. H., Montgomery, D. C., Vining, G. G. & Robinson, T. J. Generalized Linear Models: With Applications in Engineering and the Sciences (Wiley, 2012).
MATH Google Scholar
Dunn, P. K. & Smyth, G. K. Generalized Linear Models with Examples in R (Springer, 2018).
Book MATH Google Scholar
Akay, K. U. & Ertan, E. A new Liu-type estimator in Poisson regression models. Hacet. J. Math. Stat. 51(5), 1484–1503 (2022).
MathSciNet MATH Google Scholar
Alkhateeb, A. & Algamal, Z. Jackknifed Liu-type estimator in Poisson regression model. J. Iran. Stat. Soc. (JIRSS) 19(1), 21–37 (2020).
Article MathSciNet MATH Google Scholar
Amin, M., Akram, M. N. & Amanullah, M. On the James–Stein estimator for the poisson regression model. Commun. Stat. Simul. Comput. 51(10), 5596–5608 (2022).
Article MathSciNet MATH Google Scholar
Jadhav, N. H. A new linearized ridge Poisson estimator in the presence of multicollinearity. J. Appl. Stat. 49(8), 2016–2034 (2022).
Article MathSciNet PubMed MATH Google Scholar
Kibria, B. M. G., Månsson, K. & Shukur, G. Some ridge regression estimators for the zero-inflated Poisson model. J. Appl. Stat. 40(4), 721–735 (2013).
Article MathSciNet MATH Google Scholar
Lukman, A. F., Aladeitan, B., Ayinde, K. & Abonazel, M. R. Modified ridge-type for the Poisson regression model: Simulation and application. J. Appl. Stat. 49(8), 2124–2136 (2022).
Article MathSciNet PubMed MATH Google Scholar
Månsson, K. & Kibria, B. M. G. Estimating the unrestricted and restricted liu estimators for the poisson regression model: Method and application. Comput. Econ. 58, 311–326 (2021).
Article Google Scholar
Rashad, N. K. & Algamal, Z. Y. A new ridge estimator for the Poisson regression model. Iran. J. Sci. Technol. Trans. A Sci. 43(6), 2921–2928 (2019).
Article MathSciNet Google Scholar
Türkan, S. & Özel, G. A new modified Jackknifed estimator for the Poisson regression model. J. Appl. Stat. 43(10), 1892–1905 (2016).
Article MathSciNet MATH Google Scholar
Alheety, M. I., Qasim, M., Månsson, K. & Kibria, B. M. G. Modified almost unbiased two-parameter estimator for the Poisson regression model with an application to accident data. SORT 45(2), 121–142 (2021).
MathSciNet MATH Google Scholar
Asar, Y. & Genç, A. A new two-parameter estimator for the poisson regression model. Iran. J. Sci. Technol. Trans. A Sci. 42(2), 793–803 (2018).
Article MathSciNet MATH Google Scholar
Çetinkaya, M. K. & Kaçıranlar, S. Improved two-parameter estimators for the negative binomial and Poisson regression models. J. Stat. Comput. Simul. 89(14), 2645–2660 (2019).
Article MathSciNet MATH Google Scholar
Kibria, B. M. G., Månsson, K. & Shukur, G. A simulation study of some biasing parameters for the ridge type estimation of Poisson regression. Commun. Stat. Simul. Comput. 44(4), 943–957 (2015).
Article MathSciNet MATH Google Scholar
Månsson, K. & Shukur, G. A Poisson ridge regression estimator. Econ. Model. 28(4), 1475–1481 (2011).
Article MATH Google Scholar
Månsson, K., Kibria, B. G., Sjolander, P. & Shukur, G. Improved Liu estimators for the Poisson regression model. Int. J. Probab. Stat. 1(1), 2–6 (2012).
Article Google Scholar
Qasim, M., Månsson, K., Amin, M., Kibria, B. M. G. & Sjölander, P. Biased adjusted Poisson ridge estimators-method and application. Iran. J. Sci. Technol. Trans. A Sci. 44(6), 1775–1789 (2020).
Article MathSciNet PubMed PubMed Central Google Scholar
Hoerl, A. E. & Kennard, R. W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970).
Article MATH Google Scholar
Amin, M., Akram, M. N. & Kibria, B. M. G. A new adjusted Liu estimator for the Poisson regression model. Concurr. Comput. 33(20), e6340 (2021).
Article Google Scholar
Qasim, M., Kibria, B. M. G., Månsson, K. & Sjölander, P. A new Poisson Liu regression estimator: Method and application. J. Appl. Stat. 47(12), 2258–2271 (2020).
Article MathSciNet PubMed MATH Google Scholar
Liu, K. A new class of biased estimate in linear regression. Commun. Stat. Theory Methods 22(2), 393–402 (1993).
Article MathSciNet CAS MATH Google Scholar
Algamal, Z. Y. Biased estimators in Poisson regression model in the presence of multicollinearity: A subject review. Al-Qadisiyah J. Adm. Econ. Sci. 20(1), 37–43 (2018).
MathSciNet Google Scholar
Liu, K. Using Liu-type estimator to combat collinearity. Commun. Stat. Theory Methods 32(5), 1009–1020 (2003).
Article MathSciNet MATH Google Scholar
Özkale, M. R. & Kaçıranlar, S. The restricted and unrestricted two-parameter estimators. Commun. Stat. Theory Methods 36, 2707–2725 (2007).
Article MathSciNet MATH Google Scholar
Kurnaz, F. S. & Akay, K. U. A new Liu-type estimator. Stat. Papers 56, 495–517 (2015).
Article MathSciNet MATH Google Scholar
Yang, H. & Chang, X. A new two-parameter estimator in linear regression. Commun. Stat. Theory Methods 39(6), 923–934 (2010).
Article MathSciNet MATH Google Scholar
Huang, J. & Yang, H. A two-parameter estimator in the negative binomial regression model. J. Stat. Comput. Simul. 84(1), 124–134 (2014).
Article MathSciNet MATH Google Scholar
Sakallıoğlu, S. & Kaçıranlar, S. A new biased estimator based on ridge estimation. Stat. Papers 49, 669–689 (2008).
Article MathSciNet MATH Google Scholar
Theobald, C. M. Generalizations of mean square error applied to ridge regression. J. R. Stat. Soc. B 36, 103–106 (1974).
MathSciNet MATH Google Scholar
Farebrother, R. W. Further results on the mean square error of ridge regression. J. R. Stat. Soc. B 28, 248–250 (1976).
MathSciNet MATH Google Scholar
Alanaz, M. M. & Algamal, Z. Y. Proposed methods in estimating the ridge regression parameter in Poisson regression model. Electron. J. Appl. Stat. Anal. 11(2), 506–515 (2018).
MathSciNet Google Scholar
Kibria, B. M. G. & Lukman, A. F. A new ridge-type estimator for the linear regression model: Simulations and applications. Scientifica https://doi.org/10.1155/2020/9758378 (2020).
Article PubMed PubMed Central Google Scholar
Lukman, A. F., Adewuyi, E., Månsson, K. & Kibria, B. M. G. A new estimator for the multicollinear Poisson regression model: Simulation and application. Sci. Rep. 11(1), 1–11 (2021).
Article Google Scholar

Download references

Acknowledgements

We wish to thank the referees and the editor for their constructive comments to improve the quality of this paper.

Author information

Authors and Affiliations

Department of Mathematics, Science Faculty, University of Istanbul, Vezneciler, Beyazit, 34134, Istanbul, Turkey
Esra Ertan & Kadri Ulaş Akay

Authors

Esra Ertan
View author publications
You can also search for this author in PubMed Google Scholar
Kadri Ulaş Akay
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors reviewed the manuscript.

Corresponding author

Correspondence to Esra Ertan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ertan, E., Akay, K.U. A new class of Poisson Ridge-type estimator. Sci Rep 13, 4968 (2023). https://doi.org/10.1038/s41598-023-32119-0

Download citation

Received: 25 October 2022
Accepted: 22 March 2023
Published: 27 March 2023
DOI: https://doi.org/10.1038/s41598-023-32119-0

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

A new class of Poisson Ridge-type estimator

Subjects

Abstract

Similar content being viewed by others

Genome-wide association studies

Principal component analysis

Causal machine learning for predicting treatment outcomes

Introduction

A new general biased estimator

Theorem 2.1

The superiority of the PRTE in PRMs

Theorem 3.1.

Proof

Determination of \(g\left( k \right)\) function

Fact 1

Fact 2

The Monte Carlo simulation studies

Numerical example: the aircraft damage data

Some concluding remarks

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Comments

Search

Quick links

Subjects

Abstract

Similar content being viewed by others

Genome-wide association studies

Principal component analysis

Causal machine learning for predicting treatment outcomes

Introduction

A new general biased estimator

Theorem 2.1

The superiority of the PRTE in PRMs

Theorem 3.1.

Proof

Determination of \(g\left( k \right)\) function

Fact 1

Fact 2

The Monte Carlo simulation studies

Numerical example: the aircraft damage data

Some concluding remarks

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links