Performance of the Ridge and Liu Estimators in the zero-inflated Bell Regression Model

Department of Statistics and Informatics, University of Mosul, Mosul, Iraq College of Engineering, University of Warith Al-Anbiyaa, Karbala, Iraq Department of Epidemiology and Biostatistics, University of Medical Sciences, Ondo, Nigeria Department of Applied Statistics and Econometrics, Faculty of Graduate Studies for Statistical Research, Cairo University, Giza, Egypt Department of Quantitative Analysis, College of Business Administration, King Saud University, Riyadh, Saudi Arabia


Introduction
e generalized linear model is employed when the response variable does not follow a Gaussian (normal) distribution [1][2][3][4][5][6][7][8][9][10][11]. In modeling, data in the form of counts are usually common, especially in economics and medicine. Undoubtedly, in practice, the Poisson regression model is the most adopted models for count data because of its simplicity [12,13]. e distribution often assumes that the mean of the distribution is the same as the variance. However, a major drawback of the Poisson regression model is overdispersion (variance larger than the mean), which usually occurs in count data. Castellares et al. [14] introduced an alternative discrete distribution model called the Bell regression model to model count data with overdispersion. Recently, Amin et al. [15] and Majid et al. [16] introduced the ridge estimator and the Liu estimator for the parameter estimation of the Bell regression model with multicollinearity. Another limitation of the Bell regression model is the presence of excess zeros in the count data. ere is a tendency for the count to possess many zeros (or zero in ation), which is inevitable in many disciplines, including medicine, public health, environmental sciences, agriculture, and manufacturing applications [12,17]. Alternatively, the zero-in ated Bell regression model provides a better t for such type of count data [12].
In real applications, when there is multicollinearity among the explanatory variables, the X T X is near to singularity that could in turn in ate the variance of the maximum likelihood estimator (MLE). erefore, the traditional estimation methods, such as MLE, tend to poorly perform. e ridge, Liu, Liu-type, and other estimators given by several authors are alternatives to MLE to overcome the multicollinearity in the linear regression model [18,19]. ese estimators have been extended to the GLMs [20][21][22][23][24][25][26][27][28][29]. Also, robust estimators have been proposed to handle the problem of multicollinearity and outlier values together (See [30]- [31]).
Following Lemonte et al. [12], it is evident that the zeroinflated Bell regression model (ZIBRM) can be used to model count data with overdispersion. Also, it is obvious from the literature that multicollinearity is often a problem among continuous explanatory variables. ere is no study where they have an account for multicollinearity in the zeroinflated Bell regression model. e main objective of this study is to develop the ridge and the Liu ZIBRM estimators for modeling count data with overdispersion. e proposed estimator will efficiently gain an advantage over some of the existing estimators in GLM. e superiority of the proposed estimators will be established through simulated examples and a real-life application. e organization of the paper is as follows: we described the zero-inflated Bell regression model and its parameter estimation using the maximum likelihood estimator. We proposed the ridge and the Liu estimators for the zeroinflated Bell regression model in section 3. Also, we defined different shrinkage parameters for the two estimators. e simulation study and two applications are conducted in sections 4 and 5, respectively. A concluding remark is provided in section 6.

zero-inflated Bell Regression Model
We assume that (y i , x i ), i � 1, 2, . . . , n is the independent observed data with the predictor vector x i ∈ R p+1 and the response variable y i ∈ R, which follows a distribution that belongs to the Bell distribution. en, the density function of y i can be expressed as follows: θ y e − e θ +1 B y y! , y � 0, 1, 2, . . . , (1) where θ > 0, and B y � (1/e) ∞ d�0 (d y /d!) is the Bell numbers [32,33]. e mean and variance of the Bell distribution are, respectively, defined by the following: We assume that ψ � θ e θ and θ � W ∘ (ψ) where W ∘ (.) is the Lambert function. en, equation (1) can be written in the new parameterization as P(Y � y) � exp 1 − e W ∘ (ψ) W ∘ (ψ) y B y y! , y � 0, 1, 2, . . . , In the GLM, the mean of the response variable, μ i � E(y i ), is conditionally related to a linear function of predictors through a link function. e linear function is stated as e link function is providing the relation of the mean and the natural parameter as . e Bell regression model (BRM) can be modeled by assuming . e parameter estimation in the BRM is achieved through using the MLE based on the iteratively reweighted least-squares algorithm. e log-likelihood is defined as follows: en, the MLE is derived by equaling the first derivative of equation (5) to zero. is derivative cannot be analytically solved because it is nonlinear in β. Fisher's scoring algorithm can be used to obtain the MLE where in each iteration, the parameter is updated by the following: where r is the iteration counter, and I − 1 (β) � (− E(z 2 ℓ(β, ϕ)/zβzβ T )) − 1 . Subsequently, the estimated coefficients are defined as where W e MLE is asymptotically normal with a covariance matrix as It is well known that count data often may have an excessive number of zeros, that is, mean zero counts are greater than expected. Count data with many zeros, zero inflation, commonly come across many real applications [34,35]. e BRM is inadequate in the case of the excess of zeroes in the sample. In this study, we introduced the zero-inflated Bell regression model (ZIBRM) as an alternative way to model count data with the excess of zeroes. e ZIBRM can, then, be formulated as follows: Journal of Mathematics where π ∈ (0, 1). In notation, y ∼ ZIBell(θ, π). It can be noted that equation (9) is very simple to deal with, and it does not involve any complicated function. e mean and variance of equation In addition, the index of dispersion reduces to be W(θ) + θπ.
In terms of GLM, two link functions are used in modeling zero-inflated regression, they are as follows: where β � (β 1 , ..., β p ) T and ζ � (ζ 1 , ..., ζ q ) T are vectors of unknown regression coefficients, which are assumed to be functionally independent, and x T i � (x i1 , ..., x ip ) and s T i � (s i1 , . . . , s iq ) are observation on p and q known explanatory variables if the model in equation (10) without intercept, but if the model has an intercept, then the number of known explanatory variables is p − 1 and q− 1 and x i1 � s i1 � 1.

e Log-Likelihood Function Is Defined
where ] � (β T , ζ T ) T . en, the maximum likelihood esti-

Ridge and Liu Estimators
In the presence of multicollinearity, the rank(X T W ∧ X) ≤ rank(X), and therefore, the near singularity of X T W ∧ X makes the estimation unstable and enlarges the variance [36]. e ridge estimator (RE) [18] and Liu estimator (LE) [19] have been consistently demonstrated to be an attractive and alternative to the MLE, when multicollinearity exists. In Bell regression model, the ridge estimator and Liu estimator have been proposed by Amin et al. [15] and Majid et al. [16], respectively. To extend the ridge estimator and Liu estimator for the zero-inflated Bell regression model and according to Kibria et al. [24] and Asar et al. [37], we can define these estimators as, respectively, where k > 0 and 0 < d < 1.
Because the count component of the ZIBRM is considered, therefore, the scalar mean squared error (MSE) of the ridge and Liu estimators are defined, respectively, as follows: where λ j is the eigenvalue of the X T W ∧ X matrix, α j is defined as the j th element of c T β ∧ MLE , and c is the eigenvector of the X T W ∧ X.   [39].

eoretical Comparison Based on MSEM and MSE
Proof. We show that the bias difference is positive definite.
By simplification, k(d + λ j ) > λ j (1 + d), which implies that the difference is positive definite. Also, the difference in the variance is obtained as follows:

Estimating k and d.
e efficiency of both ridge and Liu estimators is fully depending on the shrinkage parameters k and d, which control the amount of the shrinkage. For k � 0 and d � 0, the MLE is obtained. In practice, it is better to estimate the value of k and d. Numerous methods are available for estimating these two shrinkage parameters, especially in linear regression. In this paper, several methods are considered and extended to estimate the value of k and d in the zero-inflated Bell regression model. [36], Lukman and Ayinde [40], Shah et al. [41], and Ali et al. [30], the following shrinkage parameters are considered for the zero-inflated Bell ridge estimator: [42], the shrinkage parameter d was estimated using the following:
e simulation study is conducted by adopting the R programming language with the help of zibellreg package [14]. e experiment was replicated 1000 times, and the mean squared error (MSE) was employed to evaluate the estimators' performance.
where β * l denotes the estimated vector of the true parameter vector β in l th replication. e MSE of the simulated data is provided in Tables 1-12 under different simulation conditions. e MLE performance is not satisfactory due to the presence of multicollinearity. For instance, from Table 1 at sample size 100, ρ � 0.9, p � 4, and π � 0.2, the MSE for MLE is 4.3654 while the MSEs for other estimators are very small compared to that of MLE. is aligns with the literature that MLE suffers setback when there is linear dependency among the explanatory variables.
We also observed that the MSE of each of the estimators increases when the level of multicollinearity increases at a particular sample size. Also, the MSE of each of the estimators decreases as the sample size increases when other factors are kept constant. It is very obvious that the MSE rises as the number of explanatory variables or π% increases. e performance of the ridge and Liu estimators is a function of the shrinkage parameter. For instance, the ridge and Liu perform with the use of (k 3 , k 5 ) and d 1 as their respective shrinkage parameters.

Blood Transfusion Data.
e data correspond to the count of blood transfusion for thalassemia patients in Mosul, Iraq, for n � 150 randomly selected patients. e response variable represents the number of blood transfusions. e percentage of zeros observed was 52%. e frequency plot is provided in Figure 1. e following explanatory variables were collected from each patient; x 1 : the age in months, x 2 : the duration of thalassemia in months, x 3 : hemoglobin concentration, x 4 : the packed cell volume, x 5 : the numbers of blood units, and x 6 : onset of blood transfusion according to age in months. We initially test whether the Poisson distribution fit well to the data. Both the Kolmogorov-Smirnov and the Anderson-Darling goodness-of-fit test show contradictory conclusions. e Kolmogorov-Smirnov test revealed that the data fit well to the Poisson distribution with statistic and p-value of 0.19804 and 0.41, respectively. However, the Anderson-Darling statistic and p-value are 21.987 and 0.0001, respectively. e conflicting conclusion is traceable to amount of excess zeros in the data. We observed that there is overdispersion in the Poisson regression model since the variance of the distribution is about twice the mean of the distribution. is makes the Poisson distribution unsuitable for the data. Finally, we assess the adequacy of the Poisson regression model using the residual deviance [50]. e residual deviance (293.07, p-value � 0.0000) is statistically significant, and we conclude that the Poisson regression model does not fit the data reasonably well. Also, the residual deviance (293.07) divided by its degree of freedom (143) is greater than 1. is is supported by the overdispersion test, and the z-value is 4.9964 with a p-value of 0.0000 [51].
ese results indicate overdispersion in the data. We fit other generalized models to the data. We conducted the overdispersion test for the Poisson regression model and observed that the true dispersion is greater than 1 (φ � 1.9554).
Consequently, we decided to fit the following regression models to the dataset: the Bell regression model, the negative binomial regression model, the Conway-Maxwell-Poisson regression model, the zero-inflated Poisson regression model (ZIPRM), zero-inflated negative binomial regression model (ZNBRM), and the ZIBRM. e most appropriate model is assessed using the Akaike criterion and the log-likelihood function. e result is reported in Table 13. e most suitable model for the cancer dataset is the ZIBRM followed by ZNBRM and ZIPRM.
e results in Table 14 are aligned with the simulation findings. It is evident that the ridge regression estimator with the shrinkage parameter k 1 produced the most outstanding performance. e method of maximum likelihood produced           the least performance as expected because of the linear dependency among the X's. e scalar mean squared error for the ridge and the Liu estimators is obtained using equations (14) and (15). e scalar mean squared error of the method of maximum likelihood is obtained by taking the trace of equation (8).

Fish Dataset.
ese data were adopted to predict the number of fish caught by 250 groups that went to a state park [4,52]. e response variable is the number of fish caught while the predictors include questions on whether the trip was not just for fishing (0 if no, and 1 if yes-x 1 ), questions on whether live bait was used or not (0 if no,   and 1 if yes-x 2 ), the fishermen brought a camper to the park or not (x 3 ), how many people were in the group (x 4 ), and how many children were in the group (x 5 ). e condition index is 181.76, which shows there is multicollinearity. Long [52] claimed that the zero-inflated Poisson regression model fits well than the Poisson regression model using the Vuong test.
is is further supported by the adequacy test using AIC and log-likelihood in Table 15. e overdispersion test shows that there is overdispersion in the data (z-value � 2.2357, p-value of 0.0000).
is supports the reason why the Poisson regression model does not fit well to the data. Even though the zero-inflated Poisson regression model (ZIPRM) is better than the Poisson regression model, the performance is not good in comparison to other fitted models. Recently, Taweel and Algamal [4] modeled the data using the zeroinflated negative binomial regression model (ZNBRM). is claim is supported by the result in Table 15. We fit other models in Table 7 and determine the most appropriate for the data. e zero-inflated Bell regression model is the most appropriate model followed by the Bell regression model.   e result in Table 16 shows that the ridge regression estimator with the shrinkage parameter k 3 produced the most preferred estimate using the scalar mean squared error.
is shows that the performance of the estimators is a function of the shrinkage parameter.

Conclusions
e Poisson regression model is adopted to model count data because of its simplicity. However, it is obvious that the Poisson regression model produces a poor fit for count data with overdispersion. Alternative models such as the Bell regression model, the negative binomial regression model, and others effectively account for overdispersion in count data modeling. Also, this study has shown the effects of excess zeros on some of these models. Recently, the zeroinflated Bell regression model (ZIBRM) was developed and was more appropriate for a count model whose overdispersion is due to excess zeros. Employing the frequentist approach, the parameter estimation of the zero-inflated Bell regression model is conducted by the maximum likelihood, and the Fisher information is derived. e limitation of the maximum likelihood estimator in the presence of linear dependency among the X's was circumvented in this study with the proposition of the ZIBRM ridge and Liu estimators, respectively. e methodology developed in this study is illustrated by means of a simulation study and an empirical application. In the application study, we showed that the ZIBRM is a more suitable fit than some existing models when there is overdispersion due to excess zeros. Also, the two proposed estimators in this study outperform the method of maximum likelihood theoretically and simulation wise. However, the ridge estimator produced a more outstanding performance and is, therefore, recommended. However, the estimator's performance is a function of the shrinkage parameter. In conclusion, the zero-inflated Bell regression model is more suitable for modeling count data with multicollinearity and overdispersion caused by excess zeros. e future works will focus in driving other shrinkage estimators to handle the collinearity issue.

Data Availability
Data are available upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.