FIRTH BIAS CORRECTION FOR ESTIMATING VARIANCE COMPONENTS OF LOGISTICS LINEAR MIXED MODEL USING PENALIZED QUASI LIKELIHOOD METHOD

Firth bias correction originally was applied to correct bias of the variance components estimator that obtained by the maximum likelihood method. Extensive research has shown that Firth bias correction is powerful to reduce bias for normal distributed response model. Questions have been raised about the use of Firth bias correction in binomial distributed response model which has under dispersion problem. The motivation of this study is giving contribution to exploring the Firth bias correction for binomial distributed response model. The binomial distributed response model which is estimated by the maximum likelihood method obtain an under-dispersion estimator. Therefore, the Penalized Quasi-Likelihood (PQL) is used as alternative numerical method to estimate the model. This paper aims to investigate whether the Firth method can reduce bias of the variance components using the PQL technique in longitudinal data.


INTRODUCTION
In Generalized Linear Mixed Models (GLMMs), it is critical to estimate the random effects separated of fixed effects. Random effect parameters are known as the variance components [1].
Both effects are generally estimated by Maximum Likelihood Estimation (MLE). A number of studies have begun to examine the impact of MLE method in estimating the fixed effect and random effect in GLMMs. MLE is known as an asymptotic method that produces an unbiased parameter estimator when the sample size is large or going to infinity. However, there is evidence that for a small sample number, the estimation of MLE would be downward biased. The downward biased estimator produces an overdispersion estimator described by [2]. Bias estimator defined as the difference between an estimator's expected value and the true value of the parameter being estimated [3].
Discussed by [4], the bias of variance components seen clearly in case of binary data. A special case of GLMMs that has dichotomous response variable and involves fixed effects and random effects is known as Logistics Linear Mixed Model (LLMM). The LLMM usually used to model the heterogeneity among subjects and correlations in repeated observations [5]. 3 FIRTH BIAS CORRECTION FOR ESTIMATING VARIANCE COMPONENTS [8]. Therefore, the bias of variance components needs to be reduced. Modifying the score function from the likelihood function is proposed to reduce bias by [9]. In the previous studies, [1] proved that the Firth method can reduce the bias of variance components of GLMMs by maximum likelihood estimation (MLE) method. Several studies have used Firth Method to reduce the bias, but none has explained how to reduce the bias in LLMM especially using the PQL techniques.
This paper aims to investigate whether the Firth method can reduce bias for the LLMM using the PQL technique in longitudinal data. The study is conducted in the form of three issues. This paper begins by the analytical studies that carried out to develop the iterative procedures to estimate the corrected bias. The second is simulation studies to evaluate the performance of Firth method. The third issue is the application of the longitudinal study using SUSENAS data that collected annually. This paper is organized as follows. Section 2 presents the Logistics Linear Mixed Model. In section 3, we discuss the Analytical Studies of Firth Bias Correction Method for Logistics Linear Mixed Model Via Penalized Quasi Likelihood. Section 4 presents the Bias Simulation Studies.
In the section 5, we describe the Illustration on longitudinal Study. Finally, the Conclusion is presented in section 6.

LOGISTIC LINEAR MIXED MODEL (LLMM)
Mixed model involves fixed effect and random effect in the model. if we assume the response variable has the normal distribution, it is known as Linear Mixed Model (LLM). The common form of linear mixed model as follow: is N x 1 column vector, the response variable; is a N x p matrix of the p predictor variables; is a p x 1 column vector of the fixed-effects regression coefficients; is the N x q design matrix for the q random effects; is a q x 1 vector of the random effects; and is a N x 1 column vector of the residuals, that part of that is not explained by the model.  (3) and (4), we find whereas is the variance covariance matrix of .
To estimate and , different method has been proposed by Saei and McGilchrist [10]. This 5 FIRTH BIAS CORRECTION FOR ESTIMATING VARIANCE COMPONENTS method is one of the practical ways of estimating and because it involves the log-likelihood function directly. In this paper, this method is extended to be applied in LLMM. The method requires the first and second derivatives of 1 with respect to , , and and 2 with respect to and as follows: since 2 does not consist of parameter , then derivatives with respect to are equal to zero Assumed H is a Hessian matrix which is known as the derivative matrix of the log likelihood function. Let is a minus of Hessian matrix, then V can be written as follows: If is evaluated at 0 and 0 , then the procedure for estimating and is ] as the partitioning of the matrix .
Then the variance components of can be defined as where the is the rank of the matrix .
The variance components of also can be obtained using the restricted estimation maximum likelihood (REML) method [11]. For the REML estimator's method, the * using the 22 submatrix of the * matrix. This matrix can be expressly determined by utilizing the formula as below: Therefore, the variance components of that using REML method can be written as where the is the rank of the matrix .

PENALIZED QUASI LIKELIHOOD
The Basic idea of the Firth method to reduce bias is substituting the smaller bias to the score function (Firth 1993). Here is the illustration Then the modification of loglikelihood function for binomial distribution is The matrix is the matrix of second-order derivatives If is evaluated at 0 and 0 , then the procedure for estimating and is where the is the rank of the matrix .
The variance components of also can be obtained using the restricted estimation maximum likelihood (REML) method [11]. For the REML estimator's method, the * using the 22 submatrix of the * matrix. This matrix can be expressly determined by utilizing the formula as below: (18) 22 = * + * ′ 11 ′ ′ * while 11 = ( ′ − ′ * ′ ) −1 . Therefore, the variance components of that using REML method can be written as where the is the rank of the matrix .

BIAS SIMULATION STUDIES
The simulations conducted to determine the behavior of the Firth-adjusted PQL (Firth method which is applied to the PQL). The purpose of the simulation is to assess and compare the performance of the Firth method in reducing bias of variance components. Specifically, the data generation model can be written as follows:   Table 2 shows the biases of variance component estimates of the unadjusted PQL and the Firthadjusted PQL. These results reflect those the result from Table 1 which also found that there is a significant difference between the two conditions. The bias of variance components is shown that Firth-adjusted PQL obtain lower biases of variance components than unadjusted PQL method.
The comparison of the result from the two method is clearly seen in the figures below.  In summary, comparing the two methods, it can be seen that the firth-adjusted PQL having less biases of variance component estimates leads to have better random effect's variability estimates.

ILLUSTRATION ON LONGITUDINAL STUDY
To make this more concrete, let's consider the illustration from the poverty dataset. The Poverty is still one of the complicated problems in every country, especially for developing countries like Indonesia. To measure poverty, the Statistics of Indonesia (BPS) uses the concept of ability to meet the basic needs (basic needs approach). With this approach, poverty is seen as an inability on the economic side to meet the basic needs of food and non-food measured from the expenditure side. So, the poor population means the population that has an average monthly per capita expenditure under the poverty line.
The analysis of the illustration is focused on estimating using LLMM via PQL. We assumed that , whether poor or not for the kth household on the ith block and tth time, was (conditionally) binomial-distributed with mean . In equation form, the model can be written as follow:  is an unknown parameter to be estimated. Finally, the block and time random effects are assumed independent of one another. The reason in choosing the random effects in this study is because we expect that the variation within block may be correlated. There are many reasons why this 13 FIRTH BIAS CORRECTION FOR ESTIMATING VARIANCE COMPONENTS could be. For example, the determination of the block is taken from the same region that have the same poverty line, such that within a block, the households are more homogeneous than they are between blocks. Turning to the odds ratio here is the conditional odds ratio for the household with the floor area of the house and number of the household members constant as well as for the household with either the same block, or blocks with identical random effects. When there is large variability between blocks, the relative impact of the fixed effects may be small.
To proof whether there are the differences between the two variances from the two kinds of random effects, there are two hypotheses: equal variances, then the F-test is close to one, but if F-test is more than one, then the evidence is against the null hypothesis. Therefore, we can conclude that the variances between unadjusted PQL and Firth adjusted PQL are different. Comparing the two results, it can be seen that the firth-adjusted PQL obtain greater of variance components estimate leads to better estimates of the variability of the random effects estimates.

CONCLUSION
The aim of the present study is to examine whether the Firth method can reduce bias for the LLMM using the PQL technique in longitudinal data. On the LLMM with multiple random effects, the simulation of this study shows that the Firth-adjusted PQL improves the bias of the variance components estimate. In general, the result of the simulation from this study indicate that the variance components of Firth adjusted PQL are leads to the true value.
The limitation of this study is the assumption of independent between 0 2 and 1 2 , but in practice the assumption may not be realistic. For future research more complicated models need 15 FIRTH BIAS CORRECTION FOR ESTIMATING VARIANCE COMPONENTS to be investigated especially in the case where the random effects are correlated. These random effects also require estimation. Based on the result of this study, the Firth-adjusted PQL is preferable to the unadjusted PQL for the model studied. Future work will determine whether the Firth-adjusted PQL is a suitable choice for other models.