POISSON-LOGNORMAL MODEL WITH MEASUREMENT ERROR IN COVARIATE FOR SMALL AREA ESTIMATION OF COUNT DATA

,


Small Area Estimation (SAE) is concerned with the development of statistical procedures
for estimating small areas (or domains) when direct survey estimates (i.e. based only on the area-specific sample data) are unreliable or even can not be calculated because of the limited sample size. In the context of small area estimation, direct estimators lead to unacceptably large standard errors for areas with unduly small sample sizes or no sample units may be selected from some small domains. Data obtained from sample surveys can be used to derive reliable direct estimates for large areas or domains, but sample sizes in small areas are rarely large enough for direct estimators to provide adequate precision for small areas.
Another method which can be used to obtain higher precision in SAE are developed by linking some information in particular area with some other areas through appropriate model and it is called indirect estimation. There are two connecting models in indirect estimation, implicit and explicit models. Traditional methods of indirect estimation are based on implicit models that provide a link to related small areas through supplementary data. The explicit small area models make specific allowance for between-area variation. [1] introduce mixed models involving random area-specific effects that account for between-area variation beyond that explained by covariates included in the model.
The SAE method was first used by [2] through a linear mixed model with area random effects for estimating per capita income in a small area in the United States. [2] uses the Empirical Best Linear Unbiased Prediction (EBLUP) which is an improvement on the Best Linear Unbiased Prediction (BLUP). The BLUP estimator was first introduced by [3] for linear mixed models.
In this case, "best" stands for minimum mean square error among all linear unbiased predictors, "linear" means that the predictor is a linear combination of the response variable values and "unbiased" means that the expected value of the prediction error is zero. The BLUP methods described in [3] assumed that the variances associated with random effects in the mixed model are known. In practice, variance components are unknown and have to be estimated from the data. The predictor obtained from the BLUP when unknown variance components are replaced by associated estimators is called the empirical best linear unbiased predictor (EBLUP) and is described in [5]. While BLUP was originally proposed by [3], the empirical version of BLUP (EBLUP) is related to the classical shrinkage estimator studied by [4], who established analytically that EBLUP improves on the sample means when the number of small areas is larger than or equal to three.
In addition to EBLUP, empirical Bayes (EB) and hierarchical Bayes (HB) estimation and inference methods have been also applied to small area estimation. The BLUP/EBLUP method is applied to the linear mixed model which handles the problem of estimating small areas, but they are not suitable for handling binary or count data. The EB and HB methods are applicable more generally in the sense of handling models for binary and count data as well as normal linear mixed models. Empirical Bayes methods are procedures for statistical inference in which the prior distribution is estimated from the data. But in the HB approach, unknown model parameters (including variance components) are treated as random, with values drawn from specified prior distributions. Posterior distributions for the small area characteristics of interest are then obtained by integrating over these priors, with inferences based on these posterior distributions.
As in [1], Bayesian specifications for small area models with count data are derived from basic area level models for SAE, e.g. the Fay-Harriot model [2] or a generalized linear Poisson model. Bayesian data analysis is based on the posterior distribution of relevant parameters given the data. For some simple posterior distributions it is possible to find the exact form of the posterior distribution and to find explicit forms for the posterior mean. The posterior distribution represents both prior information and observational data, each of which is represented by the prior distribution of its likelihood function. The likelihood function represents the data condition, while the prior distribution determines the researcher's subjectivity. The prior distribution specification in Bayes' inference is also very important because this prior distribution will affect the inference regarding its posterior distribution. However, it is commonly the case that for reasonably realistic model, it is not possible to obtain a closed form for the posterior distribution. In this situation, simulation methods must be made to to posterior sampling i.e. obtain samples from the posterior distribution which then can be summarised to get estimates of relevant quantities.
The properties of the small area estimators derived under the SAE model are based on the hypothesis that the auxiliary data or covariates are available for all the areas and that they are measured without error. However, it is sometimes difficult to find error-free covariates. When the covariate used in the SAE model is measured with error, using a small area estimator such as the Fay-Herriot estimator while ignoring measurement error may be worse than simply using the direct estimator [7]. Survey data can be used as an up-to-date covariate but it may contain measurement error. As in [8], ignoring measurement errors can produce biased estimators and lead to incorrect conclusions in data analysis. Many statistical procedures have been developed for statistical inference in measurement error models [9]. The measurement error model is a combination of measurement errors and unobserved random variables where the observed value is known. Since the true random variables are unobserved, it is important to estimate the nature of the distribution of populations through the distribution of prior random variables that contain measurement errors. The problem of estimating the density function of a true or unobserved random variable using its observed sample (surrogate) is called deconvolution [10].
There are many parametric and nonparametric methods proposed to estimate the density function of the unobserved random variables due to measurement errors. In the additive measurement error model, [10] proposed the deconvolution kernel density estimator to recover the unknown density function from contaminated data, where the kernel idea and the Fourier inverse were employed in the construction of the estimator. In this article, we estimate the density function of unobserved covariate under the framework of measurement error model by Empirical Bayes Deconvolution (EBD) method proposed by [11,12]. The basic idea of the EBD method is modeling prior density as a member of exponential family density. Empirical Bayes inference assumes an unknown prior density of the unobserved covariate produces an independent observation as surrogated with a known probability distribution mechanism.
The HB predictor for the area-level small area means based on the Fay-Herriot model with measurement error in covariates proposed by [15] is constructed for continuous responses. But for count data, the basic Poisson model is widely used with the assumption that the mean and variance of the count data are equal. However, this equal mean-variance relationship rarely occurs in observational data. In most cases, the observed variance is larger than the assumed variance, which is called overdispersion. The Poisson formulation has to be adjusted to accommodate the extra variety of sample data observations. So, we develop the Poisson-Lognormal model using the HB approach in the context of small area estimation for handling overdispersion in nonsymmetrical count data. When area-level covariate contains measurement error is used in the Poisson-Lognormal model, it is necessary to utilizing information from covariates, even though it may contain measurement errors. Measurement error might either be introduced by the measuring technique which involves the subjective judgment by human action or due to a cheaper or more convenient substitution of the correct quantity. Treating imprecise data as the true value of covariate can result in some misleading outcomes. In the structural measurement error model, the true covariate is treated as a random variable and its surrogate distribution is assumed unknown. We construct the HB estimator of the Poisson-Lognormal model with measurement error in covariates using the HB framework by adding the distribution of unobserved covariate as the prior distribution. We used simulation studies to investigate the modeling aspects of the Poisson-Lognormal model with measurement error in covariate and used a real data example to illustrate the practical usefulness of this work.
The remainder of this study is organized as follows. In Section 2, we specify the EBD method for estimating the density function of unobserved covariate. Section 3 presents the Poisson-Lognormal model with measurement error in covariate using HB approach. Sections 4 and 5 present the results of the simulation and real data example. Conclusions follow in Section 6.

THE EMPIRICAL BAYES DECONVOLUTION METHOD
Density estimation is a research area in statistics and has been studied extensively in many fields. In density estimation we are interested in determining an unknown function f , given only random samples or observations distributed according to this function. The goal of density estimation is to infer the probability density function (pdf) from observations of a random variable. Density estimation approaches can be classified into two groups: parametric and non-parametric density estimation. Parametric methods make strict prior assumptions about the form of the underlying density function. In parametric density estimation approach, where one assumes that the observations come from a parametric family of densities that has to be specified in advance. Then, the task is to estimate the parameters that fit the data best. For instance, a parametric approach may assume the random variables have a Normal distribution.
Such assumptions simplify the problem since only the parameters of the chosen family distribution need to be determined. In a normal distribution case, the density estimation reduces to determining the mean µ and standard deviation σ of the sample points. But sometimes it is not possible to make such strict assumptions about the form of the underlying density function and non-parametric methods are more appropriate in these situations. These methods make few assumptions about the density function and allow data to drive estimation process more directly. [11] proposed the Empirical Bayes Deconvolution (EBD) method for estimating the density function of an unobserved random variable based on empirical Bayes strategy. Empirical Bayes methods are procedures for statistical inference in which the prior distribution is estimated from the data. In the EBD method, an unknown prior distribution g(x) has yielded (unobservable) parameters and each of X i independently produces an observed random variable W i according to a known probability mechanism. From [13,14], in classical structural measurement error model, the true or unobserved random variable X i and its observed variable (surrogate) W i can be written as additive measurement error model: assume that X i 's are independent and identically-distributed (i.i.d.) as X, the errors ei's are i.i.d. as e, and X and e are mutually independent. We apply the EBD methodology in structural measurement error problems for estimating the density function of a true or unobserved covariate X i using observational sample W i . The Bayes deconvolution problem is one of recovering g(x) from observed sample W and g is restricted to be in a parametric exponential family.
and the marginal density of W i is where the integral is taken over the X-space, T . [11] proposed a likelihood approach to Bayes deconvolution problem in (1) with the prior g(x) modeled by an exponential family of densities on the T , which is assumed to be a finite discrete support set and the support set of the observations W i is also assumed finite and discrete W = (w (1) , ..., w (n) ).
The unknown prior density g(x) is an m-vector g = (g 1 , g 2 , .., g m ) t specifying probability g j on x j , where α is a p-dimensional parameter vector while Q is a known m × p structure matrix with where the function φ (α) normalizes g(α) so that its components sum to 1: ) and define the n × m matrix P = (p i j ), so the marginal density of W between g(x) and f (w) can be represented as matrix multiplication Define the count vector y = (y 1 , ..., y n ) t with y i = −W i = w (i) for i = 1, 2, ..., n as a sufficient statistics for g and y ∼ mult n (N, f ). The log-likelihood function l(α) for y the parameter vector then the score function for α is which is needed for the maximum likelihood calculation. The maximum likelihood estimateα for α is obtained by solvingl(α) = 0.

THE POISSON-LOGNORMAL MODEL WITH MEASUREMENT ERROR IN COVARIATE
When the response variables are discrete such as binary, count, and multi-category, the Poisson regression model is a widely used model for count data [20]. But the assumptions of the Poisson regression model turn out to be unrealistic. More specifically, the Poisson model involves the assumption that the mean is equal to the variance and, therefore, the model cannot account for overdispersion. For this reason, alternative distributions are used which imply different mean and variance that handle overdispersion, such as negative binomial distribution. An alternative for Poisson regression is to assume that the logarithmic parameter of the Poisson distribution follows a normal distribution with unknown variance, and unknown mean that depends linearly on a vector of covariates. Unfortunately, this model is not trivial to analyze because the likelihood function is not available in closed form. This model has been suggested before as a plausible competitor to Poisson-Gamma, and Poisson-Inverse Gaussian [21]. [21,22,23] has explored the relationships between count response and error-free covariates in the Poisson-Lognormal model using HB approach.
[16] discussed the use of generalized linear mixed models for small area estimation under this situation, and provided hierarchical Bayesian methods. Within the HB framework, alternative model specifications can be considered when the variable of interest is a count or a proportion [17,18]. According to [19], HB models have many advantages: (i) their specification allows   proposed by [11] we get the density estimation f (x) which will use as prior density for X.

(4) Stage 4: Prior Distribution
Prior distribution of parameters of the Poisson-Lognormal model with measurement error in covariate X as follows: In situation the posterior distribution cannot be determined analytically or via grid approximation, we use simulation to approximate the posterior distribution and its characteristics. The joint posterior distribution representation in equation (13) cannot be obtained in closed form. It is difficult or impossible to simulate directly from a distribution, so we turn to indirect methods.
If we use the Gibbs sampler, it requires generating samples from the full conditionals of each step λ , X, β , σ 2 v , given the remaining parameters and the data. Because it is not easy to find these full conditional distributions, so posterior distributions of model parameters are computed using the Markov Chain Monte Carlo (MCMC) method using the MH algorithm for estimating the model parameterβ andσ v 2 with prior distribution from stage 4.

SIMULATION STUDY
In this section a simulation study is conducted to better understand how the performance of the HB Poisson-Lognormal measurement error in one covariate including structure of the data, estimation method and accuracy of the estimated parameters. The aim of the simulation study in this study are focused on the model performance, especially about the unbiasedness and accuracy of the parameter estimators. In addition, Bayes' inference based on this simulation study needs to be evaluated whether the Markov chain has reached the stationarity of the desired posterior distribution or not. This process is carried out through convergence diagnostics using trace plots. Any MCMC scheme aims to produce (dependent) samples from a "target" distribution. If well constructed, the Markov chain is guaranteed to have the posterior as its stationary distribution. But it does not tell us how long we have to run it to convergence. The initial position may have a big influence and the proposal distribution may lead to low acceptance rates.
The chain may get caught in a local maximum in the likelihood of surface. it start by assuming the distribution we want to sample from, has density proportional to some function f. We also need to pick a "proposal distribution" that changes location at each iteration in the algorithm.
In this simulation study, we assume the true covariate X is generated from Gamma distribution and its surrogate has Poisson distribution with E(W i ) = X i . By using number of MCMC samples 1000000 with thinning 500, the output of the estimated posterior distribution is as follows:    sured if all variables related is available. Along with the establishment of regional policy, where regional governments had greater power to manage their region, the availability of LR on lower levels to monitor regional educational development is necessary.
For Susenas March 2020, the survey provides data at the city/district level and the data can also be categorized by rural and urban, sex, age groups, and family expenditure. These make it possible for the country to observe any discrepancy in literacy level among groups. Due to the sampling design of Susenas, accommodated the only estimation on the district level and it will give high variance if it used to estimate on lower sub-district level, although still unbiased.
The high variance will result in a broader confidence interval of estimation, which will make the estimation unreliable. One of the methods to obtain accurate estimators from inadequate sample size in a small area is a method of SAE with the count response, Y i is the number of illiterate people aged 15 and over in sub-district i and from the same survey, we choose the number of people who live in rural areas as covariate measurement error W i . Because Y i is non-symmetrical count data and it has overdispersion problem as in Table 2, for predicting the number of illiterate people aged 15 years old and above, we construct the Poisson-Lognormal model as combination of the small area model and the hierarchical Bayes approach to handle the problem of estimating small areas of counts, which includes taking into account the random effects of the observed area and measurement error in covariate. We use the number of people who lived in rural areas in the same survey as covariate measurement error.
As the Poisson-Lognormal model contains three parameters and the Bayes estimates can not be obtained in closed form, the Metropolis-Hastings algorithm is used to generate MCMC samples from the conditional posterior density of each parameter and produced parameter estimates as shown in Table 3 as follows In addition, it can also be obtained a prediction of the illiteracy rate of the Kepulauan Riau Province in 2020 is 3.66%. This prediction is greater than the estimated illiteracy rate of the Kepulauan Riau Province in 2020 based on the publication of Riau Island Province in 2021 which is only 1%.

CONCLUSIONS
The main goal of this research was to develop the Poisson-Lognormal model with measurement error in its covariate using the HB approach in the context of small area estimation for nonsymmetrical count data because of overdispersion. The covariate measured with error is considered to be a random variable and modeled as a structural measurement error model. Instead of covariate observed directly, we observe a contaminated version of true covariate or surrogate. One of the more significant findings to emerge from this study is that the true or unobserved covariate can be modeled as a structural measurement error model and the density function is estimated using Empirical Bayes Deconvolution (EBD) method. In the EBD method, the probability distribution of an unobserved covariate is considered as the prior distribution of its surrogate which the probability distribution is known.
As in the HB framework, we developed four stages of the HB approach by adding the es- also be conducted to explore the interdependence between measurement error and covariate, and whether this impacts the results in the EBD method.