BAYESIAN REGULARIZED TOBIT QUANTILE TO CONSTRUCT STUNTING RATE MODEL

. Abstract: This study aims to identify the best model for the stunting rate by applying and comparing several methods based on the Tobit quantile regression method's modification. The stunting rate dataset is left censored and violated with linear model assumptions; thus, Tobit quantile approaches are used. The Tobit quantile regression is adjusted by combining it with the Bayesian approach since the Bayesian method can produce the best model in small-size samples. Three kinds of modified Tobit quantile regression methods considered here are the Bayesian Tobit quantile regression, the Bayesian Adaptive Lasso Tobit quantile regression, and the Bayesian Lasso Tobit quantile regression. This article implements the skewed Laplace distribution as the likelihood function in Bayesian analysis. This study used the data of 3534 stunting children obtained from the Health Departments of several districts and municipals in West Sumatra, Indonesia. The result of this study


INTRODUCTION
One-third of all deaths in children under five are caused by malnutrition [1].Malnutrition has serious health, social, and economic consequences throughout one's life and across generations, making it as the leading risk factor among children under the age of five worldwide [2].Low height-for-age, also known as stunting, is a key indicator of chronic malnutrition because it reflects a failure to reach linear growth potential.Globally, depending on the precise definition and estimate, between 171 million and 314 million children under five are currently classified as stunted, with 36 African and Asian countries bearing 90% of this burden.West Sumatra, a province in Indonesia, has a higher stunting rate than the WHO's tolerance, which is above 20%.Therefore, the stunting problem has become a priority issue by the government of West Sumatra that has to be handled and solved soon.
The stunting rate variable is a so-called limited dependent variable whose distribution is mostly continuous but has a point mass at one or more specific values, such as zero.The Tobit model is one of statistical approach to models limited dependent variables [3].Tobit regression has become one of the most commonly used statistical tools utilized by researchers to describe the relationship between a non-negative response variable and a set of covariates [4][5][6].The Tobit regression has been routinely applied in medicine, biology, ecology, economics, and social sciences [7][8][9][10][11].This where   ′ are residuals with   ~(0,  2 ),   = ( 1 , … ,   ) ′ , and  = ( 1 , … ,   ) ′ .The observed stunting rate is assumed to be related to the latent value by the following: = {   * ,    * > 0, 0, ℎ.
The response variable can be written as   = {0,   * }, unknown parameter  is estimated using the maximum likelihood method.Although the asymptotic theory for maximum likelihood has been well studied in Tobit models [12], a Bayesian method produces exact inference even if n is small [13][14][15].Alhamzawi and Yu [6] applied a Bayesian approach to the Tobit regression model using the normal density for the residuals and generating  from its full conditional posterior distribution using a Gibbs sampler.Alhamzawi and Ali [16] proposed adaptive lasso in Tobit quantile regression using the Bayesian technique.Alhamzawi and Yu [6] suggested a Bayesian technique for coefficient estimation in (TobitQReg) model utilizing g-prior distribution with ridge parameter.Alhamzawi [4,5] proposed a Bayesian elastic net penalty in (TobitQReg).
Mallick and Yi [11] provided a new technique for achieving Bayesian Lasso in a traditional regression model by scale mixture of uniform formulation of the Laplace density.
The objective of this paper was to find the association between demographic, socioeconomic, and health factors of the stunting rate of children under 3 years in West Sumatra, Indonesia by applying Bayesian Tobit quantile regression and its modified techniques.The current analysis expects to improve the structure of successful intervention measures designed to tackle the stunting rate or reduce the prevalence of stunting and improve child health.
Since Eq. ( 3) is not differentiable at the origin, there is no closed form solution for  and the minimization of ( 2) can be achieved by a linear programming algorithm [17].However, in high dimensional censored data problems, the algorithm of [17] might be inefficient [18][19][20].From a Bayesian perspective, Yu and Stander [19] suggested a Bayesian formulation of Tobit quantile regression employing a skewed Laplace distribution (SLD) for the errors as a "working model".
The SLD connects the Bayesian analysis to standard frequentist tobit quantile regression, which proceeds semiparametrically using   as a loss function [21,22].Let   follow a  (0, , ), where the parameters are the location, precision, and skewness, respectively.The density of the SLD for the error term (  ) is written explicitly as Under the above density, the joint distribution of

Bayesian Tobit Quantile Regression with Adaptive Lasso and Lasso
The Bayesian approach for the Tobit regression model using the normal density for the residuals and generating  from its conditional posterior distribution using a Gibbs sampler has been proposed by Alhamzawi & Ali [16].According to the Tobit model, only an unknown subset of predictors is important in the model, so the problem of covariate selection is to select these active covariates.Various approaches to dealing with covariate selection in quantile regression models have been proposed recently.Because of its susceptibility to overfitting issues, the least absolute BAYESIAN REGULARIZED TOBIT QUANTILE TO CONSTRUCT STUNTING RATE MODEL shrinkage and selection operator (Lasso) method [23,24] has received much attention over the years.The Lasso is obtained for the quantile model by minimizing the following formula: is called the penalty for the selection and estimation of quantile coefficients.Meanwhile, we also consider Tobit quantile regression with the adaptive Lasso penalty, which solves the following [25][26][27]: where   are non-negative adaptive weight and   |  | is known as the adaptive penalty for selecting and estimating quantile coefficients.As the penalty parameters (  ,  = 1, … , ) increase the Tobit quantile regression coefficients of independent variables are continuously shrunk toward 0 and due to the adaptive penalty form ), some coefficients of independent variables can be set exactly to 0. Now, if we assume the error , follow the ALD with a scale parameter ( > 0) is: We also assign a Laplace prior distribution for the regression coefficients, then the conditional distribution of the regression coefficients is: where = ( 1 ,  2 , … ,   ) ′ .Under this setting, maximizing the posterior estimator of   in Eq. ( 10) is equivalent to minimizing Eq. ( 8).

SIMULATION STUDIES
In this section, the performance of the Bayesian Tobit Quantile Regression (BTQR) and its modifications, i.e; Bayesian Adaptive Lasso Tobit Quantile Regression (as BALTQR) and Bayesian Lasso Tobit Quantile Regression (BLTQR) are investigated and compared by simulations.The goal of this simulation study here is to reveal the performance of all three proposed methods and their associated algorithm in recovering the true parameters.The methods are evaluated based on the median of mean absolute deviations, referred to as , and the standard deviation of  . is estimated using this formula: , where  ̂ is the posterior mean of . and its standard deviation are estimated over 200 replications.Model selection performance is evaluated based on the credible intervals for the approaches in the comparison.a. Simulation 1 (sparse case):  = (1, 0, 1.2, 0, 0, 8, 0, 0, 0, 0) ′ b.Simulation 2 (very sparse case):  = (3, 0, 0, 0, 0, 0, 0, 0) ′ c.Simulation 3 (dense case):  = (0.80, 0.80, 0.80, 0.80, 0.80, 0.80, 0.80, 0.80) ′ We consider four choices of θ, 0.10, 0.25, 0.50, and 0.75.Under the three error distributions (0,1),  (3) and  (0.5 , 1) the censored levels of  were 30%, 50%, and 30%, respectively.Clearly, the biases due to the three approaches are more or less the same (very similar values).
However, the BLTQR generally behaves much better than the other approaches (BTQR and BALTQR) in terms of absolute bias.Across the three simulations, it can be seen that the absolute BAYESIAN REGULARIZED TOBIT QUANTILE TO CONSTRUCT STUNTING RATE MODEL bias obtained from the BLTQR method is much smaller at selected quantiles than the competing approaches.Most noticeably, when  = 0.75 the absolute bias generated by all three methods for all parameters is much smaller than the absolute bias at a smaller quantile.But, for the most extreme quantile ( = 0.90), the values of absolute bias are generally larger than quantile  = 0.75.We then check for the results of the median of mean absolute deviations (MMAD) and the standard deviations (SD) of the MAD as presented in Table 4. From Tables 4, 5, and, 6 we can observe that for MMADs and SD criteria, the method Bayesian Lasso Tobit Quantile Regression (BLTQR) generally performs better than the other methods for all the distributions under consideration.In Simulation 1 and 2, the BLTQR method has the smallest MMAD in all 15 simulation setups.In Simulation 3, the BLTqr method has the smallest MMAD in 14 out of 15 simulation setups.In general, Bayesian Lasso quantile regression performs well compared to two other methods, BTQR and BALTQR.1.The mean stunting rate is 3.42 cm and the standard deviation is 3.758.Since the data is related to stunting children, some children have zero height gain, thus censored here is about zero.
While this study assumed ten predictor variables as factors influencing the stunting rate based on previous studies.The indicator variables consist of nine categorical variables, as presented in Table 7, and one numerical variable, i.e., birth weight ( 2 ).After fitting the linear regression model using the ordinary least square (OLS) method, it is necessary to check whether the normality assumption of the residuals is held or not.To do so, the Chi-square test was performed and the test shows that the normality assumption is not held with  −  = 3.56 × 10 −8 .Additionally, BAYESIAN REGULARIZED TOBIT QUANTILE TO CONSTRUCT STUNTING RATE MODEL the histogram and the Q-Q plot show that the distribution of the residual may be poor.Similar to the simulation studies, all three methods are then compared : BTQR, BALQR, and BTQR.For each method, the MAD is recorded, where the  = In this data, we considered four quantiles, these were 0.25, 0.50, 0.75, and 0.90.We ran algorithms for 30.000iterations, discarding the first 1000 as burn-in.The results of the parameter estimated and the width of the 95% confidence interval at each quantile for all three methods are provided in Table 8.
After the three methods, BTQR achieves the best prediction accuracy.The width of the 95% confidence interval for BLTQR is lower than that of BTQR and BALQR when  = 0.10, 0.25, 0.50, and 0.75.Besides, BLTQR does not almost as well as BTQR and BALQR when  = 0.90.Table 9 informs us that for the stunting rate dataset, it can be seen that the MAD of the BTQR was about 2.08% and 0.86% lower than that of BTQR and BALQR when  = 0.25 , respectively.

CONCLUSIONS
In this article, we construct the model of stunting rate in selected cities and districts in West Sumatra, Indonesia using Bayesian Tobit quantile regression and its generalized methods.This study compares the result of BTQR, BALTQR, and BLTQR methods using a simulation study and an empirical study.The Bayesian Tobit quantile regression and its generalized methods not only accommodate the messy attributes of the stunting rate response but also provides a complete picture of the covariate effects on the stunting rate distribution.Furthermore, it successfully selects and models the important categorical predictors.Our findings are summarized below.First, exclusive breastfeeding affects the stunting rate only at the middle quantile, at  = 30, 50 .
Exclusive breastfeeding seems not to be an important factor claimed for the high stunting rate.
Comorbidity tend to be an important factor in stunting rates not only at lower quantiles but also at higher quantiles.The analysis of simulation studies and stunting rate dataset shows strong support for the use of Bayesian Lasso Tobit quantile regression to inference for Tobit quantile regression models.The proposed method generally behaves much better than the other approaches in terms of a width of 95% Bayes confidence interval and absolute bias.

Figure 1 .
Figure 1.Normality assumption checking.(a) Histogram of the OLS residuals for the stunting rate dataset, (b) Q−Q plot of the OLS residuals for the stunting rate dataset follows a cumulative distribution function   whose th quantile conditional on   equals zero.Assuming linear model   (  |) =   ′   (  ∈ ℝ  ), an intuitive estimator for the Tobit quantile is: |) is the th quantile conditional of   * given   with the parameters  ∈ , and the random error

Table 2 .
Absolute Bias of Posterior Mean for the Simulated Data in Simulation 2, ε~N(0,1).DEVA, RUDIYANTO, ZETRA, YAN, ROSALINDARI, YOZZA For each error distribution, we simulate 200 data sets assuming the sample size is n = 100.We fit the models at four different quantiles,  = 0.10, 0.25, 0.50, 0.75, and 0.95.The MCMC algorithms are run for 17,000 iterations, discarding the first 2000 as burn-in.Methods are evaluated based on the smallest value of absolute bias of parameter models.The results for each simulation at selected quantiles for each parameter from Normal distribution are presented in Tables1, 2, and 3. Other results are saved by the author provided by request.

Table 3 .
Absolute Bias of Posterior Mean for the Simulated Data in Simulation 3, ε~N(0,1)

Table 4 .
MADs and Standard Deviations (SD) of MADs for Simulations 1,

Table 5 .
MADs and Standard Deviations (SD) of MADs for Simulations 2,  = (3, 0, 0, 0, 0, 0, 0, 0) ′ 4. MODELING STUNTING RATEAll three methods then are applied to construct a model of the stunting rate in West Sumatra, Indonesia.The data obtained from Health Office in several districts and cities in West Sumatra is regarding the determinants of stunting in August 2021 and February 2022.The response variable represents the stunting rate of 3534 stunting children (in cm) from August 2021 to February 2022, the summary statistics for the response are provided in Figure

Table 6 .
MADs and Standard Deviations (SD) of MADs for Simulations 3

Table 7 .
Summary Statistics of Category Variables

Table 8 .
Estimates of Model Parameters For The Stunting Data Set