Research on Quantile Regression Method for Longitudinal Interval-Censored Data Based on Bayesian Double Penalty

: The increasing prominence of the problem of censored data in various fields has made studying how to perform parameter estimation and variable selection in censored mixed-effects models one of the hotspots of current research. In this paper, considering the situation that the response variable is restricted by the bilateral limit, a double-penalty Bayesian Tobit quantile regression model was constructed to carry out parameter estimation and variable selection in the interval-censored mixed-effects model, and at the same time, the fixed-effects and random effects coefficients are compressed in Tobi t’ s mixed-effects model, so as to reduce the estimation error of the model at the same time as the variable selection of the model is carried out. The posterior distribution of each unknown parameter was derived using the conditional Laplace prior and the mixed truncated normal distribution of interval-censored data, and then the Gibbs sampling algorithm for unknown parameter estimation was constructed. Through Monte Carlo simulation, it was found that the new method is more advantageous than the classical method in terms of variable selection and parameter estimation accuracy in various situations, such as different model sparsity, different data censoring ratios and different random error distributions, and the model is able to realize automatic variable selection. Finally, the new method is used to analyze the correlation between the crime rate and various economic indicators in China.


Introduction
In practical research in statistics, the problem of censored data due to various factors is becoming more and more prominent, and studying how to estimate parameters and select variables in censored mixed-effects models has become a hot research topic.Interval-censored data challenge traditional least squares estimation methods due to their boundedness, asymmetry, and bias.Earlier studies, such as Song et al. [1] and Ferrari [2], have pointed out that general linear regression methods lead to significant estimation bias when dealing with interval-censored data.Although Lesaffre et al. [3] attempted regression analysis through parameter transformation, the results were not satisfactory.To cope with the asymmetric problem, Espinheira [4] proposed a regression model based on the Beta distribution, but the method was limited to proportionate data that conformed to the Beta distribution.Zhao [5] investigated the variable selection problem combined with the penalty function in the context of interval censoring.On the other hand, Ying et al. [6] proposed a resampling algorithm applicable to the nature of large samples that significantly improved the computational efficiency.However, the problem of parameter estimation and variable selection for interval-censored data in high-dimensional contexts still requires in-depth research.
As early as 1958, Tobin J proposed the Tobit model with restricted response variables.[7] Since this type of model and traditional linear regression models rely mainly on the mean to estimate the regression coefficients [8], they often fail to reveal the full extent of regression information.To improve on this, Powel investigated quantile regression estimation for the Tobit model in 1986 [9], but its asymptotic covariance matrix is affected by the error density function, which affects estimation reliability [10].When applied to highdimensional longitudinal data, the complexity of truncated-tailed data, random effects, and random errors further exacerbates the difficulty of parameter estimation [11].Therefore, the development of novel and efficient sampling algorithms to optimize parameter estimation and variable selection for Tobit quantile regression models is important in terms of improving model accuracy and providing reliable statistical tools for related fields.
In Tobit quantile regression models, although mixed-effects models can comprehensively consider the covariates affecting the response variables [12], they are computationally intensive and may affect the accuracy of parameter estimation [13].Parameter estimation and variable selection of mixed-effects models with censored data using different penalization methods can effectively make up for the shortcomings of traditional methods [14].Facing the complexity of variable selection and estimation [15], the Bayesian method avoids the difficulty of penalty parameter selection by treating the parameters as distributions [16].Incorporating the penalty function into the Bayesian method to construct a hierarchical quantile regression model provides a new idea for parameter estimation of the Tobit quantile regression model.In this paper, based on the Bayesian framework, a double-penalty quantile regression method is proposed to deepen the application and effect study of the Tobit quantile regression model in censored data.
In recent years, the use of penalized methods for downscaling and modeling analysis of high-dimensional data has attracted much attention.Alhamzawi et al. [17] and Alhamzawi [18] proposed Tobit quantile regression methods for adaptive Lasso and adaptive elastic net from a Bayesian perspective, which achieved variable selection through the gamma prior and Gibbs sampling algorithms.Alhusseini [19], on the other hand, introduces Lasso penalties into a Bayesian Tobit quantile regression model with coefficient estimation using scale-mixed homogeneous prior parameters.However, applying these methods directly in mixed-effects models with latent variables may lead to biased regression coefficient estimates.For this reason, Alhamzawi and Ali [20] use a mixture of asymmetric Laplace distributions for regularization to avoid the non-convex miniaturization problem.Abbas [21] introduces ridge regression parameters in the covariance matrix to deal with the multicollinearity problem.Although most of the current research focuses on fixedeffects models, the inference of censored statistics in mixed-effects models, especially from the perspective of Bayesian incorporation of penalties, still needs to be further explored.
This paper focuses on longitudinal data Tobit models containing latent variables, which are transformed into Tobit models for interval-censored data by adjusting the constraints on the response variables.Based on the Bayesian approach and combined with the penalty function, the Bayesian double-penalty Tobit quantile regression model with interval censoring of the response variable is constructed.We will explore the parameter estimation and variable selection problems under different penalty function methods, different censoring proportions, and different random error distributions, respectively, with a view to providing new perspectives and ideas for research in this field.

Bayesian Tobit Hierarchical Quantile Regression Model for Longitudinal Interval-Censored Data
Since the response variable has both lower and upper bounds for interval-censored data, the estimates obtained by directly building the binned regression model of Powell [9] will be out of the upper and lower bounds.In the mixed-effects model, the Tobit model for interval-censored data will be developed based on the Tobit linear model with the introduction of latent variables as follows: xij' is the value of the explanatory variable taken by the i-th individual at the j-th time observation point, zij' is the covariate corresponding to the random effect, β and αi are the fixed-effect and random effect coefficient vectors, respectively.The distribution of perturbation term εij is unknown; yij is the value of the response variable that can be observed for the i-th individual at the j-th time point, yij * is the corresponding unknown potential-dependent variable, and ( )  I is an indicative function.To be specific, For the estimation of the quantile regression coefficients, β and α in the above model can be obtained by minimizing the following Equation (5).
is the quantile loss function.According to the traditional Bayesian quantile regression method, assuming , the likelihood function of the sample is as follows: Kottas and Krnjaji [22] found that the asymmetric Laplace distribution can be decomposed into a mixture of normal and exponential distributions.The ALD distribution was further decomposed into N(0,1) and E(1/σ).Then, yij * can be expressed as [23]: ) Assuming that each parameter has a prior distribution, a Bayesian Tobit hierarchical quantile regression model for interval-censored data can be built (P-BTQR):

Bayesian Double Lasso Penalized Quantile Regression Method for Tobit Model
Lasso penalties and adaptive Lasso penalties are applied to fixed-effect β and random effect αi, respectively, and the dual Lasso Bayesian Tobit quantile regression method (PDL-BTQR) and dual-adaptive Lasso Bayesian Tobit quantile regression method (PDAL-BTQR) for interval-censored data are proposed.
Firstly, in the mixed-effects model, both fixed-effects β and random effects αi are assumed to have conditional Laplace priors as follows: Using the integral constancy equation proposed by Mallows and Andrews [24]: , we can obtain: Then, ~(0, ) . From Equations ( 8), (11) and (12), the posterior distribution of the fixed-effects β and random effects αi can be derived as: In Bayesian statistics, maximizing the conditional posterior density function is equivalent to minimizing the negative logarithmic posterior density function because the logarithmic function is monotonically increasing and can be converted from multiplication to addition, making the optimization process simpler.For Bayesian Tobit quantile regression with a double Lasso penalty, define a posterior distribution that contains the data likelihood, the prior distribution, and the Lasso penalty term.The parameters are then estimated by minimizing the negative logarithmic posterior.Maximize the conditional posterior density in Equation ( 13) and also to minimize the Bayesian Tobit quantile regression function in Equation ( 14) with the following double Lasso penalty: Next, assuming ( ) . Based on the prior distribution and prior density function, the likelihood function of the resulting model can be rewritten as: The conditional posterior distribution of the latent variable yij * is obviously an interval truncated normal distribution as follows: where  denote the right-truncated and left-truncated normal distributions, respectively.yij is only related to the distribution of yij*, so yij is also normally truncated.The conditional posterior densities for each of the other unknown parameters in the model will be derived below: For, sl, l = 1, …, k, there are: The posterior distribution of the mixture parameters is ( ) ( ) , is the inverse Gaussian distribution and the conditional probability density function is as follows, where denotes the third-class modified Bessel func-  , there is: The penalty parameter is For fixed effects β: This yields a fixed-effects posterior distribution that obeys a normal distribution ( , , , , ) ~N( , ) , where For rit, i = 1, …, n, t = 1, …, q, there is: This means that the mixing parameter rit obeys the inverse Gaussian distribution, noted as  , there is: the posterior distribution of the penalty parameter 2 2  is: For random effects αi: The random effect follows a normal distribution where the mean is  and the variance is  , denoted as , where The posterior density function for vij has: ( ) ( ) where Finally, for σ: Among them:

Bayesian Dual Adaptive Lasso Penalized Quantile Regression for Tobit Models
Adaptive LASSO assigns different penalties to different coefficients to improve the estimation accuracy and have an Oracle ability [25].Under the Lasso penalty studied by Alhamzawi et al. [26], in the mixed-effects model, since both fixed β and random effects αi have dependent conditional parameters η1 and η2, respectively, but the compression coefficients should be different for each of the different β and αi.Therefore, this section proposes a Tobit quantile regression model with a dual adaptive Lasso penalty from a Bayesian perspective.First, assume that the fixed-effects β and the random effects αi both have conditional Laplace priors as follows: Among them, ) ,..., , ( , for β and αi. Equations ( 28) and ( 29) can be rewritten as: So, there is: To maximize the conditional posterior density function of β and αi, also to minimize the Bayesian Tobit quantile regression function of the double adaptive Lasso penalty in Equation (32).
( ) ( ) ( ) The double adaptive lasso with oracle properties [27], using Bayesian methods, is theoretically equivalent.Considering that the Laplace prior in Equation ( 28) has no conjugate prior with Equation ( 29), an expression for the conditional prior distribution can be written by integrating.The conditional prior distribution of the fixed effects of Equation ( 30) can be rewritten, again using the integral constants, as: By introducing auxiliary variables Similarly, for the random effects αi, the conditional prior distribution is: among them, by introducing auxiliary variables ) ,..., ,..., ( . The joint prior of (αi, S) can be viewed as a mixture of the normal and exponential distributions, i.e., ( ) Next, assuming π(σ)~IG (c0, d0), the likelihood function of the model is: In contrast to the dual Lasso Bayesian Tobit quantile regression method, here it is sufficient to change the conditional posterior distribution of . Further derivation is as follows: For rl, l = 1, …, k, there is: The posterior distribution of the mixing parameter rl is The adaptive penalty parameter 2 l  obeys the inverse Gamma distribution with the shape parameter being 1 + e0 and the scale parameter being (f0 + rl/2), denoted as For sit, i = 1, …, n, t = 1, …, q, there is: The mixing parameters sit obey the inverse Gaussian distribution, denoted as: For 2 q  , q = 1, …k, there is: the posterior distribution of the penalty parameter 2 q  is: Bayesian approaches, which utilize prior distributions of regression coefficients and regularization parameters, allow for a Bayesian treatment of the adaptive Lasso that quantifies uncertainty by introducing prior knowledge and providing posterior distributions [28].

Gibbs Sampling Algorithm for DL-BTQR
The Gibbs sampling algorithm for the dual Lasso Bayesian Tobit quantile regression method (DL-BTQR) is as follows.

Gibbs Sampling Algorithm for DAL-BTQR
The main advantage of the Bayesian double-penalized adaptive lasso with Gibbs sampling algorithm does include the fact that it does not require consistent initial estimates of regression coefficients.In high-dimensional data, the number of features may be much larger than the number of samples, causing traditional regression methods to fail.The Bayesian adaptive lasso combined with the Gibbs sampling algorithm, on the other hand, is able to handle such high-dimensional situations without the need for consistent initial estimates by introducing prior distributions and posterior inferences that give parameter values directly through sampling, allowing for efficient variable selection and parameter estimation.The Gibbs sampling algorithm for the dual adaptive Lasso Bayesian Tobit quantile regression method (DAL-BTQR) is as follows.

Comparative Analysis of Monte Carlo Simulations
The analog data is provided by: among them, ( ) of any x follows a standard normal distribution.In some practical applications, there may indeed be strong correlations between neighboring variables and weak correlations between variables in more distant locations.Choosing ρ = 0.5 may be a reasonable approximation.The correlation coefficient between any two explanatory variables xl and xk is . Take two sets of explanatory variable coefficients as sparse longitudinal data: , dense longitudinal data: (1,1,0.5,0.5,0.5,0.5,0.5,0.5)  = , assuming that . . .( , , ) ~(0, ) (1, , ) . Each simulation is performed 100 times, with the following weak prior information during the simulation process: This section simulates first the estimation results of the three methods of unpenalized Bayesian Tobit quantile regression (P-BTQR) [19], double Lasso penalized Bayesian Tobit quantile regression (PDL-BTQR) [21], and double adaptive Lasso penalized Bayesian Tobit quantile regression (PDAL-BTQR) [17] for the interval-censored data under different quantile points, and conducts two explanatory variable coefficients; then changing the censoring ratio and conducting a comparative study of the coefficients of the two explanatory variables under the three methods; and finally changing the distribution conditions of the random errors for the simulation of parameter estimation.In order to evaluate the accuracy of model estimation, mean square error (MSE) is still selected as the evaluation index in this section.Its mean value indicates the mean value of MSE and the coefficients of the explanatory variables under 100 simulations, and the standard deviation is the standard deviation of MSE and the coefficients of the explanatory variables under 100 simulations, and the confidence level of the confidence interval of each explanatory variable is 95%.

By maintaining
gates the estimation results of sparse longitudinal data and dense longitudinal data under different quartiles and methods, which contain the mean and standard deviation of MSE with the estimated coefficients of each explanatory variable and their corresponding interval estimates.The results are presented in Tables 1 and 2.  In the mixed-effects model, the double-penalized quantile regression method differs from existing studies in that both fixed effects and random effects have corresponding compression coefficients, thus eliminating irrelevant variables to a greater extent.In addition, this paper breaks with the existing literature by considering the estimation and selection of variables under the influence of different random effects only.
According to Figures 1-3, the discussion is developed for sparse longitudinal data, and the interval censoring range is [0, 6].At the lower quantile τ = 0.25, the lowest estimation error and fluctuations are observed from the MSE mean, and standard deviation metrics for the double adaptive Lasso penalty, i.e., the PDAL-BTQR method has the best estimation, and the redundant variables are almost all compressed to 0. At the middle quantile τ = 0.5, the MSEs mean obtained by the P-BTQR, PDL-BTQR, and PDAL-BTQR methods.The standard deviation of MSE for the P-BTQR method without penalty is two times that of the PDL-BTQR method and nearly three times that of the PDAL-BTQR method, which fully demonstrates the superiority of the two-penalty method for parameter estimation.At the middle and high quartiles τ = 0.75, the mean MSEs of the PDL-BTQR method and the PDAL-BTQR method are not significantly different, and the former is slightly higher than the latter, but both are smaller than the mean MSEs of the P-BTQR method.This means that both double-penalized methods can obtain accurate estimation results.At the high quantile τ = 0.95, the MSE means of the three methods are significantly higher, but the PDAL-BTQR method obtains the smallest MSE mean, standard deviation, and more accurate range of interval estimation in 100 simulations.In addition, in the model setting, assuming the explanatory variable coefficients β1 and β2 are disturbed by random effects, especially at the extreme quantile with the fluctuations of with are more obvious, indicating that their disturbances have the greatest impact on the estimation at the extreme quantile τ = 0.25 and τ = 0.95.

Simulation Results under Different Quartiles of Dense Longitudinal Data
According to Figures 4-6, under the condition of dense longitudinal data, the estimation effect of the PDAL-BTQR method is better than that of the PDL-BTQR and P-BTQR methods at the lower quartile τ = 0.25, with the smallest MSE mean value of 0.062.At the middle quartile τ = 0.5 and the MSE mean value of the PDL-BTQR method is slightly lower than that of the PDAL-BTQR and P-BTQR methods.At the point where τ = 0.75, the MSE means of the double-punishment PDAL-BTQR and PDL-BTQR methods were smaller than those of the P-BTQR unpunished method, so the parameter estimation and variable selection of the unpunished method under this condition was not as good as those of the double-punished method.In terms of the mean and standard deviation of MSE, the estimation effects of the PDAL-BTQR method and the PDL-BTQR method were not significantly different and were similar to the results obtained in Table 1.At the high quantile, τ = 0.95, the mean square error of all three methods becomes relatively large, and the mean MSEs of the P-BTQR and PDL-BTQR methods are 0.245 and 0.201, respectively, while the mean MSE of the PDAL-BTQR method is 0.193, indicating that it has the best estimation effect of the dual adaptive Lasso at the high quantile.In summary, it can be concluded that the double-penalized Bayesian Tobit quantile regression method can obtain more accurate parameter estimates for two different sets of explanatory variable coefficients at both low and high quantile points, and its performance is more advantageous than that of the no-penalty method, although the performance is comparable at the middle quantile point, but the accuracy of the double-penalized method is higher at the extreme quantile point.

Comparative Analysis of Simulation Results under Different Censoring Ratios
The estimation results of each method under different censoring ratios were compared by setting the censoring ratios to 10%, 20%, and 40%, respectively, keeping  3 and 4 only show the simulation results under 0.5 quantile.According to Figures 7-9, in the sparse longitudinal data model, with increasing censoring ratios, the mean MSE values for the P-BTQR method were 0.029, 0.030, and 0.033 for the three conditions comparing censoring ratios of 10%, 20%, and 30%, respectively; the mean MSE values for the PDL-BTQR method were 0.026, 0.027, and 0.032, respectively; and the mean MSE values for the PDAL-BTQR method were 0.025, 0.026, and 0.031, respectively.The mean MSE values corresponding to the three methods are reduced in order, because the Lasso quantile regression, compared to quantile regression, imposes a Lasso penalty on each explanatory variable, which can improve the speed of model calculation and reduce the bias of parameter estimation, and the results obtained from the simulation of the PDL-BTQR method are more accurate than those of the P-BTQR method, which is more accurate than the PDL-BTQR method.The adaptive Lasso penalty function breaks through this limitation.From the simulation results, the PDAL-BTQR method is more effective than the PDL-BTQR and P-BTQR methods for parameter estimation and variable selection under different censoring ratios.

Simulation Results under Different Censoring Ratios for Dense Longitudinal Data
According to Figures 10-12, for dense longitudinal data, the mean MSEs of the P-BTQR, PDL-BTQR, and PDAL-BTQR methods are 0.042, 0.040, and 0.041, respectively, at a 10% censoring ratio, and the estimation method with a double Lasso penalty is more accurate.As the censoring ratio increases to 20%, the mean MSE of the P-BTQR method becomes larger while the mean MSEs of the PDL-BTQR and PDAL-BTQR methods decrease, which also indicates that the parameter estimation performance of the dual-penalty method is significantly better than that of the P-BTQR method.When the censoring ratio is further increased to 30%, the MSEs of all three methods increase, but the MSEs of the double-penalized PDL-BTQR and PDAL-BTQR methods are 0.046 and 0.047, respectively, which are still smaller than the MSE of the P-BTQR method of 0.048, indicating that the MSEs of the fixed-effects and random effects coefficients in the interval-censored mixed-effects model with double penalties are still smaller than the MSEs of the P-BTQR method.The estimation method for estimating the fixed-effects and random effects coefficients in the interval-censored mixed-effects model yields more accurate estimates of the model parameters.The PDL-BTQR method outperforms the other two methods for dense longitudinal data under the condition that the censoring ratio becomes larger, and the PDAL-BTQR method has the best estimation for sparse longitudinal data.The MSE mean value increases correspondingly with the increase in the censoring proportion, implying that the estimation accuracy of fixed effects is decreasing, especially β1 and β2, which is the result of assuming the previous two variables subject are to random effects in this section during the simulation.Combining the simulation results of the coefficients of the two sets of explanatory variables, the PDL-BTQR method and the PDAL-BTQR method have better estimation results, and the advantages of their variable selection and estimation are more prominent than those of the P-BTQR method.~= M as constant, this paper will consider the variable selection and estimation results of three methods, P-BTQR, PDL-BTQR, and PDAL-BTQR, simulated under random errors obeying a standard normal distribution, a t(3) distribution, and an ALD(0,0.5,1)distribution, respectively.The advantage of the Bayesian-based framework is that the unknown parameters can be viewed as obeying a certain prior conditional distribution.After the prior conditional distributions of the different parameters to be estimated are given, the Gibbs sampling algorithm is used for parameter estimation and variable selection, and the estimation results under the 0.5 quantile are shown in Tables 5 and 6.According to Figures 13-15, from the estimation results of sparse longitudinal data with different random error distributions, when the random errors obey the standard normal distribution, the mean values of MSE estimated by the three methods P-BTQR, PDL-BTQR, and PDAL-BTQR are 0.054, 0.041, and 0.039, respectively, and all three methods can obtain the parameter estimation results with less deviation, but the PDL-BTQR method performs better than the P-BTQR method.Comparing the estimation results of the two double-penalty methods, the PDAL-BTQR method is superior, and its corresponding MSE standard deviation is the smallest at 0.052, indicating that the fluctuation of the 100 simulation results is smaller and its effect of obtaining accurate estimation is more stable.When the random errors obey the t(3) distribution, the mean MSE of the unpunished P-BTQR method increases significantly to 0.077, indicating that the parameter estimation results obtained by the P-BTQR method under the t(3) distribution are not as accurate as those under the standard normal distribution, while the corresponding mean MSEs of the PDL-BTQR method and the PDAL-BTQR method are 0.057 and 0.056, respectively, at this time.The difference between the two is not significant, which fully indicates that the estimation effect of the double-penalty method has obvious advantages.When the random errors obey the ALD(0,0.5,1)distribution, the estimation effects of the PDL-BTQR method and the PDAL-BTQR method are almost the same, and the mean MSE values are lower than those of the P-BTQR method.From the simulation results, the PDL-BTQR method and the PDAL-BTQR method perform better than the P-BTQR method regardless of the change in the random error.

Simulation Results under Different Random Error Distributions for Dense Longitudinal Data
According to Figures 16-18, in the dense longitudinal data model, when the random errors obey the standard normal distribution, the MSEs obtained by the P-BTQR method and the PDAL-BTQR method have the same mean values, and both can obtain more accurate estimation results, and the PDL-BTQR method is better in comparison.When the random errors obey the t(3) distribution, similar to the results in Table 5, the mean MSE values of the three methods also increase significantly, but the P-BTQR method obtains the highest mean MSE value of 0.074, indicating that the estimation effect of the no-penalty method is poor, while the mean MSE values of the PDL-BTQR method and the PDAL-BTQR method are 0.065 and 0.068, respectively, indicating that the double-penalty method yields less biased parameter estimates.For the sparse longitudinal data, the mean MSEs of the PDL-BTQR and PDAL-BTQR methods are lower than those of the dense longitudinal data under the t(3) distribution, indicating that the dual-penalty method is more advantageous in handling the sparse longitudinal data.When the random errors obey the ALD(0,0.5,1)distribution, the mean MSE values obtained by the P-BTQR, PDL-BTQR, and PDAL-BTQR methods are 0.046, 0.043, and 0.044, respectively, and the two-penalty methods continue to perform better than the no-penalty methods.In addition, the simulation results show that the PDL-BTQR method has the best parameter estimation effect for processing dense longitudinal data under different random error distribution conditions.In summary, the estimation error of the two-penalty Bayesian Tobit quantile regression method is the smallest regardless of whether the random errors obey the standard normal distribution or the t(3) distribution and the ALD(0,0.5,1)distribution.For different types of longitudinal data, the PDL-BTQR method and the PDAL-BTQR method both yield better parameter estimation and variable selection results for mixed-effects models with censored response variable intervals.

Time Consumption for the Methods
One important topic in modeling analysis is about the time required for computation.Although, with the advance in computer technology, the existing computational speed for much ordinary data can be handled comfortably, with the increasing requirements for model accuracy and the emergence of high-dimensional massive and complex data, computation time consumption is an issue of extreme concern even for the most advanced computers.The double-penalized Bayesian quantile regression method proposed in this study also involves large-scale operations.Below we use the sparse longitudinal data mixed-effects model from Section 3 to provide a demonstration of the various methods proposed in this thesis in terms of computing time.These methods include: (1) Unpenalized Bayesian Tobit quantile regression for interval-censored data (P-BTQR); (2) Single-Lasso penalized Bayesian Tobit quantile regression for interval-censored data (PL-BTQR); (3) Single-Adaptive Lasso penalized Bayesian Tobit quantile regression for interval-censored data (PAL-BTQR); (4) Double-Lasso penalized Bayesian Tobit quantile regression for interval-censored data (PDL-BTQR); (5) Double-Adaptive Lasso penalized Bayesian Tobit quantile regression for intervalcensored data (PDAL-BTQR) The prior settings in the Bayesian approach and the parameter settings in the doublepenalized quantile regression are the same as in the previous simulations: The number of iterations for all Bayesian methods was 20,000.The computer configuration is: Intel(R) Core (TM) 2 Duo CPU, 2.10 GHz, 2 G RAM; the running platform software is R software version 4.4.2, and the Bayesian methods all use the BUGS 1.4.
Table 7 gives the average user time, system time and elapsed time of the above methods in 50 repetitions of simulation, all in seconds.Since, in the Bayesian method, we call the BUGS software, the user time and system time do not include the real sampling time of the calculation, so here we compare the total running time as more appropriate.The double-penalized Bayesian method is slightly shorter in running time than the single-penalized method and it offers greater advantages in practical applications.Since the dual-penalty method is able to consider the effects of both fixed and random effects and penalize them appropriately, it can provide more accurate parameter estimates and more reliable prediction results.In addition, the dual-penalty method is capable of automatic variable selection and parameter compression, which further improves the generalization ability and interpretability of the model.
In summary, the dual-penalty method provides more accurate parameter estimates and more reliable prediction results while maintaining a similar runtime as the singlepenalty method.This makes the double-penalized Bayesian Tobit quantile regression method an attractive option, especially when dealing with complex data and constructing high-precision models.

Interprovincial Longitudinal Crime Rate Data Analysis
The Bayesian double-penalty-based longitudinal interval-censored data quantile regression method studied in this paper may be more suitable for the case where the dependent variables are not categorical variables data, with higher model accuracy and better variable selection and model estimation in continuous-type data.
Currently, many scholars at home and abroad have conducted more in-depth empirical studies on the relationship between crime rates and some conventional economic indicators, such as regional income disparity, urbanization, and unemployment rate, which are important reasons for the rise of crime rate indicators [29].Based on the study of Monte Carlo simulation analysis in the previous section, this section will discuss the relationship between crime rates and economic indicator data for 31 provinces across the country from 2010 to 2016 using the two new methods, PDL-BTQR and PDAL-BTQR, which contain 1302 observations; 31 provinces across the country were classified into eastern, central and western regions.The conclusion found that crime rates were higher in the eastern and western regions than in the central region, suggesting a correlation with regional income disparities at this stage of the country's history.
Since the crime rate is expressed by the number of crimes with approved arrests per 10,000 people, using it as a response variable will be limited by the left-hand side being greater than 0. Secondly, in this section, we will select the top 10 regions with high crime rates in the eastern and western regions, calculate the average crime rate of these 10 regions, and use the average crime rate as the upper limit of the response variable to obtain the upper limit of the response variable of 9.35.Therefore, the crime rate is a set of response variable bilaterally constrained data, i.e., between [0, 9.35], the censored rate is 12.9%.
According to previous studies, scholars have studied the main causes of rising crime rates from several perspectives, including total economy, urban population, education, wealth gap, and employment.Therefore, this section identifies the explanatory variables as: gross per capita product, urbanization rate, regional income gap, education level, and unemployment rate.Among them, the crime rate data are obtained from the Chinese Prosecution Yearbook and the Chinese Law Yearbook, and the economic indicators are obtained from the Chinese Statistical Yearbook.The specific variable definitions and descriptive statistics are shown in Table 8.To provide insight into the relationship between the variables in the dataset, analysis of correlation coefficients and covariances was introduced and the results are displayed in Figures 19 and 20.Utilizing equation  ) for the intercept distance term and five explanatory variables; are the observed values of the relevant explanatory variables for the i-th province in the j-th year, respectively.β = (β0, β0,…, β0) are the coefficients of each explanatory variable; αi is the random effects coefficient; ij z denotes an explanatory variable that produces a random effect and z We are interested in the degree of influence of each explanatory variable on the response variable at different quantile points, and Table 9 shows the estimation results of both the PDL-BTQR and PDAL-BTQR methods at each quantile point.8 shows that GDP per capita, education level, and unemployment rate are inversely related to crime rate at each quantile, i.e., as the regional GDP per capita, education level, and unemployment rate increase, they effectively suppress the increase in crime rate, especially the education level, whose estimated coefficients at each quantile are larger in absolute value than GDP per capita and unemployment rate.The urbanization rate and regional income disparity act as positive shocks to crime rates, i.e., an increase in the urbanization rate and an increase in income disparity both lead to an increase in crime rates, and both estimation methods indicate that the urbanization rate reaches a maximum at the 0.5 quantile and the income disparity has the greatest impact at the 0.7 quantile.A high urbanization rate responds to a certain extent to the high mobility of the mobile population, which is more likely to breed crime, while an increase in regional income disparity and a widening gap between rich and poor in the region can easily cause class conflicts, trigger social unrest, and generate delinquent behavior.In the real data, the estimation of the model and variable selection in this paper are performed simultaneously, and the penalty part enables automatic variable selection.

Discussion
In this paper, considering the situation that the response variable is restricted by the bilateral limit, we construct a double-penalized Bayesian Tobit quantile regression model for interval-censored data, add the penalty function to the fixed-effect and random-effect coefficients at the same time, make parameter estimation and variable selection of the interval-censored mixed-effects model, and obtain the estimation results of the two sets of longitudinal data in a Monte Carlo simulation under different estimation methods, different censoring ratios, and different random error distributions, and use the new method to analyze and discuss the correlation between crime rate and various economic indicators in China.The Monte Carlo simulation is used to obtain the estimation results of the two sets of longitudinal data under different estimation methods, different censoring ratios, and different random error distributions.
In the mixed-effects model with censored data, the general Tobit quantile regression method cannot obtain effective estimation of the parameters [1].On the one hand, due to the random effects added to the mixed-effects model on the basis of the general linear model [30], there are a large number of unknown parameters and the distribution of random errors is unknown, and the random errors under different distributions will increase the complexity of the model computation, which will bring great difficulties to the model parameter estimation; on the other hand, due to the restricted response variable generating a large number of censored data, the mixed-effects model contains latent variables that make the Markov Chain of the parameter estimation Monte Carlo (MCMC) sampling algorithm for parameter estimation extremely complex, resulting in low computational efficiency and a large bias in the estimation results.In recent years, parameter estimation and variable selection based on the idea of a penalty function under the Bayesian framework is one of the hot topics of academic discussion [1].Therefore, on the basis of existing research, in order to solve the above problems, this paper is devoted to constructing a Bayesian double-penalty Tobit quantile regression model for censored data, so as to provide a new way of thinking for the parameter estimation and variable selection methods of censored mixed-effects models [31].
For interval-censored data, Richard Cox proposed the Cox PH model in 1972 [32], which is mainly used to study the relationship between multiple independent variables and the dependent variable (survival time) and can handle censored data, which is highly practical.Cox proportional risk model may not be the best choice when the data are truncated and an explicit concept of survival time does not exist.This paper discusses the parameter estimation and variable selection under the condition that the response variable is subject to bilateral restrictions at the same time.Due to the characteristics of boundedness and bias of interval-censored data [33], the estimation results of simple regression methods cannot effectively screen important variables and exclude redundant variables [34].The main reason is that the fitted and estimated values obtained by traditional regression methods may exceed the upper and lower bounds of the response variables, and the model interpretation is relatively weak [35].The Tobit quantile regression method provides a new way of parameter estimation for the mixed-effects model with intervalcensored response variables [36].Therefore, this paper firstly combines the Bayesian method and constructs the Bayesian empirical likelihood function under interval-censored data [37].Secondly, the penalty function is introduced, and a more efficient Gibbs sampling algorithm is constructed using the truncated normal distribution of the asymmetric Laplace prior part [38].Finally, Monte Carlo simulation experiments and real data analysis are carried out, which fully illustrate the advantages of Bayesian double-penalty Tobit quantile regression model such as high estimation efficiency and robustness.
However, although this paper proposes the double-penalized Bayesian Tobit quantile regression method for the mixed-effects model with censored data and constructs the Gibbs sampling algorithm for parameter estimation, and the simulation results confirm that the estimation effect of the new method is better than that of the traditional method, there are still some shortcomings.This paper only analyzes the Lasso penalty and adaptive Lasso penalty for the commonly used variable selection methods, and subsequently can use SCAD, elastic net, adaptive elastic net, and other penalty methods for parameter estimation and variable selection; this paper centers on the study of linear models and subsequently can construct a Bayesian Tobit quantile regression model for the censored data under the nonlinear model and explore the nonlinear model's variable selection problem.

Conclusions
This paper proposes a Bayesian double-penalized Tobit quantile regression method for interval-censored data in mixed-effects models.The method compresses the fixed and random effects parameters using an unconditional Laplace prior to improve the estimation accuracy.The posterior distributions are derived from a mixture of truncated normal distributions and a Gibbs sampling algorithm is constructed for parameter estimation.Both simulation and real data analysis show that the method outperforms traditional methods in parameter estimation and variable selection, and is particularly suitable for dealing with censored data.
(1) Significantly improved model accuracy and efficiency In complex mixed-effects models, the double-penalty approach of PDL-BTQR and PDAL-BTQR effectively reduces the estimation error of the model and improves the accuracy of parameter estimation by compressing the random effects coefficients.This approach significantly improves the predictive power and interpretability of the model by simultaneous parameter estimation and variable selection when dealing with longitudinal data, regardless of whether the data are sparse or dense.In addition, this dual-penalty strategy helps to identify and exclude redundant variables, thus further optimizing the model structure.
(2) Demonstrated robustness in handling complex and variable datasets In practical applications, data often have different censoring ratios and complex random error distributions.In this case, the dual-penalty method shows good robustness.Whether facing high censoring ratios or different random error distributions, the dualpenalty approach provides stable parameter estimation and accurate variable selection.In particular, the PDL-BTQR method excels when dealing with dense longitudinal data, while the PDAL-BTQR method is even better when dealing with sparse longitudinal data.This robustness makes the dual-penalty method widely applicable and flexible in practical applications.
(3) New and effective tool for dealing with interval-censored data In statistics and data analysis, interval-censored data is a common and complex data type.Traditional treatments often make it difficult to accurately estimate parameters and make effective variable selection.However, the model proposed in this study is particularly suitable for dealing with interval-censored data, and its superiority in parameter estimation and variable selection is verified by setting a bilateral truncation of the response variable and conducting a simulation study, and the model is able to realize automatic variable selection.A new effective method is provided for dealing with data with censored characteristics.Data Availability Statement: The data will be made available by the authors on request.

Figure 1 .
Figure 1.Estimated mean of three methods for different quartiles under sparse longitudinal data.

Figure 2 .
Figure 2.Estimated standard deviation of three methods for different quartiles under sparse longitudinal data.

Figure 3 .
Figure 3.Estimated confidence interval of three methods for different quartiles under sparse longitudinal data.

Figure 4 .
Figure 4.Estimated means of three methods for different quartiles under dense longitudinal data.

Figure 5 .
Figure 5.Estimated standard deviations of three methods for different quartiles under dense longitudinal data.

Figure 6 .
Figure 6.Estimated confidence intervals of three methods for different quartiles under dense longitudinal data.
constant.Since the estimation results of each quantile are similar, Tables

Figure 7 .
Figure 7.Estimated means for sparse longitudinal data with different censoring ratios.

Figure 8 .
Figure 8.Estimated standard deviations for sparse longitudinal data with different censoring ratios.

Figure 9 .
Figure 9.Estimated confidence intervals for sparse longitudinal data with different censoring ratios.

Figure 10 .
Figure 10.Estimated means for dense longitudinal data with different censoring ratios.

Figure 11 .
Figure 11.Estimated standard deviations for dense longitudinal data with different censoring ratios.

Figure 12 .
Figure 12.Estimated confidence intervals for dense longitudinal data with different censoring ratios.

Figure 13 .
Figure 13.Estimated means for sparse longitudinal data with different distributions.

Figure 14 .
Figure 14.Estimated standard deviations for sparse longitudinal data with different distributions.

Figure 15 .
Figure 15.Estimated confidence intervals for sparse longitudinal data with different distributions.

Figure 16 .
Figure 16.Estimated means for dense longitudinal data with different distributions.

Figure 17 .
Figure 17.Estimated standard deviations for dense longitudinal data with different distributions.

Figure 18 .
Figure 18.Estimated confidence intervals for dense longitudinal data with different distributions.

Figure 20 .
Figure 20.Network diagram of covariance between variables.
the interval-censored model; among these, the response variable yij denotes the value of the crime rate of the i-th province in the j-th year, 31 = i and 7 = j ;  ij y is a latent variable.

Author Contributions:Funding:
Conceptualization, K.Z. and T.S.; methodology, T.S.; software, K.Z. and T.S.; validation, K.Z., Y.L. and C.H.; formal analysis, K.Z., T.S. and Y.L.; investigation, K.Z. and T.S.; resources, K.Z.; data curation, T.S.; writing-original draft preparation, T.S. and Y.L.; writing-review and editing, K.Z., Y.L. and C.H.; visualization, K.Z.; supervision, Y.L.; project administration, C.H.; funding acquisition, Y.L. and C.H.All authors have read and agreed to the published version of the manuscript.This research was funded by the National Natural Science Foundation of China, grant number 11701161; the National Social Science Fund of China, grant number 17BJY210; the Key Humanities and Social Science Fund of the Hubei Provincial Department of Education, grant number 20D043; and the Humanities and Social Science Fund of the Hubei Provincial Department of Education, grant number 22Y059.

Table 1 .
Estimation results for sparse longitudinal data at different quartiles.

Table 2 .
Estimation results of the three methods with dense longitudinal data.

Table 3 .
Estimation results for sparse longitudinal data with different censoring ratios.

Table 4 .
Estimation results for dense longitudinal data with different censoring ratios.

Table 5 .
Estimation results for sparse longitudinal data with different distributions.

Table 6 .
Estimation results for dense longitudinal data with different distributions.

Table 7 .
Comparison of computational runtimes for different simulation methods based on Gibbs sampling.

Table 8 .
Variable definitions and descriptive statistics.

Table 9 .
Estimates of the two methods at different quartiles.