LENGTH OF HOSPITAL STAY MODEL OF COVID-19 PATIENTS WITH QUANTILE BAYESIAN WITH PENALTY LASSO

,


INTRODUCTION
COVID-19 infects the human respiratory system and causes several symptoms, such as fever, cough, and shortness of breath. The COVID-19 pandemic that emerged in early 2020 significantly impacted various countries, including Indonesia. The COVID-19 shock has adversely affected Indonesia's long-term income mobilization capacity [1]. COVID-19 will also impact the effectiveness of health services because many patients need immediate treatment. Therefore, significant action is required to address this pressing health problem while strengthening the primary health care system [2]. In Indonesia, especially West Sumatra Province, until early June 2021, 43,281 COVID-19 patients were treated at the hospital. The length of stay of COVID-19 patients in the hospital is influenced by certain factors [3]. Identifying which factors significantly affect the length of stay of COVID-19 patients is necessary. The estimated length of hospital stay model obtained can be used for the benefit of health service activities, the need for health facilities, and preparation for making decisions related to preparedness for COVID-19 to reduce the negative impacts on the health, economic and social sectors [4]- [6].
West Sumatra is one of ten provinces with the most cases of COVID-19 in Indonesia [7]. The limited facilities and infrastructure in West Sumatra caused the medical teams to experience difficulties in treating the overflowing COVID-19 patients. This study examines the length of hospital stay model for COVID-19 patients in West Sumatra using the Bayesian LASSO quantile regression and Adaptive LASSO methods. This method was used because the preliminary analysis 3 LENGTH OF HOSPITAL STAY MODEL OF COVID  found that the data distribution was not normally distributed. The use of quantile analysis in the Bayesian concept aims to produce more effective and natural model parameter estimates, especially for data that is not normally distributed and violates other classical assumptions [8]- [10]. The combination of the LASSO method and Adaptive LASSO as a method of selecting independent variables as well as a method of regularization in Bayesian quantile analysis is carried out to obtain the best model and produce estimated values that are close to the estimated actual values [11]- [14].
Studies related to the Bayesian quantile regression method using the LASSO method and Adaptive LASSO began with the concept of quantile regression by Koenker and Basset [15] and the concept of regularization of the LASSO method from Thisbirani's research [16]. The Bayesian quantile regression method using the Asymmetric Laplace distribution (ALD) for its likelihood function was first introduced by Yu & Mooyed [10]. The combination of the Bayesian quantile regression method and LASSO was carried out by Li et al. [17] by adding regularization parameters to the parameter estimation process. Kozumi and Kobayashi investigated numerical simulations of the Gibbs Sampling algorithm in the Bayesian quantile regression method [18]. Research on the Bayesian quantile regression method using the LASSO and Adaptive LASSO methods was developed by Alhamzawi et al. [12] and Tang et al. [13]. Hamid and Al-Husseini examined the Bayesian LASSO concept in composite quantile regression [19]. Yanuar et al. modeled the low birth weight of newborns using Bayesian quantile regression [20] and by combining the Bootstrap method [21]. Alhamzawi and Mallick developed the reciprocal LASSO method on the Bayesian quantile regression method [22]. Algamal et al discussed DNA selection using Bayesian quantile regression [23]. Several Bayesian studies discussing COVID-19, including Yanuar et al., modeled the length of hospital stay of COVID-19 patients using Bayesian quantile regression [24] and examined the performance of the Bayesian concept in Structural Equation Modeling (SEM) analysis of health behavior during the pandemic of COVID-19 in West Sumatra [20]. Therefore, this study aims to construct the length of hospital stay of COVID-19 patients in West Sumatra using modification of Quantile regression methods.

Factors of Length of Hospital Stay for Patients with COVID-19
The length of stay of COVID-19 patients in the hospital is assumed to be influenced by age [25]- [27]. The age group of 18 -25 years has a severe level of vulnerability [25]. Elderly patients have a higher death risk [28] and are more helpless, and are treated longer [5], [27], [29], [30]. The length of stay of COVID-19 patients is also influenced by gender. COVID-19 is more dangerous for men [28], with a susceptibility rate of 33% [25]. Female patients with critical illnesses have more extended stays [4]. Diagnostic factors related to COVID-19 also affect the patient's length of stay in the hospital [3]- [5], [24]. Patients with a positive diagnosis and Patient Under Supervision (PaUS) undergo hospitalization for at least 14 days [3]. Diagnosis and administration of D-Dimer [31] and ceftriaxone in female patients reduce the length of stay [4]. Another factor that is assumed to affect the length of stay of COVID-19 patients is the illnesses (comorbidities) that the patients experienced before. Patients with hypertension, diabetes [4], [30], [32], coronary arteries [32], kidney complications [33], organ failure and decreased leukocytes [34], fever [35], obesity [30] or patients who have had more than 2 comorbidities [6], will tend to undergo more extended hospitalization than other patients who do not have comorbidities. The discharge status of COVID-19 patients is also a determining factor in the length of stay of COVID-19 patients [3], [5], [24], [33]. COVID-19 patients are declared cured after being treated in a hospital for at least 14 days [3]. Patients who are declared dead have a shorter stay, on average 5 or 6 days [33].

Real Data
The data used in this study were COVID-19 patients data obtained directly from the M. Djamil Central General Hospital (RSUP) Padang City as a referral hospital in West Sumatra Province. The data used was 1737 patients hospitalized at the hospital from March to December 2020. Table 1 below presents a description of the data used.  were patients diagnosed with PaUS (Patients under Supervision), and 79.10% were declared cured.
In Figure 1, the histogram of the length of stay of COVID-19 patients is more skewed to the left and asymmetrical as a normal distribution curve should be. The data are not normally distributed, and the quantile regression method is appropriate for data modeling like this. In this study, variables such as gender, COVID-19 diagnosis, and discharge status were independent variables with categorical data types. Therefore, these variables were changed to dummy variables for the estimation process in the regression analysis. Table 2 below presents the formation of dummy variables.

Quantile Regression Method
If a vector = ( 1 , 2 , ⋯ ) ′ is declared as the response variable and = ( 1 , 2 , ⋯ ) ′ is defined as the predictor variable, then the linear regression equation model for the th quantile, where 0 < < 1 with samples and predictors for = 1,2, …, is written as: with ( ) as the parameter vector and as the residual vector. The estimated value of the parameters in the quantile regression equation ̂( ) is obtained by minimizing the following equation [15]: with ( ) = ( − ( < 0))is the equivalent loss function with the equation: (. ) is an indicator function, which has a value of 1 when (. ) is true and 0 otherwise.

Bayesian Quantile Regression Method
Yu and Moyeed [10] suggested that minimizing the loss function of the quantile regression is equivalent to maximizing the likelihood function of the Asymmetric Laplace Distribution (ALD) 7 LENGTH OF HOSPITAL STAY MODEL OF COVID-19 PATIENTS because the loss function in quantile regression is identical to the ALD likelihood function. ALD is used to form the likelihood function so that the estimator becomes more effective and natural or close to the true value so that the correct estimation process can be produced. The ALD distribution is a continuous probability distribution. Random variable ε with ALD distribution with probability density function ( ), i.e.: with 0 < < 1 and ( ) as defined in equation (3). ALD has a combined representation of several distributions based on the exponential and the normal distribution, which are used to form the likelihood function. Let Z be a random variable with an exponential distribution ~exp(1), and is a random variable with a standard normal distribution ~ (0,1) . If the random variable that has an ALD distribution, it can be expressed as [15]: where θ = 1−2τ (1−τ)τ and p 2 = 2 (1−τ)τ . Based on equation (5), the likelihood function used in parameter estimation for the th quantile in the Bayesian quantile regression analysis is formulated in equation (6) as follows [14]: with > 0 as the scale parameter and = with exp ( ) distribution. Prior distribution was chosen for parameters ~ ( 0 , 0 ), ~exp( ), and ~ ( , ). The corresponding posterior distribution is obtained as follows: ( | , , ) ~ ( ) , ( 2 + 2 )) ;,

Bayesian Quantile Regression with Penalty LASSO and Adaptive LASSO.
The prior distribution of the regression model for the th quantile for samples with independent variables using the Bayesian LASSO quantile regression method is: The prior distribution of the regression model for the th quantile from samples with independent variables using the Bayesian Adaptive LASSO quantile regression method as follows: ( ) ∝

MAIN RESULTS
The parameter estimation process was carried out by determining the mean and the variance of each parameter formulated in the posterior distribution obtained from both methods. Furthermore, these results were applied to data on hospital length of stay of COVID-19 patients in West Sumatra Province to formulate a regression model using the R software. COVID-19 patients' length of stay model was estimated using the Bayesian LASSO quantiles regression method (BLQRM) and Bayesian Adaptive LASSO quantile regression method (BALQRM). Table 3   produces an estimated 95% confidence interval width which tends to be smaller than the estimated 95% confidence interval width resulting from the Bayesian LASSO quantile regression method.
Next, the comparison of the error values from the application of the two methods is presented in Table 4 below. 12 FERRA YANUAR, ATHIFA SALSABILA DEVA, AIDINIL ZETRA, MAIYASTRI The Bayesian Adaptive LASSO quantile regression method generally produces the smallest estimated values of MAD, MSE, and RMSE than those produced by the Bayesian LASSO quantile regression method. Based on Table 3 and Table 4 Figure 3 also shows that the density plot produced for each parameter estimate already resembles a normal distribution curve, that is, it is symmetrical, meaning that the estimated model parameter values are normally distributed. As for Figure 4, the ACF plot generated for each parameter shows an autocorrelation value that slowly goes to zero with increasing lag. It means that the estimated value is generated towards stability and then reaches convergence so that it is concluded that the resulting estimated value is acceptable.

CONCLUSIONS
In this study, it was proven that the Bayesian Adaptive LASSO quantile regression method is better for modeling the length of stay of COVID-19 patients. This method can produce estimates of the width of the 95% confidence interval, MAD, MSE, and RMSE values that are smaller than the 15 LENGTH OF HOSPITAL STAY MODEL OF COVID-19 PATIENTS Bayesian LASSO quantile regression method. The results of implementing the LASSO Adaptive Bayesian quantile regression method are that the length of stay of COVID-19 patients in West Sumatra Province is influenced by age, diagnoses related to COVID-19 (with PDP and positive categories), number of comorbidities, and patient discharge status (with cured categories, died, and outpatient). Thus, it can be concluded that to reduce the length of stay of COVID-19 patients in West Sumatra, individuals who are elderly and have comorbidities should be careful of their surroundings so as not to be infected with the Corona virus. The duration of hospital stay for individuals with these characteristics will take longer (if the person concerned is diagnosed with the COVID-19 virus) than for individuals with other conditions.