Stochastic modeling, analysis, and simulation of the COVID-19 pandemic with explicit behavioral changes in Bogotá: A case study

In this paper, a stochastic epidemiological model is presented as an extension of a compartmental SEIR model with random perturbations to analyze the dynamics of the COVID-19 pandemic in the city of Bogotá D.C., Colombia. This model incorporates the spread of COVID-19 impacted by social behaviors in the population and allows for projecting the number of infected, recovered, and deceased individuals considering the mitigation measures, namely confinement and partial relaxed restrictions. Also, the role of randomness using the concept of Brownian motion is emphasized to explain the behavior of the population. Computational experiments for the stochastic model with random perturbations were performed, and the model is validated through numerical simulations for actual data from Bogotá D.C.


Introduction
The coronavirus disease (COVID-19) that is believed to have emerged in Wuhan, China, in late 2019 spread to Colombia on March 6, 2020, when the country had reported its first positive case (Benı etezet al., 2020; Ministerio de Salud y Protecci on Social, 2020).The same month, the World Health Organization had declared the disease as a pandemic (World Health Organization, 2020).These developments have prompted much research dedicated to understanding the nature of the spread of the disease (Sohrabiet al., 2020;Wanget al., 2020;Wu et al., 2020).Specifically, various mathematical models have been widely employed to predict the future spread of the disease as well as explore the impact of certain interventions on the spread(see (Amouch & Karim, 2021;Linet al., 2020;Musaet al., 2021)).
Most mathematical models describing the spread of the disease employ classical compartments, such as the Susceptible-Exposed-Infected-Recovered (SEIR) structure described as an ordinary differential equation system (Brauer and Castillo-Chavez, 2012).Some researchers have adapted these models to include several relevant factors, including the role of containment strategies (Maier and Brockmann, 2020), mobility patterns (R adulescu et al., 2020), social distancing (Leunget al., 2018;Premet al., 2020), and network intervention strategies to patterns of plateau duration, intensity and duration of social distancing measures (Komarova et al., 2021).
However, the pandemic has prompted lock-downs, widespread closures, and calls to social distance and practice basic hygiene, which has encouraged many people to take caution to limit the spread (Alberto and GongGiordano, 2021;Orabyet al., 2021).To account for the impact of such non-pharmaceutical interventions and the social behavior of the population in response, recently a novel mathematical framework was developed (Ohajunwa et al., 2020).Specifically, three separate models including a baseline model, an explicit intervention model, and an implicit intervention model were created.Of particular relevance to this paper is the explicit intervention model, which accounted for the effect of lock-down through the addition of a Confined compartment to an extended SEIR model.Based on Dirac delta functions, a portion of susceptible individuals was modeled to enter confinement during lock-down and return to become susceptible when lock-down ended.While this model was able to incorporate social behavior through the transmission parameters, this model did not include the influence of stochasticity.Given that many individuals within the population are assumed to react irrationally and unpredictably in response to the pandemic, there is a need to consider stochastic models to capture these effects.Stochastic perturbation models can be used to study the effects of these reactions by introducing environmental noise on the transmission parameter (see (Cai et al., 2019;Grayet al., 2011;Ríos-Guti errez et al., 2021)).
Colombia, like many other countries, has implemented a series of emergency measures in response to the coronavirus outbreak.The government quickly responded by policy implementation by introducing strict lock-downs, PCR testing capacity, contact tracing, and augmenting ICU capacity in the hospitals.While Colombia's management of COVID-19 may be considered as a success, the country was not been able to flatten the curve for more than 100 days at the beginning of the pandemic (see (C ardenas & Martínez, 2020;De la Hoz-Restrepoet al., 2020)).Detailed daily reports and other information can be found at (Instituto Nacional de Salud de Colombia; Ministerio de Salud y Protecci on Social, 2020).
Thus, the goal of this paper is to develop a novel mathematical framework that extends the previous model (Ohajunwa et al., 2020) to not only account for the effects of social behavior, specifically during lock-down conditions, but also the role of stochasticity within the population.Furthermore, we also apply this model to real data from the Colombian city of Bogot a through computational techniques.Specifically, we use data from the National Institute of Health of Colombia (see (Instituto Nacional de Salud de Colombia)) to derive parameters for our simulations and analysis.This model will project the number of infected, recovered and deceased individuals taking into account not only mitigation measures, specifically confinement and partial opening, but also the role of randomness in the behavior of the population in the pandemic.This paper will be structured as follows.In section 2, we introduce the stochastic ordinary differential equation system, including the definition of the variables and parameters.The section also describes the numerical discretization for this system along with calibration of the model.In section 3, we validate the model by performing numerical simulations against real data from Bogot a. Finally, in section 4, we discuss and draw conclusions from our work.

A stochastic model: equations and methods
To address the disease dynamics of the COVID-19 pandemic in the city of Bogot a DC, we propose a stochastic compartmental disease transmission model based on (Ohajunwa et al., 2020) and adapted according to a structure of stochastic differential equations.The phases or compartments are based on the scientific literature and the different decrees adopted by the various governmental and health institutions in charge.
The flow diagram of the proposed mathematical model is shown in Fig. 1.Specifically, individuals transition from one compartment to another according to specific rates and are influenced by noise, which simulates the random dynamics of the epidemic.The model assumes that the total population is homogeneous and constant in size N. Immigration and emigration are not taken into account; that is, the population is closed.
For this work, we introduce the following the sub-populations: Susceptible (S), Confined (C), Exposed (E), Asymptomatic (I A ), Symptomatic (I S ), Quarantined (Q), Hospitalized (H), Recovered (R), and Dead (D).Susceptible individuals consist of individuals who are not infected with COVID-19 and are not isolated from the population.Confined individuals are individuals who were previously susceptible but by their own or regulatory decision are temporarily isolated from the population in order to protect themselves from infection.Exposed individuals are individuals who are no longer susceptible because they have come into contact with asymptomatic or symptomatic infected individuals and are in the incubation period of disease progression.Asymptomatic individuals are infected individuals who do not exhibit any symptoms of COVID-19, while symptomatic individuals are infected individuals who do have symptoms.Hospitalized individuals are individuals who have been symptomatic and enter hospitalization with severe COVID-19 symptoms.Quarantined individuals are symptomatic individuals who have been isolated.Note that in this model, we suppose that symptomatic population immediately go to quarantine and does not include any death of this population.However, with more available data, we could consider this case in the future.Recovered individuals consist of individuals who were previously infectious and has now survived COVID-19.Deceased individuals are individuals who were infectious but did not survive COVID-19.Given that, we assume a closed population size, In addition, we define the parameters used for the model in Table 1.The rates at which susceptible and confined individuals enter and leave confinement depend on the Dirac delta functions 4(t) and j(t).Susceptible individuals become exposed upon contact with the infected population at the transmission rate b.Exposed individuals become infected at the incubation rate, k, such that a fraction p becomes symptomatic infectious and the remaining (1 À p) becomes asymptomatic infectious.Asymptomatic individuals either recover or develop symptoms at a rate of u, with a fraction n recovering and a fraction (1 À n) becoming symptomatic.Symptomatic individuals quarantine at a rate of x.At a rate of g, a fraction q of quarantined individuals are hospitalized and a fraction (1 À q) recover.Lastly, a proportion x of hospitalized individuals die at a rate of a D , while a proportion (1 À x) of hospitalized individuals recover at a rate of a R .
Next, we propose the following system of stochastic differential equations for the modified Stochastic SEIR model with random perturbations (see (Ríos-Guti errez et al., 2021)).Note that this model will include random perturbations of the white noise terms satisfying the Brownian motion governed by system of stochastic differential equations.Let n SðtÞ; EðtÞ; I A ðtÞ; I S ðtÞ; Q ðtÞ; HðtÞ; RðtÞ; DðtÞ; CðtÞ o t!0 be an Itô process given by the following system of differential equations: (1) where fB 1 ðtÞg t!0 and fB 2 ðtÞg t!0 are two independent standard Brownian motions, and considered as generalized white noise functionals, and the constants s 1 and s 2 are the intensities of the noises.We note that fB 1 ðtÞg t!0 and fB 2 ðtÞg t!0 are defined on the complete probability space À U; I; fIg t!0 ; P Á satisfying the usual conditions .The functions 4 ¼ 4(t) ¼ ld(t À t l ) and Since the transmission rate changes according to the dynamics of the confinement or lock, then the following is proposed: (3)

Basic reproduction number
We now briefly discuss the basic reproduction number R 0 , which is the expected number of secondary cases produced by a single infection in a completely susceptible population.Using the method for the epidemic models with random perturbations (Ríos-Guti errez et al., 2021), we define the basic reproduction number as follows: where bðaÞ is the average number of newly infected individuals (in a completely susceptible population) by an infected individual that is infectious between t ¼ 0 and t ¼ a. FðaÞ is the probability that a newly infected individual will continue infecting others during the time interval between 0 and a.This is also called the underlying survival probability (or survival function).
In our case, the infectivity rate when the population is completely susceptible is given by b1 if the first infected individual is symptomatic.When the first infected individual infects a completely susceptible population with N individuals, then the expected number of asymptomatic individuals who were infected by a symptomatic individual is b 1 ðaÞ ¼ N b2 ð1 À pÞ ¼ ðb þ s 2 EðB 2 ðþ∞ÞÞÞð1 À pÞ; the expected number of symptomatic individuals who were infected by a symptomatic individual is and the expected number of asymptomatic individuals who were infected by a asymptomatic individual and have not recovered from the illness, that is, became in symptomatic individuals is and the expected number of symptomatic individuals who were infected by an asymptomatic individual is Note that we wrote E(B i (þ∞)) ¼ lim t/þ∞ B i (t) ¼ 1, 2 due to the need to establish how many people are infected by an infectious individual, which we can interpret as the time of infecting from the beginning to the end of time.
In this way.
is the number of infected individuals by an infectious individual for a / þ ∞.We have the survival functions, F 1 (a) ¼ e Àxa and F 2 (a) ¼ e Àua , for when (1) a symptomatic individual infected by a symptomatic individual remains symptomatic on [0, a], (2) an asymptomatic individual infected by a symptomatic individual remains asymptomatic on [0, a] and (3) an asymptomatic individual infected by an asymptomatic individual remains infectious on [0, a], and (4) a symptomatic individual infected by an asymptomatic individual remains symptomatic on [0, a]; respectively.We obtain the following equation Therefore, in the following equation we introduce b e (a) and F e (a) which are functions corresponding to b(a) and F(a), respectively, for the stochastic system (1).We then have, The integral R þ∞ 0 b e ðaÞF e ðaÞda is the mean of the random variable that describes the basic reproduction number for the proposed stochastic model.Based on (Ríos-Guti errez et al., 2021), we have that On the other hand, using integration-by-parts rule (Chung and Williams, 2014), we have that lim where, We compute that lim l/þ∞ B 2 (l)(upe Àxl þ x(1 À p)e Àul ) ¼ 0 a.s.In addition, B 2 (0) ¼ 0 a.s.Now using (Klebaner, 2012)[p.393], we obtain ! since we assumed fB 1 ðtÞg t!0 and fB 2 ðtÞg t!0 are two Brownian independent motions, in consequence, we get Thus, we obtain the distribution of the basic reproduction number for the proposed model.

Numerical discretization
We now briefly explain the numerical approximation method used in the paper.The Euler-Maruyama method described in (Kloeden and Platen, 1992) and introduced by the Japanese mathematician G. Maruyama (Maruyama, 1955) as an extension of the Euler method, is a numerical integration technique for obtaining approximate solutions for a system of stochastic differential equations from a given initial value where the length of each subinterval is Dt For each stochastic process trajectory, the value of X tiþ1 is approximated using only the value of the previous time step, X ti .Then, to find the trajectories or approximate solutions of a stochastic differential equation by the Euler-Maruyama method, the following equation is implemented: for all i ¼ 0, 1, …, k À 1.In order to carry out the method computationally, it is necessary to know how to calculate DB i .Since the partition is made up of equal intervals, the differences DB i , i ¼ 0, 1, …, k À 1 have the same distribution, DB i ~N(0, Dt).Let h be a random variable with a standard normal distribution h ~N(0, 1).Then

ffiffiffiffiffi ffi
Dt p h has a normal distribution with zero mean and variance Dt; that is, ffiffiffiffiffi ffi Dt p h $ Nð0; DtÞ.We use the following algorithm to implement the Euler-Maruyama method to find the approximate solutions of the stochastic differential equations: To implement in an analogous way the algorithm of the Euler-Maruyama method, previously described, for our proposed model, the respective discretization of the system of stochastic differential equation (1) must be carried out, which is given by: 3. Computational results from the statistical data: A case study from the city of Bogot a In this section, we discuss the stochastic dynamics with environmental noises for the COVID-19 epidemic in Bogot a city.The data used in this article for the simulations, parameter adjustment, and analysis was selected from the publicly available database of the National Institute of Health of Colombia (or in Spanish: Instituto Nacional de Salud, INS) (Instituto Nacional de Salud de Colombia, 2021).This data source provides daily case numbers for both symptomatic and asymptomatic infected individuals without making a clear distinction between the two conditions.It also gives the report of recovered and deceased individuals.However, given that the selected epidemiological data often had irregularities, data cleaning was performed.Subsequently, the new database was filtered with respect to the dates of the analysis.In addition, it is necessary to have the total number of inhabitants in the city of Bogot a D.C. for the mathematical modelling.The population projection of the National Administrative Department of Statistics (or in Spanish: Departamento Administrativo Nacional de Estadística, DANE) (Administrativo Nacional de Estadística, 2018), estimates that for 2020 the total population in the respective city is 7743 955 persons.Finally, the entire methodological procedure was developed in Python 3.7.
For the proposed model ( 1) model, we have taken into account the first 200 days of the pandemic in the city of Bogota.The first contagion report occurred on March 6, 2020 (Ministerio de Salud y Protecci on Social, 2020) (day 1) and the end date of the analysis was set for September 22, 2020 (day 200).The calibration of the model was carried out with the infected, recovered, and deceased data of the first 100 days, that is, from March 6, 2020, to June 14, 2020 (day 100).It is planned for the period from June 15, 2020 to September 22, 2020.The lockdown began on March 24, 2020 (Presidencia de la Repúblicade Colombia, 2020a, 2020b) (day 18) and the partial opening began on May 11, 2020 (Presidencia de la Repúblicade Colombia, 2020a, 2020b) (day 66).By September 22, 2020, the following was reported in Colombia: 777 537 infected, 650 801 recovered, 24 570 deaths, and 3460 714 processed samples.And for the city of Bogot a D.C, the following was reported: 256,162 infected, 213,956 recovered, and 6,610 deaths (see Table 2).Regarding the daily report of infected in the same city, its maximum value or peak occurred around August 6, 2020, with 6068 infected cases (see Figs. 2 and 3).

Calibration of the model
The calibration of the proposed model is carried out in order to find a set of parameters so that the model has a good description of the behavior of the system dynamics, i.e. so that the model fits the data as well as possible and allows us to make the desired projections.The search for the parameters can be carried out by means of different optimization algorithms.The non-linear least squares method of trust region was the one chosen (see (S aez and Rittmann, 1992)), which will be explained later on.Subsequently, the predictions of the model are compared with the real data available for the city of Bogot a.
It is necessary to find the values of the parameters that fit the model according to the available data and satisfy a certain set of constraints to ensure global convergence and consistency of the parameters.We present the nonlinear least squares optimization method of trust region described in (Kloeden and Platen, 1992) for the adjustment of the parameters of the proposed model.Let q ¼ (q 1 , …, q m ) be a multidimensional parameter with a neighborhood defined by B Dk ðpÞ ¼ fq k 2ℝ : jjpjj Dkg for 1 k m, where Dk is the radius of the trusted region.Consider f : R n /R a function that depends on the parameter q.
Given some points (x 1 , y 1 )/(x n , y n ) the residual sum of squares is defined as: where g(q) is the objective function.The information from the objective function in the trust region methods is used to form a model m k (7), which, near the current point q k , have as much as possible a behavior similar to that of the objective function.
where the Hessian is ▽ 2 g z J T J. Since the model may not be a good approximation of the objective function, when taking points q away from q k , the search for the minimum m k that must be restricted to points at B Dk .So, in each iteration of this methodology, the following sub-problem is solved; min p m k ðpÞ; subject to jjpjj Dk.Given the data on infected, recovered, and deceased, and the information on the parameters obtained from a metaanalysis of the epidemiological and health entities in charge of Bogot a, the parameters of the model were fitted with the nonlinear least squares procedure using the trust region algorithm.In Table 3 and Table 4, we can observe the fixed values and those obtained from the model adjustment respectively, and also in the references column we can find the literature, which helped us to establish the fixed and initial values to carry out the optimization process.Note that the standard error for the parameters a D and x are too high, which possibly indicate that there is too much variability of the number of deaths after COVID-19 hospitalized them: there could be days with few casualties and others with too many deaths after being hospitalized by COVID-19.

Model projections
Fig. 4 and Fig. 5 generally illustrate the contrast between the real dynamics and the projections of the proposed model in the first mitigation measures for the pandemic for the SARS-CoV-2 coronavirus.Fig. 4a shows the analysis of the daily report of individuals infected by the virus, that is, the incidence of infected.In contrast, Fig. 4b shows the analysis of the number of accumulated infected.Fig. 5a and b represents the analysis of accumulated recovered and accumulated deceased individuals, respectively.
In the following figures, the actual data is represented by the yellow line.The gray lines are the 20 simulations represents results from the stochastic model subject to parameter adjustment by the trust region method, and the blue line is the drift of the model.Likewise, the graphs specify the analysis period, that is, the period in which the parameters were projected and adjusted.They also indicate the periods of confinement and partial opening.The mean square logarithmic error (MSLE) is calculated as an evaluation measure of the proposed stochastic model.The MSLE only considers the relative difference between the actual and the projected value.This measure treats minor differences between small actual and forecasted values, roughly the same way as significant differences between large actual and forecasted values.The cause of errors is significantly penalized than small ones in those cases where the target value range is large.So, since the drift is the stochastic mean, when evaluated against the actual data with the MSLE measure, this gives a cursory idea of how reliable the model is.For the projections of the incidence of infected, an MSLE of 0.090 is obtained, and for the accumulated infected, an MSLE of 0.005, while for the predictions of accumulated recovered, an MSLE of 0.157 and for accumulated deaths an MSLE of 0.062.Therefore, both the graphs and the superficial or general measurement of the MSLE model infer that the projections of the recovered are not very favorable compared to the other cases.
Fig. 6 and Fig. 7 are not contrasting graphs like the previous ones.One reason could be the difficulty of obtaining the prevalence data for the cases mentioned below.However, as can be seen, the graphs represent the 20 simulations of the stochastic model, the drift of the model given the adjustment of the parameters and with intensity constants s 1 ¼ 0.2 and s 2 ¼ 0.4 and the dates of the mitigation strategies chosen by the national government.Fig. 6a and b show possible projections of the stochastic model for asymptomatic and symptomatic infected individuals, respectively.In contrast, Fig. 7a represent simulations of the prevalence of hospitalized individuals, and Fig. 7b represents these changes in the stochastic dynamics of susceptible individuals.
In our model, we obtain the basic reproduction number for the proposed model (theoretically calculated in (Ohajunwa et al., 2020)) given by Also, from the parameters given in Table 4, we note that the basic reproduction number for COVID-19 in Bogot a has normal distribution with median R Cov 0 ¼ 4:34 and variance 1.79 (standard deviation 1.34) given by:  Colombia and recognized by WHO in the list of variants that reported 25% of new incidence cases in Colombia (see (Khan Burki, 2021;Shultzet al., 2021)).

Conclusion
In this work, a new stochastic epidemiological model is proposed that accounts for randomness in the dynamics of COVID-19.The model provides a perspective of the possible scope for the spread of COVID-19 with random behaviors in the population.We were able to show the importance of including stochasticity within the population that models the real data from Colombian city of Bogot a.Our computational simulations suggest that the proposed model is reliable and robust in projecting the number of infected, recovered and deceased individuals during interventions such as confinement and partial opening.The results from this work can be used to inform data-driven decision making for improving public health in Colombia.
While the model seems to capture the data from the Colombian city of Bogot a well, it is potentially limited by the lack of prevalence data corresponding to registration of individuals who present a certain characteristic of the disease over a period of time, including susceptible, exposed, hospitalized and quarantined sub-populations, as well as the complete registry of the data of asymptomatic infected.The proposed model explains the containment of the pandemic by strict lock-down and decline in number of incidence cases and death.This work also opens a way to explore newer compartment models one can take into account including other factors such as migrations, demographics, spatial and vaccination effects.One can also compare the proposed model against reliable data from another country.Another potential area of research includes using these models in conjunction with machine learning and deep learning to obtain optimal parameters.Another area that can be considered is to study to what extent the results can be generalized to the overall population and across different geographical areas as well as patterns of contact between and within groups and in different social settings.All these will be considered in forthcoming papers.

Fig. 1 .
Fig. 1.A flow diagram of the proposed model dynamics.

Fig. 2 .
Fig. 2. Graph (a) on left denotes number of infected individuals(incidence); Graph (b) on right denotes cumulative number of infected individuals.

Fig. 3 .
Fig. 3. Graph (a) on the left denotes cumulative number of individuals recovered; Graph (b) on the right denotes cumulative number of deceased individuals.

Fig. 4 .
Fig. 4. Graph (a) on left denotes number of infected individuals (incidence); Graph (b) on right denotes cumulative number of infected individuals.

Fig. 5 .
Fig. 5. Graph (a) on left denotes cumulative number of individuals recovered; Graph (b) on right denotes cumulative number of deceased individuals.

Fig. 6 .
Fig. 6.Graph (a) on left denotes number of asymptomatic infected individuals (prevalence); Graph (b) on right denotes number of symptomatic infected individuals (prevalence).

Table 1
Definition of Parameters in the model.Rate at which confined individuals reenter the susceptible sub-population as a function of time k D. Niño-Torres, A. Ríos-Guti errez, V. Arunachalam et al.Infectious Disease Modelling 7 (2022) 199e211

Table 2
Statistical description of the data for the city of Bogot a D.C

Table 3
Adjusted drift parameters.

Table 4
Fixed parameters.