Coefficient identification in a SIS fractional-order modelling of economic losses in the propagation of COVID-19

A fractional-order SIS (Susceptible–Infectious–Susceptible) model with time-dependent coefficients is used to analyse some effects of the novel coronavirus 2019 (COVID-19). This generalized model is suitable for describing the COVID dynamics since it does not presume permanent immunity after contagion. The fractional derivative activates the memory property of the dynamics of the susceptible and infectious population time series. A coefficient identification inverse problem is posed, which consists of reconstructing the time-varying transmission and recovery rates, which are of paramount importance in practice for both medics and politicians. The inverse problem is reduced to a minimization problem, which is solved in a least squares sense. The iterative predictor–corrector algorithm reconstructs the time-dependent parameters in a piecewise-linear fashion. The economic losses emerging from social distancing using the calibrated model are also discussed. A comparison between the results obtained by the classical model and the fractional-order model is included, which is validated by ample tests with synthetic and real data.


Introduction
The COVID-19 coronavirus arised at the end of 2019 in Wuhan, China. It quickly spreads across many countries while still being a pandemic. Governments closed the so called non-essential business and services for weeks in order to prevent growth of the infectionespecially among the vulnerable populants -and thus to save lives.
Several approaches, including mathematical modelling, have been used to study the propagation of COVID-19. The subject of the COVID-19 pandemic modelling has been an active area of research for the recent 2-3 years.
The paper [1] offers a broad review of the history of the pandemic, established facts, and models of infection diffusion, mitigation strategies, and economic impact. The example of COVID-19 pandemic is used to illustrate the theoretical aspects, but the article also includes considerations concerning other historical epidemics and the danger of more infectious and less controllable outbreaks in the future.
In the wake of the COVID-19 pandemic, the Bulgarian government gradually employed non-pharmaceutical public health measures against COVID-19. At the first stage, strong quarantine measures for people SIR (Susceptible-Infectious-Removed) model. A novel and accurate PINN approach is used in [9] to solve the inverse problem in estimating parameters in SIR, SEIR (Susceptible-Exposed-Infectious-Removed) and SEIRS (Susceptible-Exposed-Infectious-Removed-Susceptible) models. In [10], a modified PINN approach is employed to study the parameters identifiability in a SICRD (Susceptible-Infectious-Confirmed-Recovered-Dead) model.
In [11] a state-dependent impulsive SIS model is discussed, where existence and the stability of the ordinary differential equations (ODE) system with saturated treatment is studied. Time-dependent probability and stationary distributions of stochastic SIS systems are considered in [12,13], while the stationary state probabilities of non-Markovian SEIS (Susceptible-Exposed-Infectious-Susceptible) model are compared to the ones of Markovian SIS model in [14]. The timedependent coefficients as piecewise constant functions of time in a SIR model are recovered in [15]. A SIR modelling of vaccinations and waning immunity is done in [16].
A brief review and some insights concerning the economic, financial and social impact of COVID-19 pandemic is given in [17]. In [18] measures to limit the spread of the virus are modelled in the framework of SIS model with deterministic time-dependent transmission rate ( ). Economic losses in the context of COVID-19 via integer-and fractional-order SEIR and SIR models are studied in [19][20][21].
Economic impact of the government interventions is investigated in [22] and it is found that social distancing has a negative effect on stock market returns, but other measures have a positive impact, while all interventions indirectly reduce the confirmed cases. The impact of the COVID-19 pandemic and the subsequent policy interventions substantially increase the volatility and the uncertainty on the global markets [23]. In [24] the effect of the monetary and fiscal policies on the stock and bond markets is studied and it is found that the fiscal policy has more positive effect on the stock market, and the monetary policy -on the bond market. The correlation between credit spreads of the relevant bonds and the corresponding bond market rate of return is investigated in [25] and it is concluded that there is a significant positive correlation, especially stronger for bonds with short maturities. In [26] it is found that the strict policies increase the volatility of returns on the stock markets due to the information campaigns and public event cancellations.
Two fractional epidemic models, the SIRS-and SEIRS-, which generalize the well-known integer order ( = 1) model are discussed in [27]. Estimation of the fractional order with real data of Florida is simulated numerically. Estimation for classical and fractional-order models are obtained in [28], using real data, reported by the World Health Organization. The paper [29] presents a simple cost-benefit analysis based on optimal control on a SIR model of disease propagation. It is shown that a 10-week lockdown is only optimal if the value of life for COVID-19 victims exceeds £10 m. Epidemiological parameters such as the transmission rate, lethality or detection rate of infected individuals in a SEIRD (Susceptible-Exposed-Infectious-Recovered-Dead) model are estimated in [30].
Fractional calculus is widely used to define subdiffusion models which arise in many areas, including biology and epidemiology. The important feature of the fractional-order derivative models are that they incorporate memory and hereditary properties, which is exhibited by many of the physical systems. It is true for the population -the number of infected today does not only depend on the number of infected yesterday but also on the infection course in the near past. The application of fractional-order models impact the parameters of the model itself, while providing (usually) a better fit to the data.
The goal of this paper is two-fold: firstly, we would like to calibrate an extension of the model used in [31] to a fractional-order ODE for studying the economic impact of the social distancing measures taken to slow down the spread of COVID-19, knowing the measurable values of the percentage of the infected people ( ) at certain instances of time.
Secondly, to employ the adjusted fractional ODE model to investigate the economic losses. All of these is done with synthetic and real data. Section 2 discusses the classical SIS model, Section 3 offers short introduction in fractional calculus and then presents the fractionalorder SIS model. The coefficient inverse problem and its solution are studied in Section 4. Computational examples are analysed in Section 5, while in Section 6 the algorithm is applied to real data and the results are thoroughly discussed. Section 7 concludes the paper.

The classical SIS model
We first discuss the simple SIS model analysed in [31]. In this model, an element of the population is either susceptible or infected. The infectious pathogen can be transmitted from an infected element to a susceptible one, with intensity proportional to the transmission rate ( ). An infected element recovers at rate ( ), and becomes part of the susceptible population afterwards.
Denote by ( ) the percentage of infected elements in the population (with constant size ) at time ≥ 0, so that the percentage of susceptible elements is ( ) = 1 − ( ). We assume that ( ) and The solution to (1) with constant coefficients is derived and studied in [32]. If ≤ (1− (0)) (resp. ≥ (1− (0))), then ( ) is an increasing (resp. decreasing) function of . When ≠ , If > , then ( ) has a positive limit as goes to infinity. More precisely, On the other hand, if ≤ , then ( ) goes to 0 as goes to infinity.
is called the basic reproduction number. In the theory of contagious disease modelling, the concept of  0 plays a crucial role since it quantifies the transmission potential of infectious diseases [33]. One of the fundamental results in the mathematical epidemiology is that there is a difference in the epidemic behaviour of the system depending on whether the basic reproduction number is less than one or more than one. It is a prominent example of the so called ≺≺threshold≻≻ behaviour. If the system is closed, i. e. there are no vital dynamics involved, the situation is simple. If  0 < 1, the disease dies out, or if  0 > 1, the infection breaks out into epidemics.
We employ the SIS model because of a couple of reasons. The most pronounced one is that the model is suitable to mimic the epidemic due to the lack of 'removed' compartment. Nowadays, it is already known that vaccinated or already recovered people could get infected (again). So, the transition of the population back to the 'susceptible' compartment is very important feature. Another reason is that the model is relatively simple to demonstrate our idea and could serve as a prototype for other more complex models.
Social distancing measures lower the transmission rate by diminishing the number of inter-personal contacts. Assume that, during an initial period [0, ), no social distancing measures are applied, and such actions have taken place during [ , * ]. In such a case Like the economic models considered in [34], the model considered in [31] for economic losses has two components: the first one estimates losses related to infections, while the second one accounts for losses due to the restrictions in the number of contacts. It summarized the following assumption: where and are non-negative constants. Assumption A1 can be justified as follows. The first component in L accounts for the fact that the infected individuals may require hospitalization and may not be able to work. It is related to the number of new infected see [31]. The second component in L is explained by the fact that, when contacts are reduced via social distancing measures, the corresponding reduction in economic output per unit of time is roughly proportional to the reduction in the number of contacts. As the infection is transmitted through contacts, can be considered as proportional to the average number of contacts per person. Thus the reduction in the number of contacts during [ , * ] is proportional to 0 − 1 .
The second assumption in [31] is as follows: The relation ≤ 0 (1 − (0)) is motivated by the fact that the number of infected elements typically increases during the initial phase of an epidemic provided no containment is applied. The relation 1 < implies that the social distancing measures put in place during the interval [ , * ] ensure that ( * ) can be made arbitrary small for sufficiently large * . The second component of formula (3), proposed in [31], covers the case when the transmission rate is a simple piecewise constant function of time. To elaborate it for our needs, we have to make it work with arbitrary function of time. As we discussed, the economic activity is depressed when ( ) decreases, or when d d ( ) < 0. The bigger the drop in (the more negative the first time derivative is), the greater the economic losses are. So basically we need to integrate the transmission coefficient in the periods where the derivative is negative: where is the total population (susceptible and infected).

Auxiliary results in fractional calculus and the fractional SIS model
In this section we introduce the essential preliminaries of the fractional calculus. The existence and uniqueness of the solution to the fractional SIS model, as well as the positivity of the solution, are proven.
The use of < 1 has just the effect of transforming (5) Then the left (forward) Caputo derivative as defined above could be expressed in this way: The left (forward) Riemann-Liouville derivative is defined as The right (backward) integral is defined as and further we define the right (backward) Caputo and Riemann-Liouville derivatives as There exists a relationship between the Caputo and Riemann-Liouville forward and backward derivatives For later use, we recall the following version of integration by parts, see e.g. [35].

The fractional SIS model
The fractional SIS model with time-dependent coefficients reads Due to the fractional derivative, we already work with units in the left hand-side. In order to keep the units equal on both hand-sides, we raise the coefficients to the power of . Furthermore, the basic reproduction number is defined as Theorem 2. There exists a unique solution ∈  1 [0, ] to the initial-value problem (5) and it remains positive for ∈ [0, ] provided that the initial data is positive.
Outline of the proof. The existence and uniqueness of the solution to the system (5) could be obtained on the base of the theory in [36] and some results in [37]. We need to show the positivity. On the contrary, let us assume that 0 > 0 and let 1 be the first time moment in which ( 1 ) = 0. Then, from (5) we have D 0 ( 1 ) = 0. But from GMF the function ( ) is non-decreasing for each ∈ [0, 1 ] which contradicts the assumption. □

The coefficient inverse problem and its solution
In this section we formulate the inverse problems for the classical and the fractional SIS models.
The function ( ) satisfies the direct problem (1) or (5) if the coefficient functions ( ), ( ) are known. In practice, though, the parameter functions are not known and they have to be identified. After obtaining their ''fair'' values, the model could be used for further robust analysis.
Hereinafter, for the ease of calculation, we introduce the change ( ) = ( ) − ( ). If the pair ( ) ≡ ( ( ), ( )) is known, then obviously The main question is how to find the coefficients ( ) ≡ ( ( ), ( )) if the population size is known at certain times: In the applications, the point observations (6) problem is reduced to the minimization of a functional of a kind The estimation of the pair function ( ) is referred as an inverse modelling problem. It essentially consists of adjusting the parameter values of the mathematical model in such a way to reproduce measured data.

Algorithm
Now the computational algorithm 1 is given in greater detail. First to introduce is the temporal mesh (Fig. 1). With * we denote every temporal point we have a measurement at. In practice, the data is reported usually on a daily basis. In the model (2) (or (5)), though, we know only * ( ) for a particular . This means we cannot recover both the coefficients ( ) and ( ) (or, equivalently, ( )) from this single measurement. To cope with the situation, we proceed as follows.
We define 'main' observation points * , which we will perform the iterations of our algorithm on. They are denoted with crosses on the time axis on Fig. 1. In practice, it is suitable to place these points on 7, 10, or 14 days. Without loss of generality, ▵ could differ for different as well as * could be far from equidistantly distributed. Furthermore, we need to introduce the indices , = 1, … , , which relates the indexation of the measurement points * ≡ * for ∀ .
Further, we will seek the unknown functions ( ) and ( ) as piecewise-linear functions of time. Actually, on iteration , the main point * acts as a temporary final pivot. Using the notation +1∕2 = ( * + * +1 )∕2, we propose: Step 1.1. Minimize the error functional Let ( 1 , 1 ) be the minimum of 1 . At this stage, we assume the coefficient functions to be constant on the interval [ * 0 , * 1 ]. This is done due to the following reason. Each linear function depends on two parameters -the slope and the intercept. It is impossible to uniquely reconstruct both the parameters by a single scalar, part of the minimum of (7). On the next stage, this would change because of the corrector feature of our algorithm.
Step 1.2. The recovered parameters so far are (1) Step 2.1. Minimize the error functional Let again the minimum of 2 be ( 2 , 2 ). Then, we assume the coefficients to be linear in [ * 0 , * 2 ], and these linear functions are build following the paradigm that (2) ( 1∕2 ) = 1 and (2) ( * 2 ) = 2 . Of course, the same is true for (2) . Mathematically speaking, this is easily computed through This means that we correct the result with a whole step backwards, which actually is the first step.
Step 2.2. Now, the reconstructed parameters look like (2) The next pair of steps is iteratively performed from = 3 to = .
Step k.1. Minimize the error functional where −3∕2 is such that * −3∕2 is the closest measurement point to Now we correct the result with a half-step backwards.
Step k.2. The recovered parameters are Step K+1. Finally, the whole piecewise linear parameter functions are identified as follows: The parameters are recovered as piecewise linear functions of time for ∈ [ * 0 , * ]. In general, the estimator functions of the cost functionals (7), (9), (12) are not unique. In order to deal with it, the proposed algorithm is of a predictor-corrector type. It corrects the last half of the previous step except the first step, which is corrected on the second step, and the last step, which is not corrected. Furthermore, the algorithm is fast and robust since the minimizers of the functionals are scalars but not functions.
A summary of the algorithm 1 is given in Appendix.

Computational simulations
In this section we provide numerical simulations to demonstrate the properties of the suggested algorithm. Firstly, we briefly explain the regularization we apply to the problems, and then we show numerical results with the integer-(1) and fractional-order (5) derivative models. In the next section we calibrate the model with real world data.
To put it in other words, we do synthetic data test in Sections 5.2 and 5.3. This means we fabricate values of ( ) and ( ) and solve the direct problem (1) or (5) to obtain synthetic measurements (6). Then we put these observations in algorithm 1 and we directly compare the true ( ) and ( ) with the algorithm output, constituted by the implied ( ) and ( ).
If the fair values of the parameters are not known, which is the case with the real data, it is needed to measure the algorithm performance in another way. This is why we introduce the residual norm (18), which we will talk about shortly.

Regularization
It is a well-known fact that the inverse problems are ill-posed. In general, the following conditions must be satisfied in order a problem to be well-posed: • the solution should exist; • the solution should be unique; • the solution should depend continuously on the input data.
If one or more of the above requirements are not satisfied, the problem is said to be ill-posed. Depending on the case, a common approach is to amend the original problem aiming the new problem to be well-posed or, at least, less severely ill-posed. Of course, the solution to the new problem would be different from the solution to the original (if exists), so the new problem have to be close to the original problem hoping that the new solution would be close to the original solution (again if exists).
This process is called a regularization of the ill-posed problem. Since our problem consists of minimizing a quadratic cost functional, the most common method is to add a regularization term. In order to control the magnitude of regularization (thus the discrepancy from the ≺≺true≻≻ solution), this term is multiplied by a regularization parameter. It is worth mentioning that we empirically checked the need of regularization, as the results from directly minimizing (7), (9), (12) with real data were far from satisfactory.
Hence we describe the type of regularization we apply to every functional. It is the fundamental idea of Tikhonov [38] we use.
To begin with 1 (7), we change it as follows: where is the regularization parameter vector. The quantities 0 and 0 usually denote a priori information about the values of and . We assign the initial approximations to the former.
Considering, , = 2, … , (9), (12), however, we employ a slightly different approach. Besides minimizing the difference with a priori information, the other basic idea of regularization is just to minimize the norm of the function. It is done to suppress oscillations which usually appear in the solution. However, at step , we have found the functions ( ) and ( ) up to −3∕2 . They do not change and thus do not contribute to the gradient of the functional. So, instead of minimizing the norm of the whole functions, we only minimize the difference with the previous estimator: where, as we discussed, −1 and −1 are the minimizers of reg −1 . Now, we proceed with the computational experiments.
Let us proceed to the quasi-real test with nonlinear parameter functions The results are plotted on Figs. 4 and 5.
The described situation matches the scenario of ongoing epidemic when ''climbing to the peak''. The phenomenon of the local minimum of the infected people, while the transmission rate ( ) is still increasing, is due to the latent feature of the disease. Considering Fig. 4, the parameters are approximately recovered, and the implied epidemic curve has very slight discrepancy around the fourth day from the true curve (Fig. 5).      The difference between the true and implied parameters are definitely the best measure for quality of the algorithm, but in the real world setting we do not know the true values. That is why we use the norm of the residual (18) to quantify the goodness-of-fit: For the constant values of the parameters const ( ( ), ( ) ) = 8.4847e-8 and for the nonlinear values it is ( ( ), ( ) ) = 0.0101.

Fractional-order derivative model
We regard now model (5). However, the change is now defined as ( ) = ( ) − ( ) due to the possible non-positivity of ( ) and the model becomes where , as usual, is the fractional order of the derivative.
Here we use the same temporal step and initial condition and regularization parameters. We test only the nonlinear case for and with = 0.9. The results could be viewed on Figs. 6 and 7. Please note that here the functions and estimators are raised to the power of . Although it seems that the coefficients are imperfectly S.G. Georgiev and L.G. Vulkov recovered this time, the residual norm in this case ( ( ), ( ) ) = 0.0071 is to a some extend expectedly smaller that the integer-order case. We continue the discussion shortly in the following section.

Discussion of a real world application
This section is devoted to a real data test. Here we use real-world observations to supply the algorithm with. As discussed, we do not know the true values of ( ) and ( ) (it is the aim of the study to suggest how to find them), so we cannot directly measure how close the implied ( ) and ( ) are to the true ones. Here we use the residual norm (18). Now we proceed with a real data fit. We take the official data for Bulgaria from [39]. We are going to use data for 2021, where the testing level is high enough to get an idea for the course of epidemic. To derive the ratio of infected population, we consider the number of positive test related to the total number of tests. Since the data is very irregular, we apply a simple moving average (SMA) filtering with period of length 7. Since this is a lagging transformation, we take the data from 26th December, 2020 to obtain obs for 1 st January to 4th December, 2021.
First we calibrate the integer-order derivative model (1). The results could be seen on Fig. 8.
The basic reproduction number  0 , defined earlier, is plotted on Fig. 9, together with the ≺≺true≻≻ and implied levels of infected. The residual norm is equal to ( ( ), ( ) ) = 0.0756. The total number of population is = 6 875 040 and we assume that = $50day -1 are the relative spenditures for curing and = $500day -1 are the relative losses for reduced economic activity. Then the total economic losses are L = $ 10 736 577 568.77 for 2021 year. Please note that the first component in L has almost 20 times greater economic impact than the second component. This roughly means that the economic damage caused by the treatment of the infection is times larger than the economic loss caused by the reduction of contacts. Now we proceed to the calibration of the fractional-order derivative model (5).
It is well known that optimal value of the fractional order should be expected to be less than one, but closer to one than to zero. This phenomenon is confirmed by our simulations. After simple grid search,  in Table 1 we present the values of the norm of the residual (18) for different values of the order . They are also plotted on Fig. 10. We can easily deduce that up to four digits precision, the best fit to the real data is achieved for = 0.7625. The optimal one is for a bit higher , but the difference is so small it is not worth seeking it. The norm of residual is ( ( ), ( ) ) = 0.0736. The calibration output is plotted on Figs. 11 and 12. Although following different trajectories (Fig. 11), both transmission coefficient ( ) and recovery coefficient ( ) tend to rise along with the development of the epidemic. This roughly means that the virulence of the coronavirus increases (due to the new strains), but the healthcare system attempts more and more successfully to handle the situation.
It is visible on Fig. 12 the two 'waves' of the epidemic in the spring and in the autumn of 2021. The basic reproduction number  0 goes above 1 and shortly thereafter the number of infected stats to rise. In the beginning of December, 2021, Bulgaria is quitting the fourth wave (since the beginning), but  0 > 1 still holds.
The epidemic curve obs is expectedly recovered in an exact manner as well. This may rise questions about the closeness of the recovered data to their raw counterpart, since it is obvious in both the syntheticand real-data experiments. Although the coefficients are reconstructed in view of piecewise-linear functions, they describe well the disease dynamics. The latter fact is true mainly due to the short period between the 'main' observations. In the real-world investigation, they are distributed on every 7 days, which accounts for the nicely fitted graphs. Despite the hereditary property of the sub-diffusion process, on each iteration, the slope of the recovered linear part does not depend on the previously identified parts, which gives the algorithm the freedom S.G. Georgiev and L.G. Vulkov  to match the data in a precise manner and in the same time there is no possibility of overfitting. The approach performs expectedly well on unknown data without ''memorizing patterns'', which is measured by the smallness of the residual norm .
Finally, the epidemic impact is estimated to be L = $ 10 527 251 758.46 for 2021 year (up to 4th December), using the same values of , and . Please note that, in the fractional-order case, we set ∶= in (4). The total losses are comparable with the ones in integer-order case, but now the first component is 30 times as more as the second component, and the implications are still the same.

Conclusion
It is already a well-known fact that the epidemic changes its behaviour with time. So should do the coefficients in the models that frame the epidemic. To incorporate this feature, we suggest a model with time-dependent coefficients as piecewise linear functions with short linear periods. What is more, the number of the infectious now is a function not only of the level of infectious right before the current moment, but the infection course in the near past must also be considered. It is the memory property that reflects this phenomenon, which is modelled by the fractional-order derivative. Employing such a model, the paper reconstructs the driving parameters and assesses the economic impact of the pandemic and the subsequently imposed non-pharmaceutical measures.
The proposed predictor-corrector method recovers the dynamics coefficients in a step-by-step manner. On each stage, only a pair of scalars are sought which makes the approach fast and resource-efficient. The regularization makes it robust too. The numerical experiments show the algorithm is able to cope with noisy real data. Furthermore, the economic losses are evaluated in the setting of the identified parameters by the algorithm.
The findings of the study are useful for rating possible anti-epidemic measures and taking appropriate decisions to control the spread of the disease in the most harmless way from the view of both healthcare and budgeting. However, the described approach is applicable in calibrating any type of dynamics systems, the latter occurring frequently in modelling the surrounding natural and anthropogenic processes.
Our approach could be enhanced in a couple of ways to consider possible future work. For example, if the gradients of the functionals are found explicitly, the minimization method could be improved. It is also interesting to test the model stability in view of more scarce or corrupted data. Moreover, the algorithm is suitable for calibration of more advanced and complex models, for example by including other compartments for vaccinated, isolated, ICUs (intensive care units), or accounting for inhomogeneous diffusion of the transmission, etc.