SI epidemic model applied to COVID-19 data in mainland China

The article is devoted to the parameters identification in the SI model. We consider several methods, starting with an exponential fit to the early cumulative data of SARS-CoV2 in mainland China. The present methodology provides a way to compute the parameters at the early stage of the epidemic. Next, we establish an identifiability result. Then we use the Bernoulli–Verhulst model as a phenomenological model to fit the data and derive some results on the parameters identification. The last part of the paper is devoted to some numerical algorithms to fit a daily piecewise constant rate of transmission.


Introduction
Estimating the average transmission rate is one of the most crucial challenges in the epidemiology of communicable diseases. This rate conditions the entry into the epidemic phase of the disease and its return to the extinction phase, if it has diminished sufficiently. It is the combination of three factors, one, the coefficient of virulence, linked to the infectious agent (in the case of infectious transmissible diseases), the other, the coefficient of susceptibility, linked to the host (all summarized into the probability of transmission), and also, the number of contacts per unit of time between individuals [1]. The coefficient of virulence may change over time due to mutation over the course of the disease history. The second and third also, if mitigation measures have been taken. This was the case in China from the start of the pandemic [2]. Monitoring the decrease in the average transmission rate is an excellent way to monitor the effectiveness of these mitigation measures. Estimating the rate is therefore a central problem in the fight against epidemics.
The goal of this article is to understand how to compare the SI model to the reported epidemic data and therefore the model can be used to predict the future evolution of epidemic spread and to test various possible scenarios of social mitigation measures. For t ≥ t 0 , the SI model is the following: S 0 (t) ¼ Àt(t)S(t)I(t) and I 0 (t) ¼ t(t)S(t)I(t) À nI(t), (1:1) where S(t) is the number of susceptible and I(t) the number of infectious at time t. This system is supplemented by initial data S(t 0 ) ¼ S 0 ! 0 and I(t 0 ) ¼ I 0 ! 0: (1:2) In this model, the rate of transmission τ(t) combines the number of contacts per unit of time and the probability of transmission. The transmission of the pathogen from the infectious to the susceptible individuals is described by a mass action law τ(t) S(t) I(t) (which is also the flux of new infectious). The quantity 1/ν is the average duration of the infectious period and νI(t) is the flux of recovering or dying individuals. At the end of the infectious period, we assume that a fraction f ∈ (0, 1] of the infectious individuals is reported. Let CR(t) be the cumulative number of reported cases. We assume that where CI(t) ¼ are known parameters.
Throughout this paper, the parameter S 0 = 1.4 × 10 9 will be the entire population of mainland China (since COVID-19 is a newly emerging disease). The actual number of susceptibles S 0 can be smaller since some individuals can be partially (or totally) immunized by previous infections or other factors. This is also true for SARS-CoV2, even if COVID-19 is a newly emerging disease. In fact, for COVID-19 the level of susceptibility may depend on blood group and genetic lineage. It is indeed suspected that the blood group O is associated with a lower susceptibility to SARS-CoV2 while a gene cluster inherited from Neanderthal has been identified as a risk factor for severe symptoms [3,4].
At the early beginning of the epidemic, the average duration of the infectious period 1/ν is unknown, since the virus has never been investigated in the past. Therefore, at the early beginning of the COVID-19 epidemic, medical doctors and public health scientists used previously estimated average duration of the infectious period to make some public health recommendations. Here we show that the average infectious period is impossible to estimate by using only the time series of reported cases, and must therefore be identified by other means. Actually, with the data of SARS-CoV2 in mainland China, we will fit the cumulative number of the reported case almost perfectly for any non-negative value 1/ν < 3.3 days. In the literature, several estimations were obtained: 11 days in [5], 9.5 days in [6], 8 days in [7] and 3.5 days in [8]. The recent survey by Byrne et al. [9] focuses on this subject.
In [10], it is reported that transmission of COVID-19 infection may occur from an infectious individual who is not yet symptomatic. In [11], it is reported that COVID-19-infected individuals generally develop

Result
In §3, our analysis shows that: -It is hopeless to estimate the exact value of the duration of infectiousness by using SI models. Several values of the average duration of the infectious period give the exact same fit to the data.
-We can estimate an upper bound for the duration of infectiousness by using SI models. In the case of SARS-CoV2 in mainland China, this upper bound is 3.3 days.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 7: 201878 symptoms, including mild respiratory symptoms and fever, on average 5-6 days after the infection date (with a confidence of 95%, range 1-14 days). In [12], it is reported that the median time prior to symptom onset is 3 days, the shortest 1 day, and the longest 24 days. It is evident that these time periods play an important role in understanding COVID-19 transmission dynamics. Here the fraction of reported individuals f is unknown as well.
As a consequence, the parameters 1/ν and f have to be estimated by another method, for instance by a direct survey methodology that should be employed on an appropriated sample in the population in order to evaluate the two parameters.
The goal of this article is to focus on the estimation of the two remaining parameters. Namely, knowing the above-mentioned parameters, we plan to identify -I 0 the initial number of infectious at time t 0 ; -τ(t) the rate of transmission at time t.
This problem has already been considered in several articles. In the early 1970s, London & Yorke [13,14] already discussed the time-dependent rate of transmission in the context of measles, chickenpox and mumps. More recently, in Wang & Ruan [15] the question of reconstructing the rate of transmission was considered for the 2002-2004 SARS outbreak in China. In Chowell et al. [16], a specific form was chosen for the rate of transmission and applied to the Ebola outbreak in Congo. Another approach was also proposed in Smirnova et al. [17].
In §2, we will explain how to apply the method introduced in Liu et al. [18] to fit the early cumulative data of SARS-CoV2 in China. This method provides a way to compute I 0 and τ 0 = τ(t 0 ) at the early stage of the epidemic. In §3, we establish an identifiability result in the spirit of Hadeler [19].
In §4, we use the Bernoulli-Verhulst model as a phenomenological model to describe the data. As it was observed in several articles, the data from mainland China (and other countries as well) can be fitted very well by using this model. As a consequence, we will obtain an explicit formula for τ(t) and I 0 expressed as a function of the parameters of the Bernoulli-Verhulst model and the remaining parameters of the SI model. This approach gives a very good description of this set of data. The disadvantage of this approach is that it requires an evaluation of the final size CR ∞ from the early beginning (or at least it requires an estimation of this quantity).
Therefore, in order to be predictive, we will explore in the remaining sections of the paper the possibility of constructing a day-by-day rate of transmission. Here we should refer to Bakhta et al. [20] where another novel forecasting method was proposed.
In §5, we will prove that the daily cumulative data can be approached perfectly by at most one sequence of day-by-day piecewise constant transmission rates. In §6, we propose a numerical method to compute such a ( piecewise constant) rate of transmission. Section 7 is devoted to the discussion, and we will present some figures showing the daily basic reproduction number for the COVID-19 outbreak in mainland China.
2. Estimating τ(t 0 ) and I 0 at the early stage of the epidemic In this section, we apply the method presented in [21] to the SI model. At the early stage of the epidemic, we can assume that S(t) is almost constant and equal to S 0 . We can also assume that τ(t) remains constant equal to τ 0 = τ(t 0 ). Therefore, by replacing these parameters into the I-equation of system (1.1) we obtain

Result
In §3, our analysis shows that: -It is hopeless to estimate the fraction of reported by using the SI models. Several values for the fraction of reported give the exact same fit to the data.
-We can estimate a lower bound for the fraction of unreported. We obtain 3.83 × 10 −5 < f ≤ 1. This lower bound is not significant. Therefore, we can say anything about the fraction of unreported from this class of models.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 7: 201878 Therefore, We obtain a first phenomenological model for the cumulative number of reported cases (valid only at the early stage of the epidemic) In figure 1, we compare the model to the COVID-19 data for mainland China. The data used in the article are taken from [22][23][24] and reported in appendix A. In order to estimate the parameter χ 3 , we minimize the distance between CR Data (t) + χ 3 and the best exponential fit t ! x 1 e x 2 t (i.e. we use the Matlab function fit(t, data,'exp1')).  The parameter χ 3 is obtained by minimizing the error between the best exponential fit and the data.
The estimated initial number of infected and transmission rate By using (1.3) and (2.3), we obtain and by using (2.1) royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 7: 201878 The influence of the errors made in the estimations (at the early stage of the epidemic) has been considered in the recent article by Roda et al. [25]. To understand this problem, let us first consider the case of the rate of transmission τ(t) = τ 0 in the model (1.1). In that case (1.1) becomes S 0 (t) ¼ Àt 0 S(t)I(t) and By using the S-equation of model (2.6) we obtain where CI(t) is the cumulated number of infectious individuals. Substituting S(t) by this formula in the I-equation of (2.6) we obtain Therefore, by integrating the above equation between t and t 0 we obtain Remarkably, equation (2.7) is monotone. We refer to Smith [26] for a comprehensive presentation on monotone systems. By applying a comparison principle to (2.7), we are in a position to confirm the intuition about epidemics SI models. Note that the monotone properties are only true for the cumulative number of infectious (this is false for the number of infectious).
Theorem 2.2. Let t > t 0 be fixed. The cumulative number of infectious CI(t) is strictly increasing with respect to the following quantities (i) I 0 > 0 the initial number of infectious individuals; (ii) S 0 > 0 the initial number of susceptible individuals; (iii) τ > 0 the transmission rate; (iv) 1/ν > 0 the average duration of the infectiousness period. Remark 2.3. By using the data for mainland China, we obtain In figure 2, we plot the upper and lower solutions CR + (t) (obtained by using I 0 ¼ I þ 0,95% and t 0 ¼ t þ 0,95% ) and CR − (t) (obtained by using I 0 ¼ I À 0,95% and t 0 ¼ t À 0,95% ) corresponding to the blue region and the black curve corresponds to the best estimated value I 0 = 1521 and τ 0 = 3.3214 × 10 −10 .
Recall that the final size of the epidemic corresponds to the positive equilibrium of (2.7) In figure 2, the changes in the parameters I 0 and τ 0 (in (2.8) and (2.9)) do not affect significantly the final size.

Theoretical formula for τ(t)
By using the S-equation of model (1.1) we obtain next by using the I-equation of model (1.1) we obtain and by taking the integral between t and t 0 we obtain a Volterra integral equation for the cumulative number of infectious which is equivalent to (by using (1.3)) The following result permits to obtain a perfect match between the SI model and the time-dependent rate of transmission τ(t).
Theorem 3.1. Let S 0 , ν, f, I 0 > 0 and CR 0 ≥ 0 be given. Let t → I(t) be the second component of system (1.1). Let c CR:[t 0 , 1) ! R be a two times continuously differentiable function satisfying and royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 7: 201878 if and only if Proof. Assume first (3.7) is satisfied. Then by using equation (3.1) we deduce that therefore by taking the derivative on both sides and by using the fact that CR(t) − CR 0 = νfCI(t) we obtain (3.8).

Explicit formula for τ(t) and I 0
Many phenomenological models have been compared to the data during the first phase of the COVID-19 outbreak. We refer to the paper of Tsoularis & Wallace [27] for a nice survey on the generalized logistic equations. Let us consider here for example, the Bernoulli-Verhulst equation supplemented with the initial data Let us recall the explicit formula for the solution of (4.1) (4:2) Assumption 4.1. We assume that the cumulative numbers of reported cases CR Data (t i ) are known for a sequence of times t 0 < t 1 < · · · < t n+1 (see figure 3). By using (4.1), we deduce that Since CR(t) < CR ∞ , by considering the sign of the numerator and the denominator of (4.5), we obtain the following proposition.

Estimated initial number of infected
By combining (1.3) and the Bernoulli-Verhulst equation (4.1) for t → CR(t), we deduce the initial number of infected

Estimated rate of transmission
By using the Bernoulli-Verhulst equation (4.1) and substituting (4.4) in (3.8), we obtain We observe that the formula for the rate of transmission (4.5) becomes negative whenever ν < χ 2 θ. In figure 5, we plot the numerical simulation obtained from (1.1) to (1.3) when t → τ(t) is replaced by the explicit formula (4.5). It is surprising that we can reproduce perfectly the original Bernoulli-Verhulst even when τ(t) becomes negative (see figure 3). This was not guaranteed at first, since the I-class of individuals is losing some individuals which are recovering.

5.
Computing numerically a day-by-day piecewise constant rate of transmission Assumption 5.1. We assume that the rate of transmission τ(t) is piecewise constant and for each i = 0, …, n,

Compatibility of the model SI with the COVID-19 data for mainland China
The model SI is compatible with the data only when τ(t) stays positive for all t ≥ t 0 . From our estimation of the Chinese's COVID-19 data, we obtain χ 2 θ = 0.14. Therefore from (4.6), we deduce that model is compatible with the data only when This means that the average duration of infectious period 1/ν must be shorter than 3.3 days. Similarly, the condition (4.7) implies So according to this estimation the fraction of unreported 0 < f ≤ 1 can be almost as small as we want.  For t ∈ [t i−1 , t i ], we deduce by using assumption 5.1 that t iÀ1 CR 0 (s) ds: where By fixing τ i−1 = 0 on the right-hand side of (5.2), we get and when τ i−1 → ∞ we obtain CR 0 (t) n f(I 0 þ S 0 ) þ n CR 0 À nCR(t): By using the theory of monotone ordinary differential equations [26], we deduce that the map τ i → CR(t i ) is monotone increasing, and we get the following result.
Theorem 5.2. Let assumptions 1.1, 4.1 and 5.1 be satisfied. Let I 0 be fixed. Then we can find a unique sequence τ 0 , τ 1 , …, τ n of non-negative numbers such that t → CR(t) the solution of (3.2) fits exactly the data at any time t i , that is to say that if and only if the following two conditions are satisfied for each i = 0, 1, …, n + 1, Remark 5.3. The above theorem means that the data are identifiable for this model SI if and only if the conditions (5.4) and (5.6) are satisfied. Moreover, in that case, we can find a unique sequence of transmission rates τ i ≥ 0 which gives a perfect fit to the data.

Numerical simulations
In this section, we propose a numerical method to fit the day-by-day rate of transmission. The goal is to take advantage of the monotone property of CR(t) with respect to τ i on the time interval [t i , t i+1 ]. Recently, more sophisticated methods were proposed by Bakhta et al. [20] by using several types of approximation methods for the rate of transmission.
We start with the simplest algorithm 1 in order to show the difficulties to identify the rate of transmission.
Step i: For each integer i ¼ 1, . . . , n we consider the system S 0 ðtÞ ¼ ÀtSðtÞIðtÞ; I 0 ðtÞ ¼ tSðtÞIðtÞ À nIðtÞ and CR 0 ðtÞ ¼ nfIðtÞ; . This system is supplemented by initial values S(t i ) and I(t i ) obtained from the previous iteration and with CR(t i ) ¼ CR Data (t i ) obtained from the data. The map t ! CR(t i ) being monotone increasing, we can apply a bisection method to find the unique value t i solving CRðt i Þ ¼ CR Data ðt i Þ: In figure 6, we plot an example of such a perfect fit, which is the same for ν = 0.1 and ν = 0.2. In figure 7, we plot the rate of transmission obtained numerically for ν = 0.2 in (a) and ν = 0.1 in (b). This is an example of a negative rate of transmission. Figure 7 should be compared to figure 4 which gives a similar result.
In figures 8-10, we use algorithm 1 and we plot the rate of transmission obtained by using the reported cases of COVID-19 in China where the parameters are fixed as f = 0.5 and ν = 0.2. In figures 8-10, we observe an oscillating rate of transmission which is alternately positive and negative back and forth. These oscillations are due to the amplification of the error in the numerical method itself. In figure 8, we run the same simulation as in figure 9 but during a shorter period. In figure 8, we can see that the slope of CR(t) at the t = t i between 2 days (the black dots) is amplified 1 day to the next.
In figure 10, we first smooth the original cumulative data by using the Matlab function CR Data = smoothdata(CR Data ,'gaussian',50) to regularize the data and we apply algorithm 1. Unfortunately, smoothing the data does not help to solve the instability problem in figure 10.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 7: 201878 We need to introduce a correction when choosing the next initial value I(t i ). In algorithm 1, the errors are due to the following relationship: which is not respected at the points t = t i which should be reflected by the algorithm.   Figure 8. In (a), we plot the cumulative number of reported cases obtained from the data (black dots) and the model (blue curve). In (b), we plot the daily rate of transmission obtained by using algorithm 1. We see that we can fit the data perfectly. But the method is very unstable. We obtain a rate of transmission that oscillates from positive to negative values back and forth.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 7: 201878 In figure 11, we smooth the data first by using the Matlab function CR Data = smoothdata(CR Data , 0 gaussian 0 ,50), and we apply algorithm 2 by approximating equation (6.6) by In figure 11, we no longer observe the oscillations of the rate of transmission.  . In (a), we plot the cumulative number of reported cases obtained from the data (black dots) and the model (blue curve) on a period six times longer than in figure 8. In (b), we plot the daily rate of transmission obtained by using algorithm 1. We see that we can fit the data perfectly. But the method is very unstable like on figure 8. We obtain a rate of transmission that oscillates from positive to negative values back and forth.  Figure 10. We apply algorithm 1 to the regularized data. In (a), we plot the regularized cumulative number of reported cases obtained from the data (black dots) and the model (blue curve). In (b), we plot the daily rate of transmission obtained by using algorithm 1. We see that we can fit the data perfectly. But the method is very unstable. We obtain a rate of transmission that oscillates from positive to negative values back and forth.  Figure 11. In this figure, we plot the rate of transmission obtained by using the reported cases of COVID-19 in China with the parameters f = 0.5 and ν = 0.2. We first regularize the data by applying the Matlab function CR Data = smoothdata(CR Data , 'gaussian',50). Then we apply algorithm 2 to the regularized data. In (a), we plot the regularized cumulative number of reported cases obtained after smoothing (black dots) and the model (blue curve). In (b), we plot the daily rate of transmission obtained by using algorithm 2. We see that we can fit the data perfectly and this time the rate of transmission is becoming reasonable.

Algorithm 2
We fix S 0 ¼ 1:4 Â 10 9 , n ¼ 0:1 or n ¼ 0:2 and f ¼ 0:5. Then we fit the data by using the method described in §2 to estimate the parameters x 1 , x 2 and x 3 from day 1 to 10. Then we use S 0 ¼ 1:40005 Â 10 9 ; For each integer i ¼ 0, . . . , n, we consider the system S 0 ðtÞ ¼ ÀtSðtÞIðtÞ; I 0 ðtÞ ¼ tSðtÞIðtÞ À nIðtÞ and CR 0 ðtÞ ¼ nfIðtÞ; . Then the map t ! CR(t iþ1 ) being monotone increasing, we can apply a bisection method to find the unique t i solving The key idea of this new algorithm is the following correction on the I-component of the system. We start a new step by using the value S(t i ) obtained from the previous iteration and and CR i ¼ CR Data ðt i Þ: (6:7) In figure 12, we plot several types of regularized cumulative data in (a) and several types of regularized daily data in (b). Among the different regularization methods, an important one is the Bernoulli-Verhulst best-fit approximation.
In figure 13, we plot the rate of transmission t → τ(t) obtained by using algorithm 2. We can see that the original data give a negative transmission rate while at the other extreme the Bernoulli-Verhulst seems to give the most regularized transmission rate. In figure 13a, we observe that we now recover almost perfectly the theoretical transmission rate obtained in §4. In figure 13b, the rolling weekly average regularization and in figure 13c the Gaussian weekly average regularization still vary a lot and in both cases, the transmission rate becomes negative after some time. In figure 13c, the original data give a transmission rate that is negative from the beginning. We conclude that it is crucial to find a 'good' regularization of the daily number of cases. So far the best regularization method is obtained by using the best fit of the Bernoulli-Verhulst model. royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 7: 201878 Remark 6.1. For each simulation figure 13b,c, it is possible to obtain a transmission rate t → τ(t) that is non-negative for all time t by increasing sufficiently the parameter ν. Nevertheless, we do not present these simulations here because the corresponding values of ν to obtain a non-negative τ(t) are unrealistic.
In figure 14(a-d respectively), we plot the daily basic reproduction number corresponding to the figure 13(a-d respectively). The red line corresponds to R 0 = 1. We see some complex behaviour for figure 14b,c,d is again unrealistic.   Figure 14. In this figure, we plot the daily basic reproduction number t → R 0 (t) = τ(t)S(t)/ν obtained by using algorithm 2 with the parameters f = 0.5 and ν = 0.2. We use the cumulative data obtained by using (a) the Bernoulli-Verhulst regularization, (b) the rolling weekly average regularization, (c) the Gaussian weekly average regularization and in (d) we use the original cumulative data.

Discussion
Estimating the parameters of an epidemiological model is always difficult and generally requires strong assumptions about their value and their consistency and constancy over time. Despite this, it is often shown that many sets of parameter values are compatible with a good fit of the observed data. The new approach developed in this article consists first of all in postulating a phenomenological model of growth of infectious, based on the very classic model of Verhulst, proposed in demography in 1838 [28]. Then, obtaining explicit formulae for important parameter values such as the transmission rate or the initial number of infected (or for lower and/or upper limits of these values), gives an estimate allowing an almost perfect reconstruction of the observed dynamics. The uses of phenomenological models can also be regarded as a way of smoothing the data. Indeed, the errors concerning the observations of new infected cases are numerous: the census is rarely regular and many countries report late cases that occurred during the weekend and at varying times over-add data from specific counts, such as those from homes for the elderly; the number of cases observed is still underestimated and the calculation of not-reported new cases of infected is always a difficult problem [21]; the raw data are sometimes reduced for medical reasons of poor diagnosis or lack of detection tools, or for reasons of domestic policy of states.
For all these causes of error, it is important to choose the appropriate smoothing method (moving average, spline, Gaussian kernel, auto-regression, generalized linear model, etc.). In this article, several methods were used and the one which allowed the model to perfectly match the smoothed data was retained.
In this article, we developed several methods to understand how to reconstruct the rate of transmission from the data. In §2, we reconsidered the method presented in [21] based on an exponential fit to the early data. The approach gives a first estimation of I 0 and τ 0 . In §3, we prove a result to connect the time-dependent cumulative reported data and the transmission rate. In §4, we compare the data to the Bernoulli-Verhulst model and we use this model as a phenomenological model. The Bernoulli-Verhulst model fits the data for mainland China very well. Next by replacing the data by the solution of the Bernoulli-Verhulst model, we obtain an explicit formula for the transmission rate. So we derive some conditions on the parameters for the applicability of the SI model to the data for mainland China. In §5, we discretized the rate of transmission and we observed that given some daily cumulative data, we can get at most one perfect fit the data. Therefore, in §6, we provide two algorithms to compute numerically the daily rates of transmission. Such numerical questions turn out to be a delicate problem. This problem was previously considered by another French group, Bakhta et al. [20]. Here we use some simple ideas to approach the derivative of the cumulative reported cases combined with some smoothing method applied to the data.
To conclude this article, we plot the daily basic reproduction number R 0 (t) ¼ t(t)S(t) n as a function of the time t and the parameters f or ν. The above simple formula for R 0 is not the real basic reproductive number in the sense of the number of newly infected produced by a single infectious. But this is a simple formula which gives a tendency about the growth or decay of the number of infectious. In figure 15a, the daily basic reproduction number is almost independent of f, while in figure 15b, R 0 (t) is depending on ν mostly for the small value of ν. The red curve on each surface in figure 15 corresponds to  Figure 15. In this figure we plot R 0 (t) = τ(t)S(t)/ν the daily basic reproduction number and we vary the parameter f (a) and ν (b).
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 7: 201878 the turning point (i.e. time t ≥ t 0 for which R 0 (t) = 1). We also see that turning point is not depending much on these parameters. Concerning contagious diseases, public health physicians are constantly facing four challenges. The first concerns the estimation of the average transmission rate. Until now, no explicit formula had been obtained in the case of the SIR model, according to the observed data of the epidemic, that is to say the number of reported cases of infected patients. Here, from realistic simplifying assumptions, a formula is provided (formula (4.5)), making it possible to accurately reconstruct theoretically the curve of the observed cumulative cases. The second challenge concerns the estimation of the mean duration of the infectious period for infected patients. As for the transmission rate, the same realistic assumptions make it possible to obtain an upper limit to this duration (inequality (4.8)), which makes it possible to better guide the individual quarantine measures decided by the authorities in charge of public health. This upper bound also makes it possible to obtain a lower bound for the percentage of unreported infected patients (inequality (4.8)), which gives an idea of the quality of the census of cases of infected patients, which is the third challenge faced by epidemiologists, specialists of contagious diseases. The fourth challenge is the estimation of the average transmission rate for each day of the infectious period (dependent on the distribution of the transmission over the 'ages' of infectivity), which will be the subject of further work and which poses formidable problems, in particular those related to the age (biological age or civil age) class of the patients concerned. Another interesting prospect is the extension of methods developed in the present paper to the contagious non-infectious diseases (i.e. without causal infectious agent), such as social contagious diseases, the best example being that of the pandemic linked to obesity [29][30][31], for which many concepts and modelling methods remain available.