Computational estimation from a statistical physics approach and its contributions to the Covid-19 in Colombia

The objective of this article is to present a computational estimate from a statistical physics approach and its contributions to Covid-19 in Colombia. Based on the daily data of contagions, recoveries and deaths, during the months of March to July, the estimation of the behavior of the epidemic was made using the nonlinear regression method with adjustment of curves by minimum squares. Highlighting the benefits that this method presents in the study of physical phenomena, it was used in the present research developing two types of modeling: exponential and Gaussian, and with these some predictions were made. The coefficients of determination of the exponential model were: 0.9641 for contagions, 0.9400 for recoveries and 0.9788 for deaths, and those of the Gaussian model were: 0.9799 for contagions, 0.9606 for recoveries and 0.9894 for deaths, showing a good correlation between the models and the real behavior of the pandemic, being the Gaussian one, the most approximate. This was also evidenced by comparing the prognosis of both models with the actual data for the first 13 days of August, concluding that the pandemic is beginning to mitigate, and the curve is flattening out.


Introduction
Covid-19 is a virus unknown to mankind. Its understanding is based on the hard work of countless researchers [1]. Many of them have analyzed the epidemic from a statistical point of view using different models, such as susceptible, exposed, infected and recovered (SEIR) [2] and Susceptible, Infected and recovered (SIR) [3], specialized in the prediction of infectious diseases, however, these could generate imprecise results due to the systematic variations in the prognosis curve and the complexity of the epidemic [2], therefore, it is extremely important for science and humanity to have more research using different methods such as: non-linear regression modeling [4].
Non-linear regression is a method that uses least-squares curve fitting [5]. This method emerged from the fields of astronomy and geodesy. The first scientists to contribute to this method were Carl Friedrich Gauss, Adrien-Marie Legendre and Robert Adrain in the XVIII century [6]. It has been widely used in different areas of knowledge such as statistical mechanics, a discipline that was born in the nineteenth century with the contributions of Rudolf Clausius, James Clerk Maxwell and especially Ludwig Boltzmann [7]. All these scientists established the basis of statistical physics and their contributions still have great importance in research of recent years, such as those carried out by Flórez and Laguado, for example, in computational fluid dynamics [8], that of Plaza in the modeling of physical and natural phenomena [9] or Vera, Delgado and Sepulveda, in solar energy [10].
Statistical mechanics is as a link between a macroscopic world treated as continuous and a microscopic world of discrete nature [6]; as the Covid-19, whose discrete components are the data reported each day, and these describe behavior that can be modeled continuously over time [11], this, in

Materials and methods
The research was carried out using a descriptive and applied methodology that can be seen in Figure 1.

Collecting data
Recent studies such as those conducted by Diaz Pinzón [13], Verbel, Mejía, Manjarres and Troncoso [14], show that the pandemic does not have a defined behavior. Figure 2 for example, shows the behavior of the pandemic in Colombia. This figure was made with data on contagions, recoveries and deaths reported by the "Instituto Nacional de Salud" (INS) [15]. Figure 2 shows that the Colombia curve has a growing exponential behavior. Although the effects of the pandemic were initially mitigated, the curve has been steadily increasing, reaching levels of contagion and deaths similar to those countries that initially did not take preventive measures and were greatly impacted by the pandemic. In those countries, drastic measures were taken due to the high rate of contagions and deaths and a flattening of the curve was witnessed. In Colombia, on the contrary, measures are being taken, but the curve continues to increase exponentially.

Modelling
Using non-linear regression with a technological tool, two types of mathematical models were chosen: exponential [16] and Gaussian [5]. Exponential due to the behavior of the curve in Colombia, which does not yet show any kind of flattening. And the Gaussian one to estimate a flattening in the following months.

Model analysis
The number of terms indicated were chosen to obtain a better result in the behavior of the curve and in the calculation of the determination coefficient.

Prediction development
With the exponential model, predictions were made for the months of August and September, considering that the curve is close to the flattening point, and with the Gaussian model, predictions were made for the months of August, September, October, November and December. With both models the error between the predicted and actual data was calculated for the first 13 days of August.

Results and discussion
Below are the two types of models estimated from data collected between March 6, 2020, the day the first contagion occurred, and July 31, 2020. Figure 3 shows the exponential model for the contagion curve and Figure 4 shows the Gaussian model. The same modeling was done with the recoveries and deceased data. The exponential model in Figure  3 is appropriate to predict a possible increase in contagions until day 209, which is equivalent to September 30th, 2020. The same modeling was used for the recoveries curve and the deaths curve.

Models
The Gaussian model predicts a possible flattening of the curve between the days 210 and 270, which is equivalent to the months of August and September 2020. The same modeling was used for the recoveries curve and the deaths curve. Two models were used for the forecasts: The exponential model predicts an increase in the number of contagions per day, a phenomenon that was evident in the behavior of the real data taken, and, if it is not contained, it can produce more than 80000 contagions per day after September 30th, 2020. On the contrary, the Gaussian model predicts a flattening of the curve, whose peak does not reach 20000 contagions per day, and by December 31, 2020, it expects to be below 300 contagions per day.   Table 1 shows the determination coefficients of each of the models. The determination coefficients showed a good correlation between the models and the real behavior of the pandemic, being more approximate the Gaussian model.  Table 2 shows the predictions of the exponential model for some days in August and September and Table 3 shows the predictions of the Gaussian model for a possible flattening of the curve. These predictions presented refer to accumulated data per day. With the exponential and Gaussian model, we calculated the accumulated contagions per day, the accumulated recoveries per day and the accumulated deaths per day and these were the data that were compared with the real ones.  It is observed that after a possible flattening of the Gaussian curve there will be fewer contagions and deaths by December 31st, 2020 than predicted by the exponential model for September 30th, 2020. Table 4 shows the comparison between actual contagions and models for the first 13 days of August. Table 4 shows that the errors of the Gaussian model were all below 1, while those of the exponential model were all above 1.5. Table 5 shows the comparison between actual recoveries and models for the first 13 days of August. In this table we can see that the errors started to increase above 5% for both models, after August 4th, 2020. What allows us to see this phenomenon is that the curve of recoveries per day does not seem to follow a defined pattern but remains between an exponential and a Gaussian behavior. For example, for the 13 of August of the 2020, the exponential model predicted almost 300000 recoveries, whereas the Gaussian model just over 200000, and recoveries were 250971, that is to say, almost to half of a model and another one.   Table 6 shows the comparison between actual deaths and models for the first 13 days of August. This table shows that the deaths curve followed the Gaussian model, since the errors were greater in the exponential model. Table 7 shows the average errors between the models and the real data for contagions and deaths during the first 13 days of August. It is observed that the mean error between the real data and the forecast is lower for the Gaussian model in the contagions and deaths, and lower in the recoveries for the exponential model.  Table 7 shows the average errors between the models and the real data for contagions and deaths during the first 13 days of August. It is observed that the mean error between the real data and the forecast is lower for the Gaussian model in the contagions and deaths, and lower in the recoveries for the exponential model.

Conclusions
Approaching the research from a statistical physics approach was successful, because the behavior of covid-19 can be modeled as a macro system, continuous in time that depends on discrete elements, reported day by day. The use of technological instruments in the statistical analysis and the prediction methods facilitates the models' estimation, making the process more dynamic and shortening the necessary time to reach conclusions that can benefit the decision making. Differences that may arise between prognostic curves and actual data relate to the nation's response to the pandemic, the presence of asymptomatic patients, prevention measures, data collection capacity, processing times, and followup of infected patients, but the results presented here are nonetheless relevant to the analysis of the pandemic. The determination coefficients showed a good correlation between the models and the pandemic, with the Gaussian model being the most accurate, allowing us to infer that the pandemic will begin to show mitigation from day 200, which is equivalent to the second week of September 2020.
The behavior of the pandemic depends on the actions taken by the authorities to mitigate the contagion and the compliance of citizens to these actions, since the exponential curve shows that, if the current growth is not mitigated, we could have 2390798 contagions and more than 68000 deaths by September 30th, 2020. If the pandemic responds to Gaussian behavior it is expected to have more than 2 million contagions and about 56800 deaths by December 31st, 2020, which is a much lower number than expected by the exponential curve, however, are still worrying figures that should lead the country to strengthen prevention measures.