Estimative of real number of infections by COVID-19 in Brazil and possible scenarios

This paper attempts to provide methods to estimate the real scenario of the novel coronavirus pandemic in Brazil, specifically in the states of Sao Paulo, Pernambuco, Espirito Santo, Amazonas and the Federal District. By the use of a SEIRD mathematical model with age division, we predict the infection and death curves, stating the peak date for Brazil and above states. We also carry out a prediction for the ICU demand in these states and for how severe possible collapse in the local health system would be. Finally, we establish some future scenarios including the relaxation on social isolation and the introduction of vaccines and other efficient therapeutic treatments against the virus.


Introduction
In December 2019, the city of Wuhan in mainland China started experiencing an outbreak of unknown pneumonia cases. Later, the cause of this outbreak was identified as a virus belonging to the Orthocoronavidae subfamiliy and the Betacoronavirus genus (Cui et al., 2019), similar to the SARS-CoV virus that caused the SARS crisis in 2003 (Andersen et al., 2020). That similarity suggested the name SARS-CoV-2 to the novel coronavirus, and COVID-19 to the disease (Coronavirus Disease -2019).
The virus quickly spread to other countries, reaching several countries by the end of February and being declared as a pandemic by the World Health Organization (WHO) on the 11th of March, being classified as a high risk threat for the world's population (WHO, 2020). Since then, several mathematical models were used to predict the dynamics of the pandemic crisis in other countries. One of those models with the biggest impact was developed by Imperial College London (Ferguson et al., 2020).
In Brazil, the first case registered dates back to February 25th, but in this study we suggest evidence that the infection might have started 19e24 days before the official record. We then proceed to simulate the crisis in specific states and attempt to estimate the real scale of the outbreak, predicting when the infections peak might occur as well as the curve for ICU demand. Finally, we present some future scenarios based on how halting the intervention might affect the curve. We also explore how the introduction of vaccines or available medication might change the infection curve since there are several studies being made to evaluate the possible use of pharmaceutical drugs to cure the disease (McCreary & Pogue, 2020), (Negri et al., 2020) and (Tang et al., 2020).

Description of the model
We make use of a SEIRD model, dividing the population into 5 groups: Susceptible, Exposed, Infected, Recovered and Dead. The exposed population differs from the infected population in the development of their symptoms; an individual with the virus first enters first the exposed group, carrying the virus during its incubation period; then, after the incubation period, the individual passes to the infected group. The rate of infection is proportional to the number of infected and a contact constant b, which is given by the average number of contacts between individuals multiplied by the probability of contracting the virus during each contact. The development rate of symptoms is proportional to the incubation period c À1 . The rate of recovery g is proportional to the percentage of people who recover divided by the average time taken from the onset of symptoms to recovery, similarly to the death rate m. Another consideration is that people in the exposed group might infect susceptible people with an infection rate k which is a small fraction of b, that is, k ¼ P exp b, where P exp determines the fraction of infections caused by exposed individuals.
The following diagram represents the dynamics of these populations ( Fig. 1): This model is represented by the following set of differential equations where the recovery rate g and death rate m are represented in terms of the Infection Fatality Rate (IFR) P IFR and the average time from the onset of symptoms to recovery t r and death t d .
Representation of a SEIRD model, a susceptible person gets exposed to the virus, being infected afterwards and either dies or recovers from the disease.
H.P.C. Cintra, F.N. Fontinele Infectious Disease Modelling 5 (2020) 720e736 All equations described conserve the total population N, which is assumed constant and homogeneous for the model to be valid. This, of course, presents a limitation of the model, since in reality N is not homogeneous. Therefore, here N has the role of the effective population, being equivalent to the population which the virus might reach during an interval of some months. Estimating the real N is not an easy task, and in next sections we discuss how we decided to estimate this number.
We then divided the population into age groups to better describe how these rates vary from group to group. With that, we made the following changes to the model in accordance to what was proposed by (Rocha Filho et al., 2020): where M is the number of age groups, C ij is the social contact matrix, representing the average contacts between a member of the i-th group with all other j-th groups and P Inf is the probability of being infected at each contact.
With these definitions, we represent non-pharmaceutical interventions such as social isolation and lockdown with a decrease of b given by a logistic function of the type here, b i is the infection rate before the intervention, t c is the time when the intervention starts, P d is the fraction of reduction in infection rate achieved and t is a constant related to the time taken from the start of the intervention until P d is reached.
When simulating the curve for infections and deaths in Brazil and in the states of Pernambuco, Espirito Santo, Sao Paulo, Amazonas and the Federal District, we used the model described above. Meanwhile, when simulating the ICU demand, we do not apply the age division for lack of specific data for each age group, thus, we apply the simple SEIRD model with b extracted from the fitting of data of each state and P IFR , t r and t d appropriate for COVID-19 patients in the ICU.

Number of hospitalizations by SARS
According to (Salje et al., 2020), 3.6% of COVID-19 infections are severe and require hospitalization of which 30% are critical and require an ICU unit. Some studies found a hospitalization rate of around 14% (Wu & McGoogan, 2020). Yet another study found similar percentages, stating that 19% of the infections resulted in hospitalizations (COVID & Team, 2020). However, these studies calculate these fractions according to the registered cases, which are undernotified in many regions.
With the emergence of the novel coronavirus, the number of hospitalizations by SARS per week increased when compared to the years of 2019, 2018 and 2017. It is important to state that SARS hospitalizations here should not be misunderstood as caused by the SARS-CoV virus, responsible for the SARS epidemic in 2002 (Marra et al., 2003). In Brazil, the term SARS is also used to describe severe acute respiratory infection, independent of the etiological agent. Using the number of hospitalizations by SARS during these years, we build a value for the background behavior, that is, the expected number of hospitalizations due to other respiratory diseases (Fig. 2). The number reported by the Health Ministry per week is subject to alterations due to the fact that new results in the following weeks may be related to previous ones, as new results are released. For example, by the end of the 6th week of the year 2018, the official report estimates a number of hospitalizations of around 50 people, but later on, this number was corrected to be close to 200. Because of this uncertainty in the most recent data, we use the values available from four weeks before the most recent report (Fig. 3). Fig. 3 shows an increase in hospitalizations by SARS in Brazil. Due to the definitions of SARS infection used by Brazils Health Ministry, most cases of hospitalizations by COVID-19 are diagnosed as SARS. Therefore, we assume that the large increase of SARS hospitalizations in comparison to the background is mainly caused by COVID-19. When comparing the years of 2018 and 2019, the latter presented an average increase (during the epidemiological weeks considered) of 10% in hospitalizations by other causes (Influenza, etc). We assume the same increase could be found from 2019 to 2020, thus, we consider that COVID-19 is responsible for 90% of the increase.
We observed that by the 6th week of 2020, the number of hospitalizations by SARS was 121 hospitalizations higher, compared to the upper error bar of the background value, and higher even than the year of 2019 by 106 hospitalizations, representing an increase of 31%, evidencing the likely existence of COVID-19 hospitalizations. According to a study performed on COVID-19 patients in Shanghai, the hospitalization occurs on average 4 days after the symptoms onset, ranging from 2 to 7 days . The study, together with the increase of SARS hospitalizations by the 6th week of 2020 suggests the possible existence of COVID-19 cases in Brazil between February 1st and February 6th, 19e24 days before the official record of the first case on the 25th of February.
Following the increase of hospitalizations, by the end of the 13th epidemiological week of 2020 (March 28, 2020), the number of hospitalizations by SARS in Brazil was already, 12260, while the background's upper error bar of reaches a value of merely 1028, and only 1123 were registered in the year of 2019. From our assumption, 90% of the excessive hospitalizations are attributed to COVID-19, resulting in 10023e10108 hospitalizations by infections of the SARS-CoV-2 virus, which translates into 278416 to 280777 infections between March 21, 2020 and March 26, 2020 (According to the average time taken to be hospitalized). Comparing these estimates with the official numbers reported trough this period, we find a real number of infections 90 to 200 times bigger than the official number (125 times, using the average). That represents a loss of 99.2% (99.0e99.5) of actual infections. By comparison, a study done in China found that 86% infections were undocumented infections prior to 23rd january (Li et al., 2020a).

Number of tests performed
A study of Imperial College London estimated the number of infections in 11 European countries until March 28th, based on the basic reproduction number of the disease, found to be between 2 and 3 (Zhang et al., 2020), (Zhao et al., 2020),  and (Read et al., 2020), and the type of non-pharmaceutical intervention done by the countries on specific dates (Flaxman et al., 2020). With these estimations, we may find the percentage of lost cases, that is, infections not documented, in these countries until the 28th of March by comparing the estimated number of people infected with the official data available on the 28th of March. Comparing these percentages with the number of tests done per 1000 inhabitants and the number of tests done per day per 1000 inhabitants, we found a linear relation between the number of total tests done per 1000 inhabitants and the number of tests performed per day per 1000 inhabitants in a country, as well as and the percentage of lost cases (Figs. 4 and 5).
The number of points on each graph is different because, although the study considered 11 countries, not all of them had data of tests per day available at (Max Roser & Ortiz-Ospina, 2020). The correlation between the number of tests per day with the fraction of undocumented infections is À0.90, and the correlation between the number of total tests with the fraction of undocumented infections is À0.79. A F-Test applied to the data set of both relations rejected the null hypothesis and that the variation of tests and the variation of undernotification are not significantly correlated (p < 0.05). We also compared the undocumented cases with the progression of the outbreak in each country and the day on which the non-pharmaceutical interventions were imposed, but found no correlation. We evaluated the effect of the increasing rate of testing as well, but it had no observable effect. From this comparison, a country needs to perform 4 (0.94e17 tests) tests per day per 1000 inhabitants in order to obtain a excellent track of infections. Here, the large margin for the higher values of testing arises from the low density of data points on the bigger values of the x-axis in Fig. 4. The last official registration of the total number of tests done per 1000 inhabitants in Brazil of was 3.46, which corresponds to 97% of cases being lost (89e99.2%). However, the highest value released on the closest date to the 13th epidemiological week, reports 1.37 tests per 1000 inhabitants, corresponding to 98.8% of infections being undocumented (95.6e99.7%). A more precise number could be achieved with the data of tests per day per 1000 inhabitants, allowing a 2-dimensional regression. Unfortunately, we found no record of this information. Still, when fitting the data to a 2-dimensional regression algorithm, the resulting function states that the most important factor controlling the uncertainty of cases is that of tests per day per 1000 inhabitants. That could also be observed by looking at the graphs individually, the number of total tests performed per 1000 inhabitants decreases the percentage of undocumented infections at a much lower rate than the number of tests per day per 1000 inhabitants.
Both methods found a region of agreement (99.5%e99.7%) of undocumented infections in Brazil. Since both methods match closely, we decided to accept the estimate for undocumented infections in Brazil and moved on to the simulations of infection and death curves of the country and of some of its specific regions.

Simulations
For the simulation of the whole of Brazil, we used the World Population Prospects from the United Nations (UN) to evaluate the age distribution in Brazil in the year of 2020 (United Nations & Affairs, 2019) (This distribution will be used when simulating the expected scenario for the whole country; when considering more specific age distributions for each state, we acquired data from the Brazilian Institute of Geography and Statistics (IBGE) census, mentioned in the following sections regarding each state). We found no study measuring the social contact matrix for the country, but the study (Deps et al., 2006) evaluated the high levels of social contact in Brazil as an important factor for the spreading of leprosy. Therefore, we decided to use the social contact matrix found with the highest entries among those available (Poland) due to the Brazilian's culture of proximity.
For the values of g and m we choose to use the ones found in the data of South Korea, Germany, Iceland and Taiwan, since these countries are performing more tests per 1000 inhabitants than Brazil, making their data more reliable (Figs. 6 and 7). For each country, we acquired the average values for t d and t r , knowing the CFR.
Data from Taiwan presented large fluctuations in the behavior of m and g, even with a almost constant Case Fatality Rate (CFR) of 1:3%±0:2%, making the values for t d and t r inconclusive. That might be explained by the early intervention made by the local government, drastically changing the values for the parameters. Clinical studies performed on Wuhan patients found t d on average 18 days (6e32) (Ruan et al., 2020), and 20 days (17e24) .
When fitting the data of those countries with the model to extract b (Table 1, Table 2), we took into consideration in the simulations the non-pharmaceutical intervention in each country in order to better describe b. The value of b was used to set a reference to compare with the ones found with the fitting of data from each state.
For the incubation period c À1 , we took an average of the values found in previous studies (Table 3).
The value for k was set to 44% of b based on the findings that showed that presyntomatic cases were responsible for 44% of the infections (He et al., 2020). The parameter P IFR for each age group was set by re-scaling the international average of the case fatality rate (CFR) (WorldMeters, 2020) with the estimated infection fatality rate of 0.7% (Salje et al., 2020), ranging from 0.001% for those younger than 20 years old to 10.1% to those older than 80 years old (Table 4), while P survival ¼ 1 À P IFR . To simulate the ICU population, we added the hospitalized population H to the set of differential equations (1)e(5). The introduction of this compartment is done by removing individuals from (3) with rate P h =t h , where P h is the fraction of infections that are critical and require ICU units, and t h is the average time from the symptoms onset to admission to the ICU. Inside the H compartment, individuals are removed to the death compartment with the rate m h ¼ P dh =t dh , where P dh is the probability of dying upon ICU entry and t dh is the average time from ICU admittance to death. Similarly, individuals are also removed to the recovered group with the analogous rates g h ¼ ð1 À P dh Þ=t rh . The result is the following modification in equations (3) to (5) dI dt ¼ cEðtÞ À ð1 À P h ÞgIðtÞ À ð1 À P À hÞmIðtÞ À P h t h Table 5 contains the parameters for the simulation of the ICU population.

Results
In the simulation for the whole country, we considered N to be 5% of the total population based on an international behavior for the total number of infections in other countries (Salje et al., 2020). We also selected P Inf ¼ 14% according to (Rocha Filho et al., 2020).
In order to add the effect of the use of masks by a large number of individuals to the simulation, we use a logistic function to decrease the value of P infec by 50% based on (MacIntyre et al., 2011), the slope of the decreasing region was set to be 10x slower than the one simulated with social distancing. We also choose P d ¼ 0:5, taking the national average for the population in social isolation (inloco, 2020).
The curve (Fig. 8) shows a good agreement with the estimated values by the number of SARS hospitalizations in the last weeks of March, shown by the þ mark on the graph. We also predict that the peak of the infection curve in Brazil should occur 100 days after the first case, which we considered to be at the beginning of February. Therefore, the peak should be in the middle to end of May with 2.4 million infections, ranging from 2.2 to 2.7 million. The number of deaths is estimated to be around 126 thousand, ranging from 114 to 139 thousand. By the end of the first wave, we estimate 8 million infections, ranging from 6.4 to 9.6 million.
The shaded areas represent a 10% deviation from the simulated curve. The value of the deviation was chosen as a reflection of the uncertainty in the value for the effective population N.

Pernambuco
Online data available from the local government in (CIEVSPE, 2020) states a total of 0.84 tests per 1000 inhabitants and an average of 0.05 tests per day per 1000 inhabitants, meaning that more than 90% of infections are being undocumented. For the simulation, we acquired data regarding the age and geographical distribution of the population from the last census from IBGE (IBGE, 2017a(IBGE, , 2017b(IBGE, , 2017c(IBGE, , 2017d. The official record for the first case dates to the 12th of March, however, data from (CIEVSPE, 2020) now shows a ICU entry of a 71 year old man in the capital of the state, Recife, diagnosed with the virus SARS-CoV-2 before this date. The patient started having symptoms on March 1st. We choose to set this date as the starting point of the simulation. According to (inloco, 2020), the isolation index, which measures the fraction of the population in social isolation is on average 50%.
The simulation shows a peak close to the 50th day, in the beginning of May, with 8000 infections, ranging from 6000 to 10000 cases. The number of deaths estimated is 1400 (1167e1680). Here we increased the margin of error to 20%, to represent a larger uncertainty on N at specific locations (Fig. 9).  Despite the large number of cases lost, when fitting the data with a simulated curve, the value of b is 0:460± 0:050, which agrees with the international standards. That indicates that the good tracking of the rate of change of the infection curve is good in Pernambuco. The state might not have the precise values of the real infections, but it has a good knowledge of their growth. This is an important feature for the state to be able to say that its data might represent the real scenario on a smaller scale. The state of Pernambuco has a total of 1315 ICU beds according to a census carried by the Brazilian Association of Intensive Medicine (AMIB) in the year 2016 (A. de Medicina Intensiva Brasileira, 2016). However, recent news point to 80% of these beds already being occupied, bringing the available number of ICU beds to 263.
From Fig. 10 we expect a higher ICU demand than the maximum capacity for Pernambuco, however, the capacity may be increased with the construction of campaign hospitals.

Espirito Santo
In Espirito Santo, the online data provided by the government states a total of 6.70 tests per 1000 inhabitants realized, placing the uncertainty percentage close to 88% (78e98). There are also 161 ICU units available for COVID-19 cases (G. do Estado do Espírito Santo, 2020). The population data for the simulations was retrieved from a local census done by IBGE (IBGE, 2017a(IBGE, , 2017b(IBGE, , 2017c(IBGE, , 2017d. The isolation index is on average 45% (inloco, 2020).
Like Pernambuco, the fitting on the Espirito Santo data reveals a good agreement of b with international parameters, b ¼ 0:436±0:199. However, the large error margin of the data lowers the confidence in it.
We found no record of previous hospitalizations due to COVID-19 prior the first case announced on March 6th, as we did for Pernambuco. Therefore, we chose the official day as the starting point of the disease. The first infectious individual was in the 30e39 years old age group.
The peak in Espirito Santo is close to 70 days after the start, close to May 15th, with a maximum infection number of around 40000 (48000 -32000) as shown in Fig. 11. The number of deaths is estimated to 700 (560e840).
Regarding the ICU demand we expect minor or no issues for the state, according to the current levels of social isolation (Fig. 12).

Federal District
Recent data from the government reveals 20716 tests, meaning 6.8 tests per 1000 inhabitants, meaning the state most likely has 86% (78.5e94.2) of undocumented infections. Unfortunately, no record of tests per day was found, so better accuracy on lost cases was not possible. The first registration of COVID-19 on the region is from the 5th of March, with nonpharmaceutical interventions starting on the 10th of March (S. de Saúde do Distríto Federal, 2020).
Like in the previous states, the IBGE census was used to extract the population's distribution (IBGE, 2017a(IBGE, , 2017b(IBGE, , 2017c(IBGE, , 2017d. The fit of the data with the simulations returns an efficiency of 88% of social isolation, but b and t d are off the margin of acceptance, indicating that the state is not efficiently tracking the rate of the increase of deaths and cases, possibly invalidating the estimated percentage of the efficiency of the social isolation. The isolation index according to (inloco, 2020) is on average 50%.    The simulation shows that the Federal District is currently at its highest number of infections, around 10000 (8000e12000). The maximum number of deaths is projected to reach 190 (158e228). Also, with the current number of infections, the Federal District is losing 89% of its cases (87e91%), in agreement with the margin estimated by the number of tests performed (Fig. 13).
From the AMIB census, the state possesses 659 ICU beds. We assume an occupation of 80% before the disease reached the state.
According to the simulation for the Federal District, at the current social distancing level, it is not expected to encounter hospitalization issues (Fig. 14).

Sao Paulo
The state of Sao Paulo also provided online data gathered by the government (G. do Estado de São Paulo, 2020). The first infection notified dates from the 26th of February. Studies done with cellphone data from Sao Paulo inhabitants saw an average of 53:6%±3:4% of the population is respecting the social isolation imposed by the local government on 24th March (G. do Estado de São Paulo, 2020).
When fitting the data with the model, considering a non-pharmaceutical intervention starting 27 days after the first case, we found an efficiency of 58:3%±7% in social isolation measures, in agreement of the study. We also found b ¼ 0:454± 0:52.
Unfortunately, the government did not display data on infections, but with such a high mortality rate of, around 8%, the number of infections is probably 4x bigger than the official number (meaning 75% of undocumented infections), assuming that the number of deaths is in good agreement with the real scenario. However, given the behavior of previous states, and the general scenario of Brazil, it is most likely that Sao Paulo finds itself in a 90% loss scenario. The state has its peak projected to be around the 70th day of infection, or close to the 7th of May (Figs. 15 and 16). The peak number of infections should be 260000 (208000e312000). For the number of deaths, the estimate is close to 6500 (5200e7800).
From the AMIB census, the state of Sao Paulo has a total of 7312 ICU beds and recent news point to 53% of them already being occupied, leaving around 3400 ICU beds available for COVID-19 treatment. Fig. 17 predicts a long period of hospitalizations problems for the state of Sao Paulo, with a peak demand of ICU units twice as high as the current capacity.

Amazonas
For Amazonas, the fitting of data acquired from the Health Ministry yields b ¼ 0:406±0:096 and t d ¼ 16± 6, showing that, despite the high number of undocumented infections, the state is in the same situation found in other states; knowing the behavior of the curve, but not the true number of each point on the curve. The difference from previous states is that the value of t d is also in agreement with international values. The average isolation index for Amazonas is around 51%.
The census from IBGE (IBGE, 2017a(IBGE, , 2017b(IBGE, , 2017c(IBGE, , 2017d was also used here to acquire population data for the state. From the AMIB census, Amazonas possesses 249 ICU beds, with 55% of them being occupied before the outbreak. Unfortunately, no data on tests was found for Amazonas, therefore we consider a 90% loss of infections. Amazonas peak is estimated to occur on May 16th, with a 20000 infections peak (16000e24000). Deaths are estimated to reach 500 in total (400e600) (Fig. 18). Several hospitalization issues are expected during the pandemic through the region (Fig. 19); given the concentration of indigenous tribes throughout the Amazon rain forest territory, it should be expected for there to be a high density of cases in the indigenous population.    H.P.C. Cintra, F.N. Fontinele Infectious Disease Modelling 5 (2020) 720e736

Future scenarios
Simulating the halting of non-pharmaceutical interventions is equivalent to making b increase up to it's initial value. By making such simulations, we observe an increase of cases, that is, a second peak of the disease right after the halt. Fig. 20 shows that to drastically diminish the second peak, the social isolation must endure about 220 days supposing an efficiency of 70%. It is equivalent to stating that in Brazil, quarantine should hold out until October, while for a total prevention of the second peak, social isolation must take place until December. That is expected and agrees to other simulations made by another group from the University of Harvard which projected that, to prevent a second peak in the world and the possible reincidence of the virus, social isolation must hold until the beginning of 2021 and social distancing until 2022 or 2024 (Kissler et al., 2020).
However, that scenario might drastically change with the introduction of vaccines or efficient medicine in the population. As shown in the simulations (Figs. 21 and 22), such pharmaceutical interventions are able to rapidly decrease the infection curve. In order to simulate the effect of medicine in the population, we started decreasing the death probability P IFR and time taken from the symptoms onset to the recovery t r from a specific date, until it reaches a maximum value. We supposed that the introduction of medicine decreased both P IFR and t r by half in the period of 10 days after the introduction in the population.
For the vaccines, we added the term ÀvSðtÞ in (1), which takes out individuals from the susceptible group at a rate v called vaccination rate, and added the term vSðtÞ in (4), adding those individuals on the recovery group, granting them immunity against the virus. The vaccination rate v was chosen to behave according to a logistic function starting at t ¼ 0, and gradually increasing to 0.2 after a specific time.
From the simulations, the safest method is not to stop the intervention and introduce the vaccines or drugs into the population, and then wait a small period of 10 days before halting the intervention.

Discussion with other studies
Other studies performed for Brazil found interesting results regarding the action of intervention policies, Bastos and Daniel simulated an epidemic scenario for Brazil with social isolation and found that if social isolation does not last long enough, the effect of decreasing the infection curve is instead substituted by a shift in the peak of the infection curve (Bastos & Cajueiro, 2003, p. 14288). Furthermore, previous models considering the implementation of public policies of social isolation have managed to show a direct relation of the reduction of daily infections in Brazil to social isolation (Crokidakis, 2020).
Simpler simulations using a different compartmental model suggest a possible infection peak of 10 8 cases in Brazil (Crokidakis, 2020), however they do not consider the introduction of the incubation period, causing a higher growth rate for the disease (Cintra et al., 2020), which in turn causes a tendency of predicting the infection peak sooner than expected.
Our work proposes not only expected scenarios, but includes an evaluation of undernotified cases and ICU demand. Finally, Brazil being a tropical country means changes in temperature affect the spread of the virus throughout the national territory (Prata et al., 2020). Future research could be performed combining the average temperature during the pandemic to better forecast viral spread.

Conclusion
Simulations of the COVID-19 outbreak vary from model to model, here we try to find balance between the most precise model, which could be achieved considering also a group of asymptomatic infections, and the availability of data. With this   objective, we decided to simulate the behavior of the disease in Brazil based on international parameters under the assumption that they would not differ much from Brazil, for example the average time from the onset of symptoms to hospitalization found in Shanghai, and the main aspects regarding the transmission would be intervention policies, population demographics and social contact. This assumption might prove to be limited if is later found that climate effects strongly alters the spread, since Brazil is a tropical country with a higher average temperature when compared to Europe and Asia, where many parameters of the simulations were found.
Another limitation of the model is in the assumption of a homogeneous population. We tried here to counter act this limitation by estimating the effective population N according to international parameters and by widening the error margin of the predictions. A better estimate of the outbreak could be done by assessing cities individually, however that would represent a loss of data, since demographics available by IBGE are mainly on states and major cities. Another outtake would be the testing data, the states which provided testing data, did only for the whole state but not for individual cities. We did not consider comorbidities in the population such as diabetes and cancer, however the age of the individual seems to be the most important factor in determining mortality factors (N. U. Kingdom, 2020).
We also state here that the nature of the process is stochastic, allowing fluctuations from the deterministic model used to run the simulations. Thus, this study present an estimate of the real situation and expected behavior given the parameters associated with the disease and the efficiency of the intervention. The above results present the dimension of the real scenario, but due to possible initial fluctuations in the stochastic behavior in reality, we might find some deviations from the expected.
Even with limitations, the model has proven efficient in generating curves that agree with the estimated loss of cases for each state. From the states studied here, Sao Paulo, Amazonas and Pernambuco present the highest risk of collapse in the health system, while Espirito Santo and the Federal District should have minor issues with system collapse or none at all. The blue curve representing the behavior of the official data considering the error percentage for Amazonas exhibited a growth far from the simulation region, however, it falls perfectly inside this region when data is translated by 10 days, meaning that if the infection in Amazonas begun 10 days earlier than previously thought, the data fits the simulated curve.
In the case of the duration of social isolation, the safer situation is to hold the isolation for as long as possible in order to decrease the second peak height, while increasing the number of tests performed. All simulations considered here did not assume the end of the intervention, therefore, numbers of deaths may be higher. Should any efficient drugs in combating the virus come along, the simulations show the safer way is to first introduce them in the population without breaking the social isolation, and about 10 days later start the process of reopening.

SUPPLEMENTARY MATERIAL
Source code used for some simulations and with didatic example of predictions available at https://github.com/ PedroHPCintra/Coronavirus.