Air temperature influences early Covid-19 outbreak as indicated by worldwide mortality

The Covid-19 outbreak has triggered a global crisis that is challenging governments, health systems and the scientific community worldwide. A central question in the Covid-19 pandemic is whether climatic factors have influenced its progression. To address this question, we used mortality rates during the first three weeks of recorded mortality in 144 countries, during the first wave of the pandemic. We examined the effect of climatic variables, along with the proportion of the population older than 64 years old, the number of beds in hospitals, and the timing and strength of the governmental travel measures to control the spread of the disease. Our first model focuses on air temperature as the central climatic factor and explains 67% of the variation in mortality rate, with 37% explained by the fixed variables considered and 31% explained by country-specific variations. We show that mortality rate is negatively influenced by warmer air temperature. Each additional Celsius degree decreases mortality rate by ~5%. Our second model is centred on the UV Index and follows the same trend as air temperature, explaining 69% of the variation in mortality rate. These results are robust to the exclusion of countries with low incomes, as well as to the exclusion of low- and medium-income countries. We also show that the proportion of vulnerable age classes and access to healthcare are critical factors impacting the mortality rate of this disease. The effects of air temperature at an early stage of the Covid-19 outbreak is a key factor to understand the primary spread of this pandemic, and should be considered in projecting subsequent waves.


Introduction
The Covid-19 outbreak is due to the coronavirus SARS-CoV-2 and originated in the area of Wuhan, Hubei province, in China (Wu et al., 2020a;Zhou et al., 2020). On March 11, 2020, Covid-19 was officially declared a pandemic by the World Health Organization (WHO), as it spread extremely rapidly since December 2019, following an exponential curve worldwide (World Health Organization, 2020a). A year later, as of March 11, 2021, the pandemic provoked more than 118 million diagnosed infections worldwide, and at least 2.6 million deaths (Dong et al., 2020). The health, social and economic consequences of this pandemic are huge (Cohen and Kupferschmidt, 2020). If most of the infections started to occur in the Northern hemisphere during winter in early 2020, the Southern hemisphere was not spared by the disease, despite the summer season.
One debated question about Covid-19 is whether its spread has been affected by climatic factors, as for various other viruses (Neher et al., 2020). Indeed, many viral respiratory diseases show seasonal fluctuation related to climatic factors, especially in temperate regions (Dowell and Ho, 2004;Li et al., 2019). Air temperature and humidity have been shown to influence the spread of influenza viruses (Lowen et al., 2007), as well as the coronaviruses SARS-CoV (Lin et al., 2006), responsible of the severe acute respiratory syndrome (SARS) and MERS-CoV (Gardner et al., 2019), responsible of the Middle East respiratory syndrome. This question is of utmost importance as the possible influence of climatic factors on the spread of Covid-19 could help decisionmakers to adopt the most suitable strategy to control this disease along its successive waves of infection. There are several recent studies that have tackled this question since the onset of the Covid-19 outbreak. As for other viruses, temperature and humidity seem to be factors with some influence on the spread of the SARS-CoV-2, but recent studies led to contrasting results. In general, a negative relationship is found between air temperature and the spread of SARS-CoV-2 virus (Merow and Urban, 2020;Prata et al., 2020;Qi et al., 2020;Tobías and Molina, 2020;Xie and Zhu, 2020), but a positive relationship (Ma et al., 2020) and a lack of effect (Jamil et al., 2020) have also been reported. In the case of humidity, positive (Merow and Urban, 2020) and negative (Ma et al., 2020) effects on the infection rate have also been reported. The majority of studies were restricted to data from specific countries (mainly China at the early stage of the epidemic), or used the number of confirmed positive cases as outbreak descriptor. A recent study proposed a worldwide analysis (Merow and Urban, 2020), but the study is also based on the number of Covid-19 positive cases, which may be unevenly assessed among countries, as it is a measure strongly dependent on the countries' screening strategies.
At the early stage of the pandemic, many countries were testing patients exhibiting clear Covid-19 symptoms, such as Italy, Spain, France, Belgium, Switzerland, the UK, and the USA. Other countries, such as South Korea, Singapore, Australia, United Arab Emirates and Germany, opted to screening their population more widely. Countries performing more tests were either following a strategy of wide screening, whatever the proportion of positive patients, or were having numerous patients with clear Covid-19 symptoms. These two contrasting situations lead to infection estimates that are hardly comparable among countries. To show that descriptors based on Covid-19 positive patients might be strongly biased when analysing data reported from multiple countries, we performed a correlation analysis between the mean maximum air temperature between mid-February and mid-March 2020 and the total number of tests performed per country, as of March 20. We found a strong negative correlation, where countries with lower temperatures tended to perform more tests (r = −0.48, n = 41, p value = 0.001). This correlation might be influenced by the higher impact of this disease in the northern hemisphere during the first months of 2020. However, the reasons for this correlation remain unclear, and may lead to temperature-biased numbers of Covid-19 positive patients across countries. Therefore, we argue that descriptors based on the number of tests performed should not be used to assess the contribution of air temperature (or other climatic factors) in the trans-country Covid-19 outbreak, until an intensive and comparable testing strategy is generalized in all countries.
We propose the mortality rate as a less biased descriptor to test the effects of climatic factors on the Covid-19 outbreak. Indeed, the number of countries recording deaths increased from nine by the end of February 2020 to more than 120 by the end of March 2020 (Dong et al., 2020), providing an opportunity to use the mortality rate over a large latitudinal range. Because the mean time from the onset of the disease to death is 20 days (Wu et al., 2020b), we use the officially reported mortality rate expressed as the number of deaths per inhabitant over a standardized time period of three weeks, corresponding to the 21 days following the first reported Covid-19 death in the country. In almost all countries, the number of Covid-19 deaths publicly reported corresponded to the cases of Covid-19 positive patients and patients displaying clear diagnostic Covid-19 symptoms who died on a given date. However, to evaluate differences among countries in measuring and reporting mortality rates, we examined the relationship between the officially reported Covid-19 mortality rates per European country for the three weeks after the first death and the excess deaths per country (for whatever reason) for the same period, as reported by EuroMOMO, the European mortality reporting network (EuroMOMO, 2020). We found a strong correlation between these two metrics (r = 0.73, n = 19, p value <0.001) indicating that as least in Europe, countries were measuring their Covid-19 mortality rates in a comparable way. In non-European countries, although the officially reported mortality may not reflect the exact number of deaths due exclusively to the Covid-19 infection, we expect the biases to be more similar across countries than the biases affecting the number of positively tested patients. The biases affecting Covid-19 mortality rate may include (i) other underlying health problems that are exacerbated by the viral infection, (ii) patients that display symptoms that are mistakenly attributed to Covid-19 (ascertainment bias), or (iii) unrecorded Covid-19 deaths. While these biases can either overestimate or underestimate mortality rates in a comparable manner among countries, we included the random country-specific variation in our analysis in order to control for all these potential biases.
When it comes to use the officially reported mortality rate as the outbreak descriptor, it implies that the disease is well established, and thus we included other known explanatory factors that may explains the difference among countries: the proportion of the most susceptible age group in the population, the number of beds in hospitals (reflecting the difference in health care accessibility among countries), and the timing and strength of governmental travel measures for controlling the spread of the disease. Our aim is to clarify the effect of air temperature on the Covid-19 outbreak by using the mortality rate instead of the confirmed positive cases as was done by previous studies.

Covid-19 dataset
We collected the number of deaths per day due to the Covid-19 in each country affected by SARS-CoV-2 from the dataset hosted by the Centre for Systems Science and Engineering at Johns Hopkins University (Dong et al., 2020). This dataset is maintained daily and is verified by registers provided by various health state authorities as well as the World Health Organization (WHO). The dataset provides information at the country level but also about sub-regions of China, The United States, Canada, and Australia. French and British overseas territories are also included separately in this dataset. Due to the uncertainly regarding the initial stage of the spread of this disease (Wu et al., 2020a), we decided to exclude the dataset for mainland China. We considered the date of first registered death in each country as well as the cumulative number of deaths during the first 21 days of deaths registration in each country or territory. We used the cumulative mortality at day 21 in order to avoid the daily variance of death registration and to compare data extracted from similar epidemic phases across countries. We considered all regions registering at least 21 days of deaths until 27th April 2020. The final dataset includes 217 territories in 144 countries.

Climatic dataset
The historical weather conditions in each country or territory included in the Covid-19 dataset were obtained from the Dark Sky API dataset by using the R package "darksky" (Rudis, 2017). We considered the geographical coordinates associated with each record in the Covid-19 dataset. We included in the analysis the maximum temperature (°C), relative humidity, and UV Index. These variables were taken daily for the 21 days before (weeks 1 to 3) and the 21 days after the first registered death (weeks 4 to 6) in each country or territory recorded. We computed the average across these 42 days for each climatic variable. However, in order to check the influence of the weather during the time period considered, we repeated our analysis using a shorter period of time, focusing on the conditions that preceded the beginning of the outbreak. In the supporting information, we thus present the results obtained with averaged climatic conditions covering only the three weeks (21 days) before the first registered death (weeks 1 to 3), as well as two weeks (15 days), including the second and third weeks before the first registered death (weeks 1 and 2), and one week (7 days) during the third week before the first registered death (week 1).

Non-climatic dataset
The proportion of the population older than 64 years old per country was calculated on the basis of the population indicators available from the World Bank database (The World Bank, 2019). The indicator of the number of hospital beds per 10 K inhabitants was obtained from the Global Health Observatory (GHO) database of the World Health Organization (WHO) (World Health Organization, 2020c). We used the latest available data per country, which varied from 2005 to 2015, but most data were from the period 2010-2015. Governmental immigration restrictions per country were used as a proxy for the governmental travel measures for controlling the spread of the disease. This last dataset was taken from the Covid-19 Travel Restriction Monitoring database, compiled by the International Organization for Migration (IOM) (2020). We added some missing data directly from the primary source database, the International Air Transport Association (IATA) (2020). For each country and date, from March 8 to March 26, we summed the number of other countries for which immigration restrictions were imposed. We considered all the IOM traveling restriction categories with a link to Covid-19 spread control. Based on these daily immigration restrictions, we grouped the countries into three categories. Category 1 (soft restriction) includes countries that imposed virtually no restrictions or otherwise late in time, not before March 18, and only on a subset of countries. Category 2 (middle restriction) contains the countries that started imposing noticeable immigration restrictions from March 11 to 17 with a fast daily-increase in the number of countries with restricted entrance. Category 3 (strong restriction) groups the countries that decided immigration restrictions early in time, before March 11, with a fast daily-increase in the number of countries from which immigration was restricted. The rates of Covid-19 tests performed per country were obtained from the database of STATISTA (Statista., 2020). The nominal Gross Domestic Product (GDP) per capita in each country was obtained from the World Bank statistics by using the R package "WDI" (Arel-Bundock, 2019). To assess the consistency of the Covid-19 mortality rates reported by European countries over the studied period, we compared them to the excess deaths for the same period as reported by EuroMOMO (2020), the European mortality monitoring network. For each country, we took from the EuroMOMO database the weekly z-scores describing the mortality rate deviation from standardized expectation. We calculated the mean z-score of excess deaths over the three complete weeks following the date of the first reported death of Covid-19, in 2020. We then calculated the mean z-score of excess deaths for the same three-weeks period for years 2015 to 2019, providing a baseline extracted from the five previous years. The fraction of excess deaths attributed the Covid-19 for the three weeks of interest was obtained by subtracting the mean z-score of the 2015-2019 period (baseline) to the mean z-score of 2020.

Statistical analysis
Our response variable (mortality rate or θ) is the logarithm of the number of deaths per inhabitant during the first three weeks of deaths registration in each country or territory included in the Covid-19 dataset. This variable is computed by considering the cumulative number of deaths (d) on day 21 divided by the population size (N) of each country or region (θ = ln ∑ t=1 21 d t /N). We used the logarithm in order to maintain a gaussian distribution of the model residuals. The fixed predictor variables included temperature, UV Index, relative humidity, proportion of the population older than 64 years old, number of beds in hospitals for every 10 K inhabitants, and the governmental travel measures implemented in each country to contain the spread of this disease. Since temperature and UV Index are highly correlated (r~0.9, p-value < 0.001), we included those variables in two separated models. We verified the collinearity among predictor variables by computing the variance inflator factor (VIF). All variables included in the separate models with either temperature or UV Index did not show collinearity with other predictor variables (VIFs < 2.4) (Zuur et al., 2009). We also estimated the effect of including a spatial correlation structure as well as the effect of the country-level as a random effect, following Zuur et al. (2009). The model selection was based on the lowest Akaike Information Criterion value (AIC) (Burnham and Anderson, 2002). The final model structure was a linear mixed model (LMM) that considers an exponential spatial correlation, as well as country-level as a random intercept and slope of either temperature or UV Index. We explored all combinations of predictor fixed variables. We averaged all models with a ΔAIC smaller than two units (Burnham and Anderson, 2002). We reported the conditional (R 2 GLMM(c) ) and marginal (R 2 GLMM(m) ) coefficient of determination for each candidate best model included in the model averaging. These values represent the variance explained by the fixed predictors and both the fixed predictors and the random variable, respectively (Nakagawa et al., 2017). We also analysed the climatic variables independently in a subset of data excluding low-income countries and both low-income and middle-income countries. These models have the same geographical correlation and random variable structure as the previous analysis. In addition, we also analysed the relation between climatic variables and mortality rate computed per week in a non-linear generalized additive mixed model (GAMM). In the case of the relationship between the number of Covid-19 tests and the temperature of each country, the model with the lowest AIC was a simple linear relationship without the influence of a spatial correlation structure or the inclusion of a country-level random effect. All statistical analyses were performed using R 3.6.2 (R Development Core Team, 2019). We used the package 'nlme' (Pinheiro et al., 2019) for the linear mixed models and the package 'MuMIn' (Barton, 2019) for the model averaging (full average) and the computation of the determination coefficients.

Results and discussion
Very similar results were obtained with the temperature-based and the UV-based averaged models (Table S1). Both approaches resulted in sets of best models in which fixed variables explained between 36% and 41% of the variation in mortality rate (R 2 GLMM(m) , Table S1), and between 66% and 69% when considering both fixed and random variables (R 2 GLMM(c) , Table S1). Our best temperature-based model explains 67% of the variation, in which 36% is explained by climatic, demographic and healthcare variables, excluding country specific effects (Fig. 1). A similar result is obtained with our best UV-based model, which explains 69% of the variation, with 37% explained by the modeled variables, excluding the country-specific effects (Fig. S1, supporting information). The country-specific effects (31% in our temperature-based model and 32% in our UV-index model) accounts for any random variation specific to each country, such as time of governmental interventions or different strategies of reporting mortality. Countries with positive random effects, such as Ecuador, the United States, Spain, and Saudi Arabia, had higher mortality rates independent of the fixed variables included in our model, while countries with negative random effects, such as Japan, France, and Canada, had a lower mortality rate than expected with the factors we modeled. The standard deviation of the random effect was 0.6 and 1.6 for the temperature-based and UV-based models, respectively. This means that both models have an almost homogenous random effect among countries, with the range of variation of most countries within the range of the standard deviation around the zero value ( Fig. 1 and Fig. S1, supporting information). For countries with a much stronger negative effect, such as Japan, the negative effect may be attributed to early governmental interventions, and are also considered in our model through the random variation. In these models, temperature and UV Index influenced negatively the mortality rate (Fig. 2). Other explanatory variables also contributed to the Covid-19 outbreak. The proportion of the population older than 64 years old had a positive effect on mortality rate, while the number of beds in hospitals showed a negative effect (Table 1 for the best models, and Table S2 for the averaged models). We found no significant effect for the relative humidity and the strength of governmental travel restrictions. However, these two variables showed a significant effect in our preliminary analysis (Quilodran et al., 2020) that considered country wise Covid-19 mortality rates but over a shorter period of time, that is, the two weeks following the first reported death, instead of three weeks considered in this study. The current study also includes more countries than the preliminary analysis (144 versus 88, respectively).
The best temperature-based model indicates that every extra Celsius degree in air temperature decreases the mortality rate by 4.88% (e −0.05 -1, Table 1). The best UV-based model shows that every unit increase in UV Index results in a 16.47% decrease in the mortality rate (e −0.18 -1, Table 1). One unit increase in UV Index has a higher effect compared with one additional Celsius degree in temperature because the former variable has a smaller range of variation (see Fig. 2). The final effects of these two variables are comparable as indicated by the standardized coefficients, which are similar in both the temperature (−0.28) and the UV-Index (−0.27) analyses. In the best temperature-based model, every extra bed in hospitals per 10 K inhabitants decreased the mortality rate by 1.98% (e −0.02 -1, Table 1). The standardized coefficient of this variable (−0.22) is slightly less than that of the temperature (−0.28) (Table 1). Finally, every 1% increase in the proportion of the population older than 64 years old increased the mortality rate by 15.03% (e 0.16 -1, Table 1). This last variable has a greater effect on the variation of the mortality rate, with a standardized coefficient (0.45) almost twice as high as for the other variables (Table 1). Comparable results for the number of beds in hospitals and the proportion of the population older than 64 years old are found with the best UV-based model (Table1).
The influence of climatic variables on the early stage of the outbreak is robust to the time period considered to summarize the climatic conditions of each territory or country (one, two or three weeks instead of the full period of 42 days) (Tables S3 and S4). Similarly, the use of a non-linear generalized additive mixed model (GAMM) computed per week shows similar trends to our LMM analysis (Appendix S1). A previous version of this manuscript published as a preprint (Quilodran et al., 2020) also shows similar results by considering the average climatic conditions of the 15 days before and the 15 days after the first death recorded in each territory or country (30 days in total).
As high-income countries may have more efficient systems to identify and report Covid-19 deaths, we examined the effect of the climatic factors with two subsets of countries. For the first, we excluded all countries with a nominal Gross Domestic Product (GDP) less than 6000 USD per inhabitant. Countries included in this analysis (n = 78) have a GDP equal to or higher than that of Lebanon, Thailand or Peru, for example. In this case, a similar significant negative effect is obtained for both air temperature (Estimated: -0.02, SE = 0.01, t = −3.2, p value <0.01) and UV Index (Estimated: -0.15, SE = 0.05, t = −3.1, p value <0.01), while humidity has a significant positive effect (Estimated: 2.12, SE = 0.97, t = 2.2, p value = 0.03). For the second subset, we excluded all countries with a nominal Gross Domestic Product (GDP) smaller than 40,000 USD per inhabitant, which represents countries with a GDP equal or higher than that of the United Kingdom (n = 23). We obtained Observed mortality rate (θ) These results indicate that the significant effect of air temperature and UV-Index in the Covid-19 mortality rate is robust to the exclusion of countries with low incomes, as well as to the exclusion of low-and medium-income countries. We stress that warm air temperatures and intense UV light may act by reducing the transmission of the SARS-CoV-2 virus or it may directly or indirectly inactivate this virus. Preliminary results indicate that, under laboratory conditions, the SARS-Cov-2 virus survives less than 4 days at 37°C (World Health Organization, 2020b) while at 4°C it may remain stable for more than 14 days (Chin et al., 2020), offering one possible explanation to the negative relation between air temperature and the Covid-19 mortality rate. Some evidence (Lowen and Steel, 2014) suggests that the transmission efficiency of influenza viruses by respiratory droplet increases when air is cold and dry. Moreover, in periods of the year or places of the world with cold air and reduced UV light, other human respiratory diseases are more frequent (Hyrkäs-Palmu et al., 2018) and may facilitate inter-individual transmission via coughing or sneezing. When the weather is cold, people also tend to spend more time in closed and populated environments, which also facilitates inter-individual transmission. A causative link between exposure to sunlight (including UV light), a higher synthesis of vitamin D, and better efficiency of the human immune system has been evidenced (Aranow, 2011), which may explain in part the negative relation between the UV Index and the Covid-19 mortality rate. Yet, these modes of action remain hypotheses to be tested in the case of Covid-19.

Conclusion
Humankind is facing an unprecedented worldwide crisis, forcing decision makers to move forward into the labyrinth of unknowns. Understanding how climatic factors have contributed to the Covid-19 outbreak could strengthen the projections of its future spread. Previous analyses were unable to provide an unambiguous view, potentially because of limited data or biased descriptors for country-wise comparisons. Using global datasets recorded during the early mortalityinducing phase of the Covid-19 outbreak, we show that air temperature and UV Index are related climatic variables that may play a role in explaining the Covid-19 mortality rate. A higher value in one or other of the variables is associated with lower mortality rates. Relative humidity is potentially a third climatic variable able to explain a reduction in Covid-19 mortality rate, but its effect deserves more attention through future studies. Our results confirm the recent observation of Merow and Urban (2020), which also consider the random country variation and find similar results regarding air temperature and UV-index, but they use the confirmed positive cases instead of the mortality rate used in our analysis. Ficetola and Rubolini (2021) also analysed the confirmed positive cases in a recent worldwide analysis that considered environmental and socio-economic variables. They also showed a faster dissemination rate at cold air temperatures, but indicate that governmental measures have the potential to overwhelm the impact of environmental conditions. While these previous analyses are not directly comparable with our study in terms of the methodological approach and the data used (e.g. they used infection rate instead of mortality rate), our analysis is most successful in explaining the observed variation across countries, potentially because the mortality rate is reported  by countries in a less variable way. In our analysis, we also show the importance of access to healthcare systems and age class structure in explaining the Covid-19 outbreak. Although our results do represent an important indication for projections of the dynamics of the disease, it is clear that climatic factors alone will not passively stop the outbreak. Importantly, the effect of air temperature on the Covid-19 outbreak does not ensure that this disease will evolve into a winter-seasonal epidemic viral disease like the influenza virus (Tamerius et al., 2013). Although our results cover the early phase of the outbreak, they provide useful knowledge for understanding and projecting subsequent infection waves. More research is needed for a thorough comprehension of the dynamics of the Covid-19 pandemic in the longer run.

Author contributions
C.S.Q., M.C. and J.I.M.B. conceived the study and compiled the data. C.S.Q. performed the statistical analysis. All authors interpreted the results, and contributed critically in the writing of the manuscript.

Data and materials availability
The data used in the analysis is available from the Dryad Digital Repository https://doi.org/10.5061/dryad.1ns1rn8th.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.