Associations between environmental covariates and malaria incidence in high transmission settings of Uganda: A distributed non-linear lagged ecological analysis

Background Environmental factors such as temperature, rainfall, and vegetation cover play a critical role in malaria transmission. However, quantifying the relationships between environmental factors and measures of disease burden relevant for public health can be complex as effects are often non-linear and subject to temporal lags between when changes in environmental factors lead to changes in the incidence of symptomatic malaria. The study aim was to investigate the associations between environmental covariates and malaria incidence in high transmission settings of Uganda. Methods This study leveraged data from seven malaria reference centres (MRCs) located in high transmission settings of Uganda over a 24-month period (January 2019 - December 2020). Estimates of monthly malaria incidence (MI) were derived from MRCs’ catchment areas. Environmental data including monthy average measures of temperature, rainfall, and normalized difference vegetation index (NDVI) were obtained from remote sensing sources. A distributed non-linear lagged model was used to investigate the quantitative relationship between environmental covariates and malaria incidence. Results Overall, the median (range) monthly temperature was 30 o C (26-47), rainfall 133.0 mm (3.0-247), NDVI 0.66 (0.24-0.80) and MI was 790 per 1000 person-years (73-3973). A non-linear relationship between environmental covariates and malaria incidence was observed. An average monthly temperature of 35 o C was associated with signicant increases in malaria incidence compared to the median observed temperature (30 o C) at month lag 2 (IRR: 2.00, 95% CI: 1.42-2.83) and the cumulative increases in MI signicantly at month lags 1-4, with the highest cumulative IRR of 8.16 (95% CI: 3.41-20.26) at lag month 4. An average monthly rainfall of 200mm was associated with signicant increases in malaria incidence compared to the median observed rainfall (133mm) at lag month 0 (IRR: 1.24, 95% CI: 1.01-1.52) and the cumulative IRR increases of malaria at month lags 1-4, with the highest cumulative IRR of 1.99(95% CI: 1.22-2.27) at lag

cumulative IRR of 1.57(95% CI: 1.09-2.25) at lag month 4. The rate of increase in cumulative IRR of malaria was highest within lag months 1-2 as compared to lag months 3-4 for all the environmental covariates.
Conclusions In high-malaria transmission settings, high values of environmental covariates were associated with cumulative increases in the incidence of malaria, with peak associations occurring after variable lag times. The complex associations identi ed are valuable for designing strategies for early warning, prevention, and control of seasonal malaria surges and epidemics.

Background
Environmental covariates such as temperature, vegetation, and rainfall play a major role in malaria transmission [1][2][3], by changing the vector populations which often lead to changes in malaria burden and yet the quantitative relationships between changes in these covariates and malaria incidence are not well characterized in many settings especially in sub Saharan Africa. Several factors complicate the characterization of these relationships. Firstly, the effect of environmental covariates on mosquito and parasite populations may not be linear. For instance, moderate increase in rainfall leads to increased humidity which prolongs adult longevity of the mosquitoes and a surge in their population while heavy rainfall reduces the populations by washing away the mosquito larvae [4]. Similarly, temperature is a crucial factor in the vector life-cycle. For instance, a rise in temperature may also increase the blood meals taken and eggs laid by the mosquito, increasing mosquito-population density affecting transmission. Lower temperatures, especially below 20 o C, and too high temperatures may hamper the completion of mosquito growth cycle [5,6]. Vegetation may provide an outdoor resting habitant or shelter for mosquitoes from extreme conditions unfavourable for mosquito-population growth. Many studies have reported associations between changes in malaria burden and patterns of environmental factors [7][8][9][10][11][12][13]. However, the associations reported vary between settings. For example, a study from South Africa found that an increase in temperature signi cantly raised malaria infections [12], while another in Ethiopia showed a negative correlation [13].
Environmental covariates may also show effects that are delayed in time, requiring examination of the temporal dimension of the exposure-lag-response relationship. Most studies on the relationships between covariates and the malaria burden have relied on speci c time lag, ignoring the cumulative effect of the environmental covariates which may last for a period longer than the current time [7,14,15]. From the biological perspective, different periods including time for mosquito to develop, period of parasites within the mosquito, and incubation period of the parasites within human body makes the assumption of a speci c time lag unrealistic, as the observed effect of the environmental covariates in a given lag may be a cumulative effect from the preceding lags. Additionally, the occurrence of extreme environmental conditions in the recent past such as prolonged rainfall seasons may have an impact on malaria burden which is not yet clear.
Climate change has had great impacts on infectious diseases, with shifts in malaria transmission areas reported [16,17], as may be re ected in changes of malaria burden provided through surveillance data. Routine malaria surveillance focuses on measures of disease (rather than entomological measures) and measures of disease are of greatest relevance from a public health perspective. Recently Uganda has experienced extreme environmental conditions amidst a setting where malaria is already endemic in almost 95 % of the country [18], and yet there is limited data on the quantitative relationship between these covariates and malaria. Uganda Malaria Surveillance Project (UMSP) in collaboration with National Malaria Control Division (NMCD) have established an enhanced health facility-based malaria surveillance system at 70 public health facilities across the country referred to as the Malaria Reference Centers (MRCs) [19]. At these MRCs, individual patient level data are collected and resources provided to maximize laboratory testing of all patients with suspected malaria. Data on village of residence of the patients is captured and catchment areas around the MRCs identi ed, allowing for the generation of estimates of malaria incidence [20]. In this study, the relationships between malaria incidence and environmental variability in rainfall, temperature and vegetation in Uganda is quanti ed by investigating exposure-lag-response effects. Quantifying these relationships is a key step in producing useful systems to predict malaria incidence in the region and plan for effective preventive strategies and sustainable long-term malaria programming in the control of malaria burden.

Methods
Study setting: This study leveraged data from UMSP derived from sentinel surveillance in level III and IV public outpatient facilities that generally see between 1000-3000 outpatients per month and have functioning laboratories. These facilities provide care free of charge, including diagnostic testing and medications. Full description of the MRCs and the data captured has been published else where [21]. This study included data from seven of the 70 MRCs. MRCs were included if they met the following criteria: 1) location in a high malaria burden area where indoor residual spraying of insecticide (IRS) was not being implemented, 2) had malaria incidence estimate data for the period between January 2019 to December 2020 available. MRCs included in the analysis were Aduku health centre IV in Kwania District, Lobule health centre III in Koboko District, Awach health centre IV in Gulu District, Lalogi health centre IV in Omoro District, Patongo health centre IV in Agago District, Padibe health centre IV in Lamwo District, Namokora health centre IV in Kitgum District. The location of these MRCs in Uganda is shown in Environmental variables: Average monthly environmental data for the period of January 2019-December 2020 were processed from remote sensing sources. Data processed by remote sensing included temperature (de ned as day time land surface temperature measured in degrees Celsius), Normalized Difference Vegetation Index (NDVI) de ned as a dimensionless index used to measure neighborhood greenness [22], and rainfall. Rainfall data was collected from climate hazards group infrared precipitation with station data (CHIRPS) database and was measured in millimeters. CHIRPS incorporate 0.05°r esolution satellite imagery with in-situ station data to create gridded rainfall time series for trend analysis and seasonal drought monitoring [23]. Temperature and NDVI data was obtained from moderate resolution imaging spectro-radiometer (MODIS) aboard the National Aeronautics and Space Administration (NASA) satellites [24]. Global MODIS data are provided every month at 1-kilometer spatial resolution as a gridded level-3 product in the sinusoidal projection and were gap-lled to correct for cloud cover using a random forest model with interpolated values, elevation, and time [25]. Satellite environmental covariates were preferred over nationally available estimates since they had been shown to have an even spatial distribution [26], and were available at a low administrative level such as a village, enabling derivation of health facility catchment area-speci c estimates.The downloaded raster les were transferred into quantum geographical Information system (QGIS) software and village corresponding environmental covariates' centroid values were extracted using Point Sampling tool. To give MRC speci c estimates of environmental covariate in a given month, the centroid values corresponding to the villages that form the catchment area were averaged. Low values of each covariate (temperature, rainfall, and vegetation cover) included any value below the observed median while high values were those greater than the median for each respective environmental covariate.
Outcome: The outcome was monthly malaria incidence de ned as total cases of malaria within a given health facility catchment area divided by the population of the catchment area. Catchment areas were de ned as villages where the MRC was located and adjacent villages with similar malaria incidence to the village where the MRC is located. Details of how the catchment areas were estimated are published else where [21]. A given catchment area included 1-5 villages. The village level population estimates for each catchment area were obtained from the AfriPop database and included a xed population growth function of 0.0029 per unit time [27].
Statistical analysis: Cumulative data for the characteristics of the study populations over the 24-month observation period (January 2019 -December 2020) were summarized and presented as monthly medians with corresponding ranges. A cross-correlation analysis was performed to ascertain the magnitude and direction of time-lagged relationships between environmental covariates and malaria incidence, and estimate the optimal lags. Optimal lags were de ned as the month corresponding to the highest signi cant correlation coe cient. The Granger causality Wald test was performed to determine the likely effect of lagged environmental factors on the variability of malaria incidence. The distributed non-linear lagged model (DLNM) was used to investigate non-linear and lagged (speci c and cumulative) effects of environmental covariates on the malaria incidence.
The DLNM is a modeling framework used to investigate associations with potentially non-linear and delayed effects on time-series data [28]. This methodology is based on the de nition of a cross-basis, which is a function expressed by the combination of two sets of basic functions that specify the relationships in the dimension of predictor and time lags, respectively. Second order natural cubic spline for environmental factors that generated a basis matrix of polynomials was used for non-linear effect and lag effect. The more exible lag effects at shorter delays were obtained by placing spline knots at equal intervals in the range of environmental variables and in the lag scale. Seasonality of malaria transmission was controlled by including four degrees of freedom per year in the model, representing the bimodal malaria peak seasons in Uganda [29]. A health facility-speci c random variable was added to the model to control for unmeasured differences between the facilities. The model was selected on the basis of the Quasi-Akaike Information Criterion (QAIC). The median value for each variable was de ned as the baseline reference for calculating the RR of the separate effect (in a speci c lag month) and cumulative effect (in all months preceding a speci c lag month) on the malaria incidence. All the analyses were performed using R software version 3.6.0 with "dlnm" and "lme4" packages. Statistical signi cance was determined using con dence intervals that do not include the RR of the null hypothesis of 1.0.
A thousand simulations were run to rule out the possibility of the effects being solely an in uence of multi-collinearity between temperature, rainfall, and vegetation cover using the methodology proposed by Jose Barrera-G´omez and Xavier Basagana in the "Collin" package in R [30]. The results are presented in the supplementary le Fig S2 and the ndings suggest the possibility of other explanations for this result than multi-collinearity.

Summary data on longitudinal measures of malaria incidence and environmental covariates in high transmission settings of Uganda
Over the 24-month study period, the overall median monthly malaria incidence was 790 (range 73-3973) cases per 1000 person years (PY), with the catchment area around Patongo health centre having the highest incidence at 1272 (176-3973) cases per 1000 PY, and area around Namokora health centre having the lowest incidence at 337.5 (73-1238) cases per 1000 PY. The overall median temperature was 30.0 o C with Padibe and Namokora health centre recording the highest temperatures (30.5 o C) and Lobule health centre recording the lowest at 28.0 o C. The median monthly rainfall was 133.0mm with highest estimates around Lalogi health centre (148.5mm, 8-214mm) and lowest around Padibe health centre (111.5mm, 6-227mm). NDVI was highest at Lobule health centre (0.74) and lowest at Patongo health centre (0.61) with the median across all-sites estimated at 0.66. Table 1 provides the details of the longitudinal measures of environmental variables at the study sites between January 2019 and December 2020.
Temporal trend and seasonality of malaria incidence and environmental covariates Malaria incidence across all-sites was highest in June 2019 (1344.5 cases per 1000 PY, 713-2922) and lowest in April 2019 (239.5 cases per 1000 PY, 103-1128) with seasonal peak in incidence observed from April to September 2019 and accounting for 28.9% of the observed malaria incidence. Temporal changes in monthly malaria incidence over the 24-month observation period by MRC are presented in the supplementary le Fig S3. Correlation analysis revealed a positive relationship between temperature and malaria incidence at month lag 4 (0.452), and a negative correlation for both rainfall (-0.160) and NDVI (-0.454) with malaria incidence at month lag 4. Across MRCs, the correlation coe cients for temperature with malaria incidence were negative at month lag 1 and positive at month lag 4. This pattern was reversed for both rainfall and NDVI at month lags 1 and 4. In addition, the optimal lags for the correlations between environmental covariates and malaria incidence varied by site ( Table 2). The results of the Granger causality tests indicated that the temporal distribution of malaria incidence was strongly affected by temperature, rainfall, and NDVI among all-sites combined (Table 3).

Non-linear and lagged effects of environmental covariates on malaria incidence
Temperature With all sites combined; the incidence rate ratio (IRR) of malaria increased at month lags 0-1 for temperature approximately 45-47 o C as compared to the median observed temperature (30.0 o C). Complete summary of the non-linear relationship between monthly temperature and malaria incidence over a four-month period is revealed in part a of  (Table 4).

Rainfall
A summary of the non-linear relationship between monthly rainfall and malaria incidence over a fourmonth period is revealed in part a of  (Table 4).

Discussion
The relationship between environmental covariates and malaria burden is complex, as the effect is not only determined in the current period but may also be in uenced by preceding time points. This study investigated the quantitative relationship between environmental covariates and malaria incidence in high malaria transmission areas in Uganda. In these settings, temperature, rainfall and NDVI signi cantly affected the temporal distribution of malaria incidence. High (greater than the observed median) temperature values increased the IRR of malaria signi cantly in month lag 4 and the cumulative IRR at month lags 1-4 as compared to the median observed temperature. Similarly, high rainfall increased the IRR of malaria signi cantly at the month lag 0 and the cumulative IRR at month lags 1-4 as compared to the median observed rainfall. High values of NDVI increased the IRR of malaria at month lag 2 and the cumulative IRR signi cantly at month lags 2-4 as compared to the median observed NDVI.
Malaria control remains a priority in the national health agenda, requiring planning and e cient allocation of the limited resources available [31]. E cient allocation of resources relies not only current measures of malaria burden but also predicting future malaria burden. Surveillance data has been used to monitor trends in malaria burden and visualization of prior seasonal peaks in different transmission settings. The addition of place of residence as part of routine surveillance data collection tool has enabled estimation of health facility catchment areas and generation of malaria incidence estimates to derive a direct measure of disease burden. Combining health facility surveillance data with environmental covariates such as rainfall, temperature and vegetation coverage available through remote-sensing sources may bene t malaria control efforts, as environmental covariates are reported to facilitate malaria transmission [32].
The relationship between environmental covariates and malaria incidence may form a strong basis for malaria early warning systems, as such prediction tools may guide planning and control of malaria outbreaks. For instance rainfall and sea surface temperature have been used for monitoring malaria early warnings in Botswana with the success of the malaria control program in reducing malaria incidence attributed to the early warnings [26]. Similarly in South Africa, prediction of malaria based on the seasonal climate forecasts showed that short-term predictions coincided closely with the observed malaria cases, which may also bene t the malaria early warning system [33]. In this study, high temperature increased the IRR of malaria at month lag 4. Knowing temperature as a key parameter in mosquito development, biting and survival with warmer temperatures increasing the infection rates as the vector reproduces faster, the likelihood of infection after a mosquito bite is ampli ed [34]. Even if the speci c effect of temperature on the IRR of malaria increased in month lag 2, the cumulative IRR increased signi cantly at month lags 1-4. The increased cumulative IRR could possibly be explained by the increased multiplication rate presented by global warming increasing the length of mosquito breeding season [34]. The month lagged effects of temperature would avail time long enough to design interventions to interrupt malaria transmission, despite temperature values used in the current study being high as compared to the optimal temperature for malaria transmission of 29 o C [35]. However, this nding was consistent with previous studies which have demonstrated how temporal disease risk shifts in response to temperature changes and increase in maximum temperature increases the incidence rate of malaria signi cantly of the current month and later [36][37][38].
The current study also found high values of rainfall to signi cantly increase the IRR of malaria at month lag 0 in these settings. Comparable to the speci c rainfall effect, the cumulative IRR of malaria was increased signi cantly at month lag 1-4 at approximately 200 mm. Rainfall provides avenues that facilitate mosquito breeding suggesting that these areas retain water after rains presenting suitable places for mosquito fertilization and increasing the risk of malaria infections and transmission. Although not all mosquitoes need stagnant water, they require at least some form of water to hatch eggs increasing the risk in preceding time points. The preceding time points' malaria IRR is increased by the transcended adult mosquitoes. This nding was consistent with earlier studies. For instance a study conducted in Kenya showed positive associations between rainfall and malaria burden at lags of 2 to 4 months at rainfall approximately 100-200 mm in both lowland and highland [39].
This study also found a signi . The current study had practical implications as the advance warnings of approaching situations advantageous to malaria epidemics will afford national malaria control programmes the freedom needed to stock commodities required to deal with impending surges or epidemics.
This study has several limitations. First, as this study was a population level study which involved environmental covariates and malaria, it is possible that some confounders may not have been considered which may have in uenced the results such as socio-economic and community practices [45,46]. Second, the data available was limited to a 24-month period, as data from previous years was only health facility cases of malaria rather than incidence as catchment areas were not available. This limited the ability to control for long-term trends. Such long-term trends in rainfall have been shown to in uence malaria burden [47]. Third, this study was unable to encompass the entirety of environmental covariates, for instance because altitude did not vary over time, it was not considered as a covariate in this analysis. However, adding a health facility random variable in the model catered for the variability that was site-speci c. Fourth, the study was conducted around health facilities whose data is prone to missingness may have in uenced the result. Health facilities with less than 5 percent missing data on the village of residence for each month were included. Finally, the current study explored the associations between environmental covariates with malaria incidence in high transmission settings, and these ndings may only be applicable and generalizable to these settings.

Conclusion
In the present study, high temperature increased the cumulative IRR of malaria signi cantly at month lags 1-4 compared to observed median of 30 o C. High rainfall increased the IRR of malaria signi cantly at month lag 0 and cummulative IRR at month lags 1-4 compared to observe median of 133mm. High NDVI increased the cumulative IRR signi cantly at month lags 2-4 compared to the observed median of 0.66.
The results highlight the relevance of incorporating the effects of environmental covariates on the cumulative IRR of malaria when developing early warning systems. These identi ed complex associations are useful for designing accurate strategies for early warning, prevention, and control of seasonal malaria epidemics. Availability of data and material The datasets used for this study are available from the corresponding author on reasonable request.

Competing interests
The authors declare that they have no competing interests.      Figure 1 a Contour plots of the combined effect of time lags and Temperature on the incidence risk ratio of malaria. b. Effect of speci c Temperature and time lags on the incidence risk ratio of malaria. The blue lines are the mean relative risks, and the gray lines are 95% CI. c. Effects of speci c Temperature and time lags on the cumulative incidence risk ratio of malaria. The red lines are the mean incidence risk ratio, and the gray areas are 95% CI Figure 2 a. Contour plots of the combined effect of time lags and rainfall amounts on the incidence risk ratio of malaria. b. Effect of speci c rainfall amounts and time lags on the incidence risk ratio of malaria. The blue lines are the mean incidence risk ratio, and the gray lines are 95% CI. c. Effects of speci c rainfall amounts and time lags on the cumulative incidence risk ratio of malaria. The red lines are the mean incidence risk ratio, and the gray areas are 95% CI Figure 3 a. Contour plots of the combined effect of time lags and normalized vegetation index (NDVI) on the incidence risk ratio of malaria. b. Effect of speci c NDVI and time lags on the incidence risk ratio of malaria. The blue lines are the mean incidence risk ratio, and the gray lines are 95% CI. c. Effects of Page 22/22 speci c NDVI and time lags on the cumulative incidence risk ratio of malaria. The red lines are the mean incidence risk ratio, and the gray areas are 95% CI

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.