Modelling heterogeneity in malaria transmission using large sparse spatio-temporal entomological data

Background Malaria transmission is measured using entomological inoculation rate (EIR), number of infective mosquito bites/person/unit time. Understanding heterogeneity of malaria transmission has been difficult due to a lack of appropriate data. A comprehensive entomological database compiled by the Malaria Transmission Intensity and Mortality Burden across Africa (MTIMBA) project (2001–2004) at several sites is the most suitable dataset for studying malaria transmission–mortality relations. The data are sparse and large, with small-scale spatial–temporal variation. Objective This work demonstrates a rigorous approach for analysing large and highly variable entomological data for the study of malaria transmission heterogeneity, measured by EIR, within the Rufiji Demographic Surveillance System (DSS), MTIMBA project site in Tanzania. Design Bayesian geostatistical binomial and negative binomial models with zero inflation were fitted for sporozoite rates (SRs) and mosquito density, respectively. The spatial process was approximated from a subset of locations. The models were adjusted for environmental effects, seasonality and temporal correlations and assessed based on their predictive ability. EIR was calculated using model-based predictions of SR and density. Results Malaria transmission was mostly influenced by rain and temperature, which significantly reduces the probability of observing zero mosquitoes. High transmission was observed at the onset of heavy rains. Transmission intensity reduced significantly during Year 2 and 3, contrary to the Year 1, pronouncing high seasonality and spatial variability. The southern part of the DSS showed high transmission throughout the years. A spatial shift of transmission intensity was observed where an increase in households with very low transmission intensity and significant reduction of locations with high transmission were observed over time. Over 68 and 85% of the locations selected for validation for SR and density, respectively, were correctly predicted within 95% credible interval indicating good performance of the models. Conclusion Methodology introduced here has the potential for efficient assessment of the contribution of malaria transmission in mortality and monitoring performance of control and intervention strategies.

(2Á4). Up to 10-fold variations in transmission intensity have been observed within very small localities due to geographical, biological or socio-economic factors (5Á8). Understanding the heterogeneity in transmission and human exposure to malaria infection is critical for optimizing control programs and targeting interventions (9Á12).
Malaria disease burden and transmission can be assessed using incidence or prevalence in human hosts. However, the entomological inoculation rate (EIR) most directly quantifies the exposure of the human population to the infectious stages of the parasite (12Á16). EIR is the product of the human-biting rate, for example, mosquito bites/person/night (which can also be estimated using mosquito density) and the sporozoite rate (SR), which is the proportion of infective mosquitoes (7,17). The measure expresses the average number of infective bites a person receives in a specified unit of time. It can be also used to predict other measures of transmission, which are used to evaluate effectiveness of malaria control program (5,18). Uncertainty due to small sample, low values and variability in the SR and cost complicate precise estimation of EIR requiring standardized entomological surveys conducted over large areas (5,6,13,14). Accurate estimation of EIR requires longitudinal surveys within the study area to take into account spatiotemporal variations and seasonality trends. However, there is a paucity of this type of data due to cost and resources needed to collect them (19Á21).
The Malaria Transmission Intensity and Mortality Burden across Africa (MTIMBA) project was initiated by the INDEPTH Network (22,23) and conducted over a period of 2001Á2004 in several countries in Africa including Tanzania, Kenya, Mozambique, Senegal, Ghana and Burkina Faso. The main objective of the initiative was to assess the relation between the intensity of malaria transmission and all-cause as well as malaria-specific mortality across Africa, taking into account the influence of malaria control activities. The MTIMBA entomological data have been collected fortnightly over large number of locations (households) and to date this is the only available entomological database appropriate to study spaceÁtime heterogeneity of malaria transmission in Africa. These data are sparse with seasonal variations and spatio-temporal correlations. High dependence of climate, environment and ecological factors in the life of mosquito and seasonality any of the survey locations had zero mosquitoes or proportion of infected ones. In standard modelling approaches, EIR is treated as a continuous outcome, logarithmically transformed to fulfil the assumption of normality (21, 24Á27). However, when EIR is estimated as a product of the SR and mosquito density, which are generated from the binomial and a count distribution like Poisson or negative binomial, respectively, normality assumptions are void. To our knowledge, Kasasa et al. (28) is the only literature report analysis of EIR data considering the two sources of data separately. In addition, due to the amount of zeros which is larger than what can be generated by the standard distributions, the data are over/under dispersed and zero inflated (21, 29Á32). Statistical analysis which accounts for these characteristics is essential to obtain unbiased estimates for the regression coefficients (33Á36).
Moreover, the MTIMBA-EIR data have been collected at fixed locations and they are typically geostatistical data. Similar exposures of environmental and climatic conditions to locations which are geographically close introduce spatial correlation between them. Geostatistical models take into account spatial correlation by introducing location-specific random effects as latent observations from a multivariate spatial Gaussian process (37). Spatial correlation between any pair of locations is often considered as a function of distance on the covariance matrix of the process. These models have a large number of parameters. Bayesian formulations (38) allow model fit via Markov Chain Monte Carlo (MCMC) simulation methods (39). However, the estimation process involves covariance matrix computations which are infeasible when the number of locations is too large (40,41). A computational flexible way to overcome this problem is the approximation of the spatial process from a subset of locations using properties of conditional multivariate Gaussian distribution of the process (40Á42). Most of these techniques have been applied in simulated data, observed in regular grid and mainly with Gaussian characteristics. In this study, selection of subset of locations is implemented using methods proposed in our previous work (40,43).
We now demonstrate a rigorous modelling way of analysing large spatio-temporal EIR data and study the heterogeneity, space and temporal patterns of malaria transmission within one MTIMBA site, the Rufiji DSS area in Tanzania (44). The Gaussian process approximation proposed by Banerjee et al. (40) is applied to binomial (SRs) and negative binomial (density) data with zero inflation. The models are fitted using Bayesian MCMC simulation and assessed on the basis of their predictive ability. Model-based predictions of SR and density were multiplied to compute EIR. Model formulation details are given in the methodology section and selected results are presented afterwards. The discussion and conclusion of the findings consider the implications for timing and allocation of resources for malaria interventions.

Study site
The study utilized data collected from one of the MTIMBA sites in Tanzania, the Rufiji DSS (RDSS). people, which is about 47% of the total population of the Rufiji District (INDEPTH Monogram). Rufiji District has an overall mean altitude of B500 metres. Its vegetation is mainly formed of tropical forests and grassland. The district has hot weather throughout the year and two rainy seasons: short rains (OctoberÁ December) and long rains (FebruaryÁMay). The average annual precipitation in the district is between 800 and 1,000 mm. A prominent feature in the District is the Rufiji River with its large flood plain and delta, the most extensive in the country (INDEPTH Monogram; Rufiji DSS Profile, 2000). The majority of the people in the Rufiji District are subsistence farmers.
The main responsible malaria vectors in the area include A. funestus, and members of the A. gambiae complex, including A. gambiae (sensu stricto) and A. arabiensis. Mosquito populations usually peak during the rain seasons especially in areas where rice cultivation is taking place and during the dry months, a high population was usually observed in areas with permanent water bodies (23).

Mosquito data
The entomological data were collected for the period of 3 years, October 2001ÁSeptember 2004 (Source: http:// www.indepth-network.org/dss_site_profiles/rufiji.pdf). The MTIMBA entomological protocol has been well defined in MTIMBA documentation (unpublished). In a snapshot, mosquitoes were captured at least twice every month using Centers for Disease Control (CDC) miniature light traps. The human population in the RDSS was classified into geographical clusters (100Á1,000 people), then for each round a simple random sampling (without replacement) was employed within clusters to select between 20 and 100 'index' people (households) for the set-up of mosquito catches (traps). The traps were fitted indoors with incandescent bulbs and laid close to a human volunteer (randomly selected from members of the household) sleeping under an untreated bednet. Light traps operated from sundown to sunrise (i.e. 6 pmÁ 6 am) for two consecutive nights in each household and bags were emptied every morning. A total of 2,479 unique locations (households) involved were geo-referenced. Collected mosquitoes were counted and sorted into vector species to allow for separate assessment of transmission intensity.

Environmental and climatic data
Remote sensing data were extracted from different sources with different spatial, Sp R , and temporal, T R , resolutions. These include normalized difference vegetation index (NDVI) (Sp R : 250 m 2 ; T R : 16 days; Source: MODIS), day and night temperature (Sp R : 1 km 2 ; T R : 8 days; Source: MODIS), rainfall (Sp R : 8 km 2 ; T R : 10 days; Source: ADDS) and distance to the nearest water bodies (Sp R : 1 km 2 ; Source: Health Mapper).

Statistical analysis
Geostatistical zero inflated negative binomial and logistic regression models were fitted on the mosquito density and SR data, respectively. The models accounted for the effect of environmental and climatic predictors, annual trends, seasonal patterns, and spatial and temporal correlations. The predictive process was used to approximate the spatial process using a subset of locations. Model-based prediction of SR and density were multiplied to obtain estimates of monthly and annual EIR. Details of the model formulation and its implementation are described in the subsections below. Programs used for this analysis are available via contact with the corresponding author.

Model formulation for density data
Let Y it be the number of female mosquitoes and X ð1Þ it be a vector of environmental predictors (extracted from satellite data) observed at location s i , i 01,. . .,n, and calendar month t01,. . .,36 for a specific species. Y it is assumed to follow a negative binomial distribution, Y it ÂNB(r, p it ), where p it 0r/(r'm it ). r is an over-dispersion parameter and m it is the mean mosquito density. Covariates X it , seasonal trends f(t) (1) , spatial U i (1) , temporal o t (1) 0 (e 1 ,e 2 ,. . .,e t ) and non-spatial f i (1) random effects are introduced on the log scale of the mean count via the equation is a residual error term capturing the remaining variability in the data. f(t) (1) is modelled via trigonometric function with a mixture of cycle, C where T c is the period of the season for cycle C (i.e. T 1 012 and T 2 06) and d ð1Þ 1c and d ð1Þ 2c are regression parameters used to describe the amplitude and phase within a period (45,46). Separate models were fitted assuming: (i) a constant seasonal pattern across the 3 years of the study by taking t01,. . .,12; or (ii) a continuous time for the entire study period by taking t01,. . .,36. The seasonal pattern considering dry/wet categorization of the data was also assessed. A zero inflated model formulation was adopted to take into account the excess zeros in the count data. The model is defined as a mixture of a degenerate distribution with mass at zero and a non-degenerate count distribution. The log-likelihood is therefore a sum of the log-likelihood for the non-zero and the zero counts. The distribution of the data is now defined as: where p* is the probability for a count to arise from the zero mass and 1Áp* is the probability to observe a sample from a count distribution (i.e. p(yNu)NB for our case, and u is the vector of parameters associated with the distribution). This probability can be assigned a value between 0 and 1, usually approximates the proportion of zero counts in the sample or can be a function of covariates similar or different from those used in the full model (21,33,34,47). Involving possible sources of zero inflation (e.g. covariates) reduces bias in parameter estimation of p* and other sources of uncertainty. In our case p* is modelled with a logit link as a function of all climatic predictors X + i observed at location s i , i.e.
where a is the corresponding vector of regression coefficients.
Bayesian model formulation requires the specification of prior distributions for all unknown parameters. For the regression coefficients, b (1) , d (1) and a, a standard non-informative uniform prior is adopted, i.e. b (1) Â Unif(Á,), d (1) ÂUnif(Á,) and aÂUnif(Á,), respectively. The latent observations U ð1Þ i introduced at each location s i are assumed to be derived from a multivariate normal distribution with a covariance matrix R ð1Þ nxn , i.e. U ð1Þ ÂMVNð0; R ð1Þ nxn Þ. The S (1) is a matrix with elements R ð1Þ ij and quantify the covariance Cov(U i ,U j ) between the pair of locations s i and s j , respectively. Its distribution defines the Gaussian spatial process. Under the assumption of stationarity, the spatial correlation is taken to be a function of distance between locations. An exponential correlation structure for the covariance matrix of the spatial process is adopted, that is R sp is the spatial variance, d ij is the distance between locations s i and s j and r (1) measuring the correlation decay and also known as the effective range (3/r (1) ) and estimates the distance where the spatial correlation is B5%. The decay parameter r (1) assumed to follow a gamma distribution.
Computation of the Gaussian process requires the inversion of the covariance matrix, S (1) , which for a very large number of locations is not feasible. To enable model fit we approximate the spatial process by a subset of locations, knots, {s i *,i 01,. . .,m} (m BBn) with latent observations U *(1) 0(U(s 1 *),. . .,U(s m *)) T . U *(1) is considered to arise from the same Gaussian process as U (1) and thus U *(1) ÂN(0,S*), where S * is the mxm covariance matrix of the sub-process. These latent observations U of the original process can be approximated by the 'predictions' of the sub-process via the mean of Gaussian conditional dis- Þ is an mxn matrix of the covariance functions between the full and the sub-process (48,49). Selection of subset of location was done using the minimax space filling design implemented in R software (50). The approach optimizes the selection of the best subset by minimizing the maximum of the nearest-neighbour distance between the original survey and the subset locations.
The e ð1Þ t model temporal correlation via a stationary autoregressive process of order one, i.e. e 1 Â Normalð0; r 2 ð1Þ T =ð1 À c 2 ÞÞ and e t je 1,.
where c ð1Þ is an autocorrelation parameter jc ð1Þ jB1 which adopts a bounded uniform distribution, c ð1Þ $ Unif½À1; 1 and r 2 ð1Þ T is the temporal error (51). The / ð1Þ i is assumed to follow a normal distribution with mean zero and a homoscedastic variance r 2 ð1Þ e . Inverse gamma priors are adopted for the variance parameters r 2 ð1Þ sp , r 2 ð1Þ T and r 2 ð1Þ e .

Model for SR
Let N it and Z it be the number of mosquitoes tested and number infected, respectively at location s i and calendar month t. Z it is assumed to arise from a binomial distribution, Z it ÂBin(N it ,p it ) , where p it measures the SR at location s i and time t. The regression function links the SR with other terms of the model (as shown for the density data) and is given as logit i . A similar specification described for the density model is followed in this model.

Data management and environmental lags
To facilitate the assessment of the seasonal pattern, data were summarized by location and calendar month. That implies that all repeated surveys from a specific location within the same month were collapsed (sum of mosquito density/tested and positive) to a single observation.
To account for the environmental-lag-effect on mosquito density or SR, non-spatial (negative) binomial models (with/without zero inflation) were fitted and best lags were assessed. Lags refer to a climate/ environment value at different time intervals prior to the study date that might influence the amount of mosquitoes collected or the SR. Lags considered include the current month (month of collection of mosquitoes); 1/2/3 month(s) prior to the collection; average of current and one previous month; average of one and two previous months; and lastly average of current, one and two previous months. The analysis took into account seasonality, distance from water bodies and time (annual effect) which was incorporated as a binary variable indicating the year of study. Analysis was conducted separately for each species. Fitted values from models with all possible combinations of the environmental lags were calculated and plotted against the observed values (mosquito counts or SR). The combination which best fits the data was used for further analysis. This was implemented in STATA 10 (Stata Corps).

Model validation and prediction
Models were fitted using a training set (85% of the data) randomly selected from the entire data. Validation of the model performance was done on the test locations (the remaining 15% of the data) The predictive ability of the model was assessed by specifically calculate different credible intervals with different probability coverage of the posterior predictive distribution and compare the percentage of test locations correctly predicted within these credible intervals (52). The best predictive ability of the model is observed when higher the number of test locations falls within the narrowest credible interval. The predicted power of the model at 95% credible interval is reported.
Using the estimates obtained from the models, SR and mosquito density were predicted for the whole Rufiji site. The prediction was done at the 250 m resolution.

Calculation of EIR
The EIR can be estimated as a product of the SR and human-biting rate. Depending on the mosquito collection method used (human landing, light trap, etc.), the human-biting rate can be correctly approximated either by the number of blood meals taken on humans/mosquito/ day or by the mosquito density. Established correlation between number of mosquitoes captured by light traps and human landing catches is usually used to adjust light trap collection to equivalence of biting catches and avoid collection bias (53). For this study, EIR was calculated as a product of SR and mosquito density and then adjusted using a correction factor of 1.605 to calibrate estimates obtained from light trap collection (28,53,54).
At a specific pixel j and month t the predicted values of SR,p jt and mosquito density,l jt were obtained for A. funestus and A. gambiae species. EIR estimates representing the infectious bite/person/day were calculated as: EÎR jt ¼ 1:605 Ãp jt af Ãl jt af þp jt ag Ãl jt ag where 1.605 is the correction factor. The EÎR jt was then multiplied by 30.5 and 365 to obtain monthly and annual estimates, respectively. Monthly and annual maps were produced to show seasonal and temporal trends of the transmission.

Geostatistical model implementation
The final model was implemented in OpenBUGS and parameters were estimated using the Gibbs sampler MCMC algorithm. The spatial variance parameter was sampled directly from its inverse gamma full conditional distributions using Gibbs sampling (39). The remaining parameters were simulated using Metropolis algorithm with a normal proposal distribution. The mean of the proposal distribution was the parameter estimated from the previous iteration with a fixed variance (55,56). Two separate chains were run in parallel with a total of 150,000 iterations each. A burn-in of 20,000 iterations was done and the last 5,000 and 1,000 samples were used for posterior inference and prediction, respectively. The Gelman-Rubin model diagnostic tool (57)     collected in the period of JulyÁSeptember. The number of A. gambiae collected was higher during the heavy rains while short rains with high temperature favour the population of A. funestus (Fig. 1). Table 1 summarizes the results of parameter estimations from a multivariate geostatistical models on SRs and mosquito density. The effect of environmental variables differs significantly between species. Rain and temperature are the most influencing factors for density and sporozoite with higher effect on the A. funestus species. No significant effect of distance to the water bodies was obtained. highly pronounced with a significant decrease of mosquito population in Year 2 as compared to Year 1 and later an increase in the Year 3 as compared to Year 2. Spatial ranges are quite high especially for the SRs. The estimate of the overdispersion parameter of A. funestus is twice as large as that of A. gambiae which could be influenced by the amount of zero counts in the data. However, the estimate of r is larger than 1 indicating that the data are not highly overdispersed (58). Day temperature significantly reduces the probability of observing zero mosquito counts. Spatial variability accounts more for the total variability in the data as compared to the non-spatial and temporal variability.

Geostatistical model results
For a total of 63, 99, 368 and 368 test locations selected for validation of SR-AF, SR-AG, Density-AF and Density-AG models respectively, 68.3, 63.6, 84.1 and 89.9% of the locations were correctly predicted within 95% credible interval. Gelman-Rubin diagnostics indicated good convergence of all model parameters. Figure 2 presents selected EIR maps for the Rufiji DSS site for the A. funestus and A. gambiae.

Mapping of EIR
The southern part of the DSS showed high transmission throughout the years. High transmission was observed immediately at the onset of rains, especially during the heavy rain period. At the end of the rainy season (MayÁJune), the transmission spread throughout the region (Fig. 2).
In Fig. 3, monthly time series (median) predicted EIR are plotted for the entire study period. Attributes of each species are also indicated.
The transmission starts peaking in the month of April (just after rains) and gradually drops in July (first year of the study). There was a reduction in the second year of the study and EIR increased again during the last year. A similar monthly trend is observed across years, which emphasizes seasonality. A. funestus are more prominent during the dry months while A. gambiae are more prominent during the rainy periods. The spatial temporal distribution of year-by-year EIR is shown in Fig. 4 with maps of prediction error. The prediction error for the EIR estimates was obtained my multiplying the prediction errors obtained from SR and density models.
Patterns in Fig. 4 show that few surveyed households are located in areas with EIR B1; however, a large proportion of household presented high transmission intensity. Higher prediction errors are seen in areas with few surveyed locations. The errors also capture the effect of heterogeneity arising from unmeasured factors.

Population-adjusted EIR
The annual and species-specific population-adjusted EIR were calculated by averaging predicted inoculation rates at all households (N 014,516) within the RDSS (Fig. 5) excluding all of the other pixels. Results are presented in Table 2.   Overall transmission intensity reduced significantly during Year 2 and 3 as compared to Year 1 of the study. A. funestus was the main responsible vector for transmission in the first (68%) and second (78%) year, while the last year transmission was mainly driven by A. gambiae (63%).

Months
In addition, we assessed the spatial shift (distribution) of transmission intensity over time, as illustrated in Table  3. EIR were categorized into five transmission intensities which were: no transmission (EIR 00), very low (EIR ] 0.0Á1), low (EIR ]1Á10), average (EIR ]10Á100), and high (EIR]100). The change in the percentage of households exposed to a specific level of transmission was then studied.
The proportion of households predicted with very low transmission intensity increased between the first year and the third year of the study, from 4.0 to 7.2%. A significant reduction (over 68%) of locations with high transmission is seen during the last year of the study (i.e. 12.6% in the first year to 4% in the third year).

Discussion
In this study, we assessed spatialÁtemporal variation and heterogeneity of malaria transmission in the Rufiji DSS site using a large geo-referenced biweekly entomological dataset collected over 3 years, and rigorous Bayesian geostatistical models. Our work is amongst the few to address spatial modelling of Entomology inoculation rate (EIR) based on sparse data by applying current Bayesian methodologies approximating spatial processes for large data. The INDEPTH-MTIMBA data, which was used in our application, is the most comprehensive entomological database in Africa. Bayesian spatio-temporal binomial and zero inflated negative binomial regression models were developed to produce monthly maps of EIR taking into account the malariaÁclimate relation and seasonality in transmission (35, 36, 59Á61).
Geostatistical models have been widely used in malaria mapping in recent years (3, 38, 52, 62Á64). Most of these analysis involved standard geostatistical models which are relevant for a moderate number of locations. Computation involved in these models is not feasible for data collected over a large number of survey locations. In this study, we used methods proposed by Barnejee et al. (40) and Finley et al. (42) to approximate the spatial process using a subset of survey locations selected via space filling design implemented in R software. Additive temporal correlations with autoregressive structure were also incorporated in all models. The predictive power of the model suggests good performance of the spatial correlation approximated from a subset of observed location. That might indicate that the subset selected was significantly appropriate. This work adds to the few in literature that indirect evaluates performance of using subsets to approximate the spatial process in real-life field data.
Changes in climate conditions, natural inhabitants and other human activities, which depend on the environment, alter the intensity of malaria transmission (21,65). Our results depict temporal and seasonal variation in EIR along the study period and study area. Transmission was higher during the rainy periods with high temperatures and very low during the dry season or year. Two species A. funestus and A. gambiae are mainly responsible for malaria transmission in this region. Differences on the effect of environmental factors on the mosquito abundance and SRs of the species were observed. The population of A. gambiae increases at the onset of heavy rains while that of A. funestus peaks during the short rains season. Similar results have been reported in the Kilombero valley and other areas with similar climate in Africa and are associated with the preferential conditions of breeding sites of these species (13, 16, 66Á70). A study, which assessed spatio-temporal variation of EIR in Navrongo DSS, showed similar patterns of seasonality  that differed by species (28). Highly significant effects of temperature on the SR and density of A. funestus were observed. Contrary to A. gambiae which has relatively exophilic behaviour, this species is strictly endophilic, which could facilitate choice of conducive a resting environment favouring the gonotrophic cycle resulting in higher survival and hence longer infectivity (71Á73). Knowledge of these characteristics can be important for understanding disease dynamics and for efficient implementation of interventions (5,6,66,74). There was considerable variation over short distances in the intensity of transmission. Small-scale variations in malaria transmission are common in sub-Saharan Africa and create complexity in implementing strategies to combat malaria (8, 28, 59, 75Á77). The spatial correlation was still present over a substantial distance and the spatial variation comprised of about 90% of the total data variance. The spatial correlation arises partly due to spatial pattern in environmental drivers of transmission, partly due to effects of limited mosquito dispersion, and is also affected by human factors such as migration and human population densities (41,42). We had an abundance of data on both mosquito and human populations; however, due to the relative small DSS area, it is difficult to separate the contributions of these different factors to the spatial correlation, which explains the higher spatial range. Such heterogeneity arising from unmeasured factors is captured by the prediction errors.
The methodology described in this study allows estimation of EIR while adjusting for both, temporal and small area spatial variations in a systematic and thorough manner. It acknowledges key characteristic of the data, considers computation difficulties and correlation among potential drivers of malaria transmission. It could be seen that the crude EIR were underestimated as compared to model-based estimations by over 55%. This underlines the importance of utilizing efficient methodologies while estimating epidemiological parameters to allow for proper decisions.
Our formulation allows further expansion and easy incorporation of other covariates in the main structure of the model either as specific covariates or their interaction. The complex component of our proposal is how to separately model SR and density data, incorporate seasonality, choosing environment lags and lastly approximate the spatial correlation when the large number of location has been observed. All these have been worked on. Moreover, DSS sites including Rufiji, collected comprehensive records of all-cause and disease mortality in the human population at the time of this entomological surveillance. The exposure surfaces estimated using this approach can be linked to mortality data to assess the malaria-specific mortality burden. Through that, much more accurate estimates of the benefits to be gained by reducing malaria transmission can be estimated than if it would have been possible from analyses that aggregate EIR over large areas and time periods or those fitted assuming normally distributed EIR.