Deriving risk maps from epidemiological models of vector borne diseases: State-of-the-art and suggestions for best practice

Epidemiological models (EMs) are widely used to predict the temporal outbreak risk of vector-borne diseases (VBDs). EMs typically use the basic reproduction number (R0), a threshold quantity, to indicate risk. To provide an overall view of the risk, these model outputs can be transformed into spatial risk maps, using various aggregation methods (e.g. average R0 over time, cumulative number of days with R0 > 1). However, there is no standardized methodology available for this. Depending on the specific aggregation methods used, the yielded spatial risk maps may have considerably different interpretations. Additionally, the method used to visualize the aggregated data also affects the perceived spatial patterns. In this review, we compare commonly used aggregation and visualization methods and discuss the respective interpretation of risk maps. Research publications using epidemiological modelling methods were drawn from Web of Science. Only publications containing maps of R0 transformed from EMs were considered for the analysis. An example EM was applied to illustrate how aggregation and visualization methods affect the final presentations of risk maps. Risk maps can be generated to show duration, intensity and spatio-temporal dynamics of potential outbreak risk of VBDs. We show that 1) different temporal aggregation methods lead to different interpretations; 2) similar spatial patterns do not necessarily bear the same meaning; 3) visualization methods considerably affect how results are perceived, and thus should be applied with caution. We recommend mapping both intensity and duration of the VBD outbreak risk, using small time-steps to show spatio-temporal dynamics when possible.


Vector-borne diseases and risk mapping approaches
Vector-borne diseases (VBDs) have become a huge concern for public health, causing more than 700,000 deaths each year (WHO, 2017). Their vectors include different groups of arthropods such as mosquitoes, ticks, or sandflies, while the hosts are typically vertebrate animals including humans. Many vector-borne pathogens have been limited to specific areas, e.g. malaria epidemics in Africa (Taffese et al., 2018). With global change, however, vector-borne pathogens can increasingly invade naïve populations of vectors and hosts in formerly disease-free areas, causing disease outbreaks and leading to new challenges in disease surveillance and control (Caminade et al., 2019;Gage et al., 2008;Myers et al., 2000;Tatem et al., 2006). International air travel is a major contributor to the global spread of pathogens. International cargo transportation facilitates the transport of vectors to regions that were previously not reachable due to dispersal limitations (Thomas et al., 2014), which now poses threats through the introduction of related pathogens. Climate change impacts the transmission of pathogens, e.g. by affecting the length of the transmission season, or by enabling diseases to emerge in areas where they could not survive before (ECDC, 2019;WHO, 2017). Consequently, the chances of VBDs emerging in novel places are increasing. To better monitor and control VBDs, it is essential to know where and when a VBD can potentially occur; and if possible, how severe the situation is.
Regarding these questions, a variety of modelling approaches have been introduced and applied. Generally, current modelling studies can be categorized by their use of either correlative or process-based models, either of which has pros and cons . The spatial aspects of risk are typically modelled using correlative ecological niche models (also known as species distribution models) based on species (vectors, virus, infected vectors or hosts) occurrence records and respective explanatory variables describing their environment (Escobar and Craft, 2016;Peterson, 2014). The explanatory variables can include a broad range of factors such as bioclimatic variables, land use, and any other variables that may affect the species' spatial distribution patterns. The output of an ecological niche model is a map of the potential spatial distribution of the target species. As the explanatory variables are typically based on long-term (e.g. yearly or decadal) average data, ecological niche models are mainly applied for gaining estimates of long-term (decadal) trends in risk for large (continental) areas. Ecological niche models are well-accepted for their good performance on spatial distribution mapping. However, correlative models typically do not resolve the short-term dynamics of real-world outbreaks. This is where the process-based epidemiological models discussed in this paper promise an advantage.
Epidemiological models (EMs, also referred to as mechanistic or mathematical models, compartmental epidemic models or compartmental models (Siettos and Russo, 2013)) aim to identify risks at fine temporal resolution at a specific location. EMs can assess the course of an outbreak by simulating the processes that drive the transmission cycle of the disease. EMs do not require occurrence records of the investigated disease or pathogen when constructing the model. Instead, they are dependent on detailed knowledge of the respective processes upon which they are based. One major environmental variable that often drives these processes is ambient temperature .
The temporal resolution of an EM depends on the available data for temperature or other key drivers (Calistri et al., 2016). It can be annual (Wu et al., 2013), monthly (Hartemink et al., 2009), biweekly (Hartley et al., 2012), or even finer. Spatially, these models are often limited to a specific place or region where processes are well understood (Du et al., 2017;Ferguson et al., 2016). Nevertheless, EMs are increasingly used to estimate spatial aspects of risk as well . In this study, we review how values of R 0 obtained from EMs can be transformed into meaningful maps. We demonstrate that the interpretation of such maps differs depending on the underlying model algorithm and the chosen method of aggregation. The visualization methods need to be chosen carefully as well.

The basic reproduction number (R 0 ): a threshold parameter
The basic reproduction number (R 0 ) is a characteristic value for epidemiological assessments (Hethcote, 2000;Ma and Li, 2009;Smith et al., 2014). It was originally defined as the expected number of secondary cases caused by a single infected individual (host) during its lifetime within a susceptible population (Diekmann et al., 1990;Dietz, 1993;Liu et al., 2018). The investigated disease can spread when R 0 > 1; otherwise, it cannot (Diekmann et al., 1990;Hethcote, 2000;Liu et al., 2018;Smith et al., 2014). For an infectious disease, R 0 can be used to determine the herd immunity threshold that is required to prevent it from further spreading (Dietz, 1993;Guerra et al., 2017;Massad et al., 2001). R 0 has also been used as an indicator for the spreading speed of diseases (Ridenhour et al., 2014). R 0 can be measured directly through disease surveillance systems (Marques et al., 1994) or estimated via EMs (Delamater et al., 2019;Ridenhour et al., 2018;Siettos and Russo, 2013). For VBDs, however, R 0 is difficult to assess directly due to the significant workload required to conduct both monitoring and surveillance and to determine the complexity of transmission patterns of VBDs (Delamater et al., 2019). Instead, estimates of R 0 for VBDs are often based on EMs. These estimates are generally also called R 0 , although it must be noted that they may differ considerably from the original definition in several aspects (Delamater et al., 2019;Li et al., 2011). While they all share the threshold at R 0 = 1, absolute values of R 0 are not comparable between different types of EM (Dietz, 1993;Heffernan et al., 2005;Li et al., 2011;Yang, 2014). For instance, values of R 0 calculated for a VBD with the popular next generation matrix (NGM) approach represent the mean number of infections in both vectors and hosts per generation (Martcheva, 2015). This is different from the original definition of R 0 that refers to the number of individuals (hosts) that get infected by a single infected host throughout its life time. Only the survival function approach reliably calculates values of R 0 that are consistent with this definition (Li et al., 2011).

Generating spatial risk maps from epidemiological models: temporal aggregation and visualization
For the calculation of R 0 for VBDs, EMs typically use time series of temperature data for specific locations, which are acquired at daily resolution from weather stations (Cadar et al., 2017;Cheng et al., 2018;Hartemink et al., 2009;Holy et al., 2011;Mordecai et al., 2017;Racloz et al., 2008). This enables capturing of weather extremes and assessment of temporal risk fluctuations. To assess the spatial aspects of risk, time series of point-wise weather station data can be replaced with time series of spatial temperature data representing the weather conditions across the whole study area (typically stored as geographical raster data). The series of spatial R 0 maps derived from this provides spatio-temporal predictions of transmission risk. Since this can result in an unpractically large number of figures (e.g. 365 maps for one year of daily-resolution data), some form of temporal aggregation of the series of R 0 maps is required after modelling. Careful attention must be given during this process, as the method for temporal aggregation of raw EM outputs affects the interpretation and usefulness of the final figure.
In addition to temporal data aggregation methods, visualization methods affect the final presentation of risk maps as well. After all, as every map necessarily is a distorted model of the real world (Monmonier, 1991), decisions on how to distort reality for the benefit of the viewer should be made consciously and coherently. The first thing to consider here is whether the raw (temporally aggregated) data should be displayed as an unclassed map, where each data value corresponds to a unique point on a continuous color gradient. The alternative is a classed map, where the range of raw data values is broken up into a limited number of groups or classes, each of them represented by a unique color. How many classes to use and how to define them are necessary follow-up questions in this case (Slocum et al., 2009). Finally, an appropriate color scheme needs to be found (Brewer, 1994).
In this study, we investigate how the outputs of R 0 -based EMs have been transformed into spatial risk maps in the literature. Commonly used methods are identified and illustrated using an established NGMbased EM as an example. We evaluate the strengths and weaknesses of the different approaches and make recommendations for useful methods for both temporal aggregation and informative visualization schemes. With these analyses, we raise awareness and contribute to the development of a more standardized methodology for future usage.

Methods
A systematic literature search was performed via Web of Science (Clarivate Analytics, 2019) topic searching, using core collections only (last accessed on January 17 th , 2020). Two sets of intentionally broad keyword combinations were applied: 1) "vector AND borne AND basic AND reproduct*" and 2) "basic AND reproduct* AND climate AND change AND (risk OR map OR spatial OR distribut* OR transmit*)", resulting in 457 publications (Fig. 1).
The publications were filtered via the following criteria: they should be 1) original research articles (as reviews and book chapters may cause double counting), 2) written in English and 3) include spatial risk maps (showing R 0 ) that were generated from process-based EMs (Fig. 1).
Among these 457 publications yielded from the primary search, 416 were of article type and written in English. Among them, 46 articles were identified as off-topic based on title or abstract. The remaining 370 articles were downloaded for manual full-text screening for the presence of maps representing R 0. Manual screening was chosen as our primary selection method for the simple reason that maps are difficult to find through search strings but easily recognizable for the human eye. This procedure yielded seventeen articles.
To broaden the search even more, the literature referenced by these 17 research articles as well as the literature citing them was screened and tracked manually, resulting in nine more articles that fit the inclusion criteria. Together with one more article suggested during peerreview, a total number of 27 research articles was analyzed in this review.
The publications were analyzed according to the following details: 1) the temporal aggregation method applied to summarize raw R 0 values over time; 2) the visualization method used to display the aggregated model output data; 3) the method used for estimating R 0 (i.e. NGM or probabilistic); 4) the data source (e.g. time series of daily temperature data or satellite images) that was used by the EM; and 5) the specific VBD investigated together with the respective study area.
To illustrate the differences among the recorded aggregation and visualization methods, these methods were demonstrated with an NGMbased EM developed by Rubel et al. (Rubel et al., 2008) for Usutu virus and previously used for a cross-discipline model comparison (Cheng et al., 2018). The model was run using rastered 0.25 • resolution daily temperature data for 2018 from the E-OBS data set (Haylock et al., 2008). With the same original EM output, a series of spatial risk maps were generated using different spatial transformation methods. More details about the NGM-based EM applied are provided in Supplement S1.

Summary of recorded methods for temporal aggregation and visualization
Overall, the NGM method was applied in 19 studies, whereas the remaining 8 studies followed a probabilistic approach (see Supplement S2). From the 27 research papers analyzed in this review, two types of temporal aggregation methods for the creation of maps from R 0 values were recorded (Fig. 2a): the averaged intensity (mean values of R 0 over Fig. 1. Flowchart of the article selection process concerning spatio-temporal risk maps produced with epidemiological models. First, only articles published in peer-reviewed scientific journals were considered. Articles that were not written in English and those where screening of abstract and/or title indicated that they were outside the scope of this review were removed. The remaining articles were screened visually for the existence of R 0 -based maps in the full-text downloads, resulting in 17 articles fitting the inclusion criteria. The same procedure was then applied to the literature referencing/referenced in these articles, revealing another nine relevant literature records.

Fig. 2.
Overview of the studies using epidemiological models to produce spatial risk maps for vector-borne diseases. a) Studies grouped by the temporal aggregation methods applied. Out of 27 total, 19 studies displayed temporal averages of R 0 , 2 displayed the duration of the at-risk state (R 0 > 1) and 3 more showed both. 3 studies did not apply any temporal aggregation method. b) Relationship between temporal resolution in input data and final maps. Thickness of the connecting lines is equivalent to the number of studies aggregating in this way. c) Temporal resolution of the final maps. time) on the one hand and the duration of the at-risk state (number of days with R 0 > 1) on the other hand. In all but three studies, some kind of temporal aggregation was applied. Three publications used both averaged intensity and the duration of the potential risk in parallel (Brugger and Rubel, 2013;Cheng et al., 2018;Ng et al., 2017). The most frequently used method was the averaged intensity (Fig. 2a). Some form of averaging was performed in 22 studies, using annual, seasonal (transmission season), monthly, bi-weekly, or roughly weekly (8 days) averages of R 0 (Fig. 2b). All of the aggregated NGM-based apüpEMs applied this method. In some cases, average values of R 0 were calculated indirectly, e.g. by first calculating the relationship between R 0 and temperature, then calculating the annual average R 0 based on long-term average temperature data (Cordovez et al., 2014;Kakmeni et al., 2018;Wu et al., 2013). Only five studies included maps that show the duration of the at-risk state (R 0 > 1), three of which also show average R 0 in parallel. Among these, two studies displayed the total number of days with R 0 > 1 throughout the study period (Brugger and Rubel, 2013;Cheng et al., 2018), while the remaining three (Holy et al., 2011;Mordecai et al., 2017;Ng et al., 2017) mapped the number of (consecutive) months with R 0 > 1. One study (Racloz et al., 2008) calculated R 0 based on monthly mean temperatures with no further aggregation. Two further studies only calculated R 0 for a single point in time, so that no aggregation was necessary for the final map (Hartemink et al., 2011;Moraga et al., 2015).
Among the recorded studies, 13 were based on daily resolution weather data, 10 on monthly data, 3 on annual mean temperature and 1 on an 8-day resolution satellite data (Fig. 2b). The resulting maps typically showed annual (n = 14) or monthly (n = 9) representations of R 0 (Fig. 2c). Seasonal (n = 4) and sub-monthly (n = 2) maps were rarer, and in two cases the time frame represented by the maps remained unclear. EMs based on daily resolution data were transformed into final maps of bi-weekly, monthly, seasonal or annual resolution. EMs based on monthly resolution data were transformed into maps of either monthly or annual resolution. In two cases, coarse resolution temperature data was first interpolated to gain daily estimates of R 0 , which were then aggregated again so that the maps met the coarse temporal resolution of the temperature data (Calistri et al., 2016;Zhang et al., 2017).
Regarding visualization methods, 9 studies presented unclassed maps with continuous color ramps. The remaining 18 studies presented classed maps, 7 of which used a relatively large amount (≥ 10) of classes. Averaged intensity maps were typically visualized using diverging color ramps with cold colors for R 0 < 1 and warm colors for R 0 > 1.

Exemplary demonstration of common aggregation and visualization methods
The aggregation methods affect the resulting values and respective interpretations (e.g. intensity or duration) across the risk map; the visualization methods, on the other hand, affect the perceived map patterns. To illustrate how these two factors affect the final presentation of risk maps, a series of figures was derived from the same EM for Usutu virus (Rubel et al., 2008;Cheng et al., 2018) (Figs. 3 and 4).

Temporal aggregation
The averaged R 0 maps (Fig. 3a)-c)) show the intensity of potential risk averaged over a certain time period. The annual average R 0 map (Fig. 3a)) portrays the spatial risk pattern, without any temporal information on the potential start of the transmission season or when during the year the risk is particularly high. The seasonal average R 0 map (Fig. 3b)) mainly shows the spatial risk pattern as well, although with a larger R 0 value range compared with the annual map. As for temporal risk information, the seasonal average R 0 map has a defined transmission season. The series of monthly average R 0 maps (Fig. 3c1)-c4)) captures the spatio-temporal risk dynamic, and shows a sudden decline of risk at the end of the transmission season. The duration map (Fig. 3d)) shows the duration of the at-risk state by means of the total number of days with R 0 > 1. In this example, most parts of Europe show more than 10 weeks of R 0 > 1.

Visualization
We varied visualization settings of Fig. 3a), b) and d), as shown in Fig. 4. With certain visualization setting, some maps can show very similar spatial patterns, e.g. Fig. 4b1) and c1). However, just by changing the value interval, the spatial patterns can certainly differ, e.g. Fig. 4b2) and c2). On the other hand, maps with different spatial patterns under certain visualization settings, e.g. Fig. 4a2)-c2), can also show similar spatial patterns, as seen in Fig. 4a3)-c3). In addition to the monthly average R 0 maps over the transmission season (Fig. 3 c)), more detailed spatio-temporal risk maps were carried out by using biweekly time-steps (Fig. 5). Using finer temporal resolution, these maps capture not only the spatial risk patterns, but also identify the peak of risk. The overall risk across Europe gradually increases from the beginning of June until the first half of August, which is when it begins to drop. After the first half of September (Fig. 5h)), the average R 0 across Europe is below 1.

Discussion
The ultimate purpose of generating risk maps is to show where and when the potential risk of a VBD outbreak is high. To achieve this, it is essential to apply appropriate and practical temporal aggregation methods when producing these maps with EMs. In this review, we provide a detailed comparison of the commonly used aggregation methods, and demonstrate their use with an EM for Usutu virus. In addition, we highlight the importance of reasonable visualizations.

Methods for temporal aggregation of R 0
When transforming the raw EM outputs into spatial risk maps, the purpose of these risk maps needs to be considered. In order to monitor the general spatial risk patterns of a VBD, the relative risk might be more important than the absolute value of R 0 . In this case, long-term (annual or even longer) or mid-term (transmission season) average R 0 risk maps and the total number of days with R 0 > 1 can work well. However, to take preventive measures to stop VBD emergence, it is essential to know both the spatial and temporal patterns of transmission risk. For this purpose, a series of short-term average R 0 maps is preferable. For all these applications, it should be borne in mind that R 0 is calculated under the assumption of a completely susceptible population, i.e. for the early stages of a disease event. Furthermore, for the EMs discussed here, R 0 is calculated for each cell of the spatial raster individually; i.e. there is no simulated spread from one cell to another. Consequently, these maps can only be interpreted as a potentialfor an outbreak to occur. In order to predict the actual course and magnitude of an outbreak, much more sophisticated models would be needed.
In the majority of the articles reviewed here, the authors aggregate R 0 in time by calculating average values over various periods. When applying this method, the primary task is to find a reasonable time period to investigate. It is important to note here that while R 0 itself is a threshold quantity, average R 0 is not. While high average R 0 (qualify R 0 > 1) indicates high risk, low average R 0 does not necessarily mean that transmission cannot occur. Even if the mean R 0 over a certain period is below 1, there could be several consecutive days or weeks with R 0 > 1, which may lead to an outbreak. This is especially problematic for longterm or mid-term average R 0 maps. Hence, maps based on longer-term average values of R 0 are not useful for determining the presence or absence of risk (let alone absolute risk) for ongoing transmission at a given place in the study area. They can, however, be useful to estimate the relative spatial differences in risk across the study area (e.g. Fig. 3a) and b)). The short-term (monthly) average R 0 maps provide temporal information together with the spatial risk maps, capturing the spatiotemporal risk dynamics (e.g. Fig. 3b), c)). With even smaller time steps, averaged R 0 values may still be able to reflect the absolute values of R 0 reasonably well. As an example, we used a two-week time step as an approximation of the extrinsic incubation period of Usutu virus in Fig. 5. This kind of map can be useful to show the beginning and end of a high risk season. In any case, it needs to be kept in mind that absolute values of R 0 calculated by different methods are not comparable (Li et al., 2011), which also applies to average values derived from them. For R 0 NGM calculated by EMs using the popular NGM method, there is the additional complication that the daily R 0 NGM is already the geometric mean per generation (Dietz, 1993). When averaging the daily R 0 NGM using the arithmetic mean, it has the potential to show a distorted picture of the real-world situation. Additionally, R 0 NGM indicates whether or not the intrinsic growth rate r of the investigated VBD is positive, but the relationship between R 0 and r is not explicit and differs depending on the respective NGM (Diekmann et al., 2010;Dietz, 1993). Therefore, average R 0 NGM maps should be interpreted with special caution, and those based on different NGMs are not comparable.
The less frequently applied accumulation-based method does not treat the threshold quantity R 0 as a continuous value. Instead, it shows the spatial risk pattern as the duration of at-risk states (such as number of days with R 0 > 1). This method leads to results that are comparable across different model types (including EMs built with different NGM) as long as they share the threshold at R 0 = 1. The drawback of this method is that it does not take into account the intensity of risk: a day with R 0 = 1.1 and a day with R 0 = 111 would both be equally counted as one day of R 0 > 1, though higher R 0 values generally indicate higher theoretical risk (Diekmann et al., 2010). In certain cases, a period with R 0 being continuously above 1 could last for months across the whole study area, e.g. Fig. 3d). However, for disease control measures, it is also important to know where the potentially most-severely affected areas will be. Since this method ignores the quantity of R 0 , valuable information is lost. Consequently, solely relying on this method would not be sufficient in many scenarios.
The average R 0 maps and the maps showing the total number of days with R 0 > 1 are not comparable in terms of risk, as one shows risk intensity while the other shows duration. In our test case, these two types of maps show similar patterns (Fig. 3b) and d)), but this will not always be the case (see Fig. 4b2) and c2)). These two types of maps should not be interpreted the same way. For instance, the high correlation between the transmission season average R 0 map and the total number of days with R 0 > 1 map indicates that where average R 0 is higher, the duration of risk is also longer, or vice versa. This does not necessarily mean that one can replace or resemble the other. As EMs are typically driven by temperature variables, it is not surprising that the average R 0 is correlated to risk duration, i.e. a long duration of high temperature would probably result in both a high average R 0 and a longer transmission season. Allowing for other parameters (such as human/vector population density) to vary in space as well may break this perceived pattern.

Visualization methods
In the vast majority of the reviewed studies, the authors opted for a classed map rather than one showing continuous values. This is in line with common practice in geography, where classed maps are often preferred based on the argument that the reader would not be able to precisely locate a pixel's color value on a continuous color bar in the legend anyway (Slocum et al., 2009). A classed map thus provides a simplification that helps the reader to estimate which numerical values an area of a certain color represents. Given that several of the articles reviewed here have classed maps with 10 or more classes, it seems worth pointing out that this simplifying effect is lost if the value intervals are too small. This leads to a large number of classes that cannot be distinguished visually any more (Fig. 4a1)-c1)), making these maps visually equivalent to an unclassed map. However, they do not share an unclassed map's main advantage of each color value in the map representing only one unique data value. If a classed map is being produced, the number of classes and the color scheme of the legend should be chosen in a way that enables the reader to easily distinguish the different classes.
Modern map making software supports a variety of methods that can be used to classify different types of data for display in a map (Slocum et al., 2009). The most straight-forward option for aggregated R 0 data is the classic "equal intervals" approach, where the range of values is broken up into value intervals of equal size. It is easy to calculate and the legend can be understood intuitively. However, the chosen size of the intervals (or: number of classes) strongly affects how the presented data is being perceived by the reader. For instance, Fig. 4b1) and c1) show two different temporal aggregates of the same model (transmission season average and duration of the conditions with R 0 > 1, respectively). They are being presented using equal intervals with a large number of classes and are perceived as showing very similar spatial patterns. Fig. 4b2) and c2) show the same data again, using the same equal intervals methods but with less classes. Due to how the values are distributed in the two data sets, the perceived patterns are now dramatically different. In contrast to this Fig. 4a1) and b1) show an average of R 0 over the whole year and the transmission season respectively. Using the exact same value intervals for both maps, the different patterns are clearly visible. The same is true for the equivalent maps using a lower number of classes (Fig. 4a2) and b2)). Another challenge is to define reasonable limits for the area to be investigated with this method, as the geographical extent affects the value range and value distribution of the pixels within the extent, which will in turn affect the classification of the pixels. The equal intervals approach can thus be useful to compare average R 0 for different time periods within the same model (e.g. annual vs. transmission season or July vs. August). Its use for comparisons across temporal transformation methods or between models from different studies is severely limited, though.
In order to enable comparisons across different models, quantilebased classification should be considered as a useful alternative to the equal intervals method. Here, the data values are first ranked and then grouped so that an equal number of data points falls into each class (Slocum et al., 2009). In our example, the annual and transmission season average of R 0 as well as the duration of the conditions with R 0 > 1 were classified into five quantiles, expressed as percentiles in the legend (Fig. 4a3)-c3)). Each of them shows an estimate of relative risk across the study region. Through that, comparisons across different studies using different methods are being made possible. Similarly, this method could enable a cross-disease overview, i.e. though the absolute values of R 0 for different VBD are not comparable, it is still helpful to know which VBD is more likely to occur in a certain area. In our test case, the two classed risk maps (Fig. 4a3) and b3)) show virtually identical patterns, which can be explained by the general correlation between the annual mean temperature and the mean temperature over the warmest quarter (transmission season). However, this does not mean that one map can replace the other. Those two maps still cover different time-periods and have different interpretations. The average R 0 over the transmission season contains more information, but it requirte it requires a transmission season to be determined first. It should also be noted that high relative risk does not necessarily mean high definite risk. For instance, if the map of average R 0 over the second half of September (Fig. 5h) was transformed into a quantile-based risk map, all values within the highest quantile would indeed still be below 1. For the sake of clarity, the data values corresponding to the class limits should thus be noted in the legend, along with or even replacing the percentile labels.
When the quantiles method or any other method that does not lead to regular value intervals (such as "natural breaks" or "optimal", (Slocum et al., 2009)) is used, it is important to report how the classification was achieved. Not only would the reproducibility of the study otherwise be limitedunder certain circumstances, the wary reader may even suspect a map to be manipulated (Monmonier, 1991). Only when their thresholds are reasonable, the risk maps are interpretable. Unfortunately, several of the studies reviewed here are incomplete in this regard.

Temporal dynamics
The temporal resolution of risk maps is typically not a free choice. Instead, it is very often dictated by the input data. For instance, with daily resolution mean temperature data as an input, it is easy to produce maps that show the duration of conditions with R 0 > 1 as well as and risk maps with small time steps. There are plenty of options to aggregate fine temporal resolution to coarse ones. However, this does not work in the opposite direction (Figure 2b)). When the input data is annual average temperature, it is not possible to produce duration maps, nor a seasonal averaged intensity of risk. Interpolating monthly data into daily values may work if the goal is a map of long-term average R 0 (Calistri et al., 2016;Zhang et al., 2017), but it is certainly not a useful method for estimating the number of days with R 0 > 1.

Summary and suggestions
In order to prepare risk maps for potential outbreaks from the raw output of EMs, two main steps are required. First, one or more appropriate methods for temporal aggregation of R 0 values need to be chosen. For a summarizing spatial overview, both maps of duration and average long-term intensity of risk can be useful. Strictly speaking, however, only the duration-based approach leads to results that are comparable across different types of EMs. This method also has a rather straightforward interpretation, making it the best choice for predicting areas of potential disease emergence in non-endemic regions so that preventive measure can be taken. A series of maps based on short-term average R 0 , on the other hand, can be useful to determine the time and place of peak transmission potential within endemic areas.
The second important, but often neglected aspect is how the maps are visualized. Unclassed maps avoid any loss in information, but are often difficult to interpret due to the lack of abstraction. By using different classification methods with the same data, the resulting maps can show very different patterns. Among the methods available, the equal intervals approach is the obvious straight-forward solution. However, quantilebased classification results in more meaningful maps and better comparability across different studies. When classification methods other than equal intervals are used, this needs to be documented properly, if only to avoid the suspicion of purposeful manipulation.
Ultimately, there is always a trade-off to be made between minimizing information loss, facilitating understanding and minimizing the amount of maps. If in doubt, we recommend to combine 1) an overview map of the duration of conditions with R 0 > 1, visualized using quantilebased classification with a 2) series of maps showing the change of shortterm (monthly, bi-weekly) average R 0 throughout the transmission season, visualized using the same set of equal intervals for the entire set. Preferably, this would be accompanied by a digital supplement containing the (temporally aggregated) data stored in a commonly used geodata format such as GeoTIFF or NetCDF.

Author contributions
YC, NT conceptualized the review. YC did the literature searches, wrote the first draft, ran the epidemiological models, drafted the table and prepared the visualizations. NT had the initial idea for the review, co-wrote the results section, prepared parts of Fig. 2 and edited the table and the manuscript. ST, AJ and CB reviewed and provided critical feedback on the draft. All authors discussed the concept, preliminary results and figures at various stages. All authors discussed and revised the manuscript. All authors read and approved the final version of the manuscript.

Funding statement
Yanchao Cheng was funded by China Scholarship Council, No. 201506040059. Stephanie Thomas was funded by the Bavarian State Ministry of the Environment and Consumer Protection and the Bavarian State Ministry of Health and Care through the BayVirMos project (TKP 01KPB-73560). This publication was funded by the German Research Foundation (DFG) and the University of Bayreuth in the funding programme Open Access Publishing.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.