What lies beneath: a spatial mosaic of Zika virus transmission in the 2015-2016 epidemic in Colombia

Background: Temporal incidence patterns provide a crucial window into the dynamics of emerging infectious diseases, yet their utility is limited by the spatially aggregated form in which they are often presented. Weekly incidence data from the 2015-2016 Zika epidemic were available only at the national level for most countries in the Americas. One exception was Colombia, where data at departmental and municipal scales were made publicly available in real time, providing an opportunity to assess the degree to which national-level data are reflective of temporal patterns at local levels. Methods: To characterize differences in epidemic trajectories, our analysis centered on classifying proportional cumulative incidence curves according to six features at three levels of spatial aggregation. This analysis used the partitioning around medoids algorithm to assign departments and municipalities to groups based on these six characteristics. Examination of the features Results: The timing of departmental-level epidemic peaks varied by three months, and departmental-level estimates of the time-varying reproduction number, R ( t ), showed patterns that were distinct from a national-level estimate. The classification algorithm identified moderate support for two to three clusters at the departmental level and somewhat stronger support for this at the municipal level. Variability in epidemic duration, the length of the tail of the epidemic, and the consistency of cumulative incidence data with a cumulative normal distribution function made the greatest contributions to distinctions across these groups. Applying the classification algorithm to simulated data showed that municipalities with basic reproduction number, R 0 , greater than 1 were consistently associated with a particular group. Municipalities with R 0 < 1 displayed more diverse patterns, although in this case that may be due to simplifications of how the model represented spatial interaction among municipalities. Conclusions: The diversity of temporal incidence patterns at local scales uncovered by this analysis underscores the value of spatially disaggregated data and the importance of locally tailored strategies for responding to emerging infectious diseases.

are underestimates of the true extent of transmission of many infectious diseases, particularly those with high proportions of asymptomatic infections, they still provide a uniquely valuable resource given the paucity of publicly available data at similar scales in other countries [11]. Such data are particularly valuable for Zika, given that a range of spatial scales are relevant for activities related to its prevention and control. On the one hand, vector control activities are planned and budgeted on multiple administrative levels but must be targeted on a very local level. On the other hand, communications, surveillance, and possible vaccination programs are generally planned and implemented only on larger administrative scales.
Our goal in this study was to utilize this unique data set on the ZIKV invasion of Colombia to perform a case study on the characteristics of temporal incidence patterns at different spatial scales in the context of an emerging infectious disease. To do so, we took a three-part approach. First, we performed a descriptive analysis of time series of weekly case reports at three distinct scales in Colombia: national, departmental, and municipal. Second, we performed a classification analysis of proportional cumulative incidence curves at departmental and municipal scales to identify distinct patterns of temporal dynamics at each of these scales. Third, we repeated the classification analysis for data simulated with a mechanistic model for ZIKV transmission in Colombia to determine the extent to which distinct statistical patterns in temporal incidence may reflect distinct driving processes.

Data
The weekly number of Zika cases, by municipality, was reconstructed using two data sources. The main data source was a website [12] of the Colombian National Institute of Health (Instituto Nacional de Salud, INS) where the official weekly reports on the cumulative number of suspected and confirmed Zika cases for each municipality were published beginning in early 2016. While the peak of the Colombian epidemic occurred in 2016, a significant number of cases were reported during 2015. To capture this initial portion of the epidemic, we used an additional data source, also available on the INS website. Unfortunately, the number of cases reported in the latter data source seemed to consistently underreport the total number of cases reported by the INS at the national scale. For example, while the official data source reports a cumulative number of 11,712 cases by the end of 2015, this secondary source only reports 3,875 cases for this same period. Therefore, to reconstruct the 2015 portion of the epidemic while accounting for the better known total number of cases, we multiplied the weekly 2015 data by a correction factor. This correction factor was calculated as the ratio between the cumulative number of cases reported by each municipality up to the first week of 2016 according to the official source and the alternative source.
For the simulation model, we used two data sources. First, we obtained gridded population data across Colombia for 2015 at a resolution of 3 arc seconds (~93 m) from WorldPop [13]. We summed these raster data at the municipal level as defined by GIS shapefiles from the National Geographical Information System of Colombia [14]. Second, we based estimates of R0 on a set of ZIKV epidemic size projections for Latin America made early in the epidemic using relationships between environmental variables and transmission metrics [15]. To obtain a single value of R0 for each municipality, we took a weighted sum of the R0 raster at 5 km x 5 km resolution weighted by a population raster aggregated to that scale. We calibrated these R0 estimates to observed dynamics in Colombia by scaling municipal values of R0 from [15] by a constant (2.72) such that the value for Girardot matched an estimate of 4.61 derived from an analysis of temporal incidence patterns there [16].

Descriptive analysis of weekly case reports
We performed two preliminary analyses of differences in weekly case report patterns at different scales of spatial aggregation. First, we generated a barplot of national case reports color-coded by which of 32 departments those national cases arose from. Likewise, for each of those departments, we generated a barplot of departmental case reports color-coded by which of its municipalities those departmental cases arose from. Second, we made estimates of the timevarying effective reproduction number, R(t), for each time series. Following Ferguson et al. [8], we used the EstimateR function from the EpiEstim library [17] in R to estimate R(t) for each time series based on the method introduced by Cori et al. [18].

Classification analysis of cumulative incidence curves
We focused our analysis on cumulative, rather than raw, incidence because of the extreme variability in raw incidence patterns in this data set. With raw incidence, time series with a small number of cases appear extremely jagged, and temporal patterns can be difficult to extract. With proportional cumulative incidence, vastly different temporal patterns are more readily comparable, because they all begin at 0 and end at 1 but arrive there by different paths. Others [19] have criticized the use of cumulative incidence data from epidemics, although these criticisms mostly pertain to parameter estimation and forecasting, neither of which we do here. Rather, our goal was to perform a descriptive analysis of diversity in the temporal patterns of an epidemic as viewed from many different perspectives spatially.
The cumulative incidence curves that we examined were proportional such that they all reached 1 at the time the last case was reported in a given area. Mathematically, for weekly reported Zika incidence Ii,t in location i in week t, we calculated proportional cumulative incidence as ( We excluded areas from our analysis that reported no Zika cases. As a basis for classifying cumulative incidence curves, we defined six features of these curves that we hypothesized represent dimensions in which curves from different areas vary (Table 1). We defined four of these features in reference to cumulative normal density curves, ̂( ), that we fitted to each Ci,t. This involved estimating mean and standard deviation parameters of ̂( ) for each Ci,t on the basis of least squares using the optim function in R. We chose these features because they provided a way to quantify the duration of local epidemics (small , short = short epidemic), to capture whether epidemics appeared strongly locally driven (low 2, large 0 = sporadic transmission fueled by importation), and to characterize shapes that deviated substantially from those predicted by simple epidemic models ( 5% and 95% near zero = "SIR-like" epidemic). Although these idealized scenarios motivated the selection of these features, the fact that all six features were calculated for each Ci,t meant that we were able to capture a wide range of patterns in between these extremes.

Symbol Definition
Standard deviation of ̂( ) 2 R 2 between Ci,t and ̂( ) 5% Difference between the 5% quantile of Ci,t and the 5% quantile of ̂( ) 95% Difference between the 95% quantile of Ci,t and the 95% quantile of ̂( ) Weeks between first and last non-zero Ci,t 0 Weeks with Ci,t = 0 between first and last non-zero Ci,t Table 1. Features of proportional cumulative incidence curves used for classification analysis.
We explored variation in Ci,t at both departmental and municipal scales. To describe how variation in Ci,t curves at these scales was distributed across the six-dimensional feature space, we performed a partitioning around medoids (PAM) clustering analysis [20] on scaled values of the features using the pam function in the cluster library [21] in R. This algorithm identifies medoids of k groups that minimize the sum of distances between each medoid and all group members. We performed this analysis for values of k ranging 2-10 and compared groupings for different values of k on the basis of their average silhouette values. A silhouette value describes how much more dissimilar a point is from points in the next most similar group compared to points in its own group [22]. An ideal classification would be indicated by silhouette values for data points in all groupings close to 1. Silhouette values nearer to or below 0 indicate that points do not cluster particularly well with the group to which they are assigned.

Simulation of epidemic curves to elucidate driving processes
To aid in the interpretation of the classification analysis of observed patterns of temporal incidence, we performed an identical analysis of simulated patterns of temporal incidence. The value of doing so is that it provides an opportunity to determine the extent to which groups identified by the classification analysis might reflect meaningful differences among those groups in terms of transmission processes and their drivers. In other words, we viewed this exercise as a form of validation of the overall approach of performing a classification analysis on the features of proportional cumulative incidence patterns listed in Table 1. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/276006 doi: bioRxiv preprint We simulated a data set comparable to the observed data using an R implementation of the ZIKV transmission model described by Ferguson et al. [8] parameterized to match the municipal-level R0 values described at the end of the previous section. The model by Ferguson et al. had a number of attractive features, including plausible values of a number of parameters common to ZIKV transmission models, realistic accounting of the timing of transmission-relevant processes in mosquitoes and humans, seasonal variation in transmission, and the ability to capture multiple forms of stochasticity associated with transmission and surveillance. In brief, the model assumes that humans transition from a susceptible compartment into a recovered and immune compartment following a period of incubation and infectiousness and that mosquitoes become infectious and remain so following bites of infectious humans and a seasonally variable incubation period. Mosquito population density is also seasonally variable, driven by seasonal variation in larval carrying capacity and adult mortality. A full description of the model can be found in the paper by Ferguson et al. [8].
To apply this model to Colombia, we used municipal-level human population sizes derived from WorldPop [13] and adjusted seasonally averaged mosquito densities such that seasonally averaged values of R0 matched our municipal-level R0 estimates. Another departure from the original model that we made was to remove explicit spatial coupling, given the complexity of doing so realistically for all 1,122 municipalities in Colombia. Instead, we simulated imported infections (i.e., infections acquired outside a given municipality) to occur at a daily per capita rate that was proportional to a normal probability density function fitted to the temporal pattern of national-scale incidence (timing of national-scale incidence: mean = 32.57 weeks after the first reported case, standard deviation = 8.85 weeks). This time-varying importation function was scaled by a value of 1.55 x 10 -3 , which along with an assumed reporting rate of 11.5% [23] allowed us to approximately match the national total of 85,353 suspected Zika cases. Also, given that our interest was in short-term dynamics rather than long-term dynamics as in [8], we removed human age stratification from the model.

Descriptive analysis of weekly case reports
As a whole, the temporal incidence pattern at the national level was consistent with a typical epidemic trajectory, marked by an increase over approximately five months, a peak around the beginning of February 2016, and a steady decline thereafter over a period of approximately eight months (Fig. 1A). Under a standard set of assumptions about epidemic dynamics, this incidence pattern can be used to estimate the temporal trajectory of the effective reproduction number, R(t) [18]. Applying this technique at the national level yielded estimates of R(t) that began high (range: 1.5-3.5 for the first four months) and gradually declined below 1 by the time the epidemic concluded ( Fig. 1A), all of which is consistent with standard expectations for an epidemic of an immunizing pathogen in an immunologically naive host population.
. CC-BY-NC-ND 4.0 International license is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/276006 doi: bioRxiv preprint Examination of temporal incidence patterns for each of the four largest departments in terms of total incidence (Valle del Cauca, Norte de Santander, Santander, Tolima) showed that patterns at the departmental level were quite different than those at the national level. First, the timing of peak incidence in the departments in This high degree of variability in temporal incidence patterns had substantial impacts on estimates of R(t). At the national level, R(t) estimates never exceeded 3.5, whereas in Santander R(t) was estimated to exceed 5 (Fig. 1D) and in Valle del Cauca it was estimated to exceed 10 (Fig. 1B), due in both cases to more rapid increases in incidence at the departmental level than the national level. In Norte de Santander, R(t) appeared to twice fall well below 1 but then quickly rise back above this critical threshold value (Fig. 1C).
Examination of temporal incidence patterns at the municipal scale revealed even more variability in temporal incidence patterns than at the department level. In the department of Norte de Santander (Fig. 1C), for example, it was clear that one municipality dominated the departmental pattern ( Fig. 2A). The municipalities with the second and third highest incidence both experienced short, unimodal patterns of incidence during the first two months, but incidence patterns thereafter were mostly low and erratic (Fig. 2B & 2C). Other municipalities in . CC-BY-NC-ND 4.0 International license is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/276006 doi: bioRxiv preprint the department had only low, erratic incidence with no sign of a distinct epidemic (e.g., Fig. 1D). With the exception of the first few weeks of transmission, estimates of R(t) at the municipal level were characterized by erratic fluctuations and much larger uncertainty than was apparent at the departmental or national level (Fig. 2).

Classification analysis of cumulative incidence curves
At the departmental level, there was only modest clustering overall, with the highest average silhouette value corresponding to two groups (0.256), a slightly lower value for three groups (0.254), and falling no lower than 0.201 for up to ten groups (Fig. S1). and 95% were the features that were most important for distinguishing two groups (Fig. S2), and contributed further to distinguishing three groups (Fig. S3). Differences in were associated with a difference of approximately two months in the time elapsed between the attainment of 5% and 80% of cumulative incidence (Fig. 3, left: blue longer than red), and differences in 95% were associated with a difference of approximately two months in the time elapsed between the attainment of 80% and 99% of cumulative incidence, but for different groups (Fig. 3, left: red longer than blue). Overall, this meant that the time elapsed between attainment of 5% and 99% of cumulative incidence for both groups was similar, but with one group experiencing epidemics that were fast initially but slow to finish and another group experiencing epidemics that were slower initially but finished more quickly. These patterns were clearest for the curves associated with the medoid of each group (Fig. 3) but were generally apparent for the curves associated with the groups as a whole (Fig. 4). Spatially, groups tended to cluster along northern, central, The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/276006 doi: bioRxiv preprint and southern strata (Fig. 5, left), with incidence-weighted cartographs showing that the epidemic was mostly dominated by distinct northern and central strata (Fig. 5, top right).    . Department assignments to two (top) and three (bottom) groups are indicated by color, with transparency inversely proportional to silhouette value. The one department (Bogotá) with zero incidence is indicated in black and given a weight equivalent to 1/5 of a case to allow for its inclusion in the right column.
There was somewhat stronger clustering at the municipal level, with the highest average silhouette value corresponding to three groups (0.352), somewhat lower values for five and six groups (0.334, 0.326), and no lower than 0.297 for up to ten groups (Fig. S4). and were the features that were most important in distinguishing two groups (Fig. S5), 95% made additional contributions to distinguishing three groups (Fig. S6), and 2 contributed to . CC-BY-NC-ND 4.0 International license is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/276006 doi: bioRxiv preprint distinguishing four groups (Fig. S7). Proportional cumulative incidence curves for the group with short and small were the most visually distinct group and remained relatively consistent regardless of the number of groups (Fig. 6). Some differences among the other groups were also apparent in the proportional cumulative incidence curves, with some having a long tail (Fig.  6, middle: green) or two discrete jumps (Fig. 6, middle: blue). The timing of discrete jumps varied across municipalities, but curves within a group otherwise resembled the curve associated with the medoid for that group (Fig. 7). Spatially, departments generally consisted of a mixture of municipalities from different groups, and the prominence of some groups in the cartograms varied depending on whether the cartograms were weighted by area, population, or incidence (Fig. 8). The cartograms weighted by population showed that a sizeable portion of the population lives in cities that had no reported cases, such as Medellín and Bogotá (Fig. 8, black in the center column). Among municipalities that did have reported cases, the cartograms weighted by incidence showed that a relatively large proportion of reported cases came from municipal-level epidemics characterized by large and (Fig. 8, right column).    The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/276006 doi: bioRxiv preprint

Simulation of epidemic curves to elucidate driving processes
We focused our analysis of simulated data at the municipal level given that the simulation model was not equipped to simulate transmission between municipalities, which is likely important for recreating departmental-level patterns. Overall, our model parameterization assumed that R0 > 1 in 34.6% of municipalities. A total of 12.6% (range: 10.4-14.1%) of municipalities had zero simulated cases, with 99.0% (range: 97.0-100.0%) of those having R0 < 1.
Out of 100 simulated datasets, the classification algorithm selected two groups eight times, three groups 80 times, and five and six groups four times each. Average silhouette value was 0.313 (range: 0.288-0.347) when there were two groups and 0.327 (range: 0.291-0.352) when there were three groups (see Fig. S8 for a representative silhouette plot from a randomly selected simulated dataset). Although this indicates a modest preference of the algorithm for three groups, we focused subsequent analyses on the two-group classification due to our desire to evaluate the correspondence between groups selected by the classification analysis and groups defined by R0 above or below 1.
With the two-group classification, 99.1% (range: 90.3-100.0%) of municipalities with R0 > 1 were placed into the group characterized by larger and . Of the municipalities with R 0 < 1, 74.0% (range: 36.3-80.5%) were also placed into that group, with the others placed into the group with smaller and (see Fig. S9 for an example from a randomly selected simulated dataset). When municipalities were classified into three groups, a new group characterized by moderately low and and negative 95% contained 18.8% (range: 0.2-36.1%) of municipalities with R0 > 1 and 44.7% (range: 23.0-56.5%) with R0 < 1 (see Fig. S10 for an example from a randomly selected simulated dataset). In the presence of this third group, 79.9% (range: 63.4-89.7%) of municipalities with R0 > 1 and 32.1% (range: 22.8-38.8%) with R0 < 1 were placed into the group characterized by larger and .
Visual inspection of five simulated datasets showed that the proportional cumulative incidence curves of municipalities placed in the group characterized by large and generally resembled the curves of municipalities with R0 > 1 (Fig. 9, red). In contrast, proportional cumulative incidence curves of municipalities with R0 < 1 were more diverse than those placed in the group characterized by low and (Fig. 9, blue). A similar pattern was apparent spatially, with municipalities placed in the group characterized by large and generally overlapping with municipalities with R0 > 1, but municipalities with R0 < 1 frequently placed in the group characterized by large and (Fig. 10).
. CC-BY-NC-ND 4.0 International license is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/276006 doi: bioRxiv preprint Figure 9. proportional cumulative incidence curves at the municipal level from five randomly selected simulated datasets. The left two columns show two different groups classified by the curve classification algorithm, and the right two columns show two different groups defined by whether those municipalities have a R0 above or below 1. Each municipality's status as having R0 > 1 (red) or R0 < 1 (blue) is indicated in the top left panel. In each of five simulated datasets shown in the other panels, municipality assignments to two groups are indicated by color, with transparency inversely proportional to silhouette value. Black indicates that no reported cases were simulated for that municipality.

DISCUSSION
Temporal incidence patterns play a vital role in inferring pathogen transmission dynamics and drivers thereof. By analyzing data from the 2015-2016 Zika epidemic in Colombia, we showed that these patterns can appear very different depending on the spatial scale at which incidence data are aggregated. Whereas national-level patterns appeared to follow a unimodal pattern . CC-BY-NC-ND 4.0 International license is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/276006 doi: bioRxiv preprint consistent with behavior of standard epidemic models, departmental-level patterns were somewhat more varied and municipal-level patterns were the most varied. Combining these observations with a formal classification of temporal incidence patterns and a model-based exploration of mechanisms capable of generating those patterns, we deduced that there is distinct variation in temporal incidence patterns subnationally and that much of that variation may be driven by spatial variation in local transmission potential. Spatial contiguity of areas classified into the same groups was observable to some extent at the departmental level but was generally not observed at the municipal level, suggesting that there are underlying spatial drivers of the variation that we observed in temporal incidence patterns.
Similar to our findings of differing dynamics at municipal and departmental scales, theoretical analyses of a range of ecological models have proposed that dynamics approach deterministic behavior as spatial scales grow larger and data become increasingly more aggregated [24]. Methods based on long-term dynamics have been proposed for identifying the scales at which behavior transitions from stochastic to deterministic in models of plant competition and predatorprey interactions [25,26]. Epidemics, however, are inherently transient in nature [2], leaving open the question of how best to define characteristic spatial scales in that context. It is certainly the case that the data from Colombia that we examined displayed greater stochasticity at finer spatial scales. At the same time, the greater variability in temporal patterns that we observed at finer scales suggests that models that aspire to a deterministic representation of behavior at coarser scales must account for spatial structure at finer scales. Indeed, a recent attempt to fit a national-scale transmission model to national-scale time series of Zika case reports from Colombia showed that ignoring subnational spatial structure inhibited that model's fit to the data [27]. A theoretical exploration of similar issues concluded that the scale at which spatial structure must be modeled explicitly is expected to vary by pathogen and geographic context, with less mobile pathogens requiring explicit spatial representation at finer scales [28].
Both stochasticity and spatial interaction are expected to contribute to variability in temporal dynamics at local scales [29]. For some municipalities, temporal incidence patterns appeared to be dominated by stochasticity (e.g., those with discrete jumps). For others, there were implications for a role of spatial interaction (e.g., those with two sharp increases or a long tail). Whereas our simulation model was realistic with respect to demography and the inclusion of spatiotemporal variability in local transmission, it made the very simplistic assumption that importation patterns have identical timing and magnitude in all municipalities. This may have caused municipalities with R0 < 1, particularly those with larger populations, to display incidence patterns that simply reflected the national trend used to drive importation. Analyses of subnational spatiotemporal dynamics in a range of contexts show that importation patterns vary substantially over time and as a function of regional connectivity or being positioned on an international border [30][31][32][33]. Future work that includes more realistic spatial interaction among subnational units could be helpful for resolving the hypothesis proposed here about the importance of spatial interaction in shaping temporal incidence patterns at each of the spatial scales that we considered. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/276006 doi: bioRxiv preprint Our finding of differences in temporal incidence patterns at different spatial scales raises questions about infectious disease forecasting that deserve further consideration, particularly at national and other highly aggregated scales. Analyses of alternative models used to forecast Ebola virus disease incidence in the West African epidemic showed that the accuracy of forecasts can be extremely sensitive to model misspecification [19]. More recent work addressing the predictive limits of forecasting for a number of infectious diseases has suggested that appropriate model structure may itself change over the course of an epidemic, with overly rigid model structures being constrained in their ability to accurately forecast future incidence patterns [34]. Our findings -specifically, that temporal incidence patterns vary across municipalitiessuggest that appropriate model structure could vary spatially. Considering that improvements in weather forecasting have resulted in part from the increasingly high resolution of data and models [35,36], it is logical to expect that improvements in infectious disease forecasting will also require improvements in the spatial and temporal resolution of epidemiological data and models.
Our analysis identified intriguing differences in temporal incidence patterns across spatial scales, but at the same time there are important limitations to acknowledge. First, our conclusions are not dependent on the magnitude of transmission, but do require that patterns in case report data reflect patterns in underlying transmission. With a high rate of asymptomatic infection and the likelihood of extensive variability in reporting rates [37], particularly at the municipal level, much caution is due. Second, our ability to ascribe meaning to the groups identified by our classification algorithm was limited by the simplicity of our simulation model, particularly with respect to spatial interaction. Consequently, while this analysis identified important relationships between scale and epidemic characteristics, it does not provide a final or comprehensive understanding of the spatial transmission dynamics of ZIKV in Colombia. Third, our model relied on a simplified description of seasonal transmission, when in fact patterns of seasonality are likely to vary spatially and to interact strongly with introduction timing [38].

CONCLUSIONS
Previous analyses of Zika [8,27], as well as chikungunya [9,39], have drawn epidemiological inferences and made forecasts on the basis of nationally aggregated time series data. These efforts depend on the implicit assumption that spatially disaggregated temporal patterns are homogeneous and consistent with spatially aggregated temporal patterns. Our analysis showed that while national-level patterns may be somewhat reflective of departmental-level patterns, municipal-level patterns of cumulative incidence are exceedingly diverse and not well approximated by national-level patterns. Although our analysis was limited in its ability to explain the mechanisms that drove these diverse patterns, applying our classification algorithm to simulated data in which driving mechanisms were known showed that spatial differences in driving mechanisms can be associated with perceptible differences in temporal patterns. The initial wave of the Zika epidemic appears to have subsided, but understanding of spatial variation in transmission dynamics remains imperative for time-sensitive applications such as site selection for vaccine trials [40,41] and anticipating future epidemics [8].    The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/276006 doi: bioRxiv preprint Figure S4. Silhouette plots at the municipal level for groups numbering two to ten obtained by partitioning around medoids. Each bar corresponds to the silhouette value of a given municipality according to the group assignments indicated by different colors in each panel. Higher average silhouette values indicate stronger clustering.    The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/276006 doi: bioRxiv preprint Figure S8. Silhouette plots at the municipal level based on a randomly selected simulated data set for groups numbering two to ten obtained by partitioning around medoids. Each bar corresponds to the silhouette value of a given municipality according to the group assignments indicated by different colors in each panel. Higher average silhouette values indicate stronger clustering.   The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/276006 doi: bioRxiv preprint