Endurance, Refuge, and Reemergence of Dengue Virus Type 2, Puerto Rico, 1986–2007

To study the evolution of dengue virus (DENV) serotype 2 in Puerto Rico, we examined the genetic composition and diversity of 160 DENV-2 genomes obtained through 22 consecutive years of sampling. A clade replacement took place in 1994–1997 during a period of high incidence of autochthonous DENV-2 and frequent, short-lived reintroductions of foreign DENV-2. This unique clade replacement was complete just before DENV-3 emerged. By temporally and geographically defining DENV-2 lineages, we describe a refuge of this virus through 4 years of low genome diversity. Our analyses may explain the long-term endurance of DENV-2 despite great epidemiologic changes in disease incidence and serotype distribution.

a partial sequence analysis from 74 DENV-2 isolates collected in Puerto Rico during 7 years throughout a 14-year period (1987-2001) showed a DENV-2 lineage evolving through a series of turnover events (14). A lineage replacement in 1994 appeared to be associated with a foreign virus but only 3 other reintroductions were found, all linked to the 1998 epidemic, the largest in Puerto Rico history (14). This was a turning point in the epidemiology of dengue, with DENV-2 (and DENV-1 and -4) rapidly declining during the expansion of DENV-3. However, transmission of DENV-2 persisted at low levels during 1999-2003 and increased thereafter. This serotype turnover offers new opportunities to study the evolution of DENV-2. Our analysis illustrates the genetic composition and population diversity of DENV-2 throughout 22 consecutive years of sampling in Puerto Rico and may explain the evolutionary resilience and long-term establishment of this virus.

Virus Isolates
We complied with the institutional review boards of the Centers for Disease Control and Prevention (CDC) (protocol 4797) and the Broad Institute of MIT and Harvard. DENV was obtained from human serum received through the passive surveillance system administered by CDC. Each sample was accompanied by a form that captured geographic and clinical information maintained for this study without patient identifi ers. Primary or secondary status of infection was inferred by absence or presence of serum immunoglobulin G (15). Viruses were rescued into C6/36 cells (16). Selection of 3 isolates per year in the 5 municipalities with the highest reporting of DENV-2 cases resulted in 253 isolates, of which 140 were successfully se-quenced and are representative of our virus repository with respect to patient age (27.7 vs. 22.6 years), sex (54.4% vs. 47.4% male), and history of infection (84.6% vs. 77% secondary infections). We also sequenced 20 regional isolates from neighboring countries.

Sequencing
We extracted RNA from tissue culture supernatant using the M48 or MDx BioRobot (QIAGEN, Valencia, CA, USA). cDNA was generated by using Sensiscript RT (QIA-GEN) with random hexamers (Applied Biosciences, Foster City, CA, USA). Presence of cDNA was confi rmed by PCR by using PfuUltraII (Stratagene, La Jolla, CA, USA) or iTaq (Bio-Rad, Hercules, CA, USA) DNA polymerase and specifi c oligonucleotides (CDC, Atlanta, GA, USA). Fourteen pooled overlapping 2,000 nt amplicons were generated by reverse transcription-PCR at CDC (San Juan, PR) and sequenced at the Broad Institute (Cambridge, MA, USA) by bidirectional Sanger by using an ABI 3730 after PCR with 96 M13-tailed serotype-specifi c primers. Resulting reads were trimmed of the primer sequences, fi ltered for high quality, and assembled by using algorithms developed by the Broad Institute. All coding sequences for the poliproteins (10,173 nt) and parts of the 5′ and 3′ noncoding regions were deposited in GenBank.

Sequence Analyses
Coding sequences for the unprocessed polyprotein (5′ and 3′ noncoding regions excluded) were aligned by Clust-alW software (www.ebi.ac.uk/Tools/clustalw/index.html) in MEGA 4 (www.megasoftware.net). Maximum-likelihood analysis and bootstrapping tests were performed in PAUP* (16) under the best-fi t substitution model estimated by MODELTEST v3.07 (14) (parameters available on request). The 1983 Jamaican isolate JM_83_M20558 (5) served as outgroup. Mean rates of nucleotide substitution and relative genetic diversity (Net, where t is the generation time) were estimated by using Bayesian Markov Chain Monte Carlo (MCMC) from BEAST v1.4.7 (http://mbe.oxfordjournals. org/content/25/7/1459). General time reversible substitution model with strict and relaxed molecular clocks and constant population size or Bayesian Skyline coalescent analysis was used. All MCMC chains were run for suffi cient length ensuring stationary parameters, with statistical error refl ected in values of the 95% highest probability density. Amino acid differences were mapped by using parsimony methods in MacClade v4.08 (17). We determined d N /d S ratios with the single likelihood ancestor counting method using HY-PHY and accessed through the Datamonkey server (13). Associations between phylogeny and geographic data were investigated by using Bayesian Tip-association Signifi cance testing (http://evolve.zoo.ox.ac.uk/evolve/BaTS.html) with the posterior sample of trees calculated by BEAST. For the parsimony score, association index, and monophyletic clade size, we considered p<0.05 signifi cant.

Results
During 1986-2007, dengue cases in Puerto Rico ranged from 2,000 to ≈16,000 per year (Figure 1, panel A), with major epidemics (>8,000 cases) reported in 1986, 1992, 1994, 1998, and 2007 (2-4,18,19). Despite major fl uctuations in serotype circulation, DENV-2 circulated predominantly for 10 years (Figure 1 Four events merit recognition ( Figure 2). First, a mixture of foreign and local strains at the base of subclades IA and IB provides evidence of multiple introductions. Eight Puerto Rico viruses associated with these foreign strains date from 1994 through 1999. These years also are associated with a distinct subgroup basal to subclade IB concomitant with the extinction of clade II in 1997. Second, subclade IB evolved mainly after the introduction of DENV-3 in 1998. Third, a period of limited circulation of DENV-2 refl ected in low levels of genetic diversity (1999)(2000)(2001)(2002)(2003) coincided with the expansion of DENV-3 and decline of DENV-1 and -4. Fourth, there was a resurgence of DENV-2 during 2004-2007.
Forty-nine amino acid differences mapped to the phylogeny were detected across the major internal branches of the tree. Twenty of these comprise major differences be-tween clades I and II and between subclades IA and IB, as well as substitutions that arose during the continuous evolution of subclade IB ( Figure 2). Only 1 aa substitution distinguished isolates in clades I/II from III: a hydrophilic glutamine to a hydrophobic leucine at position 131 in the E protein. Excluding PR79_1995_EU569708 as a possible foreign introduction, 18 aa differences distinguish isolates across clade I, 12 of which separate subclade IB from clade II and potentially could have been involved in the 1994-1997 lineage turnover ( Figure 2). The remaining differences between isolates in subclades IB and II were present in nonstructural (NS) genes and are preponderantly conservative mutations, with the exception of position 31 in NS3, which was nonconservative. Among the additional changes, the only nonconservative mutation was a hydrophobic alanine to hydrophilic threonine at position 137 in NS4B that originated with PR40_1999 EU482730, and most changes were found in the NS genes.
Using Bayesian MCMC and d N /d S analyses, we estimated the mean substitution rates for the full genomes at 9 × 10 -4 to 1.1 × 10 -3 for all clades, consistent with previously published rates (20,21). The low d N /d S ratios (0.07-0.08) provide evidence of a low percentage of substitutions that have been fi xed along independent lineages, possibly indicating purifying, negative selection. BaTS analysis shows that lineages often correlated with the corresponding region of origin of the isolates. Seven of the 8 regions had >4 isolates in subclade IB or clade II. This association was signifi cant for 6 regions (p<0.05) (Table). The most signifi cant geographic correlation of lineages were found in the San Juan (1986-1990 and 1994-1996), Ponce (1987Ponce ( -1989, and Mayaguez (1989 and1993) (Figure 3 We investigated other possible associations with the DENV-2 phylogeny, including age and DF/DHF status,  but found none. Most DENV-2 infections were secondary (84.6% and 77% of DENV-2 infections in the CDC collection and this study, respectively). However, we found no relationship between phylogeny and incidence of primary or secondary infection in patients.
The year 1999 began a period of low circulation and low genetic diversity of the Caguas lineage of subclade IB (Figure 1 (Figure 4, panel D). In the 4 municipalities with uninterrupted DENV-2 transmission, DENV-2 incidence increased 2 years after the islandwide increase ( Figure 5). DENV-3 incidence within this DENV-2 refuge was minimal during the period of high DENV-2 incidence but peaked 2 years later concomitant with an increase across the rest of the island

Discussion
Puerto Rico is a model for fi ne-scale studies on DENV evolution in the Americas. The long-term persistence of  DENV-2 and its ability to reemerge after transient periods of low circulation is a remarkable aspect of the epidemiology of dengue in the region. The fact that 13% of DENV-2 isolates represent importations or close descendants from importations brings new insights to our understanding of DENV long-term circulation. Foreign viruses were identifi ed in 8 years (1987, 1989, 1991, 1995, 1998, 1999, 2005, and 2007), of which only 1991 and 1998 had been previously sampled (14). Ten of the 18 introductions occurred during periods of high DENV-2 predominance: 1987DENV-2 predominance: -1991DENV-2 predominance: , 1995DENV-2 predominance: , and 2005DENV-2 predominance: -2007 (Figure 1, panel B; Figure 2). The other 8 introductions originated from the 1998 epidemic or shortly thereafter (1999). Therefore, DENV-2 seems to be introduced mainly during periods of favorable preponderance, not necessarily epidemic transmission of this serotype. Subclade IA viruses never established themselves, regardless of year of isolation or origin. These assessments showed a previously unknown feature of DENV-2 persistence: the endemic strain is recalcitrant to infl uences from frequent foreign introductions. The relative inability of "foreign" DENV-2 to persist in the presence of the dominant subclade IB viruses is not well understood. The Puerto Rico strain might be highly adapted and thus have a fi tness advantage, the frequently introduced strains might be simply underrepresented, or introduced strains may have disappeared through genetic drift. Isolate PR76_1995_EU569708, which lies basal to this subclade in the phylogeny (Figure 2), is more closely related to South American DENV-2 viruses than to other Puerto Rico viruses, and this lineage does not appear to have progressed, supporting the foreign origin of subclade IB. Our fi ndings then show that subclade IB resulted from an introduced strain, as previously suggested by Bennett et al. (14), and successfully penetrated during a period of proportionally high incidence of foreign introductions. Interestingly, this clade replacement was completed in 1997, less than a year before the fi nding of DENV-3 and the concomitant decline of DENV-1, -2, and -4. The early portion of subclade IB is seen as a period of short-lived lineages ending in 1997, therefore, the rise and expansion of this subclade mainly occurs in coexistence with DENV-3, a different epidemiologic scenario from that of the now extinct clade II a decade earlier.
The dominance of conservative amino acid changes that segregated the viruses by clade hinders the assessment of phenotypic changes. Compensatory mutations might have conferred replicative advantages that could have infl uenced the displacement of clade II or the persistence of subclade IB in Puerto Rico; however this hypothesis has not been tested. Positive selection was not identifi ed, contrasting with previous analyses (14,(22)(23)(24). Others have not detected positive selection and attribute lineage extinctions or clade replacements to stochastic events rather than natural selection (25). More analysis to detect site-specifi c selection is needed to corroborate whether positive selection is not at play in these populations of viruses.
The period 1999-2003 represents historically low rates of DENV-2 circulation (Figures 1, 2, 4), and the epidemiologic and phylogenetic aspects of this transient retrieval had not been studied previously. We show that the genetic variability of DENV-2 decreased during these 4 years when the virus was transmitted in only a subset of municipalities. DENV-2 represented 29% of the cases in this area but only 5% island-wide. The reason this region became a refuge of DENV-2 for 4 years remains unclear, but the low incidence of DENV-2 in prior years compared with the rest of the island suggests susceptibility for infection in this population ( Figure 5). Studies in Thailand showed serotype displacement affecting population diversity and lineage turnover (26). Short-term serotype cross-protection has been suggested to contribute to serotype displacements (27)(28)(29), implying that as DENV-3 infected a large susceptible population, cross-protective antibodies momentarily impeded transmission of other serotypes and dissemination of DENV-2 outside the eastern refuge. Our study confi rms the utility of systematic sampling and genome sequencing in large-scale surveillance systems as ways to understand the dynamics of dengue transmission and endemicity. Solid blue line, incidence of DENV-2 within the refuge region; dashed blue line, incidence of DENV-2 in the rest of the island outside the refuge reason; solid black line, incidence of DENV-3 within the DENV-2 refuge region; dashed black line) incidence of DENV-3 in the rest of the island outside the refuge region. Incidence was calculated as number of confi rmed, positive cases of each serotype per thousand residents.