Evolutionary and Ecological Characterization of Mayaro Virus Strains Isolated during an Outbreak, Venezuela, 2010

This virus commonly infects persons residing near enzootic transmission foci because of anthropogenic incursions.


RESEARCH
In 2010, an outbreak of febrile illness with arthralgic manifestations was detected at La Estación village, Portuguesa State, Venezuela. The etiologic agent was determined to be Mayaro virus (MAYV), a reemerging South American alphavirus. A total of 77 cases was reported and 19 were confirmed as seropositive. MAYV was isolated from acutephase serum samples from 6 symptomatic patients. We sequenced 27 complete genomes representing the full spectrum of MAYV genetic diversity, which facilitated detection of a new genotype, designated N. Phylogenetic analysis of genomic sequences indicated that etiologic strains from Venezuela belong to genotype D. Results indicate that MAYV is highly conserved genetically, showing ≈17% nucleotide divergence across all 3 genotypes and 4% among genotype D strains in the most variable genes. Coalescent analyses suggested genotypes D and L diverged ≈150 years ago and genotype diverged N ≈250 years ago. This virus commonly infects persons residing near enzootic transmission foci because of anthropogenic incursions.
T he family Togaviridae, genus Alphavirus, includes several major mosquito-borne pathogens. Alphaviruses can be categorized as Old or New World viruses, which are generally associated with febrile illnesses with either arthralgic or encephalitic syndromes, respectively (1,2). Mayaro virus (MAYV), an exceptional arthralgic New World alphavirus, produces Mayaro fever, which has signs and symptoms similar to those of dengue fever, including an acute febrile illness of 3-5 days' duration, typically with headache, retro-orbital pain, arthralgia, myalgia, vomiting, diarrhea, and rashes (3,4). Previous work has shown that some patients have persistent, severe joint pains for up to 1 year (5).
MAYV has a single-strand, positive-sense RNA genome ≈11,700 nt in length. The first two thirds of the genome encodes 4 nonstructural proteins (NSP1-4) and the other one third encodes the structural proteins (capsid, envelope [E] 3, E2, 6K/TF, and E1). Previous phylogenetic studies using a fragment of the structural polyprotein open reading frame suggest that MAYV occurs in 2 distinct genotypes, D and L (6). Genotype D includes isolates from all countries where MAYV has been detected, and genotype L contains strains detected only in Brazil (6).
The MAYV enzootic transmission cycle is not fully characterized. Previous studies suggest that it circulates between canopy-dwelling mosquitoes of the genus Haemagogus and nonhuman primates (6,21). Consequently, MAYV human seropositivity is largely associated with forest workers and hunters (18). MAYV has the potential to cause large outbreaks, as demonstrated in Brazil, where ≈800 persons were affected (10). Furthermore, Aedes aegypti mosquitoes are moderately competent vectors (22), which suggests that an urban human-mosquito-human transmission cycle could emerge, as has occurred for dengue, chikungunya, and yellow fever viruses with similar enzootic forest cycles (23).
In January 2010, an outbreak of Mayaro fever in Venezuela occurred in La Estación village, Portuguesa State. By June 4, a total of 77 cases were recorded, which represents one of the largest outbreaks detected in South America. To understand the origins, evolution, and ecoepidemiology of MAYV, we sequenced complete genomes of 6 strains isolated during this outbreak and 21 additional isolates, which represented the full known spectrum of MAYV genetic diversity. We performed robust phylogenetic analyses on our complete genome data and previously published partial E2-E1 sequences.

Study Site and Outbreak Details
The study protocol was approved by the Naval Medical Research Center and Naval Medical Research Unit No. 6 Institutional Review Boards (protocols NMRCD.2000.0006, 2010.0010, and NAMRU6.2012.0016). Approval was given in compliance with all applicable federal regulations governing the protection of human subjects.
La Estación village is located at the northwestern corner of the municipality of Ospino, within Portuguesa State, Venezuela ( Figure 1). It is a rural village with a population of 9,538 persons (average age 26 years). In the first quarter of 2010, an outbreak of a febrile illness with arthralgic manifestations was detected. Active surveillance was initiated by the Ministry of Health environmental health team.

Virus Isolation and Identification
Viruses were isolated from human serum samples and identified on Vero E6 cell monolayers as described (16) by using a panel of polyclonal antibodies against alphaviruses and flaviviruses, followed by a MAYV-specific monoclonal antibody (MIAF TRVL4675). Six viruses were isolated from acute-phase serum samples of symptomatic patients. Signs and symptoms and other patient details are described in the results.

Virus Propagation, Reverse Transcription PCR Amplification, and Sequencing
The 6 virus isolates from Venezuela were sequenced by using the Illumina HiSeq1000 platform (Illumina Inc., San Diego, CA, USA) as described (24). To facilitate a more detailed phylogenetic comparison, we determined complete genome sequences for 21 additional MAYV isolates from the World Reference Center for Emerging Viruses and Arboviruses Collection at the University of Texas Medical Branch (Galveston, TX, USA) ( Table 1). Viruses were propagated once in Vero cells, and RNA was extracted from cell culture supernatant by using TriZol LS (Life Technologies, Carlsbad, CA, USA) according to the manufacturer's recommendations. Complete genome sequences were obtained by using reverse transcription PCR (RT-PCR) amplification and sequencing of 6 overlapping RT-PCR amplicons (primer sequences available upon request). RT-PCRs were performed by using the Titan One-Step RT-PCR Kit (Roche Diagnostics, Indianapolis, IN, USA). PCR amplicons were visualized, excised, purified, and sequenced as described (25). Sequences were submitted to GenBank under accession nos. KP842794-KP842820.

Sequence Analysis
Nucleotide sequences were aligned with MAYV sequences available in GenBank by using Clustal X (http://bips.ustrasbg.fr/fr/Documentation/ClustalX/), and manually adjusted by using Se-Al (http://tree.bio.ed.ac.uk/software/ seal/). Twenty-nine genomic sequences (i.e., 27 determined in this study and 2 obtained from GenBank) were manually aligned, and untranslated terminal sequences were removed. A second dataset consisting of all partial E2-E1 envelope glycoprotein sequences (n = 68) was also analyzed. All sequences were confirmed as being nonrecombinant by using Recombination Detection Program version 4 (26).
The presence and nature of selective pressures acting on the MAYV genome were assessed by using methods available in Datamonkey (27), including the single-likelihood ancestor counting (SLAC) and internal fixed effects likelihood (IFEL) methods. Positive and negative selection events at each codon were determined.

Phylogenetic Analysis
Maximum-likelihood (ML) phylogenetic trees were constructed by using the best-fit general time reversible + gamma 4 + invariable sites model, which was identified by using MODELTEST version 3.7 (28). Bootstrapping was performed to assess robustness of topologies by using 1,000 replicate neighbor-joining trees under the ML substitution model. Analyses were performed with PAUP* version 4.0b (Sinauer Associates, Inc., Sunderland, MA, USA).

Coalescent Analysis
Bayesian coalescent analyses were performed by using a general time reversible + gamma 4 nucleotide substitution model, an uncorrelated lognormal molecular clock model, and a Bayesian Skyline population growth model. To ensure statistical efficiency, we applied a Bayesian stochastic search variable selection procedure (29). Inferences were obtained by using a Bayesian Markov chain Monte Carlo approach (30) run for 100 million generations with a 10% burn-in period and sampling every 10,000 states. Tracer version 1.5 (http://tree.bio.ed.ac.uk/software/tracer/) was used to monitor stationarity and efficient mixing. TreeAnnotator version 1.8.0 (http://beast.bio.ed.ac.uk) was used to summarize the posterior tree distribution, and FigTree version 1.3.1 (http://beast.bio.ed.ac.uk) was used to visualize the annotated maximum clade credibility (MCC) tree.

Outbreak Investigation and Virus Isolation
During January-June 4, 2010, a total of 77 clinical cases compatible with MAYV infection were reported from La Estación. Fifty (65%) were in female patients and 27 (35%) in male patients; >50% of patients were homemakers and laborers, and 38 (49%) were 25-54 years of age (31). MAYV was isolated from 6 of 19 acute-phase serum samples, all obtained from symptomatic case-patients. Signs and symptoms for these 6 patients (3 male patients and 3 female patients; age range 15-73 years) are shown in Table  2. Although all 6 patients had arthralgia, several did not have all characteristic signs and symptoms, such as headache (3/6) and nausea and vomiting (4/6). One case-patient (a 73-year-old woman) did not have fever.

Sequence Divergence and Phylogenetic Analysis
In addition to sequencing the complete genomes of the 6 outbreak strains from Venezuela, we also sequenced complete genomes for 21 strains representing the full spectrum of genetic diversity, and known temporal and spatial distributions of MAYV (Table 1). Percentage nucleotide and amino acid sequence identities relative to outbreak strain 16A across all 9 genes of the genome for 10 strains that represent the full spectrum of genetic diversity determined for MAYV in this study are shown in Table 3. Analysis of nucleotide and amino acid sequence identities showed that MAYV is highly conserved (nucleotide sequence identities 96.4%-100% and amino acid sequence identities 97.7%-100%) among genotype D strains ( Table 3). Comparison of genotype L with genotype D showed greater divergence (nucleotide sequence identities 83.8%-88.6% and amino acid sequence identities 90.9%-97.4%) for all genes. An ML phylogeny based on the complete genome sequences of 29 MAYV strains and with a topology similar to that reported by Powers et al. (6) for partial E2-E1 sequences is shown in Figure 2. These results suggested that the trees based on partial genomes are sufficiently resolved for detailed coalescent analyses. A Bayesian MCC tree based on the partial E2-E1 fragment (nt 9,412-11,139 for DQ001069) from 68 sequenced MAYV strains, and with posterior probabilities indicated at relevant nodes is shown in Figure 3. The color of each lineage represents the most probable geographic location for the hypothetical ancestor at the node representing it. This phylogeny was consistent with the complete genome tree (Figure 2) and the phylogeny of Powers et al. (6). In addition to the previously described distinction between genotype D and L strains, there was some evidence of geographic structure within genotype D (Figure 3). In both our ML and MCC phylogenies, the 2010 outbreak sequences from Venezuela grouped within genotype D (posterior probability >0.99), and were most closely related to strains from Peru.
In addition to the previously described genotypes, a recently isolated (2010) strain from Peru (FMD3213) was intermediate between genotypes D and L and showed strong statistical support in both phylogenies. Given its year of isolation, strong support for its position in both phylogenies, and intermediary genetic distance from both genotypes D and L (Table 3), this strain might represent a previously undetected genotype, which we designate N.

Selection Analysis
A mean global nonsynonymous:synonymous (dN:dS) ratio of 0.057 calculated by using the SLAC algorithm indicated that, similar to other arboviruses, purifying selection is the predominant evolutionary force driving MAYV evolution. This result was supported by the 274 codons found by SLAC and the 539 sites detected by IFEL to be under purifying selection. One codon was identified as being under positive selection by using SLAC and 5 by using IFEL (p<0.1). None of the 5 sites detected by IFEL delineated the 2010 outbreak strains, but these 5 sites instead primarily defined genotype L.
Estimated dates of divergence for selected lineages are shown in Table 4. The most recent common ances-

MAYV is a major emerging pathogen in northern South
America and causes sporadic outbreaks of arthralgic disease in the Brazilian Amazon and eastern Bolivia. To date, these outbreaks have been relatively small, except for the epidemic in Belterra, Brazil, in 1977-1978. However, antibody detection and virus isolation rates indicate that MAYV commonly infects persons residing near enzootic transmission foci, and high incidence rates have been detected by using clinical surveillance (32). The Mayaro fever outbreak at La Estación in 2010 is noteworthy because, with the exception of a family found to be seropositive (20), it probably represents the first outbreak documented in Venezuela. Also, with 77 reported cases, it is one of the largest outbreaks ever described. The signs and symptoms recorded corresponded to an influenza-like illness with arthralgia in most cases. One case-patient, a woman who was confirmed to be MAYV positive, had persistent arthralgia 1 month after infection. Cases of Mayaro fever are grossly underestimated in South America because extensive overlap in signs and symptoms means they typically fall under the dengue umbrella. It is essential that countries in South America test for MAYV when patients have dengue-like illness that also involves arthralgia to determine the true incidence of MAYV infection.
A larger proportion (65%) of cases were in female patients during this outbreak. Previous studies in Brazil and Bolivia have shown no sex bias during outbreaks of Mayaro fever (9,10), but a recent clinic-based surveillance study in Bolivia and Peru demonstrated that male patients were more likely to be MAYV infected, probably because of occupational exposure (32). It is unclear why we observed the opposite sex bias during the outbreak in La Estación. Further information on occupational exposure at this location is necessary to better understand the demographics of this outbreak of Mayaro fever.
La Estación is located in a former tropical forest that has recently been converted for coffee production and other farming (33). Several monkey species (i.e., Cebus olivaceus and Alouatta seniculus) and competent MAYV vectors (i.e., Haemagogus mosquitoes) are present within this village (33,34), which suggests enzootic circulation near human residences and work locations, possibly enhanced by encroachment into forests by local residents. This conclusion is supported by our data, which indicate that 50% of female case-patients, including those with higher antibody titers, primarily performed home activities; 37.5% of seropositive male patients performed coffee agricultural activities; and 63% of all seropositive persons resided near the coffee plantation. Nonsynonymous mutations that defined a specific group, clade, or lineage were identified manually. Uninformative mutations were not counted; only synapomorphies that were unique to a group or cluster were noted. There were 143 nonsynonymous synapomorphic mutations of interest, of which 114 were unique to genotype L. The 5 amino acid positions that were detected to be under positive selection and to delineate the genotype L strains were Leu→Ala/Val at position 518 in NSP1; Ala→Prol and Val→Thr at positions 298 and 386 in NSP3, respectively; Ala→Lys at position 249 in NSP4; and Leu→Thr at position 300 in the E1 protein.
In addition, strains from the outbreak in Venezuela in 2010 were also defined by 5 nonsynonymous mutations: Ser→Gly and Pro→Leu at positions 487 and 523 in NSP1, respectively; Val→Ile and His→Tyr at position 586 and 665 in NSP2, respectively; and Ile→Leu at position 156 in the E1 protein. However, these mutations were not identified as being subject to positive selection. Also, none of these unique mutations is in a genomic region known to affect the virulence or transmissibility of alphaviruses, and these amino acid substitutions are mostly conservative in nature. Whether these substitutions played any role in the emergence of MAYV is unclear. Reverse genetic studies are needed to determine if any of these substitutions cause major phenotypic changes.
In addition to the previously described MAYV genotypes (6), we detected a new genotype N that is genetically distinct and has strong phylogenetic support. We also further delineated several clades that segregated by geographic region (Figure 3). The estimate for the origin of MAYV strains in South America (i.e., 670 years [95% HPD 441-912 years] before 2013) should be interpreted with caution because a recent study of Venezuelan equine encephalitis complex alphaviruses showed that time of MRCA estimates for ancestral nodes dating more than a few hundred years ago is influenced by internal branch compression resulting from strong purifying selection (N.L. Forrester et al., unpub. data). Our coalescent analyses also estimated that genotypes D and L diverged ≈107 150 years before 2013, which suggests a relatively recent origin for these genotypes.
The apparent restriction of genotype L to Brazil, when compared with the wider distribution of genotype D, suggests geographic constraints on MAYV dispersal within Brazil. Because there is no apparent geographic barrier that can account for this distribution, the lack of genotype L strains from other countries might represent sampling bias, rather than a true population subdivision. However, we cannot exclude the possibility that potential restrictions associated with vector competence, vector distributions, or alternative vertebrate amplification hosts might affect this apparent population subdivision. The 6 virus sequences from the outbreak in Venezuela in 2010 (the only representatives from Venezuela) grouped as a monophyletic clade within genotype D and showed strong support as determined by genomic and partial sequences. Because genotype D strains show genetic diversity derived predominantly from viruses in Peru, it is not surprising that strains from Peru occupied positions basal to strains from the outbreak in Venezuela in 2010. Although there was strong support for the ancestral strain to have been derived from strains in Peru in 2010, this finding might also have resulted from sampling bias. There is a dearth of MAYV isolates and sequences from several intervening countries, including Brazil, Ecuador, and Colombia, and recent virus sequences from these countries are needed to identify whether MAYV is maintained continuously in Venezuela or if virus spread influences emergence. For example, Ecuador and Colombia might form the ecologic bridge between strains derived from Venezuela and the ancestral strains from Peru. Persons with antibodies against MAYV have been reported among Ecuadorian soldiers living in the Amazon (18), which supports this hypothesis.
Recent work on alphaviruses from Venezuela showed that a 1999 MAYV sequence grouped most closely with sequences from Trinidad in the Brazilian genotype D clade (35). However, these results were based only on E1 3′-untranslated region sequences, and their sequences could not be included in our analyses. Given the position of the 2010 sequences from Venezuela in our phylogeny, it appears that the 2010 outbreak strains did not descend from MAYV isolates previously circulating in Venezuela (i.e., at least since 1999). However, it is also possible that there is co-circulation or regionally independent evolution of genetically distinct MAYV strains within Venezuela. Yellow fever virus has a similar enzootic transmission cycle and was recently shown to undergo regionally independent evolution in Venezuela (36), which supports this hypothesis.
Analysis of sequence identities among MAYV genes demonstrated a high degree of conservation, similar to that seen for North American eastern equine encephalitis virus (37) and western equine encephalitis virus (38). These alphaviruses have relatively recent estimated times of MRCAs, occupy specific and similar ecologic niches, and appear to undergo continuous evolution without major subdivision by geography or time. The MAYV tree topologies we observed also suggested continuous circulation within a distinct ecologic niche in South America. Eastern and Western equine encephalitis viruses have avian hosts, which might account for the lack of population subdivision by geography. In contrast, yellow fever virus, which is maintained by nonhuman primates, has a more geographically structured phylogeny (39). Despite the observation of high seroprevalence in nonhuman primates and isolation of MAYV from primatophilic mosquitoes (6,10), our tree topology and genetic conservation in MAYV suggest that birds or other highly mobile hosts might play some role in its dispersal. This finding could account for reduced geographic structure and observed branching patterns in MAYV phylogeny. Previous work has shown that MAYV replicates efficiently in avian cell cultures, such as Peking duck kidney cells and chick embryonic fibroblasts (40), and can achieve titers as high as 5.7-7 logs, suggesting that birds could be suitable reservoir hosts. However, further studies are necessary to determine if MAYV can replicate at the increased temperatures (≈43°C) in birds during levels of high activity. We hypothesize that during epizootics/ epidemics, nonhuman primates are spillover hosts that might be especially susceptible to MAYV infection. The inability to continuously isolate or detect MAYV from Haemagogus spp. mosquitoes and nonhuman primate hosts in diseaseendemic areas is consistent with the hypothesis that MAYV is maintained in a transmission cycle involving other vertebrate reservoir hosts.
This outbreak in Venezuela indicates that MAYV is a major emerging pathogen in South America, where persons residing near enzootic transmission foci might be at increased risk because of anthropogenic incursions. Strains from the outbreak in Venezuela in 2010 belong to genotype D and are distinct in the MAYV phylogeny. The single isolate from Puerto Maldonado in the Madre de Dios region of Peru in 2010 represents the only genotype N strain. Further surveillance at this and neighboring locations would be useful to determine the true extent of the genetic diversity of genotype N. Filling the current gaps in sequence data for geographic and temporal distributions of MAYV is needed to increase the phylogenetic resolution and aid our understanding of the evolution and spread of this emerging arthralgic alphavirus in the Americas.