Distinct Zika Virus Lineage in Salvador, Bahia, Brazil

Sequencing of isolates from patients in Bahia, Brazil, where most Zika virus cases in Brazil have been reported, resulted in 11 whole and partial Zika virus genomes. Phylogenetic analyses revealed a well-supported Bahia-specific Zika virus lineage, which indicates sustained Zika virus circulation in Salvador, Bahia’s capital city, since mid-2014.

Z ika virus is an arthropodborne RNA virus primarily transmitted by mosquitoes of the species Aedes (1). The virus has 2 genotypes: African, found only in the continent of Africa; and Asian, associated with outbreaks in Southeast Asia, several Pacific islands, and, recently, the Americas (2). In May 2015, Brazil reported its first autochthonous cases of Zika virus infection, which occurred in northeast Brazil (3,4). As of June 30, 2016, all 27 federal states in Brazil had confirmed Zika virus transmission (http://www. paho.org/hq/index.php?option=com_docman&task=doc_ view&Itemid=270&gid=35262&lang=en).
The rapid geographic expansion of Zika virus transmission and the virus's association with microcephaly and congenital abnormalities (5) demand a rapid increase in molecular surveillance in areas that are most affected. Molecular surveillance is particularly relevant for regions where other mosquitoborne viruses, particularly dengue and chikungunya viruses, co-circulate with Zika virus (2); surveillance on the basis of clinical symptoms alone is highly inaccurate. Genetic characterization of circulating Zika virus strains can help determine the origin and potential spread of infection in travelers returning from Zika virus-endemic countries. Previous analyses have suggested that Zika virus was introduced in the Americas at least 1 year before the virus's initial detection in Brazil (1). The state of Bahia, Brazil, reported most (93%) suspected Zika virus infections in Brazil during 2015 (2), including cases of Zika virus-associated fetal microcephaly (6); however, except for 1 complete genome, no genetic information from the region has been available (2,7). We report molecular epidemiologic findings resulting from 11 new complete and partial Zika virus genomes recovered from serum samples from patients at the Hospital Aliança in the city of Salvador in Bahia, Brazil.

The Study
Symptomatic patients with suspected Zika virus infection were enrolled in a research study approved by the Brazil Ministry of Health (Certificado de Apresentação para Apreciação Ética 45483115.0.0000.0046, no. 1159.184, Brazil). During April 2015-January 2016, acute Zika virus infection was diagnosed for 15 patients whose serum samples tested positive by a qualitative reverse transcription PCR (RT-PCR) by using primers targeting the nonstructural 5 gene (8). Clinical samples were retested for Zika virus positivity by using a separate quantitative RT-PCR (QuantiTect SYBR Green PCR kit; QIAGEN, Valencia, CA, USA) and primers targeting the envelope gene (9). Metagenomic next-generation sequencing libraries were constructed from serum RNA extracts, as described (10,11; online Technical Appendix, http://wwwnc.cdc.gov/EID/ article/22/10/16-0663-Techapp1.pdf). Pathogen identification from metagenomic next-generation sequencing data was performed by using the Sequence-based Ultra-Rapid Pathogen Identification bioinformatics pipeline (12; http:// chiulab.ucsf.edu/surpi/). Results of the metagenomic analyses and identification of co-infections with chikungunya virus are reported elsewhere (13).
For Zika virus genome sequencing, 2 isolates (Bahia07 and Bahia09; Table) with Zika virus titers >10 4 copies/mL generated sufficient viral metagenomic data for complete genome assembly. For the remaining samples with lower titers, metagenomic next-generation sequencing libraries were enriched for Zika virus sequencing by using xGen biotinylated lockdown capture probes (Integrated DNA  (Table). Distribution of single nucleotide variants across the 11 recovered genomes exhibited distinct patterns (online Technical Appendix Figure 1), indicating that the assembled genomes were unlikely to result from cross-contamination by a single high-titer Zika virus sample. Multiple sequence alignment was performed by using MAFFT version 7 (http://mafft.cbrc.jp/alignment/software/); maximum-likelihood (ML) and Bayesian phylogenetic inferences were determined by using PhyML version 3.0 (http://www.atgc-montpellier.fr/phyml/) and BEAST version 1.8.2 (http://beast.bio.ed.ac.uk/), respectively. The best-fit model was calculated by using jModelTest2 (https://github.com/ddarriba/jmodeltest2; details in online Technical Appendix). Coding regions corresponding to the 11 complete or partial genomes from Bahia were aligned with all published and available near-complete Zika virus genomes and longer subgenomic regions (>1,500 nt) of the Asian genotype as of April 2016 (mean sequence size 8,402 nt with 1,652 distinct nucleotide site patterns). The ML phylogeny was reconstructed by using the best-fit general time-reversible nucleotide substitution model with a proportion of invariant sites (GTR+I). Statistical support for phylogenetic nodes was assessed by using a bootstrap approach with 1,000 bootstrap replicates. A Bayesian molecular clock phylogeny was estimated by using the best-fitting evolutionary model (2); specifically, a GTR+I substitution model with 3 components: a strict molecular clock, a Bayesian skyline coalescent prior, and a noninformative continuous time Markov chain reference prior for the molecular clock rate.
The isolates from patients in Salvador clustered together within 1 strongly supported clade (posterior probability 1.00, bootstrap support 100%, Bahia clade C) ( Figure; online Technical Appendix Figure 2). This support is notable; most Zika virus genomes in this clade are incomplete, and uncertainty is accounted for in phylogenetic inference. The tree topology accords with previous findings (2,4,5), and time to most recent common ancestor (TMRCA) of the epidemic in the Americas is similar to that previously estimated (2) (American epidemic clade A; Figure). The overall ML and molecular clock phylogenies exhibited many well-supported internal nodes with bootstrap support >60% and posterior probability >0.80 ( Figure; online Technical Appendix Figure 2), although several nodes near the ancestor of clade A were less well supported.   Figure 2). The patient denied history of travel, suggesting that multiple Zika virus lineages may circulate in Bahia.

Conclusions
Our results suggest an early introduction and presence (mid-2014) of Zika virus in the Salvador region in Bahia, Brazil. Given the size of the cluster and statistical support for it, this lineage likely represents a large and sustained chain of transmission within Bahia state. Most cases of this Zika virus lineage clustered closely to a sequence from Maranhão, and we found evidence for an additional potential introduction to Bahia from Pará state. Consequently, Zika virus in Salvador during mid-2014 was likely introduced from other regions in Brazil rather than from outside the country. Current findings of Zika virus emergence in Bahia state during mid-2014 are consistent with first-trimester viral infection in pregnant women corresponding to the initial reported cases of fetal microcephaly, which began in January 2015 (5) and peaked in November 2015.
Broader sampling across Bahia is needed to determine whether the Salvador lineage (clade C) identified in this article comprises most Zika virus cases in the state. Brazil currently faces a major public health challenge from cocirculation of Zika, dengue, and chikungunya viruses (2)(3)(4)14,15). Additional molecular surveillance in the Americas and beyond is urgently needed to trace and predict transmission of Zika virus.