Genetic and Serologic Properties of Zika Virus Associated with an Epidemic, Yap State, Micronesia, 2007

One-sentence summary for table of contents: The full coding region nucleic acid sequence and serologic properties of the virus were identified.

Z ika virus (ZIKV) is a mosquito-transmitted virus in the family Flaviviridae and genus Flavivirus. It was initially isolated in 1947 from blood of a febrile sentinel rhesus monkey during a yellow fever study in the Zika forest of Uganda (1). The virus was subsequently isolated from a pool of Aedes africanus mosquitoes collected in 1948 from the same region of the Zika forest; a serologic survey conducted at that time showed that 6.1% of the residents in nearby regions of Uganda had specifi c antibodies to ZIKV (1,2).
Over the next 20 years, several ZIKV isolates were obtained from Aedes spp. in Africa (Ae. africanus) and Malaysia (Ae. aegypti), implicating these species as likely epidemic or enzootic vectors (3)(4)(5). Several ZIKV human isolates were also obtained in the 1960s and 1970s from East and West Africa during routine arbovirus surveillance studies in the absence of epidemics (6)(7)(8). Additional serologic studies in the 1950s and 1960s detected ZIKV infections among humans in Egypt, Nigeria, Uganda, India, Malaysia, Indonesia, Pakistan, Thailand, North Vietnam, and the Philippines (5). These data strongly suggest widespread occurrence of ZIKV from Africa to Southeast Asia west and north of the Wallace line.
In April 2007, an epidemic of rash, conjunctivitis, and arthralgia was noted by physicians in Yap State, Federated States of Micronesia (11). Laboratory testing with a rapid assay suggested that a dengue virus (DENV) was the causative agent. In June 2007, samples were sent for confi rmatory testing to the Arbovirus Diagnostic Laboratory at the Centers for Disease Control and Prevention (CDC, Fort Collins, CO, USA). Serologic testing by immunoglobulin (Ig) M-capture ELISA with DENV antigen confi rmed recent fl avivirus infection in several patients. Testing by reverse transcription-PCR (RT-PCR) with fl avivirus consensus primers generated DNA fragments, which when subjected to nucleic acid sequencing, demonstrated ≈90% nucleotide identity with ZIKV. These fi ndings indicated that ZIKV was the causative agent of the Yap epidemic. We report serologic parameters of the immune response among ZIKV-infected humans, data on estimated levels of viremia, and the complete coding region nucleic acid sequence of ZIKV associated with this epidemic.

Analysis of Patient Samples
Details of the epidemic, including clinical and laboratory fi ndings for all patients, will be reported elsewhere (M.R. Duffy et al., unpub. data). A subset of ZIKV-infected patients for whom acute-and convalescent-phase paired serum specimens had been collected was analyzed by using several serologic assays to evaluate the extent of crossreactivity to several related fl aviviruses. Patients were classifi ed as primary fl avivirus/ZIKV infected or secondary fl avivirus/ZIKV probable infected. Primary fl avivirus/ ZIKV-infected patients were those in whom acute-phase serum specimens (<10 days) had no detectible antibodies (by IgG ELISA and plaque reduction neutralization test [PRNT]) to any of the heterologous fl aviviruses tested (Tables 1, 2) and were either IgM-positive in their acutephase specimen or IgM and IgG positive for ZIKV in a convalescent-phase specimen (seroconversion). Secondary fl avivirus/ZIKV probable-infected patients were those who had detectable antibodies to >1 heterologous fl aviviruses in their acute-phase specimen and were also IgM positive for ZIKV in their acute-phase specimen, or IgM and IgG positive for ZIKV in their convalescent-phase specimen. The designation "ZIKV probable" was used because secondary fl avivirus infections demonstrate extensive cross-reactivity with other fl aviviruses, and in some cases, higher serologic reactivity to the original infecting fl avivirus ("original antigenic sin" phenomenon). Thus, in secondary fl avivirus infections shown in Tables 1 and 2, serologic data alone is insuffi cient to confi rm ZIKV as the recently infecting fl avivirus. However, these secondary fl avivirus/ZIKV probable infections were likely recent ZIKV infections because ZIKV was the only virus detected during the epidemic in Yap, a relatively small and isolated island (11).

Nucleic Acid Sequencing and Phylogenetic Analysis
RNA was extracted from patient samples that demonstrated the highest concentration of ZIKV RNA determined by the real-time assay, and for which suffi cient sample volume was available (patients 824, 037, 830a, and 958). Briefl y, RNA was extracted from 150 μL of serum by using the QIAamp Viral RNA Mini Kit (QIAGEN), and RNA was eluted with 75 μL of RNase-free water. A series of RT-PCRs was performed with each RNA preparation by using primer pairs designed to generate overlapping DNA fragments that spanned the entire polyprotein coding region of the virus. Primers were designed by using the ZIKV MR 766 prototype virus coding region sequence (GenBank accession no. AY632535) and the PrimerSelect software module of the LaserGene package (DNASTAR Inc., Madison, WI, USA). Several primers initially failed to amplify because of sequence mismatches between ZIKV MR 766 and ZIKV Yap 2007. Therefore, primers were redesigned by using newly generated DNA sequence data, and a "genome walking" approach was used to derive complete coding region sequence data. The complete list of amplifi cation and sequencing primers is available upon request.
All RT-PCRs were performed with 10 μL of RNA by using the OneStep RT-PCR Kit (QIAGEN) following the manufacturer's protocol. DNAs were analyzed by 2% agarose gel electrophoresis, and bands of the predicted size were excised from the gel and purifi ed by using the QIAquick Gel Extraction Kit (QIAGEN). Purifi ed DNAs were subjected to nucleic acid sequence analysis with sequencing primers spaced ≈500 bases apart on both strands of the DNA fragments by using the ABI BigDye Terminator V3.1 Ready Reaction Cycle Sequencing Mixture (Applied Biosystems). Nucleotide sequence was determined by capillary electrophoresis by using the ABI 3130 genetic analyzer (Applied Biosystems) following the manufacturer's protcol. Raw sequence data were aligned and edited by using the SeqMan module of LaserGene (DNASTAR Inc.). Because of insuffi cient sample volume, no patient RNA was suffi cient to generate DNA that included the entire coding region. Therefore, DNA data obtained from 4 patients was combined to generate a consensus sequence heretofore designated the ZIKV 2007 epidemic consensus (EC) sequence (GenBank accession no. EU545988).
The complete coding region of ZIKV 2007 EC or the nonstructural protein 5 (NS5) gene subregion was aligned with all available fl avivirus sequences in GenBank by using the Clustal W algorithm within the MEGA version 4 software package (www.megasoftware.net). Phylogenetic trees were constructed by using either the complete coding region or the NS5 region because a large number of NS5 sequences were available in GenBank and trees for the NS5 region have been constructed (16). Additional ZIKV strains from the CDC/World Health Organization reference collection (strains 41662, 41524, and 41525) isolated from Aedes spp. mosquitoes collected in Senegal in 1984 were also amplifi ed by RT-PCR in the NS5 region and subjected to nucleic acid sequencing as described above and included in the NS5 region analysis. Trees were constructed from coding region data or from NS5 data by MEGA 4 from aligned nucleotide sequences. We used maximum parsimony, neighbor-joining, or minimum evolution algorithms with 2,000 replicates for bootstrap support of tree groupings. All trees generated nearly identical topology; only the neighbor-joining NS5 tree is shown (Figure 1).  Tables 1 and 2 show results of analysis for IgG and IgM and PRNTs of all acute-and convalescent-phase paired specimens obtained during the epidemic. Specimens were divided into primary and secondary infections on the basis of antibody testing results of acute-phase specimens. IgM antibody response in primary fl avivirus/ZIKV-infected patients was specifi c for ZIKV. However, all of these patients showed some limited degree of cross-reactivity with heterologous fl aviviruses. Patient 830a showed IgMpositive results with DENV and Japanese encephalitis virus, whereas all patients showed equivocal results (P/N 2-3) with several of the fl aviviruses tested, suggesting low levels of cross-reactivity. PRNT 90 results also showed that the neutralizing antibody response among primary fl avivirus/ZIKV-infected patients was highly specifi c. Most convalescent-phase PRNT titers for heterologous fl aviviruses were negative and rarely exceeded 10 (20 in 1 instance; patient 849b).

Serologic Analysis
Most patient specimens from the Yap epidemic tested were secondary fl avirius infections as determined by criteria described for antibody to fl avivirus in acute-phase specimens. A subset of these patients for whom acute-and convalescent-phase specimens were available was tested for reactivity against heterologous fl aviviruses; results are shown in Tables 1 and 2. In contrast to primary fl avivirus/ ZIKV-infected patients, secondary fl avivirus-infected patients showed a high degree of serologic cross-reactivity with other fl aviviruses. Six of 7 patients were positive for IgM against >1 of the heterologous fl aviviruses tested, and all demonstrated low levels of cross-reactive IgM as shown by a P/N value in the equivocal range. PRNT 90 results showed that among secondary fl avivirus/ZIKV-probable patients, the neutralizing antibody response was higher to ZIKV and more cross-reactive, a fi nding commonly observed among secondary fl avivirus infections. A >4-fold PRNT 90 titer between ZIKV and heterologous fl aviviruses was observed in only 3 of the 7 patients. In all other cases, the PRNT difference between ZIKV and other fl aviviruses tested was <2-fold; in 2 patients (817b and 844b) the PRNT titer was higher for 1 of the heterologous fl aviviruses. The PRNT result for the acute-phase specimen from patient 847 suggests previous vaccination with YFV. The convalescent-phase specimen from patient 847 showed a high titer to YFV, a demonstration of the previously described "original antigenic sin" phenomenon observed among fl aviviruses (17).

Real-Time RT-PCR
A real-time RT-PCR was developed by using newly derived sequence data obtained from several ZIKV-infected patients. All acute-phase specimens obtained during the Yap epidemic (n = 157) were tested in this assay with 2 unique primer/probe sets. Seventeen samples were positive, 10 were equivocal, and 130 were negative (data not shown). The equivocal designation indicates that a particular sample was positive by only 1 of the 2 primer sets or showed crossing thresholds >38.5, which suggests either a false-positive result or a sample with low levels of ZIKV RNA below the defi ned cut-off of the assay.  positive specimens. The viral RNA concentrations were ≈900-729,000 copies/mL. Most (15 of 17) of the ZIKVpositive samples were from specimens collected <3 days after onset; however, 1 specimen (patient 958) collected on day 11 after onset was positive with an estimated titer of ≈339,000 copies/mL.

Nucleic Acid Sequence and Phylogenetic Analysis
Several RT-PCR-positive serum specimens were selected, and RNA was amplifi ed by RT-PCR to generate DNA sequence data for the complete coding region. Because of limited specimen volume, the complete coding region genome sequence was only obtainable by combining sequence data from DNA fragments generated from 4 patients. Thus, the designation EC sequence is used to indicate that the sequence was derived from multiple patients during the epidemic. The exact contribution of sequence data from each patient is available upon request. However, the following points should be noted. Approximately 96% of the complete coding region was obtained from 3 patients; sequence data from the fourth patient was used primarily to fi ll in short gaps in the data. Second, ≈50% of the coding region data was derived from a complete overlap of data from >2 patients; in these overlap regions the sequence identity between different patients was ≈100%. Only 2-nt differences between patients were noted within the overlapping regions, strongly suggesting that 1 ZIKV strain circulated during the epidemic.
Percentage identity over the entire coding region of ZIKV 2007 EC sequence, when compared with the prototype ZIKV (MR 766, isolated in 1947), was 88.9% and 96.5% at the nucleotide and amino acid levels, respectively. Phylogenetic trees constructed from the complete coding region of all available fl aviviruses generated by a variety of methods (neighbor-joining, maximum-parsimony, or minimum-evolution) showed the same overall topology, with the ZIKV prototype and 2007 EC virus placed in a unique clade (clade 10) within the mosquito-borne fl avivirus cluster previously described by Kuno et al. (16). Alignment with phylogenetic tree construction by neighbor-joining, maximum-parsimony, or minimum-evolution algorithms was also performed for the NS5 region of all available fl aviviruses because extensive sequencing and phylogenetic analysis have been conducted for this region (16).
Three additional ZIKV strains isolated from Senegal in 1984 and sequenced in this study were also included in a tree. This NS5 tree demonstrated similar topology to the complete coding region tree, with all ZIKVs placed within a unique clade (clade 10) along with SPOV. Figure 1 shows the NS5 tree with only mosquito-borne fl aviviruses (cluster) displayed. This NS5 tree also shows that within the Zika/Spondweni clade there appear to be 3 branches among ZIKVs: Nigerian ZIKVs, prototype MR766, and 2007 Yap virus. Percentage identity among these ZIKVs confi rms the tree topology, in which ZIKV 2007 EC is most distally related to East and West African ZIKV strains (data not shown).
The predicted amino acid sequence of ZIKV 2007 EC contains the Asn-X-Ser/Thr glycosylation motif at position 154 in the envelope glycoprotein, found in many fl aviviruses, yet absent by deletion in the prototype ZIKV MR 766. This region of the prototype virus, along with 3 ZIKVs isolated from Senegal in 1984, was sequenced ( Figure 2). Included in this alignment is a ZIKV isolate from GenBank (accession no. AF372422). Sequencing confi rmed that prototype ZIKV MR766 has a 4-aa (12-nt) deletion when compared with ZIKV 2007 EC virus and ZIKVs from Senegal.

Discussion
Historically, ZIKV has rarely been associated with human disease, with only 1 small cluster of human cases in Indonesia reported (9). We report a widespread epidemic of human disease associated with ZIKV in Yap State in 2007. ZIKV epidemics may have occurred but been misdiagnosed as dengue because of similar clinical symptoms and serologic cross-reactivity with DENVs. Our serologic data indicate that ZIKV-infected patients can be positive in an IgM assay for DENVs, particularly if ZIKV is a secondary fl avivirus infection. If ZIKV is the fi rst fl avivirus encountered, our data indicate that cross-reactivity is minimal. However, when ZIKV infection occurs after a fl avivirus infection, our data indicate that the extent of crossreactivity in the IgM assay is greater. Therefore, if ZIKV infections occur in a population with DENV (or other fl avivirus) background immunity, our data suggest that extensive cross-reactivity in the dengue IgM assay will occur, which could lead to the erroneous conclusion that dengue caused the epidemic. Whether this cross-reactivity has occurred is open to speculation. However, reexamination of specimens from dengue epidemics may provide an answer. In addition, use of virus isolation or RT-PCR for laboratory diagnosis of dengue infections would also prevent this misinterpretation. Therefore, use of virus detection assays Levels of viremia among ZIKV-infected patients were relatively low. Unfortunately, measurement of concentration of infectious ZIKV was not possible because a virus isolate was not obtained from any patient during the epidemic. Absence of a ZIKV 2007 isolate also precluded use of a ZIKV 2007 isolate to generate a standard curve in the RT-PCR, which in turn could have estimated the concentration of infectious virus within patients. An estimation of the number of genome copies circulating in ZIKV-infected patients was calculated by using an RNA transcript and provides some indication of infectious virus concentration in ZIKV-infected patients. If one assumes a ratio range of 200-500 genome copies per infectious virus particle, a range reported for several fl aviviruses, then the copies/milliliter values in Table 4 would be in the range of ≈2-3,500 infectious virus particles/mL, with only 4 specimens in which ZIKV exceeded 1,000 infectious units/mL (18,19). These fi ndings may partially explain why ZIKV was not isolated, especially if one considers that shipping samples to our laboratory took ≈1 week, and shipping conditions were not conducive to virus isolation. These concentration estimates are also consistent with those of a study in which a ZIKV-infected human volunteer showed low viremia; virus was isolated only on day 4, and the volunteer was unable to infect Ae. aegypti mosquitoes that fed on the patient during the acute stage of disease (10).
Although generation of a complete coding region nucleic acid sequence by using a combination of patient samples from the epidemic is an unconventional approach, it was performed out of necessity because of limited volumes of patient samples. However, the extent of agreement among overlapping regions confi rms that the sequence obtained accurately represents the virus associated with the epidemic. Nucleic acid sequence of ZIKV 2007 showed divergence (11%) from the prototype strain (MR766) isolated in 1947. However, the predicted amino acid sequence is fairly conserved (96%), which is likely the result of the selective pressure maintained on the virus because replication occurs in vertebrate hosts and arthropod vectors.
Phylogenetic trees based on the complete coding region or the NS5 region confi rm results of a study in which ZIKV was classifi ed in a unique clade among the mosquitoborne fl aviviruses and most closely related to SPOV (16). The NS5 mosquito-borne fl avivirus tree (Figure 1), which includes additional ZIKV isolates, confi rms these relationships and suggests that there are 3 subclades among ZIKV isolates that refl ect geographic origin. Senegal ZIKVs and prototype virus from Uganda may represent West and East African lineages, respectively. The 2007 ZIKV is distantly related to these 2 African subclades and may represent divergence from a common ancestor with spread throughout Southeast Asia and the Pacifi c. Human ZIKV cases were detected in peninsular Malaysia in 1980, which confi rms that ZIKV was active in this region before 2007 (9). Additional sequence analysis of other temporally and geographically distinct ZIKV strains is needed to further elucidate relationships among these viruses.
Of particular interest is an additional 12 nt in the envelope gene (corresponding to 4 aa) in our ZIKV isolate that were not present in the ZIKV prototype virus (Figure 2). This difference is noteworthy because these 4 aa correspond to the envelope protein 154 glycosylation motif found in many fl aviviruses and associated in some instances with virulence. This glycosylation motif is also absent because of a 6-aa deletion in the ZIKV isolate obtained from GenBank (accession no. AF372422); however, the geographic and temporal origins of this virus were not available. Loss of the envelope protein 154 glycoslyation site has been observed in some fl aviviruses, and in the case of Kunjin virus has been shown to occur during passage. However, with Kunjin virus, the glycosylation site motif was lost because of a 1-base mutation, rather than a deletion, that altered the N-X-S/T sequon (20). Loss of this glycosylation site by a 4-aa deletion has also been observed in several lineage-2 WNV strains when compared with all other WNV strains (21).
The glycoslyation motif in WNV may be lost during extensive mouse brain passage; however, no direct evidence exists to support this hypothesis (21). This process may occur in ZIKV; the glycoslyation motif in MR 766 may have been present in earlier passages of prototype MR766 and lost during extensive mouse brain passage. However, earlier passage strains of MR766 were not available for investigating this hypothesis. Alternatively, the presence or absence of this glycosylation motif may represent an ancient evolutionary event with subsequent divergence of 2 ZIKV types with or without the E-154 glycosylation site amino acids. Sequence data derived from 3 additional ZIKV isolates from Senegal showed that glycosylation is intact in these isolates, which suggests evolutionary divergence. More extensive sequence analysis of available ZIKV strains of various temporal, geographic, and passage histories may provide some insight into this issue.