Molecular evolution and the global reemergence of enterovirus D68 by genome-wide analysis

Abstract Human enterovirus D68 (EV-D68) was first reported in the United States in 1962; thereafter, a few cases were reported from 1970 to 2005, but 2 outbreaks occurred in the Philippines (2008) and the United States (2014). However, little is known regarding the molecular evolution of this globally reemerging virus due to a lack of whole-genome sequences and analyses. Here, all publically available sequences including 147 full and 1248 partial genomes from GenBank were collected and compared at the clade and subclade level; 11 whole genomes isolated in Taiwan (TW) in 2014 were also added to the database. Phylogenetic trees were constructed to identify a new subclade, B3, and represent clade circulations among strains. Nucleotide sequence identities of the VP1 gene were 94% to 95% based on a comparison of subclade B3 to B1 and B2 and 87% to 91% when comparing A, C, and D. The patterns of clade circulation need to be clarified to improve global monitoring of EV-D68, even though this virus showed lower diversity among clades compared with the common enterovirus EV-71. Notably, severe cases isolated from Taiwan and China in 2014 were found in subclade B3. One severe case from Taiwan occurred in a female patient with underlying angioimmunoblastic T-cell lymphoma, from whom a bronchoalveolar lavage specimen was obtained. Although host factors play a key role in disease severity, we cannot exclude the possibility that EV-D68 may trigger clinical symptoms or death. To further investigate the genetic diversity of EV-D68, we reported 34 amino acid (aa) polymorphisms identified by comparing subclade B3 to B1 and B2. Clade D strains had a 1-aa deletion and a 2-aa insertion in the VP1 gene, and 1 of our TW/2014 strains had a shorter deletion in the 5′ untranslated region than a previously reported deletion. In summary, a new subclade, genetic indels, and polymorphisms in global strains were discovered elucidating evolutionary and epidemiological trends of EV-D68, and 11 genomes were added to the database. Virus variants may contribute to disease severity and clinical manifestations, and further studies are needed to investigate the associations between genetic diversity and clinical outcomes.


Introduction
Human enterovirus D68 (EV-D68) belongs to the family Picornaviridae and the genus Enterovirus, which was first isolated in California in 1962. [1] After the first case, 26 EV-D68 cases were sporadically reported from 1970 to 2005. [2] However, from October 2008 to March 2009, an outbreak was detected in the Philippines among pediatric patients hospitalized with pneumonia. [3] Since then, this virus has been recognized as a reemerging pathogen. [4] Studies indicated an increasing number of EV-D68-positive cases associated with acute respiratory illness in Asia, Europe, and the United States; some cases were also associated with central nervous system diseases. [5][6][7][8][9] Notably, the largest outbreak in the United States caused 130 laboratoryconfirmed cases in September 2014. [10] The enterovirus genome consists of approximately 7500 nucleotides (nt) encoding a polyprotein, from which 11 viral genes are translated with the characteristic gene order 50-VP4, VP2, VP3, VP1, 2A, 2B, 2C, 3A, 3B, 3C pro , and 3D pol -30. Among them, the VP1 gene has been used to distinguish various enterovirus serotypes, [11][12][13] and phylogenetic analysis has been used to discriminate lineages and detect new or emerging strains, including recently reported subclades B1 and B2 and clade D. [14][15][16] It suggested that interclade variations led to the identification of new clade, which in VP1 gene may alter viral antigenicity. [16] The VP1 gene contains serotype-specific neutralization sites (e.g., the BC loop), which are located at the carboxyl end of the protein and associated with viral antigenicity. [5] Although 1 VP1 deletion in clade-A strains [5] and 1 VP1 insertion in the strain 1737-Yamagata-2008 [17] have been reported, further studies are required to explore the association between genetic characteristics and disease severity. In addition to the VP1 gene, EV-D68 genomes from the early 1960s to mid-1990s underwent a rearrangement in the spacer region of the 50 untranslated region (UTR) between the end of the internal ribosome entry site and the polyprotein open reading frame (ORF). [5] The rearrangement resulted in 2 deletions of 24 and 11 nt in the spacer region, which might have a significant effect on the initiation of translation. Although the virulence was affected by the variations within the internal ribosome entry site, [18,19] the role of the spacer region with respect to viral fitness is not well known. In brief, genetic mutations may affect virulence by enhancing translational efficiency and correlate with the recent increase in EV-D68 cases worldwide.
Enteroviruses (e.g., EV-71) in Taiwan (TW) commonly circulate in the summer; however, an immunofluorescence assay for EV-D68 is not available, and little is known about the molecular genetics and epidemiology of EV-D68 strains in Taiwan. A previous study provided the sequences of 29 VP1 genes from EV-D68/TW from 2007 to 2014. [20] The authors indicated that EV-D68 has been endemic in Taiwan. Because they included only VP1 sequences, further studies were required to understand the genetic characteristics of whole genomes and the association between EV-D68 and severe clinical disease.
The primary goal of the current study was to investigate the molecular phylogeny, diversity, and epidemiology of EV-D68 strains from around the world. To this aim, we performed phylogenetic and genetic diversity analyses on all sequences available from GenBank as well as 11 EV-D68/TW strains isolated in 2014, which were sequenced for this study. Sequences were compared at the clade and subclade level.

Ethics statement
This study was approved by the Institutional Review Board of Chang Gung Medical Foundation, Linkou Medical Center, Taoyuan, Taiwan, with approval number 104-2536B.

Viral RNA isolation and PCR amplification for sequencing EV-D68 genomes
Eleven viral isolates were collected in Taiwan in 2014 for this study, and a further 136 complete/near-complete and 1248 partial genomes of EV-D68 were retrieved from GenBank in December 2015 for analysis. The 11 isolates include 3 from Linkou Chang Gung Memorial Hospital (CGMH), a medical center in northern Taiwan that processed a considerable number of clinical specimens for virus surveillance every year. The other 8 were from the The Taiwan Centers for Disease Control (Taiwan CDC), to which clinical virus isolates from 11 nationwide contracted laboratories for virus surveillance in Taiwan were sent. According to the manufacturer's specifications, RNA was extracted using the LabTurbo RNA Extraction Kit (TaiGen Biotechnology Inc, Taipei, Taiwan). cDNA was synthesized from total RNA using the SuperScript III reverse-transcription system (Invitrogen Corp, Carlsbad, CA). RNA was amplified and sequenced using 6 primer sets, as shown in Table 1. The polymerase chain reaction (PCR) mixture contained 5 mL of cDNA, 0.5 mM of each primer, and KOD plus Taq DNA polymerase (TOYOBO, Osaka, Japan) and was adjusted to a final volume of 25 mL with nuclease-free water. Each round of PCR amplification was performed under the following conditions: an initial denaturation at 94°C for 3 minutes was followed by 35 amplification cycles consisting of 94°C for 30 seconds, 55°C for 30 seconds, and 72°C for 3 minutes and concluded with a final extension at 72°C for 7 minutes. According to the manufacturer's specifications, PCR products were purified from agarose gel using a gel extraction kit (QIAGEN, Hilden, Germany). Purified DNA served as the template for the chain termination reaction with the ABI 3730 XL DNA Analyzer (Applied Biosystems Inc, Foster City, CA). The 11 EV-D68 genomes sequenced in this study have been deposited in GenBank with accession numbers KT711078 to KT711088. Table 2 shows the accession numbers and strain names and also presents clinical and demographic data for future investigations.

Phylogenetic and sequence analysis of EV-D68 genomes
A total of 147 full and 1248 partial sequences of EV-D68 were collected for analysis. These sequences were aligned using Clustal Omega version 1.2 [21] to explore indels in the VP1 gene, with special attention paid to those in Taiwan. To investigate the molecular phylogeny of strains circulating globally, phylogenetic trees were generated using Bayesian Evolutionary Analysis Sampling Trees 2 package [22] on VP1 sequences from strains collected in Taiwan, the United States, China, Japan, Mexico, Canada, France, New Zealand, and Haiti. A maximum clade credibility (MCC) tree under the Hasegawa-Kishino-Yano model was estimated. Thirty million Markov chain Monte Carlo chains with 10% burn-in were generated. One MCC tree was constructed for every 3000 chains, and a single consensus tree was summarized from these MCC trees. In addition, all positionspecific polymorphisms and substitutions identified by sequence analysis were plotted on graphs using WebLogo 3. [23] 3. Results Table 2 summarizes the demographics and clinical symptoms of the 11 patients, including 8 males and 3 females. Nine patients Table 1 Six primer pairs for amplifying and sequencing EV-D68 genomes.

Pair
Start and end positions of F and R primers Primer sequence (from 50 to 30)  [24,25] The results also demonstrated that various clades of this reemerging pathogen were circulating in Taiwan 14.18951, [15] which grouped into subclade B2, clustering with 3 US/2014 strains and 1 Canada/2014 strain.

Sequence identities among clades and subclades
To compare subclade B3 strains to A, B1, B2, C, and D strains, we collected 129 complete genomes to calculate their nucleotide identities based on each segment and ORF.

Genetic diversity of EV-D68 genomes
To further characterize the newly reported subclade B3, we compared the consensus sequence of B3 to B1 and B2 based on 129 complete genomes and identified 34 amino acid substitutions, including 7 in VP1, 6 in 3D, 5 in 2A, 5 in VP2, 3 in 3C, 2 each in VP4, VP3, and 3A, and 1 each in 2B and 3C, as shown in Fig. 2. Five positions in the B3 subclade (i.e., V18, T207, A/ V220, T470, and G558) involved novel substitutions compared with subclades B1 and B2. Six polymorphisms (i.e., positions at 291, 341, 860, 927, 1108, and 2005) have been reported [14] that in subclade B3 were identical to those in subclade B2, although position 1108 presented as both S and G in subclade B2. As described in Table 3, clade B strains showed high sequence identities. Figure 3     substitution. Furthermore, positions 558, 868, 1031, 1141, and 1190 in 3 Linkou CGMH cases had residues "G," "A," "S," "T," and "F," respectively, which diverged from the other 8 TW strains that had residues "A," "V," "N," "N," and "Y." One death occurred among the 3 Linkou CGMH patients, associated with underlying AITL. An "N" residue deletion at position 692 and "RL" residue insertion at positions 860 and 861 in VP1 have been  reported. [6,17] We further explored the existence of genetic indels based on the phylogenetic clades. To conclude, clade D has an N692 deletion and R860/L861 insertions; clade A has an N692 deletion, and clades B and C have neither. In addition to those sequences, which were associated with specific clades, we discovered 5 France strains (CF254007, CF253080, CF307029, CF298012, and CF289007), 1 Italy strain (ITA/ 23341/14), and 1 Japan strain (1737-Yamagata-2008) that have an N692 deletion and R860/L861 insertions. Two deletions of 24 and 11 nucleotides in the 50 UTR have been reported. [7] One deletion was located from positions 681 to 704 and was observed in all 11 TW/2014 strains in this study. The other deletion was located from positions 721 to 731 in the 50 UTR and was observed in all TW/2014 strains except for TW-00880-2014. This strain has only an 8-nt deletion, not the previously reported 11-nt one.

Discussion
Although EV-D68 was first described in the United States in 1962 and few cases were reported until 2005, sporadic outbreaks were detected in the United States, the Netherlands, the Philippines, and Japan from 2008 to 2010. [1][2][3][6][7][8] In 2014, the largest outbreak to date of this reemerging virus in the United States and Canada was reported, and subsequent studies described respiratory diseases in Taiwan, China, the Philippines, and Europe. [10,[24][25][26][27] However, understanding of this virus remains limited due to a lack of whole genome and molecular evolution analyses of strains from around the globe. This study involved a genome-wide analysis of globally circulating strains with sequences downloaded from GenBank and contributed 11 EV-D68/2014 genomes isolated in Taiwan, including 2 severe cases (i.e., cases 4 and 11 in Table 2). A total of 169 strains from around the world were used to generate a tree that included a new subclade, B3, along with previously reported clades A, B1, B2, C, and D [6,[14][15][16] (Fig. 1). All TW/2014 and China/2014 strains belong to subclade B3, and several cases in China were associated with moderate or severe asthma and pneumonia. [24,25] Figure 1 also demonstrates that various clades of this reemerging pathogen cocirculated in Taiwan; for example, TW strains in clade D were isolated in 2007, 2009, 2010, 2011, and 2013, and TW/2007 strains clustered in clades A, C, and D. To further compare clades and subclades, we calculated sequence identities based on each gene and ORF using all the complete genomes. Based on the VP1 gene, average sequence identities of 95.8% and 94.2% were observed by comparing subclade B3 to B1 and B2, respectively, and sequence identities of 87.9%, 91.0%, and 87.2% were found by comparing B3 to A, C, and D, respectively. Similarly, genotype replacements of EV-71 (one of the most common enteroviruses) in Taiwan, Vietnam, and Malaysia have been reported; for example, the predominant genotypes were C2 in 1998, B4 in 1999 to 2003, C4a in 2004 to 2005, 2008, and 2012, and B5 in 2008 to 2009 and 2011 in Taiwan. [28][29][30] An average sequence identity of 88.7% based on the VP1 gene was observed by comparing subgenotype C5 to subclades from C1 and C4, and 82.1% by comparing to genotypes A and B1 to B5. [31] Although the sequence diversity of the VP1 gene among EV-D68 clades was lower than that among EV-71 genotypes, the phenomena of genotype and clade replacement need to be clarified to better monitor the reemergence of EV-D68, as evidenced by previous outbreaks.
In this study, case 11, with underlying AITL, developed ARDS and eventually died. The BAL specimen collected to test for pneumonia revealed no other possible pathogen; only EV-D68 was identified. Regarding the cause of ARDS, EV-D68 associated with ARDS in adults is very rare, and only 1 case was reported in the United States in 2014. [32] In a seroepidemiological study in 2010 in Finland, [33] Smura et al suggested that after EV-D68 infection, most adults generate neutralizing antibodies; however, host factors (e.g., underlying disease) play a key role in determining disease severity. At the other end of the age spectrum, 7 of 11 TW cases occurred in children younger than 5 years of age. Case 4, the other severe case, occurred in a child who was 3 years old and had a seizure. One study reported a geographically and temporally defined cluster of acute flaccid paralysis and cranial nerve dysfunction in children associated with an EV-D68 outbreak with respiratory illness. [34] The authors suggested the possibility of an association between EV-D68 and neurological disease in children. These observations indicated that infants, children, and teenagers have a higher risk of EV-D68 infection than adults.
Rearrangements in the spacer region of the 50 UTR in EV-D68 between the early 1960s and mid-1990s resulted in 2 deletions of 24 and 11 nucleotides. [5] One Taiwanese strain in this study has a deletion at positions 721 to 728 (i.e., 8 nt long) that was shorter than the previously reported 1 that was 11 nt long. More studies are needed to investigate any association between this shorter deletion in the 50 UTR and the efficiency of translation or virulence. We found that clade D strains had an N692 deletion and R860/L861 insertions in the VP1 gene, including 5 China, 5 France, 1 Italy, 1 Japan, and 1 US strains. More complete genomes are needed to confirm these genetic features in the other clade D strains. In addition, the VP1 gene also contains serotype-specific neutralization sites (e.g., BC and DE loops). No EV-D68/TW strains isolated in 2014 had any novel substitutions in these 2 loops; they were identical to most of the US/2014 outbreak strains.
To conclude, we performed phylogenetic and genetic diversity analyses on all available sequences to reveal the molecular phylogeny, diversity, and epidemiology of EV-D68, and contributed 11 new EV-D68 genomes for analysis. A new subclade, genetic indels, and polymorphisms were discovered, which further elucidate the molecular evolution of EV-D68.
Although there was no direct evidence to prove virus variants causing clinical characteristics of patients infected by EV-D68, enterovirus studies indicated that virus variants may contribute to clinical manifestations, disease severity, and outbreaks, such as EV-71 [28,30] and coxsackievirus A6. [35,36] In this decade, EV-D68 has been detected worldwide and has caused outbreaks and severe clinical symptoms. Further studies are needed to investigate the associations between virus variants and clinical outcomes in this globally reemerging virus.