Isolation of Highly Pathogenic H5N1 Influenza Viruses in 2009–2013 in Vietnam

Routine surveillance and surveillance in response to influenza outbreaks in avian species in Vietnam in 2009–2013 resulted in the isolation of numerous H5N1 influenza viruses of clades 1.1.2, 2.3.2.1a, 2.3.2.1b, 2.3.2.1c, and 2.3.4.1. Consistent with other studies, we found that viruses of clade 2.3.2.1c were dominant in Vietnam in 2013 and circulated in the northern, central, and southern parts of the country. Phylogenetic analysis revealed reassortment among viruses of clades 2.3.2.1a, 2.3.2.1b, and 2.3.2.1c; in contrast, no reassortment was detected between clade 2.3.2.1 viruses and viruses of clades 1.1.2 or 2.3.4.1, respectively. Deep-sequencing of 42 of the 53 isolated H5N1 viruses revealed viral subpopulations encoding variants that may affect virulence, host range, or sensitivity to antiviral compounds; virus isolates containing these subpopulations may have a higher potential to transmit and adapt to mammals. Among the viruses sequenced, a relatively high number of non-synonymous nucleotide polymorphisms was detected in a virus isolated from a barn swallow, possibly suggesting influenza virus adaption to this host.


INTRODUCTION
Highly pathogenic avian influenza (HPAI) viruses of the H5N1 subtype were first identified in Vietnam in 2001 (Nguyen et al., 2005). Since 2003 Vietnam has reported more than 2,700 outbreaks of HPAI H5N1 in poultry, far more than any other country (Thailand and Egypt have experienced the second and the third largest numbers of HPAI H5N1 outbreaks in poultry with 1,141 and 1,084 reported events since 2003, respectively 1 ). The large number of HPAI H5N1 outbreaks in Vietnamese poultry in [2003][2004][2005] was paralleled by 93 laboratory-confirmed human infections with HPAI H5N1 influenza viruses in Vietnam, 42 of which were fatal. Since 2015 no human HPAI H5N1 infections have been reported in Vietnam, but HPAI H5N1 viruses remain a threat to the country's public health.
Since their first emergence in 1997 in Hong Kong, HPAI H5N1 viruses have evolved into multiple clades and subclades that are defined on the basis of the sequence of the viral surface glycoprotein hemagglutinin (HA), the major viral antigen. In Vietnam, HPAI H5N1 viruses of multiple clades have been identified, including clades 0, 1, 2.3.2, 2.3.2.1, 2.3.4, 3, 5, 7, and subclades thereof (Influenza Research Database 2 ). The major outbreaks in [2003][2004][2005] were caused by viruses of clade 1, which circulated throughout the country (Wan et al., 2008;Le and Nguyen, 2014). No outbreaks were reported in 2006, presumably due to a vaccination campaign (Le and Nguyen, 2014). , viruses of clade 1 were predominantly isolated in the southern part of the country, whereas clade 2.3.4 viruses of the Fujian lineage (and sub-clades thereof) were detected in the northern part of the country (Dung Le et al., 2008;Wan et al., 2008;Nguyen et al., 2012;Le and Nguyen, 2014 (Le and Nguyen, 2014). Instead, viruses of clade 2.3.2 became dominant in 2009-2010 in the northern part of the country and diversified into subclades 2.3.2.1a and 2.3.2.1b (Nguyen et al., 2012;Creanga et al., 2013;Okamatsu et al., 2013;Le and Nguyen, 2014;Lee et al., 2015). Continued evolution of these viruses resulted in a third subclade (2.3.2.1c) in 2012 (Le and Nguyen, 2014;Lee et al., 2015). In recent years, viruses of subclades 2.3.2.1a and 2.3.2.1c have extended their geographic range and are now also found in the central and southern parts of Vietnam (Le and Nguyen, 2014;Nguyen et al., 2017), whereas viruses of subclade 2.3.2.1b have not been isolated recently (Le and Nguyen, 2014). Viruses of clade 1 have evolved into subclades 1.1.1 and 1.1.2, which have been isolated from the southern and central regions of Vietnam (Le and Nguyen, 2014;Lee et al., 2015;Nguyen et al., 2017). In addition to the dynamic evolution of the HA gene, reassortment among the other seven viral RNA (vRNA) segments of Vietnamese HPAI H5N1 poultry viruses has been reported (Zhao et al., 2012;Nguyen et al., 2017). Reassortment resulted in the emergence of clade 2.3.4.4 H5N6 viruses in Vietnam and other Southeast Asian countries (Wong et al., 2015;Chu et al., 2016;Lee et al., 2017;Nguyen et al., 2017). Thus, several (sub)lineages of HPAI H5 influenza viruses have been co-circulating in Vietnam, providing ample opportunity for further reassortment and the potential emergence of viruses that may transmit readily to humans.
Here, we analyzed more than 3,000 samples obtained from avian species in Vietnam in 2009-2013, resulting in the isolation of a number of HPAI H5N1 influenza viruses. Characterization of the HPAI H5N1 samples by phylogenetic analysis, Sanger sequencing, and deep-sequencing of 42 of the samples revealed multiple genotypes and potentially mammalian-adapting amino acid changes in some of the viral subpopulations.

Cells
Human embryonic kidney (293T) cells were purchased from the American Type Culture Collection (ATCC) and maintained in Dulbecco's modified Eagle's medium with 10% fetal bovine serum and antibiotics.

Isolation of H5N1 Viruses
Oropharyngeal and/or cloacal swabs were collected from apparently healthy, sick, or dead poultry or wild birds and stored in transport medium (DMEM with 0.15% BSA, 100 IU/ml penicillin-streptomycin, 0.5 µg/ml amphotericin B, 100 µg/ml gentamicin, 20 µg/ml ciprofloxacin, and 0.02 M HEPES). In addition, organ samples were collected from dead or euthanized animals.
Nine-to-eleven-day-old specific pathogen-free eggs were inoculated with individual samples (for samples obtained from sick or dead animals), or with pooled samples (up to 10 samples obtained from healthy animals) and incubated at 35 • C for 24-48 h. Allantoic fluid was collected and tested by means of HA assays. For all hemagglutinationpositive samples, vRNA was extracted by using a QIAamp vRNA Minikit (Qiagen, Hilden, Germany). One-step reverse transcriptase (RT)-PCR was performed by using the Superscript III high-fidelity RT-PCR Kit (ThermoFisher, Waltham, MA, United States) and oligonucleotides to conserved regions of the matrix (M) gene (M-73F, 5 -GTCAGGCCCCCTCAAAGC; M-242R, 5 -CGTCTACGCTGCAGTCC). For all influenza virus-positive samples, portions of the viral HA and neuraminidase (NA) genes were amplified as described previously (Hoffmann et al., 2001) and Sanger-sequenced for subtype identification.
All samples were collected as part of the routine site health surveillance by the National Institute of Hygiene and Epidemiology, Vietnam. No ethical license was required.
The isolation of surveillance samples was carried out in an enhanced biosafety level 3 (BSL 3+) laboratory at the University of Wisconsin-Madison. Procedures using embryonated chicken eggs did not require approval in accordance with Public Health Service (PHS) Policy on Humane Care and Use of Laboratory Animals.
For deep-sequencing, amplicons were purified with 0.45× volume AMPure XP beads (Beckman Coulter) and 0.5-1 µg was sheared to an average fragment size of 150 bp on a Bioruptor Pico sonicator (Diagenode). Libraries for nextgeneration sequencing were prepared using the end repair, A-tailing, and adaptor ligation NEBNext DNA library prep modules for Illumina (New England Biolabs) according to the manufacturer's protocol. Multiplexed and barcoded libraries were sequenced on the Illumina HiSeq 2500 platform in a single-end 100 nt run format. Single-end 100 nt reads were first filtered with cutadapt (Martin, 2011) to remove lowquality sequences and adapters. Next, we performed a de novo assembly using Cap3 (Huang and Madan, 1999) to generate a consensus sequence for each vRNA segment of each sample. These consensus sequences were further processed by using the ViVan pipeline (Isakov et al., 2015). We configured the ViVan pipeline to trim the reads by using EA-Tools/fastq-mcf (Aronesty, 2013), with 200,000 reads used for subsampling, minimum read lengths of 16 nucleotides, and minimum quality threshold (Phred) scores of 30. Next, we modified the ViVan pipeline to use Flexbar (Roehr et al., 2017), to trim 10 base pairs at both ends of all reads. The ViVan pipeline used BWA (Li and Durbin, 2010) to align the reads to the reference sequences; it detected sequence variants by its own statistical procedure. We only considered sequence variants with a minimum frequency of 1% and at least 1,000 reads at the position where the variant was found.
The consensus nucleotide sequences of the isolated H5N1 viruses were submitted to GenBank under the following accession numbers: KX513109-KX513409, KX644099-KX644131.

Phylogenetic Analysis
Over 4,400 nucleotide sequences of H5Nx (for HA), HxN1 (for NA), and HxNx (for all internal gene segments) from 1996 to 2017 were downloaded from the NCBI Influenza Virus Resource and GISAID (accessed 23 August 2017; Supplementary Table S1). The datasets were aligned using MAFFT v.7.3 as implemented in Geneious Pro 9.0.3 (Biomatters Ltd.). The datasets were randomly sampled to produce smaller datasets and duplicate sequences were removed using custom scripts. In addition, new avian H5N1 virus sequences generated in this study were used for phylogenetic analysis, including 51 new sequences for the PB2 and PB1 vRNA segments and 53 new sequences for the PA, HA, NP, NA, M, and NS vRNA segments. The final datasets consisted of 346 sequences for the PB2 and PB1 vRNA segments; 425 for the H5-HA vRNA segment; 348 sequences for the PA, NP, M, and NS vRNA segments; and 221 sequences for the N1-NA vRNA RNA segment. To infer the evolutionary relationships of H5N1 viruses, individual gene phylogenies were reconstructed by using the maximum-likelihood (ML) method in RAxML v9.0 (Stamatakis, 2014), using a generalized time reversible nucleotide substitution model plus gamma distributed rates among sites (GTR+ ), with 1,000 nonparametric bootstrap replicates for assessing branch support. The resulting H5 phylogeny (Figure 1) was subsequently annotated with H5 clade designations following the most recent update to the WHO/OIE/FAO H5 nomenclature system (Smith et al., 2015). The remaining gene topologies were shown in Supplementary Figures S1-S7.

Minireplicon Assay
To determine the relative polymerase activities of A/duck/Vietnam/ST1488-3/2012 (H5N1) variants, consensus or variant genes of PB2, PB1, PA, and NP were inserted into the pCAGGS protein expression vector.
Human embryonic kidney (293T) cells were transfected with four protein expression plasmids for PB2, PB1, PA, and NP, with a plasmid for the expression of a virus-like RNA encoding the firefly luciferase gene under the control of the human RNA polymerase I promoter, and with a control plasmid encoding Renilla luciferase by using TransIT-LT1 (Mirus, Madison, WI, United States), and incubated for 48 h at 33 and 37 • C, respectively. The cells were then lysed and the relative luciferase activity was measured by using a dual-luciferase reporter assay kit (Promega, Madison, WI, United States). Data shown are the mean values with standard deviations for the results of three independent experiments.

Biosafety Statement
All recombinant DNA protocols were approved by the University of Wisconsin-Madison's Institutional Biosafety Committee (IBC) after risk assessments were conducted by the Office of Biological Safety. All experiments were approved by the University of Wisconsin-Madison's IBC. This manuscript was reviewed by the University of Wisconsin-Madison Dual Use Research of Concern (DURC) Subcommittee and by the Icahn School of Medicine at Mount Sinai (New York) IBC and DURC Committee. The reviews were conducted in accordance with the United States Government September 2014 DURC Policy. Both institutions concluded that the studies described herein do not meet the criteria of Dual Use Research (DUR) or DURC. In addition, the University of Wisconsin-Madison Biosecurity Task Force regularly reviews the research, policies, and practices of research conducted with pathogens of high consequence at the institution. This task force has a diverse skill set and provides support in the areas of biosafety, facilities, compliance, security, law, and health. Members of the Biosecurity Task Force are in frequent contact with the principal investigator and laboratory personnel to provide oversight and assure biosecurity.

Isolation of Avian Influenza Viruses in Vietnam From 2009 to 2013
We carried out surveillance in live bird markets in different provinces of Vietnam (Figure 2) by collecting oropharyngeal swabs and cloacal samples from apparently healthy chickens and ducks (Table 1). In addition, after reports of suspected or confirmed outbreaks of HPAI H5N1 in backyard poultry flocks in rural villages, we obtained tissue samples from dead animals, and from sick animals that had to be euthanized. We also collected oropharyngeal and/or cloacal samples from apparently healthy poultry in the same village. Moreover, environmental samples and samples from dead wild birds (if available) were collected. In total, we collected 3,015 samples from 1,995 animals,  including 95 tissue samples and 2,920 oropharyngeal and cloacal swabs ( Table 1).
Samples obtained from sick or dead animals were amplified individually in 9-11-day-old embryonated chicken eggs, whereas we pooled up to 10 samples from healthy birds before inoculating them into embryonated chicken eggs. After amplification in embryonated chicken eggs, all samples were tested for hemagglutination activity. For hemagglutination-positive batches, individual samples were amplified in chicken eggs and then subjected to hemagglutination assays. Hemagglutinationpositive samples were then tested for influenza viruses by PCR with oligonucleotides to a conserved region in the viral M segment. Next, portions of the HA and NA vRNAs were Sanger-sequenced to determine the influenza viral subtype. In total, we isolated 254 influenza viruses of five different subtypes: 53 H5N1 viruses, 34 H6N2 viruses, 34 H6N6 viruses, 3 H4N6 viruses, and 130 H9N2 viruses (Tables 1, 2). All H4, H6, and H9 viruses were isolated during surveillance from apparently healthy chickens and ducks in live bird markets (with the exception of one H9N2 virus, which was isolated from an apparently healthy duck during an outbreak response in a village). In contrast, 49 of the 53 H5N1 viruses were isolated from tissue samples from dead or euthanized animals from backyard flocks and wild birds in areas with suspected or reported HPAI H5N1 outbreaks. However, we also isolated one HPAI H5N1 virus from an apparently healthy chicken sampled in a village in the central province of Quang Tri, two HPAI H5N1 viruses from apparently healthy ducks sampled in a live bird market in the northern province of Lang Son, and one HPAI H5N1 virus from an apparently healthy swallow sampled in a village in the southern province of Ninh Thuan.

Phylogenetic Analysis of H5N1 Viruses
To understand the phylogenetic relationships of the novel HPAI H5N1 isolates, we analyzed complete genomes of the circulating H5N1 and H5Nx viruses of the Gs/GD (A/goose/Guangdong/1/1996 [Gs/GD] lineage in combination with 53 novel genomes generated from this study ( Table 2). ML phylogenies were reconstructed for individual gene segments. The HA-H5 phylogeny ( Figure 1A) consistently showed an extensive genetic diversification of the Gs/GD lineage worldwide. Our HA tree also indicated that the new H5 isolates from Vietnam were phylogenetically separated into different clades ( Figures 1B,C), particularly in clades 1.1.2, 2.3.2.1a, 2.3.2.1b, 2.3.2.1c, and 2.3.4.1. Thirteen of the HPAI H5N1 viruses isolated here belonged to clade 1.1.2 and they were closely related to viruses from Vietnam and Cambodia. All new viruses in this clade were isolated in 2012 in the southern part of the country (Figures 1, 2 and Table 2), consistent with previous reports of the circulation of clade 1.1.2 viruses primarily in the southern part of Vietnam (Nguyen et al., 2012;Okamatsu et al., 2013). The clade 1.1.2 viruses characterized here were collected during the same outbreak, explaining the close phylogenetic relationship of most of these isolates.
Viruses of clade 2.3.4 and its subclades were frequently detected in the northern part of Vietnam until 2010 (Dung Nguyen et al., 2012;Le and Nguyen, 2014). We isolated three viruses of clade 2.3.4.1 from the northern and central provinces of Vietnam in 2009 and 2010 (Figures 1, 2 and Table 2). Viruses of clade 2.3.4 were then replaced by viruses of clade 2.3.2.1, which subsequently diverged into three subgroups (Creanga et al., 2013;Okamatsu et al., 2013;Le and Nguyen, 2014;Lee et al., 2015). Here, we isolated 24 H5N1 viruses of clade 2.3.2.1a from the central and northern part of the country in 2010-2012 (Figures 1, 2 and Table 2). In contrast, we isolated only a single virus of clade 2.3.2.1b (Figures 1, 2 and Table 2), which is consistent with a previous report that these viruses have not become dominant in Vietnam (Nguyen et al., 2017). In 2012, viruses of clade 2.3.2.1c emerged and have become dominant (Nguyen et al., 2017). This is consistent with our finding that all 12 HPAI H5N1 viruses isolated in 2013 belong to this subclade (Figures 1, 2 and Table 2); these viruses were isolated from northern, central, and southern provinces of Vietnam, demonstrating the wide geographic range of clade 2.3.2.1c viruses in Vietnam in 2013.  Next, we performed phylogenetic analyses for the remaining seven vRNA segments (Supplementary Figures S1-S7). The phylogenetic relationships of several vRNA segments (i.e., PB1, PA, NP, and NA) were found to be similar to those of the HA segment in that viruses of clades 1.  Table S2). For many of these amino acid differences, we currently do not know their effect on virus replication, virulence, and/or antigenicity.
The HA protein possesses multiple basic amino acids at the cleavage site, a hallmark of HPAI viruses. Compared to the majority of H5 HA proteins, most clade 1.1.2 HA proteins encode a glutamic acid insertion immediately upstream of the HA cleavage site (based on our analysis of sequences available in the Influenza Research Database 3 ), which we also detected among the clade 1.1.2 viruses in the present study (Table 3 and Supplementary Table S2). All H5N1 HA proteins analyzed here encode glutamic acid and glycine at 222 and 224 (HA amino acid positions are based on H5 numbering; Burke and Smith, 2014), respectively, which confer preferential binding to α-2,3-linked sialic acids, that is, avian-type receptors (Wright et al., 2013). At positions 154-156, the clade 1.1.2 HA proteins encode the glycosylation motif "NST, " whereas the clade 2.3.2.1a,c HA proteins lack this glycosylation motif (they encode "DNA" at the respective amino acid positions) ( Table 4). Loss of this glycosylation site may increase affinity for α-2,6-linked sialic acids (i.e., human-type receptors) and transmissibility among mammals via respiratory droplets (Herfst et al., 2012;Imai et al., 2012). In addition, viruses of clades 1.1.2, 2.3.2.1a, and 2.3.2.1c differ at HA positions that are part of the receptor-binding site (RBS; position 219), are located at the rim of the RBS (positions 129, 136, 140, 154-156, 184, and 189), affect HA receptor-binding specificity [positions 94 (Su et al., 2008), 120 (Wang et al., 2010), 123 (Yamada et al., 2006), and 210 (Watanabe et al., 2011)], or may affect HA stability [positions 162 (Kaverin et al., 2015) and 528 (Rudneva et al., 2013)]. Collectively, these differences may affect the virulence and host range of the respective viruses. The NA protein encodes a sialidase whose activity is important for efficient virus release from host cells. In addition, NA contributes to the antigenic properties of viruses. The NA proteins characterized here possess a 20amino acid deletion at positions 49-68 (Tables 3, 4 and  Supplementary Table S2), which is typical for H5N1 NA proteins isolated since 2003 and increases H5N1 virulence in mice (Matsuoka et al., 2009;Zhou et al., 2009). The amino acid at position 309 is located in a pocket on the surface of NA, and an asparagine-to-serine mutation at this position was identified in an antigenic escape variant (Wan et al., 2013). Viruses of clade 1.1.2 encode NA-289(309)N (numbers indicate the amino acid position with and without the 20-amino acid deletion, respectively), whereas those of clades 2.3.2.1a,c encode NA-289(309)D (Tables 3, 4 and Supplementary Table S2); we currently do not know if the aspartic acid in this position affects the antigenic properties of NA. The emergence of NA variants with resistance to NA inhibitors is a major public health concern. Most clade 2.3.2.1a viruses characterized in the present study encode NA-106 (126) Table S2). The NA-H106(126)N mutation resulted in slightly increased (although not high) resistance to the NA inhibitors oseltamivir and zanamivir (Sheu et al., 2008), suggesting that clade 2.3.2.1a viruses may possess slightly reduced sensitivity to commonly used NA inhibitors.
The PB2 protein (one of the viral polymerase subunits) is a major determinant of host range and pathogenicity (Subbarao et al., 1993;Hatta et al., 2001;Li et al., 2005;Shaw and Palese, 2013). The H5N1 viruses studied here encode PB2-627E (Subbarao et al., 1993;Hatta et al., 2001) and -701D (Li et al., 2005), that is, the amino acid residues typically found in avian influenza viruses (Table 3). However, the clade 2.3.2.1a,c viruses possess PB2-147T, -339T, and -588T (Tables 3, 4, and Supplementary Figure S1 and Supplementary Table S2), which we recently identified as novel determinants of avian H5N1 virulence in mammals (Fan et al., 2014). In contrast, viruses of clade 1.1.2 encode the avian-like sequences PB2-147I, -339K, and -588A. A PB2-A271T mutation increased the polymerase activity of an avian influenza virus polymerase in mammalian cells (Bussey et al., 2010). In addition, several computational studies identified the amino acid at position 271 of PB2 as host-specific, with avian influenza viruses typically encoding threonine, and human influenza viruses typically encoding alanine (Shaw et al., 2002;Chen et al., 2006;Finkelstein et al., 2007;Miotto et al., 2010). All clade 1.1.2 viruses characterized in the present study encode PB2-271M (Tables 3, 4, and Supplementary Figure S1 and Supplementary Table S2); this residue is found in 32 of the 89 clade 1.1.2 PB2 proteins deposited in the Influenza Research Database 4 , with the remaining clade 1.1.2 PB2 proteins encoding threonine at this position (with the exception of one virus encoding alanine). We currently do not know the significance of the methionine residue found among clade 1.1.2 viruses at this position.  The PA protein encodes another subunit of the polymerase complex. The clade 1.1.2 and most of the clade 2.3.2.1a viruses analyzed here encode PA-343A (the most commonly found amino acid at this position) ( Table 3 and  Supplementary Table S2); in contrast, the clade 2.3.2.1c and some of the clade 2.3.2.1a viruses analyzed in our study encode PA-343S, which is prevalent among human influenza viruses of the H3N2 subtype. A PA-A343T substitution increased the replicative ability of an H5N1 virus in mammalian cells (Yamaji et al., 2015). Importantly, we recently characterized two of the viruses analyzed here (A/duck/Vietnam/QT1480/2012 and A/duck/Vietnam/QT1728/2013) and found that the PA-A343S mutation alone and in combination with a PA-D347E mutation significantly increased viral polymerase activity and mouse virulence . In addition, viruses of clades 1.1.2, 2.3.2.1a, and 2.3.2.1c differ at several other amino acid positions in PA that affect the polymerase activity, including 204 (Liu et al., 2016), 237 (Hu et al., 2013), and 391 (Liu et al., 2016) (Table 3 and Supplementary Table S2).
The viral nucleoprotein NP of clade 1.1.2 viruses encodes a methionine at position 105, whereas most of the clade 2.3.21a,c viruses analyzed here possess NP-105V (Tables 3, 4 and  Supplementary Table S2). The NP-M105V mutation increased the pathogenicity of an H5N1 virus in chickens (Tada et al., 2011), suggesting that the amino acid at this position contributes to H5N1 virulence in chickens.
In the viral M1 matrix protein, clade 1.1.2 viruses encode isoleucine at position 205 (Tables 3, 4 and  Supplementary Table S2), which conferred high-growth properties to a reassortant human virus (Yasuda et al., 1994); in contrast, clades 2.3.2.1a and 2.3.2.1c viruses encode a valine residue, which is commonly found at this position among H5 influenza viruses. The viruses analyzed here also differ at several amino acid positions in the M2 ion channel protein that affect sensitivity to ion channel inhibitors (Hay et al., 1985(Hay et al., , 1986: The M2-31N residue of clade 1.1.2 viruses (Tables 3, 4 and  Supplementary Table S2) confers resistance to ion channel inhibitors; viruses encoding M2-31S (such as the clade 2.3.2.1a,c viruses) are sensitive to ion channel inhibitors. The amino acids found here at positions 26 and 27 of M2 (M2-26I/L or M2-27V/I, respectively) ( Table 3 and Supplementary Table S2) also affect sensitivity to ion channel inhibitors (Lee et al., 2008;Chen and Guan, 2015).

Analysis of Viral Subpopulations
Next, we assessed the viral subpopulations of the 42 H5N1 influenza viruses that were deep-sequenced ( Table 2). We only considered sequence variants with a Phred quality score >30, a minimum frequency of 1%, and at least 1,000 reads at positions where variants were found. For our analysis, we focused on nonsynonymous single-nucleotide polymorphisms (SNPs), that is, sequence variants that resulted in amino acid changes.
Applying the parameters described above, we detected the largest numbers of non-synonymous SNPs in the PB2, PB1, PA, and HA proteins (Supplementary Table S3). Given that M2 is only 97 amino acids, the number of non-synonymous SNPs in M2 was relatively high; however, almost all of these SNPs were detected in the three clade 2.3.2.1c viruses isolated from swallows. Currently, we do not know the significance of this finding. The fewest numbers of non-synonymous SNPs were detected in NS2 (a short viral protein of 121 amino acids that does not play a major role in host adaptation) and M1, which is relatively conserved among influenza A viruses.
Among the samples analyzed here, we detected a relatively high number of non-synonymous SNPs in a virus isolated from an apparently healthy swallow: 29 non-synonymous SNPs in A/swallow/Vietnam/NT25/2013 ( Table 2). This large number of SNPs may reflect adaptation to a new host. Interestingly, barn swallows have been discussed as "bridge hosts" that may transfer influenza viruses between waterfowl and domestic avian species (Caron et al., 2014(Caron et al., , 2017. Next, we assessed the viral subpopulations for variants that may affect virulence (Supplementary Table S3). For example, we detected viral subpopulations with amino acid changes at HA positions located at the rim of the RBS (which may affect receptor-binding and antigenicity), or at the RBS. Similarly, for NA, we found viral subpopulations FIGURE 3 | Effect of selected mutations detected by deep-sequencing on the viral polymerase activity. 293T cells were transfected with four plasmids expressing the PB1, PA, and NP proteins; consensus or variant PB2 protein of A/duck/Vietnam/ST1488-3/2012 (ST1488-3); a plasmid for the expression of a virus-like RNA encoding the firefly luciferase gene; and a control plasmid encoding Renilla luciferase, and assayed after a 48 h incubation at 33 and 37 • C. Relative polymerase activity was determined and normalized to that of consensus ST1488-3 polymerase complexes. Data shown are mean values with standard deviations for the results of three independent experiments. P-values were calculated by one-way ANOVA, followed by Dunnett's test ( * P < 0.05; * * P < 0.01 compared to consensus ST1488-3).
We also compared the viral subpopulations of the avian samples with those of human H5N1 virus samples (Imai et al., 2018). Only one amino acid position (PA-142) showed sequence variability in both sets of samples; an amino acid change at this position significantly altered the polymerase activity (Imai et al., 2018).

Functional Analysis of Mutations in Viral Subpopulations
To assess the functional effect of SNPs detected by deepsequencing, we selected two mutations, A674E and L675I (Supplementary Table S3) in the PB2 protein of A/duck/Vietnam/ST1488-3/2012 (ST1488-3). These mutations were chosen because of their close proximity, their location on the surface of the protein, and the fact that the amino acid at position 674 may be host specific, based on computational analyses (Shaw et al., 2002;Chen et al., 2006;Finkelstein et al., 2007;Miotto et al., 2010). To assess a potential role of these sequence variants, we tested the polymerase activities of viral polymerase complexes in minireplicon assay in mammalian cells at 37 and 33 • C. Introduction of mutation A674E or L675I into the consensus PB2 protein of ST1488-3 resulted in a small, but significant increase of polymerase activity in 293T cells at both temperatures (Figure 3). These results suggested that the sequence variants detected among H5N1 influenza viruses may play a biological role.
In summary, avian surveillance in Vietnam in 2009-2013 resulted in the isolation of a number of H5N1 viruses from different subclades. By 2013, viruses of clade 2.3.2.1c had become dominant and were isolated from different provinces in Vietnam. Deep-sequencing identified viral subpopulations with amino acid changes that may facilitate adaptation to mammals. A virus isolated from a healthy barn swallow encoded a high number of non-synonymous SNPs, suggesting that these viruses were under adaptive pressure in barn swallows.

AUTHOR CONTRIBUTIONS
ML, HN, and PVMH collected the surveillance samples. GZ, SF, PH, MH, and GN isolated the viruses, carried out Sanger sequencing, and/or analyzed the Sanger and deep-sequencing data. HvB and JD carried out deep-sequencing. TL processed the deep-sequencing data. YS, JJ, and GS performed the phylogenetic analyses. GZ, SF, GN, and YK planned the study and wrote the manuscript.

ACKNOWLEDGMENTS
We thank Susan Watson for scientific editing. We also thank Kelly Moore, Lisa Burley, Alexander Karasin, and Peter Jester for technical assistance.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2019.01411/full#supplementary-material FIGURE S1 | Maximum-likelihood (ML) phylogeny of avian PB2 genes. Red font denotes H5N1 influenza viruses collected in Vietnam in this study. Bootstrap values greater than 50% are indicated at the nodes. The scale bar represents nucleotide substitutions per site. Vertical bars indicate (sub)clades and groups of viruses possessing mammalian-adapting amino acid changes.   TABLE S3 | Summary of deep-sequencing data: shown are non-synonymous SNPs found at a frequency ≥1% that passed our quality control (see the section "Materials and Methods"); data are sorted by viral protein and, for each protein, by the frequency of the SNP.