Full Genome of Influenza A (H7N9) Virus Derived by Direct Sequencing without Culture

An epidemic caused by influenza A (H7N9) virus was recently reported in China. Deep sequencing revealed the full genome of the virus obtained directly from a patient’s sputum without virus culture. The full genome showed substantial sequence heterogeneity and large differences compared with that from embryonated chicken eggs.

R ecently, a novel influenza A (H7N9) virus infected humans in China (1,2), leading to great concerns about its threat to public health (3). However, almost all the current genomes of the novel subtype H7N9 virus have been sequenced after culture in embryonated chicken eggs or mammalian cells. Switching the evolutionary selection pressure from in vivo human respiratory tract to embryonated chicken eggs might introduce mutations into the final genome sequences during culture (4). We report determination of the full genome of the influenza A (H7N9) virus derived directly by deep sequencing, without virus culture, from a sputum specimen of an infected human. Deep sequencing provides a direct way to evaluate the genome characteristics and potential virulence and transmissibility of the novel influenza A (H7N9) virus.

The Study
We collected a sputum specimen from a 54-yearold woman with fever, cough, sputum production, and pneumonia. Influenza A (H7N9) virus was detected in the specimen by specific real-time reverse transcription PCR (RT-PCR). The specimen was then processed with a viral particle-protected nucleic acid purification method (5). Total RNA was extracted and amplified by sequence-independent PCR (5) and then sequenced with an Illumina/Solexa GAII sequencer (Illumina, San Diego, CA, USA). Reads generated by the Illumina/Solexa GAII with lengths of 80 bases were directly aligned to those nucleotide sequences of influenza A viruses in the National Center for Biotechnology Information nonredundant nucleotide database by the blastn program in the BLAST (6) software package, version 2.2.22 (www.ncbi. nlm.nih.gov/blast) with parameters −e 1e-5 -F T (−e 1e-5 for selection of highly similar reads and -F T for masking the low-complexity reads) after filtering of the sequence adapters and RT-PCR primers. No assembly was performed before alignment. We obtained 19,177 reads aligned to influenza A viruses.
We then conducted a reference-guided assembly based on the 19,177 reads by the Seqman program in the DNA-Star software package version 7.1 (www.dnastar.com). The novel influenza A (H7N9) virus A/Anhui/1/2013was selected as the reference. With 80% minimum sequence similarity tolerance and 12 bp minimum match size, those 19,177 reads were assembled into 439 contigs. The top 8 contigs covered by the most reads corresponded to the 8 genome segments of the novel influenza A (H7N9) virus. The other contigs did not align to the reference virus, which might have resulted from sequencing or assembling errors. Calculating the consensus sequence, we obtained the genome of the influenza A (H7N9) virus directly from the sputum specimen of this patient. Further RT-PCR and Sanger sequencing confirmed the quality of the assembled subtype H7N9virus genome. Sequences were deposited in GenBank under accession nos. KF226105-KF226120 and KF278742-KF278749.
The influenza A (H7N9) genome that we report varies from that obtained by Sanger sequencing after passage in the allantoic sac and amniotic cavity of 9-11-day-old specific pathogen-free embryonated chicken eggs for 48-72 hours at 35°C (Table 1). In the nucleocapsid protein (NP) segment, 15 point mutations were found; 13 were synonymous and 2 induced amino acid changes S321N and M371I. In the nonstructural (NS) protein segment, 5 point mutations were found; all caused amino acid changes R59H, P107L, and V111Q. In the polymerase acidic (PA) protein segment, 3 point mutations were found, 1 of which caused amino acid change V707F. In the polymerase basic 1 (PB1) protein segment, 2 point mutations were found, both of which were synonymous. In the PB2 segment, 2 point mutations were found, 1 of which caused amino acid change S534F.
The influenza A (H7N9) genome also demonstrates significant intraspecimen heterogeneity. Deep sequencing revealed that the average coverage (ratio of the total number of nucleotides of all reads to the length of the reference gene) of the 8 genes was quite inhomogeneous.
Average coverage (± SD) was highest for neuraminidase ( Besides the gene abundance, the genome sequence of influenza A (H7N9) virus also demonstrated heterogeneity (the heterozygous peak threshold 80%). In total, 22 positions were confirmed by PCR and Sanger sequencing to be heterogeneous ( Table 2). In the NP segment, 4 positions demonstrated heterogeneity; 3 were synonymous and 1 induced amino acid change E421K. In the NS segment, 3 positions demonstrated heterogeneity; 2 were synonymous and 1 induced amino acid change R140W. In the hemagglutinin segment, 7 positions demonstrated heterogeneity; 6 were synonymous and 1 induced amino acid change H242Y. In NA, 3 positions demonstrated heterogeneity; 2 induced amino acid changes (S92L and S108L) and 1 was synonymous. In the PA segment, 2 positions demonstrated heterogeneity; both were synonymous. In the PB2 segment, 3 positions demonstrated heterogeneity; all were nonsynonymous (S532L, S533L, and S534F). All these heterogeneous sites were confirmed by PCR and Sanger sequencing; only 1 site overlapped with the mutation sites after passage in embryonated chicken eggs.
Compared with the reference influenza A (H7N9) virus strain A/Anhui/1/2013, the influenza A (H7N9) virus demonstrated prominent sequence differences (Table 2). In particular, the amino acid at the 627 position of PB2 of A/ Anhui/1/2013 is K, whereas the corresponding amino acid in the subtype H7N9 genome is E. The amino acid at the 368 position of PB1 of A/Anhui/1/2013 is V, whereas the corresponding amino acid in the subtype H7N9 genome is I. The E627K mutation in PB2 and the I368V mutation in PB1 are closely associated with the virulence and transmissibility of avian influenza A virus in mammals (1). MEGA5.0 (www.megasoftware.net) was used to construct the phylogenetic trees on the basis of the nucleotide sequences of all influenza A (H7N9) viruses in the Global Initiative on Sharing All Influenza Data (GI-SAID) database (7). We conducted 2 rounds of phylogenetic analysis. First, to examine whether this subtype H7N9 virus is clustered with the available subtype H7N9 strains, we included all influenza A (H7N9) viruses in the GISAID database. We next included all influenza A (H7N9) viruses isolated in China in 2013 to closely investigate the relationships between this virus and available subtype H7N9 genomes isolated during epidemics. However, the phylogenetic topologies based on different gene segments were not consistent (Figures 1, 2; online Technical Appendix Figures 1-6, wwwnc.cdc.gov/EID/article/19/11/13-0664-Techapp1.pdf), suggesting that the influenza A (H7N9) virus may have persistently evolved for a while (8).

Conclusion
Using deep sequencing technologies, we derived the fulllength genome of the novel influenza A (H7N9) virus directly from the sputum specimen of a patient, without conducting virus culture. The full genome revealed substantial sequence heterogeneity within the specimen, obvious sequence variations from that obtained from embryonated chicken eggs, and   prominent differences from the available influenza A (H7N9) strains, most of which were sequenced after culture.