RNA-Seq transcriptome data of human cells infected with influenza A/Puerto Rico/8/1934 (H1N1) virus

Human influenza remains a serious public health problem. This data article reports the transcriptome analysis data of human cell lines infected with influenza A/Puerto Rico/8/1934 (H1N1) virus. Mock-infected cells were included as controls. Human embryonic fibroblasts (MRC-5) and immortalized cell lines (A549, HEK293FT, WI-38 VA-13) were selected for RNA sequencing using Illumina NextSeq500 platform. Raw data were applied to the bioinformatic pipeline, which includes quality control with FastQC and MultiQC, adapter and quality trimming with Cutadapt, filtering to the genome of influenza A with STAR, transcript quantification with Salmon tool (GRCh38_RefSeq_Transcripts). Differential expressed genes were identified using R package DESeq2 with FDR-adjusted p-value < 0.001 and absolute value of log2(FC) > 1. Lists of differentially expressed genes is provided. The raw and processed RNA-seq data presented in this article were deposited to the European Nucleotide Archive via the ArrayExpress partner repository with the dataset accession number E-MTAB-9511 .


RNA-seq Influenza A virus H1N1 Human cells Transcriptome a b s t r a c t
Human influenza remains a serious public health problem. This data article reports the transcriptome analysis data of human cell lines infected with influenza A/Puerto Rico/8/1934 (H1N1) virus. Mock-infected cells were included as controls. Human embryonic fibroblasts (MRC-5) and immortalized cell lines (A549, HEK293FT, WI-38 VA-13) were selected for RNA sequencing using Illumina NextSeq500 platform. Raw data were applied to the bioinformatic pipeline, which includes quality control with FastQC and MultiQC, adapter and quality trimming with Cutadapt, filtering to the genome of influenza A with STAR, transcript quantification with Salmon tool (GRCh38_RefSeq_Transcripts). Differential expressed genes were identified using R package DESeq2 with FDR-adjusted p-value < 0.001 and absolute value of log2(FC) > 1. Lists of differentially expressed genes is provided. The raw and processed RNA-seq data presented in this article were deposited to the European Nucleotide Archive via the ArrayExpress partner repository with the dataset accession number E-MTAB-9511 .

Value of the Data
• The analysis of differentially expressed genes across MRC-5, WI-38, A549 and HEK293FT cell lines infected with influenza A virus may contribute to a better understanding mechanisms of cell permissiveness to influenza infection. • Raw FASTQ files available in the ArrayExpress repository can be processed by researchers using their own bioinformatic pipelines or analyzed as part of a large combined data sets for extended statistical analysis. • The comparison of the several cell lines may allow for the selection of cell lines for isolation and propagation of seasonal strains and production of vaccine strains. • The identification of activated or deactivated molecular metabolic and signaling pathways in infected cells may indicate the direction for future research in the field of antiviral therapy.

Data Description
To study genes involved in cellular response to Influenza A virus infection, the transcriptome analysis of MRC-5, WI-38 VA-13, A549 and HEK293FT human cell lines infected with in-   Fig. 2 .

Infection of cells and growth kinetics of influenza virus
Cells were grown in T25 cell culture flasks (TPP, Switzerland) until 90-100% monolayer, then infected with influenza A/Puerto Rico/8/1934 virus at a dose of 1 TCID50/cell (two replicates), and incubated at 37 °С for 48 h. At 24 and 48 h after infection, all the culture media were collected, and the titers were measured by plaque assay. Mock-infected cells were included as controls. Following the incubation period, the cells were washed twice with PBS and directly lysed by adding LIRA reagent (Biolabmix, Russia).

RNA isolation
Total RNA was extracted from cells with LRU RNA extraction kit (Biolabmix, Russia) following the manufacturer's protocol. RNA concentration was assessed using the Qubit 2 fluorometer (Thermo Fisher Scientific, USA) with Qubit RNA HS Assay Kit (Thermo Fisher Scientific, USA). The quality of total RNA expressed as RNA Integrity Number (RIN) was determined with Bioanalyzer 2100 instrument (Agilent, USA) using an Agilent RNA Pico 60 0 0 Kit (Agilent, USA) [2] . The threshold RIN reading greater than 7.0 was taken as cut-off point for transition to the stage of library preparation.

Library preparation and sequencing
A total of 16 cDNA libraries were prepared from two biological replicates of each time point (0 h (mock-infected) and 48 h MRC-5, WI-38 VA-13, A549 and HEK293FT). The construction of cDNA libraries according to a standard protocol using a NEBNext Ultra II Directional RNA library preparation kit (New England Biolabs, UK) and NEBNext mRNA Magnetic Isolation Module (New England Biolabs, UK), as well as massive parallel sequencing on a NextSeq Illumina 500 platform, were conducted at the Institute of Fundamental Medicine and Biology, Kazan Federal University (Kazan, Russia). For the isolation of mRNA, fragmentation and priming procedure 1 μg of the total RNA was used. A NextSeq 500/550 High Output v2.5 Kit (75-nucleotide single-end reads) (Illumina, US) was used. For the prepared sequencing libraries, fragment size distribution was analysed using Bioanalyzer 2100 instrument (Agilent, USA) using an Agilent High Sensitivity DNA Kit (Agilent, USA) and quantification by the Qubit 2 fluorometer (Invitrogen, USA) with Qubit DNA HS Assay Kit (Thermo Fisher Scientific, USA). Fragment size range between 250 bp to 700 bp with clear peak on 300 bp was observed.

RNA-seq analysis
The raw data were saved as FASTQ format files. The quality control of the raw and trimmed reads was performed using FastQC and MultiQC [ 3 , 4 ]. Trimming of the adapter content and Quality trimming was performed using Cutadapt [5] . The reads complementary to the genome of influenza A/Puerto Rico/8/1934 (H1N1) were filtered out from the trimmed reads using STAR [6] . The filtered reads were used for transcript quantification by Salmon tool (GRCh38_RefSeq_Transcripts) [7] . The R-package Tximport was used to convert the transcript quantifications to gene quantifications [8] .

Differential expression analysis
To study genes involved in cellular response to influenza A virus infection, differential expressed genes were identified using R package DESeq2 with a FDR-adjusted p -value < 0.001 and the absolute value of a log2(FC) > 1 [9] . Overall, 2192, 1378, 2647 and 607 genes were up-regulated and 2965, 1176, 2593 and 435 genes were significantly down-regulated in MRC-5, WI-38 VA-13, A549 and HEK293FT cells, respectively. Of these, 238 common genes were upregulated while 100 common genes were down-regulated in all cell lines.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships which have, or could be perceived to have, influenced the work reported in this article.