Data on RNA-seq analysis of Drosophila melanogaster during ageing

Ageing is defined as gradual decline of physiological, cellular and molecular state of an organism with time. The age-associated cell dysfunctions usually cause chronic diseases such as diabetes, cancers and other age-related diseases. Many of the genes and pathways involved in ageing are conserved in different species. These genes and pathways have been categorised into nine cellular and molecular hallmarks, namely, genomic instability, telomere attrition, loss of proteostasis, mitochondrial dysfunction, epigenetic alterations, deregulated nutrient sensing, stem cell exhaustion, cellular senescence and altered intercellular communication. Despite countless studies on ageing, the molecular mechanism of ageing is poorly understood. Here, we performed genome wide transcriptome mapping of ageing process in D. melanogaster. In which, transcriptomic analysis conducted on the 1 day and 60 days flies. Illumina Hiseq platform were used to generate raw data. Afterwards, further analysis including differential expression analysis, GO classification and KEGG pathway enrichment analysis were performed. The raw data were uploaded to SRA database and the BioProject ID is PRJNA718442. These data provide the basis for future research in order to discover the genes and pathways involved in ageing.


a b s t r a c t
Ageing is defined as gradual decline of physiological, cellular and molecular state of an organism with time. The age-associated cell dysfunctions usually cause chronic diseases such as diabetes, cancers and other age-related diseases. Many of the genes and pathways involved in ageing are conserved in different species. These genes and pathways have been categorised into nine cellular and molecular hallmarks, namely, genomic instability, telomere attrition, loss of proteostasis, mitochondrial dysfunction, epigenetic alterations, deregulated nutrient sensing, stem cell exhaustion, cellular senescence and altered intercellular communication. Despite countless studies on ageing, the molecular mechanism of ageing is poorly understood. Here, we performed genome wide transcriptome mapping of ageing process in D. melanogaster. In which, transcriptomic analysis conducted on the 1 day and 60 days flies. Illumina Hiseq platform were used to generate raw data. Afterwards, further analysis including differential expression analysis, GO classification and KEGG pathway enrichment analysis were performed. The raw data were uploaded to SRA database and the BioProject ID is PRJNA718442. These data provide the basis for future research in order to discover the genes and pathways involved in ageing.

Value of the Data
• These data provide a comprehensive picture with a greater resolution of gene expression changes and the pathways involved in the process of ageing in D. melanogaster . • The dataset and analysis provided here can be useful for researchers focusing on aging and age-related diseases such as Alzheimer, cancer, and cardiovascular diseases in D. melanogaster . • Applying different workflows, the RNA-seq raw data provided here can be used for further analysis to investigate the role of coding and non-coding genes in ageing. Besides, the analysis provided here would shed light on potential genes and pathways involved in ageing process for further molecular research in order to find novel anti-ageing strategies and treatments for age-related diseases.

Data Description
To investigate changes in molecular landscape in ageing process, day 1 and day 60 flies of D. melanogaster were chosen as model system and RNA sequencing was done using Illumina Hiseq platform. Table 1 provides accession numbers and links for raw data generated by RNA sequencing. There are in total three paired end libraries for day 1, and three paired end libraries for day 60 flies. Raw reads generated was mapped by HISAT2 and differential expression analysis was performed using edgeR. Table 2 shows the summary of libraries statistics and mapping including number of raw reads, number of cleaned reads and mapping rates. Differentially expressed genes and their respective fold change and expression levels as count per million (CPM) are listed in supplementary 1. Differentially expressed genes, further, were chosen for GO classification and KEGG pathway analysis. The enriched GO terms featuring biological process, cellular component, and molecular functions and the number of differentially expressed genes related to those GO terms are presented in Tables 3 -5 , respectively. Table 6 shows the result of KEGG pathway enrichment analysis in day 60 compared to day 1 flies. Number of differentially expressed genes related to each KEGG pathway is provided in Table 6 .

Total RNA extraction, library construction, and RNA-seq
Equal number of male and female flies was used to extract the total RNA. A combination of Trizol reagent (Invitrogen, USA) and RNeasy MinElute Cleanup Kit (Qiagen, Germany) was used to extract the RNA. The flies were homogenized in 500 μL of Trizol reagent, then, a volume of 100 μL of chloroform was added into the mixture. The sample was thoroughly mixed and centrifuged at 10,0 0 0 xg for five minutes. A volume of 10 0 0 μL of isopropanol was added into aqueous layer and thoroughly mixed. The sample was cleanup using MinElute Cleanup Kit according to manufacturer protocol. gDNA was removed using Turbo TM DNase Kit (Thermo Fisher Scientific, USA). The quality of extracted RNA was assessed by agarose gel electrophoresis, Nanodrop20 0 0 (Thermo Fisher Scientific, USA), and Agilent2100 Bioanalyzer (Agilent, USA). High quality RNA ( ≥ 5 μg; ≥ 200 ng/μL; OD260/280 = 1.8-2.2) will be used for library construction. Table 3 Enriched GO terms featuring biological process. Significantly differentially expressed genes in day 60 compare to day1 are categorised into 27 GO terms featuring biological process with significant of P -value < 0.05. The number of differentially expressed genes related to the GO terms are presented as count with their respective P -value. For library construction, standard Illumina protocol was employed. The first step involving the enrichment of mRNA using poly-T oligo attached magnetic beads. Then, the mRNA was fragmented using divalent cations. First strand cDNA synthesis was performed using SuperScript II followed by second strand. End repair was performed to remove any overhangs prior to adenylation of 3'ends. Then, adapter was ligated, and size selection (150-200 bp) was performed. The purified size-selected RNA was sequenced using Illumina Hiseq platform. Raw data generated was trimmed and cleaned by removing low quality reads and removing the adaptor.

Differential expression analysis
RNA-seq reads were aligned to the reference genome of D. melanogaster by using HISAT2 version 2.1.0 [1] . The genome was Drosophila_melanogaster.BDGP6.28.dna_sm.toplevel.fa.gz downloaded from Ensembl. Afterwards, in order to quantify the expression level of transcripts the alignment files generated by HISAT2 were used as inputs for featurecount [2] . These counts were then used as input for differential analysis using using edgeR [3] . The statistical program edgeR Table 4 Enriched GO terms featuring cellular component. Significantly differentially expressed genes in day 60 versus day1 are categorised into 25 GO terms featuring cellular component with significant of P -value < 0.05. The number of differentially expressed genes related to the GO terms are presented as count with their respective P -value.
Institutes of Health guide for the care and use of laboratory animals (NIH Publications No. 8023, revised 1978) and Guide for the Care and Use of Laboratory Animals: Table 4 8th Edition.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships which have or could be perceived to have influenced the work reported in this article.