RNA-seq data for olive flounder (Paralichthys olivaceus) according to water temperature

We provide raw data from a transcriptomic analysis of olive flounder in response to changes in water temperature. At the time of this analysis, the olive flounder genome was not yet available in China, and there were no related references. Therefore, assembly was carried out using the de novo method to reveal the entire nucleotide sequence based on the nucleotide sequence information of the sequenced reads. The functions of expressed genes based on Gene Ontology analysis are also categorized and presented.


a b s t r a c t
We provide raw data from a transcriptomic analysis of olive flounder in response to changes in water temperature. At the time of this analysis, the olive flounder genome was not yet available in China, and there were no related references. Therefore, assembly was carried out using the de novo method to reveal the entire nucleotide sequence based on the nucleotide sequence information of the sequenced reads. The functions of expressed genes based on Gene Ontology analysis are also categorized and presented.
© 2019 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons. org/licenses/by/4.0/).

Data
The major causes of stress in aquaculture can be classified into chemical and physical factors. Among the physical factors, in particular, water temperature changes cause stress to fish and affect physiological activity. Excessive temperature stimulation also causes mortality. In addition, sudden changes in water temperature due to the cold water in the East Sea of Korea during the summer may slow fish growth and cause disease. These data show RNA-seq results of the olive flounder, a major marine aquaculture species in Korea, as a function of water temperature. The results of sequence quality assessment for the whole sample are summarized in Table 1 as the number of reads and the average base pairs (bp). The total sequence length was 121,120,858 bp; the number of unigenes was 108,151; and the average length of the unigenes was 1,120 bp. The mapped reads were normalized to show the amount of RNA expressed as Fragments Per Kilobase of transcript per Million mapped reads (FPKM) (Supplementary Table 1). The number of reads mapped through RNA-seq can be used to determine the expression level of each sample by gene or transcript. However, the sequencing data size may differ for each sample, making it difficult to define the expression amount as the number of mapped reads. Thus, this value cannot be viewed as objective, since the number of mapped reads varies with the length of a gene or a transcript. Therefore, normalization of differential gene expression is required to reduce error and obtain a more objective value. One of the popular methods to do so is the FPKM calculation method; FPKM is calculated using the number of fragments per transcript. For a paired-end read, a pair of reads constitutes a single fragment; therefore, FPKM can be used for RNA-seq analysis of paired-end reads. The values for the expression of these genes were found to be more than 1 and are shown separately in Table 2. Table 3 shows the number of genes exhibiting a p-value less than 0.05 and a greater than twofold difference in their expression level based on the analysis of differentially expressed genes (DEGs) between the 13 C and 20 C groups for each time period. The Gene ID, p-value, log2fc value, etc., for each section are attached to Supplementary Table 2. Genes with a p-value of less than 0.001 in the DEG analysis were divided into three independent categories: Molecular Function, Biological Process, and Cellular Component, through Gene Ontology (GO) analysis (Table 4). Detailed GO IDs, categories, gene names, descriptions, etc., are provided in Supplementary Table 3.

Experimental design, materials, and methods
The average weight and total length of the olive flounders used in the study were 124.2 g and 23.76 cm, respectively. The fish were acclimated at 20 C for one week. The experiments were divided into two groups: group 1, in which the water temperature was decreased to 13 C within 30 minutes; and group 2, which was maintained at 20 C. Sampling was performed three times, and samples named "water temperature_intermediate sampling time-number of repeats". For example, the 13 C cold Specifications Value of the data Transcriptome data for olive flounder can provide insight into the gene expression alterations in this species in response to changes in water temperature and can further provide insights into other fish species.
Comparison of gene expression data between low and high temperatures reveals a preliminary stress-related gene associated with environmental changes. Functional analysis data can be used in future studies to anticipate the biological pathways of olive flounder when the water temperature suddenly changes. stimulation, day 3, 2nd sample was named 13_4d-2. The kidneys of the fish can be classified into head, body, and tail. The head kidney is located at the front of the kidney (near the head of the fish) and is said to be involved in hematopoietic and hormonal secretion. Head kidney were sampled from each group at 4 hours and 1 day, 3 days, and 7 days. Total RNA was isolated from the sampled head kidneys using TRIzol (Invitrogen, Carlsbad, CA, USA). We created a dUTP second strand library starting from 200 ng. Following, we fragmented RNA in 1 Â fragmentation buffer (Affymetrix) at 80 C for 4 min, purified and concentrated the RNA to 6 mL after ethanol precipitation. We added an index (8-base barcode) to each library to enable pooling of these libraries. In addition, the adaptor ligation step was performed using 1.2 mL of index adaptor mix and 4,000 cohesive end units of T4 DNA Ligase (New England Biolabs) overnight at 16 C in a final volume of 20 mL. Finally, we generated libraries with an insert size ranging from 225 to 425 bp. The sequenced raw data were assembled and cleaned by removing regions with low quality score using Quality trimming of FastQC program [1]. Here we implemented assembly using Trinity. Trinity is a method for efficient and powerful de novo reconstruction of transcriptomes consisting of three software modules: Inchworm, Chrysalis and Butterfly sequentially applied to handle large quantities of RNA-Seq readings [2]. We used the CD-HIT program to produce a non-redundant dataset through clustering and alignment of the sequencing data. Confirmation procedures and clustering procedures were used to support full parallel processing. CD-HIT was implemented in the C þþ programming language and uses OpenMP (http://www.openmp.org) for parallelization [3]. The RSEM software package was used to estimate the expression levels of genes and isoforms from RNA-seq data. Typical implementation of RSEM consists of two steps. First, a set of reference transcription sequences is generated and preprocessed for use by subsequent RSEM steps. Second, a series of RNA-Seq readings are aligned with the reference transcript and the resulting alignments are used to estimate the abundance and confidence intervals [4]. InterProScan and Blast2GO software were used to predict protein sequence domains and perform functional analysis. The InterPro database is available on the web server (http://www.ebi.ac.uk/interpro). The database can be searched using the query order or * When the p-value of the expression level in the same gene between the two groups was less than 0.05, the difference was significant and the number was indicated. The number of genes significantly higher in the G2 group than in the G1 group (**) and the number of low genes (***) are indicated. Table 4 Overview of gene ontology.