Data on transcriptome analysis from mesocarp tissue of mango Mangifera indica ‘Chokanan’ fruits

Mangoes comes in different sizes and consumer are often favour those with bigger, fleshy mango. Many genes are plays an important role in determining the growth, final size and shapes of the mango. To further understand the roles of genes that play roles in fruit development, a de novo transcriptomic analysis was performed at two stages of fruit development; immature and ripening stage, using Illumina HiSeq 4000 platform with 30× sequencing coverage. A total of approximately 128 Gb of clean nucleotides was obtained from 130 Gb of raw nucletides sequenced from four fruit mesocarp of both time points. The raw and clean data were deposited into National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database with accession number PRJNA803945.


Value of the Data
• The data obtained using Illumina sequencer is the first source of M. indica 'Chokanan' RNAseq. • The data provide a glimpse into molecular perspectives of M. indica 'Chokanan' and contribute further to gene manipulation for interest experiment. • The data presented here can be used for fruit quality improvement in mango varieties through the identification of molecular markers such as single nucleotide polymorphisms (SNPs) and microsatellites. • The genetic information and gene sequences can contribute to transcriptomic database.

Data
The RNA of a Malaysian mango cultivar 'Chokanan' was used for de novo transcriptome analysis using Illumina HiSeq 40 0 0 sequencing technology, with the read length of 150 bp at each end. We used a 30 × depth of sequencing coverage. Approximately 130 Gb raw data sequenced were generated from four samples of two stages Chokanan mango fruits. Table 1 shows the summary of raw and clean reads that have been generated from the transcriptomic sequencing of immature and ripen fruits of mango 'Chokanan' variety. The clean raw reads have been deposited at National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) ( https://www.ncbi.nlm.nih.gov/sra ) under the BioProject accession number PRJNA803945.

Fruits harvest
Mango fruits were harvested at two time points corresponding to immature and ripening stages within the same season. Fruits were harvested from two trees that were grown on the same soil and from the same rootstock, thus minimizing any external factor and genetic variation that may affects the fruit development. The fruits were harvested according to the Days After Pollination (DAP) which was on 21st DAP for immature stage, and at 75th DAP for the ripening stage [1] . Three fruits were harvested from each tree for each time points. Two of the fruits were used for RNA extraction which was needed for the transcriptomic analysis purpose. The fruits were dissected by removing the skin where the mesocarp was cut into 1 cm cubes. The mesocarp was quickly frozen in liquid nitrogen and stored in −80 °C until further use.

Total RNA isolation and cDNA synthesis
Two biological replicate of mango fruits mesocarp from each stages were sent to the laboratory of Apical Scientific Sdn Bhd (Malaysia) for RNA isolation and cDNA synthesis. Transcriptome data were generated from the total RNA extracted from these two samples collected at two different developmental stages. This resulting in four RNA-Seq for further downstream analysis. NanoDrop spectrophotometer (Thermo Scientific, Waltham, MA, USA) and Bioanalyzer RNA 60 0 0 chips system (Agilent Technologies, USA) were used to determine the quality and integrity of total RNA. The 0.8% agarose gel was used to examine the integrity of RNA samples.

Transcriptomic sequencing
The paired-end transcriptomic sequencing and analysis were carried out by Apical Scientific Sdn Bhd (Malaysia) using Illumina HiSeq 40 0 0 sequencing platform with the read length of 150 bp at each end. Paired-end sequencing libraries with insert sizes of 350 bp were constructed prior to sequencing following the manufacturer's standard protocol (Illumina, San Diego, CA, USA). The raw reads generated from the samples were trimmed using Solexa QA ++ with Phred score Q20. By using FastQ file, the FastQC was ran at default parameter. De novo assembly of the data was done using Trinity RNA-Seq 2.0 with default settings [2] .

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability
Data on transcriptome analysis from mesocarp tissue of mango Mangifera indica ' Chokanan ' immature fruits (Original data) (NCBI).