Transcriptomic data during seed maturation in dormant and non-dormant genotypes of wheat (Triticum aestivum L.)

The present data profiles a large scale transcriptome changes in seed tissues (embryo and endosperm) during maturation in dormant and non-dormant genotypes of hexaploid wheat. Seed dormancy is an adaptive trait that has a significant influence on the incidence of preharvest sprouting, which is referred to as the germination of grains on the spike prior to harvest, in wheat. Given that preharvest sprouting causes a substantial yield and quality losses, elucidation of the molecular features that regulate seed dormancy has a paramount significance in the development of preharvest sprouting resistant wheat cultivars. The data presented here was produced from total RNA/mRNA samples isolated from developing seeds of dormant and non-dormant wheat genotypes using the Affymetrix GeneChip Wheat Genome Array. The raw and normalized formats of these data are available in Gene Expression Ominbus (GEO), NCBI's gene expression data repository, with accession number GSE83077.


a b s t r a c t
The present data profiles a large scale transcriptome changes in seed tissues (embryo and endosperm) during maturation in dormant and non-dormant genotypes of hexaploid wheat. Seed dormancy is an adaptive trait that has a significant influence on the incidence of preharvest sprouting, which is referred to as the germination of grains on the spike prior to harvest, in wheat. Given that preharvest sprouting causes a substantial yield and quality losses, elucidation of the molecular features that regulate seed dormancy has a paramount significance in the development of preharvest sprouting resistant wheat cultivars. The data presented here was produced from total RNA/mRNA samples isolated from developing seeds of dormant and non-dormant wheat genotypes using the Affymetrix GeneChip Wheat Genome Array. The raw and normalized formats of these data are available in Gene Expression Ominbus (GEO), NCBI's gene expression data repository, with accession number GSE83077.
© 2019 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons. org/licenses/by/4.0/).

Data
This dataset represents large scale transcriptome comparison of embryo and endosperm tissues between dormant and non-dormant wheat genotypes during seed maturation. Total RNA samples were extracted from the embryonic tissues of maturing seeds of the two genotypes while mRNA samples were isolated from the total RNA samples derived from the corresponding endospermic tissues. The total RNA samples of the embryos and mRNA samples of the endosperms of maturing seed samples were subjected to the microarray experiments using Affymetrix GeneChip Wheat Genome Array (Affymetrix, Santa Clara, CA, USA). The raw and normalized formats of these data are available in Gene Expression Ominbus (GEO), NCBI's gene expression data repository (https://www.ncbi.nlm.nih. gov/geo/query/acc.cgi?acc¼GSE83077). Reproducibility of the transcriptomic data from the independent replicates of each sample was confirmed by scatter plot expression analysis with the squared Pearson correlation coefficient (R 2 ) ( Table 1; Figs. 1e4).

Plant materials and growth conditions
Plant of the dormant wheat genotype AC Domain and the non-dormant genotype RL4452 [3,4] were grown in a growth chamber at 22 C/18 C (day/night) under a 16/8 h photoperiod until harvest as described before [1]. Maturing seeds of each genotype were harvested at different seed maturation stages; from 20 to 50 days after anthesis (DAA). The seed maturation stages studied were determined based on extrusion of the yellow anther in the spikes, which was designated as 0 DAA. Maturing seeds, after harvesting, were separated into embryo (including scutellum) and endosperm (including pericarp and aleurone) tissues and immediately frozen in liquid nitrogen. The seed tissue samples were stored at À80 C until they were used for RNA isolation.

Value of the data
The data profiles tissue specific large scale transcriptome changes during seed maturation in dormant and non-dormant genotypes of wheat.
The transcriptomic data can be used as an important genomic resource for wheat researchers studying transcriptome change in response to loss of dormancy.
The data can be used as a resource to identify genes differentially expressed between different tissues of dormant and non-dormant seeds. The data is useful to enhance meta-analysis and provides important insights into genes that regulate seed dormancy and thereby preharvest sprouting in wheat.

Isolation of total RNA and mRNA samples
Total RNA was extracted from both embryo and endosperm tissues. The total RNA from the embryos was isolated using RNeasy Plant Mini Kit (Qiagen, Hilden, Germany) while the total RNA from endosperm tissue was isolated as described previously [5]. The total RNA samples from the endosperm tissue were treated with DNase (Ambion, Austin, TX, USA) to remove any contaminating genomic DNA before they are subjected to mRNA isolation using PolyATtract Kit (Promega, Madison, WI, USA) according to the manufacturer's instructions. The microarray experimental procedures of the endosperm tissues were performed with mRNA isolated from the total RNA samples to eliminate the issue of endospermic carbohydrate or starch interference.

Microarray experimental procedure
The total RNA or mRNA samples were used for cDNA synthesis. After purification, biotinylated cRNA samples were prepared using the GeneChip IVT Labelling Kit and the GeneChip Sample Cleanup Module (Affymetrix). Assessment of the quality of the labelled cRNAs was undertaken using an Agilent 2000 Bioanalyzer. Subsequent to fragmentation, labelled cRNA samples were hybridized for 16 hr at 45 C on GeneChip Wheat Genome Array. Washing and staining of the GeneChips were performed in the Affymetrix Fluidics Station 450. Afterwards, the GeneChips were scanned using an Affymetrix Scanner 3000.

Data analysis
Using the Affymetrix Microarray Suite (MAS5) statistical algorithm, the number of probesets with 'present' detection was determined. Subsequently, the raw data was normalized using Robust Multiarray Average (RMA) methodology. HarvEST WheatChip (http://harvest.ucr.edu/) [6] was used to annotate the probesets. The probesets that are differentially expressed between dormant and nondormant seed tissues were identified using FlexArray software [7] by analysis of variance, and probesets with two or more fold changes at probability level of 0.05 or less were considered differentially expressed. In light of the large number of samples considered and the associated cost, the following experimental strategy was devised to limit the number of replicates and the associated cost without affecting the statistics. Firstly, microarray analysis of the 20 DAA embryo samples of AC Domain genotype was performed using four replicates. Reproducibility of the transcriptome data from any two replicates of the four independent replicates was evaluated using scatter plot expression analysis with the squared Pearson correlation coefficient (R 2 ) ( Table 1). As a result, microarray analysis of the remaining samples irrespective of genotype, tissue type and maturation stage was performed using two independent replicates, and reproducibility of the transcriptomic data from the two independent replicates of both tissue samples (embryo and endosperm) derived from AC Domain (Figs. 1 and 2) and RL4452 (Figs. 3 and 4) was verified through scatter plot expression analysis as described above.