Comparative Transcriptome Analysis between Fertile and CMS Flower Buds in Wucai (Brassica campestris L.)

Wucai (Brassica campestris L. ssp. chinensis var. rosularis Tsen) is a variant of nonheading Chinese cabbage (Brassica campestris L.), which is one of the major vegetables in China. Cytoplasmic male sterility (CMS) has been used for Wucai breeding in recent years. However, the underlying molecular mechanism of Wucai CMS remains unclear. In this study, the phenotypic and cytological features of Wucai CMS were observed by anatomical analysis, and a comparative transcriptome analysis was carried out to identify genes related to male sterility using Illumina RNA sequencing technology (RNA-Seq). Microscopic observation demonstrated that tapetum development was abnormal in the CMS line, which failed to produce fertile pollen. Bioinformatics analysis detected 4430 differentially expressed genes (DEGs) between the fertile and sterile flower buds. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed to better understand the functions of these DEGs. Among the DEGs, 35 genes (53 DEGS) were implicated in anther and pollen development, and 11 genes were involved in pollen cell wall formation and modification; most of these showed downregulated expression in sterile buds. In addition, several genes related to tapetum development (A6, AMS, MS1, MYB39, and TSM1) and a few genes annotated to flowering (CO, AP3, VIN3, FLC, FT, and AGL) were detected and confirmed by qRT-PCR as being expressed at the meiosis, tetrad, and uninucleate microspore stages, thus implying possible roles in specifying or determining the fate and development of the tapetum, male gametophyte and stamen. Moreover, the top four largest transcription factor families (MYB, bHLH, NAC and WRKY) were analyzed, and most showed reduced expression in sterile buds. These differentially expressed transcription factors might result in abortion of pollen development in Wucai. The present comparative transcriptome analysis suggested that many key genes and transcription factors involved in anther development show reduced gene expression patterns in the CMS line, which might contribute to male sterility in Wucai. This study provides valuable information for a better understanding of CMS molecular mechanisms and functional genome studies in Wucai.


Background
Wucai (Brassica campestris L. ssp. chinensis var. rosularis Tsen) is a variant of nonheading Chinese cabbage (Brassica campestris L.), which is the most important species in the Brassicaceae family [1]. As an important autumn and winter vegetable crop, this crop is cultured widely in most parts of China, where it originated, especially in the Yangtze-Huaihe River Basin, and has become increasingly popular in other countries for its beautiful shape and significant levels of vitamins and minerals [2]. In recent years, cytoplasmic male sterility (CMS) has been used in some cultivated breeds [3] to generate stronger plants and higher hybrid seed yield [4,5].
Owing to an interaction between mitochondrial and nuclear genes, the CMS phenotype fails to produce functional anthers, pollen or male gametes [6]. Thus, understanding the delicate and complex processes of anther and pollen development is a prerequisite for comprehension of this unique phenomenon in CMS plants [5]. However, anther and pollen development is a critical phase in the plant life cycle, which contains a series of correlated events involving a diverse range of genes in complex regulatory networks [7][8][9]. Dysfunction of these genes may lead to male sterility [10]. Although many of these genes have been isolated and analyzed to have vital roles in CMS, the regulatory network and the novel genes underlying CMS occurrence are still largely unknown [8,10].
In recent decades, genetic research into CMS occurrence has included two main types, map-based cloning and sequence-based transcriptome assays [7]. Using AFLP and SSR techniques for gene mapping, Xu et al. [11] identified the restorer gene BrRfp from the pol-like CMS restorer line of heading Chinese cabbage (B. rapa). Compared with the gene mapping method, the Illumina sequencing (RNA-Seq) technique could offer several key advantages over existing technologies [12]. This form of transcriptional analysis allows for the determination of genome-wide expression levels as well as identification of new genes and SNPs, especially genes with very low abundance [13][14][15]. Furthermore, the results of RNA-Seq also show high levels of reproducibility for both technical and biological replicates [16]. Therefore, taking these advantages into account, RNA-Seq has been used successfully in the pollen and anther development of Brassica crops, such as B. napus [6,9,14,17], B. rapa [7,18], B. oleracea [19][20][21], B. campestris [5], and B. juncea [22]. However, to the best of our knowledge, the genome-wide transcriptional profiles and related genes of fertile and sterile flower buds from Wucai have not yet been reported through RNA-Seq technology.
In our previous study, a newly sterile plant of Wucai was generated by hybridization with nonheading Chinese cabbage, and a stable sterility line was developed via backcrossing for ten generations. In this present study, the objective was to further understand the differences in the transcriptome between the CMS line and its maintainer line and to find some molecular clues to this CMS system. Accordingly, mRNA was isolated from the flower buds of fertile and sterile plants, respectively, and then, genome-wide transcriptional profiling was performed using the Illumina RNA-Seq platform. Based on bioinformatics analysis, a large number of candidate genes and transcription factors involved in anther and pollen development were isolated, and various screened candidate genes related to pollen development were further analyzed by qRT-PCR. Our results may contribute to an understanding of CMS molecular mechanisms and provide useful information for further heterosis breeding in Wucai.

Phenotypic and cytological characterization
After ten generations of backcrossing, there was no difference in morphological phenotype between the sterile line 12-14A and its maintainer line 12-14B ( Fig. 1a  and b), and the forms of the corolla and flower seemed normal ( Fig. 1c-f ). However, compared with those of the fertile flower, shorter filaments and undeveloped anthers were observed on the stamens of the sterile flower ( Fig. 1g-h).
To accurately characterize the cause of the pollen abortion, semithin sections of the buds from the two lines of Wucai were observed. As shown in Fig. 1i and IM, there was no obvious difference in the meiosis period between sterile and fertile anthers. However, at the tetrad stage, the tapetal cells expanded, and the microspore could not carry out meiotic division (Fig. 1j). After this stage of anther development, the tapetal cells vacuolated and filled in the sacs, and the microspores degraded ( Fig. 1k), which caused pollen abortion (Fig. 1l). In contrast, a normal tapetum and fertile pollen grains developed in the fertile anthers ( Fig. 1m-p).

De novo assembly and sequence annotation
To further understand the molecular mechanisms of CMS differences in Wucai, RNA-Seq was performed using Illumina technology. After the raw data were trimmed, a total of 52,936,673 clean reads for the fertile samples and 52,606,810 for the sterile samples were obtained, and the Q20 and Q30 were > 96.61 and > 92.53%, respectively (Table 1). In addition, the GC contents were consistently approximately 45% for both sterile and fertile samples ( Table 1), suggesting that the sequencing was highly accurate. All clean reads (105,543,483) were assembled using the Trinity program [23]. As the result, 117,332 contigs were obtained with a mean length of 901 nt (Table 1). After clustering, 80,851 unigenes (> 200 bp) were generated; the average length was 1054 nt, and the N50 was 1586 nt ( Table 1). The lengths of all unigenes were longer than 199 bp, and 86.95% of them ranged from 200 to 1999 bp (Additional file 1: Table S1). The assembled unigenes were subjected to search against the Nr, Swiss-Prot and COG databases, and 66,143 (81.81%), 54,857 (67.85%) and 28,129 (34.79%) unigenes were aligned against these three protein databases, respectively (Additional file 2: Table S2). The species distribution showed that that almost all of the sequences matched sequences from the Brassicaceae (Additional file 3: Figure S1).

Identification of differentially expressed genes
To gain better insight into the differences in gene expression patterns, we identified differentially expressed genes (DEGs) between the sterile and fertile lines. A total of 4430 genes (including 147 novel genes) were identified in the sterile and fertile comparison, including 980 genes upregulated and 3450 downregulated in sterile buds ( Fig. 2; Additional file 4: Table S3). Among these DEGs, 1384 specifically expressed genes were observed that were expressed in only the fertile (1044) or sterile (340) samples. These results showed that the number of downregulated DEGs was considerably higher than that of upregulated DEGs. In addition, 147 novel genes were identified that were not annotated to any database. The biological functions of these novel genes remain to be determined (Additional file 5: Table S4).

Functional annotation by Gene Ontology
To investigate the function of the DEGs, the genes that showed significant differential expression were subjected to analysis by Blast2GO software. As shown in Additional file 6: Figure S2

Pathway mapping by Kyoto Encyclopedia of Genes and Genomes
To understand the biological functions of DEGs that might be active in Wucai, pathway annotation was performed against the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The results showed that 2217 of 4430 DEGs were assigned to 119 KEGG pathways (Additional file 7: Table S5). The 20 most significantly enriched KEGG pathways are shown in Fig. 3. The pathways with significantly more DEGs were metabolic pathways (676, 15.26%), biosynthesis of secondary metabolites (284, 6.41%), plant-pathogen interaction (162, 3.61%), and starch and sucrose metabolism (124, 2.80%). In starch and sucrose metabolism, a total of 124 DEGs were screened, and 38 of these DEGs were expressed in only fertile buds, while 71 DEGs were downregulated in sterile buds (Additional file 8: Table S6). These pathway annotations provide a basis for investigating gene functions involved in male sterility in Wucai.

Genes related to anther and pollen development
Pollen development is a complex process that involves many events and plays an important role in plant propagation. In this study, all of the DEGs were annotated against the processes of anther and pollen development of A. thaliana. As shown in Table 2, 35 genes are considered to regulate male gametophyte development in Wucai. From A6 to ZAT5, 30 genes were downregulated in sterile buds. In contrast, 5 other genes, BT2, SCC12, TCMO, VAL2 and XPO1, were upregulated in sterile buds. Among these genes, A6, AMS, ENL2, MS1, MYB39, ORTH2, PLRX1 and TSM1 are also considered to be involved in tapetum development. In addition, we found several genes associated with cell wall formation and modification, such as the Pectinesterase gene (PME5), UDP-arabinose mutase gene (RGP1), and Cinnamoyl-CoA reductase gene (CCR2), which might participate in the processes leading to CMS in Wucai.

Differentially expressed transcription factor genes
In the anther and pollen development processes, transcription factors are generally thought to be important regulators. To identify differentially expressed transcription factors, all of the DEGs were annotated. In this study, 131 transcription factors (182 DEGs) were found, including 128 down-and 54 upregulated DEGs (Additional file 9: Table S7). Among these transcription factors, 27 up-and 8 downregulated DEGs were specific to fertile and sterile buds, respectively. In addition, 13 DEGs were associated with 8 WRKY transcription factor genes, and WRKY19 (Unigene3849, CL2284.Contig2, CL2120.Contig3) and WRKY32 (CL4008.Contig1) were upregulated in only sterile buds. Fifteen DEGs were identified with 10 NAC transcription factor genes, and 6 of them were highly expressed in sterile buds. In the bHLH and MYB transcription factor families, a total of 43 DEGs were associated with 16 bHLH and 13 MYB transcription factors, and 10 bHLHs (15 DEGs) and 8 MYBs (16 DEGs) were downregulated in sterile buds, respectively ( Fig. 4, Table 3). These differentially expressed transcription factors might result in abortion of pollen development in Wucai.  All of the 28 DEGs exhibited the same tendency between the RNA-Seq analysis and qRT-PCR results, which suggested that our transcriptome analysis was accurate and reliable.
To further determine the expression pattern of key genes in the anther and pollen development, 2 transcription factors and 6 tapetum and pollen cell wall development genes were selected from the above for qRT-PCR assay (Fig. 6). Among these genes, the pollen cell wall formation gene PME5 (Unigene37636) was highly and specifically expressed in the fertile buds at the tetrad stage. The five tapetum development genes and one transcription factor (BH089) were highly expressed at the meiosis or tetrad stage in the fertile buds, and all of them shoed low abundance in sterile buds. The other transcription factor, BH077, was highly expressed at meiosis in sterile buds. These results further confirmed the reliability of the RNA-Seq data.
In addition, 8 flowering genes were also examined in this study (Fig. 6). Among these genes, CO, AP3 and FT were highly expressed at the tetrad stage, and VIN3 was highly expressed in the meiosis period in sterile buds. FLC, AGL18, AGL104-1 and AGL104-2 were highly expressed at the meiosis or tetrad stage in fertile buds. The abnormal expression of these genes might influence the development of male gametophytes and stamens, leading to male sterility (Fig. 7).

Discussion
In higher plants, male sterility is a common phenotypic trait in which the abortion of stamens occurs and plants fail to produce functional anthers, pollen or male gametes under typical natural conditions [20,24]. As the male reproductive organ, stamens play an important role in plant inheritance [5]. In this present study, morphological comparisons were performed between fertile and sterile lines of Wucai (Fig. 1a-h), and there was no difference between them except the stamens, which had shorter filaments and aborted anthers in the sterile flowers ( Fig. 1g-h). A cytological examination was further carried out to evaluate the differences in pollen development between the fertile and sterile lines, and we observed that anther abortion occurred consistently in the sterile line, in which the tapetum developed abnormally and the microspore began to degrade after the meiotic stage (Fig. 1i-l). These results were consistent      with those of Liu et al. [7] and Zhou et al. [5] and suggested that the abnormal development of the tapetal cells and microspores led to pollen and anther abortion.
To better identify the genes associated with pollen abortion in this CMS line of Wucai, a comprehensive analysis of transcript profiles between fertile buds and sterile buds was performed using RNA sequencing technology, which could detect low abundance transcripts and provide new insights into male sterility through global investigation of gene expression changes [5,25]. A total of 105,543,483 clean reads and 117,332 contigs were obtained based on the RNA-Seq data, and 980 upregulated (1.21%) and 3450 downregulated (4.27%) DEGs out of 80,851 unigenes were identified based on their gene expression levels ( Fig. 2; Table 1; Additional file 4: Table S3). These results indicated that changes in the expression of a large number of related genes could cause male sterility in Wucai, though the development of anther and pollen is a complicated process and involves numerous genes.
In the KEGG enrichment results, 4430 DEGs were classified into 119 metabolic pathways (Fig. 3, Additional file 7: Table S5), and these pathways might encompass all the biological pathways in anther development [26]. Among these pathways, starch and sucrose metabolism provides energy and carbon for anther development, and starch and sucrose are accumulated as energy reserves for pollen maturation [4,27]. In our research, out of 124 DEGs involved in this pathway, 38 DEGs were expressed in only fertile buds and 2 DEGs (Unigene23056 and Unigene11909) were expressed in only sterile buds (Additional file 8: Table S6). The specific expression of these genes might lead to disturbances in the metabolism of starch and sucrose and the processing of energy reserves, which could suppress pollen development and ultimately lead to male sterility [7,28]. This finding was consistent with those of previous works [4,19,29,30].
In addition to the metabolic pathways, many key genes have been identified for pollen and anther development in Arabidopsis [31] and Brassica [9,20,32]. It is important to note that we identified 35 anther and pollen development related genes (53 unigenes) that have homologs in Arabidopsis and Brassica, and most of them were downregulated and associated with the development of the tapetum and pollen cell wall ( Table 2). Among these 53 unigenes, 9 DEGs (6 genes: ACA2, AGD10, AGL18, PME5, TMK1 and ZAT5) were expressed in only fertile buds (log 2 Ratio(S/F) > 17), which might offer new insights into the mechanisms of CMS regulation in Wucai. ACA2 encodes a Calcium-transporting ATPase 2 (plasma membrane-type), which regulates the Ca 2+ -mediated signaling pathway during pollen development [33,34]. The nonexpression of this gene in sterile buds might disrupt the Ca 2+ balance in the pollen mother cell. However, interestingly, AGD10 might be involved in root development as an ARF-GAP protein [35][36][37], and AGL18, encoding a MADS-box protein, has been reported as a flowering-inhibiting factor [38].  (Table 2), in which critical chemical changes could lead to pollen abortion [5].
It has been reported that some constituents of the pollen wall are secreted from tapetal cells [39,40]. Abnormal (early or delayed) tapetal cell degeneration can result in male sterility [5]. In conjunction with our cytological observations of Wucai buds (Fig. 1i-p), several genes related to tapetum development were revealed ( Table 2). As a basic helix-loop-helix (bHLH) protein, AMS is required for tapetal cell biosynthesis, postmeiotic microspore and pollen wall formation, and tapetum programmed cell death (PCD) by directly regulating target genes involved in these biological pathways [41][42][43]. MS1 encodes a transcription factor of the PHD finger family and is specifically expressed in microsporocytes [44]. A6, a tapetum-specific protein secreted by the tapetal cells, displays similarity to β-1,3-glucanases, which degrade callose during pollen development [45]. TSM1 encodes a Fig. 5 qRT-PCR verification of differentially expressed unigenes. S means sterile sample, and F means fertile sample. Relative expression levels were calculated using Actin as an internal control cation-dependent CCoAOMT-like protein involved in phenylpropanoid polyamine conjugate biosynthesis and has a function in stamen/pollen development [46,47]. Downregulated expression of these genes could result in degeneration of the tapetum, eventually leading to abortion.
The regulation of transcription is a fundamental process in all living organisms [48]. Transcription factors can regulate multiple related downstream genes, which are essential components of the cellular machinery and play key roles in plant growth and development [49]. In the present study, 131 transcription factors (182 DEGs) were found (Additional file 9: Table S7). Among these transcription factors, the top four largest families were bHLH (16), MYB (13), NAC (10), and WRKY (8) ( Table 3). The bHLH proteins, which bind as dimers to specific DNA target sites, are a superfamily of transcription factors, and several of them are critical for tapetal PCD and pollen development [41]. MYB transcription factors are also known to be required for anther and aleurone layer development, callose dissolution, and exine formation [19,50,51]. NAC and WRKY transcription factors consist of a large gene family involved in a wide range of biological processes [48,50], and some of them participate in pollen development (WRKY2, WRKY27; GPC, NST1) [48,[52][53][54]. Research over the past several years has demonstrated that changes in the expression of these transcription factors often cause male sterility [5,19].
In addition, FLOWERING LOCUS C (FLC), which encodes a MADS-box transcription factor and functions as a repressor of flowering [55], was noted in our comparative analysis (CL3897.Contig1; Additional file 9: Table S7). It has been reported that overexpression of this gene from B. campestris could affect fertility by the GA pathway in Arabidopsis [56]. However, in our study, we found that the FLC gene was downregulated in sterile buds, and several identified genes involved in stamens (AP3) [57] and the male gametophyte (AGL18 and AGL104) [58,59] (Fig. 6; Additional file 10: Table S8) Fig. 6 Expression of anther and pollen development related genes at different stages using qRT-PCR. S means sterile sample, and F means fertile sample. 1-3 indicate the pollen meiosis stage (bud sizes 0.5-1.5 mm), tetrad stage (1.5-3.0 mm) and uninucleate microspore stage (3.0-4.5 mm) of anther and pollen development, respectively. Relative expression levels were calculated using Actin as an internal control were downstream targets of FLC (Fig. 7). Among these genes, AGL18 and AGL104 showed low expression in fertile buds ( Fig. 6; Additional file 10: Table S8). We speculated that the downregulation of FLC, which is associated with male infertility, might influence the expression of key genes in anther and pollen development, along with other fertility related genes (Fig. 7). This hypothesis must be further verified.
Taken together, the present investigation of the transcriptome could increase our knowledge and understanding of the molecular mechanisms of male sterile in Wucai and provide numerous candidate genes that can be verified through transgenic technology in future.

Conclusions
In this study, a comparative transcriptome analysis of sterile and fertile buds from Wucai was performed through an Illumina sequencing approach, and the different biological processes and genes that regulated anther and pollen development were analyzed using comparative analysis. As a result, a total of 4430 DEGs, 174 novel genes, 35 anther and pollen development related genes, and 47 transcription factors (the top four largest families) were revealed. The RNA-Seq analysis was further confirmed through qRT-PCR. Based on the functional annotation and expression patterns, it was concluded that the occurrence of male sterility is probably related to the functional and metabolic abnormalities of these candidate DEGs in Wucai. These transcriptome data will be important to serve as a reference and provide insights for future elucidation of male sterility in Wucai.

Plant materials
Buds from near-isogenic lines of Wucai, CMS line 12-14A and its maintainer line 12-14B ( Fig. 1a and b), were used as the plant materials in this study. Backcrossed continuously for over ten generations, the male sterile line 12-14A of Wucai was generated from a CMS line of nonheading Chinese cabbage. The sterile line 12-14A and its maintainer line, 12-14B, were planted in the vegetable breeding fields of Anhui Agricultural University (Hefei, Anhui Province, China; longitude 117°1 4′E, latitude 31°52'N) from October until April of the following year.

Morphological and cytological observations
At the full-bloom stage, the flower structures of the CMS and fertile lines were observed using a Canon EOS550D digital camera (Canon, Japan), and the images from petals and stamens were captured with an Olympus SZX10 stereomicroscope (Olympus, Japan). The sections of flower buds from the CMS and fertile plants were obtained following the method described by Peng et al. [60]. The semithin sections were observed and photographed using an Olympus BX61 light microscope (Olympus, Japan) equipped with a Mshot MD30 camera (Olympus, Japan).

RNA extraction and Illumina sequencing
In this experiment, the buds of the sterile or fertile lines were collected from three different plants, respectively. According to the methods of Huang et al. [25], total RNA was isolated from the mixed bud samples using the TRIzol Reagent Kit (Invitrogen, USA) and purified using the Dynabeads® mRNA Purification Kit (Ambion, USA). The isolated RNA samples were sent to 1GENE Technology Co., Ltd. (Hangzhou, China; http://www.1gene.com.cn/) for Illumina sequencing (Illumina MiSeq platform) and unigene annotation. The raw transcriptome data of six samples from three biological replicates of sterile or fertile lines were deposited in the NCBI Short Read Archive (SRA, accession number: SRP145484).

De novo assembly and functional annotation analysis
The paired-end clean reads of each sample were de novo assembled into contigs using Trinity (http://trinityrnaseq.sourseforge.net), and the nonredundant unigenes were further obtained with the TGI Clustering tools [61]. Among these unigenes, there were several unigenes with a high degree of similarity (more than 70%) in the same cluster (starting with CL, followed by the gene family's number); the rest were singletons (starting with Unigene), which had low similarity (less than 70%) or no similarity and could not be clustered with each other. Then, functional annotation of the unigenes was performed using BLASTX alignment (E-value<1e-5) in the nonredundant (nr), Swiss-Prot, and COG databases. With the nr annotations, the Gene Ontology (GO) annotations of the unigenes were obtained through the Blas-t2GO program [62], and GO functional classification was carried out with the WEGO software [63]. The KEGG pathway annotation was performed using a BLAST search against the KEGG database (KEGG, http://www.genome.jp/kegg/).

Differentially expressed gene (DEG) identification
Reads per kilobase per million reads (RPKM) was adopted to compare the differences in unigene expression between the sterile and fertile lines. The DEGs were identified by a false discovery rate (FDR) ≤0.001 and an absolute value of log 2 ratio ≥ 1 (ratio = the fold change of differential expression) [13]. The DEGs were used for GO and KEGG enrichment analyses according to the method described by An et al. [14] and Liu et al. [13].