Genome-Wide Analysis of Glycoside Hydrolase Family 1 β-glucosidase Genes in Brassica rapa and Their Potential Role in Pollen Development

Glycoside hydrolase family 1 (GH1) β-glucosidases (BGLUs) are encoded by a large number of genes, and are involved in many developmental processes and stress responses in plants. Due to their importance in plant growth and development, genome-wide analyses have been conducted in model plants (Arabidopsis and rice) and maize, but not in Brassica species, which are important vegetable crops. In this study, we systematically analyzed B. rapa BGLUs (BrBGLUs), and demonstrated the involvement of several genes in pollen development. Sixty-four BrBGLUs were identified in Brassica databases, which were anchored onto 10 chromosomes, with 10 tandem duplications. Phylogenetic analysis revealed that 64 genes were classified into 10 subgroups, and each subgroup had relatively conserved intron/exon structures. Clustering with Arabidopsis BGLUs (AtBGLUs) facilitated the identification of several important subgroups for flavonoid metabolism, the production of glucosinolates, the regulation of abscisic acid (ABA) levels, and other defense-related compounds. At least six BrBGLUs might be involved in pollen development. The expression of BrBGLU10/AtBGLU20, the analysis of co-expressed genes, and the examination of knocked down Arabidopsis plants strongly suggests that BrBGLU10/AtBGLU20 has an indispensable function in pollen development. The results that are obtained from this study may provide valuable information for the further understanding of β-glucosidase function and Brassica breeding, for nutraceuticals-rich Brassica crops.


Introduction
Glycoside hydrolases (EC 3.2.1) are classified into a group of enzymes that hydrolyze the glycosidic bonds of carbohydrates [1]. At the end of March in 2019, 161 families have been identified and classified in the CAZy (Carbohydrate-Active enZYmes) database [2,3]. Among these families, the glycoside hydrolase (GH) family 1 is recognized for its β-glycosidase activity, which largely contributes to various developmental processes and stress responses in plants [4,5]. Genome-wide analysis of GH1 β-glycosidase genes (BGLUs) has been conducted in three plant species: Arabidopsis, with 48 genes grouped into 10 subfamilies [6]; rice, with 40 genes grouped into eight subfamilies [5]; and maize, with 26 genes grouped into four subfamilies [7,8]. Recently, a comparison between the Arabidopsis and rice BGLUs with respect to sequence identity and expression revealed that these exhibited substantial tissue specificity and differential responses to various stress treatments, although these have a high degree of similarity [9]. However, no systematic analysis of BGLUs in Brassica rapa, which is an important vegetable crop, has been performed to date.
In addition to classifications based on genomic DNA organization, Arabidopsis BGLUs (AtBGLUs) could be classified in relation to their known functions, which shows that genes within the same subfamily may function in similar processes. A large number of AtBGLUs are involved in flavonoid metabolism: AtBGLU1-6 for flavonol accumulation [10,11], AtBGLU7-11 for anthocyanin glucosyltransferase [11,12], and AtBGLU12-17 for flavonoid utilization [10,13]. Seven genes (AtBGLU26, AtBGLU34-39) function as myrosinases for chemical defense against herbivores and pathogen attacks [14][15][16]. AtBGLU18 and AtBGLU33 regulate ABA responses by increasing ABA levels through the hydrolysis of glucose-conjugated ABA (ABA-GE) [17,18]. Scopolin, which is specifically produced in the roots, and which plays a role in a defense against pathogen attack and abiotic stresses [19,20], is controlled by ArBGLU21-23 [21,22]. The gene products encoded by AtBGLU45 and AtBGLU46 hydrolyze monolignol glucosides, thereby regulating lignin biosynthesis [23]. AtBGLU42 is involved in the induction of systemic resistance to bacterial disease, and the release of iron-mobilizing phenolic metabolites during iron deficiency [24]. However, no gene has been reported, with respect to pollen development.
During pollen development, the tapetum secretes various components, such as lipidic precursors and lipidics onto the pollen surface, leading to the formation of sculptured exine and exine cavities by hydrolyzation and other reactions [25]. In addition to lipid components, pollen wall development requires the regulation of polysaccharide metabolism [26], suggesting a possible involvement of the hydrolysis of glycosidic bonds of carbohydrates. Glycoside hydrolase has been reported involved in the cell wall polysaccharide degradation [27] and their coding genes were downregulated in the OsTDR (Tapetum Degeneration Retardation) mutant [28] and the sterile floral buds of B. rapa [29], indicating a possibility that β-glucosidase may play a role in pollen development.
In this study, we systematically identified Brassica rapa β-glycosidase genes (BrBGLUs) and analyzed their expression patterns and phylogenetic relationships. In addition, in silico analyses indicated that BrBGLU10/AtBGLU20 have conserved functions during pollen development, and knocking down AtBGLU20 using antisense oligos in Arabidopsis results in the production of aborted pollen grains. Furthermore, bioinformatics and molecular analyses provide valuable information on the function of BrBGLUs during pollen development.

Identification and Chromosomal Distribution of BrBGLUs
After a HMM (Hidden Markov Model) search, 64 BrBGLU genes were identified and designated as BrBGLU1 to BrBGLU64, according to their positions on the chromosomes ( Figure 1). The locus ID, genome location, coding sequence (CDS) length, and the protein length of the BrBGLUs are listed in Table 1. The genomic DNA sequences of the BrBGLUs ranged from 390 bp to 9617 bp. While the average length was 1293 bp, the length of the CDS of the BrBGLUs ranged from 267 bp to 2058 bp. The BrBGLU genes were heterogeneously distributed among all 10 chromosomes of B. rapa. Chromosome 5 contained the largest number of BrBGLU genes, comprising 15 members (23.4%), whereas chromosome 2 contained only one gene member. We also detected tandem arrays of the BrBGLU genes among the 10 B. rapa chromosomes. The tandem array was defined as 'multiple BrBGLU genes located in neighboring or the same intergenic region' [30]. Ten BrBGLU gene clusters were found on chromosomes A01, A03, A05, A07, and A09. Chromosome 5 contained the maximum number of clusters, comprising 11 BrBGLUs.

Phylogenetic and Gene Structure Analysis of BrBGLUs
To understand the evolutionary relationship of the BrBGLU genes, phylogenetic analysis of the BrBGLU and AtBGULU genes was conducted. To obtain AtBGLUs, HMM searching was performed by using all of the putative protein sequences of the Arabidopsis genome (ARAPORT11, https://www. araport.org) as queries. A total of 48 AtBGLU genes were obtained, which agrees with the results of a previous study [6]. The 64 BrBGLUs and 48 AtBGLUs protein sequences were aligned using ClustaX2 [31]. An unrooted phylogenetic tree was constructed for the 64 BrBGLUs and 48 AtBGLUs, using the NJ method in MEGA6 with a Poisson model. All BGLU proteins were classified into 10 distinct subgroups, namely, BGLU-a to BGLU-j ( Figure 2). The results of the phylogenetic analysis were relatively similar to the findings of a previous study using Arabidopsis [6], with a few exceptions. All B. rapa and Arabidopsis proteins are grouped into 10 subgroups, whereas Arabidopsis subgroups 8 and 9 were combined into a subgroup GH1-c in our analysis. In addition, AtBGLU48 (SENSITIVE TO FREEZING 2, SFR2), which belongs to a distinct lineage from 10 subgroups in a previous study [6,32], was incorporated into the GH1-j subgroup, together with BrBGLU8 and BrBGLU42 ( Figure 2).
Gene structure was commonly diversified during the evolution of the large number of gene families. To expand our knowledge of BrBGLUs in relation to evolution and functional diversification, the gene structures of the BrBGLUs were analyzed on the basis of exon-intron organization, using GSDS 2.0 [33]. The BrBGLUs exhibited 12 distinct exon-intron organization patterns, and the most common organization was 11 exons separated by 10 introns, presenting 19 members (Table 1 and Figure 3). Most genes contained more than two introns, except for BrBGLU46 and BrBGLU55, indicating the possible occurrence of alternative splicing during gene expression. The AtBGLUs exhibited 10 distinct exon-intron organization patterns, and the pattern with 13 exons was the most common [6]. This analysis is consistent with Arabidopsis and rice, where the intron size and number of the BGLUs genes are highly variable [5,6]. Multiple sequence alignment of GH1 proteins was performed using ClustalX2 with default parameters. The unrooted phylogenetic tree was constructed by MEGA 6 with the neighbor-joining (NJ) methods using the following parameters: bootstrap values (1,000 replicates) and Poisson model. The tree is divided into 11 phylogenetic subgroups, designated as GH1-a to GH1-k. Members of Arabidopsis and B. rapa are denoted by blue squares and red circles.
Gene structure was commonly diversified during the evolution of the large number of gene families. To expand our knowledge of BrBGLUs in relation to evolution and functional diversification, the gene structures of the BrBGLUs were analyzed on the basis of exon-intron organization, using GSDS 2.0 [33]. The BrBGLUs exhibited 12 distinct exon-intron organization patterns, and the most common organization was 11 exons separated by 10 introns, presenting 19 members (Table 1 and Figure 3). Most genes contained more than two introns, except for BrBGLU46 and BrBGLU55, indicating the possible occurrence of alternative splicing during gene expression. The AtBGLUs exhibited 10 distinct exon-intron organization patterns, and the pattern with 13 exons was the most common [6]. This analysis is consistent with Arabidopsis and rice, where the intron size and number of the BGLUs genes are highly variable [5,6].

Identification of BrBGLU Genes Involved in Pollen Development
Rice TDR (Tapetum Degeneration Retardation) mutant alters BGLU1 expression with flower specificity [28], and BGLU1 and BGLU13 are found to be related to male organ development in Calamus palustris [34]. These previous reports lead to a hypothesis that BrBGLUs are involved in pollen development. To test this hypothesis, the previously published microarray data relating to male sterility in B. rapa [29] were re-annotated, using the improved B. rapa genome (version 3.0) [35] and analyzed based on pollen development (Table S1). A total of 36 BrBGLUs, represented by 88 probes (or 88 ESTs) showed significant hybridization values, of which 12 BrBGLUs showed over two-fold change in expression levels between fertile and sterile floral buds: six members were upregulated, and members were downregulated in fertile buds. Among these genes, four upregulated genes (BrBGLU10/AtBGLU20, BrBGLU15/AtBGLU3, BrBGLU16/AtBGLU4, and BrBGLU64/AtBGLU41) and two downregulated genes (BrBGLU2/AtBGLU46 and BrBGLU19/AtBGLU30) were described as good candidates that were associated with pollen development. The function of all four upregulated genes has not been known up to now, but at least three, BrBGLU10, BrBGLU15, and BrBGLU64 appeared to be related to pollen wall development. In particular, we further analyzed BrBGLU10/AtBGLU20, as these showed hundred-fold changes between fertile and sterile buds.

Analysis of the Putative Functions of BrBGLU10/AtBGLU20 in Pollen Development
To gain more insights into the functions of the BrBGLUs during pollen development, BrBGLU10, which was highly and specifically expressed in fertile buds, was selected for further analysis. AtBGLU20, the Arabidopsis ortholog of BrBGLU10, was initially named as ATA27, which is one of the A. thaliana anther-specific expressed genes [36]. To confirm the expression patterns of BrBGLU10 and AtBGLU20, RT-PCR was conducted ( Figure 4A,B). The expression level of BrBGLU10 was specifically detected at the F1-F3 stages, with highest levels at the F2 stage, representing the tetrad stage, and AtBGLU20 was specifically expressed before floral stage 12. The RT-PCR results might imply its important role in pollen development. and BrBGLU19/AtBGLU30) were described as good candidates that were associated with pollen development. The function of all four upregulated genes has not been known up to now, but at least three, BrBGLU10, BrBGLU15, and BrBGLU64 appeared to be related to pollen wall development. In particular, we further analyzed BrBGLU10/AtBGLU20, as these showed hundred-fold changes between fertile and sterile buds.

Analysis of the Putative Functions of BrBGLU10/AtBGLU20 in Pollen Development
To gain more insights into the functions of the BrBGLUs during pollen development, BrBGLU10, which was highly and specifically expressed in fertile buds, was selected for further analysis. AtBGLU20, the Arabidopsis ortholog of BrBGLU10, was initially named as ATA27, which is one of the A. thaliana anther-specific expressed genes [36]. To confirm the expression patterns of BrBGLU10 and AtBGLU20, RT-PCR was conducted (Figures 4A, B). The expression level of BrBGLU10 was specifically detected at the F1-F3 stages, with highest levels at the F2 stage, representing the tetrad stage, and AtBGLU20 was specifically expressed before floral stage 12. The RT-PCR results might imply its important role in pollen development.  [29]. D, GO enrichment analysis of genes co-expressed with BrBGLU10. E, Expression pattern of AtBGLU20 and its co-expressed genes in various tissues of Arabidopsis, which was performed using the Arabidopsis eFP Browser (http://bar.utoronto.ca/efp/cgibin/efpWeb.cgi). F, GO enrichment analysis of genes co-expressed with AtBGLU20. S1-S3 represent the floral buds from male-sterile B. rapa. S1, before the tetrad stage. S2, after the tetrad stage. S3, containing aberrant pollen grains. F1-F4 indicate fertile B. rapa floral buds before the tetrad stage (F1), at the tetrad stage (F2), after the tetrad stage, but before containing mature pollen (F3), and containing mature pollen (F4). For Arabidopsis, FS1-12, flower stage 1 to stage 12; FS13-14, flower stage 13 to stage 14. PI, probe intensity.  [29]. D, GO enrichment analysis of genes co-expressed with BrBGLU10. E, Expression pattern of AtBGLU20 and its co-expressed genes in various tissues of Arabidopsis, which was performed using the Arabidopsis eFP Browser (http://bar.utoronto.ca/efp/cgibin/efpWeb.cgi). F, GO enrichment analysis of genes co-expressed with AtBGLU20. S1-S3 represent the floral buds from male-sterile B. rapa. S1, before the tetrad stage. S2, after the tetrad stage. S3, containing aberrant pollen grains. F1-F4 indicate fertile B. rapa floral buds before the tetrad stage (F1), at the tetrad stage (F2), after the tetrad stage, but before containing mature pollen (F3), and containing mature pollen (F4). For Arabidopsis, FS1-12, flower stage 1 to stage 12; FS13-14, flower stage 13 to stage 14. PI, probe intensity.
To demonstrate similar or conserved functions between BrBLU10 and AtBGLU20, we isolated the co-expressed genes of BrBGLU10, using microarray data [29] and AtBGLU20 from the Arabidopsis eFP Browser (http://bar.utoronto.ca/efp/cgi-bin/efpWeb.cgi) [37]. With the Pearson's correlation coefficient (PCC) value above 0.90, 183 probes (107 genes) and 25 genes were determined to be co-expressed with BrBGLU10 and AtBGLU20, respectively (Figure 4C, E; Tables S2-S3). BrBGLU10 and its co-expressers were upregulated at the fertile floral bud stage, and the highest expression level was detected at the F2 stage ( Figure 4C), suggesting that BrBGLU10 plays a role during pollen development, especially from the tetrad stage to that before the mature pollen stage. In Arabidopsis, flower and stamen development processes were divided into 14 stages and 12 stages, respectively [37][38][39]. AtBGLU20 and its co-expressers were represented by a high probe intensity (PI) value at flower stages (FS) 9 to 12, indicating that AtBGLU20 plays a role in Arabidopsis pollen development ( Figure 4E). We also conducted Gene Ontology (GO) enrichment analysis to provide more information on the function of BrBGLU10 and AtBGLU20 (Figure 4D,F). The results showed that genes involved in pollen exine formation and pollen wall assembly were highly over-represented among genes co-expressed with BrBGLU10 and AtBGLU20. Taken together, our analysis indicated that BrBGLU10 and AtBGLU20 may be required for pollen development in both B. rapa and Arabidopsis.
To validate AtBGLU20 function in the pollen development, we generated knockdown mutants of AtBGLU20 by introducing antisense constructs under the control of the CaMV35S promoter ( Figure 5A). After screening, four independent knockdown lines were obtained with expression levels ranging from 55% to 85% ( Figure 5B). However, the AtBGLU20 downregulated plants showed normal vegetative growth based on morphology ( Figure 5C), but produced defective pollen grains relative to the wild-type plants ( Figure 5D). These results indicated that normal pollen development in Arabidopsis requires sufficient amounts of AtBGLU20. All data obtained from gene expression, co-expression analysis, and transgenesis led to the conclusion that AtBGLU20 and BrBGLU10 may have indispensable functions in pollen development in both Arabidopsis and B. rapa, respectively.

Identification and Analysis of BrBGLUs
GH1 family genes play an important role in regulating abiotic and biotic stress responses, as well as various developmental processes in plants [9,12,14,18,23,40]. Based on the results of an increase in the number of whole genome sequences from a large number of species, genome-wide analysis of various gene families has been extensively performed. However, genome-wide identification and characterization of the GH1 gene family has only been reported in a few plant species, and there is no information on Brassica species, which are important crops for production of functional foods, as well as health-promoting compounds. In this study, the isolation of BrBGLUs from B. rapa genome (Figure 1), the distribution of BrBGLU genes on chromosomes (Figure 1), phylogenetic analysis (Figure 2), and exon-intron structures ( Figure 3) provides substantial information on the functions and roles of these genes.
Compared with the 49 AtBGLUs and 37 OsBGLUs in Arabidopsis and Rice, respectively [9], 64 BrBGLUs were isolated from the B. rapa genome, which is the largest number so far that has been reported in plants (Figure 1). The high number of BGLU family members in B. rapa could be related to the genome triplication event in this lineage [41]. To adapt different new functions that are suitable for changes in the environment, gene structure was commonly diversified during the evolution of multigene families [42]. For BGLUs, 13 exon-12 intron organization was considered as the ancestral gene structure, with the loss of certain introns leading to other gene structures [6]. The exons present in BrBGLUs varied from 2 to 13, and the most common organization was 11 exons (Figure 3). The introns in Arabidopsis vary from 0 to 13 [6]. This results suggested that little diversity exists in the gene structure of BrBGLUs when compared to AtBGLUs.
BrBGLUs may have originated from Arabidopsis, although duplication, gene loss, and functional diversification may have also occurred. This is supported by the fact that BGLUs from both species could be grouped into 10 subfamilies, with tandem arrays, as defined by Singh et al., 2013 [39], although some families were re-grouped or diverged into other subgroups. Figure 2 shows that AtBGLU subfamilies 8 and 9 [6] were incorporated into one B. rapa subfamily, GH1-c, and BrBGLU51 is composed of GH1-d with six AtBGLUs (AtBGLU34/35/36/37/38/39), indicating the loss of some BGLUs in B. rapa. This phenomenon may result from the rapid evolution of genes similar to that previously observed between Arabidopsis and rice [5]. One more interesting finding was that AtBGLU48 (SFR2) was incorporated into the GH1-j subgroup, with BrBGLU8 and BrBGLU42 (Figure 2). AtBGLU42 is a β-glucosidase, but it is divergent from all other AtBGLUs, and more similar to several β-glycosidases from thermophilic archea and bacteria [32]. SFR2 is involved in the lipid remodeling of the outer chloroplast membrane during freezing tolerance [43,44]. Because two BrBGLUs in the GH1-j subgroup had identities between 85% and 87% with AtBGLU2, Brassica genes may have a similar function of freezing tolerance as that in AtSFR2, although this requires further investigation.
On the basis of Arabidopsis study, most subfamilies of BGLUs in Figure 2 may be associated with specific functions: GH1-a for flavonoid and anthocyanin metabolism, GH1-e for flavonoid utilization, GH1-d for myrosinases, and GH1-i for scopolin hydrolysis. At least 12 genes are known to be involved in flavonoid metabolism in GH1-a: AtBGLU1-6 for flavonol accumulation [10,11], AtBGLU7-11 as anthocyanin glucosyltransferases [10][11][12], and AtBGLU15 for flavonol bisglycoside catabolism under abiotic stress [13]. AtBGLU12-17 in the GH1-e subgroup code for flavonoid-utilizing BGLUs in legumes [10]. An examination of the functions of BrBGLUs that are clustered with AtBGLUs in subgroups GH1a and GH1-e may provide information and understanding into the regulation of flavonoid biosynthesis in Brassica species.
Several subfamilies may be related to abiotic and biotic stress resistance, such as GH1-b, GH1-c, GH1-d, GH1-f, and GH1-i. Myrosinases hydrolyze glucosinolates into active forms that are involved in plant defense against herbivory and pathogens, and in human health promotion [45][46][47][48]. AtBGLU26 and AtBGLU34-39 function as myrosinases [14][15][16]. Except for AtBGLU26 (GH1-h), most genes belong to the GH1-d subgroup (Figure 2). Understanding myrosinase function in Brassicaceae, which is rich in glucosinolates, may provide an excellent strategy for breeding health-promoting Brassica crops [49,50]. ABA also functions in stress responses, including drought stress. AtBGLU18 [17] and AtBGLU33 [18] hydrolyze glucose-conjugated ABA, thereby increasing ABA levels and inducing ABA responses such as drought tolerance. However, these two proteins are separated into two subfamilies, implying the presence of more BGLUs for the regulation of ABA levels. Scopolin is one of the coumarins produced in roots [51], and it plays a role in the defense against pathogen attack and abiotic stresses [19,20]. Three β-glucosidases that hydrolyze scopolin and their encoding genes (AtBGLU21, 22 and 23) have been characterized [21,22]. The GH1-i subfamily includes these three genes and 11 BrBGLUs, which should be examined in relation to scopolin production. The GH1-b subfamily includes two monolignol glucoside hydrolases (AtBGLU45 and AtBGLU46) that control lignin content [23]. Because OsBGLU14, 16, and 18 are involved in lignin biosynthesis with monolignol β-glucosidase activity and compensate for the Arabidopsis bglu45 mutant [52], BrBGLUs in this subfamily may play similar roles. AtBGLU42 in GH1-c is involved in the induction of systemic resistance to bacterial disease, and the release of iron-mobilizing phenolic metabolites under iron deficiency [24]. Several genes in this subfamily would thus be expected to contribute to eliciting defense responses. All of this information may contribute to future research directions in relation to BrBGLUs.

The Potential Functions of BrBGLUs During Pollen Development
Previous studies on rice and other plant species have indicated that β-glucosidases play roles in pollen development [34,36,53]. To identify the BrBGLUs responsible for pollen development, the previously published microarray data relating to male sterility in B. rapa were re-annotated and re-analyzed. Among the 36 BrBGLUs, 12 BrBGLUs showed over a two-fold change between fertile and sterile floral buds (Table S1). However, six genes (four upregulated and two downregulated genes) were more extensively studied in terms of their role in pollen development. We selected one BrBGLU10 for investigation, the homolog of AtBGLU20, which showed hundreds-fold changes in its expression.
We examined the expression levels of BrBGLU10/AtBGLU20 and analyzed the co-expressed genes in both B. rapa and Arabidopsis (Figure 4). An assessment of expression levels strongly suggests that BrBGLU10/AtBGLU20 are involved in pollen development. The cellular contents from the degeneration of the tapetum supports pollen wall formation and subsequent pollen release [39]. Mutations in polysaccharide metabolism-related genes lead to defective pollen wall formation [26]. Glycoside hydrolase has been reported to be involved in the cell wall polysaccharide degradation [27]. The expression patterns of BrBGLU10 and AtBGLU20 suggest that they might play a role from the tetrad stage to mature pollen grains ( Figure 4A, B), which corresponding to the tapetum degradation stage [29,39]. Co-expression analysis is a valuable approach for classifying and visualizing transcriptomic data to identify genes with similar cellular functions and regulatory pathways [54][55][56], although this is not always the case [57,58]. In plants, co-expression analysis under various experimental conditions has been used for predicting gene function [55,59]. Figure 4C,D shows that this gene possibly regulates pollen development. In particular, GO annotation of co-expressed genes reflects that pollen wall and exine formation are influenced by BrBGLU10/AtBGLU20, indicating that the hydrolysis of glucose moieties is necessary for proper pollen development.
Because BrBGLU10 had a high sequence identity with AtBGLU20 (87% at the nucleotide level and 84% at the amino acid sequence level), both genes may thus have similar functions. Therefore, knocking down AtBGLU20 may provide direct evidence for its function in pollen development. Figure 5 shows that the suppression of AtBGLU20 expression had no effect on plant growth and development, although this aborted pollen production. This result implies that BrBGLU10/AtBGLU20 are critical to pollen grain development.

Plant Materials and Growth Conditions
Seeds of B. rapa subsp. pekinensis (Chiifu) were germinated in Petri dishes in the dark at 23 • C for two days, then the germinated seeds were transferred to a 4 • C growth chamber with 16 h of light for 25 days to induce vernalization. After vernalization, the seedlings were transplanted into 15 cm × 15 cm × 18 cm pots containing potting soil and grown in a 23 • C growth chamber with 16 h of light. The floral buds were collected from 10 plants with three biological replicates, as previously described [29], and stored at −70 • C until use. Root and shoot tissues were collected from three-week-old seedlings without vernalization. Stem tissue was sampled from the plants one week after bolting.
A. thaliana (L.) Heynh var. Columbia (Col-0) plants were grown under 140 µmol/m 2 /s light intensity at 23 ± 1 • C with a long day cycle with 16 h of light for plant transformation. Seeds were sown in 55 mm × 55 mm pots in potting soil, stratified for three days at 4 • C, and then transferred to the growth room. The plants were then kept under a transparent polythene lid for one week to increase humidity and support equal germination. The plate-cultured seeds were sterilized with 30% bleach and 0.1% Triton X-100 (Sigma, St. Louis, MO, USA), stratified for three days at 4 • C, and sown in Petri dishes with dimensions of 100 mm × 100 mm × 20 mm. The dishes contained half-strength MS media (Duchefa Biochemie, Netherlands) supplemented with 0.8% phytoagar and 1% sucrose.

Antisense Constructs and Plant Transformation
The full-length coding sequence of AtBGLU20 was cloned from first-strand complementary DNA (cDNA), using the primers BGLU20F (Table S4). Then, the fragments were inserted into T&A cloning vectors (RBC T&A cloning kit, Real Biotech Corporation, Taiwan). After confirmation of the AtBGLU20 sequence in the T&A vector by sequencing, the fragment was cloned into pCambina 3300-35S binary vectors and used in plant transformation. Col-0 were used for transformation with Agrobacterium tumefaciens GV3101 carrying the above binary plasmid using the floral dip method [60]. The transformants were selected on plates containing 25 mg/mL glufosinate in MS medium (Sigma, St. Louis, MO, USA), and also confirmed by genomic DNA PCR analysis.

Reverse Transcription PCR and qRT-PCR
Total RNA (1 µg) from each sample was used in reverse transcription. First-strand cDNA was synthesized with a PrimeScript™ RT reagent kit with a gDNA Eraser kit (TaKaRa, Japan). The concentration of the synthesized cDNA was determined, and the cDNA was diluted to 20 ng/µL for PCR analysis. Semi-RT-PCR was performed, which consisted of denaturation at 94 • C 5 min; followed by 25 cycles of 94 • C for 30 s, 55 • C for 30 s, and 72 • C for 60 s. The qRT-PCR conditions were pre-denaturation at 95 • C for 30 s; followed by 30 cycles of 95 • C for 5 s, 60 • C for 20 s, and 72 • C for 15 s. All primer sequences used in this study are listed in Table S4. The semi-RT-PCR products were separated on 1.5% agarose gels, and stained with ethidium bromide. The qRT-PCR results were analyzed using the 2 −∆∆C T method, with three biological replicates.

Pollen Viability
For pollen viability and pollen developmental progression, flowers collected from Col-0 and AtBGLU20 antisense transgenic plants were fixed in Carnoy's fixative (6:3:1 alcohol:chloroform:acetic acid) for 2 h. Then, the anthers were detected and stained with a solution of Malachite green, acid fuchsin, and Orange G for approximately 12 h, as previously described [61].

Identification of BrBGLUs and Phylogenetic Tree Construction
The protein sequence of 48 BGLU members were downloaded from TAIR (http://www. arabidopsis.org/tools/bulk/sequences/index.jsp) [6]. All putative protein sequences of B. rapa (version 3.0) were downloaded from BRAD (http://brassicadb.org/brad/index.php) [35] and used as queries to search against the Hidden Markov Model (HMM) profile (Version 3.1b2) with the Pfam HMM library (Pfam 32.0) [62]. A total of 64 protein sequences with PF00232 (E value below 1E −5 ) were obtained, and these sequences were considered as BrBGLUs candidates and used for further analysis. Multiple sequence alignment of full-length BGLU proteins and phylogenetic tree construction were conducted using ClustalX2 [31]. The phylogenetic tree was generated by the MEGA6 program, using the neighbor-joining method with the 'pairwise deletion' option and 'Poisson correction' model, with a bootstrap test of 1000 replications [63].

Chromosomal Location, Nomenclature, and Gene Duplication of BrBGLUs
The position of each BrBGLU on B. rapa chromosomes was identified from BRAD (http://brassicadb.org/brad/index.php). For nomenclature, the 'Br' for B. rapa was added, followed by BGLU, and numbered according to its position from top to bottom on B. rapa chromosomes 1-10. MCScanX software was used to search potentially duplicated BrBGLUs [64]. All of the putative protein sequences of B. rapa (version 3.0) were compared with themselves by BLASTP, with a tabular format and an e-value of < 10 −5 . Then, tandem, segmental, and dispersed duplications were identified using MCScanX, using default criteria.

Co-Expression and Gene Ontology Enrichment Analysis
AtBGLU20 was used as bait gene for genome-wide co-expression analysis to identify genes of similar function from Expression Angler [65]. BrBGLU10 was represented by two EST probes Brapa_ESTC004210 and Brapa_ESTC007739, which were used as bait for co-expression analysis. A cutoff threshold of 0.90 for the Pearson correlation coefficient was used. The expression pattern analysis was performed using the Arabidopsis eFP browser (http://bar.utoronto.ca/efp/cgi-bin/efpWeb. cgi) [37]. Clustering analysis for categorization was performed with the TIGR Multi-Experiment Viewer (http://www.tm4.org/mev.html). GO enrichment analysis was performed using agriGO (http://bioinfo.cau.edu.cn/agriGO/index.php) [66].

Microarray Analysis
To analyze the gene expression patterns of BrBGLUs in B. rapa during pollen development, the previously published microarray data relating to male sterility analysis were downloaded from NCBI's Gene Expression Omnibus (GSE47665) [29]. The microarray data were re-annotated using BLASTX by comparing with the newly improved B. rapa reference genome sequence (version 3.0) [35].

Conclusions
In conclusion, 64 BrBGLUs have been identified in B. rapa genome, which were classified into 10 subgroups with Arabidopsis counterparts, and the GH1-i subgroup included putative pollen development-related BrBGLU10. Base on its known function in Arabidopsis, BrBGLUs may participate in various defense responses against biotic and abiotic stresses, flavonoid metabolism, and pollen development. This study has provided valuable information for a better understanding of BGLUs, and for their biotechnological application to crops.