Genome-Wide Analysis of the UGT Gene Family and Identification of Flavonoids in Broussonetia papyrifera

Broussonetia papyrifera is a multifunctional deciduous tree that is both a food and a source of traditional Chinese medicine for both humans and animals. Further analysis of the UGT gene family is of great significance to the utilization of B. papyrifera. The substrates of plant UGT genes include highly diverse and complex chemicals, such as flavonoids and terpenes. In order to deepen our understanding of this family, a comprehensive analysis was performed. Phylogenetic analysis showed that 155 BpUGTs were divided into 15 subgroups. A conserved motif analysis showed that BpUGT proteins in the same subgroups possessed similar motif structures. Tandem duplication was the primary driving force for the expansion of the BpUGT gene family. The global promoter analysis indicated that they were associated with complex hormone regulatory networks and the stress response, as well as the synthesis of secondary metabolites. The expression pattern analysis showed that the expression level of BpUGTs in leaves and roots was higher than that in fruits and stems. Next, we determined the composition and content of flavonoids, the main products of the BpUGT reaction. A total of 19 compounds were isolated and analyzed by UPLC-ESI-MS/MS in 3 species of Broussonetia including B. kazinoki, B. papyrifera, and B. kazinoki × B. papyrifera, and the number of compounds was different in these 3 species. The total flavonoid content and antioxidant capacities of the three species were analyzed respectively. All assays exhibited the same trend: the hybrid paper mulberry showed a higher total flavonoid content, a higher total phenol content and higher antioxidant activity than the other two species. Overall, our study provides valuable information for understanding the function of BpUGTs in the biosynthesis of flavonoids.


Introduction
Flavonoids, as a type of polyphenolic compound with a C6-C3-C6 double aromatic ring, have been extracted from the leaves, skins, roots and fruits of many plants [1]. Modern pharmacological studies have shown that the flavonoids are beneficial in drug development and health care, such as anti-cancer [2], anti-oxidation [3], anti-inflammation [4], anti-bacteria [5], anti-allergy [6], anti-tumor [7] and other pharmacological activities. In plants, flavonoids exist in various modified forms, and are generated by hydroxylation, methylation, acylation, and glycosylation, among which glycosylated flavonoids are the most common natural compounds. Most of the natural flavonoids are C-glycosides and Oglycosides, and the most abundant flavonoid glycosides in plants are flavone glycosides [8]. The most reported flavonoid O-glycosides are 7-and 3-O-glycosides, and the flavonoid Cglycosides are found mainly as 6-and 8-C-glycosides. Although the flavonoid C-glycosides are less well known than flavonoid O-glycosides, they exhibit a wide range of benefits for human health [9]. The glycosylation is mainly catalyzed by glycosyltransferases (GTs), which are classified into 111 families (http://www.cazy.org/GlycosylTransferases.html) (accessed on 5 January 2021). Both flavonoid C-glycosides and O-glycosides are catalyzed leaves, which were located in the same place, were rapidly frozen in liquid nitrogen and were dried with a vacuum centrifuge concentrator (CV100-DNA, Baijiu, Beijing, China). The dried experimental materials were stored at −20 • C before analysis. All concentrations used in this study were calculated by dry weight (DW). Three independent biological replicates were performed for each plant.

Identification of UGT Family Genes in Paper Mulberry Genome
To identify the UGT family genes in the paper mulberry genome, two approaches were used. Firstly, the known 122 UGT protein sequences of A. thaliana were downloaded from the TAIR database v10.0 (https://www.arabidopsis.org/) (accessed on 5 January 2021) and were used as queries to search the UGT protein database by using a local BLASTP program [21]. Secondly, the Hidden Markov Model (HMM) seed file of the UGT domain (Pfam00201, http://pfam.xfam.org/) (accessed on 5 January 2021) was also used to search the paper mulberry candidate proteins. Then, each of these HMM models were used as a probe to perform a BLASTP against the local paper mulberry protein sequence database by using HMMER 3.0 (http://www.hmmer.org/) (accessed on 5 January 2021). Summarizing the results of both methods and removing the redundant sequences, the remaining sequences were the candidate UGT protein sequences of paper mulberry. The candidate protein sequences were further verified by scanning against SMART (http://smart.embl-heidelberg.de/) (accessed on 7 January 2021), PFAM, and CDD Search (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) (accessed on 7 January 2021) to confirm the presence of the UGT domain in each candidate sequence.

Phylogenetic, Gene Structure and Conserved Motif Analysis
For phylogenetic analysis, the full-length amino acid sequences of UGT proteins from Arabidopsis and B. papyrifera were aligned by MEGA X software with default parameters. An unrooted phylogenetic tree was constructed by the maximum likelihood (ML) method based on the JTT amino acid substitution model with 1,000 bootstrap replicates. The iTOL (http://itol.embl.de/help.cgi) (accessed on 10 January 2021) online tool was used to better illustrate and edit the phylogenetic tree. The gene structure of the BpUGT genes were examined by the online website GSDS 2.0 (http://gsds.cbi. pku.edu.cn) (accessed on 13 January 2021). The MEME Suite web server 5.1.1 (Multiple Expectation Maximization for Motif Elicitation) (http://meme-suite.org/tools/meme) (accessed on 16 January 2021) was used to analyze the conserved motifs of the BpUGT sequences with default arguments, the number of repetitions was set to 0 or 1, and the maximum number of motifs was set to 20.

Chromosomal Location, Gene Duplication Events and Promoter Cis-Regulatory Analysis
The chromosomal location information of all the BpUGT genes were extracted from the paper mulberry genome database and were visualized by the MapInspect version 1.0 (https://mapinspect.software.informer.com/) (accessed on 3 February 2021) graphical tool. For the identification of gene duplication, all the BpUGT amino acid sequences were searched against the paper mulberry genome by employing the BLASTP process with an e-value of 1 × 10 −5 . To identify the cis-elements in the promoter sequences of the UGT family genes in paper mulberry, 2,000 bp genomic sequences upstream of the transcription start site were analyzed using the PlantCARE database (http://bioinformatics.psb.ugent. be/webtools/plantcare/html/) (accessed on 3 February 2021).

Expression Analysis of BpUGT Genes in Different Tissues
The expression profiles of BpUGTs genes in ten tissues (fruit, shoot apex, young leaf, developing leaf, mature leaf, immature stem, phloem of proximal stem, phloem of mature stem, phloem of root, and root tip) were detected using transcriptome data. Each tissue had three biological replicates. The transcript abundance was represented by fragments per kilobase of the transcript per million fragments mapped (FPKM) values which were calculated based on RNA-seq reads. The expression values of each gene in the different tissues were averaged and presented as a log 2 value. The results were presented as heatmaps using the TBTOOLS software.

Sample Extraction
The extraction of the flavonoids was performed according to the method described by Chen et al. [22] with minor modifications. The lyophilized leaves powder (20 mg) was transferred into 2 mL centrifuge tubes containing 1 mL of methanol-acetic acid (0.1%), the mixtures were shaken with a vortex-qilinbeinbeier (Kylin-Bell, Haimen, China) for 30 s, the samples were sonicated by a KQ-50B ultrasonic cleaner (Ultrasonic instruments, Kunshan, China) at room temperature for 1 h and then centrifuged in SIGMA 2K15 (Sigma Centrifuges, Germany) at 12,000 rpm for 10 min. The supernatants were collected in new 2 mL centrifuge tubes. The sample extraction was filtrated by millipore membrane filters (D: 0.22 µm) prior to UPLC-DAD and UPLC-MS 2 analysis.

UPLC-PDA and UPLC-MS/MS System and Conditions
The samples were examined with ultra-performance liquid chromatography (UPLC, Waters, Milford, MA, USA) coupled to a triple-Quadrupole Mass Spectrometry (XEVO ® -TQ) with electrospray ionization (ESI). The separation was carried out with a ZORBAX Eclipse plus C18 (150 mm × 3.0 mm) with a particle size of 1.8 µm (Agilent Technologies, Santa Clara, CA, USA) at 40 • C. The gradient was calculated using 0.1% formic acid (A) and acetonitrile (B) as the mobile phases, 0-1 min (5% B), 1-8 min (5-40% B), 8-12 min (40-95% B), 16-17 min (95-100% B), 17-21 min (100% B), 21-25 min (5% B). The operating conditions were set at positive ion ESI modes with a 1.0 mL/min flow rate. Chromatograms were acquired at 350 nm and photodiode array spectra were recorded from 200 to 800 nm. The UPLC-MS/MS analysis for flavonoids was performed using a XEVO ® TQ-MS triple quadrupole mass spectrometer (Waters, Milford, MA, USA), which was connected to an Ultra Performance Liquid Chromatograph (UPLC-MS/MS, Waters). The UPLC separation conditions were the same as mentioned above. The flavonoids were employed in a positive ion (PI) mode and the MS detection conditions were as follows: capillary voltage, 3.00 kV for PI mode; cone voltage, 30V for PI mode; desolvation gas flow, 650 L/h; cone gas flow, 50 L/h; collision gas flow, 0.12 mL/min; collision energy, 15 eV for PI mode; desolvation temperature, 300 • C; source temperature, 150 • C; scan range, 100-1000 (m/z). The injection volume of each sample was 1 µL. Analytical software (MassLynx™, V 4.1, SCN 846, Waters Corp., Manchester, UK) was used for the system control and the data processing. For data acquisition, each MS scan and UV detection were acquired three times.

Quantitative and Qualitative Analysis of Flavonoids
The compounds were identified by comparing them with the retention times of the standards, the characteristics of the UV-vis spectra of the peaks and the mass spectrometric information with Mass Hunter qualitative software. Quercetin 3-rutinoside (rutin) and trans-5-caffeoylquinic acid were used as the standards for the semi-quantification of all flavonoids and chlorogenic acids, respectively. The linear regression equations are as follows: for rutin, Y = 0.0041X + 0.0032 (r 2 = 0.9991); for chlorogenic acid, Y = 0.0021X + 0.0058 (r 2 = 0.9982). The concentrations of phenolic compounds were presented as micrograms of corresponding standards per 1 g of DW. All analyses were performed with three biological replicates.

Total Polyphenol and Flavonoid Contents
The total number of polyphenol compounds was determined by means of the Folin-Ciocalteau reaction as previously reported [23]. A sample solution (100 µL, 5 mg/mL) was added to 100 µL of the Folin-Ciocalteau reagent and then mixed with 1000 µL of double distilled water and 200 µL of 20% sodium carbonate. The reaction solution was incubated at room temperature for 1 h and the absorbance of the sample mixture was recorded at 760 nm. To obtain the total phenolics contents, the calibration curve (Y = 2.853X + 0.0072, r 2 = 0.9986) established that the gallic acid (GA) was used as standard. The results were expressed as mg gallic acid equivalents (GAE)/100 g dry weight (mg GAE/100 g DW). To measure the flavonoids in the leaf extracts, we used the spectrophotometer method as reported [23]. The reaction system included a 100 µL sample, 1000 µL of double distilled water and 500 µL of AlCl 3 (2%, v/v) reagent. The sample absorbance was read at 430 nm after the mixture was stored for 15 min at room temperature. To obtain the value of the flavonoid compounds, the calibration curve (Y = 1.1748X + 0.0066, r 2 = 0.9993) was established that the rutin was used as a reference and the results were expressed as mg rutin equivalents (RE)/100 g dry weight (mg RE/100 g DW).

Antioxidant Capacity Analysis
The antioxidant capacity was determined using free radical 1,1-diphenyl-2-picrylhydrazyl (DPPH), the kits of the ferric reducing ability of plasma (FRAP), and 2, 2'-azino-3ethylbenzothiazoline-6-sulfonic acid (ABTS) assays. The specific operations were as follows. The DPPH assay was determined according to the reported protocols [24,25]. DPPH (0.2 mM) was dissolved in methanol. The sample (100 µL) was mixed with DPPH (2900 µL) and incubated in the dark at room temperature. After 30 min, the absorbance was measured at 515 nm. GA was used as an authentic standard and the calibration curve was established by plotting the DPPH·scavenging ratio (A 0 − A t )/A 0 (A 0 and A t separately mean the absorbance at the initiation and termination of the reaction) against the GA concentration. The linear regression equation was Y (scavenging ratio) = 13.554X + 0.0184 (r 2 = 0.9992). The antioxidant capacity was expressed as GAE and data were presented as milligrams of GAE per 100 g DW.
The FRAP assay was performed based on the method of previous studies [23]. A total of 60 mL of 10 mM of TPTZ solution (dissolved in 40 mM of hydrochloric acid), 60 mL of 20 mM of ferric chloride solution, and 600 mL of an acetate buffer solution (pH 3.6) were used to prepare the FRAP working solution. The sample (100 µL) was mixed with 2900 µL of the FRAP working solution to react in the dark at room temperature for 15 min. Then the absorbance at 593 nm was measured. Similarly, gallic acid (GA) was used as a standard and the calibration curve was Y (absorbance) = 25.522X − 0.0158 (r 2 = 0.9999). Moreover, data were presented as milligrams of GAE per 100 g DW.
The ABTS assay was carried out based on the method described by the authors in [26]. The FRAP working solution contained 5 mL of 7 mM of ABTS solution, 1 mL of 140 mM of potassium persulfate solution, and 380 mL of the ethanol buffer solution. The sample extract and GA (100 µL) were added into 2900 µL of the ABTS working solution to incubate at room temperature. After 6 min, the absorbance at 414 nm was recorded using a UV spectrophotometer. GA was used as an authentic standard and the calibration curve was Y (absorbance) = 24.483X − 0.0059 (r 2 = 0.9995). GAE was used to describe the results.

Statistical Analysis
Data was obtained from three independent biological replicates and the significance was determined via one-way analysis of variance (ANOVA), and the significant difference was employed when p < 0.05.

Identification and Phylogenetic Analysis of BpUGTs
A total of 155 putative BpUGTs were identified in the paper mulberry genome using the HMM and BLASTP programs. The length of the deduced UGT proteins varied from 238 to 874 amino acids, with an average of 475. The predicted molecular weight ranged from 26 to 97 kDa. The isoelectric point ranged from 5.03 to 8.7 (Supplementary Materials Table S1). The phylogenetic analysis of the identified UGTs was performed to analysis their grouping pattern and their genetic relationships based on the Arabidopsis UGT sequences. The paper mulberry UGTs were classified into 15 subgroups (A-O), including one newly discovered group (group O), which was present in B. papyrifera but not in A. thaliana ( Figure 1).

Chromosome Location, Gene Duplication Events, Gene Structure and Conserved Motif Analysis of Paper Mulberry UGTs
To detect detailed information of the UGT genes in paper mulberry, we investigated the chromosome location of the UGT genes in the B. papyrifera species according to the gene annotation files retrieved from public genomic databases. After curation, 145 UGT genes in B. papyrifera were located on chromosomes, which represent 93.5% of the total  The number of UGTs in each group varied: the largest group E had 46 UGT members and the smallest group N had only one member. The new group identified in our study was O containing two UGT members. There were 46 UGT members in the largest E group and only one UGT member in the smallest N group.

Chromosome Location, Gene Duplication Events, Gene Structure and Conserved Motif Analysis of Paper Mulberry UGTs
To detect detailed information of the UGT genes in paper mulberry, we investigated the chromosome location of the UGT genes in the B. papyrifera species according to the gene annotation files retrieved from public genomic databases. After curation, 145 UGT genes in B. papyrifera were located on chromosomes, which represent 93.5% of the total UGT genes in paper mulberry ( Figure 2). The remaining 10 were mapped to the scaffold. In the B. papyrifera genome, the chromosome containing the greatest number of UGT genes (22 members) were the chr04 and chr09 chromosomes, and the chr05 chromosome only contained two UGT genes, and contained the least number of UGT genes compared with the other chromosomes. The chromosome location of the UGT genes is uneven in paper mulberry genomes. The UGT genes of group E with the most members (46 genes), were randomly distributed across 9 chromosomes (chromosome 1 and 3-10) and the remaining three members were located on the scaffold. The distribution of the UGT genes on each chromosome was also uneven but most were clustered, suggesting that there may be a gene duplication in the evolutionary process. Gene duplications were considered to play an important role in the expansion and evolution of the gene family, so a duplication event analysis of the BpUGT genes was performed. The results indicated that 155 BpUGTs were involved in 7 segment duplication events and 28 tandem duplication events, which suggested that tandem duplications may be the primary driving force for the expansion of the BpUGT family ( Figure S1, Table S2). The phylogenetic groups G, E, L, F, A and H possessed the maximum number of tandem duplicated UGTs (8, 5, 5, 3, 2 and 2, respectively), while J, M and I possessed one only. Groups B, D, K, N and O did not have any tandem duplicated UGTs. To better explore the relationships between the structure and the function of paper mulberry UGT genes, and to further clarify the evolutionary relationships within the UGT gene family, the exon/intron structure was analyzed. Among the 155 BpUGTs, 59 had no introns, and 63 contained only one intron ( Figure S2C). Within the same subgroup, most members shared similar exon/intron numbers and arrangements, and they had important sequence characteristics, indicating that they had very close evolutionary relationships ( Figure S2A). Conserved motifs occupy an important role in the characteristic analysis and classification of a gene family. The results showed that a total of 20 motifs were identified among the BpUGT proteins ( Figure S2B). The remaining 10 were mapped to the scaffold. In the B. papyrifera genome, the chromosome containing the greatest number of UGT genes (22 members) were the chr04 and chr09 chromosomes, and the chr05 chromosome only contained two UGT genes, and contained the least number of UGT genes compared with the other chromosomes. The chromosome location of the UGT genes is uneven in paper mulberry genomes. The UGT genes of group E with the most members (46 genes), were randomly distributed across 9 chromosomes (chromosome 1 and 3-10) and the remaining three members were located on the scaffold. The distribution of the UGT genes on each chromosome was also uneven but most were clustered, suggesting that there may be a gene duplication in the evolutionary process. Gene duplications were considered to play an important role in the expansion and evolution of the gene family, so a duplication event analysis of the BpUGT genes was performed. The results indicated that 155 BpUGTs were involved in 7 segment duplication events and 28 tandem duplication events, which suggested that tandem duplications may be the primary driving force for the expansion of the BpUGT family ( Figure S1, Table S2). The phylogenetic groups G, E, L, F, A and H possessed the maximum number of tandem duplicated UGTs (8, 5, 5, 3, 2 and 2, respectively), while J, M and I possessed one only. Groups B, D, K, N and O did not have any tandem duplicated UGTs. To better explore the relationships between the structure and the function of paper mulberry UGT genes, and to further clarify the evolutionary relationships within the UGT gene family, the exon/intron structure was analyzed. Among the 155 BpUGTs, 59 had no introns, and 63 contained only one intron ( Figure S2C). Within the same subgroup, most members shared similar exon/intron numbers and arrangements, and they had important sequence characteristics, indicating that they had very close evolutionary relationships ( Figure S2A). Conserved motifs occupy an important role in the characteristic analysis and classification of a gene family. The results showed that a total of 20 motifs were identified among the BpUGT proteins ( Figure S2B).

Cis-Regulatory Analysis on Promoters and Expression Analysis of BpUGT Genes
Cis-acting regulatory elements play an important role in the regulation of gene transcription initiation through interactions with their corresponding trans-regulatory factors, especially in the synthesis of secondary metabolites. To obtain more valuable information, we analyzed the promoter regions of each putative BpUGT gene. Our results showed that these identified cis-regulatory elements could be classified into four functional categories: light response, hormone response, development regulation and stress response. Overall, these findings demonstrated that the UGT gene family in paper mulberry play a vital part in the complex hormone regulatory network and may be involved in a variety of stress responses, as well as the synthesis of secondary metabolites, which was helpful to explore the regulatory mechanisms of the family of BpUGTs (Table S3). To obtain a broader understanding of the potential functions of BpUGTs, their expression patterns were analyzed using RNA-seq data of ten tissues (fruit, shoot apex, young leaf, developing leaf, mature leaf, immature stem, phloem of proximal stem, phloem of mature stem, phloem of root and root tip). FPKM values were used to evaluate the gene expression level. BpUGT genes exhibited different expression levels in different tissues. A total of 65 genes showed low expression levels; 80 genes showed high expression levels. The other 10 genes were not detected among all the examined tissues. Most of the highly expressed BpUGT genes were expressed more in the leaves and roots than in the fruits and stems (Figure 3, Table S4).
Additionally, many of the BpUGT genes showed the highest level of transcript in leaves, indicating that UGTs may play an important role in the biosynthesis of glycosylated secondary metabolites.

Qualitative Analysis of Flavonoid Compounds
The UGT genes catalyze the glycosylation of most flavonoids. To identify the flavonoids and their derivatives, we employed UPLC-Q-TOF-MS in positive ion modes to analyze the leaf extracts, and the compounds were obtained under excellent chromatographic condition with good peak separation and resolution ( Figure 4).
By combining UV absorption maxima, retention time and mass spectra, a total of 19 compounds were definitely or tentatively identified from the leaf samples of Broussonetia. The number of isolated compounds from B. papyrifera, B. kazinoki and hybrid paper mulberry were 18, 13 and 19, respectively. Different compounds are shown among the studied samples in Table 1. The 19 compounds were identified based on the UPLC-ESI-MS/MS analyses and by comparison with data with those standards or in the literature. the regulatory mechanisms of the family of BpUGTs (Table S3). To obtain a broader understanding of the potential functions of BpUGTs, their expression patterns were analyzed using RNA-seq data of ten tissues (fruit, shoot apex, young leaf, developing leaf, mature leaf, immature stem, phloem of proximal stem, phloem of mature stem, phloem of root and root tip). FPKM values were used to evaluate the gene expression level. BpUGT genes exhibited different expression levels in different tissues. A total of 65 genes showed low expression levels; 80 genes showed high expression levels. The other 10 genes were not detected among all the examined tissues. Most of the highly expressed BpUGT genes were expressed more in the leaves and roots than in the fruits and stems (Figure 3, Table S4).  All extracts were analyzed in positive ion mode (m/z, [M + H] + ) using UPLC-ESI-MS/MS. These compounds could be divided into two main groups: phenolic acids and flavonoids (apigenin derivatives and luteolin derivatives). The first and second compounds (peaks one, two) showed the same [M + H] + parent ion at the value m/z 355 and the maximum absorbance at 325 nm on the UV spectra. According to the basis of these characteristics, they were considered as the isomers of chlorogenic acid [27]. Thus peaks one and two were separately identified as neochlorogenic acid and chlorogenic acid by co-chromatography with corresponding standards. According to the UPLC-ESI-MS/MS analyses and comparison with standards and literature data, the flavonoid was considered to be a derivative of two flavones, apigenin and luteolin. Usually, the glycosylated flavonoids connect the sugars, mainly pentose (arabinose and xylose) and hexose (glucose, galactose and rhamnose) [28]. Thus, there are two kinds of connections between sugars and flavonoids when they are glycosylated, namely, O-C and C-C connections [29]. The former is relatively common, while the latter generally occurs in specific plant groups, and most of the glycosides are located in C6 and/or C8 positions [30]. Combined with molecular ions at m/z 287 [M] + and 271 [M] + in PI mode, the peaks 18 and 19 were regarded as luteolin and apigenin through UV spectra and by co-eluting with their counterpart standards. Peaks 8 and 10 were tentatively assigned as apigenin isomers, namely 6-C-pentosyl-8-C-glucosyl apigenin or 6-C-glucosyl-8-C-pentosyl apigenin because they showed the same fragment ions at m/z 565 [M + H] + and 433 [M + H] + . There is no doubt that peaks 8 and 10 were identified as 6-C-glucosyl-8-C-arabinosyl apigenin (schaftoside) and 6-C-arabinosyl-8-Cglucosyl apigenin (isoschaftoside) by co-eluting with their corresponding standards under the same conditions. Mass spectrometry data from the isomers of 9 and 12 showed that there were two substituents: one molecule of glucose and one rhamnose linked to the aglycone apigenin. A previous study has demonstrated that 6-C-hexosyl isomers are eluted earlier than the 8-C-hexosyl isomers [24]. So, peaks 9 and 12 were characterized as 6-C-rhamnosyl-8-C-glucosyl apigenin (Ap-6-C-Rha-8-C-Glc) and 6-C-glucosyl-8-C-rhamnosyl apigenin (Ap-6-C-Glc-8-C-Rha), which had been reported in the B. papyrifera leaves [31]. Peaks seven and five were characterized as luteolin hexosides because they exhibited the same UV absorption wavelength and parent ions. Finally, peaks 7, 5 and 11 were separately identified as luteolin 8-C-β-D-glucopyranoside (orientin), luteolin 6-C-β-D-glucopyranoside (isoorientin), apigenin 8-C-β-D-glucopyranoside (vitexin), and they were further confirmed by co-elution with their corresponding standards. Orientin, isoorientin, and vitexin were previously separated from mulberry leaves [31].  leaves, indicating that UGTs may play an important role in the biosynthesis of glycosyl-ated secondary metabolites.

Qualitative Analysis of Flavonoid Compounds
The UGT genes catalyze the glycosylation of most flavonoids. To identify the flavonoids and their derivatives, we employed UPLC-Q-TOF-MS in positive ion modes to analyze the leaf extracts, and the compounds were obtained under excellent chromatographic condition with good peak separation and resolution (Figure 4).  Table 1.
By combining UV absorption maxima, retention time and mass spectra, a total of 19 compounds were definitely or tentatively identified from the leaf samples of Broussonetia. The number of isolated compounds from B. papyrifera, B. kazinoki and hybrid paper mulberry were 18, 13 and 19, respectively. Different compounds are shown among the studied samples in Table 1. The 19 compounds were identified based on the UPLC-ESI-MS/MS analyses and by comparison with data with those standards or in the literature.  Table 1.

Quantification of Flavonoids and Analysis Antioxidant Capacity in Sample Leaves
In order to further quantify the content of flavonoids, we carried out three quantitative analyses of the flavonoid content in the Broussonetia leaves under the same experimental conditions (Table S5). In hybrid paper mulberry, the content of chlorogenic acid and neochlorogenic acid were the highest among the 19 chemical compounds analyzed. Regarding flavone, the luteolin content of the sample leaves was 27.4 ± 2.4 µg/g in hybrid paper mulberry, while the content was 2.6 ± 0.1 and 3.4 ± 0.4 µg/g in B. kazinoki and B. papyrifera, respectively. Furthermore, the apigenin content was 9.2 ± 0.6 µg/g in hybrid paper mulberry, which was the highest among the three Broussonetia species leaves and the apigenin content was 2.2 ± 0.2 µg/g in B. kazinoki, but the apigenin was not found in B. papyrifera. These results suggested that the hybrid paper mulberry leaves were more suitable for the production of healthcare products and were useful for the development of breeding programs. Taking both the composition and content into consideration, the hybrid paper mulberry could serve as another important natural source of flavonoid Cor O-glycosides. The total phenolic and flavonoid contents in the leaf extracts were analyzed as shown in Figure 5.
breeding programs. Taking both the composition and content into consideration, the hybrid paper mulberry could serve as another important natural source of flavonoid C-or O-glycosides. The total phenolic and flavonoid contents in the leaf extracts were analyzed as shown in Figure 5. Regarding the flavonoid content, it showed a significantly higher value in hybrid paper mulberry (5381.9 mg RE/100 g) than other leaves (3663.6 in B. papyrifera and 1830.7 mg RE/100 g in B. kazinoki, respectively). For total phenols, the content in the hybrid paper mulberry leaf extracts was the highest with 2909.2 mg GAE/100 g DW. In B. papyrifera and B. kazinoki extracts, the content of the total phenols was 1928.5 mg GAE/100 g and 1167.9 mg GAE/100 g, respectively. Phenols are the main natural antioxidants in plants, and a previous study also suggested that the higher total phenolic compound values reflected higher antioxidant activities. The antioxidant activity determinations (DPPH, FRAP and ABTS assays) of Broussonetia leaves are presented in Table 2.  Regarding the flavonoid content, it showed a significantly higher value in hybrid paper mulberry (5381.9 mg RE/100 g) than other leaves (3663.6 in B. papyrifera and 1830.7 mg RE/100 g in B. kazinoki, respectively). For total phenols, the content in the hybrid paper mulberry leaf extracts was the highest with 2909.2 mg GAE/100 g DW. In B. papyrifera and B. kazinoki extracts, the content of the total phenols was 1928.5 mg GAE/100 g and 1167.9 mg GAE/100 g, respectively. Phenols are the main natural antioxidants in plants, and a previous study also suggested that the higher total phenolic compound values reflected higher antioxidant activities. The antioxidant activity determinations (DPPH, FRAP and ABTS assays) of Broussonetia leaves are presented in Table 2. Our results demonstrated that the hybrid paper mulberry leaves possessed higher antioxidant abilities than those of B. papyrifera and B. kazinoki leaves. Finally, the hybrid paper mulberry could be considered as a new source for natural antioxidants. To confirm the effective components, we should further study the anti-tumor, anti-oxidation, and antimicrobial activity of total flavonoids from hybrid paper mulberry leaves based on the previous research.

Potential Functions of BpUGT Genes Inferred from the Expression Patterns
Glycosylation is one of the most important modification and detoxification phenomenon in plant secondary metabolites, which is mediated by a set of GTs. GTs can be classified into at least 111 families, of which, the UGT gene family is the largest one [33]. UGTs have been identified and analyzed in a few plant species such as Arabidopsis [34], wheat [35], and cotton [10]. To date, there is no systematical analysis of the UGT gene family of paper mulberry. Therefore, in order to deepen our understanding of this family, a comprehensive analysis of phylogenetic relationships, gene duplication, gene location, conserved motifs, intron/exon position, and gene expression was performed. Phylogenetic analysis defined 15 distinct phylogenetic groups in paper mulberry, providing a useful foundation for the understanding of the structure-function relationships among the UGT family members. Our result showed that the largest group E consisted of sixteen 71-family UGTs, nine 72-family UGTs, six 88-family UGTs, and fifteen 80-family UGTs. In addition, many plant UGT gene members belonging to group E have been functionally identified, including the glycosylation of small molecule volatile compounds, and the synthesis of flavonoid glycosides and anthocyanins [36,37], which indicates that group E made an important contribution to the glycosylation of plant secondary metabolites. The tandem duplication event performs an important function in the expansion of gene families. After curation, we obtained 28 tandem-duplicated genes belonging to the members of the 9 UGT gene families representing 18.06% of the total UGT genes in paper mulberry. These results suggest that the tandem duplication event occurs continuously in the UGT gene family and is an ongoing process throughout the evolutionary history of the UGT gene family. The gene structure and the conserved motif analysis will be useful for further understanding of the UGT genes in paper mulberry. The intron analysis suggested that the conserved intron changed during the evolution of paper mulberry. The analysis of promoter regions suggested that some of the UGT genes contain a secondary metabolite related element, including MBSI cis-regulatory elements, implying that nine UGT genes might play a significant role in the synthesis of flavonoids. The expression analysis of UGTs provided candidates for a further function study of UGT genes in regulating flavonoid development. The expression analysis in this study gives us a global landscape of the expression of paper mulberry UGTs in different tissues. A more detailed experiment is still needed to determine the mechanism of BpUGTs' at flavonoid initiation and the following steps of flavonoid development. Our study provided systematical insights into the potential roles of UGTs in paper mulberry, which is helpful for screening candidate genes and studying the functions of UGT genes, but a series of experiments are still required to confirm their functions in the future.

Overview of Polyphenol Compounds in Paper Mulberry
There are many flavonoids that are the main components in the compound preparation of traditional Chinese medicine. Therefore, it is not only of theoretical significance, but also of great practical value to study the flavonoids in plants [38]. B. papyrifera is a typical traditional Chinese medicine. Moreover, it is also a good feed material because of its high protein content. In short, B. papyrifera is not only a good feed material, but it also has important clinical medicinal value. So, it is of great significance to study the bioactive substances of B. papyrifera. In this work, we studied the polyphenol compounds in B. papyrifera. Flavonoid C-glycosides, among which there were only two aglycones, namely, apigenin and luteolin, and flavonoid O-glycosides were identified by the characteristics of UV-vis absorption spectroscopy combined with mass spectrum data (Table 1). There was no difference in the constitution of both flavonoid O-glycosides and flavonoid C-glycosides among the different samples. However, the content of the flavonoids varied widely among the different samples. Taking both the composition and content into consideration, hybrid paper mulberry could serve as another important natural source of flavonoid C-glycosides. This study showed that the content of chlorogenic acid and neo chlorogenic acid was very abundant in leaves. Chlorogenic acid, existing in natural plants such as Lonicerae japonicae flos, Lonicerae flos and Eucommia ulmoides, has a variety of biological activities including an anti-inflammatory property and an ability to prevent diseases. It can also be used as health medicine or as a food additive [39]. Previous studies have shown that the expression levels of critical inflammation molecules (interleukin-1β, interleukin-6, tumor necrosis factor-α, and nuclear factor-κB) were down-regulated in jejunal and ileal mucosa and the expression levels of inflammation repressors (suppressor of cytokine signaling 1 and toll-interacting protein) were up-regulated by chlorogenic acid [40]. Recent studies have shown that chlorogenic acid has a positive effect on improving the intestinal health of animals and enhancing the body's antioxidant capacity, with great potential for application in livestock and poultry production [41]. Many phenolic compounds are the main natural antioxidants in plants, and previous studies have also suggested that the higher total phenolic compound content reflected a higher antioxidant activity. Our results demonstrated that the hybrid paper mulberry leaves possessed a high total flavonoid content and strong antioxidant activity. As a common medicine and edible homologous plant, hybrid paper mulberry contains alkaloids, flavonoid C-glycosides, flavonoid O-glycosides, vitamins and other chemical constituents. Our study showed that nine flavonoid C-glycosides (peaks 3,4,5,7,8,9,10,11,12) and five flavonoid O-glycosides (peaks 6,13,14,15,16) were detected in paper mulberry and hybrid paper mulberry. In particular, the content of orientin and vitexin was quite high. It was found that orientin and vitexin increased the antioxidant activity of serum and tissue and decreased the amount of malondialdehyde in mice [42]. Flavonoid C-glycosides and flavonoid O-glycosides appear to have positive influences on human health, and specifically have antioxidant, hepatoprotective, anticancer and antidiabetic potential [43]. On the basis of the known biosynthetic activities in higher plants and the compounds detected in paper mulberry, a possible flavonoid Cand O-glycosides biosynthesis pathway was proposed ( Figure S3).

Conclusions
In this study, 155 BpUGT genes were identified in the B. papyrifera genome. These genes were clustered into 15 distinct evolutionary groups (A-O) based on the phylogenetic analysis. These different groups provide a useful foundation for understanding the structure-function relationships among the UGT family members. This work is the first to detect the functional characterization of the UGT gene family in paper mulberry. Moreover, it will provide novel insights into the functional analysis of the special traits of related gene families in plants. A total of 19 chemical compounds were identified and quantified by UPLC-ESI-MS/MS. Compared with B. papyrifera and B. kazinoki, the number of total phenols and flavonoids in the hybrid paper mulberry leaves was the highest. The phenol contents, flavonoid contents and antioxidant activities of leaves were determined using DPPH, FRAP and ABTS assays. All assays exhibited the same trend: the hybrid paper mulberry leaves showed a higher total flavonoid content, a higher total phenol content and greater antioxidant activities than the other leaves of B. papyrifera and B. kazinoki. Thus, the hybrid paper mulberry is the type of plant with development value, and the biological activity of its compounds needs to be further studied and exploited from the molecular level and the gene level.
Supplementary Materials: The following are available online, Figure S1: Schematic representations of segmental duplications of the paper mulberry BpUGTs. Figure S2: Phylogenetic relationship, gene structure and conserved motif analysis of the BpUGT family. Figure S3: Putative flavonoid C-glycosides and O-glycosides biosynthesis pathway in paper mulberry. CHS: chalcone synthase; CHI: chalcone isomerase; F2H: flavanone 2-hydroxylase; F3 H: flavanone 3 -hydroxylase; CGT: Cglycosyltransferase; Gly: glycoside; Ap: apigenin; Lu: luteolin. Table S1: The detailed information of 155 BpUGTs identified in paper mulberry. Table S2: The identified gene cluster and duplication analysis of BpUGT genes. Table S3: Cis-element analysis of 2000 bp nucleotide sequences data upstream of the translation initiation codon of BpUGT genes.