Integrated Metabolomic and Transcriptomic Analysis of the Anthocyanin and Proanthocyanin Regulatory Networks in Red Walnut Natural Hybrid Progeny Leaves

Walnuts are one of the most important dry fruit crops worldwide, typically exhibiting green leaves and yellow–brown or gray–yellow seed coats. A specic walnut type, red walnut ‘RW-1’, with red leaves and seed coats was selected as the plant material because of its higher anthocyanin and proanthocyanin (PA) contents. Anthocyanins and PAs coprise important secondary defense methods for plants to respond to biotic and abiotic stresses. However, few studies have focused on the molecular mechanism of anthocyanin biosynthesis in walnuts. (Delphin), peonidin 3-O-(6-O-malonyl-beta-D-glucoside), petunidin 3-O-(6-O-malonyl-beta-D-glucoside), petunidin 3-O-arabinoside and pelargonidin 3-O-(6-O-malonyl-beta-D-glucoside).

common anthocyans, and common glycosyl includes glucose, galactose, sucrose, etc. [3]. Various types and amounts of glycosyls bound to separate positions of anthocyanidin resulted in a signi cant increase in anthocyanin compounds [4]. PAs are oligomers or polymers polymerized by avane-3-alcohol subunits, and their degree of polymerization may vary according to plant species [5]. The structural changes of PAs depend on the properties of the avan-3-alcohol initiator and extension unit, the position and stereo con guration of the link with the lower unit, the degree of polymerization, and whether there is modi cation. These factors increase the structural diversity of PAs [6]. Anthocyanins and PAs act to protect plants from ultraviolet ray damage, attract insect pollination, resist low temperatures [7,8], and contribute to colors [9].
The anthocyanin and PA biosynthesis pathways have been well characterized in some plants [10].
Meanwhile, the structural genes in anthocyanin and PA biosynthesis are regulated by multiple transcription factors, among which MYB transcription factors have been widely studied. The grape R2R3-MYB TFs VvMYBA1, VvMYBA6 and VvMYBA7 regulate anthocyanin biosynthesis by activating UFGT and 3AT expression [14]. The overexpression of MdMYB3 increases anthocyanin accumulation in tobacco by activating the gene expression of CHS, CHI, UFGT and FLS [15]. The overexpression of GhMYB1a in gerbera and tobacco (Nicotiana tabacum) decreases anthocyanin accumulation by upregulating the structural gene expression of NtCHS, NtF3H, NtDFR, NtANS, and NtUFGT [16]. For PA biosynthesis, AtTT2, a positive R2R3 MYB regulator in Arabidopsis, followed the same expression pattern as AtANR and in uenced the PAs in seed coats [17]. Meanwhile, there have been a large number of reports on AtTT2 homologous genes in other species, including VvMybPA1 and VvMybPA2 in Vitis vinefera [18,19], FtMYB1 and FtMYB2 in Fagopyrum tataricum [20], MtMYB14 in Medicago truncatula [21], and GhMYB10 and GhMYB36 in Gossypium hirsutum [22].
MYB regulates anthocyanin and PA biosynthesis either alone or by forming the MBW complex (MYB-bHLH-WD40) [23,24]. For example, AcMYB123 and AcbHLH42 in Actinidia chinensis cv. Hongyang promotes anthocyanin accumulation by activating the promoters of AcANS and Ac3FGT1 [25]. PyMYB10 and PyMYB114 cotransformed with PybHLH3 induced the promoter activity of PyDFR, PyANS, PyUFGT, PyGST and PyABC transporters [26]. Composed of AtMYB123 (TT2), AtbHLH42 (TT8), and TTG1 proteins in Arabidopsis, it was reported to activate the expression of DFR, LDOX, and ANR genes, which led to the accumulation of PAs in the seed coat [27]. FaMYB9/11, FabHLH3, and FaTTG1 in strawberry [28] and DkMYB2/4, DkMYC1, and DkWDR1 in persimmon [29] are transcriptional activation complexes of PA biosynthesis pathways, which indicates that MBW complexes are well conserved in plants [30]. Therefore, the present study aimed to elucidate the regulatory mechanism of the main structural genes and transcription factors involved in anthocyanin accumulation based on red and green walnuts.
Walnuts (Juglans regia L.) are produced by an ancient fruit tree from the Juglandaceae family. They rank in the top four nuts in the world, and the walnut tree is considered to be an important ecological tree. The walnut industry has been included as the core point of industrial poverty alleviation in many areas and plays an important role in helping to target poverty alleviation and to increase farmers' income. At present, the phenotypic traits of walnut varieties are similar, with green leaves and yellow-brown or grayyellow seed coats. Luckily, a red walnut accession 'RW-1' was found by our research group with a red colored leaf, pericarp, seed coat and xylem owing to high anthocyanin content [31]. In our previous studies, we found that the growth of red walnut was less than that of ordinary walnut, which was caused by weak photosynthesis [32]. The enrichment of anthocyanins in red walnut can not only improve fruit quality but also weaken the growth potential of trees and reduce the need for pruning, which is in line with the development trend of dwarf and dense planting fruit trees. Although several genes, such as bHLHs and CHSs, have been identi ed in anthocyanin biosynthesis [33,34], the molecular mechanism of anthocyanin and PA biosynthesis has not yet been clearly elucidated in red walnut.
In recent years, the combination analysis of the transcriptome and metabolome has been widely used to clarify anthocyanin and PA biosynthesis and accumulation in plants [13,[35][36][37][38][39][40]. In the current study, the regulatory networks of anthocyanin and PA biosynthesis in walnut were constructed using two 'RW-1' natural hybrid accessions with red and green leaves, respectively. The comparative analysis of metabolomics and transcriptomics aimed to elucidate the pathway of anthocyanin and PA metabolites and to identify the differentially expressed genes (DEGs) in walnut anthocyanin and PA biosynthesis.

Results
Widely targeted metabolome and anthocyanidin detection in red walnut natural hybrid progeny leaves over developmental stages To maintain consistency of the genetic background, the RW-1 natural hybrid progenies SG and SR walnuts with separate leaf colors were investigated. During the young leaf period, the SR walnut leaves were completely red, while the SG walnut leaves were green. Next, the color difference between SR leaves and SG leaves became inconspicuous with leaf development. The SR leaves retained the red color only in the leaf vein; in contrast, the SG leaves only showed green color at all times (Fig. 1a).
The metabolome of 18 samples was pro led by a Widely Targeted Metabolome and anthocyanidin detection approach. A total of 395 compounds in the walnut leaves were classi ed into 19 classes (Table   S2), including 26 anthocyanins and 4 proanthocyanins. The quality of the Widely Targeted metabolome data was reliable, as evidenced by the PCA results (Fig. S2a, b) and the correlation analysis of population samples (Fig. S3) of the time course and two comparative analyses.
As Fig. 1b shows, 88, 20 and 36 DAMs among SG-1 vs. SR-1, SG-2 vs. SR-2, and SG-3 vs. SR-3 were identi ed between paired samples, respectively. Furthermore, the histogram showed that 30 upregulated metabolites, 58 downregulated in the rst stage, 13 upregulated while 7 downregulated in the second stage, and 22 upregulated while 14 downregulated in the third stage (Fig. 1c). The enrichment analysis of KEGG pathways showed that the most signi cant enrichment pathways in all three stages were anthocyanin biosynthesis (Fig. S4).
Components and contents of avonoids and anthocyanins during walnut leaf development stages In this work, a total of 88 avonoid metabolites were identi ed, including 55 avonoids, 3 iso avonoids, 26 anthocyanins and 4 proanthocyanins.
To determine whether the red pigmentation of red walnut is caused by anthocyanins, we analyzed the soluble anthocyanins in green and red leaves using a UPLC-ESI-MS/MS system. A total of 26 anthocyanins were detected in walnut leaves and were signi cantly upregulated in SR-1 compared with SG-1; meanwhile, only 11, 13 and 12 anthocyanins were detected in SR-1, SR-2 and SR-3 compared with the controls, respectively (Fig. 1d) PAs, including proanthocyanin B1, proanthocyanin B2, proanthocyanin B3 and proanthocyanin C1, were detected in the walnut leaves. proanthocyanin C1 was upregulated in SR compared with SG, while proanthocyanin B1 and proanthocyanin B3 were upregulated in SR-1 and SR-3 but downregulated in SR-2 compared with the controls (Fig. 1d). Moreover, 8 downaccumulated avonoids contents were found to be signi cantly different in SG-1 vs. SR-1, 2 downaccumulated and 1 upaccumulated in second period, while 2 downaccumulated and 3 upaccumulated in third period (Fig. S5). In summary, these patterns of pigment accumulation were consistent with the strikingly different leaf color phenotypes of SR and SG.
Differentially expressed genes between green-and red-leaf walnuts RNA sequencing (RNA-Seq) was used to pro le genome-wide gene expression and transcriptome changes during leaf development. With three biological replicates, transcriptome sequencing of the 18 walnut leaf samples yielded a total of 134.01 Gb clean data with 95.22% of bases scoring Q30. Furthermore, 92.23-95.27% of the total clean reads were unique matches with the walnut reference genome, and 4,131 novel genes were identi ed, including 3,154 annotated genes [33].
From Fig. 2a, 5,708, 1,587 and 703 DEGs were obtained from SG-1 vs. SR-1, SG-2 vs. SR-2, and SG-3 vs. SR-3, respectively, containing 24 identical genes (Fig. 2a). According to the results of transcriptomic analysis, there were 5,708 DEGs between SG-1 and SR-1, of which 3,513 were upregulated and 2,195 were downregulated. There were 1,587 DEGs between SG-2 and SR-2, including 810 upregulated genes and 777 downregulated genes. There were 703 DEGs between SG-3 vs. SR-3, including 364 upregulated genes and 339 downregulated genes. The pro les of the DEGs indicated that the differences in gene expression affected leaf color in walnuts. To better understand biological functions of DEGs in different color leaves, GO analysis were applied (Fig. 2b-d). The DEGs were enriched in biological process, cellular component and molecular function according to GO database. In the biological process category, most of DEGs were annotated to metabolic process, cellular process and single-organism. With respect to the cellular component category, DEGs were mainly assigned to cell, cell part and membrane. In the molecular function category, DEGs were primarily enriched in catalytic activity, binding and transporter activity.
Analysis of structural genes involved in anthocyanin and PA biosynthesis.
On the basis of a reported anthocyanin biosynthesis pathway, a pathway diagram that included the expression heat map of each structural gene in the anthocyanin biosynthesis pathway of walnut were constructed (Fig. 3). In view of leaf color differences in appearance at Stage 1, the combination analysis of transcriptomic and metabolomic data explained the anthocyanin biosynthesis pathway in SG and SR walnuts. The results showed that 17 DEGs were involved in the anthocyanin biosynthesis pathway, and 4 DEGs were involved in the PA biosynthesis pathway. Structural genes, including C4H genes (gene40343 and gene42522), CHS genes (gene4994, gene39336, gene35863 and gene32601) [34], F3H gene (gene40994), F3'5'H gene (gene4387), ANS gene (gene1297) and UFGT genes (gene28510, gene35146, gene1870, gene24302, gene35144, gene1697, and gene36923), were all upregulated, and only one UFGT gene (gene35048) was downregulated in red leaves (Fig. 3). For the biosynthetic pathway of PA, LAR and ANS, ANR and UFGT compete for substrates, and two LAR genes (gene38150, gene640) and one ANR gene (gene24378) were upregulated, while one ANR gene (gene21099) was downregulated in red leaves.

Identi cation of TFs Related to Anthocyanin and PA Biosynthesis
Anthocyanin and PA biosynthesis are regulated by the MBW (MYB-bHLH-WD40) complex. As shown in Fig. 4a, MYB and bHLH ranked as the top two TFs related to anthocyanin biosynthesis in walnut leaves. Based on the Arabidopsis MYB protein domains, 135 putative walnut MYB protein sequences were obtained with default parameters using HMMER and BLASTP (Table S3). A phylogenetic tree was constructed using the 135 JrMYBs and MYBs of other species related to anthocyanin and PA biosynthesis (Fig. 4b). Nine subfamilies and 25 JrMYBs were obtained from the current study. Based on a false discovery rate (FDR) < 0.05 and | Log 2 FC | ≥ 1.5, 3 differentially expressed MYB genes related to anthocyanin biosynthesis in red frames, JrMYB1b (gene38312), JrMYB6a (gene32351), JrMYB123 (gene9445), and one MYB gene related to PA biosynthesis, JrTT2 (gene39085), were screened (Fig. 4c). The bHLHs involved in anthocyanin biosynthesis have been reported by our research group, including JrbHLHA1, JrbHLHA2, JrEGL1a, and JrEGL1b [33]. For WD40, we obtained 23 DEGs, including 6 upregulated (gene1567, gene24750, gene31286, gene32032, gene42048, and gene5786) and 10 downregulated genes (gene10090, gene1113, gene13751, gene15328, gene16177, gene16178, gene1631, gene3696, gene37190, and gene38133) in the rst stage, 8 upregulated (gene15790, gene22436, gene37190, gene37530, gene39719, gene5160, gene7486, and gene9207) and 1 downregulated gene (gene3696) in the second stage, and 1 downregulated gene (gene3696) in the third stage. Among them, gene3696 showed a downregulated trend in all three stages. These TFs may exert an effect on or participate in the regulation of structural and regulatory genes in anthocyanin biosynthesis. qPCR analysis of DEGs related to anthocyanin and PA biosynthesis Based on our above results, 12 differentially expressed structural genes involved in the anthocyanin biosynthetic pathway and 4 MYB genes were analyzed using qPCR methods. The results in Fig. S6 show that eight structural genes, JrC4Hs (gene40343, gene42522), JrF3'5'H (gene4387), JrUFGTs (gene35144, gene35146, gene1697, gene24302 and gene1870), and four MYB genes (gene38312, gene32351, gene9445, gene39085), were highly expressed in red leaf walnut at the rst stage, which indicated that the high expression of genes involved in anthocyanin biosynthesis was related to the color of red walnut leaves. It was also further suggested that the transcriptome data were accurate and consistent with the expression of genes related to anthocyanin biosynthesis in red walnut.

Discussion
The primary pigments in red walnut have been identi ed as avonoids, particularly anthocyanins [31,34], and previous studies on walnut coloration are limited. The current work aimed to explore more comprehensive metabolites involved in the color changes in walnut leaves, and 395 metabolites were obtained from walnut leaves using the widely targeted metabolomics and anthocyanidin detection approach (Table S2). This was the rst study to present a genome-wide examination of anthocyanins and the gene expression pro les of walnuts, aiming to provide a more comprehensive landscape of the metabolites involved in the color changes in walnut leaves during development. Moreover, we also found that PA accumulation was higher in SR than in SG, especially in the rst stage (Fig. 1d). By transcriptomic and metabolic analysis, 21 core genes were identi ed in anthocyanin and PA biosynthesis (Fig. 3). These ndings provide a theoretical basis for further study of the mechanism of red walnut.
Anthocyanins are secondary metabolites in plants, such as ornamental plants, fruits, vegetables and medical plants, and play various roles in many biological processes, including determining fruit quality and ower colors, improving resistance, and avoiding UV and strong light damage [41,42]. In addition to the rich nutrition in kernels, walnuts are also an important ecological tree. Red walnuts RW-1 possesses great ornamental value, nutritional value and economic bene ts. A total of 26 anthocyanins were detected in green-and red-leaf walnuts (Fig. 2), and cyanidin 3-O-galactoside, delphinidin 3-O-galactoside, and delphinidin 3-O-glucoside were the main anthocyanins in SR. Importantly, the anthocyanin content was consistent with leaf color. Interestingly, 6 anthocyanin compounds, cyanidin, delphinidin, malvidin, pelargonidin, peonidin, and petunidin, existed not only in SR but also in SG (Fig. 2). Similar results were also found in white Salvia miltiorrhiza owers [43]. The color depended on the types and contents of anthocyanins, copigment, chlorophyll, vacuole pH, and metal ions [44]. The 6 anthocyanins accumulated less in green leaf walnuts than in red leaf walnuts, suggesting that complete anthocyanin metabolic pathways were also present in SG. The other factors affecting the color of walnut leaves need to be studied further.
The core genes for walnut anthocyanin and PA biosynthesis involve multiple enzymes encoded by structural genes (C4H, CHS, CHI, F3H, F3' H, F3'5'H, DFR, ANS, UFGT, LAR and ANR) [44]. The results from this study showed that the expression levels of C4Hs, CHSs, F3H and F3'5'H in SR were signi cantly higher than those in SG (Fig. 3, Fig. S6), which are common to both anthocyanin and PA biosynthesis pathways. The expression of ANS and UFGT genes in SR was signi cantly higher than that in SG (Fig. 3,  Fig. S6), suggesting that the genes mainly regulated red coloration in red leaf walnuts. As walnut leaves develop, these structural genes are downregulated, and the color tends towards green. The UFGT gene is involved in the nal steps of the avonoid biosynthetic pathway (biosynthesis and accumulation of anthocyanins). The expression of UFGT genes (gene35144, gene35146, gene1697, gene24302 and gene1870) was higher in SR than in SG at Stage 1, and UFGT gene (gene35038) was the opposite. It is worth noting that the 3 UFGT gene trends during the 3 stages corresponded to the color change in walnut leaves. Anthocyanidins are extremely unstable and easy to degrade; therefore, glycosylation is important to stabilize. As the last step of the anthocyanin biosynthesis pathway, UFGTs are considered to be the key enzymes controlling anthocyanin biosynthesis in many plants, and they play an important role in anthocyanin metabolism [11, [45][46][47]. The present study rst explored the expression pro le of UFGTs using red walnuts, providing new insights for studying UFGT gene function. The expression of LAR and ANR was also higher in SR than in SG, suggesting that these genes may play important roles in PA biosynthesis (Fig. 3, Fig. S6).
Anthocyanin biosynthesis is regulated by the MBW (MYB-bHLH-WD40) complex. In the MBW complex, bHLH and MYB transcription factors have DNA binding functions and can speci cally bind to the promoters of structural genes in the anthocyanin biosynthesis pathway, while the WD40 protein plays a stable role in the MBW complex [23,48]. Actinidia chinensis Planch AcMYB123 and AcbHLH42 promote anthocyanin accumulation in the inner pericarp of Actinidia chinensis Planch by activating the promoters of AcANS and AcF3GT1 [25]; PyWRKY26 and PybHLH3 act together on the PyMYB114 promoter, and cotransfection can activate the expression of PyUFGT to regulate the accumulation and transportation of anthocyanin in pear [26]. In this study, we identi ed ve MYBs, four bHLHs and six WD40s based on their expression levels and phylogenetic analysis. The manner in which TFs play a role in regulating structural genes needs further study. These ndings can help to elucidate the molecular mechanism and regulatory networks of anthocyanin biosynthesis in walnut and provide a biological basis for breeding new walnut cultivars.

Conclusions
In this study, metabolomics and transcriptomics were used to reveal the anthocyanin biosynthesis metabolic pathway. A total of 26 anthocyanins in SR were signi cantly upaccumulated comparing with SG, and the transcriptome analysis demonstrated that the expression of structural genes (C4H, F3H, F3'5'H and UFGTs), four MYBs, six WD40s in anthocyanin biosynthetic pathway were signi cantly higher in SR walnut. Delphinidin 3-O-galactoside, cyanidin 3-O-galactoside, delphinidin 3-O-glucoside and cyanidin 3-O-glucoside were identi ed as the primary components of anthocyanidins because of their higher contents. For proanthocyanin, proanthocyanin C1 was upregulated in SR compared with SG, while proanthocyanin B1 and proanthocyanin B3 were upregulated in SR-1 and SR-3 but downregulated in SR-2 compared with the controls. Our results provide valuable information on anthocyanin and PA metabolites and candidate genes in anthocyanin and PA biosynthesis, which provides new insights into anthocyanin and PA biosynthesis in walnuts.

Plant material and growth conditions
Red walnut (Juglans regia L. accession RW-1, germplasm resource number: JUREG4108210002) was introduced from Taihang Mountain, China (Fig. S1). To maintain a relatively consistent genetic background, various color phenotypes of natural hybrid plants (SG for green leaf natural hybrid plants and SR for red leaf natural hybrid plants) were grown in the Fruit Tree Experimental Station of Horticulture College, Henan Agricultural University, Zhengzhou, Henan, China. Sampling permission was obtained from the public land management agency of Henan Agricultural University. The leaves were collected according to the color change of SR walnut leaves. During the full red period (SR-1), red-green period (SR-2), and full green period (SR-3), the leaves of SG walnut from the same period of development were collected as the control [34] (Fig. 1a). All samples were immediately frozen in liquid nitrogen and stored at -80°C until RNA and metabolite extraction.

Extraction and quanti cation of metabolites
The sample widely targeted metabolome and anthocyanidin detection were performed at Wuhan MetWare Biotechnology Co., Ltd. (Wuhan, China) as previously described [38]. In brief, 100 mg leaves were crushed into powder and extracted with 0.6 mL of 70% aqueous methanol overnight at 4°C [49]. Following centrifugation at 10,000 g for 10 min, the supernatant was ltered through a microporous membrane (0.22 µm) for subsequent liquid chromatography/tandem mass spectrometry (LC-MS/MS) analysis.
Principal component analysis (PCA) was performed to verify the differences and reliability of metabolites in all the samples. Differentially accumulated metabolites (DAMs) between groups were ltered with VIP value ≥1 and absolute Log 2 FC (fold change) ≥1. Subsequently, the differentially expressed metabolites with signi cant enrichment were mapped to the Kyoto Encyclopedia of Genes and Genomes (KEGG, https://www.genome.jp/kegg), and a plant metabolic network (PMN, https://plantcyc.org/) was used to analyze the phenylalanine metabolism pathway, avonoid metabolism pathway and anthocyanin biosynthesis pathway [49,50]. We constructed metabolic pathways based on the KEGG database.

RNA extraction, sequencing and transcriptome data analysis
Total RNA from walnut leaves was extracted using the Omega Plant RNA Kit (Omega Biotek, Norcross, Georgia) according to the manufacturer's instructions. The integrity and quality of total RNA were examined on 1% agarose gels, and the RNA concentrations were measured by a NanoDrop 1000 spectrophotometer (Thermo Fisher Scienti c, Waltham, MA, United States). A cDNA library was constructed and sequenced on a HiSeq 2,500 platform (Illumina, San Diego, CA, United States) (NCBI accession PRJNA688391) by Biomarker Biotechnology Corp. (Beijing, China), following paired-end reads were produced after cluster generation.
The expression abundance of unigenes was represented as FPKM (fragments per kilobase of transcript per million fragments mapped) values, and the DEGs [false discovery rate (FDR) < 0.05, and | Log 2 FC | ≥ 1.5] between red leaves and green leaves were obtained using the DESeq 1.8.3 package [51]. Functional annotation was performed using Gene Ontology (GO), and the pathways with signi cant enrichment were identi ed based on KEGG pathways employing the cluster Pro ler R package [52].

Quantitative Real-time (qRT) PCR Assay
The expression of structural genes and transcription factor genes in the anthocyanin biosynthetic pathway was examined by qRT-PCR. First-stand cDNA was synthesized using the FastQuant RT Kit (with gDNase) (Tiangen Biotech, Beijing, China), and qRT-PCR was performed using ChamQ Universal SYBR qPCR Master Mix (Vazyme, Nanjing, China) on an ABI 7500 Real-Time PCR system (Applied Biosystems, Foster City, CA, United States). Jr18S (GenBank accession No. XM_019004991.1) was used as the housekeeping gene [53]. Quanti cation was evaluated using the 2 − ΔΔCt method. All the primers are shown in Table S1.

Statistical analysis
All data were analyzed using SPSS 22.0 software, expressed as the mean ± standard deviation (SD) of three replicates. Signi cant differences were carried out using a one-sided paired t-test ( ** p< 0.01, * p< 0.05) between red leaf and green leaf samples.

Consent for publication
Not applicable.

Availability of data and materials
The datasets presented in this study can be found in online repositories. The name of the repository and accession numbers can be found at the National Center for Biotechnology Information, https://www.ncbi.nlm.nih.gov/, PRJNA688391.

Competing interests
The authors declare that they have no competing interests. between the three compared groups of leaf samples. c Number of differentially accumulated metabolites (DAMs) in SG-1 vs. SR-1, SG-2 vs. SR-2 and SG-3 vs. SR-3. d Concentrations of anthocyanins in leaves of SG and SR. Numbers refer to -fold change in anthocyanin contents.