Genome-Wide Identification of the MYB and bHLH Families in Carnations and Expression Analysis at Different Floral Development Stages

Carnations are one of the most popular ornamental flowers in the world with varied flower colors that have long attracted breeders and consumers alike. The differences in carnation flower color are mainly the result of the accumulation of flavonoid compounds in the petals. Anthocyanins are a type of flavonoid compound that produce richer colors. The expression of anthocyanin biosynthetic genes is mainly regulated by MYB and bHLH transcription factors. However, these TFs have not been comprehensively reported in popular carnation cultivars. Herein, 106 MYB and 125 bHLH genes were identified in the carnation genome. Gene structure and protein motif analyses show that members of the same subgroup have similar exon/intron and motif organization. Phylogenetic analysis combining the MYB and bHLH TFs from Arabidopsis thaliana separates the carnation DcaMYBs and DcabHLHs into 20 subgroups each. Gene expression (RNAseq) and phylogenetic analysis shows that DcaMYB13 in subgroup S4 and DcabHLH125 in subgroup IIIf have similar expression patterns to those of DFR, ANS, and GT/AT, which regulate anthocyanin accumulation, in the coloring of carnations, and in red-flowered and white-flowered carnations, DcaMYB13 and DcabHLH125 are likely the key genes responsible for the formation of red petals in carnations. These results lay a foundation for the study of MYB and bHLH TFs in carnations and provide valuable information for the functional verification of these genes in studies of tissue-specific regulation of anthocyanin biosynthesis.


Introduction
Carnation (Dianthus caryophyllus L.), is one of the most popular cut flowers of the genus Dianthus in the Caryophyllaceae family. Along with roses (Rosa Hybrid) and chrysanthemums (Dendranthema morifolium (Ramat.) Tzvel.), carnations are among the most popular ornamental flowers in the world. More than 30,000 varieties of carnations have been cultivated worldwide [1], and are fancied by growers and consumers for their large flowers, numerous varieties with diverse colors, and long vase life. Flower color in carnations has long been an important target of artificial selection by plant breeders who are well aware of the economic benefits that new color varieties can yield [2]. Variations in flower color between different lineages and cultivars mainly depend upon pigment formation, with the four major pigments being chlorophyll, carotene, betaine, and flavonoids [3,4]. Differences in carnation color are mainly the result of differences in the accumulation of flavonoid compounds in petals [5]. 2 of 14 Flavonoids are a diverse group of compounds containing pigments such as the anthocyanins, which are deposited in the vacuoles of plants, and produce colors ranging from red to purple and blue [6,7]. Anthocyanin biosynthesis is often a very complex process that involves numerous structural genes and transcription factors. These structural genes code for enzymes such as cinnamate 4-hydroxylase (C4H), 4-coumaroyl CoA ligase (4CL), phenylalanine ammonia-lyase (PAL), chalcone synthase (CHS), chalcone isomerase (CHI), flavanone 3-hydroxylase (F3H), flavonoid 3 ,5 -hydroxylase (F3 5 H), flavonoid 3 -hydroxylase (F3 H), dihydroflavonol 4-reductase (DFR), and anthocyanidin synthase (ANS) [6,8]. In addition to these structural genes, transcription factors can determine flower color by regulating specific sites of anthocyanin biosynthesis. Many transcription factors are involved in anthocyanin transcription regulation, such as R2R3-MYBs, bHLHs, WD40 (WDR), WRKY, bZIP, and MADS-box [9,10]. These transcription factors (TFs) regulate anthocyanin structural gene expression mainly by binding cis-acting elements in downstream structural gene promoters. Among them, the MYB-bHLH-WD40 (MBW) complex is the most widely studied TF regulating the anthocyanin synthesis pathway [11][12][13].
The MBW complex includes three types of regulatory proteins, including the R2R3-MYBs, bHLHs, and WD40 proteins. The MYB family is one of the largest families of transcription factors in plants, and is important in the regulation of anthocyanin biosynthesis [14,15]. According to the number of R (arginine) motifs in the MYB gene, they can be divided into four subfamilies, from 1R-MYB with one R domain, to 4R-MYB with four R domains. Of the four subfamilies, R2R3-MYB is the most abundant and widely involved in regulating plant growth and development [16][17][18]. The bHLH family is also a large class of transcription factors in plants. The bHLH transcription factors regulate many cellular processes, such as the fate of epidermal cells, photomorphogenesis, and flower organ development [19]. The bHLH transcription factor has been shown to be involved in anthocyanin biosynthesis in many plants [20][21][22][23]. The first 200 amino acids (MIR) of the N-terminal of the protein sequence of bHLH are regions that interact with MYB, and the next 200 amino acids usually contain negatively charged regions necessary for interaction with WD40 or the RNA poly II complex. WD40 is not usually catalytic but serves as a docking platform for many interacting proteins [7,24]. In Arabidopsis thaliana, three regulatory proteins, namely, AtTT2/AtMYB123, AtTT8/AtbHLH04, and AtTTG1 (WD40-protein), have been reported to act together as MBW complexes to promote proanthocyanidin production in the seed coat by activating target gene BAN/ANR expression [25,26]. Two kinds of MYB (DcMYB6 and DcMYB7) proteins have been shown to interact with members of the MBW complex to control anthocyanin biosynthesis in purple carrots (Daucus carota var. sativa Hoffm.) [12]. Study in lychee (Litchi chinensis Sonn.) has found that LcbHLH1 and LcbHLH3 play an important role in LcMYB1's regulation of anthocyanin production, suggesting that the LcMYB1-LcbHLH complex enhances anthocyanin accumulation and may be related to the activation of DFR and ANS transcription [27].
In this study, we performed genome-wide identification of DcaMYB and DcabHLH genes, and a total of 106 MYB genes and 125 bHLH genes were identified from the carnation genome. The gene structure, motif distribution, chromosomal location, phylogeny, and the expression of genes at each flowering stage were comprehensively analyzed. From these analyses, this study aims to comprehensively understand MYB and bHLH genes in carnations and reveal their roles in regulating anthocyanin synthesis as it relates to flower color. These results provide information for further functional analysis of DcaMYBs and DcbHLHs, ultimately improving the breeding and development of new color varieties in carnations.

Identification of DcaMYBs and DcabHLHs
We identified 106 MYB genes and 125 bHLH genes in the carnation genome, combining the results of homologous, conserved motif, and HMM identification. The MYB and bHLH genes from carnation were named DcaMYB1-DcaMYB106 and DcabHLH1-DcabHLH125, separately. Further sequence analysis shows that DcaMYBs and DcabHLHs products differ in both amino acid sequence and molecular weight. The longest MYB gene is DcaMYB51, with 1027 amino acids, and the shortest DcaMYB73 gene only has 136 amino acids; the longest bHLH gene is DcabHLH21, with 969 amino acids and DcabHLH55 is the shortest at only 90 amino acids. The molecular weights of the DcaMYBs range from 34,425.05 to 251,683.56, with isoelectric points ranging between 4.85 (DcaMYB93) and 5.22 (DcaMYB92). Similarly, the molecular weights of the DcabHLHs range from 21,838.14 to 237,109.03, with isoelectric points between 4.87 (DcabHLH21, DcabHLH112) and 5.35 (DcabHLH55). The main sequence information and molecular weight and isoelectric point of the predicted protein are given in Supplementary Materials Tables S1 and S2.

Gene Structure and Motif Analysis of DcaMYBs and DcabHLHs
In this study, 10 conserved motifs of DcaMYB and DcabHLH transcription factors were separately identified using MEME. Most motifs are found located in an ordered fashion at the N terminal of DcaMYBs, while in only a few cases, the motifs are distributed irregularly at the C terminal. Motif 1, 2, 3, 5, 6, and 9 are widely distributed in the domain of DcaMYBs. A few motifs, such as motif 4 and motif 10, cluster together on a distinct branch of the phylogenetic tree ( Figure 2b and Figure S1). We identify 10 conserved motifs in DcabHLHs and find that all protein sequences contain conserved motifs 1 and 2 ( Figure 3b and Figure S2). The same subgroups usually contain similar motifs, for example, motif 6 only exists in the XIII subgroup. Motif 10 is only found in subgroup IV (c + b), and, similarly, motif 9 is mainly concentrated in subgroup III (d + e).
All introns and exons are identified in MYB and bHLH genes in carnation. Most genes in the same subgroup have exons and introns of similar length and number (Figures 2c and 3c). For example, all the genes in the S22 subfamily of MYB contain no introns and are roughly the same length. Similarly, all the genes in the III (d + e) subgroup of bHLH contain no introns. In general, the motif composition and gene structure of MYB and bHLH genes of the same group are similar. The phylogenetic analysis results strongly support the reliability of the group classifications derived from similarity to known A. thaliana subgroups. the reliability of the group classifications derived from similarity to known A. thaliana subgroups.

Chromosomal Locations of DcaMYBs and DcabHLHs
Most DcaMYBs are unevenly distributed on 15 chromosomes, and seven are mapped to three unanchored contigs. Of the 125 DcabHLHs, 119 are unevenly distributed on 15 chromosomes, and 6 are mapped to four unanchored contigs (Figure 4). In the MYB gene family of carnations, chromosome 10 has the most DcaMYBs at 12, chromosome 12 has 11, chromosome 8 has 10, and in all other chromosomes with MYB genes, they each have between one and nine genes. The collinearity analysis reveals 21 pairs of segmental duplicated genes in the MYB family of carnations, with the largest number of segmental duplicated genes found on chromosome 13 ( Figure S3). In the bHLH gene family of carnations, chromosome 2 has the most DcabHLHs at 14, chromosome 9 has 13, chromosome 3 and 12 have 12, and in all other chromosomes with bHLH genes, they each have between one and nine genes. The collinearity analysis reveals 31 pairs of segmental repeat genes in the bHLH family of carnations, with the largest number of segmental repeat genes found on chromosome 3 ( Figure S3).

Expression Profiles of DcaMYBs and DcabHLHs
In the process of evolution, different genes in a family often diverge functionally over time, sometimes resulting in new functions among genes. Gene expression studies are an important means by which such divergence in gene function can be detected. In order to explore the synthesis of anthocyanins regulated by MYB and bHLH transcription factors in carnations, we analyzed the two groups of differentially expressed genes representing floral color variation (S3 vs. S4 and red vs. white), S3 and S4 referring to the third and fourth stages of flower development, where S3 indicates uncolored flowers and S4 indicates flowers starting to color, as well as red-flowered carnations and white-flowered carnations at full bloom (Figure 5a). We identified a total of 1446 genes that overlapped between the two groups. These include various transcription factors such as MYB, bHLH, AP2, WRKY, and bZIP, of which there are 12 MYB transcription factors and 11 bHLH transcription factors (Figure 5a).
To further understand the role of these MYB and bHLH genes during the color formation of carnation petals, we analyzed their expression in seven floral development stages (S1-S3 petals are colorless and S4-S7 petals have red margins), and in four floral stages of red flower and white flower of carnations. Previous studies show that genes such as DFR, ANS, and GT/AT also affect anthocyanin biosynthesis [32,33]. In our study, the DFR, ANS, and GT/AT genes in carnations are identified and their expressions increase with petal coloring ( Figure 5). We find that DcaMYB13 and DcabHLH125 not only have similar expression patterns, but they also share similar expression patterns with DFR, ANS, and GT/AT in the process of petal coloring of seven floral development stages. In addition, they show the same trend in four stages of red flower development, but not in the white flower. We also find that the expression pattern of DcaMYB84 and DcaMYB87 are similar to that of DFR, ANS, and GT/AT in the seven floral development stages with petal coloring. DcabHLH19 is highly expressed during the S1-S3 non-colored stage, but decreases during the S4-S7 stage when red floral margin coloration occurs, presumably as a negative regulator (Figure 5b).

Expression Profiles of DcaMYBs and DcabHLHs
In the process of evolution, different genes in a family often diverge functionally over time, sometimes resulting in new functions among genes. Gene expression studies are an

Discussion
Carnations are one of the most popular ornamental flowers in the world, with diverse colors and beautiful patterns as one of the key characteristics attracting growers and consumers [34][35][36]. Differences in carnation flower color is the result of many factors, with anthocyanin pigment deposition in petal cells being one of the main factors [2,37]. In ornamental plants studied thus far, MYB transcription factors have been found to be important in the biosynthesis of anthocyanin, as well as bHLH transcription factors, which can interact with MYB-producing proteins to regulate anthocyanin biosynthesis [13,38,39].
In this study, genome-wide analysis was used to explore the regulation of carnation flower color by the MYB and bHLH families. Based on conserved features, genes are grouped into different subgroups. A large number of studies show that the genes related to regulating anthocyanin biosynthesis in MYB are mainly concentrated in the S4, S5, S6, and S7 subgroups [18,29]. Among them, the MYB genes of the S5, S6, and S7 subgroups mainly function as activators to promote anthocyanin biosynthesis [28,40]. In the S6 subgroup, genes such as AtMYB75, AtMYB113, AtMYB114 in A. thaliana [41,42], MdMYB10 [43], MdMYB110a [44] in apple (Malus pumila Mill.), VvMYBA1 [45], and VvMYBA2 [46] in grape (Vitis vinifera L.) have been shown to positively promote anthocyanin synthesis. However, the MYB genes related to anthocyanin synthesis in the S4 subgroup function as repressors to inhibit anthocyanin biosynthesis [47][48][49]. In carnations, four DcaMYB genes (DcaMYB13, 33, 38, and 95) are apparent in the S4 subgroup, whereas in A. thaliana, eleven AtMYB genes (AtMYB3, 4,6,7,8,20,32,42,43,85, and 99) are known. Similarly, the S7 subgroup in carnations has fewer genes than in A. thaliana, which may be the result of gene loss in carnation evolution. This suggests that the carnation MYB family may have undergone neofunctionalization and sub-functionalization. Most of the bHLH genes known from the IIIf subgroup, such as AtbHLH1, AtbHLH2, and AtbHLH42, are reported to be involved in anthocyanin biosynthesis [19,30,31]. DcabHLH125 is clustered in subgroup IIIf (Figure 1b) and its expression increases with petal coloring, suggesting that this gene is involved in regulating anthocyanin biosynthesis in carnations.
In addition, we analyzed the protein motif content and gene structure of MYB and bHLH gene families in carnations. Most motifs of R2R3-MYB in DcaMYBs are found to be distributed at the N-terminal, while the conserved DNA-binding domain at the N terminus determines the binding of MYB to different acting elements, with a few of these motifs irregularly distributed at the C terminal. Motifs 1, 2, and 3 are present in almost all DcaMYBs. However, some motifs only appear in specific groups, such as motif 4 and motif 10, which only occur in MYB-related gene families, suggesting a specific function for these motifs (Figure 2b). Similar findings are found in DcabHLHs proteins where motifs 1 and 2 are found in all proteins, while motif 6 is only present in subgroup XIII and motif 9 is concentrated in subgroup III (d + e) (Figure 3b). Most DcaMYBs and DcabHLHs contain two exons and one intron, but some genes are found to be made up of only one exon. In general, protein motif and gene structure are similar within the same subgroup.
In the MYB transcription factors, it has been reported that there are multiple genes involved in regulating anthocyanin biosynthesis from the S4, S5, S6, and S7 subgroups [39]. In our study, expression analysis shows that DcaMYB13 belongs to the S4 subgroup, and its expression pattern is similar to that of DFR, ANS, and GT/AT, which regulate anthocyanin accumulation, suggesting that DcaMYB13 is likely to be involved in anthocyanin biosynthesis. The expression of DcaMYB84 and DcaMYB87 during the seven flower development stages in carnation is also similar to that of genes such as DFR, suggesting that they may be candidates for the regulation of anthocyanin synthesis. We also reviewed the literature related to the bHLH family and found that genes in subgroup IIIf are often involved in anthocyanin biosynthesis [30,31]. We find that DcabHLH125 belongs to subgroup IIIf and its expression pattern is similar to that of DFR, ANS, and GT/AT, which indicates that it may play an important role in anthocyanin synthesis in carnation. DcabHLH19 is highly expressed during the S1-S3 non-colored stage but decreases during the S4-S7 stage when red coloration occurs, suggesting that it may act as a negative regulator, inhibiting anthocyanin biosynthesis.
The biosynthesis of anthocyanins is regulated by the transcription factors in the WBD complex [50]. In the model species A. thaliana, AtTT8 (bHLH) regulates anthocyanin synthesis by forming MBW transcription complexes with AtTT2 (MYB) and AtTTG1 (WDR). The interaction of AN2 (MYB) and JAF13 (bHLH) is found to promote pigment accumulation in petunias (Petunia spp.) [38]. The co-expression of MdMYB10 with MdbHLH3 and MdbHLH33 in apples can effectively induce anthocyanin biosynthesis [43]. In our study, DcaMYB13 and DcabHLH125 have similar expression patterns in transcriptomes, and they are similar to those of DFR, ANS, and GT/AT, suggesting that DcaMYB13 and DcabHLH125 may interact and jointly regulate the expression of the downstream genes DFR, ANS, or GT/AT to promote anthocyanin biosynthesis. Further elucidating the interaction between MYB and bHLH gene families in the biosynthesis of anthocyanin in carnation flowers will be directly applicable to the development of novel cultivars. To these ends, we have begun a series of gene-editing experiments to better understand protein interactions between these gene families.

Identification of MYB and bHLH Family Members in the Carnation Genome
Based on the carnation genome [51] assembled by our team at the Shenzhen Genome Institute of the Chinese Academy of Agricultural Sciences in 2022, the potential MYB and bHLH genes were identified. The 168 MYB protein sequences and 225 bHLH protein sequences of A. thaliana, as well as 138 MYB protein sequences and 211 bHLH protein sequences of rice (Oryza sativa L.), were downloaded from the Plant Transcription Factor Database [52,53] as target sequences. BLASTP (BLAST 2.12.0+) were performed using these amino acid sequences above as queries against the carnation genome database. All candidate MYB and bHLH family sequences were confirmed with the NCBI conserved domain database (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi, accessed on 1 July 2022). The hidden Markov model (HMM) file of the MYB DNA-binding domain (PF00249) and the bHLH DNA-binding domain (PF00010) from the Pfam database (http: //pfam.xfam.org/search, accessed on 3 July 2022) were used to further identify MYB and bHLH transcription factors by HMM search in TBtools [54].

Phylogenetic Analysis and Classification of Carnation DcaMYBs and DcabHLHs
The full-length amino acid sequences of MYBs and bHLHs derived from A. thaliana and carnation were used for phylogenetic analysis. An unrooted neighbor-joining (NJ) tree was constructed using MEGA11 with bootstrap test of 1000 replicates to assess clade support [55]. The carnation DcaMYBs and DcabHLHs were classified into different groups according to how they clustered with known MYBs and bHLHs from A. thaliana [19,29,30]. Finally, the phylogenetic tree was rendered in Evolview [56].

Gene Structure and Conserved Motif Analysis of DcaMYBs and DcabHLHs
Motif Elicitation (MEME) (http://memesuite.org/tools/meme, accessed on 13 July 2022) was used to confirm conserved motifs in DcaMYBs and DcabHLHs protein sequences. The parameters settings used in MEME were as follows: the maximum number of motifs was set to 10, while the other parameters were kept at default. TBtools was used to visualize the results.

Chromosomal Locations of DcaMYBs and DcabHLHs
The chromosomal locations of DcaMYBs and DcabHLHs in the carnation genome was obtained by TBtools according to the annotation data of the carnation genome. From this, the MYB and bHLH gene families were named according to their positions on chromosomes.

RNA-Sequencing (RNA-seq) Data Analysis of DcaMYBs and DcabHLHs
In order to better characterize the changing flower color of carnation, the transcriptome data of samples at seven flower development stages of carnation were used to find differences in expression (PRJNA796118). In addition, transcriptome data of red and white carnation petals separately at four stages of petal development were downloaded [57]. Clean reads were obtained by removing adapter-containing reads, poly-N-containing reads, and low-quality reads from the raw data. Using Hisat2 2.1.0 software, we mapped the clean reads to the reference genome publish in 2022 [51] using default parameters. The output of the mapping was processed with String Tie to obtain FPKM (Fragments Per Kilobase of exon model per Million mapped fragments) for all genes in each sample. TBtools software was used to generate the heatmap of expression (average FPKM values) at different stages.