Genome-wide analysis of basic helix–loop–helix superfamily members related to anthocyanin biosynthesis in eggplant (Solanum melongena L.)

The basic helix–loop–helix (bHLH) superfamily is considered the second largest transcription factor (TF) family. It plays regulatory roles in the developmental processes of plants and in their defense responses. In recent years, many bHLH superfamily genes have been identified and characterized in herbaceous and woody plants. However, the comprehensive genomic and functional analyses of these genes in eggplant (Solanum melongena L.) have not been reported. In this study, 121 bHLH TFs were identified in the recently released eggplant genome. The phylogeny, gene structure and conserved motifs of the SmbHLH gene were comprehensively studied. Subsequently, the phylogenetic relationship between the bHLH of eggplant and the bHLH of other species was analyzed, and the proteins were classified into 17 subfamilies. Among these protein sequences, 16 subgroups were clustered into the functional clades of Arabidopsis. Two candidate genes (SmbHLH1, SmbHLH117) that may be involved in anthocyanin biosynthesis were screened. The tissue specificity or differential expression of the bHLH genes in different tissues and under various light and temperature conditions suggested the differential regulation of tissue development and metabolism. This study not only provides a solid foundation for the functional dissection of the eggplant bHLH gene family but may also be useful for the future synthesis of anthocyanins in eggplant.


INTRODUCTION
Basic/helix-loop-helix (bHLH) transcription factors (TFs) are widely found in animals and plants (Xu et al., 2017a;Xu et al., 2017b). The bHLH TF is approximately 50-60 amino acids in length and consists of two conserved motifs: the basic region located at the Nterminus and the helix-loop-helix region (HLH region) at the C-terminus (Toledo-Ortiz, Huq & Quail, 2003;Li et al., 2006). The basic region contains approximately 10-15 amino acids, six of which are basic amino acid residues involved in DNA binding. The HLH region is mainly composed of hydrophobic residues and participates in the formation of dimers (Meier-Andrejszki et al., 2007). Aside from the two conserved regions, the remaining bHLH protein sequences differ greatly (Morgenstern & Atchley, 1999). With the emergence of genomic sequencing, bHLH superfamily genes such as Arabidopsis (Carretero-Paulet et al., 2010), tomato (Sun, Fan & Ling, 2015), apple (Mao et al., 2017), peach (Zhang et al., 2018), and potato (Wang et al., 2018a) have been identified, analyzed, and divided into 15-26 subfamilies on the basis of bioinformatics.
Many studies have shown that bHLH protein was involved in anthocyanin biosynthesis (Nakatsuka et al., 2008;Zhao et al., 2017;Hu et al., 2016) and affected biotic and abiotic stresses (Naing et al., 2018), such as response to light (Zhang et al., 2018), cold (Yao et al., 2018), and hormonal signals (Liu et al., 2018). It also has been shown to participate in organ development (Carretero-Paulet et al., 2010). Anthocyanin synthesis is regulated by MYB-bHLH-WD40 complexes, and the co-regulation of MYB and bHLH has been extensively studied (Xu et al., 2017a;Xu et al., 2017b). DhMYB2 interacts with DhbHLH1 to regulate anthocyanin production in Dendrobium hybrid petals, and DhbHLH1 is also responsible for the distinct anthocyanin pigmentation in lip tissues . VvbHLH003, VvbHLH007, and VvbHLH010 were found to be related to anthocyanin or flavonol biosynthesis in grapes. The promoters of most genes that are involved in flavonoid or anthocyanin biosynthesis contain a G-box or E-box element that could be recognized by bHLH family members (Wang et al., 2018a;Wang et al., 2018b).
Eggplant (Solanum melongena L.) is an important Solanaceae crop that is widely cultivated throughout the world (Azuma et al., 2008). There are various colors of the peel of an eggplant such as white, purple, green and orange, which makes this plant a good candidate for studying anthocyanin synthesis. Purple eggplants reportedly have a higher anthocyanin concentration than other dark fruits and vegetables; the concentration in purple eggplants is 2.34 times higher than that of grapes and 7.08 times higher than that of red onions (Wu et al., 2006). In addition, the publication of the eggplant genome data provides a valuable resource for the genome-wide analysis of the bHLH family (Hirakawa et al., 2014). Although many bHLHs have been identified and characterized in numerous plants, the bioinformatics of bHLH genes and their function in the anthocyanin synthesis of eggplant have not been reported. In this study, we examined the putative bHLH gene subfamily and identified 121 members encoding bHLH TFs. The phylogenetic analyses among the 121 bHLH proteins in eggplant, 152 Arabidopsis proteins, and 14 proteins related to anthocyanin synthesis were obtained form 1 tomato protein (Qiu et al., 2016), 1 potato protein (Vincenzo et al., 2014), 2 grape proteins (Hichri et al., 2010), 2 apple proteins (Xie et al., 2012;Xu et al., 2017a), 2 tobacco proteins (Bai et al., 2011), 1 snapdragon protein (Shen et al., 1998), 3 petunia proteins (Shimada, Otsuki & Sakuta, 2007;Gerats et al., 1984), 1 rice gene (Sweeney et al., 2006), and 1 maize protein (Burdo et al., 2014). In addition, the expression profile of SmbHLHs, which may be involved in the anthocyanin biosynthesis during fruit development, and the response to LED lights and temperature were also analyzed. These findings provided the first insights into the possible mechanisms of the bHLH proteins in the diversification of plant forms by analyzing the entire bHLH family as well as providing insight into the possible mechanisms of anthocyanin biosynthesis in eggplant.

Identification and analysis of eggplant bHLH family gene
To identify the bHLH sequence in the eggplant genome, the amino acid data of the eggplant genome was downloaded from the Eggplant Genome Database (http: //eggplant.kazusa.or.jp/). A total of 152 Arabidopsis bHLH protein sequences were obtained from TAIR (https://www.arabidopsis.org/). In addition, the Hidden Markov Model profile for the bHLH binding domain (PF00010) was downloaded from the Pfam database (http://pfam.xfam.org/). The HMMER program was used to search for bHLH proteins in all eggplant proteins with a cut off E-value of 1e −5 using PF00010 as a query. Using the Arabidopsis bHLH protein sequences as the query sequence, the Blast-p program was used to search the amino acid database in the eggplant genome. Amino acid sequences of the candidate genes for the bHLH family were then obtained. To ensure that the candidate genes obtained were SmbHLH sequences, the obtained candidate sequences were placed in the Pfam (http://pfam.xfam.org/) and SMART databases (http://smart.emblheidelberg.de/) for confirmation. The absence of the bHLH domain was excluded.

Phylogenetic tree analysis, gene structure and conserved motif characterization
The complete sequences of the amino acids were aligned using MAFFT, and an unrooted phylogenetic tree was constructed using MEGA6 (Tamura et al., 2013) with the following parameters: number of bootstrap replications was 1000, model or method was Pdistance, gaps or missing data treatment was pairwise deletion, and only bootstrap values greater than 50 could be displayed on the tree. The full-length gene and CDS sequences of eggplant bHLH genes were downloaded from the Eggplant Genome Website (http://eggplant.kazusa.or.jp/) to form the desired format. The online gene structure display server (http://gsds.cbi.pku.edu.cn/) was used to analyze the bHLH gene structure of the eggplant and the numbers of introns of the genes were clearly obtained. The eggplant bHLH amino acid sequences were downloaded and arranged in the desired order. The bHLH protein sequences of eggplant were uploaded to the online search tool MEME (http://meme-suite.org/tools/meme), and the protein's conserved motif characteristics were analyzed.

Expression analysis
Eggplant (Solanum melongena L). cv. 'Jingqiejingang', 'Changza NO.8' (labeled PP, purple peel eggplant), 'Baiqiezi101' (labeled WP, white peel eggplant), 'Lvyichangqie' (labeled GP, green peel eggplant), and 'Africa Red Eggplant' (labeled O-RP, orange-red peel eggplant) seeds were placed on a damp filter paper and incubated in a dark 28 • C incubator until germination. The germinated seeds were sown in the greenhouse at Shandong Agricultural University. When the eggplant seedlings were at the three-true-leaf stage, three temperature treatments (28 • C, 4 • C, and 40 • C) were applied according to Wang et al. (2016). The leaves for RNA extractions were harvested at 1, 3, 6, and 12 h after the treatments.
To investigate the expression patterns in different tissues the stems, leaves, petioles, flowers in bloom, and the peels and pulps of the fruit from different developmental stages of four eggplant varieties (PP, WP, GP, and O-RP) were collected simultaneously from 8-week-old plants under natural conditions. In order to determine the response to different light-emitting diode (LED) light expression, LED red:blue light ratio was 1:1, 3:1, 6:1, and 9:1. The treatment was conducted according to the method of Di (2017). The leaves and peels of the fruit from the eggplant cv. 'Jingqiejingang' were harvested.
All of the samples collected were frozen immediately in liquid nitrogen and stored at −80 • C until use. To analyze the expression patterns of the SmbHLH genes, a qRT-PCR was performed using the qRT-PCR Probe Kit (VAZYME, China) according to the manufacturer's instructions. The β-actin gene (GenBank: jX524155.1) was used as a reference gene. The primers used for qPCR analysis were designed by Primer Premier 5 and are listed in Table S1. The PCR products were sequenced to confirm the specific amplifications.

Identification and characterization of eggplant bHLH gene family
To identify putative bHLH proteins in eggplant, 152 bHLH protein sequences of Arabidopsis and the bHLH Hidden Markov Model were used. The Pfam and SMART program tests were used to remove redundant proteins. A total of 121 genes in the eggplant genome were identified as putative members of the SmbHLH family (designated as SmbHLH1-SmbHLH121) ( Table 1). The gene ontology (GO) analysis of eggplant bHLH genes was performed in Table S2. The GO analysis revealed that SmbHLHs mainly functioned in protein and DNA binding. To further predict the function of these genes, PSORT online software was used to predict subcellular localization. The results showed that the probability of the genes in the nucleus was more than 90.0% (Table 1). The genes' ID number in the eggplant genome database was used to find the amino acid number, molecular weight, and isoelectric point of the genes. The annotation information revealed that the length of the SmbHLH amino acid ranged from 67 (SmbHLH28) to 796 (SmbHLH18), and the molecular weights ranged from 7.93 kDa (SmbHLH28) to 91.16 kDa (SmbHLH18), which indicated that the SmbHLH gene family may have undergone a long historical evolution and participated in different biological processes. The predicted isoelectric point value of SmbHLH proteins was between 4.32 (SmbHLH119) and 10.11 (SmbHLH106) ( Table 1).

Gene structure and conserved motif analysis of the eggplant bHLH family
A neighbor-joining phylogenetic tree was constructed with MEGA6 ( Fig. 1). The SmbHLH genomic sequence and corresponding cDNA sequence of the same SmbHLH gene were submitted to GSDS together to show the gene structure. The numbers of introns ranged from 0 to 22 (Fig. 1). In addition, Table S3 shows that 120 genes (99.2%) were below 10, and 13 genes (10.7%) were without introns, 20 genes (16.5%) contained one intron, and the remaining genes contained two or more introns.  The MEME program was used to identify the conserved motif of SmbHLH proteins (Fig. 1). Ten conserved motifs were identified and the protein-conserved motifs of SmbHLH ranged from one to nine. Twenty-three genes (19.0%) had only one conserved motif (Table S3). Each of the predicted motifs were identified only once in each SmbHLH protein sequence. In general, SmbHLH proteins on close adjacent clades of the phylogenetic tree had the same or similar conserved motifs.

N-terminal conserved domain analysis of eggplant bHLH protein
To analyze the characteristics of the N-terminal DNA-binding region of the bHLH family in eggplant, the amino acid residues-conserved map of bHLH proteins located in the N-terminal were plotted by WEBLOGO (https://weblogo.berkeley.edu/logo.cgi). The

Phylogenetic tree analysis of the eggplant bHLH protein family
To predict the function of the SmbHLH family members of eggplant, an unrooted phylogenetic tree was constructed using 121 SmbHLH proteins, 152 AtbHLH proteins, and 14 proteins related to anthocyanin synthesis, including 1 potato protein, 1 tomato protein, 2 bHLH proteins from grape, 2 apple proteins, 2 tobacco proteins, 1 snapdragon protein, 3 petunia proteins, 1 rice protein, and 1 maize protein. These bHLH proteins could be classified into seventeen distinct subfamilies based on the clade support values and classification from Arabidopsis (Fig. 3). Group A was the largest subfamily with 35 proteins, whereas the smallest groups, D and M, contained only six proteins. The numbers of eggplant bHLH proteins within each subfamily varied from 1 to 18. The proteins related to anthocyanin synthesis based on the phylogenetic tree were concentrated in group P. Therefore, the genes involved in anthocyanin synthesis in eggplant were probably SmbHLH1 and SmbHLH117. Further experiments are needed to explore and verify their functions in eggplant.

Figure 3 Phylogenetic tree analysis of bHLH between different species.
Phylogenetic tree constructed with bHLH of eggplant, Arabidopsis thaliana and genes related to anthocyanin biosynthesis including one potato gene, one tomato gene, two grape genes, two apple genes, two tobacco genes, one snapdragon gene, three petunia genes, one rice gene and one maize gene. The green circle represents the eggplant bHLH protein, the blue diamond represents the Arabidopsis bHLH protein, and the red box represents the bHLH protein related to anthocyanin synthesis. and pulps (F), respectively. The eggplant β-actin gene (GenBank JX524155.1) was performed as an internal control. The PCR primers were designed to avoid the conserved region and to amplify products of 150 to 300 bp. Primer sequences were shown in detail in Table S1. Full-size DOI: 10.7717/peerj.7768/ fig-4

Expression profiles of eggplant bHLH genes in different colored tissues
Based on the phylogenetic tree (Fig. 3) and the results of previous studies in our laboratory, the genes of SmbHLH113, SmbHLH26, SmbHLH9 and SmbHLH10 were probably involved in anthocyanin biosynthesis. Thus, in order to gain insights into their roles in anthocyanin biosynthesis in differently colored tissues in eggplants, the expression profiles of six putative SmbHLH genes, namely, SmbHLH1, SmbHLH117, SmbHLH113, SmbHLH26, SmbHLH9, and SmbHLH10 were completed by qRT-PCR. As shown in Fig. 4, the six SmbHLH genes showed different patterns of tissue-specific expression in the different peel color varieties of eggplant. The transcript abundance of SmbHLH1 was higher in the purple tissues of the purple peel eggplant (PP) such as leaves (Fig. 4B), petioles (Fig. 4C), flowers (Fig. 4D) and peels (Fig. 4E), and in the leaves of orange-red peel eggplant (O-RP) (Fig. 4B) and green peel eggplant (GP) (Fig. 4E). However, it was lower in the white and green tissues, such as in the pulps of the four varieties (Fig. 4F), the stems of white peel (WP) and GP eggplant (Fig. 4A), and in the peels of WP, O-RP, and GP (Fig. 4E). Meanwhile, the transcript abundance of SmbHLH117 was higher in green tissues, such as the stems of GP (Fig. 4A), and the leaves of O-RP and GP (Fig. 4B), but lower in the flowers and peels of the four varieties. SmbHLH113 expression was higher in the stems of WP and GP (Fig. 4A), the leaves of O-RP and PP (Fig. 4B), the petioles of O-RP (Fig. 4C), and the pulps of WP, O-RP, and PP (Fig. 4F). The transcript abundance of SmbHLH9 was higher only in the stems, petioles, and flowers of O-RP (Figs. 4A, 4C, 4D). Almost no expression of SmbHLH26 and SmbHLH10 were observed in all tissues of the four varieties.

Expression profiles of eggplant bHLH genes under different LED lights
Light intensity and light quality have different effects on the expression of genes. Under different LED lights (red and blue light ratios of 1:1, 3:1, 6:1, 9:1, respectively), we examined the expressions of the above six SmbHLH genes in the leaves (Fig. 5) and peels (Fig. 6) of eggplant. The transcript abundances of SmbHLH1, SmbHLH117, SmbHLH113, SmbHLH26, SmbHLH10, and SmbHLH9 genes in leaves (Fig. 5), and SmbHLH113, SmbHLH26, and SmbHLH10 in peels of eggplant under R:B = 6:1 treatment (Figs. 6C, 6D, 6E) were higher than those in other treatments. The expression of SmbHLH1, SmbHLH117, and SmbHLH9 in the peels of eggplant (Figs. 6A, 6B, 6F) was higher under R:B = 9:1 treatment. These results suggested that all of the above genes related to anthocyanin biosynthesis could positively regulate plants to respond to LED 6 red:1 blue light ratio.

Response of eggplant bHLH genes expression to temperature
To investigate the roles of SmbHLH genes under different temperatures, trials at 28 • C (optimum temperature), 4 • C (low temperature), and 40 • C (high temperature) were conducted. The relative expression levels of the above six genes, namely, SmbHLH1, SmbHLH117, SmbHLH113, SmbHLH26, SmbHLH9, and SmbHLH10 were performed by qRT-PCR (Figs. 7-9). At the optimum growth temperature of eggplant at 28 • , there were large changes in gene expression with the increase of treatment time. The expression of SmbHLH26 and SmbHLH10 peaked at 6 h (Figs. 7D, 7E) and SmbHLH9 (Fig. 7F) at 3 h, whereas the expression levels of the other genes were relatively stable. Under treatment at 4 • C, the transcript abundance levels of SmbHLH1, SmbHLH117, and SmbHLH9 were higher after 3 h of cold treatment (Figs. 8A, 8B, 8F), while SmbHLH26 and SmbHLH10 were higher after 6 h of cold treatment (Figs. 8D, 8E). The expression levels of the six genes were greatly induced by a high temperature of 40 • C (Fig. 9). The expression levels of SmbHLH1, SmbHLH117, SmbHLH26, and SmbHLH9 peaked at 6 h (Figs. 9A, 9B, 9D, 9F), while SmbHLH113 and SmbHLH10 peaked at 12 h (Figs. 9C, 9E). These results suggest that all of the above-mentioned genes that are related to anthocyanin biosynthesis positively regulate the plants response to temperature changes, especially to stress caused by low and high temperatures.

DISCUSSION
A number of studies have shown that the bHLH TFs may respond to multiple stressors, disease resistance, or growth control in plants (Samira et al., 2018;Cheng et al., 2018). The bHLH family is a key determinant of the specification and differentiation of cells in plants and vertebrates (Carretero-Paulet et al., 2010). A total of 121 genes in the eggplant genome were identified as putative members of the SmbHLH family. Subcellular localization predicted that most of the SmbHLH proteins were located in the nucleus (Table 1). The numbers of bHLH proteins in eggplant were similar to those in Solanaceae plants, like potato and tomato. 159 SlbHLH genes were identified in the tomato genome (Sun, Fan & Ling, 2015) and 124 StbHLH proteins were identified in potato (Wang et al., 2018a). Toledo-Ortiz, Huq & Quail (2003) found that the residues Ile-20, Leu-24, Gln-28, Lys-36, Met-50, Ile-55, Val-58 and Leu-61 in the bHLH domains of plants were more conserved than in animals. The results of this study are consistent with previous research and the residues Arg-7, Leu-18 and Leu-48 showed extreme conservation among the 121 bHLH proteins of eggplant (Fig. 2). These conserved amino acid residues may play an important role in the evolution of eggplant (Wang et al., 2015). The residues Glu-4, Arg-7 and Arg-8 in the basic region of the bHLH domain play an important role in DNA binding (Atchley & Fitch, 1997), and Leu-18 and Leu-48 in the helix-loop-helix regions play an important role in dimerization activity (Simionato et al., 2007). The basic region of the bHLH proteins contained 8 amino acids in eggplant, which was six amino acids shorter than that described by Atchley (Atchley, Terhalle & Dress, 1999).
To date, the biological functions of most SmbHLHs remain unclear. However, approximately 40% of Arabidopsis bHLH proteins have been functionally characterized (Sun, Fan & Ling, 2015). The previous research revealed that the classification characteristics of the bHLH family could be divided into 15-25 subfamilies (Pires & Dolan, 2010). In this study, the phylogenetic tree was constructed with the bHLH domain regions of bHLH proteins of 11 species (121 eggplant proteins, 152 Arabidopsis proteins, 1 potato protein, 1 tomato protein, 2 grape proteins, 2 apple proteins, 2 tobacco proteins, 1 snapdragon protein, 3 petunia proteins, 1 rice protein, and 1 maize protein), and these bHLH proteins were classified into seventeen distinct subfamilies (Fig. 3). The classification of these subfamilies is common and consistent with the subfamily classification previously reported in phylogenetic tree analysis in other species (Sun, Fan & Ling, 2015;Zhang et al., 2018;Wang et al., 2018a;Wang et al., 2018b). Members within the same clade may have common evolutionary origins and conserved molecular functions, and may be involved in the same pathway or biological process (Pires & Dolan, 2010). In this study, eggplant SmbHLH61, SmbHLH83 and SmbHLH21 genes and Arabidopsis thaliana AT5G53210, AT3G06120 and AT3G24140 genes (Liu et al., 2018;Pillitteri et al., 2007) are clustered in one clade. Therefore, we concluded that the SmbHLH61, SmbHLH83 and SmbHLH21 genes of eggplant may also regulate the development of stomata in eggplant leaves. Previous studies have shown that the bHLH subfamily plays important roles in anthocyanin synthesis (Zhang et al., 2018;Zhao et al., 2017). The two eggplant bHLH orthologous SmbHLH1 and SmbHLH117 genes are clustered in the same subgroup with the genes involved in anthocyanin synthesis, so the eggplant SmbHLH1 and SmbHLH117 genes may be involved in regulating the anthocyanin biosynthesis of eggplant. The results showed that the transcript abundance of SmbHLH1 was higher in the purple tissues of the purple peel eggplant such as leaves, petioles, flowers and peels. However, it was lower in the white and green tissues. Meanwhile, the transcript abundance of SmbHLH117 was higher in green tissues, but lower in the flowers and peels of the four varieties (Fig. 4).
Light is an inducing factor for anthocyanin synthesis and can increase the content of anthocyanins in most plants (Mancinelli, 1985). The light intensity and light environment have different influences on the synthesis. The light can improve MdbHLH33 expression levels in apple, and promote the accumulation of anthocyanins in the skin (Takos et al., 2006). In this study, the results suggested that the genes of SmbHLH1, SmbHLH117, SmbHLH113, SmbHLH26, SmbHLH10, and SmbHLH9 related to anthocyanin biosynthesis could positively regulate plants to respond to LED 6 red: 1 blue light ratio (Figs. 5 and 6). In addition, anthocyanin synthesis in plants is affected by temperature (Lin- Wang et al., 2011). Low temperature can stimulate the accumulation of anthocyanins by up-regulating the expression of biosynthetic genes (Crifo et al., 2011). Our results showed that all of the above-mentioned genes related to anthocyanin biosynthesis positively responded to temperature stress at 4 • C (Fig. 8) and 40 • C (Fig. 9).

CONCLUSIONS
A total of 121 SmbHLH genes were identified from the eggplant genome and their gene structures and conserved motifs of amino acids were characterized. Phylogenetic comparisons of the SmbHLH gene families between eggplant and other species revealed that there were two SmbHLH genes (SmbHLH1 and SmbHLH117 ) related to anthocyanin biosynthesis in eggplant. There were different expression patterns of six SmbHLH genes related to anthocyanin biosynthesis in various tissues of different eggplant varieties and under LED light qualities and temperature conditions. These findings provide comprehensive information for further analysis of the biological function and evolution of the SmbHLH gene family in eggplant.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
This work was supported by the National Natural Science Foundation of China (31672169) and Science and Technology Innovation Team of Shandong Agricultural University ''Double First Class''-Facility Horticulture Advantages Team (SYL2017YSTD07). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Grant Disclosures
The following grant information was disclosed by the authors: National Natural Science Foundation of China: 31672169.