Heat shock transcription factor (Hsf) gene family in common bean (Phaseolus vulgaris): genome-wide identification, phylogeny, evolutionary expansion and expression analyses at the sprout stage under abiotic stress

Background Common bean (Phaseolus vulgaris) is an essential crop with high economic value. The growth of this plant is sensitive to environmental stress. Heat shock factor (Hsf) is a family of antiretroviral transcription factors that regulate plant defense system against biotic and abiotic stress. To date, few studies have identified and bio-analyzed Hsfs in common bean. Results In this study, 30 Hsf transcription factors (PvHsf1–30) were identified from the PFAM database. The PvHsf1–30 belonged to 14 subfamilies with similar motifs, gene structure and cis-acting elements. The Hsf members in Arabidopsis, rice (Oryza sativa), maize (Zea mays) and common bean were classified into 14 subfamilies. Collinearity analysis showed that PvHsfs played a role in the regulation of responses to abiotic stress. The expression of PvHsfs varied across different tissues. Moreover, quantitative real-time PCR (qRT-PCR) revealed that most PvHsfs were differentially expressed under cold, heat, salt and heavy metal stress, indicating that PvHsfs might play different functions depending on the type of abiotic stress. Conclusions In this study, we identified 30 Hsf transcription factors and determined their location, motifs, gene structure, cis-elements, collinearity and expression patterns. It was found that PvHsfs regulates responses to abiotic stress in common bean. Thus, this study provides a basis for further analysis of the function of PvHsfs in the regulation of abiotic stress in common bean. Supplementary Information The online version contains supplementary material available at 10.1186/s12870-021-03417-4.

to improve the tolerance of common bean under abiotic stress.
Heat shock transcription factor (Hsf ) regulates signal transductions associated with heat stress in plants [5]. Hsfs participate in signal reception and transmission, regulation of gene expression, resistance to stress and heat tolerance in plants. Members of the Hsf family have five conserved domains [6]: DNA-binding Domain (DBD), oligomerization domain, (OD), C-terminal activation domain (CTAD), nuclear localization signals (NLS), and nuclear export signals (NES) [7]. Among them, DBD and OD domain are the most evolutionary conserved and are used to accurately identify and combine the promoters of heat shock protein (Hsp) [8]. The C-terminal activation domain contains a repressor domain: LFGV-peptide [9], which directly regulates the expression of genes in response to heat shock. The NLS and NES modulate the transnuclear transport of Hsf protein and free distribution of proteins in the nucleus and cytoplasm. Hsf can activate the heat shock protein (Hsp) thereby promote refolding, assembly, distribution and degradation of damaged proteins. In this way, it helps plants to resist abiotic stress [10].
The first Hsf gene was cloned in yeast. Since then, several mammalian Hsf genes have been identified. In plants, the first Hsf gene was cloned in tomato (Solanum lycopersicum) [6]. To date, Hsfs have been reported in several species, including Arabidopsis [11], rice (Oryza sativa) [12], soybean (Glycine max) [13], maize (Zea mays) [14] due to the continuous improvement of genome sequencing technology. The number of Hsf genes varies across plants. For instance, wheat, soybean, maize and Arabidopsis have 56, 52, 30 and 21 genes, respectively [7].
Hsf gene members are involved in the regulation of growth and development in plants. Carotenoids and chlorophyll have been shown to promote plant photosynthesis in transgenic tobacco by upregulating HsfA9 expression [15]. A total of 14 Hsfs have been identified in Citrus, all of which participate in fruit development and maturation. For instance, CrHsfB2a and CrHsfB5 were found to regulate citrate content [16]. Various stimuli of biotic and abiotic stress can upregulate expression level of Hsfs, hence influence plant response to stress. The expression of HsfA2, HsfB1, HsfA4a, HsfB2a, HsfB2b and HsfA7a in Arabidopsis has been shown to increase with temperature. However, high temperatures do not increase expression of some Hsfs [11]. Previously, 19 Hsf members, including VHsf01, VHsf05, VHsf15 and VHsf18 were found to exhibit different gene expression patterns between resistant and susceptible grape species under high-temperature stress (47 °C) and heat adaptation (38 °C). The differential expression of these genes may explain differences in heat tolerance among various grape species [17]. Different types of biotic and abiotic stresses including cold, drought, pests and diseases increase the expression of Hsf members, such as HsfB3a, HsfA6a, HsfA2a and HsfA9b in Cassava (Manihot esculenta Crantz) [18]. It has also been reported that salinity, osmotic pressure and cold stress increase the expression of HSFA6b, an ABA-mediated stress response gene in Arabidopsis, [19]. Heat, drought and salt stress similarly alters the expression level of Hsf genes in Populus euphratica [20].
In the study, 30 members of Hsf gene family were identified in common bean. These members' phylogeny, motif, gene structure, evolutionary expansion and expression patterns of the identified genes were analyzed at the sprout stage. Our bionformatic analysis provide insights into PvHsfs and give useful information for further functional dissection of PvHsfs in common bean.

Identification of Hsf members in P. vulgaris
The hmm search identified 35 protein sequences in the reference genome of P. vulgaris. Among them, 30 were found to be members of the Hsf family after removal of duplications. The 30 protein sequences were located on nine chromosomes of P. vulgaris except LG10 and LG11. The Hsf members were named based on their chromosomal location (e.g., PvHsf1-PvHsf30) (Fig. 1). PvHSF06 had the longest protein length (490) and CDS length (1470) whereas PvHSF12 had the shortest protein length (206) and CDS length (618). The isoelectric point of PvHsf members ranged from 4.82 to 9.08, and the molecular weight ranged from 23,907.29 to 54,538.32. The detailed information of PvHsf members is shown in Table S1.

Evolutionary analysis of PvHsfs
MEGAX software was used to analyze the protein sequence of PvHsf members using the Maximum Likelihoodphy function and jtt + g + i model predicted by MEGA. The 14 subgroups were divided based on the analysis result of PvHsf members (Fig. 2). In these 14 subgroups, the subgroup VI, VII and VIII had only one PvHsf member. The subgroup XIV had the highest number of PvHsf members (4).

Motifs and gene structure of PvHsfs
Using MEME tool, 10 motifs of PvHsf members were identified (indicated using squares with different colors) and the sequence of each motif is shown in Fig. S1. The location of squares represents the location of each motif ( Fig. 3A and B). Motifs-1, 2, 3 and 5 were identified in all subgroups of PvHsf. Subgroup I, II, III and IV had similar motifs which were found in different locations.
Motif-10 was not found in subgroup V while motif-9 was only found in subgroup IX and X. Motif-7 was found in subgroup XIV whereas motif-8 was present in subgroup XI, XII, XIII and XIV. Each subgroup PvHsf members had similar motifs. The structures of exons and introns of PvHsfs were determined using the GSDS (Fig. 3C). Each subgroup members had a conserved Hsf domain (pink box), and the location of exon and intron were similar in each subgroup.

Evolutionary analysis of four species
In this study, members of PvHsf genes were identified in P. vulgaris whereas members of Hsf genes in plants, including monocotyledonous (O. sativa and Z. mays) and dicotyledonous (Arabidopsis) were retrieved from published research [11,12,14]. Also, the jtt + g + i model was the best model to find the evolutionary relationship between Hsf members in four species, which predicted by MEGAX. The four species Hsf members were also devided into 14 subgroups while subgroup III and XI had not monocot members; 20 motifs were identified from the four species Hsf members in motif analysis (Fig. S2), although the order of motifs had changed compared with PvHsfs, the structure of motifs was similar. Motif-1, 2 and 3 encoded DBD domain while motif-9 encoded -LFGV-motif (C-terminal activation domain); Most Hsf members also had two exons, which the results of gene structure was similar with PvHsfs; The results of evolution, gene structure and motif analyses revealed that members of Hsf in each subgroup had similar motifs and gene structure (Fig. 4).

Cis-elements analysis of PvHsfs
Using plantCARE, 10 cis-elements were identified in PvHsfs. These cis-elements found to be involved in the regulation of hormone responsiveness, environmental stress and germination (Table S2). Elements marked in red such as ARBE, TCA-element, TATC-box, P-box, TGA-element and AuxRR-core were hormone-related elements; Three elements marked in blue (LTR, ARE, MBS) were stress-related elements whereas CAT-box was the germination-related element. These results showed that PvHsfs might regulate hormone responsiveness, environmental stress and germination (Fig. 5).

Collinearity and Ka/Ks
PvHsfs had 16 pairs of collinearity. PvHsf22 and PvHsf27 exhibited the most pairs (3 pairs) (Fig. 6A). Compared to Arabidopsis (1) and rice (1), PvHsfs in soybeans had more collinear genes (36), indicating a close association between the common bean and soybean (Fig. 6B). KA/KS analysis revealed that there were no two pairs of PvHsfs (Table S3), indicating that PvHsfs eliminated harmful mutations, while maintaining the protein (purity selection).

In Silico tissue-specific expression analysis
Tissue-specific expressions of PvHsfs were obtained from the phytozome database. The expression levels of PvHsfs in flowers, flower buds, young pods, leaves, green mature pods, stems and roots were as shown in Fig. 7. All PvHsfs members were specifically expressed in different tissues, with various PvHsfs members exhibiting marked variations in expression in different tissues. For instance, compared to other tissues, PvHsf02 was highly expressed in green mature pods; Compared with other different tissues, PvHsf05 was highly expressed in roots and pods higher than other tissues; PvHsf03 was highly expressed in flowers, stems and young pods; PvHsf07 levels were suppressed in flowers, flower buds and young pods while PvHsf28 levels were elevated in leaves, stems and roots.

Tissue-specific expression analysis at the sprout stage
The expression of PvHsfs in cotyledon, hypocotyl and radicle at the sprout stage was also analyzed. The expression levels of PvHsfs were specific in the cotyledon, hypocotyl and radicle. At the sprouting stage, expression levels were relatively low in the hypocotyl, relative to cotyledons and radicles (Fig. 8). Some PvHsfs, including PvHsf01, PvHsf03, PvHsf09, PvHsf17, PvHsf21 and PvHsf22 were highly expressed in the cotyledons while some PvHsfs, including PvHsf05, PvHsf10, PvHsf16 and PvHsf24 were highly expressed in both cotyledons and radicles. These results imply that cotyledons and radicles are potential target tissues for studies on PvHsfs at the sprouting stage.

Stress-associated expression levels
QRT-PCR was used to assess the expressions of PvHsfs under heat and cold stress conditions. Stress dysregulated the expression levels of PvHsfs, with specific characteristic expressions under different stressors. Under heat stress, PvHsfs levels, apart from PvHsf15 levels, were markedly elevated under heat stress except PvHsf15 (Fig. 9). In CK and under cold stress conditions, there were no significant changes in the expressions of dome PvHsfs, such as PvHsf01, PvHsf03, PvHsf05, PvHsf16 and PvHsf29. However, these levels were significantly altered under heat stress, indicating that these PvHsfs are highly responsive to heat stress, when compared to cold stress.
QRT-PCR was also performed to evaluate the expressions of PvHsfs under salt and heavy metal stressors (Fig. 10). Under salt stress conditions, PvHsf21 and PvHsf22 levels were significantly elevated, however, PvHsf01, PvHsf03, PvHsf09, PvHsf17 and PvHsf24 levels were markedly suppressed. Moreover, exposure to Cd 2+ stress suppressed the levels of most PvHsfs, apart from PvHsf09, PvHsf21 and PvHsf22. Under salt and heavy metal stressors, levels of most PvHsfs were significantly dysregulated, apart from those of PvHsf03. PvHsf01, PvHsf17, PvHsf21, PvHsf22 and PvHsf24. Therefore, PvHsfs can be used as candidate members of the Hsf family for studies at sprouting stages under stress.

Discussion
Various species have different Hsf members. There are 29 Hsf members in Tartary buckwheat (Fagopyrum tataricum) [21], 28 members in poplar (Populus) [22], 21 members in Arabidopsis [23], 19 members in grapes (Vitis vinifera) [24], 18 members in tomatoes (Solanum lycopersicum) [7] and 16 members in alfalfa (Medicago sativa) [22]. Duplication of gene family members during plant evolution is associated with genomic replication events [17]. In this study, 30 Hsf family members were identified in the genome of the common bean (PvHsf1-PvHsf30) (Fig. 1). Compared to other dicotyledons, the common bean was established to have more Hsf members. These PvHsfs were all found to be located in 11 linkage groups. Therefore, based on these results, after differentiation from early ancestors, the common bean may have had more genomic replication events.
Introns regulate gene expressions [29]. Therefore, it is important to elucidate on gene functions by analyzing  [30]. Analyses of gene structures of PvHsf members revealed that all PvHsfs had more than one intron. However, intron lengths were different for each subgroup, and each subgroup member exhibited a similar structure. The shortest introns were in subgroup XIII, while the longest introns were in subgroup V. Comparable findings have been reported in Soybean [31], Hypericum perforatum [32] and cotton [33].
Rice and maize are monocotyledons, while common bean (P. vulgaris) and Arabidopsis are dicotyledons. Phylogenetic tree analysis revealed that Hsf members in the above four species were in the same subgroups (14) while subgroup III and XI had no monocot members, which could be attributed to evolution of plants into monocots and dicots. Hsfs of the same subgroup in these four species exhibited similar gene structures. However, the order of motifs in the four species exhibited some differences. Same subgroup Hsfs had similar motifs, such as motifs-1, 2, 3 (DBD domain) as well as motif-8 (C-terminal domain) and they also exhibited similar motif compositions, implying that Hsf members play similar functions in proteins. Comparable outcomes have been found among Hsf members in maize [34], moso bamboo (Phyllostachys edulis) [35], pumpkin (Cucurbita moschata) [36] and tea plant (Camellia sinensis) [37]. Moreover, Hsf members appear before plants differentiate into monocotyledons or dicotyledons, and same subgroup members exhibit similar motifs and evolutionary relationships [21].
The cis-elements in promoter regions of gene family members regulate the expressions of metabolic pathway-related genes [38]. In this study, seven ciselements related to hormones (such as ARBE, TATCbox, P-box, AuxRR-core, TCA and TGA elements) were identified, indicating that PvHsfs may be involved in the roles of hormones in plant growth and development. Three stress-related cis-elements (LTR, ARE and MBS elements) were found in different PvHsfs, including PvHsf09, PvHsf10, PvHsf15, PvHsf17 and PvHsf24. Under abiotic stress conditions, the expression levels of these PvHsfs exhibited significant changes (Figs. 9 and 10), confirming that stress-related cis-acting elements are responsive to abiotic stress. Also, Hormone-related cis-elements, including ABRE, AuxRR-core, P-box and TGA-elements, have been found in carnation (Dianthus caryophyllus) [39] and Hypericum perforatum [32]. Hsf members while three stress-related cis-elements (LTR, ARE and MBS elements) have also been found in Hsf members from Brassica juncea [40] and Hypericum perforatum [32]. Moreover, Hsf members in carnation (Dianthus caryophyllus) [39], Brassica juncea [40], Hypericum perforatum [32] and tea plants (Camellia sinensis) [37] had a CAT-box element. Under abiotic stress, the expressions of Hsf members exhibited some variations. Collinearity analysis revealed that PvHsfs had more pairs of homologous genes in soybean (37) than in Arabidopsis (1) and rice (1), indicating that PvHsfs are closely associated with legumes. The collinear gene of PvHsfs in Arabidopsis, AT2G41690, has been reported to exert some effects during abiotic stress [41], indicating that PvHsfs influence abiotic stress responses in plants.
Gene expression patterns can be used to investigate the biological functions of various genes [42]. In crops, Hsf members exhibit tissue-specific expressions. For instance, DcaHsfs exhibit different expression patterns in carnation, as well as in members of the same class [39]. StHsf genes are highly expressed in various potato (Solanum tuberosum) tissues [43]. The expressions of PvHsfs in the phytozome database exhibit tissue-specificity (flowers, flower buds, young pods, leaves, green mature pods, stems and roots), indicating that PvHsfs are involved in plant growth and development.
During plant growth, the sprouting stage is the first and most important stage. It directly affects plant development and yield. Moreover, during abiotic stress, it is the most sensitive stage [44,45]. As a result, studies have evaluated the effects of stress on the sprouting stage [46]. Herein, PvHsfs levels at the sprout stage were specific in the cotyledon and radicle, indicating that these tissues can be used as target tissues to assess PvHsfs at the sprout stage.
Stress, especially abiotic stress, alters the expressions of Hsf members [7]. For instance, heat stress alters the expressions of CarHsfs in chickpea (Cicer arietinum) [47]. Moreover, various stresses dysregulate the expression levels of CsHsf genes in tea plants [37]. In pumpkins (Cucurbita moschata), cold and heat stress have been shown to significantly alter the expressions of some CmHsfs [36]. Abiotic stressors, such as heat, cold, salt, drought and cadmium have been shown to alter the expressions of most TaHsfs in bread wheat (Triticum aestivum) [48]. Collectively, these findings imply that Hsf

Conclusions
In summary, this study identified 30 members of PvHsf from the reference genome and comprehensively analyzed the location, motifs, gene structure, cis-elements and collinearity among PvHsf members. The PvHsfs' expression from the phytozome database and the analysis at the sprout stage in different tissues all revealed that PvHsfs had the tissue-specific expression. In addition, the expression of PvHsfs under heat, cold, salt and heavy metal stress showed PvHsfs might regulate responses to abiotic stresses in common bean. This study lays the foundation for further identification of PvHsfs and adds to our understanding on the role of PvHsfs in the regulation of abiotic stress resistance in common bean.

Identification of Hsf members in P. vulgaris
Genomic data (genes, cDNAs and proteins) of P. vulgaris (PhaVulg1_0) was derived from the ensembl plants database [49] while data of Hsf protein domain (PF00447) was obtained from the PFAM database [50]. These data were screened using the HMMER software [51] to identify Hsf members. In addition, ExPASy Proteomics Server [52] and Plant Protein Phosphorylation DataBase (P3DB) [53] were screened to identify the Hsf members in common bean (PvHsf ). The location of PvHsf members was mapped based on the reference genome and named depending on their chromosomal location using TBtools [54].

Analysis of Hsf members in P. vulgaris
MEGA X [55] was used to align protein sequences of PvHsf and Hsf members in three species (Arabidopsis, maize, rice) reported previously [11,12,14]. Maximum Likelihoodphy analysis was performed using 1000 replicates as bootstrap values and the jtt + g + i model predicted by MEGA. The MEME tool [56] was used to identify motifs with E-value of less than 1e − 20 and 10-50 amino acids numbered based on their corresponding E-values. The gene structure of PvHsfs was analyzed using Gene Structure Display Server (GSDS) [57]. Gene-wise [58] was used to determine the coordinates corresponding to DNA (containing exon and intron) and protein sequences. The cis-acting elements of PvHsfs were uncovered using the plantCARE software [38]. Circos software [59] was used to analyze gene duplication events in PvHsfs via the MCScanX function [60]. The expression of PvHsfs was visualized using a heatmap constructed using TBtools (phytozome data) [61]. All databases and software links are shown in Table S4.

Preparation of plant materials and qRT-PCR analysis
A locally-grown common bean variety "Longjiang Ziyun" was obtained from the National Coarse Cereals Engineering Research Center (Daqing, Heilongjiang, China). The seeds were placed in an incubator away from light at 26 °C to allow sprouting. The plants were separately exposed to different stress treatments on the fifth day. Cold stress was induced by exposing plants to a temperature of 4 °C and heat stress was induced by exposing plants to a temperature of 45 °C [62,63]. Salt stress was triggered by treatment with 70 mmol/L (NaCl), while heavy metal stress was simulated by exposing plants to 0.5 mg/L (CdCl 2 ) and 60 mg/L (HgCl 2 ) [64][65][66]. For control (CK) treatment, hypocotyl, radicle and cotyledon were collected as for samples in the analysis of tissue-specific expression. The radicles under abiotic stress treatments were collected as samples respectively while the CK was served as the control tissue sample. RNA Easy Fast Kit (DP452, Tiangen, Beijing) was used to extract RNA and cDNA was obtained by total RNA reverse transcription using HiScript SuperMix was used to extract for qPCR (+gDNA wiper) (R223-01, Vazyme, Nanjing). The Primer premier 5.0 software (PREMIER Biosoft, San Francisco, USA) was used to design primers of PvHsf members (Table S5). In this experiment, Pvactin11 gene served as the internal reference gene [67]. The expression of each PvHsf member was determined through qRT-PCR using the Light Cycler system (Roche 480II, Roche, Switzerland) and TransStart ® Top Green qPCR SuperMix (AQ131-04, TransGen Biotech, Beijing). For each treatment, three biological replicates were prepared, and for each sample, three technical replicates were prepared. The relative mRNA expression was calculated as previously described [68].