Genome-wide analysis of growth-regulating factors (GRFs) in Triticum aestivum

The Growth-Regulating Factor (GRF) family encodes a type of plant-specific transcription factor (TF). GRF members play vital roles in plant development and stress response. Although GRF family genes have been investigated in a variety of plants, they remain largely unstudied in bread wheat (Triticum aestivum L.). The present study was conducted to comprehensively identify and characterize the T. aestivum GRF (TaGRF) gene family members. We identified 30 TaGRF genes, which were divided into four groups based on phylogenetic relationship. TaGRF members within the same subgroup shared similar motif composition and gene structure. Synteny analysis suggested that duplication was the dominant reason for family member expansion. Expression pattern profiling showed that most TaGRF genes were highly expressed in growing tissues, including shoot tip meristems, stigmas and ovaries, suggesting their key roles in wheat growth and development. Further qRT-PCR analysis revealed that all 14 tested TaGRFs were significantly differentially expressed in responding to drought or salt stresses, implying their additional involvement in stress tolerance of wheat. Our research lays a foundation for functional determination of TaGRFs, and will help to promote further scrutiny of their regulatory network in wheat development and stress response.


INTRODUCTION
Growth-Regulating Factors (GRFs) are plant-specific transcription factors that play important roles in regulating plant growth and abiotic stress response (Kim et al., 2012;Baucher et al., 2013). The first GRF gene OsGRF1 was identified from rice, where it was shown to play an essential role in regulating gibberellic acid (GA)-induced stem elongation (Van der Knaap, Kim & Kende, 2000). In recent years, with the development of reference genomes, many GRF genes have been identified and characterized from plant species at genome-wide levels (Omidbakhshfard et al., 2015). Protein sequence analysis has determined that there are two conserved domains, QLQ (Gln, Leu, Gln) and WRC (Trp, gene structure, and conserved motif and domain prediction. In total, 30 TaGRFs were identified from the wheat genome. On this basis, the gene expression patterns of wheat GRF were analyzed based on RNA-seq data from different wheat tissues. The expression patterns of TaGRFs under drought and salt stresses were also analyzed by qRT-PCR.

Chromosomal location and gene duplication of TaGRFs
The wheat genome GFF3 gene annotation file was from the wheat database IWGSC v1.1 (https://wheat-urgi.versailles.inra.fr/Seq-Repository/Assemblies). Gene structure annotations of TaGRFs were extracted from the GFF3 file. The start and end location information of the TaGRFs in the corresponding chromosomes was used to draw the physical map by the software MapInspect (Fang et al., 2019). The orthologous genes from wheat and its subgenome donor were identified by the common tool ''all against all BLAST search''. The cutoff values (e-value <10 −10 , identity >80%) were used to assure the reliability of the orthologues. Then we used Multiple Collinearity Scan toolkit (MCScanX) to depict their homology relationships (Wang et al., 2012). Synteny diagrams were generated using the R package ''circlize''. Gene duplication events were divided into tandem duplication events and segmental duplication events. Tandem duplication events were determined by the following evaluation criteria: (1) length of the aligned sequence >80% but of each sequence, (2) identity >80%, (3) threshold ≤ 10 −10 , (4) only one duplication can be recognised when genes are tightly linked; and (5) intergenic distance is less than 25 kb. When genes passed the criteria for (1), (2), and (3), but were on a different chromosome, they were deemed to be segmental duplications (Fang et al., 2020;Jiang et al., 2020).

Gene Ontology annotation in TaGRF family genes
The functional annotation of GRF sequences and the analysis of annotation data were performed using Blast2GO (http://www.blast2go.com) (Conesa et al., 2005). The fulllength amino acid sequence of the TaGRF proteins were uploaded to the original program, drawn and annotated. The program provides the output defining three categories of GO classification namely biological processes, cellular components, and molecular functions.

Multiple conditional transcriptome analysis of TaGRFs
The multiple transcriptome data were downloaded from the Wheat Expression Browser (http://www.wheat-expression.com/) (Ramírez-González et al., 2018); and the heat maps of TaGRFs were generated using the R package ''pheatmap''.

Growth and stress treatment of wheat seedlings
Seeds of Emai 170 (a hexaploid common wheat cultivars) were sterilized on the surface with 1% hydrogen peroxide, rinsed thoroughly with distilled water, and germinated in an incubator at 25 • C for 2 days (He et al., 2020;Ma et al., 2016). According to the reported method, the seedlings were transferred to 1/2 strength Hoagland nutrient solution and cultured in continuous ventilation (Yin et al., 2019;Zhu, Gong & Yin, 2019). After five days (when wheat seeding reached the stage of one heart and one leaf), 85.5 mM NaCl and 82.5 mM mannitol were applied to seedings. Every two days, 2 M KOH or 0.4 M H 2 SO 4 was used to adjust the pH of culture solution to 6.0. During the application, the plants were grown at 16 h/8 h (day/night) and 25 • C. Leaves and roots were collected at 2 h, 4 h, 8 h, 12 h, 24 h, 96 h and 144 h after treatments. Three biological repeats are included for each treatment. Finally, the samples were immediately frozen with liquid nitrogen and stored at −80 • C.

RNA isolation and qRT-PCR analysis
According to the manufacturer's instructions, total RNA of samples were extracted by TRizol reagent (Invitrogen, USA) and cleansing DNA with DNaseI (TaKaRa, USA). The first cDNA was reverse-transcribed from RNA by RevertAid Reverse Transcriptase (Vazyme, China). Gene-specific primers were designed using Primer 5.0; and the ADP-ribosylation factor Ta2291 (F: GCTCTCCAACAACATTGCCAAC, R: GCTTCTGCCTGTCACATACGC) was used as an internal reference gene for qRT-PCR analysis (Paolacci et al., 2009). The qRT-PCR reaction system and protocol were carried out as manufacturer's instructions for SYBR R (Vazyme, China). For each sample, settings included three technical replicates. Relative gene expression level was calculated using the 2 − Ct method (Yin et al., 2018b).

Identification and analysis of wheat GRF transcription factor gene family members
For identification of GRF TF genes in wheat, both BLAST and Hidden Markov Model (HMM) searches were performed. The 35 known GRF proteins (Table S1), including Arabidopsis (9), maize (14), rice (12) as the query sequences to conduct BLASTp against the wheat reference genome IWGSCv1.1. Using the HMM of WRC (PF08879) and QLQ (PF08880) domains were used as the query sequences for HMMER3.0 searching. The candidate proteins were verified by NCBI CDD and SMART Online Tools to determine that the TaGRF contained both WRC and QLQ domains. Finally, a total of 30 TaGRFs were identified from the wheat genome. We named wheat GRF genes (TaGRFs) according to the naming rule of Schilling et al. (2020); the corresponding gene IDs are shown in Table 1. Using the same method, we identified 6, 10, and 18 GRFs from T. urartu, Ae. tauschii and T. dicoccoides, respectively (Table S2). The deduced polypeptides ranged in length from 206 (TaGRF2-2A) to 611 (TaGRF4-4B) amino acids, with the predicted molecular weights ranging between 21.6 to 64.2 kDa. Their isoelectric points ranged from 4.72 (TaGRF1-2B) to 10.23 (TaGRF2-2A). Their instability parameters were between 41.46 According to the phylogenetic relationships ( Fig. 1), the 30 TaGRFs could be divided into four sub-categories (Group I to IV). Group I consisted of a single member, TaGRF7-6A. TaGRF2-2D, TaGRF8-6A, TaGRF8-6B, and TaGRF8-6D. Group III consisted of TaGRF4-4A, TaGRF5-4A, TaGRF4-4B, TaGRF5-4D, and TaGRF4-4D. The remaining TaGRF genes were classified in Group IV. The TaGRF gene structure map showed that all wheat GRF gene members contain 1 to 4 introns, with the majority having 2 to 3 introns (Fig. 1, and Table S3). There were 2 to 5 exons, with most TaGRF having 2 to 4 exons. The exon number of TaGRF genes within same group were relatively consistent.
Conservative motif analysis indicated that TaGRF protein domains are highly conserved among the 30 members. Each member contains only two structural domains: WRC (Motif 1) and QLQ (Motif 2) (Fig. 1). Lengths and the most matching sequences of 20 motifs were shown in Table S4.
In order to further analyze the conservation degree of QLQ and WRC domains in TaGRFs, we performed multiple sequence alignment of these two domains. The results indicate that, as highlighted in Fig. 1, the QLQ and WRC motifs are highly conserved. The N-terminal QLQ motif was conserved with one Leu and two Gln in all the TaGRF proteins. The WRC motif was also highly conserved with one Trp, Arg, and Cys in each of the TaGRF proteins. A zinc finger motif (CCCH) was also found within the WRC domain in all TaGRF proteins (Fig. 2).

Chromosome localization of wheat TaGRF genes
Based on the GFF3 genome reference files, the chromosome map of TaGRF genes was generated using MapInspect software (Fig. 3, and Table S3). The three sub-genomes A, B, and D contained 15, 11, and 10 TaGRFs, respectively. But the TaGRFs are not uniformly distributed among chromosomes (chromosome 2, 9; chromosome 4, 6; chromosome 6,

TaGRF gene promoter Cis-element analysis
Analysis of the cis-elements in the promoter sequence was important for understanding the regulatory functions of genes. The cis-acting element analysis was performed in Plant-CARE by using upstream sequences (1.5 kb) of TaGRF genes extracted from the wheat genome. The detailed information including function and location were displayed in Table S5. The results showed that all 30 TaGRF genes contained several TATA boxes and CAAT boxes, indicating that TaGRF genes can be normally transcribed. When focusing on the cis-acting elements associated with wheat growth and development, hormonal and stress responses, it can be seen in Fig. 5 the TaGRF gene promoter contains a large number of cis-elements, with the largest number found in TaGRF9-6D having 19 cis-elements, and TaGRF3-2A with the least, containing only 8 cis-elements. There were several different light-related elements in these cis-elements, such as AE-box, Box 4, I-box, C-box, Sp1, circadian, CAG-motif, 3-AF1 binding site, LAMP-element, TCT-motif, GATT-motif, ATCT-motif, and Gap-box. This suggests that the GRF gene family may play a role in light response.
In addition, a large number of responsive hormones and stress-related cis-elements were found in the promoter region of the TaGRFs, including auxin (11 TGA-elements), gibberellin (8 GARE-motifs and 6 P-boxs), jasmonic acid methyl ester (65 TGACG-motifs), abscisic acid (72 ABREs) and other hormone response components, as well as anaerobic induction (25 AREs), drought (19 MBSs) and low temperature (3 LTRs) and other stress response cis-elements. This suggests a potential role for the wheat GRF family in wheat growth and development and in a variety of hormones and stress.

Gene Ontology annotation in TaGRF family genes
The GO item analysis was performed using Blast2Go and the results indicated the putative participation of 30 TaGRF proteins in diverse biological processes (Fig. 6, and Table S6). Total ten different GO items of biological processes were defined. Majority of the TaGRFs were predicted to function in 'regulation of transcription, DNA-templated (GO: 0006355)' (76.67%), followed by 'response to deep water (GO: 0030912)' (20%) and 'response to gibberellin (GO: 0009739)' (20%). Molecular function prediction showed that about 76.67% of the TaGRFs were evidenced to participation of 'ATP binding (GO: 0005524)' and 'hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides (GO: 0016818)'. Cellular localization prediction indicated that the majority of TaGRF proteins (80%) were localized in the nucleus (Fig. 6). Peng, Sun & Nevo, 2011). Obviously, common wheat has a complicated evolutionary history, and its ancestors' origin is affected by many factors, but research shows that wheat has two major polyploid evolutionary events (Ling et al., 2013). Homology reflects the phylogeny of a species, so it can be used to transfer annotations for one known gene to another newly sequenced genome. In order to further infer the evolutionary origin and homology of the wheat GRF family, Sixty-four GRFs were identified from T. aestivum (30 TaGRFs), T. urartu (6 TuGRFs), T. dicoccoides (18 TdGRFs) and Ae. tauschii (10 AeGRFs) using a computer-based method (Fig. 7A, and Table S7). There were no paralogous gene pairs in Ae. tauschii and T. urartu, and 21 and 5 paralogous gene pairs in T. aestivum and T. dicoccoides, respectively. Among them, 18 orthologous gene pairs were identified from T. aestivum and Ae. tauschii, 5 orthologous gene pairs were identified from T. dicoccoides and Ae. tauschii, 5 orthologous gene pairs were identified from T. urartu and Ae. tauschii, 20 orthologous gene pairs were identified from T. aestivum and T. dicoccoides, 12 orthologous gene pairs were identified from T. aestivum and T. urartu, and 3 orthologous gene pairs were identified from T. urartu and T. dicoccoides. Given phylogenetic analyses and homology results of four wheat species, it was speculated that 8 TaGRFs   Table S2). The 89 homologous are shown in Table S7. Full-size DOI: 10.7717/peerj.10701/ fig-7

Gene expression pattern analyses of TaGRFs
For multigene families, analysis of gene expression patterns often provides useful clues for determining gene function. Transcriptome data from growth and abiotic stresses were downloaded from Wheat Expression Browser to examine their expression patterns. The results showed that 28 TaGRF genes (except TaGRF2-2A and TaGRF7-6A) were expressed in different tissues or under different stress treatments (Fig. 8, and Table S8). TaGRF1-2A and TaGRF1-2D were highly expressed in various tissues. TaGRF5-4A and TaGRF5-4D had the highest expression in shoot tip meristem. Most of the genes were expressed in shoot tip meristems more significantly than other tissues. About half of the TaGRF genes were expressed under NaCl treatment, and TaGRF4-4A, TaGRF4-4B and TaGRF4-4D were significantly expressed under NaCl treatment. It was speculated that TaGRF family members may play important roles in the development of wheat shoot tip meristems, and TaGRF4 may play an important role in wheat salt tolerance.

Quantitative-real time PCR analysis
To further understand the potential role of the TaGRF genes in abiotic stresses (NaCl and mannitol), qRT-PCR was used to analyze the expression pattern of TaGRFs. Based on transcriptome analysis, we selected 14 TaGRFs for qRT-PCR. In the two treatments of this study, the expression of all 14 TaGRFs differed from the control, although their degree of difference was often substantial (Fig. 9). After treatment with NaCl, the expression of TaGRF4-4A, TaGRF4-4D, TaGRF5-4A, TaGRF10-6A, TaGRF10-6B and TaGRF10-6D were higher in treated leaves than in the control 96 h after treatment. Expression of the genes TaGRF1-2B, TaGRF1-2D, TaGRF3-2D, and TaGRF9-6D were lower in the leaves 96 h after treatment than in the control group. However, in the roots, the expression of TaGRF1-2B and TaGRF3-2D were much higher than in the control 96 h after treatment, while the expression of TaGRF4-4A, TaGRF4-4D, TaGRF10-6B, and TaGRF10-6D were much lower than in the control 96 h after treatment. Among these, the expression trends of TaGRF1-2B, TaGRF3-2D, TaGRF4-4A, TaGRF4-4D, TaGRF10-6B, and TaGRF10-6D were completely reversed in the roots compared to the leaves 96 h after treatment.
After treatment with mannitol, TaGRF1-2B, TaGRF6-4A, TaGRF10-6A and TaGRF9-6D were expressed higher in the roots than in the control group 24 h and 96 h after treatment. However, in the leaves, only TaGRF6-4A, TaGRF10-6A and TaGRF9-6D was expressed higher than the control group at 24 h and 96 h after treatment. The expression level of TaGRF1-2B did not change compared with the control group. TaGRF1-2B and TaGRF3-2B were significantly down-regulated in leaves at 2 and 4 h after treatment.

DISCUSSION
With the in-depth development of plant genomics research, especially the rapid development of sequencing technology, the entire genome sequencing of many plant species has been completed, providing favorable conditions for the identification of plant In general, the number of GRF transcription factor members in terrestrial plants is between 8 to 20, but, typically, fewer are found in lower plant taxa such as mosses and algae (Omidbakhshfard et al., 2015). For example, 9 AtGRFs occur in Arabidopsis (Kim, Choi & Kende, 2003), 12 OsGRFs in rice (Choi, Kim & Kende, 2004), 14 ZmGRFs in maize (Zhang et al., 2008), 17 BrGRFs in Chinese cabbage (Wang et al., 2014), 9 CsGRFs in sweet orange (Citrus sinensis L. Osbeck) (Liu et al., 2016), 18 PeGRFs in bamboo (He et al., 2018), while the moss (Physcomitrella patens) has only 2 GRFs (Omidbakhshfard et al., 2015). Based on wheat genomic data, the present study identified 30 wheat GRF transcription factors (TaGRF). Although the wheat genome is considerably larger than the Arabidopsis genome (16 GB vs 125 MB), the number of GRF genes in wheat is only three times that of Arabidopsis (30:9), indicating that there is a large amount of gene loss during genome replication in wheat. According to phylogenetic analysis, the 30 GRF transcription factors in wheat could be divided into four groups. Studies have shown that GRF transcription factors of rice and maize can be divided into three groups, five groups in Arabidopsis, and six in rapeseed, indicating that GRF transcription factors in monocotyledons different from dicotyledons in evolution patterns and characteristics. In this study, phylogenetic analysis of GRFs of wheat, Arabidopsis, rice, maize, T. urartu, T. dicoccoides and Ae. tauschii were carried out compared. It was shown that most TaGRFs preferentially clustered with GRF in T. urartu, T. dicoccoides and Ae. taus chii, followed by rice and maize. The results showed that the GRFs in wheat were closely related to those in T. urartu, T. dicoccoides and Ae. tauschii. The number of GRF genes in Group IV is greater than in Group I to III, implying that the variability in the number of GRF genes in the different groups may be the result of independent gene gain or loss in these groups. It is generally believed that exon-intron structure is important for understanding evolutionary and functional relationships (Hu & Liu, 2011). In addition, gain or loss events in exons or introns provide structural and functional differentiation (Xu et al., 2012). With regard to corresponding gene structures within each group, most TaGRF genes shared a similar gene structure, having two to four introns/exons, which is in accordance with Arabidopsis and rice (Choi, Kim & Kende, 2004;Kim, Choi & Kende, 2003). 22 of the TaGRF genes contained three or four exons and 29 of the TaGRF genes contained two or three introns. This indicated that, the structural evolution of the TaGRF gene is conservative to some extent. Gene replication events are the main drivers of genome and genetic system evolution (Moore & Purugganan, 2003). Wheat has a complex evolutionary history with two major polyploid events (Ling et al., 2013). About 50-70 million years ago, before the genetic grouping of herbs, the first genome duplication directly produced an ancient doubling event. The second time was that the traceability of common wheat originated from the forming process of the tetraploid wheat (T. dicoccoides, A and B sub-genome) which hybridized by sub-genome progenitor T. urartu and Aegilops speltoides (B sub-genome) 300,000 years ago approximately. Again, about 8,000 years ago, the tetraploid wheat was hybridized with Ae. tauschii (D sub-genome) and formed hexaploid wheat (T. aestivum, A, B, and D sub-genome) naturally. We found that some genes were deleted during polyploidization by comparing GRF genes of T. aestivum, T. urartu, Ae. tauschii and T. dicoccoides.
Studies have shown that the expression level of the GRF gene is significantly higher in developing tissues than in mature tissues (Kim, Choi & Kende, 2003;Kim & Kende, 2004;Choi, Kim & Kende, 2004). For example, the GRF genes in rice were found to be strongly expressed in buds, immature leaves, and flower buds, and participates in plant growth and development by regulating cell proliferation in actively growing tissues (Choi, Kim & Kende, 2004). The expression profile of TaGRF genes analyzed by wheat tissue transcriptome data, showed that most of the TaGRF gene is highly expressed in wheat shoot tip meristems, and weakly expressed in other relatively mature tissues, which was similar to the previous conclusions. OsGRF6 participates in regulating the growth and development of rice infloreses (Gao et al., 2015), and the three genes with the highest homology level, TaGRF4-4A, TaGRF4-4B and TaGRF4-4D, have high expression in stigma and ovary, indicating that their function may be related to the growth and development of stigma and ovary. In addition, cis-acting elements related to the regulation of meristem expression, such as cat-box and CCGTCC motif, were found in the promoter region of TaGRF genes, indicating that the TaGRF genes play important roles in wheat growth tissues, especially in stem tip meristems.
Plants have evolved a series of signal pathways and defense systems to resist stresses. In previous researches, the activation of genes responsing stresses enhanced the plant's tolerance (Heidel et al., 2004;Sakuma et al., 2006). Over-expression of AtGRF7 in Arabidopsis under stress conditions increased resistance to osmotic and drought stress (Kim et al., 2012). It has been reported that GRF transcription factors acted as key roles in plant growth by coordinating stress responses and defense signals (Casadevall et al., 2013;Liu et al., 2014a;Liu et al., 2014b). For example, Arabidopsis growth regulators 1 and 3 (AtGRF1 and AtGRF3) played significant roles in the regulation of plant growth, defense signals, and stress responses (Casati, 2013;Hewezi et al., 2012). In our study, the cis-elements of 12 TaGRFs (TaGRF1-2A, TaGRF1-2B,  contained 1 to 2 copies of MBS (the MYB binding site is involved in drought-inducing). The qRT-PCR results showed that under NaCl stress and mannitol simulated drought stress, 14 TaGRF genes we tested responded to external abiotic stresses, either positively or negatively. Among them, 9 genes (TaGRF1-2B, TaGRF3-2B, TaGRF3-2D, TaGRF4-4A, TaGRF4-4D, TaGRF6-4A, TaGRF9-6D, TaGRF10-6A, TaGRF10-6B) were significantly expressed in treatment with NaCl and mannitol. These genes may play an active role in wheat's response to NaCl stress and drought stress. According to transcriptomic data and qRT-PCR results, TaGRF1-2D, TaGRF4-4A and TaGRF4-4D were all up-regulated in salt stress, indicating that they may play a certain role in wheat response to salt stress. But more experimental evidence is needed to understand how they work in wheat in response to salt stress.

CONCLUSIONS
This study provides a reference point for subsequent studies involving functions of the TaGRF gene family. TaGRF gene family has extensive expression profiles which span multiple developmental stages and stresses, implying their crucial roles in various physiological functions and abiotic stresses. In summary, our findings provide new clues that will be useful for improving stress tolerance of wheat.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
This study was funded by the Key Projcet of Hubei Provience Departmen of Education (grant number D20191305). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Grant Disclosures
The following grant information was disclosed by the authors: Hubei Provience Departmen of Education: D20191305.