Identification of the SPL genes in C.quinoa
A total of 23 CqSPL genes were identified in quinoa using two BLASTp methods and were denoted CqSPL1- CqSPL23 based on the different chromosomes (Additional file 2: Table S1). The general characteristics of the CqSPLs, including coding sequence length (CDS), molecular weight (MW), isoelectric point (pI), and subcellular localization, were also analyzed (http://cello.life.nctu.edu.tw/).
Among the 23 CqSPL proteins, CqSPL11 and CqSPL12 were the smallest with 119 amino acids each, while CqSPL17 was the largest with 1190 amino acids. The molecular masses of these proteins ranged from 21.3 kDa (CqSPL12) to 132.135 kDa (CqSPL17), while their pI ranged from 5.74 (CqSPL15) to 10.24 (CqSPL1 and CqSPL12), with a mean of 6.69. Moreover, four of the 23 CqSPL genes contained the ANK domain. Subcellular localization results showed that most CqSPL genes are located in the nucleus, with 7 in the endoplasmic reticulum, 8 in the cytoplasm and plasmid, 9 in the chloroplast, and 1 (CqSPL9) in the plasmid (Table S1). We also found that quinoa has more SPL genes (23) than A. thaliana (15), S. lycopersic (15), V. vinifera (17), and S. bicolor (19), but less than O. sativa and Z. mays (29 SPL genes each) [37][40].
Multiple sequence alignment, phylogenetic analysis, and classification of the CqSPL genes
The 23 CqSPL genes were divided into 8 clades of the phylogenetic tree (groups 1-8) according to the previously proposed classification method by Cenci and Rouard. Their consensus with the classified groups of SPL proteins in Arabidopsis suggested that these SPL genes remained conserved during the evolutionary process. (Fig. 1; Additional file 2: Table S1).
Among the 8 subfamilies, subfamily II had more members (6 CqSPL), while subfamily VI had only 1 CqSPL. Subfamilies I, III, V, and VIII had 2 SPL genes each. Similarly, in subfamilies IV and VII, the number of CqSPLs was both 4. The phylogenetic tree showed that some CqSPLs clustered closely with AtSPLs (bootstrap support ≥70) (Fig. 1), indicating that these proteins may be orthologous proteins with similar biological functions.
Multiple sequence alignment of the AtSPLs with the eight CqSPLsubfamilies
Previous studies reported that all SPL genes contain conserved SBP domains, composed of 2 zinc fingers (Zn 1 and 2) and a bipartite nuclear localization signal (NLS). The basic region consists of 14 conserved amino acids, spanning 70-80 amino acids (Fig. 2, Table S1). However, in the present study, only subfamily I was not fully conserved in quinoa and Arabidopsis. The Zn-1 (Cys3His-type) in CqSPL6 from the subfamily I lack Cys, while its Zn-2 (Cys2HisCys-type) lacks C2H, which are still conserved in Arabidopsis (Fig. 2). Conversely, the NLS is relatively conserved in quinoa but has one of the R’s mutated in the RRRK located at the C-terminus of the SBP domain in Arabidopsis. The SBP domains in Arabidopsis and quinoa were highly conserved, indicating that the SBP structural domain was established at an early stage in plants.
Conserved motifs and structure analysis of the CqSPL genes
The exons and introns of CqSPL genes were obtained by comparing them with their corresponding genomic DNA sequences to determine their structural diversity. The results revealed that the 23 CqSPL genes had different numbers of exons, ranging from 3 to 17, and the SBP domain was present in most of the CqSPL genes (17, ~ 69.5 %) (Fig. 3, Additional files 2 and 3: Tables S1 and S2). Furthermore, CqSPL1, CqSPL12, and CqSPL18 genes had the same intron and exon structures with 3 exons and 2 introns each (Fig. 3B). Six CqSPL genes had four introns, while CqSPL13 and CqSPL17, belonging to the subfamily II, had the most introns (16) (Fig. 3A, B). Generally, the CqSPL genes from the same subfamily had similar gene structures, but subfamily II showed greater structural differences in the number of introns and could have evolved more functions.
Further structural analysis of the CqSPL genes identified 10 diverse motifs denoted motifs 1 - 10. As shown in Fig. 3C, motifs 3 and 4 were widely distributed and located adjacent to each other in the CqSPLs. CqSPL genes of the same subfamily usually possess a similar motif composition. For instance, subfamily I contained motifs 2, 3, 4, 6, 7, and 9, except for CqSPL13, while subfamily II contained all motifs (motifs 1-10). Moreover, subfamilies Ⅲ, Ⅳ, Ⅴ, Ⅵ, Ⅶ, and Ⅷ had the same motifs (motifs 1, 3, and 4). We also found that some motifs may only be distributed in specific positions. For instance, motifs 3 and 7 were always distributed at the start and the end of the patterns, respectively, with motif 1 was always located between motifs 3 and 4 in subfamily I (Fig. 3C, Table S2). Generally, genes from the same subfamily had similar structural composition and clustered together, consistent with the phylogenetic tree classification.
Chromosomal distribution and gene duplication of CqSPL genes
Using the latest C. quinoa genome database, the physical localization of the SPL genes on chromosomes demonstrated that the 23 CqSPL genes are unevenly distributed on chromosomes (Chr)1 to 18 (Fig. 4, Additional file 4: Table S3). Each SPL gene was named based on its physical location on the C. quinoa chromosomes (Chr) 1 to 18. Conversely, the CqSPL genes were not found on Chr2, Chr4, Chr5, Chr13, Chr17, and Chr18. Additionally, Chr11 contained more CqSPL genes (4 genes, ~ 17.39%), followed by Chr6, Chr7, and Chr14 (3 genes each, ~ 13.04%), while Chr1, Chr3, Chr9, Chr12, Chr15, and Ch16 contained the least (1 genes, ~ 4.35%). Chr8 and Chr10 contained 2 (~8.70%) CqSPL genes each, and almost all SPL genes were distributed at both ends of the 23 chromosomes, except for Chr7. Only one SPL gene duplication event was evident in C. quinoa, which was CqSPL16 and CqSPL17 on Chr 11 (Fig. 4, Table S3).
Gene duplication events, which mainly include tandem repeat events and segmental duplications, have an essential role in gene amplification and the generation of new functions [41]. Tandem repeat events refer to the 200 kb range of chromosomal regions containing two or more genes [42]. Accordingly, a duplication event analysis of the CqSPL genes was performed to explore the evolutionary conservation of the gene family. Quinoa genome exhibited 7 pairs of duplicated fragments but with no tandem repeat events (Fig. 5, Additional file 5: Table S4). The 14 paralogs, which resulted from the 7 pairs of duplicated fragments, were denoted LG1-14, indicating an evolutionary relationship among the CqSPL genes. LG6 had many CqSPLs (n= 3), followed by LG7, LG10, and LG14 (n=2 each), while LG1, LG3, LG8, LG9, and LG14 the least (n=1). As expected, all genes were linked within their subfamilies. Subfamily II had many linked genes (4 SPL genes) than subfamilies III, IV, V, VII, and VIII, which had two SPL genes each (Table S4). These results showed that some CqSPL genes may have been produced during fragment duplication and that these duplication events acted as the main evolutionary drivers of the new functions in CqSPL genes.
Evolutionary analysis of the CqSPL and SPL genes of different species
We selected three dicotyledonous (Z. mays, O. sativa, and S. bicolor) and monocotyledonous (A. thaliana, S. lycopersicum and V. vinifera) plants each, and compared their SPL genes with CqSPLs. The 23 CqSPL and SPL genes from the other six plants were used to construct a phylogenetic tree with 10 conserved motifs (identified by the MEME web server) using the NJ method in Geneious R11. The CqSPL genes exhibited an uneven distribution in the phylogenetic tree because genes from the same subfamily are more inclined to have the same motifs and cluster together. Almost all SPL genes from these seven plants contained motifs 1, 2, 4, and 5, except for the first subfamily in quinoa (CqSPL6 and CqSPL15) (Fig. 6, Additional file 2: Table S1). Subfamilies I and II contained the most diverse motifs, with motifs 10 and 7 almost always distributed at the beginning and the end of the motif patterns, respectively. Meanwhile, motif 9 was always distributed at the end of the pattern in subfamilies III, IV, VII, and VIII. In conclusion, CqSPL genes from groups I and III showed higher homology with SPL gene clusters of S. lycopersicum, whereas most SPL genes in other groups clustered with A. thaliana, S. lycopersicum, and V. vinifera, implying that they are closely related and might have similar functions.
To further understand the phylogenetic developmental mechanisms of SPL genes, we constructed comparative syngeneic maps of quinoa and the six representative species. The 23 CqSPL genes showed collinear relationships with various SPLs in A. thaliana (15), S. lycopersicum (15), V. vinifera (17), S. bicolor (19), O. sativa (29), and Z. mays (29) (Additional file 6: Table S5). The homologous pairs between the six species, Z. mays, O. sativa, S. bicolor, A. thaliana, S. lycopersicum, and V. vinifera were 3, 3, 6, 16, 20, and 25, respectively (Fig. 7, Table S5).
We found at least one pair of genes from the six plants which collineated with CqSPL, such as CqSPL21(Solyc05g015840/EER97011/AT5G50670.2/VIT_14s0068g01780/BGIOSGA005075/Zm00001d021056), suggesting that these orthologous genes were more highly conserved before divergence. Therefore, we speculate that they might have played an essential function in the evolution of the quinoa SPL gene family. Interestingly, some gene pairs collineating with 12 CqSPL genes were identified in A. thaliana, S. lycopersicum, and V. vinifera but were not found in S. bicolor, O. sativa, and Z. mays. This suggested that these orthologous pairs may have been formed via gene duplication during the differentiation of dicotyledonous and monocotyledonous plants.
Expression patterns of the CqSPL genes in different plant organs
The expression of the 15 representative genes (selected from the eight subfamilies) was analyzed in four organs (roots, stem, leaf, and flower) using qRT-PCR to evaluate the potential function of CqSPL genes. The CqSPL genes exhibited different expression patterns in the roots, stems, leaves, and flowers, suggesting that these genes may have diverse regulatory roles in plants. Three genes (CqSPL3, CqSPL7, and CqSPL19) had the highest expression in the stem, while eight genes (CqSPL2, CqSPL5, CqSPL6, CqSPL9, CqSPL11, CqSPL14, CqSPL15, and CqSPL20) had the highest expression in the leaves. CqSPL1, CqSPL12, CqSPL18, and CqSPL20 were highly expressed in the flowers (Fig. 8A). Most genes from the same subfamily exhibited similar expression patterns, suggesting that they might have similar functions. It was clear that all CqSPL genes were least expressed in the roots than in the stem, leaves, and flowers; therefore, we speculate that SPL genes may be closely associated with stem, leaf, and flower development in plants. The qRT-PCR analysis showed a differential expression pattern of the SPL gene in different tissues and provided preliminary confirmation of the biological function of the SPL gene in quinoa.
Consequently, some CqSPLs might regulate the fruit development of quinoa, thus affecting its nutritional composition and the development rate [3][4]. We analyzed the expression of 15 SPL genes at five post-anthesis stages (7D, 14D, 21D, 28D, and 35D) to identify genes that could potentially regulate quinoa fruiting-related genes. The results showed that most CqSPL genes exhibited different expression patterns at the five stages of fruit development. There was a significant increase in the expression of two genes (CqSPL2 and CqSPL15) and a decrease in the expression of two genes (CqSPL7 and CqSPL18) in quinoa fruits. Interestingly, CqSPL1, CqSPL3, CqSPL5, CqSPL11, and CqSPL20 showed the highest expression on day 21 of fruit development, whereas the expression of most genes (CqSPL5, CqSPL11, CqSPL12, CqSPL14, CqSPL18, CqSPL19, CqSPL19, and CqSPL20) was the highest at 28 days (Fig. 8B). These findings also demonstrated that SPL genes play an essential role in fruit development, providing a theoretical basis for studying the nutritional value of quinoa. Furthermore, correlations between CqSPL gene expression patterns were observed (Fig. 8). Positive correlations were observed among the most CqSPL genes. However, a few CqSPL genes were significantly negatively correlated, such as CqSPL6 with CqSPL21/CqSPL1, as well as CqSPL1 with CqSPL9 (P < 0.05).
Expression patterns of CqSPL genes under various abiotic stress consitions
We evaluated the expression of 15 CqSPL genes in the roots, leaves, and stems under six abiotic stresses to determine whether different abiotic stresses affect the expression of CqSPL genes. The results showed that some CqSPL genes were significantly upregulated, while others were down-regulated under different stress treatments. Most CqSPL genes also showed significant expression differences in different tissues, which increased with treatment time, depending on the type of stress [43]. For example, most SPL genes were induced by cold stress in the stem, and the expression of CqSPL11 and CqSPL12 genes was initially upregulated but later down-regulated in the roots, leaves, and stems. CqSPL1 and CqSPL5 genes were significantly upregulated, while CqSPL2 was significantly down-regulated in the stem under flooding stress. Generally, most genes exhibited different patterns when subjected to different treatments and were significantly reduced during the early phases of the treatments. CqSPL1, CqSPL7, CqSPL5, CqSPL18, and CqSPL20 genes demonstrated similar expression patterns, and different tissues showed upregulated expression trends of SPLs with prolonged treatment time, indicating their rapid inhibition by abiotic stresses. However, the expression patterns of some SPLs, such as CqSPL2, CqSP19, and CqSPL20, exhibited a reverse trend. They were upregulated by heat stress but downregulated by cold stress treatment in the stem (Fig. 9). Notably, CqSPL1 was highly expressed in the different plant tissues under the six different stress treatments, and thus, it could be a potential candidate gene responsible for abiotic stress responses in quinoa.
The expression patterns of CqSPL members have shown many coordinated expressions under several abiotic stresses (Fig. 9B). Most CqSPL genes showed significant positive correlations; for example, nine genes CqSPL12, CqSPL15, CqSPL2, CqSPL3, CqSPL18, CqSPL6, CqSPL19, CqSPL 11, CqSPL9, CqSPL14 were significantly positively correlated, and CqSPL1 and CqSPL5 were significantly positively correlated. On the other hand, some pairs of CqSPL genes (CqSPL5, CqSPL20) were significantly negatively correlated.