Genetic identification and characterization of chromosomal regions for kernel length and width increase from tetraploid wheat

Improvement of wheat gercTriticum aestivum L.) yield could relieve global food shortages. Kernel size, as an important component of 1000-kernel weight (TKW), is always a significant consideration to improve yield for wheat breeders. Wheat related species possesses numerous elite genes that can be introduced into wheat breeding. It is thus vital to explore, identify, and introduce new genetic resources for kernel size from wheat wild relatives to increase wheat yield. In the present study, quantitative trait loci (QTL) for kernel length (KL) and width (KW) were detected in a recombinant inbred line (RIL) population derived from a cross between a wild emmer accession ‘LM001’ and a Sichuan endemic tetraploid wheat ‘Ailanmai’ using the Wheat 55 K single nucleotide polymorphism (SNP) array-based constructed linkage map and phenotype from six different environments. We identified eleven QTL for KL and KW including two major ones QKL.sicau-AM-3B and QKW.sicau-AM-4B, the positive alleles of which were from LM001 and Ailanmai, respectively. They explained 17.57 to 44.28% and 13.91 to 39.01% of the phenotypic variance, respectively. For these two major QTL, Kompetitive allele-specific PCR (KASP) markers were developed and used to successfully validate their effects in three F3 populations and two natural populations containing a panel of 272 Chinese wheat landraces and that of 300 Chinese wheat cultivars, respectively. QKL.sicau-AM-3B was located at 675.6–695.4 Mb on chromosome arm 3BL. QKW.sicau-AM-4B was located at 444.2–474.0 Mb on chromosome arm 4BL. Comparison with previous studies suggested that these two major QTL were likely new loci. Further analysis indicated that the positive alleles of QKL.sicau-AM-3B and QKW.sicau-AM-4B had a great additive effect increasing TKW by 6.01%. Correlation analysis between KL and other agronomic traits showed that KL was significantly correlated to spike length, length of uppermost internode, TKW, and flag leaf length. KW was also significantly correlated with TKW. Four genes, TRIDC3BG062390, TRIDC3BG062400, TRIDC4BG037810, and TRIDC4BG037830, associated with kernel development were predicted in physical intervals harboring these two major QTL on wild emmer and Chinese Spring reference genomes. Two stable and major QTL for KL and KW across six environments were detected and verified in three biparental populations and two natural populations. Significant relationships between kernel size and yield-related traits were identified. KASP markers tightly linked the two major QTL could contribute greatly to subsequent fine mapping. These results suggested the application potential of wheat related species in wheat genetic improvement.

Conclusions: Two stable and major QTL for KL and KW across six environments were detected and verified in three biparental populations and two natural populations. Significant relationships between kernel size and yield-related traits were identified. KASP markers tightly linked the two major QTL could contribute greatly to subsequent fine mapping. These results suggested the application potential of wheat related species in wheat genetic improvement.
Keywords: Tetraploid wheat, Wheat 55 K SNP array, Kernel size, QTL validation Background Wheat (Triticum aestivum L.) is one of the main food crops in the world [1]. The pace of population growth requires a stable increase of wheat yield [2]. Wheat yield is determined by three key components, including productive spike number per unit area, kernel number per spike, and 1000-kernel weight (TKW) [3]. TKW is mainly affected by kernel size including kernel length (KL), kernel width (KW), and kernel thickness [4]. Therefore, KL and KW play vital roles in wheat yield formation.
To date, quantitative trait loci (QTL) for kernel size have been detected on all of the wheat chromosomes [5]. For example, seven QTL for KW were detected on chromosomes 1A, 4D, 5A, 5B, 6D, and 7B [6]. Three stable QTL were identified in more than three environments, including two for KL and one for KW [7]. Xin et al. [8] identified two QTL for KW. Furthermore, several genes for kernel size have been isolated and cloned in wheat via a map-based cloning approach. For example, the grain-shape gene Tasg-D1 encoding a Ser/ Thr protein kinase glycogen synthase kinase3 was associated with formation of round grains in wheat [9]. Ketoacyl thiolase 2B (KAT-2B) involved in β-oxidation during JA synthesis played a role in determination of kernel weight [10].
Wheat breeding is facing the bottleneck of narrow genetic basis at present [11]. Fortunately, a large diversity of undeveloped genetic resources from wheat related species could contribute to meeting future wheat production challenges. It is feasible to identify and utilize novel QTL/genes for KL and KW from excellent germplasms of wheat and its related species [12]. For example, a major QTL (QGD-4BL) controlling kernel size of the upper spikelet was identified in wild emmer (T. turgidum ssp. dicoccoides) [13]. Four QTL for KL and one for KW were detected in durum wheat [14]. Okamoto et al. [15] found that P1 had a positive effect on KL in Polish wheat (T. turgidum ssp. polonicum). TtGRF4-A (ortholog of rice OsGRF4) was associated with kernel size and kernel weight in wild emmer [16].
As the progenitor of modern tetraploid and hexaploid cultivated wheat, wild emmer has the highest nucleotide diversity across the Triticum taxonomic groups making it an invaluable gene pool for the genetic improvement of wheat [17]. Thus, identification of QTL/genes for KL and KW from wild emmer will facilitate progress to meet wheat production challenges in the future. In this study, we are aiming at identifying and validating major QTL for KL and KW in a recombinant inbred line (RIL) population derived from a cross between a wild emmer accession and a Sichuan endemic tetraploid wheat 'Ailanmai'.

Genetic populations
Four bi-parent populations developed by the single-seed descent method were used in this study. They were derived from crosses Ailanmai × LM001 (AM, 121 F 8 RILs including parents) [18], LM001 × PI 503554 (MP, 102 F 3 lines), Ailanmai × AS 2268 (AAs, 102 F 3 lines), and Ailanmai × PI 193877 (API, 72 F 3 lines). Notably, the 121 RILs of AM were previously genotyped using the Wheat 55 K SNP array [18] and used for QTL mapping in this study. The other three populations were used for validating QTL identified in this study. Ailanmai (T. turgidum L. 2n = 4x = 28, AABB) is a local dwarf variety from Sichuan province, and LM001 is a wild emmer accession (T. turgidum ssp. dicoccoides, 2n = 4x = 28, AABB). PI 503554 (T. turgidum ssp. durum) and PI 193877 (T. turgidum ssp. dicoccon) were from The U.S. National Plant Germplasm System (NPGS), and AS 2268 (T. carthlicum Nevski) was collected and preserved by Triticeae Research Institute of Sichuan Agricultural University. Besides, two natural populations were further used to verify the effect of the major QTL, and they were: (I) a panel of 272 Chinese wheat landraces (CWL) genotyped using the Wheat 660 K SNP array [19], and (II) a panel of 300 Chinese wheat cultivars (CWC) genotyped using the Wheat 55 K SNP array [20]. The information of two natural populations was listed in Table  S1.

Phenotypic evaluation
The phenotype of AM RIL population was measured in six different environments, including Chongzhou (103°38′E, 30°32′N) in 2017, 2018, 2019, and 2020 (2017CZ, 2018CZ, 2019CZ, and 2020CZ), Wenjiang (103°51′E, 30°43′N) in 2020 (2020WJ), and Ya'an (103°0′E, 29°58′N) in 2020 (2020YA) in China. Details of all the experiments planted were consistent with previous study [18]. Field management was according to local agricultural practices [21]. Thirty kernels in each line were scanned using Epson Expression 10,000 XL. KL and KW were evaluated using WinSEEDLE (Regent Instruments Canada Inc) based on the selected objects in image [21]. Then, the average values of each line in a single environment and the best linear unbiased prediction (BLUP) value estimated from average values from different environments were used for QTL detection and further analysis. The data of spike length (SL), effective tiller number (ETN), length of uppermost internode (UIL), TKW, and grain number per spike (GNS) were retrieved from our previous study [18]. The measurement of flag leaf length (FLL) and flag leaf width (FLW) was conducted about ten days after anthesis. The FLL (from leaf bottom to the tip) and FLW (on the widest part of the leaf) were measured on five selected plants (five typical plants per row for each line) from the main tiller of each plant [22]. The phenotypic average value of each trait in multiple environments was used to calculate BLUP value of each trait for further analysis. All the observations were made during the previous experiment [18], and presentation of data of kernel size were completed in the current study along with validation of identified QTL.
The F 2 populations of MP, AAs, and API were grown in 2020CZ and the harvested F 3 seeds for each plant (line) were used for phenotype. Their experiment planted and field management were consistent with the AM population. Thirty kernels in each plant were scanned using Epson Expression 10,000 XL. KL and KW were evaluated using WinSEEDLE (Regent Instruments Canada Inc) based on the selected objects in image [21]. Then, the average values of KL and KW were used for validating major QTL identified in this study. Details of environmental information of agronomic traits measurement were listed in Table S2.
The 272 CWL were planted in six different environments, including 2012YA, 2013-2015WJ, and 2014-2015CZ [19]. The average value of each accession in a single environment was used for further analysis [19].
The 300 CWC were planted in three different environments, including Beijing in 2018 and 2019, and Baoding in 2019 [20]. One hundred and twenty seeds of each accession were planted in a single row of 2 m in length with 0.7 m spacing between the rows in three environments [20]. The kernel-related traits were measured using the SC-A wheat grain appearance quality image analysis system developed by the Hangzhou Wanshen Detection Technology Co [20].
Data analysis SAS 8.0 (SAS Institute, Cary, NC, USA) was used to analyze the BLUP of the agronomic traits and the broad-sense heritability (H 2 ) in different environments. According to the description of Smith et al. [23], the SPSS Statistic 24.0 program (IBM SPSS, Armonk, NY, USA) was used to obtain Pearson's correlation coefficients within agronomic traits based on the BLUP values, descriptive statistical analyses, and independent sample ttest (P < 0.05). Frequency distributions of KL and KW values were plotted in the Origin 9.0 software using Gaussian distribution. The individuals of the AM RILs were divided into two groups based on the genotypes of the closest markers for each of the two major QTL, and then the differences between the two groups for the corresponding traits were analyzed. Furthermore, Excel (Microsoft Corporation, Microsoft Excel 2010, USA) was used to analyze the binary linear regression analysis.

QTL mapping
The details of DNA extraction and 55 K SNP array analysis of AM population refer to previous work [18]. The genetic map was constructed by Mo et al. [18].
The inclusive composite interval mapping (ICIM) in IciMapping 4.1 (https://www.isbreeding.net/) was used to detect QTL, and thousand permutations test (p < 0.05) was used for defining QTL logarithm (base 10) of odds scores (LOD) threshold [24]. A LOD score of 2.5 was chosen as a threshold for considering significant QTL [25]. The QTL × Environment (QE) interaction effects were analyzed using IciMapping with the preset parameter: step = 1 cM, PIN = 0.001, LOD = 5.0. In the present study, QTL identified in more than three environments and expressed more than 10% of the phenotype variance explained (PVE) were defined to be major ones, and those with less than 1 cM apart were treated as an identical one [26]. Furthermore, QTL were named in accordance with the International rules of Genetic Nomenclature (http://wheat.pw.usda.gov/ggpages/wgc/ 98/Intro.htm). The 'sicau' represents Sichuan Agricultural University.
As F 3 is a segregating generation, we selected 15 F 3 kernels from each line of MP, AAs, and API populations for germination and grew them in greenhouse. Leaves of 15 seedlings were all collected and mixed for DNA extraction representing F 2 genotype. High-quality genomic DNA was extracted using the Plant Genonic DNA Kit (Tiangen Biotech, Beijing, China), and was then used to do genotyping using KASP markers. Details of the amplification reaction system and conditions were listed in Table S3. The lines were divided into two groups (Data set 1 and 2) based on the genotyping results. Data set 1 represented lines with homozygous alleles from Ailanmai or LM001, whereas Data set 2 represented lines with homozygous alleles from the other parents. Lines with heterozygous genotype were not included for analysis. Finally, we evaluated the differences in KL or KW between the two groups with the independent sample ttest (P < 0.05) to determine the effects of the major QTL.
The flanking marker AX-111112626 was included in CWL natural population genotyped using the Wheat 55 K SNP array and AX-108974756 was included in the CWC natural population genotyped using Wheat 660 K SNP array. According to the genotype of these two flanking markers in the CWL and CWC populations. The lines were divided into two groups: (1) lines with identical genotype as Ailanmai. (2) lines with identical genotype as LM001. The BLUP values of KL and KW data from all environments of CWL and CWC were used to analyze the differences with the independent sample t-test (P < 0.05) between the two groups.

Physical intervals of the major QTL and comparison with previously reported QTL
In order to predicate physical intervals of the major QTL identified in this study, the sequences of their flanking markers were used to blast against (E-value of 1e-5) genomes sequences of the wild emmer wheat 'Zavitan' WEWseq v2 (http://202.194.139.32/blast/ blastresult.php) [27] and the International Wheat Genome Sequencing Consortium (IWGSC) Chinese Spring (CS) RefSeq v2.1 (https://urgi.versailles.inrae.fr/ download/iwgsc/IWGSC_RefSeq_Assemblies/v2.1/) [28]. The annotations and functions of genes were retrieved on UniProt (http://www.uniprot.org/). We compared physical distances by anchoring flanking marker sequences of KL and KW QTL obtained in previous studies on CS to indicate whether the currently determined QTL were novel. Furthermore, to identify the possible regulatory genes of KL and KW, the spatio-temporal expression patterns of the genes that were identified in the intervals of QKL.sicau-AM-3B and QKW.sicau-AM-4B were analyzed using the Triticeae Multi-omics Center website (https://202.194.139.32/expression/index.html).

Phenotypic data analyses
LM001 showed longer and narrower kernel than Ailanmai ( Fig. 1; Table 1). The values of the KL and KW in each environment showed a continuous distribution (Fig. S1a, b). The KL and KW ranged from 5.86 to 9.43 mm and from 2.74 to 4.16 mm, respectively, in the AM RIL population (Table 1). The standard deviation (STD) of KL and KW ranged from 0.45 to 0.60 and from 0.11 to 0.24, respectively. H 2 of KL and KW were 0.79 and 0.72, respectively ( Table 1). The result indicated that both KL and KW had high repeatability over testing environments, suggesting KL and KW were mainly controlled by genetic factors.

Correlation analyses between kernel traits and other yield-related traits
Significant and positive correlations for KL and KW were detected in most different environments (P < 0.05). The correlation coefficients ranged from 0.62 to 0.82 for KL and from 0.29 to 0.45 for KW, respectively ( Table 2).

QTL detection
A total of eleven putative QTL associated with KL (six QTL) and KW (five QTL) were identified in the AM population and they were located on chromosomes 1B, 2A, 2B, 3B, 4B, 6A, 6B, and 7A (Table 3).
Six QTL for KL explained 4.56 to 44.28% of the PVE. QKL.sicau-AM-3B, a major and stable locus, was detected in five environments and BLUP data, and explained 17.57 to 44.28% of the PVE. The positive allele was from LM001 ( Table 3). The remaining five QTL detected in a single or two environments explained between 4.56 and 18.59% of the PVE.
Furthermore, five QTL for KW explained 13.91 to 39.01% of the PVE. QKW.sicau-AM-4B, a major QTL, detected in all the six environments and also the BLUP data. This locus could explain 13.91 to 39.01% of the PVE, and the positive allele was contributed by Ailanmai ( Table 3). The other four QTL were detected in less than three environments, and they explained between 22.96 and 29.82% of the PVE (Table 3). Furthermore, twenty-eight QTL were detected using QE interaction analysis (Table S4). QKL.sicau-AM-3B controlling KL and QKW.sicau-AM-4B controlling KW were simultaneously identified by multienvironmental and individual environmental analyses, further showing that they were major and stable QTL.
According to the polymorphism of KASP-AX-111112626, the lines were divided into two groups in the AM RIL population: lines with Ailanmai homozygous allele and lines with LM001 homozygous allele (excluding heterozygosis). The group with positive allele of QKL.sicau-AM-3B (from LM001) had significantly greater values than that with negative one (from Ailanmai) in     Fig. 2a  and b). Likewise, the lines from CWL population were divided into two groups. The group with positive allele of QKL.sicau-AM-3B had 3.29% higher values than that with negative one (P < 0.05; Fig. 3d). In MP population, the lines with positive allele had 11.46% higher values than those with negative one, indicating that QKL.sicau-AM-3B is indeed a major QTL controlling KL (Fig. 3a).
Additionally, according to the polymorphism of KASP-AX-108974756, the lines from AM population were also divided into two groups. The group with positive allele of QKW.sicau-AM-4B had significantly higher values than that with negative one in six environments and BLUP data set (P < 0.05; Fig. 2c, d). In CWC population, the group with positive allele of QKW.sicau-AM-4B had significantly 0.78% greater values than that with negative one (P < 0.05; Fig. 3e). In AAs and API populations, the group with positive allele from Ailanmai had significantly greater values than that without this allele, and the differences between the two groups were 10.17 and 10.08%, respectively, with an average of 10.13% in two validation populations, indicating that QKW.sicau-AM-4B is also a major QTL controlling KW (Fig. 3b, c).

Relationship between kernel size and other agronomic traits
In this study, we evaluated the correlation coefficients between kernel size and other agronomic traits (Fig. S2). Positive and significant correlations were observed between KL, KW, and TKW (P < 0.05; Fig. S2d, l). The result indicated that the selection of larger kernels might lead to indirect selection of heavier kernels [29]. Kernel size, like KL and KW, greatly influences TKW. For example, Cui et al. found that compared with other kernel traits, KW has the largest effect on TKW [30]. Liu et al. also reported that TKW was mainly affected by KW [31]. In the current study, KL likely contributed more to TKW than KW ( Fig. 5; Table S5), suggesting that increasing KL through utilization of positive allele of QKL.sicau-AM-3B may be more effective in increasing TKW than KW contributed by positive allele of QKW.sicau-AM-4B at tetraploid level. As expected, KL and KW were positively correlated with UIL (Fig. S2c, k). A longer UIL contributed to ventilation, light transmittance, and lower relative humidity of spikes, thus reducing the possibility of occurrence of diseases and insect pests such as scab, which was conducive to dry matter accumulation and affects kernel size [32]. KL and FLL showed significant positive correlation, and KW was positively correlated with FLW (P < 0.05; Fig. S2g, o). Theoretically, FLL and FLW determined the flag leaf area that was proportional to whether it had a strong assimilation tissue, vascular bundle area and these factors determined the kernel filling intensity of wheat, which was closely correlated with kernel size [33]. Furthermore, the results indicated that larger flag leaves increased yield by providing more photosynthetic nutrient to kernel [34]. The above conclusions provided a scientific basis for evaluating complex relationships among wheat yield components, which will be helpful in understanding increase of wheat yield.

Stable and novel QTL controlling KL and KW
We compared the major QTL identified in this study with those detected in previous studies through aligning physical positions of their closest markers (Table S6).
Predictive genes in the intervals where major QTL were located QKL.sicau-AM-3B was located between 675.6 and 695.4 Mb on wild emmer 3BL and between 670.4 and 690.5 Mb on CS 3BL by anchoring flanking markers AX-111112626 and AX-110375013 of QKL.sicau-AM-3B (Fig. 4a, b, c, and d). There were twenty-nine shared predicated genes (Table S7). Expression analyses showed that twenty-four genes can be expressed in kernel (Fig.  S3a). Similarly, for QKW.sicau-AM-4B, it was mapped between 444.2 and 474.0 Mb on chromosome arm 4BL of wild emmer and 432.7 and 462.5 Mb on chromosome arm 4BL of CS by anchoring its flanking markers AX-108974756 and AX-110915030 (Fig. 4e, f, g and h). There were forty shared predicated genes (Table S7). Expression analyses showed that thirty-four genes can be expressed in kernel (Fig. S3b).
Of these sixty-nine genes, four genes were involved in kernel development. For example, TRIDC3BG062390 encoded fructose-bisphosphate aldolase (FBA) and it had  (Fig. S3a). FBA is an important isozyme involved in plant metabolism, and it is directly involved in the fixation and distribution of photosynthate [42]. Cytosolic and plastidic FBAs were expressed in plant photosynthetic tissues [43]. FBA regulates kernel size development through affecting plant photosynthesis. In addition, there were two un-functional and annotated genes of wild emmer, TRIDC4BG037810 and TRIDC4BG037830. Nonetheless, they were highly expressed in kernel at different growth stages (Fig. S4a, b). Therefore, we identified annotations of their orthologs in CS [28]. TraesCS4B03G0584900 (TRIDC4BG037810) and TraesCS4B03G0585000 (TRIDC4BG037830) encoded Heat-shock protein (HSP ;  Table S7) and were also highly expressed in kernel (Fig.  S4c, d). HSP was widely reported in graminaceous plant [44]. At high temperature, the role of HSP is to ensure the normal growth of kernel in wheat through providing protection to soluble starch synthase [45]. It was reported that HSP, as a molecular chaperone, aids in refolding soluble starch synthases denatured by heat and thus prevents them from aggregating, which was beneficial to starch synthesis of in kernel [46]. Thus, these genes related to kernel development may provide information for fine mapping and gene cloning of these identified major and novel QTL.

Utilization of elite alleles for kernel size from wheat related species
In the current study, two major, stably expressed, and novel QTL, QKL.sicau-AM-3B and QKW.sicau-AM-4B for kernel-related traits were identified from a wild emmer accession and a local landrace and validated in five populations with different genetic backgrounds. The combination of QKL.sicau-AM-3B and QKW.sicau-AM-4B had the largest additive effect on TKW (Fig. 5). These results suggest that they have a great potential in wheat breeding. Previous studies showed that pyramiding of choiceness genes was an effective method to improve a given trait [47]. In this study, we found some transgressive segregations in AM RIL. For example, AM-3 has longer and wider kernels than both parents (Fig. 1). Interestingly, AM-3 carries the positive alleles of both QKL.sicau-AM-3B and QKW.sicau-AM-4B, implying the possibility of pyramiding these two positive alleles from wheat related species in wheat breeding.

Conclusions
Two major and novel QTL, QKL.sicau-AM-3B and QKW.sicau-AM-4B, were identified in AM RIL population. Both of them were successfully verified in their corresponding validation populations with newly developed KASP makers. Some genes involved in regulation of Fig. 5 The effects of different combinations of QKL.sicau-AM-3B and QKW.sicau-AM-4B on increasing 1000-kernel weight (TKW) in the AM population. '+' and '-' represent lines with and without the positive allele of the corresponding QTL based on the genotype of flanking markers, respectively. ** Significance at the 0.01 probability level, * significance at the 0.05 probability level kernel growth and development were detected in the intervals where major KL and KW QTL were located. Significant correlations between kernel size and other agronomic traits were detected and discussed. KASP markers tightly linked the two major QTL could contribute greatly to subsequent fine mapping. This study indicated that wheat related species have great potentials for wheat yield improvement.