QTL mapping for kernel-related traits in a durum wheat x T. dicoccum segregating population

Durum wheat breeding relies on grain yield improvement to meet its upcoming demand while coping with climate change. Kernel size and shape are the determinants of thousand kernel weight (TKW), which is a key component of grain yield, and the understanding of the genetic control behind these traits supports the progress in yield potential. The present study aimed to dissect the genetic network responsible for kernel size components (length, width, perimeter, and area) and kernel shape traits (width-to-length ratio and formcoefficient) as well as their relationships with kernel weight, plant height, and heading date in durum wheat. Quantitative Trait Locus (QTL) mapping was performed on a segregating population of 110 recombinant inbred lines, derived from a cross between the domesticated emmer wheat accession MG5323 and the durum wheat cv. Latino, evaluated in four different environments. A total of 24 QTLs stable across environments were found and further grouped in nine clusters on chromosomes 2A, 2B, 3A, 3B, 4B, 6B, and 7A. Among them, a QTL cluster on chromosome 4B was associated with kernel size traits and TKW, where the parental MG5323 contributed the favorable alleles, highlighting its potential to improve durum wheat germplasm. The physical positions of the clusters, defined by the projection on the T. durum reference genome, overlapped with already known genes (i.e., BIG GRAIN PROTEIN 1 on chromosome 4B). These results might provide genome-based guidance for the efficient exploitation of emmer wheat diversity in wheat breeding, possibly through yield-related molecular markers.


Introduction
Durum wheat [Triticum turgidum L. subsp. durum (Desf.) Husn., 2n = 4x = 28, AABB] is the most cultivated subspecies of the tetraploid wheats with a global production of 38.1 million tons in 2019 (De Vita and Taranto, 2019;Beres et al., 2020;Xynias et al., 2020). Although durum wheat represents only 5%-8% of the world wheat production, it is the 10th most important crop worldwide, and it is an integral component of the Mediterranean diet due to its use on food as pasta, couscous, and other semolina-based products (Dhanavath and Prasada Rao, 2017;De Vita & Taranto, 2019;Arriagada et al., 2020). Being a mostly rain-fed crop and due to the future climatic scenarios, increasing biomass and thousand kernel weight (TKW) are key goals to develop varieties that could outperform current cultivars under severe climatic conditions and meet rising cereal demand (De Vita and Taranto, 2019;Rahman et al., 2020;Xynias et al., 2020).
Grain yield is a quantitative trait determined by several interrelated plant and grain components. Kernel weight contributes about 20% of the genetic variation in grain yield in bread wheat (Schierenbeck et al., 2021). Moreover, seed morphology descriptors, including kernel size [i.e., length (L), perimeter (P), and area (A)] and shape (roundness), have been demonstrated in determining grain weight and, therefore, grain yield (Cui et al., 2011;Patil et al., 2013;Qu et al., 2021;Mohammadi et al., 2021). In addition, kernel size and shape have a role in the determination of other quality factors for the semolina industry such as test weight, flour yield and milling quality (Tyagi et al., 2015;Wang et al., 2021), and ash distribution (Ficco et al., 2020). The optimum grain morphology ideotype in durum wheat has a large and spherical (thick) shape to maximize endosperm-to-bran ratio, whereas small-sized kernels have the lowest test weight and semolina yield (Ficco et al., 2020). Larger kernels have also shown a positive influence on the seedling vigor and early growth in different crops, as in rice and bread wheat (Avni et al., 2018;Sun et al., 2020).
The understanding of the genetic mechanisms that regulate grain size and shape may facilitate the selection of the ideal kernel architecture through molecular markers (Russo et al., 2014). Most work on QTL mapping for kernel-related traits (size and shape) and kernel weight has been performed in bread wheat [recent reviewed by Brinton and Uauy (2019); Cao et al. (2020), and Gupta et al. (2020)] with around 1,000 QTLs described for kernel related traits (Singh et al., 2022) along with some fine mapping studies [e.g. Zhao et al. (2021)]. Unlike bread wheat, up to date, approximately 300 QTLs have been described in durum wheat for kernel-related traits and located on all 14 tetraploid wheat chromosomes, from the publicly available linkage and association mapping studies until February 2022 [based on 18 studies reviewed by Maccaferri et al. (2019), QTLome; 10 recent studies reviewed by Arriagada et al. (2020); and the works of Desiderio et al. (2019) and Mangini et al. (2021)]. Many detected QTLs are related to TKW, although less information is available on the genetic basis of kernel size and shape (Arriagada et al., 2020). In the cited studies, around 170 loci were found related to TKW, including approximately 90 QTLs related to kernel size factors [L, width (W), P, and A] and 37 QTLs related to kernel shape [W-to-L ratio (WL) and form coefficient (FC)]. Nevertheless, as some of these major QTLs are environmentspecific, they should be prudently considered in breeding programs (Arriagada et al., 2020). Indeed, some recent research is based on integrative approaches like meta-analysis of previous QTL studies in order to identify hot spot genomic regions for yield related traits (Arriagada et al., 2020;Yang et al., 2021;Miao et al., 2022). Note that heading date (HD) and plant height (PH) might affect yield and its components, as PH reduction through dwarf major genes increases harvest index and HD delimits grain weight by marking the transition from spike development to grain setting and filling period. Thus, known regulatory genes of phenology and development showed a pleiotropic effect on multiple kernel traits (Arriagada et al., 2020;Mangini et al., 2021;Haugrud et al., 2023).
The knowledge about molecular factors controlling kernelrelated traits is mostly extended in rice, with approximately 20 genes already cloned . The close relationship between wheat and rice has allowed the cloning of orthologous bread wheat genes, whereas such translational approach has not been yet applied in durum wheat. For example, TaGW2, the ortholog of OsGW2, encoding an E3 ubiquitin ligase involved in the pathway for cell wall expansion, has been demonstrated to control grain weight and kernel architecture in bread wheat (Zhai et al., 2018). Similarly, the bread wheat ortholog of rice gene BIG GRAIN 1 was mapped on chromosome 4B and proved to be related with auxin transport and regulation of seed growth (Liu et al., 2015;Milner et al., 2021). Most recently, Guo et al. (2022), demonstrated enhanced grain weight and grain yield in wheat upon localized overexpression of the gene TaCYP78A5, whose homologous previously shown to affect seed size in several plant species.
The functions of the genes with a role in kernel-related traits are highly diverse , include the following: i) metabolism of growth regulators such as auxins as, for example, TaTGW6 (Hu et al., 2016); ii) genes determining cell division and proliferation as FUWA (Chen et al., 2015); iii) genes involved in carbohydrate metabolism as TaSus1 (Mohler et al., 2016); iv) genes coding for proteins involved in ubiquitination processes as TaSDIR1-4A ; and v) transcription factors as TaGL3A (Yang et al., 2019).
The present study was conceived to unlock and dissect genetic variability behind kernel morphological traits and TKW, by performing a QTL mapping analysis on a recombinant inbred line (RIL) population derived from a cross between a T. dicoccum accession and a durum wheat cultivar. The research work included: i) the high-throughput phenotyping of kernels by image analysis, ii) the identification of QTL regions for the related traits and further identification of QTL clusters, iii) a comparison of these QTLs with previous genetic knowledge to highlight stable and novel QTL regions for the mentioned traits, and iv) the identification of candidate genes within the physical positions of the QTL clusters.

Phenotypic evaluation of the RIL population
The parental lines and the RIL population were evaluated in four different environments (location × year) including Valenzano (BA, Italy) in 2012-2013 (V13), Bologna (BO, Italy) in 2013-2014 (B14), and in Fiorenzuola d'Arda (PC, Italy) in 2014-2015 (F15) and 2019-2020 (F20). A randomized complete block design was developed with two replications for V13 and B14 and three replications for F15 (then reduced to two replications, due to field experimental issues) and F20. Each experimental unit consisted of a single 1-m row with 20-25 plants each. Trials were fertilized following the standard agronomic practices for each location, and weeds were chemically controlled. For further analysis, 110 RILs were considered in each environment, except for F20, where only 103 lines were harvested.
The phenotypic characterization was performed on a random sample of 100 kernels for each experimental unit. Each sample was scanned by Epson Expression 10000XL. Following, the kernel morphology descriptors, presented in Table 1, were analyzed by the software WinSEEDLE ™ Pro Version 2011a (Regent Instruments Canada Inc.). In addition, for B14, F15, and F20, TKW, PH, and HD were scored for each experimental unit. PH was scored at maturity, and it included spikes. HD was recorded as the number of days between 1 April and the day when 50% of tillers within a plot have the spike emerged from the flag leaf. Three samples of 100 kernels were randomly chosen from the seed bulk of each experimental unit and weighted, and the medium value used to calculate the corresponding TKW was expressed in g/1,000 seeds.

Statistical analysis
The statistical analyses were performed in R software (R Core Team, 2021) using the phenotypic data from each environment and for each trait. The Shapiro test and Student's t-test were performed, using the R/rstatix package (Kassambara, 2021), to evaluate normality of data and the differences between parents, respectively. Descriptive statistical analyses and analysis of variance (one-way ANOVA, p< 0.05) were performed to determine the effect of the RILs. Repeatability for each environment was estimated for each trait. Adjusted overall means across environments [best linear unbiased predictions (BLUPs)] were calculated by fitting a linear mixed model through the gamem_met function in R/metan. In this model, environment was considered as fixed effect, genotypes, and genotype × environment interaction (GEI) as random effects. The model was formalized as follows: Y ijk = m + Gen k + Env i + Rep j (Env i ) + Gen k Â Env i + ϵ ijk where y ijk is the response variable (that is the phenotypic value of the trait of interest) measured in for the genotype k, in the environment i and replicate j; m is the overall mean; Env is the effect of the environment i; Rep is the effect of replicate j on each environment i; Gen is the effect of the genotype k; Gen x Env is the interaction effect between genotype k and environment i; and ϵ is the error associated with each phenotypic value of the response variable y ijk . This model assumes that the random effects of Gen follow a normal distribution with mean 0 and variance s 2 g . Likewise, the model assumes ϵijk ∼ (0, s 2 ϵ ), that is the error terms e are independent and normally distributed with mean 0 and variance s 2 ϵ . The across-environment broad-sense heritability (H 2 ) was estimated from the BLUP model through the following formula: .
where s 2 g is the genotypic variance, s 2 i is the GEI variance, and s 2 r is the residual variance. Pearson's correlation coefficients were calculated for all trait combinations based both on the data recorded for each environment and for BLUPs. All these analyses were performed by using R/metan package (Olivoto and Lućio, 2020

Descriptor
Definition Illustration Trait category (unit) The straight distance between the two farthest points on the projected image perimeter.
Kernel size (mm) Width (W) The maximum width measured perpendicular to length.

Perimeter (P)
The length of the seed's outline.

Area (A)
The two-dimensional area occupied by the seed projection. Kernel size (mm 2 ) Width to Length Ratio (WL) The comparison of the width and length.
Kernel shape

Form Coefficient (FC)
Indicates the seed shape through the formula 4*p*A/P 2 , where A is area and P is perimeter; with a value of 0 for a filiform object and 1 for a perfect circle.
Valladares García et al. 10.3389/fpls.2023.1253385 Frontiers in Plant Science frontiersin.org quality control using FastQC version 0.11.7 (Andrews, 2010). Subsequently, reads containing adapter sequences were discarded using cutadapt version 1.17 (Martin, 2011), and the resulting FASTQ files were trimmed to a base quality of 10 from both ends with TRIMMOMATIC version 0.30 (Bolger et al., 2014) using the following parameters: LEADING = 10, TRAILING = 10, SLIDINGWINDOW = 4, and MINLEN = 50. Filtered reads were aligned to the reference sequence of T. turgidum L. ssp. durum, cv. Svevo (Maccaferri et al., 2019) using Burrow-Wheeler Aligner (BWA-MEM) version 0.7.15 with default parameters (Li and Durbin, 2009), whereas duplicated reads were marked in the alignment file using the "MarkDuplicates" command of Picard software (http://broadinstitute.github.io/picard). Genetic variants were called from the resulting alignments with marked duplicated reads using SAMtools/BCFtools pipeline version 1.7 with BCFtools call parameter set to -m and -v. Beyond SAMtools/BCFtools pipeline, FreeBayes version 1.0.0 (Garrison and Marth, 2012) was additionally used for Single Nucleotide Polymorphism (SNP) calling using default parameters. SNP calls detected using both SAMtools/BCFtools pipeline and FreeBayes were subsequently filtered for including variants supported with more than 20 reads and a mapping quality higher than 50. The subset of filtered SNPs called with both SAMtools/BCFtools pipeline and FreeBayes was considered for further analyses. Along with SNPs, the mentioned pipeline identified raw small indels, which were hard filtered using the same parameters used for other genetic variants. The annotation and prediction of functional effect of the genetic variants have been done using the SnpEff toolbox version 4.3t (Cingolani et al., 2012) and the Svevo genome annotations (Maccaferri et al., 2019) including low-confidence and highconfidence annotated genomic features.

QTL mapping
The high-density genetic map used in this study was previously constructed (Desiderio et al., 2014;Maccaferri et al., 2015) with a total of 10,840 markers assembled into 14 linkage groups corresponding to the 14 durum wheat chromosomes and an overall map length of 2,363.4 cM. For each trait, the R/qtl package (Broman et al., 2003) was used for QTL analysis with the mean values for each genotype in each single environment and the BLUP values as adjusted mean values for the combined data. The procedure described by Desiderio et al. (2019) was performed as follows: (i) a permutation test to define the logarithm of odds (LOD) significance level with a genome-wide significance level of 5% after 1,000 permutations; (ii) initial scan of the genome using the simple interval mapping (SIM) with a 1-cM step; (iii) evaluation of the position and effect of the QTLs with the multiple imputation method [composite interval mapping (CIM)]; and (iv) the "addqtl" command to search for additional QTLs. When more QTLs were identified for the trait under consideration, a model containing the QTLs and their possible interactions were tested by the "addint" command. If these putative loci remained significant, then the "refineqtl" command re-evaluated the QTL positions based on the full model.
The additive effects of QTLs were estimated as half the difference between the phenotypic values of the respective homozygotes. If a QTL was found close to the threshold, estimated by permutation, and co-located with a significant QTL, then it was considered as putative QTL. The confidence intervals (CIs) of each QTL were determined as proposed by Darvasi and Soller (1997). Next, for each trait, QTLs found in more than one environment were considered to correspond to the same stable QTL, provided that CIs were overlapping and that the additive effect was conferred by the same parent. Furthermore, QTLs were named according to the rule "Q + trait code + chromosome.locus number," where Q stands for QTL, trait code refers to the trait acronym presented in Table 1, and the last refers to the wheat chromosome on which the corresponding QTL is located. If two QTL are on the same chromosome, then a consecutive number (".1, .2, .3") was added.

Clustering of QTL and identification of candidate genes
To compare QTLs identified in the present study with data from literature and hypothesize candidate genes, the durum wheat reference genome was used as framework to combine physical and genetic information. This comparison procedure included two steps to project current knowledge on the durum wheat reference genome: (i) updating the tetraploid QTLome provided by Maccaferri et al. (2019), with the most recent QTLs for the kernel-related traits retrieved by a literature survey from the publicly available linkage and association mapping studies until August 2022 (Desiderio et al., 2019;Arriagada et al., 2020 andMangini et al., 2021;Supplementary Material, Appendix A); and (ii) compiling a list of wheat and/or rice cloned genes with known functions affecting kernel size, shape, and kernel weight; their sequences were used as a query to perform a BLAST against the durum wheat reference genome to define their Other steps were necessary thereafter to anchor the best results of the present study on the reference genome: (i) grouping the QTLs identified in the present study by defining QTL clusters as regions where QTLs for different traits co-located in the Latino x MG5323 map (based on total or partial overlapping of CIs) and defining the QTL with highest LOD and R 2 values within each cluster as the major QTL of that cluster; (ii) initial anchoring of peak and flanking SNP markers of the best QTL of each cluster on the tetraploid wheat consensus map and defining the related genetic position and selecting the coinciding/nearest QTL from previous studies; (iii) for each cluster, projecting the CI on the durum wheat reference genome assembly (Svevo.v1, Maccaferri et al., 2019) by BLASTing nucleotide sequence of CI flanking markers on Svevo.v1 at https:// plants.ensemble.org, upon an intermediate step onto the consensus map of the tetraploid wheat (Maccaferri et al., 2015) in order to use more and reliable markers as bridge and thus increase the consistency and accuracy for the genome projection; (iv) hypothesizing candidate genes within the physical interval of the QTL clusters by screening high-confidence Svevo genes based on their functional annotation (previously obtained via blast2GO PRO, available at https://figshare.com/s/2629b4b8166217890971); (v) in addition, based on the assumption that cvs. Latino and Svevo have highly similar (0.85) genome sequence similarity (Mazzucotelli et al., 2020;unpublished exome data), identifying polymorphisms between the MG5323 genome and the Svevo v1 assembly (Maccaferri et al., 2019) in promoter and gene sequences of candidate genes and evaluating their possible effects based on SnpEff (Cingolani et al., 2012); and (vi) analyzing the bread wheat homologous of candidate genes for their transcriptional profile in different tissues/organs (leaf, grain, root, and spike) and at different developing stages (seedling, vegetative, and reproductive). B wheat (cv. Chinese Spring) homologous on the IWGSC RefSeqv1.1 genome assembly was retrieved from the Triticeae Gene Tribe homology database (http://wheat.cau.edu.cn/TGT/) (Chen et al., 2020), whereas gene expression data were downloaded from the ExpVIP platform (Wheat Expression Browser, www.wheatexpression.com; Ramirez-Gonzales et al., 2018), which collects published transcriptome data on bread wheat. Transcript abundances were expressed in log2 (transcript per million).

Phenotypic characterization of the RIL population
The two parents and the RILs were evaluated for traits related to kernel morphology, size, and weight, and HD and PH in four environments and BLUP across the four environments were calculated (Table 2). This analysis shows significant differences (p< 0.01) between the two parental lines in each environment and across them for most of the traits, except for A that was only statistically significant in B14. As expected, MG5323 obtained greater values for L and P (kernels longer and narrower), PH and HD and a lower value of kernel weight were compared with that in the cv. Latino. Furthermore, about the RIL mean values in the different environments, the L ranged from 7.9 mm to 8.3 mm, the W ranged from 3.0 mm to 3.2 mm, the P ranged from 18.8 mm to 19.6 mm, the A ranged from 18.6 mm 2 to 20.3 mm 2 , WL ranged from 0.38 to 0.41, FC ranged from 0.65 to 0.67, TKW ranged from 44.3 to 53.5 g, HD ranged from 30 to 45 days, and PH ranged from 100.8 cm to 106.9 cm. Broad-sense heritability (H 2 ) values calculated for each trait in each environment and on BLUPs were high (0.80-0.99), the highest values obtained by L and kernel shape traits (FC and WL). As depicted in Figure 2, the frequency distribution of phenotypic values for each trait in each environment and across environments suggested the contribution of several loci controlling the phenotypic variation for each trait (quantitative nature), including HD. The unique exception is PH, whose bimodal distribution indicated one major gene. In addition, high transgressive segregation was observed for all traits in both directions, including TKW, which implies the presence of superior alleles for the kernel-related traits in both parents.
The analysis of variance (one-way ANOVA) detected highly significant differences among RILs for all traits in each environment (p< 0.0001, Appendix C), which indicates that genetic factors explain a large fraction of the observed phenotypic variability. However, for F15, the replication factor was also significant and higher than the genotype factor, which could imply experimental error in experimental field; for this reason, the replication 1 was removed for the rest of the analysis. Variance components computed by BLUPs on the overall dataset across environments revealed the effects of RILs, environments, and GEI, as shown in Table 3. Variance of the environment (ENV) component ranges from 3.6% for PH to 68.5% for HD. Variance component related to genotype (GEN) ranged from 19.1% for DH to 78.6% for PH, with kernel L having the highest value among the kernel related traits and TKW having the lowest value. Variance of GEN component was greater than the GEI component (ranging from 7.2% to 26%) for all the traits considered in this study, suggesting that the genetic factors contributed largely to the phenotypic variability, as also shown by moderate to high heritability computed for all traits (Table 3).
Correlation analysis was performed for the phenotypic data of each environment (Appendix D) and for BLUPs ( Figure 3) among the nine evaluated traits. These kernel-related traits can be distinguished in primary (L and W) and secondary (P, A, WL, and FC) being derived by combinations of the primary traits, leading to inherent correlation between them. In detail, for each environment and BLUPs, for kernel size traits, kernel L is the main feature related to its secondary features A (r ≈ 0.7) and P (r ≈ 1). Meanwhile, kernel W is the main trait for WL and FC attributes (r ≈ 0.7). Interestingly, TKW showed a significant and highly positive correlation to A (r ≈ 0.9) and W (r ≈ 0.8) and moderate positive correlation to L (r ≈ 0.4) and P (r ≈ 0.5) for all environments and BLUPs. The correlations between TKW and kernel shape traits were significant (p< 0.05) with r-values lower than the traits mentioned before. In addition, PH showed a moderate positive correlation with TKW and A (r ≈ 0.3) as well as with L, W, and P (r ≈ 0.2), whereas HD was positively correlated with L and P (r ≈ 0.3) and negatively correlated with W (r ≈ −0.3), WL (r ≈ −0.4), and FC (r ≈ −0.5).

QTL analysis
QTL analysis was performed for all traits recorded in the four individual environments (V13, B14, F15, and F20) and on BLUPs, finding a total of 100 individual significant QTLs and five suggestive/putative QTLs (Appendix E). In the different environments, the number of QTLs identified was 15, 19, 24, and 18 in V13, B14, F15, and F20, respectively, whereas 24 QTLs were identified on BLUPs. The explained phenotypic variance ranged from 6.9% to 22%, with an average of 12.5% for kernel related size/ weight traits and from 6.8% to for 42.7% for HD. The highest average explained variation was found for P (14.2%), whereas the  Frequency distribution for the nine phenotypic traits analyzed for each environment (V13, B14, F15, and F20) and BLUPs (overall data). Traits are denoted as L, length; W, width; P, perimeter; A, area; WL, width-to-length ratio; FC, form coefficient; TKW, thousand kernel weight; HD, heading days; and PH, plant height. For more details on the trait description, please refer to Table 1 and the Materials and Methods section.
Valladares García et al. 10.3389/fpls.2023.1253385 Frontiers in Plant Science frontiersin.org lowest was calculated for FC (10.7%). QTLs for kernel traits were found on all chromosomes, with exception of chromosome group 1 and chromosomes 5B, 6A, and 7B. QTLs for HD were mostly distributed on chromosome groups 2 and 5, in addition to one QTL each on chromosomes 7B and 3A, whereas QTLs for PH were all on 4B. Many QTLs were coincident or close together, suggesting that the same genomic region was the genetic determinant of the same trait in different environments and, thus, that stable QTLs were identified. Therefore, for each trait, QTLs whose peak markers were less than 10 cM faraway and/or have overlapping CIs were considered to correspond to the same QTL, provided that the additive effect was conferred by the same parent (Table 4). The 10-cM threshold was prudently chosen on the basis of the size of the largest QTL CI calculated for the 100 identified QTLs. An exception was made for QTLs for L, P, A, and TKW on chromosome 4B identified at 63 cM in F20. These QTLs were grouped with QTLs for the same traits identified at about 79-82 cM in B14, F15, and BLUPs, considering the greater consistency of these latter QTLs and imputing the shift due to the reduced number of RILs used in F20.
After grouping, a total of 42 different QTLs were defined (Table 4), and, among them, 24 were identified in at least one environment and by BLUPs and therefore considered environmentally stable. The 42 QTLs were distributed on 11 of the 14 chromosomes of the MG5323 x Latino linkage map. More in detail, for the kernel size traits (L, W, P, and A), a total of 21 QTLs were identified on chromosomes 2A, 2B, 3A, 3B, 4A, 4B, 5A, 6B, and 7A. Overall, nine loci were detected for kernel shape traits (WL and FC), located in chromosomes 2A, 2B, 3A, and 7A. For TKW, five QTLs were identified: two located on chromosome 3B, two on chromosome 4B, and one on 6B. For HD, six QTLs were found, mapped on chromosomes 2A, 2B, 3A, 5A, 5B, and 7B. For PH, only a major QTL on 4B was found. About the effect, for 23 of the 42 QTLs, the allele with a positive additive effect was contributed by the parent MG5323. In detail, four QTLs associated to L, two to W, six to P, two to A, two to TKW, six to HD, and one to PH. Meanwhile, the parental Latino carried all the alleles for increasing kernel shape traits (WL and FC) and thus for conferring more roundness to the kernels. In all the QTLs detected for HD, the alleles with a positive effect were contributed by the parent MG5323, which is indeed the late parent. Major/moderate QTLs (with R 2 above 15%) were found, including three for L (QL-2B, QL-4B, and QL-7A), one for W (QW-7A), two for P (QP-4B and QP-7A.1), two for A (QA-4B and QA-6B), one for WL (QWL-7A), one for FC (QFC-7A), one for TKW (QTKW-6B), and two for HD (QHD-2A and QHD-2B). No significant epistatic interactions were identified in this study.
The 24 (Table 4). Interestingly, the most stable regions, identified in all environments and across them, were located on chromosomes 7A and 2B (QL-7A, QWL-2B, QWL-7A, and QFC-7A), in addition to QHD-2B, QHD-7B, and QPH-4B, which were identified in three environments and by BLUPs. The trait with the poorest stability was kernel W, where five of the six QTLs identified were only detected in one or two environments.

QTL clusters
Because of the geometrical or biological nature of the relationships between the traits under study, co-location of QTLs for different traits was expected, as also suggested by Pearson's correlation analysis. This implied the pleiotropic effect of a single gene or a set of linked genes on multiple related traits. Thus, 83 QTLs of the initial 100 were grouped into nine different QTL clusters, defined as regions where QTLs for different traits colocated, with their CIs being fully or partially overlapping (Table 5). Furthermore, the genetic position on the tetraploid wheat consensus map (Maccaferri et al., 2015) for each QTL of the nine cluster was obtained by projecting the molecular markers of the CIs, corroborating the co-location of these loci. The nine clusters identified were located on chromosomes 2A (cluster 1), 2B (clusters 2 and 3), 3A (cluster 4), 3B (cluster 5), 4B (clusters 6 and 7), 6B (cluster 8), and 7A (cluster 9).
Clusters 1 and 2, constituted by five and eight QTLs, respectively, were the only ones found related to HD and shape/ size kernel traits (WL, FC, P, and W). Clusters 3, 4, and 9 included 11, 7, and 20 QTLs, respectively, and highlighted the expected geometrical relationship between main traits and their derivative ones. Most of these clusters indicated independence of L and W traits, except for cluster 9 where QTLs for both traits were detected. Pearson correlation coefficient (r) among the nine phenotypic traits analyzed using BLUPs (overall data). Traits are denoted as L, length; W, width; P, perimeter; A, area; WL, width-to-length ratio; FC, form coefficient; TKW, thousand kernel weight; HD, heading days; and PH, plant height. For more details on the trait description, please refer to Table 1                Meanwhile, clusters 5, 6, 7, and 8, composed of 3 to 14 different QTLs, were associated with kernel size/shape and TKW. Two major and consistent clusters were found on chromosome 4B, associating PH, W, and TKW in cluster 6 as well as L, P, A, and TKW in cluster 7. In both clusters, the positive alleles of all the QTLs were derived from the emmer parent (MG5323). This was coherent with the high transgressive segregation of the TKW observed in the RIL population, despite the lower TKW value of MG5323. Notably, in cluster 7, the QTLs explained from 9.6% to 12.7% of the kernel weight variance and up to 22% of the kernel size traits (L, P, and A) variation. The availability of the durum wheat reference genome (Svevo.v1; Maccaferri et al., 2019) allowed to define the physical interval of the clusters identified. To this aim, the best QTLs (QTL with the highest LOD and R 2 ) related to size/shape traits within each cluster were projected on the Svevo genome. In this way, the largest physical regions were detected on chromosomes 2B (cluster 3), 3A (cluster 4), and 6B (cluster 8), which spanned for more than 90 Mbp (119 Mbp, 95 Mbp, and 216 Mbp, respectively). Clusters 1 and 2 spanned approximately 10 Mbp, clusters 5 and 9 for about 50 Mbp, whereas the two clusters, 6 and 7, identified on chromosome 4B spanned for 18 Mbp and 25 Mbp, respectively ( Figure 4, Table 5).

Identification of possible candidate genes for the QTL clusters
Candidate genes were hypothesized by inspecting the functional annotations [Gene Ontology (GO) terms] of the high-confidence Svevo genes retrieved within and/or near the physical intervals of the major QTL for each cluster (Table 5). Most attention was addressed to genes with GO terms likely associated with functions hypothetically related to kernel development and grain yield based on previous knowledge (i.e., hormone pathways and sugar metabolism). Then, an updated list of known genes controlling kernel related traits and kernel weight previously described in rice and/or wheat was also considered (Appendix B). Clusters 3, 4, and 8 were characterized by a large physical interval with a high number of annotated genes (around 500 for each cluster), making impossible the manual inspection of each gene under the CI of each QTL; therefore, only the comparison with known genes from Appendix B was carried out. The overall results are reported in Appendix E and represented in Figure 4.
About 300 Gb of 150-bp Illumina paired-end reads were obtained and aligned against the durum genome Svevo.v1 (Maccaferri et al., 2019), obtaining an average sequencing depth of 24.7 and a mean genome coverage of 98.7% (Appendix F). Overall, 11,414,704 genome-wide DNA variants were detected between MG5323 and the reference durum wheat genome, which were inspected to support the role of selected candidate genes. Assumption to this analysis is an extensive genetic similarity between the genome of the cvs. Latino and Svevo. Indeed, Latino and Svevo showed 85% similarity when genotyped with the Illumina iSelect 90K wheat array, whereas the similarity was 46% between both Latino and Svevo in respect to MG5323 (Mazzucotelli et al., 2020). Therefore, we used the Svevo.v1 genome assembly as surrogate of the cv. Latino genome, and we supposed that polymorphisms (SNPs or small INDELs) in corresponding genes between Svevo and MG5323 were likely conserved between Latino and MG5223. Notably, as a de novo assembly of MG5323 genome was not achievable, we cannot exclude that larger structural variations at candidate genes underlied the target traits. The analysis focused on gene sequences and upstream regions (2,000 bp) of the 15 candidate genes, which included both the novel proposed candidates (TRITD2Bv1G019940, TRITD3Bv1G229090, TRITD3Bv1G229910, TRITD3Bv1G235190, TRITD3Bv1G239650, TRITD4Bv1G175480, TRITD4Bv1G179270, TRITD7Av1G052720, TRITD7Av1G055870) and some known cloned genes (D61, TRITD3Av1G163790; BG1, TRITD4Bv1G171270; GW2, TRITD6Bv1G096950; FUWA, TRITD6Bv1G115800; TASUS1, TRITD7Av1G050690; TAGASR7, and TRITD7Av1G071860) located in the CI of the identified QTLs (Table 5, Figure 4). A total of 67 SNPs between orthologous sequences of MG5323 and Svevo genomes were identified: 46 in the upstream regions and 21 in the gene sequences (Appendix G). No SNPs were identified for TRITD2Bv1G019940 and TRITD6Bv1G115800 (FUWA). The highest number of SNPs in the upstream region was identified in TRITD3Bv1G235190 and TRITD7Av1G055870 (9 and 11, respectively). Considering the gene sequences, only eight genes reported SNPs, and, among them, only three genes ( T R I T D 3 B v 1 G 2 2 9 0 9 0 , T R I T D 3 B v 1 G 2 2 9 9 1 0 , and TRITD3Bv1G235190) had SNPs in the coding sequence (two, two, and seven SNPs, respectively). Seven of them were synonymous variants, whereas the remaining four (two in TRITD3Bv1G229090, one in TRITD3Bv1G229910, and one in TRITD3Bv1G235190) were classified as missense variants. Three of them changed the aminoacidic chemical characteristics (Leu/Gln, Glu/Gln, and Val/ Ile) with a possible consequence on the protein folding and/ or activity.
Upon identification of orthologs of candidate genes in bread wheat, gene expression atlas available for bread wheat through ExpVIP was inspected to gain some functional evidence to support the candidates. Bread wheat homologous genes were identified for all candidates (Appendix G) but one (TRITD4Bv1G175480). The expression profile of these genes was in silico analyzed for different plant organs and developmental stages, considering both relevant (spikes, grains, and reproductive stage) and not relevant (leaves, roots, and vegetative stage) plant tissues ( Figure 5). The most expressed gene was TaSUS1/TraesCS7A02G158900, and, then, other genes can be classified in three different groups according to the general expression profile. One group contained genes with general medium expression level (D61/TraesCS3A02G245000, FUWA/ TraesCS6B02G235400, GW2/TraesCS6B02G215300, TraesCS3 B02G470300, and TraesCS3B02G452800), a second group included genes with general low expression level (BG1/TraesCS4B02G292300, TraesCS2B02G079600, TraesCS3B02G450900, TraesCS3B02G353200, TraesCS3B02G462900, TraesCS4B02G312300, TraesCS7A02G175200, and TraesCS4B02G307400), lastly there were a couple of genes (TaGASR7/TraesCS7A02G208100 and TraesCS7A02G164000) with higher expression into specific organs (leaf and spikes). Of particular interest is the expression profile of some unknown candidate genes that showed a specific induction in spikes at vegetative (TraesCS2B02G079600 and TraesCS7A02G164000) or reproductive plant stage (TraesCS3B02G353200).

Discussion
Unraveling the genetic bases determining yield components, such as TKW, is an ongoing and essential task to drive grain yield improvement. In this way, attention should be paid to kernel size and shape factors, which are important parameters for grain weight and have been manipulated because of domestication, selection, and improvement for grain yield. The molecular mechanisms behind these traits have been mainly studied in bread wheat, whereas, in durum wheat, there is still a huge terrain to cover (Desiderio et al., 2019;Mangini et al., 2021;Haugrud et al., 2023). Moreover, wheat ancestors as cultivated emmer (T. dicoccum) should be considered as promising genetic resources to be employed for studying effects of genetic improvement and restoring durum wheat diversity (Rahman et al., 2020;Mohammadi et al., 2021). Under this context, the present study was conceived to dissect the genetic network behind kernel size and shape traits, kernel weight, PH, and HD, by performing QTL mapping on a RIL population derived from a T. dicoccum accession.
Detection of environmentally stable QTLs, trait relationships, and favorable alleles from T. dicoccum MG5323 In this study, the ANOVA across four environments (locationyear) showed that the genotypic effect was higher than the GEI effect for all traits. Thus, we were able to detect environmentally stable QTLs (24) for most of the kernel morphological traits, which implied their reliability in the determination of the considered traits. Kernel W was the most unstable trait. This could imply that kernel W might be controlled by minor effect genes under a relatively higher environmental effect, as exposed in two previous studies in durum wheat, where low heritability was also detected for this trait (Desiderio et al., 2019;Sun et al., 2020).
Co-locating loci defined nine QTL clusters. Some of them were expected (clusters 3, 4, and 9) as a consequence of inherent geometrical relationships between main kernel traits and their mathematically derivative ones, also suggested by the Pearson's correlation coefficients. The relationship between kernel L and W is more intriguing. Indeed, the identification of associated regions that independently control these two kernel traits might allow the use of this genome-based information to obtain the target kernel ideotype. On the other hand, one QTL determining both traits may allow one to focus on only one genomic region to efficiently increase kernel A. In this study, as previously shown, W and L were found controlled by different clusters, and no significant correlation was identified in the correlation analysis, so the independence of both traits could be implied as in previous studies (Desiderio et al., 2019;Mangini et al., 2021). The only exception was represented by the cluster 9, on chromosome 7A, which included QTLs with positive allelic effects provided by both parents. These findings suggest that genes controlling the traits are closely linked and could allow their exploitation to parallelly increment them.
The highest significant positive relationship between a kernel size trait and TKW was detected from Pearson's correlation analysis for kernel A (r ≈ 0.9) and further confirmed by coincident loci detected for both traits in clusters 5, 7, and 8. There is compelling and expected evidence for this relationship, suggesting that TKW improvement is due to the kernel A increase (Russo et al., 2014;Desiderio et al., 2019;Sun et al., 2020;Mangini et al., 2021). Moreover, these co-located loci confirmed the expectation that gene(s) responsible for variation of kernel size/shape might also affect kernel weight. Some recent examples about this assumption have been documented in both bread and durum wheat (Avni et al., 2018;Desiderio et al., 2019;Wang et al., 2019;Xin et al., 2020;Li et al., 2021;Mangini et al., 2021;Qu et al., 2021), and, in some cases, it has been also confirmed by QTL cloning in rice and wheat (Yamamuro et al., 2000;Chen et al., 2015;Zhang et al., 2017). The other significant positive relationship was obtained between kernel W and TKW (r ≈ 0.8), which was confirmed by clusters 6 and 8; however, the environmental dependency of the QTLs related to kernel W, as explained before, needs to be taken in account for further studies on these relationships.
Two clusters, 1 and 2, on chromosomes 2A and 2B, respectively, grouped QTLs for kernel shape/size traits and HD, highlighting well-known ectopic effects of plant phenology on yield components (Wilhelm et al., 2009). The study by Mangini et al. (2021), using a RIL population derived from a cross between two durum wheat lines, also reported about a cluster on chromosome 2A associated with HD and kernel traits; however, it included kernel A and kernel L, a relationship not found in this work, which might be due to the difference of genetic backgrounds.
The two clusters on 4B (6 and 7) are of major interest because they included TKW beside kernel size traits, with favorable alleles originated from MG5323. This result is consistent with several studies supporting T. dicoccum as donor of valuable alleles to increase seed size and weight (Thanh et al., 2013;Faris et al., 2014;Russo et al., 2014;Wang et al., 2019), whereas Guan et al. (2018) referred to wheat chromosome 4B as a "QTL-hotspot," thus a shared genomic region with a pleiotropic effect or tightly linked loci affecting two or more traits. Noteworthy, about cluster 7, this study suggests that L is the main trait contributing to A, which implies that the increase of kernel A through L could be achievable using a T. dicoccum line for durum wheat breeding.

Comparative analysis of QTL clusters
The comparison of physical positions of the clusters detected in this study with QTLs from previous studies (from both linkage and association mapping; Appendix A) was performed to assess the novelty of our results (Table 6) through a first projection on the consensus map as bridge and then on the reference genome. Most of the QTLs identified in this work fell within regions previously identified for kernel-related traits, despite different genetic backgrounds and experimental conditions. Nonetheless, this study allowed incrementing the number of traits associated to each of the co-locating QTL.
Clusters 1 and 2 on chromosomes 2A and 2B, respectively, were only found coincident with QTLs for TKW (Patil et al., 2013;Graziani et al., 2014; Table 6), whereas the physical interval of cluster 3 related to L and WL overlapped with a QTLs from Desiderio et al. (2019) for the same traits. Regarding cluster 4, it coincided with QTLs for TKW previously detected by Avni et al. (2018) and Sun et al. (2020); meanwhile, the association with this trait was missing in our study. However, two loci described in Wang et al. (2019) in association with kernel W coincided with our result. Schematic representation of clusters of QTL anchored on the durum wheat reference genome (created using MapChart version 2.3). Part of the chromosomes are represented by including some markers surrounding the QTL clusters; SNP marker IDs are on the right, whereas their positions on the durum wheat reference genome are reported in bp on the left. The name of flanking markers of the cluster intervals (major QTL of each cluster) is in bold. QTL names are according to Table 4. Traits are denoted as L, length; W, width; P, perimeter; A, area; WL, width-to-length ratio; FC, form coefficient; TKW, thousand kernel weight; HD, heading days; and PH, plant height. For more details on the trait description, please refer to Table 1 and the Materials and Methods section. The + or − signs preceding the QTL name indicate the positive or negative additive effect of the allele carried by the parental line MG5323. Environments where reported QTLs were identified are also indicated in parentheses. Known and candidate genes hypothesized are shown in red. Complete information of this figure is on Tables 5 and 6. Furthermore, the cluster 5 might correspond to a locus associated with TKW detected by Faris et al. (2014) on chromosome 3B, also using a T. dicoccum-derived population. However, in the present study, the mentioned QTL was also found related to A, whereas no other traits were previously reported for the same region, which could be an indication of a likely new relationship involving the two traits found here.
For cluster 7, the region had been previously identified for kernel A and W; however, this was in modern genetic background (Mangini et al., 2021). In addition, QTLs were already found for TKW by Blanco et al. (2012) and Elouafi and Nachit, (2004). In an analogous interspecific durum x emmer population, Russo and co-authors (2014) identified a QTL on chromosome 4B related to TKW and kernel A and W, where the favorable allele was donated from the T. dicoccum line. However, this locus was located at about 27 Mbp on the reference genome and thus is unlikely to overlap with our cluster (594 Mbp to 619 Mbp on chromosome 4B). Overall, such comparisons suggested that the cluster 7 detected in this work is likely to be new for the relationships found (between kernel L, P incrementing A, and TKW) as no previous coincidences were found with QTLs for L and P at this specific region.
Only two previous QTLs were found coincident with the physical interval of cluster 9, with one being related to TKW (Patil et al., 2013) and one found in association with kernel W (Sun et al., 2020). This last was consistent with some of the traits associated to this cluster in our work (W, WL, and FC), whereas no coincidences were found for L and P.
Notably, none of our QTLs co-localized with domestication related chromosome regions that are Q and Brt loci and corresponding cloned genes, respectively, on chromosome 5A and on short arm of chromosome group 3.

Candidate genes hypotheses for the QTL clusters
Hypotheses about candidate genes, both novel and known cloned genes (Table 5), were proposed based on their position within the QTL regions, functional annotations (GO terms), polymorphisms between parent lines (Appendix G), and expression profile of bread wheat homologs ( Figure 5). On the basis of previous knowledge, some GO terms could be more likely associated to functions related to kernel development and grain yield (as hormone pathways and sugar metabolism).
A co-location of yield related traits with QTL for HD, which suggests a pleiotropic relationship, has been described before (Gegas et al., 2010;Mangini et al., 2021). In this study, the physical positions of clusters 1 and 2 (on 2A and 2B, respectively), which also includes QTL for HD, was compared with the known positions of the major genes Ppd-A1 and Ppd-B1 (36.6 Mbp and 56.3 Mbp on the Svevo genome, respectively; Maccaferri et al., 2019), which are key components in the photoperiod/flowering regulatory pathway. As depicted in Table 5, the physical interval detected in this study for cluster 1 on chromosome 2A included Ppd-A1. Instead, cluster 2 on chromosome 2B is slightly shifted in respect to Ppd-B1. This effect could be a consequence of the gap present in 2B genetic map or due to the process of anchoring the QTL on the reference genome. Although markers could look to be cosegregant in a genetic map, their physical position on the genome can be slightly different, also based on the recombination rate of the target region. Alternatively, the gene TRITD2Bv1G019940, located at 43 Mbp and encoding a coiled-coil domain-containing protein 6G with GO related to controlling cell proliferation, could be a candidate for cluster 2. However, although the bread wheat homolog showed a specific expression in spike at reproductive stages, no SNPs were identified between the MG5323 and Svevo alleles.
Analogously, the pleiotropic consequences of the known Rht1 (Rht-B1b), located at 29.3 Mbp on chromosome 4B, were corroborated with this study, as the semi-dwarfing gene overlapped with the position of cluster 6 ( Table 5). This gibberellin insensitive dwarfing gene has been comprehensively documented to have pleiotropic effects as an increased grain number and lodging tolerance, which favors grain yield and led to its wide adoption in bread wheat during the Green Revolution. Heatmap of gene expression of bread wheat homologs of QTL candidate genes. Expression level of each gene in different organs and at different developmental stage is reported, as obtained from ExpVIP database. Gene expression levels are expressed as log2 of transcript abundances, normalized as for transcript per million (tpm). Expression level is shown according to the color scale reported, from blue for no expression, to red for highest expression. The other pleiotropic effects are reduced seed size, kernel weight, and micronutrient and protein content (Patil et al., 2013;Russo et al., 2014;Mohler et al., 2016;Velu et al., 2017;Guan et al., 2018). The wild-type allele present in emmer makes plants taller and grain larger and heavier as showed by the positive additive effect of the MG5323 allele at cluster 6. The physical interval of cluster 3 on chromosome 2B did not include any orthologous of known genes related to kernel traits. Noteworthy, in most of the clusters associated with kernel W (clusters 4, 8, and 9), genes known to be involved in/or whose functional annotation is related to cell development were retrieved, strengthening the chances of being potential candidates for this trait. About cluster 4 on chromosome 3A, such type of gene is represented by the wheat orthologous (TRITD3Av1G163790) of the known rice gene D61 (Os01g0718300). It encodes a brassinosteroid insensitive-like leucine-rich repeat receptor kinase (Avni et al., 2018) associated with cell elongation (Nakamura et al., 2006). However, the expression of the bread wheat homolog (TraesCS3A02G245000) is higher in spikes at vegetative than reproductive stage, and only three SNPs in upstream region were identified between MG5323 and Svevo. Cluster 8 on chromosome 6B encompasses two known genes, GW2 (TRITD6Bv1G096950) and FUWA (TRITD6Bv1G115800), which are known to control grain size by regulating cell division (Chen et al., 2015;Zhai et al., 2018). Among the two, GW2 might be the most reliable candidate because the bread wheat homolog (TraesCS6B02G215300) showed a specific higher expression in grains at reproductive stage, and three polymorphisms were identified between of MG5323 and Svevo gene sequences. GW2 encodes an E3 RING ligase and mediates ubiquitination in the ubiquitin-26S proteasome system. This gene has been shown to negatively regulate grain size in rice and in bread wheat (Hong et al., 2014;Simmonds et al., 2016;Nadolska-Orczyk et al., 2017;Zhai et al., 2018). Within cluster 9 on chromosome 7A, annotations of TRITD7Av1G052720 and TRITD7Av1G055870, which encode a receptor protein kinase and a MADS box transcription factor, respectively, mention the regulation of cell growth and cell differentiation. Both these two genes are reliable candidates, because a higher number of SNPs were found between TRITD7Av1G052720 alleles of MG5323 and Svevo, and a significant upregulation in spike at vegetative stage was seen for the bread wheat homolog of TRITD7Av1G055870.
In most of the clusters related to kernel A (clusters 5 and 7), genes associated to auxin metabolism were retrieved (TRITD3Bv1G229090, TRITD3Bv1G229910, TRITD3Bv1G235190, TRITD3Bv1G239650, TRITD4Bv1G175480, TRITD4Bv1G179270, and TRITD4Bv1G171270). Several lines of evidence have determined that auxins play an important role in organ size by regulating cell expansion, cell division, and For previously reported QTLs, the corresponding reference, trait, mapping population/germplasm collection, and flanking markers with position (in Mbp) on the reference genome Svevo.v1 are reported. Clusters found in this study are shown in bold. Nearby QTLs but not overlapped to the cluster's positions are shown in italics. Traits are denoted as L, length; W, width; P, perimeter; A, area; WL, width-to-length ratio; FC, form coefficient; TKW, thousand kernel weight; HD, heading days; and PH, plant height. For more details on trait description, please refer to Table 1  differentiation and thus affecting stem elongation, lateral branching, vascular development, growth responses, and various aspects of seed development, including development of the embryo, endosperm, and seed coat (Teale et al., 2006;Zhao, 2010;Cao et al., 2020b). Among the candidates of cluster 5 on chromosome 3B, two genes ( TRITD3Bv1G229090 and TRITD3Bv1G235190) carry polymorphisms in the CDS with moderate effect on the protein. In addition, for the latter, the bread wheat homolog showed a specific induction of gene expression at reproductive stages, including spikes. Specifically in cluster 7, two genes related to auxin regulation (TRITD4Bv1G175480 and TRITD4Bv1G179270) and one for cell proliferation (TRITD4Bv1G177190) were included. In addition, nearby the physical region of cluster 7, we found the known gene BIG GRAIN 1 (BG1) (corresponding to TRITD4Bv1G171270 at 582 Mbp), which encodes a plasma membrane-associated protein (Liu et al., 2015) and could be involved in the control of the relationships found for this cluster. This gene overlapped with the position of the loci related to TKW in this cluster ( Figure 4); therefore, it was considered within the cluster's interval. In rice, BG1 (GenBank, accession Q10R09.1) has been described as a positive regulator of the auxin signaling pathway involved in gravitropism, plant growth, and grain development. The overexpressing rice dominant mutants of this gene showed an increased grain size with bigger L, W, and A, associated with longer epidermis cells and higher number of parenchyma cells in both the palea and lemma in the spikelet hull (Liu et al., 2015). Nevertheless, a recent study about the orthologous gene in bread wheat showed that, even if the overexpression of BG1 led to larger seed size, it also triggered the reduction in seed number per plant (fewer grains), thus causing no significant overall increase in yield, and was related to a lower concentration of essential elements (zinc and phosphorus) and protein content (Milner et al., 2021). Our additional in silico evidence could not give a further support to candidate genes of cluster 7. Indeed, none or few polymorphisms were identified a part in TRITD4Bv1G175480, which presents three SNPs in upstream gene region, and no specific induction in spike or grain organs, in a general low expression context, was reported for bread wheat homologs. Notably, the bread wheat homolog of the BG1 gene (TraesCS4B02G292300) also showed a very low expression, with a light induction in spike at vegetative stage only. Within the physical interval of cluster 9 (chromosome 7A), we found a known candidate gene that impacts on grain size by regulating sugar metabolism. This is TaSus1 (TRITD7Av1G050690), which encodes a sucrose synthase, catalyzing the first step in the conversion of sucrose to starch. It has been correlated with TKW, as starch is the main component of grain endosperm (70%) (Nadolska-Orczyk et al., 2017). In addition, within the region of cluster 9, we identified the orthologous gene TaGASR7 (TRITD7Av1G071860), which is considered a negative regulator of grain weight in wheat through an effect on grain L (Dong et al., 2014). Although just a few, both genes have polymorphisms in the upstream gene region.

Conclusion
Exploring new genetic resources to increase wheat yield is a vital task to cope with climate change and future food demands. In this way, the current study contributes to lay the foundations on understanding the genetic basis on the relationships between kernel-related traits (size, shape, and TKW), by identifying nine clusters of co-located loci in a T. dicoccum-derived population. In particular, a major and stable QTL was detected on chromosome 4B related to kernel size traits (L, A, and P) and kernel weight, being the superior allele donated by the T. dicoccum accession. This study further supports the role of this ancestral species as a source of favorable alleles for durum wheat breeding although a validation of the detected QTL as with fine mapping is needed to refine the position of the QTL and then study its eventual interaction with other traits and loci.

Data availability statement
The dataset (variants MG5323-Svevo.v1) presented in this study can be found in the online repository www.figshare.com with the doi: 10.6084/m9.figshare.24119532.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.