Genetic Diversity and Population Structure in Türkiye Bread Wheat Genotypes Revealed by Simple Sequence Repeats (SSR) Markers

Wheat genotypes should be improved through available germplasm genetic diversity to ensure food security. This study investigated the molecular diversity and population structure of a set of Türkiye bread wheat genotypes using 120 microsatellite markers. Based on the results, 651 polymorphic alleles were evaluated to determine genetic diversity and population structure. The number of alleles ranged from 2 to 19, with an average of 5.44 alleles per locus. Polymorphic information content (PIC) ranged from 0.031 to 0.915 with a mean of 0.43. In addition, the gene diversity index ranged from 0.03 to 0.92 with an average of 0.46. The expected heterozygosity ranged from 0.00 to 0.359 with a mean of 0.124. The unbiased expected heterozygosity ranged from 0.00 to 0.319 with an average of 0.112. The mean values of the number of effective alleles (Ne), genetic diversity of Nei (H) and Shannon’s information index (I) were estimated at 1.190, 1.049 and 0.168, respectively. The highest genetic diversity (GD) was estimated between genotypes G1 and G27. In the UPGMA dendrogram, the 63 genotypes were grouped into three clusters. The three main coordinates were able to explain 12.64, 6.38 and 4.90% of genetic diversity, respectively. AMOVA revealed diversity within populations at 78% and between populations at 22%. The current populations were found to be highly structured. Model-based cluster analyses classified the 63 genotypes studied into three subpopulations. The values of F-statistic (Fst) for the identified subpopulations were 0.253, 0.330 and 0.244, respectively. In addition, the expected values of heterozygosity (He) for these sub-populations were recorded as 0.45, 0.46 and 0.44, respectively. Therefore, SSR markers can be useful not only in genetic diversity and association analysis of wheat but also in its germplasm for various agronomic traits or mechanisms of tolerance to environmental stresses.


Introduction
Bread wheat (Triticum aestivum L.) is one of the most important species belonging to the genus Triticum in the Poaceae family [1]. The genomic structure (2n = 6x = 42, AABBDD) of this cereal consisted of three diploid genomes AA, BB and DD, which are inherited from three ancestral species-Triticum urartu Thuman ex Gandil (A genome), Aegilops speltoides Tausch (B genome) and Aegilops tauschii Coss (DD genome) [2]. The rather large genome size (17,000 Mb) and the high rate of repetitive sequences (80%) are important issues to overcome in bread wheat research [3]. Therefore, efficient and sufficient tools should be used in bread wheat genome research.
The green revolution has resulted in increased yield and quality in wheat production and emergence of high-yield varieties. As the world's population continues to grow, climate change and the resulting global warming are having a serious impact on food supplies. World wheat production is expected to increase by about 50% by 2050 to meet the food needs of a growing population [4,5]. However, in recent decades, wheat yields have not been able to increase sufficiently in the world [6], as well as in Türkiye [7]. As a result, wheat production is unable to meet demand. Given the negative effects of climate change and the growing world population, which is expected to exceed 9 billion by 2050, the need to increase wheat production to ensure global food security is a high priority [8]. In this case, the biggest challenge for wheat farmers is to improve grain yields and crop tolerance to various environmental stressors to meet growing demands [9]. Wheat has accumulated quite a large amount of genetic variability during its evolution. Today, such a large amount of genetic diversity has generally decreased due to repeated cultivation, adaptation, development and use of local varieties for desirable traits [10]. However, the increased homogeneity of the genetic background has become a major challenge for the future genetic development of wheat.
Plant breeding programs mainly focus on genetic diversity, inheritance, conservation and evolution [11]. Homogeneity in a population would mean that all members of that population behave similarly in the face of a stressor and could not withstand an epidemic [12]. Potential new alleles can be used to overcome such adverse conditions [13]. Genetic diversity is a key topic for the adaptation and survival of wheat species to biotic and abiotic stressors, as such stressors are expected to be major constraints to food security [14]. On the other hand, domestication and selection pressures, as well as the use of modern breeding techniques, have already narrowed the wheat gene pool [15]. National and regional strategies should be developed to characterize and preserve the genetic diversity of wheat species. The decline in the level of genetic diversity has led to the use of such genetic resources in breeding programs. Morphological and molecular markers are commonly used to characterize wheat species and assess genetic diversity. Such tools allow breeders to select genotypes that are well adapted to specific conditions and resistant to various biotic and abiotic stresses. Agromorphological markers, special quantitative traits, are often influenced by environmental factors. To address this problem, several molecular markers have emerged as biotechnological tools for studying genetic diversity and population structure [16]. With the development of biological aspects, a number of molecular marker techniques have emerged, such as random amplified polymorphic DNA (RAPD) [17], amplified fragment length polymorphisms (AFLP) [18], inter-simple sequence repeats (ISSR) [19], start codon targeted markers (SCoT) [20], Inter-primer binding site (iPBS)-retrotransposons [21], expressed sequence tag (EST) [22], single nucleotide polymorphism (SNP) [23], next-generation sequencing (NGS) [24], divergence array technology (DArT) [25] and simple sequence repeats (SSR) [26] have been developed. Of these, SSR markers served as effective molecular markers for studying genetic diversity in hexaploid Türkiye wheat embryos [7,27,28]. The number of genotypes and markers used in these studies seemed insufficient for genome-wide association mapping. SSR markers play a key role in marker-assisted selection (MAS) in wheat breeding programs [29]. Currently, SSR databases are available for various crops [30]. To date, many studies have identified SSR markers as effective tools for use in breeding programs [31]. It has been reported that SSR markers offer a more efficient choice than SNPs, due to their faster mutation rates and higher levels of polymorphism that can be found with several highly polymorphic markers [32]. Therefore, SSR markers are largely used to analyze genetic diversity and population structure, as well as to elucidate phylogenetic relationships among plant genetic resources, as such relationships play a key role in developing appropriate breeding programs [30]. SSR markers can originate from coding or non-coding regions of genomes [33]. It was previously reported that these markers located in promoter regions can affect gene expression levels, while those located in coding sequences can affect protein structure and function [34]. SSR markers have many advantages, such as co-dominance, high levels of polymorphism, chromosome specificity and high reproducibility; they are also excellent for identifying and monitoring target traits within varieties [35]. SSR markers are very efficient in wheat research due to their co-dominant structure and wide coverage across the genome [29].
Türkiye encompasses a high level of bread wheat genetic diversity as it is a major center of wheat domestication and diversity. However, there is little information on the population structure and germplasm diversity of wheat. Therefore, the main objective of this study was to investigate genetic diversity and population structure in a set of Türkiye wheat genotypes using SSR markers.

Genetic Materials
In this study, 63 genotypes of bread wheat (Triticum aestivum L.) were used as plant material. Variety names and locations are given in Table 1. Bread wheats were collected from eight different regions of Türkiye. All samples were obtained from the Türkiye National Gene Bank [36].

Extraction of Genomic DNA
Genomic DNA extractions were performed according to the CTAB protocol [37]. The quality of extracted DNA was assessed by agarose gel electrophoresis (0.8%).

PCR Amplification
For SSR analysis, a total of 425 SSR primers were tested on five randomly selected wheat genotypes. Of the primers tested, 120 polymorphic primers were selected for PCR amplification in all 63 wheat genotypes [38]. Subsequently, 120 out of 425 SSR markers were selected for genotyping all 63 sets of bread wheat. Details of the primers used in this study are given in ST1.

Statistical Data Analysis
TotalLab TL120 software (TotalLab Ltd., Gosforth, Newcastle upon Tyne, UK) was used to generate matrices [41]. Several informative parameters such as major allele frequency (MAF), gene diversity (GD) and polymorphic information content (PIC) were estimated using Power Marker version 3.25. POPGEN1.32 software was used to determine unbiased expected heterozygosity (uHe), expected heterozygosity (Exp-Het), effective number of alleles (ne), expected heterozygosity Nei (h) and Shannon's information index (I) values [42]. The Dice similarity index [43] was used to calculate the genetic similarity between each pair of genotypes. NTSYS-pc V2.1 was used to construct a dendrogram using the unweighted double group method with arithmetic mean (UPGMA) and SAHN clustering [25]. Principal coordinate analysis (PCoA) and molecular analysis of variance (AMOVA) were calculated using GenAlExV6.5 [44]. A clustering algorithm on the Bayesian model STRUCTURE 2.2 was used to obtain an explicit picture of genetic composition [45]. For this analysis, input values and parameters were selected as described by Evanno et al. [45]. Finally, the number of actual sub-populations was determined using the Structure Harvester website [46]. MCMC chains were run with a firing period of 100,000 iteration, followed by 100,000 iterations using a model that allowed for admixture and correlated allele frequencies.

Marker Polymorphism, Genetic Diversity and Principal Coordinate Analysis (PCoA)
Information on the descriptive parameters of SSR markers is shown in Table 2. Of the 425 markers, 120 showed polymorphisms. Genetic variation in SSR loci of bread wheat genotypes was calculated based on Na, MAF, Exp-Het, uHe, GD, H, NE, I and PIC values (   (Table 2).
Principal coordinate analysis (PCoA) was conducted using Nei's neutral genetic distance. The three principal coordinates explained 12.64, 6.38 and 4.90% of genetic diversity, respectively (23.92% diversity in total). The presence of genetic diversity was confirmed by the distribution of genotypes in the diagram (Figure 1). The results of the AMOVA showed that the fraction of genetic diversity within populations was greater than between them (78% vs. 22%) ( Table 3).

Genetic Distance and Cluster Analysis for SSR Markers
Phylogenetic relationships were investigated for 63 bread genotypes using 120 SSR markers. Dice similarity coefficients were calculated for the 120 SSR markers, and a UPGMA tree was generated (Figure 2). Genetic diversity (GD) values ranged from 0.184 to 0.420 with a mean of 0.279. The highest GD was observed between genotypes G1 and

Discussion
Detection of genetic variation using molecular markers is highly dependent on the mode of reproduction, domestication history and size of the samples analyzed. Collection, conservation and management of genetic resources are key issues in sustainable agriculture development [47]. Assessing levels and patterns of genetic diversity allows accurate classification of species and identification of individuals with desirable traits [48]. Existing genetic resources, their geographic location and relationships are commonly used to determine population diversity [49]. Comprehensive knowledge of bread wheat genetic diversity will have a significant impact on germplasm conservation and utilization. Such knowledge also facilitates breeding programs. Breeders have made significant progress in detecting various morphological traits and variation of molecular traits at the DNA level [50]. Molecular markers offer efficient tools for improving traditional breeding programs because they are not affected by environmental and developmental factors [51]. SSR markers are commonly used to analyze the genetic diversity of wheat genotypes [30]. In this study, 120 SSR markers were used to determine molecular variation and population structure in core-collection of Türkiye bread wheat genotypes.

Monitoring of Genetic Diversity
Using 120 SSR markers, 651 alleles were identified in 63 wheat genotypes. The number of polymorphic alleles ranged from 2.00 to 19.0 with an average of 5.442. Polymorphism can result from SSR expansion, contraction or interruption [52]. The current mean of polymorphic alleles was higher than the 458 [53], 49 [54] and 38 [31] values, the lower than the 1620 [48] and 939 [27] values represented in previous studies. Teshome et al. [35] reported that the number of alleles (Na) per locus ranged from 2 to 6. The current average number of alleles was lower than the values of 5.7 [55], 10 [56], 10.06 [30], 7.97 [48], 7.2 [57], 6.8 [58], 5.9 [7] and 5.89 [59] and higher than the values of 3.3 [60] and 5.05 [61] reported in previous studies. The differences in the results of these studies are mainly attributed to differences in genotypes and number of markers. The number of alleles per marker largely depends on the relative distance of the locus from the centromere, the allele frequency motif and the number of repeats [16]. Allelic diversity is also influenced by genetic composition, designating the number of alleles per locus [30].
Exp-He values ranged from 0.00 to 0.359 with an average of 0.124. The differences in Obs-He values can be attributed to several factors, including the molecular markers used, the number of selections and the geographic location of the wild-type origin and location of the samples. Our result is higher than that of Teshome et al. [62] with 0-0.05 and lower than that of Ateş Sönmezoglu [7] with an average value of 0.75 and Tsonev et al. [63] with an average of 0.185.
Genetic diversity (GD) values ranged from 0.031 to 0.920, with an average value of 0.460. Arystanbekkyzy et al. [64] indicated that genetically distinct genotypes can facilitate breeding programs for desired traits. Henkrar et al. [65], Ateş Sönmezoglu and Terzi [7] and Belete et al. [30] observed greater gene diversity for primers producing a higher number of alleles. Our result was lower than Tsonev et al. [63] with an average of 0.658 and Mohi-Ud-Din [53] with an average of 0.936.
In this study, the highest value of h, ne and I with 1.667, 1.578 and 0.459, respectively, were observed, while the lowest values of h, ne and I were 0.556, 1.00 and 0.00, respectively. A higher number of effective alleles indicates greater genetic diversity and is therefore generally desirable in breeding programs. The Shannon information index is also an indicator of genetic variation in a population. Teshome et al. [66] reported I values with 0.53. These values were greater than the current results. Mohi-Ud-Din et al. [53] reported the average number of effective alleles per locus as 18.32, indicating considerable diversity in the genotypes studied. The lower values of diversity indices in the present study were attributed to differences in germplasm.
PIC and MAF values indicate significant genetic variability among all wheat species used. They are also reliable indicators of genetic diversity in the plant. The current MAF values ranged from 0.143 to 0.984 with an average of 0.631. Our result was higher than Mohi-Ud-Din et al. [53] with an average of 0.296. The polymorphism information content (PIC) is used as an indicator of the diversity of a gene or DNA segment of a population. It also indicates evolutionary pressure on alleles and mutations. Current PIC values ranged from 0.031 to 0.915 with an average of 0.430. In this study, 25 markers had a PIC value of ≥0.5, indicating their potential use in wheat germplasm genetic diversity studies. Locus has high diversity when the PIC value is ≥0.5 and low diversity when the PIC value is ≤0.25 [67]. In similar studies conducted on wheat genotypes with SSR markers, the average PIC values were lower than the value of 0.62 [63], 0.65 [68], 0.65, [48], 0.52 [28], 0.50 [7], 0.57 [69], 0.83 [53] and higher than Erayman et al. [27] with an average of 0.205, Demirel [70] with an average of 0.19 and Pour-Aboughadareh et al. [54] with an average of 0.32, and Kumar et al. [51] reported an average of 0.33 PIC values. Of the 25 SSR markers, 21 had a PIC value greater than 0.800, indicating that these markers were highly informative and effective.
Principal coordinate analysis (PCoA) is commonly used to spatially represent relative genetic distances between populations [57]. It is also a multidimensional dataset that provides key patterns across multiple loci and samples. The two-dimensional diagram reflects the actual distances between genotypes [56]. In this study, the three main coordinates were able to explain 12.64, 6.38 and 4.90% (23.92% in total) of the total variation. Data were considered reliable when the explained portion of variation was ≥25% [71]. SSR-based clustering offers reliable differentiation of wheat genotypes based on their origin. In this study, significant correlations were found between PCoA clustering and cluster analysis. Mohi-Ud-Din et al. [53] indicated that PCoA was unable to group 56 genotypes based on their population. However, Pour-Aboughadareh et al. [54] found that PCoA confirmed the clustering pattern. Based on AMOVA results, there was more variability within populations (78%) than between populations (22%). Consistent with our results, Mohi-Ud-Din et al. [53] found that differences between populations accounted for 7% of total genetic diversity, with the rest (93%) attributed to differences within populations.

Genetic Identity, Genetic Distance and Clustering Anlaysis
Genetic differences between populations play a huge role in the conservation of genetic resources [72]. Our results showed that the highest genetic distance (GD) occurred between G1 and G27 and the lowest between G10 and G11; G41 and G; G43 and G; G62 and G63. Kumar et al. [12] found dissimilarity indices ranging from 0.62 to 0.85. Erayman et al. [27] reported similarity indices between 0.52 and 0.97 for all species and between 0.69 and 0.97 for wheat cultivars.
The current SSR markers were able to group all genotypes well based on phylogenetic relationships. The UPGMA method divided the present genotypes into three main clusters. Cluster I included 56 (88.88%); cluster II included 6 (9.52%); and cluster III included 1 (1.58%) genotype. UPGMA analysis showed a mix of frequencies as submissions from different geographic regions were grouped into the same subgroups. The current results showed that the clustering models were not able to clearly distinguish between wheat genotypes based on geographic origin. Clustering of genotypes showed no significant relationship between geographic origin and genetic similarity. Such a case indicated gene flow between genotypes. Differences between genotypes were attributed to the greater genetic distance between them. Grouping based on geographic origin was not clear. Such findings were also supported by analysis of population structure. The present results are consistent with those of Mohammadi et al. [71] and Pour-Aboughadareh et al. [54]. Tsonev et al. [63] divided 117 varieties into 2 major clusters, consistent with the 2 major subpopulations of the K = 2 genetic structure analysis. Mohi-Ud-Din et al. [53] used UPGMA analysis to assess the genetic diversity of wheat genotypes using SSR markers and grouped wheat genotypes into five major clusters. Pour-Aboughadareh et al. [54] found that for phylogenetic relationships, SSR markers gave better performance than gene-based techniques.

Population Diversity, Gene Differentiation and Gene Flow of Populations
Natural diversity is used to analyze population structure to detect genes/qTLs of agronomic traits [73]. Such analysis reveals similarities between genotypes and sub-populations. It has been proven to be more reliable and provide more information than other clustering algorithms [30]. The population structure facilitates the selection of different parents and the mapping of marker-trait relationships for use in breeding programs. In this study, analysis of the population structure showed that all varieties came from three subpopulations. The genetic composition of a population is largely determined by various factors, including recombination, genetic drift and natural selection. Subpopulation A contained 18 wheat genotypes (28.57%); subpopulation B contained the highest number of genotypes (28-44.44%); and subpopulation C contained 11 wheat genotypes (17.46%). In addition, 6 wheat genotypes (9.52%) were in mixed groups.
The smallest number of genotypes included in the mixed groups indicated that the genotypes present had a wide range of genetic pools. This study analyzed the population structure of wheat genotypes representing the diversity of Türkiye wheat genotypes. The Bayesian model yielded similar clustering results to UPGMA and PCoA. STRUCTURE analysis revealed three groups (A, B and C) at K = 3. Group B contained the highest number of genotypes. The present results on population structure are consistent with the findings of Mohi-Ud-Din et al. [53], dividing 56 wheat genotypes into three sub-populations, as well as Tascioglu et al. [48], dividing wheat genotypes into three sub-groups based on Bayesian model and PCA. On the other hand, the present results on population structure are not consistent with those of Le Couviour et al. [74] and Tsonev et al. [63], mainly due to the different genetic materials used in these studies. F-statistic (Fst) value was determined to be 0.253 for the first, 0.330 for the second and 0.244 for the third sub-population. Th expected value of heterozygosity (He) was determined as 0.452 for the first, 0.463 for the second and 0.444 for the third sub-population. Mohi-Ud-Din et al. [53] reported two significant differences (p < 0.01) in paired population Fst values.

Conclusions
To facilitate the conservation, classification and maintenance, as well as the use of these valuable genes available in genetic resources, genetic diversity analysis is needed. In Türkiye, many efforts have been made to identify the best wheat genotypes in terms of yield and agromorphological traits. Although wheat genotypes collected from some regions have been previously characterized using other marker systems, here the SSR marker set was used to assess genetic diversity and population structure in set of bread wheat genotypes. Our results showed acceptable values for average allele number, PIC, GD, Ex-He, u-He parameters. In addition, the mean values of Ne, H and I for all genotypes tested were estimated at 1.190, 1.049 and 0.168, respectively. AMOVA showed that variability within populations was higher than between them (78% vs. 22%). In addition, the Fst values for the assumed sub-populations were 0.253, 0.330 and 0.244, respectively. In conclusion, our findings again showed that there is a high level of genetic diversity among Türkiye bread wheat genotypes, which in turn can be taken into account in future wheat breeding programs.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/genes14061182/s1, Table S1: Details of 120 SSR markers, including primer name, primer sequences and chromosomal location.  Informed Consent Statement: All data needed to conduct this study is provided within the manuscript.
Data Availability Statement: Data is contained within the article.

Conflicts of Interest:
The authors declare no conflict of interest.