Genetic Diversity and Relationships of Terebinth (Pistacia terebinthus L.) Genotypes Growing Wild in Turkey

Genetic diversity and relationships of 54 wild-grown terebinths (Pistacia terebinthus L.) were determined using 40 SSR (simple sequence repeat) markers (38 in silico polymorphic SSR markers and 2 SSR markers). In silico polymorphic SSR analysis, 430 alleles were identified. The number of alleles per locus ranged from 3 to 25 with a mean value of 11 alleles per locus. The values of polymorphism information content (PIC) ranged from 0.34 (CUPOhBa4344) to 0.91 (CUPSiBa4072) with a mean PIC value of 0.68. Genetic distances were estimated according to the UPGMA (Unweighted Pair Group Method with Arithmetic Average), the Structure, and Principal Coordinates (PCoA) based clustering. The structure analysis and UPGMA clustering of the genotypes depicted two major clusters. PCoA results supported cluster analysis results. The dendrogram revealed two major clusters. Forty-two samples were obtained from the Kazankaya canyon and 12 samples from the Karanlıkdere region. The two regions are 130 km apart from each other but in a dendrogram, we did not find geographical isolation. The results proved the efficiency of SSRs for genetic diversity analysis in the terebinth. Based on the results, SSRs can be applied as a trustworthy tool for the evaluation of genetic diversity in terebinth genotypes. Molecular analysis on the terebinth genotypes in this study will promote the germplasm collection and the selection of the populations in future studies on terebinths for genetic mapping, genetic diversity, germplasm characterization, and rootstock breeding.


Introduction
Cultivation of pistachio (Pistacia vera L.) has gained more importance recently because its fruits have been proven to have antioxidant and anti-inflammatory effects on humans. The fruits are rich in fibers, minerals, and unsaturated fat. It also has high economic value for farmers.
The cultivated pistachio has several wild relatives such as Pistacia khinjuk Stocks, Pistacia lentiscus L., and Pistacia terebinthus L. [1]. Having knowledge of genetic relationships among different species of pistachio is vital for the proper evaluation of cultivated and wild species. The genus Pistacia is distributed circumpolarly around the globe. Pistachio originated in south-central Asia and migrated throughout the Mediterranean region of Southern Europe and North Africa, Middle East [1] and eastwards up to China. Pistacia texana Swingle and P. mexicana Kunth remain completely remote from other members cydonia [31], etc. On the other hand, in silico polymorphic SSR markers developed by comparing sequences of different genotypes for repeat lengths, provide a high percentage of polymorphic SSRs. In silico microsatellite variability may provide a useful proxy for PCR product polymorphism [32]. The identification of new SSRs in silico is a relatively cost and time-efficient approach compared to SSRs being developed from genomic libraries [33].
The evaluation and proper determination of genetic diversity among terebinth genotypes is lacking in the literature. The use of in silico polymorphic SSR markers for the determination of genetic diversity among terebinth genotypes can be a reliable tool. Thus, in the present study, we focused on genetic diversity among a large number of terebinth (Pistacia terebinthus L.) genotypes with different origins using in silico polymorphic SSR markers. The results may improve the knowledgebase on the level of diversity of the collection of proper terebinth trees and will be useful for the conservation and management of these genetic resources.

Plant Material
A set of 54 terebinth genotypes were used as experimental materials (Table 1).  Figure 1A,B, Table 1). The geographic coordinates were determined by reference to the World Geodetic System 1984 (WGS84).
In our field expedition to determine wild species in fruiting stage in the Yozgat province, we saw that only Kazankaya Canyon and Karanlıkdere region, which were microclimatic areas within Yozgat, had terebinth populations. The two regions are 130 km apart from each other. The ages of trees in both regions are estimated to be 25-30 years old. Moreover, Kazankaya Canyon was more populated by terebinth trees thus we selected 42 genotypes from the Kazankaya Canyon for evaluation. Three vegetation types are seen in the Kazankaya Canyon study area. Rock vegetation is clearly separated from the forest and hygrophilic vegetation in the area. The Karanlıkdere region has steppe vegetation. The Kazankaya canyon is 10 km long and terebinth trees are more populated within the first 3 km of the canyon. The leaf samples of 42 genotypes were collected at 10 m intervals. In the Karanlıkdere region, terebinth trees were found together with grapevine. The terebinth trees grow wildly in the Yozgat province Karanlıkdere region (latitude: 39°34′25.92″, longitude: 34°36′04.22″) and the Kazankaya Canyon (latitude: 40°13′43.46″, longitude: 35°19′36.02″) in middle Turkey ( Figure 1A,B, Table 1). The geographic coordinates were determined by reference to the World Geodetic System 1984 (WGS84).

DNA Extraction
Total genomic DNA was isolated from fresh leaves by the CTAB method described by Doyle and Doyle [34] with some modifications [21]. The Qubit Fluorometer (ThermoFisher Scientific, Waltham, MA, USA) was used to quantify isolated DNA. Followed by diluting the extracted DNA to 10 ng/µL for SSR-PCR reactions, the samples stored at −20 • C.

In Silico Polymorphic SSR-PCR Reactions
Forty-five in silico novel SSR markers developed by Khodaeiaminjan et al. [35] were used to construct in silico polymorphic SSR-based genetic linkage maps in pistachio and five SSR markers developed by Motalebipour et al. [36] in pistachio were used for testing the amplification and polymorphism in eight terebinth genotypes. The 38 of 45 in silico polymorphic novel SSR markers have been used for the first time in the characterization study of Pistacia wild species. Finally, 40 SSR primer pairs were selected for the characterization of 54 terebinth genotypes ( Table 2). All SSR-PCR reactions were done based on a three-primer strategy according to Scheulke [37] with minor modifications. A total volume of 12.  PCR amplifications were done in two consecutive steps. The first step included initial denaturation at 94 • C for 3 min, followed by 28 cycles of 94 • C for 30 s, 58 • C for 45 s, and 72 • C for 60 s. The second step involved 10 cycles of 94 • C for 30 s, 52 • C for 45 s and 72 • C for 60 s, and a final extension at 72 • C for 5 min. When the PCRs were completed, the reactions were subjected to denaturation for capillary electrophoresis in an ABI 3130xl genetic analyzer (Applied Biosystems Inc., Foster City, CA, USA (ABI)) using a 36-cm capillary array with POP7 as the matrix (ABI). Then, samples were denatured by mixing 0.5 µL (in 6-FAM and VIC labeled primers) or 1.0 µL (in NED and PET labeled primers) of the amplified product, 0.3 µL of the size standard and 9.7 µL of Hi-Di formamide. The ABI data collection software 3.0 was used for resolving the fragments, and then SSR fragment analysis was done using the GeneMapper 4.0 (Applied Biosystems Inc., Bedford, MA, USA).

Genetic Diversity
After capillary electrophoresis of the in silico polymorphic SSR loci PCR samples, an effective number of alleles (Ne), expected heterozygosity (He), the number of alleles per locus (Na), and observed heterozygosity (Ho) were calculated using the GenAlEx version 6.5 program [38]. PIC for the loci was calculated using PowerMarker software version 3.25 [39].

Genetic Relationship
The UPGMA (Unweighted Pair Group Method with Arithmetic Average) based dendrogram analysis was conducted in MEGA4 software [40]. The genetic similarity coefficients among terebinth genotypes were calculated according to Jaccard prior to the pedigree to be obtained. Principal Coordinates (PCoA) based clustering was also done using the GenAlEx version 6.5 program. Population structure and identification of admixed individuals were performed using the model-based software program, STRUCTURE 2.3.4 [41]. In this model, a number of populations (K) are considered to be available, with each of them characterized by a set of allele frequencies at each locus. Individuals in the sample are given to populations (clusters), or jointly to more populations if their genotypes indicate that they are admixed. Ln P (D) values (logarithm probability for each K) were applied to determine the Delta K indicating the probable population number. The term of Delta K is calculated by the change ratio of logarithm probability (∆K = 2 to ∆K = 10). In the diagram, the highest K of Delta K confers the information about the probable population number.

Results
Fifty SSR primer pairs were used to discover polymorphisms for terebinth genotypes and 40 out of them generated scorable and polymorphic bands and were consequently applied for fingerprinting of 54 genotypes. Allele sizes of all SSR primers in genotypes were determined by capillary electrophoresis.

Polymorphism Levels of SSR Loci
Of 50 screened SSR primer pairs, 10 failed in the amplification, and the 40 remaining SSR markers generated polymorphic alleles in eight genotypes, and they were consequently used for characterization of all terebinth genotypes (Table 3).
In total, 430 alleles were detected for all studied genotypes, and the number of alleles varied between 3 to 25 alleles per locus (Table 3) with a mean value of 11. The highest number of alleles was obtained from the CUPSiPa732 locus (Na = 25). The number of effective alleles (Ne) ranged from 1.56 (CUPOhBa4344) to 12.38 (CUPSiBa4072) with a mean of 4.95. The observed heterozygosity (Ho) ranged from 0.28 to 1.00 with a mean of 0.68. Observed heterozygosity (He) was the highest in the CUPVSiirt625 and CUPVSiirt1326 loci. The average value of expected heterozygosity (He) was 0.71 and the highest value (0.92) was seen in the CUPSiBa4072 locus. Polymorphism information content (PIC) of the loci ranged from 0.34 (CUPOhBa4344) to 0.91 (CUPSiBa4072) with an average of 0.68 (Table 3). Except two *SSR markers, the other thirty-eight are in silico polymorphic SSRs.

Genetic Relationships among Terebinth Genotypes
The generated dendrogram from the UPGMA analysis was shown in Figure 2A. Genetic similarity coefficients ranged from 0.08 to 0.78. All genotypes were grouped into two significant clusters by the UPGMA clustering analysis (Figure 2A). The highest genetic similarity coefficient (0.08) was achieved between the K28 and K22 genotypes while the lowest genetic similarity coefficient (0.78) was obtained between the M7 and K55 genotypes.
Principal Coordinate Analysis (PCoA) was done to envision the relationship among the terebinth genotypes in more detail. The variations expressed on axes 1, 2, and 3 were 67.34, 21.46, and 5.08%, respectively. The first principal component accounts for 93.88% of the total variance. The results of the PCoA showed that all genotypes are evidently separated from each other ( Figure 2B).
Structural genetic analysis was performed in 54 terebinth genotypes using 40 amplified loci by STRUCTURE and STRUCTURE HARVESTER programs ( Figure 2C). The highest value of Delta K (∆K) was obtained at ∆K = 2 ( Figure 2D). ∆K = 2 corresponded to the greatest possible number of populations in the study. Thus, all genotypes were categorized into two major clusters similar to the UPGMA analysis results (Figure 2A).
The dendrogram of the relationships of genotypes was very similar to the structural genetic analysis (Figure 2C,D). Overall patterns of genetic differentiation examined using the UPGMA and the structure analysis, and PCoA were in accordance with each other (Figure 2). separated from each other ( Figure 2B).
Structural genetic analysis was performed in 54 terebinth genotypes using 40 amplified loci by STRUCTURE and STRUCTURE HARVESTER programs ( Figure 2C). The highest value of Delta K (ΔK) was obtained at ΔK = 2 ( Figure 2D). ΔK = 2 corresponded to the greatest possible number of populations in the study. Thus, all genotypes were categorized into two major clusters similar to the UPGMA analysis results (Figure 2A). The dendrogram of the relationships of genotypes was very similar to the structural genetic analysis ( Figure 2C,D). Overall patterns of genetic differentiation examined using the UPGMA and the structure analysis, and PCoA were in accordance with each other (Figure 2).

Discussion
Molecular characteristics were used to elucidate the genetic diversity of 54 terebinth genotypes.

In Silico Polymorphic SSR
Comprehensive studies of genetic diversity in particular in underutilized species are pivotal not only for breeding but also for genetic conservation and management strategies. The studies on genetic diversity and relationships of terebinth genotypes using SSRs are limited in the literature [26,[42][43][44][45] and to date, no single report on the application of SSRs to determine genetic diversity and relationships of terebinth genotypes is available in Turkey. Therefore, this highlighted the importance of the current study to develop a unique identity card for Turkish terebinth and for their conservation. In terms of the advantages of the SSR technique, we present a few polymorphic SSR loci in the present study (Tables 2 and 3). Vendramin et al. [42] examined 10 SSR primers for 82 genotypes among Pistacia species including P. terebinthus, P. vera, P. atlantica subsp. mutica, P. mutica × P. khinjuk, and P. integerrima and found that the most recently evolved species in Anacardiaceae family is P. terebinthus. They obtained 73 total fragments with 68 polymorphic ones where the average number of polymorphic fragments per primer was 6.8. The genetic diversity studies on wild Pistacia attempt to increase the genetic base and breed stress-tolerant crops to mitigate the impacts of climatic resilience. In a study, the transferability of 47 SSRs was evaluated in 8 wild Pistacia species including P. terebinthus, and these SSRs had the lower number of transferable loci (72.9%). A total of 168 alleles were generated with an Agronomy 2021, 11, 671 8 of 11 average of 3.6 alleles per locus [26]. In another study, the transferability of 110 SSR primer pairs was tested in eight wild Pistacia species including P. terebinthus, which had the lower number of transferable loci (66.7%) [43]. In a study, a total of 206 SSRs were applied for the characterization of twenty-four P. vera cultivars and twenty wild Pistacia genotypes belonging to P. terebinthus, P. atlantica, P. integerrima, P. chinenesis, and P. lentiscus genotypes. In P. terebinthus, a total of 416 alleles were generated by 142 SSR loci, ranging from one to seven with an average of 3.4 alleles per locus [44] indicating lower alleles per loci than our findings and identify unique diversity locked within the wild terebinth genotypes growing in middle Anatolia. In another study, a total of 55 genic SSRs were evaluated for characterization of 11 P. vera cultivars and 78 wild Pistacia genotypes which belong to nine Pistacia species (P. terebinthus (11 genotypes), P. khinjuk, P. eurycarpa, P. atlantica, P. mutica, P. integerrima, P. chinensis, P. lentiscus, and P. palaestina). In P. terebinthus, 46 SSR primers were tested for 11 P. terebinthus and 180 alleles were detected with a mean value of 3.27 [45]. In the present study, 430 alleles were produced by 40 in silico polymorphic SSR loci, ranging from 3 to 25 with an average of 11 alleles per locus for genetic characterization of 54 terebinth genotypes. The obtained values on alleles per locus were higher compared to values in previous studies. Evidence suggests genetic diversity in all crop species has declined during polyploid formation and domestication followed by intensive selection. Our results showed that the level of detected diversity was higher than previous studies in the literature, based on the average PIC, He, and Ho per locus [26,[42][43][44][45] indicating that terebinth populations in Turkey are still untouched with high genetic diversity and it is well known that decreasing genetic diversity is indicative of the strong selection pressure and role of domestication, whereby homozygosity for a few quality traits alone has added to genomic bottlenecks.
The frequency of alleles and the number of expressed alleles per locus which correspond to genetic diversity were determined by PIC. In this study, the PIC values were obtained between 0.34 and 0.91 with an average of 0.68, while Pistacia Karci et al. [45] calculated it as 0.33, Motalebipour et al. [44] reported it as 0.50, and Topcu et al. [43] calculated it as 0.51. In our study, He and Ho were 0.71 and 0.68, respectively, while Karci et al. [45] reported 0.38 and 0.34, Motalebipour et al. [44] calculated as 0.56 and 0.47, Topcu et al. [43] reported 0.56 and 0.52, Zaloglu et al. [26] found as 0.64 and 0.57, Vendramin et al. [42] reported as 0.38 and 0.43, respectively. As a result, in this study, the average number of PIC, He, and Ho values were found to be higher compared to other research in the literature. This could be the effect of in silico polymorphic SSR loci and also the use of more terebinth genotypes in the current studies. In addition, high levels of genetic variation between individuals within a population can be attributed to genetic drift, mutations, and environmental conditions. Since, the terebinth plants in this study consist of genotypes from different geographical locations and different climatic conditions, it is expected that there will be large differences within the population.

Genetic Relationships among Terebinth Genotypes
Although the morphological classification of Pistacia species started by Zohary [9], the first study at the molecular level was carried out by Parfitt and Badanes [46]. The researchers divided the Pistacia genus based on leaf phenology into two groups as deciduous (P. atlantica, P. khinjuk, P. chinensis, P. palaestina, P. terebinthus, and P. vera) and evergreen (P. lentiscus, P. weinmannifolia, P. texana, and P. mexicana). Although many morphological studies have been done on Pistacia species, molecular studies are very limited. However, it is still discussed whether P. palaestina and P. terebinthus are different or the same species. In the first classification, Engler [47] reported that P. palaestina is a variety of P. terebinthus. Later, Zohary [9] reported that P. palaestina and P. terebinthus were different species. However, in a latter classification by Yaltirik [10], P. palaestina was described as a subspecies of P. terebinthus. Golan-Goldhirsh et al. [48] evaluated the polymorphism among and within Pistacia species of the Mediterranean region by AFLP and RAPD analyses. The researchers reported that P. terebinthus and P. palaestina were extremely similar. Classification studies Agronomy 2021, 11, 671 9 of 11 of Yaltirik and Engler's were then supported by Kafkas [5] and Kafkas et al. [49]. Avanzato [50] reported the monoecious forms of P. terebinthus, located in the Rodopi mountains (Bulgaria) using molecular markers (RAPD) and phenological observations and initiated a monoic P. vera breeding program.
The results of in silico polymorphic SSR-based structural genetic analysis and UPGMA clustering were similar to the genetic relationships of terebinth genotypes reported by Karci et al. [45], Motalebipour et al. [44], and Vendramin et al. [42] which were clustered in the same groups.
The result of this study demonstrated that SSR markers used in this study were enough to discriminate against the genotypes. This study obtained a robust dendrogram that agrees with relationships established among terebinth genotypes in previous studies [26,[42][43][44][45]. The overall PCoA based on genetic similarity matrices was applied to envision the genetic relationships among terebinth genotypes ( Figure 2B). The first principal component accounts for 93.88% of the total variance. Our results proved that there is a high level of genetic diversity in the terebinth genotypes studied in this paper. Previous studies confirmed high genotypic diversity among wild horticulture crops for most of the traits [51][52][53][54][55][56].

Conclusions
In this study, terebinth genotypes from Turkey were analyzed by using 40 in silico polymorphic SSRs and SSRs. The results revealed that all 54 used genotypes were grouped at clusters with the same geographic origin. The SSRs were also the most powerful and efficient tools to differentiate the genotypes. Moreover, the information of the microsatellite number of repeats besides polymorphisms in silico may facilitate the reduction of the number of candidate polymorphic microsatellites in laboratory testing and decrease the cost, time, and labor. Consequently, in silico polymorphic novel SSR markers which were used in this study represent the first characterization study of Pistacia terebinthus L. The results can be used in the selection of proper terebinth trees as rootstocks and also can be used for future studies on genetic mapping, genetic diversity, germplasm characterization, and rootstock breeding. It is also appropriate to verify the gene pool (GP) for this crop wild relative, in order to evaluate the crossing with the cultivated species (Pistacia vera), and therefore, verify the selection of hybrids with different and perhaps better fruit characteristics, and possibly more resistance to pathogens [57][58][59][60][61][62].