Assessment of genetic diversity and phylogenetic relationship of local coffee populations in southwestern Saudi Arabia using DNA barcoding

The genetic diversity of local coffee populations is crucial to breed new varieties better adapted to the increasingly stressful environment due to climate change and evolving consumer preferences. Unfortunately, local coffee germplasm conservation and genetic assessment have not received much attention. Molecular tools offer substantial benefits in identifying and selecting new cultivars or clones suitable for sustainable commercial utilization. New annotation methods, such as chloroplast barcoding, are necessary to produce accurate and high-quality phylogenetic analyses. This study used DNA barcoding techniques to examine the genetic relationships among fifty-six accessions collected from the southwestern part of Saudi Arabia. PCR amplification and sequence characterization were used to investigate the effectiveness of four barcoding loci: atpB-rbcl, trnL-trnF, trnT-trnL, and trnL. The maximum nucleotide sites, nucleotide diversity, and an average number of nucleotide differences were recorded for atpB-rbcl, while trnT-trnL had the highest variable polymorphic sites, segregating sites, and haploid diversity. Among the four barcode loci, trnT-trnL recorded the highest singleton variable sites, while trnL recorded the highest parsimony information sites. Furthermore, the phylogenetic analysis clustered the Coffea arabica genotypes into four different groups, with three genotypes (KSA31, KSA38, and KSA46) found to be the most divergent genotypes standing alone in the cluster and remained apart during the analysis. The study demonstrates the presence of considerable diversity among coffee populations in Saudi Arabia. Furthermore, it also shows that DNA barcoding is an effective technique for identifying local coffee genotypes, with potential applications in coffee conservation and breeding efforts.


INTRODUCTION
Coffee is one of the most commercially significant crops, and the second most traded commodity after oil (Mussatto et al., 2011).In addition to its high export value, coffee has also gained in cultural significance over the past few decades.Despite there being more than 125 reported species in the genus Coffea, only two species, Coffea arabica L. (also known as Arabica coffee) and C. canephora Pierre ex A. Froehner (known as Robusta coffee) are grown commercially (Mishra, 2019).The total annual global coffee production in 2022 was 10.2 million tons, about 60% of which were Arabica coffee (USDA, 2023).Coffee's genetic development is progressing at a sluggish pace despite its enormous economic relevance (Mishra, 2019).The collection, characterization, and wise use of accessible germplasm material for any crop plant species contribute to its genetic development and long-term viability (Nguyen & Norton, 2020).Therefore, enhancing diversity from both local and foreign sources is critical for the improvement of crops (Migicovsky et al., 2019).For historical reasons, the main issue with Arabica coffee has been its narrow genetic base that limits its adaptation to changing environments (Mishra, 2019).To get around this problem, breeders made use of wild coffee diploid species to introduce new genes into Arabica genotypes (Mishra, 2019).For instance, the leaf rust-resistant Arabica cultivar Timor Hybrid got its resistance from its C. canephora parent; it was later used as a parent to develop several new rust-resistant cultivars such as Catimor and Ruiru 11 (World Coffee Research, 2023).For bean and liquor quality traits, the wild tetraploid Arabica genotypes from the species' center of origin and the little-known ancient varieties from the Arabian Peninsula offer a wide gene pool to explore (Montagnon et al., 2021).Despite the potential importance of coffee heirlooms from the Arabian Peninsula as a source of genetic diversity, there is limited information available on these genotypes.This information is essential for the development of new coffee varieties that can better adapt to changing environmental conditions, increasing pest and disease pressure and changing consumer preferences (Herrera & Lambot, 2017).Furthermore, since over 60% of wild coffee species are in danger of extinction due to accelerated environmental change, gathering complete information and characterizing this germplasm is of utmost importance (Davis et al., 2019).
Another issue facing the coffee industry as it struggles to cope with an over-supplied market is adulteration.It has long been known that coffee is often adulterated with less expensive and readily available plant material (Oliveira & Franca, 2015).Coffee adulteration has become a more serious issue for the industry in recent years due to the significant expansion in the variety of coffee recipes, stores, and ultimately consumers (Choudhary et al., 2020).Therefore, developing molecular means like genetic barcodes to identify and authenticate the varieties can help mitigate the problem.
In Saudi Arabia and Yemen, C. arabica has been cultivated for at least four centuries on the terraced slopes and narrow valleys of the western mountains at different altitudes ranging mostly from 1200 to 2000 m above sea level (a.s.l.) (Al-Zaidi et al., 2016;Al-Asmari, Zeid & Al-Attar, 2020).Most of what is grown now in southwestern Saudi Arabia are old cultivars that have been around for hundreds of years (Tounekti et al., 2017).It is likely that these diverse populations are a result of successive introductions of genetic material from Eastern Ethiopia by Arab traders over centuries of uninterrupted exchange across the narrow strait of Bab El-Mandeb (Montagnon et al., 2022).Therefore, it is safe to assume that the southwestern corner of the Arabian Peninsula contains the most genetic diversity of C. arabica outside the species' center of origin in the Ethiopian highlands (Montagnon et al., 2021).Regrettably, the scientific community has shown only limited interest in these genetic resources, with the notable exception being the 1989 FAO expedition to southern Yemen (Eskes, 1989) and three subsequent studies (Tounekti et al., 2017;Montagnon et al., 2021;Al-Ghamedi et al., 2023).These studies reported the existence of considerable diversity among coffee populations in the Arabian Peninsula.It is worth noticing that the present coffee populations have evolved over hundreds of years in a semi-arid environment (De Pauw, 2002) marked by recurring droughts, uneven distribution of rainfall, heat stress and high irradiance.Therefore, it is expected that these genotypes could be the source of interesting genes that confer stress tolerance (Tounekti et al., 2018).
In recent years, DNA metabarcoding has emerged as a progressive alternative approach enabling qualitative analysis (species or genus identification for certain taxa) and to some extent, quantitative analysis of complex biological mixtures.This method utilizes highthroughput sequencing (HTS) and comparative analysis of specific DNA sequences known as ''DNA barcodes'' to differentiate the species present within the mixture (Omelchenko et al., 2022).One of the main challenges in plant barcoding is the selection of an appropriate DNA barcode for the target taxa (Coissac, Riaz & Puillandre, 2012;Taylor & Harris, 2012).The effectiveness of the primary chloroplast markers, initially suggested by the CBOL group to consist of matK and rbcL, is a crucial factor to consider in this context.The same study also demonstrated that the trnL marker reliably identifies 50% of the plant species considered, affirming its credibility as a taxonomic tool for plant identification (Valentini et al., 2009).
The difference among the coffee species have been established based on phylogenetic analysis using different barcode intergenic spacer sequences (Cros et al., 1998;Tesfaye et al., 2014), introns (Tesfaye et al., 2007), plastid DNA, and internal transcribed spacer (ITS) region of rDNA (Lashermes et al., 1997), and different combination of four plastid and ITS primers (Jingade et al., 2019).Similarly, the chloroplast DNA (cpDNA) sequence variation is also widely used for identification and for making phylogenetic inferences at different taxonomic levels (Li et al., 2019).Introns and intergenic spacers are known to exhibit high rates of mutation (Barakat et al., 2010).The trnT-trnL, trnL-trnF and atpB-rbcL intergenic spacers, the trnL intron region were successfully used for species identification at low taxonomic levels.These regions also have been used in phylogenetic studies to figure out the cytoplasmic differences as well as the demographic history of several species (Barakat et al., 2010;Mashaly et al., 2017).These markers were successfully used for the identification of species and the construction of phylogenies at different taxonomic levels within the Rubiaceae family (Kårehed et al., 2008;Ginter, Razafimandimbison & Bremer, 2015).Therefore, these four barcode loci were used for the identification of local coffee genotypes present in the southwestern region of Saudi Arabia.
Overall, further research is necessary to fully comprehend the diversity and potential of diploid and tetraploid coffee species and to utilize this information to develop new coffee varieties that can better meet the needs of farmers and consumers in the future.The present study aims to use DNA barcoding to identify the local coffee genotypes in Saudi Arabia, to estimate the genetic diversity of the local coffee populations and to examine their genetic relatedness using chloroplast intergenic spacer markers.

Plant material
The plant material for the study was collected as previously described by Al-Ghamedi et al. (2023).A survey was carried out at several sites in the Sarawat mountain range, running parallel to the Red Sea from the southeast to the northwest through the three administrative regions of Jazan, Assir, and Al-Baha.The survey covered a narrow strip of terraced mountains located between latitudes 17 • N and 20 • N, the most northern location where coffee is commercially grown in the world.The coffee gardens included in the survey were found at altitudes ranging from 900 to 2000 m a.s.l.In total, we collected young leaves from 56 accessions, from Jebel Fayfa (Fayfa district), Eddayer, Maadi (Haroub district), Jebel Al-Gahr (Al-Rayth district), Rayda valley (Assouda district in Assir region), Mahayel Assir district, Al-Majarda district and Jebel Shada (Al-Mekhwah district of Al-Baha region) (Table 1).We tagged and sampled 3-4 trees representing each tree population.Each accession was given a code starting with the acronym ''KSA'' (e.g., KSA-1), but, for the sake of simplicity, we dropped the acronym in the figures.The letter ''R'' was added to the code of accessions 1-19, 45, and 51 to indicate that they were sourced from a small, local coffee germplasm collection established in the Fayfa district.

DNA extraction
Portions of this text were previously published as part of a preprint (Khemira et al., 2023).Plant material, consisting of young leaves from various C. arabica accessions, was collected from representative trees in each population, transported to the laboratory in a storage container and stored at −20 • C prior to DNA extraction.The leaves were sanitized by immersing them in a 5% sodium hypochlorite solution for 1-2 min and then rinsing them with sterile distilled water.The material was then ground in liquid nitrogen and stored in an −80 • C freezer.DNA was extracted from 100 mg of mixed powder using an innuPREP Plant DNA Kit (Analytik, Jena, Germany), following the manufacturer's protocol.DNA quality and concentration were determined using a Nanodrop ND-1000 spectrophotometer (Saveen Werner, Limhamn, Sweden).

Chloroplastic DNA amplification and sequencing
Four chloroplast DNA regions were considered (Table 2).PCR was performed in a 25 µl volume containing 2 µl of template DNA, 10 µl of 1X innuMix Standard PCR, and 1 µM of each primer (Table 2) (Khemira et al., 2023).The Gene Amp PCR System 9700 was used with the following program: initial denaturation at 94 • C for 5 min, 35 cycles of denaturation at 94 • C for 1 min, annealing at 49-52 • C for 60-75 s, and elongation at 72 • C for 60-75 s, followed by a final polymerization at 72 • C for 10 min.To check the effectiveness of PCR, positive control using sterile water was included in all amplifications.
The PCR products were checked by electrophoresis on 1% agarose gel in TAE buffer, and DNA was visualized under UV light after staining with ethidium bromide.The amplified products were purified using the GFX PCR kit (GE Healthcare, Chicago, IL, USA).Sequencing reactions were carried out by Congenic using Sanger technology, separately for each strand to obtain independent forward and reverse sequences.The forward and reverse fragments were aligned, and additional reactions were conducted in case of any discrepancies.

Sequence analysis
The scanner software-2 was utilized to determine the quality of the sequences.The four barcode samples of each C. arabica genotype were manually curated and aligned using the Contig assembly program in Bio Edit 7.0 software to ensure high-quality sequences.
Nucleotide sequences obtained from the 57 accessions were initially aligned using CLUSTAL W (Thompson, Higgins & Gibson, 1994) and analyzed with MEGA program version X.The number of individuals, number of nucleotide sites, variable polymorphic sites, number of segregating sites, number of haplotypes, nucleotide diversity, and average number of nucleotide differences of each barcode marker and consensus sequence were measured using DNAsp (v6) (Rozas et al., 2019).The quantification of insertion events in the sequence was determined by the number of variable sites where the addition of one or more nucleotides signals polymorphism.Likewise, the number of deletions was determined by the variable sites where polymorphism arises due to the removal of one or more nucleotides.The identification of the number of transitions in the sequences was based on the number of variable sites where polymorphism occurred due to the exchange between two purines (A and G) or two pyrimidines (C and T).On the other hand, the number of transversions was determined by the variable sites where polymorphism resulted from the replacement of a purine with a pyrimidine.To determine the number of mutation events that have occurred in a sequence, the sum of variable sites and the number of distinct mutations observed at the same nucleotide site across different samples are combined.This quantification considers both different types of polymorphisms and multiple occurrences of mutations within the sequence.Various parameters were estimated for each sequence region to differentiate them, based on the number of monomorphic or polymorphic sites, the number of parsimony informative sites (PIS), nucleotide diversity (π ), haplotype diversity (Hd), and the total number of mutations (Hosein et al., 2017;Rabaan et al., 2020), singleton variable site (STVC) (Pettengill & Neel, 2010).The percentage of polymorphic sites for each sequence was determined by dividing the number of variable nucleotides by the length of the entire region and multiplying the result by 100 (Chen et al., 2023).

Evolutionary analysis by Maximum Likelihood method
The Maximum Likelihood method and the Kimura 2-parameters model proposed by Kimura (1980) were used to assess the evolutionary relationships among the genotypes.The tree with the highest log likelihood (−22360.57) is shown.The Neighbor-Join and BioNJ algorithms were applied to a matrix of pairwise distances obtained using the Maximum Composite Likelihood approach to obtain the initial tree for the heuristic search.The topology with the superior log likelihood value was retained.The tree was drawn to scale with the length of the branches proportional to the number of substitutions per site.This analysis involved 57 nucleotide sequences.There was a total of 5381 diverse positions present in the final dataset.Evolutionary analyses were conducted in MEGA X (Kumar et al., 2018).

RESULTS
The successful amplification of all four intergenic spacer barcode sequences (atpB-rbcl, TrnT-trnL, TrnL-trnF, TrnL) was achieved, resulting in a single band of the expected size.The respective sequences for each barcode were submitted to the National Center for Biotechnology Information (NCBI) via Bankit submission.The accession number of each barcode for the 56 C. arabica genotypes is presented in Table 3 The number of nucleotide sites (NNS), variable polymorphic sites (VPS), number of segregating sites (NSS), number of haplotypes (NH), nucleotide diversity (ND), and average number of nucleotide differences(ANND) for each barcode primer and the cumulative results for all four primers (Table 4).The combined sequences showed the highest NNS (4114), followed by the atpB-rbcl primer, while the trnL primer had the lowest NNS.The trnT-trnL primer had the highest number of variable polymorphic sites VPS (341), followed by atpB-rbcl, while the lowest (154) was recorded for TrnL-trnF.The combined sequences had the highest number of segregating sites (NSS) followed by the trnT-trnL primer, while trnL and trnL-trnF had the lowest number.The number of haplotypes was highest for trnT-trnL and lowest for trnL-trnF and atpB-rbcl while trnL and the combination of all four markers were intermediate.The primer atpB-rbcl had the highest ND, followed by trnT-trnL (0.051), with TrnL-trnF showing the lowest ND.Additionally, the atpB-rbcl had the highest ANND (185.54), while TrnL-trnF exhibited the lowest value (25.23) for ANND.
The nucleotide base composition of each barcode primer was determined and is presented in Table 5.The average nucleotide base composition of atpB-rbcl was recorded as 33.15% T(U), 16.60% C, 34.49% A, and 15.76% G.For trnL, the composition was 26.7% T(U), 15.9% C, 37.6% A, and 19.8% G. trnT-trnL had a composition of 39.18% T(U), 13.88% C, 33.84% A, and 13.10% G.For trnL-trnF the composition was 32.83% T(U), 19.81% C, 32.21% A, and 15.13% G (Table 5).The singleton variable sites (STVS) and parsimony information sites (PIS) for each chloroplast barcode are presented in Table S1.The trnT-trnL barcode recorded the highest number of STVS (338), followed by trnL-trnF (133), trnL (52), then atpB-rbcl which had the lowest number (1).Similarly, for grand total of PIS was 182 for trnL, 155 for trnT-trnL, 137 for atpB-rbcl and 45 for TrnL-trnF (Table S1).A phylogenetic analysis was constructed based on the concatenated sequences of all four barcode primers using the maximum likelihood method and Kimura 2-parameters model (Fig. 1).The percentage of trees in which the associated taxa clustered together is shown next to the branches.This analysis involved 56 nucleotide sequences, and the final dataset comprised 4,114 positions.The tree is drawn to scale, with branch lengths measured in the number of substitutions per site.The final phylogenetic tree divided the 56 accessions into four distinct groups.The first group contained six accessions (KSA42, KSA29, KSA2R, KSA41, KSA43, and KSA11R) that were mostly from the Rayda district of Assir region.The second group contained seven accessions (KSA51R, KSA3R, KSA27, KSA60, KSA4R, KSA7R, and KSA1R), all from the Jazan Region except KSA60 was from Assir.The third group was formed of 12 accessions (KSA45R, KSA13R, KSA39, KSA25, KSA35, KSA59, KSA52, KSA36, KSA24, KSA22, KSA37 and KSA46), all collected from the Jazan Region except KSA59 from the north of Assir Region.The fourth and largest group contained 43 accessions that can be further subdivided into four subgroups.The first subgroup (IVa) was a diverse one and contained 12 accessions originating from the three regions.Subgroup IVb contained three accessions (KSA33, KSA28 and KSA5R) all from the Jazan Region.Subgroup IVc contained seven accessions, six from Jazan and one

DISCUSSION
The genetic diversity present in any crop wild or primitive relatives plays a crucial role in the effectiveness of crop improvement programs.These wild or unknown genotypes exist in diverse habitats, many of which are currently facing significant threats due to habitat degradation and climate change (Davis et al., 2019).Therefore, developing molecular means like the genetic barcodes used to identify and validate the coffee varieties can help mitigate the problem.
In Saudi Arabia and Yemen, C. arabica has been cultivated for at least four centuries on the terraced slopes and narrow valleys of the western mountains at altitudes ranging mostly from 1,200 to 2,000 m above sea level (a.s.l.) (Al-Zaidi et al., 2016;Al-Asmari, Zeid & Al-Attar, 2020).Food and beverages adulteration is another widespread malpractice of concern to both traders and consumers.In particular, coffee adulteration aims to mitigate the effects of high prices, product shortages, or reduce production expenses (Flores-Valdez et al., 2020).Therefore, there is a real need to develop methods and models for detecting and quantifying coffee adulterants commonly used in coffee.
It is estimated that approximately 60% of wild coffee species are at risk of extinction worldwide.Similarly, underutilized old varieties are disappearing from the orchards.This it underscores the pressing importance of preserving these species through both in situ and ex situ measures to safeguard their genetic diversity for future use.
While morphological descriptors are commonly used to characterize different coffee species, molecular markers are considered more efficient in distinguishing closely related species and cultivars (Mishra, Jingade & Huded, 2022).They are also more precise and reliable than morphological and biochemical markers (Hao et al., 2009).Furthermore, several studies have demonstrated that specific regions of the chloroplast genome can serve as DNA barcodes for a wide variety of plant species (Skuza et al., 2019;Meena et al., 2020).Selection of suitable plastid genomes offers sufficient genetic information for distinguishing between genotypes.Additionally, when choosing suitable DNA barcoding loci, the variable regions should be given a primary consideration (Mahadani & Ghosh, 2014).Therefore, the objective of this study was to identify fifty-six local Arabica coffee accessions in the southwestern Saudi Arabia and to evaluate the evolutionary and phylogenetic relationships among them by utilizing four DNA barcoding markers (atpB-rbcl, trnL-trnF, trnL, and trnT-trnL).This research aimed to investigate the potential of four DNA barcode loci(specifically, atpB-rbcL, TrnL, TrnL-trnF, and trnL-trnT from the chloroplast region) for the identification and provision of phylogenetic information on local Arabica coffee genotypes.All four regions were successfully amplified using universal primers, yielding clear and reliable results.However, earlier studies have indicated that there were cases of partial amplification from the respective barcode loci's using universal barcode primers (Hamon et al., 2017;Wu et al., 2021).Similarly, other studies have shown 100% success rate for PCR amplification and sequencing for mangrove (Guyeux et al., 2019), duckweeds (Meena et al., 2020), and Coffea (Taberlet et al., 1991).The PCR amplification and sequencing of rbcL fragments in core barcodes of mangrove DNA samples achieved a 100% success rate.Our results demonstrated higher universality and success rates compared to Kress et al. (2009) and were consistent with Pei et al. (2015), where success rates ranged from 90% to 100% in forest plant communities within tropical and subtropical regions.
Similarly, other studies (Vickers, 2017;Wu et al., 2019b) have indicated that additional barcode primers, including matK, rbcL, and trnL-trnF, have demonstrated successful amplification within coffee species.However, no significant differences were recorded in the rate of coffee identification between rbcL + trnH-psbA and other combinations of random fragments, which aligns with the findings of the present study using all four barcodes for genotype identification.
Despite the abundance of available data on DNA barcoding of angiosperms, there is currently limited information regarding specific barcodes that can guarantee an accurate species identification in all cases (Weigand et al., 2019).Often, a barcode that performs effectively for one group of plants may prove inadequate for another group, especially in the case of recently diverged species (Li et al., 2015).The current study successfully identified all fifty-six accessions as Coffea arabica, except KSA2, KSA41, KSA42 and KSA43 for atpB-rbcl, showcasing the effectiveness of the universal DNA barcode primers.Likewise, multiple studies have extensively documented the reliability of matK and rbcL, either individually or in combination, as DNA barcodes that can be used with confidence across various plant species (Carneiro de Melo Moura et al., 2019).Several reports have recommended the utilization of rbcL as a valuable DNA barcode locus, primarily due to its relatively compact length of 500 bp, high success rate of PCR amplification, and excellent sequencing quality (Wu et al., 2019a;Wu et al., 2019b;Hong et al., 2022).However, other DNA barcodes, such as trnL-trnF and the trnL spacer, have also been suggested as reliable alternative barcodes for identification of species (Kang, 2021).The extent of sequence variation among the species or terminals under analysis is a crucial factor in determining the effectiveness of any barcoding locus (Carneiro de Melo Moura et al., 2019).
The number of singleton variable sites was found to be higher in trnL, trnL-trnF, and trnL-trnT compared to atpB-rbcl.Similarly, trnL and atpB-rbcl had more parsimony information sites than the rbcL barcode spacer region.These findings are consistent with a previous study by Mishra, Jingade & Huded (2022), which reported that trnL-trnF and matK barcodes exhibited greater variability than rbcl in Indian C. arabica genotypes.The present study also found similar results for PIS among the four barcodes analyzed.Similarly, previous research has indicated that trnL-trnF and matK loci exhibit greater sequence polymorphism than rbcL, as suggested by Kimura (1980) and Kumar et al. (2018).The current study's results support these findings.Hence, the present study found that all four barcode sequences, which were evaluated as candidate barcode markers, met the DNA barcoding criteria outlined by Li et al. (2015).Specifically, these markers exhibited sufficient sequence variability to enable effective discrimination among the Saudi coffee genotypes.
The phylogenetic analysis grouped the Saudi C. arabica genotypes into four groups with a clear influence of geographic origin suggesting the genotypes of each region share one or more common ancestor (Fig. 1).For instance, accessions KSA11R, KSA41, KSA42 and KSA43 from the isolated Rayda district of Assir region were grouped in clusters I and II.The accessions representing very old trees (KSA36, KSA44, KSA46, KSA47) segregated in the middle of the phylogenic tree in groups III and IVa.Similar results were reported by Mishra, Jingade & Huded (2022) where the grouping using single and multi-locus barcode primers was strongly influenced by the geographic origin of the genotypes.A molecular analysis of coffee genotypes from Saudi Arabia using SRAP markers grouped them into five distinct groups based mostly on their geographic origin (Al-Ghamedi et al., 2023).The accessions collected from Jazan region primarily clustered in groups II and IV, whereas those from Al-Baha and Assir regions formed a different group.Similar surveys of genetic diversity among coffee populations in northern Yemen (Montagnon et al., 2021) and southern Yemen (Eskes, 1989) found that each district (valley) have its own cultivars.Another study using genotyping by sequencing (GBS) showed that genetic closeness correlated with geographic proximity (Hamon et al., 2017).The current study provides further evidence to support this finding.It was also suggested that chloroplast sequences provide more insights into species evolution because they are more conserved (Guyeux et al., 2019).For future studies on this economically significant crop, we recommend using sequencing and genome-wide association studies (GWAS) to discover additional polymorphic markers associated with important agro-morphological traits.These markers would be beneficial for a range of investigations in Coffea.Ultimately, the polymorphic markers established and confirmed in this research hold potential as a valuable genomic asset for molecular breeding, genotype identification, and biogeography studies on Arabica coffee.

Table 4 Summary of nucleotide sites, variable polymorphic sites, number of segregating sites, haploid diversity, nucleotide diversity, and aver- age number of nucleotide difference.
Notes.NNS, Number of nucleotide sites; VPS, variable polymorphic sites; NSS, number of segregating sites; NH, number of haplotypes; ND, Nucleotide diversity; ANND, average number of nucleotide difference.fromAl-Baha.Subgroup IVd contained 11 accessions, eight from the Jazan region, two from Assir and one from Al-Baha.