Genetic diversity in eggplant (Solanum melongena L.) germplasm from three secondary geographical origins of diversity using SSR markers

Indo-Burmese region was the primary center of eggplant diversity from where the crop extended to several secondary origins of diversity. In this study, the genetic diversity among fifty-six eggplant accessions collected from three countries was assessed using sixteen polymorphic SSR markers to determine suitable parents for heterotic hybridization. The estimation of genetic diversity among the population of three countries (Bangladesh, Malaysia, and Thailand) varied from 0.57 to 0.74, with Shannon’s index value of 0.65. The mean value of expected heterozygosity and Nei’s index was 0.49, with an average PIC value of 0.83. A dendrogram was constructed based on UPGMA (unweighted pair group method with arithmetic mean), and the dendrogram categorized all accessions into six groups. The AMOVA (analysis of molecular variance) revealed a 77% total variation within the population from three different countries and 23% total variation among the populations. The result revealed a high genetic differentiation among the eggplant germplasms while the accessions that are farther from each other show a high level of diversity; thus, they can be recommended as parental in breeding programs. Hence, accessions, EB12, ET11, ET13, ET15, ET16, and ET17 could be crossed with accessions EM3, EB34, and EB3 for improvement in the future breeding program.


Introduction
Solanum melongena L., also known as eggplant, belongs to the family Solanaceae and is ranked as one of the beneficial vegetables worldwide. The crop ranks among high-valued vegetables with the highest antioxidant activity and nutritional value (Liu et al., 2018). The Indo-Burma region is considered the primary origin and center of eggplant's domestication, where the highest diversity of this crop is found (Augustinos et al., 2016). The crop enjoys extensive cultivation throughout the tropics and warm temperate regions, especially in Southern USA and Mediterranean regions (Liu et al., 2018). Despite its profitability and nutritional value, the breeding attempts for this vegetable are limited compared to other members of the Solanaceae family, such as potato and tomato (Hurtado et al., 2012). The study of genetic diversity is vital in breeding programs because it provides useful utilization of germplasm in the advancement of closely related species (Jasim et al., 2018). It is essential to assess polymorphisms among existing cultivars and select parents for hybridization. Generally, morphological characterization is considered the first step towards exploring genetic variation in eggplant (Sulaiman et al., 2020). However, morphological characters have particular limitations in distinguishing homozygous from heterozygous. Aside from this, morphological characters cannot define the exact level of diversity among existing germplasm due to additive gene action of disclosure of economically important traits (Jasim et al., 2018). Molecular markers are not environmentally controlled and can reveal the genotypic difference at the DNA level.
Hence, molecular markers play a crucial role in analyzing plant genealogy, gene mapping, construction of genetic maps, evolution, germplasm characterization, selection for characters, diversity study, and the determination of genome organization (Zuki et al., 2020;Sarif et al., 2020). Microsatellites, also known as simple sequence repeats (SSR), have become one of the most prevalent genetic markers due to their co-dominance inheritance, multi-allelic nature, reproducibility, high genome coverage, abundance and are polymerase chain reaction (PCR)-based (Demir et al., 2010;Chukwu et al., 2020). The first set SSR marker for eggplant was developed from the screening of small insert of di-and trinucleotide repeats of genomic libraries (Nunome et al., 2003). Subsequently, a set of small SSR markers from genic DNA sequence developed by Stàgel et al. (2008) developed was lodged in public databases. Similarly, over 1,000 SSR markers were identified by Nunome et al. (2009) while screening enriched cDNA and gDNA libraries. Barchi et al. (2011) isolated a wide-range of approximately 2,000 putative eggplant SSRs markers from restriction-site associated DNA tags out of which a subset exhibited polymorphism among the mapping population parents. A wide range of SSR markers is publicly obtainable for eggplant, either from genomic SSRs (genomic libraries of SSR enriched) or EST-SSR (genic libraries). Genomic SSRs are usually related to non-coding parts, but EST-SSRs are derived from expressed regions of the genome (Muñoz-Falcón et al., 2011). Meanwhile, EST-SSRs are less polymorphic compared to genomic SSRs (Muñoz-Falcón et al., 2011). The main objective of this study was to evaluate genetic diversity among collected materials using both genomic SSR and EST-SSR polymorphic markers and examine suitable parents for heterotic hybridization in future breeding programs. This study will be useful in the germplasm conservation and characterization for future breeding programs of eggplant resources.

Planting materials
Fifty-six accessions of eggplant (S. melongena) which formed three populations, of which 33 from Bangladesh (EB), 15 from Thailand (ET), and 8 from Malaysia (EM), were used for this study (Tab. 1). The materials were selected to represent the genetic diversity of local materials for each country (Fig. 1).
DNA extraction, genotyping and electrophoresis Young leaves from individual accessions (approximately 100 mg) were used for the extraction of genomic DNA following a slight modification on the CTAB procedure (Oladosu et al., 2015). The DNA extraction was diluted to 50 ng/μL using a TE buffer and stored at −20°C until PCR amplification. The DNA concentration and purity were quantified using a Nanodrop 2000 spectrophotometer machine (ND 1000). The extracted DNA purity for individual samples was measured at an absorbance ratio from 1.95 to 2.0 of 260 nm divided by 280 nm. The PCR was conducted using a 15 μL-reagent containing: 7.5 μL 2× Taq DNA polymerase master mix (Thermo Scientific, USA), 4.5 μL nucleus free water, 1 μL forward primer (10 μM), 1 μL reverse primer (10 μM), and 1 μL DNA template (50 ng/μL). The PCR was run on a PCR machine using a touchdown protocol which was optimized for eggplant with initial denaturation of 94°C for 3 min followed by 10 cycles at 94°C for 30 s (decrease 1°C per cycle for denaturation), 55-65°C for 1 min and 72°C for 30 s followed by 30 cycles at 94°C for 30 s, annealing at 55°C for 1 min, 72°C for 1 min and final extension at 72°C for 5 min, followed by rapid cooling at 4°C prior to analysis. Annealing temperature depends on primers. For the DNA fragments amplification, 5 µL of PCR product loaded on the on 2% MetaPhor TM agarose (Lonza Rockland, Inc., USA) with 1X TBE buffer (0.05 M Tris, 0.05 M boric acid, 1 mM EDTA; pH 8.0) that was pre-stained with Midori Green Nucleic Acid Staining Solution (1:100,000) and run at 80 V for 60 min. Then the gels were documented using Molecular Imager ® (GelDoc TM XR, Bio-Rad). Exactly 100 bp (GeneDirex) DNA ladder was utilized to score the band.

Data analysis
The data were transformed into binary data using the UVIDoc software. NTSys software was used for clustering and PCA, with number 1 indicating present, and 0 showing (0) for each locus. The data was then analyzed using the Popgen software Version 1.32, reported by Yeh (1997). Polymorphic Information Content (PIC) values were computed using the following formula: Here, p ij is the frequency of the j-th allele for the i-th marker and summed over N alleles (Anderson et al., 1993).  All the data were analyzed using the NTSYS Pc software Version 2.20 for multivariate analysis. First, the data were standardized to remove the effects of different measurements using the STAND function. The distance coefficient was then worked out using the DICE similarity index by utilizing the transformed data and the information was exemplified in dendrogram following unweighted pair group method with arithmetic average (UPGMA), and SHAN (sequential, hierarchical, and nested clustering) methods in NTSYS Pc. 2.20. The adjustment between the dendrogram and dissimilarity matrix was estimated by the cophenetic correlation coefficient (r) according to Rohlf (1998). The average genetic distance was then used as a cut-off value to define genotype clusters. Principal component analysis (PCA) was calculated using DECENTRE, EIGEN, and GRAPHICS as described by Rohlf (1998) to complement cluster analysis. The distribution of genetic variation within and between families from different countries was determined using the analysis of molecular variance (AMOVA) was calculated using the Gene Alex 6.502 software (Peakall and Smouse, 2006). The test for significance of the estimated parameters was conducted based on 10,000 bootstrap resamples.

Polymorphism analysis with SSR (Simple Sequence Marker)
From the studied 102 markers, the 16 which showed polymorphism bands were selected for genetic diversity analysis. The expected heterozygosity ranged from 0 to 0.756, with an average of 0.493 (Tab. 3). The observed heterozygosity was zero (0)

Clustering using SSR markers
The selected sixteen SSR marker data were analyzed for clustering using the NTSYS software. Clustering was conducted to group all accessions into the dendrogram. The similarity coefficient is ranged from 0.23 to 0.88. All accessions were classified into six groups, with a threshold level of 0.33 (Fig. 2). The first cluster consisted of 31 accessions (30 from Bangladesh and one from Malaysia). In contrast, the second Cluster had 5 accessions from Thailand and Bangladesh, while Cluster III consisted of only one accession from Bangladesh, Clusters IV and V had 6 accessions each, and Cluster VI consisted of seven accessions from Thailand. The result of the PCA is presented in Fig. 3. Accessions such as EB12, ET11, ET13, ET15, ET16, and ET17 were farthest from the center. Meanwhile, accessions like EM3, EB34, and EB3 were located near the center.

Analysis of molecular variance (AMOVA) using SSR markers
The SSR profiles of the eggplant genotypes in this research were analyzed using AMOVA to determine the interpopulation genetic variances. The inter-genetic variances and intra-genetic variances among the populations were 23% and 77%, respectively (Tab. 5). Moreover, the AMOVA analysis showed highly significant (p ≤ 0.01) genetic differences among populations (Bangladesh, Malaysia, and Thailand) and within populations. Total genetic variation within populations was 77% in 56 eggplant genotypes; whereas, genetic variation among populations was 23% in three regions. This indicated that high genetic dissimilarity existed within-population compared to among-population.

Discussion
This research revealed the level of genetic diversity among the available germplasm of eggplant collected from three different countries using the SSR marker (Fig. 2). The determination of genetic variation among germplasm is vital in the breeding and conservation of genetic resources. It is also important in genetic improvement and exploitation of genes for tolerance against abiotic stress. The detection of polymorphism within germplasm is important in breeding. There are reports (Nunome et al., 2003;Stàgel et al., 2008;Demir et al., 2010) of low polymorphism frequency within intraspecific lines and cultivars among crops of Solanaceae family, and this is possibly due to their autogamous nature. Eggplant is an autogamous crop, and most of the materials are commercial varieties, so low heterozygosity is not unexpected (Cericola et al., 2013). The more or less similar value of H o (0.038) was also reported by Augustinos et al. (2016). The low value of H o was also observed by Liu et al. (2018) and  EB10   EB1  EB28  EB3  EB6  EB4  EB5  EB27  EB30  EB7  EB34  EB31  EB32  EM7  EB33  EB8  EB16  EB19  EB17  EB9  EB10  EB11  EB12  EB13  EB14  EB15  EB18  EB20  EB21  EB23  EB22  EB24  EB35  EB36  ET6  ET7  ET8  EB26  EM3  EM8  EM9  EM4  EM5  EM6  EM10  ET1  ET4  ET2  ET3  ET5  ET9  ET11  ET13  ET15  ET16  ET17  ET10   I   II   III Vilanova et al. (2012), who reported 0.03 and 0.06 in their respective study. The high value of homozygosity indicates that pure lines can be found from the selection of individuals among this germplasm (Vilanova et al., 2014;Gramazio et al., 2019). Although molecular diversity depends on the number of markers, types of markers, and the tested accessions (Augustinos et al., 2016;Tümbilen et al., 2011;Gramazio et al., 2019). The high average value of PIC, the high mean number of alleles per locus, and high levels of observed heterozygosity in this research are comparable with other studies, thus indicating that great diversity exists among the collected germplasm. The study reported by Nunome et al. (2003) showed an average value of the observed number of alleles per locus was 3.1, and the expected heterozygosity value was 0.38 when an evaluation was conducted within 11 germplasm of S. melongena using 16 polymorphic dinucleotide genomic microsatellites. Thirty-eight S. melongena accessions were evaluated using 11 EST-SSR polymorphic markers, the mean value of alleles per locus was 3.1 and the average PIC value was 0.38, according to a report by Stàgel et al. (2008). From their study, it was observed that genomic SSR markers are more polymorphic compared to EST-SSRs. This was in tandem with reports by Kalia  showed an average of 5alleles/locus. Considering the five SSRs with the highest value of PIC, the average number of alleles per locus in this study was 5, which is greater than the value reported by Demir et al. (2010). This indicates a wide diversity exists with different origins and types. SSR marker is useful for the analysis of genetic diversity in eggplant. For this study, the PIC value range from 0.660 to 0.966, with an average value of 0.830. This value was higher than the mean PIC value of 0.401, 0.47, and 0.507, reported by Vilanova et al. (2012), Muñoz-Falcón et al. (2011), and Liu et al. (2018, respectively. A PIC value greater than 0.5 indicates a highly polymorphic locus. A PIC value of 0.25-0.50 shows an intermediate polymorphic locus, while a PIC of lower than 0.25 indicates a low polymorphic locus (Gramazio et al., 2019;Kalia et al., 2011;Nunome et al., 2009;Ge et al., 2013). In this research, the average value of PIC was 0.830, indicating a high level of polymorphism in the loci. Genetic diversity level measured in eggplants varies in different literature studies. Hurtado et al. (2012) recorded a high genetic diversity study for some Chinese accessions, i.e., He = 0.494, and some Sri Lankan accessions, i.e., He = 0.540, which is similar to that observed in this study  The dendrogram coefficient range varied from 3.51 to 12.89, indicating a high amount of variation present among existing materials. Higher diversity was observed among genotypes of Groups I to VI due to their different morphological characters. The accessions were admixtured from different countries, indicating that these accessions had a common origin or more or less the same morphological characters. On the other hand, the accessions which were distant from one another, meaning that these accessions had different agronomical traits or distinct origin. The accessions from different clusters but different origins suggest an exchange of genetic materials by plant breeders from different geographical locations. Dissimilarities among accessions could be due to environmental influence occurring over a long period of time.
The AMOVA showed highly significant genetic differences within populations. This result indicates that high genetic dissimilarities existed among the accessions within the population (77%), while there are low significant differences among the populations (23%), showing the presence of low genetic dissimilarities among the population. This research results were similar to previous research by Mazid et al. (2013), in which 67% variation was present within groups of 41 rice genotypes while there was 33% variation among the 41 rice genotypes.

Conclusion
The microsatellite markers are valuable tools in determining the genetic relationship among eggplant accessions such as those from three different countries (Bangladesh, Malaysia, and Thailand) used in this study. These markers also helped to reveal a high level of polymorphism within the population and a low polymorphism level among populations. Through the improvement of eggplant accessions and widening of their genetic base, the population which has the least genetic similarities could be selected as parental materials. Therefore, hybridization should be conducted using two distant populations like any accessions of Cluster I with Cluster V. Hence, accessions EB12, ET11, ET13, ET15, ET16, and ET17 could be crossed with accessions EM3, EB34, and EB3 for improvement in the future breeding program. The molecular variance analysis showed that 77% of total genetic variations were due to differences within populations, whereas 23% genetic variation was exhibited among populations.