Genetic Diversity Analysis of Capsicum Genus by SSR Markers

The genetic diversity of pepper resources is rich and the potential of breeding is great. Therefore, the objectives of the study were to determine the genetic diversity and population structure of 32 accessions of Capsicum germplasm resources and contribute to breeding of pepper. In this study, the genetic diversity of different species of pepper germplasm was studied from the molecular level, which provided reference for the collection, research and rational utilization of pepper germplasm resources. 80 pairs of SSR primers were designed based on the whole genome coding region sequence of pepper. The 32 accessions of Capsicum germplasm resources of 12 species (subspecies) with different geographical origin and different traits were selected to screen 80 pairs of primers, which was to obtain clear bands, good stability of 40 SSR polymorphic primers. DPS, MEGA7 and POPGENE32 software were used to analyze the genetic diversity of 32 pepper germplasm resources. The results showed that 40 pairs of primers amplified 122 polymorphic bands, with an average of 3.05 loci amplified by each pair of primers, which showed that the SSR primers had high practicability in the genetic analysis of pepper. The mean value of effective allele number (Ne), observed heterozygosity (Ho), expected heterozygosity (He), shannon-weaver index (I), polymorphism information content (PIC) were showed that pepper genetic information is rich. Based on cluster analysis of UPGMA method and principal component analysis were basically consistent with the source of Capsicum were divided into 10 clusters.


Introduction
Capsicum is a genus of Solanaceae family. In 1983, IBPGR confirmed which can be divided in five cultivars: Capsicum annuum L., Capsicum frutescens L., Capsicum chinense Jacquin, Capsicum baccatum L. and Capsicum pubescens Ruiz & Pavon. In addition to cultivar species, as most of the wild species have certain special traits, such as Capsicum chacoense is able to oppose bacterial speck; Capsicum angulosum bears CMV and TLCVC. It is important to search and discuss the origin, evolution and classification of the pepper for estimating the relationship and exploiting the critical gens by hybridization between wild and cultivar species. Meanwhile, it is as important as to analyze the relationship between cultivars among genus for improving the breeding of the pepper.
Simple sequence repeat called as microsatellite, it is a molecular marker that has been used to identifying the diversity of the species. Tam researched genetic relationship between pepper and tomato using the molecular marker technology of RFLP, RAPD and SSR (Tam et al., 2005). Comparing with the other molecular marker, advantages of SSR contain that it saves the quantity of the template of DNA; it's an effective method for researching population's diversity and genetic relationship (Zhebentyayeva et al., 2003). SSRs located in the coding region are the molecular markers bearing with genetic transcription, so that the real genetic diversity can be reflected by the data examined in the transcriptional region of the samples (Kong et al., 2012).
Zhou used 109 pairs SSR primers and reported 75.76% polymorphism (Zhou et al., 2009). Saleh used 28 SSR markers to evaluate 407 individual pepper plants and classified the 407 individuals into 3 groups (Saleh et al., 2016). Zhang investigated the genetic structure of the 372 GenBank Chinese pepper germplasm using 28 pairs SSR, these studies proved that SSR is available to analyse the genetic diversity of pepper (Zhang et al., 2016).
This study based on the 80 pairs of primer designed by whole genome coding region sequence of the pepper. We studied 32 accessions of pepper involving in 12 subspecies, 40 pairs of primers that showed high genetic polymorphism, Based on cluster analysis of UPGMA method and principal component analysis were basically consistent with the source of Capsicum were divided into 10 clusters.
1 Results 1.1 SSR makers' diversity 80 pairs of primer designed and developed by our laboratory were used for PCR amplification and detection of gel electrophoresis (Figure 1), 40 pairs of primers with better polymorphism were screened out, 40 pairs of primer producted 122 alleles, on average,each primer was generated 3.05 alleles, 16 pairs of primer generated 4 alleles such as , CA02g1958, CA05g2028, CA06g2745, Capana02g0029, CA01g0951. 1.2 Genetic diversity analysis POPGENE32 was used for dealing with the data of amplification product, the result indicating in table 3. Number of alleles of peppers is 2~4, average value is 3.05; Effective number of alleles is 1.289 9~3.945 4, average value is 2.597 95; Observed heterozygosity is 0~1, average value is 0.436 425; Expected heterozygosity is 0.228 5~0.766 7, average value is 0.597 5; Shannon-wiener is 0.384 5~1.379 3, average value is 0.508 1; polymorphism infomation content is 0.199 5~0.699 2; average value is 0.505 8. The results show that the primers are rich in polymorphism information (Table 1).

Cluster analysis
The marker data were used to generate a 0/1 matrix (absence/presence of allele at the marker locus), which was employed to estimate the genotypic distances between lines. We used UPGMA method to perform the cluster analysis in (Figure 2) and genetic relationship analysis. When coefficient of genetic similarity is 0.19, 32 accessions were divided into 10 clusters. The first cluster was consisted of 5 accessions of C. annuum L, 1 accession of C. galapogense and 1 accession of C. tovarii. It indicated that the relationship of two wild species of C. galapogense and C. tovarii is closer to the cultivated species of C. annuum L. The second cluster was consisted of 1 accession of C. praetermissum, the third cluster was consisted of 3 accessions of C. frutescens and 1 accession of C.eximium. The fourth cluster was consisted of 1 accession of C. eximium. The fifth cluster was consisted of 4 accessions of C. chinense. The sixth cluster was consisted of 5 accessions of C. annuum L. The seventh cluster was consisted of 1 accession of C. pubescens. The eighth cluster is 3 accessions of C. chacoense. The ninth cluster is 5 accessions of C. baccatum. The last cluster is Yushanhu. The result showed that it is abundant genetic polymorphism among pepper species.

Discussion
In this study, primers were designed by two genomes of pepper (CM334 and No.1 Zunla). SSR markers are located in coding region, so it means that these genes are expressed, and these genes can present as phenotype traits. Therefore, the sequence variation of transcribed region can be tested by SSR makers of the gene coding region on the whole genome level. Further, SSR makers can reflect functional diversity and indicate genetic diversity among germplasm.  Five major Capsicum cultivars are from three different centers. Mexico is the primary center of origin for C. annuum, and the secondary center of origin is Guatemala. The primary center of origin for C. chinense and C. frutescens is Amazonia. Peru and Bolivia are the primary centers of origin for C. pendulum and C. pubescens. The genetic base of C. annuum is becoming narrower and narrower under long-term artificial selection, which greatly affected the production of pepper.
After doing cluster analysis, 32 accessions were divided into 10 clusters. And the results of cluster analysis were in line with the source of Capsicum species. We found that C. eximium closed to C. frutescens, so C. eximium can be consisted as wild relatives of C. frutescens. Yushanhu is more different from other species; it can be used for researching the revolution of Capsicum. All the C. baccatum and C. chinense accessions were clustered together, respectively.
Lee found that C. galapagoense was close to C. annuum (Lee et al., 2016). Our results also support this conclusion, but we also found an interesting conclusion that C. tovarii was also close to C. annuum. However C. baccatum and C. frutescens were clustered together in Lee's research, which was different from our studying that one C. eximium accession was clustered into C. frutescens.
In order to reflect the truth of the results, it is necessary to do further association analysis by adding phenotypic data (Rivera et al., 2016). Li analyzed 857 Capsicum spp. germplasms with 23 morphological traits, and they reported 1.75 average genetic diversity index and 75.95% coefficient of variation. It meant that the genetic diversity was abundant in 857 hot pepper germplasms (Li et al., 2015).  -Maroof et al., 1984). Quality was estimated on 1% agarose gels and DNA concentrations were determined with Thermo scientific NanoDrop One Spectrophotometer. Final DNA concentration was adjusted to 50ng µl -1 and they were stored at -20°C until used.  (Qin et al., 2014;Kim et al., 2014). On the one hand, our laboratory designed SSR makers in coding region (CDS) referring to the conception of the cucumber's primer (Ren et al., 2009;Lv et al., 2012). On the other hand, we used the primers that our laboratory had designed (Jia et al., 2017). In order to judge the genetic information of the two genoms located in the same chromosome, we used NCBI blat for searching best hit. In this study, we used 80 pairs of primers for analysing the genetic diversity of 32 pepper's accessions. Finally, we screened out 40 pairs of primers which showed good polymorphism (Table 3, repeated primers no longer shown).

Date analysis
According to the statistical principle of binary data, the results of the polymorphic bands were calculated. The assignment of the band is 1, and the assignment of the no-band is 0, and the fuzzy band is not counted. DPS v7.05 software is used to analyze the genetic similarity among the materials, and using MEGA 7 to cluster the subject materials by UPGMA method to establish genetic relationship of the pepper germplasm. Using POPGENE32 software calculated the number of alleles (Na), effective allele number (Ne), observed heterozygosity (Ho), expected heterozygosity (He) and shannon-weaver index (I).Using PIC1.06 calculated polymorphism information content (PIC) by: In the formula, P is the frequencies of n alleles, take any i & j less than n, then Pi and Pj will be ith and jth frequency of the alleles (Shete et al., 2000). 75