Identi ﬁ cation of groundnut ( Arachis hypogaea ) SSR markers suitable for multiple resistance traits QTL mapping in African germplasm

Background: ThisstudyaimedtoidentifyandselectinformativeSimpleSequenceRepeat(SSR)markersthatmay be linked to resistance to important groundnut diseases such asEarly Leaf Spot, Groundnut Rosette Disease,rust and a ﬂ atoxin contamination. To this end, 799 markers were screened across 16 farmer preferred and other cultivated African groundnut varieties that are routinely used in groundnut improvement, some with known resistance traits. Results: The SSR markers ampli ﬁ ed 817 loci and were graded on a scale of 1 to 4 according to successful ampli ﬁ cation and ease of scoring of ampli ﬁ ed alleles. Of these, 376 markers exhibited Polymorphic Information Content (PIC) values ranging from 0.06 to 0.86, with 1476 alleles detected at an average of 3.7 alleles per locus. The remaining 423 markers were either monomorphic or did not work well. The best performing polymorphic markers were subsequently used to construct a dissimilarity matrix that indicated the relatedness of the varieties in order to aid selection of appropriately diverse parents for groundnut improvement. The closest related varieties were MGV5 and ICGV-SM 90704 and most distant were Chalimbana and 47 – 10. The mean dissimilarity value was 0.51, ranging from 0.34 to 0.66. Discussion: Of the 376 informative markers identi ﬁ ed in this study, 139 (37%) have previously been mapped to the Arachis genome and can now be employed in Quantitative Trait Loci (QTL) mapping and the additional 237 markers identi ﬁ ed can be used to improve the ef ﬁ ciency of introgression of resistance to multiple important biotic constraints into farmer-preferred varieties of Sub-Saharan Africa.

and other farmer preferred traits is accomplished only through artificial hybridization in targeted breeding from, for example, diploid wild relatives of groundnut with known abiotic and biotic stress resistance and/or tolerance [5]. In general, inheritance of disease resistance has been governed by quantitative recessive genes with low heritability that are controlled by epistatic effects and the environment [9]. The narrow genetic base of cultivated groundnut and variation in ploidy levels further limits introgression of resistance traits by interspecific hybridization [2].
Detection of polymorphic molecular markers associated with genes governing disease and insect resistance has progressed rapidly over the past two decades. This accelerated the development of cultivar resistance breeding programs for enhanced yield and grain quality [16,17,18]. SSR markers are preferred due to their co-dominance, simplicity, high polymorphism, repeatability, multi-allelic nature and transferability within the genus Arachis and significant polymorphism has been identified in novel Simple Sequence Repeat (SSRs) by He et al. [19]. These markers have enhanced phylogenetic studies of the Arachis species, for pre-breeding parent determination and integration of SSR based maps in both diploid and tetraploid species [20,21,22], comprehensive Quantitative Trait Loci (QTL) analysis for linkage to disease and pest resistance [23,24,25], comparative Table 1 African Arachis germplasm used in this study grouped according to their attributes of disease tolerance/resistance, productivity and quality traits and farmer preference. mapping studies [26,27] and as a basis for identification of candidate genome regions controlling rust and LLS resistance [28,29]. Wang et al. [30] constructed a genetic linkage map from SSR derived bacterial artificial chromosome end sequences, facilitating the identification of markers linked to resistance gene homologs and map-based cloning. Even markers with low polymorphism enhanced the total available SSRs in wild species for transfer of target traits and should not be disregarded [31].
This study was undertaken to identify and select informative SSR markers that may be linked to resistance to ELS, GRD, rust and aflatoxin contamination across 16 varieties of farmer-preferred and other cultivated African groundnut varieties that are routinely used in groundnut improvement in order to aid the identification of suitable parents for mapping populations or marker-assisted introgression and to select a subset of SSR markers that are evenly spread across the groundnut genome for future resistance QTL mapping.

DNA extraction
A total of 799 SSRs (supplementary data), comprising of di-and tri-nucleotide motifs from both genomic and expressed sequence tag (EST) SSRs, as compiled by Zhao et al. [32], were screened across 16 cultivated groundnut varieties indigenous to Africa. These varieties are listed in Table 1  Genomic DNA was extracted from 14-day old seedlings with one leaf from three individual plants combined into a single sample for each genotype. The genomic DNA was extracted according to the CTAB method of Mace et al. [33] with the exclusion of the phenol-chloroform extraction step.

SSR marker properties and performance
A total of 799 markers (Supplementary data) were screened to identify the most informative markers for QTL mapping and pre/post-breeding applications.
Marker allele profiles obtained after capillary electrophoresis using GeneMapper 4.0, were graded on a scale of 1 to 4 for ease of scoring as illustrated in Fig. 1 (1 = clear single peaks, 2 = clear peaks with multiple stutter peaks, 3 = peaks not well defined but could be scored and, 4 = difficult to score due to noise, multiple loci binding or low availability). For grades 1, 2 and 3 the numbers of polymorphic markers obtained were 182, 61 and 133, respectively. In total, 423 markers were excluded from the final data set. These included 93 that were scored as grade 4, 169 that failed to amplify PCR products in the majority of the 16 varieties (i.e. availability b 0.38) and 161 monomorphic markers. This screening provided 376 high quality polymorphic markers that worked well (average success rate of 94.2%) across the 16 varieties.
PowerMarker results were compiled for allele number, major allele frequency, how well each marker worked (availability), heterozygosity and PIC (Supplementary data).
Markers that were highly heterozygous confounded data interpretation and were carefully considered to determine if they had amplified two loci and if so, were split into two sets of alleles denoted with (_1/2) to the marker name. If both sets of alleles were heterozygous and polymorphic, these markers were retained. If one set of alleles was homozygous, this allele was discarded. Markers that would have resulted in two homozygous loci were not split. The total number of retained split markers was 18 and thus resulted in 394 polymorphic loci from a total of 376 markers.
The PIC range observed (0.06 for Ah-671 to 0.86 for Ah1TC4F12) in this study was similar to that reported by Pandey et al. [37] (PIC range 0.10 to 0.89). The mean PIC value obtained in the current study was Table 3 Polymorphic SSRs loci identified in this study that were previously mapped to Arachis linkage groups (LG) (Gautami et al. [45], Wang et al. [30]).
LG Markers a04 (LG9)  GM1062  Ap40  GM890  GM2246  TC11B04  GM1720  IPAHM105  GM2589  GM1919  GM1311  a09 (LG18)  GM2450  GM849  GM2359  GM1291  GM1911  PM675  AHGS0695  Ah1TC5D06  Ah1TC1D02  AHGS0993  a06 (LG5,10)  IPAHM659  GM1489  GM1490  GM2337  IPAHM245  TC11A04  GM1573  IPAHM689  GM1916  Ah2TC7C06  a03 (LG7)  GM1717  GM2402  GM2215  GM2528  GM2206  GM1954  Ah1TC0A01  pPGSseq19G7  AHGS0132  a05 (LG19)  GM1049  GA34  GM1577  GM2078  RN16F05  GM1702  pPGSseq10D4  Ah1TC6E01  GA32  b07 LG2 0.49, with values above 0.5 observed in 174 (44%) of the loci analyzed, which was high compared to the findings of Cuc et al. [39], where only 15.7% of SSR markers showed PIC values N0.5. A study by He et al. [19] gave a lower percentage (34%) of polymorphic markers than that shown in our study. The number of polymorphic markers identified in this study (376 or 47% of the total number screened) was also high compared to other studies in groundnuts, which ranged from 3 to 33% of the markers analyzed [19,20,38]. However, the values were comparable to those reported by Cuc et al. [39] (44% with mean PIC 0.46; PIC range 0.12 to 0.75) and Mace et al. [40] (PIC range 0.29 to 0.60; mean PIC 0.47) with variations ascribed to genotype differences. The polymorphic markers identified in this study are therefore highly informative. Marker GM2009 had a PIC value of 0.67 and has been shown to be closely associated with the major QTL for Late Leaf Spot (LLS) [23]. The genetic similarity of LLS and ELS disease resistance mechanisms [9] further supports the significance of this marker for QTL analysis for ELS resistance. Markers IPAHM108_1/2 and IPAHM123_2 had PIC values of 0.69/0.72 and 0.73 across 5 and 6 alleles, respectively. These were similar to that from a previous study by Cuc et al. [39] in which IPAHM108/123 had PIC values of 0.62/0.75 across 3 and 4 alleles, respectively. Other polymorphic IPAHMx markers varied from those of Cuc et al. [39] in terms of both low (IPAHM659_2/105/136/177) and high polymorphisms (IPAHM689) whilst allele numbers were fairly consistent in comparison. These variations in marker characteristics could be due to the inherent genotypic constitution of the cultivars used but cannot be confirmed as there were no common genotypes between this study and that of Cuc et al. [39]. Other markers that had high PIC values in this study as well as in that of Varshney et al. [41] were Ah1TC1E01, Ah1TC4F12 and Ah1TC6E01 with PIC values of 0.60-0.90. These similarities across different studies further highlight their usefulness in the present study across globally cultivated Arachis spp. The polymorphic markers identified in this study may also be useful across other legume species in comparative genomics studies as was ascertained with polymorphic soybean derived EST-SSRs in the Arachis genome [26]. These markers produced an average of 3.7 alleles per locus, for a total of 1476 alleles. The number of alleles per marker ranged from 2 to 11 with a mean of 3.74. Both higher numbers of alleles ranging from 2 to 14 [2,19] and lower numbers ranging from 2 to 8 [39] have been reported by previous studies. The most polymorphic markers with PIC values N 0.70, reported by Hildebrand et al. [42] had allele values ranging from 5 to 11. The most informative SSR markers in this study were Ah2TC7H11, Ah1TC3E02, Ah1TC4F12, GNB70, Ah2TC11H06, AHGS0798, pPGPseq3B5, Ah2TC9H09, Lec1, Ah1TC2G05, AHGS0965, GA161, TC04G02, Ah1TC3B04, TC11A04, TC3E05, TC05A06 and GNB18 and had allele numbers ranging from 8 to 11 and PIC values of 0.78 to 0.86 and were considered important to distinguish all the varieties for use in MAS and other diversity studies.
Major Allele Frequency (MAF) ranged from 0.18 to 0.97 with a mean of 0.58 and heterozygosity ranged from 0 to 0.38 with a mean of 0.20. Markers with MAF between 0.5 and 0.8 (181 polymorphic markers in this study) have been reported to contribute approximately equally to information in linkage disequilibrium studies and should be useful in QTL mapping [43].

Dissimilarity matrix pair wise comparisons across the sixteen Arachis sp. varieties
A dissimilarity matrix was calculated from the allelic data of the 376 polymorphic markers (Table 2) and values ranged from 0.34 for the closest related varieties MGV5 and ICGV-SM 90704 to 0.66 for the most distant varieties Chalimbana and 47-10, with a mean value of 0.51. The dissimilarity values obtained were high in comparison to genetic distance values of previous studies in Arachis sp. [27,44] and ranged from 0.091 to 0.288 and 0.083 to 0.117, respectively. Subsequently, the most appropriate combinations for the development of bi-parental mapping populations for disease tolerance/resistance QTL mapping were identified, selecting the most distantly related varieties with contrasting expression of the trait and dissimilarity values above 0.5. As such, for ELS and LLS QTL mapping, ICG 7878 can be combined with FPVs 47-10, JL 24 and ICGV 86124 (dissimilarity values of 0.607, 0.597, 0.582 respectively) as well as with high yielding and quality trait variety FLEUR II (dissimilarity value: 0.57). ELS resistant genotype ICGV-SM 95714 will also combine well with  Table 3 Polymorphic SSRs loci identified in this study that were previously mapped to Arachis linkage groups (LG) (Gautami et al. [45], Wang et al. [30]). Sixty-three percent of the dissimilarity values calculated ranged from 0.50-0.66 and resulted from 237 polymorphic markers that could differentiate all varieties for the various traits of yield, quality and disease resistance. Nineteen percent of these values were associated with recommended crosses for introgression of ELS resistance. The high number of markers used in this study therefore enhanced the potential for targeted introgression of multiple disease resistance, yield and quality traits into farmer preferred and commercial groundnut varieties.

Genetic tree analysis
A neighbor-joining tree, illustrating the relatedness among the varieties, is presented in Fig. 2. The 16 varieties were grouped into three large clusters and a single outlier, ICGV-SM 95714. The majority of FPVs (47-10, ICGV 86124, JL 24 and Pendo) were grouped together in cluster 1 with ICGV 86124, 47-10, JL 24 and Pendo forming a more closely related sub-group (sub-cluster 1a). This may be attributed to low levels of out crossing [13,14,15]. Seed exchange among small holder farmers, planting proximity of preferred varieties, farmer preference for specific varieties and collection of seed for this study from a common geographic location may also have influenced the overall composition and relatedness of the varieties over the years. ELS resistant varieties ICG 7878 and ICGV-SM 95714 were noticeably distant from the majority of the varieties and hence more useful for trait QTL mapping and introgression into the other 14 varieties. ICGV-SM 95714 showed the lowest score for PCR performance across the varieties (90.9%), which may have contributed to its independent clustering.

Marker map distribution
A total of 139 (37%) of the 376 markers that were found to be polymorphic in this study have been previously mapped [30,45] (Table 3) and the number of markers per linkage groups (LG) and chromosomes (aa and bb) ranged from 0 for LG b06 to 18 for LG9 of chromosome a04. On average, the mapped markers were distributed evenly across all LGs with the exception of LG b06 of chromosome bb. These can be used to identify markers linked to various resistances and quality trait QTLs and their locations on the genome. The 139 is an appreciable number of mapped polymorphic SSRs since other studies successfully constructed genetic maps from 144 SSRs [46], 175 SSRs [47], 181/188 SSRs [23] and 324 SSRs [24] on recombinant inbred line populations as well as with larger marker numbers -895 for the tetraploid 328 genome [45] and 1724 for the diploid genome [48].

Conclusions
In this study, 376 highly informative SSR markers were identified from 799 that were screened. This allowed genetic diversity assessment of 16 African groundnut cultivars with a wide repertoire of disease resistance and farmer preferred traits and a dissimilarity 'tool' was constructed that provides guidance on which parental combinations to use for mapping population development. In addition, 139 of these markers have been previously mapped and can now be employed in Quantitative Trait Loci (QTL) mapping. The additional 237 informative markers identified can be used to improve the efficiency of introgression of resistance to multiple important biotic constraints into farmer-preferred varieties of Sub-Saharan Africa.

Financial support
This study was supported by USAID Feed-the-Future Programme (EEM-G-00-04-00013) under the "Improving groundnut farmer incomes and nutrition through innovation and technology enhancement" (I-FINITE) project.