Genetic Characterization of a Core Set of a Tropical Maize Race Tuxpeño for Further Use in Maize Improvement

The tropical maize race Tuxpeño is a well-known race of Mexican dent germplasm which has greatly contributed to the development of tropical and subtropical maize gene pools. In order to investigate how it could be exploited in future maize improvement, a panel of maize germplasm accessions was assembled and characterized using genome-wide Single Nucleotide Polymorphism (SNP) markers. This panel included 321 core accessions of Tuxpeño race from the International Maize and Wheat Improvement Center (CIMMYT) germplasm bank collection, 94 CIMMYT maize lines (CMLs) and 54 U.S. Germplasm Enhancement of Maize (GEM) lines. The panel also included other diverse sources of reference germplasm: 14 U.S. maize landrace accessions, 4 temperate inbred lines from the U.S. and China, and 11 CIMMYT populations (a total of 498 entries with 795 plants). Clustering analyses (CA) based on Modified Rogers Distance (MRD) clearly partitioned all 498 entries into their corresponding groups. No sub clusters were observed within the Tuxpeño core set. Various breeding strategies for using the Tuxpeño core set, based on grouping of the studied germplasm and genetic distance among them, were discussed. In order to facilitate sampling diversity within the Tuxpeño core, a minicore subset of 64 Tuxpeño accessions (20% of its usual size) representing the diversity of the core set was developed, using an approach combining phenotypic and molecular data. Untapped diversity represents further use of the Tuxpeño landrace for maize improvement through the core and/or minicore subset available to the maize community.


Introduction
Knowledge of genetic diversity within and among maize landraces is essential for effectively managing the conservation of landraces and using them in plant breeding. Maize landraces have genetic diversity in terms of plant and ear morphology, adaptation, and consumer traits such as grain quality and yields. Following studies based upon chromosomal knob morphology [1,2] and isozyme markers [3][4][5][6][7][8], several analyses of maize landraces using DNA markers have been carried out [9][10][11][12]. Based on genotyping 193 landrace accessions at 99 microsatellite loci, Matsuoka et al. [9] presented phylogenetic analysis indicating a single domestication for maize and developed a scenario for its spread through the Americas. Reif et al. [10] used 25 simple sequence repeat (SSR) markers to characterize 25 maize race accessions from Mexico and examined their relationships on the basis of morphological data. Vigouroux et al. [11] analyzed the population genetic structure of maize races by genotyping 964 individual plants, representing most of the entire set of about 350 races native to the Americas, with 96 microsatellites. They identified the highland of Mexico and the Andes as potential sources of genetic diversity, which are currently underrepresented among elite lines in maize breeding programs. Most recently, Sharma et al. [12] revealed significant phenotypic and microsatellite-based genetic diversity in 48 landrace accessions in India, and identified promising accessions which could be utilized for introgression of novel traits in broad-based pools/populations.
The tropical maize race Tuxpeñ o has been incorporated in pools and populations in CIMMYT [13], where pools are maize populations with a broad genetic base. Its productivity per se and combining ability in crossing with race ETO developed at Estacion Tulio Ospina, Colombia is known as Tuxpeñ o-ETO heterotic patterns in tropical maize breeding [14][15][16]. It is predominantly a white dent with a cylindrical ear type. Some accessions of race Tuxpeñ o are yellow dent type, which were collected mainly in the Huasteca region of San Luis Potosi, Hidalgo, and Veracruz in Mexico. The long-term accessions evaluation experiments at CIMMYT planted 2,366 accessions of the race Tuxpeñ o since 1988. From them, 1,350 accessions were uniquely identified to be the race Tuxpeñ o. They are mostly from Mexico, but also include introductions from Brazil, Ecuador, Guatemala, and Venezuela. A multivariate cluster analysis of phenotypic data collected from seven trials was used to create a core set containing 321 accessions (23.7% of 1,350 Tuxpeñ o race accessions) of the race Tuxpeñ o [17][18][19][20][21][22][23][24].
CIMMYT has developed and released CIMMYT maize lines (CMLs) since 1984. The CMLs are carefully selected with good general combining ability (GCA) and a significant number of value-added traits such as drought tolerance, nitrogen use efficiency, acid soil tolerance, and resistance to disease and insect pests: (http://www.cimmyt.org/ru/component/content/article/ 459-international-maize-improvement-network-imin/434-cimmytmaize-inbred-lines-cml). They are used as parental lines for the hybrids in one to several maize mega-environments (MEs). Two heterotic patterns were classified within CMLs (i.e. CML-A as dent kernel type and CML-B as flint kernel type). CMLs were developed from tropical, subtropical and highland white and yellow dent CIMMYT populations and pools, including germplasm from Central America, Caribbean, Mexico, South America, and USA. Some of them originated from populations and gene pools with a background of Tuxpeño germplasm.
The GEM project in the United States is designed to broaden U.S. maize breeding germplasm, representing a public-private sector collaboration in which elite tropical and sub-tropical germplasm (i.e. from non-Corn Belt dent races of maize) is crossed with private sector inbred lines (http://www.public.iastate. edu/,usda-gem/). GEM has used some of the elite germplasm of the Latin American Maize Project (LAMP) identified as a source of new genetic diversity for broadening the genetic base of U.S. maize hybrids, and breeding crosses are grouped into stiff stalk (SS) and non-stiff stalk (NSS) heterotic patterns [25][26][27][28]. As Tuxpeñ o germplasm has not been largely used in the GEM project, comparison of genetic diversity of them would be of interest to maize breeders.
In this study, the Tuxpeñ o core set containing 321 accessions, together with 14 U.S. landrace accessions, 11 CIMMYT populations, 4 temperate inbred lines, 94 CMLs and 54 GEM lines was characterized using SNPs across the maize genome. The objectives were to assess genetic diversity and genetic distance among the Tuxpeñ o core and other germplasm; to investigate potential utilization of the Tuxpeñ o core in maize improvement and to develop a minicore subset of the Tuxpeñ o core to facilitate sampling untapped alleles, if they existed.

Phenotypic evaluation and formation of Tuxpeñ o core set
Seven trial sets mentioned above were conducted during 1988 to 2008 at three CIMMYT experimental stations (i.e., Tlaltizapán, 18u419480N, 99u079480W, 940 m above sea level; Agua Fria, 20u279000N; 97u 389 240W, 100 m above sea level; and Poza Rica, 20u 339 000N; 97u 279 000W, 60 m above sea level). The experimental design used alpha lattice with two replications. Each plot consisted of two 5 m rows with 75 cm apart between rows. Two seeds per hill were sown and later thinned to establish 32 plants per plot. Six check entries were included in each trial at each experiment station. Forty-four traits were evaluated for each accession, including morphological (plant height; ear height; ratio of ear height to plant height; tillering in scale; tassel type; percentage of erect plants; grain type; grain color), agronomic (days to 50% anthesis; days to 50% silking; ratio of anthesis to silking; foliar disease scale; root lodging (%); stalk lodging (%); number of plants harvested; number of ears harvested; ratio of harvested ears to harvested plants; field ear weight per plot (kg); rating on ear rot; rating on easiness of shelling; ear quality; grain moisture (%); grain shelling (%); adaptation in scale; agronomic scale; ratio of grain yield (kg) to grain moisture(%); yield per hectare (kg/ha)), vegetative (germination (%); rating on seedling vigor; number of leaves above the ear; days to leaf senescence; ratio of days to silking to days to leaf senescence; rating on forage production; rating on pubescence; rating on husk cover) and reproductive traits (ear length; ear diameter; kernel length; kernel width; kernel row number per ear; ratio of ear diameter to ear length; cob diameter; ratio of cob diameter to ear diameter; ratio of kernel width to kernel length). Detailed information of these traits can be found in Table S2. A multivariate cluster analysis (Ward-MLM) and a sample allocation strategy-D method and selection indexes (ESIM), were used to select core set to represent phenotypic diversity of the race Tuxpeñ o [17][18][19][20][21][22][23]. All trait data of discrete and continuous variables (44 traits in total) were included in calculating Gower distance among the accessions [24]. Based on the Gower distance, Ward was used to make a preliminary grouping, which was improved by MLM using maximum likelihood estimation. For each accession in the core set, the accession name, trial set in which they were evaluated, race classification, the value of each trait in the separate trial sets and the mega-environments (MEs) that they originated from are listed in Table S2.

SNP genotyping
Genotyping was performed using Illumina GoldenGate assay on 1,536 bi-allelic SNP markers developed by Yan et al. [30]. The details of the SNP genotyping procedure and allele scoring have also been described [30]. The software Illumina BeadStation 500 G (Illumina, Inc., San Diego, CA, USA) was used for SNP genotyping according to the protocol described by Fan et al. [31]. Allele calling was re-checked manually and further analysis was carried out.

Clustering analysis and genetic diversity
A neighbor-joining tree of these 498 entries was constructed based on the Modified Rogers genetic distance (MRD) using 1,041 SNPs. Briefly, pair-wise MRD between each two entries were calculated using an R (http://www.R-project.org) code, and neighbor-joining method implemented in the DARwin5 (http:// darwin.cirad.fr/darwin) program was used on the matrix of distances to construct the dendrogram. An additional tree was constructed to show the relationship among different germplasm groups (Tuxpeñ o core, CML-A, CML-B, CML-A/B, GEM-SS, GEM-NSS, CIMMYT populations, U.S. landraces), based on the Nei's genetic distance [32]. Bootstrap support for this tree was determined by resampling across 1,041 SNP loci for 1000 times. The output of each bootstrap sample was summarized to obtain a consensus tree.
The genetic diversity parameters gene diversity and observed heterozygosity were quantified for sets of entries. Gene diversity, often referred to as expected heterozygosity, is defined as the probability that two randomly chosen alleles from the population are different. The estimator of gene diversity is defined for the r th locus as Dr~1{ P m i~0 X 2 i , where m is the number of alleles and X i is the population frequency of the i th allele at locus r [33].

Adaptation and genetic divergence of Tuxpeñ o core
A GIS-based approach for defining global maize production environments called ''mega-environments (MEs)'' has been useful for targeting maize germplasm for the introduction and adaptation trials [34]. The program DIVA-GIS (http://www.diva-gis.org/) was used to assign the maize growing environments based on the altitude, latitude and longitude information of the accessions. The MEs of 299 Tuxpeñ o accessions were defined based on their available geographic information.
Within the Tuxpeñ o core, 277 accessions were classified into 10 subgroups according to the 10 major geographic regions (i.e. Guatemala and 9 states in Mexico: Chiapas, Hidalgo, Jalisco, Nayarit, Nuevo Leon, Sinaloa, San Luis Potosi, Tamauripas, Veracruz) where they were collected from (Table 1), based on available passport data. The program Arlequin [35] was used to perform analysis of molecular variance (AMOVA; [35,36]) and investigate the population differentiation among these 10 subgroups; and statistical significance of each variance component as well as pair-wise Fst was assessed based on 1000 permutations of the data.

Minicore subset formation
Data of 44 phenotypic traits (i.e. 31 continuous, 11 categorical and two nominal variables; Table S2, [21]) and genotypic data (1,433 SNPs covering 10 chromosomes) from evaluation of 321 Tuxpeñ o accessions were used to develop a minicore subset with a sample size equal to 20% of the entire core set size (that is 64 accessions). Morphological Gower distance [24] and MRD [37] were calculated between every pair of the 321 accessions and then combined following the Gower principle of using the average of both the two distances weighted by the number of variables included in the distance calculations, where MRD accounted for more weight than morphological distance because of more SNP numbers than number of phenotypic traits (i.e., 1,433 vs. 44). The resulting matrix D of combined distances showed to be an Euclidean distance matrix as all the Eigen values from the similarity matrix S = 12D were positive values, that is S was a positive definite matrix. Because the evaluation of phenotypic data was conducted in seven different sets of trials, a sequential strategy was used to obtain the mini core subset. First we defined the number of accessions to be selected from each trial set according to the diversity of each trial set. That is, the number of accessions we selected is proportional to the average of distances between accessions within each trial set: where n i is the number of accessions to be selected from the i th set, d i is the average of distances between accessions within the i th trial set, and 64 is the number of accessions to be selected to form the mini-core. Second, 1,000 mini-core subset candidates were randomly and independently drawn following a stratified random sample process of selection where each set was a stratum; then for each candidate subset the average distance between its 64 accessions was calculated. Finally, the candidate showing maxi-mum average distance between accessions was selected to be the mini-core subset [38].
To evaluate the mini-core subset we used three concepts: (1) the increase of the average of distances between accessions in the minicore in respect to the core set; (2) comparison of allele richness (expected and observed heterozygosity); (3) comparison of means, standard errors, and ranges between core and mini-core, and calculus of the range recuperation (RR, %) in the mini-core. As discussed by Marita et al. [39], allele richness is an evaluation from the point of view of taxonomists or geneticists looking for core subsets ensuring the inclusion of restricted or rare alleles; while distances between accessions is an evaluation from the point of view of breeders, looking for the inclusion of ''generalized'' alleles.

Genotypic data
A total of 1,443 polymorphic SNPs (93.3%) were successfully called, with less than 10% missing data in 350 accessions (including 321 Tuxpeñ o core, 14 U.S. landraces, 11 CIMMYT populations and 4 temperate inbreds, 647 plants in total). They were evenly distributed across the whole maize genome, with coverage ranging from 103 SNPs on chromosome 10 to 213 SNPs on chromosome 1 (Table S3). Ninety-four CMLs and 54 GEM lines were genotyped with a set of SNPs [40] that has 1,041  (Table S3). Marker names and physical positions of these 1,433 SNPs are listed in Table S3, where 1,041 out of 1,433 SNPs used for genotyping 148 GEM and CML lines were marked.

Dendrogram of all entries
The Neighbor-joining tree of all 498 entries is shown in Fig. 1, where lines from the same germplasm group (eg. Group of Tuxpeñ o core, CMLs and GEM lines) tended to clustered together. All U.S. landraces clustered together except one accession named ''Mexican June'', which grouped with lines from CIMMYT populations (La Posta-Across 8443, Population 23, 28, 32, and Pool 24). Entries from CIMMYT populations were scattered next to the group of Tuxpeñ o core, except Population 21, which clustered amongst the Tuxpeñ o accessions. Pop 21 is composed of seven Tuxpeñ o race accessions and some families from Pool 24 (which is mainly based on Tuxpeñ o germplasm but includes also some materials from Central America). Lines from heterotic group SS and NSS of GEM were absolutely distinguished. Mo17 and the other three temperate inbred lines grouped with GEM lines; Mo17 and CI7_1 were clustered in the NSS group; K22_1 and DAN340 were clustered between NSS and SS group. However, lines from heterotic groups A and B of CMLs were not clearly separated. Grouping of different germplasm was also shown in Fig. S1, where bootstrap value (%) above 50% was shown. Tuxpeñ o accessions collected from the same region were not necessarily grouped together (Fig. 2).

Genetic diversity among Tuxpeñ o core, GEM, CMLs and other germplasm
Gene diversity (expected heterozygosity) and observed heterozygosity of different sets of germplasm revealed by SNP markers are shown in Table 2. Using 1,433 SNPs, the set of U.S. landraces have higher values for gene diversity and heterozygosity than Tuxpeñ o core, temperate inbreds, and CIMMYT populations, which may be due to the inclusion of Southern dent and Corn Belt dent races in it [41]. The set of GEM lines has the highest values for gene diversity among all the germplasm assembled in this study, on the basis of 1,041 SNPs. This may result from the clear heterotic groups (SS and NSS) within GEM lines ( [26];http:// www.public.iastate.edu/,usda-gem/).

Genetic distances among Tuxpeñ o core, GEM-SS, GEM-NSS, CML-A and CML-B
Pair-wise MRD among Tuxpeñ o core, CML heterotic groups A and B, GEM heterotic groups SS and NSS, as well as MRD within each group are shown in Table 3. According to Tukey-Kramer  comparison of MRD means, larger genetic distances were observed between Tuxpeñ o core and GEM groups than that between Tuxpeñ o core and CML groups. MRD between CML heterotic groups A and B were less than that between GEM heterotic groups SS and NSS. The Tuxpeñ o core was closer to GEM-NSS group than GEM-SS group, according to the genetic distances. MRD within the Tuxpeñ o core was the least (Table 3). Relationship among different germplasm groups based on MRD was consistent with that based upon Nei's genetic distance, as revealed from Table 3 and Fig. S1.

Adaptation, genetic divergence and phenotypic variation of Tuxpeñ o core
The set of 321 Tuxpeñ o accessions represents 27 geographic regions (Mexican states and other countries) of the landrace adaptation, in which 10 major regions were identified. More than 5 accessions were collected from each of these 10 regions (Table 1). In total, 299 out of 321 accessions were classified into their corresponding MEs, based on available latitude, longitude and altitude data. A total of 171 accessions from 16 states of Mexico were classified as non-equatorial tropical/subtropical lowland wet mega-environment (day length: 12.5 to 13.4 hours, mean temperature $24uC, precipitation $600 mm and ,2000 mm). The second largest group was classified into the tropical midaltitude mesic mega-environment (day length: 11 to 12.5 hours, mean temperature .18uC and ,24uC, precipitation $200 mm and ,600 mm), in which 41 Tuxpeñ o core accession from Guatemala, and Chiapas, Tamaulipas, and Veracruz states in Mexico were collected. Twenty-six Tuxpeñ o core accessions were in non-equatorial tropical/subtropical lowland mesic (day length: 12.5 to 13.4 hours, mean temperature $24uC, precipitation $200 mm and ,600 mm) and non-equatorial tropical/subtropical mid-altitude wet (day length: 12.5 to 13.4 hours, mean temperature .18uC and ,24uC, precipitation $600 mm and ,2000 mm) mega-environments, respectively, which are the third largest groups (Table S4).
The AMOVA (Table S5) revealed that a very low percentage (1.30%) of variation was partitioned among the 10 subgroups of Tuxpeñ o accessions. Only 9.74% of the variation was attributed to differences among individuals within these 10 subgroups. The majority of the variation was found within individuals (88.96%). Pair-wise Fst among these 10 subgroups showed that in general the accessions in Veracruz, Chiapas, and Guatemala were significantly differentiated from those in most of other states in Mexico (P#0.01). Accessions from Hidalgo showed no significant differentiation as compared to those from all other subgroups (Table 4). However, genetic differentiation based on molecular data didn't completely concur with the morphological Gower distance (Table 5), suggesting no strong association between molecular and phenotypic data in this study. Most accessions in this Tuxpeñ o core are late white dent, with a few yellow late dent accessions collected from Huasteca regions of Veracruz, Hidalgo, and San Luis Potosi. CIMMYT populations have used most of them, but perhaps much less have been exploited from Chiapas and Guatemala.
The range and mean are summarized in Table 6 for certain important agronomical and yield-related or reproductive traits of the 321 Tuxpeño accessions evaluated in the seven trial sets. Wide variations were observed in days to 50% anthesis (AN), days to 50% silking (SI), plant height (PH), ear height (EH), ear length (EL) and ear diameter (ED). Other traits such as number of leaves above ear (LAE), kernel length (KL), kernel width (KWD), and ratio of kernel width to length (KWL) showed a relatively narrow range of variation.

Minicore subset of Tuxpeñ o
A minicore subset containing 64 accessions was defined. The genetic diversity represented by gene diversity, heterozygosity and Gower distance (Gd) in the minicore and core collections were compared. Gene diversity and heterozygosity of the minicore subset were higher than those of the core set (Table 2). In addition, Table 3. Average and standard error of modified Rogers pair-wise genetic distances studied by 1,041 SNP markers within (diagonal) and between (lower diagonal) Tuxpeñ o core (Tux.core), CML heterotic groups, and GEM heterotic groups; number of accessions per group (n); results of the Tukey-Kramer comparison of group means (lower letters).  Table 4. Pair-wise Fst studied based on 1433 SNPs for 10 subgroups of Tuxpeñ o core classified according to the regions they were collected from (i.e., 9 states of Mexico and Guatemala). Gd of the minicore subset (0.3289) was higher than that of the core set (0.3159) as well. Finally the means, standard deviations and ranges of 14 agronomical and yield related continuous variables characterized for the entire core set were recovered in the minicore (Table 6). Thus, the minicore subset reduced the number of genotypes while maintaining the diversity of the core collection (i.e. reducing the presence of some redundancies in the entire core set), which is satisfactory. The collecting sites (states or departments in Mexico and Guatemala) and CIMMYT accession identification numbers (Acc.ID) of these 64 Tuxpeñ o minicore accessions are shown in Table S6.

Genetic diversity of Tuxpeñ o core set and minicore subset
The Tuxpeñ o core set for breeding use was chosen to best represent phenotypic diversity within the race. They covered 23 States of Mexico, and parts of Brazil, Ecuador, Guatemala, and Venezuela, including landraces and old breeding populations. A relatively high gene diversity and heterogygosity were observed as revealed by SNP markers. In addition, the geographic locations (mega-environments) where the Tuxpeñ o core accessions were collected show a wide climatic range. This confirmed a previous study which indicated that Tuxpeñ o is the most widely adapted Mexican landrace, as it is found in 19 climatic types [42]. Environmental differences seem to drive the overall patterns of maize diversity [42,43]. Ecogeographical information where the collections originated from is central to understanding the variety of other sites in which they can adapt to. Breeders can select the promising accessions with potential adaptation and use them in the breeding program. The minicore subset, as indicated from the present result, can capture the genetic variation present in the Tuxpeñ o core set. We used a strategy combining phenotypic and genotypic data to develop the minicore. A distance was defined using both phenotypic and genotypic variables to achieve effective classification of genotypes. Inclusion of morphological traits to measure the distance is better than using only genotypic or marker data, since they provided additional information generally independent of the genotypic information. The use of the weighted average of both morphological and genetic distance followed the Gower principle, in which more variables produce larger effects. Evaluation of agronomically important and stress-tolerant traits can be carried out using the minicore. Mining new alleles for useful traits either in the minicore or in the core is cost-effective, as the number of accessions is substantially reduced compared to that of the entire Tuxpeñ o race collection at the CIMMYT maize germplasm bank.
The present study on the core set of the largest collection in CIMMYT (i.e. race Tuxpeñ o) can be extended and applied to other landrace collections. As shown in Figure 2, relationship among the accessions does not necessarily follow the geographic pattern for the collection of the accessions. Hence, genotyping a large number of accessions and plants per accession would be necessary in order to establish relationship among the landraces and devise sampling strategy in the future.

Grouping of Germplasm
Clustering analysis based on MRD and Nei's genetic distance revealed clear separation among different germplasm ( Fig. 1; Fig. Table 5. Average of Gower pair-wise phenotypic distances within (diagonal) and between (lower diagonal) 10 subgroups of Tuxpeñ o core originated from 9 states of Mexico and Guatemala; standard errors of the means (in parenthesis); results of the Tukey-Kramer comparison of means (lower letters); number of accessions in each subgroup (n). S1). No subclusters were formed within the Tuxpeñ o core, which is consistent with a high within individual variation (89%) revealed by AMOVA (Fig. 2, Table S5). A total of 94 CMLs were not well separated into A (mostly dent type) or B (flint type) patterns, as conventional heterotic groups classified by the CIMMYT breeders. This is as expected because most germplasm sources used to extract the lines were established based on a mixture of different racial complexes [44,45]. Similar results were demonstrated in previous studies [46,47]. For CMLs analyzed in this study, more than 50% of their base populations included Tuxpeño germplasm (dent kernel) in their formation as CIMMYT gene pools and populations used Tuxpeñ o germplasm for its high productivity per se and good combination with other germplasm (Table S1; [13]). This can be reflected by the relatively low genetic distance between the CMLs and Tuxpeñ o core (Table 3).
On the other hand, 54 U.S. GEM recommended lines showed two clear groups of NSS and SS heterotic patterns. The Tuxpeño core had the largest genetic distance from GEM-SS lines among its genetic distances from all other groups. In this study, larger genetic distance between tropical germplasm (i.e. Tuxpeñ o core, CML-A and CML-B) and SS were observed than that between tropical germplasm and NSS, which is consistent with a previous study [48]. A large genetic distance between heterotic germplasm can be useful for developing lines with good combining ability in hybrid breeding [49,50]. GEM-SS can be an excellent heterotic germplasm against CML-A, CML-B and Tuxpeñ o germplsms, considering these CMLs analyzed in this study did not show large MRD from the other germplasm groups.
The gene diversity parameter used for evaluating the genetic diversity in this study is less sensitive to the sample sizes of the subsets [11,51]. However, the allele number of each locus is restricted to a maximum of two when using bi-allelic SNP markers, which may cause limitations in genetic diversity measurement. Detection of genetic diversity with a large number of SNPs could mitigate the shortage. In addition, ascertainment biases might affect the measurement of diversity and population differentiation due to the use of SNP genotyping chips. The frequency of alleles may be affected and difference among temperate lines may be overestimated compared to that within tropical lines, because most SNPs (1106 out of 1536) used in the present study were developed from sequencing the set of 27 parental lines of the nested association mapping (NAM) population (i.e., SNPs were selected to maximize polymorphisms between B73 and 26 other inbred parental genotypes. About half of the 26 lines are tropical.) [30]. With the availability of maize genome and the advance of genotyping by sequencing technology, larger amount SNPs with good quality can be used for molecular characterization of maize landraces, which is possible to control ascertainment bias [52,53,54].

Further use of Tuxpeñ o core set in maize breeding programs
Tuxpeñ o germplasm has been exploited in tropical maize improvement for its yield potential [55][56][57], superior plant type [58,59], and resistance to drought and pests [60,61]. They constitute the largest collection in the CIMMYT maize germplasm bank. Despite much larger genetic distances and allelic frequency differences between Tuxpeñ o and GEM groups than that between Tuxpeñ o and CML groups, the results of cluster analysis showed clear separation of CMLs from Tuxpeñ o. The divergence between them implies that there may be untapped allelic variations in Tuxpeñ o germplasm, which can be used for broadening the genetic diversity within CML-A or B groups.
The 54 GEM lines investigated in our study have a 50% or 75% background of temperate germplasm and a 25% or 50% background of tropical germplasm. The genetic diversity of GEM was broader in this study, compared to the tropical germplasm (i.e. CML and Tuxpeñ o). However, large allelic frequency differences between GEM and tropical germplasm imply that the tropical germplasm can be used in a temperate breeding program. Incorporation of elite tropical and subtropical germplasm into elite temperate germplasm to combine favorable alleles into germplasm pools adapted to temperate environments as well as to broaden its genetic base have been carried out in previous studies [62,63]. Whitehead et al. [62] suggested that 25% elite exotic germplasm can be incorporated in the important U.S. heterotic groups without disrupting the highly productive combining ability for grain yield expressed in BSSS and non-BSSS hybrid combinations. On the other hand, GEM germplasm can be considered as an exotic source for improving tropical maize lines and populations. Promising results were observed in the breeding crosses, where clearer separation was observed between the F 1 crosses from CML A6GEM-SS and CML B6GEM-NSS [40].
Larger separation between GEM heterotic groups (i.e. SS and NSS), compared to the genetic divergence between CML heterotic groups (i.e. CML-A and CML-B) provide tropical and temperate maize breeders with potential germplasm sources for hybrid maize breeding, in which the genetic distances between opposite heterotic lines and populations can be increased. For example, we can make allied breeding cross combinations between GEM-SS and CML-A (or Tuxpeñ o minicore), and between GEM-NSS and CML-B (or Tuxpeñ o minicore). GEM lines are subtropicaltemperate adapted and more tropical germplasm should be Table 6. Statistical description of 14 agronomical and yield related traits of Tuxpeñ o core and selected mini-core evaluated from seven trials at CIMMYT stations.
-------core (321) ------------mini-core (64)  1Percentage of the range in the entire core recovered by the minicore subset. {AN = days to 50% anthesis; SI = days to 50% silking; PH = plant height; EH = ear height; LAE = number of leaves above the ear; EL = ear length; ED = ear diameter; KL = kernel length; KWD = kernel width; KRN = kernel row number; EDL = ratio of ear diameter to ear length; COB = cob diameter; CED = ratio of cob diameter to ear diameter; KWL = ratio of kernel width to kernel length. doi:10.1371/journal.pone.0032626.t006 incorporated for its use in tropical breeding. In the above breeding cross combinations, selection for tropically adapted SS-A heterotic pattern and NSS-B heterotic pattern is recommended for tropical maize breeding. Although Tuxpeñ o is one of the heterotic patterns in tropical maize breeding, it may contribute to enhancing GEM-SS heterotic lines. The same can be done with Tuxpeñ o minicore for enhancing CML-A and CML A/B in the similar grain types. Selection for adaptation and increasing genetic divergence must be done as a priority using standard breeding procedures. As a result, superior lines and hybrids can be developed in the adapted regions.
In addition, short stature improved populations and lines of Tuxpeñ o germplasm are good sources for improving the farmers' landraces, without altering grain type and adaptation. CIMMYT maize genebank has used the improved gene pools and lines in participatory maize breeding in the state of Oaxaca, Mexico (Taba et al. unpublished data; [20]) for evolutional maize germplasm conservation. In this way, genetic diversity of the race can be maintained in situ on farm [64] and modern maize production can be realized with small scale farmers.