Introduction

Ancestry informative markers (AIMs) are genetic loci showing large frequency differences between populations that are useful to study the ancestral contributions to recently admixed groups.1, 2 AIMs are often used in genome association studies to test for the genetic homogeneity of the studied population and to correct for possible population stratification.3, 4 Panels of AIMs for Latin American populations have been proposed.5, 6, 7 These maps will allow the application of admixture mapping studies as an approach for the identification of genetic risk factors for complex diseases in populations of mixed ancestry.6 In Mexico, the current population is constituted predominantly by Mestizo, resulting from the admixture of Native Americans and Spaniards who arrived in the country during the conquest, and in smaller proportion by Africans brought to the country as slaves. Some studies show that the ancestral genetic contribution in the Mexican population has regional fluctuations, with an increased European background in the north and a predominant Native American background in the south of the country.8, 9, 10, 11 However, very few admixture studies have been carried out in populations from the northeastern part of Mexico, which includes the states of Nuevo Leon, Coahuila, Tamaulipas, San Luis Potosi and Zacatecas. In this study, we used 98 markers (82 autosomal and 16 Y-single-nucleotide polymorphisms (Y-SNPs)) to characterize admixture proportions in a northeastern Mexican sample. Admixture proportions were estimated using different numbers of AIMs to evaluate the minimum amount of markers useful to classify individuals according to their ancestral composition. Overall, we found a higher European contribution in our samples compared with recent studies performed in Mexico City with similar markers. We further show evidence of differential genetic flow in the Mexican population with an increased European contribution influenced by the paternal branch.

Materials and methods

Sample origin

This study was evaluated and approved by the Ethics Committee of the University Hospital of the Universidad Autonoma de Nuevo Leon in Mexico.

A total of 100 DNA samples extracted from the peripheral blood of consecutive non-related Mexican males attending a prostate cancer detection campaign were selected from the DNA bank. Participants claimed to have parents and grandparents from northeastern Mexico. The concentration and quality of the DNA samples were determined by spectrophotometry.

Marker selection and genotyping

AIMs

A panel of 82 autosomal markers showing large frequency differences among European, African and Native American populations was selected from the literature and genotyped with the GoldenGate assay (Illumina, San Diego, CA, USA), using a total of 500 ng of genomic DNA. The results obtained by the GoldenGate assay were validated for 50 samples by PCR-restriction fragment length polymorphisms using the marker rs1800498 and by SNaPShot for markers rs3340 and rs2695 (see Supplementary Table S1 for methods description). Allelic discrimination was performed using the BeadStudio v3.0 (Illumina) and the Gene Mapper v3.0 (Applied Biosystems, Foster City, CA, USA) software. Eight markers were excluded from further analyses because of genotyping errors, and primary results are based on 74 AIMs. In the analysis, subgroups of 54, 34, 24 and 14 AIMs were selected according to their degree of ancestry information content, represented by the FST European/Native American/African value of each marker, as well as the chromosomal coverage. Markers with lower FST value were gradually excluded from consecutive subgroups. Subgroups of markers are represented in Supplementary Table S2.

Y-SNPs

Sixteen binary markers of the Y chromosome were included in the characterization of the study population, 14 of which were described by Brion et al.12 and included in the Golden Gate assay. The remaining two markers, the YAP marker, common in populations of African ancestry, and the M3 SNP, characteristic of Native Americans, were genotyped by PCR and PCR-restriction fragment length polymorphisms, respectively, as described earlier.13, 14 The definition of the different haplogroups was made according to the Y-Chromosome Consortium recommendations.15 Native American, European and African paternal contributions were determined directly from haplogroup frequencies, considering worldwide phylogeographic information.16, 17 Into the haplogroup P (Q–R), the ancestral Amerindian lineage defined by P36 (haplogroup Q) was not genotyped; thus, it was estimated by the frequency of M3 and P36 in Native Americans and Hispanics,18, 19 as suggested by Rangel-Villalobos et al.20

Statistical analysis and software

Hardy–Weinberg departures were determined by an exact test using the De Finnetti software (http://ihg2.helmholtz-muenchen.de/cgi-bin/hw/hwa1.pl). Frequency data for the ancestral European, Native American and African populations were obtained from the public HapMap database (http://www.hapmap.org/) and from earlier studies.6, 7, 21 The ancestral population contributions to the sample, as well as the number of generations since the admixture process, were estimated using the Admixmap v3.7 software3, 22 with 2000 iterations for the burn-in phase, and 10 000 iterations for data gathering. Variation in the individual and population admixture proportions for each subgroup of markers was evaluated by analysis of variance on the ranks using Tukey's test for multiple comparison procedures and linear correlation.

Results

AIMs

A total of 74 AIMs were used for the determination of the ancestral composition of the population. Detailed information about the 74 AIMs, as well as the frequencies obtained in this study for each marker, are shown in Supplementary Table S3.

The results obtained during the validation stage by the markers rs1800498, rs3340 and rs2695 were 100% consistent with the genotypes obtained with the Golden Illumina Assay for all the re-genotyped samples.

An analysis of the sample with the program Admixmap v3.7, using the panel of 74 AIMs, indicated that the average Native American contribution was 56% (range: 27.4–81.2%), the average European contribution 38% (range: 16.7–70.5%), and the average West African contribution 6% (range: 1.3–11.9%). Figure 1 shows the individual admixture proportions in a three-dimensional plot graph. The number of generations since the admixture process, represented by the sum intensities parameter, was estimated as 11.2 per Morgan.

Figure 1
figure 1

Individual admixture composition. The figure shows the individual ancestral composition and admixture proportion mean of the study population using 74 ancestral informative markers (AIMs). The x axis represents the Native American admixture proportion of the study population, and the y and z axes the European and African admixture proportions, respectively. Native Americans, NAM; European, EUR and Africans, AFR.

Variation in admixture proportions using subgroups of AIMs

To determine variation in the individual and population admixture proportion, subgroups of 54, 34 and 24 AIMs were defined according to the criteria mentioned in Materials and methods section. Figure 2 shows a graphical representation of the individual admixture proportions using the four different subgroups of AIMs. Individual admixture proportions were similar when using 74, 54, 34 and 24 AIMs. Most individuals had Native American proportions between 45 and 75%, and the West African contributions were low.

Figure 2
figure 2

Triangle plot showing individual ancestry composition using different ancestral informative markers (AIMs) subgroups.

Figure 3 reports the average ancestral contributions estimated using the four subgroups of AIMs. There were no differences in the mean Native American, European and African proportions using the different marker subgroups. However, a comparison of the estimated contributions among the different AIMs subgroups, using the 74 AIMs results as a reference, showed that the R2 values decreased with the number of AIMs used in the analyses. For the European contribution, the R2 values ranged between 0.9353 for 54 AIMs and 0.7058 for 24 AIMs; for the Native American contribution, the R2 values were between 0.9447 for 54 AIMs and 0.7121 for 24 AIMs, and for the West African contribution the R2 values ranged from 0.864 for 54 AIMs to 0.5759 for 24 AIMs. The usefulness of quantities lower than 24 AIMs for ancestry determination was evaluated using two additional subgroups constituted by 20 and 14 AIMs, respectively, however, we found a statistical difference in the median of proportions for the African component (P=0.013 and P<0.001, respectively).

Figure 3
figure 3

Mean of admixture composition using different ancestral informative markers (AIMs) subgroups.

Y-SNPs

Figure 4 and Supplementary Table S4 show detailed information regarding the results obtained for the 15 binary markers for the Y-chromosome (rs2032624 corresponding to M173 was excluded because of genotyping errors). The selected Y-SNPs allowed the classification of all the studied individuals in the represented haplogroups. The total male sample presented the ancestral M168 mutation, representing the out of Africa Diaspora.16 The most frequent haplogroups were R (35%) and P (28%) followed by haplogroups D (10%) and Q (8%); and the less frequent were J, I, G and K (6, 5, 3 and 2%, respectively). The presence of the Eurasian haplogroups J, I, G and K in the European populations allow the inference that they were received in Native American populations by European males.12 As the ancestral African haplogroups A and B were not found in the sample, the African paternal ancestry in the sample was established exclusively by the indel YAP. The Native American contribution was estimated on the basis of the proportion of the haplogroups Q1a3a and P36 (see Materials and methods section). Consequently, assuming that the large majority of the European and Eurasian haplogroups were brought to Mexico from Europe by the Spaniards, all the remaining haplogroups (neither African nor Amerindian) were considered as the European paternal contribution to the Mexican population sample, resulting in predominantly Europeans (78.15–78.38%) followed by Native Americans (11.85–11.62%) and Africans (10%).

Figure 4
figure 4

Phylogenetic tree and haplogroup frequencies defined with the Y markers analyzed. The length of each branch has no significance. *These haplogroups include Amerindian alleles (n=100).

Discussion

Previous admixture studies conducted in Mestizo populations from Mexico indicate a substantial variation in admixture proportions. In this study, the admixture levels of the Mestizo community in northeastern Mexico are examined using a panel of 89 AIMs, including 74 autosomal markers and 15 Y-SNPs.

Our analysis of autosomal markers shows a Native American contribution of 56%, a European contribution of 38% and, in less proportion, an African contribution of 6%. These results are similar to the data reported by Lisker et al.23 for the Mestizo population from the states of Coahuila (northeastern region of Mexico) and Guanajuato (Central Mexico), where the estimated Native American proportions are 55.6 and 51.1%, respectively. In a recent study using 13 combined DNA index system-short tandem repeats (CODIS-STRs), Rubi-Castellanos24 reported for the Mestizo population from Nuevo Leon state a similar European contribution (38.2%) but with a lower Native American (43.3%) and a higher African (18.5%) admixture of proportions than was found in this study (56 and 6%, respectively). However, the results of this study differ from the estimates based on D1S80/HLA-DQA1 and 10 STR loci reported earlier by Cerda-Flores et al.10, 11 for the Monterrey metropolitan area, who claim that 51.5–61.9% of the contribution is European with a lesser Native American contribution (31.9–42.6%). The difference between their study and ours could be explained by the markers used to estimate admixture, by the statistical method, as the type of genetic markers under study determines a preference for a particular statistical approach, by the number of individuals included in the study and/or by selection of the population sample. For example, a significant association between a European admixture proportion and higher educational status has been reported in Mexico City.21 This intra-population structure in Mexican populations, by socioeconomical and/or educational status, deserves further research. Using a panel of 69 AIMs, Martinez-Marignac et al.21 estimated that a Mestizo sample from Mexico City had 65% Native American, 30% European and 5% African contributions. It is important to indicate that there is considerable overlap in the markers used in the study by Martinez-Marignac and in our study (54 AIMs in common), facilitating a direct comparison of the admixture estimates. In this regard, the Mestizo population from northeastern Mexico has higher European ancestry (38 vs 30%) and lower Native American ancestry (56 vs 65%) than the sample from Mexico City. Both samples show similar African contributions (6 vs 5%).

The impact of decreasing the numbers of AIMs in the estimation of the average proportions of ancestry shows that as the number of employed AIMs decreases, the average of admixture proportions is maintained. These results show that as few as 24 AIMs could determinate similar admixture proportions. This observation was already sustained by Kosoy et al.25 in a similar estimation of ancestry, using 128, 96, 64, 48 and 24 AIMs sets. It is important to remark that in our study, panels of 20 and 14 AIMs were evaluated and the difference found in the mean of proportions for the African component indicates that the minimal quantity of markers useful for admixture estimation is 24 AIMs. However, the individual admixture estimates that are most affected are the average admixture proportions when the number of AIMs is reduced, as indicated by the R2 values in the correlation studies.

There are several reports on the admixture for Mestizo from the regions of West Mexico, such as Jalisco,11, 23 and populations from the Central-South region; such as Puebla, Mexico City, Guerrero, Tlaxcala, Oaxaca, Veracruz, Tabasco, Campeche and Yucatán.6, 11, 21, 23, 25, 26, 27, 28, 29, 30 The admixture estimates reported in these studies are based on blood groups, serum proteins, STRs and AIMs. As a whole, the data are consistent with the Mexican population history and reflect a gradient with a higher proportion of Native American ancestry in the southern states and increased ancestry of the African contribution in the states of the Gulf of Mexico.

This is the first study in Mexico that reports data on 15 Y-SNPs for the analysis of the male ancestral component of our population, with some particular considerations: we did not include the marker P36 of the haplogroup Q for the Native American component; however, estimations based on different reports of Hispanics and Native Americans were very similar (11.6 and 11.8%, respectively) and support this result. Although the Native American marker M242 (ancestral to P36) could increase this estimation, its low frequency (2.3%) in an earlier report of Amerindians,31 allows for predicting that our estimation will not change substantially. For the European component, the described Spanish ancestry of the Mexican Mestizos32 allows for assuming that the Eurasian haplogroups G, I, J and K were received from Spaniard males during and after the Conquest, rather than involving the recent Asian gene flow. Finally, the African component did not include the ancestral African lineages A and B, and was exclusively represented by YAP, related to the Bantu populations from this continent.16

Accordingly, our estimations showed a clear predominance of paternal European and Eurasian markers ancestry (78%) in the sample analyzed, compared with Amerindian (12%) and African markers (10%). The differences between AIMs and Y-SNPs ancestry estimates clearly show an admixture of processes sex biased, as expected, because of the historical pattern of paternal lineage in Mexico and Latin America in which the main Y-chromosome contribution for Mestizo comes predominantly from European males .21, 29 The Y-SNPs estimations found in this study represent a significant difference from an earlier report analyzing M3 and YAP in western Mexican Mestizos (P=0.046 and 0.0073), where these components were estimated in 60–64%, 25–21% and 15%, respectively.20 Our results also showed a difference for the estimates of paternal contributions reported by Martinez-Marignac et al.21 in central Mexican Mestizo population using the M3 and M170 markers (60 and 40% for European and Native American contributions, respectively). However, these results are in agreement with the hypothesis of a higher European component in the northern region and, similarly, a higher Amerindian ancestry in the central and southeastern regions.23

This study describes admixture proportions in a Mestizo population from northeastern Mexico, estimated with a panel of AIMs and Y-chromosome polymorphisms. This sample is characterized by a predominant Native American ancestry, but the European contribution is higher than that of other regions of Mexico. Conversely, the paternal ancestry was mainly European, determined by Y-SNPs. The ancestral variability in admixture proportions observed throughout Mexico and reported in different Mexican-Mestizo populations emphasizes the need to correct for possible population stratification in association studies. This study provides relevant information that will set a reference for future determinations of the admixture proportions in the Mestizo population from Mexico.

Conflict of interest

The authors declare no conflict of interest.