Data on a genome-wide association study of type 2 diabetes in a Maya population

Maya communities have been shown to exhibit type 2 diabetes (T2D) with high prevalence compared with Mexican mestizo populations. Furthermore, some variants associated with the risk for T2D have been described. In this study, we describe the results of a pilot genome wide association study (GWAS) using 817,823 single nucleotide polymorphisms (SNPs) to identify candidate variants for replication in future studies. Herein, we present the GWAS study data, which were divided into three parts: first, 1289 ancestry informative markers (AIMs) were selected for Latino populations containing European, African, and Native American SNPs obtained from the literature; second, a GWAS hypothesis free to select candidate genes associated with T2D was performed, which identified 24 candidate genes; and third, 39 SNPs previously associated with T2D or related traits were replicated. This article is associated with the original article published in “Gene” under the title “Pilot genome-wide association study identifying novel risk loci for type 2 diabetes in a Maya population”.


a b s t r a c t
Maya communities have been shown to exhibit type 2 diabetes (T2D) with high prevalence compared with Mexican mestizo populations. Furthermore, some variants associated with the risk for T2D have been described. In this study, we describe the results of a pilot genome wide association study (GWAS) using 817,823 single nucleotide polymorphisms (SNPs) to identify candidate variants for replication in future studies. Herein, we present the GWAS study data, which were divided into three parts: first, 1289 ancestry informative markers (AIMs) were selected for Latino populations containing European, African, and Native American SNPs obtained from the literature; second, a GWAS hypothesis free to select candidate genes associated with T2D was performed,

Data
A pilot genome-wide association study was performed in a Maya population with high prevalence of T2D and compared with a group of healthy Maya volunteers using statistical models for genotype, allelic, and Armitage trend tests [1].
Specifications Table   Subject area Genetics, Genomics and Molecular Biology. More specific subject area Diabetes and Maya populations Type of data Tables, figures and text file How data was acquired All the genotypic data was obtained by "Centro de Investigaciones y Estudios Avanzados del Instituto Polit ecnico Nacional (CINVESTAV)" by analysis of the DNA obtained from blood samples of volunteer donors acquired from the primary health centre located in a rural and an urban community in the State of Yucat an, M exico by the "Centro de Investigaciones Regionales Dr. Hideyo Noguchi, Universidad Aut onoma de Yucat an". The genotypic data obtained from the microarray "Affymetrix Axiom Genome-wide LAT1 array" were analysed by CINVESTAV and "Centro Internacional de Mejoramiento de Maíz y Trigo". Data format Original and analysed dataset.

Experimental factors
Patients were recruited in 2010, and DNA was extracted from whole blood using the automated extraction system Chemagic Prepito® ( Value of the Data New genotypic data in a Maya population with T2D compared with a non-diabetic Maya population, which is scarce in the literature. These data contain the descriptive data of 24 candidate variants associated with T2D in a Maya population.
Complementary data for the AGTR2 rs1914711 associated with T2D in a Maya population The results obtained could also be used for comparison in future genetic association studies in Mexican patients with T2D. Ancestry markers are included.
A. Totomoch-Serra et al. / Data in brief 28 (2020) 104866 2 populations. To define the number of clusters grouped in the Maya population, a five-fold cross-validation analysis was performed.

Statistical genetic analyses
Genotype, allelic, and Armitage-trend association models for association analysis were included using the SAS software in the CASE-CONTROL procedure (SAS/Genetics (TM) 9.4, SAS Institute, Cary, NC, USA). To identify positive association, p < 2.200 Â 10 À8 was fixed, and p < 0.001 for the replication analysis of 39 SNPs previously associated with T2D. Power analysis was performed using the Quanto software (Quanto 1.2.4, California, USA). Because the Maya population of this study is small and apparently isolated the endogamy grade was previously calculated as 1-He/Ho, where He and Ho are the expected and observed heterozygosity respectively [1]. Endogamy was null or very low since the inbreeding parameter were between À0.25 and 0.25. Consequently, the adjustment for the relationship among relatives was not needed. Chr, chromosome; SNP-Array ID, single nucleotide polymorphism identifier in the array; dbSNP, single nucleotide polymorphism identifier; Pos, position; Six groups were formed for cross validation analysis, which are shown in the Frequency group column; In the admixture analysis, the minimum cross-validation error using the ancestry markers for the population of this study was for K ¼ 2, which is represented by the partition of the frequency allele of each SNP in the two groups (G1 and G2); AF G1 is the allelic frequency for group 1; AF G2 is the allelic frequency for group 2.  Chr, chromosome; SNP-Array ID, single nucleotide polymorphism identfiler in the array; GenDist, genetic distance; PhysPosit, physical position; T2D-G, type 2 diabetes group; OR, allele odds ratio for the first allele in allele column, if OR larger than 1 then the first allele is associated to diabetes; LowerCI, lower coefficient interval; Upper CI, upper coefficient interval; ProbGenotype, probability genotype model; ProbAllele, probability allele model; ProbTrend, probability trend model; lpa, logarithmic probability allele model; lpg, logarithmic probability genotype model; lpt, logarithmic probability trend model.