Skip to main content
Log in

Effects of cutoff thresholds for minor allele frequencies on HapMap resolution: A real dataset-based evaluation of the Chinese Han and Tibetan populations

  • Articles/Medical Genetics
  • Published:
Chinese Science Bulletin

Abstract

Genomic variation is the genetic basis of phenotypic diversity among individuals, including variation in disease susceptibility and drug response. The greatest promise of the International HapMap is to provide roadmaps for identifying genetic variants predisposing to complex diseases. Single nucleotide polymorphism (SNP) is the fundamental element of the HapMap. Allele frequency of SNPs is one of the major factors affecting the resulting HapMap, being the factor upon which linkage disequilibrium (LD) is calculated, haplotypes are constructed, and tagging SNPs (tagSNPs) are selected. The cutoff thresholds for the frequency of minor alleles used in the making of the map therefore have profound effects on the resolution of that map. To date most researchers have adopted their own cutoff thresholds, and there has been little real dataset-based evaluation of the effects of different cutoff thresholds on HapMap resolution. In an attempt to assess the implications of different cutoff values, we analyzed our own data for the centromeric genes on Chromosome 15 in Chinese Han and Tibetan populations, with respect to minor allele frequency cutoff values of ≥0.01 (0.01 group), ≥0.05 (0.05 group), and ≥0.10 (0.10 group), and constructed HapMaps from each of the datasets. The resolution, study power and cost-effectiveness for each of the maps were compared. Our results show that the 0.01 threshold provides the greatest power (P = 0.019 in Han and P = 0.029 in Tibetan for 0.01 vs. 0.05 threshold) and detects most population-specific haploypes (P = 0.012 for 0.01 vs. 0.05 threshold). However, in the regions studied, the 0.05 cutoff threshold did not significantly increase power above the 0.10 threshold (P = 0.191 in Han; 1.000 in Tibetans), and did not improve resolution over the 0.10 value for populationspecific haplotypes (P = 0.592) neither. Furthermore the 0.05 and 0.10 values produced the same figures for tagging efficiency, LD block number, LD length, study power and cost-savings in the Tibetan population. These results suggest that a lower cutoff value is more appropriate for studies in which population-specific haplotypes are crucial, and that the most appropriate cutoff value may differ between populations. Due to the limited genes studied in this project more studies should be conducted to further address this important issue.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. The International HapMap Consortium. A haplotype map of the human genome. Nature, 2005, 437: 1299–1320

    Article  Google Scholar 

  2. Payseur B A, Place M, Weber J L, et al. Linkage disequilibrium between STRPs and SNPs across the human genome. Am J Hum Genet, 2008, 82: 1039–1050

    Article  Google Scholar 

  3. Gabriel S B, Schaffner S F, Nguyen H, et al. The structure of haplotype blocks in the human genome. Science, 2002, 296: 2225–2229

    Article  Google Scholar 

  4. Guthery S L, Salisbury B A, Pungliya M S, et al. The structure of common genetic variation in United States populations. Am J Hum Genet, 2007, 81: 1221–1231

    Article  Google Scholar 

  5. The International Hapmap consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature, 2007, 449: 851–861

    Article  Google Scholar 

  6. Siva N. 1000 Genomes project. Nat Biotechnol, 2008, 26: 256

    Google Scholar 

  7. Vasan R S, Larson M G, Aragam J, et al. Genome-wide association of echocardiographic dimensions, brachial artery endothelial function and treadmill exercise responses in the Framingham Heart Study. BMC Med Genet, 2007, 8(Suppl 1): S2

    Article  Google Scholar 

  8. Iida A, Ozaki K, Tanaka T, et al. Fine-scale SNP map of an 11-kb genomic region at 22q13.1 containing the galectin-1 gene. J Hum Genet, 2005, 50: 41–45

    Google Scholar 

  9. Huang W, Li C, Chen S, et al. Construction of fine SNP haplotypes and haplotype blocks in 5 genes in the centromere of chromosome 15 in Chinese Han subjects. Chin Sci Bull, 2004, 49: 1044–1051

    Article  Google Scholar 

  10. Huang W, Li C, Labu, et al. High resolution linkage disequilibrium and haplotype maps for the genes in the centromeric region of chromosome 15 in Tibetans and comparisons with Han population. Chin Sci Bull, 2006, 51: 542–551

    Article  Google Scholar 

  11. Ke X, Durrant C, Morris A P, et al. Efficiency and consistency of haplotype tagging of dense SNP maps in multiple samples. Hum Mol Genet, 2004, 13: 2557–2565

    Article  Google Scholar 

  12. Carlson C S, Eberle M A, Rieder M J, et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am J Hum Genet, 2004, 74: 106–120

    Article  Google Scholar 

  13. Crawford D C, Carlson C S, Rieder M J, et al. Haplotype diversity across 100 candidate genes for inflammation, lipid metabolism, and blood pressure regulation in two populations. Am J Hum Genet, 2004, 74: 610–622

    Article  Google Scholar 

  14. Skol A D, Scott L J, Abecasis G R, et al. Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nat Genet, 2006, 38: 209–213

    Article  Google Scholar 

  15. Lindman H R. Analysis of variance in complex experimental designs. San Francisco: W. H. Freeman & Co, 1974

    Google Scholar 

  16. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Statist Soc Ser B, 1995, 57: 289–300

    Google Scholar 

  17. Nie N H, Hull C H, Jenkins J G, et al. SPSS: Statistical Package for the Social Sciences, Berkshire: McGraw-Hill Education, 1975

    Google Scholar 

  18. Wang W Y, Barratt B J, Clayton D G, et al. Genome-wide association studies: theoretical and practical concerns. Nat Rev Genet, 2005, 6: 109–118

    Article  Google Scholar 

  19. McCarthy M I, Abecasis G R, Cardon L R, et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet, 2008, 9: 356–369

    Article  Google Scholar 

  20. Anderson C A, Pettersson F H, Barrett J C, et al. Evaluating the effects of imputation on the power, coverage, and cost efficiency of genome-wide SNP platforms. Am J Hum Genet, 2008, 83: 112–119

    Article  Google Scholar 

  21. Bhangale T R, Rieder M J, Nickerson D A, et al. Estimating coverage and power for genetic association studies using near-complete variation data. Nat Genet, 2008, 40: 841–843

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to YiMing Wang.

Additional information

Supported by the Key Construction Program of the National “985” Project of China (Phase II), Natural Science Foundation of Guangdong Province (Grant No. 031673), Guangzhou Municipal Science and Technology Foundation (Grant Nos. 2002Z3-C7191, 2004Z3-C7501)

About this article

Cite this article

Xiong, S., Hao, Y., Rao, S. et al. Effects of cutoff thresholds for minor allele frequencies on HapMap resolution: A real dataset-based evaluation of the Chinese Han and Tibetan populations. Chin. Sci. Bull. 54, 2069–2075 (2009). https://doi.org/10.1007/s11434-009-0302-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11434-009-0302-4

Keywords

Navigation