Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Sensitive detection of chromatin-altering polymorphisms reveals autoimmune disease mechanisms

Abstract

Most disease associations detected by genome-wide association studies (GWAS) lie outside coding genes, but very few have been mapped to causal regulatory variants. Here, we present a method for detecting regulatory quantitative trait loci (QTLs) that does not require genotyping or whole-genome sequencing. The method combines deep, long-read chromatin immunoprecipitation–sequencing (ChIP-seq) with a statistical test that simultaneously scores peak height correlation and allelic imbalance: the genotype-independent signal correlation and imbalance (G-SCI) test. We performed histone acetylation ChIP-seq on 57 human lymphoblastoid cell lines and used the resulting reads to call 500,066 single-nucleotide polymorphisms de novo within regulatory elements. The G-SCI test annotated 8,764 of these as histone acetylation QTLs (haQTLs)—an order of magnitude larger than the set of candidates detected by expression QTL analysis. Lymphoblastoid haQTLs were highly predictive of autoimmune disease mechanisms. Thus, our method facilitates large-scale regulatory variant detection in any moderately sized cohort for which functional profiling data can be generated, thereby simplifying identification of causal variants within GWAS loci.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Validation of ChIP-seq data.
Figure 2: SNP calling.
Figure 3: haQTLs.
Figure 4: Correlations and molecular mechanisms of haQTLs.
Figure 5: haQTL distribution at promoter and nonpromoter H3K27ac peaks.
Figure 6: haQTLs provide candidate molecular mechanisms for GWAS SNPs.

Similar content being viewed by others

Accession codes

Primary accessions

Gene Expression Omnibus

References

  1. Boyle, A.P. et al. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 22, 1790–1797 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Manolio, T.A. Genomewide association studies and assessment of the risk of disease. N. Engl. J. Med. 363, 166–176 (2010).

    Article  CAS  PubMed  Google Scholar 

  3. Maurano, M.T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Musunuru, K. et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466, 714–719 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Bauer, D.E. et al. An erythroid enhancer of BCL11A subject to genetic variation determines fetal hemoglobin level. Science 342, 253–257 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. van den Boogaard, M. et al. Genetic variation in T-box binding element functionally affects SCN5A/SCN10A enhancer. J. Clin. Invest. 122, 2519–2530 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Smemo, S. et al. Obesity-associated variants within FTO form long-range functional connections with IRX3. Nature 507, 371–375 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Spieler, D. et al. Restless legs syndrome-associated intronic common variant in Meis1 alters enhancer function in the developing telencephalon. Genome Res. 24, 592–603 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Altshuler, D., Daly, M.J. & Lander, E.S. Genetic mapping in human disease. Science 322, 881–888 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Faye, L.L., Machiela, M.J., Kraft, P., Bull, S.B. & Sun, L. Re-ranking sequencing variants in the post-GWAS era for accurate causal variant identification. PLoS Genet. 9, e1003609 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. The GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).

  12. Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Montgomery, S.B. et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464, 773–777 (2010).

    Article  CAS  PubMed  Google Scholar 

  14. Pickrell, J.K. et al. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464, 768–772 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Veyrieras, J.B. et al. High-resolution mapping of expression-QTLs yields insight into human gene regulation. PLoS Genet. 4, e1000214 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Raj, T. et al. Polarization of the effects of autoimmune and neurodegenerative risk alleles in leukocytes. Science 344, 519–523 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Battle, A. et al. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res. 24, 14–24 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Degner, J.F. et al. DNase I sensitivity QTLs are a major determinant of human expression variation. Nature 482, 390–394 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Gaulton, K.J. et al. A map of open chromatin in human pancreatic islets. Nat. Genet. 42, 255–259 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Kasowski, M. et al. Variation in transcription factor binding among humans. Science 328, 232–235 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. McDaniell, R. et al. Heritable individual-specific and allele-specific chromatin signatures in humans. Science 328, 235–239 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Ni, Y., Hall, A.W., Battenhouse, A. & Iyer, V.R. Simultaneous SNP identification and assessment of allele-specific bias from ChIP-seq data. BMC Genet. 13, 46 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Smith, A.J. et al. Use of allele-specific FAIRE to determine functional regulatory polymorphism using large-scale genotyping arrays. PLoS Genet. 8, e1002908 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Kasowski, M. et al. Extensive variation in chromatin states across humans. Science 342, 750–752 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Kilpinen, H. et al. Coordinated effects of sequence variation on DNA binding, chromatin structure, and transcription. Science 342, 744–747 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. McVicker, G. et al. Identification of genetic variants that affect histone modifications in human cells. Science 342, 747–749 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Altshuler, D.M. et al. Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010).

    Article  CAS  PubMed  Google Scholar 

  28. The 1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

  29. Wang, Z. et al. Combinatorial patterns of histone acetylations and methylations in the human genome. Nat. Genet. 40, 897–903 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Heintzman, N.D. et al. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459, 108–112 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Kumar, V. et al. Uniform, optimal signal processing of mapped deep-sequencing data. Nat. Biotechnol. 31, 615–622 (2013).

    Article  CAS  PubMed  Google Scholar 

  32. Gerstein, M.B. et al. Architecture of the human regulatory network derived from ENCODE data. Nature 489, 91–100 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Neph, S. et al. An expansive human regulatory lexicon encoded in transcription factor footprints. Nature 489, 83–90 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. DePristo, M.A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Bamshad, M.J. et al. Exome sequencing as a tool for Mendelian disease gene discovery. Nat. Rev. Genet. 12, 745–755 (2011).

    Article  CAS  PubMed  Google Scholar 

  36. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Stat. Methodol. 57, 289–300 (1995).

    Google Scholar 

  37. Potapova, A. et al. Systematic cross-validation of 454 sequencing and pyrosequencing for the exact quantification of DNA methylation patterns with single CpG resolution. BMC Biotechnol. 11, 6 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Hou, S. et al. Identification of a susceptibility locus in STAT4 for Behcet's disease in Han Chinese in a genome-wide association study. Arthritis Rheum. 64, 4104–4113 (2012).

    Article  CAS  PubMed  Google Scholar 

  39. Kirino, Y. et al. Genome-wide association analysis identifies new susceptibility loci for Behcet's disease and epistasis between HLA-B*51 and ERAP1. Nat. Genet. 45, 202–207 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Pai, A.A. et al. The contribution of RNA decay quantitative trait loci to inter-individual variation in steady-state gene expression levels. PLoS Genet. 8, e1003000 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Kent, W.J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Jurinke, C., van den Boom, D., Cantor, C.R. & Koster, H. The use of MassARRAY technology for high throughput genotyping. Adv. Biochem. Eng. Biotechnol. 77, 57–74 (2002).

    CAS  PubMed  Google Scholar 

  44. Skotte, L., Korneliussen, T.S. & Albrechtsen, A. Association testing for next-generation sequencing data using score statistics. Genet. Epidemiol. 36, 430–437 (2012).

    Article  PubMed  Google Scholar 

  45. ONeill, R. Algorithm AS 47: function minimization using a simplex procedure. J. R. Stat. Soc. Ser. C Appl. Stat. 20, 338–345 (1971).

    Google Scholar 

  46. Nelder, J.A. & Mead, R. A simplex method for function minimization. Comput. J. 7, 308–313 (1965).

    Article  Google Scholar 

  47. Benjamini, Y. & Yekutieli, D. The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29, 1165–1188 (2001).

    Article  Google Scholar 

  48. Degner, J.F. et al. Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25, 3207–3212 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Kheradpour, P. & Kellis, M. Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments. Nucleic Acids Res. 42, 2976–2987 (2014).

    Article  CAS  PubMed  Google Scholar 

  50. Welter, D. et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42, D1001–D1006 (2014).

    Article  CAS  PubMed  Google Scholar 

  51. Beecham, A.H. et al. Analysis of immune-related loci identifies 48 new susceptibility variants for multiple sclerosis. Nat. Genet. 45, 1353–1360 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Tsoi, L.C. et al. Identification of 15 new psoriasis susceptibility loci highlights the role of innate immunity. Nat. Genet. 44, 1341–1348 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Eyre, S. et al. High-density genetic mapping identifies new susceptibility loci for rheumatoid arthritis. Nat. Genet. 44, 1336–1340 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Cortes, A. et al. Identification of multiple risk variants for ankylosing spondylitis through high-density genotyping of immune-related loci. Nat. Genet. 45, 730–738 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work was funded by the Agency for Science, Technology and Research (A*STAR), Singapore, including grant no. GIS/11-IAF300-4.

Author information

Authors and Affiliations

Authors

Contributions

R.C.-H.d.R. performed the computational and statistical analyses; J.P. performed ChIP-seq experiments and PyroMark validation with assistance from E.P.; R.C.-H.d.R. and S.L.R. programmed the G-SCI test; C.C.K. performed Sequenom validation; S.P., R.C.-H.d.R., J.P. and M.L.H. designed the study; S.P. provided overall supervision; S.P., R.C.-H.d.R. and J.P. drafted the manuscript with assistance from M.L.H. and C.C.K.

Corresponding author

Correspondence to Shyam Prabhakar.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 H3K27ac peak-height correlation across data sets.

Peak-heights were input corrected, normalized by the total number of reads in the library and then Pearson correlations were computed between pairs of data sets. GIS: samples sequenced in this study, KAS: samples sequenced by Kasowski et al. (2013)24, MCV: samples sequenced by McVicker et al. (2013)26, ENC: samples sequenced by ENCODE.

Supplementary Figure 2 Histogram of H3K27ac peak-height correlations for all pairs in the 57-sample set.

Correlation values are the same as in Supplementary Fig. 1. median R=0.87.

Supplementary Figure 3 Quantile-quantile plot of observed versus expected haQTL P values for the 500,066 called SNPs.

The dashed black line is the expected null distribution.

Supplementary Figure 4 Functional enrichment of low-frequency haQTLs.

(a) Enrichment of low frequency (minor allele frequency below 5%) haQTLs in GM12878 TF ChIP-seq peak regions. Only low frequency called SNPs within histone acetylation peaks were used as the negative control. (b) Low-frequency haQTL enrichment in GM12878 TF binding sites. Binding sites were defined as motif matches within the corresponding ChIP-seq peaks and only haQTLs that overlapped the binding site were considered in the analysis. The P-values and enrichment scores were calculated as in Fig. 4b,c.

Supplementary Figure 5 Low-frequency haQTL distribution at promoter and nonpromoter H3K27ac peaks.

Similar to Fig. 5 except that only low frequency (minor allele frequency below 5%) haQTLs and SNPs were used in the analysis. For both promoter and non-promoter peaks, the low frequency haQTLs show an enrichment at the hypersensitive region of the peak (right panels).

Supplementary Figure 6 Venn diagram of autoimmune GWAS SNPs that are in strong LD with the regulatory QTL sets considered in this study.

Supplementary Figure 7 LD of haQTL SNPs with autoimmune GWAS SNPs.

The upper two bars are identical to Fig. 6b – they represent the GWAS statistics of the set of 8,764 haQTLs regulome-wide by our G-SCI pipeline, which includes an effect-size filter. When that filter was removed, the haQTL set expanded to 17,936 SNPs. Both haQTL sets show enrichment for LD with GWAS SNPs, with the latter having a stronger P-value due to the larger number of tested SNPs. See also Supplementary Table 6.

Supplementary Figure 8 Functional enrichment of haQTLs with and without the effect-size filter.

(a) haQTLs in GM12878 TF ChIP-seq peak regions (b) haQTLs in GM12878 TF binding sites (motifs in peak regions). The P-values and enrichment scores were calculated as in Fig. 4b,c. While haQTL sets were enriched for evidence of transcription factor binding, the filtered set showed stronger enrichment.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–8 and Supplementary Tables 1, 2, 4, 6 and 7 (PDF 1154 kb)

Supplementary Table 3

List of haQTLs (XLSX 1403 kb)

Supplementary Table 5

List of autoimmune GWAS SNPs (XLSX 57 kb)

Source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

del Rosario, RH., Poschmann, J., Rouam, S. et al. Sensitive detection of chromatin-altering polymorphisms reveals autoimmune disease mechanisms. Nat Methods 12, 458–464 (2015). https://doi.org/10.1038/nmeth.3326

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.3326

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research