Effect of natural genetic variation on enhancer selection and function

Heinz, S.; Romanoski, C. E.; Benner, C.; Allison, K. A.; Kaikkonen, M. U.; Orozco, L. D.; Glass, C. K.

doi:10.1038/nature12615

Article
Published: 13 October 2013

Effect of natural genetic variation on enhancer selection and function

S. Heinz¹^na1,
C. E. Romanoski¹^na1,
C. Benner^1,2,3,
K. A. Allison¹,
M. U. Kaikkonen^1,4,
L. D. Orozco⁵ &
…
C. K. Glass^1,3,6

Nature volume 503, pages 487–492 (2013)Cite this article

30k Accesses
234 Citations
75 Altmetric
Metrics details

Subjects

Abstract

The mechanisms by which genetic variation affects transcription regulation and phenotypes at the nucleotide level are incompletely understood. Here we use natural genetic variation as an in vivo mutagenesis screen to assess the genome-wide effects of sequence variation on lineage-determining and signal-specific transcription factor binding, epigenomics and transcriptional outcomes in primary macrophages from different mouse strains. We find substantial genetic evidence to support the concept that lineage-determining transcription factors define epigenetic and transcriptomic states by selecting enhancer-like regions in the genome in a collaborative fashion and facilitating binding of signal-dependent factors. This hierarchical model of transcription factor function suggests that limited sets of genomic data for lineage-determining transcription factors and informative histone modifications can be used for the prioritization of disease-associated regulatory variants.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Genetic variation affects LDTF binding.**

**Figure 2: Genetic variation supports the LDTF collaborative binding model.**

**Figure 3: Validation of predicted binding and modification patterns.**

**Figure 4: p65 binding is largely determined by LDTF binding.**

**Figure 5: Validation of strain-specific enhancer activity and causal variants.**

Single-cell long-read sequencing-based mapping reveals specialized splicing patterns in developing and adult mouse and human brain

Article Open access 09 April 2024

Anoushka Joglekar, Wen Hu, … Hagen U. Tilgner

Genome-wide association studies

Article 26 August 2021

Emil Uffelmann, Qin Qin Huang, … Danielle Posthuma

Spatially organized cellular communities form the developing human heart

Article Open access 13 March 2024

Elie N. Farah, Robert K. Hu, … Neil C. Chi

Accession codes

Accessions

Gene Expression Omnibus

GSE46494

Data deposits

Data are available in the Gene Expression Omnibus (GEO) under accession GSE46494.

References

Hindorff, L. A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 9362–9367 (2009)
Article ADS CAS Google Scholar
Cowper-Sal-lari, R. et al. Breast cancer risk-associated SNPs modulate the affinity of chromatin for FOXA1 and alter gene expression. Nature Genet. 44, 1191–1198 (2012)
Article CAS Google Scholar
Degner, J. F. et al. DNase I sensitivity QTLs are a major determinant of human expression variation. Nature 482, 390–394 (2012)
Article ADS CAS Google Scholar
Gaffney, D. J. et al. Dissecting the regulatory architecture of gene expression QTLs. Genome Biol. 13, R7 (2012)
Article CAS Google Scholar
Gaulton, K. J. et al. A map of open chromatin in human pancreatic islets. Nature Genet. 42, 255–259 (2010)
Article CAS Google Scholar
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010)
Article CAS Google Scholar
Kasowski, M. et al. Variation in transcription factor binding among humans. Science 328, 232–235 (2010)
Article ADS CAS Google Scholar
Maurano, M. T., Wang, H., Kutyavin, T. & Stamatoyannopoulos, J. A. Widespread site-dependent buffering of human regulatory polymorphism. PLoS Genet. 8, e1002599 (2012)
Article CAS Google Scholar
McDaniell, R. et al. Heritable individual-specific and allele-specific chromatin signatures in humans. Science 328, 235–239 (2010)
Article ADS CAS Google Scholar
Reddy, T. E. et al. Effects of sequence variation on differential allelic transcription factor occupancy and gene expression. Genome Res. 22, 860–869 (2012)
Article CAS Google Scholar
Schaub, M. A., Boyle, A. P., Kundaje, A., Batzoglou, S. & Snyder, M. Linking disease associations with regulatory information in the human genome. Genome Res. 22, 1748–1759 (2012)
Article CAS Google Scholar
Garber, M. et al. A high-throughput chromatin immunoprecipitation approach reveals principles of dynamic gene regulation in mammals. Mol. Cell 47, 810–822 (2012)
Article CAS Google Scholar
Mullen, A. C. et al. Master transcription factors determine cell-type-specific responses to TGF-β signaling. Cell 147, 565–576 (2011)
Article CAS Google Scholar
Soufi, A., Donahue, G. & Zaret, K. S. Facilitators and impediments of the pluripotency reprogramming factors’ initial engagement with the genome. Cell 151, 994–1004 (2012)
Article CAS Google Scholar
Trompouki, E. et al. Lineage regulators direct BMP and Wnt pathways to cell-specific programs during differentiation and regeneration. Cell 147, 577–589 (2011)
Article CAS Google Scholar
Ghisletti, S. et al. Identification and characterization of enhancers controlling the inflammatory gene expression program in macrophages. Immunity 32, 317–328 (2010)
Article CAS Google Scholar
Keane, T. M. et al. Mouse genomic variation and its effect on phenotypes and gene regulation. Nature 477, 289–294 (2011)
Article ADS CAS Google Scholar
Mirny, L. A. Nucleosome-mediated cooperativity between transcription factors. Proc. Natl Acad. Sci. USA 107, 22534–22539 (2010)
Article ADS CAS Google Scholar
He, H. H. et al. Nucleosome dynamics define transcriptional enhancers. Nature Genet. 42, 343–347 (2010)
Article CAS Google Scholar
Creyghton, M. P. et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl Acad. Sci. USA 107, 21931–21936 (2010)
Article ADS CAS Google Scholar
Kaikkonen, M. U. et al. Remodeling of the enhancer landscape during macrophage activation is coupled to enhancer transcription. Mol. Cell 51, 310–325 (2013)
Article CAS Google Scholar
Rockman, M. V. & Kruglyak, L. Genetics of global gene expression. Nature Rev. Genet. 7, 862–872 (2006)
Article CAS Google Scholar
Orozco, L. D. et al. Unraveling inflammatory responses using systems genetics and gene-environment interactions in macrophages. Cell 151, 658–670 (2012)
Article CAS Google Scholar
Song, L. et al. Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity. Genome Res. 21, 1757–1767 (2011)
Article CAS Google Scholar
Bennett, B. J. et al. A high-resolution association mapping panel for the dissection of complex traits in mice. Genome Res. 20, 281–290 (2010)
Article CAS Google Scholar
Raetz, C. R. et al. Kdo2-Lipid A of Escherichia coli, a defined endotoxin that activates macrophages via TLR-4. J. Lipid Res. 47, 1097–1111 (2006)
Article CAS Google Scholar
Smale, S. T. Transcriptional regulation in the innate immune system. Curr. Opin. Immunol. 24, 51–57 (2012)
Article CAS Google Scholar
Wong, D. et al. Extensive characterization of NF-κΒ binding uncovers non-canonical motifs and advances the interpretation of genetic functional traits. Genome Biol. 12, R70 (2011)
Article CAS Google Scholar
Pham, T. H. et al. Mechanisms of in vivo binding site selection of the hematopoietic master transcription factor PU.1. Nucleic Acids Res. 41, 6391–6402 (2013)
Article CAS Google Scholar
Jolma, A. et al. DNA-binding specificities of human transcription factors. Cell 152, 327–339 (2013)
Article CAS Google Scholar
Barski, A. et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837 (2007)
Article CAS Google Scholar
Wang, D. et al. Reprogramming transcription by distinct classes of enhancers functionally defined by eRNA. Nature 474, 390–394 (2011)
Article CAS Google Scholar
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009)
Article Google Scholar
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nature Methods 9, 357–359 (2012)
Article CAS Google Scholar
Hochberg, Y. B. Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. A Stat. Soc. 57, 289–300 (1995)
MathSciNet MATH Google Scholar
Frazer, K. A. et al. A sequence-based variation map of 8.27 million SNPs in inbred mouse strains. Nature 448, 1050–1053 (2007)
Article ADS CAS Google Scholar
Kirby, A. et al. Fine mapping in 94 inbred mouse strains using a high-density haplotype resource. Genetics 185, 1081–1095 (2010)
Article CAS Google Scholar
Kang, H. M. et al. Efficient control of population structure in model organism association mapping. Genetics 178, 1709–1723 (2008)
Article Google Scholar
Barrett, J. C., Fry, B., Maller, J. & Daly, M. J. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2005)
Article CAS Google Scholar

Download references

Acknowledgements

We thank A. J. Lusis for providing access to eQTL data (http://systems.genetics.ucla.edu/) and for productive conversations. We thank D. Pollard for discussions and suggestions, and L. Bautista for assistance with figure preparation. These studies were supported by National Institutes of Health (NIH) grants DK091183, CA17390 and DK063491 (C.K.G.). M.U.K. was supported by the Foundation Leducq Career Development award and grants from Academy of Finland, Finnish Foundation for Cardiovascular Research and Finnish Cultural Foundation, North Savo Regional fund. C.E.R. was supported by the American Heart Association Western States Affiliates (12POST11760017) and the NIH (5T32DK007494).

Author information

S. Heinz and C. E. Romanoski: These authors contributed equally to this work.

Authors and Affiliations

Department of Cellular and Molecular Medicine, University of California, San Diego, 9500 Gilman Drive, Mail Code 0651, La Jolla, California 92093, USA,
S. Heinz, C. E. Romanoski, C. Benner, K. A. Allison, M. U. Kaikkonen & C. K. Glass
Integrative Genomics and Bioinformatics Core, Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, California 92037, USA,
C. Benner
San Diego Center for Systems Biology, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA,
C. Benner & C. K. Glass
Department of Biotechnology and Molecular Medicine, A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, PO Box 1627, 70211 Kuopio, Finland,
M. U. Kaikkonen
Department of Molecular Cell and Developmental Biology, University of California, Los Angeles, 3000 Terasaki Life Sciences Building, Los Angeles, California 90095, USA,
L. D. Orozco
Department of Medicine, University of California, San Diego, 9500 Gilman Drive, Mail Code 0651, La Jolla, California 92093, USA,
C. K. Glass

Authors

S. Heinz
View author publications
You can also search for this author in PubMed Google Scholar
C. E. Romanoski
View author publications
You can also search for this author in PubMed Google Scholar
C. Benner
View author publications
You can also search for this author in PubMed Google Scholar
K. A. Allison
View author publications
You can also search for this author in PubMed Google Scholar
M. U. Kaikkonen
View author publications
You can also search for this author in PubMed Google Scholar
L. D. Orozco
View author publications
You can also search for this author in PubMed Google Scholar
C. K. Glass
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.H., C.K.G. and C.E.R. designed the study; S.H., C.E.R., K.A.A., M.U.K. and L.D.O. performed experiments; C.E.R. performed all genetic-variation-related analysis; C.B. wrote custom code for HOMER2 and analysed data; K.A.A. and S.H. analysed data; C.E.R., S.H. and C.K.G. wrote the manuscript.

Corresponding author

Correspondence to C. K. Glass.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Extended data figures and tables

Extended Data Figure 1 ChIP-Seq data characteristics.

a, Summary of ChIP-Seq features identified. The number of ChIP-seq regions/peaks identified in untreated primary thioglycolate-elicited macrophages is tabulated for H3K4me2, H3K27ac, PU.1 and C/EBPα. Peaks for p65 were quantified in macrophages treated with 100 ng ml⁻¹ KLA for 1 h. Unless otherwise noted, modification and binding were considered strain-specific at ≥fourfold difference between strains in sequenced tags, and the FDR was <1 × 10⁻¹⁴ based on Poisson cumulative distribution testing and Benjamini and Hochberg correction. b–e, Reproducibility and strain-specific binding. Two separate pools of thioglycolate-elicited macrophages from mice from each strain (C57BL/6J and BALB/cJ) were treated with KLA for 1 h. ChIP-seq for p65 was performed separately on each pool (see Methods). The number of normalized sequencing tags at the union of peaks identified in the indicated experiments is shown. Peaks highlighted in red were deemed experiment-specific using criteria applied throughout this study (fourfold, and FDR < 1 × 10⁻¹⁴ from the cumulative Poisson distribution and Benjamini and Hochberg FDR estimation). The number of experiment-specific peaks is indicated (red) relative to the total number of peaks (black). f, Comparison of the p65 log₂ peak tag ratio between strains and experimental sets for all peaks (black), highlighting experiment-specific peaks (red) identified in either d or e. g, Heat map showing pairwise correlation between all p65 experiments. Pearson correlation coefficients are given for each comparison.

Extended Data Figure 2 Strain-specific LDTF binding correlates with variant density and location in LDTF motifs but not with genomic context.

a, Genomic features do not distinguish between strain-similar and strain-specific LDTF binding. Peaks were restricted to promoter-distal peaks (>3 kb to gene start sites). Genomic features (distance to nearest gene, distance to nearest repeat, CpG content and conservation score) were compared among three pairs of strain-similarly bound and strain-specifically bound PU.1 and/or C/EBPα loci (listed as groups 1–6). Box midlines are medians, boundaries are first and third quartiles. Whiskers extend to the extreme data points. CpG content and conservation were quantified in 1-kb regions centred on the LDTF peak. P values from two-sided t-test are given if below 0.05. b, Strain-specific C/EBPα binding occurs in regions with increased variant density. ChIP-Seq tag counts in 200-bp peak regions were stratified into five bins according to log₂ ratios of peak tag counts in BALB/cJ versus C57BL/6J mice (x axis, log₂ ratio), and the variant density distributions are shown per bin. c, d, Variant density distribution in strain-specific peaks. Mean variant densities within 10-bp bins relative to ChIP-Seq peak centres in strain-similar (red) or strain-specific (blue) peaks. e, Strain-specific PU.1 binding correlates with mutations in PU.1 motifs. PU.1 motif mutations were quantified in PU.1-bound regions and plotted against the logarithmic ratio of PU.1 peak tag counts in each strain (binding ratio) (x axis). The frequency of motifs that were mutated in BALB/cJ are plotted in red and those mutated in C57BL/6J in blue. f, The analogous relationship as shown in e for PU.1 is plotted for C/EBP motif mutations versus C/EBPα strain binding ratio.

Extended Data Figure 3 Strain-specific PU.1 and

C/EBPα binding correlates with strain-specific LDTF motifs. a, Top and degenerate motifs enriched in H3K4me2 and PU.1 or C/EBPα ChIP-Seq peaks. b, NF-κB consensus and degenerate motifs enriched in p65 ChIP-Seq peaks. These motifs were used to query individual genome sequences and identify strain-specific motifs in subsequent analysis. Degenerate ‘weak’ motif occurrence numbers for a given factor include ChIP-Seq peaks containing ‘strong’ motifs. Position weight matrices and log-odds score thresholds for each motif are given in Supplementary Table 1. c, d, Mutations in LDTF motifs affect PU.1 (c) and C/EBPα (d) binding. Left panels show scatterplots for the ChIP-Seq-defined binding of PU.1 (c) and C/EBPα (d) between C57BL/6J (x axes) and BALB/cJ (y axes). Strain-specific motifs were queried within 100-bp of each peak position. Red symbols designate binding events at loci where a polymorphism mutated a C/EBP, PU.1 or AP-1 motif in the C57BL/6J genome, whereas the motif was intact in the BALB/cJ genome. Blue points highlight mutations in these motifs in the BALB/cJ genome only. Violin plots in the right panels show the effect of each motif mutation (along x axes: PU.1, C/EBP, AP-1 and NF-κB) on the ratio of PU.1 (c) and C/EBPα (d) binding between mouse strains, (y axes: positive values denote BALB/cJ-specific, negative values denote C57BL/6J-specific). Tag ratio distributions for peaks overlapping C57BL/6J motif mutations are on the left (light colours), those for peaks overlapping BALB/cJ motif mutations are on the right (dark colours). The fold-difference between mean binding ratios is indicated under the pair of distributions for each motif. The grey distribution indicates PU.1- or C/EBPα-bound loci not overlapping strain-specific motifs.

Extended Data Figure 4 Effects of cognate motif distance from peak centre, variant position within a motif and the presence of alternative motifs on strain-differential binding of PU.1 and C/EBPα.

a–d, PU.1 and C/EBP motif mutations near the experimentally derived peak centre are associated with impaired binding. a, c, The ratios of the frequencies of variant-containing motifs at the given distances from strain-differentially versus strain-similarly bound peak centres (>twofold versus <twofold tag count ratio) for 570 PU.1 (a) and 278 C/EBP (b) variant-containing motifs are shown, respectively. b, d, The distribution of absolute strain peak tag count ratios of peaks whose centre is at the given distances from mutated PU.1 (b) or C/EBP (d) motifs. Box midlines are medians, and boundaries are first and third quartiles. Whiskers extend to the extreme data points. P values are from two-sided t-test. e, f, Effects of alternative PU.1 and C/EBP motifs and core mutations on binding. The number of non-mutated ‘alternative’ PU.1 or C/EBP motifs in the strain with a PU.1 or C/EBP motif mutation was counted, and the absolute respective PU.1 or C/EBPα log₂ strain binding ratio is shown. g, Defining the C/EBP motif core by comparing differential versus similar C/EBPα binding. Sequence variants within C/EBP motifs located in loci devoid of alternative C/EBP motifs (n = 178) were counted according to whether they were in differential (blue) or similar (red) C/EBPα-bound peaks. h, The distribution of PU.1 binding strain log₂ ratios (x axis) is shown for PU.1 mutations located in the PU.1 core and non-core nucleotides (defined in Fig. 1g). i, The C/EBPα binding strain log₂ ratio is shown for C/EBP core and non-core mutations as defined in g. j, k, Motif mutations predominately occur at differentially bound loci. The odds ratios (x axis; equation shown in box) describing the relative effect of the indicated characteristics of mutated motifs on differential binding relative to similar binding are shown for PU.1 (j) and C/EBPα (k). Whiskers show 95% confidence intervals. nt, nucleotides. l, m, The percentage of respective motif mutations consistent with altered PU.1 (l) and C/EBPα (m) binding is shown for the indicated categories of motif mutations.

Extended Data Figure 5 Analysis pipeline for predicting functional PU.1 mutations in NOD.

Data are shown in Fig. 1H.

Extended Data Figure 6 LDTF motif mutations are enriched at strain-specific

C/EBPα-bound loci relative to strain-similar loci. a, The log₂ odds ratio for observing a C57BL/6J-specific versus BALB/cJ-specific mutation in the indicated three bins of C/EBPα binding ratios: similar (middle bin), or strain-specifically C/EBPα bound (left and right bins). Details are in the Methods. b, Collaborative binding is largely not mediated by direct protein–protein interactions. A total of 14,199 loci bound by PU.1 and C/EBPα were centred on the PU.1 weak motif (0 on x axes) and cumulative instances of C/EBP and AP-1 motifs were plotted at each position relative to the central PU.1 motif. Interferon response factor (IRF) half-sites are plotted as control for a factor that requires direct protein–protein interactions with PU.1 for DNA binding. The motifs in each comparison showing overlapping sequence and base pair distances are indicated to the right. Peak distances from the central PU.1 motif are indicated in the histograms. RC denotes reverse complement. c, Allelle-specific C/EBPα binding in F₁ heterozygotes is similar to binding in homozygous parental strains. C/EBPα ChIP-seq reads from CB6F1/J hybrid F₁ macrophages were mapped with no mismatches to both parental genome sequences to identify allele-specific reads. C/EBPα log₂ peak tag ratios between the parental strains (BALB/cJ versus C57BL/6J) are shown on the x axis, and the log₂ ratio of allele-specific reads in the F₁ hybrids are shown on the y axis (BALB/cJ allele versus C57BL/6J allele). C57BL/6J-specific C/EBPα regions are blue, BALB/cJ-specific C/EBPα regions are red, and strain-similar C/EBPα regions are black. Strain-specific or similar regions were defined from parental data.

Extended Data Figure 7 Strain-specific epigenetic marks correlate with LDTF binding, and LDTF mutations segregate with altered H3K4me2 deposition.

a–f, Strain-specificity of LDTF binding and epigenetic marks. The relative amount of H3K4me2 (a–c) and H3K27ac (d–f) between C57BL/6J and BALB/cJ (x axes) is highly correlated with the amount of bound PU.1, C/EBPα or product (PU.1 × C/EBPα). The log₂ ratios of the peak tag counts for PU.1, C/EBPα and PU.1 × C/EBPα in each strain are shown relative to the log₂ of the peak tag count ratios for H3K4me2 or H3K27ac. Loci containing strain-specific LDTF motifs in a differentially PU.1- or C/EBPα-bound peak are highlighted. Correlation coefficients (Pearson) are indicated for each comparison. g, LDTF mutations segregate with altered H3K4me2 deposition. The log₂ of the ratio of the product of the normalized peak tag counts for PU.1 and C/EBPα in 200 bp in each strain (x axis) is compared to the log₂ H3K4me2 peak tag ratio in 1 kb (y axis) for loci containing at least a PU.1 or C/EBPα peak. Strain-specific LDTF motif mutations are indicated by the designated symbols and coloured by the mutated strain (C57BL/6J red, BALB/cJ blue). The distribution of H3K4me2 strain ratios stratified by corresponding LDTF strain mutations is shown to the right, with P value from a two-sided t-test. h, Relationships between H3K27ac patterns in different cell types. ES, embryonic stem. Hierarchical clustering of H3K27ac-positive regions as determined by ChIP-Seq and analysis with HOMER. The number of ChIP-seq tags in each of the 86,264 H3K27ac-marked regions used for comparison with eQTL data in Fig. 2e that were detected in at least one cell type was clustered using Euclidean distance.

Extended Data Figure 8 LDTFs prime the p65 cistrome.

a, The 69,517 regions that gained p65 in C57BL/6J after KLA treatment were analysed for binding of PU.1 and C/EBPα with and without KLA treatment as shown in the pie charts. Loci not bound by PU.1 or C/EBPα after KLA treatment were analysed by de novo motif finding. The most enriched motif was AP-1, and the second-most enriched motif was NF-κB. b, Violin plots of the p65 strain ratios of mean-normalized p65 binding for p65-bound peaks stratified by motifs mutations present in either BALB/cJ or C57BL/6J. Mutated motifs included PU.1 (strong and weak), C/EBP (strong and weak), C/EBP:AP-1 heterodimers, AP-1 and NF-κB. The effect on p65 binding per group is shown by comparing the mean-normalized p65 tag binding ratio along the y axis (log₂(BALB/cJ–C57BL/6J); positive values denote BALB/cJ-specific, negative values denote C57BL/6J-specific). White circles indicate the distribution means, and the average fold change associated with C57BL/6J-mutating and BALB/cJ-mutating SNPs in the respective motifs is given beneath. One-sided t-test P values between each pair of distributions ranged from 1 × 10⁻²⁹ to 1 × 10⁻¹⁴. c, Variant density in strain-specific and strain-similar p65 peaks. Mean variant density within 10-bp bins relative to p65 ChIP-Seq peak centres in strain-similar (red) or strain-specific peaks (blue). d–e, The variant density distribution in strain-specific p65 peaks is broader than those for PU.1 or C/EBPα. Fold enrichment of variant densities in strain-specific relative to strain-similar peaks (y axes) for PU.1 (d), C/EBPα (e) and p65 (f) is shown relative to the peak centres (x axes). Ratios plotted in d and e are from data in Extended Data Fig. 2c and d, respectively.

Extended Data Figure 9 Validation of strain-specific enhancer activity.

a, Enhancer activity in transient reporter assays correlates with strain-specific LDTF and p65 binding. Luciferase assay results for 24 loci (20 strain-specific enhancers with strain-specific motifs, 1 positive control with strain-similar enhancer activity (row 7, column 3), 2 negative controls lacking enhancer activity in both strains (row 8, columns 1 and 2), and 1 strain-specific enhancer lacking a strain-specific motif (row 8, column 3)) in transiently transfected RAW264.7 cells 48 h after transfection. Each 1-kb locus is represented by the horizontal midline within a box (see Fig. 5). ChIP-seq tag pile-ups are shown for PU.1 (green), C/EBPα (blue), p65 (red), H3K27ac (purple) and H3K4me2 (orange) for C57BL/6J (above midline) and BALB/cJ (below midline) with identical scales. Binding/modification data are shown after treatment with 100 ng ml⁻¹ KLA. Vertical black lines indicate SNP locations. Horizontal bars indicate average luciferase (enhancer) activity of the empty vector (blue, no enhancer), activity of a locus cloned from either strain in grey C57BL/6J (above) and BALB/cJ (below) under basal conditions, or after overnight stimulation with 100 ng ml⁻¹ KLA (pink). Luciferase values from transiently transfected cells were normalized to the activity measured for a co-transfected UB6 promoter-β-galactosidase reporter construct. Empty vector values were scaled to 0.5 for the first four loci, and to 1 for the remaining loci. Constructs in which the predicted motif-disrupting variant alleles were swapped are denoted by ‘M’, with mutations causing a significant effect in at least two out of three replicates being denoted by an additional asterisk (P < 0.05, one-sided t-test). Error bars show s.d. from three biological replicates, average values are indicated next to each bar. Experiments were replicated at least three times. Significant strain-specific enhancer activity is indicated by a dagger (grey without treatment, red after KLA treatment, one-tailed t-test, P < 0.05). b, Chromatinization is necessary for the strain specificity of a subset of enhancers. RAW264.7 cells were stably transfected with the two constructs containing the loci that showed strain-specific binding but lacked strain-specific enhancer activity in transient reporter assays (row 4, column 1 and row 1, column 3, marked by an asterisk). Luciferase activity measured in lysates of stably transfected cells was normalized to total protein content. RLU, relative light units.

Extended Data Figure 10 Motif analysis identifies causal SNPs in enhancers.

Regions of ∼1 kb size centred on PU.1 or C/EBPα ChIP-Seq peaks of similar tag count in C57BL/6J and BALB/cJ (t-test (P < 0.05) are marked with an asterisk. Strain and motif mutated by a variant are indicated below denoted by the ‘m’ prefix. In the table, plus signs indicate whether a tested enhancer contains an alternative motif for the same factor, a variant at a motif position that is not located at a motif core as defined in Fig. 1g and Extended Data Fig. 4g, or a variant in a motif located less than 20 bp away from the peak centre. Characteristics of the loci and primer sequences are in Supplementary Table 3. b, Identifying causal variants by motif analysis. Left panels show the ChIP-Seq pile-ups and SNP locations as in Extended Data Fig. 9. Right panels plot the relative enhancer reporter luciferase activities of the loci shown on the left, either in the wild-type configuration or when swapping the SNP indicated by a black triangle by site-directed mutagenesis. Motifs mutated by the indicated SNPs are shown above, with the mutation underlined and in red. c, To confirm that the centrally located PU.1 motif is essential for the C57BL/6J-specific activity, a 1-kb fragment of the locus from C57BL/6J or BALB/cJ was cloned into the luciferase reporter as described in Fig. 5 and the effects of swapping alleles at the predicted causal PU.1 SNP and flanking control 5′ and 3′ SNPs on enhancer activity are shown. Swapping alleles at the PU.1 SNP reversed strain-specific enhancer activity, whereas swapping alleles at either flanking SNP had no significant effect.

Supplementary information

Supplementary Table 1 - HOMER-formatted motif files for the motifs used for strain-specific motif finding listed in Extended Data Figure 3a,b

The header rows, which begin with ">", list the consensus motif, the motif name, and the log-odds threshold above which a given sequence is considered to be positive for the motif. Below each header is the position weight matrix that lists the frequency of each nucleotide (A, C, G, T in the columns from left to right, respectively) at each position (rows) of the motif from top to bottom. (XLSX 41 kb)

Supplementary Table 2 - Strain-specific PU.1-bound loci where NOD broke the C57BL6/BALBc haplotypes

Loci are shown in rows. The number of variants at each region is shown between C57BL/6J and BALB/cJ in column 4. The number of variants with alleles matching the binding pattern observed across NOD, C57BL/6J, and BALB/cJ are shown in column 5. (XLSX 44 kb)

Supplementary Table 3 - Strain-similar loci cloned for luciferase reporter assays

The genomic location, variant information, strain-specific motif information, and primer sequences used to clone strain-similar loci are shown in columns for the 9 loci tested (data in Extended Data Figure 10a). (XLSX 36 kb)

PowerPoint slides

PowerPoint slide for Fig. 1

PowerPoint slide for Fig. 2

PowerPoint slide for Fig. 3

PowerPoint slide for Fig. 4

PowerPoint slide for Fig. 5

Source data

Source data to Fig. 1

Source data to Fig. 2

Rights and permissions

Reprints and permissions

About this article

Cite this article

Heinz, S., Romanoski, C., Benner, C. et al. Effect of natural genetic variation on enhancer selection and function. Nature 503, 487–492 (2013). https://doi.org/10.1038/nature12615

Download citation

Received: 29 January 2013
Accepted: 29 August 2013
Published: 13 October 2013
Issue Date: 28 November 2013
DOI: https://doi.org/10.1038/nature12615

This article is cited by

Single-cell multiomics of the human retina reveals hierarchical transcription factor collaboration in mediating cell type-specific effects of genetic variants on gene regulation
- Jun Wang
- Xuesen Cheng
- Rui Chen
Genome Biology (2023)
Roles and regulation of microglia activity in multiple sclerosis: insights from animal models
- Félix Distéfano-Gagné
- Sara Bitarafan
- David Gosselin
Nature Reviews Neuroscience (2023)
Discrimination of cell-intrinsic and environment-dependent effects of natural genetic variation on Kupffer cell epigenomes and transcriptomes
- Hunter Bennett
- Ty D. Troutman
- Christopher K. Glass
Nature Immunology (2023)
Perfect and imperfect views of ultraconserved sequences
- Valentina Snetkova
- Len A. Pennacchio
- Diane E. Dickel
Nature Reviews Genetics (2022)
Integrated proteomic and transcriptomic landscape of macrophages in mouse tissues
- Jingbo Qie
- Yang Liu
- Chen Ding
Nature Communications (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.