Rapid profiling of transcription factor–cofactor interaction networks reveals principles of epigenetic regulation

Abstract Transcription factor (TF)–cofactor (COF) interactions define dynamic, cell-specific networks that govern gene expression; however, these networks are understudied due to a lack of methods for high-throughput profiling of DNA-bound TF–COF complexes. Here, we describe the Cofactor Recruitment (CoRec) method for rapid profiling of cell-specific TF–COF complexes. We define a lysine acetyltransferase (KAT)–TF network in resting and stimulated T cells. We find promiscuous recruitment of KATs for many TFs and that 35% of KAT–TF interactions are condition specific. KAT–TF interactions identify NF-κB as a primary regulator of acutely induced histone 3 lysine 27 acetylation (H3K27ac). Finally, we find that heterotypic clustering of CBP/P300-recruiting TFs is a strong predictor of total promoter H3K27ac. Our data support clustering of TF sites that broadly recruit KATs as a mechanism for widespread co-occurring histone acetylation marks. CoRec can be readily applied to different cell systems and provides a powerful approach to define TF–COF networks impacting chromatin state and gene regulation.


Introduction
Gene expression is coordinated by transcription factor (TF) binding to cis-regulatory elements (CREs) throughout the genome ( 1 ,2 ).TF function is subsequently carried out by the recruitment of cofactors (COFs) that are not classical DNAbinding TFs ( 2 ) but are recruited to DNA by protein-protein interactions (PPIs) with DNA-bound TFs to modulate transcription.COFs perform diverse functions related to gene regulation, including modifying histones, remodeling chromatin and interacting with the transcriptional machinery ( 1 ,3-6 ).TF-COF interactions can be cell type specific ( 7 ,8 ), can change in response to different signals (9)(10)(11)(12)(13) and can be altered in a range of diseases ( 7 , 14 , 15 ).As such, TF-COF interactions define a dynamic network that governs gene regulation in the cell.Nevertheless, despite their central role in coordinating cellular responses, there is surprisingly limited information about cell-specific TF-COF networks, due primarily to a lack of methods for high-throughput profiling of TF-COF complexes.
Multiple high-throughput methods exist for assaying PPIs; however, each has features that complicate their use for rapid profiling of cell-specific, DNA-bound TF-COF complexes.The yeast two-hybrid assay (16)(17)(18) is widely used to profile pairwise PPIs but is unable to evaluate cell-type-specific in-

Nuclear extraction
The nuclear extract protocols are based on the wellestablished Dignam and Roeder procedure ( 33 ) and implemented as previously described ( 30 ,32 ) with modifications detailed below.Approximately 100 million cells were harvested for each nuclear extraction protocol.To harvest suspension cells, the cells were collected in a falcon tube and placed on ice.Suspension cells were pelleted by centrifugation at 500 × g for 5 min at 4 • C. Supernatant was collected and cell pellets were resuspended in 10 ml of 1 × phosphate-buffered saline (PBS) with protease inhibitor and pelleted again at 500 × g for 5 min at 4 • C and PBS wash was aspirated leaving behind the cell pellet.To collect HEK293 cells for nuclear extraction, medium was aspirated from the flask and 10 ml of warmed 1 × PBS was used to gently wash the cells.PBS was aspirated off and 10 ml of ice-cold 1 × PBS with protease inhibitor was added to the flasks.A cell scraper was used to lift the HEK293 cells from the flask and cells were collected in a falcon tube and put on ice.Collected cells were pelleted at 500 × g for 5 min at 4 • C before PBS was aspirated.All remaining steps were performed the same between suspension and HEK293 cells.To lyse the plasma membrane, the cells were resuspended in 1.5 ml of Buffer A [10 mM HEPES, pH 7.9, 1.5 mM MgCl, 10 mM KCl, 0.1 mM protease inhibitor, phosphatase inhibitor (Santa Cruz Biotechnology, catalog # sc-45044), 0.5 mM dithiothreitol (DTT; Sigma-Aldrich, catalog # 4315)] and incubated for 10 min on ice.After the 10-min incubation, Igepal detergent (final concentration of 0.1%) was added to the cell and Buffer A mixture and vortexed for 10 s.To separate the cytosolic fraction from the nuclei, the sample was centrifuged at 500 × g for 5 min at 4 • C to pellet the nuclei.The cytosolic fraction was collected into a separate microcentrifuge tube.The pelleted nuclei were then resuspended in 100 μl Buffer C [20 mM HEPES, pH 7.9, 25% glycerol, 1.5 mM MgCl, 0.2 mM ethylenediaminetetraacetic acid (EDTA), 0.1 mM protease inhibitor, phosphatase inhibitor, 0.5 mM DTT and 420 mM NaCl] and then vortexed for 30 s.To extract the nuclear proteins (i.e. the nuclear extract), the nuclei were incubated in Buffer C for 1 h while mixing at 4 • C. To separate the nuclear extract from the nuclear debris, the mixture was centrifuged at 21 000 × g for 20 min at 4 • C. The nuclear extract was collected in a separate microcentrifuge tube and flash frozen using liquid nitrogen.Nuclear extracts were stored at −80 • C.

CoRec experiments
Microarray DNA double-stranded and PBM protocols are as previously described ( 30 , 32 , 34 , 35 ).Any changes to the previously published protocols are detailed.Double-stranded microarrays were pre-wetted in HBS (20 mM HEPES, 150 mM NaCl) containing 0.01% Triton X-100 for 5 min and then dewetted in an HBS bath.Next, the array was incubated with a mixture of the binding reaction buffer [20 mM HEPES, pH 7.9, 100 mM NaCl, 1 mM DTT, 0.2 mg / ml bovine serum albumin, 0.02% Triton X-100, 0.4 mg / ml salmon testes DNA (Sigma-Aldrich, catalog # D7656)] and nuclear extract for 1 h in the dark.The array was then rinsed in an HBS bath containing 0.05% Tween 20 and subsequently de-wetted in an HBS bath.After the protein incubation, the array was incubated for 20 min in the dark with 20 μg / ml primary antibody for the COF of interest ( Supplementary Table S1 ) diluted in 2% milk in HBS.After the primary antibody incubation, the array was rinsed in an HBS bath containing 0.1% Tween 20 and dewetted in an HBS bath.Microarrays were then incubated for 20 min with 20 μg / ml of either Alexa 488 or Alexa 647 conjugated secondary antibody ( Supplementary Table S1 ) diluted in 2% milk in HBS.The array was rinsed in an HBS bath containing 0.05% Tween 20 and then placed in a Coplin jar containing 0.05% Tween 20 in HBS.The array was agitated in solution in a Coplin jar at 125 rpm on an orbital shaker for 3 min and then placed in a new Coplin jar with 0.05% Tween 20 in HBS to repeat the washing step.It was then placed in a Coplin jar containing HBS and washed for 2 min as described above.After the washes, the array was de-wetted in an HBS bath.Microarrays were scanned with a GenePix 4400A scanner and fluorescence was quantified using GenePix Pro 7.2.Replicate fluorescence image scans highlighting reproducibility are provided ( Supplementary Figure S1 ).Exported fluorescence data were normalized with MicroArray LINEar Regression ( 113 ).Analysis of normalized CoRec microarray data was performed using the publicly available CoRec analysis package ( https:// github.com/Siggers-Lab/ CoRec ).

CoRec microarray design
Nonredundant TF binding position weight matrix (PWM) models from the JASPAR 2018 core vertebrate set were obtained using the JASPAR 2018 R Bioconductor package.The 452 human models were collapsed into consensus sequences and filtered for equivalence based on nucleotide identity and relative sequence length (cutoff > 0.9).Sequence length filter was used to ensure that composite site models for two TFs (e.g.A + B) and half-site models (e.g.A or B) were both included in the final design.Filtering led to a final set of 346 TF models to be included in the final design.These 346 models represent ∼65% of human TFs based on comparison against human motifs in the CIS-BP database ( 36 ) (motif comparison P -value < 0.0005 using T omT om; MEME Suite version 5.3.3)(37)(38)(39).TF consensus binding sites were embedded within a 34-nt DNA probe sequence attached to a 24-nt common primer sequence.For all 346 consensus sequences, DNA probes corresponding to each possible single-nucleotide variant (SV) sequence were also included on the microarray.The 60-nt probe sequences were organized as follows: 2-nt GC cap + 34-nt binding site (TF site or SV probe) + 24-nt common primer.Two hundred sixty-one background target DNA probes were included in the design to estimate background fluorescence intensities in the experiments.Background probe sequences were randomly selected from the human genome (hg38).

CoRec motif generation and motif strength
Log-transformed PBM fluorescence values (median fluorescence over five replicate probes) were normalized against background fluorescence levels to yield a z -score that quantifies binding to each probe sequence: where f is the log fluorescence value of the probe, μ bg is the mean background log fluorescence value and σ bg is the standard deviation of the background log fluorescence values.COF binding specificity to each consensus + SV probe set was modeled using a z -score binding motif that captures the change in binding for every SV substitution across the consensus binding site.z -score motifs are defined as where z ik is the z -score for nucleotide variant k at position i of the motif and μ i is the median z -score for all nucleotide variants at position i .The z -score binding motif is akin to a binding energy matrix that quantifies the impact of nucleotide variants to the total binding energy.COF binding strength is quantified using a motif strength score based on the median z -score of the 10 top-scoring probes ( ∼15% of probes) contributing to the motif (i.e. for this consensus + SV probe set) ( Supplementary Files S1 and S2 ).We found that quantifying motif strength over a subset of top-scoring probes was more robust than using only the top-scoring probe or the seed probe.
To allow for motif comparison to published PWM-type TF binding models, the z -score motifs were also transformed into position probability matrices (PPMs) using a Boltzmann distribution formalism and the calculated motif strength (MS) ( Supplementary Files S3 and S4 ): β is calculated as follows for each individual CoRec motif: The adaptive β function addresses the fact that the z -score change with nucleotide variants is dependent on overall binding strength quantified by the motif score.To account for this, β values are scaled based on the motif score.Various functional forms were tried and gave comparable results.
The fluorescence value for each spot (i.e.cluster of many thousands of identical DNA molecules) is an aggregate measurement of all protein complexes that recruit the target COF.Given that our motifs match well to canonical TF binding motifs, we interpret the motifs as indicating simple TF-COF complexes.However, we cannot rule out the possibility of higher order complexes bridging across multiple DNA molecules.Additionally, we cannot rule out the possibility that complexes are stabilized by free DNA ends.However, we report only high-affinity binding that is supported by motifs that match known TF binding motifs.

TF motif cluster generation
Motif clusters were generated using a set of 946 human TF binding models from the JASPAR 2022 ( 37 ) CORE database.Pairwise motif comparison was performed using T omT om (MEME Suite version 5.3.3)( 37-39 ) with default settings.The distance between two motifs was defined as max(15 + log 10 ( Pvalue), 0) using the T omT om returned P -values.Initial motif clusters were created using agglomerative clustering with the complete linkage method (i.e.clusters were combined based on the maximal distance between two motifs from the different clusters).Clusters were then manually curated to account for the canonical TF families represented ( Supplementary File S5 ).For integration with publicly available PPI datasets, we mapped individual human TFs to our TF motif clusters.To do this, we mapped 6445 human TF motifs in the CIS-BP database to our TF motif clusters.CIS-BP motifs were assigned to their best match ( P -value < 0.0005, T omT om default settings).At this stringency, 65% of ∼1200 human TFs can be assigned to TF motif clusters ( Supplementary File S5 ).

Matching COF motifs to TF binding motifs
To infer the identity of TFs recruiting each COF, CoRec motifs were matched to databases of TF binding motifs.All data presented in the main section of this paper are presented at the summarized level, after passing all quality, replicate and condition filters.Motif quality filter : First, we filtered out lowquality CoRec motifs by requiring that the motif strength be above 0.4 (i.e.MS > 0.4) and the minimum average information content (averaged over windows of five consecutive positions) be above 1.0.Replicate filter : Second, we identified reproducible motifs by combining replicate experiments.We found that requiring replicate motifs match the same reference motif, or group of similar motifs, to be considered replicated could overestimate differences in our data since small differences in nucleotide frequencies can alter motif matching.Therefore, we used a more direct Euclidean distance (ED) metric computed between replicate PPM motifs to determine whether they were consistent.To be considered replicated, we required an ED < 0.4 between motifs derived from the same consensus + SV probes.If more than two replicates were present, motifs were considered replicated if the ED < 0.4 between at least two consensus + SV probes from that replicate set.CoRec motifs meeting these thresholds were then each compared to the set of 946 reference motifs using the memes Bioconductor R package (version 1.2.5), an R wrapper of the MEME Suite (version 5.3.3)( 37 ).Comparisons were done with the run_tomtom() function using ED and requiring a minimum overlap of five nucleotides between the target and query motifs (dist = 'ed', min_overlap = 5).Matches were considered significant if the adjusted P -value was < 0.01.Adjusted P -values were calculated by T omT om ( 38 ,39 ) as the raw P -value multiplied by the number of motifs in the reference library .Critically , replicate CoRec motifs were assigned to a motif cluster according to the single replicate motif with the lowest T omT om adjusted P -values (i.e. based on the best match).Condition filter : When comparing across cell types or stimulation conditions, again we require that motifs meet our replicate filtering criteria and then use an additional ED metric to determine whether binding is replicated across conditions .We then assign the motif groups (i.e.across replicates and conditions) to single motif cluster according to a best-match criterion.For across-condition or cell-type comparisons, we require that at least one replicate motif from condition 1 (or cell type 1) matches one replicate motif from condition 2 (or cell type 2) with ED < 0.25.If this criterion is met, we annotate the motifs as conserved across the conditions ( Supplementary Files S6 and S7 ).

RNA-seq, proteomic and PPI analyses
RNA-seq analyses : To examine gene expression values in resting Jurkat and SUDHL4 cells, RNA sequencing (RNAseq) data were obtained from the DepMap consortium ( 40 ).To examine induced gene expression in T-cell receptor (TCR) / CD28-activated Jurkat T cells, we used data from ( 41 ).Differential expression analysis was performed using edgeR package (release 3.18) ( 42 ).Preprocessed expected count profiles for each cell type were used as input, with filters for 100 reads required for each gene and a bcv value set to 0.2.Genes with log 2 (FC) > 2 or log 2 (FC) < −2 (FC = fold change) were determined to be differentially expressed ( Supplementary File S8 ).Proteomic analyses : To examine protein levels across cell types, normalized relative protein expression levels for resting Jurkat and SUDHL4 cells were obtained from the DepMap consortium and used as provided ( 43 ).Proteins with log 2 (FC) > 2 or log 2 (FC) < −2 were determined to be differentially expressed between the cell lines ( Supplementary File S9 ).PPI analyses : To compare our CoRec-defined interactions with PPI data from public databases, individual TF-COF interactions in PPI databases were mapped onto TF motif cluster annotations using 6445 human TF motifs in CIS-BP (described above).Using this mapping, we could compare COF interactions with TF motif clusters across PPI and CoRec datasets.

ChIP-seq experiments
Jurkat cells in complete growth medium at a concentration of 1 × 10 6 cells / ml were collected in a conical tube.For each sample, 4 × 10 7 cells were cross-linked with 1% formaldehyde (final concentration) (Thermo Fisher, catalog # 033314.AP) for 10 min at room temperature (RT) with gentle rotation.Cross-linking was stopped by adding 125 mM final concentration of glycine solution in PBS and cells were rotated at RT for 5 min.Fixed cells were pelleted at 1000 × g for 10 min at 4 • C and washed twice with 14 ml of ice-cold PBS and pelleted at 1000 × g for 10 min at 4 • C each time.Washed cell pellet was resuspended in 1 ml of ice-cold PBS and transferred to a 1.5-ml lo-bind DNA tube (Eppendorf, catalog # 022431005) and centrifuged at 2400 × g for 5 min at 4 • C and flash frozen with liquid nitrogen and stored at −80 • C for later use.
The same procedure was repeated with lysis buffer 2 (200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA (ethylene glycol tetraacetic acid), 10 mM Tris-HCl, 0.1 mM protease inhibitor) at RT followed by pelleting at 10 000 × g for 5 min at 4 • C. Pellets were washed with 400 μl of sonication buffer [0.1% sodium dodecyl sulfate (SDS) in TE (Tris-EDTA)] and then resuspended in 1.4 ml of sonication buffer.Resuspensions were then split into two 1.5-ml lo-bind DNA tubes for sonication (each tube with 700 μl of liquid).During sonication, each tube was placed in a 1.5-ml microcentrifuge tube placed in Benchtop 1.5-ml Tube Cooler (Active Motif, catalog # 53076).The nuclei were sonicated using an Active Motif Q120AM sonicator with a 2-mm probe (Active Motif, catalog # 53056) at 40% amplitude for 40 min with 30 s ON and 30 s OFF cycles (80 cycles total).Cell debris was pelleted at 21 000 × g for 10 min at 4 • C. Soluble chromatin was transferred to a new 1.5-ml lo-bind DNA tube, and 300 μl of sonication buffer was added along with additional reagent to create RIPA buffer (150 mM NaCl, 0.1% sodium deoxycholate, 1% Triton X-100, 5% glycerol).Thirty microliters of the combined soluble chromatin was saved to be checked for chromatin shearing upon reverse cross-linking via 1.5% agarose gel and Bioanalyzer 2100 using the DNA High Sensitivity Kit (Agilent, catalog # 5067-4626).The rest of the soluble chromatin mixture was mixed by end-over-end agitation for 10 min at 4 • C and then flash frozen and stored at −80 • C.
For immunoprecipitation, each chromatin sample was thawed on ice 2 h prior to experiment and precleared with 20 μl of washed Protein A / G dynabeads via incubation for 1 h on a rocking platform at 4 • C. Precleared chromatin was separated from beads via spin at 12 000 × g for 10 min at 4 • C and supernatant was transferred to new 1.5-ml lo-bind DNA tubes.
DNA concentration was then measured via the Qubit ds-DNA HS Assay Kit (Invitrogen, catalog # Q32851).For immunoprecipitation, 25 μg of precleared chromatin and 3 μg rabbit polyclonal anti-H3K27ac antibody (Abcam, ab4729) were mixed together and nutated overnight at 4 • C. The next day 14.5 μl of washed Protein A dynabeads were added to each sample and the mixture was nutated for 4 h at 4 • C.After 4 h, samples were placed on the magnet and supernatant was gently removed.Seven hundred microliters of wash buffer 1 (150 mM NaCl, RIPA buffer) was added to the sample and nutated for 15 min before samples were placed on magnet and supernatant was removed.This process was repeated for wash buffer 2 (400 mM NaCl, RIPA buffer), wash buffer 3 (TE, 250 mM LiCl, 0.5% sodium deoxycholate, 0.5% NP-40) and wash buffer 4 (TE, 0.02% Triton).After removing wash buffer 4, beads were resuspended in elution buffer (TE, 250 mM NaCl, 0.3% SDS) supplemented with 0.8 U of Proteinase K and moved to 0.2-ml polymerase chain reaction tubes.The samples were then uncross-linked in the thermocycler at 42 • C for 30 min, 65 • C for 5 h and 15 • C for 10 min.ChIP DNA was then purified using the Qiagen MinElute Reaction Cleanup Kit (catalog # 28204) and eluted in 12 μl of 1 × TE buffer.Sample concentrations were measured via Qubit and library preparation was performed using the NEBNext Ultra II DNA Library Prep Kit for Illumina (NEB, catalog # E76455S) following the provider's instruction manual.Amplified libraries were bioanalyzed again to check the library preparation success.Samples were pooled by molarity and sequencing on No-vaSeq 6000-S4 to obtain ∼35-40 million reads per sample.
To determine differential peaks' H3K27ac between the untreated and the TCR-stimulated samples, we use the R Bioconductor package DiffBind (release 3.4.11).Replicate bam read and peak files were used as inputs.Samples were normalized and differential peaks were called using defaults with comparison being called between the treatment groups.DiffBind calculated log(FC) and false discovery rate (FDR) values were used to determine differential peaks between the two sets of samples.All DiffBind H3K27ac peaks were linked to genes using R package ChIPseeker (v1.30.3) ( 48 ) using GRCh38, Ensembl release 102 ( 46 ) and Bioconductor org.Hs.eg.db (v3.8.2) with tssRegion determined as −3000 to 3000 (default setting).If a gene had multiple peaks in which some changed and some did not, it was called an induced or reduced gene and was filtered out of the 'no change' group.NF-κB target gene list was used from https:// www.bu.edu/ nf-kb/ gene-resources/ target-genes/.

Motif analysis
Motifs were identified in genomic regions (e.g.H3K27ac peaks or gene promoters) using FIMO (-max-strand) (version 5.3.3)( 37 ,38 ).Regions were scored using −log 10 ( P -value) of the single best occurrence of each motif.If a sequence had no occurrences of a given motif with a P -value ≤1e −5, the sequence was assigned a score of 0 for that motif.To assign scores to TF motif clusters (as opposed to single motifs), all motifs associated with the TF cluster were evaluated and the maximum motif score was used.Motif enrichment between sets of genomic elements (e.g.promoters with induced acetylation relative to unchanged promoters) was quantified using the two-sided Wilcoxon rank-sum test.
For motif analyses of gene promoters, we defined promoters as the region from 500 bp upstream to 100 bp downstream of the transcription start site (TSS) for a protein coding gene (GRCh38, Ensembl release 102).Alternative TSSs were not considered.Acetylated promoters were defined as induced if they overlapped DiffBind regions [log 2 (FC) ≥ 1.5] or as unchanged if they overlapped DiffBind regions [log 2 (FC) ≤ 0.1].Promoter acetylation level was defined as the maximum Genrich score [i.e.−log 10 ( P -value)] for any overlapping H3K27ac ChIP-seq peak from either condition.Unacetylated promoters were identified as promoters that overlapped an A T AC-seq peak but not an H3K27ac peak.A T AC-seq peaks were defined using publicly available data in resting Jurkat cells [Gene Expression Omnibus (GEO) accession number GSM4706085] and identified using Genrich v0.6 with A T AC-seq mode enabled (-j).Promoters that were unacetylated in both conditions were assigned an acetylation score of 0.

CoRec is an ar ray -based method to profile cell-specific TF-COF interactions
To examine TF-COF networks in different cell types and stimulus conditions, we have developed the CoRec method to profile TF-COF complexes present in cell nuclear extracts.The CoRec approach is built upon the PBM technology for the high-throughput measurement of protein-DNA binding ( 34 , 49 , 50 ) and extends work from our lab using PBMs to analyze the DNA binding of TF-COF complexes from cell extracts (30)(31)(32).In CoRec, we apply nuclear extracts to a DNA microarray and profile the recruitment of a target COF to thousands of customized DNA sequences representing different TF binding sites.We infer the identity of the TFs involved in COF recruitment based on the pattern of DNA sites to which the COF is recruited.Therefore, many TF-COF complexes can be characterized simultaneously, providing a rapid and high-throughput assay to identify TF-COF complexes functioning in a cell.Briefly, nuclear extracts are applied to a double-stranded DNA microarray, and DNA-bound COFs are labeled using fluorescently labeled antibodies (Figure 1 A).The CoRec microarray contains consensus binding sites that span the DNA binding site specificity of ∼65% of human TFs, allowing us to profile COF recruitment by the majority of human TFs (see the 'Materials and methods' section).Critically, for each of the 346 consensus binding sites we also include all SV sites on the microarray, defining 346 consensus + SV probe sets on the array (Figure 1 B).By profiling the differential recruitment of each COF to the consensus + SV probe sets, we can define COF 'recruitment motifs' that can be matched against large motif databases [e.g.CIS-BP ( 36 ) and JASPAR ( 51 )] to infer the identity of interacting TFs at the level of TF family ( 31 ,32 ) (Figure 1 C and D).The CoRec approach is inherently multiplexed as COF recruitment to all 346 TF sites is assayed in parallel.Eight individual COFs can be profiled per microarray, allowing thousands of potential TF-COF complexes to be profiled in a single experiment.
COF recruitment to each motif is quantified by a motif strength calculated from the PBM fluorescence intensities for each consensus + SV probe set.The motif strength depends on TF concentration, TF-DNA binding affinity and TF-COF interaction affinity (Figure 1 E).As such, motif strength provides an aggregate measure of COF recruitment by all TFs in the cell that bind to a given DNA motif.For example, the ETS motif strength quantifies the aggregate ability of the ETS family members present in the extract to bind that motif and recruit a given COF.Consequently, the CoRec-defined motif strength represents a cell-specific recruitment activity for individual TF DNA binding site motifs.
To evaluate the applicability and reproducibility of CoRec, we performed experiments for the COFs P300 (acetyltransferase), BRD4 (scaffold protein) and TBL1XR1 (subunit of NCOR / SMRT repressor complexes) using extracts from three different cell lines: HEK293 (embryonic adrenal precursor cells) ( 52 ), Jurkat (T-cell leukemia cells) and SUDHL4 (Bcell lymphoma cells).Replicate experiments showed excellent agreement in the COF recruitment to all consensus and SV probes (Figure 1 F-H).The measured COF recruitment strength is quantified by a z -score based on the probe fluorescence values (see the 'Materials and methods' section).To demonstrate the sensitivity of CoRec in quantifying differential COF recruitment to DNA variants across a range of binding strengths, we highlight COF recruitment to consensus + SV probe sets derived from motifs for E2F7, ZBED1 and MAF:NFE2 (heterodimer) (Figure 1 F-H).Despite the wide range of z -scores across the indicated probe sets, the COF recruitment motifs are in excellent agreement with published TF motifs (Figure 1 I-K) and allow us to infer the identity of TF-COF complexes binding with different strengths across a range of DNA sites (Figure 1 E).These results demonstrate that CoRec is a robust method to investigate a broad range of TF-COF complexes present in cell extracts.

CoRec identifies cell-specific TF-COF interaction networks
To evaluate our ability to define cell-specific TF-COF interaction networks, we examined recruitment of the COFs P300, BRD4 and TBL1XR1 to our full CoRec microarray using extracts from three cell types.The TF-COF interaction network is represented as a heatmap indicating significant matches of COF motifs to TF motifs (Figure 2 A; see the 'Materials and methods' section).A heatmap with individual replicates is provided to highlight reproducibility ( Supplementary Figure S2 ).As many TFs bind DNA with similar sequence specificity, we collapsed TFs into 184 TF specificity 'clusters' based on the similarity of their known binding motifs ( Supplementary File S5 ; see the 'Materials and methods' section).Many TF clusters correspond to conventional TF families (e.g.NFKB_REL cluster); however, some TF families were split or combined based on motif similarity (e.g. MYBL_OVOL is a combined cluster).The intensity of each cell in the heatmap corresponds to the maximum COF motif strength for that TF cluster and represents how strongly a COF is recruited to DNA by TFs in that cluster.
We identified 28 distinct TF clusters that recruit the three COFs (P300, BRD4 and TBL1XR1) across the three cell types.Comparing the average number of TF clusters per COF, we find the most diverse recruitment for P300 (8.7 clusters), then TBL1XR1 (6.3 clusters) and finally BRD4 (5.0 clusters) (Figure 2 B).P300 is a broad transcriptional activator with defined subdomains that interact with a variety of TFs ( 53 ,54 ).It has been suggested that P300 must interact with multiple TFs to stabilize its own binding at chromatin, and thus interacting with a large number of TFs may be necessary for P300's function ( 55 ).While BRD4 can directly interact with TFs, contributing to site-specific recruitment, it also contains two bromodomains that bind acetylated histones ( 56 ,57 ).As our assay involves short (60-bp) unchromatinized DNA probes, it is of note that BRD4 has the lowest average number of recruitment clusters, consistent with a model in which BRD4 recruitment in vivo is mediated by interactions with both TFs and chromatin.
To determine whether the number of CoRec-defined TF-COF interactions is comparable to other approaches, we compared our results to published PPI datasets (Figure 2 D).To evaluate TF-COF interactions in a single cell type, we first examined data from cell-specific PPI predictions ( 58 ) and proximity labeling approaches ( 59 ).We find comparable numbers of TF interactions per COF between CoRec (average of 6.3 clusters per COF) and these published approaches [average of 3 and 9 clusters per COF for cell-specific PPI predictions ( 58 ) and BioID ( 59 ), respectively].To understand how our approach compared to larger datasets, we examined the number of TF-COF interactions reported for each COF in public PPI databases [STRING (v12.0,physical subnetwork) ( 60 ), HIP-PIE (v2.3) ( 61 ), BioGrid (release 4.4.221)( 62 ) and APID (version: March 2021) ( 63 )] (Figure 2 D).These databases aggregate PPI data measured using different approaches (e.g.yeast two-hybrid, co-immunoprecipitation, etc.) and cell types.As expected, the number of PPIs reported for each COF in the databases is higher than the number of CoRec-detected interactions given their broad inclusion criteria.However, it was surprising that our CoRec data for TBL1XR1 in only three cell types captured half of the reported interactions in the largest database (14 clusters from CoRec, 28 in STRING), suggesting that interactions in TBL1XR1 are undersampled in existing databases.Despite the large number of PPIs currently represented in the public databases, 26% (12 / 46) of the COF-TF cluster interactions we identified have not been previously reported ( Supplementary Figure S3 A).Specifically, we identified five unreported interactions each for TBL1XR1 and BRD4, which have the fewest reported interactions in the databases, and two new TF interactions for P300.
Of the 28 TF clusters that recruit the three COFs, we found that 68% (19 / 28) were observed in only one cell type, 14% (4 / 28) were found in two cell types and 18% (5 / 28) were observed across all three cell types (Figure 2 C).The most broadly recruiting TF cluster is the ETS cluster, which recruits multiple COFs in all three cell types.ETS TFs comprise a large family with 28 members in humans ( 64 ).ETS factors regulate housekeeping genes in a number of cell types ( 65 ,66 ).Further, they can bind redundantly to regulate housekeeping genes and genes that are constitutively expressed at high levels ( 65 , 67 , 68 ).Examining the STRING PPI database, we find that the ETS cluster has the most interactions (718 individual ETS-COF interactions) of any of our reported TF clusters, supporting their broad COF interaction ability.Our data suggest that promiscuous COF recruitment across cell types is a feature of the ETS cluster and that more generally this may be a feature of TFs that regulate constitutively expressed genes.
In contrast to the broadly acting TF clusters, cell-typespecific interactions can highlight cell-specific TF functions.For example, the MEF and BCL6 clusters are cell type specific and only recruit COFs in SUDHL4 B cells.BCL6 and MEF2B (member of the MEF cluster) are both crucial regulators of B-cell development, and changes in their COF interactions have been implicated in diffuse large B-cell lymphoma (DLBCL) (69)(70)(71)(72).In our dataset, BCL6 interacts strongly with TBL1XR1.Notably, the BCL6-TBL1XR1 interaction is rarely disrupted in germinal center B-cell DL-BCL, the subtype to which SUDHL4 belongs, but is often disrupted in the more aggressive DLBCL subtype known as activated B-cell DLBCL ( 72 ).These results support the ability of CoRec to identify functionally relevant TF-COF interactions in a cell-type-specific manner.We further note that because CoRec assays endogenous proteins, it provides a straightforward method to profile the impact of cell-specific mutations on TF-COF interactions and DNA binding, as exemplified by the BCL6-TBL1XR1 interactions in DLBCL subtypes.

TF-COF binding motifs can differ from canonical TF binding motifs
A central feature of CoRec is that recruitment motifs are determined independently for each COF.Comparison of COF motifs can reveal both COF-and cell-specific differences that suggest additional means for achieving gene regulatory specificity.For example, the NFAT cluster recruits P300 in both Jurkat and SUDHL4 cells, but the recruitment motifs differ between the cell types.In Jurkat cells, the P300 recruitment motif reveals a strong preference for a guanine at position 8 (5 -NNTTTCC G NN-3 ), whereas in SUDHL4 cells there is a strong preference for adenine at this position (5 -NNTTTCC A NN-3 ) (Figure 2 E).NFAT family member motifs appear to have a relatively equal preference for A and G at this position and previous studies have noted NFATC2 binding to both of these types of sequences, albeit with varying affinities ( 73 ).Strikingly, in SUDHL4 cells, BRD4 and TBL1XR1 are also recruited to NFAT motifs with the same preference for adenine at position 8, suggesting that it is a cell-type-specific difference ( Supplementary Figure S3 B).
COF-specific differences within individual cell types were also observed.For example, TBL1XR1 and P300 are recruited to the TCF7_LEF cluster in Jurkat cells, but the motifs for each COF differ in distinct ways.For P300, position 11 of the motif shows a distinct preference for cytosine, whereas for TBL1XR1 there is little or no base preference at this position (Figure 2 F).In contrast, the TCF_LEF cluster member motifs all indicate a strong-to-moderate preference for guanine at this position (represented by the LEF1 motif for illustration).While the weak preference shown for TBL1XR1 may result from differences in motif generation, the strong cytosine preference for P300 suggests an altered specificity for TF-P300 complexes.Cell-specific COF motifs identified by CoRec pro-vide a means to identify additional mechanisms of TF-COF binding specificity.

Gene expression, protein levels and PPI data do not predict TF-COF interactions
To investigate whether messenger RNA (mRNA) levels, protein levels and PPI data can explain our observed cell-specific TF-COF interaction data, we further analyzed our CoRec results from Jurkat and SUDHL4 cell lines for which we have both RNA-seq and whole proteome mass spectrometry data ( 40 ,43 ).The COFs themselves have similar mRNA and protein levels in both cell types ( Supplementary Figure S3 C and D), suggesting that cell-specific TF-COF differences are not a result of COF levels.Focusing on the TFs, we identified 16 TF clusters that recruit COFs in only one of the two cell types (Figure 3 A), which is statistically significant (2.7 standard deviations above mean) based on random permutations of row labels ( Supplementary Figure S4 A).We then evaluated whether at least one TF associated with each of the 16 TF clusters was differentially expressed [log 2 (FC) > 2] on the mRNA or protein level, which could explain the observed cell-specific difference in COF recruitment (Figure 3 A and Supplementary Files S8 and S9 ).We found that in 75% (12 / 16) of the cases, there was at least one TF cluster member that had differential mRNA that might explain the observed COF recruitment differences.Examining proteomic data, we find that 63.5% (10 / 16) of the differential COF recruitment could be explained by changing TF protein levels.These numbers were not statistically significant based on row permutation testing ( Supplementary Figure S4 B), suggesting that interaction data are not readily explained by gene and protein levels.Finally, we asked whether PPI data from the STRING database would further support predictions of differential recruitment.Of the cases where altered TF-COF interactions were supported by differential mRNA or protein levels, we found that 50% (8 / 16) were also supported by PPI data for the relevant COF, highlighting the difficulty in predicting how TF-COF networks may change across cell types.
Two examples illustrate how we can leverage expression, proteomic and PPI datasets to refine our CoRec measurements and also highlight the importance of cell-specific measurements.The MEF cluster recruits P300 in SUDHL4 but not in Jurkat (Figure 2 A).Members of the MEF cluster are expressed in both cell types, but MEF2B and MEF2C are expressed at higher levels in SUDHL4 (Figure 3 B).Examining the protein levels themselves, the trend is even more striking as no MEF members are identified in Jurkat, while MEF2A and MEF2C proteins are both detected in SUDHL4.Furthermore, interactions between P300 and both MEF2A and MEF2C have been reported ( 60 ); therefore, a plausible scenario is that differences in MEF2 TF levels could explain the observed recruitment differences we see between SUDHL4 and Jurkat.In contrast, YY1_YY2 recruits P300 in Jurkat but not in SUDHL4; however, TFs in this cluster show no differences in mRNA or protein levels (Figure 3 C), suggesting that P300 recruitment is likely regulated by post-translational modifications or other mechanisms that differ between these two cell types.These examples demonstrate the difficulty in explaining (or predicting) differential recruitment data using mRNA, proteomic and existing PPI data, and highlight the need for cell-specific measurements, as afforded by CoRec.

CoRec identifies cell state-dependent TF-COF interactions
Cell-specific profiling of TF-COF complexes provides a means to examine the TF-COF complexes driving epigenetic changes as cells differentiate and respond to signals.Histone tail acetylation has been correlated with CRE activity and gene expression changes (74)(75)(76).Therefore, we sought to determine whether profiling KAT-TF interactions could identify transcriptional activators important for a given biological output.
We profiled seven COFs from three main KAT subfamilies (P300 / CBP family, GCN5 / PCAF family and MYST family) to examine recruitment of KATs in resting and TCR-stimulated Jurkat T cells.As transcriptional changes downstream of TCR signaling have been found to occur in as little as 1 h ( 77 ), we examined the early changes in KAT recruitment (45 min poststimulation) that likely initiate the gene expression response.We observed 111 TF-COF interactions across the KATs and T-cell conditions, involving 43 unique TF motif clusters (Figure 4 A).Of the 111 interactions, 20% (22 / 111) of the interactions are gained upon T-cell stimulation, 15% (17 / 111) are lost upon stimulation and 65% are present in both conditions, demonstrating that even at this early 45-min time point, ∼35% of the TF-COF interactions have been altered.All KATs, except GCN5, exhibited changes in TF interactions in response to T-cell signaling at this time point (Figure 4 B).Of the KATs that showed signal-induced TF cluster interactions, TIP60 exhibited the largest number of clusters that recruited only in the stimulated condition (eight clusters), while MOF had the fewest (one cluster).
Outside of simple loss or gain of interactions, we identified several broad categories based on KAT recruitment, including 4 TF clusters with strongly induced recruitment (all recruited KATs have motif strength > 1.5), 2 TF clusters with partially induced recruitment (multiple recruited KATs have motif strength > 1.5 and constitutive recruitment of other KATs), 3 TF clusters with diminished recruitment (multiple recruited KATs have motif strength < −1.5 and constitutive recruitment of other KATs) and 15 TF clusters with recruitment specific to only one KAT (Figure 4 A).Examples of motifs with different motif strengths that match the same TF clusters are provided ( Supplementary Figure S5 ).Comparing the number of KAT-interacting TF clusters to those from cell-specific BioID measurements (HEK293 cells), we observed comparable numbers of interactions, i.e. an average of 15.9 clusters per COF for CoRec versus 11.8 per COF for BioID ( Supplementary Figure S6 A).The data highlight that even at relatively short stimulation times, we observe largescale changes in the KAT-TF network of T cells.

Stimulus-dependent acti vator s recruit multiple KATs
Canonical downstream activators of TCR and CD28 coreceptor signaling involve three major TF pathways: NF-κB, NFAT and AP-1 ( 78 ,79 ).Interactions between KATs and members of these TF families have been reported to contribute to gene activation ( 60 ,80-85 ).To determine whether profiling KAT-TF interactions can identify these TF activators, we identified TF clusters that recruited multiple KATs more strongly after T-cell stimulation (i.e.motif strengths are strictly higher after activation, motif strength > 1.5) (Figure 4 A).Four TF clusters met this criterion-NFKB_REL , NFAT , NR_half and NR_DR1 -two of which match the expected canonical TCR pathways (NF-κB and NFAT).The motifs identified for NFKB_REL and NFAT are consistent with motifs of RelA and NFATC1, regulators known to be activated by TCR / CD28 signaling (Figure 4 A).Relaxing our activator selection criteria to allow for both constitutive and induced KAT recruitment, we identified two additional clusters, AP-1 and STAT .The AP-1 cluster represents the third canonical TCR pathway, and recruits PCAF and TIP60 upon stimulation, but constitutively recruits CBP and P300 (Figure 4 A).This differential activator status for AP-1 suggests a more complex KAT interaction landscape than the strict stimulusinduced activators NF-κB and NFAT.
Among our set of strict activators, we also identified two nuclear receptor (NR) TF clusters.The NR_half cluster contains motifs defined by a single 5 -AGGTCA-3 -type NR halfsite motif, while the NR_DR1 cluster contains motifs defined by direct repeats of the half-site separated by a one nucleotide spacer (DR1).The NR4A family binds the NR_half motifs, and all members of this family have been shown to be rapidly induced upon TCR stimulation, including as early as 30 min post-stimulation (86)(87)(88).NR4A1 has also been shown to interact with PCAF ( 89 ), and we find an increased recruitment of PCAF to the NR_half cluster upon stimulation (Figure 4 A).The NR_DR1 cluster represents motifs bound by a range of type II NRs ( 90 ,91 ).However, peroxisome proliferatoractivated receptor (PPAR) TF family binds as heterodimers with retinoid X receptor α (RXR α) to NR_DR1 cluster motifs ( 90 ), and both the PPAR family and RXR α play a role in T-cell activation, regulating cytokine expression and contributing to T-cell survival ( 92 ,93 ).We note that the NR_half and NR_DR1 clusters recruit distinct repertoires of KATs, supporting the conclusion that separate NR complexes bind to each motif.These results demonstrate that profiling KAT interactions can provide a protein-level approach to identify transcriptional activators of cell signaling events.

Promiscuous KAT recruitment is a feature of transcriptional acti vator s in T cells
KATs catalyze deposition of different repertoires of histone acetylation marks ( 75 ,94 ) (Figure 5 A), many of which can facilitate, or are correlated with, enhanced transcription ( 95 ,96 ).Genome-wide maps of H3K27ac and P300 binding are widely used to predict enhancer activity (97)(98)(99); therefore, we anticipated that TF activators may interact strongly with P300 (and paralog CBP) following TCR stimulation.However, the simple combination of H3K27ac marks and P300 binding does not identify all enhancers ( 76 , 100 , 101 ), and recruitment of other KATs (e.g.PCAF, GCN5, TIP60, MOZ and MOF) has been observed at active regulatory elements ( 74 , 95 , 100 , 102 ).These observations support a model in which many KATs are recruited to active regulatory elements ( 95 ); however, it is not clear whether this is by individual TFs each recruiting separate KA Ts, by promiscuous KA T recruitment by TFs or by a combination of recruitment by TFs and other chromatin-associated factors (e.g.chromatin marks or other regulatory COFs).
In light of these findings, we re-examined the KAT recruitment specificity of both stimulus-dependent and constitutive activators.Focusing first on strict stimulus-induced TF activators ( NFKB_REL , NFAT , NR_DR1 , NR_half ), we find induced recruitment of paralogs CBP and P300 that deposit H3K27ac, as well as induced recruitment of PCAF and TIP60 that deposit different histone marks (Figure 5 B).When also considering AP-1, the other canonical TCR signaling TF pathway, we find that stimulus-enhanced PCAF recruitment was the one common feature across all stimulus-induced TF activators.PCAF is known to deposit H3K9ac and H3K14ac, both marks that have been correlated with active regulatory elements ( 101 ).Notably, we did not find a widespread stimulus-dependent increase in recruitment for MOZ, MOF and GCN5 (Figure 4 A and B).These observations demonstrate that stimulus-specific TF activators downstream of TCR signaling interact with multiple KATs in a signal-dependent fashion that would predict deposition of a number of histone acetylation marks at activated regulatory elements.
Studies have identified binding of multiple KATs at active transcriptional regulatory elements in resting T cells ( 74 ,95 ).Therefore, we next examined constitutive recruiters in our dataset to assess the repertoires of KATs recruited.We identified five TF clusters that exhibit constitutive, promiscuous recruitment ( > 4 KATs with motif strength > 3 in both   ).TFs from these families are known to regulate either housekeeping genes (ETS, YY1) ( 65 , 103 , 104 ) or genes associated with T-cell functions (ETS, RUNX, bHLH, IRF_ST A T) ( 105-109 ).For example, IRF1 and IRF4 act as competing pioneer factors in T cells ( 105 ), and members of ETS and RUNX were found to occupy the vast majority of accessible chromatin regions in mouse CD4 and CD8 T cells ( 106 ).These data demonstrate that broadly acting T-cell activators are associated with, and can be identified by, constitutive KAT recruitment.Furthermore, the observation that KATs are promiscuously recruited to constitutive activators provides a mechanism for the widespread co-occupancy observed for KATs and multiple histone acetylation marks at regulatory elements in T cells ( 74 ,110 ).
CBP and P300 paralogs exhibit unique KAT recruitment profiles CBP and P300 are paralogous KATs with homologous TFinteracting domains ( 111 ).Despite their sequence homology and co-occupancy at many genomic loci ( 74 ), studies have identified functional differences between them (112)(113)(114).Strikingly, while we find a number of TFs that recruit both CBP and P300, we find numerous differences in their recruitment profiles.There are 10 TF clusters that recruit both P300 and CBP, 8 that uniquely recruit CBP and 6 that are unique to P300 (Figure 4 A).However, even for the 10 shared TF clusters, 6 exhibit different patterns of condition-specific recruitment.For example, the E2F_v2 cluster recruits CBP in both resting and stimulated Jurkat cells but recruits P300 only in the stimulated condition.TFs in this cluster are implicated in controlling the proliferation of T cells after TCR stimulation and have been shown to recruit both P300 and CBP to activate target genes ( 115 ).The interaction data indicate joint recruitment of CBP and P300 upon stimulation, but a CBPspecific role in resting T cells.Overall, our data demonstrate that CBP and P300 exhibit differential recruitment profiles in both resting and TCR-stimulated T cells, providing a possible mechanism for their overlapping yet specific functions.

KAT-TF interactions highlight ascertainment bias in PPI databases
To evaluate potentially novel KAT-TF interactions in our dataset, we compared our results to PPI data in the STRING database (see the 'Materials and methods' section) ( 60 ).This database contains both direct and indirect associations between proteins, and includes PPIs determined from experiments as well as computational predictions such as text mining or conserved co-expression ( 60 ).Evaluating KAT-TF interactions at the level of TF clusters, we find that 40% (44 / 111) of our KAT-TF interactions represent previously unreported interactions (Figure 6 A).We identified at least one unreported TF cluster interaction for every KAT, except for P300.Strikingly, the majority of interactions with MOF and MOZ were unreported (85.7% and 92.9%, respectively).This was in contrast to P300 and CBP, where most TF cluster interactions had been previously reported (Figure 6 B).To evaluate whether the number of novel interactions we find might be explained by the extent to which a KAT has been previously studied, we compared the number of PPIs for each KAT in the STRING database to the percent of known KAT-TF interactions in our data (Figure 6 C).We identified a clear rela-tionship (logarithmic fit, adjusted R 2 = 0.92, P -value < 0.01) that suggests that the number of unreported interactions identified for each KAT is likely a result of ascertainment bias of PPIs reported in the public databases.These data indicate that interactions with certain KATs are understudied, and suggest that similar biases may be true for other COFs.
KAT recruitment by NF-κB correlates with H3K27ac levels at induced genes Genes with induced acetylation were expressed at lower levels prior to stimulation (Figure 7 B) and at higher levels poststimulation (Figure 7 C); these genes included many known activated T-cell genes, including NR4A1 , EGR3 and IL2RA (encoding CD25) (Figure 7 A).As H3K27ac marks are deposited by P300 and CBP, we examined whether motifs from TF clusters with enhanced CBP / P300 recruitment following stimulation were enriched in the promoters that gained acetylation.We found that NF-κB motifs, those associated with the strongest induced recruitment of both CBP and P300, were the only significantly enriched motifs in these promoters compared to promoters that did not gain acetylation (Figure 7 E).This is consistent with other studies showing that induced H3K27ac marks are dependent on NF-κB under specific stimulation conditions ( 116 ,117 ).In contrast, motifs from the two strongest constitutive recruiters of P300 and CBP, the IRF_STAT and ETS clusters, were not enriched in induced genes but were enriched in promoters with constitutive acetylation.Further supporting the role of NF-κB in altering acetylation levels and promoting gene activation, genes associated with induced H3K27ac peaks are enriched in NF-κB target genes (Figure 7 F).For example, within the cis-regulatory regions of the NF-κB target gene RelB , there are several regions of increased acetylation that overlap κB binding sites, and previous studies have indicated that the RelB transcription is dependent on NF-κB RelA binding (Figure 7 D) ( 118 ).To further evaluate the impact of NF-κB sites on H3K27ac marks, we examined whether the number of predicted κB binding sites that overlapped with the H3K27ac peaks correlated with H3K27ac levels.We found a statistically significant increase in H3K27ac levels with a single NF-κB site, and a further increase with two sites, but no further gain beyond two sites (Figure 7 G).These results suggest that the stimulus-enhanced CBP / P300 interactions seen  for NF-κB factors correspond to concomitant gains in histone acetylation and induced gene expression.

Heterotypic clusters of KAT-recruiting TF sites correlate with promoter H3K27ac levels
We identified a number of TF clusters that constitutively recruit KATs (Figure 4 A) and hypothesized that they likely play a role in histone acetylation levels in both resting and stimulated T cells.To test this hypothesis, we examined whether the presence of TF motifs associated with constitutive CBP / P300 recruitment correlated with constitutive genomewide H3K27ac levels at gene promoters, defined as the region from 500 bp upstream to 100 bp downstream of a gene's TSS.We first examined whether multiple instances of the CBP / P300-recruiting motifs predict higher promoter acetylation levels, and we found significantly higher levels of promoter acetylation with the occurrence of up to three motifs (Figure 7 H).As a control, we examined the impact of motifs that do not recruit CBP / P300 but do recruit MOZ / MOF, and we found no increase beyond one motif for this group.Given that CBP and P300 were recruited by several TF clusters, we then asked whether heterotypic motif groups-defined by motif matches from different TF clusters-were also predictive of acetylation levels.Strikingly, we found an even stronger trend in which promoters with motifs from up to five or more CBP / P300-recruiting TF clusters predicted increasingly higher acetylation levels.This trend was not seen with motifs from TF clusters that recruit only MOZ / MOF and not CBP / P300 (Figure 7 I).This trend of acetylation levels with heterotypic motif clusters was also stronger than that seen for homotypic motif clusters ( Supplementary Figure S7 C).These results demonstrate that KAT-TF networks can determine TF families that are predictive of genome-wide acetylation levels, and that high levels of promoter acetylation are likely a result of heterotypic clusters that contain binding sites for multiple KAT-recruiting TFs from distinct families.

Discussion
CoRec is a method for rapid profiling of cell-specific TF-COF complexes and provides an alternative to other methods for high-throughput analysis of PPIs that focuses on DNA-bound TF-COF complexes.In CoRec, TFs that recruit target COFs are inferred using DNA binding motifs.As multiple TFs can often bind the same motif, our assay provides an aggregate, quantitative measure of COF recruitment by TFs that may bind the same DNA sites in vivo , which is quantified as the motif strength (Figure 1 E).We demonstrate that CoRec can be used to identify cell-specific TF-COF interactions for different classes of COFs (Figure 2 A).Profiling KAT-TF interactions in resting and stimulated Jurkat T cells, we found that 35% of the interactions occurred only in one condition, highlighting the importance of cell context for TF-COF interactions.These differences in interactions may be due to changes at the transcriptional level, altered nuclear localization or post-translational modifications that alter PPIs.We show that profiling interactions with KATs provides a means to identify transcriptional activators and propose that using CoRec to profile interactions with other classes of COFs can provide a means to characterize TF function.CoRec can be readily applied to different cell systems, requiring only nuclear extracts and COF antibodies for labeling.We anticipate that using CoRec in different cell states and disease contexts (e.g.different cancer cell types) will provide insights into how TF-COF networks alter chromatin state and gene regulation.
Investigating regulators of histone acetylation, we find that promiscuous recruitment of KATs is a prominent feature in our KAT-TF network in T cells, with 66% (29 / 44) of TF clusters recruiting at least two KATs and 30% (13 / 44) recruiting at least four KATs (Figure 4 A).The recruitment of multiple different KATs to the same DNA sites, whether by single TFs or by paralogous TFs that bind the same DNA sites, predicts co-occurrence of KATs and their cognate histone acetylation marks at regulatory elements.Indeed, studies in primary T cells that examined genome-wide binding of five KATs (P300, CBP , PCAF , MOF , TIP60) ( 74 ) and the occurrence of 18 acetylation marks ( 110 ) reported widespread colocalization of these KATs and histone acetylation marks.This widespread co-occurrence suggests low KAT specificity for many genes, which is at odds with genetic perturbation data for individual KATs ( 95 ).These observations led Anamika et al. ( 95 ) to propose a model in which initiation of gene expression is mediated by specific KATs, whereas maintenance of expression requires less KAT specificity and involves the recruitment of many KATs to the acetylated chromatin environment.As KAT complexes often contain bromodomains, the authors proposed that KAT co-occurrence may result from bromodomain-mediated interactions with acetylated histone tails ( 95 ).While our data are consistent with this model, it suggests that the KAT co-occurrence at regulatory elements is also the result of promiscuous KAT recruitment by constitutive TF activators.Beyond the mechanisms by which multiple marks are deposited at regulatory elements, it also remains unclear what roles these multiple acetylation marks play for maintenance of gene expression.One potential explanation is that dynamic TF-COF interactions ( 119 ), in combination with active deposition and removal of acetylation marks ( 120 ), make multiple acetylation marks necessary in order to maintain open chromatin and create a more constant level of expression.Future studies that integrate TF-COF interaction data with gene perturbation approaches should help to clarify the roles of constitutive TF activators on promiscuous KAT recruitment and the impact of different KATs on the maintenance of basal levels of gene expression.
The combinatorial action of multiple TFs is a feature of eukaryotic enhancers and promoters; however, deciphering the individual roles of TFs within regulatory elements remains a challenge ( 121 ).Here, we sought to understand the role of TFs by examining their recruitment of COFs associated with specific epigenetic marks.We focused on H3K27ac marks associated with the activity of promoters and enhancers.To characterize potential activators in T cells, we profiled TFs that recruit the KATs CBP and P300 that deposit H3K27ac.In TCR-activated T cells, we found that the NF-κB family had the strongest stimulus-induced recruitment for both CBP and P300, and that NF-κB family motifs were the most enriched in loci that gained H3K27ac (Figures 5 B and 7 ).These results are consistent with the known role of NF-κB as a potent activator downstream of T-cell signaling ( 78 , 79 , 122 , 123 ).In resting T cells, we identified a number of TF clusters that recruit either CBP or P300 (Figure 4 A).Unexpectedly, we found that heterotypic clusters of these TF motifs at individual promoters were correlated strongly with the H3K27ac levels (Figure 7 I), and were more predictive than the total number of TF motifs (Figure 7 H) or the number of homotypic motifs ( Supplementary Figure S7 C).We hypothesize that heterotypic motif clusters may be predictive of H3K27ac as both CBP and P300 contain multiple TF-interacting subdomains ( 111 ) that could facilitate synergistic CBP / P300 recruitment ( 13 ,121 ).Indeed, a recent study has highlighted the individual roles of these subdomains in P300 genome-wide binding ( 55 ).It is unclear whether similar relationships exist for other histone marks; however, similar studies performed for other classes of COFs should provide insight into how heterotypic motif clusters relate to additional features of promoters or enhancers.We anticipate that a modified version of the CoRec design could be used to study cooperative COF recruitment by motif pairs, which might provide additional insight into the role of motif clusters in COF recruitment.
We highlight several limitations of our method and study.The first is that our assay identifies TFs at the level of TF clusters defined by similar DNA binding motifs, and not at the level of individual TFs.Integration with other data, such as gene expression data, protein abundance data or published PPI data, can be used to refine observations.The second is that CBP / P300 recruitment is inferred through the presence of H3K27ac and not by direct measurement of their binding.We tried CUT&RUN and ChIP-seq of both P300 and CBP in Jurkat T cells but were unable to recover high-confidence peaks.We note that others have recently commented on difficulties for P300 ( 55 ).

Figure 1 .
Figure 1.CoRec is an array-based method to profile cell-specific TF-COF interactions.Overview of the CoRec method: ( A ) Nuclear extract from cell type of interest is applied to DNA microarray.( B ) Microarray contains 346 sets of consensus DNA probes (canonical TF binding site) and all SV probes.( C ) COF recruitment to all consensus + SV probe sets is used to determine COF recruitment motifs.( D ) COF recruitment motifs are matched to TF DNA binding motifs to infer TF identities.( E ) Schematic illustrating that multiple TFs with different COF interaction strengths may contribute to COF recruitment on the microarray.( F -H ) Comparison of replicate experiments for three COFs profiled in three cell types.COF recruitment is quantified using microarray fluorescence-based z -scores.Consensus + SV probes associated with select TFs are highlighted.( I -K ) COF recruitment motifs corresponding to consensus + SV probes highlighted in each scatter plot are shown along with best-match TF motifs from the JASPAR database.

Figure 2 .
Figure 2. CoRec identifies cell-specific TF-COF interaction networks.( A ) Heatmap illustrating CoRec TF-COF interaction data.Columns indicate COFs profiled in each experiment.Rows indicate TF motif clusters recruiting each COF.Heatmap cells indicate the maximum motif strength of COF motifs matc hing eac h TF cluster.( B ) Av erage number of TF clusters identified f or COF. ( C ) Number of unique TF clusters identified in CoR ec e xperiments f or different cell types.( D ) Number of TF clusters identified for COFs in public PPI databases, individual studies and CoRec experiments.( E ) Comparison of P300 recruitment motifs that matched the NFAT motif cluster found in Jurkat (top) and SUDHL4 (middle), and a representative NFAT motif from JASPAR.( F ) Comparison of P300 (top) and TBL1XR1 (middle) recruitment motifs in Jurkat that matched the TCF7_LEF motif cluster, and a representative TCF / LEF family motif from JASPAR.

Figure 3 .
Figure 3. Gene expression, protein levels and PPI data cannot predict cell-specific TF-COF interactions.( A ) Comparison of CoRec cell specificity to cell-specific mRNA expression, protein levels and PPI data.Column 1: Cell-specific recruitment observed for different TF clusters.Columns 2 and 3: Differential expression or protein levels [log 2 (FC) > 2] for at least one member within each TF cluster that is in the same sense as the cell-specific COF recruitment.Columns 4-6: L e v els of support for COF cell specificity.'Differential' indicates support from either differential expression or protein levels.'PPI' indicates that the indicated COF can interact with at least one of the differential TFs for that cluster.( B , C ) Comparison of mRNA expression (left) and protein le v els (right) for TFs in Jurkat and SUDHL4 with members of the MEF cluster or the YY1_YY2 cluster highlighted.

FamilyFigure 4 .
Figure 4. KAT-TF interaction network reveals KAT and TF specificities.( A ) Heatmap of KAT-TF interaction network.Columns indicate profiled KATs and experimental conditions: untreated Jurkat T cells or 45 min post-TCR stimulation.Rows indicate TF motif clusters recruiting each COF.Heatmap cells indicate the maximum motif strength of COF motifs matching each TF cluster.( B ) Total number of TF clusters recruiting each KAT and condition specificity.

Figure 5 .
Figure 5. KAT-TF interactions re v eal promiscuous KAT recruitment.( A ) Schematic indicating histone ly sine residues acetylated b y different KATs.( B ) COF recruitment of four TF clusters with TCR stimulation-induced recruitment.COF recruitment is quantified by a normalized motif strength relative to the maximum for each KAT.Histone acetylation marks catalyzed by each KAT are indicated.Representative COF motifs matching each TF cluster are shown along with matching TF motifs from JASPAR.( C ) COF recruitment of constitutively recruiting TF clusters.COF recruitment is quantified as in panel (B).

Figure 6 .
Figure 6.CoRec provides a means to profile TF recruitment of understudied KATs.( A ) Heatmap indicating no v el KAT-TF interactions.Columns indicate profiled KATs and e xperimental conditions: untreated J urkat T cells or 45 min post-TCR stimulation.R o ws indicate TF motif clusters recruiting each COF.Heatmap cells indicate whether PPIs with each KAT and a member of the TF clusters ha v e been reported in a public PPI database.( B ) Proportion of KAT-TF cluster interactions that are no v el.Interactions annotated as in panel (A).( C ) Percentage of KAT-TF interactions reported in PPI databases compared to the total number of PPIs reported for that KAT.

Figure 7 .
Figure 7. P300 / CBP recruitment identifies TFs associated with genome-wide H3K27ac le v els.( A ) Differential H3K27ac peaks in TCR-stimulated Jurkat T cells.Log 2 (FC) of H3K27ac levels between untreated and 45 min post-TCR stimulation conditions.Select genes linked to differential acetylation peaks are indicated.( B ) Gene expression levels in unstimulated Jurkat for genes associated with TCR-stimulated decreasing [log 2 (FC) < −2], unchanging [ −0.5 < log 2 (FC) < 0.5] or increasing [log 2 (FC) > 2] H3K27ac le v els at associated peaks.( C ) TCR-stimulated gene expression changes (2 h post-stimulation) for gene sets defined as in panel (B).( D ) ChIP-seq H3K27ac tracks in untreated (UT_H3K27ac) and 45 min TCR-treated (T_H3K27ac) Jurkat cells.κB binding sites predicted within the region are indicated.( E ) Enrichment of motifs from TF clusters in loci defined by differential H3K27ac le v els.( F ) Proportion of known NF-κB target genes associated with induced H3K27ac.( G ) Impact of κB site number on changes in H3K27ac levels at ChIP-seq peaks.( H ) Impact of the number of specific COF-recruiting motifs on promoter H3K27ac le v els (peak scores).Motifs e v aluated in gene promoters ( −500 nt to +100 nt from TSS).Data shown for motifs recruiting CBP / P300 (left panel) and motifs recruiting MOZ / MOF and not CBP / P300 (right panel).( I ) Impact of the number of COF-recruiting motifs from unique TF clusters (i.e.heterotypic motifs) on promoter H3K27ac le v els (peak scores).Motif identity and promoter elements defined as in panel (H).Adjusted P -values are as follows: P > 0.05 (ns), * P ≤ 0.05, ** P ≤ 0.01, *** P ≤ 0.001 and **** P ≤ 0.0001.