A Structured Workflow for Mapping Human Sin3 Histone Deacetylase Complex Interactions Using Halo-MudPIT Affinity-Purification Mass Spectrometry *

Although a variety of affinity purification mass spectrometry (AP-MS) strategies have been used to investigate complex interactions, many of these are susceptible to artifacts because of substantial overexpression of the exogenously expressed bait protein. Here we present a logical and systematic workflow that uses the multifunctional Halo tag to assess the correct localization and behavior of tagged subunits of the Sin3 histone deacetylase complex prior to further AP-MS analysis. Using this workflow, we modified our tagging/expression strategy with 21.7% of the tagged bait proteins that we constructed, allowing us to quickly develop validated reagents. Specifically, we apply the workflow to map interactions between stably expressed versions of the Sin3 subunits SUDS3, SAP30, or SAP30L and other cellular proteins. Here we show that the SAP30 and SAP30L paralogues strongly associate with the core Sin3 complex, but SAP30L has unique associations with the proteasome and the myelin sheath. Next, we demonstrate an advancement of the complex NSAF (cNSAF) approach, in which normalization to the scaffold protein SIN3A accounts for variations in the proportion of each bait capturing Sin3 complexes and allows a comparison among different baits capturing the same protein complex. This analysis reveals that although the Sin3 subunit SUDS3 appears to be used in both SIN3A and SIN3B based complexes, the SAP30 subunit is not used in SIN3B based complexes. Intriguingly, we do not detect the Sin3 subunits SAP18 and SAP25 among the 128 high-confidence interactions identified, suggesting that these subunits may not be common to all versions of the Sin3 complex in human cells. This workflow provides the framework for building validated reagents to assemble quantitative interaction networks for chromatin remodeling complexes and provides novel insights into focused protein interaction networks.

Deciphering networks of protein-protein interactions is key to understanding how the components of a cell's protein machinery are organized and how that machinery functions. One method of investigating such interactions uses an exogenously expressed affinity tagged "bait" to function in place of the endogenous version of the protein to survey physical interactions between the bait and endogenous prey proteins. The bait is affinity purified from cell extracts, together with "prey" proteins that might interact either directly or indirectly with the bait. These copurifying proteins can be identified and quantitated using sensitive "shotgun" mass spectrometry approaches. The combination of affinity purification and mass spectrometry (AP-MS) 1 thus probes the networks of protein interactions in a cell, providing valuable insight into the structure and functions of its protein machinery.
A variety of AP-MS workflows have been developed. Highthroughput strategies have sought to map the protein interaction landscape globally in standard immortal cell lines. For example, Hein and coworkers generated 1330 HeLa cell lines stably expressing a variety of GFP-tagged mouse or human proteins and analyzed the GFP purified bait associated proteins using reversed phase LC-MS (1). Similarly, Huttlin et al. identified the interaction partners for 2594 HA-tagged human proteins expressed in HEK293T cells (2). Such strategies elucidate important global protein interaction patterns and can provide insight into the likely functions of uncharacterized proteins. In addition to high-throughput pipelines, mediumthroughput strategies can characterize interactions in more depth. Indeed, more targeted experiments can explore the different properties of protein isoforms, determine distinct regions of proteins important for a given interaction, or examine perturbation of protein interaction patterns as cells are treated with drugs. For example, Hubner and coworkers determined that two isoforms of Pericentrin interact with differ-ent subsets of centrosomal proteins (3). In another study, Banks et al. used a series of truncation mutants of the TNIP2 protein to identify two distinct regions of TNIP2 that interact either with NFKB1 or with the ESCRTI complex (4). In a third study, Sardiu et al. investigated the effect of the cancer drug SAHA on interactions centered around subunits of the Sin3 co-repressor complex (5). Such studies depend on efficient experimental workflows for developing and testing reagents that can be used to identify legitimate protein-protein interactions and reduce the likelihood of false positive results.
Here we present a medium-throughput strategy applied to investigating interactions among components of the Sin3 complex, an important regulator of global gene expression (6). Clarifying the precise role of Sin3 components in controlling critical cellular processes is important as abnormal Sin3 function is associated with carcinogenesis (7). Notably, SIN3A, the gene encoding the key scaffolding protein around which Sin3 complexes are built, was recently identified as one of the top 127 significantly mutated genes across common cancer types (8). We first describe how a structured decision-making process can be used to screen recombinant complex subunits and verify correct bait protein localization, before exploiting the validated reagents in comprehensive AP-MS analyses. We have used the workflow to develop 23 cell lines constitutively expressing modest levels of Halo-tagged Sin3 components.
Here we present AP-MS analyses based on three of these cell lines, Halo-SUDS3, Halo-SAP30 and Halo-SAP30L, and identify 128 high-confidence bait-prey interactions. These interactions include 17 subunits previously identified as Sin3 subunits, several Sin3 associated transcription factors, and putative Sin3 associated proteins BAHCC1 and TNRC18. Surprisingly, we did not find Sin3 subunits SAP18 and SAP25 in our purifications, suggesting that these components are not tightly integrated into all versions of the Sin3 complex. Finally, adjusting the Halo-MudPIT protocol can result in notable changes to the interactions identified, highlighting the importance of interpreting data mindful of factors such as whether nucleases have been included in buffers. To summarize, workflows for developing protein interaction networks that address the suitability of the reagents and purification procedures are important for generating refined models of protein complex organization and function.
Cloning ORFs into Plasmid Vectors-cDNA was prepared from human placental total RNA (Clontech, Palo Alto, CA) using the iScript cDNA synthesis kit (Bio-Rad, Hercules, CA). The coding sequences of SUDS3 (NP_071936), SAP30L (NP_078908) and SAP18 (NP_005861) were amplified from cDNA using the PCR primers listed in Supplementary Data and cloned into either pFN21A, pFC14A, or CMVd2 pcDNA5/FRT PacI PmeI (9) as indicated in the figure legends. A codon-optimized ORF coding for SAP30 (NP_003855) was synthesized and subcloned into pFN21A or CMVd2 pcDNA5/FRT PacI PmeI (9) (supplementary Data). Halo-SIN3A in pFN21A was generated using site directed mutagenesis to modify clone #FHC11647 (Promega) to match the SIN3A NCBI sequence NP_001138830, and to insert a stop codon at the 3Ј end of the ORF. Halo-SIN3B in pFN21A was generated by modifying clone #FHC01991 (Promega) to match the SIN3B NCBI sequence NP_056075, and to insert a stop codon at the 3Ј end of the ORF. Halo-SAP25 was made by cloning a synthetic sequence (supplementary Data) coding for the SAP25 protein (Uni-ProtKB -Q8TEE9) into pFN21A. SNAP-tagged versions of SUDS3, SAP30 and SAP30L were generated by subcloning the coding sequences between the SgfI and PmeI restriction sites in SNAP-FLAG pcDNA5 (supplementary Data).
Stable Cell Line Construction-Flp-In™-293 cell lines stably expressing Halo-SUDS3, Halo-SAP30 or Halo-SAP30L were generated using the Flp-In™ System (Invitrogen, Carlsbad, CA) according to the manufacturer's instructions. In brief, Flp-In™-293 host cells were cotransfected with pcDNA5/FRT constructs (coding for a Halo-Sin3 subunit) and pOG44 (coding for Flp recombinase) and cultured in media containing Hygromycin B. Twenty-four hygromycin resistant clones were isolated and expression of the Halo-Sin3 subunit was confirmed by Western blotting. Three stable transformants for each subunit were selected for expansion.
Purification of Recombinant Proteins Complexes-For HaloTag purifications, lysates were incubated with Magne® HaloTag® beads prepared from either 0.1 ml of bead slurry (transiently transfected HEK293T cells) or 0.2 ml of bead slurry (Flp-In™-293 cell lines) overnight at 4°C. Beads were washed a minimum of four times with buffer containing either 10 mM HEPES (pH 7.5), 1.5 mM MgCl 2 , 0.3 M NaCl, 10 mM KCl, and 0.2% Triton X-100 (Flp-In™-293 cell lines) or 50 mM Tris⅐HCl pH 7.4, 137 mM NaCl, 2.7 mM KCl and 0.05% Nonidet ® P40 (transiently transfected cells). Bound proteins were eluted by incubating the beads with buffer containing 50 mM Tris⅐HCl pH 8.0, 0.5 mM EDTA, 1 mM DTT, and 2 Units AcTEV™ Protease (Invitrogen) for 2 h at 25°C. Eluates were passed through a Micro Bio-Spin™ chromatography column (Bio-Rad) to remove residual traces of beads. For SNAP purifications, lysates were incubated overnight at 4°C with SNAP-Capture magnetic beads prepared from 0.08 ml bead slurry. Beads were washed with the wash buffer described for Halo purifications above. Bound proteins were eluted by incubating the beads with buffer containing 50 mM Tris⅐HCl pH 8.0, 0.5 mM EDTA, 1 mM DTT, and 1 Unit PreScission Protease at 4°C overnight.
Mass Spectrometry-HaloTag® protein complexes were TCA precipitated and digested with Trypsin as described by Swanson et al. (12). Digested samples were first loaded onto microcapillary columns packed with three immobilized phases (reversed phase; strong cation exchange; reversed phase) and bound peptides were then eluted from the column using an Agilent 1100 series quaternary HPLC pump. Peptides were resolved using 10 multidimensional chromatography steps as previously described (13), and analyzed using a linear ion trap (LTQ) mass spectrometer (Thermo Fisher Scientific, Waltham, MA).
The resulting .RAW files were converted to .ms2 files using RAWDistiller v. 1.0 before matching MS/MS spectra to a database containing: 36628 human protein sequences (National Center of Biotechnology Information, June 2016 release) and 199 common contaminants using the ProLuCID algorithm version 1.3.5 (14). The database also included shuffled versions of all sequences to enable estimation of false discovery rates (FDRs). The total number of sequences searched was 73654. The database was searched for peptides with static modifications of ϩ57 Daltons on cysteine residues (carboxamidomethylation) and with dynamic modifications of ϩ16 Daltons on methionine residues (oxidation). A mass tolerance of 4.5 amu was used for precursor ions and 0 amu for fragment ions. Only fully tryptic peptide sequences were considered.
Inaccurate matches were filtered out using DTASelect (15) in combination with an in-house software algorithm, swallow. Swallow works with DTASelect to select peptide spectral matches at given specific spectrum, peptide, and protein FDRs. The spectral and protein FDRs for each run are presented in supplemental Table S1. The mean spectral FDR for the 33 MudPIT runs was 0.19% Ϯ 0.15% (standard deviation), and the mean protein FDR was 0.92% Ϯ 0.95% (standard deviation). A DTASelect filter also established a minimum peptide length of 7 amino acids, and proteins that were subsets of others were removed using the parsimony option in Contrast (15).
We have reported some of the mass spectrometry data used in this study previously. A summary of the mass spectrometry runs used in this study, including details of where they were first reported is presented in supplemental Table S1. Mass spectrometry data has also been deposited to the PeptideAtlas repository (16) (www. peptideatlas.org), or in the MassIVE repository (http://massive.ucsd. edu); the identifiers for each data set are listed in supplemental Table S1.
Fluorescence Microscopy-Microscopy was performed essentially as described previously (17). In brief, cells stably expressing either Halo-SUDS3 or Halo-SAP30 were plated at 40% confluence in Mat-Tek glass bottom culture dishes. Cells were first cultured for ϳ24 h at 37°C in 5% CO 2 . HaloTag® TMRDirect™ was then added to the culture medium at a concentration of 20 nM to label Halo tagged proteins, and cells were further cultured overnight at 37°C in 5% CO 2 . Cells were stained with Hoechst dye for 1 h to label nuclei, washed twice with Opti-MEM® reduced serum medium, and imaged using either an UltraVIEW VoX spinning disk confocal microscope (Perkin Elmer) with a 40x plan-apochromat (NA 1.2) objective or a LSM (Ziess, Thornwood, NY) microscope with a 40x C-Apochromat (NA 1.2) objective. We excited TMRDirect and Hoeschst with a 561 and 405 nm laser, respectively. All experiments implemented a multiband dichroic filter (Vox 405/488/561/640 nm or LSM 405/458/561) and a multiband emission filter (Vox 415-475/580 -650 nm or LSM 426 -583/583-696 nm).
Experimental Design and Statistical Rationale-AP-MS analyses used a minimum of three biological replicates (supplemental Table  S1). The number of biological replicates for analyses using transiently transfected cells was: 9 (Halo-control); 3 (Halo-SUDS3); 3 (Halo-SAP30). For analyses using stable cell lines, the number of biological replicates used was: 5 (controls); 4 (Halo-SUDS3); 5 (Halo-SAP30); 4 (Halo-SAP30L). Searched data was analyzed using Contrast and NSAF7 software (15) and lists of proteins with a high probability of being in the experimental samples relative to the control samples were made using parameters generated by the QSPEC algorithm (18). A high-confidence interaction was established when a protein was both identified in Ͼ 50% experimental replicates, and when QSPEC interaction parameters met the conditions QSPEC log 2 FC Ͼ2, and QSPEC FDR UP Ͻ0.05. Four samples were excluded from analysis the following reasons: MudPIT run failure (1 sample: Halo-control #1, transient transfection experiment), sample cross contamination during purification (1 sample: Halo-control #1, stable cell line experiment), fewer than 200 total proteins detected (1 sample: Halo-SAP30L #2, stable cell line experiment), impure sample of TEV protease used for elution (1 sample: Halo-SUDS3 #1, stable cell line experiment).

Establishing a Structured Workflow for Mapping Sin3
Complex Interactions-Previously, we had identified a set of more than 20 proteins consistently copurifying with several FLAGtagged subunits of the human Sin3 complex (5). To map the network of interactions among these proteins more precisely, as well as to investigate the interactions between the Sin3 complex and other protein complexes more broadly, we developed a structured approach for building a Sin3 protein interaction network that combines innovative features of the Halo affinity tag with Multidimensional Protein Identification Technology (MudPIT) (19).
The decision-making process that we used to construct and test our Halo-tagged reagents is outlined in Fig. 1A  A, Sin3 subunit ORFs are first cloned into pFN21A for transient CMV driven expression of the N-terminally Halo tagged subunit in 293T cells ➀. The pFN21A construct is used in trial AP-MS experiments to assess whether the recombinant subunit can capture other known Sin3 subunits ➁. If the Halo-tagged subunit copurifies with Sin3 complex components, the subunit ORF is cloned into pcDNA5/FRT for building Flp-In™-293 cell lines ➂; otherwise, the bait protein is reengineered by changing the location of the affinity tag or it's expression is increased by adjusting the strength of the promoter ➃. The pcDNA5/FRT construct is tested for bait expression in 293T cells prior to generating cell lines expressing the Halo-tagged subunit using the weak CMVd2 promoter. Clonal isolates are screened by SDS-PAGE followed by Western blotting and three isolates expressing each Halo-tagged subunit are screened for correct localization of the subunit to the nucleus using fluorescently-tagged cell-permeable Halo ligands ➄. Finally, complexes are purified from ϳ1 ϫ 10 9 Flp-In™-293 cells stably expressing the Halo tagged subunit and analyzed by MudPIT ➅. B, Application of the workflow to developing 23 Sin3 related stable cell lines for AP-MS analysis. Of the 5 cases where we modified the bait expression strategy, we used a higher strength promoter for detectable expression in 3 cases and switched tag location in another 2 cases (the tag caused both improper association with Sin3 complexes and abnormal localization). C, An example of affinity tag location influencing bait localization in HEK293T cells expressing either Halo-HDAC2 or HDAC2-Halo. Halo tag localization in live cells is with HaloTag® TMRDirect™ fluorescent ligand (red); nuclei are stained with Hoechst dye (blue). D, Temporal (green) and economic (blue) impact of testing transiently transfected bait constructs prior to investigating cell lines stably expressing baits. standard flowchart symbols (20). First, we subcloned coding sequences for Sin3 subunits into pFN21A for expression of the subunit with an N-terminal Halo tag in human cells. pFN21A drives robust expression of the recombinant protein using the CMV promoter allowing us to test the behavior of the protein in small scale experiments (Fig. 1A). To ask whether the tagged subunit could associate with other Sin3 subunits, we Halo-purified complexes from lysates of cells transfected with the pFN21A (CMV) construct and analyzed the proteins copurifying with the Halo-tagged bait by MudPIT. If the expected Sin3 subunits failed to copurify with the bait, we then moved the Halo tag to the C terminus of the protein and retested the ability of the bait to connect with other Sin3 complex components.
Once we had confirmed that other Sin3 subunits were present in purified samples generated using a transiently transfected bait, we made Flp-In™-293 cells stably expressing CMVd2 driven Halo-tagged Sin3 subunits. We had previously found that the weaker CMVd2 promoter allows us to express the tagged recombinant proteins at levels that more closely reflect the expression levels of the endogenous version of the protein (4,9). Consistent with these previous findings, we found that although CMV driven expression of the tagged Sin3 subunit Halo-SAP30 yields greater amounts of the recombinant compared with the endogenous protein (supplemental Fig. S1), CMVd2 driven expression of Halo-SAP30 yields less recombinant than endogenous SAP30 protein. In addition to enabling lower bait expression levels, the pcDNA5/FRT vector that we used contains an FRT recombination site, which allows systematic integration of the Sin3 subunit into a distinct site in human Flp-In™-293 cells. After confirming expression of the CMVd2 expressed recombinant protein in transiently transfected HEK293T cells, we isolated hygromycin B resistant stable transformants expressing Halotagged Sin3 subunits in Flp-In™-293 cells as described under "Experimental Procedures." Having confirmed stable expression of the protein by Western blotting, we imaged these cells to assess proper subunit localization. We found that proper nuclear localization of the recombinant protein could be discerned more clearly in the stable cell lines, which had modest isogenic protein expression, than in the transiently transfected cells, which were susceptible to mislocalization of the overexpressed recombinant protein (21). If the recombinant protein was not localized to the nucleus, we again adapted our tagging strategy. For recombinant proteins that both copurified with other Sin3 subunits in trial AP-MS experiments and localized to the nucleus when stably expressed, we used the stable cell lines for large scale AP-MS analyses. Using a larger number of stably expressing cells (10 9 ) than we had with transiently transfected cells (10 7 ) allowed us to isolate sufficient amounts of the CMVd2 (weakly expressed) Halo-tagged bait protein for further analysis.
We have applied this workflow for the development of 23 cell lines stably expressing affinity tagged Sin3 related pro-teins (Fig. 1B). We modified our strategy in 5 cases (21.7%). In two of these cases, we moved the tag to the C terminus of the protein (this resulted both in a more robust association with Sin3 subunits and in proper nuclear localization (Fig. 1C)). The economic cost of identifying problems such as these quickly by performing trial mass spectrometry analyses ($40.46 ϫ 23 ϭ $930 for a single analysis of each bait (Fig. 1D)) was mitigated by economic savings in not using suboptimal reagents for AP-MS analyses ($273 per biological replicate per cell line). Importantly, we achieved significant time savings by identifying suboptimal reagents early, for example we could purify a sample from transiently transfected cells for AP-MS analysis within 3.1 days, whereas generation of stable cell lines prior to purification extended this time to 36.5 days (Fig. 1D).
This structured workflow ( Fig. 1) not only establishes a basis for building a detailed Sin3 complex interaction network in Flp-In™-293 cells, but also lays the groundwork for investigating Sin3 interaction networks in other cell types and under different cellular conditions. For example, although there is evidence that the Sin3 interactome may be modulated in response to drug treatment with HDAC inhibitors in HEK293 cells (5), such effects have yet to be studied in clinically relevant cancer cell lines. Two key Sin3 subunits, SUDS3 and SAP30, have well characterized direct interactions with the core Sin3 scaffolding protein SIN3A (22)(23)(24)(25), and so we initially characterized these two proteins using our Halo-MudPIT workflow; the results of these analyses are presented below to illustrate our approach.
Transiently Transfected Halo-SUDS3 and Halo-SAP30 Associate with Other Sin3 Complex Components-The Sin3 subunits SUDS3 and SAP30 are tightly integrated with SIN3A. The interaction between SUDS3 and SIN3A has been mapped by NMR and involves an alpha helix within SUDS3 residues 205-228 binding the SIN3A HDAC interaction domain (SIN3A 608 -729 ) (25). The interface between SAP30 and the Sin3 complex has also been well characterized and involves contacts between a SIN3A paired amphipathic helix domain (PAH3) and a SAP30 domain that resembles the SAP DNA binding motif (23). Consistent with these observations, we also found that overexpressed SIN3A copurified with affinity tagged versions of SUDS3 and SAP30, as well as with a paralog of SAP30, SAP30L (supplemental Fig. S2). To test whether N-terminally Halo-tagged versions of human SUDS3 and SAP30 might associate with the full complement of known Sin3 subunits, we purified either Halo-SUDS3 or Halo-SAP30 complexes from transiently transfected HEK293T cells and identified proteins in the resulting eluates using MudPIT ( Fig. 2A). Cells for control samples were transfected with plasmids expressing the Halo tag alone. Peptides shared by multiple proteins (for example peptides present in different isoforms or paralogs) were assigned to proteins as described by Zhang et al. (26). We next used the QSPEC algorithm (18) to define sets of proteins significantly enriched in experimen- Protein complexes were Halo affinity purified from HEK293T cells transiently transfected with Halo-SUDS3 in pFN21A (3 replicates) or Halo tag alone controls (9 replicates). Proteins in the eluates were identified by digesting samples with trypsin prior to separating tryptic peptides using multidimensional chromatography for mass spectrometry analysis (MudPIT). Proteins identified in each replicate were quantitated by spectral counting: proteins significantly enriched in Halo-SUDS3 compared with control samples were identified using QSPEC (18) (QSPEC log 2 FC Ͼ 2, QSPEC FDR UP Ͻ 0.05, detected in Ͼ 50% replicates; supplemental Table S2). B, Gene ontology terms (cellular component) enriched in Halo-SUDS3 or Halo-SAP30 complexes. Groups of proteins significantly enriched with either Halo-SUDS3 or Halo-SAP30 in pFN21A transiently expressed in HEK293T cells were analyzed for GO term enrichment (cellular component) using the DAVID annotation tool (27) (supplemental Table S2, "DAVID analysis results"). Significantly enriched GO terms (p adj Ͻ 0.05) were summarized using REViGO (57) (supplemental Table S2, "Revigo analysis results"); treemap segment areas are proportional to -log 10 p adj . C, Sin3 complex subunits copurify with both Halo-SUDS3 and Halo-SAP30. dNSAF values for Sin3 subunits copurifying with Halo-SUDS3 or Halo-SAP30 visualized using Circos (58). tal but not control replicates (log 2 FC Ͼ 2, FDR Ͻ 0.05), limiting the sets to proteins detected in Ͼ 50% of experimental replicates ( Fig. 2A). The 146 SUDS3-associated and 100 SAP30associated proteins are listed in supplemental Table S2. To assess whether these sets of proteins contained known Sin3 subunits, we first used the DAVID bioinformatics resource (27) to survey these lists for enriched gene ontology (GO) terms in the cellular component category (Fig. 2B and supplemental Table S2). GO terms associated with both sets of proteins were enriched for the descriptor "Sin3 complex" (GO: 0016580), with p adj ϭ 2.5 ϫ 10 Ϫ13 (SUDS3) and p adj ϭ 3.3 ϫ 10 Ϫ10 (SAP30). In addition to examining proteins for enriched GO terms, we asked whether our sets of Halo-SUDS3 and Halo-SAP30 associated proteins contained 22 Sin3 subunits and associated proteins previously described by our group and others (5, 6, 28), only 7 of which have been annotated with the term GO:0016580 (Sin3 complex). We have grouped these 22 proteins as either "core Sin3 subunits"  Table S2). Both Halo-SUDS3 and Halo-SAP30 purifications contained 20 of these proteins ( Fig. 2C and supplemental Table S2). Intriguingly neither SUDS3 nor SAP30 associated with SAP18 or SAP25. Although SAP18 had initially been described as a subunit of the Sin3 complex (30), SAP18 functions in the absence of other Sin3 subunits as part of the ASAP complex to regulate RNA splicing (31,32). Murine SAP25 had previously been identified as a core component of the mSin3 complex (28) and exogenously expressed mSAP25 also associates with human Sin3 components in HeLa cell lysates (28). To ascertain whether SAP25 and SAP18 are available for integration into the Sin3 complex in HEK293T cells, we analyzed RNA-seq data that we had generated in a previous study using RNA purified from HEK293T cell extracts (4). In addition to transcripts corresponding to the core Sin3 subunit proteins that we had identified in Fig. 2C, we detected robust levels of SAP18 transcripts (Fig. 2D). In contrast, we detected negligible levels of SAP25 mRNA. SAP25 might therefore be absent in Halo-SAP30 and Halo-SUDS3 purifications because SAP25 is not expressed in HEK293T cells under normal culture conditions. Indeed, overexpressed Halo-SAP25 was able to capture endogenous HEK293T cell Sin3 complexes (Fig. 2E). Conversely, different tagged versions of SAP18 did not capture Sin3 complexes (Fig. 2E), which together with the findings of Fig. 2D suggest that SAP18 is not tightly integrated into the Sin3 complex.
Having confirmed the ability of these transiently overexpressed Sin3 subunits to combine with endogenous Sin3 subunits, we constructed Flp-In™-293 cell lines stably expressing Halo-tagged versions of SUDS3 and SAP30 using the weak CMVd2 promoter. Although many published studies continue to rely on experiments using transiently overexpressed proteins, and the results of such studies can answer important questions, overexpressed proteins can misfold, mislocalize, or create other artifacts that do not reflect the behavior of the endogenous protein (21). We had previously tested several lower strength promoters and found that the CMVd2 driven expression level of the recombinant protein was closest to the expression level of the endogenous protein for the bait that we tested (4,9). We examined the localization of either stably expressed Halo-SUDS3 or Halo-SAP30 by covalently labeling the Halo tag with fluorescent ligands in live cells and imaging the labeled proteins using confocal microscopy. Consistent with the function of Sin3 complexes as chromatin modifiers, we observed the stably expressed Sin3 proteins in the nucleus (Fig. 2F).
Evaluating Complexes Purified from Sin3 Stable Cell Lines-Having tested recombinant Halo-Sin3 proteins both for association with other Sin3 subunits and proper localization, we used the cells stably expressing either Halo-SUDS3, Halo-SAP30 or Halo-SAP30L (another validated Sin3 stable cell line) to purify Sin3 complexes from Flp-In™-293 cell lines. Initial analysis of the eluates from these purifications by SDS-PAGE and silver staining indicated putative endogenous Sin3 subunits in the experimental purifications but not in controls (a representative analysis is shown in Fig. 3A). MudPIT mass spectrometry analysis of the eluates confirmed the presence of core Sin3 subunits, other Sin3 associated proteins and Sin3 associated transcription factors (Fig. 3B). Consistent with our previous observations, we failed to detect significant amounts of the Sin3 subunits SAP18 or SAP25. In addition to monitoring the purifications for the presence of Sin3 subunits, we also asked whether the three bait proteins were associating primarily with the Sin3 complex, or whether they additionally copurified with other groups of proteins. There were 21 proteins copurifying with all three bait proteins (Fig. 3C). We used the ClueGO/CluePedia algorithm (33,34) to ask if this subset of proteins shared GO terms and found that 16/21 of the proteins shared the associated enriched GO terms "nuclear transcriptional repressor complex" and/or "histone deacetylase complex." The other 5 proteins include: the previously characterized Sin3 associated proteins ARID4A, ARID4B and ING1; FASN (fatty acid synthase); and LGALS3BP, a glycoprotein elevated in human colorectal carcinoma (34) that might function as an immune stimulator after oncogenic transformation (34). In addition to these 21 shared proteins, a second set of 38 proteins consistently copurified with Halo-SAP30L, but not with Halo-SAP30 or with Halo-680LT labeled goat anti-Mouse secondary antibodies (loading control). Western blots were imaged using a Li-Cor infra-red imaging system. Subcellular localization of Halo-SUDS3 and Halo-SAP30 proteins was assessed by labeling live cells with HaloTag® TMRDirect™ fluorescent ligand (red) and staining DNA (nuclei) with Hoechst dye (blue).
SUDS3. ClueGO analysis revealed that these included several tubulin proteins as well as components of the proteasome. Although we and others (35) found Halo-SAP30L or endogenous SAP30L mainly in the nucleus and the majority of cellular tubulin is detected in the cytoplasm (supplemental Fig. S3), SAP30L could potentially associate with the fraction of cellular tubulin associated with chromatin in the nucleus (36,37).
Halo-MudPIT Analysis Extends Network of STRING Annotated Sin3 Subunit Interactions-We next wanted to compare interactions captured using our Halo-MudPIT approach to interactions that have been previously reported. We initially compared our data to experimental evidence for protein-protein interactions curated in the STRING database (38). To ask whether STRING-identified Sin3 subunit interactors were also identified using our Halo-MudPIT approach, we used the DyNet analysis tool (39) to compare our network of high-confidence Halo-MudPIT identified interactions (log 2 FC Ͼ 2, FDR Ͻ0.05) with medium confidence STRING annotated interactions (STRING score Ͼ 0.4) for the same bait proteins (Fig 4A).
Although most bona fide Sin3 subunits were identified both in our purifications and by STRING by at least one bait (Fig. 4A gray lines), neither FAM60A nor SAP30L were identified by STRING. In contrast, both FAM60A and SAP30L were identified in all of our Halo-MudPIT analyzed samples. In addition, whereas none of the Sin3 core subunits/other associated proteins were detected by STRING alone, we detected many interactions between either Halo-SUDS3, Halo-SAP30 or Ha-
Several proteins were annotated in the STRING network and not captured in our purifications. First CTPS1 and CTPS2 were shown as interacting with both SAP30 and SAP30L in the STRING network but were not in our purifications. Peculiarly, this STRING annotation is based on an unreferenced interaction observed among putative homologs of these proteins in other species. Similarly, most of the SUDS3 interactions detected by STRING only (red lines connecting SUDS3 to MORF4L1, MORF4L2, AIRE, MSL3, TAF3, PHF12, and ING3/4/5) are not in fact because of experimentally evidenced interactions among the human proteins but are based on interactions among putative homologs in other species. The other SUDS3 interactors detected by STRING only, LACC1 (C13orf31) and MXD1 were reported in a high throughput screen for interactors in HEK293 cells (2). In this screen, both proteins used as baits detected SUDS3 as prey, whereas LACC1 and MXD1 were not detected in SUDS3 purifications, suggesting that overexpression of LACC1 and MXD1 might be needed to capture these interactions. Retinol Binding Protein 1 (RBP1), recently reported in association with SAP30 in HeLa cell purifications (1), was also not detected in our Flp-In™-293 Halo-SAP30 purifications.
In addition to comparing the components captured by our Halo-tagged baits with interactions listed in the STRING database, we asked whether our baits captured proteins previously detected in FLAG-SAP30 and FLAG-SAP30L purifications by Sardiu et al. (5) (Fig. 4B). Except for the transcription factor BBX, we detected all proteins that had been previously captured by both FLAG-SAP30 and FLAG-SAP30L in all our Halo purifications (Fig. 4B). Curiously, we did not detect many of the proteins that were previously only found in FLAG-SAP30L (but not in FLAG-SAP30) purifications (Fig. 4B). As we had used a different approach to purifying complexes using Halo-tagged baits, the absence of these proteins might result from the loss of weakly associated interactors under our modified purification conditions. Our Halo-MudPIT approach for capturing high-confidence Sin3 complex interactions recapitulates most information annotated in the STRING database (Fig. 4A) as well as most previously reported interactions defined using FLAG-tagged Sin3 subunits (5). Taken together, the analyses of Fig. 4 support that our Halo AP-MS procedure isolates Sin3 complexes containing most Sin3 components isolated previously in other studies.

Quantifying Relative Sin3 Subunit Composition for Different Bait Purifications with the Halo-MudPIT Approach-Previ-
ously, we found evidence that populations of Sin3 complexes appear to be heterogeneous, with some subunits not appearing together in the same complex. For example, although complexes purified through the SAP30 subunit contain most other Sin3 subunits, they do not contain the homologous subunit SAP30L; similarly, SAP30L complexes do not contain SAP30 (5). When we compare Sin3 complexes purified using different baits, we would like to ask whether each bait associates with a similar set of Sin3 subunits, or whether there is variation in subunit preference among baits. To estimate the relative levels of different subunits within specific complexes, some previous studies have used complex normalized spectral abundance factor (cNSAF) values instead of dNSAF values (40,41). When we calculated cNSAF values for Sin3 complexes purified from cells stably expressing Halo-SUDS3, Halo-SAP30, and Halo-SAP30L, we noticed that Halo-SUDS3 purifications contained lower cNSAF values than the other baits for several subunits including SIN3A, HDAC1, ING1/2, and FAM60A (Fig. 5A (upper bar graph)). We thought that such differences might result from technical differences among purifications using different bait proteins. Although our AP-MS strategy uses the CMVd2 promoter to limit the effects of overexpressing bait proteins, we still observe differences in expression among different baits. Some baits might capture comparatively more of the endogenous Sin3 complexes than others, either because they are expressed at a higher level or because they are more efficiently incorporated into Sin3 complexes (Fig. 5B). Alternatively, some bait proteins (SUDS3) might dissociate faster from the Sin3 complex than other baits (SAP30 or SAP30L), resulting in proportionally more bait than prey in these purifications. Therefore, if we compared the cNSAF values for a given prey captured by three different baits, it would be difficult to assess whether a particularly low (or high) cNSAF value for a given bait prey combination reflected abnormal bait expression/incorporation into Sin3 complexes, or alternatively reflected a genuine antagonism for this combination of subunits in cells (for example, SAP30 and SAP30L not coexisting in the same Sin3 complex).
With these two possibilities in mind, we decided to calculate the ratios of the cNSAF value of each bait to the cNSAF value of a key complex subunit (in this case we chose SIN3A as the key subunit) (Fig. 5A (lower bar graph) and Fig. 5B). If different baits were to capture heterogeneous populations of SIN3A containing complexes without preference for subunit combinations, we would expect ratios of cNSAF PREY / cNSAF SIN3A to be the same for different baits (Fig. 5B). We calculated the ratios cNSAF PREY /cNSAF SIN3A for the baits Halo-SUDS3, Halo-SAP30 and Halo-SAP30L (Fig. 5A (lower bar graph)) and found that the cNSAF PREY /cNSAF SIN3A ratios for the preys HDAC1, ING1/2, and FAM60A varied less than the original cNSAF values had done between the three baits (compare Fig. 5A (upper bar graph) with Fig. 5A (lower  bar graph)). This is consistent with a model where the lower cNSAF values for some Halo-SUDS3 preys are caused by Halo-SUDS3 capturing proportionally lower amounts of SIN3A containing complex, rather than by a preference of Halo-SUDS3 for SIN3A containing complexes lacking these subunits. In contrast, this analysis supported that Halo-SAP30 captured minimal amounts of endogenous SAP30L, and that Halo-SAP30L captured minimal amounts of endogenous SAP30 (Fig. 5A (lower bar graph)), corroborating our previous findings that these subunits likely do not coexist in the same Sin3 complexes (5). Although we chose the cNSAF approach here, repeating the analysis using dNSAF rather than cNSAF values gives similar results (supplemental Fig. S4).
Intriguingly, this analysis suggested that SIN3B associated with Halo-SUDS3 but not Halo-SAP30 or Halo-SAP30L. To ask whether SIN3B does indeed associate with SUDS3, but  not with SAP30, we expressed either Halo-SIN3A, Halo-SIN3B or Halo tag alone in HEK293T cells and isolated complexes using Halo affinity purification (Fig. 5C). Consistent with the results of Fig. 5A, Halo-SIN3A captured both endogenous SUDS3 and SAP30. In contrast, Halo-SIN3B captured SUDS3 but significantly less SAP30 (Fig. 5C).
In summary, by normalizing to a key complex subunit cNSAF value (cNSAF KS ), and then comparing the prey cNSAF PREY /cNSAF KS values across different baits, we can address whether different baits associate with Sin3 complexes containing distinct combinations of subunits.
Differential Network Analysis of "Transient Transfection" and "Stable Cell Line" Networks-Having used two different methods to analyze Sin3 associated proteins, first using transiently transfected cells overexpressing Sin3 subunits, and second using larger numbers of cells stably expressing Sin3 subunits at lower levels, we decided to compare the resulting networks of Sin3 subunit interactions mapped using each method. To do this we first generated two networks made using either transiently transfected Halo-SUDS3 and Halo-SAP30 (transient network), or with stably expressing Halo-SUDS3 or Halo-SAP30 cells (stable network). We used the QSPEC parameter log 2 FC (42) to quantitate prey proteins enriched with the bait proteins in these networks. We next compared these two networks using the DyNet tool (39) to generate the differential network shown in Fig. 6. This differential network indicates the differences in log 2 FC values (⌬ log 2 FC) between the transient and stable networks by edge color and width; node colors indicate whether prey proteins are present in the transient network or stable network only, or in both networks.
Subunits of the Sin3 complex (diamond nodes) and several previously identified Sin3 associated proteins (hexagonal nodes) including the transcription factors FOXK1 and FOXK2, and the trinucleotide repeat containing protein TNRC18 (43,44), were present in both transient and stable networks. In addition, the previously identified Sin3 associated protein BAHCC1 (44), a TNRC18 paralog, was identified only in the stable network. Of these 21 Sin3 subunits/associated proteins, most (17) only showed greater QSPEC log 2 FC values in the stable network (pink edges, Fig. 6), suggesting a greater enrichment of bona fide Sin3 complex components in the stable cell line purifications. One hundred fifty-three proteins were present in the transient network only. We used the DAVID annotation tool (27) to ask whether this group of proteins was enriched with gene ontology terms in the "cellular component" category and found significant enrichment of the terms "ribosome" (54 proteins p adj 1.9 ϫ 10 Ϫ69 ) and "nucleolus" (68 proteins, p adj 1.4 ϫ 10 Ϫ46 ). Ribosomal proteins are assembled onto rRNA during ribosomal biogenesis in the nucleolus (45). It is possible that many of the proteins that we found only in purifications using transiently transfected cells are associating with the bait protein indirectly via interactions with nucleic acids. Although stable cell line lysates were treated with a nuclease that would mitigate such interactions, transiently transfected cells were lysed without adding nuclease. We found several proteins not previously identified as Sin3 associated proteins in the stable network. These included several DNA repair proteins (XRCC5, MSH2 and PRKDC) and components of the proteasome (PSMC2 and PSMD2). Finally, to ensure that the major differences between the two networks did not result from differences in the numbers of replicates used for the stable cell line and the transient transfection network analyses, we repeated the analysis of Fig. 6 using only 3 biological replicates for all Halo tagged baits and using 6 biological control replicates for both transient transfection and stable cell line networks (supplemental Fig. S5). This complementary analysis is similar and supports the analysis of Fig. 6.

DISCUSSION
This paper describes an AP-MS based workflow using Halo affinity capture and MudPIT mass spectrometry for systematically mapping interactions among components of large protein complexes: we then use the workflow to determine interactions among several components of the human Sin3 complex.
Our affinity purification approach offers some advantages over traditional affinity capture methods, which use antibodies that bind small peptide tags. For example, the Halo tag becomes covalently attached to its corresponding ligand allowing complexes to be rapidly concentrated on the immobilized beads (46). This may be propitious for the capture of low abundance complexes as, in contrast to antibody/antigen interaction kinetics, the rate of dissociation of the Halo tag from the affinity resin is zero (47). Indeed, a previous comparison between Halo and FLAG affinity capture methods supports increased detection of bait interacting proteins for identical baits and decreased detection of proteins nonspecifically in control purifications when using the Halo versus the FLAG tag (47). Covalent capture reduces the need for dual tag strategies such as the TAP tag (48), which are sometimes used to increase purity, as well as the need to concentrate nuclear proteins by preparing nuclear extracts prior to affinity purification (49). Importantly, covalent capture using the Halo tag enables purification of proteins expressed at modest levels using the relatively weak CMVd2 promoter, reducing gene dosage artifacts (21).
In developing the workflow that we used to build suitable reagents for AP-MS analyses, we anticipated, and tried to avoid, several possible sources of systematic error. First, although we used a system designed to integrate our recombinant bait at a predetermined genomic locus to create isogenic cell lines, we still considered the possibility of variation among different clonal isolates. For example, Fukushige and Sauer (50) had previously used targeted DNA integration to make ␤-galactosidase expressing cell lines and noted higher ␤-galactosidase expression in some anomalous transfor-mants. They identified chromosomal duplication in cells from one such transformant as a possible source for the difference in phenotype. To reduce the impact of possible artifacts resulting from abnormal cell lines, we initially purified complexes from three distinct clonal isolates for each Sin3 subunit analyzed (in some cases only two clonal isolates were represented in final AP-MS analyses; for example, one Halo-SAP30L replicate was excluded from our final analysis-see Experimental Procedures). A comparison of the three clonal isolates used for 5 replicate Halo-SAP30 AP-MS analyses (supplemental Fig. S6) did not suggest consistent differences in prey dNSAF values that depended on the clonal isolate used. In addition to taking measures to mitigate biological artifacts from abnormal cell lines, we also considered the impact of systematic technical artifacts that might be introduced during purification. Indeed, Leek et al. had previously stressed the negative impact of such batch effects for studies using high throughput technologies (51). As small amounts of buffer contaminants could result in spurious protein identifications with the sensitive protein identification technology that we were using, we avoided situations where all replicates corresponding to a specific cell line were analyzed in parallel. For example, four people performed the 18 purifications used for the stable cell line network on 14 separate dates with no similar samples processed together by the same person.
Having acted to reduce systematic errors in analyses using cells stably expressing a bait, we noticed reproducible differences between these analyses and similar analyses using transiently transfected cells and an alternative purification protocol. Although we have not systematically investigated the causes for these differences, we believe there are several factors that might explain them. First, the Halo purification protocol for affinity purifying complexes from transiently transfected cells did not use a nuclease in the lysis buffer. Additional proteins could reflect genuine specific nucleic acid mediated interactions that are captured in the purifications using the transiently transfected lysates. This could explain the presence of RNA binding proteins in these purifications that were not detected in the equivalent purifications using stable cell line lysates. Nucleic acids might also regulate protein-protein interactions by binding to proteins and changing their secondary structure. For example, the TIP5 subunit of the chromatin remodeling complex NoRC, which targets HDAC1 containing complexes to ribosomal gene promoters (52), binds to pRNA, promoting a more open configuration exposing new interaction surfaces (53). Other differences among the purification protocols might also explain why some interactions are observed only in the transient transfection experiments. For example, capture of some interactors may depend on the salt concentration in the lysis buffer (here the buffer used with transiently transfected cells contained 150 mM NaCl compared with 0.42 M NaCl in the buffer used to lyse cells stably expressing the bait). Finally, the differences between the two analyses methods might simply result from substantial overexpression of the bait in the transiently transfected cells. Gibson et al. have suggested that transiently overexpressed baits can misfold (21), and misfolded or partially folded proteins can result in aberrant bait-prey interactions (54). To summarize, our observations suggest that the set of proteins identified in AP-MS experiments can change appreciably with different purification protocols; a larger systematic study could determine how different factors influence such changes.
As we focused specifically on components of the Sin3 complex enriched in the stable cell line purifications, we observed a "bait effect" with marked differences in absolute values of cNSAF (or dNSAF) for Sin3 subunits captured using different baits (Fig. 5). Normalizing cNSAF values to the cNSAF value of a key complex subunit can moderate these effects and can help assess whether certain combinations of subunits are favored. Curiously, this analysis indicates that although Halo-SUDS3 appears to assemble into both SIN3A and SIN3B based complexes, Halo-SAP30 and Halo-SAP30L appear to be preferentially used with SIN3A based complexes (Fig. 5A (lower bar graph) and Fig. 5C). None of our purifications contained the subunits SAP18 and SAP25. Although SAP18 was initially identified through its association with Sin3/HDAC complexes (30), Zhang et al. also observed that the interaction depended on exogenous overexpression of SAP18 in HEK293T cells. Schwerk. et al. later found that the majority of SAP18 is associated with the RNA processing complex ASAP, and suggested that only a small proportion of endogenous SAP18 might associate with the Sin3 complex (55). We did not detect Sin3 components by Western blotting in either Halo-SAP18 or SAP18-Halo purifications using HEK293T cells (Fig. 2E). In contrast to SAP18, SAP25 appears to be a definitive Sin3/HDAC component but may only be expressed and incorporated into the Sin3 complex in response to distinct cellular signals. Although SAP25 binds to PAH1 domain at the N terminus of the SIN3A scaffolding protein (56), it was found to be mainly cytoplasmic in HeLa cells (28), and its nuclear involvement with Sin3/HDAC complexes may depend on Ras-regulated signal transduction pathways. The absence of SAP25 in our purifications was likely because of low expression of endogenous SAP25 in HEK293T cells (Fig. 2D)-indeed we were able to capture Sin3 components with exogenously expressed Halo-SAP25 (Fig. 2E).
To summarize, we report a robust Halo-MudPIT based workflow for mapping protein-protein interactions among components of large protein complexes and show how the workflow can be applied to identify high-confidence interactions among components of protein complexes such as the Sin3/HDAC complex. Determining the architecture of such complexes is challenging, as cells assemble bespoke versions of the Sin3/HDAC complex with distinct combinations of paralogous subunits for different situations (29). The Halo-MudPIT approach provides a platform for addressing these challenges and developing more refined understanding of Sin3/HDAC subunit organization and usage.
Original data underlying this manuscript can be accessed from the Stowers Original Data Repository at http://www.