Differential complex formation via paralogs in the human Sin3 protein interaction network

Despite the continued analysis of HDAC inhibitor efficacy in clinical trials, the heterogeneous nature of the protein complexes they target limits our understanding of the beneficial and off-target effects associated with their application. Among the many HDAC protein complexes found within the cell, Sin3 complexes are conserved from yeast to humans and likely play important roles as regulators of transcriptional activity. The functional attributes of these protein complexes remain poorly characterized in humans. Contributing to the poor definition of Sin3 complex attributes in higher eukaryotes is the presence of two Sin3 scaffolding proteins, SIN3A and SIN3B. Here we show that paralog switching influences the interaction networks of the Sin3 complexes. While SIN3A and SIN3B do have unique interaction network components, we find that SIN3A and SIN3B interact with a common set of proteins. Additionally, our results suggest that SIN3A and SIN3B may possess the capacity to form hetero-oligomeric complexes. While one principal form of SIN3B exists in humans, the analysis of rare SIN3B proteoforms provides insight into the domain organization of SIN3B. Together, these findings shed light on the shared and divergent properties of human Sin3 proteins and highlight the heterogeneous nature of the complexes they organize.


SUMMARY
Over 13000 or 70% of protein coding genes within the human genome have at least one paralog (1). The acquisition of additional copies of a gene through duplication events provides opportunities for the development of unique gene products with distinct regulatory mechanisms (2). Functional divergence can result from gene duplications and protein paralog identity can influence the composition of large protein complexes (3). However, the consequences of paralog switching are largely overlooked during the characterization of proteins, protein complexes, and protein interaction networks.
Classically associated with transcriptional repression, the removal of histone lysine acetyl groups by the Sin3 histone deacetylase (HDAC) complexes represents a central mechanism whereby transcriptional status is regulated (4). Named for the scaffolding protein of the complex, Sin3 complexes are well studied in Saccharomyces cerevisiae (5,6). However, the presence of additional components not found in lower eukaryotic forms of the Sin3 complexes likely increases the diversity of complex function in higher eukaryotes. Contributing to this expansion of components is the acquisition of paralogous genes encoding Sin3 proteins. The two Sin3 paralogs present within mammals, SIN3A and SIN3B, have undergone significant divergence and maintain only 63% sequence similarity at the protein level in humans.
Genetic deletion of murine Sin3a results in early embryonic lethality whereas deletion of Sin3b induces late gestational lethality (7,8). That SIN3A and SIN3B cannot compensate for the loss of one another provides evidence for paralog-specific functions within mammals and suggests that variations of the Sin3 complex have functional consequences. In addition to having critical roles during mammalian development, SIN3A and SIN3B also have contrasting influences on breast cancer cell metastasis. SIN3A was identified as a suppressor of metastasis, whereas SIN3B was proposed to be pro-metastatic (9). However, the mechanisms responsible for divergent influences on development as well as cancer cell metastatic potential remain poorly understood.
SIN3A has been identified as one of the 127 most significantly mutated genes across multiple cancer types (10) and is considered a cancer driver gene (11). Not surprisingly, FDAapproved HDAC inhibitors (HDACis) are effective constituents of multicomponent cancer treatment therapies. While these compounds target the activity of the Sin3 complex catalytic subunits HDAC1/2 (12), heterogeneity within the population of HDAC complexes targeted by these compounds is likely responsible for the well-documented off-target effects associated with HDACis (13) and ultimately prevents our complete understanding of the mechanisms through which these compounds function. The evidence that SIN3A and SIN3B differentially influence metastatic potential, along with the existence of FDA-approved chemotherapeutic agents that target Sin3 complex function, warrants further investigation into the unique properties of complexes containing SIN3A and/or SIN3B as components. Here, we characterize human SIN3A and SIN3B, highlighting the unique and shared properties of Sin3 paralogs in humans.
Together, our results shed light on the molecular targets of chemotherapeutic HDAC inhibitors and highlight consequences associated with paralog switching on Sin3 complex composition.
Recombinant proteins were affinity-purified and analyzed by MudPIT (Tables S1, S2A-C) as previously described (15) (19)(20)(21). Notably, all of these proteins were enriched by SIN3A (Fig. 1E-F). Among these proteins, only SUDS3 and BRMS1L met statistical criteria for enrichment by SIN3B_2 (Fig. 1E). While peptides mapping to BRMS1, ING1/2, and SAP30/SAP30L were observed following SIN3B purification ( Fig. 1F), these proteins did not meet criteria for enrichment. These data suggest that SIN3B_2 may interact with at least a subset of proteins with homology to Rpd3L-specific components.
The yeast Rpd3S complex also contains components with homology to human proteins: yeast Rco1 and Eaf3 have homology to human PHF12 and MORF4L1, respectively (22,23).
PHF12 was specifically enriched by SIN3B_2 but not SIN3A. GATAD1, a PHF12 interaction partner was also specifically enriched by SIN3B_2. SIN3A-purified samples were devoid of peptides that mapped to PHF12 (Fig. 1F). Together these data suggest that SIN3A and SIN3B_2 interact with proteins that are homologous to yeast Rpd3L-specific components but only SIN3B_2 interacts with proteins that are homologous to Rpd3S-specific components.
To assess the characteristics of each isoform, we expressed SIN3B_1 and SIN3B_3 fused Previous studies of mouse SIN3A identified a 327 residue HDAC interaction domain (HID) that is essential and sufficient for interactions with HDAC2 (24). Though a HID within SIN3B has not been experimentally defined, alignment of SIN3A and SIN3B isoforms proteins reveals that a region of SIN3B_2 has high homology to the experimentally defined SIN3A HID ( Fig. 2A, S2). The additional exon found within SIN3B_1 resides in the region that aligns with SIN3A HID whereas the HID region in SIN3B_3 is shorter, missing about 1/5 th of its N-terminus ( Fig. 2A HaloTags were transiently expressed in 293T cells for subsequent protein isolation and HDAC activity assays (Fig. 3D). Activity of the purified protein complexes was assessed with a fluorometric HDAC activity assay. As SIN3B_1 protein levels were consistently lower than that of SIN3B_2 (Fig. 3D), activity was normalized to bait protein abundance (Fig. 3E, S3, Table   S3). The enzymatic activity of SIN3B_1-purified samples was consistently lower than purified complexes containing recombinant SIN3B_ 2 (Fig. 3E). Of note, the HDAC activity of both SIN3B complexes was almost completely inhibited by suberoylanilide hydroxamic acid (SAHA) (Fig. 3E). These data suggest that, like the HID region in SIN3A, this region is influences the construction of catalytically active SIN3B-containing complexes.

The Sin3 HID influences complex assembly-While the HID of Sin3 proteins influences
interactions between HDAC1/2 and Sin3 proteins, we next sought to determine what influence alterations to this region in SIN3B isoforms had on other protein-protein interactions.
In addition to HDAC1/2, SIN3A and SIN3B both interact with RBBP4/7. Unlike interactions with HDAC1/2, SIN3B_1 and SIN3B_3 did not display a decreased capacity to interact with RBBP4/7 compared to SIN3B_2 (Fig. 4A, Table 2E-F). The enrichment of RBBP4/7 by SIN3B_3 suggests that the C-terminal half of SIN3B is sufficient for association with these proteins. Of note, RBBP4 isoform c (RBBP4_c, NP_001128728.1) was consistently observed as an interaction partner of SIN3B_ 3 but not of other SIN3B isoforms (Fig. 4A, Table   S2E-F). cNLS mapper predicts that a nuclear localization signal with a score of 3.8 is present in RBBP4 isoform a (RBBP4_a, NP_005601.1) but is absent in isoform c (Fig. S4). As SIN3B_3 was observed exclusively within the cytoplasm, this interaction may represent an interaction between cytosolic proteins.
Proteins that have homology to Rpd3S-and Rpd3L-specific components also displayed differential enrichment by the SIN3B isoforms ( Fig. 4B). Rpd3L-specific component homologs SUDS3 and BRMS1L were both consistently less abundant following SIN3B_1 and SIN3B_3 purifications compared to SIN3B_2 purified samples ( Fig. 4B and Table S2E-F). Further, SIN3A met criteria for enrichment by SIN3B_2 (Fig. 1E, Table S2E) but not by SIN3B_1 or SIN3B_3 (Table S2E), suggesting that Sin3 hetero-dimerization is also influenced by the SIN3B HID.
In addition to Rpd3L-specific component homologs, the Rpd3S-specific component homolog PHF12 and its interaction partner GATAD1 were also less abundant in SIN3B_1purified samples and barely detected in the SIN3B_3 pull-downs compared to SIN3B_2-purified samples ( Fig. 4B. Table S2E-F). These data suggests that the HID region likely influences the organization of both major Sin3 interaction networks but is not required for interactions between SIN3B and RBBP4/7. KPNA4, and KPNB1, which are involved in the import of nuclear proteins ( Fig. 1E-F, 4C, Table   S2E). Notably, SIN3B_1 and SIN3B_2 contain a sequence predicted by cNLS Mapper (25) to be a bipartite nuclear localization signal (NLS) (Fig. 5A, S2). This sequence is absent within SIN3B_3 (Fig. 5A, S2) and this isoform failed to enrich karyophorin proteins ( Fig. 4A-B, Table   S2E).

SIN3A and SIN3B have divergent nuclear localization signals-
To test the accuracy of the predicted NLS sequence in SIN3B isoforms, basic residues within the predicted bipartite sequence in SIN3B_2 were mutated to alanine residues. Basic residues found within SIN3A that align with the predicted SIN3B NLS were also mutated. Open reading frames encoding wild-type ( Surprisingly, the introduction of mutations to homologous residues in SIN3A did not inhibit the nuclear localization of SIN3A ( Fig. 5F-G). The observation that KPNA2/3/4 and KPNB1 were not enriched by SIN3A purification (Fig. 1E, 4C, Table S2E) supports the conclusion we derived from the mutational analyses of nuclear localization signals: SIN3A and SIN3B_1/2 are imported to the nucleus via distinct molecular interactions ( Figure 6).

Discussion:
To understand the therapeutic potential of targeting Sin3 complex function, we must first characterize the heterogeneous population of Sin3 HDAC complexes. Through a comparative analysis of human Sin3 protein forms, we highlight the influence of paralog switching on complex composition and identify, as well as the shared and unique features of Sin3 protein paralogs.

SIN3B is a component of two protein interaction networks-
The single S. cerevisiae Sin3 protein is partitioned into 2 distinct protein complexes, known as Sin3S (or Rpd3S) and Sin3L (or Rpd3L) (5,6). mapping to these proteins were observed following SIN3B_2 purification but that these proteins did not meet criteria for enrichment (Fig. 1F, Table S2E). Thus, interactions between SIN3B and these proteins may be weak or indirect. As SIN3A was enriched by SIN3B_2, a possible explanation for the weak identification of these protein following SIN3B_2 purification is that indirect interactions between SIN3B_2 and FAM60A, TNRC18, ARID4A/B, and SAP130 could be mediated by SIN3A (Fig. 6). While isoform 2 (NP_001284524) likely represents the principal SIN3B isoform in humans, alternative splicing produces multiple SIN3B isoforms in mouse (38,39) and it was reported that ratios of human SIN3B gene products are responsive to cellular queues (38).

Protein domain organization is partially conserved between SIN3A and SIN3B-
Though alternative isoforms of SIN3B are likely a small portion of the total SIN3B within humans, these alternative proteoforms provide insight into the domain organization of SIN3B.
SIN3B_1 has an additional exon within a region that has high homology to the SIN3A HID (Fig   2A, S2). Therefore, we used this alternative SIN3B proteoform to provide insight into the dependence of SIN3B protein-protein interaction on the SIN3B HID. We demonstrate that isoform 1, which contains a 32-residue sequence not found within isoform 2, has a decreased catalytic potential as well as weaker associations with Sin3 network components (Fig. 2, Table   S2). It should be noted that isoform 1 was still capable of enriching HDAC1/2 (Table S2). Thus, the addition of this exon does not completely abolish interactions with HDAC1/2 but does diminish both HDAC1/2 binding and the complex's catalytic capacity (Fig. 3C, 3E).
A region within SIN3B that has sequence homology to a portion of the SIN3A HID is critical for interactions between SIN3B and PHF12 (31). Our results support this observation as the additional exon found within SIN3B_1 disrupts interactions between SIN3B and PHF12 (Fig.   5F). Additionally, we observed poor interactions between SIN3B_1 and homologs of yeast Sin3L components SUDS3 and BRMS1L (Fig. 3A, 4B, Table S2). Results obtained by others suggest that the HID regions is critical for interactions between SIN3A and SUDS3 (21).
Together, these results suggest that the organizing role of the HID region is conserved between SIN3A and SIN3B. As SUDS3/BRMS1 and PHF12 were less abundant in SIN3B_1 compared to SIN3B_2 enrichments, the HID is likely important for the organization of both forms of the Sin3 complex in humans.
Notably, SIN3A was less abundant following SIN3B_1 enrichment compared to SIN3B_2 enrichment (Fig. 4B, Table S2E). Thus, the HID appears to be critical for heterooligomerization of the Sin3 complex. Previous findings suggest that yeast Sds3 is essential for complex integrity (17) and mammalian BRMS1 (32) and SUDS3 (21,32) are capable of forming dimers. As SUDS3 and BRMS1L were enriched by both SIN3A and SIN3B_2, it is possible that these proteins mediate the formation of SIN3A-SIN3B_2 hetero-oligomeric complexes.
SIN3B_3 provides us with insight into the mechanisms responsible for SIN3B nuclear import. This isoform, resulting from an alternative start codon, is significantly shorter than SIN3B_1/SIN3B_2 and has absent or disrupted PAH and HID domains ( Fig. 2A). Interestingly, recombinant SIN3B_3 failed to localize to the nucleus (Fig. 2C). This isoform lacks a predicted NLS signal and prompted us to investigate the requirement for this sequence for SIN3B nuclear import. Analysis of the SIN3B_3 interaction network revealed that, unlike other SIN3B isoforms, SIN3B_3 did not enrich KPNA2/3/4 or KPNB1. The introduction of mutations to the predicted NLS of SIN3B isoform 2 resulted in cytoplasmic localization; however, mutations to conserved or similar residues within SIN3A had no effect on nuclear localization. Thus, our findings indicate that SIN3A and SIN3B have distinct nuclear localization signals. Consistently, it has recently been shown that nonsense mutations at residue 949 in SIN3A results in a truncated protein with cytoplasmic localization (40), suggesting that the NLS of SIN3A resides within its C-terminus and is distinct from SIN3B NLS. Additionally, SIN3A, unlike SIN3B_2, did not enrich KPNA2 KPNA3, KPNA4, or KPNB1 (Fig. 4C, Table S2E), further suggesting that SIN3A and SIN3B may experience unique modes of nuclear import (Fig. 6).
Not all identified proteins displayed differential interactions with SIN3B isoforms. In fact, RBBP4 and RBBP7 were consistently identified as the most abundant non-bait proteins in all SIN3B isoform purifications (Fig. 4A). As SIN3B_1 and SIN3B_3 enriched RBBP4/7, these results show that the C-terminal half of SIN3B is sufficient for interactions with these proteins.
While the HID may act as an organizing domain for much of the complex, RBBP4/7 do not depend on this region for interactions with SIN3B. Thus, while HDAC1/2 and RBBP4/7 likely form a shared core complex with Sin3 proteins in humans, they likely do so via interaction with distinct regions of Sin3 proteins. This is surprising as HDAC1/2 and RBBP4/7 co-exist within multiple HDAC complexes and have been proposed to form a preformed sub-module (41).
In total, these results provide insight into the shared and unique properties of human Sin3 scaffolding proteins. While a conserved HID is present in both proteins, the identity of the Sin3 protein found within complexes influences their interaction networks and, likely, composition.
We also show that hetero-oligomeric forms of the Sin3 complex can exist as SIN3B enriches SIN3A. These findings highlight the influence of paralog switching on protein complex composition and outline the need for future studies that further delineate the unique functions of the distinct classes of Sin3 complexes. Future studies should take into account additional heterogeneity within population of Sin3 complexes that is introduced by other complex subunits that exist as protein paralogs.

Acknowledgments:
Original data underlying this manuscript can be accessed from the Stowers Original Data   Cells were washed two times with Opti-MEM media before imaging in Opti-MEM media.

Affinity purification of recombinant proteins from Flp-In™-293 cells and Multidimensional Protein Identification Technology (MudPIT) Analysis of Transiently Expressed Protein-Cells
were lysed and recombinant proteins were isolated using Magne® HaloTag® Beads (Promega) as previously described (15). Affinity purified (AP) proteins were TCA precipitated, digested with Lys-C or rLys-C, Mass Spec Grade (Promega Corporation) then Sequencing Grade Trypsin (Promega Corporation). Peptides were loaded onto triphasic MudPIT microcapillary columns as previously (42). Columns were placed in-line with an 1100 Series HPLC system (Agilent Technologies, Inc., Santa Clara, CA) coupled to a linear ion trap mass spectrometer (Thermo Fisher Scientific) and peptides were resolved using 10-step MudPIT chromatography as previously described (43).
Acquired .RAW files were converted to .ms2 files using RAWDistiller (44). ProLuCID v1.3.5 (45) was used to match spectra against a database containing human protein sequences (National Center of Biotechnology Information, June 2016 release) along with shuffled sequences for false discovery rate (FDR) estimation. The database was searched for fully tryptic peptides with static modification of +57 daltons for cysteine and a dynamic modification of +16 daltons for methionine residues. DTASelect and Contrast (46) were used to filter results and NSAFv7 (47) was used to calculate label-free quantitative dNSAF values and generate final reports (Tables S2A-B). The spectral FDR mean ± s.d. for the 19 MudPIT runs was 0.198% ± 0.105%, the mean peptide FDR was 0.355% ± 0.225%, and the mean ± s.d protein FDR was 1.30% ± 0.769%. A DTASelect filter also established a minimum peptide length of 7 amino acids, and proteins that were subsets of others were removed using the parsimony option in Contrast.
Data that has been previously described was included in our analyses and is summarized in Table S1. All mass spectrometry data has been deposited into the MassIVE repository (http://massive.ucsd.edu). Data set identifiers are supplied in Table S1.
Statistical analysis of proteomics data sets-A minimum of three biological replicates were acquired for each affinity purification mass spectrometry (APMS) analysis. To identify highconfidence interaction partners, QSPEC v1.3.5 (16) was used to calculate Z-statistic, log2 fold change, and FDR values of identified proteins. Prey proteins that were not present in at least half of at least one bait protein purification (Table S2C)  Enzyme activity assays-HDAC activity assays of transiently produced proteins were performed as described (49).     (Table S2E)     H.