Proteomic Analysis Reveals a Novel Mutator S (MutS) Partner Involved in Mismatch Repair Pathway*

The mismatch repair (MMR) family is a highly conserved group of proteins that function in correcting base–base and insertion–deletion mismatches generated during DNA replication. Disruption of this process results in characteristic microsatellite instability (MSI), repair defects, and susceptibility to cancer. However, a significant fraction of MSI-positive cancers express MMR genes at normal levels and do not carry detectable mutation in known MMR genes, suggesting that additional factors and/or mechanisms may exist to explain these MSI phenotypes in patients. To systematically investigate the MMR pathway, we conducted a proteomic analysis and identified MMR-associated protein complexes using tandem-affinity purification coupled with mass spectrometry (TAP-MS) method. The mass spectrometry data have been deposited to the ProteomeXchange with identifier PXD003014 and DOI 10.6019/PXD003014. We identified 230 high-confidence candidate interaction proteins (HCIPs). We subsequently focused on MSH2, an essential component of the MMR pathway and uncovered a novel MSH2-binding partner, WDHD1. We further demonstrated that WDHD1 forms a stable complex with MSH2 and MSH3 or MSH6, i.e. the MutS complexes. The specific MSH2/WDHD1 interaction is mediated by the second lever domain of MSH2 and Ala1123 site of WDHD1. Moreover, we showed that, just like MSH2-deficient cells, depletion of WDHD1 also led to 6-thioguanine (6-TG) resistance, indicating that WDHD1 likely contributes to the MMR pathway. Taken together, our study uncovers new components involved in the MMR pathway, which provides candidate genes that may be responsible for the development of MSI-positive cancers.

nuclease activity plays a critical role in 3Ј-5Ј excision involving EXO1. EXO1 then excises nascent DNA from the nick toward and beyond the mismatch to generate a single-strand gap, which is filled by DNA polymerases ␦ (lagging strand) or (leading strand) using the parental DNA strand as a template. Finally, the nick is sealed by DNA ligase I (19,20). In addition, two MutS homologues, MSH4 and MSH5, share similar structure and sequence features with the other members of the MutS family. Recent evidence suggests that they function beyond MMR and are involved in processes such as recombinant repair, DNA damage signaling, and immunoglobulin class switch recombination (21,22).
It has been well documented that impairment of MMR genes, especially MSH2 and MLH1, cause susceptibility to certain types of cancer, including human nonpolyposis colorectal cancer. At the cellular level, deficient MMR results in a strong mutator phenotype known as microsatellite instability (MSI), which is a hallmark of MMR deficiency (3)(4)(5). However, a significant fraction of MSI-positive colorectal cancers express MMR genes at normal levels and do not carry detectable mutation or hypermethylation in known MMR genes (23). Similarly, certain noncolorectal cancer cells with MSI also appear to have normal expression of known MMR protein (24,25). These observations suggest that additional factors and/or mechanisms may exist to explain these MSI phenotypes in patients.
To address this question, we performed tandem affinity purification coupled with mass spectrometry analysis (TAP-MS) to uncover MMR-associated protein complexes. Our proteomics study of the MMR family led to the discovery of many novel MMR-associated proteins, and gene ontology analysis expanded the roles of MMR in multiple biological processes. Specifically for MSH2, we uncovered a novel MutS binding partner WDHD1, which associates with both MutS␣ (MSH2-MSH6 heterodimer) and MutS␤ (MSH2-MSH3 heterodimer). We provide additional evidence suggesting that WDHD1 is involved in the MMR pathway, which can be used as potential biomarker for MSI phenotypes in cancer patients.
Antibodies-The anti-MSH2 antibody was obtained from Cell Signaling Technology. The monoclonal anti-FLAG M2, anti-␤-actin, and anti-WDHD1 antibodies were purchased from Sigma-Aldrich. The anti-Myc (9E10) antibody was obtained from Covance.
Coprecipitation and Western blotting-Cells were lysed with NETN buffer (100mM NaCl; 1mM EDTA; 20mM Tris HCl; and 0.5% Nonidet P-40) containing protease inhibitors on ice for 20 min. The soluble fractions were collected after centrifugation and incubated with protein A agarose beads coupled with anti-MSH2 antibody, or S-protein beads for 4 h at 4°C. The precipitates were then washed and boiled in 2 ϫ sodium dodecyl sulfate (SDS) loading buffer. Samples were resolved on SDS-polyacrylamide gel electrophoresis (PAGE) and transferred to polyvinylidene fluoride membrane, and immunoblotting was carried out with antibodies as indicated.
Clonogenic Survival Assays-Briefly, a total of 1 ϫ 10 3 HeLa cells were seeded onto 60 mm dish in triplicates. Twenty-four hours after seeding, cells were treated with different concentrations of 6-TG (0, 1 M, 3 M, 8 M) for 3 days, washed, and cultured in the medium. After 14 days, cells were stained with crystal violet and colonies counted. Numbers of colonies were expressed as a percentage of the colonies formed in the absence of the drug. Results were the averages of data obtained from three independent experiments.
Tandem Affinity Purification-HEK293T cells were transfected with plasmids encoding various SFB-tagged MMR proteins. Stable cell lines were selected with media containing 2 g/ml puromycin and confirmed by immunostaining and Western blotting. HEK293T cells stably expressing SFB-tagged MMR proteins were lysed with NETN buffer on ice for 20 min. After removal of cell debris by centrifugation, crude lysates were incubated with streptavidin Sepharose beads for 1 h at 4°C. The bead-bound proteins were washed three times with NETN buffer and eluted twice with 2 mg/ml biotin (Sigma) for 1 h at 4°C. The eluates were combined and then incubated with S-protein agarose (Novagen) for 1 h at 4°C. The S beads were washed three times with NETN buffer. The proteins bound to S-protein agarose beads were separated by SDS-PAGE and visualized by Coomassie Blue staining. The eluted proteins were identified by mass spectrometry analysis, performed by the Taplin Biological Mass Spectrometry Facility (Harvard Medical School).
Mass Spectrometry Analysis-Gel bands were excised into small pieces, destained completely, disulfide bonds were reduced with 5 mM tris(2-carboxyethyl)phosphine (TCEP), cysteines were alkylated with 10 mM IAA, and then subjected to trypsin digestion at 37°C for overnight. The peptides were extracted with acetonitrile and vacuum dried. Samples were reconstituted in HPLC solvent A (2.5% acetonitrile, 0.1% formic acid), delivered onto a Proxeon EASY-nLC II liquid chromatography pump (Thermo Fisher, Waltham, MA), and eluted with acetonitrile gradient by increasing concentrations of solvent B (97.5% acetonitrile, 0.1% formic acid) from 6% to 30% in 30 mins. The eluates directly entered Orbitrap Elite MS (Thermo Fisher), setting in positive ion mode and data-dependent manner with full MS scan from 350 -1250 m/z, resolution at 60,000, automatic gain control target at 1 ϫ 10 6 . The top 10 precursors were then selected for MS2 analysis.
The MS/MS spectra were used to search SEQUEST (ver. 28) (Thermo Fisher). Spectra were converted to mzXML using a modified version of the ReAdW.exe. Database searching included all entries from the human Uniprot database (March 11, 2014). This database was concatenated with one composed of all protein sequences in the reverse order. The number of entries in the database was 141,456. Searches were performed using a 50 ppm precursor ion tolerance for total protein level analysis. The product ion tolerance was set to 1 Da. Enzyme specificity was set to partially tryptic with two missed cleavages. Carboxyamidomethyl for cysteine residues (ϩ57.021 Da) was set as static modifications, and oxidation for methionine residues (ϩ15.995) was set as a variable modification. The identified peptides were filtered by false discovery rate Ͻ 1% based on the target-decoy method. The parameters XCorr, ⌬Cn, missed cleavages, peptide length, charge state, and precursor mass accuracy were considered for the peptide-spectrum match (PSM) filtering using a linear discriminant analysis (28,29). Single peptide identifications were removed. The identified proteins and peptides are shown in Supplemental Tables S1 and S2.
Mismatch Repair Protein Interactome Analysis-For the evaluation of potential protein-protein interactions, identified proteins and the corresponding PSM numbers (10 baits of mismatch proteins with one biological repeat for each) were subjected to assessment using the CRAPome methodology. The CRAPome scoring strategy is based on quantitative comparison of abundance (spectral counts) of coprecipitating proteins in purifications with bait against the distribution of prey abundances across a set of negative controls. This fold change (FC) score includes primary score FC-A and more stringent score FC-B. The FC-A calculation averages the counts across all control, whereas the FC-B score takes the average of the top three highest spectral counts for the abundance estimate (30). In this study, we used 233 TAP-MS data with randomly selected baits as the control group. An FC-B score higher than two was taken as the threshold for potential binding proteins. To further select for HCIPs, we chose the proteome profiling data of HEK293T whole-cell lysis as the background to assess the specificity of protein-protein interaction. The spectra number for the identified proteins was normalized by total spectra counts. By comparing with this global expression background, only proteins that were enriched above the average enrichment fold following the TAP-MS procedure were included in the HCIP lists.
The HCIPs of MMR proteins were analyzed by Cytoscape (31). We analyzed the network and created custom styles then applied yFiles organic layout with minor adjustments when necessary. The principal component analysis of the interactomes was studied with R statistical computing software. The HCIPs with normalized spectra number for each MMR protein were analyzed. The gene ontology annotations with p value were performed based on the Knowledge Base provided by Ingenuity Pathway Analysis software (IPA, Ingenuity Systems), which contains findings and annotations from multiple sources, including the Gene Ontology Database. False discovery rate correction of p value was used to correct for multiple testing to get the significantly enriched function with R statistical computing (32).
Experimental Design and Statistical Rationale-All the TAP-MS experiments of MMR proteins were performed with two biological replicates in HEK293T cells. These biological replicates came from two independent stable clones. The purified protein lysis from TAP were digested with trypsin and analyzed by MS. The raw data were calculated with SEQUEST and filtered by false discovery rate Ͻ 1% as we described in the methods. These identified proteins were filtered by combining CRAPome analysis and background enrichment strategy considering the two biological replicates. Function enrichment of the HCIPs was analyzed by IPA. False discovery rate correction of p value was used to correct for multiple testing. Clonogenic survival assays were performed with at least three biological replicates and statistical analysis was performed using the Student's test.

Proteomics Study of Mismatch Repair Protein Interactome
Using TAP-MS Approach-To build the interaction network of DNA MMR pathway, we used the well-established tandem affinity purification followed by mass spectrometry (TAP-MS) strategy (33)(34)(35), which was described in Fig. 1A, to identify the binding proteins. In humans, the DNA MMR pathway includes 10 proteins, MSH2, MSH3, MSH4, MSH5, MSH6, PMS1, PMS2, MLH1, MLH3, and EXO1. We established HEK293T derivative cell lines stably expressing each of the triple-tagged (S-protein, FLAG, and streptavidin-binding peptide) MMR proteins. TAP experiments were performed twice for each protein using independent stable clones, and the purified proteins were digested and delivered to mass spectrometry for identification. The identified protein numbers are shown in Figs. 1B and 1C. Details of the identification results are shown in Supplemental Tables S1 and S2. In total, 131,449 peptides and 20,001 proteins were acquired from the 20 TAP-MS experiments. Analysis of our repeat purifications verified strong reproducibility of our TAP-MS procedure (Fig.  1D), especially for the proteins identified with high PSMs, suggesting the high quality of our TAP-MS data.
To obtain the high-confidence candidate interacting proteins (HCIP) list, we submitted 20 TAP-MS results with spectra counts information for DNA MMR proteins and 233 controls with random selected unrelated control proteins for CRAPome analysis (30). FC-B score was used to filter our TAP-MS dataset for HCIPs. We obtained 648 proteins out of the total 14,340 identification list with the score higher than two. Furthermore, to improve the confidence of our interacting protein list, we adopted the proteome profiling data of input cell lysate as background for our protein-protein interaction study, which allowed us to remove background contaminants. Finally, we obtained 230 HCIPs as the "interactome" for all 10 MMR proteins with 36.1% of the proteins identified as nuclear proteins and 45.0% as cytoplasmic components (Fig. 1E). The details of the identified proteins and HCIPs for each bait protein are shown in Fig. 1C and Supplemental Table S3.
Overview of Protein-Protein Interaction Network of Human DNA Mismatch Repair Pathway-To understand the interactomes of MMR proteins, we first used IPA to reveal the function of all the identified HCIPs. The IPA analysis found that interactomes are highly enriched in proteins with reported roles in the MMR pathway, cell cycle, cellular growth and proliferation, DNA damage response, cellular development, cell morphology, and cellular assembly and organization ( Fig.  2A). Our results are in agreement with many published reports, which not only further demonstrate the high reliability of our dataset and methodology but also provide us with clues on how these proteins function in the MMR pathway.
MMR proteins do not function in isolation. There are many interactions among these MMR proteins and their HCIPs. Therefore, we studied the interactome network of all the HCIPs using Cytoscape (Fig. 2B). From the interaction data among various DNA MMR proteins, we found there are strong bindings among some of these proteins, which are already known as functional complexes involved in MMR, for example, the MutS and MutL complexes. MSH2 forms heterodimers with MSH6 and MSH3, which are, respectively, called MutS␣ and MutS␤, while MLH1 forms heterodimers with PMS2 and PMS1, which are MutL␣, and MutL␤. As some of the HCIPs are shared among several MMR proteins, the comparison of different identified spectra number for the common HCIPs were analyzed by unsupervised principal component analysis of the 10 TAP-MS results. We generated the principal component analysis plot with the top two principal components, which explained 21.4% and 16.3% of total data variation (Fig. 2C). As expected, our analysis validated the MutS and MutL complexes. A DNA exonuclease EXO1 has been reported to function in DNA MMR by excising mismatch-containing DNA tracts directed by strand breaks to the mismatch (36,37). According to our HCIPs list, EXO1 interacts with both MutS and MutL complexes, supporting active interaction and coordination between MutS, MutL, and EXO1 in lesion recognition, incision, and excision steps during the MMR process. We also identified an interaction between MSH4 and MSH5, suggesting that they may function as a complex in MMR pathway and/or other cellular processes.
Subinteractome Network Study of MutS, MutL, and EXO1-As EXO1 interacts with both MutS and MutL complexes, they may form a large "MMR repairsome" involved in the MMR process. Here, we further studied the subinteractome network. First, we integrated the HCIPs of MutS complexes (including MSH2, MSH3, and MSH6) and MutL complexes (including MLH1, PMS1, and PMS2) individually, and then built a subnetwork with HCIPs of these three components, MutS, MutL, and EXO1 (Fig. 3A). The proteins in the three cycles around the baits or complexes are the proteins only identified as HCIPs in the identical TAP-MS experiment. The proteins labeled in purple are the HCIPs identified by at least two baits. Some of the common identified HCIPs are involved in DNA repair pathways. For instance, x-ray repair cross-complementing protein 3 (XRCC3) is involved in the homologous recombination repair pathway (38). This protein was identified as HCIPs of MutS and MutL complexes, indi- cating that it may play a role in the DNA MMR pathway. Of course, it is also possible that MutS and MutL may associate with XRCC3 and function in homologous recombination repair pathway.
To globally reveal the functions of HCIPs of MutS, MutL, and EXO1 identified in our TAP-MS study, we used the software IPA for the localization and function analyses (Fig. 3B). Many of the HCIPs localize in the nucleus, which include 53.73% HCIPs of MutS complex, 51.16% of EXO1, and 34.29% of MutL complex. The functional analysis illustrates that these HCIPs are highly enriched in several functional pathways, including the MMR pathway, homologous recombination repair, nucleotide excision repair, cell cycle, and DNA replication. The proteins with these functions may be involved in DNA MMR pathway and vice versa.

Validation of MSH2 Interactome Reveals a Novel MutS-
Binding Partner WDHD1-To further validate our proteomics data, we decided to perform an in-depth study of the MSH2 interactome. In this interactome, we identified several known MSH2-binding proteins, including MSH3, MSH6, and EXO1 (Fig. 4A). Excitingly, we uncovered WDHD1 as a major MSH2associated protein (Fig. 4A). To confirm that WDHD1 exists in the same complex as MSH2, we performed reversal TAP-MS analyses using SFB-tagged WDHD1 as the bait protein and were excited to identify MSH2, MSH3, and MSH6 as WDHD1associated proteins (Fig. 4A). These data suggest that Immunoprecipitation reactions were performed using S-protein beads and then subjected to Western blot analyses using antibodies as indicated. (F) WDHD1 depletion confers an increased cellular resistance to 6-TG. Colony-formation assays were performed as described in the Experimental Procedures. Statistical analysis was performed using the Student's test. A p value less than 0.05 was considered significant. An asterisk (*) represents the p value. Data are presented as mean Ϯ S.E. KIAA1671, SMARCAD1, SDF4, and MeCP2 (negative control) with Myc-tagged MSH2 in HEK293T cells. Results indicated that, besides the known interactions (i.e. MSH2-MSH3, MSH2-MSH6, MSH2-EXO1), several HCIPs such as WDHD1, KIAA1671, and SMARCAD1 also bind to MSH2, therefore validating the MSH2 interactome we identified (Fig. 4B). In addition, we confirmed the MSH2-WDHD1 interaction between endogenous proteins (Fig. 4C), suggesting that these two proteins indeed associate with each other in vivo.
Mapping the Interaction Domains of WDHD1 and MSH2-We next attempted to define the MSH2-binding region(s) on WDHD1. A series of truncation mutants of WDHD1 were coexpressed with SFB-tagged MSH2 in HEK293T cells. We were able to map the minimal MSH2-binding region to a small region at the C terminus of WDHD1 (residues 1122-1126). Interestingly, within this region, an Ala 1123 to Pro missense mutant of WDHD1 was detected in a lung cancer patient Catalogue of Somatic Mutations in Cancer (COS-MIC). We found that mutation of Ala 1123 (Ala 1123 Pro) site alone or both Ala 1123 and Phe 1124 (Ala 1123 ProPhe 1124 Ala) sites abolished the interaction between MSH2 and WDHD1 (Fig. 4D), indicating that this interaction may contribute to cancer development.
Next, we sought to identify the region(s) of MSH2 that is responsible for its interaction with WDHD1. Again, we generated a series of truncation and internal deletion mutants of MSH2. As shown in Fig. 4E, the D5 mutant (the second lever domain, residues 550 -620) of MSH2 dramatically reduced the MSH2-WDHD1 interaction, indicating that this domain of MSH2 is important for its binding to WDHD1.
WDHD1 Depletion Confers an Increased Cellular Resistance to 6-thioguanine (6-TG)-It is well documented that the levels of MSH2 inversely correlate with 6-TG and N-methyl-N'-nitro-N-nitrosoguanidine (MNNG) resistance (39). Consistently, there were fewer colonies formed in parental HeLa cells upon 6-TG treatment, while knockdown of MSH2 or WDHD1 in these cells resulted in resistance to 6-TG, i.e. more colonies formed after 6-TG treatment (Fig. 4F). These results indicate that WDHD1 may not only bind to MSH2 but also function with MSH2 in the MMR pathway. DISCUSSION This work provides an extensive analysis of MMR proteinprotein interaction network, identifies over 230 HCIPs, and therefore greatly broadens our current understanding of the MMR pathway. We uncovered several uncharacterized partners for MMR proteins like MutS, MutL, and EXO1 (Fig. 3). The biological significance of these interactions remains to be determined. Given that the MMR pathway is a critical genome maintenance pathway and MMR deficiency leads to MSI and cancer development, we speculate that some of the MMRbinding proteins discovered in this study may be mutated or downregulated in cancer and therefore contribute to cancer development and MSI phenotypes identified in cancer patients. This possibility warrants further investigation.
In our proteomics analysis of the MMR pathway, we built a subnetwork with HCIPs for three MMR components, MutS, MutL, and Exo1 (Fig. 3A). It is known that MMR is implicated in other repair processes, including DNA damage signaling, homologous recombination, interstrand cross-link repair, and meiotic DNA recombination (40). Indeed, HCIPs of three MMR components clearly indicate the connections between MMR and other DNA repair pathways. For example, XRCC3 was identified as HCIPs of MutS and MutL complexes. This protein is implicated in homologous recombination repair (41,42), indicating that the MMR pathway may participate in homologous recombination repair through interactions with multiple factors involved in homologous recombination. MutL complexes have also been shown to participate in the repair of interstrand cross-links, with the evidence that MutL␣ interacts specifically with Fanconi anemia protein FANCJ (Fanconi Anemia Group J Protein, BRCA1-Interacting Protein 1) to facilitate interstrand cross-link repair (43). As a matter of fact, BRIP1 was repeatedly identified in our purifications of MutL complexes (Fig. 3A). Moreover, another Fanconi anemia protein FAN1 was also identified in the MutL complexes (Fig. 3A). The specific interaction between FAN1 (FANCD2/FANCI-Associated Nuclease 1), and MutL was further confirmed by reverse purification conducted by us and others (44,45), suggesting that MutL may participate in interstrand cross-link repair through its interaction with several Fanconi anemia proteins.
MSH2 is a central component of the MMR pathway that recognizes mismatches arising during DNA replication. The analysis of MSH2 interactome revealed not only several known components of MutS complex, including MSH3 and MSH6 but also several previously unidentified partners, such as WDHD1, KIAA1671, and SMARCAD1 (Fig. 4A). In particular, WDHD1 protein containing an amino-terminal WD40 domain (tryptophan-aspartic acid (W-D) dipeptide repeat) and a carboxyl-terminal HMG High-mobility group motif. It has been shown that WDHD1 acts as a component of the replisome to regulate DNA replication and S phase progression (46 -49). It is also well documented that MMR corrects DNA mismatches generated during DNA replication. Thus, it is reasonable to speculate that WDHD1 may function to recruit MutS complex to chromatin during DNA replication and thus facilitate the MMR pathway in removing mismatches after ongoing DNA replication forks. In this study, we not only validated the interaction between MSH2 and WDHD1 but also showed that a missense mutant of WDHD1, Ala 1123 -to-Pro, detected in a lung cancer patient (COSMIC), abolished the WDHD1-MSH2 interaction, indicating that this mutation in WDHD1 may be functionally important for lung cancer development. Of note, we also checked cBioPortal and found 163 WDHD1 mutations in colorectal cancer, endometrial cancer, bladder cancer, and others, indicating that WDHD1 may be mutated in multiple types of cancers and contribute to tumorigenesis. Moreover, similar to MSH2, knockdown of WDHD1 confers cellular resistant to 6-TG, indicating that WDHD1 likely participates in the MMR pathway. Future studies will be directed at defining whether and how WDHD1 may facilitate the loading of MSH2 during DNA replication and promote MMR.
In conclusion, our proteomics analysis of the MMR pathway provides a rich resource for further exploration of MMR functions in various DNA repair pathways, which will offer new ideas and therapeutic approaches for cancer patients.