Development of a Novel Cross-linking Strategy for Fast and Accurate Identification of Cross-linked Peptides of Protein Complexes*

Knowledge of elaborate structures of protein complexes is fundamental for understanding their functions and regulations. Although cross-linking coupled with mass spectrometry (MS) has been presented as a feasible strategy for structural elucidation of large multisubunit protein complexes, this method has proven challenging because of technical difficulties in unambiguous identification of cross-linked peptides and determination of cross-linked sites by MS analysis. In this work, we developed a novel cross-linking strategy using a newly designed MS-cleavable cross-linker, disuccinimidyl sulfoxide (DSSO). DSSO contains two symmetric collision-induced dissociation (CID)-cleavable sites that allow effective identification of DSSO-cross-linked peptides based on their distinct fragmentation patterns unique to cross-linking types (i.e. interlink, intralink, and dead end). The CID-induced separation of interlinked peptides in MS/MS permits MS3 analysis of single peptide chain fragment ions with defined modifications (due to DSSO remnants) for easy interpretation and unambiguous identification using existing database searching tools. Integration of data analyses from three generated data sets (MS, MS/MS, and MS3) allows high confidence identification of DSSO cross-linked peptides. The efficacy of the newly developed DSSO-based cross-linking strategy was demonstrated using model peptides and proteins. In addition, this method was successfully used for structural characterization of the yeast 20 S proteasome complex. In total, 13 non-redundant interlinked peptides of the 20 S proteasome were identified, representing the first application of an MS-cleavable cross-linker for the characterization of a multisubunit protein complex. Given its effectiveness and simplicity, this cross-linking strategy can find a broad range of applications in elucidating the structural topology of proteins and protein complexes.


Knowledge of elaborate structures of protein complexes is fundamental for understanding their functions and regulations. Although cross-linking coupled with mass spectrometry (MS) has been presented as a feasible strategy for structural elucidation of large multisubunit protein complexes, this method has proven challenging because of technical difficulties in unambiguous identification of cross-linked peptides and determination of cross-linked sites by MS analysis. In this work, we developed a novel cross-linking strategy using a newly designed MS-cleavable cross-linker, disuccinimidyl sulfoxide (DSSO). DSSO contains two symmetric collision-induced dissociation (CID)-cleavable sites that allow effective identification of DSSO-cross-linked peptides based on their distinct fragmentation patterns unique to cross-linking types (i.e. interlink, intralink, and dead end). The CID-induced separation of interlinked peptides in MS/MS permits MS
analysis of single peptide chain fragment ions with defined modifications (due to DSSO remnants) for easy interpretation and unambiguous identification using existing database searching tools. Integration of data analyses from three generated data sets (MS, MS/MS, and MS 3 ) allows high confidence identification of DSSO cross-linked peptides. The efficacy of the newly developed DSSO-based cross-linking strategy was demonstrated using model peptides and proteins. In addition, this method was successfully used for structural characterization of the yeast 20 S proteasome complex. In total, 13 non-redundant interlinked peptides of the 20 S proteasome were identified, representing the first application of an MS-cleavable cross-linker for the characterization of a multisubunit protein complex. Given its effectiveness and simplicity, this cross-linking strategy can find a broad range of applications in elucidating the structural topology of proteins and protein complexes. Proteins form stable and dynamic multisubunit complexes under different physiological conditions to maintain cell viability and normal cell homeostasis. Detailed knowledge of protein interactions and protein complex structures is fundamental to understanding how individual proteins function within a complex and how the complex functions as a whole. However, structural elucidation of large multisubunit protein complexes has been difficult because of a lack of technologies that can effectively handle their dynamic and heterogeneous nature. Traditional methods such as nuclear magnetic resonance (NMR) analysis and x-ray crystallography can yield detailed information on protein structures; however, NMR spectroscopy requires large quantities of pure protein in a specific solvent, whereas x-ray crystallography is often limited by the crystallization process.
In recent years, chemical cross-linking coupled with mass spectrometry (MS) has become a powerful method for studying protein interactions (1)(2)(3). Chemical cross-linking stabilizes protein interactions through the formation of covalent bonds and allows the detection of stable, weak, and/or transient protein-protein interactions in native cells or tissues (4 -9). In addition to capturing protein interacting partners, many studies have shown that chemical cross-linking can yield low resolution structural information about the constraints within a molecule (2,3,10) or protein complex (11)(12)(13). The application of chemical cross-linking, enzymatic digestion, and subsequent mass spectrometric and computational analyses for the elucidation of three-dimensional protein structures offers distinct advantages over traditional methods because of its speed, sensitivity, and versatility. Identification of cross-linked peptides provides distance constraints that aid in constructing the structural topology of proteins and/or protein complexes. Although this approach has been successful, effective detection and accurate identification of cross-linked peptides as well as unambiguous assignment of cross-linked sites remain extremely challenging due to their low abundance and complicated fragmentation behavior in MS analysis (2,3,10,14). Therefore, new reagents and methods are urgently needed to allow unambiguous identification of cross-linked products and to improve the speed and accuracy of data analysis to facilitate its application in structural elucidation of large protein complexes.
A number of approaches have been developed to facilitate MS detection of low abundance cross-linked peptides from complex mixtures. These include selective enrichment using affinity purification with biotinylated cross-linkers (15)(16)(17) and click chemistry with alkyne-tagged (18) or azide-tagged (19,20) cross-linkers. In addition, Staudinger ligation has recently been shown to be effective for selective enrichment of azidetagged cross-linked peptides (21). Apart from enrichment, detection of cross-linked peptides can be achieved by isotope-labeled (22)(23)(24), fluorescently labeled (25), and mass tag-labeled cross-linking reagents (16,26). These methods can identify cross-linked peptides with MS analysis, but interpretation of the data generated from interlinked peptides (two peptides connected with the cross-link) by automated database searching remains difficult. Several bioinformatics tools have thus been developed to interpret MS/MS data and determine interlinked peptide sequences from complex mixtures (12,14,(27)(28)(29)(30)(31)(32). Although promising, further developments are still needed to make such data analyses as robust and reliable as analyzing MS/MS data of single peptide sequences using existing database searching tools (e.g. Protein Prospector, Mascot, or SEQUEST).
Various types of cleavable cross-linkers with distinct chemical properties have been developed to facilitate MS identification and characterization of cross-linked peptides. These include UV photocleavable (33), chemical cleavable (19), isotopically coded cleavable (24), and MS-cleavable reagents (16, 26, 34 -38). MS-cleavable cross-linkers have received considerable attention because the resulting cross-linked products can be identified based on their characteristic fragmentation behavior observed during MS analysis. Gas-phase cleavage sites result in the detection of a "reporter" ion (26), single peptide chain fragment ions (35)(36)(37)(38), or both reporter and fragment ions (16,34). In each case, further structural characterization of the peptide product ions generated during the cleavage reaction can be accomplished by subsequent MS n1 analysis. Among these linkers, the "fixed charge" sulfonium ion-containing cross-linker developed by Lu et al. (37) appears to be the most attractive as it allows specific and selective fragmentation of cross-linked peptides regardless of their charge and amino acid composition based on their studies with model peptides.
Despite the availability of multiple types of cleavable crosslinkers, most of the applications have been limited to the study of model peptides and single proteins. Additionally, complicated synthesis and fragmentation patterns have impeded most of the known MS-cleavable cross-linkers from wide adaptation by the community. Here we describe the design and characterization of a novel and simple MS-cleavable cross-linker, DSSO, and its application to model peptides and proteins and the yeast 20 S proteasome complex. In combination with new software developed for data integration, we were able to identify DSSO-cross-linked peptides from complex peptide mixtures with speed and accuracy. Given its effectiveness and simplicity, we anticipate a broader application of this MS-cleavable cross-linker in the study of structural topology of other protein complexes using crosslinking and mass spectrometry.

EXPERIMENTAL PROCEDURES
Materials and Reagents-General chemicals were purchased from Fisher Scientific or VWR International (West Chester, PA). Bovine heart cytochrome c (98% purity) and bovine erythrocyte ubiquitin (98% purity) were purchased from Sigma-Aldrich. Synthetic peptide Ac-IR7 (Ac-IEAEKGR; 98.1% purity) was synthesized by GL Biochem (Shanghai, China). Sequencing grade modified trypsin was purchased from Promega (Fitchburg, WI). The 20 S proteasome core particle was affinity-purified using a Pre1-TAP-expressing yeast strain as described previously (39).
Synthesis and Characterization of DSSO- Fig. 1A displays a twostep synthesis scheme of DSSO with an extended spacer length of 10.1 Å. Briefly, sulfide S-1 was first synthesized by mixing 3,3Јthiodipropionic acid (2.50 g; 14.0 mmol) with N-hydroxysuccinimide (3.30 g; 28.6 mmol) in dioxane (60 ml). The reaction mixture was stirred under an atmosphere of argon, and a solution of N,NЈ-dicyclohexylcarbodiimide (5.79 g; 28.1 mmol) in dioxane (20 ml) was added dropwise. After 12 h, the insoluble urea was filtered from the reaction. The filtrate was concentrated to form a white solid. The solid residue was washed with cold diethyl ether followed by cold hexanes. After drying under reduced pressure, 5.20 g (70%) of sulfide S-1 was recovered and used without further purification: 1  To synthesize DSSO, a solution of sulfide S-1 (0.600 g; 1.61 mmol) in CHCl 3 (30 ml) at 0°C was mixed with a solution of m-chloroperbenzoic acid (0.371 g; 1.61 mmol) in CHCl 3 (10 ml). The reaction product was filtered and washed with cold CHCl 3 (10 ml) and cold MeOH (10 ml Cross-linking of Synthetic Peptides with DSSO-Synthetic peptides Ac-IR7, Ac-myelin, and substance P were dissolved in DMSO to 1 mM and cross-linked with DSSO dissolved in DMSO in a ratio of 1:1 in the presence of 1 eq of diisopropylethylamine similarly as described (21). The cross-linked peptide solution was then diluted to 1 pmol/l in 4% ACN, 0.1% formic acid for liquid chromatography multistage tandem mass spectrometry (LC MS n ) analysis.
Cross-linking of Cytochrome c and Ubiquitin with DSSO-Lyophilized bovine cytochrome c or ubiquitin was reconstituted in 1ϫ PBS (pH 7.5) to 200 M, 20 l of which was mixed with 2 l of 20 mM DSSO (in DMSO) in a molar ratio of 1:10 (protein:cross-linker) for the crosslinking reaction as described (21). The cross-linked protein was digested with trypsin (1%, w/w) overnight at 37°C. The cross-linked peptide digest was then diluted to 1 pmol/l in 4% ACN, 0.1% formic acid for LC MS n analysis.
Cross-linking of Yeast 20 S Proteasome with DSSO-Affinity-purified yeast 20 S proteasome complex was concentrated by Microcon (Billerica, MA) to ϳ1.2 M in 1ϫ PBS buffer (pH 7.5). Typically, 50 l of the 20 S proteasome was cross-linked with 3 l of DSSO (20 mM) dissolved in DMSO (final concentration, ϳ1 mM) at a molar ratio of 1:1000 (protein:cross-linker). Cross-linking was performed for 0.5 h or overnight and quenched with excess ammonium bicarbonate buffer. Cysteine residues were reduced with 5 mM DTT at 56°C for 30 min and alkylated with 10 mM chloroacetamide for 30 min at room temperature. The cross-linked protein complex was digested with trypsin (2%, w/w) overnight at 37°C. Digested peptides were desalted by C 18 OMIX ZipTip (Varian, Palo Alto, CA) prior to LC MS n analysis.
For some analyses, two-dimensional LC MS n analysis was carried out. Off-line strong cation exchange chromatography was performed as the first dimension of separation using an Ä KTA HPLC system (GE Healthcare) as described (9). Each fraction was desalted by ZipTip prior to LC MS n analysis.
LC MS n Analysis-LC MS n analysis of DSSO-cross-linked peptides was performed using an LTQ-Orbitrap XL mass spectrometer (Thermo Scientific, San Jose, CA) with an on-line Eksigent NanoLC system (Eksigent, Dublin, CA). The LC separation was the same as described previously (21). The MS n method was set specifically for analyzing DSSO-cross-linked peptides. Each acquisition cycle of an MS n experiment includes one MS scan in FT mode (350 -1800 m/z, resolution of 60,000 at m/z 400) followed by two data-dependent MS/MS scans with normalized collision energy at 10 or 15% on the top two peaks from the MS scan and then three MS 3 scans operated in the LTQ with normalized collision energy at 29% on the top three peaks from each of the MS/MS scans. For initial analyses, MS/MS spectra were acquired in the LTQ in LC MS n experiments. For automated data analysis, MS/MS spectra were obtained in FT mode (resolution of 7500).
Data Analysis of DSSO-cross-linked Peptides-Monoisotopic masses of parent ions and corresponding fragment ions, parent ion charge states, and ion intensities from LC MS/MS and LC MS 3 spectra were extracted using in-house software based on the Raw_ Extract script from Xcalibur v2.4 (Thermo Scientific). Database searching was performed with a developmental version of Protein Prospector (v5.5.0, University of California San Francisco) (http:// prospector.ucsf.edu/prospector/mshome.htm) using its software suite, i.e. Batch-Tag and MS-Bridge, as described (27). Using inhouse scripts, extracted MS 3 data were reformatted such that MS 3 fragment ions were directly linked to their MS/MS parent ions. For cytochrome c (P62894) and ubiquitin (P62990) analyses, database searching of MS 3 spectra was performed using Batch-Tag against their accession numbers in Swiss-Prot September 1, 2009 database. For the 20 S proteasome, a Batch-Tag search of MS 3 data was performed against a decoy database consisting of a normal Saccharomyces Genome Database concatenated with its reversed version (13,490 total protein entries). The mass tolerances for parent ions and fragment ions were set as Ϯ20 ppm and 0.6 Da, respectively. Trypsin was set as the enzyme, and a maximum of two missed cleavages was allowed. Protein N-terminal acetylation, methionine oxidation, and N-terminal conversion of glutamine to pyroglutamic acid were selected as variable modifications. In addition, three defined modifications on uncleaved lysines were chosen, including alkene (C 3 H 2 O; ϩ54 Da), sulfenic acid (C 3 H 4 O 2 S; ϩ104 Da), and thiol (C 3 H 2 SO; ϩ86 Da) modifications due to remnants of the cross-linker (Fig. 1). Initial acceptance criteria for peptide identification required a reported expectation value Յ0.05. For the 20 S proteasome analysis, the false positive rate for peptide identification was less than 1%.
The Link-Finder program (http://www.ics.uci.edu/ϳbaldig/Link-Finder/) was developed to search MS/MS data and identify the list of putative DSSO-interlinked and dead end products based on their unique MS fragmentation patterns as illustrated in Fig. 1 (for details, see "Results"). Monoisotopic masses and charges of parent ions measured in MS scans for those putative cross-linked peptides identified by the Link-Finder program were subsequently submitted to MS-Bridge to determine cross-linked peptide sequences by mass mapping with a given cross-linker (i.e. DSSO) and protein sequences (27). The parent mass error for MS-Bridge search was set as Ϯ10 ppm, and only one cross-link was allowed in the cross-linked peptides for general search. All of the three types of the cross-linked peptides (14), i.e. interlinked (type 2), intralinked (type 1), and dead end-modified (type 0), can be computed and matched in MS-Bridge (27).
The search results from Link-Finder, Batch-Tag, and MS-Bridge programs were integrated together using in-house scripts to compile a list of cross-linked peptides identified with high confidence. The final results were validated manually by examining MS/MS spectra and MS 3 spectra, respectively.

Development of Novel Sulfoxide-containing MS-cleavable
Cross-linker-To develop a robust MS-cleavable cross-linking reagent, the incorporated MS-labile bond must have the ability to selectively and preferentially fragment prior to peptide backbone breakage independently of peptide charges and sequences. It is well documented that methionine sulfoxide-containing peptides have preferential fragmentation at the C-S bond adjacent to the sulfoxide during collision-induced dissociation (CID) analysis (40), and this fragmentation is dominant and much more labile than peptide bonds. Such labile fragmentation has often been observed as the loss of 64 Da (ϪSOCH 4 ) from oxidized methionine-containing peptides in our routine peptide analysis. Therefore, we expect that if a sulfoxide is incorporated in the spacer region of a NHS ester, the C-S bond adjacent to the sulfoxide will be MS-labile and prone to preferential fragmentation. To test this, we designed and synthesized a CID-cleavable cross-linker, namely DSSO, which contains two NHS ester functional groups and two symmetric MS-labile C-S bonds adjacent to the sulfoxide (Fig. 1A). DSSO has a spacer length of 10.1 Å, making it well suited for detecting protein interaction interfaces of protein complexes and generating highly informative distance constraints. In comparison with existing MS-cleavable cross-linkers, DSSO can be easily synthesized in a two-step process as shown in Fig. 1A.
Proposed CID Fragmentation Pattern of DSSO-cross-linked Peptides-Three types of cross-linked peptides can be formed during the cross-linking reaction: interlinked (type 2), intralinked (type 1), and dead end (type 0)-modified peptides (14), among which interlinked peptides are the most informative for generating distance constraints. Fig. 1, B-D, show the proposed fragmentation schemes of DSSO-cross-linked peptides. As shown in Fig. 1B, during CID analysis of a DSSO-interlinked peptide ␣-␤, the cleavage of one C-S bond next to the sulfoxide separates the interlinked peptide into a pair of peptide fragments, i.e. ␣ A /␤ S in which the ␣ peptide fragment is modified with the alkene (A) moiety (ϩ54 Da) and the ␤ peptide fragment is modified with the sulfenic acid (S) moiety (ϩ104 Da). If peptides ␣ and ␤ have different sequences, two possible pairs of fragments (i.e. ␣ A /␤ S and ␣ S /␤ A ) will be observed due to the breakage of either of the two symmetric C-S bonds next to the sulfoxide in the spacer region of DSSO (Fig. 1B), thus resulting in four individual peaks in the MS/MS spectrum. But if peptides ␣ and ␤ have the same sequences, only one fragment pair, i.e. two peaks, will be detected in the MS/MS spectrum. To determine sequences of interlinked peptides and assign the cross-linking site, the resulting peptide fragments (i.e. ␣ A , ␤ S , ␣ S , or ␤ A ) generated in MS/MS can be further subjected to MS 3 analysis. Because these fragments represent single peptide sequences, the interpretation of the MS 3 spectra by the Batch-Tag program in Protein Prospector is identical to the identification of a single peptide with a defined modification (in this case, remnant of the cross-linker). This will dramatically simplify data interpretation and improve the identification accuracy of cross-linked products.
DSSO dead end-modified peptides have a defined mass modification (ϩ176 Da) due to the half-hydrolyzed DSSO (Fig.   1C). MS/MS analysis of a dead end-modified peptide ␣ DN would result in two possible fragment ions, i.e. ␣ A and ␣ S , due to the cleavage of the C-S bond on either side of the sulfoxide. We name the ␣ A and ␣ S fragments as the dead end fragment pair, and the mass difference between these fragments correlates to the difference between the remnants of DSSO attached to the fragments. Similarly, intralinked peptides (e.g. ␣ intra ) also have a defined mass modification (ϩ158 Da) due to DSSO cross-linking of two distinct lysines in the same peptide sequence (Fig. 1D). The cleavage of the C-S bond will result in only one fragment peak in MS/MS with the same mass as the parent ion observed in MS. MS 3 analysis of fragment ions detected in MS/MS will lead to the detection of y or b ions containing either A or S modifications.
As shown in Fig. 1E, the sulfenic acid-containing fragment (e.g. ␣ S , ␤ S , or ␣ AϩS ) may undergo further fragmentation and lose a water molecule (Ϫ18 Da) to generate a new fragment containing an unsaturated thiol (T) moiety (ϩ86 Da) (e.g. ␣ T , ␤ T , or ␣ AϩT ). We do not expect any complication with data analysis as the thiol-containing fragment ion will become the dominant ion instead of the sulfenic acid-modified fragment ion in the MS/MS spectrum. Thus, we anticipate that the total number of pairs and peaks will remain similar as shown in Fig.  1, B-D. Because of the specific and unique MS/MS fragmentation patterns for different types of DSSO-cross-linked peptides, there are fixed mass relationships between parent ions and their fragment ions as listed in Fig. 1F. For DSSO-interlinked peptides (␣-␤), the mass sum of each fragment pair (␣ A /␤ S or ␣ S /␤ A ) is equivalent to the mass of the parent ion (Fig. 1F, Equation 1). If ␣ S or ␤ S loses a water and becomes ␣ T or ␤ T , respectively, the fragment pairs will be ␣ A /␤ T and ␣ T /␤ A , and the mass sum of each fragment pair plus a water will be the same as the parent mass (Fig. 1F, Equation 2). As for the dead end (DN)-modified peptide ␣ DN , each fragment (i.e. ␣ A , ␣ S , or ␣ T ) has a distinct mass difference from the parent ion (Fig. 1F, Equation 3). For the intralink peptide ␣ intra , the fragment mass could be either the same as the parent mass (i.e. ␣ AϩS ) or 18 Da less than the parent mass (i.e. ␣ AϩT ) (Fig. 1F,  Equation 4). Moreover, there is a defined mass difference (⌬32 Da) between the T-and A-modified forms of the same sequence (Fig. 1F, Equation 5). These characteristic mass relationships were incorporated into the Link-Finder program to identify DSSO-cross-linked peptides.
Characterization of DSSO-cross-linked Model Peptides by MS n Analysis-To characterize the new DSSO linker, we first cross-linked several model peptides, including Ac-IR7, Acmyelin, and substance P. Under our experimental conditions, the major cross-linked products for Ac-IR7 and Ac-myelin are interlinked, whereas substance P mostly formed dead endmodified peptides. All of the cross-linked model peptides were subjected to LC MS n analysis. The interlinked Ac-IR7 peptide (␣-␣) was detected as doubly charged (m/z 923.46 2ϩ ) and triply charged (m/z 615.97 3ϩ ) ions ( Fig. 2A). MS/MS analyses of the two differently charged parent ions resulted in two dominant fragment ions, respectively (Fig. 2, B and C). Because the two interlinked sequences are identical, only one . Sequences of Ac-IR7, Ac-myelin, and substance P are Ac-IEAEKGR, Ac-ASQKRPSQRHG, and RPKPQQF, respectively. fragment pair (i.e. ␣ A /␣ S ) was observed as expected. The results suggest that MS/MS fragmentation of interlinked peptides is independent of peptide charges. It should be noted that besides unique mass relationships the fragment ions in each pair have a defined charge relationship associated to the charge of the parent ion. In other words, the sum of the observed charges for each fragment in a pair equals the charge of the parent ion. For example, the triply charged parent ion (m/z 615.97 3ϩ ) generated the fragment pair with one doubly charged (␣ A 2ϩ ) ion and one singly charged (␣ S 1ϩ ) ion, whereas the doubly charge parent ion (m/z 923.46 2ϩ ) only produced a fragment pair with two singly charged (␣ A 1ϩ and ␣ S 1ϩ ) ions. This information can be used to validate the fragment pairs identified by masses. The respective MS 3 analysis of ␣ A and ␣ S ions (Fig. 2, D and E) allowed unambiguous identification of the peptide sequence and cross-linked site based on a series of y and b ions. Similar analysis was carried out for interlinked Ac-myelin (␤-␤), and a characteristic fragment pair was observed in MS/MS spectra of the parent ion (␤-␤) at three different charge states (m/z 458.23 6ϩ , 549.68 5ϩ , and 686.84 4ϩ ), respectively (Fig. 2, F-I), which represent the expected fragmentation of two identical interlinked peptides. Whereas the fragment pair ␤ A /␤ S was detected in MS/MS spectra of quintuply and quadruply charged interlinked Acmyelin (␤-␤) (m/z 549.68 5ϩ and 686.84 4ϩ ) (Fig. 2, H and I), the fragment pair ␤ A /␤ T was observed in the MS/MS spectrum of sextuply charged interlinked Ac-myelin (␤-␤) (458.23 6ϩ ) (Fig.  2G). The ␤ T fragment, namely the ␤ peptide fragment containing an unsaturated T moiety (ϩ86 Da), was generated due to the loss of H 2 O from the sulfenic acid moiety on the ␤ S fragment (Fig. 1E). This is likely due to excess collision energy deposited on the highest charged species as the collision energy chosen for CID analysis in the LTQ-Orbitrap XL mass spectrometer does not change with peptide charges during LC MS n runs.
In addition to interlinked peptides, dead end-modified peptides were analyzed. Fig. 2J displays the MS spectrum of the DN-modified substance P (␥ DN ; m/z 538.76 2ϩ ). As predicted in Fig. 1D, MS/MS analysis of ␥ DN led to two expected fragments, the alkene-(␥ A ; m/z 478.03 2ϩ ) and sulfenic acid (␥ S , m/z 502.95 2ϩ )-containing peptide fragments, representing the characteristic feature of dead end-modified peptides. The fragment ions carry the same charge state as the parent ion, and MS 3 analysis of the ␥ A fragment confirmed its sequence unambiguously (Fig. 2L). Taken together, the results clearly demonstrate that the new MS-cleavable bonds in DSSO are labile and can be preferentially fragmented prior to peptide bond breakage, and the desired fragmentation is independent of peptide charge states and sequences.
Characterization of DSSO-cross-linked Peptides of Model Proteins by MS n Analysis-We next evaluated the applicability of DSSO for protein cross-linking under physiological conditions. Model proteins cytochrome c (1, 19 -21, 31, 41-43) and ubiquitin (18,38) have been extensively utilized to test various new cross-linking strategies because they have a relatively large number of lysine residues accessible for cross-linking. Based on our previous work (21), cytochrome c was crosslinked with a 10-fold excess of DSSO. The cytochrome c cross-linking efficiency using DSSO was comparable with the efficiency using disuccinimidyl glutarate or our previously developed azide-disuccinimidyl glutarate cross-linkers (21), indicating that DSSO is as effective for protein cross-linking reactions. The DSSO-cross-linked cytochrome c was then digested with trypsin and analyzed by LC MS n . Three types of cross-linked peptides of cytochrome c (i.e. interlink, intralink, and dead end) were observed.  Fig. 3, B and C. Based on the series of y (i.e. y 1-7 ) and b (i.e. b 2-7 ) ions, the sequence of the MS/MS fragment ion ␣ A (m/z 478.99 2ϩ ) was unambiguously identified as Ac-GDVEK A GKK in which Lys at the fifth position from the N terminus was determined to be modified with the alkene moiety. MS 3 analysis of the corresponding fragment pair ion ␤ T (m/z 352.40 2ϩ ) determined its sequence as K T KGER. Although there are two lysine residues in the sequence, the occurrence of y 4 and a 1 ions indicates that the first N-terminal Lys is modified with an unsaturated thiol moiety. Taken together, the identity and cross-linking site of the interlink peptide [Ac-GDVEKGKK interlinked to KKGER] was determined unambiguously.  24), indicating that this peptide is a heterodimeric interlinked peptide. The same characteristic fragment pairs, i.e. ␣ A /␤ T and ␣ T /␤ A , were also identified but with different charges in the MS/MS spectra of the quadruply (m/z 481.5069 4ϩ ) and quintuply (m/z 385.4070 5ϩ ) charged parent ions, respectively (Fig. 4, B and C). It is noted that some charge distribution of fragment ions was observed in the pairs (Fig. 4C) due to the high charge state of the parent ion. Nevertheless, the dominant ions are the characteristic fragment ions of the interlinked peptide. MS 3 analysis of the ␣ A (m/z 496.60 3ϩ ) fragment revealed its sequence identity unambiguously as HK A TGPNLHGLFGR in which the Lys at the second position from the N terminus was modified with the alkene moiety (Fig. 4D). In combination with the MS-Bridge result, the interlinked peptide was identified as [HKTGPNLH-GLFGR interlinked to GKK]. These results demonstrate that preferred fragmentation of the C-S bonds in DSSO-interlinked peptides of cytochrome c occurs as expected and is independent of peptide charge states and sequences.
To understand how dead end-modified peptides of cytochrome c behave in MS n analysis, Fig. 5A illustrates the MS/MS spectrum of a selected dead end-modified peptide (m/z 880.8975 2ϩ ). As shown, two major fragment ions (m/z 820.20 2ϩ and 835.88 2ϩ ) were detected, and they are 122 and 90 Da less than the parent ion, respectively. Such mass differences between the parent ion and its fragment ions fit well with those predicted for DSSO dead end-modified peptides (Fig. 1F, Equation 3), identifying the ion m/z 820.20 2ϩ as ␣ A and 835.88 2ϩ as ␣ T fragment. MS 3 analysis of the ␣ A fragment (m/z 820.20 2ϩ ) (Fig. 5B) as well as the MS-Bridge result of the parent ion (m/z 880.8975 2ϩ ) identified its sequence as K DN TGQAPGFSYTDANK.
As discussed above (Fig. 1D), we predicted that MS/MS analysis of the intralinked peptide (␣ intra ) would lead to either a fragment ion (␣ AϩS ) containing one Lys A and one Lys S with the same mass as the parent ion or a fragment ion (␣ AϩT ) containing one Lys A and one Lys T with a mass 18 Da less than that of the original parent ion. 606.24 2ϩ ) was detected with a mass 18 Da less than that of the parent ion. This suggests that the peptide is potentially an intralinked peptide of cytochrome c, and its MS/MS fragment ion (m/z 606.24 2ϩ ) can be labeled as ␣ AϩT . Mass mapping of the parent ion m/z 611.9802 3ϩ using MS-Bridge matched to an intralinked peptide, GGK*HK*TGPNLHGLFGR, where the two N-terminal Lys* residues are linked. Because the CIDinduced C-S bond breakage can occur at either side of the sulfoxide, a mixture of two fragments with identical masses but with A or T moieties at either Lys can be generated. ment and charge determination of both parent ions in MS and their fragment ions in MS/MS spectra. MS 3 was obtained in the LTQ to achieve the highest sensitivity. As shown, the first data extraction step is to generate the text files containing peak lists of MS/MS and MS 3 data, respectively. Based on the unique MS/MS fragmentation profiles of DSSO-cross-linked peptides and the defined mass relationships between parent ions and their fragment ions (Fig. 1), the Link-Finder program was developed to automatically search MS/MS data to identify putative DSSO-cross-linked peptides (Fig. 6B). As discussed above, the interlinked products produce distinct MS/MS spectra with two pairs of dominant peptide fragments (␣ A /␤ S/T and ␣ T/S /␤ A ). For each MS/MS scan, among the top eight most abundant peaks, if there is a fragment pair with a mass sum equal to their parent mass with or without a water loss (Ϫ18 Da), the parent ion will be categorized as a possible interlinked peptide. If two of those pairs can be found, and the mass difference between any two fragments from the two distinct pairs is 32 Da, i.e. the mass difference between the thiol and alkene moieties, then it is almost certain that the parent ion is a true interlinked product. The dead end product typically has two major fragment ions representing the parent peptide attached with either a thiol or an alkene moiety. Among the top three peaks, if there are two peaks with mass difference of 32 Da, and one of them is 90 Da less than the parent mass, then it is categorized as a high confidence dead end peptide. Using the Link-Finder program, a list of parent ions were identified as putative interlinked or dead end-modified peptides. The generated list of parent ion masses was then subjected to MS-Bridge to identify putative cross-linked peptides of all types by mass matching with high mass accuracy (Ͻ10 ppm).
For MS 3 data, only the original parent ion observed in the MS scan is listed as the precursor ion during database searching. To extract the MS 3 parent ion (fragment ions in MS/MS) for a Batch-Tag search, the second data extraction step is carried out using in-house scripts to generate a modified MS 3_ txt file. The Batch-Tag search result provides high confidence identification of single peptide fragments generated in MS/MS that are initially cross-linked. Finally, the results from three different types of searches, i.e. Batch-Tag (MS 3 data), Link-Finder (MS/MS data), and MS-Bridge (MS data), are integrated using in-house scripts within the Link-Finder program to obtain accurate and reliable identification of cross-linked peptides. Among them, MS 3 sequencing with Batch-Tag searching is essential for unambiguous identification of cross-linking sites.
Identification of DSSO-cross-linked Peptides of Model Proteins by Automated Database Searching-The newly developed integrated work flow was first used to identify DSSOcross-linked peptides of cytochrome c. In total, 19 interlinked peptides were unambiguously identified and are summarized in Table I (for details, see supplemental Table 1 and supplemental Fig. 1). Cross-linked peptides have characteristic fragment pairs in MS/MS spectra and were identified by the Link-Finder program. In addition, one or two MS/MS fragment pair ions were sequenced by MS 3 to provide unambiguous identification. Moreover, all of the parent masses fit well with cross-linked peptides identified by the MS-Bridge program with high mass accuracy. In comparison with reported crosslinking studies of cytochrome c (14, 19 -21, 31, 41-43), three novel interlinks were identified in this work. Besides the interlinked peptides, seven intralinked and eight dead end peptides were also identified (see supplemental Table 1). For the dead end-modified peptides, each has a dead end fragment pair, and at least one of the fragment ions was sequenced, which correlates very well with the MS-Bridge and Batch-Tag results. The intralinked peptides were mainly identified by Batch-Tag and MS-Bridge results.
In addition to products with one cross-link (i.e. types 0, 1, and 2), peptides containing two cross-links were also identified using this integrated work flow. In this work, 11 nonredundant DSSO-cross-linked peptides with two links (e.g. one interlink with one dead end, one interlink with one intralink, or one intralink with one dead end) were identified and are summarized in supplemental Table 1. This type of infor-

TABLE I Summary of DSSO-interlinked peptides of cytochrome c identified by LC MS n
All of the interlinked peptides displayed characteristic fragment pairs and were identified by Batch-Tag, MS-Bridge, and Link-Finder. AA, amino acid; Mod., modification.
* Peptide fragments containing these sites were not sequenced by MS 3 . ‡ They were identified from different fragment pair ions by MS 3 . mation is not commonly reported because peptide sequencing of multilinked peptides is highly complicated. This demonstrates the ability of our new cross-linking strategy for identifying such complex products. Based on the crystal structure of bovine heart cytochrome c (Protein Data Bank code 2B4Z) (44), we calculated the distances between ␣ carbons of the identified cross-linked lysine residues (Table I and supplemental Table 1). Among the 26 non-redundant interlinked lysines in cytochrome c identified in this work (excluding linkages between two adjacent lysines), all of the linkages had distances between their ␣ carbons within the range of 5.3-19.3 Å. This is consistent not only with the length of a fully expanded DSSO (10.1-Å spacer length) and two lysine side chains but also with the previous results using similar lengths of NHS ester cross-linkers (21,31,43,45). The results suggest that our cross-linking conditions did not induce significant disturbance to cytochrome c structural conformations.
In addition to cytochrome c, the same strategy was successfully applied to identify DSSO-cross-linked peptides of ubiquitin. Using the same analysis strategy, three interlinked, one intralinked, and five dead end peptides were identified as summarized in supplemental Table 2 and supplemental Fig. 1. Based on the crystal structure of bovine ubiquitin (Protein Data Bank code 1AAR), all of the identified inter/intralinked lysines in ubiquitin had distances between their ␣ carbons within the range of 6 -18 Å. The identified cross-linked lysines are consistent with the known structure of ubiquitin and previous reports (18,38). It is interesting to note that one of the identified interlinked peptides is [LIFAGK 48 QLEDGR interlinked to LIFAGK 48 QLEDGR], which is a cross-link formed between the ubiquitin dimer. Residue Lys 48 is located at a hydrophobic patch important for protein interactions, and Lys 48 is also an in vivo chain linkage site for polyubiquitination required for ubiquitin/ATP-dependent proteasomal degradation (46). The same Lys 48 -Lys 48 cross-link was identified previously using an alkyne-tagged NHS ester but only after selective enrichment coupled with CID and electron transfer dissociation analyses (18). In comparison, we were able to identify the Lys 48 -interlinked peptide without any enrichment, thus further demonstrating the effectiveness of our approach to identify DSSO-cross-linked peptides from complex mixtures.
Structural Elucidation of Yeast 20 S Proteasome Complex Using DSSO Cross-linking-The ubiquitin-proteasome degradation pathway plays an important role in regulating many biological processes (46). The 26 S proteasome complex is the macromolecular machine responsible for ubiquitin/ATPdependent protein degradation, and it is composed of two subcomplexes: the 20 S core particle and the 19 S regulatory complex. To date, only the crystal structure of the 20 S proteasome complex has been resolved. However, structures of the 19 and 26 S complexes remain elusive, thus hindering the understanding of the structure and functional relationship of the 26 S proteasome complex. To develop an effective cross-linking strategy to elucidate structures of the 19 and 26 S proteasome complexes, we therefore investigated the structure of the yeast 20 S proteasome complex using the DSSO cross-linking approach. The cross-linking of the 20 S proteasome complex was carried out in PBS buffer under conditions allowing efficient cross-linking of all subunits as based on one-dimensional SDS-PAGE (supplemental Fig. 2). The tryptic digest of the cross-linked proteasome complex was subjected to LC MS n analysis, and the data were analyzed using the integrated work flow described above (Fig.  6). In total, 13 unique interlinked peptides were identified, including 10 intrasubunit and three intersubunit heterodimeric interlinks, as summarized in Table II ( In addition, 21 dead end-modified peptides were identified by multiple lines of evidence as illustrated in supplemental Table 3. The fragmentation behavior for the dead end-modified peptides of the 20 S subunits is the same as that of cytochrome c, showing two distinct dead end pairs in MS/MS spectra. This is illustrated with an example shown in supplemental Fig. 4. The experimentally determined structure of the yeast 20 S proteasome holocomplex was utilized (Protein Data Bank code 1RYP) to assess the cross-linked lysine pairs identified in this study. For each identified cross-link, the distance between the ␣ carbons was calculated, and the results are summarized in Table II. Considering the spacer length of DSSO and lysine side chains, the theoretical upper limit for the distance between the ␣ carbon atoms of paired lysines is ϳ26 Å. Our reported distances are within this upper limit, providing some evidence that the proteasome cross-links are formed in the native state. The quaternary proteasome structure is formed by four stacked seven-member rings in the order ␣␤␤␣. The side view and basal view of the arrangement among one set of the symmetric ␣␤ rings and their subunits are shown in Fig. 8. The ␣ carbon trace is shown for all subunits, and the cross-linked lysines are shown in spacefilling representation. Lysines forming intrasubunit cross-links appear in blue, and those forming intersubunit cross-links appear in red. The images in Fig. 8 were generated using UCSF Chimera visualization software (47). DISCUSSION We have presented a novel cross-linking strategy for structural analysis of model proteins and the yeast 20 S proteasome complex by combining a newly designed MS-cleavable cross-linker, DSSO, with an integrated data analysis work flow. This approach is effective and facilitates fast and accurate identification of DSSO-cross-linked peptides by LC MS n . The new MS-cleavable cross-linker DSSO is attractive for cross-linking studies of protein complexes for a number of reasons: 1) It can be easily synthesized and can cross-link protein complexes effectively at submicromolar concentrations (ϳ1 M); 2) It has two symmetric CID-labile C-S bonds that preferentially fragment prior to peptide backbone breakage; 3) The CID-induced cleavage of interlinked peptides is specific and independent of peptide charges and sequences; 4) DSSO-cross-linked peptides can generate characteristic fragmentation patterns in MS/MS spectra that are unique to different types of cross-linked peptides for easy identification; 5) There are unique mass and charge relationships between MS/MS peptide fragment ions and their parent ions, permitting automated data processing. In comparison with existing MS-cleavable cross-linkers (16, 34 -36, 38), the DSSO crosslinker can provide a specific and selective fragmentation of cross-linked peptides for identification. The fragmentation patterns of DSSO-cross-linked peptides are similar to those of fixed charge sulfonium ion-containing cross-linked model peptides developed by Lu et al. (37). Although DSSO does not carry a fixed charge, our results demonstrated that the preferential cleavage of the C-S bond adjacent to the sulfoxide in DSSO is as effective as cleavage of the C-S bond in the sulfonium ion-containing cross-linker (i.e. S-methyl 5,5Ј-thiodipentanoylhydroxysuccinimide) (37). However, fragmentation of the sulfonium ion-containing cross-linked peptide requires the formation of a five-member ring with the sulfonium ion and the amide of the linker such that it is not feasible to change spacer lengths in these cross-linkers. In contrast, the simple fragmentation mechanism gives DSSO the flexibility of changing its spacer lengths to accommodate cross-linking lysines at different distances while maintaining the symmetry of the linker with easily interpretable fragmentation patterns. In addition, DSSO has better potential for studying protein interactions by in vivo cross-linking. It is well known that cross-linking study of protein complexes is extremely challenging because of the inherent limitations of current cross- linkers. With the improvement on database searching of non-cleavable interlinked peptides, it is possible to identify cross-linked peptides of protein complexes using non-cleavable cross-linkers (12,13). However, this requires a special program for data interpretation, and the false positive rate of identifying interlinked sequences is higher than that of identifying single sequences. Here we have demonstrated the feasibility of using the novel DSSO cross-linking strategy to study the structure of the yeast 20 S proteasome complex. This work represents a major advancement in structural elucidation of multisubunit protein complexes with improved data analysis and accuracy as such application of MS-cleavable cross-linkers has not been reported before.
In addition to the design of this novel MS-cleavable linker, we developed an integrated data analysis work flow to achieve fast, easy, and accurate identification of cross-linked peptides and the cross-linking sites. Identification of DSSOcross-linked peptides from complex mixtures was accomplished with high confidence by integrating data analyses of three different data sets, MS, MS/MS, and MS 3 data. Because of the difficulty in interpreting MS/MS spectra of unseparated interlinked peptides, many of previously reported interlinked products were determined only based on parent masses. In contrast, all of the interlinked peptides of cytochrome c, ubiquitin, and the yeast 20 S proteasome complex were identified in this work with three lines of evidence, including characteristic fragmentation pairs (Link-Finder), peptide sequence determination by MS 3 sequencing (Batch-Tag), and mass mapping (MS-Bridge). This procedure permits the identification of cross-linked peptides with high accuracy, reliability, and speed. It is important to note that existing database search programs can be easily adapted for analyzing DSSO-crosslinked peptides; thus, a broad application of the DSSO-based cross-linking strategy is foreseeable. Furthermore, crosslinked peptides of cytochrome c with two links can be identified, suggesting the capability of the new cross-linking strategy for identifying more complex cross-linked products.
Cross-linking/mass spectrometry has been previously attempted to study the yeast 20 S proteasome complex using Ru(II)(bpy) 3 2ϩ (tris(2,2Ј-bipyridyl)ruthenium (II) dication)/ammonium persulfate/light-mediated cross-linking (48) in which multiple subunit interconnectivity has been determined based on MS identification of co-migrated subunits by SDS-PAGE after cross-linking. No cross-linked peptides were identified due to complicated chemistry of the radical-based crosslinking reaction. Therefore, our work describes the first successful use of a cross-linking/mass spectrometry strategy to determine intersubunit and intrasubunit interaction interfaces of the yeast 20 S proteasome complex. Although only 13 interlinked peptides of the yeast 20 S proteasome were identified and reported here, this work presents the first step toward full characterization of proteasome structures using cross-linking/mass spectrometry in the future. The feasibility of using the DSSO-based cross-linking strategy to identify cross-linked peptides of a large protein complex at 1 M or less concentration is very significant and of great promise to structural studies of protein complexes because purifying protein complexes at high concentrations is technically challenging.
During LC MS n analysis using the LTQ-Orbitrap XL mass spectrometer, collision energy cannot be adjusted on the fly to account for differences in peptide charge states; therefore, compromised collision energy is set during the entire LC MS n run. Thus, there exists a possibility that the collision energy may be too high for the highly charged ions but too low for peptides with lower charges. Future improvement on charge FIG. 8. Mapping identified DSSO-interlinked lysines onto crystal structure of yeast 20 S proteasome. The lysines forming intrasubunit cross-links appear space-filled in blue, and those forming intersubunit cross-links appear space-filled in red. selection and energy adjustment during LC MS n data acquisition may be needed to further enhance the quality of the results. Additionally, optimized peptide separation prior to LC MS n analysis will be necessary to improve the dynamic range of peptide analysis and allow the detection of low abundance cross-linked peptides. Moreover, refinement of the Link-Finder program is needed to improve the identification of intralinked peptides. Lastly, the addition of an affinity tag to the sulfoxide-containing cross-linker will improve detection of cross-linked peptides, which will be the subject of our future study.
In summary, we developed a new MS-cleavable crosslinker DSSO that is applicable for model peptides and proteins and a multisubunit protein complex. The unique MS features of DSSO-cross-linked peptides together with our integrated data analysis work flow for analyzing LC MS n data greatly reduce the time spent identifying cross-linked peptides. Given its simplicity, speed, and accuracy, we believe that this cross-linking strategy will have a broad application in elucidating structures of proteins and protein complexes in the future.