Development of Large-scale Cross-linking Mass Spectrometry*

Cross-linking mass spectrometry (CLMS) provides distance constraints to study the structure of proteins, multiprotein complexes and protein-protein interactions which are critical for the understanding of protein function. CLMS is an attractive technology to bridge the gap between high-resolution structural biology techniques and proteomic-based interactome studies. However, as outlined in this review there are still several bottlenecks associated with CLMS which limit its application on a proteome-wide level. Specifically, there is an unmet need for comprehensive software that can reliably identify cross-linked peptides from large data sets. In this review we provide supporting information to reason that targeted proteomics of cross-links may provide the required sensitivity to reliably detect and quantify cross-linked peptides and that a reporter ion signature for cross-linked peptides may become a useful approach to increase confidence in the identification process of cross-linked peptides. In addition, the review summarizes the recent advances in CLMS workflows using the analysis of condensin complex in intact chromosomes as a model complex.


Cross-linking Mass Spectrometry and the Analysis of Protein Complexes and Protein-Protein Interactions-Chemical
cross-linking in combination with mass spectrometry is a technology that has been used for over a decade to reveal the topology of protein complexes and protein-protein interactions (1). Cross-linking Mass Spectrometry (CLMS) 1 is based on the usage of a bifunctional reagent that covalently links two reactive residues (usually lysines) under near-physiological conditions, to create an informative linkage in a protein or between proteins. The linkage has a defined length and provides information which residues in proteins are within the distant constraint defined by the length of the selected crosslinker. For example, the bissulfosuccinimidyl suberate (BS3) cross-linker introduces a maximal distance constraint between the linked residues of 30Å (the length of the lysine side chains plus the length of the BS3 linker). The cross-linked protein sample is then digested to peptides and cross-linked peptides by a protease (e.g. trypsin) followed by mass spectrometric (MS) analysis to identify where in protein sequence the linkages has occurred. This linkage information can then be used to refine or build a 3D model of a protein or a protein complex or to describe protein-protein interaction (PPI) networks of a cell. Results from a cross-linking experiment are usually presented as a detailed linkage map, when isolated proteins are analyzed, or as an interaction network map for in vivo experiment (Fig. 1). When cross-linking data are combined with data from other structural biology techniques, such as X-Ray crystallography, electron microscopy (EM), computational modeling, and NMR, a high-resolution structure of a multiprotein complex can be obtained. The increasing performance of CLMS technology has the potential to reveal the PPI networks of the entire human cell and provide novel system approaches to structural biology.
Despite the apparent straightforwardness of the CLMS approach, many technical challenges have not yet been overcome, and the number of studies where cross-linking mass spectrometry has resolved major biological issues is still limited, especially when identification of cross-linked peptides from complex peptides mixture is required (2)(3)(4). Currently, the major challenges are (1) the low abundance of crosslinked peptides and (2) the reliable identification of crosslinked peptides. In this review, we discuss the challenges and current solutions to the problems in greater detail. We present the most impressive studies reflecting recent progress, and examine the developed strategies enabling the advancement.
Bottlenecks in Large Scale CLMS-Low Abundance Problem-In a cross-linking experiment, the desired cross-linked peptides are formed when the crosslinker has reacted with residues on both its ends, thereby producing informative linkage site that contains the distance constraint information. Cross-linkers of different specificity have been developed (reactive toward Lys, Asp, Glu, Cys, Arg, nonspecific), although the lysine reactive are most commonly used. Unfortunately, cross-linked peptides are far less abundant compared with unmodified linear peptides and cross-linker modified peptides, in which the cross-linker only has reacted on one-end with amino acid and on the other end with H 2 O molecule. It was previously estimated that crosslinks account for less than a 0.1% of the total theoretical peptide combinations (5). The low amount of cross-linked peptide is a consequence of several factors. First, there are few pairs of specific reactive residues close enough in the native protein structure to provide a cross-link. NHS-ester based cross-linkers target on average 5-7% of the residues in 1 The abbreviations used are: CLMS, cross-linking mass spectrometry; AMAS, N-␣-maleimidoacet-oxysuccinimide ester; BS3, bissulfosuccinimidyl suberate; BS2G, bis(sulfosuccinimidyl) 2,2,4,4-glutarate; BS(PEG)5, PEGylated bis(sulfosuccinimidyl)suberate; CLIP, click-enabled linker for interacting proteins; CID, collision-induced dissociation; DSS, disuccinimidyl suberate; DSSO, disuccinimidyl sulfoxide; DST, disuccinimidyl tartrate; DSP, (dithiobis(succinimidylpropionate)); DTSSP, 3,3Ј-dithiobis(sulfosuccinimidyl propionate); EDC, 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochloride; EM, electron microscopy; ETD, electron transfer dissociation; EThcD, electron transfer and higher-energy collision dissociation; FDR, false discovery rate; HCD, higher-energy collisional dissociation; IMAC, Immobilized metal ion affinity chromatograph; NMR, nuclear magnetic resonance; ORF, open reading frames; PEG, polyethylene glycol; PPI, protein-protein interaction; PIR, protein interaction reporter; QTOF, quadrupole time-of-flight mass spectrometer; SAXS, smallangle X-ray scattering; SCX, strong cation exchanged; Sulfo-GMBS, N-␥-maleimidobutyryl-oxysulfosuccinimide ester. a protein. Second, the yield of cross-linking is variable and depends on the cross-linker reactivity and accessibility. Cross-linking is in competition with hydrolysis, and frequently cross-linker modified peptides are generated instead of cross-linked peptide. Third, in a complex mixture, not all protein molecules will be cross-linked at the same residues. In this case, the protein molecules of the same kind will be cross-linked differently, creating an even more complex sample. Fourth, ionization, peptide length, chromatographic and fragmentation properties of cross-linked peptides are normally less efficient or favorable compared with unmodified noncross-linked peptides. As a result, a very complex mixture of peptides is generated, and the amount of unique cross-linked peptides is present in relatively low abundance. This provides a considerable challenge to detect cross-links by mass spectrometry, where the identification of cross-linked peptides is like looking for "a needle in a haystack".
Incomplete Knowledge About the Fragmentation of Crosslinked Peptides-Apart from the general low abundance nature of the cross-linked peptides, another problem regarding the identification is the incomplete knowledge about the fragmentation pattern. Cross-linked peptides are fundamentally different molecules than linear and modified peptides, as the cross-linked peptides are composed of two linear peptides connected with a cross-linker. Moreover, cross-linkers are often bulky and contain different functional groups that might affect how linked peptides fragment in mass spectrometers.
To date only few studies have focused on mapping the dissociation trends of cross-liked peptides (6 -11). They examined MS-cleavable and noncleavable cross-linkers, but none carefully investigated cross-linkers containing other functional groups. These studies established that fragmentation occurs mainly as expected at the cleavable bonds and in the linear moiety of both peptides, yielding typically a series of y-ions and b-ions. Tryptic peptides often produce more pronounced y-ions than b-ions, and all together under collision-induced dissociation (CID) ions from the longer (alpha) peptides are more pronounced than ions from the shorter (beta) peptide (14). Rappsilber and coworkers could improve the coverage for beta peptide by applying electron-transfer dissociation (ETD) fragmentation method and showed that the choice of fragmentation method has a distinct effect on the intensity of generated fragments (12). ETD has been successfully implemented by Burlingame, Adkins and Heck together with 1,3diformyl-5-ethynylbenzene (DEB), click-enabled linker for interacting proteins (CLIP) and disuccinimidyl sulfoxide (DSSO) cross-linkers (4,13,14). Hence, cross-linked peptides can be identified as two linear peptides carrying a modification.
Additionally, fragmentation can occur at the peptide amide bond adjacent to the cross-linked residues, generating ions containing parts of both peptides and the cross-linker ( Fig. 2 y 4␣ , y 3␤ , b 2␣ , b 2␤ ). Tendency of dissociation at the amide bond C-terminal to the cross-linked lysine was reported (9,10). If the peptide cross-linker amide bond is cleaved, the moiety fragments as a regular modified or unmodified peptide ( Fig. 2 L ␣13 , L ␤5 , L ␣5 L ␤13 ). The group of Robert J. Chalkley reported that 71.9% out of 3885 disuccinimidyl suberate (DSS) crosslinked peptides fragments at the amine bond joining the lysine -amine with the cross-linker or at the peptide bond joining a modified lysine to adjacent residues (15). Also, DEB crosslinked peptides under ETD fragment at equivalent positions to L ␣13 , L ␤5 , L ␣5 , L ␤13 and this was useful to confirm the identity of cross-links belonging to GroL-GroES complex (14). That increased knowledge of the fragmentation pattern can be used to improve scoring algorithms.
When higher-energy C-trap dissociation (HCD) is employed, additional small fragments can be generated, arising from multiple fragmentations of peptides and fragmentation at the bonds adjacent to cross-linked residues. This exposes fragments containing both the cross-linked residues joined by cross-linker, or the cross-linked residue and its immonium ions. All of these can serve as the reporter ions for crosslinked peptides. Barysz and colleagues studied the fragmentation spectra of over 700 cross-linked peptides and found several small fragments (m/z Ͻ 400) that contain the crosslinker backbone and the remains of one or both linked resi- Many isotope-encoded and chemically cleavable crosslinkers were developed for producing reporter ions. For example 3,3Ј-dithiobis(sulfosuccinimidyl propionate) (DTSSP) results in the presence of a 66Da doublet peak and disappearance of the cross-linked peak after addition of reducing agent (17, 19 -23). But only Protein Prospector software considers reporter ions that are naturally embedded in every cross-linked lysine regardless which cross-linker is used. This leaves room for further improvements.
The Problem of Reliable Identification of Cross-linked Peptides from Large Data Sets-Cross-linked peptides share fragmentation features with standard peptides (series of band y-ions), but their precursor mass is a combination of the mass of two peptides and a cross-linker. In addition, crosslinked peptides produce ions typical of a cross-linked species as outlined above. On top of that, unequal fragmentation efficiency of the two linked peptides forming the cross-link was reported (15). Ion trap-based CID and higher-energy collisional dissociation (HCD) of cross-links typically favor the formation of product ions from only one of the two constituent peptides. This impairs the confident assignment of crosslinks, which depends on high product-ion quality of the two cross linked peptides, especially in complex mixtures (15). All of the factors put particular demands on a software that can match large-scale cross-linked spectra with protein sequences in databases (Fig. 3).
A further challenge is the large number of cross-linked peptide combinations in a given database. For a database containing n peptides, the possible combination is n 2 /2ϩn. This means when the number of peptides increases linearly, the number of combination increases quadratically, and this number defines the required search space (24). Not only is the database size large, but so also is the number of acquired spectra, when working with complex peptides mixtures. This presents a computational challenge that requires different approaches (25).
FIG. 3. Architecture of search engine able to identify cross-linked peptides from large databases. 1. Acquired data are extracted as raw files, 2. Parameters for search are specified, cross-linker is selected, database is chosen, and search is performed. Results are provided in a table and displayed according to confidence. Each match can be visualized in two tabs spectrum viewer and protein viewer. 3. Only high confidence matches (FDR Ͻ 5%) are extracted and used for 4. modeling the structure of protein complexes and PPI network. Ideal search engine should be able to search any type of cross-linking data, provide a score reflecting the confidence of the matches, annotate spectra matches, extract the high confidence spectra and provide FDR. Cross-links can be directly used for building models of protein structure or PPI network (38).
To judge the quality of the match of a candidate cross-link spectrum to a theoretical spectrum the following features are typically considered: (1) error tolerance for MS and MS/MS, (2) the number of assigned peaks and their intensity, (3) the presence of ions with loss of water, or ammonia and (4) the presence of reporter ions for cross-linked peptides. Each of these features is scored and constitutes a certain part of a total score that aims to express how good the match is (26 -30). For very complex samples, requiring a large search space, sophisticated scoring functions are a prerequisite for the confident identification of cross-links, and MS/MS data need to be the basis for identification.
False positive identification of cross-linked peptides arises mainly from cases in which the shorter peptide (beta peptide, see Fig. 2) is assigned wrongly. The shorter the beta peptide is the higher the likelihood that it will be assigned incorrectly. If the longer peptide (alpha peptide) carries several fragment matches, the beta peptide can at times be assigned as a match even if it only fits the required precursor. Such cases will be more common in complex peptide mixtures than in analysis of single proteins because of broader availability of sequences that can be matched randomly. It was suggested that valid matches in experiments of cross-linked cell lysates should require beta peptide longer than five amino acids (3), as peptides with four or fewer amino acids can easily show random matches to signals in a spectrum because: (1) the C-terminal amino acids can overlap between the alpha and beta peptide (Lys/Lys, Arg/Arg); (2) the ions might be spectral noise; or (3) the b n-1 ions are identical in the alpha and beta peptide when either one of the C-terminal amino acids is cleaved off. Additionally, four-amino acid peptides often cannot be assigned to a protein unambiguously as in the protein database there might be at least two proteins containing the peptide.
The problem is simplified when working with small protein complexes, because few proteins need to be considered in the database. Several programs have been reported to automate and match MS2 spectra of cross-linked peptides in such smaller databases, and most of these programs work with a variety of cross-linkers (8,(31)(32)(33). This is however, not the case for programs that are designed to work with large-scale data sets. These programs typically require specific crosslinking chemistries (amine-reactive cross-linker, cleavable cross-linkers, isotope-labeled cross-linker) and/or specific MS workflows (25,34,35). They are often restricted to one or a few of experimental conditions that they were optimize for and are less successful in producing meaningful data when applied to data obtained using different experimental conditions. So far, at least six search engines capable to work with large data sets have been reported. xQuest can work as a generic tool suitable for most cross-linkers, but it can search databases of only up to 100 proteins (25). Larger databases can be searched but this requires an isotope labeled crosslinker. The advantage is that it works well with xProphet, which calculates false-discovery rate (FDR) based on target-decoy strategy (36). The software XlinkX is designed to work only with S-S-cleavable cross-linkers and can both determine the accurate precursor masses and obtain the sequence of the two linked peptides combining the information from MS, MS2, and MS3 (4). The search engines Protein Prospector and pLink omit the (n 2 /2 ϩ n) "n-square problem" by considering cross-linked peptides as a single peptide bearing an unknown modification, enabling the search to be done against a regular linear peptide database. pLink can identify cross-linked peptides linked through lysine residues from databases with the size of the E. coli proteome (37). It works with light and heavy labeled cross-linker. The software visualizes the spectra matches so they can be manually validated, and their confidence assigned. Xi, works well when used with lysine reactive cross-linkers, and has been shown to produce meaningful data for data sets of the size of the chicken chromosome (38). Kojak is an open-source software that works with variety of crosslinker chemistries, and it was shown to correctly identify crosslinks for 26S ribosome (39). However, its scoring algorithm needs further optimization as it scores the long peptides higher and, in many cases, omits correct shorter peptides. These software packages were developed by scientific laboratories mainly for internal use, later became open access, and support and flexibility is as a consequence limited.
Strategies to Tackle Drawbacks of CLMS-Over the years several strategies were developed to improve identification of cross-linked peptides. They comprise sophisticated methods for enrichment of cross-linked peptides prior and during mass spectrometry acquisition, and clever design of chemical cross-linkers.
Development of Enrichment Methods for Cross-linked Peptides-Considerable efforts have been made to enhance the intensity and increase the identification rate of the low abundance and complex but informative cross-linked peptides by MS. One strategy includes separation of the cross-linked proteins before digestion on SDS-PAGE followed by Western blot analysis to localize and separate the multiprotein complex of interest in gel. Only the area containing the complex is cut out from gel and analyzed by MS. This approach allows focusing the analysis on proteins of interest and significantly reduces the complexity of the sample (38). SDS-PAGE is also used to optimize the cross-linking protocol. Another common strategy is to separate the cross-linked peptide mixture on reverse-phase HPLC or using isoelectric focusing before injection into mass spectrometer. This reduces the complexity of the sample and thus allows for detection of cross-linked peptides (35). Cross-link enrichment is achieved to some extent, as cross-linked peptides are typically larger and thus more hydrophobic than the other peptides, so they tend to elute later from the reverse-phase column. Another common strategy is based on strong cation exchange chromatography-based separation of cross-linked peptide mixtures. A tryptic cross-linked peptide has at least four residues that can hold a proton and thus typically carry more positive charges than a linear peptide at pH 2 and will hence elute later the from strong cation exchange matrix than a linear peptide. This separation was shown to be an efficient way of separating cross-linked peptides (25). The high charge state of crosslinked peptides is also utilized during MS/MS acquisition to selectively fragment cross-linked peptides. Only peptides of charge state zϾ2ϩ are selected for fragmentation. This promotes fragmentation of cross-linked peptides even if they have lower intensity than linear peptides. Trnka and Burlingame introduced a new amine-specific cross-linker, DEB (14), which contains two additional protonation sites. The same concept was utilized by Reilly in the amine-specific diethyl suberthioimidate (DEST) cross-linker (40). Unlike BS3/DSS, these two cross-linkers preserve the basicity of targeted amines and through this generate cross-links in the zϾ4ϩ charge state. This can further improve the charge-based separation of crosslinked peptides from linear peptides and at the same time allows focusing MS acquisition on only 4ϩ or higher charging state ions. Peptide-level size exclusion chromatography has also been used for enrichment as the cross-linked peptides have higher molecular weight and tend to elute with a lower retention time (3,41).
Enrichable Cross-linkers for Selective Purification of Crosslinks-To selectively purify cross-linked peptides from high abundance linear peptides affinity tagged reagents were introduced. The affinity tagged reagents result in enrichment of the cross-linked peptides together with cross-linker modify peptides as the later also contain the affinity group and bind to affinity matrix. These reagents can provide the highest enrichment factor compared with all other strategies discussed here.
Many cross-linkers containing an affinity tag have been developed, using variable affinity groups and chemistry (42, 43, 44 -46). Pierce offers a sulfo-N-hydroxysuccinimidyl-2-(6-[biotinamido]-2-(p-azido benzamido)-hexanoamido) ethyl-1,3Ј-dithioproprionate (Sulfo-SBED) cross-linker, which comprises a biotin functionality, allowing affinity enrichment on streptavidin beads (47,48). A few cross-linkers that use alkyne-azido click chemistry or sulfhydryle capture or a phosphoryl group have been demonstrated to work in proof-ofprinciple studies (49 -52), but none of them has been applied to a biological problem yet. One cross-linker, bis(succinimidyl)-3-azidomethyl glutarate (BAMG) contains azido group which in cross-linked peptides can be reduced to an amine group. Reduction enables isolation of cross-linked peptides by diagonal strong cation exchange chromatography. CID of reduced cross-linked peptides shows abundant cleavage of the cross-link amide bonds, along with the cleavage of peptide bonds of the composing peptide pair (53). Borchers lab has shown that the entire cross-linker can serve as an antigenic group for anti-cross-linker antibody-based affinity enrichment of cross-linked peptides.
The downside of the affinity tagged cross-linkers is that they are usually bulky, which decreases their solubility, ac-cessibility and the cross-linking efficiency. One exception is the azide-tagged, acid cleavable disuccinimidyl bis-sulfoxide (Azide-A-DSBSO) cross-linker, which is used with biarylazacyclooctynone. The cross-linker is membrane-permeable, MS-cleavable and carries a bio-orthogonal azide tag that functions as an enrichment handle permitting selective isolation of cross-linked proteins and peptides through azidebased conjugation chemistry and subsequent affinity purification. It has also acid-cleavable site adjacent to the azide tag allowing for acid elution of cross-linked peptides from streptavidin beads. DSBSO delivered PPIs data for mammalian cells and for human proteasome complexes (54). Other exception are the Protein Interaction Reporters (PIRs) developed in Bruce lab (5,34,55). The cross-linkers contain biotin moiety and additionally second functionality low-energy CID-cleavable bonds and are reporter encoded. PIR cross-linked peptides are selectively purified before MS acquisition and can be targeted during MS (see paragraph below). When a PIR crosslinker was applied to an entire E. coli cell it delivered over 708 unique cross-linked peptides, used to assemble the first protein interaction network that contains topological features on many interactions (34).

MS-cleavable Cross-linkers to Simplify Identification of Cross-linked Peptides-
To simplify the identification of crosslinked peptides MS-cleavable cross-linkers were developed, containing labile bond(s). Photo-and chemically-cleavable cross-linkers are also used but the MS-cleavable are more widespread because they permit that each cross-linked peptide can be fragmented separately from the others in the mixture, so that the m/z of precursor, cleaved peptides and fragment ions is registered specifically for each cross-linked peptide and the information can be linked. During MS2 acquisition the labile bonds brake open, produce two linear peptides with half the cross-linker attached, and serve as a mark indicating the presence of cross-link peptide. In MS3 both the peptides are further fragmented to yield regular series of y-ions and b-ions. This significantly decreases the database size for such cross-linked peptides as they can be searched as regular linear peptides carrying a modification rather than two peptides connected with a cross-linker. The downside of performing MS3 is that it reduces the number of precursor ions that can be analyzed because of the need to acquire two extra MS/MS spectra. However, peptide backbone fragmentation can also be achieved without the need for MS3 by applying ETD-MS/MS (4). The newest Orbitrap instruments offer EThcD, where ETD and HCD are performed sequentially before scanning and detecting and this fragmentation method could be also used to avoid MS3.
To date, several different types of MS-cleavable crosslinkers have been developed (56 -65) but only a few are broadly used. Back et al. employed a ternary amino group within N-benzyliminodiacetate N-succinimidyl ester (BID) cross-linker and Michael Goshe introduced labile Asp/Val-Pro bond (56,57). Gavin Reid synthetized for the first time "fixed charge" sulfonium ion-containing cross-linker with labile C-S bond directly adjacent to the fixed charge and by this specifically and selectively dissociated the cross-linked peptides chains from each other overcoming the "proton mobility" issue (66). The design of the cross-linker was further simplified by Athit Kao (61). The sulfonium ion was replaced with sulfoxide, which was previously shown to trigger fragmentation at the C-S bond when adjacent to the sulfoxide. This provided additional advantage: flexibility in changing the length of the spacer. Disuccinimidyl sulfoxide (DSSO), is a very efficient reagent, which proved to work for isolated multiprotein complexes but also on proteome-wide level (4). Equally efficient is the PIR mentioned above, which contains CID-cleavable Asp-Pro Rink bond. Recently, Andrea Sinz proposed automated workflow for identification of cross-linked peptides from complex peptides mixture by using MS-cleavable cross-linker, BuUrBu, and software MetoX (67)(68)(69). The reagent contains thiourea moiety flanked by two NHS esters, and releases indicative doublet upon MS/MS fragmentation of cross-links but also backbone fragments. This allows for an automated screening procedure for cross-links and provides fragment ion information without the need for conducting MS3. Further use of such reagents holds great promise for obtaining comprehensive protein topology data on a large scale.
The design and synthesis of new reagents is ongoing to create an ideal cross-linker that is stable, reactive, and sufficiently soluble, contains an affinity handle and a MS-cleavable bond, is relatively small in size and has a suitable length of the spacer. Shorter cross-linkers provide higher resolution structural data but can link fewer active residues. For longer crosslinkers more active residues are available but the formed linkages are longer and hence less accurate. The perfect spacer length for large scale analysis still needs to be assessed. Once available the ideal cross-linker will significantly improve detection and identification of cross-links.
Targeted CLMS-A novel concept is the application of targeted MS to improve detection of cross-linked peptides in complex mixtures. In targeted MS, the MS acquisition is focused on peptides of interest and saves acquisition time by ignoring the detection of other peptides. This method is particularly useful when the peptides of interest are low abundance and are present in a very complex mixture. In the regular 6 -10 most intense ions per spectrum fragmentation strategy, cross-linked peptides could be missed all together. The targeted MS approaches are most frequently implemented on triple quadrupole instruments operating in the selected reaction monitoring mode (SRM, often also called MRM multiple reaction monitoring) (70,71). The function of the quadrupole is to concentrate the available measurement time on the targeted analytes. This enables signal accumulation that translates into an improved limit of detection. The approach however, requires a priori information about the target such as the m/z of the precursor ion, the retention time and a set of high-intensity fragment ions, unique to the tar-geted peptide (reporter ions) (72). When QTOF mass spectrometer is available parallel reaction monitoring (PRM) experiment can be performed which allow parallel detection of all transition in a single analysis and don't require prior information about target (73). The method could easily be adapted to analyze cross-links.
In one study the authors used the reporter ions for crosslinked peptides at m/z 222.149, 239.175, 350.222, to target the fragmentation of cross-linked peptides added to trypsindigested hemoglobin (18). They could identify the crosslinked peptides and showed that the precursor ion scanning that utilizes reporter ions for cross-links allows for more sensitive cross-linked peptides detection. Furthermore, a targeted MS technique was adopted to gain information about the structure of condensin and cohesin complexes in intact chromosomes (38). The m/z values of the condensin and cohesin cross-linked peptides, identified in the analysis of the purified complexes, were used to direct the peptide fragmentation in the analysis of mitotic chromosomes. This was performed on the LTQ-Orbitrap instrument, utilizing the inclusion list feature rather than precursor ion scanning, and resulted in a considerable improved detection of the peptides of interest. The results demonstrated the detection of 15 high confidence cross-linked peptides matching to condensin I complex in a mixture of over 600 other proteins. Through these 15 high confidence cross-linked peptides the architecture of the condensin I complex was revealed. It was found that condensin coiled-coils interact with each other along their lengths, and this provided sufficient data to refine the proposed model of condensin I complex action in cells (74,75). Further, it was established by the identification of cross-links outside the inclusion list, that the condensin I complexes interacted with each other using the N-terminal tail of the CapH linker subunit. The cross-linking data revealed that both histones H2A and H4 may have roles in condensin interactions with chromatin (Fig. 4).
Advances in Current Pipelines-Some of the most significant studies based on CLMS were communicated after 2010. One study determined the topology of the GroEL-GroES chaperonin complex (14) by using the earlier mentioned reagent DEB to generate cross-linked peptides of z-Ͼϩ4. The identification of cross-linked peptides was facilitated by electron transfer dissociation of the DEB-peptide bonds to yield diagnostic ions, equivalent to L ␣13 , L ␤5 , L ␣5 , L ␤13 see Fig. 2. ETD was described to be superior to CID in terms of providing higher coverage for both peptides forming the cross-link. In 2012 the Aebersold group mapped the interaction network for protein phosphatase 2A affinity-purified from human cells (35). They identified 176 inter-protein and 570 intraprotein cross-links that linked specific trimeric PP2A complexes to adaptor proteins that control their cellular functions. This was achieved by combining affinity purification, on beads crosslinking with d0/d12 disuccinimidyl suberate (DSS), SCX enrichment of cross-linked peptides, xQuest and molecular modeling. Spatial constraints directed modeling of the bind-ing interface between immunoglobulin binding protein 1 and PP2A and revealed the topology of the TCP1 ring complex chaperonin interacting with the PP2A regulatory subunit 2ABG. A year later, an important study was published in Cell where the structure and subunit topology of the INO80 chromatin remodeler and its nucleosome complex (1 MDa) was solved using cross-linking and electron microscopy (76). In this study the same strategy to identify cross-linked peptides as in (35) was utilized: d0/d12 DSS cross-linking, SCX enrichment of cross-linked peptides and xQuest search engine. Additionally, the software xProphet was used to improve the estimation of false discovery rate of cross-linked peptides assignment (36). The same year, Blankenship lab cross-linked cyanobacteria cells and using affinity purification, mass spectrometry, molecular modeling and time-resolved spectroscopy mapped the interaction between phycobilisomes and reaction centers on photosystems I and II. They utilized membrane permeable dithiobis[succinimidylpropionate] (DSP) cross-linker and search engines MassMatrix and xQuest (77). In another example, whole chromosome cross-linking information and molecular modeling provided a 3D model of the SMC2/SMC4 sub-complex build of long patches of coiledcoils. This was achieved by applying in situ BS3 cross-linking, targeted MS, SCX-enrichment and in house software Xi (38). The obtained model was in good agreement with a partial crystal structure obtained later for the isolated complex (78). Cross-linking data was valuable in modeling the structure of 26S proteasome holocomplex and helped to build a molecularlevel model of PsbQ protein bound to photosystem II (79,80).
First attempts have been made to cross-link living cells and identify protein-protein interactions. In a pioneering study done on E. coli lysate in 2008, the Aebersold lab identified 22 internally cross-linked proteins and validated seven interprotein cross-links. They used the isotopically-tagged DSS, enrichment on SCX chromatography, and targeted proteomics (isotopic pattern). The advancement was possible by parallel development of xQuest software that can reliably identify linkages from a database of 4000 proteins (25). The study suggested that direct purification of cross-linked peptides is needed to increase the yield of identifiable cross-links (81). The Bruce lab introduced PIR technology, which enables selective purification and cleavage of cross-linked peptides under CID to produce two modified peptides which sequences were determined in MS3, considerably simplifying, and adding confidence to the assignment of cross-links (5). In total the application of this strategy resulted in the identification of 1500 cross-linked peptides involving 400 proteins in whole E. coli cells (34), 368 cross-linked peptides pairs in mammalian cells (82) and 626 cross-links in Pseudomonas aeruginosa (83). All the enrichment strategies reported earlier were utilized: affinity purification, SCX chromatography and MS analysis of only zϾ4ϩ charged precursors. Luitzen de Jong lab also synthetized enrichable (containing azido group that is convertible to amine and hence can be enriched on SCX chromatography) and cleavable (amine group during MS2 fragments together with peptide bonds) cross-linker: BAMG. They reported identification of 265 intraprotein and 19 interprotein cross-links in HeLa cell nuclear extract (53). Meng-Qiu Dong lab used in house-developed software pLink and identified 394 interlinks from BS3 treated E. coli lysates. 85.6% of these cross-links were compatible with the structures of corresponding proteins and complexes deposited in the PDB. pLink besides BS3, also works with DSS, EDC, AMAS and sulfo-GMBS cross-linkers (81). Lan Huang lab advanced the cross-linking technology by synthetizing Azide-A-DSBSO cross-linker described earlier and showing its ability to reveal the protein interaction network in mammalian cells (54). In a similar fashion, the Heck lab utilized an MS-cleavable cross-linker, merged CID and ETD fragmentation data on cross-links to increase the amount of fragment ions that can be used for assignment using an inhouse developed software XlinkX to identify 2179 unique cross-linked peptides at 5% FDR in human cell lysate (4).
These highlighted examples indicate that cleavable crosslinkers significantly simplify the identification process of cross-links and by this are the most effective cross-linkers in complexes systems so far.

CONCLUSIONS
The constant improvement of the CLMS technology over the past decade has resulted a powerful tool to study the structure of protein complexes. The technology has so far  data, which helped to understand the function of key multiprotein complexes in human body. Currently, CLMS faces the next challenge: the elucidation of the protein-protein interaction networks of an entire cell. To reach the goal, the major bottlenecks in CLMS need to be resolved such as the low abundance of the cross-linked peptides and the reliable identification of cross-links. These are challenging tasks putting demands on the refinement of enrichment and identification strategies for cross-linked peptides. The appealing solution is the development of an enrichable cross-linker that is small, reactive, resuspendable in water, can efficiently penetrate the cell membrane and be easily purified from complex peptide mixtures. The cross-linker should also contain a MScleavable bond, and by this ensure reliable identification of cross-links. In parallel, targeted proteomics emerges as a novel tool that can increase the likelihood of detection and reliable identification of cross-links, which could result in great improvement of the quality and reproducibility of the CLMS data. For example, the Bruce lab reported technical reproducibility of 70%, and 40% reproducibility among bioreplicates; in total 400 proteins were identified cross-linked out of 4237 ORFs. In addition, increased understanding of the fragmentation pattern of cross-linked peptides would enable the enhancement of search engines designed to work with large data set from cross-linking experiments. Reporter ions for cross-linked peptide can serve as a way of separating crosslinked peptide spectra from linear and modified peptide spectra and when integrated into scoring algorithms, will further increase the identification confidence of cross-linked peptides. New search strategies, like consideration of the crosslinks as single linear peptides caring large unknown mass modification, already brought meaningful data from the analysis of E.coli proteome.
In parallel, a new era of applications of CLMS to biological and structural problems is emerging; cross-linking information is used not alone but in combination with other structural biology methods (38,84). When merging data from X-Ray crystallography, NMR, EM, SAXS, Native-MS, homology-modeling with the distance constraints provided by crosslinking is an interesting way forward to solve the structure of huge macromolecular machineries. Such approaches may considerably increase the resolution of protein structure and above all enable investigations of macromolecular complexes that are outside the reach of high-resolution structural biology techniques. Each subunit can be studied separately and then subunits are assembled together into an active complex what allows to elucidate the complex function.