Trifunctional cross-linker for mapping protein-protein interaction networks and comparing protein conformational states

To improve chemical cross-linking of proteins coupled with mass spectrometry (CXMS), we developed a lysine-targeted enrichable cross-linker containing a biotin tag for affinity purification, a chemical cleavage site to separate cross-linked peptides away from biotin after enrichment, and a spacer arm that can be labeled with stable isotopes for quantitation. By locating the flexible proteins on the surface of 70S ribosome, we show that this trifunctional cross-linker is effective at attaining structural information not easily attainable by crystallography and electron microscopy. From a crude Rrp46 immunoprecipitate, it helped identify two direct binding partners of Rrp46 and 15 protein-protein interactions (PPIs) among the co-immunoprecipitated exosome subunits. Applying it to E. coli and C. elegans lysates, we identified 3130 and 893 inter-linked lysine pairs, representing 677 and 121 PPIs. Using a quantitative CXMS workflow we demonstrate that it can reveal changes in the reactivity of lysine residues due to protein-nucleic acid interaction. DOI: http://dx.doi.org/10.7554/eLife.12509.001


Introduction
Proteins execute diverse functions by interacting with multiple protein partners in different complexes. The study of protein complex structures and protein-protein interactions is critical for understanding their functions. Recently, chemical cross-linking of proteins coupled with mass spectrometry analysis (CXMS) has emerged as a powerful tool for the analysis of such structures and interactions (Sinz, 2006;Leitner et al., 2010;Petrotchenko and Borchers, 2010;Singh et al., 2010;Rappsilber, 2011;Bruce, 2012). CXMS methods are less time-consuming and less demanding of sample purity than are traditional methods; this technology has thus been increasing in popularity.
In this work, we developed a series of chemical cross-linkers with a modular design as pioneered previously (Trester-Zedlitz et al., 2003). They each contain a biotin tag for affinity purification and a cleavage site that can be used to release cross-linked peptides from streptavidin beads. We selected the cross-linker with the best performance and developed a robust enrichment protocol with >97% enrichment efficiency. We termed it Lysine-targeted enrichable cross-linker (Leiker). Using our eLife digest Proteins fold into structures that are determined by the order of the amino acids that they are built from. These structures enable the protein to carry out its role, which often involves interacting with other proteins. Chemical cross-linking coupled with mass spectrometry (CXMS) is a powerful method used to study protein structure and how proteins interact, with a benefit of stabilizing and capturing brief interactions.
CXMS uses a chemical compound called a linker that has two arms, each of which can bind specific amino acids in a protein or in multiple proteins. Only when the regions are close to each other can they be "cross-linked" in this way. After cross-linking, the proteins are cut into small pieces known as peptides. The cross-linked peptides are then separated from the non cross-linked ones and characterized.
Although CXMS is a popular method, there are aspects about it that limit its use. It does not work well on complex samples that contain lots of different proteins, as it is difficult to separate the cross-linked peptides from the overwhelming amounts of non cross-linked peptides. Also, although it can be used to detect changes in the shape of a protein, which are often crucial to the protein's role, the method has not been smoothed out.
Tan, Li et al. have now developed a new cross-linker called Leiker that addresses these limitations. Leiker cross-links the amino acid lysine to another lysine, and contains a molecular tag that allows cross-linked peptides to be efficiently purified away from non cross-linked peptides. As part of a streamlined workflow to detect changes in the shape of a protein, Leiker also contains a region that can be labeled.
Analysing a bacterial ribosome, which contains more than 50 proteins, showed that Leiker-based CXMS could detect many more protein interactions than previous studies had. These included interactions that changed too rapidly to be studied by other structural methods. Tan, Li et al. then applied Leiker-based CXMS to the entire contents of bacterial cells at different stages of growth, and identified a protein interaction that is only found in growing cells.
In future, Leiker will be useful for analyzing the structure of large protein complexes, probing changes in protein structure, and mapping the interactions between proteins in complex mixtures.
previously developed pLink identification software , we here demonstrate that the use of Leiker effectively facilitates CXMS analysis in a variety of sample types, from purified complexes, crude immunoprecipitates, to highly complex whole-cell lysates.
Quantification of cross-linker modified peptides has the potential to detect protein conformational changes and changes in molecular interactions, though these methods are not mature. To address this potentially critical application of our technology, we synthesized stable isotope-labeled Leiker. Also, we established an automated data analysis workflow for the relative quantitation of light and heavy Leiker cross-links. As a proof of concept, we carried out a quantitative CXMS analysis of an RNA-binding protein L7Ae. Using deuterium-labeled Leiker, we found that for the three L7Ae lysine residues that are buried upon RNA binding, their mono-links decreased dramatically in the presence of RNA, exactly as expected. We further extended the application of quantitative CXMS to a highly complex system consisting of log-phase and stationary-phase E. coli cells and identified a growth phase specific protein interaction.

Design, synthesis and evaluation of Leiker
We aimed to develop a cross-linker similar to the widely used BS 3 but that had two major advantages: first, a biotin tag for affinity purification of cross-linked peptides, and second, a cleavage site to release cross-linked peptides after enrichment on streptavidin beads without carrying the biotin group; biotin can interfere with subsequent LC-MS/MS analysis. After experimenting with different designs of Leiker ( Figure 1, Figure 1-figure supplements 1-6, and Appendix), we found that bAL1 and bAL2 worked the best and there was no difference in performance between these two ( Figure 1-figure supplement 5). Hereafter, Leiker refers to either bAL1 or bAL2. In this study, bAL2 was used in most of the experiments and a bAL2-based CXMS workflow is illustrated in Figure 2. Both bAL1 and bAL2 feature a one-piece design with an azobenzene-based chemical cleavage site (Yang et al., 2010) and a 9.3-Å carbon chain that connects two sulfo-NHS esters. This spacer arm is shorter than that of BS 3 (11.4 Å ), so it may confer a higher specificity to Leiker in capturing protein-protein interactions. Inter-, loop-, and mono-linked peptides generated by either all produce a reporter ion of m/z 122.0606 in higher-energy collisional dissociation (HCD) spectra ( Figure 2D). It can be used to verify the identification of Leiker-cross-linked peptides. For quantitative CXMS analysis, we synthesized isotope-labeled bAL2 in which six hydrogen atoms in the spacer arm were replaced with deuterium ( Figure 1 and Figure 1-figure supplement 6). The six-dalton difference was sufficient to separate peptides cross-linked by [d 0 ]-Leiker from the same peptides cross-linked by [d 6 ]-Leiker.

Leiker enabled robust enrichment of cross-linked peptides
To assess to what extent Leiker could improve the identification of low-abundance cross-linked peptides from a complex background, a mixture of ten standard proteins (Figure 3-source data 1), consisting of RNase A, lysozyme, PUD-1/PUD-2 heterodimer, GST, aldolase, BSA, lactoferrin, bgalactosidase, mouse monoclonal antibody, and myosin, was treated with Leiker, digested with trypsin, and then diluted or not with a tryptic digest of non-cross-linked E. coli lysates at different ratios (1:1, 1:10, and 1:100, w/w). The peptide mixture was then incubated with streptavidin agarose. After extensive washes, cross-linked peptides were released using the Na 2 S 2 O 4 elution buffer. BS 3 was used in parallel as a control. As is shown in Figure 3A the number of BS 3 -linked peptide pairs identified decreased dramatically, from 109 in the undiluted sample to only one in the 100-fold diluted sample. The number of Leiker-linked peptide pairs identified after enrichment was in no way affected by increasing background complexity, with >160 inter-links detected in each sample. To be noted, these inter-linked peptide pairs, or inter-links for abbreviation, can result from either intraprotein or inter-protein cross-linking (illustrated in Figure 2B). Strikingly, cross-linking products, including inter-, loop-and mono-links ( Figure 2C) constituted over 97% of all peptides identified post enrichment ( Figure 3A). Of the Leiker-linked lysine pairs that can be mapped to the pdb structures (Figure 3-source data 1), 82% have Ca -Ca distance 22 Å and 93% have Ca -Ca distance 30 Å (FDR < 5%, E-value < 0.01), which is comparable to BS 3 (Figure 3-figure supplement 1). This result demonstrated that Leiker enables effective enrichment of cross-linked peptides.
The ten standard proteins also allowed us to assess the specificity of Leiker. Because Leiker has more functional groups than BS 3 does, a concern arises that Leiker may produce more cross-linking artifacts. Cross-links between non-interacting proteins are surely artifacts, which include all the interprotein cross-links identified from the ten-protein mixture except those between the light-chain and the heavy-chain of myosin, between the light-chain and the heavy-chain of an IgG antibody, and between PUD-1 and PUD-2, which form a heterodimer. We found that the percentage of artifactual cross-links is 3% for both Leiker and BS 3 (Figure 3-source data 2), fitting with the filtering criteria that were applied (FDR cutoff 0.05 followed by E-value cutoff 0.01). The results demonstrate that Leiker is as specific as BS 3 . Further, we cross-linked highly complex E. coli lysates with either Leiker or BS 3 for a side-by-side comparison. After enrichment and a single reverse phase LC-MS/MS analysis, Leiker yielded at least a fourfold increase in the number of inter-links identified ( Figure 3B).

Application of Leiker to large protein assemblies and immunoprecipitates
Next, we applied Leiker to real-world samples, starting with purified E. coli 70S ribosome, a 2.5 MDa ribonucleoprotein (RNP) complex consisting of more than 50 proteins. A total of 222 inter- linked lysine pairs were identified with high confidence, including 95 inter-molecular and 127 intramolecular cross-links ( Figure 4-source data 1). This is three times as many as in a previous study (Lauber and Reilly, 2011). Of the 95 cross-links connecting two lysine residues that are both present in the crystal structure of a 70S ribosome (Fischer et al., 2015) (PDB code: 5AFI), 75% are compatible with the crystal structure with a Ca-Ca distance 22 Å , which is the length of the spacer arm of Leiker plus two lysine side chains. Among the subset of intra-molecular cross-links, 84% have Ca-Ca distances 22 Å ; among the subset of inter-molecular cross-links, 50% have Ca-Ca distances 22 Å and 73% have Ca-Ca distances 30 Å , which could be a reasonable cutoff considering conformation flexibility of proteins in solution ( Figure 4-source data 1 and Figure 4-figure supplement 1).
One particular ribosomal protein L9 is a good example to illustrate conformational flexibility and the dynamic nature of interactions between proteins or protein complexes. A large b-factor in the crystal structure has suggested that L9 is highly mobile. It has been observed to adopt an extended, rod- like conformation in the crystal structure (Schuwirth et al., 2005) and a strikingly different bent conformation in the solution structure of the ribosome determined using cryo-EM (Fischer et al., 2015;Seidelt et al., 2009). Bending of L9 was echoed in this study, as reflected in the cross-links bridging L9 and L2 and the cross-links bridging the two termini of L9 (Figure 4-figure supplement 2). Three additional cross-links involving L9 have Ca-Ca distances >50 Å if measured within a ribosomal particle ( Figure 4-source data 1). We propose these apparently long distance cross-links, which are similar to the ones observed in a previous CXMS study (Lauber and Reilly, 2011), reflect interactions between ribosomal particles. L9 locates at the interface between ribosomal particles in higher-order configurations (e.g. polysome) (Brandt et al., 2009). Dimerization or oligomerization of 70S ribosomes in the absence of mRNA was also observed using negative staining EM from highly purified non-cross-linked 70S ribosomes ( Figure 4-figure supplement 3).
The peripheral regions of the ribosome are critical for protein translation and regulation (Savelsbergh et al., 2000;Valle et al., 2003;Kothe et al., 2004). However, despite of extensive studies on the ribosome structures, these peripheral parts are still largely missing because they are either too dynamic or refractory to crystallography. For E. coli ribosomal proteins currently lacking well-defined coordinates in the 70S crystal structures, Leiker-based CXMS provided remarkably more linkages than other ribosomal proteins. The top four proteins with the most inter-molecular cross-links identified are S1, L1, L7/12, and L31, all of which are mobile components in the peripheral regions and often invisible in the crystal structure ( Figure 4A and Figure 4-source data 2). S1 is the largest ribosomal protein, which binds to mRNA and initiates translation, but has no high-resolution structures available either alone or in the context of the 70S ribosome (Fischer et al., 2015;Lauber et al., 2012). A previous CXMS analysis of the 30S subunit revealed interaction between S1 and a region near the 3' end of 16S rRNA (Lauber et al., 2012), but it is unknown whether or not S1 interacts with the 50S subunit. Our analysis of the 70S ribosome revealed extensive contacts between the C-terminal mRNA binding domain of S1 and L1 in the 50S subunit (Figure 4-figure supplement 4A). Since the two of them localize to a region where both tRNA and mRNA leave the ribosome, the observation of six cross-links between them hints that there might be a coordination between deacylated tRNA release and mRNA exit from the ribosome. The 30S proteins that were found to interact with S1 in this study were largely consistent with those identified in the previous study (Lauber et al., 2012). In particular, four cross-links were identified between the N-terminal peptide of S1 (M1-K14) and the N-terminal peptide of S2 (M1-K11) (Figure 4-source data 1 and Figure 4-figure supplement 4A). This result agrees perfectly with a recent structural finding on the direct interaction between S1 and S2 (Byrgazov et al., 2015). L1 had the highest number of cross-links with S1, followed by cross-links with L33, L5, L9, S13, and S2 (Figure 4-figure supplement 4B). The proximity of L1 to L33 and L5 implicates a rotated conformation of L1 in the sample (Figure 4-source data 3), which was repeatedly observed in various structures of the 70S ribosome in different functional states (Valle et al., 2003). Furthermore, beyond the expected interactions between L7/12 and L6, L10, or L11 (Diaconu et al., 2005), we also found novel interactions between L7/12 and L19 or S3 ( Figure 4A and Figure 4-figure supplement 4C). These findings suggest that the highly flexible L7/12 stalk might be able to contact the 30S subunit, given the predicted large length of this dynamic stalk (Diaconu et al., 2005). Nine cross-links between E. coli L31 and L5 placed L31 in the central protuberance region (Figure 4-figure supplement 4D), which is supported by the crystal structure of T. thermophilus 70S ribosome (Voorhees et al., 2009) and the newly revealed structure of 70S ribosome (Fischer et al., 2015). Together, these results demonstrate that Leiker-based CXMS analysis can provide structural information that is highly complementary to crystallography and cryo-EM, especially for the flexible or dynamic regions that cannot be deduced using traditional methods.
Combining CXMS and immunoprecipitation (IP) has great potential for the detection of binding partners in close proximity among co-immunoprecipitated proteins; such method may be widely adopted in biology laboratories. Much progress has been made recently in this area by the use of a modified anti-GFP single-chain antibody that cannot be cross-linked so that GFP-tagged protein complexes can be cross-linked on beads and separated away from the antibody for CXMS analysis (Shi et al., 2015). For highly heterogeneous IP samples, however, cross-linked peptides can be inundated by non-cross-linked peptides even if the antibody is removed from the background. As a test, we prepared a crude immunoprecipitate of a TAP-tagged yeast exosome subunit Rrp46 (Figure 4figure supplement 5), from which 740 proteins were identified at 0.1% protein FDR. The . Leiker-based CXMS analyses of large protein assemblies. (A) Analysis of a purified E. coli 70S ribosome revealed the locations of highly dynamic periphery ribosomal proteins S1, L1, and L7/12 that were refractory to crystallography and cryo-EM analysis. Cross-links to S1, L1, and L7/12 are colored red, blue, and yellow, respectively, and the cross-linked residues on these three proteins are numbered according to the Uniprot sequences. (B) Analysis of a crude immunoprecipitate of the yeast exosome complex. Dashed blue and grey lines denote 50 compatible and 22 incompatible cross-links, respectively, according to the structure of the RNA-bound 11-subunit exosome complex (PDB code: 4IFD). Rrp44, green; Rrp40, orange; Rrp4, violet; Rrp42, gold; other exosome subunits, yellow; RNA, black. Known and candidate exosome regulators revealed by Leikercross-links are shown along the periphery and highlighted in green and yellow circles, respectively. (C) Connectivity maps of the ten-subunit exosome core complex based on the inter-molecular cross-links identified in the current IP-CXMS experiments or on previous yeast two-hybrid (Y2H) studies (Stark et al., 2006;Uetz et al., 2000;Oliveira et al., 2002;Luz et al., 2007;Yu et al., 2008). Blue solid lines: experimentally identified putative direct protein-protein interactions; grey dashed lines: theoretical cross-links according to the crystal structure; Ca-Ca distance cutoff 30 Å . immunoprecipitated proteins were eluted off IgG beads and cross-linked with Leiker. To evaluate the sensitivity of the method, we varied the amount of immunoprecipitates from 40 mg to 3 mg of proteins and found that the number of inter-link identifications did not change much as the input decreased from 40 to 20 mg (Figure 4-figure supplement 6). From three experiments starting with 40 mg of proteins, a total of 195 cross-linked lysine pairs (43 inter-molecular and 152 intra-molecular) were identified ( Figure 4B and Figure 4-source data 4). Thanks to cross-linking, not only did we identify all ten exosome core subunits, but also 15 putative direct protein-protein interactions amongst the core subunits, which generated a connectivity map more complete than the one from yeast two-hybrid experiments (Stark et al., 2006;Uetz et al., 2000;Oliveira et al., 2002;Luz et al., 2007;Yu et al., 2008) and showed that among the co-immunoprecipitated proteins, Rrp41 and Rrp45 directly bind to the bait protein Rrp46 ( Figure 4C). Of the cross-links identified, 69% were compatible with the crystal structure of an RNA-bound 11-subunit exosome complex (Makino et al., 2013) (PDB code: 4IFD). Among the cross-links that disagreed with the RNA-bound structure, 68% involved the catalytic subunit Rrp44, which has a large rotation relative to the rest of the exosome core between the RNA-bound and the RNA-free states (Makino et al., 2013;. The crude Rrp46 immunoprecipitate should mainly contain apo exosome, because magnesium was included in the buffer to activate the nuclease activity of exosome. Therefore, the presence (in the crystal structure) or absence (in our exosome preparation) of bound RNA is likely to be the primary reason behind most of the seemingly inconsistent inter-molecular cross-links.
To fulfill different functions in multiple biological processes (Houseley and Tollervey, 2009), the core exosome complex must recruit additional regulators, of which only a few are known. Here we found two known (Mpp6 [Milligan et al., 2008] and Ski7 Araki et al., 2001) and four potential exosome regulators through nine cross-links with core exosome subunits ( Figure 4B and Figure 4source data 4). These cross-links revealed residues in close proximity. Ski7 was found to cross-link with Rrp4 via K111, which fits well with previous co-IP results obtained by using different fragments of Ski7 (Araki et al., 2001) and a recently published CXMS study of the yeast exosome (Shi et al., 2015). Among the newly identified candidate regulators, the translation initiation factor Tif1 stood out; it had interactions with the Rrp4 and Rrp44 exosome subunits ( Figure 4B). Translation has been implicated in RNA quality control (Shoemaker and Green, 2012). The linkages identified here support the hypothesis that exosome complexes 'stand by' the translation machinery and recognize and degrade aberrant mRNA molecules.

Application of Leiker to lysates
We further tested Leiker for the purpose of mapping protein-protein interaction networks using E. coli and C. elegans lysates. E. coli whole-cell lysates are commonly used for evaluating CXMS methods (Rinner et al., 2008;Yang et al., 2012;Weisbrod et al., 2013). In three independent experiments, Leiker-treated, trypsin digested E. coli lysates were fractionated on a high pH reverse phase column, and cross-linked peptides were enriched from each of the 10 or 11 fractions (Figure 5-figure supplement 1). After filtering the data by requiring FDR < 0.05, E-value < 0.01, spectral count ! 3, we identified a total of 2003 non-redundant inter-linked lysine pairs including 1386 (69%) intra-molecular and 617 (32%) inter-molecular cross-links (Figure 5-source data 1 and Figure 5-figure supplement 2). Protein structure information is available in the PDB database for 984 intra-molecular cross-links identified with Leiker, and is consistent with 80% of them, indicating the high quality of the results ( Figure 5-source data 1). Of note, the inter-molecular cross-links represent 436 pairs of protein-protein interactions, and 25% of the cross-links are supported by the combined network of the bacteriome.org database (Peregrín-Alvarez et al., 2009) Figure 5-source data 1). Most of the inter-molecular cross-links suggest novel protein-protein interactions. Based on the Leiker cross-links, we constructed a protein-protein interaction network and extracted the most highly connected module (Bader and Hogue, 2003;Saito et al., 2012) ( Figure 5A). This 12-protein module consists of 9 ribosomal proteins and two DNA-binding proteins (the Hu heterodimer DBHA/ DBHB) organized around a translation elongation factor Tu (EF-Tu). Evidently, it is enriched with proteins that function in translation, suggesting that DBHA/DBHB also plays a role in this process. Indeed, previous studies reported that a small fraction of this Hu heterodimer is bound to ribosomes (Rouvière-Yaniv and Kjeldgaard, 1979) and that this protein can enhance or repress translation of the mRNA molecules that it binds to (Balandina et al., 2001). In contrast, the most connected module obtained from the previously identified BS 3 cross-links  comprised only three ribosomal proteins ( Figure 5A). These results indicate the potential of Leiker in generating comprehensive protein-protein interaction networks using CXMS. Since ribosomal proteins dominated E. coli whole-cell lysates, we prepared samples in which ribosomes were removed by centrifugation through a layer of sucrose cushion. Analysis of the ribo-free samples (two repeats) with Leiker identified 1971 inter-links, 1127 of which were not identified in the whole-cell lysates (5% FDR, E-value < 0.01, spectral count ! 3) ( Figure 5B Applying Leiker to an even more complex lysate from C. elegans, which has a similar number of protein coding genes as human (~20,000), we identified 459 inter-links (5% FDR, E-value < 0.01, spectral count ! 3) ( Figure 5-source data 3). We also analyzed a C. elegans mitochondrial fraction and identified 547 inter-linked lysine pairs (5% FDR, E-value < 0.01, spectral count ! 3), of which 434 were not detected in the whole-worm lysate ( Figure 5-source data 4). Together, we identified 893 non-redundant cross-linked lysine pairs from C. elegans and constructed protein-protein interactions between 155 proteins ( Figure 5-figure supplement 3B).
In order to compare with previous studies, we also applied a less stringent cutoff (5% FDR, E-value < 0.01, spectral count ! 1) to the data sets of E. coli and C. elegans whole-cell lysates. This allowed us to determine that the number of C. elegans cross-links identified in this study was 23 times as many as the previous record ( Figure 5C) . The number of E. coli crosslinks identified in this study is four times greater than the number of PIR-identified inter-links  and eight times greater than the number of BS 3 -identified inter-links . Half of the BS 3 -identified cross-links  were recapitulated in this study ( Figure 5D).

Leiker-based quantitative CXMS analysis
Relative quantification of cross-linker modified peptides can reveal changes in protein conformation and/or interactions between a protein and another molecule (e.g. nucleic acid, ligand, or protein). To apply Leiker in quantitative CXMS, we synthesized deuterium-labeled Leiker ([d 6 ]-bAL2) in addition to the unlabeled version ([d 0 ]-bAL2). Few software tools reported to date directly support quantitative CXMS (Fischer et al., 2013;Walzthoeni et al., 2015). We therefore modified the quantification software pQuant  and established an automated data analysis workflow for quantitative CXMS (Figure 6 and Materials and methods). As a proof-of-principle experiment, we compared the RNA-free and H/ACA RNA-bound states of a Pyrococcus furiosus ribosomal protein L7Ae (Rozhdestvensky et al., 2003;Li and Ye, 2006). We treated RNA-free L7Ae with [d 0 ]- The best protein-protein interaction cluster extracted from the Leiker-identified or BS 3 -identified  inter-links from E. coli whole-cell lysates. Node size represents the degree of connectivity of the indicated protein in the network. Line width represents the spectral counts of every inter-molecular cross-link. The line color is set to blue when the two peptides of an interlink are both attributed to unique proteins, to grey if either could be assigned to multiple proteins. All the lines connected to EF-Tu1 are grey because EF-Tu1 differs from EF-Tu2 by only one amino acid. (B) Comparison of the identified inter-links in E. coli whole-cell lysates and ribo-free lysates (5% FDR, E-value < 0.01, spectral count ! 3). bAL2 and the assembled L7Ae-RNA complex with [d 6 ]-bAL2 in the forward reaction, and switched the isotope labels in the reverse reaction ( Figure 7A). An equal amount of BSA protein was included in each sample to control for possible difference in cross-linking efficiency between [d 0 ]-and [d 6 ]-Leiker. We expected that the formation of the protein-RNA complex would block the access of Leiker to lysine residues at the binding interface, which would be manifested as large abundance decrease of mono-or inter-linked peptides at these sites.
Mono-linked peptides are usually neglected in CXMS, but they are valuable because they indicate that the modified lysine residues are exposed to solvent. Mono-links at all 15 lysine residues and the N-terminus of L7Ae were reliably quantified ( Figure 6) in both forward and reverse labeling experiments. Three mono-links at K35, K42, and K84 consistently had significantly higher abundance (>5 fold) in the RNA-free state (F) than in the RNA-bound state (B) ( Figure 7B-D, Figure 7-figure supplement 1 and Figure 7-source data 1). None of inter-links passed the quantification criteria described above. These results suggest that the three lysine residues are buried upon RNA binding, either due to direct protein-RNA binding or indirect protein conformational changes induced by RNA binding. This is in perfect agreement with the crystal structure (Li and Ye, 2006) (PDB code: 2HVY), which shows that K35, K42, and K84 all bind to the RNA, each with a buried area greater than 20 Å 2 ( Figure 7C).
Lastly, we applied quantitative CXMS to E. coli lysates. The log phase and the stationary phase cell lysates were cross-linked, respectively, with [d 0 ]-and [d 6 ]-bAL2 in the forward labeling experiment, or with [d 6 ]-and [d 0 ]-bAL2 in the reverse labeling experiment. After a single enrichment step without pre-fractionation, a total of 161 inter-linked lysine pairs were quantified in both the forward and the reverse labeling experiments, and most of them had similar [log phase]/[stationary phase] ratios in the two experiments ( Figure 8 and Figure 8-source data 1). Noticeably, the cross-link between YqjD and ElaB increased at least 10 times in the stationary phase compared to the log phase. These two paralogous proteins are associated with the inner membrane of E. coli cells through their C-terminal transmembrane motifs and both bind to stationary phase ribosomes, probably through their N-terminal regions (Yoshida et al., 2012). It is suggested that YqjD binding to ribosomes inhibits translation (Yoshida et al., 2012). Association of YqjD and ElaB has been detected but the sites of interaction are not known (Hu et al., 2009). Here, our results not only confirm previous findings, but also provide new insights that YqjD and ElaB form a heterodimer through their central regions, presumably as a stronger, divalent anchoring site for ribosomes to inhibit protein translation in the stationary phase.

Discussion
In this study, we developed an MS-friendly and isotope-encodable cross-linker called Leiker that enables the efficient enrichment of cross-linked peptides through biotin-based immobilization and azobenzene-based chemical cleavage. With an enrichment efficiency of 97% or more, Leiker yields a fourfold increase in the number of identified cross-linked peptide pairs from complex samples. Also established is a workflow for quantitative CXMS based on deuterium-labeled Leiker.
In theory, a comprehensive network of putative direct protein-protein interactions could be obtained by applying Leiker to lysates. However, the interaction networks obtained as such are limited, because the cross-links identified are dominated by those from highly abundant proteins, for example, EF-Tu and ribosomal proteins in E. coli. This can be overcome with subcellular fractionation, which can separate abundant proteins from less abundant ones. We increased the number of unique inter-link identifications by more than 50% (from 2003 to 3130) by simply removing ribosomes from the E. coli lysates ( Figure 5B). This is also obvious by contrasting the CXMS results of the whole-worm lysate and the mitochondrial fraction of C. elegans, from which 459 and 547 interlinked lysine pairs were detected, respectively, with an overlap of only 113. We anticipate that extensive protein fractionation coupled with Leiker-assisted CXMS will pave the way towards constructing comprehensive interactomes for different model organisms, and next-generation cross-link identification software of higher sensitivity will also help. Further, with the advantage of heavy isotope labeling for quantification in addition to the enrichment function, Leiker shows promise for use in differential interactome analysis (Ideker and Krogan, 2012). ratios can be calculated based on the monoisotopic peak, the most intense peak, or the least interfered peak of each isotopic cluster as specified by users. The accuracy of the ratio calculation was evaluated with the confidence score s (range: 0-1, from the most to the least reliable). If a cross-link have ratios with s < 0.5, the median of these ratios is assigned to this cross-link. The cross-link ratios of the proteins of interest are normalized to the median ratio of all BSA cross-links. For each cross-link, the median [state1]/[state2] ratio of three independent forward labeling experiments is plotted against the median ratio of three independent reverse labeling experiments. Cross-links that are only present in state1 or state2 due to a dramatic conformational change cannot be quantified as described above because the ratios would be zero or infinite and their s values would be 1. Therefore, if a cross-link does not have a valid ratio after automatic quantification, the EICs were manually inspected to determine if it was an all-or-none change. DOI: 10.7554/eLife.12509.034 When we examined the cross-links identified from E. coli against the protein structures deposited in the PDB database, we noted that the intra-molecular cross-links in both the whole-cell lysates and the ribo-free samples had similar rates of structural compatibility (80% and 84%, respectively). This shows that the quality of our Leiker-based CXMS data is high. Interestingly, the inter-molecular cross-links detected from the ribo-free samples had a much higher rate of structural compatibility (69%) than those detected in the whole-cell lysates (12%). Given that 92% of the inter-links with existing structural information in the whole-cell lysate involved at least one ribosomal protein and many were between ribosomal proteins, we think that most of the apparently incompatible intermolecular cross-links seen in the whole-cell lysates likely result from cross-linking of adjacent ribosomal particles.
Previous cross-linking studies have typically treated mono-linked peptides as by-products, and have ignored them. This is regrettable, as they carry structural information about proteins and always outnumber inter-links ( Figure 3). Leiker also generates abundant mono-links. In this study, we demonstrate that mono-links are highly valuable in mapping RNA-binding lysine residues. As the positively charged lysine residue is frequently involved in binding the negatively charged phosphate backbone of DNA and RNA, relative quantification of lysine mono-links would be particularly suited for mapping the DNA or RNA binding surface on a protein. We suggest that mono-link data should be used in routine practice.

Preparation of protein samples
RNase A, lysozyme, aldolase, BSA, lactoferrin, b-galactosidase, and myosin were obtained from Sigma-Aldrich. Recombinant GST containing an N-terminal His tag was expressed in E. coli BL21 cells from the pDYH24 plasmid and purified with glutathione sepharose (GE Healthcare, Piscataway, NJ). PUD-1/PUD-2 heterodimers were purified on a HisTrap column followed by gel filtration. Stock solutions of the ten standard proteins were individually buffer exchanged into 20 mM HEPES, pH8.0 by ultrafiltration, and then mixed to make a total protein mixture with a 2 mg/ml protein concentration.
Purification of 70S ribosomes from E. coli cells was performed as previously described (Guo et al., 2011). E. coli cells (DH5a) were grown in 2 L LB medium to an OD 600 =0.8. Cells were collected by centrifugation, washed with 100 mL lysis buffer (50 mM HEPES-KOH, pH 7.5, 500 mM KCl, 12 mM MgCl 2 , 1 mM DTT, 1 mM PMSF) and resuspended in 100 mL of lysis buffer. Cells were then disrupted with an Ultrasonic Cell Disruptor. The lysate was clarified at 13,000 rpm for 1 hr at 4˚C in a JA 25.50 motor (Beckman Coulter, UK). The supernatant was layered on a sucrose cushion (50 mM HEPES-KOH, pH 7.5, 500 mM KCl, 12 mM MgCl 2 , 33% sucrose) and centrifuged at 30,000 rpm for 18 hr in a 70Ti rotor (Beckman Coulter) at 4˚C. The supernatant was collected as the ribo-free lysate. The pellet was resolved with a buffer containing 50 mM HEPES-KOH, pH 7.5, 500 mM KCl, and 12 mM MgCl 2 . The crude ribosomes were then layered on a 10-50% sucrose gradient (50 mM HEPES-KOH, pH 7.5, 500 mM KCl, 12 mM MgCl 2 , 10% to 50% sucrose) and centrifuged at 28,000 rpm for 5 hr in an SW28 rotor (Beckman Coulter) at 4˚C. The gradient was scanned at 260 nm and fractionated in an ISCO gradient collector. The fractions of 70S ribosomes were pooled and concentrated with Amicon Ultra centrifugation filters (Millipore, China) with a buffer containing 50 mM HEPES-KOH, pH 7.5, 500 mM KCl, and 12 mM MgCl 2 .
The yeast exosome complex was immunoprecipitated with IgG beads as described previously , with the following modifications: a gentle wash buffer (150 mM NaCl) was applied and the mono-Q anion exchange step was not performed. These modifications were made in order to maintain the interaction of the proteins in the sample. Eluted proteins were exchanged into 20 mM HEPES, pH 8.0, 150 mM NaCl.
E. coli OP50 lysates and C. elegans N2 lysates were prepared following a protocol from Bing et al. Zhao et al., 2015). Mitochondria were isolated from the wild-type N2 worms as described previously (Shen et al., 2014) and lysed by incubation in 100 mM HEPES pH 8.0, 1% NP-40, 10 mM CaCl 2 at 4˚C for 30 min.
The Pyrococcus furiosus L7Ae and the H/ACA RNA were prepared as described previously (Li and Ye, 2006). The buffer was exchanged to 50 mM HEPES, pH 7.6, 1 M NaCl.
E. coli (MG1665) cells were grown at 37˚C in 500 mL M9 minimal medium from a 1 mL overnight culture. Log phase cells were harvested after 11 hr at OD 600 0.7; stationary phase cells were harvested after 26 hr at OD 600 2.3. Cell lysates were prepared in 50 mM HEPES pH 8.0, 150 mM NaCl using a FastPrep system (MP Biomedicals, Santa Ana, CA) using two volumes of glass beads at 6.5 m/s, 20 s per pulse for four pulses, with 5 min of cooling on ice between pulses. The lysates were cleared by centrifugation at top speed in a tabletop microfuge for 30 min. Protein concentrations were determined using the bicinchoninic acid assay.

Trypsin digestion
At room temperate (RT), protein pellets were dissolved (assisted by sonication) in 8 M urea, 20 mM methylamine (to reduce carbamylation), 100 mM Tris, pH 8.5, reduced with 5 mM TCEP for 20 min and alkylated with 10 mM iodoacetamide for 15 min in the dark. Then, the samples were diluted with 3 volumes of 100 mM Tris, pH 8.5 and digested with trypsin at 1/50 (w/w) enzyme/substrate ratio at 37˚C for 16-18 hr.

CXMS analysis of model proteins
The optimal protein-to-cross-linker mass ratio was determined by a titration experiment. 1 ml of cross-linker at increasing concentrations (2.5 mg/ml, 5 mg/ml, 10 mg/ml, 20 mg/ml, 40 mg/ml) in DMSO was incubated with 20 ml of 2 mg/ml of the ten-protein mixture at RT for 1 hr to make 16:1, 8:1, 4:1, 2:1, and 1:1 protein-to-cross-linker mass ratios, respectively. The reactions were quenched with 20 mM NH 4 HCO 3 at RT for 20 min. Cross-linking products were analyzed by SDS-PAGE. The 4:1 ratio was ultimately chosen for both the one-piece and the two-piece Leiker. Higher dosages were avoided to minimize excessive cross-linking.
For comparison of the one-piece and two-piece Leikers, 50 ml of the 2 mg/ml ten-protein mixture was incubated with 0.5 ml of 50 mg/ml AL or bAL1 at RT for 1 hr. The reactions were quenched as described above. For AL, the solution was mixed with 350 ml of 8 M urea, 100 mM Tris, pH 8.5, and filtered with an Amicon Ultra-0.5 10-kD filter device (Millipore). Excess cross-linker molecules were removed by two additional washes with urea. Click chemistry was subsequently performed on the membrane. In a 100 ml reaction, 28 nmol of azide-biotin was added (an amount equal to the starting amount of the alkyne group of AL), followed by the addition of 2 mM CuSO 4 , 2 mM TCEP, and 200 mM TBTA. Samples were gently rotated and incubated at RT for 2 hr. The excess free azide-biotin was then removed by washes with urea in the filter device. Finally, the proteins were collected by centrifugation with the filter device placed upside down inside the tube. Recovered proteins were transferred to a new 1.5 mL tube, precipitated at -20˚C with four volumes of pre-cooled acetone for at least 30 min, and digested with trypsin. The bAL1 samples were processed in the same way except that the reaction mixture was precipitated directly without going through the 10-kD filter device.
The AL-and bAL1-cross-linked peptides were enriched in parallel. The tryptic digests, without formic acid (FA) acidification, were directly mixed with an equal volume of 20 mM HEPES, pH 8.0 and incubated with 40 ml pre-washed high capacity streptavidin agarose for 2 hr. Then, the beads were washed three times with 20 mM HEPES, pH 8.0, 1 M KCl, once with H 2 O, three times with 10% acetonitrile (ACN), and another three times with H 2 O, each time with 1 mL buffer or H 2 O, with 5min rotation. Supernatants were removed carefully with a 1 mL syringe needle connected to a vacuum pump. Loss of beads was avoided by keeping the beveled surface of the needle tip in contact with the wall of the tube. After the extensive washes, the peptides were released by incubating the beads with 5Â bed volumes of cleavage buffer (300 mM Na 2 S 2 O 4 in 6 M urea, 2 M thiourea, 10 mM HEPES, pH8.2) (Yang et al., 2010) at 37˚C for 30 min, with end-to-end rotation. Recovered peptides were acidified with 5% FA and subsequently desalted on home-made C18 desalting columns, followed by elution with 70% ACN/0.1% FA. Eluates were vacuum dried and reconstituted in 0.1% FA for mass spectrometry analyses. The color of the beads could be used to monitor the entire enrichment process: a bright yellow color indicated the binding of Leiker-linked peptides; a return to a white color occurred when the cleavage reaction was successful.
Comparison of bAL1 and bAL2 was carried out in two samples. For the first comparison, 50 mg of the ten-protein mixture was cross-linked with bAL1 or bAL2 at 4:1 protein-to-cross-linker mass ratio and then digested with trypsin. After mixing with the tryptic digest of an E. coli lysate containing 500 mg of total proteins, the digested Leiker-linked peptides were affinity purified with 20 ml of highcapacity streptavidin agarose. For the second comparison, 30 mg of ribosome was treated with bAL1 or bAL2 at 8:1, 4:1, or 2:1 protein-to-cross-linker mass ratios, digested, and enriched using 20 ml of high-capacity streptavidin agarose.
For the serial dilution experiment (Figure 3), 200 mg of the ten-protein mixture was treated with 50 mg of bAL1 at RT for 1 hr. After quenching, the proteins were precipitated and digested with trypsin. Four equal aliquots of this digest were either not diluted to serve as a control (1:0) or diluted with the tryptic digest of a non-cross-linked E. coli lysate at 1:1, 1:10, or 1:100 (w/w) ratio. Each mixture was enriched with 200 ml of pre-washed streptavidin agarose.
CXMS analysis of purified ribosomes and the immunoprecipitated exosome complex 30 mg of ribosome was treated with bAL2 at 8:1, 4:1, or 2:1 protein-to-cross-linker mass ratios. 40 ml of the exosome complex sample (1 mg/ml) was incubated with 0.25 ml of 40 mg/ml bAL2 at RT for 1 hr. 20 ml of high-capacity streptavidin agarose was used to enrich Leiker-linked peptides in each sample.
Negative staining of E. coli 70S ribosome 70S ribosomes were negatively stained with 0.2% uranyl acetate. Carbon coated grids were first glow-discharged to increase the surface hydrophilicity using a Harrick Plasma cleaner. 4 mL aliquots of 70S ribosomes (~10 nM) were placed on grids for about 1 min, and excessive liquid was absorbed by filter paper. After that 0.2% uranyl acetate was applied on the grid for about 1 min and absorbed using filter paper. The grids were air-dried and examined using an FEI Tecnai Spirit BioTwin microscope (FEI, Hillsboro, OR) (120 KV) at 49,000Â magnification.
CXMS analysis of E. coli and C. elegans cell lysates E. coli or C. elegans lysates prepared as described previously Zhao et al., 2015) (1 mg of total proteins) were treated with 250 mg bAL1 at RT for 1 hr, in 300 ml reactions; NH 4 HCO 3 was added to quench the reactions. Proteins were precipitated and digested with trypsin. After centrifugation in a bench top centrifuge at top speed for 30 min and filtering with a 50-kD cutoff filter, the digested peptides were brought to a volume of 3 mL with 2% ACN, 20 mM HEPES, pH 8.2; the pH was adjusted to 10.0 with ammonia prior to high-pH reverse phase separation on an Xtimate column (10Â250 mm) packed with 5 mm C18 resin (Welch Materials, China) at a flow rate of 2 mL/min. A 70 min gradient was applied as follows: 0-6% B in 10 min, 6-40% B in 40 min, 40-100% B in 10 min, 100% B for 10 min (A = 4% ACN, 5 mM NH 4 COOH, pH 10, B = 80% ACN, 5 mM NH 4 COOH, pH 10). A total of 39 two-min fractions were collected, and then combined into 9-11 fractions of similar shades of color judging by naked eyes. These pooled samples were evaporated to 200-300 ml volumes before Leiker-linked peptides were enriched with 50 ml of high-capacity streptavidin beads from each sample. For the ribo-free lysates, 3 mg of proteins were cross-linked with 0.75 mg bAL2 at RT for 1 hr, and subjected to tryptic digestion and fractionation as described above.
C. elegans mitochondria were prepared as described previously (Shen et al., 2014), and the CXMS analysis was performed as described above except with two differences: 3.2 mg of total proteins was used as the starting material and the collected fractions were pooled into 5 fractions.

Quantitative CXMS analysis of the L7Ae-RNA complex
In the forward and reverse labeling experiments, 0.7 nmol of RNA-free L7Ae was treated with [d 0 ]-bAL2 and [d 6 ]-bAL2, respectively; an equal amount of L7Ae was pre-incubated with 1 nmol of the 65 nt H/ACA RNA at 4˚C for 30 min and then treated with [d 6 ]-bAL2 and [d 0 ]-bAL2, respectively. An equal amount of BSA was spiked into each cross-linking reaction. A 4:1 protein-to-cross-linker ratio (w/w) was used for each reaction. The cross-linking reactions were quenched with ammonium bicarbonate after 1 hr at RT. The paired [d 0 ]-and [d 6 ]-bAL2 samples were combined and subjected to acetone precipitation and trypsin digestion.

Quantitative CXMS analysis of E. coli lysates
In the forward labeling experiment, the log phase and the stationary phase cell lysates (100 mg proteins each) were cross-linked with 50 mg of [d 0 ]-bAL2 and 50 mg of [d 6 ]-bAL2, respectively, with 1 mg of BSA spiked into each sample. After 1 hr at RT, the two reactions were quenched, mixed, precipitated with acetone, and digested with trypsin. The reverse labeling experiment was conducted in the same way except that the log phase lysate was cross-linked with [d 6 ]-bAL2 and the stationary phase lysate was cross-linked with [d 0 ]-bAL2.

LC-MS/MS analysis
All protein samples were analyzed with an EASY-nLC 1000 system (Thermo Fisher Scientific, Waltham, MA) interfaced with a Q-Exactive mass spectrometer (Thermo Fisher Scientific). A two-column setup was used, consisting of a pre-column (100 mmÂ4 cm, 3 mm C18) with a frit at each end and an analytical column (75 mmÂ10 cm, 1.8 mm C18) with a 5 mm tip. For the Leiker-cross-linked samples after enrichment, typically one third of a reconstituted sample was injected and separated with a 65 min linear gradient at a flow rate of 300 nl/min as follows: 0-5% B in 2 min, 5-28% B in 41 min, 28-80% in 10 min, 80% for 12 min (A = 0.1% FA, B = 100% ACN, 0.1% FA). Slight modifications to the separation method were made for different samples. A 120 min gradient was used with a more gradual ramp to 28% buffer B. The Q-Exactive mass spectrometer was operated in data-dependent mode with one full MS scan at R = 70000 (m/z = 200), followed by ten HCD MS/MS scans at R = 17,500 (m/z = 200), NCE = 27, with an isolation width of 2 m/z. The AGC targets for the MS1 and MS2 scans were 3e6 and 1e5, respectively, and the maximum injection times for MS1 and MS2 were both 60 ms. For cross-linked samples, precursors of the +1, +2, +7 or above, or unassigned charge states were rejected; exclusion of isotopes was disabled; dynamic exclusion was set to 30 s.
For accurate mass analysis, 20 mg/ml of [d 0 ]-bAL2 or [d 6 ]-bAL2 in methanol was sprayed directly into a LTQ Orbitrap XL mass spectrometer (Thermo Fisher Scientific) operated in the negative mode with a spray voltage of 0.8 kV and a scan mass range of 150-1000 m/z.

Identification of cross-linked peptides with pLink
The Xcalibur raw data was converted to ms2 files using RawExtract (McDonald et al., 2004). Crosslinked peptides were identified using pLink software as described previously , with the following modifications Cross-linker was set to AL, bAL1, bAL2, [d 6 ]-bAL2, or BS 3 ; The minimum peptide length was 5 amino acids for lysate samples; oxidation on Met was set as a variable modification.
For the ten-protein mixture and ribosome complexes, the search databases consisted of the sequences of all of the proteins in question. The sequences were downloaded from NCBI or Uniprot.
Prior to the CXMS analysis of the exosome complex, LC-MS/MS analyses of digested, uncrosslinked samples were carried out to identify the proteins present in the samples. For protein identification, the precursors of +1 or unassigned charge states were rejected; MS2 spectra were searched against a S. cerevisiae protein database (downloaded from Uniprot on 2013-04-03) using ProLuCID2 (Xu et al., 2006) and filtered using DTASelect 2.0 (Tabb et al., 2002) with a spectral false identification rate 1% and a minimum of two identified peptides for each protein. A restricted database containing only the identified proteins (740 in total) was generated using Contrast 2.0 (Tabb et al., 2002). MS2 spectra from the cross-linked samples were then searched against this small database using pLink.
For the CXMS analysis of E. coli whole-cell lysates and ribo-free lysates, the sequences of the entire proteome of the K12 strain were downloaded from Uniprot on 2014-07-31 and used for searching.
For the CXMS analysis of C. elegans lysates, a database consisting of proteins identified from N2 C. elegans lysates generated with ProLuCID2 was used for searching (unpublished).
For the CXMS analysis of C. elegans mitochondrial proteins, a restricted database was constructed in a similar way as for the exosome complex.
Quantification of cross-linked peptides with pQuant pQuant  was used to determine the heavy-to-light ratio (H/L) of each cross-link. The regression model Y = aX +e is used to calculate peptide ratios. The optimal value of a is solved using the least-squares method asâ ¼ X X j Y j = X X j X j , and the estimated standard error ofâ iŝ is then normalized to the interval of [0,1], and is named confidence score. If the value ofŝ is zero (the highest confidence), there is no interference signal; if the value is one (the lowest confidence), the peptide signals are inundated by interference signals. For each identified cross-link spectrum, an extracted ion chromatogram (EIC) was constructed for each isotopic peak of the light-and heavy-labeled precursor. The H/L ratios can be calculated based on the monoisotopic peak, the most intense peak, or the least interfered peak of each isotopic cluster as specified by users. For L7Ae, all options yielded similar results and we selected the monoisotopic peak. For the highly complex samples of the log phase versus stationary phase E. coli, the option of the least interfered peak performed the best. For each cross-link, every identified spectrum (E-value < 0.001) will lead to a H/L ratio and a confidence score s, because pQuant conducts the quantitation independently starting from each identified MS/MS spectrum. In most cases, the H/L ratios obtained for the same precursor ion are close, but sometimes the ratios may differ due to multiple reasons including local interference signals or a sudden decrease followed by recovery in signal intensity in the chromatograms, all of which can affect the calling of the start and the end of a chromatogram peak. Ratios with s values above or equal to 0.5 were discarded. The median H/L ratio obtained from the remaining spectra was assigned to a cross-linked lysine pair or a mono-linked lysine residue as the final quantification value. If a cross-link had no assigned ratio value (i.e., none of its ratios had a s value less than 0.5), we manually evaluated the reconstructed ion chromatograms to assess abundance changes. All the ratios were normalized against the median value of all the H/L ratios belonging to the spiked-in BSA.