Structure of Complement C3(H2O) Revealed By Quantitative Cross-Linking/Mass Spectrometry And Modeling

The slow but spontaneous and ubiquitous formation of C3(H2O), the hydrolytic and conformationally rearranged product of C3, initiates antibody-independent activation of the complement system that is a key first line of antimicrobial defense. The structure of C3(H2O) has not been determined. Here we subjected C3(H2O) to quantitative cross-linking/mass spectrometry (QCLMS). This revealed details of the structural differences and similarities between C3(H2O) and C3, as well as between C3(H2O) and its pivotal proteolytic cleavage product, C3b, which shares functionally similarity with C3(H2O). Considered in combination with the crystal structures of C3 and C3b, the QCMLS data suggest that C3(H2O) generation is accompanied by the migration of the thioester-containing domain of C3 from one end of the molecule to the other. This creates a stable C3b-like platform able to bind the zymogen, factor B, or the regulator, factor H. Integration of available crystallographic and QCLMS data allowed the determination of a 3D model of the C3(H2O) domain architecture. The unique arrangement of domains thus observed in C3(H2O), which retains the anaphylatoxin domain (that is excised when C3 is enzymatically activated to C3b), can be used to rationalize observed differences between C3(H2O) and C3b in terms of complement activation and regulation.

that forms the arm of the puppeteer. MGs 1-5, LNK, and half of MG6 are contributed by the ␤-chain whereas the remaining domains are coming from the ␣-chain.
Comparing the crystal structures of C3 and C3b revealed significant domain rearrangements between them (11). Most dramatically, the CUB arm swings away from the shoulders toward the "feet" of the puppeteer (supplemental Fig. S1). As a result, the TED (i.e. the puppet) rotates and is repositioned. This is accompanied by exposure and activation of the thioester group, allowing attachment of C3b to surface-borne nucleophiles. The crystal structure of C3(H 2 O) has not been reported. New binding sites for complement components and cell-surface receptors are created in both nascent C3b and C3(H 2 O) (7,(12)(13)(14)(15)(16)(17)(18). Both proteins bind factor B that is subsequently cleaved to Bb. Importantly, both the resultant C3bBb and C3(H 2 O)Bb complexes are C3 convertases, generating further molecules of C3b and thereby stoking a positive-feedback loop.
Because C3(H 2 O) (unlike C3b) is a spontaneously arising product of C3 domain rearrangements and thioester hydrolysis, C3(H 2 O)Bb (rather than C3bBb) is the initiating convertase of the alternative pathway of complement activation. Thus the constitutive presence of C3(H 2 O) ensures the alternative pathway can be activated quickly and indiscriminately allowing a rapid response to any cell not protected by the appropriate regulatory molecules such as factor H. Inappropriate regulation of complement activity is linked to many autoimmune, inflammatory and ischemia/reperfusion (I/R) injury-related diseases (19).
It has been shown that hydrolysis of the thioester in C3 alone does necessarily result in transition to active C3(H 2 O) (20). Despite use of diverse methodologies (7, 9 -13, 21-27), the remodeling of domains that underlies spontaneous formation of C3(H 2 O), and therefore triggers complement, are poorly understood. Current structural models of C3(H 2 O) rely on epitope-mapping (21), hydrogen-deuterium exchange (27), other biophysical solution studies (9) and negative-staining EM images (25). These indicate a "C3b-like" structure but do not provide direct evidence regarding placements of the ANA and TED relative to specific domains within the shoulders and body of the C3(H 2 O) molecule. It has been proposed that the ANA domain acts as a safety catch in native C3. Removal of the ANA triggers the dramatic structural transition into C3b (24). More knowledge of the C3(H 2 O) structure is required to test if the safety catch role of ANA (presumably displaced in C3(H 2 O) rather than removed, as in C3b) and subsequent domain reconfigurations are general mechanisms, relevant both to the spontaneous but rare hydrolytic C3 to C3(H 2 O) transition, and to the proteolytic cleavage-dependent but rapid C3 to C3b transition.
Further understanding of this event depends on the ability to elucidate, in solution, the dynamic processes whereby the domains of a protein molecule are reorganized, following a triggering event, to form a new stable arrangement. Quantitative cross-linking/mass spectrometry (QCLMS) using isotope-labeled cross-linkers ( Fig. 2A) has emerged as a new approach with which to elucidate the details of protein conformational changes (28 -31). In this approach, chemical cross-linking captures proximities between amino acid residues and the residues involved are identified by mass spectrometry. Quantitative comparison of the cross-linking results obtained for two different conformations of a protein allows the details of the conformational change to be elucidated. We have developed a workflow for QCLMS analysis (32). In our benchmark study, we used QCLMS to accurately reveal differences and similarities between C3 and C3b in terms of the spatial arrangements of their domains (32). In another application, this technique successfully revealed conformational changes involved in maturation of the proteasome lid complex (33).
Here we apply our QCLMS workflow, and an integrative modeling approach, to interrogate the unknown arrangement of domains in, C3(H 2 O), a key component of the complement alternative pathway. We combined knowledge of the crystal structures of C3 and C3b with QCLMS data sets for C3(H 2 O), C3 and C3b. We thus generated structural models for the conformational transition of C3 to C3(H 2 O) that are consistent with other biophysical studies and with previously observed functional similarities and differences between these proteins.  After incubation (two hours) on ice, reactions were quenched with 10 l 2.5 M ammonium bicarbonate for 45 min on ice. For the monitoring of cross-linking, aliquots containing 5 pmol of cross-linked protein from each of the above six reactions were subjected to SDS-PAGE using a NuPAGE 4 -12% Bis-Tris gel (Life Technologies, Carlsbad, CA) and MOPS running buffer (Life Technologies). The protein bands were visualized using the Colloidal Blue Staining Kit (Life Technologies) (Fig. 2B). Cross-linking reactions were repeated for "experiment II" as described for "experiment I".
For each of the four samples, a 20 g (40 l) aliquot was fractionated using SCX-Stage-Tips (36) with a small variation of the protocol previously described for linear peptides (37). In short, peptide mixtures were first loaded on a SCX-Stage-Tip in loading buffer (0.5% v/v acetic acid, 20% v/v acetonitrile, 50 mM ammonium acetate). The retained peptides were eluted in two steps, with buffers containing 100 mM ammonium acetate and 500 mM ammonium acetate, into two fractions. These peptide fractions were desalted using C18-Stage-Tips (38) prior to mass spectrometric analysis.
Experiment II-Preparation of four quantitation samples were repeated as described for "experiment I." A 4-g (8 l) aliquot of each sample was desalted using C18-Stage-Tips for mass spectrometric analysis without pre-fractionation.
Mass Spectrometric Analysis-Experiment I-SCX-Stage-Tip fractions were analyzed using a hybrid linear ion trap-Orbitrap mass spectrometer (LTQ-Orbitrap Velos, Thermo Fisher Scientific, Bremen Germany) applying a "high-high" acquisition strategy. Peptides were separated on an analytical column that was packed with C18 material (ReproSil-Pur C18-AQ 3 m; Dr. Maisch GmbH, Ammerbuch-Entringen, Germany) in a spray emitter (75-m inner diameter, 8-m opening, 250-mm length; New Objectives, Woburn, MA) (39). Mobile phase A consisted of water and 0.5% v/v acetic acid. Mobile phase B consisted of acetonitrile and 0.5% v/v acetic acid. Peptides were loaded at a flow-rate of 0.6 l/min and eluted at 0.3 l/min using a linear gradient going from 3% mobile phase B to 35% mobile phase B over 130 min, followed by a linear increase from 35% to 80% mobile phase B in 5 mins. The eluted peptides were directly introduced into the mass spectrometer. MS data were acquired in the data-dependent mode. For each acquisition cycle, the mass spectrum was recorded in the Orbitrap with a resolution of 100,000. The eight most intense ions with a precursor charge state 3ϩ or greater were fragmented in the linear ion trap by collision-induced disassociation (CID). The fragmentation spectra were then recorded in the Orbitrap at a resolution of 7,500. Dynamic exclusion was enabled with single repeat count and 60-s exclusion duration.
Experiment II-Non-fractionated peptide samples were analyzed using a hybrid quadrupole-Orbitrap mass spectrometer (Q Exactive, Thermo Fisher Scientific). Peptides were separated on a reversedphase analytical column of the same type as described above. Mobile phase A consisted of water and 0.1% v/v formic acid. Mobile phase B consisted of 80% v/v acetonitrile and 0.1% v/v formic acid. Peptides were loaded at a flow rate of 0.5 l/min and eluted at 0.2 l/min. The separation gradient consisted of a linear increase from 2% mobile phase B to 40% mobile phase B in 169 min and a subsequent linear increase to 95% B over 11 min. Eluted peptides were directly sprayed into the Q Exactive mass spectrometer. MS data were acquired in the data-dependent mode. For each acquisition cycle, the MS spectrum was recorded in the Orbitrap at 70,000 resolution. The ten most intense ions in the MS spectrum, with a precursor charge state of 3ϩ or greater, were fragmented by Higher Energy Collision Induced Dissociation (HCD). The fragmentation spectra were thus recorded in the Orbitrap at 35,000 resolution. Dynamic exclusion was enabled, with single-repeat count and a 60-s exclusion duration.
Identification of Cross-linked Peptides-The raw mass spectrometric data files were processed into peak lists using MaxQuant version 1.2.2.5 (40) with default parameters, except that "Top MS/MS Peaks per 100 Da" was set to 20. The peak lists were searched against C3 and decoy C3 sequences using Xi software (ERI, Edinburgh) for identification of cross-linked peptides. Search parameters were as follows: MS accuracy, 6 ppm; MS2 accuracy, 20 ppm; enzyme, trypsin; specificity, fully tryptic; allowed number of missed cleavages, four; cross-linker, BS 3 /BS 3 -d4; fixed modifications, carbamidomethylation on cysteine; variable modifications, oxidation on methionine; modifications by BS 3 /BS 3 -d4 that are hydrolyzed or amidated on the other end. The linkage specificity for BS 3 was assumed to be for lysine, serine, threonine, tyrosine and protein N termini. Identified candidates for cross-linked peptides were validated manually in Xi, after applying an estimated false discovery rate (FDR) of 3% for cross-linked peptides (Fischer and Rappsilber, submitted). We used 3% FDR as this was the best FDR that returned a reasonable number of decoys to provide a meaningful FDR. Only those cross-linked peptide pairs identified with fragment signals of both peptides in MS2 spectra were used to generate the list of identified cross-linked residue pairs and used for subsequent quantitation. Identification information of all quantified cross-linked peptides and the annotated best-matched MS2 spectra for these cross-linked peptides are provided in supplemental Table S1 and supplemental File S1.
Quantitation of Cross-link Data Using Pinpoint Software-Quantitation was carried out in each pair-wise comparison. For each cross-linked peptide, the elution peak areas of light (BS 3 cross-linked) and heavy (BS 3 -d4 cross-linked) signals were retrieved using Pinpoint (Thermo Fisher Scientific) (32,33). The error tolerance for precursor m/z was set to 6 ppm. Signals were only accepted within a window of retention time (defined in spectral library) Ϯ10 min. Manual inspection was carried out to ensure the correct isolation of elution peaks, and correct isotope peaks that were used for quantitation. "Match between runs" (41) was performed manually using Pinpoint software based on high m/z accuracy and reproducible chromatographic retention time for MS1 signals. Thus, signals of each identified crosslinked peptide were quantified in every quantitation samples. All transferred identification were verified based on their MS1 signal pattern (either shown as doublet signals or singlet signals with 4D mass shift between paired label-swapped replicas).
Differences between the yields of cross-linked peptide pairs were expressed in terms of "signal fold-changes" (i.e. by how many-fold the two signals differed). The signal fold-change of a cross-linked peptide pair was calculated as log 2 (C3/C3(H 2 O)), or log 2 (C3b/ C3(H 2 O)). Within each quantitation sample, signal fold-changes of all observed cross-linked peptide pairs were first normalized to their median. This corrected systematic errors introduced by minor differences in mixing ratios during sample preparation. Then the signal fold-change for a residue pair was calculated as the median of all its supporting cross-linked peptides. Only those cross-links that were consistently quantified in both paired replicas (i.e. with label-swapping) were accepted for subsequent structural analysis and the average of signal fold-changes of a residue pair from replicated analyses was calculated. When a cross-linked residue pair was quantified in both experiment I and experiment II, the average of signal foldchanges in two experiments was reported. All quantified cross-links are listed in supplemental Table S2. Within each pair-wise comparison, the "Significance A" test from the standard proteomics data analysis tool Perseus (version 1.4.1.2) (40) was carried out based on fold-change values to determine cross-links that are significantly enriched in either conformations. The following parameters were used for the test: "Side": both; "Use for truncation": p value; "Threshold value": 0.05.
Visualizing Cross-linking Data in Crystal Structures-PyMol (version 1.2b5) (42) was used to visualize cross-linking data. Cross-links were displayed in the crystal structures of C3 (PDB 2A73) and C3b (PDB 2I07) as solid lines between the C-␣ atoms of linked residues. In the case of a residue missing from the crystal structures, the nearest residue in the sequence was used for display purposes. The distance of a cross-linked residue pair in the crystal structures was measured between the C-␣ atoms. The theoretical cross-linking limit was calculated as the sum of side-chain lengths of cross-linked residues plus the spacer length (11.4 Å) of the cross-linker. An additional 2 Å were added for each residue to allow for residue displacement in the crystal structures. The following side-chain lengths were used for the calculation: 6.0 Å for lysine, 2.4 Å for threonine, 2.4 Å for serine and 6.5 Å for tyrosine. For example, for a lysine-lysine cross-link, this limit is 27.4 Å.
Determining the Structures of C3(H 2 O), C3, C3b with Integrative Modeling Platform (IMP)-Our integrative approach for determining the structure of the three macromolecules proceeded as follows (43)(44)(45)(46)(47)(48): (1) gathering of data, (2) representation of subunits and translation of the data into spatial restraints, (3) configurational sampling to produce an ensemble of models that optimally satisfies the restraints, and (4) analysis and assessment of the ensemble. The modeling protocol was scripted using the Python Modeling Interface (PMI), a library for modeling macromolecular complexes based on the open-source IMP package (46), release 2.5.0. The protein domains were represented as rigid or flexible, based on known protein structures. The cross-linking data (sets of 82, 75, and 85 cross-links for C3, C3b, and C3(H 2 O), respectively. supplemental Table S3) were encoded into a Bayesian scoring function that restrained the distances spanned by the cross-linked residues (44,45). We also included the disulfide bond between residues 851 and 1491. Models of C3, C3b, and C3(H 2 O) were computed separately. The 200 best scoring models (i.e. solutions) for each protein were clustered to yield the localization density maps (further described below). The average precision of the solutions for C3(H 2 O) (average r.m.s. (root-meansquare) deviation with respect to the cluster center when superposed on the MG1-6_␣ЈNT domain) was 15 Å (Fig. 6E, green bars). The precision of the domains varies because the intra-domain cross-links are not uniformly distributed. The C-alpha r.m.s. deviation of the solutions with respect to the crystallographic structures (PDB 2I07 and 2A73) was used to estimate the accuracy of the C3 and C3b models (Fig. 6E, blue and red bars). The C3 and C3b crystallographic structures were reproduced with an accuracy of ϳ11 Å.
Representation of Domains-The domains of the complement protein C3 were represented by beads arranged into either a rigid body or a flexible string on the basis of the available crystallographic structures (PDB 2I07 for C3b and PDB 2A73 for C3 and C3(H 2 O)) ( Fig.  6A). The beads representing a structured region were kept rigid with respect to one another during configurational sampling (i.e. rigid bodies). Segments without a crystallographic structure or the linkers between the rigid domains were represented by a flexible string of beads, where each bead corresponded to a single residue.
Bayesian Scoring Function-The Bayesian approach estimates the probability of a model, given information available about the system, including both prior knowledge and newly acquired experimental data. The approach is extensively described elsewhere (44,45,49,50). Briefly, using Bayes' theorem, we estimate the posterior proba- To account for the presence of noisy cross-links, we parameterized the likelihood with a set of variables {} defined as the uncertainties of observing the cross-links in a given model (44,45). C is the average uncertainty for cross-links that were consistently identified in all cross-linking experiments, I is the average uncertainty for cross-links that were identified only once. For instance, cross-link 1049 TED -1409 MG8 was identified for C3 in both C3(H 2 O)/C3 and C3b/C3 quantification experiments, therefore its uncertainty was estimated by C . Conversely, the cross-link 203 MG2 -1049 TED was identified for C3 in C3b/C3 but not in C3(H 2 O)/C3 and its uncertainty was identified by I .
The prior terms comprised the excluded volume and the sequence connectivity, which are described elsewhere (44,45). Moreover, the disulfide bond is implemented as a harmonic restraint on the distance between the two cross-linked residues.
Sampling Model Configurations-Structural models were obtained by Replica Exchange Gibbs sampling, based on Metropolis Monte Carlo sampling (50). This sampling was used to generate configurations of the system as well as values for the uncertainty parameters. The Monte Carlo moves included random translation and rotation of rigid bodies (4 Å and 0.03 rad, maximum, respectively), random translation of individual beads in the flexible segments (5 Å maximum), and a Gaussian perturbation of the uncertainty parameters. The sampling was run on 32 replicas, with temperatures ranging between 1.0 and 2.5. Two independent sampling calculations were run for each system, each one starting with a random initial configuration, for a total of 200,000 models per system. We divided this set of models into two ensembles of the same size to confirm sampling convergence (data not shown).
Analysis of the Model Ensemble-For each ensemble, the solutions were grouped by k-means clustering on the basis of the r.m.s. deviation of the domains after the superposition of the MG1-6_␣_ЈNT domain (supplemental Fig. S3 in Supplemental File). For C3 and C3b, the models cluster into a single configuration. For C3(H 2 O), the models cluster into two configurations (A and B). We chose the cluster with that best satisfied the cross-linking data. The precision of a cluster was calculated as the average r.m.s. deviation with respect to the cluster center (i.e. the solution with the lowest r.m.s. deviation with respect to the others). The solutions of a cluster, superposed on the MG1-6_␣_ЈNT domain, were converted into the probability of a volume element being occupied by a given domain (i.e. the localization density, Fig. 6B, 6C, 6D) (51).
Accession Codes-IMP modeling scripts and models are publicly available at http://salilab.org/Complement The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (52) (http://proteomecentral. proteomexchange.org) via the PRIDE partner repository with the data set identifier PXD003486.

C3(H 2 O) Versus C3 Comparison Confirms Structural Rearrangements in C3(H 2 O)-
To interrogate the unknown arrangement of domains in C3(H 2 O), we first compared QCLMS data for C3(H 2 O) and C3 (for which a crystal structure has been determined). Using our tried and tested workflow ( Fig. 2A) (32,33), a total of 94 cross-linked pairs of amino acid residues ("cross-links") were quantified for these two proteins (Fig. 3A). Ten cross-links were identified uniquely in C3 and 22 crosslinks were found only in C3(H 2 O). Of 62 cross-links that were observed in both C3 and C3(H 2 O), one was significantly enriched in C3 compared with C3(H 2 O) and four were significantly enriched in C3(H 2 O) compared with C3 (determined using the "Significance A" test from Perseus version 1.4.2.1 with p Ͻ 0.05) (40). The observation of 37 (39% of the total) cross-links differing in occurrence or in enrichment between C3(H 2 O) and C3 reflects significant structural differences between these proteins. Yet these proteins must also have structural features in common, given that 57 out of 94 crosslinks showed no significant differences in yields between the two proteins.
The 57 cross-links that are preserved upon the formation of C3(H 2 O) from C3 (referred to hereafter as "C3(H 2 O)-C3 mutual" cross-links) were inspected to identify structural features shared between the activated (C3(H 2 O)) and non-activated (C3) versions of this protein. Altogether 29 of these 57 crosslinks, covering ten domains, connect residues within the same domains (Fig. 3B). It seems likely that these domains retain their structures following transition of C3 to C3(H 2 O). Furthermore, out of the total of 16 cross-links between domains within the ␤-chain, 15 are C3(H 2 O)-C3 mutual, suggesting that the respective ␤-chains share highly similar domain architectures. By contrast, the ten C3(H 2 O)-C3 mutual cross-links between domains in the ␣-chain account for only about 35% of the inter-domain cross-links identified in the ␣-chain. Thus, following transition to C3(H 2 O), the ␣-chain undergoes more conformational adjustments than the ␤-chain. Only three of twelve cross-links between the ␣-chain and the ␤-chain (the MG6␤ segment was treated as part of the ␤-chain for the purposes of this analysis) are C3(H 2 O)-C3 mutual. In the crystal structure of C3, all three are located at the interface between the ␣-chain and the ␤-chain, linking MG3 and MG7 (Fig. 3B).
From the above, it is clear that non-(C3-C3(H 2 O)) mutual cross-links, indicating structural differences, occur predomi- nantly in the ␣-chain. They are concentrated in two distinct regions. The first group of C3-specific or C3(H 2 O)-specific cross-links occur in the vicinity of ANA and its neighboring domains in the "shoulder" region of the molecule (Fig. 3C). Inspection of cross-linking networks implied close proximities between these domains in both C3 and C3(H 2 O). Yet nine C3-unique (and enriched) cross-links and 17 C3(H 2 O)-unique (and enriched) cross-links were found within this region (Fig.  3C). This observation suggests that these domains undergo significant rearrangements while remaining intimately associated during the C3-C3(H 2 O) transition. Of these 26 nonmutual cross-links, five involve pairs of residues within the ANA domain and one involves a pair of residues within the MG8 domain. This observation implies that transition-associated conformational changes within these two domains accompany the more general domain rearrangements in the region. The second group of C3-specific or C3(H 2 O)-specific crosslinks (Fig. 3C) revealed a drastic repositioning of TED relative to the main body of the molecule. In C3, TED was exclusively cross-linked to MG8 and MG3, at the "shoulder" of the molecule, whereas in C3(H 2 O) six unique cross-links suggest a newly established proximity between TED and MG1 at the "foot" of the molecule.
In summary, simple inspection of the QCLMS data reveals radical structural rearrangements in the ␣-chain of C3(H 2 O) compared with C3. The TED domain in particular was relocated. In addition, rearrangements at the "shoulder" region resulted in significant although less dramatic shifts in relative position of its component domains. In contrast, the conformation of the ␤-chain is largely preserved between C3 and C3(H 2 O). Such a structural transition is very similar to what was observed when comparing the crystal structure of C3 (24) with the crystal structure of its activated fragment form, C3b (11). Structural similarities between C3(H 2 O) and C3b were therefore investigated based on QCLMS data for both proteins.

C3(H 2 O) Versus C3b Comparison Confirms Similar Domain
Architectures-In total, 92 cross-links were quantified in the C3(H 2 O) versus C3b comparison (Fig. 4A). The detection of 57 C3(H 2 O)-C3b mutual cross-links, distributed throughout the structure of C3b, proves that these two activated derivatives of C3 have similar structures (Fig. 4B). As with the C3 versus C3(H 2 O) comparison, the arrangements of ␤-chain domains appear to be nearly identical, but C3(H 2 O) and C3b also share extensive structural features in their ␣-chains, and in the interfaces between their ␣-chains and ␤-chains.
Of the total of 21 C3b-C3(H 2 O) mutual cross-links connecting ␣-chain residues to ␤-chain residues, eleven are also present in C3, whereas ten are not. Of these ten C3(H 2 O)-C3b mutual, but non-C3, cross-links, five connect TED and MG1. This supports suggestions that the TED domain in C3(H 2 O) Mutual to three (48) C3b-unique (7) C3(H2O)-C3 mutual (14) C3(H2O)-C3b mutual ( Fig. 3C). G, Detailed view of six C3b-unique cross-links (red) within the "shoulder" region (four from Ser 727 , the ␣Ј chain N terminus in C3b) drawn on the C3b structure (PDB 2I07). These cross-links suggest structural features of C3b that are likely to be missing from C3 (H 2 O).  (25,54,55). Our data additionally suggest differences between C3(H 2 O) and C3b with respect to the organization of the CUB and TED domains relative to MG1. A cross-link 75 MG1 -1293 CUB was detected in C3b but was absent in C3(H 2 O). This agree with previous SAXS-based observation, which suggested that TED and CUB are positioned, on average, further away from the MG domains in C3(H 2 O), compared with what is seen in the various crystal structures of C3b (9). In addition, we observed a C3(H 2 O)-unique cross-link 44 MG1 -1049 TED . To enable cross-linking between this pair of residues, a CUB-TED position in respect to MG1 is required, which was not captured by cross-linking in C3b (Fig. 4D). Thus, although TED is mobile, relative to MGs 1-6, in both C3(H 2 O) and C3b, it appears to be more mobile in C3(H 2 O) (9).
Unique  Table S2). In total 101 cross-links were quantified (Fig. 5A) in this comparison including nine that were unique to C3(H 2 O) and 48 that were mutual to all three proteins (Fig. 5B). None of the crosslinks fell into the theoretical category of being mutual to C3 and C3b, but absent in C3(H 2 O). This analysis provided valuable additional insights into the architecture of C3(H 2 O).
The three-way comparison of cross-links reinforces the aforementioned inferences from the C3(H 2 O)-C3 comparison regarding ANA. Seven C3(H 2 O)-C3 mutual cross-links within this domain suggest a broadly C3-like ANA structure in C3(H 2 O) (Fig. 5D). But three cross-links within the N-terminal ␣-helix of ANA are unique to C3 (Fig. 5E). This, together with data between cross-links from ANA to MG3 and MG7 (discussed further below), implies that a conformational rearrangement accompanies C3(H 2 O) formation that involves both relocation of the ANA domain and a structural change within the N terminus of the ANA domain.
The three-way comparison also supplements the previously described two-way comparisons by suggesting that C3(H 2 O) formation from C3, like C3b formation from C3, involves a movement of MG8 toward MG3 (Fig. 5C, 5D, 5H). Specifically, a cross-link between MG3 and MG8 (267 MG3 -1409 MG8 ) that had similar yields in both C3b and C3(H 2 O), was not detected in C3. In the C3 crystal structure, the C␣ atoms of residues 267 and 1409 are 35 Å apart and therefore lie beyond the theoretical cross-linking limit for BS 3 (27.4 Å). In C3b, these residues are 17.3 Å apart.
In the C3-to-C3b transition the approach of these residues is facilitated by proteolytic removal of the ANA domain. In contrast, for the case of the C3-to-C3(H 2 O) transition, the ANA domain remains attached and therefore this movement of MG8 toward MG3 would require it to be displaced. Indeed, three cross-links between ANA and MG7 occurred exclusively in C3(H 2 O), supporting a movement of the ANA domain toward MG7 in C3(H 2 O). All three of these cross-links involve residues that are separated in C3 by distances (54,46, and 44 Å) substantially greater than the theoretical cross-link limit (Fig. 5C). Despite its movement toward MG7, ANA remains proximal to MG8 in C3(H 2 O), as evidenced by six MG8-ANA cross-links, four of which are C3(H 2 O)-unique (Fig. 5D). Furthermore, a C3(H 2 O)-unique cross-link between MG3 and ANA residues, 241 MG3 -670 ANA , effectively replaces the C3unique cross-link between these two domains, 267 MG3 -650 ANA (Fig. 5D, 5E). This observation suggests a new ANA-MG3 contact region has formed in C3(H 2 O).
Taken together, this cross-linking network in C3(H 2 O) between ANA on the one hand, and MG7, MG8, and MG3 on the other, places the ANA domain close to a hypothetical region of C3(H 2 O) where MG7, MG8, and MG3 converge, as they do in the C3b crystal structure (Fig. 5F, 5H). This new arrangement could be accomplished by migration of MG8 toward MG3 and a shift of the MG8-MG7 interface.
In the case of C3b formation, comparison of the crystal structures imply that the ␣Ј-NT segment relocates from its position in C3 to the other side of the molecule. Four crosslinks from Ser 727 in C3b supported this relocation. But none of these cross-links were observed in C3(H 2 O). Instead, the relative position of ANA and ␣-NT in C3(H 2 O) may be inferred to be similar to that in C3, on the basis of three C3/C3(H 2 O)mutual cross-links between residues in ANA and ␣-NT (Fig.  5G). Presumably, the presence of the ANA domain in C3(H 2 O) prevents the relocation of the ␣Ј-NT segment to the opposite side of the molecule.
Moreover, the observation of C3-unique cross-link 882 MG7 -1539 C345C and C3b-unique cross-links 1479 Anchor -1573 C345C and 1346 MG8 -1475 Anchor imply that in C3(H 2 O), the arrangement of the neck and the head (supplemental Fig. S1) is identical to that of neither C3 nor C3b.
In summary (Fig. 5I), inspection of QCLMS data for a threeway comparison allows us to conclude that C3(H 2 O) adopts a C3b-like conformation in terms of the relative arrangement of its TED and CUB domains. The presence of the ANA domain at the N terminus of the ␣-chain restricts the rearrangements possible in C3(H 2 O) thus resulting in a unique domain architecture within the ␣-chain that is different from the ones in C3 and C3b. We next sought to build a structural model of C3(H 2 O) based on the structures of C3 and C3b and the differences between them in terms of intramolecular cross-links.
Integrative Modeling of C3(H 2 O)-In addition to structural inferences drawn from manual inspection and interpretation of cross-links, we used IMP to compute the models from the data (46). Our QCLMS data included two key observations that allowed us to model the 3D structure of C3(H 2 O). First, the 48 cross-links that are mutual to C3, C3b and C3(H 2 O) confirmed that the structures of individual domains, together with the architecture of the ␤-chain, are largely preserved during the structural transitions that accompany C3 activation. Hence, we were able to treat individual domains, and the entire key-ring like core structure contributed by the ␤-chain (see Fig. 6A) as rigid bodies with known structures (from the crystal structure of C3) when modeling the structure of C3(H 2 O). Second, a total of 85 quantified cross-links (supplemental Table S3) provided high-confidence distance restraints between pairs of residues in C3(H 2 O), which allowed for assembling individual rigid bodies into a 3D model of C3(H 2 O). Precise cross-linked sites were further assessed by manually inspecting the supporting fragmentation spectra, considering only lysine, serine, threonine, tyrosine residues and the protein N termini as possible sites. If multiple possible sites were present in a peptide they were disambiguated by help of back-bone fragmentation events and in some cases using the chromatographic behavior of the cross-linked peptides as supporting information (examples shown as supplemental Fig. S2 in Supplemental File). To assess our ability to build accurate 3D models in this way, we also built cross-link based models of C3 and C3b using the same approach and compared these against their crystal structures.
The models of C3, C3b, and C3(H 2 O) relied on the crosslink data sets, the crystallographic structures of domains in C3, the known disulfide bonds, the known primary structure, and the excluded volume, encoded in a Bayesian scoring function (Experimental procedures). The domains were represented by rigid-bodies connected by flexible linkers (Fig. 6A). We sampled the models by generating random translations and rotations of the rigid bodies, followed by the analysis of the best-scoring models (44 -48). The resulting models of C3 and C3b (Fig. 6B, 6C) satisfied all of the input cross-linkderived restraints. Their architectures corresponded well to the respective crystal structures (Fig. 6E) as reflected in the accuracy (C-␣ r.m.s. deviation of the solutions with respect to the crystallographic structures) of ϳ11 Å in both cases. In the case of C3(H 2 O), solutions with the best scores fall into two distinct clusters (A and B) corresponding to two alternative configurations of domains MG8 and ANA with respect to the other "shoulder" domains (supplemental Fig. S3A, S3B). Cluster B better satisfies the cross-linking data (supplemental Fig.  S3C, S3D). The solutions in cluster B have a precision of 15 Å and satisfied 95% of the input cross-link-derived restraints. The four cross-links that were not satisfied were 44 MG1 -1181 TED , 44 MG1 -1195 TED , 267 MG3 -1049 MG8 and 727 ␣-NT -1567 C345C (Fig. 6F). 44 MG1 -1181 TED and 44 MG1 -1195 TED could reflect the aforementioned mobility of TED with respect to the MG1-6 core. 267 MG3 -1049 MG8 and 727 ␣-NT -1567 C345C may reflect conformational changes within domains that were not allowed for in our rigid body-based modeling approach. For example, residue 1409 MG8 lies within a ␤␣␤-␤␣␣ motif that exhibits different conformations between the structures of C3 (PDB 2A73) and C3b (PDB 2I07). The ␣-NT segment (727-745) connecting the ANA and MG6 domains, appeared as a flexible loop in the C3 crystal structure yet had been treated as a rigid body, together with MG1-6 domains, for modeling purposes. It is also possible that the C345C domain is flexible relative to the shoulder region. Overall, the C3(H 2 O) models agree well with inferences based on manual inspection of cross-links (Fig. 5I) discussed in earlier.

DISCUSSION
The low-rate spontaneous activation of C3 occurs via a concerted process consisting of the hydrolysis of a thioester bond and a putative rearrangement of protein domains. This process is responsible for the ubiquitous and constitutive presence of C3(H 2 O) in plasma. It is the presence of C3(H 2 O) that maintains the alternative pathway of complement in "tickover mode", which is essential for the remarkable, near-instantaneous, response of complement to infection or danger first noted more than 100 years ago (56,57). Unlike in the cases of C3 and C3b, no crystal structure of C3(H 2 O) has been reported and therefore the structural basis for spontaneous C3 activation is unproven.
In the current study we utilized QCLMS to reveal structural differences and similarities between C3(H 2 O), its progenitor C3, and its functional analogue C3b. This allowed us to infer the domain architecture of C3(H 2 O). We also combined QC-MLS with structural knowledge, from crystallography, about the domains of C3 to build a computational model of the 3D structure of C3(H 2 O) with a precision of about 15 Å.
In QCMLS, pairs of amino acid residues within peptides that are tethered by a bi-functional cross-linker are identified using mass spectrometry. A potential source of error here is the possibility of misidentifying which residue within a peptide participates in the cross-link. A lead peptide-spectrum-match (PSM) is included in the supplement for every linked residue pair, showing no sign of abundant presence of this error. The impact of such errors on our findings is likely to be negligible for several reasons. First, our primary concern is modeling the spatial arrangement of protein domains (of between 70 and 300 residues) within a large (ϳ180 kDa) multiple domain structure; hence misallocation, within a few residues, of a small number of cross-linked sites will not have significant consequences. Second, our findings are based on multiple, mutually corroborating cross-links. Third, a recent study using a similar approach reported that randomly altering cross-linked sites within an 11-residue window, had few repercussions   Green and orange circles are satisfied and violated cross-links, respectively (a satisfied cross-link is defined when the C␣ -C␣ distance is below 35 Å (44)). (58). Moreover, to validate our procedure, we generated cross-link-derived models of C3 and C3b structures with an accuracy (compared with crystal structures) of about 11 Å. These results demonstrate the capacity of QCLMS to produce a set of self-consistent and highly informative distance restraints for proteins in solution.
We have thus provided experimental evidence that C3(H 2 O) adopts a C3b-like spatial arrangement of TED and CUB domains relative to the unchanging key ring-like core structure of MGs 1-6 ( Fig. 4B). We further show that the rearrangement of the remaining, ANA, MG7, MG8, Anchor and C345C domains results in a unique conformation at the shoulder, neck and head regions of C3(H 2 O), which may be considered intermediate between those of C3 and C3b. We observed cross-links between TED and MG1 that are mutual to C3b and C3(H 2 O)b, but are not compatible with a single conformation of either protein. These cross-links imply that TED is mobile relative to the core of the protein. This observation agrees with several other strands of evidence for TED mobility in C3b (25,54,55) and shows there is a similar level of TED mobility in C3(H 2 O).
Our integrative model of C3(H 2 O) explains previously reported observations (8,9,21,25,27). A hydrogen/deuteriumexchange study reported that significant repositioning or reorientation of the CUB and TED domains accompanies the formation of C3(H 2 O) from C3. Images of C3(H 2 O) obtained by negative-stain transmission electron microscopy, show a predominance of C3b-like conformers in which TED had migrated from the shoulders of the molecule to its feet. Moreover, the similar structures of C3(H 2 O) and C3b explain their functional similarities. Both can bind factor H, and both act as a platform for binding of factor B and the subsequent factor D-mediated cleavage of factor B to Bb, although C3(H 2 O) is less effective in this role than C3b (see below) (59).
To explain the QCLMS data an adjustment is needed of a previous structural model of C3(H 2 O) based on EM images. In this model the ANA domain at the N terminus of the ␣-chain was proposed to migrate, together with the ␣Ј-NT-equivalent segment, from the MG8-MG3 side of the C3 structure to the opposite, MG7, side in C3(H 2 O) (25) (as does the nascent ␣Ј-NT of C3b). In contrast, our data indicated that the ␣Ј-NTequivalent segment does not migrate across the structure to become exposed on the other side of C3(H 2 O). This implies that this segment of C3(H 2 O) would not contribute to the binding site for FH and FB as does ␣Ј-NT in C3b. These potential differences in the FH-binding and FB-binding surfaces (Fig. 5I) are consistent with the reportedly lower affinity of factor B for C3(H 2 O), compared with C3b (59), and may also pertain to the lower efficacy of factor H in deactivation of the C3(H 2 O)Bb complex (60). Taken together, our data placed the ANA domain at the contact point of the MG3, MG7, and MG8 domains (Fig. 5H, 5I, 6D), and on the opposite face of the molecule to the TED. Interestingly, the EM images (25) of C3(H 2 O) in complex with a F Ab that binds to an epitope within the ANA domain, support the ANA location in our model. It has been hypothesized that the ANA domain of C3 works as a molecular safety catch; its excision by proteolysis removes steric constraints on the domain rearrangements required for C3b formation (24). In agreement with this hypothesis, our data show that in C3(H 2 O) the ANA domain is displaced (Fig. 5H, 5I, 6D), and this has a similar effect to excision on the surrounding domains. Interestingly, an inactive C3(H 2 O)/C3 intermediate has been reported, with a hydrolyzed thioester, which resembles native C3 in overall shape (20,25). Taken together these observations imply that it is the displacement of ANA that serves as the trigger for the dramatic structural rearrangements that accompany formation of active C3(H 2 O).
Our analysis of C3(H 2 O) has demonstrated that a hitherto unknown structure can be determined by QCLMS data and modeling, widening the application range of this technology.