Mass Spectrometry-Based Structural Analysis of Cysteine-Rich Metal-Binding Sites in Proteins with MetaOdysseus R Software

: Identi ﬁ cation of metal-binding sites in proteins and understanding metal-coupled protein folding mechanisms are aspects of high importance for the structure-to-function relationship. Mass spectrometry (MS) has brought a powerful adjunct perspective to structural biology, obtaining from metal-to-protein stoichiometry to quaternary structure information. Currently, the di ﬀ erent experimental and/or instrumental setups usually require the use of multiple data analysis software, and in some cases, they lack some of the main data analysis steps (MS processing, scoring, identi ﬁ cation). Here, we present a comprehensive data analysis pipeline that addresses charge-state deconvolution, statistical scoring, and mass assignment for native MS, bottom-up, and native top-down with emphasis on metal − protein complexes. We have evaluated all of the approaches using assemblies of increasing complexity, including free and chemically labeled proteins, from low-to high-resolution MS. In all cases, the results have been compared with common software and proved how MetaOdysseus outperformed them.


■ INTRODUCTION
Mass spectrometry (MS) has become a powerful approach for structural and physicochemical characterization of metal− protein complexes. 1−11 Among all amino acid residues responsible for d-block metal-ion binding, cysteine (Cys) is the most frequent residue in structural sites building and stabilizing the protein structure. 12 It participates in many catalytic and chemical/posttranslational reactions explained by its high nucleophilic character. The identification of redox Cys state and metal coordination position has been a subject of many research focusing on the structural characterization of metalloproteins and mechanisms of their folding. 6,12,13 Of all techniques used for that purpose, electrospray ionization mass spectrometry (ESI-MS) is currently one of the most informative. Particularly, native ESI-MS has become an indispensable tool for the speciation of metal−protein complexes, especially when studying spectroscopic silent metal ions (e.g., Zn 2+ ). [1][2][3]5,7,10,11 In around 10% of the proteome, Zn 2+ ions are playing catalytic, structural, and regulatory roles. 2,12−16 Cysteine chemical labeling, improved ESI sources together with the recent advances in the transmission technology of Orbitrap and time-of-flight (TOF) mass analyzers have permitted the elucidation of the Cys oxidation state in the native-like structure. [2][3][4][5][6]14,17 The different reactivity between a free Cys residue and metalbinding Cys enables its identification with the use of an appropriate electrophile. 14 A great example of a protein class, which is intensively studied in terms of their structure and transient species is the metallothionein (MT) family, whose representatives were used in this study. [1][2][3][4]11,14 In particular, mammalian MTs are a low molecular mass (∼6 kDa) involved in the homeostatic control of Zn 2+ and Cu + ions, controlling metal fluctuations in the cytosol, nucleus, and mitochondria. 18−20 They contain at least a dozen of MT proteins, and they represent one of the major cellular zinc buffering system. 21 The process of data analysis involves three steps. (i) Preprocessing raw mass spectra thorough baseline subtraction, smoothing, and peak picking. The spectra should be noisefiltered while preserving all of the information. (ii) Charge deconvolution. All of the peak series have to be identified, and the correct charge state should be assigned. (iii) Identification of the deconvoluted mass with a peptide/protein search engine. The identifications need to be accompanied by a scoring system. Certainly, a great number of deconvolution algorithms have been developed, mainly classified into peak assignment or fitting algorithms. Peak assignment algorithms are based on performing a peak detection and assign a charge state to each peak using multiple charge states (e.g., MaxEnt, 22,23 AutoMass, 24 Z-score 25 ) or isotope peak spacing (e.g., THRASH, 26 Z-score 25 ). Simulation algorithms simulate multiple hypothetical mass and charge distributions, and that mass-charge distribution that fits the data best is taken to be the most correct (e.g., CHAMP, 27 Massign, 28 UniDec, 29 PeakSeeker 30 ).
While peak assignment constitutes fast and simple algorithms, often are not suitable for complex spectra because of the difficulty in peak picking, and do not produce quantitative results. On the other hand, simulation algorithms are quantitative but computationally intensive. Regardless of the algorithm used for ESI deconvolution, the quality of the results is judged with a scoring system that is usually algorithmdependent. Recently, a universal score (UniScore) based on the evaluation of critical steps during deconvolution has been released, promoting an unbiased scoring system to compare between multiple deconvolution algorithms. 31 Protein/peptide database search and statistical modeling tools are well established in peptide-based bottom-up proteomics. Looking at finding new peptide-spectrum matches (PSMs) or integrating searching engine with scoring models has resulted in a variety of software, in addition to the commercial offerings Mascot 32 or SEQUEST. 33 This includes freely available programs such as MetaMorpheus, 34 MS-GF+, 35 MaxQuant, 36 OMSSA, 37 or X!Tandem. 38 Commonly, bottomup MS/MS spectra are scored without charge deconvolution.
Nevertheless, protein-based top-down MS generates a high degree of complexity that presents a challenge for data analysis, more pronounced for native top-down MS. 3,39 The sequential pipeline analysis requires a peak picking, charge deconvolution, proteoform identification, and quantification steps. However, most of the software's available focus is exclusively on each step of the pipeline, and scarce are those that include all steps. 40 For instance, algorithms for charge deconvolution are based on isotope spacing, including MS-Deconv 41 or TopFD 42 in addition to the first high-resolution top-down deconvolution algorithm THRASH, 26 and commercially included SNAP 43 algorithm by Bruker Daltonics and the Xtract 44 algorithm by Thermo Scientific. Following mass spectra deconvolution, common data analysis tools for identification are ProSight PTM, 45 MS-TopDown, 46 MS-Align+, 47 or TopPIC. 42 A tool that incorporates all of the steps is MASH Suite Pro, 48 which accommodates enhance-THRASH, MS-Deconv, 41 and TopFD 42 algorithms for spectral deconvolution and MS-Align+ 47 or TopPic 42 for database searching.
Here, to fulfill the deficits encountered for the available MSbased software, we developed MetaOdysseus, a comprehensive data analysis pipeline in R language that integrates native MS, bottom-up and native top-down protein analysis with special focus on metal−protein complexes. The characterization of metal−protein complexes, involving determination of the metal−protein stoichiometry, identification of metal-binding sites, or characterization of metal-coupled folding events, requires combining multiple MS-based approaches. 2,3,7,14 Although much effort has been put in to develop new proteomic software, less attention has been paid to integrate native MS and proteomics into a single software. Moreover, the available common software presents deficits for the data analysis of metal−protein complexes at the peptide and intact protein levels. For native MS analysis of intact protein complexes, although many deconvolution algorithms have been released, still a global pipeline for the data analysis is missing and thus, we incorporated: several preprocessing schemes; two deconvolution algorithms, one based on peak assignment and other on mass spectra simulation; the UniScore metrics promoting the global evaluation of the deconvolution; and a targeted mass assignment engine. We demonstrated how deconvolution results (measured as the UniDec metric) obtained with MetaOdysseus agreed closely with the UniDec algorithm. For peptide-level MS analysis, MetaOdysseus surpassed Mascot and MS-GF+, providing a higher number of PSMs, and equal or higher sensitivity at the same false discovery rate (FDR) and at every precision value. For native top-down analysis, we compared deconvolution results obtained with MetaOdysseus and with the eTHRASH algorithm included in the MASH Suite pro and demonstrated that MetaOdysseus yielded a higher true positive rate and a lower false-negative rate. To benchmark the identification issue, we compared our software with two common free tools for matching a single-candidate sequence against MS spectra: ProSight Lite 45 and MS-Align+. While ProSight Lite requires the definition of the exact position of PTMs, MS-Align+ identifies unexpected PTMs and thus appears capable of topdown analysis of metalloproteins. However, there are two major limitations that hamper its use for it: (i) the Cys protection parameters are fixed, with or without modification and only allows two types of modification, carbamidomethylation or carboxymethylation. Many research studies focused on metal-binding proteins used other reagents for Cys modification and, in the case of metal-binding Cys proteins, incomplete Cys modification is commonly observable. 3,4,6 (ii) The maximum number of unexpected or blind PTMs is two, which limits a proteoform to bind a maximum of two metal ions and excludes its analysis for a large part of metalloproteins. Although both ProSight Lite and MetaOdysseus provided similar identification, MetaOdysseus is a time-saving solution that runs automatically in comparison to the required manual pinpointing with ProSight Lite. Altogether, this research shows that MetaOdysseus provides a global solution for the analysis of metal−protein complexes.

Overview of Software
The software presented herein was implemented in R programming language and incorporates a set of algorithms for processing, charge-state deconvolution, target mass assignment, and statistical evaluation for different MS experiments (native MS, bottom-up, and native top-down) and ionization sources (ESI and MALDI). File format. The MS raw data or preprocessed data (centroided and/or smoothed and background removed) can be imported as xy, mzXML, or mzML with or without preprocessing (smoothing, centroiding, background removal).
Preparation of Mass Spectra (Optional). The mass spectra are smoothed and the background is removed with asymmetric least squares. For smoothing, we incorporated three algorithms: (1) Savitzky−Golay filters; 49 (2) finitedifference approach; 50 and (3) discrete wavelet transform. 51 Peak Detection (Optionally). Here, we performed a peak detection with a continuous wavelet transform-based peak detection algorithm. 52 It convolves the mass spectra with a Mexican Hat wavelet and the true peaks are registered as local maxima.
Charge-State Deconvolution. For ESI-MS spectrum where multiple charge states for each specie is expected, we focused on addressing the problem of assigning mass series through two algorithms, one based on peak assignment adapted from the Z-score 25 algorithm, and the other based on simulation spectra. The peak assignment algorithm can be used for high-resolution mass spectra, and thus, a mass with only one charge state can be also deconvoluted. The simulation algorithm was incorporated to account for peaks with lower resolution that are widening due to the incomplete desolvation or adduct formation, common observations of native mass spectra. 30 Mass Assignment. A theoretical in silico mass pattern is generated from the protein sequence, and with a simple crosscorrelation mass-to-mass matching algorithm, we solve the mass assignment problem. For the generation of theoretical masses, we added features addressing metalloprotein studies, that is, the incorporation of several Cys-labeling reagents and metal ions with partial labeling modification.
Statistical Scoring. Here, a scoring scheme was incorporated for each MS experiment to evaluate the deconvolution results and/or the mass assignment. For native MS, we integrated the UniScore for the simulation algorithm that evaluates the quality of the deconvolution with a universal score that is software-independent. 31 After the mass assignment, each feature is reported with their scored mass error, DScore, UScore, FScore, CCScore, and MScore along with the global UniScore. For MALDI-MS analysis of enzymatically digested proteins (peptide-mass fingerprinting, PMF), a PMF score is computed based on similarity descriptors, 53 and its confidence interval is calculated with bootstrapping. An empirical p-value is computed producing a list of decoy protein sequences via permutation tests and calculating a histogram of decoy PMF after interrogation against the experimental spectrum. For bottom-up MS/MS, the quality of the single peptide-spectrum matches (PSMs) is evaluated with a peak match probability scored based on computing probabilities from a binomial distribution and the probability is log-transformed. The problem of multiple PSMs is addressed with the estimation of the false discovery rate using the targetdecoy approach (TDA). 54 We used a conservative TDA approach using a separate target-decoy database search and building the decoy database with a randomized target protein sequence. For native top-down MS/MS, the quality of the charge-state deconvolution algorithm based on peak assignment was defined as the total score S defined as ∑s m/z , where s m/z is the (log(SNR)) m/z for each charge state z i . A singlespectrum confidence score of the mass assignment is computed as the p-value obtained via permutation tests for the probability log-transformed from a binomial distribution. In addition to the summary given below, a detailed description of each step is provided in the Supporting Information.

Mass Spectrometry Measurements
ESI-MS analysis was performed on a Bruker Maxis Impact (Bruker Daltonik GmbH, Bremen, Germany) calibrated with a commercial ESI-TOF Tuning mix (Sigma-Aldrich). Proteins were buffer-exchanged to 20 mM ammonium acetate using PD-10 desalting columns (Sigma-Aldrich). Protein concentrations were adjusted to 5−15 μM and directly injected by a syringe pump with a 2 μL/min flow rate. The following MS parameters were used for direct injection electrospray ionization-mass spectrometry: collision energy, 10 eV; collision RF 600 Vpp endplate offset potential, 500 V; capillary potential, 2500 V; nebulizer gas, (N 2 ); pressure, 1.5 bar; drying gas (N 2 ) flow rate, 4 L/min; drying temperature, 180°C . The transfer parameters were: funnel 1 RF, 400 Vpp; isCID energy, 0 eV; multiple RF, 400 Vpp. The quadrupole ion energy was 5 eV, and the low mass for transmission was set at m/z 300. The mass range was set from 50 to 3000 m/z. Native top-down experiments were performed with a data-dependent MS/MS acquisition (DDA) with auto MS/MS with a collision energy of 200 eV, or with a data-independent acquisition (DIA) using broad-band CID (bbCID) with increasing collision energy of 50−200 eV. A MALDI-TOF/TOF MS Bruker UltrafleXtreme instrument (Bruker Daltonik GmbH, Bremen, Germany) was used for MALDI-MS analysis. MALDI instrument was controlled by flexControl v 3.4 and flexAnalysis v 3.4 software. 2,5-Dihydroxybenzoic acid (DHB) was used as MALDI-TOF matrix for protein analysis. The saturated matrix solution was prepared in 30% acetonitrile and 0.1% trifluoroacetic acid. MALDI-MS analysis of proteins was performed in a linear positive mode in the 2−20 kDa range. The mass spectra were typically acquired by averaging 2000 subspectra from a total of 2000 laser shots per spot. Laser power was set at 5− 10% above the threshold. The calibration was done using standard peptide and protein calibration mixture obtained from Bruker (Bruker Daltonik GmbH, Bremen, Germany).

Mass Spectrometry Data
Several MS datasets were obtained from different mass spectrometers to demonstrated the MetaOdysseus's capabilities: (1) native MS, native top-down MS/MS and bottom-up MS/MS data acquired in a regular quadrupole time-of-flight (Bruker Maxis Impact); (2) external native MS data acquired in an Orbitrap Eclipse tribrid by Robinson's group; 55 (3) peptide-mass fingerprinting with a MALDI-TOF/TOF MS Bruker UltrafleXtreme instrument.

Cysteine Residues Labeling, Reconstitution and Trypsin Digestion of Metallothionein
Freshly SEC-purified apo-MT proteins (25 μM) were incubated with freshly prepared 10 mM alkylator (IAM, IAA, NEM, ethyl iodoacetate) for 15 min in the dark in 100 mM (NH 4 ) 2 CO 3 . The alkylation reaction was stopped by dry ice. All solutions and plastic tubes were previously degassed by purging with nitrogen. To eliminate the excess of alkylation reagents, samples were desalted with ZipTip μ-C18 (Merck Millipore) and eluted with 10 μL of Milli-Q water/ACN solution (50/50, v/v). To obtain Zn 7 MT, an appropriate molar equivalent of 500 μM ZnSO 4 over particular apo-MT was added. In solution trypsin digestion was carried out with proteomics-grade trypsin (Sigma-Aldrich) freshly prepared in 1 mM HCl at a 1:20 trypsin/protein (w:w) ratio at 37°C for 15 min. The reaction was stopped by freezing the sample on dry ice. Aliquots were immediately analyzed or stored at −20°C.

Data Availability
All data generated or analyzed during this study and an R script that follows the results section are deposited in https://github. com/ManuelPerisDiaz/Data_MetaOdysseus. The MetaOdysseus R package is deposited in GitHub and can be freely downloaded from the following link: https://github.com/ ManuelPerisDiaz/MetaOdysseus. We benchmarked MetaOdysseus here with the application to systems of increasing complexity, from isotopically resolved single protein to an unresolved peak distribution mix of proteins: (1) high-and low-resolution native ESI-MS spectra for ∼6 kDa metal-binding proteins apo-MT2 and Zn 7 MT2; (2) high-and low-resolution native ESI-MS spectra for chemically labeled Cys-iodoacetamide (IAM) or Cys-labeled with N-ethylmaleimide (NEM) Zn 7 MT2; (3) external dataset of native MS spectra of 17−175 kDa soluble proteins (myoglobin, BSA, ADH, and Herceptin) analyzed with an Orbitrap Eclipse tribrid. 55 The R-based software may be used sequentially for processing, charge-state deconvolution, and mass assignment of mass spectra, or it may independently use any of its modules. For example, deconvolution can be performed externally, and the mass assignment problem solved within MetaOdysseus. A qualitative comparison of the preprocessing algorithms incorporated in MetaOdysseus is exemplified for low-resolution apo-MT2, which showed that the finite-difference penalty followed by asymmetric least squares provided the mass spectra with the most reduced background noise ( Figure S2). Subsequently, deconvolution was successfully obtained in all of the cases, fitting the simulated component spectra to the experimental one with R 2 ranging from 0.85 to 0.99 (Tables S1 and S2). However, as it has been demonstrated, R 2 is not a proper statistic to evaluate the quality of the fit because of the potential for overfitting the raw data. Thus, another scoring metric is required to confidently assess the quality of the deconvolution. Here, we incorporated the recently presented universal score (Uni-Score), 31 defined as the weighted average of the deconvolution score (DScore) for each mass. DScore is calculated with four components that can capture the fit of each charge state (UScore), peak shape (MScore), gaussian charge-state distribution (CSScore), and the symmetry and resolution of the peaks (Fscore). As it can be seen, the UniScore obtained varied widely from 38 to 97, in comparison to the narrower distribution of R 2 values. Not only the broadening of the UniScore distribution but also the low correlation (Pearson's correlation coefficient 0.22) between R 2 and the UniScore indicates that both metrics provide complementary information and that R 2 tends to overfit. Deconvolution results obtained with the simulation approach incorporated in MetaOdysseus were compared with the UniDec algorithm.
In the second step, once the mass spectra were deconvolved, the masses and charges of the components were assigned. To do so, masses deconvoluted from the zero-charge mass distribution were interrogated against a constructed theoretical ion library based on the protein sequence. For every mass deconvolved from the mass spectra, DScore, UScore, MScore, CSScore, and Fscore are reported along with the identifications with its mass error (Table S3). To give an example, the most intensive deconvoluted mass for the apo-MT2 sample was assigned as apo-MT2, and the mass deconvoluted has a DScore of 0.75 and a global UniScore of 69 (Tables S1, S3, and Figure  S3). The charge-state series of this component showed [apo-MT2 + zH] z+ with z = 4−6 ( Figures S4 and S3). The high UScore of 0.90 indicated that the signals corresponding to these charge states are mostly captured by the deconvoluted mass and do not overlap with adjacent charge states. However, the points lost from the MScore indicate fluctuations in the peak shape between the different charge states. The CSScore and FScore of 100 indicate that the charge-state distributions are well separated and are symmetric. Another example is the high-resolution nMS of Zn 7 MT2 after reaction with the Cyslabeling reagent IAM ( Figure 1A). The deconvolved neutral mass spectrum was scored with a UniScore of 62, and two masses were assigned as Zn 0 IAM 19 MT2 and Zn 0 IAM 20 MT2 with DScore values of 68 and 63, respectively ( Figure 1B). In general, the distribution of R 2 and UniScore for both software agrees closely, as we did not find differences in the mean (pvalue (Wilcoxon test) 0.4 and 0.98, respectively) ( Figure 1C,D and Table S4). These results demonstrate that MetaOdysseus provides a quantitative deconvolution and an accurate mass assignment algorithm for low-and high-resolution native MS.

Case 2: Deconvolution and Mass Assignment of MALDI-MS Spectra
The second case approached is the peak picking of linear and reflector mode of MALDI-TOF spectra and the mass assignment problem. Two experiments were used to evaluate the software: (1) analysis of intact and chemically Cys-labeled In the second experiment, after mass deconvolution, the mass assignment algorithm produced a list of peptide masses and their mathematically possible Cys modifications. Tryptic peptides were identified with multiple IAM modifications (Table S6). For Cys-rich metalloproteins, likely the peptide generated harbors multiple Cys residues. For example, three peptides from the region [32−43] for apo-MT2 were identified, 5M [32−44], 4M [32−44], and 2M [32−44] (nM, where n indicates the number of modifications) ( Table  S6). The software positively identified peptide masses with isotopically well-resolved patterns as well as could identify low abundant signals from the background ( Figure S7).
The quality of the deconvoluted and mass assigned mass spectra for all 10 datasets was assessed with a global PMF score, 53 and its confidence interval was estimated with bootstrapping. The correlogram showed a low correlation between the mass coverage (MC) and the hit ratio (p < 0.005), and thus, in this case, the PMF score is mostly correlated by the hit ratio ( Figure S8 and Table S7). For instance, a higher PMF score was obtained for Zn 5 MT3 than Zn 4 MT3, even though the latter one achieved higher mass coverage. To validate a particular PMF score, we constructed a list of decoy sequences via the permutation of the protein sequence and computing a p-value ( Figure S9). In any case, the PMF score was statistically significant (p < 0.005).
To compare the performance obtained with MetaOdysseus, we processed the datasets within BioTools software (Bruker Daltonics), performing a peak picking using averaging-based peak finding SNAP algorithm and a targeted database searching against the protein sequence with Sequence Editor software (Bruker Daltonics). Then, we calculated a confusion matrix for each dataset. Using all 10 datasets, we observed how MetaOdysseus obtained higher accuracy and precision with lower FDR (p < 0.05) ( Figure 2). Moreover, we did not find differences in the F1 score, suggesting MetaOdysseus performance equal to or better than BioTools.

Case 3: Deconvolution and Assignment of MS/MS Spectra
Bottom-Up Approach. A dataset of MS/MS spectra collected for enzymatically digested Zn 0-7 MT3 proteins was employed to demonstrate how MetaOdysseus solves the spectrum matching problem for peptide identification from the bottom-up approach. Given a set MS/MS spectrum (40) and a sequence database, we were able to obtain a 90% (36 out 40) of peptide-spectrum match (PSM) ( Table S8). To evaluate the quality of the PSMs, a simple peak match probability score based on computing probabilities from a binomial distribution was generated for each mass spectrum. Accounting for false-positives filtering by FDR threshold (1%) estimated with a target-decoy method resulted in 33% of PSMs (Table S9 and Figure S10). We investigated how Meta-Odysseus compares to Mascot and MS-GF+, in terms of their percentage of PSMs, sensitivity, specificity, and precision. PSMs (80 and 70%) were achieved by MS-GF+ and Mascot, which after filtering by 1% FDR remained with 15 and 10%, respectively ( Figure 3 and Table S10). We found that at every  Figure 4C). MetaOdysseus can be also used only for identification, importing already preprocessed mass spectra.
Since MS-GF+ only works for identification without peak picking step, we evaluated the identification ability of MetaOdysseus and MS-GF+ with an already preprocessed set of mass spectra. We have observed how MetaOdysseus achieved equal or higher sensitivity at the same FDR ( Figure  S11A). The precision−recall curve showed that the sensitivity achieved with MetaOdysseus was similar or higher at every precision value ( Figure S11B). The ROC curves showed that for a particular range of specificity, MS-GF+ is more sensitive; however, a similar accuracy measured as AUC (0.68) was obtained for both software ( Figure S11C). In terms of the number of PSMs, 24% against 15% obtained with MS-GF+ was achieved by MetaOdysseus. Native Top-Down Protein Characterization. Here, we report how MetaOdysseus solves the assignment process of a given set of MS/MS spectra of Zn 7 MT1e obtained by collision-induced dissociation experiments using different collisional activation energies. In general, the presence of the seven Zn 2+ coordinating all of the 20 Cys residues provides a folded and stable structure. Mass deconvolution returned the charge states and masses for the metal−protein complexes present in the mass spectra (Table S11). Results were compared to the eTHRASH algorithm incorporated in the MASH suite pro, considering as a reference method the manual inspection of the data (Table S12). Deconvolution with MetaOdysseus provided a higher number of true positives (58 vs 17) and a lower number of false positives (11 vs 19) ( Figure 5A). Calculated confusion matrix showed that MetaOdysseus provided higher TPR (0.98 vs 0.30) and lower FNR (0.38 vs 0.66). Following spectral deconvolution, we focused on protein identification. Isolation and fragmentation of the m/z 1614.11 that corresponds to the [Zn 7 MT1e] 4+ ion yielded CID fragment ion corresponding to b 17 and y 31 containing one and three metal ions where charge is retained by the N-and C-terminus, respectively (Table S13). Interestingly, the y 31 fragment derived from backbone dissociation between αand β-domain contained three Zn 2+ , probing the higher stability of Zn−S bonds and folding of the α-domain. 14,56 On the other hand, fragment b 17 with four Cys residues in the sequence contained one Zn 2+ . Analysis of the CID fragments from the data-independent acquisition (DIA) experiment showed that except the common b 17 fragment, most of them derived from the α-domain (C-terminus). This can be attributed to the higher thermodynamic stability and lower kinetic lability of the Zn−S bonds for α-domain (Table  S13). 14, 56 We detected the fragment y 36 corresponding to the Zn 4 Cys 11 cluster within α-domain, which suggests four Zn 2+ coordinates in the α-domain as featured for mammalian MTs ( Figure 5B). To these identifications, we assigned a statistical significance by computing first a score based on the binomial distribution probability and then by the estimation of an empirical p-value via permutation tests ( Figure 5C). We compared the database searching with ProSight Lite, which required to pinpoint manually the position of the custom modification, that is, Zn 2+ in here. 58 So, we added to the Nand C-terminus from zero to seven Zn 2+ , which leads to a total of 49 possible solutions. This manual inspection validated the results found automatically with MetaOdysseus and did not detect any new feature.

Integration of the Native MS and Proteomics for a Comprehensive Data Analysis
The characterization of metal−protein complexes, including the metal−protein stoichiometry, mapping the metal-binding  Journal of Proteome Research pubs.acs.org/jpr Article sites, or studying their metal-coupled folding, requires combining several experimental protocols. 2,3,7,14 The interplay between metal ions and proteins and its inherently dynamic nature challenges their investigation. In our recent study, we attempted to develop an experimental approach that combined native MS and proteomics approaches to elucidate the metal− protein stoichiometry and topology of different Zn 7-x MT2 species. 2,14 Having prepared a metal−protein complex (e.g., Zn 4 MT2), the first step is to record mass spectra under nondenaturing conditions. The native-like structure is likely retained because of the kinetic trapping effect, which allowed us to determine the stoichiometry. Afterward, the protein was labeled with IAM, a Cys-alkylating reagent. In principle, one may determine the number of Cys residues coordinating metal ions by the simple premise that a metal-bound thiolate exhibits lower reactivity toward nucleophiles than a free Cys residue.  (Table S1); software parameters used in the MetaOdysseus for deconvolution results in Table S1 (Table S2); mass assignment of the deconvolved spectrum obtained with MetaOdysseus in Tables S1 and S2 (Table S3); deconvolution results obtained with UniDec for the samples assayed (Table S4); charge deconvolution overview ( Figure S1); comparison of combined algorithms for ESI-MS spectrum processing performed over apo-MT2 ( Figure S2); deconvolution scoring scheme based on UniDec ( Figure S3); analysis of ESI-MS spectra for apo-MT2 and Zn 7 MT2 ( Figure S4);  Figure S5); annotated MALDI-TOF-MS spectra of chemically labeled apo-and Zn 7 MT2 using of the R package (Table S5); processed MALDI-TOF-MS spectrum for apo-MT2 and Zn 7 MT2 chemically labeled by a set of alkylation reagents ( Figure S6); annotated peptide-mass fingerprint for IAM-labeled apo-MT2 (Table S6); peptide-mass fingerprint (PMF) of chemically labeled apo-MT2 and apo-MT3 obtained by MALDI-TOF-MS ( Figure S7); scores obtained for annotated peptide-mass fingerprint for apo-MT2 and apo-MT3 (Table S7); correlogram for the PMF scores hit ratio, sequence coverage, and mass coverage for the 10 datasets analyzed (Zn 0 MT2···Zn 7 MT2, Zn 0 MT3··· Zn 7 MT2) with MALDI-MS ( Figure S8); permutation tests to calculate the p-value from the null distributions for the PMF scores obtained for the decoy list scored against the experimental MALDI-MS spectrum for Zn 0 MT2···Zn 7 MT2 and Zn 0 MT3···Zn 7 MT3 proteins ( Figure S9); peptide-spectrum matches (PSMs) for the dataset of MS/MS spectra collected for enzymatically digested Zn0-7MT3 proteins (Table S8); deconvolution and assignment of the bottom-up MS/MS spectra with MetaOdysseus (Table S9); evaluation of the relationship between the false positives (FP) and the score achieved for the peptide-spectrum matches computes as a peak match probability scored based on the probabilities obtained from a binomial distribution ( Figure S10); peptide-spectrum matches (PSM) obtained with MS-GF + for the set of enzymatically digested Zn 0-7 MT3 proteins (Table S10); comparison between MetaOdysseus (red line) and MS-GF+ (black line) for the peptidespectrum matches (PSMs) results obtained in terms of their sensitivity, specificity, and precision ( Figure S11); mass deconvolution for the native top-down MS/MS obtained with MetaOdysseus using the peak assignment algorithm (Table S11); mass deconvolution for the native top-down MS/MS obtained with MASH Suite Pro using the eTRASH algorithm (Table S12); and mass assignment for the deconvolved masses from the native top-down MS/MS obtained with MetaOdysseus (Table  S13) (PDF)