Key words

1 Introduction

Proteomics is a powerful platform for studying both single proteins and complex protein samples. Combining gel- and chromatography-based separation techniques with subsequent mass spectrometry (MS)-based analysis and bioinformatics approaches enables processing of questions from all areas of medicine and basic science. In particular, biological processes and pathways can be examined in a more detailed manner. The key technology for the protein identification in biological samples is MS. Improvement in MS sensitivity, resolution, and mass accuracy , advances in high-performance liquid chromatography (HPLC) efficiency, progress in software development, and increasing computing power have allowed transition from “what” to “what and how much” by introducing MS-based protein quantification techniques.

Depending on the research question and the sample at hand, quantification of proteins on a global level as well as accurate quantification of individual proteins can be performed. MS-based proteome-wide analysis of protein changes between different conditions (i.e., healthy versus diseased) allows to uncover pathological mechanisms on the phenotypic level. In addition, quantification of a specific stimulus, ranging from changes in the amounts of a single protein or its defined post-translational modification (PTM) to the proteome-wide kinetics of the same modification between different stages of the cell cycle, can be achieved. Accurate and reliable quantification of proteins and their PTMs is playing a decisive role for new disease-related biomarkers for better diagnostics, prediction, and treatment.

The variety of questions being asked has impelled the development of a variety of quantitative MS techniques, which can be generally divided into four strategies (Fig. 1). According to the study aim and scope, two main strategies are used: (i) untargeted (global) quantification of hundreds or thousands of proteins for protein profile comparison (Chaps. 10, 11, 1425) and (ii) targeted (single- or several-component) quantification of only few components (Chaps. 12, 13, 26), which are selectively isolated from a sample and quantified. Depending on the quantification level, protein-centric (top-down) or peptide-centric (bottom-up) approach could be applied.

Fig. 1
figure 1

Overview for the MS-based protein quantification strategies selection, based on a study scope (I); quantification level (II); methodology, i.e., label-based or label-free (III); and the data, which will be available for the further statistical analysis (IV)

According to the underlying methodology, MS-based quantification can be further divided into two subgroups: (i) label-based quantification utilizing stable isotope labels incorporated within the peptides/proteins (Chaps. 10, 11, 1416, 1820, 26, 27) and (ii) label-free quantification (Chaps. 17, 2125) in which sample retains its native isotope composition. Label-based technology comprises artificial labeling of peptides or proteins, which introduces an expectable mass difference within different experimental conditions. Depending on the way of tagging, chemical , metabolic, and enzymatic labeling are introduced. Furthermore, the methodologies for the stable isotope tagging can be generally divided into two main groups: post-harvest methods (Chaps. 10, 11, 13, 15, 16, 18, 26, 27) and metabolic labeling methods (Chaps. 14, 19, 20), which involve the incorporation of an isotopic label into the protein when the sample is still metabolically active. Depending on the information provided by these quantification methods, they are classified into (i) relative (Chaps. 10, 11, 1422, 24, 25, 27) and (ii) absolute (Chaps. 12, 13, 26). Relative quantification yields protein quantitative ratio or relative change, by comparing the amount of single protein or whole proteomes between samples. In its turn, absolute quantification provides information regarding absolute amount or the concentration of a protein within a sample. Schematic overview of the main classification for MS-based protein quantification as well as quantification strategy selection is provided in Fig. 1. This chapter provides a general overview of MS-based protein quantification (overview is given in Table 1) and the main strategies and methodologies known at the time of writing this overview.

Table 1 Methods for relative and absolute quantification

2 General Principles for MS-Based Quantification

MS comprises analyte ionization, their separation based on mass-to-charge ratio (m/z) with the further detection of these ions [1, 2]. Among a wide variety of ionization techniques, electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI) are commonly used in proteomics research, as they enable MS analysis of highly polar and large molecules. ESI can be directly combined with separation techniques, e.g., HPLC. MALDI, in contrast, has an advantage of producing singly charged ions of peptides and proteins, minimizing spectral complexity [1,2,3). The sensitivity and resolution of a mass spectrometer strongly depends on the mass analyzer ability to effectively separate ions. The most commonly used mass analyzers for accurate protein quantification are quadrupole, time of flight (ToF), and Orbitrap. In its basic operation with continuous sample introduction, the mass spectrometer continuously acquires mass spectra, when the instrument operates in the full-spectrum (or full-scan MS1) mode. In this case, a three-dimensional data array, defined by time, m/z, and ion intensity (counts), is acquired [1, 2]. Coupling several mass analyzers in one mass spectrometer allows two or more sequential separations of ions with their fragmentation occurring in between and refers to as tandem mass spectrometry [1, 2]. Mostly, in tandem mass spectrometry experiments (known as MS/MS or MS2), the first mass analyzer is used to isolate a precursor ion, which then undergoes fragmentation in a collision cell to yield product ions to be sorted and weighted in a second analyzer. The number of steps can be increased to yield MSn experiments (where n refers to as the number of generations of ions being analyzed) [2]. Analysis of MS-acquired spectra allows not only molecular analytes mass/weight determination and structure/sequence elucidation but also their quantitative analysis. In the following sections, an overview of the most important analytical approaches in quantitative MS-based proteomics is discussed.

2.1 Global and Targeted Quantification

For untargeted proteomics experiments, the data-dependent acquisition (DDA) (see Chaps. 10, 11, 1418, 21, 22, 27) or data-independent acquisition (DIA) (see Chaps. 17, 2325) can be used. Currently, most liquid chromatography-coupled tandem mass spectrometry (LC-MS/MS) approaches rely on DDA, which comprises a selection of the most common Top N precursor ions from a full MS1 overview scan for further fragmentation and acquisition of the respective MS2 spectra [3,4,5,6]. Data derived from each MS2 scan can be analyzed with a database search algorithm [7,8,9]. DDA typically yields thousands of protein identifications together with the quantitative information. However, the selection of peptide precursors is stochastic; if too many peptide species co-elute and appear in a single MS1 scan, then DDA samples only the most abundant peptides, missing low-abundance ones [7]. Consequently, different subgroups of peptides could be selected for fragmentation between different samples, resulting in high variations between replicates such as lack of identification of low abundant peptides and thus a reduced number of quantifiable proteins [10,11,12]. Moreover, quantification based on DDA depends on the analysis of the chromatographic MS1 peak area, which is particularly susceptible to interferences, especially in complex samples [13, 14]. In spite of its drawbacks, the flexibility, scope of detection, and the relative simple setup and analysis make DDA still the preferred method within the proteomics community. Moreover, DDA allows relative quantification of peptides between samples through a variety of labeling techniques, and by label-free proteomics [8, 15, 16].

In DIA, a set of predetermined sequential mass isolation windows is used to send all precursor ions of the same mass window for simultaneous fragmentation and analysis [17]. This approach allows more reproducible and accurate protein identification and quantification, compared to DDA (see Note 1) [17,18,19,20]. As the fragmentation and subsequent analysis of all the ionized peptides occurs within a defined mass isolation window [12], theoretically the identification and quantification of all precursor peptides is possible. DIA-based quantification is performed at the MS2 level by extraction of fragment ion chromatograms, which are less susceptible to interferences than MS1-based extracted ion chromatograms [17, 21]. However, the large number of fragment ions, derived from different peptides from the same selection window, prohibits the analysis in a classical database search strategy . Moreover, as DIA allows acquisition of nearly complete MS2 data, the direct correlation between precursor and its fragment ions is lost resulting in the need for more complex data analysis algorithms (see Chap. 32). Most commonly spectral libraries, containing the information regarding the elution time of the peptide and its fragment ions, are used to infer the precursor peptide-fragment connection and thus allow peptide and protein identification [22,23,24]. At that, due to data complexity , special software tools, i.e., Spectronaut, OpenSWATH, Skyline, PeakView [25,26,27,28], or others (Chap. 32), are required. A spectral library is usually generated during preliminary profound analysis of the same samples with the same instrument by DDA (see Chap. 23). This requires extra time and sample expanses. Furthermore, if a peptide is not present in a spectral library, in principle, it cannot be quantified. Therefore usage of a larger spectral library does not always lead to the increase in identifications from DIA data and may also lead to a lower quantification accuracy [20, 29]. A trend toward increasing coefficients of variations (CVs) with increasing library size was reported and explained as a result of detecting more low abundant species that naturally entail a higher variation [20]. Recently, spectral library-free approaches, such as DIA-Umpire and DirectDIA, have been developed [26, 30]. The quantification accuracy for these methods is similar to the spectral library-based approaches, but with lower number of identified precursors [20, 26, 30].

While global proteomics is a strategy of choice for, e.g., the discovery of new biomarkers, their further validation requires targeted methods for sensitive, accurate, and specific protein quantification [31]. The targeted MS approaches, i.e., selected reaction monitoring (SRM), also known as multiple reaction monitoring (MRM) (see Chap. 26), and parallel reaction monitoring (PRM) (see Chaps. 12, 26), are highly reproducible and relatively fast. In contrast to the DIA approach , which is normally not performed using heavy isotope labeling, SRM and PRM offer its best performance for absolute quantification when applying isotope-labeled peptides/proteins as internal standards [32, 33].

SRM is typically performed with triple-quadrupole instruments: Q1 selects a peptide ion, Q2 fragments the peptide, and Q3 selects a specific fragment ion for detection and quantification on the MS2 level [32]. Thus, due to the two-level mass filtering, most of the co-eluting peptides are effectively excluded, making SRM a highly sensitive technique [34]. However, due to the quadrupole’s low resolving power, separation of interfering near-isobaric ions that co-elute with the target peptides is limited [35]. In addition to the m/z of the peptides, SRM requires the beforehand information concerning the fragment ions to target [36].

The SRM limitations were overcome with the implementation of PRM approach, which uses high-resolution Orbitrap or time of flight as MS2 analyzers. PRM uses targeted tandem MS to simultaneously monitor product ions of a targeted peptide with high resolution and mass accuracy [35, 37]. PRM offers the same selectivity and accuracy but wider dynamic range and selectivity compared to SRM. Herewith, PRM has a longer cycle time. This approach was successfully used for the accurate quantification of specific low-abundant peptide/protein in complex biological samples [38,39,40].

2.2 Peptide or Protein-Centric Approach

Nowadays two distinct strategies co-exist in MS-based proteomics studies: bottom-up, peptide-centric approaches and top-down, protein-centric approaches. The bottom-up approach enables high-throughput analysis and allows identification and quantification of thousands of proteins in complex samples, due to their proteolytic digestion into shorter peptides before analysis [41, 42]. This simplifies MS/MS sequencing, since peptides are easier fragmented in the mass spectrometer than intact proteins. However, the protein digestion at the very first stage of the experiment discards the connectivity between peptides and proteins, complicating computational analysis and biological interpretation of the data [42, 43]. Despite the fact that MS/MS spectra analysis comprises peptide spectrum matches (PSMs) determination, most of the search engines and methods return protein lists. These lists will vary according to the underlying model and the implied independence assumptions [43, 44]. Moreover, the inconsistences are connected with so-called protein inference , when two or more proteins share the same peptide [42, 44, 45]. In this case, accurate identification is only possible, if discriminating (unique) peptides are identified and the applied database was adequate for the sample. Otherwise, all of the possible proteins with shared peptides will be assigned as protein group members (according to the inference algorithm) [42, 43]. Protein inference is automatically performed only by few peptide search engines, i.e., X!Tandem and Mascot, and these can only employ peptide identifications found by the respective algorithms. Most of the widely used programs, for example, MS-GF+ and Comet, return only the spectra identifications without any inference [43]. Thus, identification and quantification of a large part of proteins and proteoforms may be restricted by the presence of non-unique low-quality or incorrectly identified peptides [42, 44]. In attempt to solve this issue, programs for protein inference from PSMs (e.g., PIA, ProteinProphet, Scaffold, and IDPicker) were developed [43, 46, 47]. PIA, for example, is able to combine PSMs from different search engine runs and turn these into consistent results [43]. Additionally, protein quantification could be complicated due to the necessary proliferation of peptide-level information up to the protein-level. Thus, the peptides are usually quantified first, and then these data are transferred to the protein level [45]. Hence, quantification and analysis on peptide level is preferable because of better differential abundance detection and higher accuracy [43, 44].

Distinguishing and quantification of different proteoforms, protein products of a single gene derived from genetic variation, alternative splicing of RNA transcripts, and PTMs (e.g., phosphorylation, glycosylation, acetylation), is essential for biological processes and pathology understanding [48]. In this context, the top-down MS approach is especially attractive for relative quantification of protein modifications because MS of a digest mixture may not detect the peptide carrying a given particular isoform modification . Moreover, addition of smaller PTMs (i.e., phosphorylation) has much less effect on ionization/detection of intact proteins compared to peptides. Thus, top-down approach may be employed for the quantification of modified and unmodified protein species in biological samples [49,50,51,52]. The top-down MS comprises introduction of intact proteins into the mass spectrometer and analysis of both intact and fragment ions masses. This approach allows up to 100% sequence coverage and full characterization of proteoforms [53]. Despite its attractivity, due to technical difficulty of proteome-wide analysis at the intact protein level, reproducible and accurate protein quantification by top-down MS remains challenging. Mainly, quantification is restricted due to the much lower efficiency of ionization and detection of large proteins [54]. For complex samples such as human plasma, target protein is usually enriched prior to top-down MS analysis, e.g., by immunoprecipitation [54, 55]. Successful protein quantification with top-down approach may be also achieved when using targeted MS [56] or when analyzing relatively small molecules (<30 kDa) [55, 57].

2.3 Label-Free and Label-Based Quantification by Mass Spectrometry

MS is a technique for the measurement of the mass-to-charge ratios of charged particles and does not allow in itself their quantification . Different peptides and proteins exhibit a wide range of physicochemical properties, i.e., size, charge, and hydrophobicity; therefore, their mass-spectrometric responses cannot be used for quantitative comparison between different molecular species within one sample. For accurate quantification, it is therefore generally required to compare each individual molecule between experiments (includes methods for label-free quantification), or within a single experiment, when molecules differ only in their isotopic composition and have identical physical and chemical properties (label-based quantification) [58].

Label-based methodology comprises in vivo (metabolic) (see Chaps. 14, 19, 20) or in vitro (post-harvested) peptide or protein labeling with heavy, non-radioactive, isotopes (see Chaps. 10, 11, 13, 15, 16, 18, 26, 27). Owing to the natural occurrence of certain stable heavy isotopes, each peptide/protein contains a certain proportion of these; the isotope pattern seen in the mass spectrometer thus reflects the natural abundances of these heavy isotopes within the peptide. Artificial incorporation of heavy isotopes (e.g., 13C, 15N, 18O, 2H) produces a mass-to-charge shift of the peptide’s peaks in the mass spectrum . Therefore, heavy isotope-labeled peptides, with the exception of deuterium incorporation, are chemically identical to its native counterpart, and therefore the two peptides behave identically during LC-MS analysis (see Note 2). Given that a mass spectrometer can recognize the mass difference between the labeled and unlabeled forms of a peptide, quantification is achieved by comparing their respective signal intensities [58]. Isotope labels can be introduced as an internal standard into amino acids metabolically, chemically, or enzymatically or, alternatively, as an external standard using spiked synthetic peptides [59]. Thus labeling techniques enable comparing two or more samples within the same mass spectrum with a high accuracy and reproducibility of quantitative measurements. It allows avoiding missing value problem of label-free approaches: the signal absence rather testifies to peptide signal being below the detection limit , than not being picked up by chance [60]. The major limitations of the approach are additional steps in sample preparation , high costs, and the limited number of samples, analyzed within one experiment [58]. Using the MS1 spectrum for quantification could be another limitation of the methodology [61]. The higher the number of samples, the more complex is the MS1 spectrum for analysis, due to overlapping peaks and identical properties of precursor ions [61]. In practice, this limits the number of samples that can be compared in a single experiment [60]. Another limitation of MS1-based quantification is that the number of ions that can be accumulated in the most commonly used high-resolution analyzer, the Orbitrap, is limited [62, 63]. The number of ions for low-abundance peptides can therefore be very small if some very high-abundant peptides co-elute at the same time in the MS1 spectrum , resulting in less precise quantification due to poor ion statistics [60]. This limitation has been somewhat alleviated by ion-mobility separation or BoxCar [60]. Moreover, quantification information can be also obtained from the MS/MS fragment ion spectrum . In this context, more than one independent spectrum will be available for analysis. However, quantification based on isobaric labels can be complicated due to co-isolated peptides, creating reporter tags, which, if superimposed on the reporter tags from the selected precursor ion, gives an inaccurate link between peptide quantity and identity [64]. Double isolation method (MS3) was suggested to address this problem. A disadvantage with the MS3 isolation is that it results in lower number of proteins being quantified [65].

Label-free quantification relies on comparison of peptides and proteins in their natural state in consecutive experiments (see Chaps. 17, 2125). It requires highly reproducible sample-handling and analysis protocols [58, 66]. Since all samples in a label-free study are separately analyzed by LC-MS/MS, even running all samples in a single sequence in the same instrument can lead to variation even for technical replicate of the same sample. In order to consider the data bias and to make data more comparable, normalization is required [58, 67]. Label-free quantification enables analysis of virtually unlimited number of samples without introduction of any labels, thus keeping costs low and minimizing the sample preparation steps. This approach is highly preferable for the biomarker research [5, 6, 18]. On the other hand, poor reproducibility may require analysis of many technical replicates and may lead to the low accuracy of the quantitative measurements [58]. Thus, label-free approach has been shown to give the largest dynamic range and the highest proteome coverage for identification but the lower quantification accuracy and reproducibility compared to labeling-based strategies [68].

2.4 Quantitative Cross-Linking/Mass Spectrometry

Cross-linking MS (XL-MS) is an advanced technique to study single proteins, protein complexes, and protein–protein interaction networks [69, 70] (see Chap. 27). The method is based on the ability of a cross-linker to convert 3D proximity of amino acid residues into covalent bonds [71, 72]. The bridgeable distance between residues depends on the cross-linker used. Thus the commonly used bis[sulfosuccinimidyl] suberate (BS3) links residues up to 25–30 Å apart [71]. Following proteolytic digestion of the proteins, cross-linked peptides are identified using LC-MS and database search [71].

Quantitative XL-MS allows comparison of cross-linked peptides across experimental conditions and varying biological states and may apply both label-free and label-based approaches [73]. Successful label-based XL-MS was reported when using isotope-labeled cross-linker combined to the software tool, e.g., XiQ, xTract MaxQuant, and Skyline [73,74,75,76]. The majority of the methods are designed for quantification based on MS1 signal [73]. Combination with isobaric tandem mass tags (TMT) allowed quantification from the reporter ion signal in MS3 spectra [77]. The main limitations of the isotope labeling-based XL-MS are costs, complex sample preparation , and reduced data coverage [72]. Advantages of label-free quantitation, allowing non-limiting numbers of samples, were presented recently with an MS1- and MS2-based XL-MS workflow using Skyline [72, 73]. Since cross-linking sample preparation procedure is more sophisticated than in normal proteomics, one might expect a larger variance. However, it has been shown that the reproducibility of label-free XL-MS is in line with the reproducibility of general quantitative proteomics [72]. Recently, DIA-based XL-MS have been demonstrated to be capable of detecting changing abundances of cross-linked peptides in complex mixtures when combined to the Spectronaut software [78].

3 Methods for Protein/Peptide Quantification

3.1 Relative Quantification

Relative quantification provides calculation of abundance ratios between peptides and proteins by comparing their signals originating from different samples (see Chaps. 10, 11, 1422, 24, 25, 27). Usually performed in “discovery” (non-targeted) mode, it allows quantitative profiling of tens of thousands of peptides from thousands of proteins within a single experiment without a prior information. It can be based upon heavy isotope labeling or label-free.

3.1.1 Stable Isotope Labeling-Based Methods

3.1.1.1 Chemical Labeling

The methods for relative quantification by chemical labeling rely on the chemical reaction (without enzymatic catalysis) between a reagent and the peptides (or proteins) in the sample of interest in vitro (i.e., after isolation of the protein /peptide from the biological sample). The reagent used bears different numbers of stable heavy isotopes and thus produces a mass shift in the MS spectrum (e.g., dimethyl labeling) or MS/MS spectrum (in case of isobaric reagents, e.g., iTRAQ, TMT).

One of the first commercially available reagents for the chemical labeling was ICAT (isotope-coded affinity tags) [79]. ICAT is a protein-specific non-isobaric chemical label , which consists of three moieties: a sulfhydryl-reactive group for coupling to the analyte cysteines, an affinity group for isolation of the tagged species (peptides), and a linker in light (with natural isotope distribution) and heavy (containing eight deuterium (2D) atoms instead of 1H) form. Two samples to be compared are labeled with light or heavy ICAT reagent and subsequently mixed. The ratio of the peak intensities of light−/heavy-labeled peptide pairs obtained by MS correlates with their abundance . In the original version, deuterium labeling of the linker was used, but due to differential elution of light- and heavy-labeled peptides, the method was improved by using 13C labeling [80]. Significant disadvantages of the approach are the side reactivity of the biotin tag and its inability to label peptides lacking cysteine.

Another non-isobaric labeling method based on the same principle is ICPL (isotope-coded protein labeling). The advantage of ICPL is their reactivity toward free amines (lysine side chains and N-terminus), allowing labeling of virtually all peptides present in the samples [81]. This approach can be also performed on the protein level, reducing influence of sample processing variabilities. However in this case, trypsin is unable to cleave at ICPL-modified lysine residues, resulting in longer peptides that are difficult to fragment. Moreover, a part of proteins is identified with no lysine-containing peptides, restricting their usage for quantification [82].

An important group of reagents used for relative quantification comprises the isobaric chemical labels [59, 83]. These rely on isobaric labeling of peptides from different samples, which upon fragmentation give rise to different reporter ions in the MS/MS spectrum. The iTRAQ (isobaric tags for relative and absolute quantification) labels each contain an amine-reactive group, a balance group, and a reporter group. Overall, different reagents have the same molecular weight and upon labeling produce identical mass shifts. Different samples are labeled with reagents containing different distributions of heavy isotopes between the balance and reporter groups and are subsequently mixed. Identical peptides from the samples to be compared co-elute and are detected as a single precursor ion. The iTRAQ labels are designed in such a way that, upon fragmentation, different reagents give rise to reporter ions with identical chemical composition but different molecular weights, owing to their different isotope compositions. Their intensities are proportional to the relative abundances of the labeled peptide originating from the different samples. A major advantage of this method is that it is capable of “multiplexing”; it enables analysis of up to eight samples within a single experiment. The disadvantage of the method is that, similar to other isobaric techniques, ratio compression due to background interference occurs. This problem can be solved through high-resolution sample fractionation and additional isolation and fragmentation (MS3) [83, 84].

A very similar approach is the labeling with tandem mass tags (TMTs), which consist of an amine-reactive, a balance, and a reporter group, which are released upon fragmentation during MS/MS, and the intensity of which is used to calculate relative peptide amounts between the samples [85] (see Chaps. 10, 15, 16, 18). The multiplexing capability of TMT reagents was recently improved using differential stable isotope incorporation (15N, 13C) across the reporter and mass balancing regions of the tag, giving rise to 6-plex and 10-plex sets for relative quantitative profiling of multiple conditions [86]. Importantly, quantitation of TMT-labeled peptides requires fragmentation using higher-energy collision dissociation (HCD) or electron transfer dissociation (ETD), because TMT reporter ions are not visible in ion traps following collision-induced dissociation (CID). Accurate quantification for all isobaric labeling strategies is only possible when a single precursor ion is selected for MS2 analysis . Thus, co-eluting peptides will lead to the underestimation of actual protein abundance differences. This challenge can be overcome by additional fragmentation [84]. Even though multiple software packages support relative quantification using TMT reagents, for data analysis, followed by multiple fragmentation, Proteome Discoverer or custom scripts are required.

A cost-effective alternative to commercially available TMT and iTRAQ is the N,N-dimethyl leucine (DiLeu) reagent. DiLeu has the same labeling efficacy, protein coverage, and quantitation accuracy , as well as higher fragmentation efficacy compared to iTRAQ [87]. Due to the inclusion of eight new reporter isotopologues that differ in mass from the existing four reporters by intervals of 6 mDa, DiLeu yields a 12-plex isobaric set that preserves the synthetic simplicity and quantitative performance of the original implementation [88].

Another approach for quantification at the MS/MS level is IPTL (isobaric peptide termini labeling) [89] (see Chap. 11). This uses isobaric sequential labeling of the C- and N-termini of the analyzed peptides with deuterated and non-deuterated succinic anhydride. Upon fragmentation, either the N-terminal or the C-terminal label is lost, which results in differentially labeled C- and N-terminal fragment ion series, respectively. These appear as fragment ion pairs in MS/MS, and their relative intensities can be used for quantification. An advantage of this strategy is that the quantification is based on several data points per MS/MS spectrum, although this complicates data analysis enormously. An improved duplex IPTL tags N- and C-terminal with succinic anhydride and dimethyl, respectively, and does not require peptide purification between steps [90]. Triplex-IPTL comprises dimethylation of both peptide termini using different stable isotopes of formaldehyde and cyanoborohydride [91].

A significant advantage of all chemical labeling methods is that they can be applied to practically any type of sample (cell culture, tissues, body fluids, etc.), in contrast to metabolic labeling as discussed below. However, it is crucial to optimize labeling conditions (see Note 3).

3.1.1.2 Metabolic Labeling

Metabolic labeling with stable heavy isotope labels (see Chaps. 14, 19, 20) introduces the label at the earliest time point in an experiment, i.e., during cell growth and duplication. This is achieved by feeding organisms with special media containing a subset of the metabolites in heavy-labeled form. Metabolic labeling ensures lower deviations in quantification, as the samples to be compared can be mixed at a very early stage during the experiment. Metabolic labeling can easily be achieved in cell culture, but scaling up to whole organisms such as Drosophila, Caenorhabditis elegans, and even mice is also possible.

Labeling with 15N-containing media (see Chap. 20) has been used successfully for protein quantification in the yeast [92], mammalian cells [93], C. elegans, Drosophila melanogaster [94], Arabidopsis thaliana [59], and rat [95]. Very high levels of isotope incorporation can be achieved by this method, but the mass difference between labeled and unlabeled samples depends on the number of 15N atoms present in different peptides and determines a significant challenge for data analysis and quantification. Moreover, highly enriched 15N sources are required in order to avoid complex isotope distributions of partially labeled peptides [59]. A computationally simpler method AACT (amino acid-coded mass tagging) also known as SILAC (stable isotope labeling with amino acids in cell culture) was developed to address these issues [95, 96]. SILAC (see Chaps. 14, 19) takes advantage of the fact that organisms are naturally (or genetically manipulated to be) auxotrophic for certain amino acids. These amino acids can therefore be provided in labeled and unlabeled form to growth media and would be used by the organism for building proteins in vivo. SILAC experiments usually employ lysine and arginine containing different numbers of the heavy isotopes 13C, 15N, and 2H. Using trypsin for protein digestion ensures that each resulting peptide will contain at least one labeled amino acid (except for the C-terminal peptide of the protein). By comparison of the intensities of the precursor isotope envelopes of non-labeled and labeled peptides, quantitative information can be easily obtained. It can be applied not only in cell culture but also to whole organisms such as Drosophila [97] or mice (SILAM) [98]. However, application in animals comprises high costs and sophisticated procedures of animal feeding, limiting the method’s common usage. As with most other label-based approaches, when metabolic labeling is applied, nearly 100% incorporation of the label should be aimed at. Incomplete labeling results in inaccurate quantification. Additionally, any changes or stress in the experimental organism due to the artificial growth medium should be taken into account (e.g., when using dialyzed fetal bovine serum for mammalian cells). Another important consideration when using SILAC is the metabolic conversion of the isotopically labeled amino acids within the cell. This can lead to incorrect quantification if, for example, the pathway leading from arginine to proline is stimulated when the concentration of the added arginine is not carefully adjusted, or if the conversion is not corrected for [99] (see Note 4). In the case of affinity interaction pull-downs using SILAC in vitro, careful adherence to identical conditions for preparation of heavy and light cell extracts is important for obtaining reliable results [100]. A significant disadvantage of metabolic labeling methods is their inability to be used for tissues and body fluids from organisms that cannot be labeled easily (e.g., human patients). To overcome this issue, super-SILAC, isotopically labeled internal SILAC standards were introduced; this allowed successful quantification in tumor tissue samples [101].The SILAC approach limitation is that it is impractical for quantification of up to three samples. 5-plex SILAC was suggested to compare five different cellular conditions within a single experiment. In addition to Lys and Arg, stable isotopically labeled Tyr (13C6 and [13C9,15N1]) was combined to introduce the necessary peptide mass shifts [102].

Another approach for SILAC multiplicity improvement up to 6-plex is neutron encoding (NeuCode). This method is based on labeling with six lysine isotopologues with varying combinations of 13C, 15N, and 2H substitution that differ in mass by 6 mDa [103]. While using for the targeted proteome analysis through the combination of PRM and NeuCode labeling, the multiplicity rises up to 30 channels of quantitative information in a single MS experiment [104]. NeuCode can be also used both in cell culture and in mammals with shorter labeling times, compared to standard approach (∼2 weeks for cultured cells and 3–4 weeks for mammals) [105].

Cell line-specific labeling using amino acid precursors (CTAP) was developed to differentiate the proteome of individual cell populations in co-culture [106]. This method utilizes the inability of vertebrate cells to synthesize certain amino acid required for growth and homeostasis. Transgenic expression of enzymes that synthesize essential amino acids would allow vertebrate cells to overcome auxotrophy by producing their own amino acids from supplemented precursors. These precursors can be isotopically labeled, allowing cell of origin of proteins to be determined by label status identified with MS/MS [61]. The similar method is nitrilase-activatable noncanonical amino acid tagging (NANCAT) that exploits an exogenous nitrilase to enzymatically convert the nitrile-substituted precursors to their corresponding noncanonical amino acids (ncAAs), L-azidohomoalanine (AHA) or homopropargylglycine (HPG), in living cells. Only cells expressing the nitrilase can generate AHA or HPG in cellulo and metabolically incorporate them into the nascent proteins. Subsequent click-labeling of the azide- or alkyne-incorporated proteins with fluorescent probes or with affinity tags enables visualization and proteomic profiling of nascent proteomes, respectively [107].

3.1.1.3 Enzymatic Labeling

Heavy stable isotopes can be incorporated during enzymatic proteolysis of proteins. Performing proteolysis in heavy (H218O) or light (H216O) water incorporates, respectively, two 18O or 16O atoms at the C-terminus of the generated peptides, resulting in a mass shift of 4 Da between heavy- and light-labeled peptides [108, 109]. This label can also be incorporated after digestion in a second incubation step with a protease. This method ensures near-complete labeling, in contrast to ICAT it does not favor cysteine-containing proteins; it does not require enrichment of labeled peptides and unlike metabolic labeling it can be used for human specimens. It is also less expensive compared to other stable isotope labeling techniques. Acid-catalyzed back-exchange at extreme pH conditions can occur [110] (see Note 5); however, the mild conditions used during ESI or MALDI analyses do not influence the stability of the introduced label . Incomplete labeling by incorporation of only one 18O atom can complicate data analysis and needs to be taken into account [111].

3.1.2 Label-Free Quantification (LFQ)

Label-based approaches for proteomic quantification usually come at higher cost, require additional steps of sample preparation , and are characterized by limited multiplicity. Therefore, it is not surprising that the use of label-free methods increased during the last few decades. As mentioned above, label-free quantitative approaches (see Chaps. 17, 2125) rely on the comparison of different features between independent LC-MS or LC-MS/MS measurements. They fall into two general categories: (i) spectral counting (SC), methods that involve counting the number of identified peptides or acquired fragment spectra, and (ii) methods that involve comparing precursors’ ion intensities, determined by the extracted ion chromatogram (XIC).

3.1.2.1 Spectral Counting-Based LFQ

Spectral counting-based LFQ relies on the practical observation that more abundant peptides are more likely to be observed and detected in an MS experiment. These approaches use the number of peptides or the number of fragment spectra observed for a particular protein in the analysis . Protein abundance index (PAI), calculated as a ratio of the number of peptides identified for a protein to a total number of peptides, the protein could theoretically produce, was suggested for LFQ [112]. However, Liu et al. found a linear correlation over two orders of magnitude between the number of spectra and the relative protein abundance, whereas no correlation between the relative protein amounts and the number of peptides and the sequence coverage was observed [11]. Though spectral counting is a relatively simple and reliable technique and is easily implemented, normalization and careful statistical evaluation are still needed for accurate quantification . This accuracy can decrease significantly for proteins with only a few observable peptides, as well as when the quantitative changes between experiments are small [113]. Furthermore, since larger proteins give rise to more peptides than do smaller ones, additional normalization factors can be applied to improve the results of quantification [114]. Thus, several approaches were implemented. The first is the normalized spectral abundance factor (NSAF), calculated as a ratio of the spectral counts for a given protein to its length. This value is then normalized by dividing it by the sum of all the ratios obtained for each protein identified in the experiment. Additionally this method improves the minimal fold change detectible by SC [114]. The protein abundance factor normalizes the total number of non-redundant spectra to the molecular mass of the intact protein [115]. The last one is the absolute protein expression algorithm, a machine learning classification system, that corrects spectral counts for the likelihood that a spectrum might be detected [116]. The SC disability to quantify smaller abundance changes can be improved by increasing the scoring requirements for spectrum identification ; however, the accurate quantification of low-abundance proteins will be restricted [117]. Another disadvantage of the SC is connected with the peptides that can be assigned to more than one protein. The normalized spectral abundance factor bears with the problem by dividing shared spectra proportionally between the possible contributing proteins based on the distribution of the other identified unique peptides [118].

3.1.2.2 Intensity-Based LFQ

Signal intensities of ions after ESI correlate with ion concentrations [119, 120]. The extracted peak areas from chromatograms in LC-MS measurements specific for certain ions (extracted ion chromatograms, XIC) can therefore be used for relative quantification of specific peptides and proteins between different samples. Quantitation is an area under the curve or peak height calculation for each peptide that elutes from the LC column at an expected retention time. The method allows measurements with high precision and wide dynamic range , especially when high-resolution mass spectrometers are used. It can also be applied to MALDI measurements combined with offline LC separation . However, the following important considerations should be taken into account: First, the variation between measurements of the peak intensities of peptides from the same sample (technical replicates) should be recorded, and appropriate normalization should be applied. Secondly, and more critically, variation of the LC retention time and/or m/z values of identical peptides between measurement runs should be considered. Any variability in this respect requires alignment of individual ion chromatograms for correct quantification and elimination of any global drift in retention time. Practical normalization strategies may include the addition of identical amounts of standard protein in different sample or normalization, based on a priori information about a protein that does not change quantitatively between the samples compared [121]. Reproducibility of LC separation, stability of the electrospray ion source, and the use of computational algorithms for comparison, retention time alignment , and statistical evaluation of several LC-MS datasets in a single procedure are therefore crucial (see Note 6).

When carried out using high-resolution mass spectrometers, intensity-based LFQ is more sensitive and accurate than SC [122]. Moreover it allows lower accurately distinguishing fold changes, compared to SC [123].

3.2 Absolute Quantification

Absolute quantification is used to determine the absolute amount (mass, mole number, or copy number) of proteins in a mixture or complex (see Chaps. 12, 13, 26). This is very informative, but label-based methods are usually relatively laborious, and label-free ones are less accurate. Absolute quantification is generally performed at the peptide level, although top-down absolute quantification has been introduced [124].

3.2.1 Label-Based Absolute Quantification

The arguably most widely used method for absolute quantification (AQUA) employs peptides labeled with heavy stable isotopes (AQUA peptides) as added, internal standards [125] (see Chaps. 12, 26). This method can be used for accurate profiling and absolute quantification of proteins within a complex sample , for monitoring changes in post-translational modification [125, 126], and for determining the stoichiometry of subunits within a protein complex [127]. Being a targeted approach, the method requires a priori information about the peptides and proteins that are subject to analysis . The specific characteristics of the targeted precursor ion (elution time, m/z value, charge state), optimum fragmentation conditions (collision energy), and resulting fragmentation pattern are determined in prior measurements. Peptides labeled with heavy stable isotopes (13C- and 15N-labeled amino acids), identical in sequence to the peptides of interest naturally present in the sample, are synthesized chemically. These two peptides have identical physicochemical properties but present a specific mass shift in the mass spectrum . The AQUA peptides are added to the protein digest or peptide sample at known concentrations and analyzed in SRM mode. The co-eluting analytes—i.e., the endogenous and the mass-shifted labeled peptides—are selected for fragmentation on the basis of their (already determined) elution time and m/z value. The intensities of the fragment ions of the peptide of interest are compared with those of the AQUA peptide, and this reflects directly their quantitative relationship. As the amount of the added peptide is known, the amount of the sample peptide can be deduced. The AQUA approach allows very specific, targeted detection of the peptides of interest, thereby minimizing the variability and the influence of background noise. Even in complex samples, several hundred peptides can be targeted within a single LC–MS/MS experiment [128]. As the method is strictly hypothesis-driven, it allows the selection of peptides with optimal chromatographic performance and ionization efficiency (i.e., good “detectability”), which do not undergo uncontrolled modification in vitro (e.g., oxidation of methionine) and which are unique to the protein of interest. Such peptides are called prototypic peptides and can be identified or predicted for particular proteomic platform using peptide libraries and public databases [113, 129, 130]. AQUA strategy suffers from quantification uncertainties: peptide standard spiking in occurs after sample preparation and enzymatic proteolysis. Moreover, any losses of the peptides—for example, during storage—would directly influence quantification results. There are several critical aspects that should be considered when an AQUA experiment is being planned such as incomplete proteolytic digestion , exact amount of AQUA peptide, application of AQUA peptides, and number of applied AQUA peptides for each protein to be quantified (see Note 7).

In order to simplify the quantification of several peptides per protein , heavy-labeled standard proteins can be used instead of individual peptides. Several approaches have been developed in that direction, including PSAQ (protein standard absolute quantification) [131], absolute SILAC [132], absolute NeuCode [104], and FLEXIQuant (full-length expressed stable isotope-labeled proteins for absolute quantification) [133]. Protein Epitope Signature Tags (PrESTs), developed in the course of the Human Protein Atlas project [134], can be combined with SILAC allowing accurate and streamlined quantification of the absolute or relative amount of a large amount of proteins of interest at a time, in a wide variety of applications [135].

QconCAT (concatenated signature peptides encoded by QconCAT genes) uses artificial, labeled standard proteins assembled from diverse peptides belonging to different proteins [136]. It utilizes synthetic DNA that encodes a concatenated series of peptides of interest, which are expressed in Escherichia coli grown in stable isotope-labeled media. After purification and quantification , peptides are introduced to cell lysates at digestion . Similarly to AQUA, this approach does not allow sample fractionation before analysis.

3.2.2 Label-Free Absolute Quantification

Development of the label-free absolute quantification has several advantages: (i) omitting the time-consuming and often costly step of introducing standard peptides and (ii) the opportunity to compare virtually unlimited numbers of samples. On the other hand, they entail the disadvantages of lower accuracy and the requirement for high reproducibility. One of the first label-free approaches used for absolute quantification was the emPAI (exponentially modified PAI) [137], calculated as emPAI = 10 PAI−1. It is proportional to the protein content in a protein mixture and, therefore, can be used for the estimation of absolute amounts of proteins. This approach still has a lower accuracy compared to other methods [123].

An approach termed APEX (absolute protein expression) based on spectral counting can also be used for profiling absolute protein quantities per cell [116]. Important features of APEX are the correction factors that it introduces, providing a relationship of direct proportionality between the numbers of observed and expected peptides.

In addition to the above discussed spectral counting-based absolute quantification , several intensity-based methods have also been reported. As incomplete digestion is a critical issue when one is performing absolute quantification of peptides or proteins [138], an alternative approach (generally known as “Top3”) has been developed that deals with this problem. In this approach the quantities of the three most abundant tryptic peptides are averaged [139]. It is generally assumed that some parts of the protein are completely digested and, therefore, the three most abundant peptides reflect the protein concentration . The protein sample is therefore spiked with a known amount of standard protein, and, after digestion, the average MS signal response of the standard protein is used to calculate a universal signal response factor (ion counts per mole of protein). This factor is then applied to calculate the concentration of the proteins in the sample to be analyzed [138].

In the intensity-based absolute quantitation (iBAQ) algorithm, the summed intensities of the precursor peptides that map to each protein are divided by the number of theoretically observable peptides, which is considered to be all tryptic peptides between 6 and 30 amino acids in length [140]. This operation converts a measure that is expected to be proportional to mass (intensity) into one that is proportional to molar amount (iBAQ). Interestingly, iBAQ and dividing Top3 by the number of identified peptides gave the most accurate quantitation. Here iBAQ shows less bias when calculating the abundance of smaller proteins [141]. Relative iBAQ (riBAQ), which is the iBAQ (calculated by MaxQuant) for a protein or protein group divided by all non-contaminant, non-reversed iBAQ values for a replicate , is an equivalent to normalized molar intensity [142, 143].

Total protein approach (TPA) is an approach for determination of protein copy numbers per cell without protein standards [144]. The method is based on the observation that the 3000 most abundant proteins of the cell already constitute >99% of the proteome mass. Thus, using intensity values for each protein, a fractional value of the MS signal (LFQ intensity) of a protein compared with the total MS signal is a good proxy of the percentage of its protein mass to total protein mass. This can then be converted into numbers of molecules per cell by measuring or estimating the volume and protein content of the analyzed cells [144]. Comparison analysis showed similarity of the data obtained by the TPA and SILAC-PrEST approaches [144]. Recently, DIA-based label-free absolute quantification method was reported. It uses the TPA algorithm (DIA-TPA) [145].

Nowadays MALDI-MS is still considered by many researchers as a non-quantitative technique (see Chap. 13). However MALDI-mass spectrometry imaging (MSI) permits label-free in situ analysis of chemical compounds directly from the surface of two-dimensional biological tissue slices [146]. MALDI-MSI is a key label-free technology for quantitative analysis of drugs, metabolites, and formulations as well as biomarkers in tissues [147]. It allows both protein relative and absolute quantification. Absolute quantitative MSI experiments deliver concentration levels of analytes, which are expressed in units of moles or mass quantity of compound per mass/volume or area of tissue. Relative quantification is achieved by visualizing and estimating relative concentration values of the analyte across the tissue or whole body section in comparison with other compounds [146]. Achieving reliable relative quantification by MALDI-MSI requires the application of a signal intensity normalization method, baseline correction (subtraction), and mass spectral realignment (recalibration) [148].

4 Notes

  1. 1.

    Quantification performance of DIA is superior to DDA, especially in terms of reproducibility and accuracy . Quantification accuracy is decreased when considering low protein /peptide amounts in DDA, but not in DIA. However, for DIA-based quantification several issues should be taken into account. The larger the spectral library , the higher the coefficient of variations on peptide and protein level. In DIA small fold changes might entail the risk of more false positive discoveries. For DIA quantification on peptide level is preferable because of better differential abundance detection and higher accuracy [20].

  2. 2.

    For relative quantification using stable isotopes, the quantitative correspondence does not always apply exactly when deuterium is used as a label , as labeling with deuterium can affect retention time in LC [80].

  3. 3.

    Relative quantification using stable isotope chemical or enzymatic labeling: The labeling procedure has to be optimized ensuring ideal labeling; 100% label incorporation should be aimed at, which might not be achievable for all approaches. Additionally, side reactions should be avoided to prevent erroneous quantification results.

  4. 4.

    Relative quantification using metabolic labeling: In general, large-scale SILAC experiments use both isotope-coded arginine and lysine to obtain labeling of all possible tryptic peptides, thereby maximizing quantitative coverage of all potential peptides in a given experiment. Quantification using SILAC may be disturbed by the fact that the isotopically labeled amino acid arginine is a metabolic precursor of proline and as such might be converted to labeled proline. As with other labeling approaches, complete incorporation of the heavy label should be aimed at (which should be limited only by the isotopic enrichment of the commercially available labeling sources).

  5. 5.

    Relative quantification using enzymatic labeling: Under extreme pH conditions in H216O buffers, acid-catalyzed back-exchange could result in partial loss of the 18O label . Therefore, it is recommended that the enzymatic reactions are stopped by addition of protease inhibitors or freezing of the reaction mixture, rather than by acidifying with 10% TFA.

  6. 6.

    Label-free quantification: The most crucial parameter in label-free quantification is the consistent reproducibility of the LC separation, ionization, and mass measurements of the peptides. All variations of peptide intensities, as well as LC retention times, should be recorded between technical replicates and used for normalization and alignment between runs.

  7. 7.

    Absolute quantification: First, when peptides from protease digests are to be quantified, complete digestion of the protein sample must be guaranteed. Missed protease cleavages affecting the targeted peptide will result in an artificial decrease in the amounts observed in quantification. Additionally, AQUA peptides are usually obtained in known absolute amounts in lyophilized form and therefore have to be dissolved quantitatively. As it is advisable to add standard peptides after rather than before digestion, any variability and losses during the prior sample preparation should be minimized. Finally, for reliable quantification results, several peptides per targeted protein should be monitored, in order to provide more than one reference value per protein.