Tools for Label-free Peptide Quantification

The increasing scale and complexity of quantitative proteomics studies complicate subsequent analysis of the acquired data. Untargeted label-free quantification, based either on feature intensities or on spectral counting, is a method that scales particularly well with respect to the number of samples. It is thus an excellent alternative to labeling techniques. In order to profit from this scalability, however, data analysis has to cope with large amounts of data, process them automatically, and do a thorough statistical analysis in order to achieve reliable results. We review the state of the art with respect to computational tools for label-free quantification in untargeted proteomics. The two fundamental approaches are feature-based quantification, relying on the summed-up mass spectrometric intensity of peptides, and spectral counting, which relies on the number of MS/MS spectra acquired for a certain protein. We review the current algorithmic approaches underlying some widely used software packages and briefly discuss the statistical strategies for analyzing the data.

Over recent decades, mass spectrometry has become the analytical method of choice in most proteomics studies (e.g. Refs. [1][2][3][4]. A standard mass spectrometric workflow allows for both protein identification and protein quantification (5) in some form. For a long time, the technology has been used mainly for qualitative assessments of protein mixtures, namely, to assess whether a specific protein is in the sample or not. However, for the majority of interesting research questions, especially in the field of systems biology, this binary information (present or not) is not sufficient (6). The necessity of more detailed information on protein expression levels drives the field of quantitative proteomics (7,8), which enables the integration of proteomics data with other data sources and allows network-centered studies, as reviewed in Ref. 9. Recent studies show that mass-spectrometry-based quantitative proteomics experiments can provide quantitative information (relative or absolute) for large parts, if not the entire set, of expressed proteins (10 -12).
Since the isotope-coded affinity tag protocol was first published in 1999 (13), numerous labeling strategies have found their way into the field of quantitative proteomics (14). These include isotope-coded protein labeling (15), metabolic labeling (16,17), and isobaric tags (18,19). Comprehensive overviews of different quantification strategies can be found in Refs. 20 and 21. Because of the shortcomings of labeling strategies, label-free methods are increasingly gaining the interest of proteomics researchers (22,23). In label-free quantification, no label is introduced to either of the samples. All samples are analyzed in separate LC/MS experiments, and the individual peptide properties of the individual measurements are then compared. Regardless of the quantification strategy, computational approaches for data analyses have become the critical final step of the proteomics workflow. Overviews of existing computational approaches in proteomics are provided in Refs. 24 and 25. The computational label-free quantification workflow in visualized in Fig. 1. Comparing peptide quantities using mass spectrometry remains a difficult task, because mass spectrometers have different response values for different chemical entities, and thus a direct comparison of different peptides is not possible. The computational analysis of a label-free quantitative data set consists of several steps that are mainly split in raw data signal processing and quantification. Signal processing steps comprise data reduction procedures such as baseline removal, denoising, and centroiding.
These steps can be accomplished in modular building blocks, or the entire analysis can be performed using monolithic analysis software. Recently, it has been shown that it is beneficial to combine modular blocks from different software tools to a consensus pipeline (26). The same study also illustrates the diversity of methods that are modularized by different software tools. In another recent publication, monolithic software packages are compared (27). In that study, the authors identify a set of seven metrics: detection sensitivity, detection consistency, intensity consistency, intensity accuracy, detection accuracy, statistical capability, and quantification accuracy. Despite the missing independence of these metrics and the loose reporting of software parameter settings, such comparative studies are of great interest to the field of quantitative proteomics. A general conclusion from these studies is that the choice of software might, to a certain degree, affect the final results of the study.
Absolute quantification of peptides and proteins using intensity-based label-free methods is possible and can be done with excellent accuracy, if standard addition is used. With the help of known concentrations, calibration lines can be drawn, and absolute protein quantities can be directly inferred from these calibration measurements (28). Furthermore, it has been suggested that peptide peak intensities can be predicted and absolute quantities can be derived from these predictions (29). However, the limited accuracy of predictions or the need for peptides of known concentrations limits these approaches to selected proteins/peptides only and prevents their use on a proteome-wide scale.
Spectral counting methods have also been used for the estimation of absolute concentrations on a global scale (30), albeit at drastically reduced accuracy relative to intensitybased methods. In one study, the authors used a mixture of 48 proteins with known concentrations and predicted the absolute copy number amounts of thousands of proteins based on that mixture. Despite the fact that large, proteomewide data sets will dilute the effects of different peptide detectabilities on the individual protein level, such methods will always be limited in their accuracy of quantification.
The generic nature of label-free quantification is not restricted to any model system and can also be employed with tissue or body fluids (31,32). However, the label-free ap- proach is more sensitive to technical deviations between LC/MS runs as information is compared between different measurements. Therefore, the reproducibility of the analytical platform is crucial for successful label-free quantification. The recent success of label-free quantification could only be accomplished through significant improvements of algorithms (33)(34)(35)(36). An increasingly large collection of software tools for label-free proteomics have been published as open source applications or have entered the market as commercially available packages. This review aims at outlining the computational methods that are generally implemented by these software tools. Furthermore, we illustrate strengths and weaknesses of different tools. The review provides an information resource for the broad proteomics audience and does not illustrate all algorithmic details of the individual tools.

MATERIALS AND METHODS
The Nature of LC-MS/MS Data-Quantitative proteomics data from LC/MS and/or LC-MS/MS experiments typically have a large data volume (tens to hundreds of gigabytes per sample are not uncommon), and the data are rather complex. Typically, digested proteins (i.e. complex peptide samples) are separated on a liquid chromatography column and ionized, and the resulting MS spectra are recorded by a mass spectrometer. For MS/MS experiments, peptide ions are selected (based on their intensity or through an inclusion list) for fragmentation and fragment ion spectra are recorded. These MS/MS spectra usually form the basis of the identification (which we do not consider here), but they also can be used for spectral counting.
Depending on the resolution of the mass analyzer, and because the ionization is a stochastic process, even identical ions will not be measured at the exact same m/z; instead they form a distribution of measurements around the true m/z value. This distribution is called a (raw) peak and can be described by a mathematical model (a normal distribution is a good approximation, but not quite sufficient). The process of peak picking or centroiding aims at estimating the parameters of the peak model, such as the centroid, intensity, width, and skew. Centroiding reduces the raw measurement data to a handful of parameters for each compound and, most important, yields a single value for the m/z of the ion. The centroid m/z can be reported as the position of the maximal intensity, or by averaging over m/z (raw data points weighted by intensity). Likewise, the intensity of a peak can be read off as the maximum height from the raw data (the peak apex), or one can compute the area under the curve (i.e. the peak volume). It is important to know whether the data are centroided or not, because some software can handle only one type of input data. Fig. 1 shows a typical data set generated from a biological sample using HPLC-MS and illustrates its multidimensional nature. After being eluted from the column, analytes are continuously injected into the mass spectrometer, which records mass spectra (scans) at high speed. Stacking individual spectra yields a three-dimensional dataset, a so-called map. When peptide mass spectrometry is preceded by liquid chromatography fractionation, the observed signal corresponding to a single charge state of a peptide is actually a twodimensional intensity distribution in retention time and mass-tocharge The data points belonging to this distribution are called a feature (e.g. the two-dimensional signal in Fig. 1).
Computational Methods-Quantification methods can be divided into feature-intensity-based methods and spectral counting methods. In the former, one tries to account for all signals corresponding to a specific charged peptide on the MS level; in the latter, one tries to infer the expression level of the peptide from the number of MS/MS identifications. Map alignment is especially important for featureintensity-based quantification, whereas in spectral counting one can use the identification of the peptide to assign corresponding quantities across maps. Only accurate alignments of maps enable the correct comparison of quantitative properties. In the following, we describe the main steps that are necessary for label-free data processing.
Signal Processing-Depending on the type of instrument, the processing of the raw data can differ. However, there are certain generic steps in signal processing that apply to most instruments and to both intensity-based methods and spectral counting. These are baseline filtering, noise filtering, centroiding, and charge estimation.
In MALDI spectra, and to some extent in ESI spectra, a baseline is apparent that adds up to the signal caused by the analytes. In MALDI spectra, the baseline can become dominant in the low m ϭ z regions and disappears with increasing m ϭ z. It is typically shaped like an exponential decay distribution and can be attributed to matrix material. The baseline leads to poorly resolved peak shapes due to a loss of baseline separation between adjacent peaks. The baseline thus interferes with intensity estimation and has to be removed computationally. Morphological filters such as the Top-hat filter can be used for this task.
In addition to the baseline signal, every mass spectrometer suffers from high-frequency noise (electronic noise, usually attributed to the detector, and chemical noise, usually attributed to solvents, buffers, and contaminants), and thus peaks expected to be approximately Gaussian in shape might not be convex any longer. This is a potential pitfall for algorithms that rely on local minima to separate isotope peaks. A noise filter will smooth the data by removing high-frequency noise-for example, a Savitzky-Golay filter will work well.
Finally, a signal that has been baseline corrected and smoothed is subjected to centroiding. The computational problem ranges here from almost trivial (e.g. for high-resolution spectra) to a fitting of overlapping (skewed) Gaussians, for example, in the case of highly charged ion trap signals. In general, this fitting is interwoven with the problem of obtaining the (initially) unknown charge state of a peptide, as the charge state z determines the distance of the isotope peaks, namely, 1 ϭ z. The resulting model fit can be used to analytically determine the peak volume and the height of the peak. Usually the peak volume is taken as the intensity of a centroided peak, because it corresponds directly to the ion count. However, for high-resolution spectra, the height of the peak (which is easier to determine) serves equally well.
Feature-based Quantification-Algorithmically, the main steps in feature-based quantification can be divided into (i) signal processing, (ii) feature finding, and (iii) map alignment. The advent of high-resolution mass spectrometers has made the signal processing and peak picking tasks simpler than they were on low-resolution instruments. However, the quantification methods are complex, and good quantification remains a challenge.
A central task in the processing of mass spectrometric data is the detection of peptide features for all ions eluted from the liquid chromatography column. Peptides elute over time from the liquid chromatography column, get ionized, and are injected into the mass spectrometer. The mass spectrometer takes new measurements in regular, small time intervals, thereby sampling the amount of the eluting ion over time, resulting in an elution profile. In each measurement, an ion gives rise to a typical isotope pattern, which is caused by its atomic composition (see Fig. 2 for examples of an elution profile and an isotope pattern). Via integration over the elution profile and isotope pattern, peptide feature intensities can be determined. In general, one can assume that the two-dimensional distribution is a product of two independent distributions. Thus, for the marginal distribution over m/z, similar reasoning applies as for individual spec-tra. Automated detection of these features allows their comparison across different experiments. Fundamental to a quantitative comparison of analytes is the linear correlation of electrospray ionization intensity with ion concentration within a certain dynamic range. Most algorithms try to heuristically determine the extent and intensity of a feature by fitting appropriate distribution models to the data. This is done in areas of high signal intensity (e.g. by working on intensity sorted lists of peaks). The intensity of a feature can then be determined either by using the model parameters or simply by summing up all peak intensities in the feature region.
Spectral Counting-Besides this feature-intensity-based quantification method, spectral counting methods are also used for differential quantification. Despite the fact that spectral counting is commonly used to derive quantitative information at the protein level, the differential quantification of peptides builds the fundament of this concept. In the following we discuss spectral counting (SC) 1 concepts and illustrate how these concepts are involved in differential peptide quantification. SC in its simplest form counts the number of tandem spectra that are assigned to the same protein. There have been numerous studies using SC for the inference of quantitative information in label-free shotgun proteomics data. A collection of methods has recently been reviewed (37). Peptide-spectrum matchings can be used to infer differential ratios of peptides, but these methods are also gaining popularity for differential protein quantification. Methods that extend the simple SC to differential protein quantification include the protein abundance index (38); its extended version, the exponentially modified protein abundance index (39); the normalized spectral abundance factor (40); and the absolute protein expression (41). The robust intensity-based averaged ratio (RIBAR) and its extended version xRIBAR are part of a recent approach by Colaert et al. (36) that correlates the summed intensity of corresponding fragment spectra in two experiments and which has been shown to outperform other SC-based approaches such as the exponentially modified protein abundance index and normalized spectral abundance factor. Despite the development of novel methods to calculate protein abundance on MS/MS spectra, any approaches will struggle to reach high quantifi-cation accuracy because of the data-dependent ion sampling and dynamic exclusion list settings.
Recently, different label-free abundance measures have been compared, and their results were integrated with RNA expression data (42). Although the feature-based measure was more accurate, the authors found that, if normalized to the transcript abundance, spectral counting and feature-based methods perform equally well. Hoekman et al. (26) implemented a framework that allows the combination of different quantification approaches.
Map Alignment-The purpose of map alignment is to assign the same peptide features between maps for comparison. This is done using the assumption that the chromatographic elution time of a peptide, as well as its ionization behavior, stays relatively constant between measurements and that the measured m/z does not differ. Whereas the differences in the m/z are rather marginal, the shifts in the RT dimension can become very large and frequently show some nonlinearity.
Several algorithmic approaches have been used to adjust for these distortions. Lange and coauthors (43) used pose clustering techniques to find the best parameters for an affine transformation. The approach is simple and robust, but it cannot deal with nonlinear transformations. Descriptions of similar, more recent approaches can be found in Refs. 44 and 45. The approach discussed in Ref. 46 use the similarity of individual scans to compute a scan-wise alignment, whereas other methods use nonlinear functions to model the shift in retention time.
Apart from the pairwise alignment of two maps, another important aspect is grouping the correct features together across many maps. A discussion about metrics for map alignment, as well as an overview and assessment of different methods, can be found in Ref. 47.
Normalization-Once the peptide features of different maps are assigned to each other after map alignment, one needs to correct for systematic biases in the measured intensities. This is often called "intensity normalization." Normalization is a critical step in the labelfree computational proteomics pipeline. It is necessary to account for variability in intensity signals (e.g. systematic errors in experimentation, sample preparation, chromatography, and mass spectrometry (48)). The microarray community has done extensive research in normalization procedures. In Ref. 48 Stacked side by side, these spectra form two-dimensional maps. In these maps, individual peptides being eluted from the column give rise to sets of peaks across multiple spectra. Feature-finding algorithms can identify features, which can be defined as all mass-spectrometric signals (peaks) caused by the same peptide. Elution profiles have ideally a Gaussian shape, but they can be significantly distorted. The projection of a feature along the m/z axis accordingly corresponds to the isotope profile of the peptide.
performance of four different normalization strategies for label-free proteomics data. They include a global normalization, linear regression, local regression, and quantile normalization. The authors found that normalization metrics need to be adapted depending on the data set. They conclude that quantile normalization has some advantages over other techniques, because no iterations are necessary and it does not force the mean to be zero (in log scale), as successive parts of the data (quantiles) are equalized from run to run. However, in their studies, linear regression models showed the best performance in most cases (49). Global normalization methods use information from all peak intensities per spectrum or run in order to scale the individual intensities. Kultima et al. (49) compared 10 different normalization metrics and show that linear regression that takes the analysis order into account performed best on three independent peptidomics (analysis of endogenous peptides) data sets. Wang et al. (50) argue that global normalization by a constant factor is feasible, but they caution that only a constant number of the most intense signals should be used for normalization if non-random missing data as a result of instrument detection limits is a concern.
Besides the publications by Kultima et al. (49) and Callister et al. (48), additional review articles discuss the issue of normalization of label-free proteomics data (51,52).
Software packages for label-free quantification cover a wide range of normalization techniques, but each package offers only a limited set of methods. Some use normalization on individual maps (mzMine2, Corra), most use a list of matched peptide intensity pairs for normalization, and some provide no information at all. mzMine2 works on single maps and offers multiple normalization schemes (e.g. average intensity and maximum intensity normalization). Additionally, normalization to an internal standard that must be present in all maps is possible. Corra also operates in single raw maps and employs the LIMMA package for normalization before peak picking. MaxQuant and OpenMS' ProteinQuantifier both ensure that the median of peptide ratios is zero (in log space). pView2 uses a "median of medians" normalization. Mascot Distiller offers mean, sum, and median normalization of peptide ratios. Progenesis employs an iterative-median-ofratios approach using a reference map. msInspect uses a linear model based on the highest intensity peptides between multiple runs. The most involved technique is implemented in Superhirn: maps are split into retention time segments, which are normalized separately. Normalization itself is performed hierarchically based on matched pairs in similar maps.

SOFTWARE PACKAGES
There is a growing collection of tools for label-free quantification implementing one or several of the techniques discussed in the preceding section. Out of the plethora of available software tools, we have selected several commercial and academic packages that are widely known and (to some extent) maintained. Table I gives an overview of computational tools, as well as information on their licenses, release dates, and input formats.
Some commercial packages such as SIEVE are restricted to the native vendor format and cannot read open community formats like mzML, mzData, or mzXML, which can be easily converted so as to work with one other (e.g. via OpenMS/ TOPP (34,53) or ProteoWizard (54) (36) and Census, both of which are freely available. The intrinsic details of Census are unknown, but they involve normalization for protein length and variability. Mascot Distiller and Scaffold are commercial alternatives, with the latter additionally supporting Gene Ontology term annotation. Mascot Distiller supports exponentially modified protein abundance index values, and Scaffold normalizes counts by the total count within the sample, gives access to relative and absolute counts, and allows for filtering rules.
Feature-based methods usually follow similar steps from raw data to protein expression tables (centroiding, feature finding, map alignment, and normalization, as well as protein inference) but differ in the implementation details, which are not always published, even for non-commercial tools. Progenesis and OpenMS/TOPP offer wavelet-based peak picking, suitable for low-resolution data, whereas MaxQuant fits a Gaussian curve and SuperHirn uses a simple local-maxima heuristic. Feature finding in MaxQuant is done using a graphbased approach iteratively using the best sub-graphs as predicted by an averagine model. OpenMS/TOPP uses either a wavelet approach based on an averagine model or a modelbased approach on centroided data incorporating an RT shape fit and averagine models in the m/z dimension. For map alignment, SuperHirn uses a LOWESS fit, and OpenMS/TOPP uses a linear (affine) model or b-spline driven by either pose clustering or MS2 identification landmarks with respect to a reference map. Similarly, MsInspect (58) employs smoothingspline regression. Progenesis uses a different approach of first using map alignment based on centroided data, guided by (user-defined) landmarks. Once a master map of all peak information from all maps is created, features are identified using an isotope-fitting procedure. Statistical post-processing or visualization at the protein level (where inference methods differ widely) is not supported by all tools and in this case must be diverted to dedicated statistical tools such as R. pView (59) has a tight R integration, Corra (60) features plots, and mzMine2 (35) allows for basic analysis procedures (e.g. PCA). Spectrolyzer has potent visualization capabilities and built-in classification and regression functionality.
Almost all packages run on Windows, with the exception of Corra and SuperHirn. Not every package provides a binary installer, so manual compilation might be required. Commercial packages tend to be Windows only; all non-commercial packages support Linux (with the exception of VIPER (61)) (see Table I for details). CONCLUSION Quantitative proteomics is highly relevant for systems biology, biomarker discovery, and many other biomedical applications. Among all the methods for differential peptide quantification, label-free approaches provide the highest flexibility, and as a result of recent progress in software and hardware, their dynamic range and accuracy are continuously improving. Both SC and intensity-based measures have been shown to provide good quantification results. The intensity-based measures avoid stochastic effects in ion sampling and are therefore slightly more accurate, and they potentially provide higher reproducibility. SC is easy to implement and fast.
There is a large collection of software solutions that are currently used for label-free peptide quantification, and each comes with different strengths and weaknesses. For users who intend to use standard workflows and do not need to develop algorithms and pipelines themselves, monolithic solutions such as Progenesis or MaxQuant are very suitable tools for fast data analysis. If more flexibility is needed or if an understanding of the underlying algorithms is required, open-source packages have their advantages. Large proteomics labs and core facilities will most likely appreciate the modularity and automation provided by pipeline tools.
A current challenge arises from the increasing amount of samples in more and more complex proteomics studies, in particular in clinical proteomics. Although label-free techniques scale well in general, many software tools have issues with these large-scale studies. The mere amount of data involved (hundreds of LC/MS runs resulting in hundreds of gigabytes of data) certainly causes problems, but also algorithmically there are scalability issues when these maps need to be aligned and linked. Whereas small analyses can be run on laptop computers, studies requiring more than a dozen maps usually require more powerful hardware. Multicore central processing units with a large amount of random access memory (64ϩ GB) and a generous amount of hard disk space are recommended for these larger studies.
Although there is still room for improvement, software tools for label-free quantification have reached a level of sophistication that makes their use convenient and reliable for most purposes. In many cases, label-free quantification is thus a good alternative to labeling techniques in quantitative proteomics.