Ultra-sensitive, high throughput and quantitative proteomics measurements

https://doi.org/10.1016/j.ijms.2004.09.024Get rights and content

Abstract

We describe the broad basis and application of an approach for very high throughput, ultra-sensitive, and quantitative proteomic measurements based upon the use of ultra-high performance separations and mass spectrometry (MS). An overview of the accurate mass and time (AMT) tag approach and a description of the incorporated data analysis pipeline necessary for efficient proteomic studies are presented. Adjunct technologies, including stable-isotope labeling methodologies and improvements in the utilization of liquid chromatography (LC)–MS peak intensity information for quantitative purposes are also discussed. Related areas include the use of automated sample handling for improving analysis reproducibility, methods for using information from the separation for more confident peptide peak identification, and the utilization of smaller diameter capillary columns having lower volumetric flow rates to increase electrospray ionization efficiency and allow for more predictable and quantitative results. The developments are illustrated in the context of studies of complex biological systems.

Introduction

With recent advances in genome sequencing, the biological research paradigm is rapidly transitioning towards incorporating an understanding of biology that benefits from a global “systems” perspective. As a result, biology is evolving from a largely qualitative descriptive science to a quantitative and ultimately predictive science in which the ability to collect and productively use large amounts of biological data is crucial. Developing a systems-level understanding of how an organism functions benefits greatly from global measurements of proteins because of their primary role in nearly all cellular processes. The interest in proteome-wide measurements ranges e.g. from the analysis of human plasma for the identification of diagnostic biomarkers, to the characterization of biochemical pathways of microorganisms important for environmental bioremediation or the understanding of human host-microbial pathogen interactions.

Proteomics measurements that yield insight into biochemical processes can lead us to new global predictive computational models that provide a more solid basis for understanding environmental and human health. However, to successfully reach this stage of predictive modeling, major advances in the ability to measure these highly complex systems are still required. It is expected that numerous proteomics measurements (e.g., time course, comparative disease states, a range of environmental perturbations, etc.) will be needed to provide sufficient data for extracting understandings of even the simplest of biological systems that can involve many thousands of different gene products. mRNA expression studies using microarrays have similarly shown that hundreds or thousands of measurements are often essential to support even relatively modest scientific objectives. In addition to throughput, data quality is highly important since the more quantitative and reproducible the measurements, the fewer measurements are needed to achieve a given objective.

At present, most global proteomics measurements are qualitative in nature, providing little more than “parts lists” of proteins with uncertain quality and limited information on co- and post-translational modifications. While these more qualitative proteome measurements can be useful, they generally have significant limitations. First, the likelihood that both “false positive” and “false negative” identifications will result from these measurements is substantial; their levels of confidence are often ill defined, and provided only in qualitative terms. These uncertainties are greatest for lower abundance proteins, where measurement quality (e.g., the signal to noise ratio) is lower or where related factors (e.g., the identification of only a single tryptic peptide for a protein) degrade confidence in the identification, but again in a poorly defined manner. Second, because protein detection depends significantly on the sensitivity (and other details) of a specific measurement, relatively small run-to-run variations in detection limits and other aspects of sample handling and instrument performance can result in significant changes in the proteins detected. Important biological processes potentially associated with changes in protein abundances may be obscured by measurement noise, and extensive sets of replicates may be needed to achieve acceptable levels of confidence in such cases.

Section snippets

Proteome analysis technologies

Proteome measurement capabilities with the desired comprehensive quality and quantity clearly require further advances in measurement throughput and data quality. The most mature proteome analysis technology is based on separations using two-dimensional polyacrylamide gel electrophoresis (2D PAGE), in conjunction with protein identification using mass spectrometric (MS) analysis and available protein or genome sequence data [1]. However, proteome coverage with 2D PAGE is problematic for

Accurate mass and time tag approach

To overcome the “too many peaks, too little time” bottleneck we developed a strategy that increases throughput by avoiding routine MS/MS measurements. The technical foundation for this strategy involves advanced separations combined with very accurate mass spectrometric measurements; in particular, ultra-high pressure capillary LC combined with Fourier transform ion cyclotron resonance (FTICR) mass spectrometry, and a supporting data analysis and management infrastructure.

The use of FTICR

Proteomic data analysis pipeline

Proteomic analysis of biological samples—identifying peptides and proteins, and quantifying their abundances—using MS technology generally produces large volumes of data. An analysis often provides thousands of separate MS or MS/MS spectra during the LC separation steps. Data analysis tools are used for performing database searches to identify peptides (e.g. from MS/MS datasets or AMT tags), interpret and extract detected masses from MS datasets, and assign peptide identifications to MS

Quantitation strategies

Proteome measurements often involve comparing protein abundances between two cellular populations that differ as a result of some change or perturbation. For comparative studies that employ stable-isotope labeling, the AMT tag strategy can increase throughput and precision by directly comparing two proteomes in the same analysis, such as comparing perturbed systems to a common “reference proteome”. A stable-isotope labeled (e.g., 15N or 18O labeled) reference proteome provides an effective

MS peak intensity based quantitation

Comparative measurements based on isotopic labeling generally require that both versions of the peptide (labeled and unlabeled) be detected, but often large changes in relative protein abundances between two labeled samples result in detection of only one of the peptides (e.g., when there are large abundance changes or low signal to noise levels for the measurements). Approaches based upon the use of peak intensities are attractive for this purpose, but peptide abundance measurements obtained

Electrospray ionization efficiency

The conditions under which ionization suppression occur are relatively well understood and are related to both analyte concentration and ESI volumetric flow rates [62], [63], [64]. Large, conventional flow rates result in greater compound-to-compound variation due to the increased analyte competition for charge and the proximity to the electrospray droplet surface. Converting the electrospray ionization from a conventional flow rate to a nanoflow regime (see Fig. 12 [62]) produces smaller

Sensitivity and dynamic range for quantitative measurements

The use of smaller i.d. capillary columns considerably improves sensitivity, while also improving the practical dynamic range of measurements when the absolute sample size is constrained. To examine both the sensitivity and the range of relative protein abundances measurable for complex proteomic samples, we examined a tryptic digest of a mixture containing a 106:1 difference in protein abundances for two standards (75 femtomoles cytochrome c, and 75 zeptomoles bovine serum albumin) and 5 ng of

Future directions

The field of proteomics continues to advance, increasingly driven by enhanced MS instrumentation, computational technologies, and quantitative methodologies. As capabilities continue to mature, they will push the limits of dynamic range detection, efficiency, and quantitation, while providing faster analyses with more reproducibility, and result in a parallel increase in data production. Improvements in the data pipeline, e.g., improved storage, processing, and analysis techniques, will be

Acknowledgments

We thank the U.S. Department of Energy (DOE) Office of Biological and Environmental Research and the NIH National Center for Research Resources (Grant RR018522) for supporting portions of this research. We also thank the Environmental Molecular Sciences Laboratory, a national scientific user facility sponsored by DOE and located at Pacific Northwest National Laboratory (PNNL) for use of instrumentation. PNNL is operated by Battelle for the DOE under Contract No. DE-AC06-76RLO 1830.

References (86)

  • D.J. Pappin et al.

    J. Curr. Biol.

    (1993)
  • P. James et al.

    Biochem. Biophys. Res. Commun.

    (1993)
  • J.R. Yates et al.

    Anal. Biochem.

    (1993)
  • L. Andersson et al.

    Anal. Biochem.

    (1986)
  • H.J. Cooper et al.

    J. Am. Soc. Mass Spectrom.

    (2002)
  • M.V. Gorshkov et al.

    J. Am. Soc. Mass Spectrom.

    (1998)
  • J.K. Eng et al.

    J. Am. Soc. Mass Spectrom.

    (1994)
  • D.M. Horn et al.

    J. Amer. Soc. Mass Spectrom.

    (2000)
  • S.P. Gygi et al.

    Biotechnology

    (2000)
  • X. Yao et al.

    Proteome Res.

    (2003)
  • M. Heller et al.

    J. Am. Soc. Mass Spectrom.

    (2003)
  • M.E. Belov et al.

    J. Am. Soc. Mass. Spectrom.

    (2004)
  • M.S. Wilm et al.

    Int. J. Mass Spectrom. Ion Process.

    (1994)
  • A. Schmidt et al.

    Am. Soc. Mass Spectrom.

    (2003)
  • L. Pasa-Tolic et al.

    J. Am. Soc. Mass Spectrom.

    (2002)
  • N.L. Anderson et al.

    Mol. Cell. Proteomics

    (2002)
  • M.R. Wilkins, K.L. Williams, R.D. Appel, D.F. Hochstrasser (Eds.), Proteome Research: New Frontiers in Functional...
  • S.P. Gygi et al.

    Proc. Natl. Acad. Sci. U.S.A.

    (2000)
  • A. Shevchenko et al.

    Anal. Chem.

    (1996)
  • M. Wilm et al.

    Nature

    (1996)
  • W.J. Henzel et al.

    Proc. Natl. Acad. of Sci. U.S.A.

    (1993)
  • M. Mann et al.

    Biol. Mass Spectrom.

    (1993)
  • K.A. Cox et al.

    Biol. Mass Spectrom.

    (1992)
  • M.J. Huddleston et al.

    Anal. Chem.

    (1993)
  • K. Jonscher et al.

    Rapid Commun. Mass Spectrom.

    (1993)
  • J.A. Loo et al.

    Science

    (1990)
  • J.A. Loo et al.

    Anal. Chem.

    (1991)
  • R.D. Smith et al.

    Anal. Chem.

    (1990)
  • A.J. Tomlinson et al.

    J. Liq. Chromatogr.

    (1995)
  • A.L. McCormack et al.

    Anal. Chem.

    (1997)
  • J.R. Yates et al.

    Anal. Chem.

    (1996)
  • A. Ducret et al.

    Protein Sci.

    (1998)
  • A.J. Link et al.

    Electrophoresis

    (1997)
  • J.R. Yates

    J. Mass Spectrom.

    (1998)
  • M.P. Washburn et al.

    Nat. Biotechnol.

    (2001)
  • H. Liu et al.

    BioTechniques

    (2002)
  • D.A. Wolters et al.

    Anal. Chem.

    (2001)
  • S.B. Ficarro et al.

    Nat. Biotechnol.

    (2002)
  • W. Weckwerth et al.

    Rapid Commun. Mass Spectrom.

    (2000)
  • M.B. Goshe et al.

    Anal. Chem.

    (2001)
  • Y. Oda et al.

    Nat. Biotechnol.

    (2001)
  • H. Zhou et al.

    R. Nat. Biotechnol.

    (2001)
  • W.J. Qian et al.

    Anal. Chem.

    (2003)
  • Cited by (15)

    • Toward design-based engineering of industrial microbes

      2010, Current Opinion in Microbiology
      Citation Excerpt :

      Nowadays, methods combining mass spectrometry and liquid chromatography, either gel-based or nongel-based protein separation, are used to improve complex proteome identification. Since proteomics methods were developed in the middle of 1990s, interactomics [34••,35] and phosphoproteomics [36] have also become interesting for many research applications, characterizing the state of a protein. Now, various methods are developed for proteomics such as MALDI-TOF for peptide mass finger printing and electrospray ionization (ESI), Fourier transform ion cyclotron resonance (FT-ICR) coupling with tandem mass spectrometry (MS/MS) for peptide identification.

    • Managing genomic and proteomic knowledge

      2005, Drug Discovery Today: Technologies
    View all citing articles on Scopus
    View full text