Urinary Proteome Analysis using Capillary Electrophoresis Coupled to Mass Spectrometry: A Powerful Tool in Clinical Diagnosis, Prognosis and Therapy Evaluation

Urinary Proteome Analysis using Capillary Electrophoresis Coupled to Mass Spectrometry: A Powerful Tool in Clinical Diagnosis, Prognosis and Therapy Evaluation Proteome analysis has emerged as a powerful tool to decipher (patho) physiological processes, resulting in the establishment of the field of clinical proteomics. One of the main goals is to discover biomarkers for diseases from tissues and body fluids. Due to the enormous complexity of the proteome, a separation step is required for mass spectrometry (MS)-based proteome analysis. In this review, the advantages and limitations of protein separation by two-dimensional gel electrophoresis, liquid chromatography, surface-enhanced laser desorption/ionization and capillary electrophoresis (CE) for proteomic analysis are described, focusing on CE-MS. CE-MS enables separation and detection of the small molecular weight proteome in biological fluids with high reproducibility and accuracy in one single processing step and in a short time. As sensitive and specific single biomarkers generally may not exist, a strategy to overcome this diagnostic void is shifting from single analyte detection to simultaneous analysis of multiple analytes that together form a disease-specific pattern. Such approaches, however, are accompanied with additional challenges, which we will outline in this review. Besides the choice of adequate technological platforms, a high level of standardization of proteomic measurements and data processing is also necessary to establish proteomic profiling. In this regard, demands concerning study design, choice of specimens, sample preparation, proteomic data mining, and clinical evaluation should be considered before performing a proteomic study. Analiza Proteoma u Urinu Putem Kapilarne Elektroforeze Udružene s Masenom Spektrometrijom: Moćno Sredstvo u Kliničkoj Dijagnostici, Prognostici i Proceni Terapijskog Učinka Analiza proteoma je postala je moćno sredstvo za dešifrovanje (pato)fizioloških procesa, što je za rezultat imalo uspostavljanje oblasti kliničke proteomike. Jedan od glavnih ciljeva je otkrivanje biomarkera oboljenja iz tkiva i telesnih tečnosti. Zbog ogromne složenosti proteoma, pri proteomskoj analizi zasnovanoj na masenoj spektrometriji potrebno je izvršiti separaciju. U radu su opisane prednosti i ograničenja proteomske analize pri separaciji proteina putem dvodimenzionalne gel elektroforeze, tečne hromatografije, SELDI i kapilarne elektroforeze (KE), sa fokusom na KE-MS. KE-MS omogućava separaciju i detekciju proteoma male molekularne težine u biološkim tečnostima uz visoku reproducibilnost i preciznost u samo jednom koraku radnog postupka i za kratko vreme. Pošto pojedinačni senzitivni i specifični biomarkeri možda i ne postoje, strategija za premošćivanje te dijagnostičke praznine pomera se sa detekcije pojedinačnog analita na simultanu analizu više analita koji zajedno čine obrazac specifičan za dato oboljenje. Takvi pristupi, međutim, nose sa sobom dodatne izazove, koje ćemo predstaviti u ovom radu. Pored izbora odgovarajućih tehnoloških platformi, neophodan je visok nivo standardizacije proteomskih merenja i obrade podataka kako bi se vršilo profilisanje proteoma. U tom pogledu, zahtevi koji se tiču nacrta studija, izbora primeraka uzoraka, analize proteomskih podataka i kliničke evaluacije trebalo bi da budu razmotreni pre izvođenja proteomske studije.


Requirements for clinical proteomic studies
In initial clinical proteomic studies applying relatively crude methods and analyzing a limited number of study subjects, differences between patients and controls could be observed (1,2). However, it soon became evident that the dynamics of the human proteome are far more complex than expected, and differences of a similar order of magnitude can be found, when comparing controls with controls. As presented schematically in Figure 1, many organizational, technical, and procedural aspects should be taken into consideration for the successful completion of a clinical proteomic study. These were outlined in detail recently (3), and will be discussed shortly here.

Study design
Study design must involve proper selection of patient and control groups, choice of sample material, and the selection of a proteomic platform that fulfills the methodological requirements. Further, cooperation between clinicians, statisticians, clinical chemists and basic scientists is required (3). It is generally obsolete to use samples from healthy individuals as the sole control group; patients with similar clinical characteristics/comorbidities must be included as additional controls. Individuals included in the study have to be selected carefully in order to exclude the possibility for marker identification to be influenced by drug administration and other therapeutic regimens.
For reproducibility and comparability of clinical proteome analyses, it is of major importance to minimize variability of sample collection, handling, and storage. Another challenging task is to establish a uniform sampling protocol that is equally applicable to any sample, irrespective of its physical and biochemical properties. The options to control processes of sample collection and preparation are limited, particularly when multiple centers are involved and samples are transferred from the site of collection to the site of analysis. The best strategy to overcome these uncertainties of sample acquisition appears to perform large studies providing sufficient information to decide, which marker lies within or exceeds the normal range of analytical and biological variability.
This also argues for the establishment of a public database, where essential information, such as the patient's proteome profile and all clinical information, can be deposited. An example for the organization of proteomic and clinical data using a database system is shown in Figure 2. Ideally, with the help of appropriate software, patient groups can be selected retrospectively for inclusion in a comparative study on the basis of the recorded clinical data and by a direct linkage to the patient's proteomic profile. First step in this direction is the recently reported human urinary proteome database (4).

Source of specimen
Generally, proteomic profiling can be performed with tissue extracts, cell lysates, or body fluids (such as blood, urine, cerebrospinal fluid, and saliva) as biological materials. Due to proximity to the primary site of the disease, affected cells and tissues should contain the highest amounts of biomarkers. This property makes them an ideal source for biomarker definition. Unfortunately, these are not easily accessible and require invasive or surgical methods for sampling. This also applies to samples from cerebrospinal, synovial, and body cavity fluids. The poor accessibility of tissue is reflected by the fact that to date proteomic studies using tissue samples are rare (5,6). However, as outlined recently by Lescuyer et al. (7), such approaches may be more successful than serum (or plasma)-based approaches, which, as the authors point out, have not resulted in measurable diagnostic success to date.
Malignant transformation and increased death of cells in disease-affected tissues and organs is associated with diffusion of tissue-and organ-specific proteins into the extracellular space and into blood circulation. The role of blood as a transporter for molecules to and from tissues, together with the ease of sample collection, makes blood an attractive source for biomarker discovery. Unfortunately, there are a number of problems to overcome before blood can be effectively used for clinical proteomic research (8)(9)(10). Complete identification of the human plasma proteome is currently impossible due to the high dynamic range between low-and high-abundant proteins. By removal of highly abundant proteins, most notably albumin and IgG, this dynamic range can be reduced, but this also creates new challenges, like loss of low-abundance proteins by their binding to albumin or to the resin of the subtraction column. As reported by Kolch et al. (11), persistent proteolytic activity in the blood sample is another source of experimental variability, making meaningful comparison of serum proteome data between individual samples even more challenging. These reports and considerations suggest that, while blood is certainly the richest source for biomarkers, it is highly unlikely that these can actually be identified using today's technologies. Figure 2 Composition of a database for storage and retrieval of protein/peptidome profiles, protein/peptide sequences, and patient clinical records allowing sample selection and differential proteomic profiling for biomarker discovery. L L Via proteolytic degradation, stable, proteaseresistant peptide fragments are generated, traversing epithelial barriers and passing into the urine by glomerular filtration. Although peptides and small proteins of the blood, which enter the lumen of the renal tubule, are mostly reabsorbed by proximal tubular cells, a small quantity escapes this process and is excreted into urine. As a consequence, urine can be used as a biomarker source for various diseases (10,15).
Urine as a sample matrix provides several advantages. First, it is non-invasively accessible and can be obtained in large quantities. Second, urine is relatively stable in its composition if handled properly. As reported by Schaub et al. (12), this is also the case after long storage times. Third, since urine represents the ultrafiltrate of plasma, the composition of the urinary peptidome is highly susceptible to changes caused not only by renal but also by a wide range of non-renal diseases including cardiovascular, autoimmune and infectious diseases as well as certain types of cancer.

Sample preparation
Since biological fluids are complex in their compo sition, their preanalytical processing is a prerequisite for an efficient MS-based analysis. Currently, a plethora of different protocols for sample pre paration for different tissues and body fluids exist and it is beyond the scope of this manuscript to review them. Highly standardized and reproducible preparation protocols with the ability to eliminate interfering com pounds are required to ensure ana lytical reprodu cibility. Furthermore, it should be kept in mind that each additional step in sample preparation will intro duce new, additional artifacts. Hence, preanalytical manipulation should be robust and kept to a minimum.

Biomarker quantification
A limitation of proteomic methods in respect to their clinical applicability was the inability to specify the amount of a particular protein or peptide in a given sample. To solve this problem, several MSbased quantification strategies have been developed, many of which are based on the use of isotope-coded labels (13). These, however, are time-consuming and expensive. As a consequence, further efforts were made to develop more accurate and simple quantification strategies based on signal intensity/ion counting (14). While these measures, in contrast to stable isotope labelled internal standards (15), do not permit absolute analyte quantification, they were specially adapted to perform relative quantification with acceptable deviation characteristics (+/-10%, (16) and own observation). In recent experiments, both absolute and relative quantification gave highly similar results, with clear methodological advantages of the ion counting approach (17) as it avoids direct manipulation of the specimens. In addition, ion counting procedures are not limited to the analysis of biomarkers with known peptide sequence, making it an ideal tool for biomarker discovery.

Proteomic data mining
Due to the large amount of information provided by a single proteome analysis, adequate software solutions are required to correctly interpret and process proteomic data. Mandatory features include the ability to determine the charge of a particular peak, to identify and combine peaks of the same analyte at different charge states, and to perform efficient standardization/normalization to compensate for differences between individual measurements.
Biomarkers are defined from these datasets, generally based on multivariate statistical analyses of datasets. The subsequent classification algorithm may be based among others on linear discriminate analysis (18) or support vector machine (19). As with any classification procedure, these methods have their own advantages and drawbacks. It should be noted, however, that neither of these supervised learning methods do include a variable selection procedure per se. Therefore, statistical evaluation of the different peptides appears mandatory. Nevertheless, a given biomarker showing statistical significance does not automatically perform well as a class discriminating item. Considering the high dimensionality of the statistical problems, the stochastic analysis must correct for multiple testing artifacts inherent to such analyses. An example may illustrate, why this is of utmost importance: The presumption is that n independent tests using 0.05 as the critical significance level are performed. The probability for a single test to come to a non-significant result (that is a correct conclusion) is hence 1-0.05 = 0.95 (95%). Since the n tests are independent from each other, the probability that all n tests correctly reject the n null hypothesis is determined by the product of the single results: 0.95 x ... x 0.95 =0.95 n . Hence, the probability to at least wrongly reject one of the n null hypotheses is given by 1-0.95 n . If our experiment involves 200 tests on 200 biomarkers, the experimental error probability is 1-0.95 200 = 0.999965. In other words, it is almost certain that when performing 200 tests on 200 biomarkers several of the declared significant findings are false positive. Because of the test independence, the probability of k such false positives among n biomarkers is given by the binomial distribution with the significance level a as the probability of »success« (i.e. having a false positive). In the example of 200 biomarkers tested at the significance level of 0.05, this probability amounts for k = 5 to 0.97355. Even for k = 8, the probability that the findings are false positives is still 0.78669. Bonferroni corrections, and their relatives such as the Holm procedure, are the most wides pread approach for controlling the experiment-wide false positive rate (20). Distribution-free re-sampling methods, like from Westfall and Young (21), are excellent methods to control the experimental error rate. A major drawback of these procedures is that they may lack sufficient statistical power. This has lead Benjamini and Hochberg to introduce the elegant approach of false discovery rate, which con ser ves more statistical power (22).
These reports and the application of statistical methods on a theoretical example shown above clearly underline the importance of using proper statistics. If not strictly observed, the data obtained will likely hold no value, and will be proven invalid in the next set of experiments.

Clinical validation
The performance of the biomarker(s) defined as the ability to correctly classify samples into healthy and diseased subjects may best be expressed as a Receiver Operating Characteristic (ROC) plot, since this will indicate the degree of overlap between the two groups by plotting the sensitivity against 1specificity at each level (23). ROC plot analysis is the method of choice, since it has the property to be independent of the prevalence of the disease in the sample cohort.
After identification of candidate biomarkers in a training set of well-defined samples, their discri minating and prognostic value must be validated in a second independent sample collective to prevent overfitting of training set data. Additionally, the validation process should test the biomarker's ability to discriminate between the disease of interest and other diseases and health conditions. For example, heat shock proteins, in particular HSP70, which were descri bed as biomarkers for certain types of cancer (24)(25)(26), are frequently released by affected cells after the onset of other, non-cancer associated stimuli, such as oxidative stress, heavy metals, tobacco smoke, and metabolic poisons (27). Therefore, it is difficult to utilize HSP up-regulation as a specific tumor marker.

MS-based proteome analysis
Two-dimensional separation of proteins according to their isoelectric point and molecular weight in sodium dodecylsulfate polyacrylamide gels (SDS-PAGE), first described by O'Farrell (28), provided the basis for proteomic research. It soon became clear that high resolution protein separation is only one part of the solution. The other is unambiguous identification of the proteins. Initially, proteins excised from the gel were examined by Edman degradation, a tedious and often unsuccessful procedure. There fore, whenever possible, detection was performed by nitrocellulose transfer and staining with specific antibodies (29). Subsequently, the implementation of mass spectrometry led to a stepby-step identification of hundreds of proteins based on a proteolytic in-gel digest, gel extraction, and MS analysis of the resultant peptide fragments (30).
In recent years, two major technologies emerged, which allowed high-throughput screening of pro teomes. These two methods are proteomic microarrays and separation technologies coupled to mass spectrometry. Protein-detecting microarrays rely on the development of antibody engineering techno logies and automated spotting techniques for biomolecule immobilization onto solid supports and will not be covered here [For more details see (31)]. Electrophoretic and chromatographic separation technologies, such as two-dimensional electrophoresis (2DE), liquid chromatography (LC), surface-enhanced laser desorption/ionization (SELDI) and capillary electropho resis (CE), were developed concurrently with the microarray technology, and were coupled to mass spectrometers with different ionization sources, i.e. matrix-assisted laser desorption/ionization (MALDI) or electrospray ionization (ESI), and analyzing systems, i.e. quadrupole, time-of-flight (TOF) or Fourier transform ion cyclotron resonance. We will briefly outline the advantages and shortcomings of the different technological platforms, and subsequently focus on CE-MS. For a more detailed review of the technologies, we refer to (32).

Two-dimensional gel electrophoresis coupled to mass spectrometry (2DE-MS)
In 2DE, proteins are resolved by electro-focu sing according to their isoelectric point followed by orthogonal separation by SDS-PAGE. MS detection of proteolytic digests has become the method of choice for protein identification from the 2D-gels: tryptic digestion of excised protein spots, extraction of the proteolytic fragments, and, as soon as mass infor mation of at least three fragment ions is available, comparison with public databases provided by e.g. the National Center for Biotechnology Information. Matches can subseq uently be verified by tandem mass spectrometric (MS/MS) sequencing or immu noblotting.
2DE-MS is a time-consuming, technically challenging approach with high analytical (gel-to-gel) variability, compromising comparison between samples. While the latter is to an extent solved by the use of twodimensional difference gel electrophoresis (2D-DIGE) (33), enabling simultaneous resolution of two differentially labelled samples within the same 2D gel, comparison of several different experiments still remains challenging. Furthermore, the approach is by far too time-consuming to be applied in clinical routine laboratories. Hence, potential biomarkers defined by 2DE-MS have to be transferred to an application platform, where they need to be validated. Despite these limitations, 2DE-MS remains a commonly used technique for comparative analysis of large proteins and definition of potential biomarkers >20 kDa (34).

Liquid chromatography coupled to mass spectrometry (LC-MS)
LC represents a very versatile high resolution separation method. A great variety of LC columns (e.g. reversed phase, ion exchange, size exclusion), are available that allow separation of large amounts of analyte. LC can be combined with any mass spectrometer. LC-MS was further extended to multidimensional separation with different separation media. Most important in this field is multidi mensional protein identification technology (MudPIT), which uses cation exchange pre-fractionation followed by reversed phase separation and MS/MS detection (35). Disadvantageous is the considerably long period of time for analysis, making its use for high-throughput screening of hundreds of samples difficult. Further, comparative examination of datasets obtained from multi-dimensionally separated samples still represents an unresolved challenge. Other challenges of the LC-method are lipids and deter -gents in the sample interfering with separation and detection, and analytes precipitating on the column material. As 2DE, the approach is currently not suited for routine clinical application.

Surface-enhanced laser desorption/ionization coupled to mass spectrometry (SELDI-MS)
SELDI uses selective adsorption of proteins to different active surfaces, e.g. hydrophilic, reversedphase, or affinity reagents as lectins or antibodies, to reduce the complexity of a given biological sample. Application of matrix material followed by laser desorption allows soft laser ionization for MS detection. SELDI has been used for the tentative identification of biomarkers for a variety of diseases. Unfortunately, data in general could subsequently not be validated (e.g. (36)). The advantages of SELDI are its ease of operation and high-throughput capabilities, and low sample volume requirements. Several limitations have essentially precluded the use of SELDI for clinical diagnostic purposes. These include difficulties with inter-laboratory comparison of datasets, since the proteome profiles generated by SELDI are influenced by factors, such as the type of surface coating, pH and salt conditions, and protein concentration of the sample. Different conditions and chip surfaces lead to different datasets from the same sample, resulting in a frequently observed complete lack of comparability of datasets. Other concerns raised are lack of mass precision.

Capillary electrophoresis coupled to mass spectrometry (CE-MS)
CE separates analytes with high resolution based on differential migration through a liquid-filled capillary in an electric field. By online coupling of CE to an electrospray ESI-TOF-MS, analysis of thousands of polypeptides within a time range of 45 to 60 min can be performed (11). CE and LC are similar with respect to resolution and compatibility with mass spectrometers. Advantages of CE are absence of buffer gradients and speed (37). Furthermore, CE is generally insensitive towards precipitating proteins and peptides that often interfere with LC-separation, and the capillary can be reconditioned with NaOH, enabling efficient cleaning after each run. Due to these capabilities, CE-MS can be applied for the separation of virtually all naturally occurring peptides and small proteins. This enables reproducible separation and detection of the low molecular weight proteome of any sample in one single step. As an example for the reproducibility of CE-MS, four peptide profiles of the same urine sample are presented in Figure 3. CE-MS has been successfully used for the identification and validation of specific biomarker patterns in several studies (38).
CE is limited in the separation of larger proteins (>20 kDa) due to the generally low pH used, although to a lesser extent than LC. Another limitation is the small sample volume that can be applied onto the capillary, which greatly hampers CE-MS/MS applications, but, due to the high sensitivity of mass spectrometers in the low fmol range, is of little concern in CE-MS. Today, CE-MS has been proven a stable and versatile platform for low molecular weight proteomic profiling.

Bottom-up and top-down proteomics
In a top-down experiment, intact proteins and peptides are subjected to analysis by mass spe ctrometry. In bottom-up analyses, proteins are digested by proteases and the generated peptides are analyzed by MS/MS. While the latter method is best suited for the identification of large proteins, the former is more applicable for peptides and small proteins and gives a more accurate definition of the potential biomarkers. As outlined recently, any biomarker should be defined by its accurate molecular composition (39). The identification of a theoretical protein based on a few tryptic fragments may be quite misleading, as it generally does not allow accounting for posttranslational modifications, which may in fact confer »biomarker quality« to the protein: glycated albumin may serve as a biomarker for diabetes, while albumin precursor, which would be defined as biomarker based on several tryptic peptides, certainly does not (40). If albumin precursor was defined as biomarker based on a top-down 2DE-or LC-MS/MS approach, the subsequent validation of the results using an alternative technology for clinical application would have failed, due to the inaccurate definition of the original biomarker.
With the exploitation of top-down strategies for CE-MS, LC-MS, and SELDI-MS, the low molecular mass range of the proteome, also termed peptidome, came into focus as a source of information. The peptidome consists of all naturally occurring peptides, many of which are the products of proteolytic degradation. The rationale behind performing peptidomic analysis is provided by the finding that many disease states are reflected by changes in the peptide composition of biological fluids. As suggested by Haubitz et al. (41), Villanueva et al. (42), or Rossing et al. (43), some of these peptides are produced by (disease-)specific proteases. Altered activity of proteases may be more readily assessed by analysis of reaction products, the proteolytic fragments gener ated, than by direct assessment of the proteases themselves.
It appears that the top-down approaches are currently better suited for clinical applications than the bottom-up methods. This is in part due to the more accurate definition of biomarkers including post translational modifications and the apparently higher resolution obtained in the top-down approach (while distinguishing between e.g. 4700 and 4701 Da can easily be accomplished, it is almost impossible to distinguish between 60000 and 60001 Da). In addition, the higher throughput, producing larger numbers of independent samples for statistical analysis, appears to be a prerequisite in a multiple parameter setting. Hence, while the high-molecular weight proteome, generally analyzed via the bottomup approach, may hold more information than the low molecular weight proteome/peptidome, its information is largely not accessible in a statistically sound approach. In contrast, the low molecular weight proteome/peptidome can be assessed in a statistically meaningful way.

Single versus multi-marker applications
Biomarkers are molecules (generally assessed in body fluids) used as indicators for the detection of pathological changes or disease states, drug response, etc. Many of these are proteins and peptides, the exclusive focus of this review. It is important to note that a biomarker is not only defined by its molecular structure, but also by its intended use. It can only be utilized for the intended use, but not beyond (e.g. a biomarker that indicates disease at an advanced state cannot be used as early predictor) until its value has been proven for this purpose (44).
The potential of a protein to serve as a biomarker depends on how selective and sensitive an assessment it enables of a (patho)physiological situa -tion. Most of the analytes currently used for screening and diagnostic purposes have been discovered after extensive physiological and biochemical character ization of the disease. This laborious procedure resulted in the identification of single markers with often moderate diagnostic value, mostly due to low specificity. As a prominent example, prostate specific antigen (PSA) is currently widely used as a marker for prostate cancer. Its prognostic relevance, however, is the subject of ongoing debates due to a lack of specificity. This shortcoming not only results in unnecessary biopsies, but also in higher rates of false positive diagnosis (for review see (45)). Early detection of renal impairment, which is of vast importance for the initiation of renoprotective inter vention, mainly relies on the detection of microalbuminuria. The significance of this marker is underlined by the observation of Mogensen (46) that glomerular filtration rate (GFR) and serum creatinine levels, as alternative estimates of kidney function, did not change within the microalbuminuria range. However, microalbu minuria is also found in apparently healthy individuals, and cannot be utilized as a predictive marker of renal disease (47). These two examples underline the need for other more accurate biomarkers to overcome the limitations of today's diagnostics.
It appears questionable, if screening for a single marker enables reliable, early detection of disease, unambiguously distinguishing it from other pathological conditions, and/or monitoring the efficacy of therapy. An alternative strategy is the identification of several markers, which may not be »optimal« for use alone, but work in concert, and combining them to a disease-specific pattern (32). Recently, a number of different proteomic techno logies have been introduced to establish disease-specific marker patterns for clinical diagnosis and therapeutic moni toring as an alternative to single bio marker based approaches. This multi-marker approach is now widely accepted, but it comes with several challenges, which were unknown or not fully appreciated and therefore disregarded in the beginning of the proteomics era. As outlined recently in suggestions for guidelines for clinical proteomics, the general criteria that are applied onto a biomarker (e.g. known identity, reproducible detection, known deviation) also apply for the single biomarkers in a multi-marker panel (3). The initial enthusiasm and subsequent fai lure to deliver valid results were mostly a consequence of ignoring these principles. It is now evident that an ill defined »pattern« does not constitute a multi-marker panel and cannot serve as a clinical diagnostic tool.

Urinary biomarkers for renal diseases
Chronic kidney disease (CKD) is characterized by a slow, but progressive loss of renal function. Renal biopsy, which is the standard method to differentiate between glomerular, tubular and vascular renal diseases, entails the risk of procedural complications. For differential diagnosis and continuous monitoring of disease progression, a non-invasive alternative to renal biopsy is highly desirable. For this reason, proteomic studies were initiated to identify urinary biomarkers for all types of CKD.
In an attempt to define urinary polypeptide markers specific for membranous glomerulonephritis, Neuhoff et al. used SELDI and CE-MS as proteomic platforms (48). In two clinical studies (49,50), CE-MS profiles from patients with diabetes type I or II with/without macroalbuminuria and healthy volunteers were analyzed to create stage-specific polypeptide patterns. In patients with type II diabetes mellitus and unchanged albumin excretion rate, the detected peptide pattern differed significantly from that in patients with high-grade albuminuria. Comparable results were obtained for patients with diabetes type I, suggesting that the urinary proteome contains a much greater variety of polypeptides than previously demonstrated. Further, the effects of therapeutic inter vention could be clearly demonstrated by CE-MS in an independent study on microalbuminuric subjects after Candesartan treatment (51). The results were recently confirmed in a study on 500 case and control samples (52). The authors demonstrate the efficiency of urinary peptidome profiling by CE-MS to detect both diabetes and DN, and to predict develop ment of DN in a blinded, prospectively collected population.
Prediction of disease development was also demon strated by Decramer et al. (53), who applied CE-MS-based urinary proteome analysis to define specific biomarker patterns for different grades of ureteropelvic junction obstruction, a frequently encountered pathology in newborns. In their blinded prospective study, the biomarker patterns could predict the clinical outcome of newborns without signs of proteinuria with 95% accuracy nine months in advance. The accuracy was increased even further to 97% after 12 months (54). Those data not only indica ted the potential of urinary proteomics to enable the diagnosis of renal disease, but also suggested the potential to gauge the prognosis.

Urinary biomarkers for urological disorders
Theodorescu et al. (55) described the detection and validation of biomarkers of urothelial carcinoma using CE-MS. A specific biomarker pattern was established in a training set composed of 46 patients with urothelial carcinoma and 33 healthy subjects and refined further using CE-MS profiles of 366 urine samples from healthy volunteers and patients with malignant and non-malignant genitourinary diseases. By this two-step biomarker discovery approach, a biomarker model composed of 22 urinary peptides was established, which, when applied to a blinded test set containing 31 urothelial carcinoma patients, 11 healthy individuals and 138 non-malignant genitourinary disease patients, correctly classified all urothelial carcinoma patients and all healthy controls. Differentiation between bladder cancer and other malignant and non-malignant diseases was accomplished with 86-100% specificity.
In a study on prostate cancer (PCa), the importance of proper sampling was underlined (56). While initially investigated midstream urine did not enable identification of valid biomarkers, first void urine served as a source for PCa-specific biomarkers, indicating that the identified biomarkers may originate from secretions of the prostate into urine. After refinement of the PCa-specific biomarker pattern using urine samples from 51 PCa patients and 35 patients with benign hyperplasia as well as 184 controls, a model based on 12 potential biomarkers was established and validated in a second blinded set of 264 patient samples. In combination with PSA, it enabled correct identification of 90% of the PCa samples with 61% specificity.

Application of urinary proteome analysis to non-renal diseases
Urine samples from 40 patients after allogeneic or autologous hematopoietic stem cell transplantation and 5 patients with sepsis were examined with CE-MS analysis (57,58). A pattern consisting of 16 polypeptides indicating early graft-versus-host-disease (GVHD) discriminated patients with GVHD from those without GVHD with 82% specificity and 100% sensitivity. A subsequent blinded multicenter validation study of more than 100 patients with 599 samples collected prospectively confirmed these results (59). Currently an intervention trial comparing preemptive therapy based on proteome profiling com pared to standard treatment is being conducted. First results indicate a benefit for the yet limited number of patients (Weissinger and Metzger, unpublished).
Zimmerli et al. (60) examined patients undergoing coronary artery bypass grafting or patients after acute coronary infarction. Urine samples from patients and controls were analyzed using CE-MS to identify coronary artery disease (CAD) specific biomarkers. In a blinded assessment of 59 samples, specific urinary biomarkers identified CAD patients with 98% sensitivity and 83% specificity. These data could further be validated in an independent study, where the authors could also present data on the identification of several urinary collagen fragments as biomarkers for CAD, and showed clear evidence for the relevance and abundance of different types of collagen in the arteriosclerotic plaques (61). In a further independent study on prospectively collected samples from patients with type I diabetes, the urinary biomarker panel for CAD could be further validated (62). More importantly, it could be demonstrated in this study, that the urinary polypeptide patterns not only allowed accurate diagnosis, but also provided proof that patients with suspicious proteome profiles were at doubled risk (OR 2.2 [1.3-5.2]; P=0.0016) to suffer from acute vascular events on average 1.4±1.3 years following the baseline visit compared to unsuspicious controls. The proteome pattern has proven its ability to identify patients at risk to undergo future CAD events.

Pathophysiological role of biomarkers
Albeit many potential urinary biomarkers defined by CE-MS profiling have not been sequenced yet, sequences are available for more than 400 different urinary peptides (4). Many of these peptides derived from abundant proteins: albumin, beta 2macroglobulin, uromodulin, or collagen. Consequently, a valid question is whether peptidomics is not just another way to measure glomerular injury, that could be assessed by measuring albuminuria (63). The fact that differential diagnosis based on urinary proteome analysis is possible (32,41,64,65) and that patients in complete remission without albu minuria still exhibit apparently disease-specific changes in urinary polypeptides (66) strongly suggests that these peptides contain information about the pathogenesis and are not mere degradation products. It is tempting to speculate that these peptides are indicators of disease-specific protease activity, as suggested by Haubitz et al. (64). This hypothesis is strengthened by findings recently published by Nemirovskiy et al. (67), where the presence of specific collagen fragments correlated with the disease-specific activity of matrix metalloproteases. As another indication, our recent findings that collagen fragments are significantly reduced in diabetic urine (23) fit with the observed increase of collagen and extracellular matrix in patients with diabetes and DN described by Cooper et al. (62). This further supports the hypothesis that reduced activity of proteases and protection of the extracellular matrix from proteolysis by advanced glycation endproducts may be key pathological changes in diabetes mellitus (43).
A similar scenario may be applicable to albuminuria. Consequently, an albumin-derived biomarker is not simply »an albumin fragment«, but rather a specific fragment, defined by its specific C-and Nterminus. Consistent with this view, the presence of urinary fragments of albumin and alpha-1-antitrypsin associated with nephrotic syndrome in chronic kidney disease have recently been described (68). A thorough examination of the sequences of urinary peptides and comparison with protease specificities may strengthen the above hypothesis and lead to better insight into regulation and the pathological role of proteases in disease.

Concluding remarks
Separation technologies coupled to MS currently appear the preferred option for a generic approach to identify and evaluate biomarker profiles in individuals. The available separation techniques 2DE, LC, SELDI, and CE differ in respect to throughput, robustness, accuracy, and reproducibility. It appears that CE-MS fulfills the requirements for broad application in routine clinical practice, as indicated by the validation of GVHD, renal disease, prostate and bladder cancer specific marker patterns in hundreds of patient samples (4,53,55,56,65). It must be stated, however, that future implementation of proteome profiling in laboratory diagnosis relies on more than just technological advancements. Of equal importance are concerted efforts in the development of global standardization procedures for the conduction and reporting of clinical proteomic studies. By the adoption of standardized methods in the identification of disease-specific biomarkers, the information provided by proteomic platforms will bring clinical chemists a step closer to the ultimate goal to capture critical pieces of information of a particular disease in one single diagnostic step. This hopefully will result in the integration of MS-based proteomic methods into the armamentarium of clinical laboratories.