Urine in Clinical Proteomics*

Urine has become one of the most attractive biofluids in clinical proteomics as it can be obtained non-invasively in large quantities and is stable compared with other biofluids. The urinary proteome has been studied by almost any proteomics technology, but mass spectrometry-based urinary protein and peptide profiling has emerged as most suitable for clinical application. After a period of descriptive urinary proteomics the field is moving out of the discovery phase into an era of validation of urinary biomarkers in larger prospective studies. Although mainly due to the site of production of urine, the majority of these studies apply to the kidney and the urinary tract, but recent data show that analysis of the urinary proteome can also be highly informative on non-urogenital diseases and used in their classification. Despite this progress in urinary biomarker discovery, the contribution of urinary proteomics to the understanding of the pathophysiology of disease upon analysis of the urinary proteome is still modest mainly because of problems associated to sequence identification of the biomarkers. Until now, research has focused on the highly abundant urinary proteins and peptides, but analysis of the less abundant and naturally existing urinary proteins and peptides still remains a challenge. In conclusion, urine has evolved as one of the most attractive body fluids in clinical proteomics with potentially a rapid application in the clinic.

Urine has become one of the most attractive biofluids in clinical proteomics as it can be obtained non-invasively in large quantities and is stable compared with other biofluids. The urinary proteome has been studied by almost any proteomics technology, but mass spectrometry-based urinary protein and peptide profiling has emerged as most suitable for clinical application. After a period of descriptive urinary proteomics the field is moving out of the discovery phase into an era of validation of urinary biomarkers in larger prospective studies. Although mainly due to the site of production of urine, the majority of these studies apply to the kidney and the urinary tract, but recent data show that analysis of the urinary proteome can also be highly informative on non-urogenital diseases and used in their classification. Despite this progress in urinary biomarker discovery, the contribution of urinary proteomics to the understanding of the pathophysiology of disease upon analysis of the urinary proteome is still modest mainly because of problems associated to sequence identification of the biomarkers. Until now, research has focused on the highly abundant urinary proteins and peptides, but analysis of the less abundant and naturally existing urinary proteins and peptides still remains a challenge. In conclusion, urine has evolved as one of the most attractive body fluids in clinical proteomics with potentially a rapid application in the clinic. Molecular & Cellular Proteomics 7:1850 -1862, 2008. URINE

Production
Human urine plays an important role in clinical diagnostics. Physicians have examined urinary samples from patients to diagnose various disorders for centuries. The philosopher Hermogenes (5th century before the common era) already described the color and other attributes of urine as indicators of certain diseases (1). Urine is produced by the kidney and allows the human body to eliminate waste products from blood. The kidney also maintains whole body homeostasis and produces hormones including renin and erythropoietin (2). The human kidney (Fig. 1) is composed of 1 million functional units called nephrons, which can be divided in two functional parts: the glomerulus, which filters the plasma yielding the so-called "primitive" urine, and the renal tubule, which reabsorbs most of the primitive urine. In 24 h, about 900 liters of plasma flows through the kidneys of which 150 -180 liters is filtered. However, more than 99% of this primitive urine is reabsorbed. The remainder (the "final" urine) exits the kidney via the ureter into the bladder (Fig. 1). Therefore urine may contain information not only from the kidney and the urinary tract but also from more distant organs via plasma obtained by glomerular filtration. In healthy individuals, 70% of the urinary proteome originates from the kidney and the urinary tract, whereas the remaining 30% represents proteins filtered by the glomerulus (3). The analysis of the urinary proteome might therefore allow the identification of biomarkers of both urogenital and systemic diseases.

Urinary Protein Content
Urine from a healthy individual contains a significant amount of peptides and proteins. The number of proteins and peptides identified in urine is still increasing. One of the first attempts to define the urinary proteome was published in 2001 (4). Using LC-MS, tryptic peptides of pooled urine samples were analyzed, and 124 proteins were identified. Although this study was not designed to define urinary biomarkers for disease, it showed the information potentially hidden in the urinary proteome and also indicated a possible approach toward its mining. In 2004, this number increased to 1400 distinct spots on two-dimensional electrophoresis gels of which about 420 identified spots yielded 150 unique protein annotations (5). This number of identified urinary proteins increased significantly to around 1500 in 2006 by combining one-dimensional gel electrophoresis and reverse phase liquid chromatography coupled to (Orbitrap) mass spectrometry (6), further underlining the complexity of the human urinary proteome. In a very recent study (2008), we determined that the human urinary proteome apparently contains over 100,000 different peptides, at least 5000 with high frequency (observed in Ͼ40% of individuals examined in different studies) (7). It is therefore "safe to state" that urine is indeed a rich non-invasive source of potential biomarkers of disease that awaits exploration.

Urine as a Source for Biomarkers
Advantages-Compared with other body fluids, urine has several characteristics that make it a preferred choice for biomarker discovery. First, urine can be obtained in large quantities using non-invasive procedures. This allows repeated sampling of the same individual for disease surveillance. The availability of urine also allows easy assessment of reproducibility or improvement in sample preparation protocols. Second, urinary peptides and lower molecular mass proteins are generally soluble. Therefore solubilization of these low molecular weight proteins and peptides, a process with a major influence on the proteomics analysis of cells or tissues, generally is no issue. Further these lower molecular mass compounds (Ͻ30 kDa) can be analyzed in a mass spectrometer without additional manipulation (e.g. tryptic digests). Third, in general, the urinary protein content is relatively stable probably due to the fact that urine "stagnates" for hours in the bladder; hence proteolytic degradation by endogenous proteases may be essentially complete by the time of voiding. This is in sharp contrast to blood for which activation of proteases (and consequently generation of an array of proteolytic breakdown products) is inevitably associated with its collection (8,9). Two laboratories independently showed that the urinary proteome did not change significantly when urine was stored up to 3 days at 4°C or up to 6 h at room temperature (10,11). In addition, urine can be stored for several years at Ϫ20°C without significant alteration of its proteome. However, these considerations may not apply to specialized applications, such as the recently described urinary exosomes that may be less stable (12). Finally as described above, not only the changes in the kidney and genitourinary tract are reflected by changes in the urinary proteome but also changes at more distant sites. This will be developed in more detail below.
Disadvantages-Urine has the disadvantage that it widely varies in protein and peptide concentrations mostly because of differences in the daily intake of fluid. However, this shortcoming can be countered by standardization based on creatinine (13) or peptides generally present in urine (14). In addition, definition of disease-specific biomarkers in urine, and most likely in other compartments, is complicated by significant changes in the proteome during the day. These changes are likely caused by variations in the diet, metabolic or catabolic processes, circadian rhythms, and exercise as well as circulatory levels of various hormones (15). The reproducibility of any analysis is reduced by these physiological changes even if the analytical method shows high reproducibility. However, these variations appear mostly limited to a fraction of the urinary proteome; a large portion remains unaffected by these processes (16).

Techniques
Almost any known mass spectrometry technique has been used for the analysis of the urinary proteome including twodimensional gel electrophoresis (2DE) 1 -MS, LC-MS, SELDI-TOF, and capillary electrophoresis (CE)-MS (17). The ideal sequence for biomarker discovery would be mass spectrometry-based discovery followed by ELISA-based validation and clinical application. This is easier said than done, and to our knowledge, no examples of this ideal "sequence" are available for the discovery of urinary biomarkers yet. However, over the last few years profiling approaches that allow the use of mass spectrometry-based techniques in the discovery/ validation/clinical phase for the analysis of the urinary proteome have emerged: SELDI-TOF and CE-MS (17). The technical details of both techniques can be found in detail elsewhere (17,18), but we will describe briefly the advantages and disadvantages of both SELDI-TOF and CE-MS (Table I).
In addition we will describe LC-MS and 2DE-MS, which, until now, could be used for biomarker discovery but not for clinical applications. 1 The abbreviations used are: 2DE, two-dimensional gel electrophoresis; CE, capillary electrophoresis; 2D, two-dimensional; PSA, prostate-specific antigen; UPJ, ureteropelvic junction; RCC, renal cell carcinoma; BCa, bladder cancer; TCC, urothelial (transitional cell) carcinoma; Reg-1, regenerating protein-1; PCa, prostate cancer; BPH, benign prostatic hyperplasia; GVHD, graft-versus-host disease; CAD, coronary artery disease; ECM, extracellular matrix.  1. 70% of the urinary proteins and peptides originate from the kidney and the urinary tract, whereas the remaining 30% originates from the circulation (3). About 900 liters of plasma flow through the kidneys each day. 150 -180 liters of this plasma are filtered by the glomeruli (the "primitive urine"). However, more than 99% of this primitive urine is reabsorbed by the renal tubule. The remainder ("final urine") exits the kidney via the ureter into the bladder.
SELDI-TOF-Advantages of SELDI-TOF include the capacity to analyze multiple samples in a short time and its ease of use. Therefore SELDI-TOF has been used in numerous studies aimed at the definition of biomarkers (19). Although the technology is easy to use, it is, unfortunately, also prone to generating artifacts (20,21). This may be due, in part, to difficulties with calibration and lack of precision of the determined molecular masses of the analytes. Furthermore only a very small fraction of all proteins in a sample binds to the chip surface. Therefore, only a fraction of the information contained in a biological sample can be exploited for the presence of biomarkers even if there are a number of different chip surfaces available. In addition, binding to the different chip surfaces varies depending on sample concentration, pH, salt content, and the presence of interfering compounds. Finally the SELDI-TOF instrument cannot be directly interfaced with MS/MS instruments for sequencing.
CE-MS-CE-MS (18) provides relatively fast analysis (1 h) and high resolution, it is rather robust and compatible with most buffers and analytes, and it provides a stable constant flow avoiding elution gradients that may interfere with MS detection. A disadvantage of CE is that the analysis is restricted to low molecular weight proteins as the larger proteins tend to precipitate at the low pH generally used in the running buffer. This might be seen as a drawback, but is of little consideration for the analysis of urine as the urinary proteome contains a high percentage of low molecular weight proteins (7). Another potential drawback of CE-MS is that only a relatively small amount of volume can be loaded onto the capillary, leading to a potentially lower sensitivity of detection. However, improvement of both coupling and the detection limits of mass spectrometers enables detection in the amol range, making this issue less relevant (22). Sequencing of CE-MS-defined biomarkers can be performed (although with limited success because of the low amount of sample volume that can be loaded) by direct interfacing of CE with MS/MS instruments (23) or by subsequent targeted sequencing (24).
LC-MS-Other techniques, such as LC-MS or 2DE-MS, have been used to study the urinary proteome but have, in general, only been applied on a reduced number of individuals without subsequent blinded validation.
The majority of the approaches based on LC-MS rely on digestion of the sample with trypsin and separation of the resulting tryptic peptides by nano-LC-MS. Using this method, sequencing of tryptic peptides by nano-LC-MS/MS can be automatically triggered, providing sequence information on the peptides detected and identification of the proteins from which they are derived. Nano-LC-MS/MS has proved efficient for qualitative description of the urine proteome (6,(25)(26)(27)(28). However, this approach suffers currently from at least two drawbacks for sample profiling and biomarker discovery. (i) As the sample is digested with trypsin, the complexity of the resulting mixture is much higher than that of the starting material. This leads to MS/MS undersampling, resulting in incomplete analytic coverage of the digest (29). This would be less of an issue if differential studies could be performed based on the MS signal of tryptic peptides and not only on comparison of protein lists identified by MS/MS. This might soon be the case with the evolution of mass spectrometers toward high mass accuracy and resolution and the development of new bioinformatics software for peptide pattern alignment across multiple runs (30 -34). (ii) Even with modern instruments, nano-LC-MS profiling of highly complex tryptic mixtures will probably remain a quite laborious process that may be applicable to only a limited number of patients. This problem might be circumvented by the use of targeted mass spectrometry approaches like multiple reaction monitoring to validate potential biomarkers previously identified by nano-LC-MS/MS in the discovery phase on a reduced number of patients. Multiple reaction monitoringbased approaches allow quantification with high sensitivity and selectivity of peptides in complex mixtures and may be applied at high throughput to screen simultaneously several biomarkers on large cohorts of patients (35,36). This method has been applied recently to detect proteins at very  (37). The use of such hypothesis-driven approaches generated by nano-LC-MS/MS on a limited number of samples may also prove to be useful in the future for the clinical validation of candidate markers in urine. 2DE-MS-2DE-MS is still the moss accessible technique, allows the study of large molecules, and has been used on numerous occasions for the description of the urinary proteome (38). However, it has the drawback that the reproducibility is low, time of analysis is long, and the technique is difficult to automate. The recently introduced concept of 2D DIGE using fluorescence dyes and internal standards provides better reproducibility and more accurate quantification (39). This allows the satisfactory comparison of two samples, but the comparison of several different experiments remains a challenge. However, recently the first studies showing the use of 2D DIGE to compare the urinary proteomes of multiple healthy individuals and different disease states (40,41) were published. Another limitation is that it is impossible to study peptides (in general, peptides/proteins Ͻ10 kDa) by 2DE.

The Need for Standards and the New Trend
Standards-Appropriate techniques are not the only prerequisites for clinical proteomics. Basic principles should be applied to increase the chances of clinical application of the identified biomarkers. This issue has been discussed in detail in a number of recent studies to which we refer the reader (38,42,43), and we only summarize the guidelines that need to be respected in any clinical proteomics study.
The technical platform must be well characterized (standard operating procedures and known technical variability of the platform) and allow appropriately precise measurements. To reduce the inevitable biological variability to a minimum, standard protocols for urine collection and preparation, as outlined recently (38), are highly advisable together with high numbers of comparable data sets. Standardization of protocols will allow exchanging resources and data between laboratories and increase the potential application of urinary biomarkers. World-wide (Human Kidney and Urine Proteome Project (HKUPP) (44)) and European initiatives (European Kidney and Urinary Proteomics (EuroKUP) (45)) are ongoing for standardization of kidney and urine proteomics. In addition, several publications describing detailed urine sample preparation protocols for specific proteomics applications were published recently (38, 46 -48).
It is imperative that proper statistical methods are being used combined with a precise clinical question or hypothesis (42). A Student's t test is insufficient. Correction for multiple testing (e.g. adjustment according to the Bonferroni correction (49)) or similar adjustments (50 -52) must be applied. As several of the underlying hypotheses for statistical evaluation (e.g. even distribution of data, comparability of data sets, absence of bias, etc.) are generally not fulfilled and statistics does not enable assessment of correct classification rate, the results must be validated using an independent blinded set of samples.
Sequencing and exact definition of the detected native proteins or peptides upon profiling represents a frequently underestimated issue. Using profiling methods such as SELDI-TOF or CE-MS, biomarkers are defined by several physical characteristics (e.g. mass and affinity to a certain surface chip for SELDI-TOF or accurate mass and retention time for CE-MS). The use of a bottom-up analysis for sequence determination (i.e. MS/MS analysis after tryptic digestion and database search) is difficult to implement because the biomarkers have to be previously purified. In the case of SELDI-TOF, this is done usually by using the chromatographic matrix related to the surface chip on which the biomarker was identified. In the case of CE-MS studies, the use of preparative CE to isolate a specific fraction containing the marker of interest has also been described although with limited success (53). However, upon tryptic digestion of a biomarker, which often represents itself a fragment of a larger protein, connectivity to the mass of the biomarker is lost. Moreover the bottom-up approach usually does not take posttranslational modifications into account. Posttranslational modifications are a major, sometimes even the most important, part of biomarker definition, and failure to identify them may subsequently result in failure of the validation process (18,42,43). Consequently it may be more appropriate and accurate to define a potential biomarker via several physical characteristics (e.g. precise mass and retention time). Recent advances in the field of FT-ICR and Orbitrap instruments and the introduction of electron transfer dissociation now allow sequencing of molecules Ͼ10 kDa. Although these approaches are not routine methods yet, they clearly show the path toward sequence determination that does enable accurate definition of biomarkers. As outlined recently in greater detail (43), currently the privileged pathway of biomarker discovery with the largest chances to be applicable in the clinic consists of the following: a clear clinical question 3 many samples obtained in a standardized fashion 3 analysis by instrumentation allowing relatively high throughput and high reproducibility 3 appropriate statistical analysis for these type of large sample numbers 3 validation of the potential biomarkers in a blinded study 3 sequencing of these biomarkers.
New Trend: from Single Markers to Panels-The potential of a protein or peptide to serve as a biomarker depends on how selectively and sensitively it enables disease assessment. Most of the analytes currently used in the clinical laboratory for screening and diagnostic purposes have been identified based on knowledge of the underlying disease gathered over a long period of time. This tedious and laborious procedure often resulted in the identification of single markers with frequently only moderate diagnostic value mostly because of low specificity. For example, prostate-specific antigen (PSA) is currently widely used as a marker for prostate cancer. Its prognostic relevance, however, is the subject of ongoing de-bates because of a lack of specificity when PSA levels are only moderately increased (4 -10 ng/ml). This uncertainty not only results in unnecessary biopsies but also in higher rates of false positive prostate cancer diagnosis (54). Another example is the use of microalbuminuria as an early non-invasive marker of renal damage. Microalbuminuria can be present in diabetic patients before apparent damage to glomerular function or increased serum creatinine levels (55). However, microalbuminuria is also found in apparently healthy individuals and cannot be utilized as a predictive marker of renal disease (56).
These two examples underline the need for more accurate biomarkers.
Can a single marker fulfill the requirements to reliably detect a disease as early as possible, to unambiguously distinguish it from other pathological conditions, and to monitor the efficacy of therapy? Probably not. An alternative strategy is identification of several markers, which as a stand alone marker do not present high specificity and sensitivity, but as a panel (or pattern) the markers work in concert (18). The general criteria that are applied onto a biomarker to be used for clinical assessment (e.g. known identity, reproducible detection, and known deviation) also apply for the single biomarkers that make up the multimarker panel (43).

THE USE OF URINE IN CLINICAL PROTEOMICS
The field of biomarker identification using urinary proteomics is moving toward application phase. Most of the studies described below showed potential for clinical application and adhered to the recommendations for biomarker discovery described above. A number of older studies will also be cited to give a more complete overview of the attempts to identify biomarkers. As the majority of the urinary proteins and peptides have been found to originate from the kidney and the urinary tract (3), most of the completed studies have focused on these organs and tissues.

Urogenital Disease: Non-cancer
Kidney Transplantation-One of the main areas of research with the aim to identify urinary biomarkers has been the evaluation of kidney transplant-associated complications. Acute rejection is one of the key factors that determines long term graft function and survival in renal transplant patients (57). SELDI was used by three independent laboratories to detect potential biomarkers for acute allograft rejection in kidney transplant patients (58 -60). Clusters of urinary proteins correctly classified between 30 and 50 patients (depending on the study) with high sensitivity and specificity. However, these results were only obtained on training sets, and an independent validation on a separate cohort is still lacking. One follow-up study describes the identification of two proteins that were used in the above prediction and the use of one of them, ␤-Defensin-1 (a host defense peptide), in an immunoassay to predict acute transplant rejection (61). The use of this single biomarker allowed the prediction of acute rejection but with a significantly lower sensitivity and specificity. This further indicates that the use of several urinary proteins or peptides yields higher diagnostic specificity and sensitivity. Although the three different laboratories all studied acute renal allograft rejection, completely different biomarkers were defined. This gives the impression that the results of SELDI are erratic but can be explained by the use of different chip surfaces and progress in chip surface preparation and might also originate from different instrument settings such as the signal to noise ratio or mass calibration.
CE-MS was used on urinary samples from patients with different grades of subclinical or clinical acute transplant rejection, patients with urinary tract infection, and patients without evidence of rejection or infection (62). Substantial differences were found between patients with transplanted kidneys and patients with native kidneys most likely because of treatment with cyclosporin A, a calcineurin inhibitor immunosuppressant. In addition, a distinct urinary polypeptide pattern identified 16 of the 17 patients with acute tubulointerstitial rejection. Potentially confounding variables, such as acute tubular lesions, tubular atrophy, tubulointerstitial fibrosis, calcineurin inhibitor toxicity, proteinuria, hematuria, allograft function, and different immunosuppressive regimens, did not affect the results. To enable differentiation between infection and acute rejection, an additional biomarker pattern was developed. These polypeptide patterns were validated in a blinded assessment of nine acute rejection patients, seven patients with urinary tract infection, and 10 controls. Most samples were correctly classified using these biomarkers (62). This suggests that urinary proteome analysis can be used for the non-invasive monitoring of renal transplant patients, although it awaits validation in larger cohorts.
Chronic Kidney Disease-Chronic kidney disease is becoming a global health problem as the number of individuals with chronic kidney disease is steadily increasing. This is mainly because of the increased life expectancy and the increasing incidence of type II diabetes (63). Early detection of chronic kidney disease is mandatory to reduce the number of patients requiring renal replacement therapy. Currently chronic kidney disease is detected at a late stage when renal function has already significantly deteriorated mainly because of the absence of non-invasive biomarkers. Therefore the selection of urinary polypeptide biomarkers for chronic kidney diseases is of utmost importance.
One of the first reports was the analysis of urinary polypeptide markers of membranous glomerulonephritis by SELDI and CE-MS (64). Using identical urine samples, three potential biomarkers were defined using SELDI analysis compared with 200 potential biomarkers from the CE-MS analysis. Additional work using CE-MS on urine samples from patients with other chronic renal diseases suggested that panels of 20 -50 urinary polypeptide markers allow discrimination (differential diagnosis) between different kidney diseases such as IgA nephrop-athy, focal-segmental glomerulosclerosis, membranous glomerulonephritis, minimal change disease, and diabetic nephropathy (14,16,65). A recent study using CE-MS on 3600 samples obtained from 20 different centers (Europe, America, and Australia) allowed the establishment of a database of more than 5000 urinary peptides (7). This database was used to define biomarkers of chronic kidney disease in general but also for the differential diagnosis of for example focal and segmental glomerulosclerosis and membranous glomerulonephritis. The validation of these biomarkers in a blinded and independent heterogeneous cohort (as encountered in everyday life in the clinic) of 134 individuals allowed sorting out the 89 of the 101 chronic renal disease patients, yielding an 88% sensitivity and 100% specificity. Furthermore the three patients with focal and segmental glomerulosclerosis and three of the four patients with membranous glomerulonephritis among the 134 individuals were identified in this population. 2 This study shows the potential of urinary biomarkers to identify patients with chronic kidney disease in a heterogeneous clinical setting.
Diabetic Nephropathy-In addition to disease-specific biomarkers, stage-specific urinary polypeptide markers can be defined. This will be exemplified below for the selection of urinary markers of diabetic nephropathy. Stage-specific biomarkers for diabetic nephropathy in patients with diabetes mellitus were defined (66,67). In these two studies, the individual data sets of healthy volunteers (n ϭ 9 and 39, respectively), patients with diabetes type I or II without macroalbuminuria (n ϭ 28 and 46, respectively), and patients with intermittent or persistent macroalbuminuria (n ϭ 16 and 66, respectively) were combined to create typical polypeptide patterns. In patients with type II diabetes mellitus and a normal albumin excretion rate, the detected polypeptide pattern differed significantly from that in patients with advanced albuminuria. Comparable results were obtained for patients with diabetes type I. A recent study on a larger cohort, including 300 patients and controls (68), further confirms the initial findings on type I diabetic patients based on a standardized sample preparation protocol (7). This study defined, and in a blinded assessment subsequently validated, biomarkers for diabetes and diabetic nephropathy and biomarkers that enabled differentiation of diabetic nephropathy from other chronic renal diseases. In addition, these biomarkers could also be used to predict microalbuminuric patients at increased risk of progressing toward diabetic nephropathy over a 3-year period (68). The validity of this approach was subsequently confirmed using an independent set of samples from the Coronary Artery Calcification in Type 1 (CACTI) Diabetes study with similar results for detection of both diabetes and diabetes nephropathy (69). These data indicate that the urinary biomarkers can be used not only for detection of diabetes and diabetic ne-phropathy but also to predict disease progression. The use of urinary biomarkers in the prediction of disease progression is confirmed by the studies described in the next paragraph.
Obstructive Nephropathy-Antenatal screening detects fetal hydronephrosis (dilation of the kidney due to urine accumulation) in around 1 of 100 births with about 20% of the cases being clinically significant. Ureteropelvic junction (UPJ) obstruction ( Fig. 2A) is found in 40 -50% of these clinically significant cases (70). UPJ obstruction is thus a frequently encountered clinical situation. UPJ obstruction is functionally defined as a restriction to the urinary outflow that, when left untreated, will cause progressive renal deterioration. Alternatively obstruction has been more generally defined as a condition that hampers optimal renal development (71). Because hydronephrosis is not always synonymous with obstruction, the differentiation between a dilated obstructed and dilated non-obstructed kidney is often a difficult problem ( Fig. 2A). No reference standards are available to correctly identify obstruction. Further diagnosis, based on arbitrary threshold values (in the absence of reference standards), is usually achieved through repeating the various radiologic investigations. These radiologic investigations expose these infants to radiation and may need injection of radiocontrast or radioisotope material. We have studied the urinary proteome of the newborns with UPJ obstruction to identify biomarkers of obstruction that can be used to predict whether a neonate with UPJ obstruction evolves toward spontaneous resolution or surgery ( Fig. 2A  (72, 73)). We used CE-MS-based urinary proteome analysis to define specific biomarker patterns for different grades of ureteropelvic junction obstruction. In a blinded prospective study on 36 UPJ obstruction patients, these patterns predicted with 95% accuracy the clinical outcome of the newborns 9 months in advance ( Fig. 2B (73)). After 15 months of follow-up, the accuracy of the prediction increased to 97% as one of the newborns with UPJ obstruction required surgery at a late stage as predicted by the urinary proteome analysis (Fig. 2B  (72)). A multicenter prospective study on 358 UPJ patients is ongoing for validation. These data and the recent study on the urinary proteome-based prediction of the progression of microalbuminuric diabetic patients (68) strongly suggest the possibility to predict the progression of disease by urinary proteome analysis.

Urogenital Disease: Cancer
Renal Cell Carcinoma-One of the first applications of urinary proteome analysis for a clinically relevant question aimed to define renal cell carcinoma (RCC)-specific biomarkers (74). Samples of 218 individuals were analyzed by SELDI-TOF. Samples from patients before nephrectomy for RCC (n ϭ 48), normal healthy volunteers (n ϭ 38), and outpatients with benign diseases of the urogenital tract (n ϭ 20) were used as a training set for biomarker definition. The defined mark-ers were subsequently validated in two blinded assessments with an initial "blind" group of 32 samples (12 patients with RCC, 11 healthy controls, and nine patients as disease controls) and a second group of 80 samples (36 patients with RCC, 31 healthy volunteers, and 13 patients with benign urological conditions). Although in the first round sensitivities and specificities of 81.8 -83.3% were achieved, the values significantly declined, ranging from 41.0 to 76.6%, for the second set of samples collected 10 months later. The authors analyzed possible contributing factors including sample stability, changing laser performance, and chip variability to assess a long term robustness of the approach. One of the main conclusions from this study was the need for rigorous evaluation of such variables that may influence stability/robustness. Bladder Cancer-Bladder cancer (BCa) is among the five most common malignancies worldwide. Urothelial (transitional cell) carcinoma (TCC) constitutes 95% of all these BCa cases in the Western countries. 80% of the BCa patients have superficial carcinomas that can be treated, but these patients must be closely screened for reoccurrence. This requires cytological examination of urine that lacks sensitivity especially for lower stage tumors. Cystoscopy is more sensitive but invasive. This underscores the need for novel, non-invasive biomarkers of BCa. A number of studies have been performed with the aim to identify urinary biomarkers of BCa. Many potential biomarkers were identified including psoriasin (S100A7 (75)), metalloproteinases MMP-2 and -9, fibronectin (76), orosomucoid, and zinc-␣ 2 -glycoprotein (77), but there was no subsequent validation of these biomarkers in pro-  2. A, UPJ obstruction, a stenosis at the intersection of the ureter and the pelvis of the kidney, induces urine accumulation in the kidney leading to hydronephrosis. A low number of newborns with severe UPJ obstruction need immediate surgery after birth, whereas another, larger group with low level obstruction evolves to spontaneous resolution of the obstruction. However the remaining group of neonates with "intermediate" obstruction needs medical surveillance to determine the evolution of the pathology. The common clinical practice to decide whether infants should undergo surgery depends on radiologic investigations involving exposure of the newborns to radiation. B, urinary proteomics has been shown to efficiently predict the progression of UPJ patients (72,73). Urinary protein profiles from 36 patients from the "intermediate" UPJ obstruction group were classified using a hierarchic disease model based on the discriminating polypeptides between the healthy newborn and mild (spontaneous resolution) and severe (surgery) UPJ obstruction. Each patient was scored with this model using support vector machines. This results in membership values between Ϫ1 and 1. A negative value suggests evolution toward surgery, and a positive value suggests evolution toward the spontaneous resolution of UPJ obstruction. This prediction was compared with the clinical evolution after 9 and 15 months of follow-up. This resulted in 34 of 36 correct predictions (94%) at 9 months and 97% (35 of 36) at 15 months as one patient evolved to severe UPJ obstruction at a late stage (arrow). This figure was partly reproduced, with permission, from Decramer et al. (73). spective studies. These studies on biomarkers for BCa will therefore not be described in more detail. However, some examples of identification and validation in prospective studies of urinary biomarkers of BCa by proteomics analysis were published; these are outlined below.

Ureteropelvic junction (UPJ) obstruction
One of the first studies using 2D DIGE for the analysis of the urinary proteome aimed to identify biomarkers of BCa (41). 2D DIGE was used to analyze seven different sets of patients and healthy controls yielding 12 clearly differentially expressed spots. One of the differentially expressed proteins was regenerating protein-1 (Reg-1). Reg-1 is proposed to act as an inhibitor of apoptosis leading to Reg-1-activated proliferative activity. Reg-1 expression in BCa biopsies was found to be associated with tumor progression and clinical outcome. In the next step an immunoassay was developed to study Reg-1 expression in urine. In a prospective analysis on 80 individuals containing 32 BCa patients (stage Ta to T2), this Reg-1 immunoassay allowed discrimination between BCa patients and healthy controls with a specificity and sensitivity of 81.3 and 81.3%, respectively (41).
SELDI-TOF profiling was used by several laboratories in detecting BCa in blinded sets of samples: sensitivity ranged from 71.7 to 93.3%, and specificity ranged from 62.5 to 87% (78) to discriminate BCa patients from healthy controls. As described above, in SELDI-TOF comparability of the data sets is not easy to achieve because of differences in chip surfaces and conditions in the different studies.
CE-MS was also used for the detection and validation of biomarkers of TCC (11). A BCa-specific biomarker pattern was established by initial definition in a training set composed of 46 patients with TCC and 33 healthy subjects and further refinement using CE-MS spectra of 366 urine samples from healthy volunteers and patients with malignant and non-malignant genitourinary diseases. By this two-step biomarker discovery approach, the authors were able to establish a prediction model composed of 22 urinary peptides, which, when applied to a blinded test set containing 31 TCC patients, 11 healthy individuals, and 138 non-malignant genitourinary disease patients, correctly classified all TCC patients and all healthy controls. Differentiation between bladder cancer and other malignant and non-malignant diseases (such as renal nephrolithiasis) was accomplished with at least 86 -100% sensitivity.
Urinary proteome studies allowed identification of biomarkers that distinguished between BCa and controls in prospective studies with variable sensitivities and specificities. This is promising. The next challenge is to find biomarkers that can predict tumor stage, recurrence, progression, and treatment response in patients with BCa.
Prostate Cancer-In a pilot study (79), CE-MS was used to define potential urinary peptide biomarkers for prostate cancer (PCa). Urine samples from 47 patients who underwent prostate biopsy were analyzed. On the basis of prostate biopsy, 26 patients in this group were diagnosed as having PCa and 21 as having benign prostatic hyperplasia (BPH). The data indicated several polypeptides allowing prediction of PCa with 92% sensitivity and 96% specificity. However, these data could not be validated in a larger cohort, 3 once more underlining the importance of validation in a blinded, independent test set. Upon more thorough analysis, first void urine was found to contain potentially useful biomarkers for PCa, whereas the generally used midstream urine appeared not to contain significant PCa-related information (80). These results indicate that the biomarkers originate from secretions of the prostate into the urine and also underline the importance of accurate sampling. After refinement of the PCaspecific biomarker pattern using urine samples from 54 PCa and 35 BPH patients, a model with 12 potential biomarkers resulted in the correct classification of 89% of the PCa and 51% of the BPH patients in a second blinded cohort of 213 patient samples (80). Inclusion of age and free PSA increased the sensitivity and specificity to 91 and 69%, respectively.

Application of Urinary Proteome Analysis to Non-urogenital Diseases
It has been estimated that 30% of the urinary proteins do not originate from the urogenital tract (Fig. 1), and the first studies showing the identification and validation of urinary markers for other than urogenital diseases is emerging. A first example is the clinical follow-up of patients after allogeneic hematopoietic stem cell transplantation (81,82). Urine samples from 40 patients after hematopoietic stem cell transplantation (35 allogeneic and five autologous) and five patients with sepsis were collected during a period of 100 days (a maximum of 10 samples per patient) for CE-MS analysis. A pattern consisting of 16 differentially excreted polypeptides indicated early graft-versus-host disease (GVHD), enabling discrimination of patients with early GVHD from patients without complications with 82% specificity and 100% sensitivity. A subsequent blinded multicenter validation study of 100 patients with more than 600 samples collected prospectively confirmed the results although with reduced specificity and sensitivity (83). First preliminary data on patients that received preemptive therapy of GVHD based on urinary proteome analysis indicate a clear benefit: reduction of both occurrence of GVHD and lethality. 4 These preliminary data are currently being further substantiated in a multicenter prospective trial.
Zimmerli et al. (84) were able to define and validate biomarkers for coronary artery disease (CAD) in urine. Urine from 88 CAD patients and 282 controls was examined by CE-MS. This resulted in the identification of 15 peptides that defined a characteristic CAD signature panel. In a second step this panel was evaluated in a blinded study on 47 CAD patients and 12 healthy individuals. CAD patients were identified with greater than 90% sensitivity and specificity. In addition, the polypeptide CAD signature panel significantly changed toward the healthy polypeptide signature after therapeutic intervention. These data were further substantiated in another study 5 where patients with CAD could be distinguished from patients presenting symptoms of CAD but without clinical evidence in the coronary angiography. The prospective value of the data could further be validated in prospectively collected samples from patients with type I diabetes (69). In this blinded study the value of urinary proteome analysis in the prediction of future CAD events could be demonstrated. Although still limited in number, these examples show that urine can also be a source of biomarkers for more distant organs.

FROM BIOMARKERS TO PATHOPHYSIOLOGY
The field of urinary proteomics has advanced and is now entering the era of validation of the selected urinary biomarkers for a number of urogenital and systemic pathologies. Most of us also entered this biomarker research with the hope to find clues to better understand the pathophysiology of disease: this necessitates identification of the biomarkers. As described above, urinary protein profiling enables selection and validation of biomarkers for disease. But the identification of these biomarkers remains challenging, especially of proteins Ͼ10 kDa. Nevertheless several studies reported a number of sequenced biomarkers. Although we refrain from listing the different proteins or peptides identified for each specific disease discovered by urinary proteome analysis, we aim toward discussing the major conclusions drawn from biomarker sequences.
Most of the currently identified urinary biomarkers for disease are (i) abundant plasma proteins or fragments thereof (i.e. albumin, ␤ 2 -macroglobulin, ␣ 1 -antitrypsin, etc.) because of leakage in the pathological state and (ii) abundant kidney and structural proteins (i.e. collagens and uromodulin) (85). These proteins or peptides were identified using various mass spectrometry-based proteomics techniques. Although useful as a biomarker, these abundant urinary proteins are at first sight of little information on the underlying pathology. However, the specific fragments of these abundant proteins might give some clues on the underlying physiopathology of disease (7). For example, renal disease without albuminuria still exhibits disease-specific changes in urinary polypeptides (16), including specific fragments of albumin (86). This strongly suggests that these peptides contain clues about the pathogenesis and are not simple degradation products of abundant urinary proteins. It is tempting to speculate that the disease-specific peptides may be indirect indicators of the activity of disease-specific proteases (65). This hypothesis is further strengthened by work (87) in which the presence of specific collagen fragments correlated with the disease-spe-cific activity of matrix metalloproteases. Moreover a similar process has been described in the case of some cancer biomarkers identified in plasma that have been shown to be fragments of abundant plasma proteins specifically cleaved by proteases released from cancer cells (88). Although the evidence is still scarce, it is an attractive hypothesis that urinary peptides of diagnostic value are not merely degradation products of abundant larger proteins but a result of distinct, disease-specific processes in many cases due to significant changes in the activity of proteases. This assumption is supported by sometimes apparently unrelated findings; for example, the increase of collagen and extracellular matrix in patients with diabetes and diabetic nephropathy has been established by a variety of methods. Our recent findings that collagen fragments are significantly reduced in diabetic urine (68) fit in this scenario and further support the hypothesis that both reduced activity of proteases and protection of the extracellular matrix from proteolysis by advanced glycosylation end products may be key pathological changes in diabetes mellitus (85). A similar scenario may be applicable to albuminuria. Consequently an albumin-derived biomarker is not simply "an albumin fragment" but rather a specific fragment defined by its specific N and C terminus. Similarly the presence of specific urinary fragments of albumin and ␣ 1 -antitrypsin associated with nephrotic syndrome in chronic kidney disease has recently also been described (89). Unfortunately such essential detailed information about protein processing by proteases is difficult to obtain both from the "protein side" as nano-LC-MS/MS approaches often identify proteins based on the sequencing of a few tryptic peptides that do not necessarily map the cleavage sites (6) and from the "peptide side" as profiling approaches like CE-MS and SELDI-TOF do not provide the sequence of the detected biomarker peptides. A thorough examination of the sequences of the urinary peptides and comparison with protease specificities may strengthen the above hypothesis and lead to a better insight into the regulation and pathophysiological role of specific proteases in many diseases.
Another attractive hypothesis is that the urinary peptidome displays to a large degree the turnover of the extracellular matrix. This hypothesis has been generated as a result of the observation that the most abundant urinary peptides (based on ion counting) are not, as expected, the "usual suspects" like albumin or uromodulin but specific collagen degradation products (7). Further several distinct collagen peptides are significantly reduced in diseases where an increase of ECM has been reported (85). Consequently these peptides may be derived from ECM turnover. Changes in this turnover also result in indicative changes in urinary peptides, which may serve as a very specific indicator for such a change, which in turn is likely to be disease-specific. Such changes in the ECM turnover may be due to e.g. invasion of tumors (ECM needs to be "dissolved" to make room for the growing tumor), fibrosis (reduced ECM degradation), increased arterial stiffness (change in ECM composition), changes in endothelium, etc. Therefore mapping of the collagen cleavage sites might incriminate specific proteases in ECM turnover not identified previously and define their activities under pathophysiological conditions.

CHASING LOW ABUNDANCE URINARY PROTEINS
The dynamic range of protein concentrations in body fluids often spans several orders of magnitude (35,90,91). A major challenge is thus to identify low abundance components in complex protein mixtures with high dynamic range. Immunodepletion of abundant urinary proteins might help to unmask the low abundance proteins as was shown by 2D DIGE analysis of urine of patients with diabetic nephropathy (40). A major drawback of such an immunosubtraction method appears to be co-depletion. For example, depletion of plasma for human serum albumin co-depleted another 815 species (not including albumin) (92). When capturing IgGs, another 2091 species (not including IgG) were co-depleted. These IgG-co-depleted proteins contained 56% sequences coding for antibodies and 44% low abundance cytokines or related proteins. Interestingly fewer proteins could be detected in albumin-and IgG-depleted plasma sample than in the samples destined to be discarded (92).
Recent developments might help to uncover the underexplored urinary proteome. A novel and very efficient approach has been described for capturing the "hidden proteome," rare proteins that constitute the vast majority in any cell or tissue lysate and in biological fluids (93)(94)(95). It is based on a combinatorial library of hexameric peptide ligands bound to porous polyacrylate beads named "Pro-teoMiner" (formerly called "equalizer beads" ; Fig. 3). Each bead contains billions of copies of a unique hexapeptide ligand distributed throughout its porous structure, and each bead potentially has a ligand different from every other bead. With a population of millions of individual peptide ligands obtained by combinatorial chemistry, any protein present in the starting material could theoretically interact with one or a few particular beads. Once the most abundant protein species have saturated their binding sites, the remaining molecules are washed away in the flow-through, whereas minor protein species get progressively enriched on their corresponding beads. Thus, instead of simplifying the complex mixture into fractions or partitioning away the most abundant proteins, this approach captures the species present in solution up to the saturation of the solid phase ligand library. The protein mixture is thus "equalized," and the dynamic range of protein concentrations is strongly reduced. This ligand library has been efficiently used for capturing and revealing a very large population of previously undetected proteins from serum (96), platelets (97), or red blood cells (98). It has also been applied to urine (27), and analysis of the sample by both one-dimensional gels, 2D gels, and SELDI-TOF revealed that the treatment induced a strong decrease in the levels of the most abundant proteins, notably albumin, while it allowed detection of numerous previously undetected species. Moreover nano-LC-MS/MS analysis of the treated urine by high resolution, fast sequencing Fourier transform mass spectrometry resulted in the identification of more than 300 protein species in only one analytical run of about 1 h to be compared with identification of 134 proteins in non-treated urine. Thus, application of the ProteoMiner technology may allow extending protein profiling toward lower abundance species. However, treatment with peptide ligand libraries will modify the abundances of proteins in the treated sample. It still needs to be assessed whether this approach can be used for differential proteomics studies, i.e. whether proteins found differentially expressed in samples to be compared are still found with the same differential expression ratio in treated samples. Test experiments performed by spiking standard proteins in cell lysates indicated that, if the protein does not saturate the beads, the relative quantitative information is well conserved and the method is reproducible (98). Although further validation is needed, this technology may represent a useful way to detect and quantify low abundance proteins in urine.

CONCLUSIONS
Urinary proteome analysis is emerging as a powerful diagnostic and prognostic tool not only in kidney disease but also in diseases of more distant organs. Although urinary pro-  3. Sample "equalization" using ProteoMiner beads. A schematic representation of the reduction of the dynamic concentration range of a protein mixture using combinatorial ligand libraries is shown. Each bead (gray sphere) carries a single type of ligand and interacts with one protein species (blue, red, green, and purple). Proteins in excess (blue and green) rapidly saturate the binding sites of the corresponding beads; they remain thus uncaptured and are eliminated during bead wash. Captured proteins are finally desorbed from the beads and collected. teome analysis is far from becoming a routine tool in the clinical setting, studies on larger cohorts of patients reveal its potential in clinical diagnosis. Efforts have to be made to validate these panels of biomarkers on even larger and, probably more importantly, heterogeneous cohorts to move away from the bench paradigm "disease versus control" to the bedside.
The contribution of urinary proteomics to the understanding of the pathophysiology of disease upon analysis of the urinary proteome is still modest, however. The evolution of mass spectrometers toward high mass accuracy and resolution, new ways to explore low abundance proteins and peptides, and new bioinformatics tools should help to sequence more biomarkers in the near future and thus learn more about the pathophysiology of the underlying disease.
* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. To whom correspondence should be addressed: U858/I2MR, Equipe 5, BP 84225, 31432 Toulouse Cedex 4, France. E-mail: joost-peter.schanstra@toulouse.inserm.fr.