Targeted Metabolomics for Clinical Biomarker Discovery in Multifactorial Diseases

The vast majority of this book deals with monogenic disorders which are relatively rare but have just one or a small number of characteristic genotypes and usually very pronounced clinical and biochemical phenotypes. In contrast, this chapter will try to discuss multifactorial diseases which are far more prevalent and pose a completely different kind of challenge both for the socio-economic systems and for biomedical research. As an example we will focus on chronic kidney disease (CKD) and relevant animal models thereof. In fact, together with diabetic retinopathy, myocardial infarction, and stroke, diabetic nephropathy is one of the most severe sequelae of type II diabetes mellitus (T2D) and, considering the obesity-related pandemic of T2D, will represent a major health issue in the decades to come (Mensah et al., 2004; James et al., 2010). Of course, all of these diseases have an important genetic component as demonstrated by pedigree analyses and a growing number of twin studies (Walder et al., 2003; Vaag & Poulsen, 2007). Still, with rare exceptions, this genetic component is rather seen as a predisposition for than as a cause of the actual disease. In particular, recent genome-wide association studies (GWAS) on large population-based cohorts have revealed a couple of single nucleotide polymorphisms (SNPs) that are significantly associated with T2D but the contribution of single SNPs to the individual’s risk of developing T2D are marginal (Groop & Lyssenko, 2009). To fully understand the interaction of the identified genetic loci and to appreciate the meaning of the genetic background in a personalized medicine approach, complex haplotypes would have to be analyzed, and this has not even been achieved in basic diabetes research, let alone in any clinical application. Yet, genetic research in diabetology has gained a new momentum in the last few years since it became obvious that a combination of GWAS with a more detailed phenotyping than just a generic diagnosis of T2D immediately led to improved statistics and to a much better biochemical plausibility of the findings (Gieger et al., 2008; Illig et al., 2010). Specifically, genome-wide significances could be achieved on much smaller cohorts than in classical GWAS rendering a more cost-efficient tool in biomedical research. The statistical power could be further improved by defining metabolic phenotypes based on the knowledge of the underlying biochemical pathways, e.g., by using groups of metabolites that are synthesized or degraded by the same enzymes or by calculating ratios of the concentrations of products


Introduction
The vast majority of this book deals with monogenic disorders which are relatively rare but have just one or a small number of characteristic genotypes and usually very pronounced clinical and biochemical phenotypes. In contrast, this chapter will try to discuss multifactorial diseases which are far more prevalent and pose a completely different kind of challenge both for the socio-economic systems and for biomedical research. As an example we will focus on chronic kidney disease (CKD) and relevant animal models thereof. In fact, together with diabetic retinopathy, myocardial infarction, and stroke, diabetic nephropathy is one of the most severe sequelae of type II diabetes mellitus (T2D) and, considering the obesity-related pandemic of T2D, will represent a major health issue in the decades to come (Mensah et al., 2004;James et al., 2010). Of course, all of these diseases have an important genetic component as demonstrated by pedigree analyses and a growing number of twin studies (Walder et al., 2003;Vaag & Poulsen, 2007). Still, with rare exceptions, this genetic component is rather seen as a predisposition for than as a cause of the actual disease. In particular, recent genome-wide association studies (GWAS) on large population-based cohorts have revealed a couple of single nucleotide polymorphisms (SNPs) that are significantly associated with T2D but the contribution of single SNPs to the individual's risk of developing T2D are marginal (Groop & Lyssenko, 2009). To fully understand the interaction of the identified genetic loci and to appreciate the meaning of the genetic background in a personalized medicine approach, complex haplotypes would have to be analyzed, and this has not even been achieved in basic diabetes research, let alone in any clinical application. Yet, genetic research in diabetology has gained a new momentum in the last few years since it became obvious that a combination of GWAS with a more detailed phenotyping than just a generic diagnosis of T2D immediately led to improved statistics and to a much better biochemical plausibility of the findings Illig et al., 2010). Specifically, genome-wide significances could be achieved on much smaller cohorts than in classical GWAS rendering a more cost-efficient tool in biomedical research. The statistical power could be further improved by defining metabolic phenotypes based on the knowledge of the underlying biochemical pathways, e.g., by using groups of metabolites that are synthesized or degraded by the same enzymes or by calculating ratios of the concentrations of products

The fundamentals of metabolomics
Metabolomics systematically identifies and quantifies low-molecular weight compounds in biological samples such as body fluids, tissue homogenates or cell culture. Metabolite concentrations allow inferences on the complex interactions between biological processes on a molecular level to be made. As pointed out above, metabolomics is increasingly appreciated as the richest source of information in functional genomics (Nicholson et al., 1999;. Until recently, systems biology has mainly relied on three other 'omics' technologies, namely genomics, transcriptomics, and proteomics. Important as these areas have been, they fail to provide a real-time phenotype, i.e., a picture of what is actually happening in a dynamic biological system. Recent advances in mass spectrometry have added metabolomics as another powerful and practical tool to the systems biology toolbox (Weckwerth, 2003). The metabolome is the sum of all low molecular weight metabolites in a biological system. By assessing hundreds of metabolites simultaneously,

www.intechopen.com
Targeted Metabolomics for Clinical Biomarker Discovery in Multifactorial Diseases 83 modern mass-spectrometric techniques produce high-resolution biochemical snapshots showing the functional endpoints of genetic p r e d i s p o s i t i o n a s w e l l a s t h e s u m o f a l l environmental influences, including nutrition, exercise, and medication. This snapshot is an almost real-time image of the physiology-or pathophysiology-of a cell or an entire organism (Weckwerth, 2003).

Technological advances paved the way into clinical applications
Mass spectrometric assays revolutionized the diagnosis of inherited metabolic disorders, a development co-pioneered in the late 1990s by Adelbert Roscher (Röschinger et al., 2003). This and similar pilot projects around the world taught the diagnostics community some crucial lessons. Quantitation of endogenous metabolites using multiple reaction monitoring (MRM) and stable isotope dilution (SID) for absolute quantitation on tandem mass spectrometers combined with advanced data analysis tools fulfills the most strict quality criteria in terms of precision and accuracy without suffering any of the shortcomings of immunoassays, such as cross-reactivities, which makes this technology an ideal platform for clinical chemistry (Unterwurzacher et al., 2008). What is more, the superior sensitivity of triple quadrupole mass spectrometers combined with MRM and SID enabled the detection of metabolites in biologically relevant sample types, such as plasma or serum, whereas the limited sensitivity in the previous NMR-based workflows restricted their use mainly to urine (urine as a sample type is analytically very convenient but the concentration of metabolites in urine is not regulated in the sense of a strict homeostasis as in blood). Furthermore, it has been proven for many disorders that multiparametric biomarkers reduce biological noise in the data by internal normalization as well as improve diagnostic sensitivity and specificity. Subsequently, it led to a marked reduction of healthcare costs (Röschinger et al., 2003;Weinberger., 2008).

Targeted metabolomics or metabolic profiling
There are two approaches to metabolomics usually called targeted metabolomics and metabolic profiling. While both approaches are complementary, targeted metabolomics, i.e., the identification and quantitation of defined sets of structurally known and biochemically annotated metabolites, takes advantage of our functional understanding of many biochemical pathways. In contrast to protein-protein interactions or regulatory relationships at the transcript level, the fact that so many biochemical pathways have been explored in great detail offers an invaluable source of background information that enables evidencebased interpretation of metabolomics data sets. For the majority of these pathways, substrates and products of enzymatic reactions, reaction mechanisms, equilibra, kinetics and energetic of these reactions, as well as cofactors or compartmentalization have been elucidated. This information renders instant functional interpretation of the data set and, thus, phenotyping of the analyzed cell or organism, a straight forward process (Modre-Osprian et al., 2009;Weinberger, 2008). One other major advantage of targeted metabolomics is that it generally provides quantitative information. These quantitative data, the molar concentrations of the metabolites involved in a pathway, facilitate the immediate understanding of any alterations between different biological states and allow for comparison and meta-analysis of several www.intechopen.com Advances in the Study of Genetic Disorders 84 independent studies (Enot et al., 2011). Targeted metabolomics enables the systematic quantitation of a wide range of biologically relevant molecule classes in cells, tissues, or clinically relevant fluids. The technology comprises an automated sample preparation workflow integrated with sensitive mass spectrometric methods and a tailor-made software solution. Many hundreds of metabolites can be identified and quantified using this novel platform, which is also well suited for high-throughput and routine applications .

Proof-of-concept for targeted metabolomics: Neonatal screening
The proof-of-concept for targeted metabolomics was first delivered in clinical diagnostics, namely in neonatal screening for inborn errors of metabolism. As mentioned above, the diagnosis of inherited disorders in amino acid metabolism, such as phenylketonuria, or fatty acid oxidation disorders, such as medium-chain acyl-CoA dehydrogenase (MCAD) deficiency, was revolutionized by the use of mass spectrometric assays (Röschinger et al., 2003). The idea was a logical continuation of Sir Archibald Garrod's (1857-1936) concept of chemical pathology, to quantify specific sets of amino acids and acylcarnitines to diagnose specific metabolic disorders. This laid the foundation for what is now referred to as 'targeted metabolomics'. The introduction of tandem mass spectrometry and the transition from expensive, monoparametric to multiparametric assays has enabled the simultaneous diagnosis of 20-30 monogenic diseases, which is a significant improvement of diagnostic performance, particularly of the specificity and the predictive values for very rare diseases. This improved diagnostic performance was achieved without raising costs. Rather neonatal screening is now reimbursed by health insurance providers in many Western countries and has lead to substantial healthcare savings. These medical and commercial benefits have turned neonatal screening into an impressive success story and led to its introduction in most industrialized countries within less than a decade (Röschinger et al., 2003;Weinberger., 2008).

Data exploitation
The whole data-related workflow for targeted metabolomics has recently been summarized (Enot et al., 2011). In the context of this chapter, we only refer to the pathway mapping aspects of this workflow. The unique level of understanding of metabolomics data that makes them suitable for a key role in functional genomics mainly results from this key step of data handling, namely the biochemical interpretation in the context of pathway and background knowledge. Despite its importance, very few standardized procedures have been developed and/or published for this step, and some experts would probably consider it their proprietary methodology to derive biochemical and pathobiochemical insight from multivariate metabolic datasets. The last few years have seen multiple efforts to systematically annotate endogenous metabolites and led to databases such as KEGG (Kaneshia & Goto, 2000), Reactome (Vastrik et al., 2007), BioCyc (Karp et al., 2005), HMDB (Wishart et al., 2007) and OMIM (Online mendelian inheritance in man, 2011). Despite suffering from some serious shortcomings in terms of pathway coverage and data curation, these may serve as a more or less accepted framework for future knowledge collection.
These databases also provide the background for various attempts at visualization of metabolic pathways and data mapping on these charts, although most of these projects still follow a static approach of predefined (and predrawn) maps that cannot do justice to the dynamics of biochemical networks. In the following paragraph, we would like to demonstrate a few concepts about how dynamic representation and simulation (Modre-Osprian et al., 2009) of metabolic pathways enable the first steps of generating hypotheses from multivariate datasets. Firstly, electronic availability of metabolites and metabolic reactions facilitates an almost trivial but nevertheless powerful approach that is analogous to a gene set enrichment analysis (GSEA; Subramanian et al., 2005). Any given set of metabolites which has been identified by statistics as significantly different in two biological states or clinical cohorts can be mapped on the entirety of metabolic pathways, and these pathways can then be ranked by the number of altered metabolites they contain (Fig. 1). This is a way of structuring the data that scientists from transcriptomics and proteomics are familiar with although the definition of metabolic pathways does not follow a similarly strict classification system as the classical gene ontology (GO; Ashburner et al., 2000). Note also that a reliable selection of species-specific enzymatic reactions instead of the generic reference pathways is necessary to reduce the risk of false positive hits. Secondly, starting from a particular metabolite of interest, exploration of the reactions that either synthesize or degrade this metabolite immediately generates a list of enzymes of interest for further investigation. This concept of exploring shells of reactions around a metabolite is exemplarily shown for tryptophan (Trp) metabolism in Fig. 2 and can be expanded stepwise around every metabolite serving as a new seed node. Each of these reactions can then be characterized by a ratio of product and substrate concentrations as a measure of enzymatic activity. Assessment of such ratios reduces biological noise and often dramatically increases the significance of the findings Gieger et al., 2008;Wang-Sattler et al., 2008). Lastly, moving even further from a traditional textbook representation of metabolic pathways, one can apply route finding algorithms to find and depict connections between metabolites of interest across the boundaries of (often artificially) predefined pathways. Such algorithms can identify the shortest route, routes up to a defined length, routes that do not share a certain metabolite (termed node-disjoint paths) or enzyme (so-called edge disjoint paths), depending on the respective biological question (Fig. 3). Here, the main prerequisite to avoid a potentially very large number of trivial hits is the exclusion of common cofactors and small inorganic molecules that connect many metabolites to many others, e.g., H 2 O, CO 2 , ATP, NADP. Using tools like these, and keeping in mind all the caveats discussed above, enzymes and pathways involved in the pathophysiology of a certain disease or in the mode-of-action of a drug can be more efficiently identified. In addition, hypotheses for designing further validation experiments and studies can be formulated. Yet, all of this needs to be combined with another plausibility check, which originates from inherent redundancies in metabolism: quite often groups of compounds are metabolized by the same enzyme and should, therefore, be influenced in at least a similar (if not the same) way by regulatory mechanisms, drugs, etc. If this rule of thumb is severely challenged, one should always check for possible analytical or statistical artifacts, or -not uncommon in pharmaceutical R&D -interference by xenobiotics, e.g., a drug or drug metabolite disturbing the signal for an endogenous metabolite. Fig. 3. Route finding across metabolic pathways. Nine paths from arginine to spermine, ranging in length from four to six steps were calculated based on the KEGG dataset, and the settings allowed for joint nodes, e.g., ornithine (MarkerIDQ™ software, Biocrates)

Quantitative experimental information for computational biology
The quantitative information of targeted metabolomics enables new possibilities in validating computational systems biology approaches using detailed kinetic models to simulate and predict the dynamic response of metabolic networks in the context of human diseases. It also supports the design of tailored kinetic models of human-specific metabolic pathways including detailed knowledge about all metabolic reactions concerned. Besides statistical model building and data mining-based approaches (Baumgartner et al., 2004;Baumgartner et al., 2005;Baumgartner & Graber, 2008), computational systems biology is essential to combine knowledge of human physiology and pathology starting from genomics, molecular biology and the environment through the levels of cells, tissues, and organs all the way up to integrated systems behaviour. Applying systems biology approaches within the context of human health and disease will definitely gain new insights. Eventually, a new discipline -systems medicine -will emerge at the interface between medicine and systems biology (van der Greef et al., 2006;van der Greef et al., 2007;Lemberger, 2007). Higher levels of organization are extremely complex, and even models at the cell and subcellular levels are forced to resort to simplifications to minimize modeling and computational complexity (Crampin et al., 2004;Nakayama et al., 2005;Yugi & Tomita, 2004). Additionally, some parameters and constants for kinetics, binding and concentrations of biomolecules are typically not known, thus reducing the model's ability to respond correctly to dynamic changes in external conditions. A high-quality network of humanspecific metabolic pathways including detailed knowledge about all metabolic reactions concerned is essential to design tailored kinetic models for better understanding of human physiology and its relationship with diseases. While such large networks are used to analyze the global structure or functional connectivity of the network (Ma et al., 2007), deterministic and stochastic models are mainly used for simulating specific metabolic pathways as well as regulatory and signaling networks (Goel et al., 2006).
Results of in silico experiments should be related to quantitative experimental data (e.g. from neonatal screening) in order to reveal better insights into dynamic properties of the complex biochemical networks under the constraints of various disease conditions and finally to obtain a better understanding of pathophysiological aspects of genetic disorders (Modre-Osprian et al., 2009).

Use case: Biomarker development in CKD
Chronic kidney disease is a major health problem associated with increased risk of cardiovascular disease, renal failure and other complications (James et al., 2010). The cost for treating these complications puts a disproportionally large part on national health care budgets (Eknoyan et al., 2004;James et al., 2010;Mendelssohn & Wish, 2009). With an aging population and a worldwide epidemic of diabetes, the most common causes of CKD have switched from infection/inflammation and inheritance, to hypertension, other vascular disorders and diabetes as the main triggers. It is estimated that at least 40 million people in the EU have some degree of CKD. This number is expected to increase every year, even double over the next decade, and the trend is similar all over the world (European Kidney Health Alliance, 2011;James et al., 2010). One of the major reasons for this is the dramatic increase of T2D, accounting for up to 95 % of the total diabetes incidence (American Diabetes Association, 2000; Kurukulasuriya, & Sowers, 2010;Ritz & Stefanski, 1996). Diabetic nephropathy is one of the most severe complications of diabetes and by far the most common cause of end-stage renal disease (ESRD; Susztak & Bottinger, 2006). Most people are unaware of their disease at early stages and do not get the right treatment in time. The classical renal function markers, serum creatinine level and estimated glomerular filtration rate (eGFR) are known to be insensitive and late markers of CKD (National Kidney Foundation, 2002). The gold standard for assessing renal function is measuring the true GFR with test substances like inulin or iothalamate, but this is an invasive and far too tedious procedure for routine application. It is of highest importance to develop markers which have the ability to predict or detect CKD at an earlier stage, making it possible to intervene with therapy to prevent or at least slow down the progression of kidney damage finally leading to ESRD and control related complications. While the classical diagnostic markers are restricted to traditional endpoints for kidney damage, metabolic markers can assess pathophysiological and pathobiochemical changes that play a role in exacerbation of renal damage. This use case is based on two studies explained in further detail below. In a preclinical study on puromycin-treated Sprague-Dawley rats, several classes of metabolites were quantitated covering the main pathways of metabolism. The absolute concentration of the metabolites was determined by MRM and the application of SID (Jarman et al., 1975). The aim of the preclinical rat study was to evaluate metabolic changes in these rats, focusing on nephrotoxicity.
Cohorts consisted of three dosage groups (10 mg/kg/day, 20 mg/kg/day and 40 mg/kg/day) and one control group where only a vehicle was administered. Samples were taken at day 3, 7, 14 and 22 after start of the experiment, except for the highest dosage group, where all animals had to be sacrificed at day 14 because of complete renal failure. One of the metabolites that was associated with exacerbation of renal damage was symmetric dimethylarginine (SDMA) in plasma (Fig. 4), which has been extensively discussed in the literature as a marker for renal failure (Bode-Böger et al., 2006;Vallance et al., 1992). SDMA is hardly metabolized in the body, but only eliminated by renal excretion and, since no specific tubular resorption has been reported, it could be interpreted as an internal test substance for renal clearance (Bode-Boger et al, 2006;Martens-Lobenhoffer & Bode-Böger, 2006). As seen in figure 4, SDMA was increased in the two highest dosage groups, and there was also an increase over time within these groups. Many of the preclinical findings were confirmed in a clinical biomarker study on progression of CKD that was performed at Montpellier University hospital as part of an EUfunded consortium (ETB Urosysteomics). The participating patients were divided into three cohorts according to severity of kidney disease; no to moderate renal function impairment (eGFR > 30 ml/min/1.73 m², corresponding to stages 1 to 3 of CKD as proposed by the National Kidney Foundation, for simplicity referred to as stage 3), severe renal function impairment (30 ml/min/1.73 m² > eGFR > 15 ml/min/1.73 m², corresponding to stage 4) and renal failure (eGFR < 15 ml/min/1.73 m², corresponding to stage 5 treated with dialysis) based on eGFR as proposed by Bauer et al, 2008. The patients in this study were mixed cases from different etiologies of CKD (diabetic and non-diabetic), and several analyses were performed to exclude confounding factors of these diseases and to look at biomarkers influenced by kidney damage, regardless of underlying disease. Just as in the preclinical study, many of the quantitated metabolites were found to be significantly up-or downregulated with progressing CKD. In a discriminant analysis the data could be separated almost completely which indeed indicates there is information in the data set to distinguish the stages from one another (Fig. 5) and further statistical analyses both identified novel markers (Lundin et al., submitted for publication) and confirmed biomarkers that had already been found in previous studies (e.g., nephrotoxicity of model compounds, early prognostic markers for acute rejection and chronic nephropathy in kidney transplant patients; Boudonck et al., 2009;Lundin & Weinberger, 2009). One of the findings was elevated levels of SDMA, as already observed in the rat model (Fig. 6). Another finding that was reproduced in both studies was changes in concentration of the essential with amino acid Trp. Tryptophan is crucial for protein synthesis and might therefore play an important part in cellular differentiation, development and growth (Badawy, 1988). The drop in concentration of Trp has been associated with impaired kidney function before (Egashira et al., 2006;Saito et al., 2000). In the puromycin-treated rat model of nephrotoxicity, Trp drops to less than a third of its concentration between low-and high dose and at different time points (Fig. 7). The same trend can be observed in the clinical study on progression of CKD (Fig. 8 left) which emphasizes the advantages of being able to use translational research in metabolomics. This phenomenon could partly be explained by albumin depletion since, in peripheral blood, Trp is bound to albumin to a significant extent (Walser & Hill, 1993). This would result in a drop of Trp when the albumin is being depleted in progressing kidney disease. As seen in figure 8 (right), there is indeed a drop in albumin in the clinical study, but not in the same magnitude as Trp, hence it can be assumed that other mechanisms are also involved here.  Fig. 7. Tryptophan depletion in puromycin-treated rats. Tryptophan shows both a dose-(4.9·10 -6, comparing day 14 in all cohorts) and time-dependent (p=1.8·10 -3 in the cohort treated with 10 mg/kg/day, p=1.2·10 -6 in the cohort on 20 mg/kg/day, and p= 2.5·10 -9 in the cohort on 40 mg/kg/day) decrease Another explanation for the drop in concentration of Trp might be in its degrading pathways (see Fig. 9). When analyzing the two main catabolic pathways originating from Trp, both the rat model and the clinical study revealed that actually both pathways (towards serotonin and through kynurenine towards niacin) are upregulated (Fig. 10). The steep increase and high statistical significance of the kynurenine / Trp (product to substrate) ratio in the clinical study suggest there is a markedly increased activity through this pathway, in keeping with the fact that the kynurenine pathway accounts for 95 % of tryptophan catabolism (Walser & Hill, 1993). The Trp degrading indoleamine 2,3-dioxygenase (IDO) enzyme, which catalyzes the initial and rate-limiting step in the niacin pathway, seems to have an increased activity in progressing CKD. Tryptophan shows a highly significant decline in later stages of CKD (p = 6.2·10 -9 ). Right: This could in part be explained by the fact that it is albumin-bound, but although it is significant (p=1.56·10 -9 ) the fold-change of the albumin depletion is by far less pronounced than the one for Trp Fig. 9. Simplified scheme of the two main pathways catabolizing Trp. To the right the niacin pathway is illustrated where the rate limiting step is the conversion of Trp to kynurenine catalyzed by IDO (Ball et al., 2009). To the left the serotonin pathway with its two key reactions (tetrahydrobiopterin-dependent tryptophan hydroxylation and pyridoxalphosphate-dependent decarboxylation of 5-hydroxytryptophan) is shown

Outlook and potential applications
There is a widespread range of potential applications of metabolomics in the areas of biomedical research, pharmaceutical R&D and clinical diagnostics. Besides the wellestablished procedure of neonatal screening, metabolomics is currently being applied in biomarker and diagnostics research as demonstrated in the example of CKD above. Still, the diagnostic potential of metabolomics is not confined to typical metabolic disorders, but rather extends into fields such as cancer (Osl et al., 2008) and neurologic disorders (Urban et al., 2010). Metabolomics can also be applied in drug development where it is used to uncover new drug targets, prioritize lead compounds and assess drug toxicity, enabling the development of novel, smarter and safer drugs . In addition, metabolomics has the possibility to identify individuals likely to benefit from a given therapy, minimizing the risk of side effects and avoiding unnecessary drug use.

Conclusion
The most important difference of metabolomics to other -omics approaches is the level of functional understanding that, currently, the metabolome is offering to a much greater extent than the other -omes. The first successful examples of combining GWAS with metabolic phenotypes, so-called metabotypes, have recently been published and show significant promise for a more useful outcome of population based association studies in general. This phenotyping is particularly useful when the disease in question affects metabolically active organs and when large scale transport phenoma are affected. Consequently, metabolomics is very well suited for biomarker discovery in multifactorial diseases like T2D and CKD. It is not only possible to find novel markers but also to explain the pathophysiological effects behind the disease, e.g., inhibition of or upregulation of an enzyme in a specific pathway. Additionally, the fact that there is no need to redevelop the analytic assays, because of the non-species specific properties of the metabolites, makes it cost-and time-saving since there is no need for redevelopment of the analytical assays.