Abstracts of the NIH-FDA Conference “Biomarkers and Surrogate Endpoints: Advancing Clinical Research and Applications”

s of the NIH-FDA Conference “Biomarkers and Surrogate Endpoints: Advancing Clinical Research and Applications” Abstracts of the NIH-FDA Conference 188s of the NIH-FDA Conference 188


Plenary Session 1
Historical Perspectives

Biomarkers as Surrogates in Clinical Research
Scott L. Zeger, Ph.D. Hygiene and Public Health, Room E3132, 615 North Wolfe Street, Baltimore, MD 21205-2179, USA The biotechnology revolution has made possible highly innovative monitoring of biological and disease processes using biological markers or "biomarkers". The pressure to more efficiently evaluate new therapies in clinical research motivates the use of biomarkers as substitutes or "surrogates" for more traditional clinical endpoints. This talk will summarize a framework for evaluating biomarkers as surrogates in clinical research developed by an National Institutes of Health working group over the past year. The distinct roles of biomarkers in early development of new therapies and in phase III clinical trials will be detailed. The talk will then focus on the evidence necessary to establish a biomarker as a surrogate endpoint in the evaluation of a particular therapy-disease combination. Statistical approaches to using a surrogate endpoint in a single trial or in multiple trials will be presented. The concepts will be illustrated with an example from a recent schizophrenia clinical trial. The degenerative disorders of the mature nervous system (Alzheimer's, Parkinson's, and Huntington's diseases) produce major disability and early death, but no therapies have minimized neuronal cell death and thereby slowed the progression of illness. Advances in experimental therapeutics have been hampered by a lack of biological markers that might provide useful and meaningful surrogate endpoints in therapeutic trials. Huntington's disease (HD) is an autosomal dominant disorder characterized by motor and mental deterioration resulting from selective neuronal degeneration. HD is caused by a single gene mutation (4p16.3) produced by CAG trinucleotide repeats within its coding region. CAG repeat expansions greater than 38 represent a highly predictive trait marker for developing clinical features of HD. The greater the CAG repeat expansions, the earlier the onset of illness and seemingly the more rapid its progression. Experimental treatments are being examined (NS 35284) in persons who have manifest HD in an effort to slow clinical decline. Neuroimaging techniques, including magnetic resonance spectroscopy of brain lactate concentrations, are being evaluated as potential biomarkers to assess the impact of putative neuroprotective therapies. Volumetric MR and dopamine receptor positron emission tomography (PET) imaging are also being assessed as potential biomarkers of clinical onset (phenoconversion) in healthy persons at risk for HD for use in secondary prevention studies aimed at postponing the onset of illness. Parkinson's disease (PD) is a disorder of unknown etiology involving progressive degeneration of the dopaminergic nigrostriatal pathway and advancing motor and mental disability. Several trials have been undertaken to examine promising interventions aimed at slowing clinical progression, but neuroprotective effects can only be inferred without biomarkers. Cerebrospinal fluid markers of dopamine metabolism have not proven sufficiently sensitive as indices of disease progression. However, single photon emission computerized tomography imaging of the dopamine transporter and fluorodopa PET imaging of dopamine synthesis are promising neuroimaging techniques to assess the integrity of presynaptic nigrostriatal neurons and the underlying progression of PD. for objective measures by which to select the "most appropriate" molecule to develop. The cost to reach a critical decision point on compound selection or progression in multicenter clinical trials often may approach hundreds of millions of dollars. Very expensive decisions are a hallmark of the pharmaceutical industry. Unfortunately, many such decisions are based on best-guesses drawn from preliminary and incomplete data. Indeed, in many clinical trials, selecting the appropriate patient population, the correct dose and dosing schedule, and most importantly, obtaining early evidence of efficacy and adverse reactions remain formidable challenges. In very few cases, welldefined surrogate markers have allowed for development of important therapeutic entities. Reduction of blood pressure (surrogate for stroke prevention) and reduction of serum cholesterol (surrogate for atherosclerosis and myocardial infarction) are perhaps the two best known. Broader and more general strategies must be employed to discover useful surrogates or biological markers that can be used to diagnose disease, monitor disease progression, and respond to therapy. New technologies that embrace large-scale "objective" measures of physiological status, clinical symptomatology, and medical informatics will be needed to achieve the desired goals. An imperative exists to standardize clinical records if we are ever to extract valuable information that will lead to identification of surrogate markers from existing and future medical databases. The fundamental advances being derived from the "new" genetics, combinatorial chemistry, and high-throughput screening cannot be exploited without new breakthroughs that will dramatically improve our ability to conduct efficient and thoughtful clinical trials. Overview A review of the scientific literature reveals that there are many terms and meanings used when describing the measurement of biological parameters and their substitution for clinical endpoints in clinical trials. In an effort to develop an effective dialogue to discuss the topics presented in the conference, a working group on definitions 1 was organized. The working group developed terminology and a conceptual model that were recommended to serve as the convention for discussion at this conference.

Definitions
Biomarker (Biological Marker) -A characteristic that is measured and evaluated as an indicator of normal biologic processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention.
Clinical Endpoint -A characteristic or variable that measures how a patient feels, functions or survives.

Surrogate Endpoint -A biomarker intended to substitute for a clinical endpoint.
Global Assessment -Evaluation of risk and benefit balance for a patient or a group of patients.

Conceptual Model
A conceptual model was developed to demonstrate the potential interactions of biomarkers and surrogate endpoints (Figure l). Biomarkers used in evaluating a therapeutic intervention may have utility in assessing safety or efficacy. In fact, some biomarkers may have dual functions that assess efficacy for some interventions and safety for others. A subset, represented by the quadrant of the biomarker sphere, may achieve the status of a surrogate endpoint in a clinical trial. Surrogate endpoint status is evaluated by considering those factors that relate to the ability of a biomarker to accurately substitute for a clinical endpoint. The evidence supporting the linkage of a biomarker to a clinical endpoint may be derived from epidemiologic studies, clinical trials, in vitro analyses, animal models, and simulated biological systems. Many biomarkers have been proposed as potential surrogate endpoints, but relatively few are likely to achieve this status because of the complexity of disease mechanisms and the limited capability of a single biomarker to reflect the collective impact of multiple therapeutic effects on ultimate outcome. The evaluation of the risk and benefit balance of an intervention on a biomarker employed as a surrogate endpoint for a patient or group of patients is referred to as the "global intervention assessment". Scenarios for the relationships of biomarkers as surrogate endpoints are shown in Figure 2.

Objectives
• Review current status of biomarkers for three chronic lung diseases -asthma, chronic obstructive pulmonary disease (COPD), and pulmonary fibrosis • Address current opportunities to evaluate genetic and physiologic (e.g., biochemical, imaging, and quality of life) biomarkers for chronic lung diseases • Identify potential genetic markers, markers of function, markers of therapeutic benefit, and promising new technological advances that offer improved capacity to precisely and accurately assess response to therapies • Determine research needs for the development, evaluation, and validation of biomarkers for these diseases • Identify new research questions and scientific approaches to evaluate promising biomarkers and advance preclinical research to clinical research • Identify potential data registries or databases for retrospective analyses or meta-analyses of potential biomarkers, intermediate endpoints, and surrogate endpoints Moderator: S. Buist, M.D., University of Oregon Health Sciences Center, USA 10 years, it has become accepted practice to treat asthma with anti-inflammatory medication (e.g., corticosteroids) with the hope that suppression of inflammation will not only improve the signs and symptoms of asthma but also prevent the pathological changes. The aim of new drug discovery is to target mechanisms in the immunoinflammatory cascade that might modify the asthmatic process. These types of programs may lead to novel biomarkers of disease activity, which may be shown to be surrogate endpoints during clinical development. Positional cloning in asthma and differential gene expression studies of asthmatic tissue may also provide novel biomarkers. Finally, genotypic analysis of asthmatic patients in clinical trials may reveal biomarkers of disease susceptibility and severity and drug response. The measures of disease response most usually utilized in clinical trials are lung function, symptoms, and beta-agonist use. Measurement of bronchial responsiveness to various stimuli is the commonest surrogate for bronchial inflammation as such but has several limitations. Other direct measures of bronchial inflammation such as bronchoalveolar lavage, bronchial biopsy, and induced sputum analysis are used in research settings. Indirect measures such as exhaled nitric oxide or hydrogen peroxide and serum and urinary eosinophil products are currently under assessment. Recently, asthma exacerbation has been used as an endpoint in some clinical studies and is considered by some one of the best measurements of disease activity. The rate of asthma exacerbations may reflect the intensity of asthmatic inflammation. It has the added advantage of being an intuitive endpoint with obvious relevance to the therapeutic situation. Unfortunately, large numbers of patients and a prolonged observation period are needed to provide discrimination between different groups. This may be the endpoint against which a novel biomarker should be validated as a surrogate endpoint.

Biomarkers for Chronic Lung Diseases
Robert P. Schleimer, Ph.D.

Johns Hopkins University School of Medicine at the Johns Hopkins Asthma and Allergy Center, USA
Since the past century, the primary markers of asthma have included symptoms (e.g., reversible airflow obstruction, airways hyperreactivity, excessive mucus production) and evidence of eosinophilic inflammation (e.g., elevations of eosinophils or their proteins in blood, bronchoalveolar lavage or sputum). In the past decade, there has been an exhilarating expansion in the appreciation of molecular markers of allergic airways inflammation. The elucidation of type 1 and 2 subsets of T cells, and predominance of type 2 cells in asthma, led to the use of their specific products (IL-4 and IL-5) as markers. More recently, numerous cell surface markers can be used to enumerate and characterize T cells (including IL-12 receptors, IFN receptors, an IL-1 receptor family member, and distinctive subsets of chemokine receptors). Several transcription factors have been shown to be selectively expressed in T cell subsets (e.g., GATA-3 and others). Progress has been made in distinguishing activated (primed) eosinophils from resting cells. In addition to traditional markers, including buoyant density, survival, and staining with the EG2 antibody, it is becoming clear that activated eosinophils express elevated levels of CD44, CD69, and the chemokine receptor CCR3. Indeed, the presence of a variety of eosinophil-active chemokines (e.g., eotaxin-1 and -2, RANTES, MCP-4) has been established in allergic inflammation of the airways. Among markers of endothelial activation, VCAM-1 has emerged as a marker, active participant in the response, and potential therapeutic target. Exciting progress is being made in the description of polymorphisms of various candidate genes that may be causally involved and genetically associated with allergic airways disease, including polymorphisms of the IL-4 promoter, the Fc receptor, beta adrenergic receptors, and LTC4 synthase. In the case of the chemokine network, polymorphisms have been identified that are associated with allergic disease and are expressed at different frequencies in individuals of African or European descent. With the sequence of the human genome imminent and the rapid growth in the awareness of functionally relevant gene and promoter polymorphisms, integration of available information will become more important than ever before.

Assessing Biological Response In Vivo
Stephen I. Rennard, M.D.

University of Nebraska Medical Center, USA
A variety of methods are available for assessing biological response in vivo. These include assessment of symptoms, measurement of pulmonary physiology, and assessment of biological materials. Symptom assessment in lung disease is advancing rapidly with the development of validated disease-and organ-specific instruments for assessing a variety of domains (Mahler and Jones 1997). Physiologic assessment of lung function has been an active area of investigation for nearly five decades. Recent advances in acquisition of biological materials relevant to chronic lung disease promises to extend these means of assessing patients with pulmonary disorders. Relevant materials can be obtained in exhaled air, through sputum obtained either spontaneously or following induction, and by bronchoscopy, and measures relevant to pulmonary disease can be made both in peripheral blood and in urine specimens. A number of gases in the exhaled breath reflect biological activities relevant to lung disease including nitric oxide (Kharitonov et al. 1994) and carbon monoxide (Horvath et al. 1998), which may reflect inflammatory status, and pentane (Wispe et al. 1985), which may reflect oxidant activity. Exhaled air also contains a large number of other molecules that can be collected and quantified, including hydrogen peroxide and a variety of proteins (Schneider et al. 1993). The lower respiratory tract can also be sampled by collection of sputum. The technique of sputum induction permits collection of samples in a reproducible fashion even from normal individuals who do not ordinarily produce sputum (Fahy et al. 1993). Because sputum contains both biochemical and cellular elements, sputum analyses have permitted studies of intraluminal inflammation in airways disease. Bronchoscopy is an invasive procedure but permits sampling of the lower respiratory tract with capabilities not available through sputum. Specifically, the cells lining the airways can be recovered either by brushing or by biopsy. This permits both functional studies using ex vivo cultures and histologic studies using a variety of histological methods. Biopsies can also be taken from more distal regions of the lung with the technique of transbronchial biopsy (Wang and Mehta 1995). Finally, the technique of bronchoalveolar lavage can sample the intraluminal contents not only of the airways but also of the alveolar spaces (Linder and Rennard 1988). Materials relevant to lung disease can also be recovered in the peripheral blood. In this regard, measures as simple as peripheral blood leukocytes can correlate with smoking and risk for developing chronic obstructive pulmonary disease (Bridges et al. 1986). Such measures, presumably, could serve as biomarkers. Finally, the urine contains metabolic end products relevant to chronic lung disease. In this context, both leukotriene metabolites (Drazen et al. 1992) and elastin-specific crosslinks (Stone et al. 1995) have been quantified in urine and related to disease activity. In summary, a spectrum of methods are available for assessing biological responses in vivo in humans relevant to chronic lung disease. Their application should help advance both understanding the natural history of chronic lung disease and evaluating therapeutic interventions.

Chronic Obstructive Pulmonary Disease, Functional Markers, Computerized Tomography, and Quality of Life
Robert A. Wise, M.D.

Johns Hopkins University School of Medicine, USA
Chronic obstructive pulmonary disease (COPD) encompasses chronic bronchitis and emphysema. Chronic bronchitis is defined by symptoms, whereas emphysema is defined by anatomical derangement. Both disorders lead to airflow obstruction that impairs ventilatory capacity and gas exchange. Functional measures have relied on forced expiratory spirometry, which is a standardized and reproducible test. Exercise capacity is imperfectly correlated with spirometric indices. Other factors that may contribute to exercise capacity and disability include efficiency of exercise, nutritional factors, psychological factors, and the tendency to dynamically trap air. Emphysema is detected and quantified by computerized tomography of the lung; however, methods can take into account extent, severity, and pattern of abnormality in different ways. The correlation of morphologic emphysema to survival, functional status, progression of disease, and response to treatment are not well developed at present. Measures of diseasespecific quality of life (QOL) have been validated for COPD, but overlap with measures of dyspnea and symptoms. Other important impacts of COPD on quality of life such as fear of dependency, social embarrassment, and poor sleep quality are not well represented on current QOL instruments. Many intermediate outcome measures are available for COPD, but there are still needs for standardization and validation.

Pulmonary Center, Boston University School of Medicine, USA
Many conditions of both known and unknown etiology may produce excess connective tissue deposition in the lungs. These disorders are variously known as interstitial pulmonary fibrosis or fibrosing alveolitis. These processes may develop over a course measured in days, as in the adult respiratory syndrome (ARDS), or over a course measured in years as in pneumoconiosis or idiopathic pulmonary fibrosis. The accumulation of collagen disrupts function, and there is evidence that inhibition of collagen deposition attenuates the physiologic disturbance. Fibrosing lung diseases may have serious consequences. The onset of the fibroproliferative phase of ARDS heralds a poor outcome. Interstitial lung fibrosis has a 5-year survival of less than 50 percent. Currently, there is no effective antifibrotic therapy in use. The ability to evaluate new therapies would be greatly enhanced by a biomarker of disease activity. Several potential biomarkers for fibrogenic activity have been identified. The potential usefulness for each of these markers will be discussed.

Pharmacokinetic and Pharmacodynamic
Methods in Biomarker Development and Application

Objectives
• Demonstrate how PK/PD models that incorporate biomarkers and surrogate endpoints can be used to accelerate drug development • Address novel PK/PD modeling approaches involving biomarkers and surrogate endpoints • Suggest that PK/PD models can assist in evaluating the biological plausibility of biomarkers • Provide a forum for discussion on application of PK/PD principles to biomarkers and surrogate endpoints is the most profound regulatory development in three decades. In particular, the "fast track" provision and statutory recognition of the single clinical trial effectiveness standard provide powerful pathways for acceleration of drug development and regulatory approval of new drugs. The fast track provision of FDAMA legislatively confirms the Accelerated Approval Regulation within the Food and Drug Act and provides for FDA to facilitate development and expedite review of new drugs that address unmet medical needs. Approval of new fast track drugs may result from showing the efficacy of using a surrogate endpoint but is subject to postapproval validation of the surrogate endpoint or confirmation of clinical benefit. Linking (modeling) drug doses and concentrations (pharmacokinetics) with surrogate endpoint measurements (pharmacodyamics) can be used to strengthen the evidentiary basis supporting regulatory approval. The single clinical trial effectiveness standard described in FDAMA (and accompanying committee reports reflecting congressional intent) provides: "The Secretary [may] determine, based on relevant science" that substantial evidence of effectiveness may consist of "one adequate and well controlled clinical investigation and confirmatory evidence (obtained prior to or after such investigation)," where "confirmatory evidence" comprises "scientifically sound data from any investigation in the NDA [New Drug Application] that provides substantiation as to the safety and effectiveness of the new drug." Furthermore, "confirmatory evidence" may "consist of earlier clinical trials, pharmacokinetic data, or other appropriate scientific studies." Additional regulatory guidance concerning circumstances in which a single effectiveness trial, including a pharmacokinetic, pharmacodynamic, or clinical endpoint trial, may be sufficient for regulatory approval can be found in a recently published FDA guidance on evidence of effectiveness (FDA Guidance for Industry 1998). Although these two provisions of FDAMA codify regulatory practices already utilized by FDA in a limited fashion through recent regulations or discretionary practices, their statutory status provides a powerful incentive for expanded applications. For those instances in which pharmacokinetic and pharmacodynamic analyses of biomarker or surrogate endpoint data are applicable, advanced pharmacometric methods already exist (e.g., clinical pharmacology and population pharmacokinetic techniques, modeling and simulations of clinical trials, and pharmacogenetic techniques). Nevertheless, widespread application of these provisions will be possible only after several barriers are overcome. These barriers include (1) broad understanding and acceptance of the sometimes complex quantitative concepts and methods utilizing kinetic techniques, (2) limited numbers of scientists who are trained to use such methods, and (3) a paucity of validated surrogate endpoints and routine methods for advancing biomarkers to surrogate endpoint status.
The pharmacokinetic/pharmacodynamic arena poses a series of questions about new drugs in development. These same questions also apply to the customized management of individual patients: Does the drug work at its target? What is dose-response at the target? What is the duration of drug action? In most cases, obtaining answers to these questions is slow, resource intensive, and generally problematic. Biomarkers (e.g., external imaging of radiolabeled exogenous compounds) offer an alternative approach to rapid answers. Fowler and colleagues at the Brookhaven National Laboratory have elegantly demonstrated the use of positron emission tomography in this context. They applied 11C-labeled deprenyl as a phenotypic probe for monoamine oxidase, type B (MAO-B). This probe is covalently bound to the enzyme. After a tracer dose is given, the patterns of radiodistribution that are imaged externally correspond to variations in enzymatic activity. For lazabemide, an investigational agent for inhibiting MAO-B, the initial human studies answered all three questions by demonstrating that (1) the enzyme was inhibited in situ, (2) there was a clear dose-response curve for inhibition, and (3) the frequency of dosing to maintain inhibition would be either once or twice a day. This type of study is far more common in neuropharmacology than other therapeutic areas. The development of suitable probes for use as exogenous biomarkers in other disease types could stimulate many more relevant PK/PD studies, with consequent benefits in drug development and individualized patient treatment.

Pharmacokinetic/Pharmacodynamic Population Model Linking Cortisol Production with Fluticasone Concentration
Elena V. Mishina, Ph.D.

Center for Drug Evaluation and Research, FDA, USA
Fluticasone propionate (FP) is a novel corticosteroid with potent anti-inflammatory activity (receptor affinity is 18 times higher than for dexamethasone) and low systemic bioavailability. It has shown a significant therapeutic efficacy in the treatment of asthma. The main adverse effect -suppression of adrenal function -has been assessed by the measurement of plasma cortisol level. For the prediction of pharmacodynamic effect based on the dose of drug, modeling is applied. Modeling of cortisol (CT) suppression has additional complexities due to nonstationary secretion of cortisol (circadian rhythm) and downregulation effect. Previous models describe cortisol secretion as a cosine function or as sum of linear functions. We applied a superposition of two Bateman functions to characterize the pulsatile circadian rhythm of cortisol in plasma. CT and FP levels were measured over 24 hours in 12 healthy subjects after inhalation of 0.5, 1, and 2 mg of FP and placebo. NONMEM was used to model pharmacokinesis/pharmacodynamics. PK of FP was best characterized by a two-compartment model with first-order absorption (CL/F 158.8 L/hr, V/F 2800 L, Ka 7.1 hr-1). CT plasma levels were described by two pulsatile Bateman functions over 24 hours. The effect of FP was modeled using the Hill equation with inhibition of cortisol secretion. Cortisol peaks were estimated to occur in pulses at 3:30 a.m. (large sharp) and 1:30 p.m. (small shallow) with an amplitude ratio of 7:1. The IC50 of FP was 350 pcg/mL, which is consistent with predicted values for receptor affinity. Cortisol suppression (adverse effect) will have negligible values at the therapeutic doses (50 mg).

Panel Discussion: Pharmacokinetic/Pharmacodynamic Methods in Biomarker Development and Application
Arthur J. Atkinson, Jr., M.D.
Warren G. Magnuson Clinical Center, National Institutes of Health, Building 10, Room 1C-227, 9000 Rockville Pike, Bethesda, MD 20892, USA Statistical attempts to validate biomarkers as putative surrogate endpoints are based on an assessment of the extent of correlation between the biomarker and clinical outcome. Whether based on results from a single study or a meta-analysis of several studies, the fundamental approach has been to estimate the proportion of the outcome difference that can be explained by the effect of the treatment on the biomarker. However, it is axiomatic that correlation is no proof of causation. Thus, a number of strategies have been employed in the attempt to distinguish between "spurious correlations" and the "true correlation" expected in a causal relationship. Of central importance to this discussion is the use of pharmacokinetic/pharmacodynamic (PK/PD) models to evaluate the biological plausibility of a putative biomarker by formally demonstrating appropriate quantitative relationships among drug exposure, changes in a biomarker, and clinical outcome. In its most developed form, this analysis is carried out in the context of a disease progression model. In addition, kinetic analysis can be used in conjunction with the development and use of biomarkers (1) as an integral part of interpreting the raw data on which a biomarker is based (e.g. the interpretation of PET scan data) or (2) as part of investigations of pathophysiology needed to support the biological plausibility of a biomarker.

Value of Precise, Sensitive Measurements of Clinical Pharmacokinetics and Pharmacodynamics
Stephen A. Williams, M.D., Ph.D.

Pfizer Inc., USA
The pressure to make rapid, correct decisions in drug development is driven by prospects of a limited payback period, so that every development day lost, or spent futilely developing something that is later unsuccessful, is often worth $1 million. In this environment, precise measures of pharmacokinetics and innovative clinical measurements of pharmacodynamics are valuable contributors to the speed and quality of decisionmaking. In early development, the information they provide is used for proof of mechanistic concept and profiling the compounds' effects. In late development, where primary endpoints are usually relatively conventional, novel measuring tools used as secondary endpoints can improve the ability to differentiate compounds from their competitors or can more fully characterize a potential strength or weakness of the product. For diseases where no drugs have been registered successfully, there may be no existing method, or the sensitivity of the method may be in question (e.g., Alzheimer's neuroprotection, osteoarthritis joint preservation). In these circumstances, and for unprecedented mechanistic approaches, there is no alternative but to use a novel clinical measure. However, fully "validated" measurement tools do not just appear. Their development, evolution, and implementation is a long and complex process that requires active management and collaboration among stakeholders from academia, industry, and regulatory authorities. Doing this more efficiently will lead to well-informed choices for physicians, better drugs, and reduced patient suffering. The technique of magnetic resonance imaging (MRI) has provided a window on the brain for the actual pathology of multiple sclerosis (MS) to be visualized as it evolves in the living patient. Natural history studies of the activity of new and otherwise active lesions have shown that MRI-detected pathological activity can be seen at a rate of 5 to 10 times the rate of clinical relapses. Systematic MRI monitoring has now been used to supplement the clinical monitoring of clinical trials (Paty and McFarland 1998;Miller et al. 1996). Beginning in the 1980s, it was clear that MRI could define MS lesions quite precisely in both the brain and spinal cord. Diagnostically abnormal scans can be seen in more than 90 percent of patients with clinically definite MS (CDMS). In spite of the advances related to MRI scanning, however, the final diagnosis of MS relies on clinical findings and clinical judgment. The use of MRI after the age of 50 is more difficult because of the nonspecificity of white-matter lesions at that age. The correlation between clinical findings and MRI findings are statistically significant but without a high correlation coefficient. This lack of specific correlation is probably due to the fact that most of the lesions are not located in eloquent tracts. In addition, the major use of MRI is in diagnosis, precisely because MRI reveals many neurologically assymtomatic lesions. Also, the analysis of data from groups of patients has shown that MRI techniques can measure the evolving pathology over time in both an accurate and objective way. MRI findings correlate modestly with neurological findings. The best correlations have been with neuropsychological function. When natural history studies are done using frequent systematic MRI scans, new lesions can be seen to appear and old stable lesions enlarge. In addition, most active lesions enhance with gadolinium. The time scale of the evolution of active lesion changes is usually about 4 weeks waxing and 8 weeks waning. Enhancement probably occurs as the first event in active lesions and lasts on average about 4 weeks (1 to 8 weeks). Active lesions gradually reduce in size and degree of enhancement to then remain stable for an extended period. This period of lesion stability probably reflects the stable residual chronically demyelinated plaque. Some chronically active plaques gradually enlarge over time. New MRI techniques such as magnetization transfer (MT) imaging (Dousset et al. 1994) andT2 relaxation (MacKay 1994) analysis may help identify the degree of demyelination that has occurred in these stable lesions. In addition, MR spectroscopy (MRS) may also identify the elements of active demyelination by the detection of neutral fat as a degradation product of myelin. The final irreversible damage in the MS lesion is axonal loss (Trapp et al. 1998). Axonal integrity can be measured by MRS techniques such as the height of the N-acetyl aspartate (NAA) peak (Arnold et al. 1992). These new combinations of MRI investigative techniques are exciting for understanding the evolution of MS pathology. A practical application of MRI monitoring in MS has been the adjudication of clinical trials (Paty and Li 1993). Acute phase monitoring can be done by identifying active (new, enlarging, or enhancing) lesions. One can then compare a placebo group with treated groups in a way similar to measuring the number of clinical relapses in both placebo and treated groups. Repeated pretreatment scans prior to baseline can also give a reasonable idea of the untreated MRI activity rates. Baseline activity also seems to predict on-study activity. Chronic phase monitoring is done by measuring the total extent of the MRI detected MS pathological involvement in the brain. This measurement of the MRI burden of disease (BOD) or lesion load can be considered the MRI equivalent of the chronic neurological impairment measures such as the EDSS. Over the past 10 years, several studies have shown that MRI activity and quantitative measures have predictable changes over time and are sensitive to treatment effects. Four clinical trials with interferon beta show both a clinical effect on relapses or disability and an MRI effect on both activity measures and chronic impairment measures (IFNB Multiple Sclerosis Study Group 1995;Jacobs et al. 1996;PRISMS Study Group 1998;European Study Group 1998). In the future, the application MR techniques to clinical trials will be enhanced by more specific measures of pathology such as MT, T2 relaxation analysis, and MRS. It is the understanding of evolving pathology in vivo that is the most exciting aspect for MR in the future.

Novel Brain Imaging Techniques and Drug Development in Schizophrenia
Marc Laruelle, M.D.

Columbia University, USA
Abnormalities of dopamine function in schizophrenia are suggested by the common antidopaminergic properties of antipsychotic medications. However, direct evidence of a hyperdopaminergic state in schizophrenia has been difficult to demonstrate given the difficulty of measuring dopamine transmission in the living human brain. Such evidence has recently emerged from new brain imaging techniques. Three studies reported an increase in dopamine transmission following acute amphetamine challenge in patients with schizophrenia compared with matched healthy controls (Laruelle et al. 1996;Breier et al. 1997;Abi-Dargham et al. 1998), thus demonstrating a dysregulation of dopamine in schizophrenia. The dysregulation of dopamine function revealed by the amphetamine challenge is present at onset of illness and in patients never previously exposed to neuroleptic medications. However, this dysregulation was observed in patients experiencing an episode of illness exacerbation but not in patients studied during a remission phase. This finding has important consequences for the development of new treatment strategies for both the exacerbation and remission phases of schizophrenia. To study modulation of this abnormal response by drugs such as 5HT2A antagonists or metabotropic receptor agonists might be useful as proof of concept in drug development. In addition, the recent availability of high-resolution positron emission tomography cameras might document the possible mesolimbic selectivity of these alterations, providing further guidance to drug development. We present the first evidence that structural in vivo magnetic resonance imaging (MRI) of the brain over the course of Alzheimer's disease (AD) supports a neuropathology/neuroanatomy staging model of sequential and progressive regional involvement of the entorhinal cortex (EC), hippocampus, and neocortex (Braak et al. 1993;Gomez-Isla et al. 1996;West et al. 1994;Bobinski et al. 1995). Currently, only reductions in the size of the hippocampus are sufficiently characterized in vivo to be considered as an early (preclinical) diagnostic marker for AD. Hippocampal atrophy is nearly always found in patients with AD, and it is relatively uncommon in normal elderly persons (de Leon et al. 1997). Hippocampal atrophy is associated with memory deficits in normal subjects (Golomb et al. 1994); in preclinical AD, it is anatomically specific (Convit et al. 1997) and predicts future AD (de Leon et al. 1989(de Leon et al. , 1993. AD is marked by both hippocampal and neocortical atrophy (Convit et al. 1997). Recent MRI data show that EC surface area measurements were superior to hippocampal volume in classifying very mild AD patients and controls (Bobinski et al. 1999). In a 3-year longitudinal study of normal elderly persons, the baseline EC uniquely predicted hippocampal volume reductions and was superior to hippocampal volume in predicting memory decline. In summary, MRI data support a model where the EC is the first brain region affected in the course of AD. Subsequent hippocampus atrophy is associated with significant memory impairments and predicts future neocortical atrophy and dementia.

University of California, San Diego, USA
The only proven acute stroke treatments act by improving blood flow to the ischemic brain region. Numerous neuroprotective drugs are effective in animal models but have failed in clinical trials. Many of these failures are because preclinical investigations optimize the conditions for treatment and often differ substantially from feasible patient studies. Identification of critical treatment variables is essential, and the available clinical rating scales are rather inefficient. Specifically, we would like to identify surrogate endpoints that can be used in phase II trials to facilitate protocol designs for phase III studies. An important advantage of stroke investigations is that the cause of the disorder is simply a mechanical disruption of blood supply. Therefore, it is possible to model the problem quite well in animals. At present, no single abnormality has been identified that is responsible for irreversible ischemic nervous system damage. Further understanding of the pathophysiology of ischemic injury is needed to develop valid biomarkers. Until then, clinical investigation will depend on empiric methods. For this purpose, imaging methods may provide usable surrogate endpoints for identifying salvageable tissue, but no such methods have yet been proven useful.

Mental Health Research Institute, University of Michigan, USA
Biological markers for complex cognitive and affective behavioral states are difficult to envision. There are relatively few biological indices of the subtleties of brain function. Among the most actively studied are those found with sleep electroencephalograph, neuroendocrine, behavior, and genetics (neuroimaging is being presented separately). Each of these four areas has produced some information useful for tracking patient status, response or response potential, yet none is very powerful. Each brings novel data to the question -sleep patterns, genetics of drug metabolism, basal and stress responses via responses to social stressors. It is likely that integration of these several markers will provide the best index of treatment response.

Background
The pathology of acute ischemia is typically related to plaque instability, thrombus formation, and coronary flow obstruction resulting in myocardial damage. The intervention is targeted to restore patency, whether through use of thrombolytic agents or direct revascularization. The relevant biomarker would be related to a patency and/or perfusion assessment. Patency can be assessed directly by angiography or indirectly by perfusion techniques. Cardiac echocardiography is an important part of the management of acute coronary syndromes, and much research has gone into identifying prognostic features. Echocardiography is therefore a useful tool in risk stratification, with high-quality research indicating excess mortality in patients with lower global left ventricular function (ejection fraction), more extensive regional abnormalities, moderate or severe mitral valve regurgitation and acute left ventricular dilation, and longer term remodeling. Unfortunately, such studies have wide confidence intervals; few analyses have been done to determine the predictive power of these variables for cardiac events or death; and no studies have yet demonstrated causal relationships between echocardiographic findings and subsequent events. A promising echocardiographic tool, myocardial perfusion assessment by the use of intravenous microbubble injection, is currently in development. Only one multicenter assessment of test performance has been reported, indicating poor sensitivity for segmental perfusion defects (compared with single photon emission computerized tomography sestamibi), with reasonable specificity. This technique is not yet sufficiently refined to replace angiography in determining coronary patency.

University of Alabama, School of Medicine, Birmingham, AL, USA
There are a multitude of biomarkers useful for detection of stable cardiovascular diseases, including ultrasound methods, x-ray approaches, radionuclide methods, and magnetic resonance (MR) strategies. The usefulness of these methods for two of the most common cardiological problems, ischemic heart disease and valvular heart disease, is described below. In ischemic heart disease, certain biomarkers are useful for detection and for prognostication. Ultrasonic methods used clinically can be divided into transthoracic and transesophageal approaches. Ultrasonography is the most widely used of the imaging biomarkers and it is used to evaluate morphologic alterations that occur as a consequence of myocardial damage such as left ventricular aneurysm, ventricular septal defect, and so forth. Ultrasonography is also used to assess left ventricular function both at rest and during infusion of a myocardial stimulant such as dobutamine. Ventricular function at rest correlates well with prognosis, while function under the influence can evaluate viability of poorly contractile segments of heart muscle. X-ray methods have largely revolved around the use of electron beam-computed tomography. The level of calcium in the coronaries is thought to be a means of screening for important coronary artery disease (CAD). It is also possible to assess the patency of coronary artery bypass grafts. Radionuclide methods are useful for detecting the extent of CAD using myocardial perfusion imaging with thallium-201 of a technetium-based tracer. Radionuclides provide the optimal approach for detecting and prognosticating in CAD. Finally, MR methods include imaging to assess morphology, evaluate ventricular function, assess myocardial perfusion, and determine the patency of arterial vasculature and ultimately the composition of arterial plaque. Nuclear magnetic resonance spectroscopic methods are being applied in a research mode to assess viability and other metabolic characteristics of myocardium. Finally, these techniques, by monitoring changes in ventricular function and metabolism, should provide insight into the optimal timing for valve repair or replacement so that myocardial function would be preserved in the postoperative state. Imaging approaches provide a means to comprehensively detect and prognosticate in cardiovascular disease states by providing a number of biomarkers for myocardial function, perfusion, morphology, arterial patency, arterial wall composition, and myocardial metabolism.

Allegheny General Hospital, MCP/Hahnemann School of Medicine, USA
Electron beam-computed tomography (EBCT) is a highly effective method for detection of calcific coronary atherosclerosis. Thus, the EBCT calcium score is a very useful marker of the extent of coronary atherosclerosis, particularly for studies of clinical epidemiology. Stratification of a population by calcium score correlates well overall with likelihood of coronary stenoses and prevalence of coronary events. In populations with chest pain, those with normal EBCT calcium scores have a much lower likelihood of coronary stenoses than those with high calcium scores. Effective treatment of hyperlipidemia can be associated with a reduction in calcium score. However, the calcium score also has many important limitations that limit use as a biomarker for coronary atherosclerosis at the present time. The variability of calcium score on repeat EBCT is excessively high. Age-dependent increases in score complicate interpretation. The method may not identify patients with early disease and potentially hazardous vulnerable plaques. A high calcium score is sensitive but nonspecific as a marker for coronary stenoses. Although scores may change with lipid-lowering treatment or increase over time as part of the natural history of coronary atherosclerosis, there is no evidence that such changes correlate with changes in the actual extent of atherosclerosis or risk of clinical events.

St. Louis University School of Medicine, State University of New York (SUNY), USA
In clinical trials, sample size can be reduced by identifying high-risk patient subsets at entry. The magnitude of risk can be related to clinical characteristics and the extent of electrocardiographic (ECG) abnormalities. Novel lead configurations, more frequent ECG acquisition, and use of selected coding criteria have the potential to further enhance identification. Definitions of Q or non-Q wave myocardial infarction vary considerably in trials of acute coronary syndrome (ACS). Cardiac serum markers (i.e., serial CK-MB and cardiac troponins) identify the presence of myocyte injury. Appropriate cutpoints for the definition of myocardial infarction have become problematic of late with the advent of more sensitive markers, especially in patients with heart failure and hypertension and those after interventional and cardiac surgical procedures. Sampling frequency, the type of assay used, missing samples, and determination of myocardial infarction represent challenges to clinical trial design. Estimates of myocardial infarction size have been considered as a surrogate endpoint for mortality, enhancing the specificity of this composite endpoint, but have become more problematic in the area of recanalization. Myocardial infarction definitions in clinical trials of ACS will undoubtedly represent tradeoffs between sensitivity versus specificity. Standardized definitions should be used for reporting purposes and comparability among studies.
Transplantation I Physiologic, Histologic, and Pharmacologic Markers of Graft Function

Background
Graft survival for all solid organ transplantation procedures is restricted by acute and chronic rejections. The solution to this problem is induction of a state of donor-specific tolerance in the patient so rejections will not occur. Current methods of diagnosing allograft dysfunction are inadequate in that significant organ damage occurs prior to the establishment of a clinical diagnosis. Clinical tolerance remains an elusive goal despite success in animal models. One of the main hurdles in developing tolerance strategies is the lack of a clinical biomarker or a "tolerance assay". The development of assays or novel technologies that will enable detection of allograft dysfunction/rejection, monitor responses to therapy, and predict long-term outcomes is vital for the success of transplantation clinical trials.

Objectives
• Assess graft dysfunction for renal, hepatic, and cardiac allografts by histological criteria and identify newer methods to quantitatively assess the degree of dysfunction • Define physiological and pharmacological criteria for graft dysfunction and validate the techniques • Identify areas in need of further diagnostic tool refinement

University of Cincinnati Medical Center, USA
Kidney transplantation outcomes have improved progressively over the past three decades. However, despite a large body of literature produced during this period, few reports indicate agreement on the clinical presentation of acute rejection, the treatment of acute rejection, the response of acute rejection to therapy, and the correlations between the pathologic findings and the clinical presentation and response to treatment. In the precyclosporine era, a number of clinical parameters associated with a diagnosis of acute rejection were described. However, after the introduction of cyclosporine, these became less obvious in the patient experiencing an acute rejection episode. Measurement of serum creatinine has been the most significant biochemical marker of acute rejection, but considerable tissue damage may occur prior to the serum creatinine becoming elevated. A search for a more reliable biochemical marker of graft dysfunction remains elusive, and histologic assessment of the allograft has therefore become the gold standard for the diagnosis of acute rejection. The Banff classification of acute rejection grades the process according to histologic severity. A strong correlation has been shown between the histologic severity and clinical and biochemical parameters and provides a reliable means for stratifying patient risk of treatment success or failure. By using the pathologic findings in conjunction with other markers of acute rejection, the clinician should be able to make the decision on treatment so as to offer the patient the maximum benefit of judicious antirejection therapy, while avoiding unnecessary overimmunosuppression in the absence of data supporting the benefit of such therapy to an individual patient.

Use of Surrogate Endpoints in Cardiac Transplantation Abstract
Leslie W. Miller, M.D.

University of Minnesota, USA
Acute cellular rejection and chronic allograft rejection (allograft coronary disease) are the two primary endpoints in most trials in heart transplantation. Unlike renal transplantation, there this no biochemical marker to suggest or define rejection, and therefore, endomyocardial biopsy has remained the gold standard for diagnosis of rejection in heart transplantation for the past 25 years. However, there is limited evidence to suggest that the incidence, frequency, severity, or time to rejection have a good correlation with graft function, survival, or development of chronic rejection. Perhaps the most definitive study regarding the correlation between acute and chronic rejection was with the use of intravascular ultrasound to measure direct intimal thickening within the allograft vessel as the marker of chronic rejection. Only the average biopsy score in the first 3 months posttransplant had any correlation with development of intimal thickening using the current grading system to define acute cellular rejection. Function of the graft (ejection fraction) can be readily measured, but patients who demonstrate evidence of severe graft dysfunction or hemodynamic compromise have a 40 to 50 percent mortality at 1 year. Therefore, all heart transplant recipients are biopsied on a predetermined protocol basis in hopes of detecting rejection before graft dysfunction develops. Data from protocol kidney biopsies, which were not driven by change in function, have confirmed a significant "trafficking" of lymphocytes and inflammatory cells that when left untreated often resolve entirely and were never associated with the change in function. These data suggest that many of the biopsies interpreted as showing histologic rejection in heart transplantation may represent a potentially innocent immunological response. This phenomenon is one of the main reasons for the lack of correlation of acute rejection with chronic rejection and the problem with the use of biopsy-proven rejection as a primary endpoint in heart transplant trials. A number of noninvasive or surrogate markers have been examined in heart transplantation, including (1) echocardiography to define alterations in dystocic compliance that may precede overt systolic dysfunction; (2) measurement of voltage from endocardial electrodes; and (3) markers of immune activation, including originally simple T-cell subsets, but more recently both surface and soluble interleukin-2 receptor. Other approaches have included examination of proinflammatory cytokines, such as tumor necrosis factor and IL-6, as well as adhesion molecules ICAM and VCAM. The most important advance in use of biologic markers as endpoints for defining chronic allograft rejection in heart transplant recipients is the use of intervascular ultrasound. This new technology allows direct examination and measurement of the amount of intimal thickening within the allograft vessels. This safe and highly reproducible technology has now become the most definitive surrogate for chronic rejection in any vascularized allograft. The major deficiency in the field of transplantation is the lack of a bioassay of the level of immunosuppression. One recent approach is the use of a cell line expressing donor antigens to which a mixed lymphocyte culture can be performed at any time following transplantation to assess the degree of donor specific alloreactivity. This test not only provides a relative quantitation of low, medium, or high reactivity but also can be used as a target for increasing or decreasing immunosuppression.

The University of Texas Medical School at Houston, Division of Immunology and Organ Transplantation, USA
The advent of a variety of novel immunosuppressive agents has led to a need to understand their pharmacokinetics and pharmacodynamics when either used alone or in drug combinations. Initial data that the pharmacokinetic behavior of an immunosuppressive drug is important to predict outcome were first obtained with cyclosporine (CsA) (Kahan et al. 1982(Kahan et al. , 1983(Kahan et al. , 1984. Almost 20 years of investigation have shown that concentration rather than dose determines outcome: A low drug exposure represents a risk factor for acute rejection episodes (Lindholm and Kahan 1993) and a variable exposure, a risk factor for chronic rejection . Parallel considerations may be important for the dosing of tacrolimus, mycophenolate mofetil, and possibly sirolimus . Pharmacodynamic assays to quantitate drug effects on transplant recipient lymphoid cells have been limited due to the rapid reversibility of the effects, general insensitivity of the assays, and the difficulty to assay cells ex vivo without altering their pathophysiologic state. The largest effort has been reported with CsA estimating IL-2 m-RNA content by Southern blots with specific probes (Yoshimura et al. 1987), measuring cytokine production by patient lymphocytes (Yoshimura and Kahan 1985), and most recently by in vitro calcineurin assays (Batiuk et al. 1995). Using the median effect equation to obtain a rigorous model of drug action, one can evaluate the immunosuppressive as well as toxic interactions between two agents as more than additive (synergistic), additive, or less than additive (antagonistic) (Kahan 1985). Due to the possibility of interactions of CsA and sirolimus at the tissue level -namely, pharmacokinetic interactions at cytochrome P450 3A4 (Stepkowski et al. 1996) or pharmacodynamic interactions at the level of low-density lipoprotein generation and metabolism -we have recently developed new mathematical models that describe combined pharmacokinetic and pharmacodynamic effects.

Department of Immunology and Clinic of Organ Transplantation, Ospedali Riuniti di Bergamo -Mario Negri Institute for Pharmacological Research, Bergamo, Italy
Pharmacological monitoring is a key step in the management of transplant recipients to allow adequate immunosuppression to avoid graft rejection and minimize drug toxicity. Therapeutic monitoring of trough blood cyclosporine (CsA) concentration has been widely adopted to adjust CsA dose in individual subjects. However, trough-level monitoring is not of universal help. More informative than trough CsA concentration is the area under the concentration-time curve (AUC), which is calculated from the individual complete pharmacokinetic profile. However, this approach is quite expensive and time consuming and increases discomfort for the patients, making it seldom feasible in routine outpatient clinic monitoring. Thus, abbreviated CsA AUC profiles have been proposed. Recent data show the possibility of accurately estimating the CsA AUC using only three very early blood samples after Neoral dosing. Attempts to define abbreviated kinetic profiles in AUC monitoring has also been extended to the more recent immunosuppressants that are now entering routine clinical application. This is the case of tacrolimus, mycophenolate mofetil, as well as sirolimus, for which a limited sampling strategy represents an efficient approach to assess total exposure to drugs. Although the proposed strategies are good predictors, they are difficult to apply to day-by-day drug monitoring in clinical practice since in all cases the last time-point of blood sampling is far from drug dosing (6, 9, or even 12 hours), making the procedure cumbersome for outpatients and taxing for the transplant centers in terms of staff effort. Evaluating drug exposure is the conventional way to optimal pharmacological monitoring of transplant patients but does not reveal more on the level of immunosuppression achieved by a given antirejection drug. Thus, efforts should focus on setting up simple, accurate, and precise methods for monitoring the level of T-lymphocyte inhibition in these circumstances.

Prediction of Long-Term Renal Allograft Outcome Using Image Analysis of Sirius Red Staining in Protocol Biopsies
Paul C. Grimm, M.D.

Department of Pediatrics, University of California at San Diego, USA
The 6-month Banff Chronic Score (BCS) is a predictor of the 24-month serum creatinine in renal transplant patients. As components of the Banff Chronic Score are subject to sample error, computerized image analysis of interstitial fibrosis may allow more precise quantitation. The objective of this study was to assess whether quantitation of interstitial fibrosis by image analysis could predict long-term graft outcome. We studied 6-month protocol allograft biopsies from 51 patients with at least 3 years of followup. 1/serum creatinine graphs were used to estimate the time to graft failure (TTGF) by extrapolating to a creatinine level of 5 mg/dL. A blinded observer analyzed biopsy fibrosis by using the mean particle size of Sirius Red-stained tubulointerstitial collagen using image analysis with watershed segmentation. The BCS was used as a comparison. The total BCS of the 6-month biopsy was correlated with TTGF (p = 0.0011, r = 0.44, r2 = 0.181). The mean particle size of interstitial Sirius Red staining was also correlated with TTGF (p = 0.0001, r = 0.516, r2 = 0.266). This study of computerized image analysis indicates a superior correlation of Sirius Red analysis with TTGF than the BCS. Further development is necessary to determine whether this will be useful in predicting allograft outcome.

University of Texas Medical School, Houston, USA
Long-term renal allograft survival is due to the efficacy of immunosuppressants or to an immunoregulatory recipient (recip) hyporesponsiveness. In vitro immunologic evaluation parameters were used to identify immunologically low-risk allograft recips with improved long-term graft survival. Recips whose pretransplant (Tx) sera had little IgG anti-HLA class I antibody (< 10 percent PRA, ELISA-developed) experienced a 35 percent versus a 70 percent rejection frequency (p < 0.01) and an 85 percent versus 74 percent 1-year graft survival (p < 0.01) when compared with recips with reactive anti-HLA sera (PRA 10 percent). The pre-Tx PRA sera < 10 percent delineated an unsensitized, weak immune responder. The recip-donor mixed lymphocyte reaction (MLR) also served as an in vitro correlate, reflecting recip antidonor hyporesponsiveness or hyperresponsiveness. Hypo-MLR recips experienced only 27 percent versus 54 percent rejection episodes (p < 0.05) and had a 92 percent versus 79 percent 1-year graft survival (p < 0.01) compared with hyper-MLR recips (SI 10). There was significant correlation between recips with a pre-Tx PRA sera < 10 percent, and a post-Tx hypo-MLR: 89 percent (46/52) of post-Tx hypo-MLR versus 19 percent (12/63) of hyper-MLR responders displayed pre-Tx PRA sera < 10 percent (p < 0.001). These data suggest that unsensitized recips (PRA < 10 percent) may develop an immunoregulated status resulting in donor hyporesponsiveness and improved graft survivals and may be candidates for tapering and/or withdrawing immunosuppressants.

Predictive Markers of Cancer in Smokers
Margaret R. Spitz, M.D.

University of Texas, MD Anderson Cancer Center, USA
Risk biomarkers used in the design of chemoprevention trials, and especially in the identification of high-risk cohorts, can affect study size and duration as well as stratification and analytic considerations. Their value is further enhanced if they are also predictive of response to the intervention. These biomarkers should be of a high sensitivity and specificity; the biomarker assay should be standardized, validated, relatively easy to measure, and should require noninvasive techniques. Since only a fraction of smokers will develop neoplastic lesions, markers of genetic susceptiblity to tobacco carcinogenesis would in theory help identify highest risk smokers for enrollment in chemoprevention trials. A number of low penetrance/high frequency genes are likely to be relevant in the etiology of tobacco-related cancers. For example, the dose of tobacco carcinogens to which aerodigestive tract tissue is exposed is modulated by genetic polymorphisms in the cytochrome P450 multigene family of enzymes. These enzymes are involved in phase I oxidative processes that may create intermediates more reactive than the parent compounds. The derivative compounds may covalently bind to DNA and form carcinogen-macromolecular adducts. Phase II metabolic processes generally inactivate these genotoxic compounds through conjugation that promotes cellular excretion. Cancer risk is thus defined by the balance between metabolic activation and detoxification of tobacco carcinogen compounds, as well as by the efficiency of DNA repair. Several phenotypic assays are predictive of risk. These include the host cell reactivation assay that measures DNA repair capability and the mutagen sensitivity assay in which the frequency of in vitro bleomycin-or benzo[a]pyrene-induced breaks in cultured peripheral lymphocytes is quantified as a measure of sensitivity. These assays are not optimal since they require viable lymphocytes and are labor intensive. Nevertheless, our data show that they are predictive of risk and might also be useful to predict patients at higher risk of developing recurrence and/or second primaries. Emphasis should now be placed on studying phenotype/genotype correlations and on comparing markers in surrogate tissue (peripheral lymphocytes) with molecular and cellular changes in the target (lung) tissue. In the near future, microarray technology will enable large-scale, low-cost genotyping with the use of automated workstations for extracting DNA and performing DNA amplification, hybridization, and detection. As many as 100 genes at a time can be analyzed for this allele signature analysis. The initial chip being developed will include putative genes for tobacco-induced cancer susceptibility as well as nicotine addiction. The legal and ethical implications of this emerging technology are immense.

The University of Texas, M.D. Anderson Cancer Center, Houston, Texas 77030, USA
Cancer chemoprevention is the intervention in the multistep process of carcinogenesis by chemical agents (Hong et al. 1995;Hong and Sporn 1997;Lotan 1996). Prevention trials that rely on cancer development as an endpoint are prolonged and require large populations. Therefore, biomarkers are needed to serve as intermediate endpoints (Hong et al. 1995;Hong and Sporn 1997). We investigated the expression of nuclear retinoic acid receptors (RARs) and retinoid x receptors (RXRs) (alpha, beta, and gamma) in surgical specimens collected from normal, premalignant, and malignant head and neck and lung tissues during retinoid chemoprevention trials (Hong et al. 1995;Lotan 1996Lotan , 1997) using a nonradioactive in situ hybridization technique (Xu et al. 1994;Xu and Lotan 1998). The level of RARbeta messenger ribonucleic acid (mRNA) was suppressed in 50 to 65 percent of oral premalignant lesions and head and neck squamous cell carcinomas (Xu et al. 1994;Lotan et al. 1995). Treatment of leukoplakia patients with 13-cis-retinoic acid caused an increase in RARbeta expression, which was associated with clinical response (Lotan et al. 1995). Similar results were obtained recently with specimens from normal and malignant lung biopsies (Xu et al. 1997). Thus, RARbeta can serve as an intermediate biomarker because its level decreases during the carcinogenic process, it is upregulated by the chemopreventive agent (retinoid), and this upregulation is associated with clinical response. Several ongoing trials will assess in a prospective manner the potential of employing RARbeta as an intermediate biomarker for chemoprevention trials.

Theoretical and Practical Considerations in the Use of Surrogate Endpoints in Cancer Prevention Research
Arthur Schatzkin, M.D., Dr.P.H.

National Cancer Institute, USA
Because cancer occurs relatively infrequently, prevention trials with cancer endpoints tend to be large, long, and costly. Studies with surrogate cancer endpoints may be smaller, shorter, and cheaper. The key question though, is whether studies with surrogate endpoints give us the right answers about cancer. Using colorectal cancer as an illustration, I will discuss two potential surrogate endpoints: epithelial cell hyperproliferation and adenoma formation. Studies with colorectal mucosal proliferation as an endpoint have been influential (e.g., the calcium chemoprevention story). Although hyperproliferation has been postulated as a relatively early event in carcinogenesis, whether it is a necessary step on the pathway to cancer is uncertain. Alternative pathways to cancer that bypass (and possibly offset) hyperproliferation are plausible, which casts doubt on using hyperproliferation as a surrogate endpoint. The most definitive way to address this uncertainty is to integrate proliferation markers in prevention trials with cancer endpoints, the very studies we were trying to avoid. Less persuasive evidence can be gleaned from observational cohort studies of proliferation versus cancer. Furthermore, even if we establish that hyperproliferation is a good surrogate endpoint in a prevention trial with treatment A (a chemopreventive agent or dietary modulation), there is no guarantee that this marker is a valid surrogate for treatment B. Similar problems are likely to attend the use of other potential molecular and cellular surrogates. In contrast, adenomatous polyp recurrence in the large bowel is widely regarded as a strong surrogate endpoint for intervention studies. Adenoma formation, a relatively late event in carcinogenesis, appears to be a necessary step in the development of most colorectal cancers (the adenoma-carcinoma sequence). Even for adenomas, however, inferences to cancer may be problematic: (1) A small proportion of cancers may arise from areas of flat dysplasia not readily observable through the colonoscope; (2) preventive interventions may differentially affect the pathways to bad (progressing to cancer) and innocent (not progressing) adenomas, complicating the interpretation of polyp trials; and (3) because recurrent adenomas tend to be small, if an intervention operates primarily in the transition from small to large adenoma, or large adenoma to cancer, then this effect will be largely missed in polyp trials. Cervical intraepithelial neoplasia type 3 (CIN3) may be one of the strongest surrogates for cancer, but this state of severe dysplasia/carcinoma in situ is obviously very close to being invasive cervical cancer. The analogous state for large bowel might be the so-called advanced adenoma (more than 1 cm in size, villous elements, or high-grade dysplasia). An adenoma recurrence trial with large or advanced adenomas as endpoints, however, would have to be substantially larger and more expensive than the current generation of polyp trials. For surrogate endpoints in cancer prevention research, study cost and inferential certainty may well be directly related.

Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, USA
A variety of measurements and outcomes have been proposed as possibly predictive of the development of cancer. The approaches to assessing the predictive value of such variables are complex and increase in complexity when considering multiple variables simultaneously. Nevertheless, it may be the case that combinations of biomarkers or early clinical events (e.g., precancerous changes) will prove to be better predictors of disease than any single variable. Statistical methods for evaluating interventions based on multiple outcomes have been developed; their potential application to the setting of cancer prevention studies will be considered.

Biomarkers and Surrogate Endpoints in Breast Cancer Prevention Studies
Barbara S. Hulka, M.D., M.P.H.

University of North Carolina at Chapel Hill, USA
Cancer prevention studies evaluate the effectiveness of interventions to reduce cancer risk in population subgroups using the model of randomized, double-blind clinical trials. Biomarkers are used to enrich the study population with individuals at high risk of breast cancer (i.e., susceptibility markers), measure compliance with the intervention, and serve as surrogate endpoints. A study population at high risk of disease serves both to increase the potential for benefits to study participants and to reduce sample size requirements by increasing the frequency of outcome events. The selection of population groups with germline alterations in major genes such as the breast cancer gene BRCA1 and 2, or common polymorphisms in specific P450 genes may be suitable for targeting specific interventions. Biomarkers of compliance are specific to the intervention. Most critical to trial design is the need for surrogate endpoints that may be identified well in advance of the occurrence of disease. To be useful, these measurable biological events must occur with greater frequency than clinically detectable breast cancer, occur at an earlier point in disease pathogenesis, and be strong predictors of invasive cancer occurrence. Breast parenchymal patterns, depicted on mammograms as variations in radiographic density, illustrate a potential surrogate endpoint. A high percentage relative to a low percentage of density in the breast confers at least a fourfold increased risk of breast cancer. Thus, the correlates of breast density, including hormone replacement therapy, require further evaluation and will be discussed.

Johns Hopkins University School of Medicine, Otolaryngology -Head & Neck Surgery, Baltimore, Maryland 21205-2196, USA
Clonality is a fundamental characteristic of human cancer. One transformed cell gives rise to daughter cells that all exhibit the same genetic change that initially provided a growth advantage to the parent cell (Nowell 1976). The faithful transmission of these and other genetic changes in subsequent daughter cells has been well documented in vitro and in vivo (Sidransky et al. 1992). Correlation of these clonal genetic changes with histopathologic progression has led to the development of molecular progression models (Fearon et al. 1990). Thus, genetic markers able to detect the clonal outgrowth of neoplastic cells may be useful in the detection of primary cancers. We have demonstrated that point mutations in critical oncogenes and tumor suppressor genes can be used in polymerase chain reaction detection of rare neoplastic cells in bodily fluid. We have identified ras gene mutations in the stool of patients with colorectal cancer, p53 mutations in the urine of patients with bladder cancer, and both ras and p53 mutations in the sputum of patients with lung cancer (Sidransky 1997). Although widespread microsatellite instability is rare, we found that markers composed of larger repeat sizes (trinucleotides and tetranucleotides) were more likely to display instability in many tumor types (Mao et al. 1994). We have identified more than 90 percent of bladder cancers at initial presentation and followup with a panel of 20 carefully selected markers (Mao et al. 1996;Steiner et al. 1997). Furthermore, we have identified a clonal population of cells several months before cystoscopic detection in at least two patients. Patients without evidence of recurrence (NED) have reverted to normal by molecular analysis. Epigenetic promoter methylation can inactivate tumor suppressor genes and serves as a marker of clonal expansion. Using a panel of differentially methylated genes, including p16, we have shown that 50 percent of lung cancer patients display a methylated DNA pattern in sputum and serum (Ahrendt et al. 1999;Esteller et al. 1999). Novel automated approaches will allow rapid assessment of these tests in prospective trials and rapid integration into the clinical setting. Identification of clonal DNA alterations in clinical samples continues to evolve as a promising method of cancer detection (Ju et al. 1996).

Genetics of Preneoplasia and Lung Cancer Development
Adi F. Gazdar, M.D.

Hamon Cancer Center, UT Southwestern Medical Center, Dallas, TX, USA
Three important concepts are important in understanding the pathogenesis of lung cancer: (1) Lung cancer is preceded by multiple preneoplastic changes that take many years to evolve into invasive cancer; (2) multiple molecular changes involving recessive and dominant oncogenes are present in lung cancers; and (3) these changes commence early in the pathogenesis. These concepts can be explained by the field cancerization theory that states that much of the upper aerodigestive tract has been damaged by exposure to carcinogens present in tobacco smoke. The major findings are as follows: 1. Preneoplastic bronchial lesions can be recognized by fluorescent bronchoscopy. 2. Women smokers are at greater risk for developing lung cancer, but they have fewer bronchial preneoplastic lesions than male smokers.
3. Molecular changes are present in the majority of smokers, and they commence early during lung cancer pathogenesis, even in histologically normal epithelium.
4. Similar changes are present in the bronchial epithelium of smokers with lung cancer. 5. Patients with small-cell lung cancer have more extensive bronchial epithelial changes than other lung cancer patients.
6. Numerous small clonal patches, most consisting of only a few hundred cells, are present in the epithelium of lung cancer patients.
7. Preneoplastic lesions and molecular changes persist for many years after smoking cessation.

Issues and Challenges in the Application of Biomarkers in Cancer Detection and Prevention
Sudhir

National Cancer Institute, USA
New technologies coming from the field of molecular and cellular biology are now able to identify genetic as well as antigenic changes during the early stages of the malignant process. Some of these changes show promise as biomarkers for preneoplastic development or for early malignant transformation. However, it is essential to fully understand the molecular pathogenesis of cancer (i.e., the natural history of tumor progression at the molecular level) that would lead to development of accurate molecular determinants or biomarkers. However, these biomarkers must be validated for their sensitivity, specificity, and positive predictive value in defined clinical settings to fully evaluate their performance characteristics. In addition, for these biomarkers to be effective in earlier cancer detection, they need to be validated in asymptomatic populations using carefully designed epidemiologic studies with consideration of interindividual and intraindividual variability, cost-effectiveness, and ethical issues. Authors will discuss an array of issues and challenges that need to be addressed prior to the application of biomarkers in earlier cancer detection and prevention trials. The past two decades have seen remarkable advances in medical imaging. The development of magnetic resonance imaging, in particular, has brought unprecedented power to the study of joint disease and its causes and has offered a unique opportunity to explore arthritis in ways not imaginable in the past. This discussion will review the different roles that medical imaging can play in clinical trials and outline the deliverables that imaging markers must strive to meet to be useful in these roles. It also will review the most promising imaging markers available today for evaluating arthritic changes in bone, articular cartilage, and other joint structures and point to areas where further progress can be anticipated.

University of California, San Diego and Veterans Affairs Medical Center, San Diego, USA
Radiological evaluation of arthropathies is classically based on analysis of two fundamental findings: the distribution and morphology of osteoarticular abnormalities. Ankylosing spondylitis (AS) primarily targets synovial and cartilaginous joints, as well as sites of tendinous and ligamentous attachment to bone. Radiographic abnormalities in patients with AS predominate in the sacroiliac joints and spine, following in descending order of frequency by the hips, glenohumeral joints, knees, hands, wrists, and feet. In the past, scientific investigations commonly evaluated the efficacy of imaging in providing an accurate diagnosis of early AS. Although the early and accurate diagnosis of AS is still a primary goal of imaging, other concerns and questions are voiced with increasing frequency. For example, given the substantial constraints imposed by managed care, how should imaging be utilized to provide a cost-effective impact on patient care and outcome? Specifically, what is the optimal imaging algorithm for the assessment of patients? Other questions focus on more academic (and corporate) concerns: What is the most sensitive and specific means for detecting response to a new therapy? Although conventional radiography continues to be the principal method of radiologic assessment, other imaging methods (e.g., magnetic resonance imaging) are expected to be used in the future with increasing frequency to detect pathologic changes involving joints, bones, and entheses.

Harvard Medical School, USA
A variety of modalities have now shown feasibility for modifying the progression of articular cartilage damage in osteoarthritis. This has resulted in a need for better methods to image early changes in cartilage and monitor the progression of these changes. The advantages and limitations of traditional imaging methods, such as radiography, low-frequency ultrasound, computed tomography, magnetic resonance imaging, and arthroscopy, for assessing articular cartilage, will be examined. In addition, newer technologies, such as optical coherence tomography (OCT), will be discussed. OCT is a new method of high-resolution imaging that allows distances to be measured on a micron scale. OCT can be envisioned as analogous to ultrasound, measuring the intensity of backreflected infrared light rather than sound. All technologies will be discussed with respect to their ability to identify fine changes in the cartilage necessary for monitoring therapeutic modalities.

University of California, San Diego School of Medicine, USA
Cytokines play a key role in the perpetuation of rheumatoid arthritis (RA). Careful studies of the cytokine profile in RA have demonstrated an abundance of macrophage and fibroblasts such as IL-1, TNF-alpha, and IL-6. Furthermore, production of proinflammatory mediators, including prostaglandins and metalloproteinases, is regulated in rheumatoid synovium by these cytokines. In an effort to understand the mechanism of action of antirheumatic drugs, surrogate markers for clinical trials have been evaluated. Although peripheral blood and synovial effusions are more accessible and a few studies have demonstrated clinical correlations with cytokines or soluble cytokine receptors, they are probably less useful than direct examination of synovial tissue. Serial synovial biopsies to evaluate surrogate markers was first used to examine the effect of steroids and methotrexate on metalloproteinase gene expression. Subsequent studies have demonstrated that blind percutaneous biopsies are essentially equivalent to arthroscopic samples and can be used to evaluate the expression of cytokines. For instance, methotrexate and anti-TNF-alpha therapy significantly decrease synovial expression of cytokines as determined by immunohistochemistry. This method for assessing cytokine expression is reproducible and has been carefully validated. Attempts to quantify cytokine mRNA in synovial tissue using nested reverse transcriptase-polymerase chain reaction are also promising, although no studies to date have directly compared mRNA and protein analyses. Additional studies are required to develop clear criteria for determining the most appropriate techniques for assaying cytokines as well as the best cytokine surrogate markers.

Genetic and Major Histocompatibility Complex Markers of Disease Severity in Rheumatoid Arthritis
Cornelia M. Weyand, M.D.

Mayo Clinic, USA
Rheumatoid arthritis (RA) is a chronic inflammatory disease that often leads to disability and crippling. It is now recognized that RA is a multigene disorder with several genetic risk factors contributing to pathogenesis. We have proposed that multiple subtypes of RA exist and that different phenotypes of RA correspond to the inheritance of different arrays of disease risk genes. This concept has been supported by a detailed analysis of the distribution of human leukocyte antigen (HLA)-DRB1 polymorphisms in patient cohorts, indicating that the clinical heterogeneity in course and outcome correlates with HLA-DRB1 allelic polymorphisms. A set of HLA-DRB1 alleles has been recognized as disease associated, but different alleles are enriched in distinct subtypes of RA. Rheumatoid factorpositive destructive disease is preferentially associated with the HLA-DRB1*0401 allele, whereas HLA-DRB1*0404 and B1*0101 predispose to milder and seronegative disease. Inheritance of two copies of RAassociated alleles carries a high risk for extra-articular spreading of RA. Homozygosity for HLA-DRB1*0401 has been reported in patients with the most serious complication of RA, rheumatoid vasculitis. In addition to HLAencoded polymorphisms, abnormalities in the generation and function of CD4 T cells can be useful in dissecting patient subsets with different variants of RA. In patients with extra-articular RA, unusual CD4+ lymphocytes emerge that are characterized by a deficiency for the CD28 molecule. CD4+CD28-T cells have a tendency to form large clonal populations, produce high amounts of interferon-gamma, and exhibit autoreactivity. Accumulation of these CD4 T cells in a subset of RA patients likely reflects a fundamental abnormality in T cell homeostasis. Appreciation of the heterogeneity of the synovial component of RA has come from studies describing at least three different patterns of lymphoid organization and tissue cytokine production in the synovium of RA patients. Genetic elements determining disease expression in the inflammatory lesions await identification, but candidate genes include cytokine genes and tissue injury response genes. The ultimate goal of these studies is to dissect the phenotypic and genotypic heterogeneity of RA and to correlate combinations of disease-risk genes with clinical variants of the disease. The recognition of clinical subcategories will be required to optimize pathogenic studies and (Lohmander et al. 1995(Lohmander et al. , 1998, supporting a relationship between marker concentrations in joint fluid, serum or urine, and cartilage turnover. This provides face validity for these markers to monitor dynamic changes in the target tissue. Other aspects of validity for the use of markers are less well supported, but new data suggest that markers will be useful in future trials. Increased serum concentrations of hyaluronan, C-reactive protein, and cartilage oligomeric matrix protein predict future OA progression (Månsson et al. 1995;Sharif et al. 1995aSharif et al. , 1995bSpector et al. 1997), which can be used to select high-risk individuals in early trials. Data on within-and between-patient variability for molecular markers in joint fluid and serum and urine are available for stable OA cohorts (Lohmander et al. 1998) and suggest that (1) variability differs between markers in the same compartment, (2) variability is lower within than between patients, and (3) markers are responsive to change. Some 30 patients per treatment arm would be needed to show a change of 0.5 standard deviation with 80-percent power (Lohmander et al. 1998). The final answer on the utility of these surrogate measures must await the availability of an agent that changes OA disease progression.

Collagen Type II Cross-Linked Telopeptides: A Promising Marker of Cartilage Degradation in Arthritis
David Eyre, Ph.D.

Burgess Professor of Orthopedic Research, University of Washington, Seattle, USA
A specific cartilage degradation marker has potential value in chondro-protective drug development and clinical management of arthritis patients. There is a need for minimally invasive biochemical assays that can assess the rate of cartilage degradation in patients with degenerative joint diseases. The collagen framework of cartilage turns over extremely slowly in the normal adult, and its gross degradation in articular cartilage is believed to be a critical, irreversible event in osteoarthritis. A degradation assay for cartilage collagen is particularly desirable. We describe an advance in developing an immunoassay designed to measure pyridinoline cross-linked telopeptides from type II collagen in human urine, serum, and synovial fluid.

Professor of Medicine and Head, Rheumatology Division; Director Multipurpose Arthritis and Musculoskeletal Diseases Center, Indiana University Medical Center, USA
Development of potential disease-modifying osteoarthritis (OA) drugs has prompted efforts to develop outcome measures of OA progression in humans. Although interest exists in various imaging procedures and arthroscopy, these techniques have not been validated or been proved suitable for use in clinical trials. Interest has grown, therefore, in biochemical/immunochemical tests for monitoring OA progression. Given the problems that exist in relating the serum concentration of a cartilage-derived molecule to events in an index joint, it has been suggested that synovial fluid (SF) measurements may be more useful as markers of disease activity or severity in OA. However, unless factors that affect the kinetics of removal from the joint are taken into account, the SF concentration of these molecules cannot be a valid quantitative indicator of changes in articular cartilage metabolism. Although synovitis increases clearance of proteins from the joint space, we found that even after adjustment for the increased clearance rate in the OA knee, the synovial fluid concentration of sulfated glycosaminoglycans (most of which are derived from the cartilage) did not correlate with the severity of concurrent or subsequent cartilage damage. This emphasizes that the SF concentration of a cartilage-derived marker is related not only to its clearance but also to other, more proximal variables, such as its rate of degradation, matrix permeability, and the rate of synthetic activity by chondrocytes in the OA cartilage.

Objectives
• Review state of current biomarkers and surrogate endpoints in sepsis clinical research ! What is the experience to date in clinical trials with specific biomarkers and secondary endpoints? ! What is the association of biomarkers and surrogate endpoints with mortality or organ failure? ! How do physiologic endpoints (e.g., reversal of shock, development of organ failure) interact with biomarkers and mortality? ! Have these physiologic endpoints been affected by the experimental therapies?
• Discuss role of biomarkers and surrogate endpoints in enhancing development of trials of novel therapies for septic shock and review their role in preclinical and clinical studies in measuring therapeutic and adverse effects of agents ! Can substitution of biomarkers for clinical endpoints be used to evaluate the safety and efficacy of a novel therapy? ! What is necessary to validate biomarkers in preclinical and clinical studies? ! Can information be obtained from existing databases of completed trials to validate biomarkers? ! Will trial design be improved with novel endpoints? ! Can biomarkers be applied as a surrogate endpoints in Phase III trials of therapies for sepsis? Major breakthroughs for the treatment of sepsis remain elusive. More than 10,000 patients (21 studies) have been enrolled in trials of nonsteroidal antiinflammatory agents in septic shock, and no agent has altered outcomes. When the treatment effects from these trials are pooled, a small (3 percent) beneficial effect, equal to a 7-percent reduction in mortality, is found (Natanson et al. 1998). These data suggest that mediator-specific anti-inflammatory therapies as currently applied (e.g., patient selection, dose, duration of therapy) have at most only small beneficial effects. Power analysis suggests that demonstrating these modest effects would require large trials (more than 5,000 patients). To date, however, clinical parameters (e.g., severity of illness scores) or biologic markers (e.g., blood cytokine and endotoxin levels or cell antigen expression) have not proven useful for selecting patients for these inflammation-modifying therapies (Reinhart et al. 1996;Abraham et al. 1998). Although biomarkers in some diseases correlate with pathogenesis and outcome (e.g., CD4 cell counts in HIV infection), syndromes such as sepsis are more difficult to characterize because the clinical importance of biomarkers may depend on their context (e.g., tumor neurosis factor [TNF] has beneficial and harmful effects) and the net effects of other signaling molecules (Brandtzaeg et al. 1996). Identifying new host or microbial-derived biomarkers, refining the use of existing markers, and developing models that reflect the complex nature of these interactions will be important in future clinical trial development.
Key References:

IL-6 and Tumor Necrosis Factor Levels as Markers of Response in Sepsis Trials
Edward Abraham, M.D.

Division of Pulmonary Sciences and Critical Care Medicine, University of Colorado Health Sciences Center, USA
Although elevated circulating IL-6 levels have been shown to correlate with poor outcome in septic patients, it has been more difficult to show any relation between IL-6 and response to therapy. The INTERSEPT and NORASEPT II trials examined the utility of a murine antitumor necrosis factor (anti-TNF) alpha monoclonal antibody in patients with septic shock. In neither study was any predictive value associated with IL-6 levels and response to anti-TNF therapy. Similarly, a p55 TNF receptor fusion protein was studied in two clinical trials of patients with severe sepsis with or without hypotension. Although there was a significant relationship in both studies between IL-6 levels and outcome, there was no predictive value apparent for response to anti-TNF therapy and IL-6 levels. In contrast, a small study using Fab fragments of a murine monoclonal antibody showed a linear relationship between dose of this anti-TNF therapy and improved survival, but only in patients with IL-6 levels greater than 1,000 pg/mL. Such results were unable to be confirmed in a larger European study. However, a large North American study of more than 2,000 patients aimed at further examining the relationship between IL-6 levels and outcome to this anti-TNF therapy was recently completed, and results should be available shortly. Circulating levels of TNF are present in only a minority of patients with severe sepsis or septic shock. However, in the NORASEPT II trial, those patients with detectable plasma TNF alpha levels seemed to have better response to the anti-TNF antibody than patients without such elevations in circulating TNF.

Cytokine Balance in Acute Respiratory Distress Syndrome: Implications for Detecting Acute Lung Injury
Thomas R. Martin, M.D.

University of Washington School of Medicine, USA
Acute respiratory distress syndrome (ARDS) is characterized by an intense inflammatory response in the lungs that begins before ARDS is clinically evident. Markers of inflammation in blood and lungs were among the first measurements made to predict the onset and outcome of ARDS. Despite initial studies in small samples, single cytokine markers in blood and lung fluids are not consistently predictive. The biological activity of individual cytokines is determined by the balance between cytokines and their naturally occurring inhibitors and by other factors, such as binding to tissue matrix. In the Seattle ARDS Specialized Center of Research program, we have prospectively studied 25 patients at risk for ARDS and 45 with established ARDS, using serial bronchoalveolar lavage. Proinflammatory cytokines and chemokines are detectable in the lungs of patients at risk, increase at the onset of ARDS, and decline with time. For tumor necrosis factor alpha, IL-1alpha, and IL-6, the molar concentrations of naturally occurring inhibitors (sTNFRI and II, IL-1RA, sIL-6R and others) exceed those of the ligands at all times by tenfold or more. High concentrations of anti-IL-8 IgG and alpha-2-macroglobulin, which binds IL-8, also are present. The only cytokine that increases with time in ARDS is migration inhibitory factor, a naturally occurring antagonist of the inhibitory effects of cortisol on macrophage cytokine production. Neither the cytokines nor their inhibitors were strong predictors of outcome. Thus, cytokine balance complicates the use of single cytokine measurements as predictors of ARDS. Markers of the effects of inflammation on the structural components of the alveolar wall may prove to be more useful in predicting the onset or outcome of ARDS.

Nonmortal Clinical Endpoints for Trials in Critically Ill Patients
Gordon R. Bernard, M.D.

Vanderbilt University School of Medicine, USA
The cause of death in most intensive care unit (ICU) patients is multiple organ dysfunction or failure (MOD). Life support is increasingly effective (at least in the short term). Well-established methods exist for outcome prediction (e.g., APACHE, SAPS, MPM), but there are no well-established methods for systematically quantifying morbidity (or severity of illness). We hypothesized that standardized assessment of organ dysfunction can be used as a tool to measure important clinical morbidity in clinical trials and clinical practice. Methods included development of standard definitions through a series of consensus conferences (Antioxidants in ARDS Study Group, Chicago, July 1993; Acute Lung Injury Specialized Center of Research Coordination Group, Bethesda, September 1993; Sepsis Round Table Group, Brussels, March 1994). The selection criteria for MOD assessment variables were that they be (1) simply and easily measured, (2) useful in heterogeneous groups of patients, (3) reflective of specific organ function, (4) unaffected by therapeutic interventions that may appear to but do not restore organ function, (5) a continuous variable, (6) abnormal in only one direction, and (7) correlative with increasing mortality. Results will be presented at the conference. MOD assessment is simple and easy to apply and describes ICU morbidity in a clinically meaningful manner and when combined with OFF (organ failure-free) day analysis, avoids the confounding effect of high mortality rates. Such a measure would be a useful standard measure of outcome in clinical trials in critically ill subjects. It is likely to be a more sensitive and specific outcome variable than mortality in clinical trials in the critically ill.

Prospects for Functional Immune Assessment in Severe Infections and Septic Shock
Stephen F. Lowry, M.D.

University of Medicine & Dentistry of New Jersey-Robert Wood Johnson Medical School, New Brunswick, New Jersey, USA
Critical care risk evaluation systems that utilize demographic and biochemical parameters have been proposed as a means to direct some therapeutic decisions or to evaluate responses to therapy. The refinement of additional disease or condition-specific biomarkers to enhance therapeutic decisions has great appeal but lacks prospective investigation. In theory, such a biomarker should be definitively altered by the prevailing clinical condition and responsive to intervention. Such markers should also be attainable in real time, be cost effective, and exhibit a high degree of sensitivity and specificity with respect to clinical outcome (surrogate endpoint). The function of solid organs is currently assessed by biochemical or clinical parameters that do not provide insight into the competence of the innate and acquired immune systems. Toward this end, recent efforts to quantify soluble inflammatory mediators or their respective soluble or cell-associated receptors suggest that such an approach is clinically feasible and potentially useful as a surrogate endpoint of disease severity in critically ill patients. It remains to be established that such markers are responsive to novel therapies. Results of these studies and the prospects for development of biomarkers of immune function will be discussed.

Validating Biomarkers and Ascertaining their Relationship to Clinical Endpoints
Polly E. Parsons, M.D.

Denver Health Medical Center, University of Colorado Health Sciences Center, USA
Patients with sepsis are not a homogeneous population (Parsons and Moss 1996;Abraham et al., in press). They frequently have preexisting/comorbid conditions, including alcohol abuse and diabetes, that contribute to outcome and influence biomarker measurements. In 351 patients at risk for acute respiratory distress syndrome (ARDS), we found that the incidence of ARDS was 43 percent in those with a history of alcohol abuse compared with 22 percent for those who were not alcoholic (Moss et al. 1996). The effect was most pronounced in the cohort of septic patients (n=109) in which the incidence of ARDS was 52 percent in the alcoholics compared with 20 percent. In 113 patients with septic shock, we found that the incidence of ARDS was 25 percent for those with diabetes (n = 32) compared with 47 percent for those without (Moss et al. 1997). Alcohol abuse and diabetes also affect the measurements of biomarkers. ICAM-1 levels are decreased in the circulation of septic patients with a history of alcohol abuse compared with septic patients without alcohol abuse (Moss et al. 1998), and patients at risk for and with diabetes have increased levels of ICAM-1 in their circulation (Lampeter et al. 1992;Roep et al. 1994). Numerous other preexisting/comorbid conditions, including renal failure, liver disease, age, gender, and drug abuse -which could also influence clinical outcomes and biomarker measurements -remain to be investigated.

Severity of Infectious Challenge Alters the Effects of Anti-Inflammatory Agents in Sepsis
Peter Q. Eichacker, M.D.

Clinical Center, National Institutes of Health, USA
Although anti-inflammatory agents were shown to markedly improve survival in published animal models of sepsis, these agents have had at best only small beneficial effects in clinical sepsis trials. These differing results suggest that factors encountered in clinical trials were not controlled for in preclinical studies. One such factor may be the severity of infectious challenge. Humans were studied in sepsis trials with mortality rates of 30 to 50 percent, whereas published preclinical trials were performed in animal models with 70 to 90 percent mortality rates. We therefore investigated the relationship between control mortality and the effects of anti-inflammatory agents on survival in animal studies of sepsis we performed as well as in published animal and human trials. Both in our own animal studies as well as those in the literature, there was a significant association between control mortality and the effects of anti-inflammatory agents on survival. Agents were most beneficial at high mortality rates. As mortality rates decreased, agents became less effective or even harmful. The small beneficial effects of anti-inflammatory agents in clinical trials, where control mortality was not high, were consistent with findings in animal studies. Thus, severity of infectious challenge may be an important factor altering the effects of anti-inflammatory agents in sepsis. Anti-inflammatory agents may be most beneficial in septic patients with a high likelihood of dying. However, these agents may have little effect or be harmful in septic patients with less severe infections associated with a low mortality rate. These findings suggest that effective use of anti-inflammatory agents in sepsis may not be possible unless reliable markers differentiating the severity of infection in patients can be identified. cDNA microarray technology promises to become a pivotal tool in understanding the functional genomics of complex diseases (Heller et al.1997). Critically ill or injured patients frequently die of incompletely understood conditions such as septic shock, acute respiratory distress syndrome, and ultimately multiple organ dysfunction syndrome. Activation of host inflammatory pathways causes tissue injury and thereby acts as a major pathogenic mechanism in these syndromes. At a basic level, the clinical and biological manifestations of host responses are determined by quantitative and qualitative changes in gene expression. Therefore, organ injury syndromes might be defined by their associated patterns of altered gene expression. From paired samples of cells or tissues, cDNA microarrays can quantitate relative changes in mRNA levels for thousands of genes simultaneously (Chen et al. 1998). Furthermore, cDNA microarrays can be used to detect genetic polymorphisms that affect outcome and to identify new gene targets for drug development (Ramsay 1998). Because cDNA microarrays generate huge data sets even from relatively simple experiments, difficulties with validating and conceptually handling this quantity of information need to be resolved. Clustering data from genes with shared characteristics, developing software to aid in the interpretation of results, and rapidly making results widely available (Iyer et al. 1999) are some potential approaches to these problems. Serial samples from patients, human or rodent models, and cultured cells could be used to create a public-domain, functional genomics database relevant to investigating septic shock and multiple organ failure.

Biomarkers and Surrogate Endpoints for Anti-inflammatory Therapies for Sepsis
Jay P. Siegel

Center for Biologics Evaluation and Research, FDA, USA
Mortality, the endpoint of primary clinical interest in sepsis,occurs in about 30% of patients in most trials, usually within a few days to a few weeks. Thus, frequency and rapidity of the endpoint do not give rise to need for a surrogate. However, drug development in this area has been extremely difficult, in part because of the broad diversity of the target population. Sepsis patients differ with regard to many parameters which might influence response to therapy including site of infection, type of infection, underlying disease, metabolic state, presence of shock and organ failure. Identifying the optimal target population, dose, regimen, and concomitant therapies could be greatly facilitated by identification of a marker that effectively predict effects on mortality and or other important clinical outcomes. However, given the diversity of pathophysiological processes present during sepsis, one must be cautious about assuming that an agent with a favorable effect on one will not have an unfavorable effect on others. FDA-coordinated efforts currently underway to analyze sepsis trials to date and to identify useful markers will be described briefly.

Objectives
Modification of the function, chemistry, and structure of neurons is a major determinant of inflammatory and neuropathic pain. The challenge is to: • Identify which signals initiate plasticity and develop markers for these • Characterize the cellular, molecular, and physiological markers that may be involved in pain responses • Discover the participation of novel genes in plasticity that are relevant to pain mechanisms • Use imaging techniques to identify pain-activated areas in humans that may provide opportunities to follow effectiveness of new therapeutic approaches • Utilize this information to improve the diagnosis and initiate novel treatment strategies for pain

McGill University, Montreal, Quebec, Canada
Pain is a subjective experience that results from injury to cutaneous, musculoskeletal, visceral, or nervous tissue. Since structural and functional alternations in pain transmission and pain modulation pathways in the central nervous system can emerge from injuries that are not easily recognized or that appear to be minor, identifying the neural basis of altered pain perception is often elusive. Human functional brain imaging is an emerging technique that, with caution, may provide insights into the diagnosis and treatment of pain conditions. Functional magnetic resonance imaging and high-resolution positron emission tomography have been used to image neural activity related to peripheral and central neuropathic pain conditions. For example, some neuropathic pain states produce reduced thalamic activity, whereas allodynia in central nervous system pain patients leads to activation of painrelated cortical areas such as the anterior cingulate cortex. Functional brain imaging has begun to be used to compare neural activation patterns related to cutaneous, musculoskeletal, visceral, and neuropathic pain, but data are not yet available to predict the source of pain based on such activation patterns. Competitive radioligand binding studies using opiate agonists such as diprenorphin have identified possible forebrain sites of pain-related opiate actions. Similar radioligand binding studies may prove useful in screening novel analgesic medications. At the present time, functional brain imaging alone cannot be used to diagnose pain conditions or to determine the effectiveness of analgesic treatments. However, with appropriately designed experiments and cautious interpretation, this method may provide important insights into the neural substrate of different pain conditions and may be useful in evaluating new analgesic treatments.

Imaging Modalities and Pharmacologic Markers for Analgesia
James C. Eisenach, M.D.

Wake Forest University School of Medicine, USA
Other than reversal of opiate analgesia with naloxone and imaging of regional brain glucose metabolism or blood flow before and after opiates, there has been little investigation into imaging and markers of analgesia or analgesic mechanisms. Development of high-resolution positron emission tomography (PET) scanners and synthesis of highly selective ligands should allow examination of pharmacologic mechanisms of analgesia and sites of action in humans. For example, we have recently demonstrated spinal cholinergic activation by PET in response to opiate administration, a marker for one site of opiate analgesic activity. In the future, activity at known sites of action could predict analgesic activity in certain pain syndromes and help guide drug development.

Behavioral Markers of Pain
Francis J. Keefe, Ph.D.

Health Psychology Program, Ohio University, Athens, Ohio 45701, USA
Pain is an unpleasant sensory and emotional experience that is typically assessed through self-report measures such as numeric or adjective rating scales. Although pain is a personal experience, it does have behavioral correlates. People who have pain may talk about their pain, reduce their activity level, take medication, or exhibit pain-related body postures or facial expressions. Over the past decade, there has been a growing recognition that an assessment of such pain-related behaviors may provide a useful supplement to the more typical self-report measures of pain. The purpose of this presentation is to provide an overview of recent research on the behavioral assessment of pain. The presentation is divided into three parts. In the first part, the clinical and theoretical rationale for the measurement of pain behavior is discussed. The second part describes and evaluates behavioral observation protocols that are increasingly being used to assess pain behavior. These protocols feature the use of trained observers who systematically observe and code nonverbal pain behaviors that occur in standardized situations or the natural environment. The reliability and validity of behavioral observation protocols have been supported by numerous studies carried out in patients suffering from arthritis pain, back pain, and cancer pain. The third part of this presentation focuses on important future directions for pain behavior assessment. Innovative approaches will be highlighted, including the use of electronic event monitoring to assess pain medication intake and hand-held electronic diaries to record day-to-day variations in pain behavior. Taken together, findings from recent studies suggest that behavioral methods can play an important role in the assessment of pain. These methods provide a useful adjunct to self-reports of pain and yield information that can be quite helpful in understanding and treating pain.

Clinical Trial Design Issues for Evaluation of Pain Biomarkers
Christine N. Sang, M.D., M.P.H.

Massachusetts General Hospital, Boston, MA, USA
Neuropathic pain is made up of heterogeneous mechanisms. Such heterogeneous mechanisms are usually clinically assessed only on the basis of subjective measures of overall pain intensity and unpleasantness. The use of surrogate markers to distinguish between pain mechanisms may facilitate the conduct of clinical trials and the development of new treatment strategies. Study groups or subgroups that address pain mechanisms may be distinguished ad hoc by using surrogate endpoints, and surrogate endpoints may be used to evaluate patients' responses to treatment. However, the validity of such endpoints, the clear relationship between the surrogate endpoint and the true endpoint, is critical to the validity of the data from clinical trials that use these markers. We will discuss the validation of pain biomarkers using clinical trials to evaluate both the marker itself and the true endpoint, using intensity and spread of allodynia as a specific example.

University of California at San Francisco, USA
There have been many developments in the understanding of pain on the molecular and cellular levels. With these developments have come specific markers of the presence of mechanisms highly specific to pain. These objective markers provide the possibility for unique therapies as well as detection of disruption of what might be presumed to be necessary mechanisms for the experience of pain. However, our understanding of pain is far removed from placing it in the objective realm, and the traditional subjective assessments of which the visual analog scale is the gold standard will undoubtedly continue to be in the forefront in clinical pain research. Nevertheless, the existence of markers such as the tetrodotoxin-resistant sodium channel, vanilloid receptor, and specific second messenger isoforms provide an opportunity to explore further the objectification of pain transmission and present significant potential for pharmacological targeting and as pain transmission markers. Recent neuropathology research has identified mechanistic links between nerve injury and pain that have been useful in understanding neural biomarkers of pain. There are several clinical studies of historical importance that have sought to link the form of neuropathy with pain conditions, but structure-function relationships have been too general to provide much prognostic significance. This is due in part to the retrospective protocols of the studies, which have compared sural nerve biopsies with descriptions of clinical complaints. More recent prospective studies demonstrate a strong relationship between injury of small sensory nerve fibers and spontaneous neuropathic pain (Griffin et al.). The strongest correlation in spontaneous neuropathic pain in feet was not with changes in the morphology of sural nerve biopsies, although small fiber loss was evident in many cases, but with small fiber loss in distal biopsies of epidermal tissue of the feet. These studies reinforce the thought that sensory fiber loss is a common feature of neuropathies with spontaneous neuropathic pain, and that the small sensory fiber loss is greatest distally. Laboratory work has demonstrated quite convincingly that the neuropathies with rapid axonal degeneration are the most painful, and that the cytokine-driven process of Wallerian degeneration is the link that unites painful neuropathies of different causes (Myers et al. 1993(Myers et al. , 1999. Although predominantly demyelinating neuropathies can be painful, many of these also involve axonal injury, and it is this component of the neuropathy that best relates to pain. We believe that the direct effect of tumor necrosis factor alpha, upregulated in response to endoneurial release of nociceptive neuropeptides and Schwann cell activation, is the key factor that links nerve injury, pain, and sensitization of the afferent sensory pathway. Thus, markers for cytokine upregulation and axonal degeneration could provide important diagnostic, prognostic, and treatment endpoint measures without the need for a more invasive nerve biopsy. In this regard, the recent work of Helena Brisby in Gothenburg, Sweden, is important (Brisby et al., in press). She has shown that human patients with acute, painful disk herniation in the lumbar spine have significant increases in cerebrospinal fluid (CSF) neurofilaments from injured nerve root axons and increased S-100 protein in CSF, presumably as a consequence of Schwann cell injury during Wallerian degeneration. It is proposed that continued research to identify molecular markers of axonal injury shed into the blood or CSF will be instrumental in designing a relatively noninvasive biomarker for pain.

A Rat Model of Painful Peripheral Neuropathy as a Biomarker for the Clinical Efficacy of New Analgesics
Gary J. Bennett, Ph.D.

MCP Hahnemann University, USA
Damage to peripheral somatosensory nerves by disease or trauma sometimes results in a chronic syndrome of abnormal pain sensation. Examples of such painful peripheral neuropathies include diabetic neuropathy, posttherapeutic neuralgia, causalgia, and the toxic neuropathy caused by chemotherapy. These abnormal pain conditions respond poorly or not at all to standard nonsteroidal anti-inflammatory drugs and opiate analgesics. Testing new analgesics in these patients is a difficult and expensive undertaking, and testing in normal animals seems pointless because neuropathic pain has no obvious counterpart in normal physiology. Recently developed rat models of painful peripheral neuropathies can serve as biomarkers of clinical conditions, which has led to a rapid increase in research for new drug therapies. Several models are available, but only two have been used extensively in pharmacological research. The chronic constriction injury (CCI) model of Bennett and Xie (1988) is produced by tying loosely constrictive ligatures around the rat's sciatic nerve at midthigh level. The ligatures evoke intraneural edema, the swelling is opposed by the ligatures, and the nerve self-strangulates. The spinal nerve transection (SNT) model developed by Kim and Chung (1992) involves tight ligation (and hence transection) of the L5 and L6 spinal nerves close to their respective ganglia. Both models produce signs of abnormal pain that resemble those found in patients -heat-hyperalgesia, mechanohyperalgesia, mechanoallodynia, and cold-allodynia -plus signs of spontaneous pain. The animal models have been accurate in predicting clinical efficacy. Drugs with efficacy in the rats have efficacy in human patients and vice versa. Moreover, drugs that are weakly effective in rats (e.g., carbamazepine and nonspecific Ca2+ channel blockers) are also weakly effective in patients, and drugs that do not work in rats (e.g., benzodiazepines) are also clinically ineffective. Research with these biomarkers has already led to the discovery of at least three entirely new classes of drugs that have clinical efficacy: N-methyl-D-aspartate receptor blockers, N-type calcium channel blockers, and gabapentin-like drugs. These drugs have little or no effect on normal acute pain sensation; thus, they would have been impossible to discover using the animal models traditionally used to discover new analgesics.

Key References:
Bennett GJ, Xie Y-K. A peripheral mononeuropathy in rat that produces disorders of pain sensation like those seen in man.

Biomarkers of Toxicity and Surrogate Endpoints for Safety
Objectives • Review the current status of toxicology biomarkers in drug development and clinical safety assessments ! Do we need more information on biomarkers, and are we using our current information wisely? • Demonstrate that biomarkers can serve as early predictors of insidious adverse effects related to chronic drug exposures ! What is needed to establish the utility of a biomarker as an indicator of clinically significant toxicology? • Discuss the development of genetically modified rodent models as better predictors of drug effects in humans ! Can or should transgenic humanized models be developed in other species besides the mouse? • Discuss the clinical application of pharmacogenetics to predict and prevent adverse drug effects ! Do the benefits outweigh the costs/risks associated with clinical pharmacogenetic profiling? • Identify technological approaches that will provide opportunities for developing more and better toxicology biomarkers

The Status of Toxicity Biomarkers and Safety Evaluation Approaches
The present study sought to determine whether cardiac troponin T (cTnT), a cardiac protein used to diagnose ischemic myocardial injury, also may be useful in detecting doxorubicin (DXR) cardiotoxicity. Spontaneously hypertensive rats (SHRs) were given 1 mg/kg DXR weekly. Myocardial tissues and serum samples were collected and analyzed after 2,4,6,8,10, or 12 weeks of treatment. DXR lesion scores were assessed semiquantitatively by light microscopy, and serum levels of cTnT were quantified by an ELISA method (Enzyum). Increases of cTnT (0.03-0.05 ng/mL) in serum and lesion scores of 1 or 1.5 were noted in 1/5 and 2/5 SHRs given 2 or 4 mg/kg DXR, respectively. All SHRs given 6 mg/kg or more DXR had elevations of cTnT in serum and myocardial lesions. The average cTnT concentrations and the average lesion scores increased with the cumulative DXR dose (0.13 versus 0.40 ng/ml and 1.4 versus 3.0) in SHRs given 6 and 12 mg/kg DXR, respectively. It is concluded that cTnT is released from DXR-damaged myocytes, and measurements of serum levels of this protein can provide a sensitive means for assessing the early and continuing cardiotoxic effects of DXR.
Striking interindividual differences in environmental toxicity and cancer susceptibility among human populations often reflect polymorphisms in drug-metabolizing enzymes (DMEs) and receptors that control DME levels. Compared with interspecies (human-mouse) differences of twofold to perhaps twentyfold, human interindividual variability in DME activities or DME receptor affinity has often been shown to vary fourfold to more than 10,000fold. How can these different human alleles be studied most effectively and efficiently in an experimental animal model? Transgenic knock-in technology can be used to insert unique human alleles in place of the orthologous mouse genes. The knock-in of each gene is a separate targeting event, however, requiring (1) construction of the targeting vector and transfection into embryonic stem cells, (2) generation of a targeted mouse, and (3) backcross breeding of the knock-in mouse (at least six times) to produce a suitable genetically homogeneous (99 percent) background (for decreasing interindividual variability). These experiments require years to complete -making this very powerful technology inefficient for routine applications. If, on the other hand, the initial knock-in targeting vector includes sequences that would allow the knocked-in gene to be exchanged (possibly, even repeatedly) for yet another new allele, then testing a battery of human polymorphic alleles in transgenic mice could be accomplished in several months instead of several years. This gene swapping can be done by zygotic injection of the human allele cassette into the fertilized ovum of the parental knock-in mouse strain or by cloning mice from fibroblasts containing the nucleus wherein each human allele has already been swapped. In mouse cells, we have succeeded in gene swapping by exchanging one gene (including its regulatory regions) flanked by heterotypic lox sites with a second gene (including its regulatory regions) flanked by heterotypic lox sites. This research is supported in part by National Institutes of Health grants R01 ES07058, R01 ES08147, R01 AG09235, R01 ES06321, and P30 ES06096.

Pharmacogenetics as Applied to Human Drug Safety Testing
Richard M. Weinshilboum, M.D.

Mayo Medical School/Mayo Clinic/Mayo Foundation, USA
A large number of functionally significant, common, genetic polymorphisms for enzymes that participate in drug and xenobiotic biotransformation have been described over the past three decades. We now appreciate that genetic polymorphisms for both phase I and phase II drug-and xenobiotic-metabolizing enzymes can be important factors for variation in toxicity after exposure to these agents. Prototypic examples of genetically polymorphic drug-metabolizing enzymes will be described in the context of the rapid advances that are occurring in human genomics as well as molecular pharmacology and toxicology.

The Role of Mass Spectrometry in the Development of Biomarkers
Ian A. Blair, Ph.D.

University of Pennsylvania, USA
The quantitation and identification of DNA adducts in biological samples as molecular dosimeters of both exogenous and endogenous carcinogens require sophisticated analytical methodology because adducts are normally present in such low concentrations. Mass spectrometric techniques have high sensitivity and are highly specific because only ions derived from the analyte of interest are monitored. The development of liquid chromatography/mass spectrometry methodology based on the use of atmospheric pressure ionization-based techniques of electrospray ionization, ionspray, and atmospheric pressure chemical ionization has had a profound effect on our ability to identify analytes of interest when only trace amounts of material are available. These techniques are extremely robust and are amenable to the quantitation of DNA adducts in biological samples. High sensitivity can be attained by the use of selected reaction monitoring of a specific product ion after collision-induced dissociation has been performed on the protonated molecular ion derived from the analyte of interest. The ability to use this methodology as a dosimeter for environmental chemicals was exemplified in recent studies we have conducted on butadiene exposure. Liquid chromatography/mass spectrometry, in combination with stable isotope methodology, provided a robust, sensitive, and accurate means to quantify the butadiene derived N7-guanine adducts. Using this methodology, it was possible to demonstrate the presence of two diastereomeric forms of trihydroxybutylguanine in liver DNA and to assess the excretion of these DNA adducts in the urine. Studies on endogenous carcinogens have relied more heavily on the use of gas chromatography/electron capture negative ion chemical ionization mass spectrometry because of the high sensitivity of this technique. However, liquid chromatography/mass spectrometry has been very useful for identifying novel targets for molecular dosimeters. Using this methodology, we have recently identified a new reactive electrophilic aldehyde from the breakdown of lipid peroxides that form covalent adducts with DNA bases. The characterization of these DNA adducts has provided two novel biomarkers that can be employed to assess exposure to oxidative stress. This research is supported by National Institutes of Health grant CA63878.

Director International Strategic Toxicological Sciences, Medicines Safety Evaluation, Glaxo Wellcome Research and Development, Ware, England, UK
Most toxicities that are not lethal within minutes are likely to have some impact on gene expression, be a result of gene expression, or both. Some of the gene expression changes (GECs) that occur as toxicity develops are expected to be unique to the mechanism of toxicity (e.g., free radical production, inhibition of cellular respiration). Some GECs are expected to be unique to the type of toxicity (e.g., apoptosis, nongenotoxic carcinogenicity) but common among mechanisms that cause the same type of toxicity. Other GECs are expected to be adaptive in response to changes in such things as blood pressure, nutrition, and so forth. Determination of GECs that reliably indicate the above would allow development of gene expression-based biomarkers of toxicity. The most valuable GECs will probably be critical core genes that have been conserved during evolution. These GECs will be shared among species, greatly improving interspecies comparisons and extrapolations. If gene expression analysis in discovery and toxicology studies identify GECs indicative of toxicity and in vitro studies with human and animal cell lines establish that the human gene homologs behave as the animal genes do, GECs can be used as biomarkers of toxicity in clinical trials and patients. These biomarkers will be especially valuable when they occur in easily assessable tissues (e.g., lymphocytes) and for early detection of chronic toxicity. Although functional or morphologic evidence of chronic toxicities may take weeks to years to develop, characteristic GECs likely occur in hours or days. Identification of these GECs would enable early detection of the chronic toxicity, markedly shortening necessary animal studies and protecting clinical trial subjects and patients. To fully harvest the benefit of gene expression analysis, reliable broad gene coverage technologies, computer systems capable of manipulating large data sets, and scientists that understand pathophysiology at the molecular and transcriptional level are necessary. Development of these technologies, systems, and scientists should be a major focus of future resource commitment.

Gene Expression Analysis for Toxicology: Moving Beyond Phenomenology
Spencer B. Farr, Ph.D.

Phase-1 Molecular Toxicology, Sante Fe, New Mexico, USA
The science of toxicology is changing radically. There are two fundamental forces compelling this change: (1) high-throughput chemical library synthesis and screening and (2) the human genome project and attendant technology for rapid polymorphism and expression analysis. Toxicologists are faced with the challenge of developing high-throughput toxicity screens to keep pace with the increased number of "hits" derived from upstream screening activities. In addition, toxicologists will need to better understand the interaction between a potential drug and the individual genetic makeup of the patient exposed to that compound. To make meaningful progress in either area, toxicologists will have to effectively assimilate and wisely use an avalanche of information.
Without appropriate bioinformatics developed specifically for these challenges, the effort will be stunted. The purpose of this presentation is to present an overview of the interplay among screening, molecular toxicologic analysis, and associated informatics.

Additional Thoughts on Toxicity Biomarkers: The Isoprostanes, Indices of Oxidative Stress In Vivo
Jason D. Morrow, M.D.

Vanderbilt University School of Medicine, USA
Over the past decade, the role that oxidant stress plays in human pathophysiology has received considerable attention. A number of biomarkers have been developed to measure oxidative injury although most have serious shortcomings when applied to the assessment of oxidant stress in vivo. We have previously described a series of prostaglandin F2-like compounds, termed F2-isoprostanes (F2-IsoPs), produced independently of the cyclooxygenase enzyme by the free-radical-catalyzed peroxidation of arachidonic acid in vivo in humans. A large body of evidence suggests that quantification of these compounds is an accurate index of lipid peroxidation in vitro and in vivo. F2-IsoPs can be quantified by mass spectrometry or immunologically. A particular interest of mine is the role of oxidative injury in atherosclerosis and neurodegenerative disorders such as Alzheimer's disease. We have found that risk factors for atherosclerosis are associated with markedly increased circulating levels of F2-IsoPs and that antioxidants such as vitamin E, which decrease the incidence of atherosclerosis, suppress plasma F2-IsoP levels. Furthermore, F2-IsoPs are selectively increased in cerebrospinal fluid from patients with Alzheimer's disease, and levels correlate with disease severity. Therefore, these studies suggest that quantification of F2-IsoPs may provide a useful biomarker of oxidant stress in humans.

Surrogate Endpoints for Treatment-Induced Change in Risk of Osteoporotic Fractures: Introduction
Steven R. Cummings, M.D.

University of California, San Francisco Coordinating Center, USA
Bone mass, measured in a number of ways, is strongly associated with the risk of fracture. In addition, biochemical markers of bone resorption and formation can be measured in urine or serum. Single measurements of these markers are related to the subsequent risk of fracture. To be useful surrogates for the effect of treatment on the risk of fracture, changes in these markers during treatment should predict changes in the risk of fractures. This session will review the evidence that changes in these markers predict changes in risk of fracture. In addition, we will review how well two common outcomes, changes in vertebral dimensions on x-ray and changes in stature during treatment, predict change in risk of nonspinal fractures and morbidity. The goals of the session are to review the state of the art in biomarkers and surrogate endpoints for change in fracture risk and identify key questions that still need to be addressed.

Division of Metabolism and Endocrine Drug Products, Food and Drug Administration (Retired), USA
Biochemical markers of bone turnover reflect bone remodeling at the skeletal level and provide information about bone formation (by the osteoblast) and resorption (by the osteoclast). Important formation markers include serum bone alkaline phosphatase (BAP) and serum osteocalcin or bone-gla-protein (OC). Specific markers of bone resorption include urinary pyridinoline crosslinks (Pyr), urinary deoxypyridinoline crosslinks (dPyr), and their amino-and carboxy-terminal telopeptides NTx and CTx. Serum formation markers are secreted by the osteoblast and are subject to less biologic variation. Diurnal variation of serum BAP is also less evident. Urine assays of resorption markers usually show high biologic variability, and their clinical usefulness is also limited due to diurnal variation. Potential uses of these markers are to (1) predict bone loss at menopause to identify "fast" versus "slow" bone losers, (2) monitor response to antiresorptive therapy in postmenopausal osteoporosis, (3) estimate fracture risk, (4) predict changes in bone mineral density to antiresorptive therapy, and (5) monitor compliance with treatment. However, biochemical markers provide little or no help in the diagnosis of osteoporosis. Significant changes in biochemical markers are generally observed within 3 to 6 months of treatment with antiresorptive drugs in postmenopausal osteoporosis. In response to treatment, changes in formation markers lag behind changes in resorption markers. Conflicting data in the literature preclude the use of these markers as surrogate of any efficacy endpoints in clinical practice. The clinical utility of these markers other than in postmenopausal osteoporosis has not been clearly demonstrated. Future long-term longitudinal studies with measurements of specific markers of bone formation and resorption will lead to a better understanding of their clinical usefulness.

Is Change in Bone Mineral Density an Adequate Surrogate for Assessing the Effect of Antiresorptive Medications on Fracture Risk?
Dennis M. Black, Ph.D., Jim Pearson and Steven R. Cummings, M.D.

University of California, San Francisco, USA
Bone mineral density (BMD) has been shown to be a strong predictor of future fracture risk, and it has been thought that antiresorptive drugs that reduce fracture risk do so by improving bone mineral density. In fact, the U.S. Food and Drug Administration and other regulatory agencies base their approval of some medications solely on showing effectiveness for BMD. If the pathway of action of a drug in reducing risk is solely through its effect on BMD, then a drug's effect on BMD should accurately predict its effect on fracture risk. To test this hypothesis, we compared the BMD effects to vertebral fracture effects from a large number of recent clinical trials of osteoporosis medications. To compile a comprehensive list of trials, we performed MEDLINE searches of the published literature, examined abstracts from major meetings over the past 2 years, and queried experts in the field. We included all trials of antiresorptive medications that were large enough to have at least five fractures per treatment. A total of 11 trials of 6 medications, which randomized a total of about 19,000 women, were included. We correlated the active/placebo BMD difference to the log of the relative risk for the observed fracture effect and performed a weighted linear regression analysis. There were two major findings from this analysis. First, we found that, in general, the studies that showed the largest BMD effects tended to show the largest fracture reductions. However, despite the large number of studies, the correlation was only moderate and was only marginally statistically significant (p = .09). Second, we found that the changes in BMD were too small to account for the large decreases in fracture risk found in these trials. This suggests that factors other than BMD must account for at least some of the action of these drugs. To confirm these results, we performed an analysis using the method of Freedman and colleagues to estimate the proportion of the fracture effect that could be explained by the change in BMD in a large trial of alendronate. The results showed that change in BMD could explain only 15 to 17 percent of the effect of alendronate. We conclude that BMD only is not an adequate surrogate for assessing the potential effects of a drug on fracture risk. Since no other surrogates are presently available, trials with fracture endpoints are necessary to ascertain how effective a drug will be in reducing fracture risk.

Quantitative Ultrasound as a Surrogate Endpoint for Fracture Risk in Osteoporosis Studies
Mary L. Bouxsein, Ph.D.

Orthopedic Biomechanics Laboratory, Beth Israel Deaconess Medical Center and Harvard Medical School, Boston, MA, USA
Osteoporosis is an enormous and growing public health concern. In the United States alone, it is estimated that 25 million postmenopausal women are at risk for fracture due to low bone mass. Despite the increased awareness of osteoporosis and expansion in clinical bone densitometry, only a minority of these individuals at risk are diagnosed, because traditional bone densitometry scans are not yet available to all patients who might benefit from knowledge of their skeletal status. Thus, it is clear that there is a growing need for bone densitometry techniques that can increase access to bone density testing. Quantitative ultrasound (QUS) has recently been introduced and shows promise as a tool for widespread screening and evaluation of skeletal status as it is quick, easy to use, portable, and free from ionizing radiation and requires a relatively low capital investment. Moreover, due to the nature of the interaction between ultrasound and bone, QUS may reflect skeletal properties that are distinct from bone mineral density. Devices currently approved by the U.S. Food and Drug Administration include those designed to assess bone at the heel and tibia. Prospective studies indicate that heel QUS is a strong predictor of hip fracture risk, with risk ratios (RR = 2.0) similar to those obtained from femoral BMD. However, there are few data available to evaluate whether QUS can be used to monitor therapeutic response and thus whether changes in QUS are predictive of changes in fracture risk. Randomized clinical trials incorporating QUS, BMD, and fracture as outcome measures are required to fully evaluate the potential clinical utility of QUS.

Assessment of Trabecular Bone Using Magnetic Resonance Imaging: Role in the Study of Osteoporosis
Sharmila Majumdar, Ph.D.

University of California, San Francisco, USA
Currently bone mineral density (BMD) testing is the most common clinical diagnostic for assessing skeletal status and fracture risk; however, there is strong evidence that trabecular bone structure may be of importance in osteoporosis (Kleerekoper et al. 1985;Parfitt 1982Parfitt , 1987Hodgskinson and Currey 1990), and considerable effort is being expended in developing techniques to assess trabecular bone microarchitecture noninvasively. In addition, bone macroarchitecture and changes in trabecular bone microarchitecture may play a role in and affect the biomechanical competence of trabecular bone. The heterogeneity in the microarchitecture of trabecular bone is primarily governed by physiological function and mechanical loading on the skeleton. This results in trabecular bone microarchitecture being dependent on the anatomic site, as well as having a directional anisotropy of the mechanical properties and architecture (Zhu et al. 1994;Ciarelli et al. 1991;Martens et al. 1983;Schoenfeld et al. 1974;Goldstein 1987). Traditionally, trabecular structure has been assessed using a two-dimensional analysis of histomorphometry sections obtained from iliac crest biopsies. However, the anisotropy of trabecular orientation and connectivity of trabecular bone, which is a three-dimensional (3-D) quantity, is likely to play an important role in determining bone strength. With recent hardware and software advances, magnetic resonance (MR) images with spatial resolutions of 80 to 200 µm and slice thicknesses of 300 to 700 µm, which resolve the trabecular structure, have been obtained both in vitro and in vivo. In conjunction with 3-D image processing and an understanding of the mechanisms of image formation, these high-resolution images may be used to quantify trabecular bone architecture. In addition to obtaining standard stereological measures such as trabecular bone volume, mean trabecular width, mean trabecular spacing, mean intercept length as a function of angle, parameters such as 3-D connectivity as measured by the Euler number, fabric tensor in three dimensions, and texture-related parameters such as fractal dimension may be derived from such images. In vitro, quantitative measures of trabecular architecture derived from such images have been compared with those obtained from higher resolution 18 µm images and with those with biomechanical properties. In vivo, high-resolution MR techniques combined with standard techniques of stereology and texture analysis have been used to determine the relationship between trabecular bone structure parameters, age, and measures of BMD and osteoporotic status. Furthermore, these structure measures may be combined with density measures to assess the added role of trabecular microarchitecture, as well as combined with finite element modeling to predict mechanical properties of bone. The issues associated with longitudinal assessment of trabecular bone structure in vivo are complex and will be discussed. Recent results emphasize the need for studies to establish the role of bone structure in understanding the pathophysiology of osteoporosis, and the mechanism of therapeutic action clearly warrants further investigation. However, with the recent advances in technology and research, the potential of combining MR imaging with 3-D image analysis provides a potentially unparalleled tool for this purpose.

Radiographically Detected Vertebral Deformities and Loss of Stature as Surrogate Endpoints in Osteoporosis
Michael C. Nevitt, Ph.D.

University of California, San Francisco, USA
The primary objective of osteoporosis treatment and prevention is to reduce the risk of clinical fractures and their associated morbidity, including pain, limitation of activity, and disability. Although hip fractures cause the greatest morbidity, a wide variety of other fractures are common in elderly persons and also result in substantial morbidity. Radiographically detected vertebral deformities are the most common fracture in elderly patients, a hallmark of osteoporosis, and the primary fracture endpoint in most clinical trials of osteoporosis treatments. However, a large proportion of these radiographically detected deformities are asymptomatic; an even larger proportion escape clinical detection, and their clinical and functional impact remains uncertain. We will examine evidence for the validity of radiographically detected vertebral deformities as surrogate endpoints for clinical fractures and the morbidity caused by osteoporosis. What is the morbidity associated with vertebral deformities? What is the relationship between deformities and clinical spine fractures and between deformities and other clinical fractures? How well do the treatment effects observed for vertebral deformities agree with the effects of the same treatment on clinical spine fractures and on other clinical fractures? Do we see effects of treatment on morbidity related to spine fractures? Does a reduction in deformities or spine fractures explain the effect observed on morbidity? Similarly, we will examine the validity of loss of stature as a valid surrogate endpoint for radiographically detected vertebral deformities or spine fractures.

Biological Markers of Bone Turnover: Clinical Value of Biochemical Markers
Richard Eastell, M.D.

University of Sheffield, UK
Bone turnover markers could have clinical utility in setting treatment for osteoporosis if they predicted response to treatment or allowed monitoring of response. So far, these markers have been evaluated against another surrogate marker, bone mineral density in clinical trials. They need to be evaluated against the probability of sustaining a fracture. The markers could be used to evaluate efficacy of treatment and to enhance compliance. We need to know the relationship of response to variability to characterize an individual's response. We need to know whether excessive dosage is harmful such that titration of treatment to response is a valid exercise. We need to know whether compliance is enhanced by treatment monitoring. We need to conduct studies that monitor bone turnover after cessation of therapy to understand how long treatment effects persist. Finally, we could use markers to evaluate anabolic treatments for osteoporosis, not just to identify such compounds in the short term, but also to devise the optimal dosing regime.

Biological Markers of Bone Turnover: Current Status as Surrogate Endpoints in Clinical Trials
David B. Karpf, M.D.

Director of Clinical Research, Osteoporosis and Metabolic Bone Disease, Roche Pharmaceuticals, Clinical Associate Professor, Department of Medicine, Stanford University School of Medicine, USA
In the setting of clinical trials, biochemical markers of bone turnover should be considered validated surrogate efficacy parameters for compounds where the primary mechanism of action is antiresorptive (or where at least a significant proportion of the antifracture efficacy is presumed to be mediated by this mechanism). Substantial evidence supports a key role of enhanced bone turnover in the pathogenesis of osteoporosis and osteoporotic fractures: • Estrogen deficiency is associated with an increased rate of bone turnover, whether measured histomorphometrically or by use of biochemical markers (Eriksen et al. 1990;Eastell et al. 1988;Riis 1996;Riis et al. 1996). With any imbalance between resorption depth and wall thickness, any increase in activation frequency will be associated with an increased rate of decline in bone mass. In addition, estrogen deficiency is associated with an increase in the degree of negative bone balance at each BMU, which, together with the increased activation frequency, further exacerbates the rate of decline in bone mass.

•
Increased bone turnover is predicted to decrease bone strength (and increase fracture risk) independently of its effect to decrease bone mass (e.g., via an increased number of resorption spaces, decreased ability of trabeculae to withstand buckling loads due to the presence of resorption lacunae, and decreased trabecular connectivity due to increased perforation) (Einhorn 1992;Parfitt et al. 1983;. This biomechanical theory has been supported by several prospective studies in which patients in the highest quartile for bone turnover demonstrated a significantly increased relative risk for either hip or vertebral fractures. The predictive value of increased bone turnover was as predictive of fracture risk as decreased bone mineral density (BMD) and was independent of BMD; hence, the risk of high turnover was additive to that imposed by low bone mass (Garnero et al. 1995;Ross et al. 1997). Data from prospective fracture trials have shown that the fracture efficacy demonstrated is substantially greater than that predicted by the increased BMD alone (Cummings et al. 1996). Both the magnitude and the time course of the fracture efficacy observed are inconsistent with the entire effect being mediated only or substantially by increased BMD and support the concept that the efficacy derives primarily from the substantial and rapid decrease in bone turnover (Black et al. 1996). For antiresorptive agents, there appears to be a relationship between the magnitude of the turnover suppression and the magnitude of the fracture efficacy. Thus, in the clinical trial setting, markers of bone turnover constitute validated surrogates for the clinical endpoint of fractures. Recommendations for future study/analysis include quantification of the relationship between bone turnover decrease and reduction in fracture risk, as well as studies to evaluate the validity of turnover markers as efficacy surrogates for anabolic agents.

Objectives
• Assess the use of imaging modalities as correlates to physiologic markers • Evaluate various approaches used to identify cellular markers of disease state, progression of disease, and therapeutic competence (efficacy) • Determine research needs for the advancement of marker identification useful in the identification of early cancer detection and cancer therapeutic drug targets • Identify other technological approaches that will provide opportunities for developing more and better biomarkers to measure therapeutic efficacy • Examine the roles of industry, academia, and government in the development of toxicity biomarkers The molecular mechanisms specific regulate the metastasis of tumor cells to specific organs are diverse and both tumor and organ specific. Data will be presented that demonstrate that the organ microenvironment can influence the pattern of gene expression and the biologic phenotype of metastatic tumor cells, including regulation of cellular survival, angiogenesis, and growth at the organ-specific metastatic site. Insight into the molecular mechanisms regulating this process as well as a better understanding of the interaction between the metastatic cell and the organspecific microenvironment provides a foundation for the design of new therapeutic approaches. The purpose of this study was to determine whether epidermal growth factor receptor (EGF-R) signaling regulates, in part, human bladder and pancreatric carcinoma cell proliferation, invasion, or angiogenesis. EGF-R overexpression correlates with bladder (TCC) and pancreatic carcinoma (PC) progression. We evaluated whether EGF-R blockade by use of (1) dominant negative mutant EGF-Rs, (2) EGF-R-specific tyrosine kinase inhibitors, or (3) neutralizing anti-EGF-R antisera (C225) has therapeutic benefits against high-grade TCC and PC growing orthopically in nude mice. In vitro treatment of each cell type with EGF-R blockade therapies resulted in inhibition of EGF-R-specific phosphorylation as measured by Western blotting and maximal 50 percent cytostasis. A decrease was observed in expression of the angiogenic factors vascular endothelial growth factor, IL-8, and bFGF and the matrix metalloproteinase 9 protease by the treated cells at both mRNA and protein levels in a dose-dependent manner (p < 0.005). In contrast, cell cyclerelated proteins such as p16, p21, and CDK2 were not downregulated, although the cyclin-dependent kinase inhibitor p27 was elevated after treatment. Systemic therapy of established TCC and PC tumors with EGF-R blockade therapies alone or in combination with Taxol or gencitabine, respectively, resulted in growth inhibition, tumor regression, and abrogation of metastasis (p < 0.0005). Therapy conferred a significant survival advantage to the treated mice (p = 0.0001). The expression of proliferating cell nuclear antigen, Rb, MMP9, VEGF, IL-8, and bFGF (protein and mRNA) levels was significantly reduced in treated versus control tumors (p = 0.001), whereas p27 levels were induced to high levels. The inhibition of the angiogenic peptides resulted in the subsequent involution of the tumor neovascularization as determined by microvessel density (p < 0.005), contributing in part to an increased tumor cell apoptotic index. These experiments indicate that therapeutic strategies targeting EGF-R signaling have a significant antitumor effect, mediated in part by the inhibition of cellular proliferation, invasion, and angiogenesis that leads to apoptosis and tumor regression. Ongoing studies are analyzing the identical biomarkers in patient specimens subsequent to treatment with EGF-R blockade therapies in combination with cytotoxic drugs. Collectively, these data support the hypothesis that the microenvironment of different organs can influence the biological behavior of tumor cells during the metastatic process and provide a therapeutic basis for interfering with metastasis by downregulation of receptor number or function.

Biochemical Endpoints in Mechanism-Based Clinical Cancer Trials
James K.V. Willson, M.D.

Case Western Reserve University and University Hospitals of Cleveland, USA
Our group has conducted a series of mechanism-based phase I clinical cancer trials in which one endpoint of the trial was the modulation of a molecular target. A procedure for sequential computerized tomography-guided biopsies of solid tumors was developed to investigate the modulation of a biochemical target in tumor tissues. In one phase I trial, we used this approach to determine the dose and schedule of a novel modulator of DNA alkyltransferase activity in tumor tissues. In this trial, 28 patients completed sequential tumor biopsies which proved informative for biochemical studies of alkyltransferase. The drug level required to deplete the enzyme in the tumors was tenfold higher than the drug area under the curve required to deplete alkyltransferase activity in peripheral blood mononuclear cells. This experience demonstrates the feasibility of using molecular endpoints in the development of cancer therapeutics. This experience also demonstrates the limitation of studying a surrogate tissue and the importance of investigating biochemical endpoints in the relevant target tissue.

Practical Issues in Current Drug Development: The Gross Philadelphia Chromosome as a Chronic Myelogenous Leukemia Surrogate Endpoint; Progression-Free Survival as an Endpoint for Tumoristatic Therapies
Robert J. Spiegel, M.D.

Schering-Plough Research Institute, USA
The question to be addressed is, Are there now instances where outcomes short of survival should be accepted as appropriate surrogates for new therapies in the treatment of cancer? In particular, Does the new genetic understanding of the basis of certain malignant transformations allow demonstrations of effect on tumor genetics to serve as adequate surrogates of new therapeutic's efficacy? In chronic myelogenous leukemia, the gross Philadelphia chromosome (Ph1) aberration has been recognized for decades as a pathognemonic marker of this disease. With the introduction of interferon and bone marrow transplantation, for the first time some patients were demonstrated to lose detectable Ph1 following treatment. A number of studies have now confirmed that those patients who lose Ph1 can expect durable long-term responses. The question arises, Can the demonstration of loss of Ph1 and normalization of hematologic values for some X period of time be considered a suitable surrogate of treatment efficacy and preclude the necessity of following cohorts of hundreds of patients over 3 to 5 years to demonstrate survival benefit? This model raises interesting questions for other therapies in the future, which may be able to eliminate specific genetic defects that are easily tracked. A larger and related question involves the recent introduction into the clinic of new therapies that are expected to produce tumoristatic rather than tumoricidal effects. Some current approaches to gene therapy such as replacement of tumor suppressor genes (p53) or reversal of oncogenic gene stimuli (farnesyl protein transferase inhibitors in RAS-positive tumors) may produce successful control of tumor growth but without bystander effect and without dramatic conventional tumor responses. These approaches to cancer therapy may well produce control of existent tumor but not elimination. Demonstration of clinical benefit in these circumstances may take hundreds of patients followed for many years, and this produces serious challenges to the normal development process where early readout is necessary to optimize drug formulations or determine a positive decision to invest further resources. A discussion of specific therapeutic approaches and specific disease settings will be made.

Genentech, Inc., USA
The use of biomarkers to select therapy has been a frequent topic for theoretical discussion. Herceptin, a recently approved humanized monoclonal antibody, is the first drug whose use is specifically based on the presence of a biomarker, the HER2/neu oncogene. The challenges associated with the practical implications of using a biomarker to select therapy will be discussed.

Biomarkers and Angiogenesis -A Tabula Rasa
Thomas Boehm, Ph.D.

Children's Hospital/Harvard Medical School, Boston, MA, USA
The field of angiogenesis research and particularly the application of the antiangiogenesis concept as a potential way to treat cancer were pioneered by Judah Folkman during the past 30 years. During the past 5 years, several potent or less potent molecules have been found to be capable of controlling certain aspects of angiogenesis and thereby of blocking tumor growth. Nevertheless, only a handful publications can be found on biomarker and angiogenesis. In addition, these publications describe molecules secreted from tumor cells to stimulate endothelial cells (bFGF and VEGF). In other words, these proteins are not real markers for angiogenesis but for the presence of tumor cells. It is now generally accepted that tumor cells need to establish an adequate capillary network to expand. This means that endothelial cells will undergo phenotypical changes, because they have to migrate, proliferate, degrade, and remodel the matrix to accomplish the task of forming a functional unit supplying the tumor cells with oxygen and nutrients. If these phenotypical changes can be measured and quantified, we could detect cancer earlier or follow antiangiogenic therapy. Before this can be accomplished, these endothelial cell-specific molecules lost or overexpressed during tumor growth must be identified. A possible approach to isolate real endothelial cell marker will be presented.

Prognostic Markers
Donald Berry, Ph.D. Prognostic markers are of little clinical interest unless they have therapeutic implications. Despite some attitudes to the contrary, it may not be appropriate to use the most intensive and most toxic therapies with patients having the poorest prognosis. But assessing interactions between therapy and biomarkers is difficult and requires very large numbers of patients. Such assessment is made even more difficult because the typical experiment involves assessing numerous markers. A result is that the false-positive rate increases, giving rise to biomarkers that are flashes in the pan. These and related problems in the case of HER-2/neu and p53 and their possible interaction with doxorubicin in node-positive breast cancer will be discussed as well as the general notion of using surrogate markers in clinical research from the Bayesian perspective.

Radiolabeled Probes as a Tool for the Assessment of Receptor Expression or Vascularity
Albert F. LoBuglio, M.D.

University of Alabama at Birmingham, Comprehensive Cancer Center, USA
A recent phase I clinical trial utilizing a humanized anti-Vitronectin receptor provided an opportunity for our group to initiate attempts at measuring antibody binding or blockade of this angiogenesis-dependent endothelial cell receptor. We proposed that it may bind available Vitronectin receptors and that administration of radiolabeled tracer doses might be used to image tumor vascular bed and when administered postinfusion of a therapeutic dose, might demonstrate blockade of the receptor. This initial attempt was only partially successful and suggested that the antibody's binding affinity was too low, and particularly, its "off-rate" was too rapid. Evidence for this hypothesis was then presented by Viti and colleagues who showed that single-chain antibodies directed to an alternative intravascular target were only able to image the tumor vascular bed if their off-rate was very slow. Such highaffinity and low off-rate versions of the anti-Vitronectin receptor antibody have been described. The generalization of this concept (i.e., use of a probe carrying an isotope or other imaging agent as a means of delineating the tumor vascular bed) might well be applicable to magnetic resonance imaging and positron emission tomography scanning as a means of estimating "volume" or "extent" of vascular bed per volume of tumor. As new anticancer agents are developed that target the tumor neovasculature, novel approaches to measuring responses are needed. Functional imaging has been suggested as a means of detecting changes in tumor blood flow and metabolism. Several challenges confront the development of these new technolgies. We must identify noninvasive methods to quantify changes in tumor vasculature, and we must validate these methods as a means of quantifying the effects of antivascular agents. We must also be able to control for variability of repeated measures. Finally, we must be able to coregister images with traditional scans to follow the responses of particular lesions over time. We have identified several advanced imaging modalities that we believe have the potential to address these challenges: positron emission tomography (PET) utilizing radiolabeled glucose ( 18 FDG), radiolabeled carbon monoxide ( 11 CO), radiolabled water (H 2 O 15 ), and dynamic magnetic resonance imaging (MRI). PET-FDG relies on the fact that malignant lesions have elevated glycolysis, and therefore there is increased uptake of [ 18 -F] fluorodeoxyglucose. Conversion of 18 FDG to 18 FDG-6-phosphate occurs, which traps the radioisotope in neoplastic tissues. Our hypothesis is that changes in tumor vascularity in response to agents may alter glucose metabolism and therefore change the amount of uptake of 18 FDG. PET-CO is designed to assist in the quantification of tumor blood volume. 11 CO is administered by inhalation, binds tightly to red blood cells, and has a short half-life (20 minutes). Red blood cell volume can be quantified as can changes in blood volume over time. PET-H 2 O allows for the measurement of tumor blood flow. H 2 O 15 is delivered intravenously and is rapidly cleared (half-life < 5 minutes). Short scanning times can be used, and flow in tumor vessels can be standardized to major vessels and to arterial blood samples. The technique has been validated in myocardial and cerebral blood flow studies. Dynamic MRI relies on the rapid (more than 1 minute) administration of a gadolinium-based contrast agent. Ultrafast analysis of signal intensity and patterns of contrast uptake within tumors can correlate with microvessel density. Pharmacokinetic modeling of gadolinium clearance -as well as measurements such as the rate of enhancement, time to peak enhancement, clearance rate, and area under the curve -can be performed. Recent advances in MRI utilizing gradient-recalled echo and inhaled carbogen may enhance these measurements. Our current clinical trial is designed to evaluate these various imaging modalities in patients currently being treated with antiangiogenic agents (anti-VEGF, thalidomide, isolated hepatic perfusion). Pretreatment imaging with CT, PET, and dynamic MRI is followed by interval imaging while on therapy. Correlation with tumor biopsy is performed when available.

Background
Graft survival for all solid organ transplantation procedures is restricted by acute and chronic rejections. The solution to this problem is induction of a state of donor-specific tolerance in the patient so rejections will not occur. Current methods of diagnosing allograft dysfunction are inadequate in that significant organ damage occurs prior to the establishment of a clinical diagnosis. Clinical tolerance remains an elusive goal despite success in animal models. One of the main hurdles in developing tolerance strategies is the lack of a clinical biomarker or a "tolerance assay". The development of assays or novel technologies that will enable detection of allograft dysfunction/rejection, monitor responses to therapy, and predict long-term outcomes is vital for the success of transplantation clinical trials.

Objectives
• Address the validation of histological evidence of graft dysfunction by immunological methods • Develop noninvasive techniques that use peripheral blood and urine to establish biomarkers that may be used as surrogate endpoints in transplantation clinical trials • Evaluate newer methods of functional prediction by genomic DNA typing
Research Director, Laboratory of Immunogenetics and Transplantation, Brigham and Women's Hospital, Boston, MA, USA CD4+ T cell recognition of alloantigen is the key initial event that leads to graft rejection. There are two nonmutually exclusive pathways of allorecognition. In the direct allorecognition pathway, T cells recognize intact donor major histocompatibility complex (MHC) molecules on donor antigen-presenting cells. In the indirect pathway, T cells recognize processed donor alloantigen presented as allopeptides by recipient antigen-presenting cells. There is increasing evidence to suggest that both pathways play important roles in the rejection process but that indirect allorecognition may be the dominant role in chronic rejection. Indeed, T cells from patients with chronic graft dysfunction exhibit specific alloreactivity to donor MHC peptides with epitope spreading. The utility of these assays as a biomarker of allograft dysfunction will be discussed.

Immune Parameters Correlating With Long-Term Graft Outcome
Nancy L. Reinsmoen, Ph.D.

Professor Pathology, Director of Clinical Transplantation Immunology Lab, Box 3712, Duke University Medical Center, Department of Pathology, Durham, NC 27710, USA
Our long-term goal is to identify immune parameters that predict graft outcome and use these parameters to identify recipients who are candidates for individualization of immunosuppression. We have used the immune parameters of donor antigen-specific hyporeactivity, peripheral blood allogeneic microchimerism, donor antigenspecific anti-HLA antibodies, and cytokine gene polymorphism to test solid organ recipients transplanted at two centers. We have found that hyporeactivity of the donor antigen-specific response at 1 year posttransplant, as determined by a decreased donor antigen-specific proliferative response, is predictive of a chronic rejection-free state in kidney, heart, and lung recipients (predictive value = 96 percent for kidney, 100 percent for lung, and 91 percent for heart recipients). Furthermore, kidney recipients who have experienced an acute rejection episode and remain responsive to donor antigen are at high risk for developing chronic rejection versus those who develop hyporeactivity (odds ratio = 3.61, p = 0.0042). The presence of high levels of peripheral blood allogeneic microchimerism (1:10,000) in lung recipients is also associated with a chronic rejection-free state (BOS grade < 1)(p = 0.029 at 18 months posttransplant). Recently, we have identified kidney recipients who, at the time of transplantation, had donor class II-directed antibodies detectable only by ELISA or flow cytometry techniques but not by cytotoxicity, and were at high risk for acute and chronic rejection. Of seven such kidney recipients, all had one or more acute rejection episodes, and three have developed biopsy-proven chronic rejection. We have also investigated whether cytokine (TNF-, TGF-and IL-10) genotype polymorphisms associated with high and low cytokine production contribute a greater propensity for acute or chronic rejection. In heart recipients (n = 28), four allelic patterns were found only in seven recipients with coronary artery disease (CAD); four other patterns were found only in eight recipients without CAD; and three other unique patterns were found in four recipients with and nine recipients without CAD (p < 0.001). We are currently genotyping lung and kidney recipients. Thus, the use of multiple parameters to assess a recipient's immune profile may more accurately predict graft outcome, thus permitting individualization and optimization of immunosuppression.

Objectives
• Identify current biomarkers used in the evaluation of therapeutics and preventive interventions for hepatitis C virus (HCV) infections • Explore immune response parameters in the control and cure of HCV infections • Determine areas of need, opportunity, and viable approaches to developing clinical assessment tools to evaluate HCV therapies • Identify applicable tools, technologies, and strategies to evaluate response to HCV treatments

Yale University Medical School, USA
The selection of appropriate biomarkers to monitor hepatitis C virus (HCV) progression and determine the effectiveness of therapy is hampered by the limited understanding of the pathophysiology of the infection. It would be useful to identify candidate biomarkers that could be used both in studies of the disease's natural history and in monitoring experimental therapy. Such biomarkers would enable the clinical research community to evaluate the following: How much viral RNA and virus-encoded protein are present in body fluids and in the liver? How many hepatocytes are infected with HCV? In which other cells and tissues is HCV present? How many activated T cells are present in the blood, and how many of these are specific for HCV antigens? How many lymphocytes are present in the liver, what is their subset distribution, what is their level of activation, and how many are tolerized or apoptotic? How many of these lymphocytes are HCV-specific? How many hepatocytes are damaged, and how much of this damage is due to virus infection versus the host immune response? How much damage affects noninfected hepatocytes? What effector arms of the immune system are active in HCV-infected liver? It would clearly be desirable to move beyond techniques based on reverse transcriptase-polymerase chain reaction, Western blotting, and T cell subset analysis of the peripheral blood, and beyond H and E histology of liver biopsy material, and deploy a wider range of biomarkers in investigational settings. We look forward to a wide-ranging discussion of the options.

Overview of the Hepatitis C Virus: Natural History and Current Therapeutic Regimens
Karen L. Lindsay, M.D.

University of Southern California, USA
Acute infection with the hepatitis C virus (HCV) is associated with a high frequency of chronic infection, the potential for development of chronic liver disease which may result in liver failure (requiring liver transplantation or resulting in death), and hepatocellular carcinoma. Extrahepatic disorders, such as mixed cryoglobulinemia and renal disease, also occur. The natural history of HCV has been studied in two populations -individuals identified or retrospectively studied during the acute phase of the infection and individuals identified during the chronic phase of the infection. Studies on the natural history of acute HCV infection indicate that more than 80 percent of cases develop chronic infections. The estimated incidence of acute infection in the United States has, however, markedly diminished since the late 1980s. The relatively low incidence of new infections as well as other factors have made prospective studies of the immunologic response to acute infection and treatment of acute HCV very difficult. Once chronic infection occurs, progression of the liver disease follows a variable course over time. Retrospective/prospective followup studies of cohorts who acquired HCV infection in the 1970s have shown that liver-related mortality is slightly increased in the first 22 years of infection, but the percentage of individuals with significant hepatic fibrosis is low. Modeled analyses of the progression of hepatic fibrosis based on serial liver biopsies have demonstrated that male sex, age 40 or older at infection, and a history of heavy alcohol consumption are associated with an increased risk of fibrosis. In patients with clinically compensated cirrhosis, the estimated 10year survival is at least 80 percent, but once clinical decompensation occurs, survival decreases to less than 40 percent. In patients coinfected with HIV, the course of the chronic liver disease is accelerated. Based on the results of a large recent study, the estimated prevalence of hepatitis C in the United States is 1.8 percent, corresponding to an estimated 3.9 million Americans who have been infected with HCV, most of whom have chronic infections. Currently, the primary endpoint of treatment for chronic HCV is nondetectability of HCV RNA by reverse transcription-polymerase chain reaction at the end of treatment and for at least 6 months after treatment is discontinued (sustained virological response [SVR]). In adult patients with elevated serum alanine aminotransferase and clinically compensated chronic liver disease, unselected patients treated with standard doses of interferon alpha-2b monotherapy for 48 weeks demonstrated a virological response rate of 24 percent at the end of treatment, and SVR is achievable in approximately 13 percent of patients. Treatment with interferon alpha-2b combined with ribavirin is associated with SVR rates of 28 percent in patients infected with genotype 1 HCV who were treated for 48 weeks and 69 percent in patients infected with other genotypes of HCV and treated for 24 weeks. Long-term followup of patients with SVR has demonstrated clinical benefit, and data from several Japanese studies have raised the possibility that treatment with interferon alpha, even in the absence of SVR, is associated with a decrease in the risk of progressive liver disease or development of hepatocellular carcinoma. Identification of immunological markers associated with recovery from acute infection, slow progression of liver injury, virological response during therapy, and particularly those which are associated with and could predict a sustained virological response off treatment are potentially of great benefit.

Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
The hepatitis C virus (HCV) is notable for the high rate of chronic infection that occurs in nearly all individuals who become infected. Liver biopsies from individuals with chronic HCV infection are notable for the presence of numerous mononuclear cells, at least some of which are CD4+ and CD8+ T lymphocytes. The immune response to HCV is polyclonal and multispecific, both in terms of antibody and cellular immune responses. Individuals who recover from acute HCV infection appear to have quantitatively more vigorous CD4+ cell proliferative responses against one or more HCV proteins compared with those individuals who develop chronic disease. However, there is compartmentalization of a unique set of CD4+ cell responses in the liver during chronic infection. CD8+ cell responses are less well characterized, in part because of the technical difficulties involved in isolating and characterizing these cells. HCV-specific cytotoxic T lymphocytes can be readily isolated from the liver of chronically infected persons and recognize multiple epitopes within the HCV polyprotein. Even individuals with the same human leukocyte antigen type do not consistently recognize the same epitope. Thus, there does not appear to be an immunodominant response on the CD8+ cell level in this infection. CD8+ cells do appear to play some role in limiting viral replication, although they likely cause tissue damage once chronic infection is established. These responses are insufficient to eradicate the virus completely, however, and may cause liver injury once chronic infection is established. The kinetics of the CD8+ cell response in the liver is poorly understood in human studies. Cytokines produced by both CD4+ and CD8+ cells may play an important role in both inhibiting viral replication and causing liver injury. Intrahepatic CD4+ and CD8+ T cells produce type 1 cytokines such as interferon gamma although this appears insufficient to control viral replication. A better understanding of the role of cellular immunity in the pathogenesis of HCV infection may aid in the development of vaccines and immunotherapeutic intervention strategies.

Key References:
Abrignani S. Antigen-independent activation of resting T-cells in the liver of patients with chronic hepatitis. Dev Biol Stand 1998;92:191-194. Ballardini G, Groff P, Pontisso P, Giostra F, Francesconi R, Lenzi M, Zauli D, Alberti A, Bianchi FB. Hepatitis C virus (HCV) genotype, tissue HCV antigens, hepatocellular expression of HLA-A,B,C, and intercellular adhesion-1 molecules. Clues to pathogenesis of hepatocellular damage and response to interferon treatment in patients with chronic hepatitis C. J Clin Invest 1995;95:2067-2075.

Role of the Cellular Immune Response During Recovery From Hepatitis C Virus Infection
Barbara Rehermann, M.D.

Liver Diseases Section, NIDDK, National Institutes of Health, Bethesda, MD, USA
The hepatitis C virus (HCV) has a remarkable ability to establish clinically inapparent, persistent infections in individuals who are otherwise immunocompetent. Both the immunologic and virologic correlates that determine resolution versus chronic evolution of acute HCV infection are still unknown. The reasons are that patient cohorts and HCV strains are generally heterogeneous and neither duration nor route of infection are known. Out of necessity, therefore, most human studies have concentrated on patients with chronic HCV infection. To date, these reports suggest that once chronic infection is established, the HCV-specific immune response exerts some control over viral load but is unable to terminate persistent infection or resolve chronic hepatitis in most cases. Nevertheless, HCV-specific T cells are considered key players in the defense against HCV infection because of their ability to recognize viral antigens in and to eliminate virus from HCV-infected cells. A strong T helper 1 cell-dominated Thelper response targeted against immunodominant epitopes in the nonstructural proteins of the virus has been demonstrated in the few patients with known, acute, self-limited hepatitis C. Lack of such a T cell response in the early phase of infection is associated with chronic evolution of disease. HCV-specific helper and cytotoxic T cells have been demonstrated in the peripheral blood for up to two decades after biochemical and virologic recovery from disease. In contrast, circulating HCV-specific antibodies may not be maintained for such a long time and were completely lost in 46 percent of patients studied between 10 and 20 years after recovery from a single-source outbreak of hepatitis C. Therefore, cellular rather than humoral immune responses persist for decades after recovery from hepatitis C. These findings imply that the percentage of cases with acute self-limited HCV infection and complete recovery may be underestimated in the general population since these recovered patients do not display HCV antibodies as markers of prior HCV infection.

Hepatitis C Virus Vaccine Design and Biomarkers of Protective Immunity
Christopher M. Walker, Ph.D.

The Ohio State University, USA
Vaccines that protect from hepatitis C virus (HCV) infection are at the earliest stages of development. Progress has been slow, in part because biomarkers of protective immunity against this virus have not yet been identified. Anti-HCV antibodies may be an important component of immunity, and preliminary data suggest that immunization with recombinant E2 glycoprotein protects chimpanzees from infection with low doses of a genetically matched virus (Choo et al. 1994). It is unlikely that antienvelope antibody titers alone will be a useful marker for vaccine evaluation, because sterilizing immunity (i.e., complete neutralization of infectivity) will probably not be achieved against a virus as variable as HCV. A more realistic goal for HCV vaccines may be termination of viremia during the acute phase of infection. Resolution is observed in a minority of individuals in the first few weeks after infection, but mechanisms of protective immunity remain to be elucidated. Virus-specific T lymphocytes belonging to the helper (CD4) and cytotoxic (CD8) subsets probably contribute to control of ongoing virus replication. Indeed, the observed correlation between virus-specific CD4+ T helper cell activity and resolution of acute HCV infection (Diepolder et al. 1995;Missale et al. 1996) might have implications for vaccine design and marker development. Studies correlating acute phase immune responses with outcome of infection are incomplete, however. Further investigation of this issue using human subjects is important but also complicated by several factors, including difficulty in identifying individuals with acute infection, limited access to liver tissue, and uncertainty about virus genetic variability and its impact on detection of HCV-specific T lymphocyte responses. Nonhuman primate models may be crucial for defining and stimulating immune responses that protect from chronic hepatitis C. Chimpanzees are the only species other than humans reproducibly infected by HCV. Although rare and endangered, they may be of great value in defining acute-phase immune responses that determine infection outcome. Preliminary data suggest that a CD8+ cytotoxic T lymphocyte response directed against multiple viral proteins is temporally associated with termination of virus replication in two animals. Strikingly, these animals failed to make antibodies against viral structural proteins. Eliciting strong cytotoxic cell responses by immunization has been the holy grail of HIV vaccine research, and several strategies for delivery of antigens into the class I major histocompatibility complex processing pathway have emerged. Many of these approaches, including nucleic acid vaccines, are now being applied to HCV. Comparing multiple vaccine options in chimpanzees is neither practical nor feasible. Methods for monitoring the breadth, frequency, and duration of vaccine-induced T cell responses in a small nonhuman primate model such as the rhesus macaque are needed to facilitate studies of HCV vaccine immunogenicity.

Schering-Plough Research Institute, USA
Approximately 10 years have passed since the principal agent responsible for parenteral (blood-borne) transmission of infectious non-A, non-B hepatitis was cloned. The molecular characterization of the hepatitis C virus (HCV) has provided critical insight into small molecule intervention strategies targeting the viral-encoded genes required for viral replication (e.g., nonstructural protease, helicase, polymerase). HCV infection becomes chronic in the majority of exposed individuals with a significant fraction developing serious life-threatening sequelae such as cirrhosis, liver failure, and hepatocellular carcinoma. Therapy with interferon alpha, alone or in combination with ribavirin, achieves an antiviral effect through a direct inhibition of viral replication as well as through immunomodulation. Differences in sustained response are seen between type 1 and non-1 genotypes; however, the biological basis for these observations has not yet been determined. The mechanism(s) for virus persistence is (are) largely unknown but probably include the generation of viral quasispecies and the evasion of antibody neutralization through immunoselection and hypervariability in the viral envelope. The complex interplay between host and virus can exist for decades with host immunity largely ineffective at eliminating the virus and virus-infected cells. Adjunctive immunotherapeutic approaches to address this immune-tolerant (compromised?) state may be useful in an effort to improve the sustained response rate. Activation of the cellular immune response may assist in the control and elimination of virus and virus-infected cells and perhaps improve efficacy in a genotype-independent manner. The interplay and role of the various components of the cellular immune response to viral infection have been broadly defined (Ag presenting cells, T helper 1 cells, and cytotoxic T lymphocytes). In particular, therapeutic advantage may be taken of cooperative cellular interactions to process and present viral antigen and in the process activate the immune system. The elicitation of an effective CTL response to virusinfected cells and the in vivo measures of that response may require the application of metrics other than the accepted biomarker of circulating HCV viral load. In the course of any future immunotherapeutic clinical trial, it may be necessary to identify other surrogate measures of safety and efficacy. The timely development of DNA therapeutic vaccines and cytokine adjuvants and other approaches to immunomodulation will require an understanding of the relative contributions of antigens, modes of presentation, and their activity in eliciting immune recognition, recruitment, and activation.

Objectives
• Review current state of biomarkers for autoimmune diseases, including type 1 diabetes, multiple sclerosis, scleroderma, and systemic lupus erythematosis • Determine criteria for identification, development, and validation of biomarkers for disease risk, diagnosis, disease activity or stage, immune activity, and response to therapy ! What are our gold standards? ! What are the criteria and approaches for defining and validating a biomarker? • Discuss advantages and disadvantages of genetic, immunologic, inflammation, quality-of-life, and end-organ specific biomarkers for diagnosis, staging, and evaluation of therapy of autoimmune diseases in general and of specific diseases • Identify opportunities for potential new biomarkers for autoimmune diseases, research approaches to develop these, and pathways for transition of new biomarkers from basic research to clinical research to clinical practice • Evaluate the possibility for clinical trials for new agents for prevention and treatment of autoimmune disease to be done in a timely, cost-effective, and definitive manner with available biomarkers

Biomarkers in Lung Disease in Scleroderma
Barbara White, M.D.

University of Maryland Research Service, VA Maryland Health Care System, USA
Interstitial lung disease is the leading cause of death in patients with systemic sclerosis (SSc). We have used bronchoalveolar lavage (BAL) to study immune abnormalities in the lungs of SSc patients. All patients were seen at the Johns Hopkins and University of Maryland Scleroderma Center and had pulmonary function tests (PFTs), including measurement of forced vital capacity (FVC) and single-breath diffusion capacity for carbon monoxide (DLco). In a series of 89 patients, we found that the presence of alveolitis, defined by an abnormal BAL cell differential with equal to or greater than 3 percent neutrophils or equal to or greater than 2.2 percent eosinophils or defined by interstitial inflammation on lung biopsy, was associated with greater subsequent decline in FVC and DLco compared with patients who did not have alveolitis. The presence of alveolitis was also associated with worse survival. Treatment of alveolitis with cyclophosphamide was associated with normalization of BAL cell differentials and an improved outcome in FVC, DLco, and survival compared with untreated patients with alveolitis. In a similar study of 61 SSc patients, the presence of high levels of CD8+ T cells, especially activated CD8+ T cells, was associated with lower initial and followup FVC and DLco values. These CD8+ T cells were found to make a type 2 pattern of cytokine mRNAs. Production of type 2 cytokine mRNAs was associated with a greater decline in FVC and DLco over time. In contrast, activation of cytolytic pathways of CD8+ T cells, as measured by soluble Fas ligand, granzyme A, and granzyme B levels, was not associated with a worse pulmonary outcome. These data suggest that activated CD8+ T cells may contribute to pulmonary fibrosis in SSc, especially through the production of type 2 cytokines. Consideration could be given to using BAL cell differential, numbers and percents of activated CD8 T cells, and production of type 2 cytokine mRNAs by CD8+ T cells as surrogate endpoints in SSc lung disease.   Neuroimmunology Branch, NINDS, NIH Building 10, Room 5B-16 10 Center DR MSC 1400, Bethesda, MD 20892-1400 Similar to insulin-dependent diabetes mellitus and rheumatoid arthritis, multiple sclerosis (MS) is considered a T cell-mediated autoimmune disease in which T-helper-1 (Th1) lymphocytes (i.e., T cells that secrete interferon gamma [IFNg] and tumor necrosis factors-a/b [TNF-a/b]), contribute to organ damage. During recent years, a number of candidate autoantigens have been identified in MS, among them myelin basic protein (MBP). Based on this knowledge, a number of novel immunotherapies have been developed. These try to interfere specifically with T cell recognition of an autoantigenic peptide or to inhibit more globally the release of pathogenic mediators such as the Th1 cytokines. We have explored immunological measures that allow us to monitor the fluctuation of antigenspecific T lymphocytes during the natural course of MS or under a specific treatment, but we have also begun to develop markers that may be useful to assess the immunological disease state (with respect to Th1 responsiveness)

Barbara Davis Center for Childhood Diabetes, University of Colorado Health Sciences Center, USA
A combination of genetic analysis, autoantibody determination, and physiologic testing allows the identification of individuals at high risk for a series of autoimmune disorders and more precise diagnosis of specific disorders. In particular, for type 1A diabetes mellitus, Addison's disease, and celiac disease specific class II, human leucocyte antigen alleles are associated with a high risk of disease. The highest risk genotype for type 1 diabetes is DQ8/DQ2 (risk approximately 6 percent in the United States), DQ8/DQ2 with DRB1*0404 for Addison's disease (risk approximately 1/200), and DQ2/DQ2 (risk approximately 1/20 in U.S. infants). The autoimmunity of type 1 diabetes and celiac disease often begins early in life (prior to 9 months in many DQ8/DQ2 infants with expression of GAD, ICA512, or insulin autoantibodies) and prior to age 3 years for celiac disease (transglutaminase autoantibodies). As autoantibody assays are improved and cutoffs are set at or above the 99th percentile of normal populations, the autoantibodies are remarkably stable prior to the development of overt disease. This stability also allows the definition of response to therapy. For celiac disease transglutaminase autoantibodies, a gluten-free diet is associated within months with a marked decrease in autoantibodies as well as resolution of intestinal pathology. It should be similarly possible to design trials for the prevention of the appearance of anti-islet autoantibodies and to evaluate therapies to suppress such antibodies. Approximately one-third of DQ8/DQ2 first-degree relatives of patients with type 1 diabetes develop anti-islet autoantibodies prior to age 2. At present, autoantibodies provide the only specific and sensitive immunologic marker for type 1 diabetes. Current T cell assays in workshop evaluation were not able to distinguish patients with diabetes from controls, and reported assays of serum IL-4 have lacked specificity (interference from heterophile antibodies). It is likely that T cell assays will be improved, in particular with the introduction of tetramers to quantitate specific T cells. In the nonobese diabetic mouse model of type 1 diabetes, we believe insulin is the primary autoantigen, and a dominant insulin peptide (B:9-23) has been identified. The T cell clones recognizing this peptide evidence restricted Valpha chain utilization. The development of both specific T cell and antibody assays coupled with determination of metabolic function and cytokine production by specific cells will hopefully provide additional surrogate markers.

A Correlate Does Not a Surrogate Make: Autoantibodies and Beta-Cell Function in Prevention of Type 1 Diabetes
Carla J. Greenbaum, M.D.

University of Washington, Department of Veterans Affairs, Puget Sound Health Care System, USA
The ideal surrogate endpoint must have a strong biological rationale, the marker must predict disease, and treatment effects must be the same on the marker as on the disease. For the prevention of type 1 diabetes, markers include autoantibodies and beta-cell function. Autoantibodies are useful in prediction of disease; however, not all antibody-positive subjects progress to disease. In addition, clinical trial data indicate that change in autoantibodies is not associated with clinical outcome. For beta-cell function, there is a strong biological rationale. First-phase insulin release (FPIR) is useful in predicting disease; however, not all subjects with low FPIR progress to diabetes. Moreover, glucose homeostasis is a result of both insulin secretion and insulin action. Interpretation of insulin secretion measurements during intervention trials may be complicated by the adaptation of the beta-cell to changes in insulin sensitivity. Also, interventions may effect insulin secretion by mechanisms unrelated to the autoimmune process. These caveats notwithstanding, beta-cell function is currently the best surrogate endpoint for trials aimed at Michael B. Stewart,Princeton,USA Substantial unmet medical needs persist in autoimmune diseases for both new diagnostic and therapeutic modalities. Particularly needed are agents that will allow the subclinical and clinical course to be altered rather than simply treating the symptoms of established diseases. Scientific knowledge about the basic mechanisms of immune regulation and inflammatory response, as well as the genomics of diseases mediated by these mechanisms, is growing rapidly. There is great opportunity for meaningful advances if clinical and basic research resources can be aligned. Escalating costs for clinical trials and the protracted timelines and large numbers of patients that are required for currently used composite clinical endpoints and indices mandate becoming more efficient in determining how well new agents can address unmet medical needs. That efficiency is most likely to be achieved by validating correlations between specific biological mechanisms of disease and clinical outcomes. The benefits will include faster evaluation of new therapeutic agents or diagnostic methods in early clinical studies, as well as fewer patients and shorter timelines in confirmatory trials. Types of markers to be considered may include molecules in systemic circulation, cells and cellular responses, and tissue-localized markers assessed biochemically or by imaging techniques. Alcoholism has multiple biological and environmental determinants; alcohol-dependent persons demonstrate considerable individual differences in the biological underpinnings of the disorder and in treatment response. Recent research has suggested that the endogenous opioid system plays a central role in alcohol reinforcement and dependence. Recent clinical trials of naltrexone treatment for alcoholism have demonstrated efficacy, and findings also have shown variable responsiveness to opioid antagonist therapy. Several opioid-related variables, including baseline alcohol craving, high levels of baseline somatic distress, and family history of alcoholism, appear to modulate naltrexone treatment effectiveness. Our research is examining individual differences among alcoholdependent subjects in three direct measures of the opioid system and the effectiveness of these measures in predicting naltrexone treatment outcomes. First, we employ positron emission tomography imaging for measurement of mu-and delta-opioid receptor binding throughout key central nervous system regions. We will assess individual differences in receptor blockade at baseline and after induction on naltrexone therapy. Second, characterization of hypothalamo-pituitary-adrenal axis stimulation (i.e., adrenocorticotropic hormone and cortisol release) following opioid receptor blockade by naloxone provides a functional assessment of opioid activity. In preliminary data, we have observed a reciprocal relationship between endogenous opioid activity and opioid receptor binding, such that low opioid activity is associated with a compensatory upregulation of opioid receptors. Third, naltrexone is rapidly converted to its major active metabolite 6-beta-naltrexol, which provides a stable marker for naltrexone bioavailability. Research has demonstrated individual variability in naltrexone metabolism, with fourfold differences across subjects in peak 6-beta-naltrexol levels. Our data suggest that differing individual biotransformation rates and associated differences in 6-beta-naltrexol levels influence naltrexone effects on alcohol reinforcement and consumption. This research initiative will provide important information on the mechanism of action of naltrexone pharmacotherapy for alcoholism and, in the future, may allow us to individualize naltrexone pharmacotherapy based on patients' endogenous opioid system characteristics.

Neuroimaging of Cue-Induced Craving States
Anna Rose Childress, Ph.D.;William McElgin, B.S.;P. David Mozley, M.D.;and Charles P. O'Brien, M.D., Ph.D. Penn/VA Addiction Treatment Research Center, 3900 Chestnut Street, Philadelphia, PA 19104-6178, USA Powerful drug incentive (craving) states are cardinal features of addictive disorders and can fuel the relapse tendency that characterizes addiction. These craving states can be triggered by cues signaling the drug, and this fact has enabled their study under controlled laboratory conditions. Our early research found that video cues for cocaine could reliably elicit profound craving and arousal in users of the drug. The craving state was often likened by cocaine users to strong sexual desire and was sometimes accompanied by cocaine-like symptoms (e.g., ear-ringing, head-buzzing, taste of cocaine in back of the throat, even mild euphoria). Understanding the substrates of this particular biomarker may help explain the differential vulnerability to addiction: Many individuals are exposed to highly rewarding drugs, yet only a relatively small subset develop the compulsive craving/drug use characteristic of addiction. We recently studied the brain substrates of cocaine craving (Childress et al. 1999) by monitoring brain activity (indexed by regional cerebral blood flow, with 0-15 position emission tomography) during video cues that reliably induce the state. Based on preclinical data showing limbic activation both to cocaine and to cocaine cues and on the salient emotional and motivational properties of cocaine craving, we hypothesized and subsequently confirmed increases in limbic activity (amygdala, anterior cingulate, and temporal pole) during cue-induced cocaine craving. Comparison regions (cerebellum, dorsolateral prefrontal cortex, basal ganglia, occipital cortices, and thalamus) did not show differential activation. Cocaine-naive controls neither craved nor showed differential limbic activation during the cocaine videos. Results suggest limbic activation characterizes appetitive drug craving. Ongoing work with induced craving for natural rewards (e.g., sexual activity) will help determine whether the brain activation during pathognomonic cocaine craving differs, either quantitatively or qualitatively, from normal desire.

Biomarkers of Addiction Revealed Through Brain Imaging: State of the Art
Edythe D. London, Ph.D.

National Institute on Drug Abuse, USA
Brain imaging has been applied to studies of cerebral abnormalities in substance abusers even during abstinence. Some convergent data implicate the same brain regions in the genesis of some addiction-related behavioral states. For example, position emission tomography (PET) studies of cerebral glucose metabolism demonstrated correlation between cue-elicited craving and activation in the dorsolateral prefrontal cortex, amygdala, and cerebellum. To some extent, these findings have been replicated in other studies, including assessments of cerebral blood flow using PET and functional magnetic resonance imaging (fMRI). Nonetheless, discrepancies exist and may reflect variations in the experimental paradigms or the analytical methods used. One major consideration is the fact that brain activity reflects the complexity of simultaneous thoughts and feeling states. Determination of the extent to which a change in a particular brain region is a function of that feeling state or of another process (e.g., remembering taking cocaine) even if significantly correlated with the state of craving requires careful study design. Spatial and temporal resolutions have been greatly improved through brain imaging techniques. Future studies will benefit from pairing these refined methods with paradigms that isolate specific behavioral states or probe the function of particular brain regions as markers for vulnerability to addiction or for drug-induced damage. Keith Wesnes,Ph.D. Cognitive Drug Research Ltd.,Reading RG30 1EA,University of Northumbria,UK Cognitive function is crucial for the execution of the activities of daily living. Aspects of cognitive function such as attention, short-term memory, long-term memory, reasoning, planning, and so on are all vital processes that underlie our ability to conduct various everyday tasks. While we all accept that complex tasks require a high level of involvement of various elements of cognitive function, it is becoming acknowledged that even the simplest everyday tasks depend on adequate cognitive functioning. Unfortunately, many professionals mistakenly believe that behavioral observation, electroencephalography, positron emission tomography scanning, and so forth can directly measure cognitive function. However, these are indirect measures, and their roles as surrogate endpoints for changes in human cognitive function is thus seriously flawed. The only direct assessment of cognitive function is the measurement of the capability of an individual to perform a cognitive task. Cognitive tasks are therefore the only valid surrogates for changes in cognitive function in clinical research. Cognitive tasks designed by psychologists are used in clinical research for two major reasons: (1) to identify the current status of functioning of various aspects of mental ability and (2) to measure change in cognitive function. A wide range of procedures are available for assessing the current status of functioning, and if time is not an issue, a typical comprehensive neuropsychological workout can be conducted in about 2 hours. The requirements for assessing change make the number of available procedures dramatically smaller. We need parallel forms to minimize training effects. Furthermore, ideally, pretraining on tests enables change over time to reflect a change in the cognitive status of the individual as opposed to simply representing an experience-based difference in the ability to do the task. Because the greatest requirement in clinical research is to assess change in individuals as result of the experimental treatments and manipulations, this will be the principal focus of this presentation. Traditional pencil-and-paper measures such as the digit symbol substitution test will be reviewed and their limitations highlighted. The advantages of automated procedures will be highlighted, and examples of their application to clinical research will be assessed. An important area of clinical research is the development of novel medicines and therapies. Patients have the right to know whether medicines will alter their ability to conduct the tasks of daily living. Cognitive function assessment can provide this information at any time during the clinical research program, enabling decisions to be made about the relative utility of novel medicines. Many medicines (but not those for treating central nervous system disorders) can affect cognitive function, and any medicine can potentially interact with alcohol and other compounds. Improvements in cognitive function can also be identified, resulting either directly from the compound or by the successful treatment of a medical or psychiatric condition known to impair cognitive function (e.g., diabetes, depression). The stages of validation of computerized cognitive tests will be discussed, with examples from the automated system used most widely in clinical research, plus examples of the reliability and sensitivity of the system in current clinical use. Finally, the future developments of cognitive testing for clinical research will be outlined.

Genetic Markers of Alcoholism and Alcohol Abuse
Henri Beglieter, Ph.D.

Department of Psychiatry, State University of New York Health Science Center, Brooklyn, NY, USA
A complex illness such as alcoholism does not have a pattern of inheritance that results from a single genetic abnormality. The psychiatric diagnosis of alcoholism does not inform us about its possible biological underpinnings. For the past two decades, we have pursued the possible brain anomalies that may be present in alcoholics and their relatives and also present in young unaffected offspring. In the early 1980s, we identified an abnormal event-related brain potential response manifested as a low p3 voltage. This low p3 response is prevalent in young sons and daughters of alcoholic families. More recently, we have demonstrated that this neuroelectric anomaly is also present in young adult offspring from high-density families. In addition, we have found that this electroencephalographic feature is highly heritable. In a nationwide study on the collaboration on the genetics of alcoholism, we have conducted linkage analysis of a large number of pedigrees and have found some quantitative trait loci that relate to both qualitative traits (diagnosis) and quantitative traits (neurophysiology). We have postulated that these neuroelectric features index an imbalance in the central nervous system (CNS) excitatory/inhibitory homeostasis. This is manifested by CNS disinhibition and hyperexcitability. A number of neurobiological systems are hypothesized.

Themes in the Development of Addiction-Related Biomarkers
Robert L. Stout, Ph.D.

Center for Alcohol and Addiction Studies, Brown University, USA
The presenters in this session on behavioral and biochemical markers cover an especially diverse set of markers. The issue of markers in the sense of early indications of future status has received little direct attention in substance abuse research. It is not always easy to assess current substance use, so looking at markers may in some ways seem premature. Nonetheless, it is important for us to enrich our armamentarium of alternative outcome measures or indicators to give students of alcohol problems maximum flexibility in conducting studies and by using multiple fallible indicators, maximum confidence in the results of our studies. There have been many applications of computerized adaptive testing in the assessment of individual differences, but the use of adaptive techniques in outcome evaluation is novel. Because of concerns about subject burden and cost of assessment, this technology will undoubtedly receive further research scrutiny. Biochemical markers of consumption have a firmly established role in outcome assessment. Because all existing markers have important weaknesses, however, their promise can only be realized through further development of the technology. Vulnerability measures as described by Dr. Begleiter could potentially be used in outcome assessment by enabling us to target the highest-risk subjects for the most Cardiovascular II

Background
In patients with stable coronary artery disease (CAD), the atherosclerotic process can induce a host of coronary functional and anatomic abnormalities that eventually affect myocardial performance. A number of factors likely contribute to this process, including procoagulant tendencies, inflammation, metabolic abnormalities, coronary microcalcification, oxygen radicals, and oxidized lipids. The treatment is usually multitargeted, frequently focused on clinical presentation of ischemia. The relevant biomarker might target coronary flow limitations and/or abnormalities, expression of ischemia, such as perfusion, myocardial metabolism or left ventricular function, or factors involved in CAD. Ischemia can be assessed directly by myocardial metabolic assessment ( 31 P or other techniques), indirectly by perfusion techniques, such as nuclear imaging or MRI, or by ECG changes (ambulatory ECG).

Calcium as a Biomarker for Atherosclerosis Progression and Regression in Coronary Artery Disease
Douglas P. Boyd, Ph.D.

Imatron Inc., USA
Atherosclerosis is a silent disease that develops slowly over decades of life until it manifests itself in clinical coronary artery disease with symptoms including angina, heart attack, heart failure, and sudden death. The treatment of atherosclerosis involves lifestyle modifications and medical treatment of lipid disorders. Histologic studies have shown that approximately 20 percent of the volume of plaque in coronary atherosclerosis is marked by detectable levels of calcium (Rumberger et al. 1995). Although soft plaques with no detectable calcium exist, large studies have shown that 96 percent of asymptomatic patients with clinical coronary disease as demonstrated by angiography have detectable coronary calcium (Laudon et al. 1999). Other studies have shown that the prevalence of calcium correlates with increased risk of future coronary events, and the greater the amount of calcium, the higher the risk (Arad et al. 1996). The risk, sometimes expressed as an "odds ratio," can be 20:1 or higher and is independent of the presence or absence or symptoms of cardiac disease. In the past, coronary artery calcification (CAC) could be detected by fluoroscopy. However, the results were variable due to the relative insensitivity of the fluoroscopic technique and the requirement of a skilled operator. Today the "gold-standard" for the detection and quantification of CAC is electron beam CT (EBCT) scanning using a 100 millisecond scan speed. Recent research has focused on the reproducibility of CAC scores and the ability of such quantitation to track the progression and regression of atherosclerotic disease. Callister and colleagues (1998) demonstrated in a retrospective study of 149 patients the ability to track disease progression after 12 to 15 months of treatment with a statin drug. Untreated patients advanced in plaque volume by 52 percent. Those treated who achieved a final low-density lipoprotein (LDL) cholesterol level of less than 120 mg per deciliter had a net regression of about 7 percent. Regression analysis showed an association with the degree of regression and the final LDL level achieved. These kinds of drug studies depend on the accuracy of CAC scoring. With higher accuracy, the longitudinal interval could be shortened and fewer patients would need to be logged into a blind study. Currently, CAC is scored using an Agatston score (Agatston et al. 1990), which approximates the mass of calcium present, and a volume score, which estimates a plaque volume. The major sources of error include motion artifact, electrocardiograph triggering errors, resolution blurring, and variability in background subtraction as determined by a threshold. All of these issues can be addressed by technical improvements, many of which are under way. Some of the improvements will be made in the EBT scanner itself (reduce scan speed to the 35-50 msec range, introduce multiple slices, increase resolution by doubling detector pitch), and others require improvements in the CAC scoring workstation algorithms (interpolation scoring, linearization, and normalyzation of background using self-calibration). These techniques and others will advance the accuracy of CAC in future years, thus providing an increasingly precise biomarker for early detection, monitoring of treatment, and for interventional research studies in coronary artery disease.

Inflammatory Biomarkers in the Prediction of Future Coronary Events Among Apparently Healthy Men and Women
Paul M. Ridker, M.D., M.P.H.

Brigham and Women's Hospital, Harvard Medical School, USA
Myocardial infarction and stroke commonly occur among individuals without hyperlipidemia. In an attempt to better predict future coronary events, epidemiologic studies have explored a series of novel risk factors including biomarkers of inflammatory function. Specifically, recent large-scale prospective studies indicate that nonspecific inflammatory markers, such as C-reactive protein and serum amyloid A, as well as direct markers of cellular adhesion (soluble intercellular adhesion molecule 1) and cytokine activation interleukin-6 are all elevated many years in advance among those at high risk for future events. This has been shown for women as well as men and is present in subgroups of patients traditionally considered low risk. Further, the predictive value of inflammatory The purpose of the session will be to propose an agenda for advancing the scientific approaches by which clinical biomarkers will be discovered, evaluated, and in appropriate instances, developed as effective surrogate endpoints. This session will be organized as a free-flowing discussion designed to address several topics. The first is the research infrastructure in support of developing clinical biomarkers, to include (1) Centers of Excellence that provide an environment in which fundamental scientific inquiry is integrated with clinical investigation in the process of translational research; (2) technological resources (e.g., chip technology, mass spectrometry, proteomics, magnetic resonance imaging); (3) training and the training environment; (4) requests for proposals that encourage development of biomarkers in designated areas; and (5) other infrastructure support. The second topic, research infrastructure in support of classifying clinical biomarkers, will include (1) What evidence is necessary to establish the utility of a surrogate endpoint for a particular treatment and disease? (2) What new methodologies will enhance our ability to generate the evidence necessary to establish the utility of a surrogate endpoint? (3) Who should be systematically evaluating surrogate endpoints? How can such research be stimulated and disseminated for maximal benefit? Should there be a repository of such information on surrogate endpoints? If so, where should it be located, and who should manage it? and (4) What will research on biomarkers and surrogate endpoints look like over the next 5 to 10 years? Participants in this session will include bench scientists, clinical investigators, and statisticians with both scientific and regulatory perspectives. Session participants will draft a tentative agenda and action plan for research and activities needed in the future.

Issue: Evaluation of Biomarkers and Surrogate Endpoints
Harry B. Burke,M.D.,Ph.D. and Donald E. Henson,M.D. New York Medical College,USA,Phone: (914) Fax: (914)   As types of prognostic factors, biomarkers and surrogate endpoints are important for assessing the natural history of disease, selecting the optimal therapy, and evaluating the effectiveness of treatment. Two issues are central to the initial evaluation of biomarkers and surrogate endpoints. The first is the time from diagnosis to the analysis of outcomes (e.g., mortality). The longer this interval, the longer the prediction time interval can be. To provide, for example, 10-year survival predictions, a patient population must be followed for 10 years. The 10-year information is used to assess predictive accuracy and to provide 10-year outcome predictions for future patients. Once the relationship between the surrogate outcome and the true outcome is known, the factor can be used as a surrogate endpoint. The second issue is the accrual of a sufficient number of outcomes so that the assessment of the factor is statistically reliable. A human specimen bank that contains abnormal and normal tissue, white cells, serum, and plasma facilitates the determination of biomarkers and surrogate endpoints because, if the tissue was collected long ago and the patients followed through time, it eliminates the initial waiting time and accrual problems inherent in both biomarkers and surrogate endpoints.  MS HFS-452, 200 C Street, SW Washington, DC 20204, USA, Phone: (202) Fax: (202)   Recently, we have witnessed an explosive growth in the use of biological markers of bone turnover as clinical endpoints in evaluating the safety and efficacy of therapies other than antiresorptive drugs. These markers have been applied as assessment tools in nutritional therapy involving conventional foods, dietary supplements, functional foods, food fortificants, infant formulas, and medical foods. In safety and efficacy testing of nutritional products, the ideal biological marker should reflect a change in physiological status such as toxicity, nutritional deficiency or sufficiency, as well as health benefits. The following discussion focuses on safety testing and present evidence supporting the utility of the biological markers of bone turnover in clinical testing of nutritional products. Both resorptive and formative markers used as an index of bone turnover show variable and often weak response to nutritional therapy when compared with the classic endpoints used in drug trials, bone mineral density (BMD), and fracture rate. This was observed in a 3-year calcium and vitamin D supplementation trial in elderly men and women where significant increases in BMD were observed at several sites, which was consistent with a reduction in fracture rate in the supplemented group. Serum osteocalcin showed a small but significant reduction with supplementation compared with the placebo group; the resorptive marker, urinary N-telopeptide of type 1 collagen of bone (NTx), did not change. The absence of an increase in either bone marker relative to the placebo group is an important indicator of nutritional adequacy. Usefulness of these biomarkers in this capacity is evident in a 2-year study comparing supplementation with milk to calcium carbonate or placebo. Significant increases in serum osteocalcin and urinary NTx occurred in women with the lowest calcium intake, demonstrating nutritional inadequacy in the placebo group and suggesting poor compliance in the milk group. A recent study demonstrated significant increases in type I C-terminal procollagen peptide (PICP), osteocalcin, and urinary deoxypyridinoline (DPY) in response to acute dietary calcium depletion. This study further supports the potential usefulness and sensitivity of these markers to indicate conditions of nutrient insufficiency. There is also growing evidence that this technology has application in improving nutritional intervention practices. An unexpected suppressive effect of magnesium supplements on serum-intact parathyroid hormone levels and bone turnover rate was demonstrated using serum biochemical markers of bone formation (osteocalcin and PICP) and resorption (type I collagen telopeptide, ICTP). Biomarkers may therefore prove useful in designing and testing functional foods or dietary supplements for specific interventions such as optimizing peak bone mass in young adults. Ricci and colleagues used the urinary pyridinium cross-links and osteocalcin to monitor bone loss during weight reduction intervention in postmenopausal women. They demonstrated that calcium supplementation was effective in reducing bone loss that occurs during energy restriction. The markers have also been used effectively to demonstrate lack of benefit of calcium supplementation during states of increased physiological need such as lactation. Calcium supplementation did not affect the levels of bone turnover markers (osteocalcin, bone alkaline phosphatase [bAlkP], and urinary DPY) in Gambian women consuming a calcium supplement compared with those consuming a placebo for 52 weeks. The U.S. Food and Drug Administration (FDA) is also aware of the use of these markers in nutrient fortification trials in developing countries, notably in situations where the fortificant may interfere with calcium or bone metabolism. Osteocalcin is unique among the bone markers in that it has utility as both a marker of bone formation and turnover and a sensitive indicator of vitamin K nutritional status. Functional indication of vitamin K status is measured by the degree of undercarboxylation of the glutamic acid residues in this bone protein. Measurement of undercarboxylated osteocalcin has demonstrated that the current recommended daily allowance for vitamin K is insufficient to meet the needs of nonheptic tissue. Collaborators from FDA, Centers for Disease Control and Prevention, and Yale Medical School are using sera from the third National Health and Nutrition Examination Survey (NHANES III) to determine the utility of measuring undercarboxylated osteocalcin to estimate vitamin K status in the U.S. population. Biological markers of bone turnover may also prove useful in demonstrating in large cross-sectional surveys the age at which skeletal maturity occurs. Urinary NTx and bAlkP measures were included in the 1999 survey (NHANES IV), with the objective that these markers would facilitate the determination of when skeletal maturity occurs in the various racial/ethnic and gender groups. This information has direct bearing on the most appropriate ages to initiate dietary interventions. We have measured osteocalcin and bALK P in samples from the previous NHANES survey for this same purpose. These few examples will hopefully help broaden scientific awareness to the potential utility of the bone markers in clinical testing of nutritional products and in establishing nutritional status of vitamin K.

Issue: Oxidizability of Lipoproteins as a Promoter of Atherosclerosis and Lack of Large-Scale Clinical Trials
Thomas G. Cole,Ph.D. and Nilima Parikh,B.Sc. Director Core Lab for Clinical Studies,Washington University,Box 8046,660 South Euclid Avenue,St. Louis,MO 63110,USA,Phone: (314) Fax: (314)   Oxidized low-density lipoprotein (LDL) has been suggested as a causative factor in atherosclerosis. The measurement of the oxidizability of LDL in clinical studies has been hampered by tedious methodology and a lack of instrumentation that allows analysis of large numbers of specimens. We developed a method to measure the kinetics of the Cu ++ -induced production of conjugated dienes (CD) in LDL simultaneously in 96 specimens. LDL was isolated from ethylenediaminetetraacetic acid (EDTA) plasma by sequential ultracentrifugation at 1.063 and 1.019 g/mL KBr in 10 µM EDTA. KBr was removed by rapid gel filtration, and LDL apo B was measured by nephelometry. LDL (50 µg/mL apo B) in 10 µM EDTA/phosphate-buffered saline was incubated with Cu ++ (25 µM) at 37 o C in a Bio-Tek TM PowerWave 200 plate reader using a disposable microtiter plate with UV-transparent bottom (Costar TM ). The rate of oxidation (CD production) was monitored at 234 nm with readings every 3 minutes for 3 hours. Lag time (LT), maximum velocity (Max V), and total CD production were calculated.

Precision
Within Run (n = 19) Between Run (n = 19) Mean % CV Mean % CV LT (min) 69.7 2.3 77.8 6.1 Max V (muol/min.g) 11.7 5.9 11.9 3.9 CD (muol/g) 454.0 4.8 485.0 3.9 Pretreatment of plasma with vitamin E lengthened LT, and preoxidation of LDL with Cu ++ reduced LT. In a typical run, 46 specimens and 2 quality control specimens were analyzed in duplicate, allowing the analysis of the large number of specimens often encountered in clinical studies.

Issue: Emerging Technologies in Drug Development
Richard A. Frank, M.D., Ph.D. Senior Director, Sanofi Research, 9 Great Valley Parkway, Malvern, PA 19355, USA, Phone: (610) Fax: (610)   The need for relevant, validated biomarkers (Colburn 1997;Rolan 1997) to confirm drug mechanisms of action in nonclinical and clinical studies highlights a key distinction between classical imaging and positron emission tomography, single photon emission computerized tomography, or functional magnetic resonance imaging. In contradistinction to computerized tomography, these emerging technologies quantify molecular biomarkers or mechanistic intermediates. These snapshots of toxicological, cognitive, physiological, pharmacokinetic, and biochemical processes (Gibson et al. 1993;Langstrom et al. 1993) might be called bioimages, or pharmacoimages, for their utility in drug development (Frank, submitted). These rapid, noninvasive measurements can be made repeatedly at the site of action in small numbers of unanesthetized humans with minimal perturbation of the system studied. The quantitative results can be expressed in familiar units of measure. Thus, all the criteria are satisfied for a clinical pharmacology tool as well as two criteria for approval of an New Drug Application based on a single phase III trial--multiple endpoints involving different events and persuasive data showing the related pharmacologic effect (U.S. Food and Drug Administration 1998). During phase I or IIa clinicals, the value added makes these technologies attractive for portfolio management, venture capital milestones, and out-licensing strategies. The pharmaceutical industry can be a valuable source of potential tracers if proprietary issues are acknowledged. For such clinical investigations as longitudinal course of disease, multisite validation, or drug dependence and recidivism, broad societal benefit begs public-private partnerships. Forced oscillation has been superimposed on natural breathing for more than 40 years as an index of respiratory airflow resistance. We used a new method of nonsinusoidal forcing (impulse, IOS) to measure a newly defined index of low-frequency reactance (LFRX)as a biomarker for small (peripheral) airway obstruction. We used plethysmographic airway resistance as a "gold standard" and showed in 43 patients with chronic obstructive pulmonary disease and in 32 asthmatic patients that LFRX was 20 percent more sensitive and more specific than conventional spirometry (FEV1) in detecting repsonse to inhaled bronchodilator. In 29 patients with a history of asthma, LFRX showed changes of greater than 50 percent and greater than 2 standard deviation in 13 patients classified conventionally (by FEV1) as nonresponders. We conclude that LFRX is a substantially more sensitive biomarker of changes in small airway disease in patients with chronic lung disease. Twenty percent more patients are classified correctly using this biomarker with bronchodilator challenge, whereas 144 percent more patients are correctly classified in response to bronchoconstriction challenge.

Issue: Imaging Technology in Development of Biomarkers
Laurie Hall, Ph.D.;Paul Watson;Ashwini Kshirsagar;and Jenny Tyler Professor Herchel Smith Laboratory, University of Cambridge, University Forvie Site, Robinson Way, Cambridge CB2 2PZ, UK, Phone: 44-0-122333-6805, Fax: 44-0-122333-6748, E-Mail: tac12@hslmc.cam.ac.uk It is already well known that radiological magnetic resonance imaging (MRI) can visualize injury of human articular joints. The hypothesis is that new MRI protocols will enable accurate measurement of osteoarthritis and rheumatoid arthritis. This presentation will demonstrate, for the first time, fully automated MRI procedures for quantitating both the amount and composition of human articular cartilage in the hand and knee, specifically, measurement of thickness, total volume, and selected volumes of cartilage via three-dimensional (3-D) MRI and measurement of the MRI parameters of water in finger and knee cartilage, including Mo, T1, T2, MTR, Msat/Mo, and D. Important features especially relevant to clinical drug trials include quantitative 3-D assessment of each joint as an intact organ, including all soft tissue, bone, musculature, and vasculature; fully automated measurement of images from MRI-scanners worldwide; use of image coregistration for automated, repeat-measurements of diseased tissues; and measurement of animal models for osteoarthritis and rheumatoid arthritis using human MRI protocols.

Issue: Expanding the Concept of Surrogate Endpoints To Include Accessible Anatomic and Functional Field Effects to Assess Risk for Cancer in the Nonaccessible Organ Sites (e.g., Prostate) for Chemoprevention
Donald E. Henson, M.D. Medical Officer, Program Director, National Cancer Institute, National Institutes of Health, Executive Plaza North, Room 330, MSC 7346 6130 Executive Boulevard, Bethesda, MD 20892-7346, USA, Phone: (301) Fax: (301)   Surrogate endpoints are needed for evaluating therapy. However, the concept of surrogate can be extended to anatomic sites to include risk assessment for cancer chemoprevention. For accessible sites, such as the uterine cervix, analsysis of risk usually depends on cytology. For sites that are not accessible, such as lung, it is more difficult to evaluate risk of cancer, since a biopsy or other diagnostic test cannot be justified unless there is evidence of disease (e.g., an elevated prostatic surface antigen for a prostate biopsy). Therefore, to assess risk in nonaccessible sites, physicians might be able to take advantage of the anatomical or functional field effects in cancer. Patients with lung cancer often develop additional cancers of the upper aerodigestive tract (UADT) because of a field effect. Thus, the risk for lung cancer might be assessed by evaluating biomarkers for cancer in the oropharynx (surrogate for lung), since all UADT epithelium, including lung, is exposed to cigarette smoke. Theoretically, the risk for breast cancer can be tested by evaluating biomarkers (functional field effect) along the mammary line, which might serve as a surrogate for breast tissue. Potential examples of functional and anatomical surrogate sites will be presented for cancers of the lung, prostate, breast, esophagus, and ovary.  Field Research Scientist, Ciphergen Biosystems, Inc., 2817 New Providence Court, Falls Church, VA 22042, USA, Phone: (703) Fax: (703)   Protein profiling has become increasingly important for indicating global changes between normal and disease states and searching for disease biomarkers. Surface-enhanced laser desorption/ionization (SELDI) ProteinChip TM technology was used to generate and analyze protein profiles for cancer biomarker discovery. Laser capture microdissection (LCM)-procured normal and prostate cancer (PCA) cells were added onto the ProteinChip TM array. After binding and washing, the mass for the bound proteins was determined by SELDI mass reader. Comparison maps showing the changes in the protein profiles between the normal and PCA samples were generated by SELDI software. Using 1,000 to 2,000 LCM-procured cells, prostatic acid phosphate and prostate-specific antigen were found in both normal prostate and PCA, and prostate-specific membrane antigen was found in two out of three matched tissue samples based on their molecular weights. We also found two unique proteins in PCA cells below 1 kDa and at least three unique proteins under 1 kDa using one uL of seminal plasma. The ProteinChip TM technology provides a rapid, sensitive protein analysis in complex biological samples. Besides, it is readily adaptable to a clinical assay format. Further studies will be required to determine whether the unique proteins identified in this preliminary study will be found to be novel biomarkers for diagnosis or prognosis of PCA. Director Transplant Immunology Lab, Departments of Internal Medicine and Immunology, Health Sciences Centre, University of Manitoba,Room GG549,820 Sherbrook Street,Winnipeg,MB R3A 1R9,Canada,Phone: (204) Fax: (204)   A noninvasive test of renal allograft pathology would safely predict rejection and its resolution following treatment. Magnetic resonance (MR) spectral analysis of urine, combined with a reliable classification strategy, provides such a test. Our strategy to generate disease classifiers (i.e., validated on several biofluids and tissues, including novel spectral preprocessing, bootstrap-based cross-validation, simple and robust linear discriminant analysis) was applied to urine MR spectra (ppm range 0.0 to 9.0, 4,096 points/spectra) using transplant protocol biopsy pathology (Banff scored) as a reference "gold standard." MR spectra from 22 patients (normal histology) and 22 normal controls constituted the "normal" group, and those of 34 patients with biopsy-proven acute rejection (n = 15), chronic rejection (n = 9), or acute/chronic rejection (n = 10) made up the "rejection" group. The optimized classifier correctly allocated both the normals (3 misclassifications; sensitivity = 91 percent, specificity = 95 percent, positive predictive value = 95 percent) and the rejections (1 misclassification; sensitivity = 95 percent, specificity = 91 percent, positive predictive value = 92 percent). These results suggest that urine MR spectra can distinguish between normal histology and acute and/or chronic allograft inflammation. Ongoing studies are developing classifiers for specific graft pathology. Urine MR spectral analysis, with appropriate classifier strategy and algorithms, provides a rapid noninvasive test of graft pathology that allows repetitive sampling of the graft. Head Section of Nephrology, University of Manitoba,820 Sherbrook Street,Winnipeg,MB R3A 1R9,Canada,Phone: (204) Fax: (204)   A noninvasive test of renal allograft pathology would safely predict rejection and its resolution following treatment. Infrared (IR) spectral analysis of urine, combined with a reliable classification strategy, provides such a test. Our strategy to generate disease classifiers (i.e., validated on several biofluids and tissues, including novel spectral preprocessing, bootstrap-based cross-validation, simple and robust linear discriminant analysis) was applied to urine IR spectra (800 to 5,000 cm −1 range, 2,163 points/spectra) using transplant protocol biopsy pathology (Banff scored) as a reference "gold standard." IR spectra from 22 patients (normal histology) and 22 normal controls constituted the "normal" group, and those of 34 patients with biopsy-proven acute rejection (n = 15), chronic rejection (n = 9), or acute/chronic rejection (n = 10) constituted the "rejection" group. The optimized classifier correctly allocated both the normals (only 3 misclassifications; sensitivity = 88 percent, specificity = 89 percent, positive predictive value [PPV] = 88 percent) and the rejections (3 misclassifications; sensitivity = 89 percent, specificity = 88 percent, PPV = 88 percent). These results suggest that urine IR spectra can distinguish between normal histology and acute and/or chronic allograft inflammation. Ongoing studies are developing classifiers for specific graft pathology. Urine IR spectral analysis, with the appropriate classifier strategy and algorithms, provides a rapid noninvasive test of graft pathology that allows repetitive sampling of the graft.

Issue: Technologies for Identification of Proteins as Biomarkers
Yingming Zhao, Ph.D. and Brian T. Chait Assistant Professor Department of Human Genetics, Mount Sinai School of Medicine, New York, NY 10029, USA, Phone: (212) Fax: (212) 824-2508, E-Mail: zhao@vaxa.crc.mssm.edu Biological machines control many critical cellular processes such as DNA replication, gene expression, and protein sorting. Identification of the protein components of such machines is often the first step in gaining a detailed understanding of the operation of such machines. In this presentation, we report our utilization of capillary LC-ion trap mass spectrometry and matrix-assisted laser desorption ionization-tandem ion trap mass spectrometry (MALDI-TIMS) in combination with a protein sequence database search using the software tool PepFrag for the identification of gel-separated proteins from various preparations of biological complexes. Using this approach, we were able to identify a protein at a level greater than 0.2 pmol loaded on sodium dodecyl sulfate polyacrylamide gel, which is at least 50 times more sensitive than the more conventional Edman peptide sequencing method. This approach was used for the analysis of 450 protein bands from two preparations of enriched yeast nuclear pore complex. The analysis yielded 430 positive protein identifications, which resulted in the identification of 220 different proteins. In addition to the yeast nuclear pore complex, we also applied the methods for the identification of protein components of other biological machines (e.g., RNA polymerase II holoenzyme, SWI/SNF complex, Stat1a TAD-associated proteins). These studies demonstrated that mass spectrometry-based protein identification and sequencing methods provide an extremely powerful tool for the dissection of a complicated biological system.

Trainee Travel Award Abstracts
The Cardiovascular Imaging Center, Department of Cardiology, The Cleveland Clinic Foundation,9500 Euclid Avenue,Cleveland,OH 44195,USA,Phone: (216) Fax: (216)   Doppler echocardiography is a useful tool to assess left ventricular (LV) dysfunction, a common clinical problem. Color M-mode echocardiography (CMM) provides blood propagation velocities. We correlated CMM characteristics with early diastolic transmitral pressure gradients (P MV ), LV isovolumic relaxation time constant (), and left atrial (LA) pressures -all valuable indices of LV function. Twelve patients undergoing cardiac surgery had, during baseline, norepinephrine infusions and partial-bypass conditions transesophageal CMM performed. Simultaneous high-fidelity LA and LV pressures were also obtained. LV end-diastolic (EDV) and end-systolic volumes (ESV) were measured from transesophageal echo. P MV , and LAP were correlated with CMM-derived propagation (V P ), pressure gradients, calculated using the simplified Euler equation, which relates pressure gradients to spatial and temporal velocity changes (P Euler ), and early diastolic peak filling velocities (E). V p was strongly correlated with (58 ± 16 msec, r = −0.78, p < 0.001, Figure 1) but changed little with preload (baseline: 37 ± 12 cm/sec versus partial bypass: 34 ± 16 cm/sec, p = 0.3). E also correlated with LAP (r = 0.68) as did P Euler with P MV (y = 0.79x + 0.48, r = 0.91, Figure 2). V P is a preload independent index of LV relaxation. CMM-derived pressure gradients may be useful for the clinical assessment of LV filling and function. The Cardiovascular Imaging Center, Department of Cardiology, The Cleveland Clinic Foundation,9500 Euclid Avenue,Cleveland,OH 44195,USA,Phone: (216) Fax: (216)   Quantitative assessment of left ventricular (LV) contractility is a challenging task and common clinical problem. Color tissue Doppler echocardiography (CTDE), a novel technique, allows noninvasive measurement of myocardial contraction velocities. Myocardial strain represents the spatial derivative of these velocities (dv/ds). CTDE images of the interventricular septum were recorded from the apical four-chamber view in seven closed-chest, anesthetized dogs during four to five different inotropic stages. Simultaneous, LV volumes and pressures were obtained with a combined-conductance, high-fidelity pressure catheter. Peak elastance (E max ), the slope of end-systolic pressurevolume relationships during caval occlusion, was used as the "gold standard" of LV contractility. Peak systolic CTDE basal-septal velocities (S m ), peak ( peak ), and mean ( mean ) strain were compared against E max by linear regression analysis. E max and CTDE systolic indices increased during stimulation with dobutamine and decreased with esmolol infusion. A stronger relationship was found between E max and peak (r = 0.94, p < 0.01, y = 0.27x + 0.56, Figure 1) and mean (r = 0.85, p < 0.01, y = 0.18x + 0.39, Figure 2) than for S m (r = 0.75, p < 0.01). CTDE-derived peak and mean are strong noninvasive indices of LV contractility and can be used in the acute and serial clinical assessment of patients with LV dysfunction. Research Associate Department of Pathology, Loyola University Medical Center, Room 2646, 2160 Fax: (708)   Acute endothelium injury is often caused by oxidative stress induced by an inflammatory reaction associated wiht the production of oxygen-derived free radicals. Such occurs in the liver in response to any array of disease processes that include viral infection, alcohol abuse, cirrhosis due to any cause, chemotherapy or other drug-induced injury, and allograft rejection. This leads to neutrophylic activation and adherence and an accelerated endothelial inflammatory reaction associated with uninhibited widespread vascular injury. The latter circumstance characterizes the sequence of events associated with end-stage liver diease that characteristically terminates a multisystem organ failure. Enzyme-linked immunosorbent assay systems (ELISA) and colorimetric immunoassays for inducible nitric oxide system (iNOS), human soluble L-selectin (hs-L-Sel), antiannexin V (anti-Anx V), human soluble thrombomodulin (hs-TM), free and total tissue factor pathway inhibitor (F-TFPI and T-TFPI), and human soluble interferon-gamma (IFN) were used to measure these releasable endothelium response biomarkers in plasma. Individual patient groups (4 to 6) with various liver and renal autoimmune diseases such as acute liver failure (ALF), acute renal failure (ARF), hepatitis C and autoimmune hepatitis (HCV auto), hepatocellular carcinoma, hepatitis C (HCC, HCV), hepatitis C cirrhosis (HCV cirrh), hepatocellular carcinoma cirrhosis (HCC cirrh), and hepatitis C cryoimmunoglobulimia (HCV cryo) were included. The seven response markers may provide a diseasespecific profile for each disorder, as shown in the table below. Although antiannexin V was elevated in groups 1 and 2; L-selectin in groups 1, 2, 5, and 6; and hs-TM in groups 1, 2, 3, the constant elevation of iNOS was found in all groups. The results suggest that measurement of these endothelial response markers may help in the categorization of patients in various risk groups. Furthermore, these markers may be helpful in the assessment of therapy directed Over the past decade, there has been an accelerating interest in the replacement of final endpoints with surrogate endpoints in randomized clinical trials. Unfortunately, the treatment of surrogate variables for cancer clinical trials in the medical and statistical literature has often been heuristic and ad hoc in character. Nevertheless, there are underlying issues present in much of the more recent research, an understanding of which will enable us to highlight the inherent methodological limitations of techniques for assessing the validity of surrogate endpoints. We begin by outlining the common approaches to surrogate variables adopted in the literature. A detailed discussion of the methodological strengths and weakness of each approach and the resulting impact on reliability and quality are underpinned by a discussion of empirical examples. Pointers to improved methodology are discussed. Although the practical constraints of randomized clinical trials often encourage the use of surrogate variables, the choice of a particular method for use in a trial remains ad hoc. However, this choice can have a substantial impact on the quality and reliability of findings. We outline some factors that need to be taken into consideration and outline a methodological framework for the development of more robust methods.  Biology, Department of Organ Transplantation, Swedish Medical Center, Suite 400, 1120 Cherry Street, Seattle, WA 98104, USA, Phone: (206) Fax: (206)   Intestinal mucosal injury occurs in necrotizing enterocolitis (NEC), and intestinal mucosal compromise has been related to the development of complications (e.g., multiple organ dysfunction and systemic inflammatory response syndrome [SIRS]) in critically ill patients. Intestinal fatty acid binding protein (IFABP) is a 15kD cytoplasmic protein unique to the mature small intestinal enterocyte. In animal models of acute intestinal mucosal injury, IFABP in measurable concentrations was found in serum and was then excreted in the urine. We hypothesized a similar occurrence in humans with clinical diseases involving acute mucosal compromise. A radioimmunoassay for IFABP was developed in our laboratory. Healthy staff volunteered random serum, plasma, and urine specimens. Preliminary blinded, prospective studies were performed in two groups: Group 1 comprised 21 consecutive patients admitted to a neonatal intensive care unit for NEC, and Group 2 comprised 100 consecutive patients admitted to a surgical intensive care unit. In controls, all IFABP levels were less than 1.87 ng/mL. In NEC, plasma IFABP was directly related to disease severity (Table 1). For adults admitted to the SICU, urine IFABP correlated with the development of SIRS and peaked an average 1.4 days before SIRS was diagnosed. These preliminary studies support the hypothesis that IFABP concentrations correlate with the severity and development of disease in NEC and SIRS. Development of this marker may aid the clinician/scientist in predicting, monitoring the course of disease, and in assessing the effects of interventions.