Future research directions in the use of biomarkers.

Many DNA adduct studies have been carried out in occupational groups that have been at a risk of cancer based on epidemiological results relating to exposure decades ago. Even new epidemiological publications on cancer cannot accurately address the effective exposures after about 1970. This is one justification for biomarker studies. Another justification is exposures for which epidemiological studies have not been conducted or have provided inadequate results, in spite of suspicions raised by short-term or animal experiments. The modulation of environment carcinogenesis by host polymorphism in genes for xenobiotic metabolizing and DNA repair enzymes is currently under extensive investigation. The studies relating phenotype/genotype to cancer are presently extended to various end points that may be related to cancer such as DNA adducts and cytogenetic damage. Adjustment for a metabolic phenotype or genotype may also increase the precision in the measurement. Mutations in oncogenes and tumor suppressor genes may give clues to the etiology of cancer.


Introduction
The use of biomarkers (defined as indicators of exposure, effect, and individual susceptibility) is relatively recent. Many of the methods used have not been extensively validated and it is not known, in most instances, to what extent the biomarkers predict the risk of mutation or cancer.
In this paper we will review, subjectively, some areas of biomarker research that are familiar to us. The areas covered include DNA adducts, adducts and mutations, and mutations in oncogenes and tumor suppressor genes, including their products, oncoproteins. Mutation epidemiology, whether carried out through a twin or a population registry or linked to a cancer registry, may be a powerful source of new patients for genetic analysis. Because genes involved in familial cancer may also operate in common cancers, a large interest in these genes has developed.

DNA Adducts
In the future, the emphasis in DNA adduct research should be on quantification of specific adducts. For some 10 years, aromatic adduct profiles have been presented in the literature. While serving a purpose as a general adduct level, there is now a need to improve the specificity of adduct studies, aiming at criteria of general analytical chemistry; however, because of multiple steps of analysis, a great deal of standardization is required. The most straightforward approach is to use both external and internal standards for the adducts to be identified. This is not feasible in exposures to complex mixtures such as polycyclic aromatic hydrocarbons (PAHs). The approach is more feasible to adducts, which make only one main type of adduct.
Another kind of general problem with adducts is the use of surrogate tissues instead of target tissues. Furthermore, halflives of adducts are largely unknown even in surrogate tissues. A few studies have addressed these problems.
Smoking is a known risk factor of laryngeal cancer. Aromatic adducts of laryngeal tissue obtained from surgery were analyzed; there was a relationship to smoking, most clearly in the tumor tissue. Both tumor and normal laryngeal tissues showed a correlation of about 0.9 to the total white blood cells (1).
Smokers had elevated levels of 7methylguanine, particularly in their lymphocyte DNA as compared to the granulocyte DNA (2). The adduct levels were highest in the bronchial DNA of smokers, almost four times the level in nonsmokers (3). In a small number of smokers, both target (bronchial) and surrogate (lymphocyte) DNA were available, showing a correlation of 0.8.
Larynx tissue samples obtained from surgery have also been assayed for 7-methylguanine-DNA adducts. There was a relationship to smoking, and larynx adduct levels were two times the level in white blood cells. There was a modest correlation only between 7-alkylguanines and aromatic adducts.
In the latter part of this section, we will discuss some examples of how the problems of specificity and quantification can be tackled on very different kinds of exposures.

PAH Adducts
We have attempted to study the nature of the aromatic adducts detected in the postlabeled samples from Silesia, an industrialized area of Poland, which has been a focus of our studies for years. The methods applied with samples before postlabeling included nuclease P1 treatment, butanol extraction, and immunoaffinity chromatography (IAC) using an antibody raised against benzo[a]pyrene-modified DNA (4). The results on IAC are shown in Figure 1. The antibody binds benzo[a]pyrene diol epoxide (BPDE)-modified deoxyguanosine 3'-monophosphate (dGMP), a microsomally produced mixture of 10 different PAH-DNA adducts, and DNA obtained from lymphocytes of coke workers (highly occupationally exposed to PAHs) and Silesian residents (environmentally exposed to PAHs). However, samples of both coke workers and other Silesians showed more binding by the IAC column in the winter, a time of heavy exposure to PAHs. In high-performance liquid chromatography (HPLC) analysis using a flowthrough radioactivity detector, typical seasonal adduct peaks were noted; they were particularly prominent in lymphocyte DNA collected in the winter. These peaks eluted in the area of PAH-DNA adducts, giving additional support that the adducts are PAH-like (5).
Another problem with complex mixtures is the difficulty in quantitating the results. In an illustrative experiment, DNA adducts of a number of 3H-labeled PAHs were prepared in a microsomal system and used for optimization and measurements of recoveries in the postlabeling assay. The optimal labeling conditions for all tested compounds were very similar. The recoveries varied from 3 to 60% among different PAHs, indicating that the levels of these adducts could be considerably underestimated when analyzing human samples from PAH-exposed populations (6). Because we have found similar results with an entirely different group of compounds, we concluded that different adducts require different conditions for optimal labeling (7). Thus, the absence of proper standards, or unknown adducts, makes quantitative interpretation of the postlabeling results difficult, if not impossible.
Further methods and efforts are needed to characterize the levels of adducts of individual PAHs or other aromatic compounds, which is a large task. Alkenes Gasoline is one of the most common solvent vapors to which workers and the general public are exposed. The exposure is particularly to the volatile alkanes and alkenes. Additionally, many of the components of gasoline are found in vehicle exhaust either because of incomplete combustion or because of the chemical reactions taking place in engines. Incomplete combustion also creates new types of compounds that are not present, or only in minor quantities, in gasoline. Typically these are various PAHs, but aliphatic compounds such as alkenes ( e.g., ethene, propene, butadiene, isoprene) and various aldehydes are also being formed. Fuel and engine development has been focused on the reduction of the polycyclic aromatic compounds. Catalytic converters, on the other hand, are effective in removing the main part of volatile hydrocarbons from engine exhaust; however, even in optimal conditions, a fraction remains. In cold starts and in malfunctioning converters, a high proportion of hydrocarbons is released unburned.
The concerns about the harmful effects of engine exhausts have traditionally been focused on polycyclic aromatic compounds (8,9). There has been increasing concern about the effects of alkenes such as ethene, propene, butadiene, and isoprene because their epoxides (metabolites in humans) are carcinogenic (10). The present risk estimates of environmental cancer ascribe cancers approximately equally to butadiene alone as to polycyclic aromatic matter, including PAHs (11). It is projected that butadiene is increasing overwhelmingly over polycyclic compounds; in 2010, it is estimated that butadiene alone will cause approximately five times more cancer than polycyclic material, based on a comparison of motor vehicle exhausts (11).
The extraordinary carcinogenicity of butadiene in rodents influences such projections (8); however, for this compound human occupational data are also becoming available (12). New carcinogenic alkenes are still being detected. Isoprene, an analogue of butadiene that possesses two double bonds capable of cross-linking DNA has recently been found to be a potent carcinogen in rodents (10). It can be assumed that other dialkenes will be found in vehicle exhaust that will alter the risk estimates for particular compounds. Effort should be focused on DNA binding products of alkenes, including butadiene and isoprene. The occupational groups that are most heavily exposed include tank truck drivers, tank ship unloaders, butadiene manufacturing workers, and garage workers. The method of adduct detection can rely on the newly developed postlabeling technique for monoadducts (7), which has been successfully used in studies of experimental animals exposed to 1-alkenes ( Figure 2) (13).

UV-Adducts
For cross-links, the technique used for cisplatinum and UV cross-links can be applied (14,15). This modification of the postlabeling technique ( Figure 3) is necessary because cross-linked dinucleotides label vary poorly (14,15). In the modification, a normal nucleotide is left on the 5'-side of the cross-linked dinucleotide, resulting in a number of labeled trinucleotides. This kind of modification is so far the only way to label cross-linked products. The results on excised human skin, irradiated at approximately 310 nm, are shown in Figure 4, analyzed by HPLC radioactivity detection.

UV-Adducts and Mutations
The incidence of malignant melanoma and other skin cancers has increased markedly in many countries with primarily fairskinned populations (16). In Sweden the incidence of malignant melanoma and nonmelanomatous skin cancer has increased more than any type of cancer, representing 4.5 and 3.2% annual increases during the last 20-year time period, respectively. Solar ultraviolet (UV)-irradiation is thought to be an important cause of the nonmelanomatous skin cancer, but it may also contribute to melanoma (16). UV light has complex action on biological organisms and is considered a complete carcinogen, with both initiation and promotion capacities in model systems (16,17). UV irradiation causes specific dipyrimidine adducts in DNA that are likely to be related to the mutagenicity and tumor-initiating potential of UV light. Mammalian cells can repair the adducts at various rates (18). The relationship between DNA repair and cancer is illustrated by several skin diseases such as xeroderma pigmentosum, in which repair defect predisposes to skin cancer. Decreased repair of UV damage also contributes to 30,000 -0, 20,000-0E CD c 0 CD 10,000 -common skin tumors such as basal cell carcinoma (18).
Specific UV-induced photoproducts may be measured by the novel modification of the 32P-postlabeling technique (15). Further methods development involves adaptation of the method to human skin in situ. UV-induced adducts can be determined in parallel with mutation measurements in the p53 gene. The assay of UV-specific CC to TT mutations in the p53 gene in human skin has been published (19). CC to TT mutations in the p53 gene are rare in internal organs, which implicates UV as the main causative factor. The codons conveying transforming properties should be complemented with silent mutations that give no growth advantage, thus serving as measures of mutation frequency.
UV-induced photoproducts caused experimentally and through suntanning can be studied in an early biologically effective target dose in human skin of healthy individuals and patients with skin diseases. The in situ transformation of the adducts, including those on the p53 gene, to p53 mutations could be measured in several codons as an early indication of potential hazard for skin cancer in individuals. This would tie different photoproducts to mutations that appear relevant to skin cancers in healthy and predisposed humans. Adducts indicate the target dose at a level of a nucleotide in the p53 gene, DNA repair indicates the efficiency of damage removal, and p53 mutations 20 40 Retention time, min Figure 4. Human epidermis exposed to UVB and analyzed by the method used for cross-links. Some cyclobutane dimers (such as TT=C) and 6-4 photoproducts (such as TT-T) are shown in HPLC radioactivity indicate fixation of damage as mutations relating to cancer risk.

Mutation Studies in Tumor Suppressor Genes
Many cancer-related genes are excessively large (20). This applies to both tumor suppressor genes (Table 1) and to many DNA repair genes. Thus, the screening of the retinoblastoma (Rb) gene with 27 exons requires a huge number of polymerase chain reactions (PCR) because only a few hundred nucleotides can be accurately assayed at one time. There are many upto-date methods used in the detection of unknown mutations at the gene, mRNA, and protein levels (21-28): * direct sequencing * denaturing/constant gradient gel electrophoresis (DGGE/CDGE), < 600 bp * capillary electrophoresis (CDCE) * ligation-mediated assay * single-strand conformation polymorphism (SSCP), -300 bp * chemical deavage, < 2000 bp * application of mismatch repair enzymes * protein truncation assays, large * mRNA level * functional tests. Although protein truncation assays (24) are powerful in detecting deletion and frameshift mutations, they fail to detect missense mutations.
It is of utmost importance to be able to screen mutant species in a reliable and fast fashion. The results would be helpful in the analysis of all mutations, irrespective of whether mutations are inherited or somatic and to the extent that the mutations are scattered in different parts of the genome. This would result in simplified analysis of disease carriers in genetic diseases if parental DNA samples are not available and of samples from somatic mutations in cancer patients, even in large genes. The aim is to widen the main bottleneck in the analysis of mutations.
Capillary electrophoresis has been used extensively in protein sequencing and to some extent in DNA sequencing. The primary feature of capillary electrophoresis is its high separation power, giving a baseline separation of long nucleotide sequences that differ in size by one nucleotide only. For mutational analysis the applications are new but essentially analogous to single-strand conformation polymorphism (SSCP) (25) or denaturant gradient gel electrophoresis (DGGE) (26). In the first application, genomic DNA sequences are amplified, denatured, and analyzed on capillary electrophoresis as single-stranded molecules. In the second application, dubbed constant denaturant capillary electrophoresis (CDCE) (26), melting profiles of the nucleotide sequences are analyzed; it is important that the sequence to be analyzed contain domains melting both at high and low temperatures. This ensures separation of heteroduplexes containing mismatches identical to DGGE but with a constant denaturant concentration in the capillary. This is different from the SSCP type of analysis in that the samples are analyzed as partially melted heteroduplexes and not as single strands. In both methods, the DNA is available for sequence analysis; however, experience with analysis of certain sequences with capillary electrophoresis will lead to some understanding of the types and locations of the mutations. An extensive sequence analysis is required before the ground rules can be established.
We used the 19 commonly found ras mutants cloned in a plasmid, (Figure 5) (27). We have devised primers that allow us to use SSCP-and DGGE-type capillary electrophoresis and sequences of different lengths in order to compare the separation power of the two methods. Most of the mutations could be detected as homoduplexes and the rest as heteroduplexes

Oncoproteins and Growth Factors
Cellular growth signaling can be divided into four separate stages: a) extracellular growth factors (e.g., platelet-derived growth factor [PDGF], epidermal growth factor [EGF], and transforming growth factor-a [TGFa]); b) growth factor receptors at cell membranes (e.g., PDGF receptor and a common receptor for EGF and TGFla); c) intracellular signaling proteins, G-proteins that interact between the membrane receptors and nuclear processes.
(these forms involve many multistep pathways and include proteins such as ras [p21] and raf); d) nuclear factors of many functions such as transcription factors, cellcycle control proteins, and DNA repair proteins, which have many interactions. p53 protein appears to take part in each of these functions.
The a-tocopherol, ,-carotene (ATBC) cancer-prevention-trial serum bank including 30,000 middle-aged smoking men can be used to identify the possible association between the level of growth factors and oncoproteins (jointly called oncoproteins) in respect to lung and colorectal cancer. The cancer types were selected because of their high incidence and known increase in mutations or elevation of oncoproteins (29)(30)(31). The oncoproteins selected included ras p21 protein, p53 protein, and, for squamous lung cancer only, epidermal growth factor receptor (EGFR). The special advantage of this study, as compared to others carried out in this field, is the large, well-characterized study population, which enables the assessment of the oncoproteins years before clinical diagnosis. Interview data and serological analysis enable the control for confounding variables.
The objectives of this work are 4-fold. One objective is to analyze more thoroughly the role of oncoproteins in early stages of cancer. Evaluation of the lag time between detection of oncoproteins in serum and clinical diagnosis of cancer can be assessed in a large number of cancers developed in this population. A second objective is to analyze the prognostic value because many oncoproteins are already being used for prognostic purposes in the treatment of cancer (32,33). As a third objective, the appearance and possible fluctuation of the levels of oncoproteins provide information about irreversibility, which is mechanistically important. And fourth, the variables affecting the normal levels of oncoproteins will also become available.
For predictive and preventive purposes, it is important to develop markers either for general population screening or screening of some risk groups. The criteria of predictivity are well established in general population screening; this generally implies that both false positive (low specificity) and negative (low sensitivity) results undermine the marker. Based on the multiple pathways to oncogenesis, it would be overly optimistic to assume that one or a few oncoprotein markers would fulfill the criteria for population screening. However, oncoprotein screening may serve a purpose in the case of special groups such as those seeking medical advice about their symptoms, families with a high incidence of cancer, or tobacco smokers or other heavily exposed populations. In such cases, the criteria of population screening do not hold, and the tolerance in predictivity can be lower (34). The direction of false diagnosis largely depends on the individual situation. Yet, it is important that diagnostic efficiency is maintained, for example, the cost-benefit is reasonable.

Genetic Epidemiology
Studies have been initiated with the Swedish Cancer Registry (-1.2 million patients) and the Twin Registry (-50,000 twin pairs) to analyze familial cancer in the Cancer Registry, which covers all of Sweden since the 1950s ( Figure 6) and to find out to what extent mono-and heterozygotic twins are presented in the Registry. Because the Swedish Cancer  Registry is one of the largest in the world and the Twin Registry is the largest, these data sources will provide unique patient material for analysis of common cancers using sib pair analysis. However, as the collection of material involves several generations, including a large number of dead persons, logistics have to be worked out for the identification of important genes in common cancers.

Conclusions
This work uses a number of parameters that are currently available to predict the health outcome of the exposures of concern (Figure 7 proponents of molecular epidemiology, the present-day findings can be translated to risk estimates in the absence of epidemiological data (35). The relevant exposure data from epidemiological studies date decades back and are usually uncertain (see Figure 7); this impedes direct extrapolation to the risks of the current exposure. Furthermore, individual metabolic factors can be taken into consideration.