Multistage Models of Carcinogenesis by

The simple multistage model of carcinogenesis is outlined. It provides a satisfactory explanation of the power law for the age incidence of many forms of epithelial carcinoma, for the effects in human populations of changing exposures to supposed carcinogenic agents, and for many of the observed effects of applied carcinogens in animal experiments. In particular, the evidence on the effects of starting and stopping cigarette smoking suggests that both an early and a late stage may be affected. In the absence of direct evidence on the nature of the cellular changes there is some reluctance to accept a model with more than two stages, and several forms of two-stage models provide good general explanations of observed phenomena. Such a model has recently been applied to breast cancer; another approach to this disease, effectively involving transformations of the time scale, is discussed.


Introduction
Multistage and related models of carcinogenesis have been discussed for about 30 years, and the growth in the literature has been almost as rapid as the rise of cancer incidence with age. In a short paper I cannot attempt a comprehensive review, and I shall aim to outline the topic in a general way, making more specific comments about some of the points which happen to have interested me over this period. More comprehensive reviews of the mathematical theory have been given by Armitage and Doll (1), Whittemore (2), Whittemore and Keller (3), and Peto (4) has provided a stimulating general review.

Early Work
The flurry of work in the early 1950s, which led to the formulation of a number of related models, was probably motivated by evidence from various sources. First, there was the epidemiological evidence that the mortality or incidence rates for many forms of human cancer increased rapidly with age. This might be a general effect of aging, the body becoming more susceptible to insults of various sorts, or it might be because carcinogenesis is a complex process requiring time and perhaps involving several qualitatively different stages. There were two considerations favoring the second explanation: the fact that people exposed to a high but short-lived carcinogenic risk (for example from irradiation or industrial hazards) often acquire cancer after a long period of time; and animal experiments such as those of Berenblum and Shubik (5) showed that some chemicals are especially effective either early or late in the induction process (the present terms for these being *Department of Biomathematics, University of Oxford, 5 South Parks Rd., Oxford OX1 3UB, England. "initiators" and "promoters"), suggesting that qualitatively different processes were at work during the early and late phases. I shall discuss later some more recent work on the distinction between age per se and duration of exposure to carcinogens.
In the reviews mentioned earlier, fuller descriptions of some of the early models are given than can be presented here. They include the "multicell" theory of Fisher and Holloman (6) (requiring a mutationlike change to a specific number of neighboring cells in a tissue, and inconsistent with the unicellular nature of most tumors); and the "multistage" or "multihit" models of Stocks (7) and Nordling (8) (in which a specific number of changes in any order are required). The very similar model of Armitage and Doll (9) introduced the idea of a specific ordering of the changes, so as to accommodate the evidence from initiation-promotion experiments and also a number of features of the epidemiology of human cancer. There was also a series of papers by Iverson and Arley, starting with one (10) which postulated a randomly occurring initiating event followed by a randomly distributed induction period. This rather general formulation encompasses most of the other models, since in a multistage model the first stage can be taken as the initiating event while all subsequent events are subsumed into the induction period (1).

Derivation of the Basic Model
It will be useful to outline the theory of the Armitage-Doll model in slightly different terms to those of the original paper.
Suppose that, in a particular tissue, there are N cells (or cell lines, if they divide) that can potentially experience carcinogenic transformation. The final development of cancer is the k-th and last of a series of sudden and irreversible changes (or stages) which must take place in a specific order. The clinical detection of the disease may be delayed by the period required for the tumor to grow to a detectable size: we shall assume that this is a relatively short lag and shall not consider it in any detail. (Many writers systematically replace the current time t by t -w, where w is the assumed lag.) Suppose that, for any cell which has experienced i-1 changes [which we shall call an (i -1)-cell], the event rate for next change is Xi, independent of time. That is the probability that the i-th change takes place in (t, t + dt) is Xidt + o(dt). This defines a time-homogeneous birth process. We should like to know fit), the event rate for the k-th change at time t (the process starting at time 0). General and particular solutions for this problem are well-known (11)(12)(13)(14) but are algebraically cumbersome. Fortunately, an approximation is adequate for almost all purposes. Consider the position for values of t small enough to make the probabilities of any of the changes in (O,t), in any one cell, very small. We can either take the limit of the general expression as t -O 0 (13,14), or use a straightforward argument (9) to show that f(t) -X1X2...k tk-1 (1) (k -1)!
The cumulative probability, F(t), that the k-th change has taken place by time t is t~X 1X2...Xktk Clearly, Eq. (2) cannot hold indefinitely as t increases. However, in most cases lifetime values of t will still be sufficiently small for the limiting assumption (which concerns single cells) to be adequate.
For the particular tissue with N cells, the probability that cancer (i.e., the k-th change) has not appeared by time t is Thus, the distribution function G(t) for the time to appearance of the first cancer is a Weibull distribution, with a density function g(t) = G'(t) = atklexp { -(aIk)tk} and hazard function h(t) = g(t)/{l -G(t)} = ote-1 (4) We have here the familiar power law. The limiting approximation on which it depends seems reasonably secure, since it assumes small rates per cell, but Moolgavkar (15) has pointed out that, for some values of the parameters which are plausible for human cancer, it may appreciably overestimate the hazard to be expected at high ages.

Human Cancer
The age-specific mortality rates for cancers at a particular site, or more directly the age-specific incidence rates obtained from cancer registries, can be regarded as roughly analogous to the hazard functions described mathematically by Eq. (4), since the denominators of the rates are the numbers of people alive at the ages in question. From Eq. (4), log h(t) = log a + (k -1) log t (5) and this linear log-log relation has been widely observed for a wide range of sites and human populations (16,1 7). It seems to be the usual finding in most epithelial carcinomas, but a variety of quite different relationships is seen for many nonepithelial tumours and for epithelial tumours at sex-specific sites (4). The slope in Eq. (5) is commonly in the range four to six, suggesting there may be around five to seven discrete stages.
However, there are several reasons for caution. In the first place, several other diseases show rapidly increasing age-incidence curves, and one would not seek to explain them all by models of this sort. Secondly, a power law, with a slope of k-i, or something very close to it, could be obtained with fewer than k stages. Suppose some of the stages had rates increasing as powers of the time elapsing since the previous stage. Then the slope k -1 would be the sum of the (power + 1) for all stages before the last plus the power for the last; for instance, k -1 = 4 would arise from five constant rates, or two linearly increasing rates followed by a constant rate, or a quadratic rate followed by a linear rate. A reductio ad absurdum is to postulate one state with A°toc-; the model then becomes purely tautological. Two-stage models are discussed below.
Third, a similar effect (of a high slope with a small number of stages) will be obtained if one or more of the event rates increases with age (rather than with time since last event).
Fourth, the Weibull hazard, Eq. (4), can be obtained more generally, on the argument (18) that the time to first tumor in a tissue is the minimum of N random variables (the time to tumor in the N cells), and that Eq. (4) is a standard limit of the distribution of minima in large samples. However, for this limiting form to be valid there are restrictions on the shape of the extreme left-hand tails of the distributions of the cell-specific times, namely, that they are power functions like Eq. (1), and this might be taken to provide at least weak support for the multistage theory.
Confidence in a multistage model must clearly depend on wider considerations than the power law. In particular, we need to consider the effects of external carcinogenic agents, data from animal experiments, and biological plausibility. These and other topics are taken up in later sections of the paper.
As already noted, the cancers of sex-specific organs tend not to follow the power law. This is understandable since many of these organs are subject to changes in their hormone dependence at various periods throughout life or, like the uterine cervix, are affected by changes in sexual habits. Some tentative explanations of age-incidence can often be given in qualitative terms (9). Some recent quantitative modeling for breast cancer, in terms of a two-stage model is discussed in a later section.

Animal Experiments and the Effects of Applied Carcinogens
In experiments in which animals receive continuous application of a carcinogenic agent, the time to first tumor commonly follows a distribution close to the Weinbull (19,20). Such experiments not only provide a measure of support for the general theory, but also enable one to study the dose-response relationship. In the simple multistage model, suppose that m of the k stages are affected by the carcinogen, so that, for these values of i, Xi = d\oi, where d is dose intensity. Then, from Eq. (4) and the definition of a in Eq. (3), the hazard function should be proportional to dm. It is common to find m < k, suggesting that some but not all of the stages are affected by a particular carcinogen.
Carcinogenic agents may, of course, not be applied at constant rates, and the question arises how the hazard function h(t) is affected if a particular rate constant, say Xi, is an arbitrary function of time Xi(t), which in the simplest case might be proportional to the dose intensity of d(t) of an applied carcinogen. The answer (9) is that h(t) is proportional to a weighted mean of At(t) in (O,t), the weight at time T (0 < T < t) being proportional to Ti-(t -T)k-i-1. This means that, for small values of i (early stages affected), what matters is the value of Xi(T) at low , whereas for high values of i (say k or k -1) the more recent values of Xi(T) carry most weight.
These effects are explored more fully by Whittemore and Keller (3) and by Day and Brown (21).
In this context one could broadly explain an initiatorpromoter experiment by saying that the initiator affects primarily the first step and the promoter primarily a later step (perhaps the second of two). However, Stenback et al. (22)(23)(24) have shown that the interpretation of these experiments may be complicated by aging and other effects. Earlier, Peto et al. (25) had carried out some experiments with regular benzpyrene applications to mice, which showed that under these circumstances the incidence of tumors depended on the time since start of exposure and not on age. This result is consistent with the view that the first stage is affected and that its enhanced event rate in the presence of benzpyrene is much greater than the natural background rate. Thus, whether or not it also affects some late stage(s), benzpyrene appears at least to "initiate" the first stage. In contrast with the age-independent effect of an initiator, however, Stenback et al., in experiments similar to (but much longer than) those of Berenblum and Shubik (5), found that the "promoting" effect of a TPA declined with age, suggesting a systemic aging effect in the response to TPA. Finally, to illustrate that the opposite effect is possible, Gray et al. (26) in experiments on radon inhalation by rats, found the incidence at a fixed time after start of exposure to increase with age. This is what might be expected if radon affected the second or a later stage, since with increasing age at start of exposure there would be more cells that had already undergone one or more of the early stages spontaneously. We return to the question of age effects in the next section.

Human Data and Exposure to Carcinogens
The considerations outlined in the first two paragraphs of the section titled "Animal experiments and the effects of applied carcinogens" would be expected to apply to human exposures as well as to animal experiments. One of the most illuminating examples is provided by the effects of starting and stopping smoking at different ages (4,(27)(28)(29).
Data from prospective studies, such as the British and American data analyzed by Doll (16), show that nonsmokers have a log-log relationship for lung cancer with a slope k -1 of about four. For cigarette smokers, the same slope is obtained if time is measured not from birth but from the start of smoking. This is reasonable if smoking enhances one or more of the ki to such high levels that the naturally occurring changes are very much less frequent than those induced by smoking.
Consider now the effect of stopping smoking. Smokers who stop retain their high rates, but at a constant level, perhaps until the nonsmokers' rates rise to that level. This is precisely what would be expected if smoking affected the (k -1)th of k stages, for there would be a pool of ex-smokers with (k -1)-cells, waiting for the final change which would occur at a constant rate. In due course, the pool will be augmented by naturally occurring (k -1)-cells and the rate will start to rise. Consider, secondly, the incidence rate at a fixed time after start of smoking, as a function of age at starting. The data are sparse but seem to indicate either little effect of age at starting or at most a rather modest positive effect. This would be consistent with an early stage being affected; (if the first stage were affected so that Xl, were increased dramatically by smoking, the process would effectively start at that point, but if the second stage were affected the number of 1-cells available for further transformation would increase approx-imately linearly with age at starting). Moreover, general considerations about the delay in the effect on a population of a marked increase in smoking suggest that an early stage is affected.
Thus, different arguments support effects on both early and late stages. Both effects could, of course, be present. Some skin-painting experiments with benzpyrene on mice (19) have suggested an incidence proportional to (dose)2, in turn suggesting that two stages are affected by benzpyrene or that there is one stage with a quadratic effect. A preliminary analysis by Whittemore and Altshuler (30) of the study on British doctors (31) suggested that the incidence rate was proportional to the number of cigarettes, which provisionally implied that one state was affected proportionally to dose, i.e., that m = 1. However, an analysis (32) of a "reliable" subset of the doctors' data suggests a response more than proportional to dose; the estimate of m may be reduced by errors of measurement of smoking habits; and the effect of smoking on a particular Xi may be less than proportional to the daily consumption of cigarettes. The evidence thus points, somewhat loosely, toward the involvement of two stages.
A useful discussion of the effects of removal of carcinogenic exposure in a range of human cancers, as well as in animal experiments, is given by Day and Brown (21).
The concept that a carcinogen may affect only some of the rate constants helps us to understand some of the observed interactions between different carcinogens. In some instances, as in the interaction between smoking and asbestos exposure (33), the effects ofthe two agents appear to be multiplicative. This would be expected if they acted, with proportionate effects on the rate constants, for two different stages, say the i-th and j-th, since the hazard function, Eq. (4), involves the product XiXj. On the other hand, if both agents affect the same rate constant Xi, their effects could well be additive.

Two-Stage Models: Breast Cancer
In the absence of direct biological evidence about a succession of stages, models with several (five to seven) stages have often been regarded as implausible. A twostage model with exponential proliferation of the 1-cells has been discussed (34). The exponential growth in the rate constant for the second stage has much the same effect as a low-order polynomial and it is not surprising that the two-stage model with proliferation mimics fairly closely the multistage model with constant rates. Other two-stage models are detailed elsewhere (35)(36)(37).
Moolgavkar and Venzon have studied a generalization of the model (39) permitting growth also of the 0-cells and have been able to fit data for a wide range of human cancers. The model has been adapted for breast cancer by Moolgavkar, Day, and Stevens (40) who postulate growth in the rate constant for the first initiation during puberty (with menarche following a logistic curve), subsequent proliferation of 1-cells with an enhanced rate during pregnancy, a reduced rate after menopause, and a protective effect of first birth by a subsequent reduction in 1-cell proliferation. They provide extremely impressive fits to data.
Pike and his colleagues (41,42) have obtained equally impressive fits with a model conceptually different from, but very similar in its consequences, to that of Moolgavkar et al. Pike et al. adopt a power law with an index k -1 of 4.5, but assume that "time" (as used in the formula) is effectively expanded or contracted during a woman's life. Exposure starts at menarche (for which again a logistic curve is assumed), "time" moves more rapidly during reproductive life, with a temporary spurt during pregnancy, a fall after the first birth and a further fall after menopause. These authors suggest that the constancy of breast cancer rates in postmenopausal Japanese women (in contrast to the rise in other population groups) may be an effect of their low weights and low estrogen levels. The changes in the rate of passage of "time"are equivalent, in the simple multistage model, to the multiplication of all the rate constants by some factor varying throughout a woman's life, and may be motivated by the view that the rate constants depend on the rate of metabolism of stem cells, which may vary in the way indicated. It would, of course, be a rather strong assumption that all the rate constants should remain in the same ratios to each other although varying greatly with time.

Low-Dose Extrapolation
Considerable interest has been expressed in recent years in the assessment of low-dose carcinogenicity on the basis of extrapolation toward zero dose from the results of animal experiments in which high doses of test substances are used (43). Setting aside the important questions of the extrapolation from laboratory animal to man, there are serious problems about downward extrapolation within one animal species. The results depend heavily on the assumed nature of the dose-response curve at very low doses (44)(45)(46)(47).
One plausible and helpful assumption is, however, suggested by many multistage models. Suppose, as before, that m stages are affected by the carcinogen. Since we are dealing with very low doses it will be inappropriate to assume the Xi to be proportional to the doseintensity d, because there may well be background effects, but a linear relation seems reasonable. At fixed t, therefore, from Eq. (3), the cumulative incidence at dose d will be P(d) = 1 -exp {-Hf (oi + Pid)} (6) where the parameters ai and Pi? absorb the constants and terms involving t in Eq. (3). A (7) where all the Oi are nonnegative, and this model has been studied in detail (48)(49)(50)(51). In Eq. (7), if 01 > 0, the response curve is essentially linear at low doses. This restriction will give more "conservative" assessments (i.e., a given excess risk will be reached at lower doses) than most or all other models proposed. Now, a maximum likelihood estimate (50) of 01 may be 0, in which case a steeper curve will be fitted at low doses; however, some nonzero value of 01 will always be consistent with the data, and so linear extrapolation can scarcely be excluded as a reasonable procedure.
Other models have been proposed. Hartley and Sielken (52,53) generalize Eq. (7) to include time. Cornfield and his associates (54)(55)(56) have studied a multihit model (i.e., one involving hits occurring in an arbitrary order), which, with certain assumptions about background effects, has similar consequences to Eq. (6). However, Van Ryzin (57) points out that the low-dose linearity of Eq. (6) depends essentially on the assumption that the Xi are asymptotically linear in d. This linearity would follow if the background incidence were due to a carcinogenic agent, the dose of which combined additively with the applied dose. This need not be true.

Biological Evidence and Conclusions
In the construction of mathematical models for biological phenomena it is not uncommon to find that theories of quite disparate types provide good fits to the same data. Discrimination between models must then depend partly on general biological plausibility and partly on the ability of the models to explain new data. This is essentially the position with our present topic. As a statistician I can offer no authoritative guide to biological mechanisms. There seems little doubt, though, that the multistage theory, in some form or another, has provided a useful framework for hypothesis formation and for the design of observational and experimental studies. A number of experimental biologists maintain that carcinogenesis is a multistage process (the term 'multistep' is often used) (58,59), with perhaps an initial mutationlike stage of initiation being followed by one or more steps of a different nature (such as the activation of an oncogene). Evans and DiPaolo (60,61) have identified a number of specific stages in the progression of guinea pig fetal cells to neoplasia, such as morphological transformation, anchorage-independent growth, colony forming in agar, etc.
Until and unless we obtain direct evidence about the presence and nature of intermediate stages, any statistical theory is likely to remain largely unfalsifiable, particularly if it is allowed to be modified with the flexibility to which we have become accustomed.
The main contenders for generally applicable theories seem to be (a) the multistage theory, (b) some form of two-stage theory, and (c) the time-transformation theory of Pike and his colleagues. Until we have clear evidence for more than two states, it seems best to regard the multistage theory, like the dogmas of certain religions, as permitting either a literal or a figurative interpretation. That is, one can either assume that there really are k > 2 separate stages or one can regard some of the intermediate stages as being fictional shorthand for a single proliferative stage. There does seem a need to preserve at least two stages, so that we can distinguish between "early" and "late" effects of carcinogens.
The explicit two-stage models, with appropriate assumptions about proliferation, seem to explain many of the known facts. However, the observations on the effects of starting and stopping smoking, described above, suggest that at least three stages are involved for lung cancer (two affected by smoking and a final stage). Moreover, the multiplicativity of the effects of asbestos and radiation with that of smoking suggests at least a third stage. In any proliferative model, the precise nature of the proliferative parts of the process is likely to remain indeterminate until and unless direct biological observations become available.
The time-transformation model is relatively new and its full consequences have not, as far as I know, been explored. In one sense, it avoids some of the assumptions of the other models, in that the power law can be invoked as an empirical observation without any reference to stages. On the other hand, the particular way in which the response function is modified by changing circumstances (which we have seen would be equivalent to changing the rate for each of a number of stages by the same multiple) seems more specific than is required by other models, and it is unclear whether the model provides a suitable explanation for initiator-promoter data or other data in which an early or a late effect in indicated.
In many areas of biomathematics the ingenuity of the mathematician often seems to run ahead of the ability of the biological scientist to provide the data needed to validate the mathematical models. In the study of carcinogenesis it is encouraging to see, to an increasing extent, the close cooperation between mathematicians and statisticians on the one hand, and biologists on the other, and I believe that in this sort of collaboration lies the key to the solution of some of the problems I have discussed in this paper. I am grateful to Mr. Richard Peto for helpful comments on the first draft of this paper.