Profound synchrony of age-specific incidence rates and tumor suppression for different cancer types as revealed by the multistage-senescence model of carcinogenesis

The age-specific trend of cancer incidence rates, but not its magnitude, is well described employing the multistage theory of carcinogenesis by Armitage and Doll in combination with the senescence model of Pompei and Wilson. We derived empirical parameters of the multistage-senescence model from U.S. Surveillance, Epidemiology, and End Results (SEER) incidence data from 2000–2003 and 2010–2013 for The Cancer Genome Atlas (TCGA) cancer types. Under the assumption of a constant tumor-specific transition rate between stages, there is an extremely strong linear relationship (P < 0.0001) between the number of stages and the stage transition rate. The senescence tumor suppression factor for 20 non-reproductive cancers is remarkably consistent (0.0099±0.0005); however, five female reproductive cancers have significantly higher tumor suppression. The peak incidence rate for non-reproductive cancers occurs at a younger age for cancers with fewer stages and their carcinogenic stages are of longer duration. Driver gene mutations are shown to contribute on average only about a third of the carcinogenic stages of different tumor types. A tumor’s accumulated incidence, calculated using a two-variable (age, stage) model, is strongly associated with intrinsic cancer risk. During both early adulthood and senescence, the pace of tumor suppression appears to be synchronized across most cancer types, suggesting the presence of overlapping evolutionary processes.


INTRODUCTION
Cancer incidence rates in adults generally rise exponentially with post-pubertal age, peak around 80 years, and then approach zero after age 100, although estimates are higher in centenarians if incidental tumors discovered at autopsy are included [1]. Mortality rates show a similar age-dependent trend delayed by survival, the time from clinical manifestation of the cancer until death [2,3].
For most cancers in adults, the rise in the age-specific rate of cancer incidence ASR(t) can be represented by probabilities of occurrence of successive independent stages and by a power function [4]. The probability per unit time of the i th stage occurring, up to a total of k stages can be denoted by pi. Using the notation and simplification described in the review by Frank [5], we assume that for each cancer, these probabilities are equal (p1= p2 =…= pk = u). Therefore, if each stage of a unicellular or multicellular process proceeds at a small, AGING roughly constant stage-transition rate u per year, the probability of any step occurring after t years is 1−e −ut ≈ ut. At age t the probability that k−1 steps have occurred is approximately (ut) k−1 , and the final stage-transition rate is u (yr -1 ); therefore, the approximate rate (incidence) of occurrence at time t is 11 1, 2 Nordling [6] and Armitage and Doll [7] accounted for and modeled age-related cancer incidence as a power function that is dependent on a discrete number of ordered sequential stages: Generalizing the factorial function to the gamma function as Γ(k) = (k−1)! for natural numbers k, To accommodate the old age decline in this multistage power function, Pompei and Wilson [8] developed their "beta model" of senescence, in which the empirical tumor suppression term (1−bt), with 0 ≤ t ≤ b −1 , is based on the linear decline in cellular senescence or cell population doubling b (yr -1 ) and is equal to unity at birth for cancers of non-reproductive organs (henceforth referred to as "non-reproductive cancers"), or alternatively around the time of puberty, 15 years, for cancers of reproductive organs or "reproductive cancers". Thus, t represents age for non-reproductive cancers, and (age − 15) ≥ 0 for reproductive cancers. Senescence tumor suppressor b (yr -1 ) is a constant parameter and a is a variable parameter that can be substituted according to Equation 3: where A is the "stage-transition-rate tumor suppression" term that is age-independent and exponential, B is an agedependent "power law" growth term, and C is an agedependent linear "senescence tumor suppression" term.
Harding et al. [9] produced best fit curves of the multistage-senescence model to U.S. Surveillance, Epidemiology, and End Results (SEER) program agespecific cancer incidence for three decades of data (1979-1983, 1989-1993, and 1999-2003). In a similar manner, we investigate the multistage-senescence model of carcinogenesis and model-fit to the age-specific rate of cancer incidence from SEER data sets for [2000][2001][2002][2003] and 2010-2013 [10] of The Cancer Genome Atlas (TCGA) cancer types [9,11]. We show that just two variables (age and the number of carcinogenic stages) have a major influence, not only on the trend, but on the magnitude of age-specific incidence rate for all cancer types analyzed. In addition, the multistage-senescence model of carcinogenesis indicates that the number of stages is positively related to the age of peak incidence rate and inversely related to the stage-transition rate. Finally, the U.S. SEER data were matched to the cancer types from the National Cancer Institute TCGA data set to allow an estimate of the proportion of the total number of stages associated with driver gene mutations.

RESULTS
The age-specific rate of cancer incidence was estimated following the methods of Harding et al. [9] from U.S. Census data [12] and the SEER cancer registry data for 2000-2003 and 2010-2013 [10] of 23 non-reproductive cancers as well as five female-specific cancers and two male-specific cancers. The parameters values are given in Table 1A-1C for geometric mean of the stagetransition rate uμ, senescence tumor suppressor b, and stages k for the weighted model-fits to SEER 2010-2013 incidence rates employing the multistagesenescence model (Equation 5) for males and females separately as well as for both sexes pooled. We will use the terms "both" or "both sexes pooled" to refer to the case when all incidence cases (both male and female, M&F) of non-reproductive cancer types were considered and corresponding estimates produced. We will use the term "male" when male-only cancer incidence data (both reproductive and non-reproductive) were tabulated and analyzed to produce estimates, and analogously for "female." When we refer to "male and female," the estimates were produced separately for males and females and the estimates (not the incidence data) were pooled. Exact probability P values are given unless the statistical test is insignificant when applying multiple testing correction.
Results from 2010-2013 are presented in this section and in Supplementary Table 1A   Abbreviations for non-reproductive cancer types and footnotes 1-4 are explained in Table 1A. LGG, PCPG and READ are cancer types with P values for model-fitted uμ or k larger than 0.1 and were omitted from further analysis.

Senescence tumor suppressor
We fit the multistage-senescence model (Equation 5) to cancer incidence rate and analyzed the trends in the senescence tumor suppressor b, a parameter linked to declining cancer rates in very old age, with sex and different cancer types. There is insignificant variation (when adjusted for multiple testing) in b (yr -1 ) for nonreproductive cancers between sexes, according to the paired t test, with the b value for female cancers on average only about 2% smaller than that for males for the same cancer (n = 20, P > 0.1, Holm method). The mean b value for female non-reproductive cancers is 0.0098±0.0008 (SD, n = 20) and for males it is 0.0100±0.0008 (SD, n = 20, omitting rectal cancer, READ to compare paired values). The mean value of b for non-reproductive cases of both sexes pooled, omitting READ, is 0.0100±0.0005 (SD, n = 20), and 0.00991±0.0006 (SD, n = 21) including READ.
The mean value of b for all (non-reproductive and reproductive) male cancers is 0.0106±0.0030 (SD, n = 23, Table 1A, males) and for females 0.0102±0.0011 (SD, n = 25, Table 1B, females). We also performed AGING Abbreviations for non-reproductive cancer types and footnotes 1-4 are explained in Table 1A. LGG and PCPG are cancer types with P values for model-fitted uμ or k larger than 0.1 and were omitted from further analysis. multiple comparison tests, which are described in the Supplementary Note 1. The term (1 − bt) is zero when t = b −1 =98.2 years for females and 94.1 years for males.
The mean b value for female reproductive cancers is 0.0117±0.0008 (SD, n = 5, Table 1B, female). The b values of female reproductive cancers compared to the b values of their non-reproductive cancers (given above) are significantly different (P = 0.002), according to the Welch two-sample t test. This assessment was not repeated for male reproductive cancers as only two cancer types were analyzed (Table 1A, male).

Stages, age of peak incidence, and stage-transition rate
We analyzed the trends in the number of cancer stages k, age of peak incidence (yr), and geometric mean of the stage-transition rate uμ (yr -1 ) of the AGING multistage-senescence model (Equation 5) with sex and different cancer types. There is a significantly greater (Welch two-sample non-paired t test, P = 0.019) mean number of stages of female non-reproductive cancers (k = 5.9±1.5 SD, n = 20) compared to their reproductive cancers (k = 3.7±1.4 SD, n = 5) (Table 1B, female).
The age of peak cancer incidence rate is dependent on the tumor suppression b and the number of stages k the cancer passes through until diagnosed; there is no dependence on geometric mean of the stage-transition rate uμ. By taking the derivative of Equation 4, the age of peak incidence rate can be calculated (Equation 7), and consequently the model-fitted peak incidence rate (per 100,000 person-year). The age at peak incidence rate increases in a non-linear manner with k for male and female cancers ( Figure 1A, 1B). There is an extremely significant Spearman's rank correlation with k for age at peak incidence rate, both model-fitted The model-fitted cumulative probability of being diagnosed with at least one of the cancers considered is 41% for females and 55% for males, computed using Equation 8. These values are similar to 42% for females and 56% for males calculated from the SEER data directly.
There is an extremely significant linear relationship (P < 0.0001) of the geometric mean of the stagetransition rate uμ with respect to the number of stages k, (Figure 2A, 2B). Given the assumption that stages occur at the same rate, Figure 2A, 2B indicate that the rate of change of each stage is faster when there are more stages in a cancer. However, if the assumption does not hold, one possibility that would be consistent with this pattern is that rates are slower during the initial stages of cancer initiation and increase during later stages (see Discussion). The linear regression analyzes of the geometric mean of the stage-transition rate uμ versus the number of stages k, were compared for males and females by one-way analysis of variance, ANOVA and the slopes and intercepts did not differ significantly between paired regions (Supplementary Note 2).
There is an extremely significant variation in uμ for nonreproductive cancers between sexes, according to the paired t test, with the uμ value for male cancers on average about one-third greater than that for females for the same cancer (n = 20, P < 0.0001).

Two-variable model
In previous publications on the multistage and multistage-senescence models, only the trends in cancer incidence rate were analyzed, whereas we developed a two-variable model (age and the number of carcinogenic stages) that uniquely takes account of the relative magnitude of incidence rate of different cancers. AGING An age-and stage-dependent model is obtained by substituting in Equation 5 for u: where constants b, c, and d equal 0.0099, 0.0046, and 0.0087, respectively, as assessed for non-reproductive cancers of both sexes pooled ( Figure 3). The effects of the senescence tumor suppressor factor employing the two-variable model are considerable as assessed from the increase in cumulative probability of cancer over a lifetime and the change in b value from 0.0099 (mean of non-reproductive cancers for both sexes pooled) to zero. For cancer stages k = 2, 3, 4, 6, and 8, the cumulative probability up to 101 years (=1/b) was reduced by 67, 75, 80, 85, and 88%, respectively (Supplementary Note 3), indicating greater tumor suppression b for the more complex cancers of longer latent periods. The age of  (Table 1A).
(B) Red triangles represent values for females (Table 1B). The gray bars indicate one standard error in the estimate of the parameters by nonlinear least squares. AGING peak incidence rises from 67 to 76, 81, 87, and 90 years when b = 0.0099 as k increases from 2 to 3, 4, 6, and 8 (Equation 7). Complex cancers, compared to those with few stages, also have a higher cumulative incidence as is evident in the rise of probability over lifespan from 0.00034 to 0.0055, 0.0075, 0.0094, and 0.011, as k increases from 2 to 3, 4, 6, and 8.
There is an 1,800-fold range in the cumulative probability over lifespan of different cancers derived from SEER data (second-last column, Table 1A-1C), from 0.00012 for male adrenal cancer, ACC to 0.22 for prostate cancer, PRAD. The SEER to two-variable model ratios of the cumulative probability over lifespan were evaluated (last column, Table 1A-1C) and found to be the highest value of 40 for prostate cancer and 30 for breast cancer, BRCA, which have substantial environmental or extrinsic risk proportions [13]. When considering the non-reproductive cancers of either males or females, relatively high SEER to two-variable model ratios ranging from 3.2 to 7.8 are also found for bladder (BLCA), colorectal (COADREAD), lung (LUAD), skin (SKCM), and thyroid cancers (THCA), which correspond with cancer risk that is greatly influenced by extrinsic factors.
The ratios of known extrinsic to intrinsic incidence rates for 13 non-reproductive cancer types were obtained from data reported by Wu et al. [13], matching their cancer types to SEER TCGA cancers as closely as possible. The SEER to two-variable ratio of the cumulative probability over lifespan was strongly associated with the extrinsic to intrinsic cancer risk ratio (Spearman's ρ = 0.73, P = 0.0047, n = 13).

Driver mutations
In this subsection, the role of reported mutational cancer driver genes is quantified as a portion of the number of carcinogenic stages identified using the multistagesenescence model. The mean number of driver mutations as assessed by Iranzo et al. [14] for 20 TGCA non-reproductive cancers (excluding glioma, LGG and neuroendocrine tumors, PCPG) of both sexes pooled was 1.7 compared to 6.2 stages obtained by best fit to the multistage-senescence model (Table 1C). Therefore, on average about two-thirds of the cancer stages do not involve driver genes. This premise assumes that the multi-stage model is valid, and that each driver mutation contributes to a single stage, which may not be the case (see Discussion).
We investigate whether the number of stages k is related to published evaluation of the number of driver genes for various cancer types using the 2010-2013 SEER data ( Figure 4). The outlier endometrial cancer, UCEC  (Table 1C, n = 20) are represented by black circles. Female-specific cancers BRCA, CESC, OV, UCEC, and UCS are represented by red triangles, and male specific cancers PRAD and TGCT are represented by blue squares. UCEC is omitted as an outlier (standardized residual ≥3.0). The linear regression fit to the data is y = 0.14 k + 0.86, where y is the number of driver mutations as assessed by Iranzo et al. [14]. The regression is not weighted. The size of the points is proportional to the number of TCGA cases of that cancer type for illustration only.
AGING (standardized residual ≥3.0), was excluded from the analysis as it was the only type of cancer that demonstrated more driver genes than the number of stages. Endometrial cancers consist of four categories with highly variable mutation frequency and copy number [15]. Driver mutations from Iranzo et al. [14] are linearly correlated to a moderate degree with the number of stages in the 2010-2013 data (Pearson's r = 0.42, P = 0.033, n = 26) and in the 2000-2003 data (Pearson's r = 0.39, P = 0.054, n = 25). In terms of percentage contribution from driver mutations to the number of stages, the mean is 34% (similar in value to that derived from Table 2) with a wide 95% confidence interval of 9.9% and 64.3%, possibly partly due to differing contributions of driver genes to carcinogenesis as detailed in the Discussion. Comparing the number of stages with the number of driver genes from Bailey et al.

DISCUSSION
This analysis shows that the synchrony in both the rise and fall of U.S. SEER cancer incidence in aging adults is profound, with the possible evolution of the stagetransition rate and senescence tumor suppression mechanisms. However, the task of disentangling and quantitatively identifying aging effects on cancer etiology is recognized to be especially difficult, with on the one hand effects that promote cancer such as inflammation and genomic instability [17] and on the other hand senescence-related defensive systems that suppress cancer.

Senescence tumor suppression
The tumor suppression factor b (yr -1 ) is relatively constant for non-reproductive cancers regardless of cancer type or sex. Senescence tumor suppression is assumed to increase linearly with age. There are various potential cellular mechanisms that likely contribute to the suppression of malignant cancers in old age such as telomere erosion, Hayflick limit, stem cell exhaustion, senescent cells, accelerating systemic mass loss, and epigenetic aging changes [18][19][20]. For example, apoptosis rates in bone marrow increased from ~7% of cells, both in 0-to 9-year-olds and in 50to 59-year-olds, rising to three-fold more in 80-to 100-year-olds [21]. The tumor suppressor factor is of a similar value whether the malignancy is mesothelioma primary arising from inhaled asbestos, cervical cancer from exposure to human papillomavirus, melanomas from sun exposure, or post-menopausal breast cancer with elevated insulin levels. The constancy of the tumor suppression factor indicates tissue adaptation to various extrinsic environmental causes of cancer. The suppression of malignant tumors is accompanied by an increase in the prevalence of benign cancers. Imaida et al. [22] studied autopsies of 871 Japanese patients aged 48 to 113 years at death and found that the ratio of prevalence of latent cancers (those not diagnosed clinically) to cancers with metastasis increases from 0.64 in 48-to 84-year-old patients to 1.4 in older patients. Although there is evidence of mosaic aging of normal tissues [23], overwhelmingly there is synchrony in systemic aging [24] and the tumor suppressor factor, which is accompanied by the decline of cancer incidence rates for all cancer types in the aging adult.
The aggregated SEER cumulative probability over lifespan of 20 non-reproductive cancers analyzed is 73% greater for males than for females (Table 1C). In parallel with this, males have a greater acceleration of mass loss than females as measured in some major organs and body cell mass of normal populations [24]. This may be indicative of a greater rate of aging in males [25]. DNA mutation accumulation is greater in sex-specific cancers [26]. The tumor suppression of female reproductive cancers is shown to be stronger than for female non-reproductive cancers. A possible tumor suppressive mechanism is the early onset of mass loss in female reproductive organs compared to nonreproductive ones (breast, ovary, and uterus at about 25, 35, and 21 years of age, respectively) and an elevated decline in mass loss of functional tissue (breast, ovary, and uterus lose about 35, 46, and 35% mass, respectively, from 25 to 70 years of age) [24]. This greater suppression of female reproductive cancers may be an evolutionary adaptation to counter estrogen having a strong proliferative effect.
The interplay between the p53 master tumor suppressor and insulin-like growth factor 1 (IGF-1), which stimulates the mammalian target of rapamycin (mTOR), is critical to normal cell growth and carcinogenesis: viable p53 down-regulates these two highly evolutionary-conserved pathways [27]. In fact, insulin, growth factors and amino acids all activate the mTOR pathway, which stimulates protein synthesis and cell growth [25]. IGF-1 production provides an important protein determining post-natal growth and growth hormone (GH) signaling. Adversely, IGF-1 is not directly mutagenic but a potent mitogen and cancer risk; for example, prospective blood samples show elevated IGF-1 levels in individuals later diagnosed with prostate and pancreatic cancer [28,29]. IGF-1 has also been implicated in increased cancer risk of breast, colorectal, lung, and other cancers [30]. The down-regulation of the GH/IGF-1/insulin system decreases cancer risk and increases longevity in animal models; however, in AGING humans the results are somewhat contradictory, although genetic studies of the GH/IGF-1/insulin system support their involvement in human longevity [31]. Cancer is virtually unknown in patients with congenital IGF-1 deficiency, exhibiting dwarfism and obesity [32]. A meta-analysis study confirms that the prevalence of most cancers increases with adult height, which is influenced by hormone levels, especially growth factors [33].
Multiple studies of serum IGF-1 concentrations show an increase in adolescence, a peak during puberty, and then initially in adulthood a rapid decline; thereafter IGF-1 levels change more slowly to about a fifth to a tenth of the maximal value at 80 years of age [34,35]. For both sexes, the multicenter and largest study [36] measured the annual fractional decline in IGF-1 activity (estimated as the gradient over the intercept of a linear trend between 25 and 80 years) of about 0.0087, which is similar to the tumor suppressor factor b value of 0.0099. Growth hormones can have a profound influence on both normal and carcinogenic tissues. Organ functional mass loss, greater than fat-free mass loss, starting early in adulthood and accelerating in old age may be indicative of the reduction in primary growth hormones and metabolic rate with adult age [24,37]. Consequently, the systemic cancer suppressor factor b and involutional changes could therefore be associated with the declining plasma IGF-1 in aging adults.
Cancer and aging are inextricably interconnected. In 1957, Williams [38] proposed the theory of antagonistic AGING pleiotropy: genes such as TP53 or transforming growth factor-β (TGF-β), and biological processes that enhance reproductive success early in life, lead to an evolutionary trade-off, with later fitness decline and death. In this view, p53-dependent replicative senescence would be one such biological process. In terms of cancer, p53, replicative senescence, and indeed growth hormones are a double-edged mechanism: they can both advance and impede oncogenesis [39,40]. Contemporary observations are that longevityenhancing, protective, genetic variants become more prevalent with increasing age of the very old [41]. The heritability of living to the mid-80s is only 20-30% (twin study [42]; however, the heritability of living past 100 is between 33% (females) and 48% (males). Hence, there is the possibility that the evolution of tumor suppression, although associated with frailty, nevertheless counteracts cancer in the very old.

Stages
This work demonstrates that adult cancers with an early age of peak incidence rate (yr) have fewer cancer stages k than more complex cancers that reach maximum incidence rate at a later age ( Figure 1A, 1B). The multistage model is based on cellular changes that are specific, discrete, and stable and that proceed in a unique order, although the changes are not necessarily gene mutations [7,43]. Malignant tumors typically acquire a range of biological capabilities or hallmarks that include proliferating in a sustained manner, evading growth suppression, resisting cell death, acquiring replicative immortality, inducing angiogenesis, and activating invasion and metastasis [17]. Biological mechanisms that support these hallmarks include genetic mutations and epigenetic modifications [5,44]. Most cancer cell lines also exhibit very short telomeres but escape replicative senescence through mechanisms such as telomerase activation or telomeric recombination [45][46][47]. Four-fifths of tumors are solid tumors, which generate a blood vascular supply to supply nutrients and oxygen to enable growth beyond a few millimeters. Two-thirds of solid tumours in a Norwegian population registered with metastases at death [48]. This process involves genetic and epigenetic changes in which cells commonly change their phenotype such as during epithelial-to-mesenchymal transition or mesenchymalto-epithelial transition and exhibit hybrid features via intermediate or partial states [49,50].

Stage-transition rate
The geometric mean of the stage-transition rate uμ (yr -1 ) has a wide range from 0.00064 to 0.040 for the various cancer types (Table 1A-1C) and its increase with the number of stages k is remarkably robust (Figure 2A, 2B). For example, testicular cancer, TGCT with 1.5 stages has a value of uμ of 0.0013, whereas for bladder cancer, BLCA with 8.5 stages, uμ is 0.036 for both sexes pooled. Therefore, if the number of cancer stages is small, the duration of each stage is longer. This finding may result from earlier stages being of longer duration than later ones. One could speculate that if, contrary to our findings, the transition rate was initially rapid, then the cumulative incidence of the cancer could be extremely high by middle age, so there would be strong evolutionary pressures to reduce the incidence of the cancer. This pressure could result in addition obstacles to the cancer's formation, which would then require additional mutations for a cancer to overcome. In this way, a fewer-stage cancer could be converted to a many-stage cancer. Therefore, it appears that cancer cells out-compete aging, slower-dividing, normal cells by acquiring quickening carcinogenic stages via Darwinian selection that widens the cancer cell traits from those of the initial cell, and in the process augments the competitive advantages of cancer cells (especially high-stage tumors) over those of the surrounding non-cancer cells [51].
Several tumor suppression factors have been posited as influencing the generally lower cancer rates in women than men, including adult females having generally shorter stature, longer telomeres, less telomere attrition and lower rates of thymic involution, healthier lifestyles, better T cell production, and more robust p53 response [52][53][54]. An exception is DNA methylation, which is the most accurate parameter of biological aging and a tumor suppression factor that has equal influence in men and women [20]. We identified some additional gender-specific carcinogenic factors by comparing the parameters of the multistage-senescence model derived for 20 types of non-reproductive cancers in males and females (Table 1C). For example, cancers in males have ~14% more stages than those in females (P = 0.0033). Intriguingly, the geometric mean of the stage-transition rate uμ in males is 33% greater than that of females (P < 0.0001). This is a novel explanation of male susceptibility to carcinogenesis. However, it is worth further consideration because a small change in the stage-transition rate u results in a large change in the incidence rate, as u is raised to the power k. Surprisingly, there is no sex-dependent significant difference in the extrinsic-risk-dependent parameter, the SEER to two-variable model ratio of the cumulative probability over a lifetime, which tentatively infers that male susceptibility is not due to lifestyle.

Two-variable model
Although the two-variable model based on age-and stage-dependence is too broad a stroke to characterize AGING cancer incidence for all cancer types, it is instructive to analyze the trends it describes. The two-variable model for non-reproductive cancers indicates that for a small number of stages, the peak incidence rate is lower and occurs at a younger age (Figure 3). Complex cancers, compared to those with few stages, have a much higher cumulative incidence; for example, the cumulative probability over lifespan of an eight-stage cancer is 33 times that of a rarer two-stage cancer. Therefore, in general, the more complex the staging of specific adult cancers, the higher the cancer incidence rates and the longer the latent periods. Another observation is that the early age of peak incidence rates of cancers with fewer stages have far broader incidence rate peaks than the more complex cancers, somewhat due to lesser influence of senescence earlier in adulthood. Age 60 years appears pivotal; before that age cancers with few stages dominate diagnoses, whereas after that age highstage cancers dominate incidence rates.
There is great variation in the endogenous (e.g., biologic aging) and exogenous (i.e., radiation) nonintrinsic factors driving the total cancer risk and the age-adjusted incidence rate of regions around the world, with the ratio of high to low incidence rates being up to ten-fold or more [13]. A finding was that the twovariable, age-and stage-dependent model is highly dependent on the intrinsic risk of carcinogenesis [13]. This was determined by the strong association (P = 0.005) of the ratio of the cumulative probability of cancer over lifespan, calculated from SEER data (numerator) and the two-variable model (dominator), with the ratio of the extrinsic cancer risk (numerator) and the intrinsic cancer risk (dominator). Hence, there appears to be an evolutionary component to multistage [55] and multistage-senescence models, where natural selection acts to suppress early-onset, rare, low-stage cancers when able to effect Darwinian fitness, which is less effective as reproductive rates decline and lateonset, common, high-stage cancers develop.

Driver mutations
All cancers possess somatically acquired mutations. Most somatic mutations are passenger mutations, which far outnumber, and rise with, the number of mutational driver genes [14]. Sporadic cancers arise through somatic evolution that parallels the increase in mutations in normal cells with aging. Early events in a cancer's development are characterized by a constrained set of common driver genes, and later events are ascribed to a greater set (~four-fold more) of drivers and increased genomic instability [56]. This applies to different cancer types and subtypes. Primary tumor cells may lie dormant for years before circulating cells form metastases [50]. Only ~0.03% of circulating melanoma cells in mice formed lung metastases [57]. However, other researchers report that the formation of driver gene mutations mainly arises during the initial stages of carcinogenesis, as primary and metastatic tumors share almost the same driver genes [58]. Non-shared metastatic driver genes do not have functional consequences. The median number of mutated driver genes was three for adult acute myeloid leukaemia, with the number of driver mutations increasing with age [59]. The mean number of driver gene mutations of various cancer types is approximately two ( Table 2), [14,16].
Mutations in the TP53 gene (or p53 protein), the most common driver gene in cancers, particularly epithelial cancers, significantly increase in tumors with a high number of stages as evaluated from the steepness of the power function rise in cancer rates with aging (Equation 1) [60]. Driver gene mutations can contribute to one or more stages. This quantitative trend of the loss of the TP53 gene increasing with a cancer's stages can be interpreted as p53 contributing to multiple carcinogenic stages and promoting increasingly rapid progression. Early mutations dominate cancerous tissues, as shown in breast cancer [61]. This is supportive of metastatic driver genes arising early in the cancer development, (especially those involving oncogenes TP53 and KRAS) [62], although further stages comprise the metastatic process. Notwithstanding the central dogma in oncology of the dominant role of driver genes and mutations, by our analysis driver mutations contribute (making no allowance for non-singular driver mutations contributions) to only approximately one-third of the carcinogenic stages, which underlines the complexity of tumorigenesis [63].

Models of carcinogenesis
There are many alternatives to the multistagesenescence model of carcinogenesis. A review of carcinogenic models identified five mathematical model types, namely mutations, genomic instability, nongenotoxic mechanisms, Darwinian cell selection, and tissue organization [29]. A well-known model by Moolgavkar, Venzon, and Knudson (MVK) [64], based on a two-mutation initiation-promotion process, seeks to address a considered deficiency of the Armitage-Doll model [7] in that it takes account of cell division and differentiation. This model is especially suited to childhood cancers such as the most common type, acute lymphoblastic leukemia, and the much rarer tumor, retinoblastoma, which have been experimentally and theoretically shown to have undergone two genetic changes [65,66]. Childhood cancers often result from defects in developmental signaling pathways of stem cells [19,67]. Obviously, the primary incidence peak of AGING childhood cancers is not influenced by senescence as represented by the multistage-senescence model. However, cancer models have been produced based on genomic instability and the somatic cellular evolution of cancer likely common to tumors of all ages [68]; notwithstanding, childhood and adult cancer develop and require models of a different nature.
The decline in cancer incidence rates in old age has been modeled by Cook et al. [69] for stomach cancer in males assuming that only 1% of the population is susceptible to this cancer; the modeling led to similar fits between the susceptibility and multistage-senescence models except for those aged more than 105 years. Our model is in agreement with the analysis of Cook et al. [69], indicating incidence peaks at a younger age for rarer cancers. However, Ritter et al. [70] suggest that it is unlikely that cancer-susceptible people are depleted after age 80. A recent analysis by Belikov [71] shows that Erlang probability distributions (summing independent, exponentially-distributed events or stages) closely follow U.S. SEER incidence rate curves for 20 prevalent cancers. No account is taken of a cellular-senescence component being a factor in cancer downturn in old age. The analysis generates the number of successive, carcinogenic events, with a very wide and perhaps improbably large range of 4 to 41 stages, proffered to be driver mutations or epimutations. Based particularly on colorectal cancer common in adulthood, it has been proposed that three driver genes are needed for a cell to evolve through breakthrough, expansion, and invasive stages to an advanced cancer [72]. Whereas our study estimates the number of stages for colorectal cancer, COADREAD (Table 1A-1C) as between six and seven stages, maybe including observed epigenetic modifications [5,44]. The Armitage-Doll and MVK models both assumed a constancy in the number of stages for all tumor types, although analysis of contemporary SEER and driver gene data and contemporary understanding of the development of genomic instability, epigenetic changes, and metastases indicate that this is certainly not the case.

Limitations of determining temporal trends in cancer incidence
Finally, there is a limited potential to identify temporal trends in multistage-senescence model parameter values as seen in the five data sets consisting of the 1979-1983, 1989-1993, and 1999-2003 Figure 1). Care was taken to duplicate the data retrieval and fitting methods of Harding et al. [9], as three of the five data sets originate from this published study. Of course, ideally all data sets used would derive from our labors, although data from the 2000-2003 and 2010-2013 groups come from a larger geographic region than those of earlier years. There is an overlap in the 1999-2003 and 2000-2003 data sets, with four independent data sets, not five, which limits temporal analyzes.
The prostate cancer incidence rate is influenced by its detection by transurethral resection of the prostate and by prostate-specific antigen (PSA) [73]. The use of PSA tests from 1986 onwards in the U.S. has allowed the early detection of prostate cancer. This has also resulted in considerable overdiagnosis because detection of prostate cancer would otherwise not have been diagnosed within the patient's lifetime. An Australian study indicated that overdiagnosis was common in prostate, breast, renal, and thyroid cancers, and melanoma [74]. An analysis of SEER data from 1988 to 1998 indicated over-diagnosed prostate cancer rates of ~29% for white subjects and 44% for black subjects [73]. It is proposed that early-and over-diagnosis due to PSA availability may partly explain the declining number of stages in prostate cancers diagnosed over four decades. This demonstrates the potential to examine temporal trends in incidence-related parameters, provided there is future analysis of more data sets.
A generalized limitation of our analyzes of age-specific rate of cancer incidence is their cross-sectional nature. The age-specific incidences rates emanate from various groups of people, who may have different cancer risks for reasons other than age itself, and these reasons are not accounted for in the carcinogenesis models. Hanson et al. [75] point out that cross-sectional cancer incidence studies often offer limited conclusions about cancer trends due to aggregated ages above 85 years or the examination of a single period or fail to consider period and cohort influencesonly the last of these is a limitation of our study. There are significant differences between the characteristics of the TCGA and SEER datasets, such as age and stage at diagnosis [76]. The potential inaccuracy of population estimates for age groups above 85 years was investigated in a data quality study by Miller et al. [77] that analyzed (in a similar manner to our study) the 2010 Census and U.S. SEER registry records for 2008-2012, which yielded cancer incidence rates that usually did peak (a trend observed in our study) and then decrease in the oldest old.

CONCLUSION
A multitude of carcinogenic models have been proposed in the scientific literature; however, the multistage-senescence model is unique in quantifying several important parameters of the age-specific incidence rate of different cancer types based solely on the number of development stages. The finding, perhaps controversial, that cancer-suppressive mechanisms counter the agingdependent escalation in cancer complexity and incidence gives support to preventing cancer by augmenting extant mechanisms. The two-variable model gives us a useful tool to further investigate the biological mechanisms that drive specific cancers, such as those that are affected by ethnicity or ionizing radiation.

MATERIALS AND METHODS
All statistical analysis was performed using software R version 3.6.0 [78]. The full code for all analyzes (including processing data, computing incidence rates, statistical analysis, and making plots) that we developed at the time of publication, under MIT License, for estimating cancer incidence rates is posted in the GitHub repository at https://github.com/canghel/cancerincidence-v5.

SEER cancer incidence data
The methods to estimate incidence rates from personyears at risk for the older U.S population and to fit the multistage-senescence model were reproduced as closely as possible from Harding et al. [9]. The agespecific rate of cancer incidence data from the SEER cancer registry were analyzed were for 2000-2003 and 2010-2013. This was to allow use of the SEER 18 registries data for all the calculations [10], to be near in date to the Census 2000 and 2010 population estimates [12], and to allow reasonable comparison to previous literature. As in Harding et al. [9], we restricted the selection of cases to malignant behavior, known age, and first matching record for each person.
The incidence cases of 30 cancer types corresponding to TCGA cancer types [76] were extracted from the SEER data [10] using SEER*Stat 8.3.5 software. The cancer types consist of 23 non-reproductive cancers as well as five female-specific cancers and two male-specific cancers. The TCGA codes and SEER histology correspondence are given in Supplementary We used the Census population older ages fractions to infer the populations for the oldest categories, assuming that these fractions remain constant over the subsequent three years. The discrepancy between such ratios is less than ~15% for the groups under 85 years.
The crude rate per 100,000 persons in each five-year age category were determined and the associated standard error. Although the assumption of constant 85+ population fractions over time may fail for the oldest populations, especially for males with higher mortality rates, the computed standard errors for the incidence rates in these populations are also large and hence these incidence rates to the multistage beta model fit. For the peak age of incidence and the peak rate of incidence, we restricted the values reported to ages less than 105 to avoid outliers.

Multistage-senescence model fit
We fit the multistage-senescence model (Equation 4) by non-linear least squares, weighted proportionally to the inverse standard error squared, using the Levenberg-Marquardt algorithm from the minpack.lm v.1.2-1 package [79] in R v.3.6.0 [78]. The curves were fit beginning at age 50 when the number of cases for each cancer begins to rise [9], with three exceptions where the incidence peaks at much younger ages than most cancer types: thyroid carcinoma (THCA) and cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC) both for females at 30 years, and testicular germ cell tumors (TGCT) in males at 20 years. The time t in Equations 4 and 5 corresponds to age in non-reproductive cancers and to time since the onset of puberty, estimated as age minus 15 years in reproductive cancers.
By taking the derivative of Equation 4 [80], the age of peak incidence is The total probability of each cancer type, the cumulative ASR, computed analogously for Equation 4 as in Pompei Parameter a is a multiplicative term [7], whose values are given in Supplementary Tables 4A-4C, 5A-5C.

TCGA driver gene data
The number of driver mutations are taken from Table  S6, of Iranzo, Martincorena [14], with the total number of samples per cancer type assumed to be the same as in Martincorena, Raine [81].

Additional information
Supplementary information is available in the online version of the paper. The full code and additional documentation at the time of publication, under MIT License, is posted in the GitHub repository at https://github.com/canghel/cancer-incidence-v5.

AUTHOR CONTRIBUTIONS
R.B.R. designed the study, carried out exploratory analysis based on published data, and wrote the first version of the paper. C.V.A. carried out data processing and computational analysis. D.S.D. carried out data processing and computational analysis and provided statistical expertise. All authors participated in the interpretation of the results, critically revised the paper, and approved the final version to be published.

ACKNOWLEDGMENTS
This study has benefited from the library facilities that McGill University makes available to its adjunct professors (R.B.R.). We are very grateful to Helena Rummens of Deep River, Ontario for assistance in editing the paper. An anonymous reviewer is thanked for critically reading the manuscript and suggesting thoughtful improvements.

CONFLICTS OF INTEREST
The authors declare that they have no conflicts of interest.   Abbreviations for non-reproductive cancer types and footnotes 1-4 are explained in Supplementary Table 1A. LGG, PCPG and READ are cancer types with P values for model-fitted uμ or k larger than 0.1 and were omitted from further analysis. The mean, SD and paired t-test P values in each column are for the 20 non-reproductive, paired cancer types for males and females, for the parameter in the corresponding column. †Values are no longer significant to 5% level when Holm's method for multiple-testing correction is applied for the paired t tests in the table *Values are no longer significant to 10% level when Holm's method for multiple-testing correction is applied for the paired t tests in the table. Male (M) and female (F) parameter values are compared by paired t test for 20 paired cancer types in the last row of the table.

Supplementary
Abbreviations for non-reproductive cancer types (footnote 0) and footnotes 1-4 are explained in Supplementary Table 1A. LGG and PCPG are cancer types with P values for model-fitted uμ or k larger than 0.1 and were omitted from further analysis.