A Review of Estimation of Key Parameters and Lead Time in Cancer Screening

Early detection combined with e ective treatment are the only ways to ght against cancer, and cancer screening is the primary technique for early detection. Although mass cancer screening has been carried out for decades, there are many unsolved problems, and the statistical theory of cancer screening is still under developed. Screening sensitivity, time duration in the preclinical state, and time duration in the disease free state are the three key parameters, which are critical in cancer screening, since all other estimates are functions of the three key parameters. Lead time is the diagnosis time advanced by screening, and it serves as a measurement of e ectiveness of screening programs. In this article, we provide a review for major probability models and statistical methodologies that have been developed on the estimation of the three key parameters and the lead time distributions. These methods can be applied to screening of other chronic diseases after slight modi cations.


Introduction
Cancer screening, as the primary technique for early detection, has been carried out since 1960s.The goal of screening is to catch the disease early before symptoms appear.The United States Preventive Services Task Force (USPSTF) has recommended screening schedules for almost all of the most prevalent cancers (USPSTF 2016), such as breast, lung, colon, cervical cancer, etc.Although dierent cancer sites have their specic characteristics and developmental stages, they all share some common features as well.
The commonly followed progressive model used in cancer screening and its parameters are outlined below.A cohort of apparently healthy individuals are enrolled in a screening program to detect the presence of a specic disease.The disease progression stochastic model was rst proposed by Zelen & Feinleib (1969) and has been used since then.In this model, the disease progresses through 3 states: S 0 → S p → S c (See Figure 1).S 0 refers to the disease-free state or the state in which the disease can not be detected; S p refers to the preclinical disease state, in which an asymptomatic individual unknowingly has the disease that a screening exam can detect; and S c refers to the state at which the disease manifests itself in clinical symptoms.The progressive disease model describes the natural history of lesions detected by screening for cancer.The goal of screening programs is to detect the cancer in the preclinical state (S p ), so that it may be treated before adverse symptoms arise.Sensitivity is the probability that an screening exam result is positive, given that an individual is in the preclinical state S p .More specically, a binary variable D represents the true disease status of an individual; that is, D takes value one when an individual has the disease and zero otherwise.The binary variable X represents test result from a screening exam with X = 1 indicating that the test is positive.The sensitivity is the probability of correctly identifying those who have the disease, that is, β = P (X = 1 | D = 1).Specicity is the probability of correctly identifying those who do not have the disease, that is, α = P (X = 0 | D = 0).Ideally, we desire the test to have both a sensitivity and specicity of 100%, but in reality this is unachievable.In fact, both sensitivity and specicity cannot be estimated directly from data summary in a mass screening.To see why, suppose there are n people take part in one screening exam, according to their true disease status and the screening results, they can be classied into four categories as in Table 1.
From Table 1, the sensitivity is β = n 11 /(n 11 + n 21 ), and the specicity is α = n 22 /(n 12 + n 22 ), where n 11 and n 12 can be obtained by a follow-up exam, such as a biopsy after a positive screening result to conrm either the nding is cancerous or not.However, for those screened negative individuals (who are the majority in a mass screening), conrmation of the true disease status is not cost eective, nor ethical.Therefore, n 21 and n 22 are usually unknown, although their sum is observed.Hence, β and α cannot be obtained from data directly.Also, a screened negative individual who has been followed and found to be positive later may fall into one of two cases: either it was a false negative on the previous screening exams, or it is a newly developed case.However, the sensitivity can be estimated by likelihood-based estimation from mass screening data (Shen & Zelen 1999, Wu, Rosner & Broemeling 2005, Wu, Wu, Banicescu & Cariño 2005).
Sojourn time is the time from when the disease rst develops to the manifestation of clinical symptoms.If one enters the preclinical state (S p ) at age t 1 , and becomes clinically incident (S c ) later at age t 2 , then (t 2 − t 1 ) is the sojourn time (see Figure 1).The nature of data collection in a screening program make the exact observation of time of onset of either S p or S c impossible.Therefore, estimation of the sojourn time distribution is dicult.However, this information can be obtained under model assumptions.For example, previous analyses have shown that the preclinical state of breast cancer may last from 1 to 4 years (Shen & Zelen 1999, Shen, Wu & Zelen 2001, Wu, Rosner & Broemeling 2005, Wu, Wu, Banicescu & Cariño 2005), and sojourn time may last longer for colorectal cancer (Wu, Erwin & Rosner 2009b).Hence, cancers with longer sojourn time are more likely to be detected in its preclinical stage, which is the goal of implementing a screening program.
The transition density from the disease free state (S 0 ) to the preclinical state (S p ) is the probability density function (PDF) of the time duration in the diseasefree state S 0 , i.e., t 1 in Figure 1.It is commonly assumed that the sojourn time and the transition time are independent (Wu, Rosner & Broemeling 2005, Wu, Wu, Banicescu & Cariño 2005).Due to the imperfect sensitivity of the test and the interval-censored nature of the data, the transition density is typically estimated by relying on common parametric models or interval-constant assumptions.
Lead time is the length of time that the diagnosis is advanced by screening.
In Figure 1, if one is oered a screening exam at time t within the time interval (t 1 , t 2 ), and cancer is diagnosed, then the length of the time (t 2 − t) is the lead time.An individual with a longer lead time usually has a better prognosis than one with a shorter lead time.For a particular case detected by the screening, the lead time is unobservable, due to the fact that once cancer was diagnosed, it will be treated immediately, making it impossible to observe the onset of clinical state S c .
The three key parameters in screening are the sensitivity, the sojourn time and the transition density.They are the key parameters due to the fact that all other estimates are functions of these three key parameters, including the lead time.
In the next few sections, we will review the existing statistical methods used to estimate the three key parameters in cancer screening, as well as the methods for estimating the lead time.Finally, we close the article with a brief discussion of the variations of these methods as applied to dierent cancer sites, such as breast, colon, and lung cancer.

Estimation of the Three Key Parameters
We rst introduce some notation used in the remainder of the article.Consider a group of initially asymptomatic individuals scheduled with K ordered screening exams t 0 < t 1 < . . .< t K−1 , where t i−1 represents a person's age when receiving the ith screen, i = 1, . . ., K. For an annual screening program, t i = t 0 + i.We dene β as the sensitivity of the screening exam, β = β(t i ) if it is age-dependent.The function w(t) describes the time duration in S 0 ; note that it is often modeled as a sub-PDF due to the fact that someone may stay in the state S 0 during their lifetime.Finally, q(•) is the probability density function of the sojourn time in S p , with the survival function The mass screening data used in these methods usually consist of three pieces of information from each screening cycle: n i is the total number of individuals examined at ith screening (at age t i−1 ); s i denotes the number of individuals diagnosed by the ith screening exam, that is, the number of screen-detected cases; r i is the number of individuals found in the clinical state (S c ) within the ith screening interval (t i−1 , t i ), that is, the number of interval cases.Table 2 shows the data format for a mass screening program with K scheduled exams, where t 0 is the age at the rst exam, and the triplets (n i , s i , r i ) stratied by the initial age are the data we use.Shen & Zelen (1999) proposed a likelihood function to estimate the screening sensitivity and the mean sojourn time under the assumptions of a stable and nonstable disease model.The stable model means that the transition density w(t) = w is uniformly distributed over all ages, and the nonstable model allows the probability of transitioning w(t) to depend on t.In their approach, they take w(t) to be a step function of age with discontinuities every ve years.The sojourn time was assumed to follow an exponential(µ) distribution in both stable and nonstable models, i.e., Q(x) = exp(−x/µ).The estimated parameters are the sensitivity β, the mean sojourn time µ and the transition density w.
Consider the ith screening interval [t i−1 , t i ) of a xed age strata.Let D i be the probability of an preclinical individual diagnosed at the ith screening given at age t i−1 .For an individual who is diagnosed at the ith screening (i > 1), the person has to be tested as negative at all previous (i − 1) screening exams and stay in preclinical state at least till t i−1 .It can be calculated by Let I i be the probability of an individual being incident in the ith interval.The person has failed to be detected at i previous exams (true negative or false negative), and develops clinical cancer at time point t after t i−1 .The person can enter the preclinical state at anytime before t.It is given by Thus, the full likelihood function was derived as where the likelihood functions only depend on sensitivities for dierent modalities α j and the parameter vector of the sojourn time distribution.The overall sensitivity, β = α 1 + α 2 + α 3 , is applied to the case of using two screening modalities simultaneous in each exam, such as using mammogram and physical exam in breast cancer, or using chest X-ray and sputum cytology in lung cancer, with β 1 = α 1 + α 3 and β 2 = α 2 + α 3 represent sensitivity of each modality (See Shen et al. (2001) for details).And s i1 + s i2 + s i3 = s i denotes the number of cases detected by modality 1 only, by modality 2 only and by both.
By treating r i and s i as approximately Poisson, they develop a simplied conditional likelihood function In both papers (Shen & Zelen 1999, Shen et al. 2001), the data was not stratied by age, which means Table 2 could be collapsed into a vector.Two breast cancer screening datasets, the Health Insurance Plan (HIP) study and the Canadian National Breast Screening study were used in both stable and nonstable model.In the nonstable model, estimates of the transition rate w for each ve year interval can be achieved by using the incidence data from the SEERs database.The innovation of this study is that a likelihood function was developed to estimate the sensitivity and the mean sojourn time.
2.2.Estimation of age-dependent sensitivity and transition probability Wu, Rosner & Broemeling (2005) developed statistical inference procedures to estimate the sojourn time, the age-dependent sensitivity, and the age-dependent transition density from the disease-free state to the preclinical state.Both maximum likelihood estimate (MLE) and Bayesian posterior estimates were used to estimate the parameters.The age was considered to be a covariate of the sensitivity and the transition probability density.
Consider a cohort of initially asymptomatic individuals who enter the screening program at age t 0 .There are K ordered screening exams that will occur at age is the follow-up time after the last exam, during which incident case may be detected.Let (n i,t0 , s i,t0 , r i,t0 ) be the data for the ith screening for the strata with starting age t 0 .Then the likelihood for the individuals aged t 0 at study entry is proportional to where D k,t0 is the probability that an individual will be detected by the kth screening exam (at age t k−1 ) given this person is in the state S p .Here, we work with a single testing modality, so there are no α terms as in the previous section.When k = 1, 2, . . ., K, D k,t0 can be calculated by The likelihood also depends on I k,t0 , the probability of an individual being incident during the kth interval (t k−1 , t k ), it can be calculated by For one screening study, the likelihood for all age groups is proportional to We can clearly see the likelihood is a function of the three key parameters β(t), w(t) and q(x).The parametric models for the three key parameters were carefully chosen as following: where t is the average age at entry in the study group.The sensitivity β(t) was associated with age t by a logistic link.The log-normal distribution was used for the transition probability w(t).As the integral of w(t) over all ages is the lifetime risk of developing a cancer and should always be less than 1, w(t) is in fact a sub-PDF.Hence, the upper limit was set to w max = w(t)dt.For breast cancer, the upper limit was set to be 0.2, and for heavy smokers in lung cancer, it was 0.3 (Wu, Rosner & Broemeling 2005, Liu, Levitt, Riley & Wu 2015).For the sojourn time, the log-logistic distribution was adopted, in part due to its convenient survival function Q(x) = [1 + (ρx) κ ] −1 .The unknown parameters θ = (b 0 , b 1 , µ, σ 2 , κ, ρ) were estimated from the likelihood function described above.Note that although the integrals in D k,t0 and I k,t0 are not available in closed-form, methods for numerical integration can be applied.Simulations were carried to evaluate the reliability of the proposed likelihood, and the detailed procedure can be found in Wu, Wu, Banicescu & Cariño (2005).Both Markov Chain Monte Carlo (MCMC) estimates and MLEs were obtained.They applied their model to the HIP female breast cancer study and obtained estimates for age-dependent sensitivity and transition probability along with the sojourn time.
2.3.Key Parameters Estimation When Sensitivity Depends on Sojourn Time Wu, Cariño & Wu (2008) argued that the screening sensitivity should be a function of both age at diagnosis and the amount of time spent in the preclinical state, rather than only depend on the age at diagnosis.Intuitively, as the cancer gets closer to progressing from the preclinical state to the clinical state, it should be easier to catch by a screening exam than it was previously.
In this way, the sensitivity is modeled as β = β(t, s | S), where t represents an individual's age at the screening exam, s is the time duration a person has already spent in the preclinical state, and S is the sojourn time in S p (s < S).The probability that an individual will be diagnosed by the kth screening exam (at age t k−1 ) given that this person is in the state S p with initial age t 0 becomes The probability that an individual is an incident case during the kth interval (t k−1 , t k ) with initial age t 0 becomes The sensitivity associated with age, time spent in S p and sojourn time is where t is the average age at entry for the entire study group, S is the sojourn time, and s is the time a person already spent in preclinical state S p , s ∈ [0, S].
Clearly, the sensitivity is increasing in s where the maximum sensitivity is achieved at s = S, that is, the moment the cancer transitions from preclinical to clinical.When b 1 > 0, the sensitivity is a monotonic increasing function of age t.This method was applied to breast cancer data, such as HIP (Wu et al. 2008).
Motivated by the fact that age seems to have little eect on the screening sensitivity in lung cancer, Kim & Wu (2016) treated the sensitivity as a function of time spent in the preclinical state and the sojourn time for further inference.The sensitivity was modeled as a ratio of time spent in the preclinical state s to the sojourn time S, given by where τ is a parameter added to control the overall sensitivity.The parameter γ reects the changing rate of sensitivity: when s/S is close to zero, the sensitivity increases rapidly if γ < 1, while it increases slowly if γ > 1.
The probabilities D k,t0 and I k,t0 are the same with Equations 2, 3 and 4.This method combined with the likelihood in Equation 1 was applied to the Johns Hopkins Lung Project data in Kim & Wu (2016).

Estimation Of the Lead Time Distribution
Lead time is the length of time that the diagnosis is advanced by screening.It can serve as a surrogate measurement on how eective a screening program is.In the case of cancer, survival time is typically measured from the time of diagnosis.Hence, an earlier detection of the tumor due to screening will cause the patient's survival to appear long, even if there is no real eect on mortality.When survival benet is compared between the screened group and the control group, the lead time must be adjusted for the screened group, so accurate estimation of the lead time is necessary.
Many researchers have proposed methods to estimate the lead time (Kafadar & Prorok 1994, Kafadar & Prorok 1996, Kafadar & Prorok 2003, Straatman, Peer & Verbeek 1997).Most of these methods assume that the sojourn time follows an exponential distribution.Due to the memoryless nature of the exponential random variable, the lead time will follow the same exponential distribution as well.These publications have provided estimates of the mean and variance of the lead time under the exponential assumption.We will focus on three major methods in this section.

Local Lead Time Distribution for the Screen-Detected
Cases Prorok (1982) made a major contribution by deriving the conditional probability distribution of the lead time, given that one was detected at the ith screening exam.
Consider a screening program with a total of K screening exams.If an individual enters the preclinical state S p during the time interval (t i−1 , t i ], i = 0, 1, . . ., K − 1, this person is a member of the ith generation, where t −1 = 0. Prorok (1982) argued that the lead time distribution at a given screening, say (j +1)th screening, is a weighted average of the lead time distributions for all generations potentially detectable at it.The local lead time PDF for individuals detected in S p by the (j + 1)th screening (at time t j ) can be dened by where f ij (l) is the lead time distribution for ith generation who are detected at (j + 1)th screening but not before.This f Dj (l) distribution can be interpreted as a weighted-average of the lead time distributions for each generation i, with mixing weights D ij .The ith generation lead time distribution can be calculated by where w i (•) and Q i (•) are the transition density from S 0 to S p and survival function of sojourn time for the ith generation, respectively.The u represents the length of time from entering S p to being detected at screening t i , which is a random variable.
The weighting factor D ij is the probability that an individual is detected at (j + 1)th screening given the person belongs to the ith generation.It can be obtained by where P (E i ) is the probability that an individual belongs to the ith generation.P (t i ) is the probability that an ith generation individual is in S p at time t i .Q vi (t j − t i ) is the probability that the time length of (τ − t i ) for an ith-generation individual is not less than t j − t i , where τ represents the time point this individual enters S c .The term f (β ij ) takes account of the sensitivities of screening exams.The derivation of these probabilities can be found in Prorok (1982) and Prorok (1976).
Simulations were conducted to explore the lead time properties based on the derived lead time distribution.In the simulation, the sojourn time is assumed to follow the generalized gamma distribution, with the same mean at 2 years, and three dierent variances, corresponding to the cases of the coecient of variation to be larger, smaller and equal to one.Simulation results showed that the local lead time for the ith screen-detected cases will not change after a certain number (four or ve) of screening exams, given the screening interval was xed at 1 year.This suggested a possible stopping rule when designing the screening programs, since continued screenings are not expected to yield any additional benet.However, this study only focuses on the analysis of screen-detected cases whose lead time is positive, and ignored the interval cases whose lead time is zero.
3.2.Global Lead Time Distribution When Lifetime is Fixed Wu, Rosner & Broemeling (2007) rigorously evaluated the lead time distribution based on model parameters for the whole cohort participating in the screening program, including both the screen-detected and the interval incident cases.In this way, the proportion of patients whose lead time is zero can be estimated, together with the distribution of time of those patients who were detected early by screening.Thus, the lead time distribution is a mixture of a point mass at zero and a probability density function of a positive continuous random variable.
Let us consider an initially asymptomatic individual with no history of cancer, he or she is assumed to take K screening exams at ages Where P (D = 1) is the probability of developing (clinical) cancer after age t 0 , and ) is the probability that the lead time is zero, i.e., the collective probability of being an interval case, The joint probability density function f L (z, D = 1) when z ∈ (0, T − t 0 ) is: The validity of the probability calculation can be proved by It is clear that the lead time distribution depends on the three key parameters: the sensitivity β(•), the transition probability w(•) and the distribution of sojourn time q(•).The method was applied to the HIP study and the posterior predictive distribution of the lead time was estimated using MCMC posterior samples.Bayesian inference was performed to explore the lead time properties with dierent screening intervals (6, 9, 12, 18 and 24 months), given the initial screening age t 0 = 50 and lifetime T = 80.Later, this method was applied to various cancer screening studies, including breast, lung, and colon cancer (Wu et al. 2007, Wu, Erwin & Rosner 2011, Wu, Erwin & Rosner 2009a).

Global Lead Time Distribution When Lifetime Is a
Random Variable Wu, Kafadar, Rosner & Broemeling (2012) extended the lead time distribution by allowing the lifetime T to be a random variable, which is more realistic.The lead time distribution when T is a random variable can be obtained by where P (L = 0 | D = 1, T = t) and f L (z | D = 1, T = t) can be calculated by Equations 5 and 6, and f T (t | T ≥ t 0 ) = f T (t)/P (T ≥ t 0 ) is the conditional lifetime distribution.The validity of this mixed probability distribution can be proved by The actuarial life table from the United States Social Security Administration was used to estimate the lifetime distribution f T (t | T ≥ t 0 ) (see http://ssa.gov/OACT/ STATS/table4c6.html).The life table provides the conditional probability of death within one year from age 0 to age 119, denoted as b N = P (T < N + 1 | T ≥ N ), N = 0, 1, . . ., 119.The conditional density can be approximated by The nal lifetime distribution was approximated by a step function, Because the lifetime T is random, the number of screening exams K = (T − t 0 )/∆ is a function of T , hence it is also a random variable, with ∆ as the screening interval.Hence, the distribution of the lead time is a weighted average across dierent lengths of lifetimes.Additional simulations were done in Kendrick, Rai & Wu (2015).

Discussion
Accurate estimation of the three key parameters in cancer screening lays a foundation for evaluating the eectiveness of a screening protocol.In particular, all the interesting terms (lead time, rate of over diagnosis, survival benets, etc.) can be expressed as a function of the sensitivity, sojourn time distribution, and transition density.Estimation of the unobserved lead time is another important topic in cancer screening, as lead time is essential in evaluating the survival benet of cancer screenings.
In this paper, we reviewed three existing methods for estimating the three key parameters.The stable and nonstable models proposed by Shen & Zelen (1999) provide a way to estimate the key parameters using likelihood-based methods.Under the nonstable model, the transition probability is not constant across dierent age groups, but assumed the same within each age group.Conversely, the stable model treated the transition probability as a constant over ages.In this approach, the sensitivity was xed over dierent age groups and sojourn time was estimated using an exponential distribution.To perform inference allowing sensitivity to vary by age, Wu, Rosner & Broemeling (2005) model sensitivity using a logistic function of age, while assuming a log-normal density for the transition time from S 0 to S p .While it has been argued that the sensitivity is negatively correlated with the sojourn time (Walter & Day 1983), Wu et al. (2008) extend the sensitivity model in Wu, Rosner & Broemeling (2005) by modeling the sensitivity using both age and the ratio of the time already spent in the preclinical state to the full sojourn time.
Since breast cancer screening programs began before screening of other cancer sites, the earliest developed models are known to be quite accurate for breast cancer.It is now commonly known that sensitivity of mammogram increases as a woman's age increases.The medical explanation is that the breast tissue of younger women is denser and more brous compared to that of older women, whose breast tissue is relatively softer and fattier.The previous probability models also showed the trend (Wu, Rosner & Broemeling 2005, Wu, Wu, Banicescu & Cariño 2005, Chen, Brock & Wu 2010).This may not be generally true for screenings of other cancer sites, such as the widely recommended lung cancer screening using low-dose computed tomography (Liu et al. 2015).With this screening method, the sensitivity does not seem to be aected by age.Whether the sensitivity of the fecal occult blood test for colorectal cancer depends on age is currently uncertain (Prevost, Launoy, Duy & Chen 1998, Wu et al. 2009b).
We also reviewed three methods for estimating the lead time distribution in cancer screening.Prorok (1982) derived the lead time distribution for those detected at the ith exam and used it to determine the stopping rules when designing the screening program.However, the method only considers the screen-detected cases when the lead time is positive.Wu et al. (2007) derived the lead time distribution for the whole cohort, by considering both screen-detected cases and interval cases when the lifetime is xed.The distribution of the lead time is a mixture of a point mass at zero (for the interval cases) and a piece-wise continuous density function (for the screen-detected cases); the probability calculation was dramatically simplied in this model, and it includes the result of Prorok's as a special case.Wu et al. (2012) extended the model in Wu et al. (2007) to consider lifetime as a random variable.In this circumstance, the lead time distribution is a weighted average of the lead time distribution under dierent lifetimes.This is the rst prospective study: for people at current age and based on the existing data, one can make predictive inference on the distribution of the lead time under dierent future screening schedules.Thus, one can use this method to infer future outcomes, such as the possibility of early detection, and how early it could be if it is a screen-detected case, and the possibility of no-benet if it is an interval case, under various future screening schedules.These methods have been applied to estimate the lead time distribution of breast cancer screening (Wu et al. 2007, Shows & Wu 2011, Wu et al. 2012), lung cancer screening (Wu et al. 2011, Jang, Kim & Wu 2013) and colorectal cancer (Wu et al. 2009a).
Eectiveness of screening is constantly debated.Questions regarding the ecient design of cancer screening programs have arisen, such as at what age to start a screening exam and how frequently patients should be re-screened.For example, there has been recent controversy about whether mammography in breast cancer screening benets women in their 40s.Here, we reviewed several statistical methods and hope eective and novel statistical evaluation of screening protocols will be an integral part of this debate.

Figure 1 :
Figure 1: Disease progressive states and the lead time.Sensitivity is the probability that an screening exam result is positive, given and T represents the lifetime, a xed value.Let D represent true disease status, with D = 1 indicating having cancer and D = 0 indicating no clinical disease in one's lifetime.Let L represent the lead time of an individual.The distribution of lead time is a mixture of the conditional probability P (L = 0 | D = 1) and the conditional density function f L (z | D = 1), for z ∈ (0, T − t 0 ):

Table 1 :
True disease status and test result in one mass screening.

Table 2 :
A sample of mass cancer screening data.