Statistical methods for elimination of guarantee-time bias in cohort studies: a simulation study

Background Aspirin has been considered to be beneficial in preventing cardiovascular diseases and cancer. Several pharmaco-epidemiology cohort studies have shown protective effects of aspirin on diseases using various statistical methods, with the Cox regression model being the most commonly used approach. However, there are some inherent limitations to the conventional Cox regression approach such as guarantee-time bias, resulting in an overestimation of the drug effect. To overcome such limitations, alternative approaches, such as the time-dependent Cox model and landmark methods have been proposed. This study aimed to compare the performance of three methods: Cox regression, time-dependent Cox model and landmark method with different landmark times in order to address the problem of guarantee-time bias. Methods Through statistical modeling and simulation studies, the performance of the above three methods were assessed in terms of type I error, bias, power, and mean squared error (MSE). In addition, the three statistical approaches were applied to a real data example from the Korean National Health Insurance Database. Effect of cumulative rosiglitazone dose on the risk of hepatocellular carcinoma was used as an example for illustration. Results In the simulated data, time-dependent Cox regression outperformed the landmark method in terms of bias and mean squared error but the type I error rates were similar. The results from real-data example showed the same patterns as the simulation findings. Conclusions While both time-dependent Cox regression model and landmark analysis are useful in resolving the problem of guarantee-time bias, time-dependent Cox regression is the most appropriate method for analyzing cumulative dose effects in pharmaco-epidemiological studies.


Background
Extensive studies have elaborated and documented protective effect of aspirin for the primary prevention of cardiovascular diseases, including myocardial infarction (MI), stroke, coronary artery disease (CAD) [1][2][3][4][5][6]. Distinct from other non-steroidal anti-inflammatory drugs (NSAIDs), aspirin has the capacity to irreversibly inhibit cyclooxygenase (COX) and suppress thromboxane production. Findings of experimental studies have suggested that COX-2 suppression [7][8][9] or anti-platelet effect of aspirin [10,11] could interact with cancer cells and play a significant role in blocking tumor angiogenesis, invasiveness and metastatic potential. Accordingly, several observational studies have reported an inverse association between aspirin use and cancer incidence or mortality. However, the results have been inconsistent and controversy still remains regarding its anticancer effects [12][13][14][15], in part due to variations in study designs and statistical methods for analyzing the effectiveness of aspirin. For example, Fraser et al. [16] conducted a population-based study of breast cancer patients in Scotland, in which Cox's proportional hazard models was used to assess the relationship between aspirin use and survival. The authors found significantly reduced all-cause (HR = 0.53, 95% CI = 0.45 to 0.63) and breast cancer-specific mortality (HR = 0.42, 95% CI = 0.31 to 0.55) among participants who consumed aspirin following a diagnosis of breast cancer. Similarly, using Cox proportional hazards modeling, Jacobs et al. [17] reported that total cancer incidence is significantly lower among regular long-term aspirin users (RR = 0.84, 95% CI = 0.76 to 0.93). In contrast, a study by McMenamin and colleagues [18] found a non-significant decrease in the risk of lung cancer-specific mortality (p = 0.5975) with aspirin use. In this study, the authors used a time-dependent Cox model whereby low-dose aspirin usage was treated as time-varying covariate, with users not considered to be exposed until after a lag of six months following initial prescription. Further, Cardwell et al. [19], in their conditional logistic regression analysis of nested case-control data, also found no evidence for protective effects of low-dose aspirin usage against colon cancer-specific mortality (OR = 1.06; 95% CI = 0.92 to 1.24).
The above-mentioned studies of aspirin use show that when evaluating drug effects, results are often confounded by time-related biases that tend to exaggerate the protective effects, even when the drug has no or little effect [20]. Timerelated biases can be further categorized into guarantee-time bias (also known as immortal time bias or time-dependent bias), time-window bias, or time-lag bias. For the purpose of this paper, we focus on guarantee-time bias, which is frequently encountered in time-to-event analyses evaluating drug effects. Guarantee-time bias refers to bias arising from incorrect handling of the time from the beginning of followup to the first treatment exposure [21]. The bias occurs if treated patients are assumed to have already been treated from time of cohort entry. For example, in a study of patients with heart disease, heart transplant is a time-varying treatment. While some patients may receive a transplant at the start of follow-up, others may undergo transplant long after the beginning of follow-up or die before undergoing the transplant. Failure to account for time-varying feature of the treatment can result in biased estimation of the treatment effect, in this case, in favor of the treatment group [22].
This study aimed to evaluate the extent of guaranteetime bias in estimation of drug effects on disease outcomes. We conducted simulation studies in which timefixed, time-dependent Cox model, and landmark analysis were compared for handling guarantee-time bias. For illustrative purposes, effect of cumulative rosiglitazone dose on the risk of hepatocellular carcinoma was investigated using the Korean National Health Insurance Data.

Guarantee-Time bias
Guarantee-time bias in cohort studies can distort the results in favor of the treatment group, depending on the type of event (if it benefits the patient or not) [21,23,24]. In detail, consider this study as a cohort study and the situation where drug effect and event occurrence are independent. Event is defined as the disease of interest. Define the random variables W and T 0 as the time to initiation of drug usage and time to occurrence of event. The binary random variable Z is defined to be Z = I[W ≤ T 0 ], which represents drug usage. The random variables T 0 and T 1 are defined as the time to event conditional on Z = 0 or 1 respectively, i.e. T = (1 − Z)T 0 + ZT 1 . As shown in Fig. 1, the person whose value of T 0 is smaller than W is allocated to the non-user group (person #1). Otherwise, individuals are allocated to the drug-user group, according to cumulative drug exposure (persons #2 to 4). In other words, to have received the treatment implies that the subject survived or was eventfree, up to the initiation of drug use. As regards drug-user groups, the longer the survival times, the higher the probability of belonging to the high-dose group. This is the guarantee-time bias phenomenon, i.e. subjects must have lived long enough to receive the treatment and consume large amounts of drug.

Statistical modeling for Guarantee-Time bias
Our statistical modeling closely follows the paper by Nam and Zelen [25]. For convenience, we consider the situation where the drug usage is binary. The probability density functions of W , T 0 , T 1 will be denoted by g(w) , q 0 (t) and q 1 (t) respectively. The survival functions will be de- Note that by definition Z = 1 if the time for drug usage is observed. If the drug usage does not cause a change in the survival function, the conditional density function of non-user group f(t| z = 0) and drug-user group f(t| z = 1) would be the same.
Under the hypothesis that q 0 (t) = q 1 (t), if we simply compare the survival functions of the two groups according to the drug usage, the two groups of conditional density functions are not equal to f(t| z = 0) ≠ f(t| z = 1), as it has been proven in the paper by Nam and Zelen [25]. Therefore, it is not appropriate to simply compare the survival functions of two groups according to drug usage.

Landmark method
In the landmark method, selected landmark time τ 0 is set, and binary random variable Z(τ 0 ) is defined as Z(τ 0 ) = I(W < τ 0 | T > τ 0 ). The conditional density functions for Z(τ 0 ) can be computed as follows [25]: Hence if q 0 (t) = q 1 (t), it can be seen that the two functions are the same as As a result, it is possible to make a valid comparison of two survival functions between the drug-user and non-user groups using the landmark method. In this study, landmark analysis was conducted using time-fixed Cox regression with drug exposure status defined prior to landmark time. One thing to mention in the landmark method is that it is very important to carefully select the landmark time.

Time-dependent Cox regression model
In a time-dependent Cox regression model, a timedependent covariate in the model tracks whether the classifying event has occurred during the estimation process [23]. This method eliminates guarantee-time bias by using drug usage as a time-dependent covariate; subjects are classified as unexposed until the start of drug usage and exposed thereafter [26][27][28]. The time-dependent Cox regression model has the advantage of using all study follow-up data since it starts analysis at the time of cohort entry. By including all data, this method has increased statistical power over the landmark method [29].

Simulation setting
We conducted extensive simulations to compare the performance of our three methods, Cox regression, time-dependent Cox regression and landmark method, in terms of empirical type I error and power. We used a discrete-time survival model to describe the effect of cumulative drug dose on the outcome using three indicator variables Z 1 (t) , Z 2 (t) , Z 3 (t) which represent low, moderate and high-drug dose groups. We set N as total sample size and d as the number of drugusers. Among N subjects, d subjects consumed drug for different durations and the dosage was fixed at a constant dose of 0.5 per unit of time. We assumed that drug is taken continuously without interruption until the end of the study or event.
We divided the total observation period (10-year) into ten intervals as needed for the discrete-time survival model. The overall simulation setting is represented in Fig. 2. We assumed that the drug user rate is 5%. First, we randomly assigned time to initiation of drug use between 0 and 10 for each drug-user. Using these initiation times, we calculated cumulative dose at each time t and categorized each subject into four groups: non-user group and low(Z 1 (t) = 1), moderate(Z 2 (t) = 1), high-dose(Z 3 (t) = 1) groups. At each discrete time point, the probability of disease occurrence at time t is as follows: Fig. 1 Guarantee-Time Bias Considering four people, time to initiation of drug use w i and a time to event t i were randomly generated for each person. The person whose value of T 0 is smaller than W is allocated to the non-user group (person #1). Otherwise, individuals are allocated to the drug-user group, according to cumulative drug exposure (persons #2 to 4) If the event occurred within the observation period, survival time is the point of event occurrence. On the other hand, if the event did not occur, survival time is equal to 10 years and censored. For the landmark analysis, we considered subjects whose starting point of drug intake is before landmark time as drug-users, and otherwise nonuser. We used 5 and 7 as landmark times, respectively. Subjects who died or who were censored before the landmark time were excluded from the analysis.
Simulations were performed under two different scenarios. In the first scenario, type I error rates for each of the statistical methods were evaluated under the null hypothesis, whereby β 1 , β 2 and β 3 values were set to zero to represent a case of no significant protective effect of drug on disease outcome. For the power comparisons in scenario 2, we simulated data from the alternative hypothesis (i.e. cumulative drug dose is associated with disease outcome) for each of the three β 0 values denoting disease incidence: log(0.015), log(5 * 0.015) , and log(10 * 0.015). For each simulation setting, 1000 replication datasets were generated. All simulations were performed using the R statistical software [30].

Simulation results
The empirical type I errors of three methods under the null hypothesis of no association between cumulative drug dose and disease are illustrated in Table 1. Each value represents the total number of cases out of 1000 replications for which the null hypothesis is incorrectly rejected. As expected, Cox regression model with fixed covariate values tended to produce inflated type I error rates, especially in the moderate and high dose groups. Time-dependent Cox regression and landmark method, on the other hand, displayed well-controlled type I error   Table 2. Statistical power of the timedependent Cox regression approach was higher than that of landmark analyses in the high-dose group. In the case of moderate and low-dose groups, the landmark method using landmark time 5 demonstrated higher statistical power compared to the time-dependent Cox regression. This is attributed to the fact that there was extremely small number of cases for the high-dose group, compared with the moderate group. Table 3 summarizes the bias and mean squared errors (MSE) of hazard ratios for estimation of the association between cumulative drug dose and outcome. Time-dependent Cox regression had much lower bias than the landmark method and generally showed smaller MSE values compared with the other methods. For instance, for the low-dose group, when the β 0 value was log(0.015), the bias of the time-dependent Cox regression model was 0.0009, while the bias of the landmark analyses were 0.0673 and 0.0416 for τ = 5 and τ = 7, respectively.

Real data example
We applied our simulation findings to a real data from the Korean National Health Insurance Database (NHID). The sample was followed for 12 years from 2002 to 2013. The event of interest was incidence of hepatocellular carcinoma (HCC). Among 47,738 patients with incident diabetes, 203 hepatocellular carcinoma cases were identified. Full details of the study design have been described elsewhere [31,32]. In the current study, total prescribed doses of rosiglitazone were calculated at each year and summed to produce cumulative doses of follow-up years. Discretetime survival analysis was used to take into account of the intermittent administration of rosiglitazone and the varying dosage between patients. Based on cumulative dose of rosiglitazone use during the study period, drug users were categorized into low (<1350 mg), moderate (1350-4499 mg), high groups (≥4500 mg). In the landmark analysis, sixth year was chosen as a landmark time since the number of incident HCC was balanced at that time point. Data analysis was carried out using the SAS statistical software version 9.4 [33].
Results based on the NHID data are illustrated in Table 4. In the Cox regression, the risk of HCC incidence was lowest among subjects exposed to high cumulative doses of rosiglitazone (HR = 0.443, 95% CI = 0.218 to 0.899). However, the protective pattern of dose-response relationship and effect sizes were

Discussion
In this study, we examined the phenomenon of guaranteetime bias through statistical modeling and simulation study. Specifically, our simulation study assessed the performance of three methods, namely Cox regression, timedependent Cox regression and landmark method based on time-fixed Cox regression. These methods depend upon the proportional hazard assumption. According to our simulation results, time-fixed Cox regression was shown to be vulnerable to guarantee-time bias [20,22]. Pharmaco-epidemiological studies typically involve the use of time-varying exposures, such as cumulative dose.
Hence, under such situations, applying the time-fixed Cox regression approach can induce bias due to model misspecification. Our results are in accordance with results previously reported by Mi et al. [29]. However, unlike previous studies, we performed a simulation by including cumulative dose groups as exposures. Landmark analysis has been suggested in previous studies as a simplified alternative to time-dependent Cox regression for elimination of guarantee-time bias in time-to-event data. In our simulation, landmark method was comparable to the time-dependent Cox regression method in terms of type I error. But landmark analysis tended to slightly increase MSE. One possible explanation for these results could be the sample size. For landmark analysis to produce efficient estimates, it is imperative that optimal landmark time is specified. As illustrated by our results, if the landmark point is too early, there is a greater possibility of imbalanced observations among treatment groups. On the other hand, if the landmark point is too late, a significant proportion of events may be omitted, giving rise to insufficient number of cases to achieve adequate power. Thus, it is important that landmark studies are designed with sufficient number of participants to maintain adequate statistical power. Additionally, the landmark method will produce estimates with minimal bias conditional upon the treatment being evenly distributed across the follow-up [29]. Recently, the use of landmark super models, a pooled summary analysis of several landmarks, has been advocated to remedy the problem of low statistical power related to the landmark method [34,35]. Further studies are warranted to compare the performance of landmark super models with the time-dependent Cox model.
Nevertheless, the landmark approach has several advantages over the time-dependent Cox regression model. Most notably, this method has the advantage of computational simplicity because drug use is defined as a timefixed covariate by using landmark time. Thus, one can visualize the survival curve of drug users using the Kaplan-Meier method. But, unconditional Kaplan-Meier estimates of time-varying status of drug users are unstable because the number of drug users can be quite small at early time point [36].

Conclusions
In conclusion, to avoid guarantee-time bias in observational studies of drug effects, we recommend incorporating time-dependent exposure status in the analysis. While both time-dependent Cox regression model and landmark analysis were found to be useful in resolving the problem of guarantee-time bias, time-dependent Cox regression was the most appropriate method for analyzing cumulative and long-term drug exposure. We recommend the time-dependent Cox regression for estimating hazard ratios of cumulative doses. Alternatively, due to its graphical capabilities, the landmark method may be a suitable alternative for visualizing survival curves for treatment groups.

Availability of data and materials
The data used in this study was simulated. The simulation program is in R and available from https://github.com/jihyeon1/yonsei_biostat.
Authors' contributions ISC, YRC, JHK, HRY, SYJ, and GRK contributed equally to the conception, analysis/interpretation of data, and drafting of the manuscript. CMN edited and revised the manuscript. All authors have read and approved the final manuscript.