A treatment‐specific marginal structural Cox model for the effect of treatment discontinuation

Abstract Patients taking a prescribed medication often discontinue their treatment; however, this may negatively impact their health outcomes. If doctors had statistical evidence that discontinuing some prescribed medication shortened, on average, the time to a clinical event (e.g., death), they could use that knowledge to encourage their patients to stay on the prescribed treatment. We describe a treatment‐specific marginal structural Cox model for estimation of the causal effect of treatment discontinuation on a survival endpoint. The effect of treatment discontinuation is quantified by the hazard ratio of the event hazard rate had the population followed the regime “take treatment a until it is discontinued at some time ν,” versus the event hazard rate had the population never discontinued treatment a. Valid causal analysis requires control for treatment confounding, regime confounding, and censoring due to regime violation. We propose new inverse probability of regime compliance weights to address the three issues simultaneously. We apply the framework to data from the Global Anticoagulant Registry in the FIELD–Atrial Fibrillation (GARFIELD‐AF) study. Patients from this study are treated with one of two types of oral anticoagulants (OACs). We test whether the causal effect of treatment discontinuation differs by type of OAC, and we also estimate the size and direction of the effect. We find evidence that OAC discontinuation increases the hazard for certain events, but we do not find evidence that this effect differs by treatment.

anticoagulant (OAC). There are two main classes of OAC-vitamin K antagonists (VKA) and non-vitamin K oral anticoagulants (NOACs), with NOACs being the newer of these two classes. 3,4 The VKAs and the NOACs act differently within the body. Specifically, VKAs such as Warfarin inhibit the synthesis of various clotting factors, whereas NOACs such as dabigatran and rivaroxaban directly target a particular clotting factor. 4 Despite their potential to prevent stroke and clotting, 4 high levels of both VKA and NOAC treatment discontinuation (some has high as 53%) have been reported in the AF population. [5][6][7] This has sparked an interest in studying the causal effect of OAC discontinuation on endpoints such as stroke and death. Furthermore, because VKAs and NOACs work differently to reduce the risk of stroke, it is important to examine whether or not the effect of OAC discontinuation differs by class of OAC.
The Global Anticoagulant Registry in the FIELD-Atrial Fibrillation (GARFIELD-AF) study contains data on patients recently diagnosed with nonvalvular atrial fibrillation. Recruitment for the prospective study began in 2009 and was completed in 2016. Each patient in the study had at least one risk factor for stroke and agreed to 2 years of follow-up. 8,9 A rich set of baseline information was collected on each patient, including: age, gender, race, and medical history (e.g., hypertension, diabetes, heart failure, bleeding history). Time-varying information was also collected on each patient. This included if and when a patient had a stroke, myocardial infarction (MI), left atrial appendage procedure (LAAP), 10 or various bleeding events (e.g., minor bleed; major bleed; nonmajor, clinically relevant bleed) as well as if and when a patient discontinued treatment. Refer to Kakkar et al. 8 for additional details on the information obtained throughout the study. We focus on the 23,882 patients from cohorts 3-5, whose initial treatment was either VKA or NOAC. Interest lies in estimating the causal effect of OAC discontinuation on the following endpoints: death, cardiovascular death, stroke/systemic embolism (SE), acute MI, as well as combinations of these endpoints.
According to the GARFIELD-AF Steering Committee, most patients that do not permanently discontinue treatment, but instead temporarily get off treatment, are off treatment for just one to 7 days before getting back on. For this reason, we formally define treatment discontinuation as being off treatment for at least seven consecutive days. Under this definition, 2902 (1399 from VKA group; 1503 from NOAC group) of the 23,882 AF patients discontinued treatment during follow-up, and prior to regime violation, which is discussed later in this section. In both the group of patients that started on VKA and the group of patients that started on NOAC, most patients that discontinued treatment did so early into follow-up. Refer to Figure 1 for a visual representation of when (how long into follow-up) the GARFIELD-AF patients discontinued treatment.
While the time-dependent (TD) Cox proportional hazards (PH) model is often used to analyze survival data with time-varying covariates, it has been shown than in certain cases, using a TD Cox PH model to estimate the causal Months Percent of Patients (among those that discontinued) Treatment NOAC VKA F I G U R E 1 Months from start of treatment to discontinuation effect of treatment (or in this case, treatment discontinuation) on a given survival endpoint can yield biased estimates of the causal effect. 11,12 In particular, these issues arise when there exists a TD confounding variable such that past exposure predicts the TD confounder. We call this "regime confounding." This is the case for the treatment discontinuation problem we consider. For example, consider the time-varying information on major bleeding recorded in the GARFIELD-AF study. The occurrence (or absence) of a major bleed is predictive of treatment discontinuation (or persistence), and it is prognostic for clinical endpoints such as death. Furthermore, whether or not an individual experiences a major bleed is influenced by the individual's previous choice to either stay on the drug or to discontinue it. Thus, analyzing the causal effect of OAC discontinuation in the GARFIELD-AF patients warrants an advanced approach.
Structural failure time models 13,14 and marginal structural models (MSMs) 12,15 are two common approaches for estimating causal effects in the presence of TD confounders that are themselves affected by past exposure. MSMs are an appealing choice because they resemble standard models, unlike structural failure time models. 15 This makes them easy to interpret. Yang et al. 16 used MSMs to estimate the causal effect of treatment discontinuation on a survival endpoint. Specifically, they accomplished this by first casting the intervention, in this case treatment discontinuation, as a treatment policy of the form "take treatment until you discontinue treatment at time ν," where ν is any positive real number. The authors then proposed using a dynamic regime MSM to estimate the population hazard rate of having a clinical event had all patients in the population followed the treatment policy dictated by "ν." By computing the hazard ratio of the event hazard had all patients followed the time-to-discontinuation regime "ν," relative to the event hazard had all patients remained on treatment (never discontinued), they were able to quantify the causal effect of treatment discontinuation on survival. While this approach allows intervention to be a function of the time-to-discontinuation time "ν," it does not allow it to be a function of more than one treatment. In the context of fixed treatment regimes (as opposed to dynamic regimes, where the treatment policy takes into account each patient's evolving outcomes), we extend the framework in Yang et al. 16 to the case where each patient's initial treatment is not necessarily the same. We do this by considering the following treatment-specific time-to-discontinuation policy: "take treatment a (VKA or NOAC) until you discontinue treatment a at time ν"; however, the extension is not straightforward. Yang et al. 16 took for granted that a patient will neither switch treatments throughout follow up nor get back on a treatment once he/she has discontinued treatment. If either of these two events do occur, the treatment-specific time-to-discontinuation regime that we are interested in is violated. 3100 (1738 from VKA group; 1362 from NOAC group) of the patients in the GARFIELD-AF data have violated the regime in one of these two ways. Two naive approaches to account for the issue of regime violation are to (1) ignore when a patient has violated the regime and use the initial treatment (VKA/NOAC) to determine which treatment group the patient belongs to or (2) completely remove patients that violate the regime from the analysis. Unfortunately, both of these solutions potentially bias the analysis. If the first approach were taken, patients from both OAC groups would be used to represent a single OAC group-contaminating the desired treatmentspecific analysis. Additionally, any potential effect of discontinuation on survival may be weakened because patients that discontinued treatment, but then later got back on treatment, would still be included in the group of patients that discontinued. If the second approach is taken, the resulting sample may not be representative of the population-biasing results. This would be the case if patients that violate the regime are systematically different than patients who do not. In this paper, we address regime violation by artificially censoring patients when they violate the treatment-specific time-to-discontinuation regime. Artificially censoring patients in this way may induce informative censoring, so it must be appropriately accounted for in the analysis. We propose inverse probability of regime compliance (IPRC) weights to appropriately adjust for artificially censoring these patients, as well as to adjust for any treatment confounding and regime confounding that may exist. We apply this method to the GARFIELD-AF data in order to test whether the effect of OAC discontinuation differs by treatment and to estimate the (potentially treatment-specific) effect of OAC discontinuation.
The remainder of this paper is organized as follows. Section 2 introduces the statistical framework used throughout the paper. This includes the notation, the formal specification of the causal parameter of interest, assumptions, and the treatment regime MSM for the hazard of treatment discontinuation. In Section 3, unbiased estimating equations for estimation of the treatment regime MSM and the asymptotic behavior of these estimators is discussed. The performance of the proposed IPRC weighted estimator is evaluated and compared against two naive estimation methods via simulation studies, and then in Section 4 the IPRC weighted estimator for β Ã a is applied to the GARFIELD-AF data. The results of this analysis are discussed in Sections 4.2 and 4.3. We end with a brief conclusion.

| Notation
Suppose a sample of patients initiated on one of two treatment options are followed over time, where it is possible that some patients discontinue their treatment during follow up. Suppose further that the endpoint of interest is a clinically relevant failure time (e.g., time-to-death), which may be censored for certain individuals. Let X i denote the p Â 1 vector of observed baseline covariates for patient i, and let A i be his/her initial treatment assignment, where A i 0, 1 f g. For the GARFIELD-AF analysis, we set A i ¼ 0 if VKA is the initial treatment and A i ¼ 1 if NOAC is the initial treatment. Let T i and C i be patient i's possibly unobserved failure time and time-to-censoring, respectively, and let Þ is the indicator that the clinical outcome was observed for patient i. Let D i be patient i's potentially unobserved time-to-treatment-discontinuation, and let Þ is the indicator that patient i discontinued treatment prior to the clinical event and censoring. Define the time-dependent discontinuation indicator at time t to be Additionally, let Q i l ð Þ be a q Â 1 vector of all time-dependent covariates, excluding Z V i l ð Þ, that are observed for patient i at time l as long as he/she is still at risk at time l. The collection of these time-dependent covariate vectors available on patient i at time t is then denoted Using the potential outcomes framework, 17,18 let D a ð Þ i be patient i's time-to-discontinuation had he/she taken treatment a. Let T a,v ð Þ i be the failure time that would be observed had patient i taken treatment a and discontinued that treatment at time ν, where ν IR þ , and let T a,∞ ð Þ i be the potential failure time had patient i never discontinued treatment a. Furthermore, define Q a,ν ð Þ i l ð Þ to be the q Â 1 vector of time-dependent covariates, excluding the time-dependent discontinuation indicator, that would be observed at time l had patient i been on treatment a until discontinuing treatment at time v. Similarly, the collection of time-dependent covariate vectors that would be available on patient i at time t, under the treatment regime dictated by a, ν ð Þ, is denoted by Q Recall that there are two ways that a patient can become inconsistent with the regime "take treatment a until discontinuing a at time ν," which we call "regime violation." If a patient switches treatment, for example, goes from VKA to NOAC or vice versa, then he/she becomes inconsistent with the regime of interest at the time treatment is switched. Additionally, a patient becomes inconsistent with the regime of interest the moment he/she gets back on any drug (VKA or NOAC) after already having discontinued initial treatment. Let R i denote patient i's potentially unobserved time-to-regime-violation, and let Þ is the indicator that patient i violated the regime of interest during follow up. When Γ Si ¼ 1, we artificially censor patient i at time S i due to regime violation. Let R a,ν ð Þ i be the potential outcomes version of R i , defined in the same manner as T a,ν ð Þ i and Q a,ν ð Þ i . We assume i ¼ 1, …,n independent and identically distributed copies We define the observed event counting process as Þand the observed at-risk process as The event and at-risk process under the policy dictated by a, ν ð Þ are written as N a,ν ð Þ i t ð Þ and Y a,ν ð Þ i t ð Þ, respectively.

| Model and assumptions
Define the treatment-regime-specific hazard of failure which is the hazard of failing had all patients followed the regime "take treatment a until discontinuing at time ν." This is the causal parameter of interest. We assume censoring is non-informative, which means that the censoring time is independent of the full set of potential variables. Under this assumption, we have where Δ a,ν ð Þ ¼ 1 T a,ν ð Þ ≤ C À Á . Section 5 provides a brief explanation on how to extend the proposed method to the case when censoring depends on the observed data. We consider the following treatment-regime-specific marginal structural Cox model for the causal parameter: Under the model given in (1), β a is the log of the relative hazard of failing had the population taken treatment A ¼ a and discontinued treatment, compared to if the population had taken treatment A ¼ a and stayed on treatment a (never discontinued). If the effect of discontinuation is modified by treatment, this difference in the effect of discontinuation on the log of the relative hazard of failure had the population taken NOAC, versus had the population taken VKA, is quantified by the parameter β 3Ã . Interest lies in testing whether the effect of treatment discontinuation on failure time differs by treatment (test β 3Ã ¼ 0) and in making inference on β a .
Due to the fundamental problem of causal inference that at most only one potential outcome can be observed for a particular subject, the parameters in the MSM are not identifiable with observed data in general. In order to estimate (1), we make three additional assumptions. First, we make the consistency assumption, 18,19 which states that the observed data are equal to the corresponding potential outcomes under the treatment regime that was actually followed. Specifically, if patient i followed the regime "take treatment a until discontinuing ν days after starting treatment", we assume that , respectively. For the counting process and at-risk process, the consistency assumption implies that The full set of potential variables is denoted by We define F Ai to be equal to F i , excluding A i ; we define F Di to be equal to F i , excluding D i and D a ð Þ i ; and we define F Ri to be equal to F i , excluding R i and R a,ν We make the no unmeasured confounders assumption, 19 under which the propensity score, the hazard of discontinuation, and the hazard of regime violation are independent of F A ,F D , and F R , respectively, given the observed data available at time t. Thus, these three quantities are defined as follows. The propensity of receiving treatment a as the initial treatment is the hazard of treatment discontinuation is and the hazard of regime violation is Using the definitions of the treatment discontinuation hazard and the regime violation hazard given in (2) and (3), respectively, we now define can be viewed as the probability that individual i did not discontinue treatment before time l and the probability that individual i did not violate the posited treatment regime before time l, respectively. The quantity f D ljH i l ð Þ È É can be interpreted as the probability that individual i discontinues treatment at some time within l, l þ dl ½ . Finally, the positivity assumption 20 we require is threefold. We require pr A ¼ ajX The positivity assumption ensures that for each time point l for which a patient is still at risk for discontinuing an OAC, it is possible for that patient to follow any of the treatment-discontinuation regimes still available at time l. It also ensures that the estimating equations given in Section 3 are well defined.

| Theory
Similar to Yang et al., 16 we define the following mean zero martingale process under the fixed treatment policy dictated by a,ν ð Þ: where Λ a0 l ð Þ is the cumulative baseline hazard at time l had the population taken treatment A ¼ a. If the full set of potential variables were observed on each individual in the observed data, estimators for β a and Λ a0 t ð Þ, denoted b β a and b Λ a0 t ð Þ, could be obtained using the following estimating equations: The estimating equations in (4) and (5) extend the equations in (10) of Yang et al. 16 to the case where the treatment policy is a function of both treatment and discontinuation (instead of just dis- β a is the maximum partial likelihood estimator 16,21,22 for β a and b Λ a0 t ð Þ is the Breslow estimator 16,23,24 for Λ a0 t ð Þ. Since, however, F is not observed on each individual, we approximate the solutions to (4) and (5) by solving the weighted observed-data estimating equations In (8), g . Refer to the Supporting information for a proof that (6) and (7) are unbiased estimating equations for β a and Λ a0 t ð Þ, respectively. We call the subject-specific, time-dependent weights, ω ai t ð Þ, inverse probability of regime compliance (IPRC) weights. Note that at any time t and for each individual i with A i ¼ a, only one of the three lines that make up ω ai t ð Þ in (8)  The (8) are not known, and so they must be estimated using the observed data and plugged into (8). We write the resulting estimated TD, subjectspecific weights as b ÞÂ1 vector of parameters and let α D be a f Â 1 vector of parameters. Suppose two separate TD Cox PH models are As discussed in Yang et al., 16 one way to obtain a root-n consistent estimator for θ D ν ð Þ=f D ν ð Þ, using the estimators just described, would be to set , which is root-n consistent for θ D ν ð Þ=f D ν ð Þ. A possible choice for the remaining weight function is to set θ R t ð Þ ¼ pr R ≥ t j A, X ð Þ , which can also be estimated using a Cox PH model-this time without the TD covariates. The propensity score model can be fit by logistic regression.
Choosing the weight functions in the manner described above yields stabilized weights. 15,16 Stabilized weights are desirable because they can improve the efficiency of the estimation of β a , compared to when other choices for the weight function are used. See Yang et al. 16 for a nice discussion of stabilized weights in the context of treatment regime MSMs. If the propensity model, the model for the hazard of regime violation, and the model for the hazard of treatment discontinuation are correctly specified, and b β a is estimated using the scheme for nuisance function estimation described above, then it can be shown that b β a is asymptotically Normally distributed. Yang et al. 16 recommend using the nonparametric bootstrap 25 to estimate the variance of b β a .

| Simulation study
We study the performance of the proposed IPRC weights via simulations. The simulation scenario builds upon the scenario discussed in Yang et al. 14 Specifically, we generate the treatment indicator, A, according to a Bernoulli distribution with mean equal to 0:5, and we generate G such that G $ exp 0:2 ð Þ. We then simulate a 1 Â 3 vector from a multivariate normal distribution with mean equal to 0:2G À 4 and covariance matrix equal to 0:7 jiÀjj for i,j ¼ 1, 2, 3. The vector represents the values of a time-dependent covariate, Q t ð Þ, at times t 1 ¼ 0, t 2 ¼ 5, and t 3 ¼ 10. We assume Q t ð Þ remains constant between times t 1 , t 2 , and t 3 . The time-to-treatment-discontinuation, D, is generated according to the proportional hazards model The time-to-regime-violation is generated according to the proportional hazards model . The time-to-event is generated according to the proportional hazards model For each observation, if R < T and A ¼ 1, the value of T associated with that observation is multiplied by 5, which means that regime violation affects failure time in one of the treatment arms. The time-to-censoring is generated according to the proportional hazards model . According to this data generating scheme, we have the following MSM: where β 2Ã þ β 3Ã a quantifies the relative hazard of patients who took treatment A ¼ a and never stopped treatment compared to those who took treatment A ¼ a and discontinued treatment. The parameter β 3Ã quantifies the difference in the effect of discontinuation (never stop treatment versus discontinue treatment) on the log of the relative hazard of failure had the population taken a ¼ 1, versus had the population taken a ¼ 0. Importantly, under this simulation scenario there is regime confounding because Q t ð Þ predicts discontinuation, it is affected after treatment discontinuation, and it is related to the time-to-event.
We compare three estimators for β 1Ã ,β 2Ã , and β 3Ã : (i) the estimator based on the proposed IPRC weights (treatmentregime-specific MSM method); (ii) the Naive 1 estimator, which is obtained by fitting a Cox PH model for failure time, adjusting for treatment, 1 À Z V t ð Þ, and the treatment Â 1 À Z V t ð Þ f ginteraction; and iii.) the Naive 2 estimator, which is obtained by fitting a Cox PH model for failure time that adjusts for the time-dependent covariate Q t ð Þ, in addition to the covariates specified for the Naive 1 estimator. Under the 6 parameter settings considered, only our proposed treatment-specific MSM approach consistently estimates β 1Ã , β 2Ã , and β 3Ã (see Table 1 and Figure 2). The naive approaches tend to underestimate β 1Ã and β 2Ã , and they tend to overestimate β 3Ã . Moreover, the 95% Wald confidence intervals, which were generated using the robust standard error output by R software, 26 achieve their nominal coverage under our MSM approach, but not under the naive approaches.

| IPRC weight estimation
For the GARFIELD-AF discontinuation analysis, the vector of time-varying covariates at time l, Q l ð Þ, consists of the following time-dependent indicators that are 1 if the event is true and 0 otherwise: whether or not a patient experienced a minor bleed, major bleed, or nonmajor clinically relevant bleed since treatment initiation; whether or not a patient has had a LAAP since treatment initiation; whether or not a patient has had a nonhemorrhagicstroke/SE or since treatment initiation; and whether or not a patient has had an MI since treatment initiation. This amounts to six TD indicatorsone for each event just described. When the endpoint of interest involves either nonhemorrhagic stroke/SE or MI, the T A B L E 1 Simulation results under the Naive 1 method, Naive 2 method, and our proposed treatment-specific MSM method under the three methods considered, and for each simulation scenario studied in Section 3.2 corresponding TD indicators for these events are excluded from Q l ð Þ. We consider 30 baseline covariates. After turning the categorical covariates into dummy variables, this amounts to 97 baseline covariates. See Table S1 for a summary of the baseline covariates.
Fitting the treatment regime MSM involves the following steps. First, we estimate π X i ð Þ using logistic regression. We then estimate the TD hazard for regime violation described in (3) using a TD Cox PH model, and we estimate the TD hazard for discontinuation given in (2) using a TD Cox PH model with estimated TD, subject-specific weights In the above, and also for the computation of b by fitting a Cox PH model for the hazard of regime violation given just X and A. The weights in (9) serve the same purpose as piece 2 in (8). Namely, to control for any bias induced by artificially censoring patients that violated the posited treatment policy. Because the denominator in (9) is a function of the fitted TD Cox PH model for the hazard of regime violation, the TD regime-violation model must be fit prior to fitting the TD Cox model for the discontinuation hazard. At this point, the only nuisance function that still needs to be estimated in (8) Þ are estimators for the discontinuation hazard and the discontinuation survival distribution given just X and A. They are also obtained by fitting a Cox PH model. Finally, the estimated TD, subject-specific weights b ω ai t ð Þ are computed, and β Ã 1 , β Ã 2 , and β Ã 3 are estimated by fitting a TD Cox PH model with weights equal to b Þb ω 0i . All of the model fitting was done using R. 26 For each modeling step described above, LASSO 27 variable selection was performed to reduce the dimension of X. The Cox PH models were fit using the coxph() function in the R package survival. 28,29 Mean imputation is used to handle missingness in the continuous covariates. Inference is carried out using robust variance estimates computed by the software. Results are described in the next section.

| Constant effect of discontinuation
Of the 23,882 patients considered, 3100 patients (1738 from VKA group; 1362 from NOAC group) violated the treatment policy and were artificially censored at that time, and 365 patients (216 from VKA group; 149 from NOAC group) were censored prior to violating the treatment policy. See Figure 3 for a bar plot of when regime violation occurred for the group of patients on VKA and the group of patients on NOAC. Before discussing the results of the final MSM fit, we   Table 2 for the full description of hazard ratio estimates and associated p-values.
Based on the results from fitting the final treatment regime MSM, the effect of OAC discontinuation does not significantly differ by type of OAC for the endpoints we considered (testing at α level ¼ 0:05, with smallest p-value = 0.145). Refer to Table 3 for the parameter estimates and associated p-values, and see Figure 4 for a forest plot of the failure hazard ratios for each endpoint had treatment been discontinued versus had treatment never been discontinued, by treatment group. Accordingly, we removed from the treatment regime MSM the interaction term for the interaction of OAC and treatment discontinuation, and we refit the model. See Table 4 for the results. We find that OAC discontinuation significantly increases the hazard for death ( b β 2Ã ¼ 0:48; p-value < 0.001), stroke/SE ( b β 2Ã ¼ 0:79; p-value < 0.001), MI ( b β 2Ã ¼ 0:60; p-value = 0.024), death/stroke/SE ( b β 2Ã ¼ 0:51; p-value < 0.001), and death/stroke/SE/MI ( b β 2Ã ¼ 0:51; p-value < 0.001). After using a Bonferroni correction to adjust for multiple comparisons, there still is a significant effect of OAC discontinuation on all of the previously mentioned endpoints, except for MI. For certain endpoints (death; cardiovascular death; death/stroke/SE; death/stroke/SE/MI) there is evidence that β 1Ã is significantly less than zero. This suggests that NOACs reduce the risk for having a clinical event, compared to VKAs, among patients who never discontinue treatment. This agrees with findings in the literature. 30 Finally, the analyses were also run using a 30 day definition of discontinuation, in order to see how robust the results are to changes in the definition of treatment discontinuation. The results are qualitatively similar to those under the 7 day definition of treatment discontinuation, but the effect size is slightly smaller. Refer to Tables S2 and S3.

| Time-varying effect of discontinuation
The model we have considered thus far assumes the effect of treatment discontinuation is constant over time. To study whether there is a time-varying effect of treatment discontinuation, we can add a term such as β Ã TD t À ν ð Þz ν t ð Þ to the current model, so that the treatment-regime-specific MSM becomes: In this way, the effect of treatment discontinuation is now a function of the duration of treatment discontinuation. We fit the model given in (10) to the GARFIELD-AF data, excluding the term β 3Ã az ν t ð Þ, as we already found that the effect of discontinuation does not significantly differ by type of OAC (see Section 4.2 and Table 3). After fitting the model to the various endpoints of interest, we find a significant time-varying effect of treatment discontinuation for death ( b β Ã TD ¼ À0:004; p-value ≤ 0.001), cardiovascular death ( b β Ã TD ¼ À0:003; p-value = 0.004), death/stroke/SE ( b β Ã TD ¼ À0:004; p-value ≤ 0.001), and death/stroke/SE/MI ( b β Ã TD ¼ À0:004; p-value ≤ 0.001). Refer to Table 5. For each of those endpoints, b β Ã TD is less than zero, indicating that the effect of discontinuation on those endpoints dampens over time.

| CONCLUSION
We consider a treatment-specific marginal structural Cox model for the effect of treatment discontinuation on a survival endpoint, and we propose IPRC weights for estimating the parameters of the MSM. The IPRC weights control for three potential sources of bias-bias due to non-randomized treatment, bias due to regime confounding, and bias caused by artificially censoring patients. The adjustment for this last source of bias, within the IPRC weights, is a key contribution of the proposed framework. Using this framework, we estimate the causal effect of OAC discontinuation in a population of patients with Atrial Fibrillation. We do not find evidence that the effect of OAC discontinuation differs by treatment (VKA/NOAC), but we do find evidence that treatment discontinuation increases the hazard for certain clinical events (death; stroke/SE; death/stroke/SE; death/stroke/SE/MI). Combining these findings with the insight that most patients that discontinue treatment do so relatively early after treatment initiation, it may be worthwhile for clinicians to emphasize the importance of remaining on OACs, especially early into treatment initiation. The proposed framework is widely applicable to other disease settings with treatment discontinuation and/or regime violation.
In the Application to GARFIELD-AF Study, we illustrate the application of the proposed method to multiple outcomes (death; stroke/SE; death/stroke/SE; death/stroke/SE/MI). When analyzing one outcome, for example, death, we treat other variables: stroke, SE, and MI as time-varying confounders. The strategy is useful to correct for confounding biases; however, one cannot study the effect of discontinuation on multiple outcomes simultaneously. One potential solution is to recast the problem in a competing risks setting, where the failure event is classified into one of several mutually exclusive types, and occurrence of one type of event precludes the occurrence of an event of another type. Extension of the proposed framework to competing risks is possible by changing the MSMs to eventspecific models. This will be an interesting topic for research in the future.
A limitation of this work is that it relies on a three-part assumption of no unmeasured confounders, which is a strong yet unverifiable assumption. Additionally, we have assumed that censoring is non-informative. If the censoring assumption is relaxed so that censoring can depend on the observed data, R can be reset to denote the time to censoring or regime violation, whichever comes first, and Γ Si can be reset as the indicator that either censoring or regime violation occurred during follow up. This scenario may require a more complicated model for R than the model described in this paper. Finally, the IPRC weights that we consider are the product of three inverse probability weights-which may be unstable if the denominator(s) of those weights is(are) close to zero. In such cases, the weights may need to be truncated in order to fit the model, and sensitivity analyses should be conducted in order to examine how sensitive the results are to different levels of truncation (e.g. compare results after truncating weights greater than: 50, 100, 150). Augmenting the IPRC weighting approach using outcome regression may help, but will be a future topic for research.