Simulation of anticoagulation in atrial fibrillation patients with rivaroxaban—from trial to target population

The populations included in the randomized controlled clinical trials and observational studies were different. The effectiveness and safetyofrivaroxabanforstrokepreventioninpatientswithatrialfib-rillation(AF)variedamongstudies. Thisstudyaimedtoestimatethe real-world outcomesofrivaroxabaninpatientswith AFaccurately. A discrete event simulation (DES) was used to predict the counterfactual results of the ROCKET AF study. The hypothetical cohorts of pa-tientsweregeneratedusingMonteCarlosimulationaccordingtothe baselinecovariatedistributionsthatmatchedthemarginaldistribu-tionofcovariatesreportedintheROCKETAFandthreeobservational studies. The DES model structure was constructed based on a priori knowledge about disease progression and possible outcomes of patients with AF. The DES model accurately replicated the overall re-sults of the ROCKET AF study. Both predicted stroke/systematic embolism (SE) and major bleeding rates were lower in the three obser-vationalstudiesthaninthesimulatedROCKETAFstudy. Theriskdif-ference of stroke/SE and major bleeding was not significant among thepredictedoutcomesofthethreeobservationalstudies. Although somedifferencesexistedintheabsoluteratesofstroke/SEandmajor bleeding between observed and simulated studies, the results confirmed that rivaroxaban was noninferior to warfarin for the preven-tionofstroke/systematicembolismwithnosignificanceintheriskof major bleeding in large AF populations, which was similar to the re-sultsofROCKETAF.


Introduction
Atrial fibrillation (AF) is the most commonly diagnosed and treated arrhythmia in clinical practice, with an increasing health burden.In the United States, 2.7 to 6.1 million individuals are currently suffering from AF, and it is estimated to be prevalent in more than 8 million people by the year 2050 [1,2].Stroke is the most feared complication of AF, which is usually prevented by oral anticoagulation [3].Non-vitamin K antagonist oral anticoagulants (NOACs) have become an alternative to vitamin K antagonists for preventing stroke in patients with AF [4].Rivaroxaban, one of the most commonly used NOACs, was approved for stroke prevention in patients with AF based on the pivotal randomized controlled clinical trial (RCT), namely ROCKET AF study [5].This trial demonstrated the efficacy and safety of rivaroxaban in reducing AF-related stroke risk.
RCTs are conducted on highly selective populations and are managed in tightly controlled settings.Therefore, RCTs are considered as the gold standard for assessing treatment efficacy, and the results have the highest reliability.Nevertheless, RCTs, such as the ROCKET AF study, usually exclude certain patient groups; for example, AF patients with a CHADS 2 score of 0 to 1 [5] account for about 40% of the entire AF population in the real-world setting [6,7].It is well known that AF patients with different stroke risks estimated by the CHADS 2 score might receive different benefits from the anticoagulation therapy [8].Therefore, rigorous exclusions could limit the generalizability of evidence from RCTs, as the benefits and risks in all patient populations who actually treated in real clinical practice may differ.Consequently, uncertainty exists when physicians make anticoagulation decisions of rivaroxaban for patients who fall outside the inclusion criteria in the ROCKET AF study.
Real-world observational studies using routine electronic healthcare databases, such as insurance claims data or registry data, are available for large and diverse patient populations, and could be used to capture rare adverse events and longterm outcomes, as well as provide outcome estimates of treatment effectiveness in broad patient populations.However, the results of observational studies often differ from those of RCTs, which might mainly result from the differences in patient characteristics, drug adherence, and outcome measurement across studies that differ in design [9].The XANTUS study, a real-world, prospective, observational cohort study described the use of rivaroxaban in a broad unselected AF pa-tient population [6].In both ROCKET AF and XANTUS studies, patients exhibited different baseline characteristics.
In the XANTUS study, a lower CHADS 2 score and a lower proportion of patients with prior stroke, heart failure, hypertension, or diabetes mellitus compared to the ROCKET AF study were observed, which might have contributed to the outcome discrepancy between these two studies [10].
Moreover, the effectiveness and safety of rivaroxaban in patients with AF also varied among different observational studies.The incidence of stroke/systematic embolism (SE) was observed to be 0.8 event per 100 patient-year in the XANTUS study [6], whereas different rates (1.9 and 4.6 per 100 patient-year) were observed in two other observational studies [7,11] (Supplementary Table 1).Such significant variations in observational studies might result from differences in study design, data source, definition of outcomes, length of observation, analysis methods, etc. [12].Although real-world studies could support and extend RCT findings to larger patient populations, the results could be biased.
To obtain generalizable results on the use of rivaroxaban in patients with AF, it is desirable to make the results of the ROCKET AF study and real-world studies complement each other.To the best of our knowledge, generalizing the baseline characteristics of the RCT population to match those of real-world patients could facilitate the generation of evidence for effectiveness and safety of treatments in excluded populations, thus providing more relevant evidence for decisionmakers [9].It is worth noting that discrete event simulation (DES) is a method that can mimic the disease pathways and outcomes over time according to a function of treatment and patient-level covariates [13].Using DES, we could generalize the results to real-world patients with a different distribution of characteristics from the RCT population.Previous studies [14,15] have used DES to estimate the percentage of patients with atherosclerotic cardiovascular disease who would require lipid-lowering therapy (LLT) intensification in real practice and evaluate the impact of LLT intensification on cardiovascular events.In this study, we proposed a DES to predict the counterfactual outcomes of ROCKET AF that would have been conducted in larger observational study populations.

Data sources and study population
The ROCKET AF was selected as a case of RCT that evaluated the efficacy and safety of rivaroxaban versus warfarin in patients with AF.XANTUS, a prospective observational study, was chosen to investigate whether the results regarding the effectiveness and safety of rivaroxaban obtained in the ROCKET AF study could translate into real-world clinical practice.Two other retrospective observational studies (Laliberté, 2014 [7] and Amin, 2017 [11]) were used to assess the effectiveness and safety of rivaroxaban versus warfarin in routine care.Patient baseline characteristics of the four studies were collected, including age, sex, previous thromboem-bolic events (stroke, SE), transient ischaemic attack (TIA), myocardial infarction (MI), heart failure (HF), hypertension, diabetes mellitus (DM), and CHADS 2 risk of stroke.The incidence rate of events, such as stroke/SE, major bleeding, intracranial haemorrhage (ICH), gastrointestinal (GI) bleeding, and MI, were extracted as events per 100 patient-year.

Discrete event simulation (DES) model
A DES model was developed and reprogrammed in Python (version 3.7, The Python Software Foundation, USA).The hypothetical cohorts of patients were generated according to the baseline covariate distributions that matched the marginal distribution of covariates reported in the ROCKET AF, XANTUS, and two other observational studies (Laliberté, 2014 [7] and Amin, 2017 [11]), using Monte Carlo simulation by random sampling.Random sampling continued until 7000 patients were simulated for each treatment group, and this was similar to the sample size of the ROCKET AF study.Fig. 1 presents the DES model structure built based on a priori knowledge about disease progression and possible outcomes of patients with AF receiving rivaroxaban [3,16].The model was designed to predict treatment outcomes, based on patients' baseline characteristics.The CHADS 2 score of individual patients, representing their stroke risk, was calculated according to the simulated patients' baseline characteristics.Therefore, patients at different stroke risk would trace different probabilistic pathways in the model based on their treatment assignment (rivaroxaban or warfarin).The incidence of events in the simulation model was obtained based on the rates reported in the ROCKET AF study (Supplementary Table 2) [5,[17][18][19][20][21].Although CHA 2 DS 2 -VASc score is now recommended in the clinical guidelines for stroke risk assessment of patients with AF, CHADS 2 score was the mainstream score for stroke prediction when the ROCKET AF study was conducted.Some patients with CHADS 2 score of 0 could obtain CHA 2 DS 2 -VASc score of 2 to 3, and those patients are eligible for anticoagulation therapy.As patients with a CHADS 2 score of 0-1 were excluded from the ROCKET AF study, the incidence of the events used in the model for this subset of patients was extrapolated based on the Randomized Evaluation of Long-Term Anticoagulation Therapy (RE-LY) trial, which investigated the efficacy and safety of dabigatran in AF patients.Cardiovascular events, such as stroke/SE, major bleeding (ICH, GI bleeding, etc.), non-major clinically relevant (NMCR) bleeding, MI, and unknown death, were recorded during the simulated two-year follow-up period.The patients' cardiovascular profiles were updated from the first year to the second year.Therefore, the CHADS 2 score of individual patients would be calculated again when entered the second year.Patients who suffered death, fatal stroke/SE, fatal major bleeding, or fatal MI, were removed from the simulation model after their outcomes were recorded.The event rates, hazard ratios (HRs), and risk differences (RDs) were calculated and reported.

Comparison of simulated and observed results in ROCKET AF
To validate the DES model, we compared the simulated results and observed results of the ROCKET AF study.The RDs and relative HRs (RHRs) of each outcome were calculated.RDs were estimated as the absolute risk difference of each outcome, and were calculated by subtracting the observed incidence from the simulated incidence [22].HRs and 95% confidence intervals (CI) were calculated using the Cox proportional-hazards models.The RHRs were calculated by dividing the simulated HRs by the observed HRs for each outcome [22].Both RDs and RHRs reflected the model error, which could have been caused by misspecification of the simulation structure or assumptions about input parameters.RDs around 0 (±0.10%) and RHRs near to1 (0.90-1.10) represented low simulation model error.

Comparison of simulated results among the ROCKET AF, XANTUS, and two observational studies
The simulated outcomes for hypothetical cohorts of patients with marginal covariate distributions similar to the XANTUS and two observational studies were predicted.Hypothetical cohorts of 7000 rivaroxaban patients and 7000 warfarin patients were simulated for the XANTUS and other two observational studies.As XANTUS was a single-arm study, and there was no baseline information about patients using warfarin; therefore, the baseline characteristics of patients using rivaroxaban were used in the simulation.The DES model developed and validated in patients with RCT was used to estimate outcomes of the three observational study populations.This process was conducted by replacing the baseline characteristics of the RCT cohorts with the simulated cohorts of the observational studies.The event rates, RDs, HRs, and RHRs were estimated.

Statistical analyses
Demographics and clinical characteristics at baseline for cohorts of involved studies and simulated cohorts are summarized descriptively as means ± SD or proportions, as appropriate.Key summary measures of the DES model were reported as event rates (events per 100 patient-year), RDs, HRs, and RHRs, with associated 95% CIs estimated using the 2.5th and 97.5th percentiles.Simulation of ROCKET AF was repeated 10, 20, 50, 100, 200, 300, 500, 1000, and 2000 times, respectively, to obtain the optimal iteration times for stable results.Finally, each simulation was repeated 1000 times with a computer running time of approximately 30 min.Analyses were performed using STATA software (version 13, StataCorp, College Station, Texas, USA) and Python (version 3.7, The Python Software Foundation, USA).

Baseline characteristics of observed and simulated patients
Table 1 (Ref.[5][6][7]11]) and Supplementary Table 3 outline the baseline characteristics of the observed and simulated patients in the four studies.The average age in each of these four studies was above 70 years.The proportion of each comorbidity varied among the different studies.A larger proportion (54.9%) of patients on rivaroxaban with prior stroke/TIA was included in the ROCKET AF study than that in the three observational studies.Meanwhile, the proportion of patients with HF, hypertension, or DM was also higher in the ROCKET AF study.The mean CHADS 2 score was 3.48 ± 0.94 in the ROCKET AF study, which was much higher than that in the three observational studies.

Simulation of ROCKET AF
The DES model accurately replicated the overall results of the ROCKET AF (Fig. 1 and Table 2).The simulation was repeated 1000 times to obtain robust and convergent results (Supplementary Figs.1,2).The simulated incidence of stroke/SE and major bleeding was 1.718 vs. 1.980, and 3.463 vs. 3.379 per 100 patient-year for rivaroxaban and warfarin, respectively.The RDs between the simulated and observed results were relatively low among each outcome (Table 2).The simulated HRs comparing rivaroxaban and warfarin in the risks of stroke/SE and major bleeding were 0.868 (95% CI, 0.863-0.872)and 1.025 (95% CI, 1.021-1.029),respectively (Table 3, Ref. [5][6][7]11]).The RHRs between the simulated and observed results were approximately 1.The results indicated that the estimated risks and HRs closely matched the observed risks and HRs in the ROCKET AF study, indicating a low error of the simulation model.

Discussion
The ROCKET AF study and three observational studies (XANTUS, Laliberté (2014) [7] and Amin (2017) [11]) contributed to the clinical evidence for rivaroxaban in stroke prevention in patients with AF.However, the effectiveness and safety of rivaroxaban varied among the four studies.In this study, a DES model was proposed to predict the counterfactual outcomes of ROCKET AF study that could have been observed in larger observational study populations.The DES accurately replicated the overall results of the ROCKET AF study.Counterfactual results of the ROCKET AF study obtained by using the populations in observational studies showed relatively lower stroke/SE rates and major bleeding rates than those in the simulated ROCKET AF.Moreover, most of the simulated HRs between the rivaroxaban and warfarin arms were similar to the corresponding observed HRs, indicating similarities in benefits and harms of rivaroxaban in patients with AF in the ROCKET AF study.
As an RCT, the ROCKET AF study investigating the efficacy and safety of rivaroxaban in AF patients is regarded as the gold standard.Nevertheless, the study was performed in selected AF patients with moderate-to-high risk of stroke (CHADS2 score ≥2 and mean score: 3.5), resulting in a lack of external validity and generalizability [23].In comparison, real-world studies, such as the XANTUS, Laliberté (2014) [7], and Amin (2017) [11] studies, could reflect real-world treatment patterns among diverse populations and provide outcome estimates in broad patient populations.In fact, the results of observational studies often differ from those of RCTs and differ from each other.As baseline covariates, such as age and history of stroke, are also risk factors for the studied outcomes, differences in patient characteristics could be a common barrier, leading to outcome discrepancies across studies.As a result, the treatment effects might differ across different patient populations.In addition, differences in data sources, outcome measures, and patient adherence as well as confounding bias of observational studies, contribute to the discrepant results.Furthermore, the follow-up periods were different in the XANTUS, Laliberté (2014) [7], and Amin (2017) [11] studies, ranging from 0.5 to 1 year, which was shorter than that in the ROCKET AF study (about 2 years).Considering that some rare events could not be detected and differences in low incidence events might not be found during the short follow-up period, the absolute event rates and the relative benefits of rivaroxaban might be inaccurate.
To estimate the real-world effectiveness and safety of rivaroxaban in patients with AF accurately, we used the DES method to model the pathways and 2-year outcomes of rivaroxaban anticoagulation in AF patients.It is known that DES is a strategy for modeling disease pathway and outcomes over time as a function of treatment and patient-level covariates [9].Several studies have used DES to simulate LLT in patients with atherosclerotic cardiovascular disease [14,15,24].Besides, a recent study used baseline characteristics from two observational studies to replicate the efficacy and safety of dabigatran compared to warfarin in the RE-LY study with a DES model [25].The study found that differences in patient populations can explain a substantial portion of observed differences in outcomes across studies.In this study, a Monte Carlo simulation was used to generate hypothetical cohorts of patients.The DES built in this study could keep track of patient-level covariates and account for the changes in patients' stroke and bleeding risk factors over time [9].Therefore, the stroke and major bleeding risks could be modified as the patient grew older or had a greater comorbidity burden.Event rates and treatment effects could then be estimated based on predefined relationships between the outcomes and risk factors of stroke and bleeding.The baseline characteristics of ROCKET AF patients were generalized to match the baseline of patients treated in routine care, which facilitated the generation of evidence for the effectiveness and safety of rivaroxaban in excluded AF populations.
Our results indicated that the observed outcomes of the ROCKET AF, XANTUS, Laliberté (2014) [7], and Amin (2017) [11] studies differed from each other.However, the differences were insignificant among the corresponding simulated studies.In the Laliberté (2014) [7] study, wide discrepancies were found in the stroke incidence in the rivaroxaban arm in both observed and simulated results, with an observed rate of 4.6 and a simulated rate of 1.097 per 100 patient-year.A similar trend was found in the observed and simulated incidence of major bleeding in the Amin (2017) [11] study.This inconsistency might be caused by the inherent limitations of real-world studies, such as short follow-up, unbalanced confounding bias, etc.Interestingly, stroke/SE incidence in the rivaroxaban group was similar in the simulated XANTUS, Laliberté (2014) [7], and Amin (2017) [11] studies (1.118, 1.097, and 1.318 per 100 patient-year, respectively), which was much lower than that in the simulated ROCKET AF study (1.718 per 100 patient-year).It is known that patients enrolled in the ROCKET AF study were of moderate-to-high stroke risk.In comparison, the baseline characteristics of the three observational studies were similar and could represent the whole AF population.The stroke risk of patients in the three observational studies was much lower than that in the ROCKET AF study.Accordingly, the simulated stroke/SE incidence of the three observational studies might reflect the real-world stroke/SE rate in AF patients using rivaroxaban to some extent, same as the other simulated outcomes.
It is worth noting that most observed and simulated HRs between rivaroxaban and warfarin for each outcome were similar in our study, with most RHRs around 1. In terms of the HR for stroke/SE comparing rivaroxaban and warfarin, the simulated HRs in the Laliberté (2014) [7] and Amin (2017) [11] studies were 0.780 and 0.824, respectively, which were close to the observed HR of 0.79 in the ROCKET AF study.These results confirmed that rivaroxaban was non-inferior or even superior to warfarin for the prevention of stroke/SE in the real-world setting, which were in accordance with the results obtained in two previous metaanalyses reporting that HRs for stroke/SE comparing rivaroxaban and warfarin were 0.75 (95% CI, 0.64 to 0.85) and 0.83 (95% CI, 0.73 to 0.94) in real-world setting, respectively [26,27].With respect to the HR for major bleeding, there was no significant difference between groups, with observed HR being 1.04 in ROCKET AF study and simulated HR being 1.034 in Amin (2017) [11] study.The HRs for major bleeding obtained in this study were also similar to those reported in two previous meta-analyses considering real-world studies, with the HRs being 1.02 (95% CI, 0.95 to 1.10) and 0.99 (95% CI, 0.91 to 1.07), respectively [26,27].Therefore, similar results for the effectiveness and safety were observed when comparing rivaroxaban and warfarin in patients with AF.
It could be easier to understand the difference in effectiveness between rivaroxaban and warfarin when the molecular mechanisms of these two drugs are clarified.As a traditional oral anticoagulant, warfarin exerts its activity through inhibition of the synthesis of vitamin K dependent coagulation factors II, VII, IX, and X, as well as proteins C and S [28].By comparison, rivaroxaban, as a NOAC, selectively inhibits free and clot-bound factor Xa, and further inhibits the generation of thrombin, thrombin-mediated activation of coagulation, and thrombin-mediated platelet aggregation [28].It is notable that rivaroxaban can also exert an antiplatelet effect by acting through protease-activated receptor 1, which may lead to reduced frequency of atherothrombotic events and improved outcomes in patients [29].
This study has some limitations.First, the model error could not be neglected, as the DES model structure and pathway were built based on a priori knowledge regarding disease progression and possible outcomes of AF patients receiving rivaroxaban, which lacked a multivariable outcome prediction component.Second, the CHA 2 DS 2 -VASc score, rather than the CHADS 2 score, is now recommended in the clinical guidelines for stroke risk assessment of patients with AF, as it has the advantage to identify a subset of low-risk AF patients with CHADS 2 score of 0-1 [8,30].However, in this study, the relationship between baseline characteristics and clinical outcomes was calculated according to the patient's CHADS 2 score, as it was the mainstream score for stroke prediction when the ROCKET AF study was conducted.Third, as the ROCKET AF study excluded patients with a CHADS 2 score of 0 to 1, the incidence of the events used in the simulation model for this subset of patients was based on the RE-LY trial, which investigated the efficacy and safety of dabigatran in patients with AF.This could also introduce some errors in the model.Moreover, individual-level information was not available in our study.Therefore, the bootstrapping method, which could preserve the covariance structure among the baseline characteristics of observational studies and could increase the accuracy of the simulation, could not be used.In addition, the covariates and outcomes of observational studies might be imprecise, as the data were not originally recorded for research purposes and some vital information might be missing.All these factors might have negatively impacted the accuracy of the simulation model and the predicted outcomes.

Conclusions
To estimate the real-world effectiveness and safety of rivaroxaban in patients with AF, the DES method was used to model the pathways and 2-year outcomes of rivaroxaban anticoagulation in these patients.The simulated event incidence of observational studies, such as stroke/SE incidence and major bleeding incidence, which was lower than that in the simulated ROCKET AF study, might reflect the realworld event rate in patients with AF.Moreover, the results confirmed that rivaroxaban was noninferior to warfarin for prevention of stroke/SE with no significance in the risk of major bleeding in large AF populations, which was similar to the results of ROCKET AF study.

Fig. 1 .
Fig. 1.The structure of discrete event simulation (DES) model.DES model structure was built based on a priori knowledge about disease progression and possible outcomes of AF patients.The model was designed to predict treatment outcomes conditional on patients' baseline characteristics.Patients at different stroke and major bleeding risk would trace different probabilistic pathways in the model based on their treatment assignment (rivaroxaban or warfarin).DES could keep track of patient-level covariates and account for the changes in patients' stroke and bleeding risk factors over time.Therefore, the stroke and major bleeding risk could be modified as the patient got older age or greater comorbidity burden.

Table 3 . Comparisons of each outcome between observed results and simulated results in ROCKET AF, XANTUS, and two observational studies.
SE, systemic embolism; MB, major bleeding; ICH, intracranial hemorrhage; GIB, gastrointestinal bleeding; MI, myocardial infarction; HR, hazard ratio; 95% CI, 95% confidence interval; RHR, relative hazard ratio.RHR was calculated by dividing the simulated HR by observed HR.An RHR of 1 indicates no difference between simulated outcomes and observed outcomes.

Table 4 . Simulated results and the risk difference for rivaroxaban arms in the simulation.
SE, systemic embolism; MB, major bleeding; ICH, intracranial hemorrhage; GIB, gastrointestinal bleeding; MI, myocardial infarction; RD, risk difference (simulated event rate of observational studies minus simulated event rate of ROCKET AF).The lower and upper limits of the 95% confidence interval for the RD between two studies were calculated using the website of http://vassarstats.net/based on methods described by Robert Newcombe derived from a procedure outlined by E.B.Wilson in 1927.