Transcatheter aortic valve implantation versus surgical aortic valve replacement for treatment of severe aortic stenosis: comparison of results from randomized controlled trials and real-world data

Objective Results from randomized controlled trials (RCTs) and real-world study (RWS) appear to be discordant. We aimed to investigate whether data derived from RCTs and RWS evaluating long-term all-cause mortality of transcatheter aortic valve implantation (TAVI) versus surgical aortic valve replacement (SAVR) in patients with severe aortic stenosis (AS) were in agreement. Methods RCTs or RWS comparing TAVI and SAVR, reporting longterm (≥2-year follow-up) all-cause mortality, were identified. We also carried out subgroup analyses to access the effect in different subgroups. A pre-designated data extraction form including 5 domains and 26 items was used to explore the relationship between RCTs and RWS. Mortality and effect in different subgroups were evaluated using random-effects meta-analyses. Results Five RCTs (5421 participants, TAVI: 2759, SAVR: 2662) and 33 RWS (20839 participants; TAVI: 6585, SAVR: 14254) were identified. Pooled RCT analysis showed no difference in all-cause mortality between TAVI and SAVR (HR=0.97, 95% CI: 0.88-1.07; P=0.55). In RWS, TAVI was associated with an increased risk of allcause mortality (HR=1.46, 95% CI: 1.26-1.69; P<0.001) compared to SAVR. Conclusion These results highlight the inconsistencies between RCTs and RWS in assessing long-term all-cause mortality in the treatment of AS using TAVI or SAVR, which may be caused by interactions of clinical characteristics or study design. RCTs as well as RWS are both developing and improving; the advantages of one kind of design, measurement and evaluation can and should be thoughtfully referred to the other.

displacing and functionally replacing the native valve with a bioprosthetic valve, delivered via a catheter, irrespective of whether the femoral artery (transfemoral [TF] placement), the transthoracic placement (transapical [TAp] or transaortic [TAo] access), or other access was used. SAVR was defined as the treatment of AS through median or minimal sternotomy under cardiopulmonary bypass (CPB). RCTs or RWS were included if they enrolled patients with severe AS, compared TAVI with SAVR, and reported long-term all-cause mortality (follow-up ≥2 years). For studies with duplicated data, we included only the report with the most informative and complete dataset.

Assessment of Risk of Bias
Two researchers (DW, LH) assessed the methodological quality of the chosen studies. For RCTs, reviewers used the Cochrane risk of bias assessment tool [11] to evaluate the following seven items: randomization sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessment, incomplete outcome data, selective reporting, and other bias. We defined other bias as company-sponsored trials and trials in which baseline characteristics differed between intervention groups. For RWS, the Newcastle-Ottawa Scale (NOS) [12] was used to assess the risk of bias. Scores ≤6 were categorized as a high risk of bias, while >6 was considered a low risk of bias. The results were compared, and disagreements resolved by discussion.

Data Extraction
Data derived from RCTs and RWS were extracted by two researchers (YZ, ZC) independently. A pre-designed data extraction form, including five domains and 26 items, was used to explore the relationship between data from RCTs and RWS. The number of events in each arm and the hazard ratio (HR) with the corresponding 95% confidence intervals (CIs) were extracted. HRs were given precedence over risk ratios (RRs), since they incorporate time-to-event data and allow censorship. When HRs with corresponding variance were not presented, we calculated them from Kaplan-Meier curve data or summary data using an HR calculation spreadsheet provided by Tierney et al. [13] based on statistical methods reported by Parmar et al. [14] Two reviewers independently completed all data extraction and disagreements were resolved through discussion or, if necessary, arbitration by a third reviewer (DK).

Statistical Analysis
The main outcome was long-term all-cause mortality (over a follow-up period of at least 2 years). We used the random-effects model to pool the data and evaluated statistical heterogeneity between summary data using the I2 statistic. We performed subgroup analyses by the TAVI access route used for valve delivery (TF vs. TAp), sex (RCTs only), surgical risk (high-vs. nonhigh-risk patients), and type of TAVI (balloon-expandable vs. selfexpandable); for RWS, geographic variation (Europe vs. North America vs. rest of the world) and whether the association of other procedures with SAVR (concomitant procedures vs. isolated

INTRODUCTION
Aortic stenosis (AS) is a common condition among elderly individuals and the prevalence of this disease increases with age, from 0.2% for ages 50-59 years to up to 10% for ages 80-89 years [1][2] . Without aortic valve replacement (AVR), symptomatic AS results in a life expectancy of <3 years [3] . The standard of treatment for patients with symptomatic severe AS is valve replacement, through surgical aortic valve replacement (SAVR) or by transcatheter aortic valve implantation (TAVI). So far, several randomized controlled trials (RCTs) comparing the two procedures have been conducted focusing on patients with high and low surgical risk. Three trials (Nordic Aortic Valve Intervention [NOTION] [4] , Placement of Aortic Transcatheter Valve [PARTNER 1] [5] and CoreValve US High Risk Pivotal Trial [US CoreValve] [6] ) have obtained 5-year follow-up results. RCTs confer the least biased estimates of treatment effects; their strict inclusion and exclusion criteria allow for recruiting participants most likely to benefit from an intervention. However, RCTs do not necessarily reflect real-world settings. Studies have shown that approximately 80% (82.2% in NOTION and 77.5% in PARTNER 1) of screened patients were excluded and not assigned to randomization [7] . On the other hand, real-world studies (RWS) may provide better generalizability to routine practice, as they often have less restrictive inclusion criteria, or tend to enrol "all comers". However, results of observational studies should always be interpreted cautiously because of greater potential bias of these studies. Comparison of these two study designs has indicated that their results may be conflicting [8,9] . This may be attributed to the complex interplaying of factors, such as the varied clinical characteristics (patients, intervention, control, time [PICT]) in addition to the study design. Therefore, this study compared data derived from RCTs and RWS on longterm all-cause mortality when treating AS with TAVI or SAVR and examined the clinical characteristics and clinical profiles that may underlie discrepancies.

METHODS
We followed the reporting standards for systematic reviews and meta-analyses according to the PRISMA statements [10] . The concordance and reasons for discrepancies were further evaluated after conducting a meta-analysis.

Search Strategy
EMBASE, MEDLINE, and the Cochrane Central Register of Controlled Trials (CENTRAL) were searched from inception to May 30, 2019. We used database-specific subject headings (e.g., MeSH terms) and free-text terms to search for potentially eligible studies (Supplementary Materials). We also searched ClinicalTrials.gov to identify additional relevant clinical studies.

Inclusion Criteria
In both RCT and RWS settings, the intervention/exposure was transcatheter aortic valve replacement (TAVI group) and the control/non-exposure was surgical aortic valve replacement (SAVR group). TAVI was defined as the treatment of AS by

Studies Retrieved and Characteristics
The electronic search yielded 3564 unique citations ( Figure 1). After full-text and reference screening, 38 studies, including five RCTs (NOTION, PARTNER 1A, PARTNER 2A, SURTAVI [Surgical Replacement and Transcatheter Aortic Valve Implantation], and US CoreValve) and 33 RWS, met the eligibility criteria (Details of included studies were presented in the Supplement).
The results of risk of bias assessment for the 38 included studies are presented in Figures 1 and 2 in the Supplement. There were three RCTs considered unclear in the item of allocation concealment, and two industry-funded trials were graded as SAVR) was used for stratification. Publication bias was assessed by visual inspection of funnel plot and statistical asymmetry was evaluated using the Begg's and Egger's test. Publication bias was assessed only when 10 or more studies were included.
Generally, descriptive statistics were used to summarize baseline variables in all groups. Data descriptions included frequencies and percentages for dichotomous data and mean with standard deviation or median with interquartile ranges for continuous data. Student's t-test was used for comparison of means; when the data distribution was not normal, nonparametric tests were used (Mann-Whitney U test), and Pearson's chi-square test was used for categorical comparisons. All analyses were performed using STATA version 14.0.

Subgroup Analyses for Death from Any Cause
Stratified meta-analyses for death from any cause within the RCTs and RWS were performed in terms of access route (TF vs. TAp), geographic variations (RWS only), concomitant procedures with SAVR (RWS only), sex (RCTs only), surgical risk of participants (high vs. non-high-risk), and type of TAVI heart valve system (balloon-expandable vs. self-expandable) ( Figures 5 and 6).
For the RCTs, meta-analysis showed a survival benefit of participants distributed to TAVI through the TF route during follow-up compared to patients with SAVR (

Clinical Characteristics
We explored the concordance of data between RCTs and cohort studies to find the causes of discordance. The details of PICT and other important characteristics are summarized in Table  1. Compared with RWS, patients enrolled in RCTs were on average 4 years older (81.5±6.6 vs. 77.5±8.7), had more comorbidities and a lower Society of Thoracic Surgeons Predicted Risk of Mortality (STS-PROM) score (6.3±3.3% vs. 7.4±5.3%).
In the TAVI (intervention/exposure) group, patients within RCTs were primarily assigned to the TF approach (2358/2758, 85.5%) and the total procedure time was shorter than that in RWS (73.2±59.0 vs. 125.6±68.5 minutes; P<0.001). The two TAVI heart valve systems (self-expandable and balloon-expandable)  studies failed to report this information. All RCTs presented conflict of interest disclosures, compared to only 45.5% of RWS.

DISCUSSION
Long-term all-cause mortality when treating AS by TAVI or SAVR differed between RCTs and RWS. The discordance resulted not only from different study designs, but also heterogeneous clinical characteristics between RCTs and RWS. Compared with RWS, patients in RCTs were 4 years older, had more comorbidities, and had lower STS-PROM scores. The arterial route of TAVI access within RCTs was primarily TF and the mean total procedure time was markedly shorter; more patients in RCTs received newer generation TAVI balloon-expandable devices. The total SAVR procedure time as well as CPB time were markedly shorter in RCTs than in RWS. The RWS group had a higher percentage of isolated procedures (both isolated SAVR and isolated TAVI) than that in RCT group.
Patients undergoing TAVI in RCTs were older (81.6±6.6 years) and had a longer life expectancy than that in the US (79.1, 95% CI: 79.0-79.1) [15] . Non-valvular causes of death during the longterm follow-up may mask the surgical effect, which can be more significant in RCTs. In RWS, there is an inherent and practical bias in treatment selection, where physicians tend to select TAVI rather than SAVR for patients with a shorter life expectancy.
Regarding SAVR (control/non-exposure), more minimally invasive approaches as well as more bio-prosthetic aortic valves were used in RCTs compared with RWS. A significantly shorter total procedure time (210.6±68.4 vs. 269.1±119.0 minutes) and total CPB time (99.7±41.5 vs. 111.7±65.3 minutes) were observed in RCTs compared with RWS, but this did not apply to total aortic cross-clamp time (74.0±30.2 vs. 71.5±41.7 minutes). The proportion of isolated SAVR was higher in RWS group rather than in RCT group (87.2% vs. 82.5%).
The median follow-up time in cohort studies was 23.1 months (interquartile range: 16.4 to 38.8 months). Only two RCTs presented this data (median follow-up time: 35.5 and 37.7 months, respectively).
All RCTs (5/5, 100%), but only 5/33 (15.2%) RWS were registered on ClinicalTrials.gov; more RCTs than RWS were multicentre studies (5/5, vs. 6/33, P<0.001). Eleven single-centre RWS were performed at university-affiliated hospitals, six at cardiovascular centres, one at a clinic, and the remaining eight was used in RWS, while in RCTs, which mentioned more newer generation devices (SAPIEN XT), similar survival was achieved with TAVI and SAVR. The heart valve system used changed with time; for example, SAPIEN was replaced by SAPIEN XT and more recently SAPIEN 3 and Core Valve have been replaced by Evolute and Evolute R, respectively. It is reasonable to expect that newer generation devices might lead to better long-term outcomes.
Two included RWS tended to favour TAVI [20,21] ; they were performed in the setting of "prior cardiac surgery (mostly coronary artery bypass grafting [CABG])" and "porcelain aorta", where surgical AVR is particularly challenging. This result is consistent with the recommendations from existing guidelines that suggest patient frailty, and conditions such as porcelain aorta, history of chest radiation, or patent coronary bypass grafts may render patients less suitable for SAVR [22] .
Our observation that TAVI was associated with worse longterm overall survival than SAVR in RWS was supported by the findings from a meta-analysis that involved 4197 patients with severe AS. Takagi et al. [7] . reported that the pooled results of 14 RWS with a propensity score analysis showed TAVI to be inferior to SAVR in 3-year overall survival. Another meta-analysis [23] including four RCTs found TAVI was superior to SAVR in 2-year survival (HR: 0.87, 95% CI: 0.76-0.99; P=0.038). Compared with these studies, our study included both RWS and RCTs, and While SAVR remains the standard treatment, several studies have demonstrated that 30% to 60% of patients with severe symptomatic AS are denied or not referred for surgery, leading to off-label use of TAVI [16] . An incomplete or poor risk-adjustment RWS may affect the results. For example, one research study that adopted several different adjustment methods to examine the effect of TAVI compared with SAVR found that TAVI could accordingly be associated with either substantial benefits or harms [17] . Factors such as the impact of the initial learning curve and the different algorithms for patient selection and perioperative curve may account for the discordancy in all-cause mortality with TAVI compared with SAVR in the RWS population versus the RCT population.
A survival benefit has been found for patients randomized to TAVI through TF access in RCTs. TF-TAVI is associated with higher rates of major vascular damage and those with more comorbidities are generally not eligible for this approach, particularly those with peripheral vascular disease [18] . While this result was the opposite in RWS, in this subgroup, an article [19] had a significantly higher Charlson comorbidity index (an indicator of coexisting conditions) for TAVI, which meant patients who underwent TAVI had more complications than those in SAVR. We detected a detrimental effect associated with TAVI over SAVR when a balloon-expandable bioprosthesis (mainly SAPIEN)

CONCLUSION
These results highlight the inconsistencies between RCTs and RWS in assessing long-term all-cause mortality when treating AS using TAVI or SAVR, which may be caused by interactions of clinical characteristics or study design. RCTs as well as RWS are both developing and improving; the advantages of one kind of design, measurement and evaluation can and should be thoughtfully referred to the other. quantitatively compared the differences between the two research approaches by a thorough comparison, which makes our results more objective and convincing.
This study had several limitations. First, cohort studies were selected to represent the real-world situation, but their inherent limitations may lead to inaccurate results [24,25] . We attempted to minimize these limitations by performing a strict quality assessment, but poor reporting quality of the included studies did not allow definitive judgments about risk of bias in all domains. Second, this study concentrated on long-term all-cause mortality because it can provide non-biased results. Other important outcomes might yield additional clinical insights, and thus further research is required. Third, given the high cost of TAVI as well as SAVR, almost all 38 studies were performed in developed countries; therefore, these results may not be generalizable to developing countries. The number of multicentre, international studies was actually small, and we were unable to find large-scale and carefully conducted nationwide registry studies that may provide an objective realworld conclusion with robust risk adjustment. Substantial contributions to the conception or design of the work; or the acquisition, analysis or interpretation of data for the work; drafting the work or revising it critically for important intellectual content; final approval of the version to be published Substantial contributions to the conception or design of the work; or the acquisition, analysis or interpretation of data for the work; final approval of the version to be published Substantial contributions to the conception or design of the work; or the acquisition, analysis or interpretation of data for the work; final approval of the version to be published Substantial contributions to the conception or design of the work; or the acquisition, analysis or interpretation of data for the work; final approval of the version to be published Substantial contributions to the conception or design of the work; or the acquisition, analysis or interpretation of data for the work; drafting the work or revising it critically for important intellectual content; final approval of the version to be published Substantial contributions to the conception or design of the work; or the acquisition, analysis or interpretation of data for the work; final approval of the version to be published Substantial contributions to the conception or design of the work; or the acquisition, analysis or interpretation of data for the work; drafting the work or revising it critically for important intellectual content; final approval of the version to be published Substantial contributions to the conception or design of the work; or the acquisition, analysis or interpretation of data for the work; drafting the work or revising it critically for important intellectual content; agreement to be accountable for all aspects of the work in ensuring that issues related to the accuracy or integrity of any part of the work are appropriately investigated and resolved; final approval of the version to be published