Female “Paradox” in Atrial Fibrillation—Role of Left Truncation Due to Competing Risks

Female sex in patients with atrial fibrillation (AF) is a controversial and paradoxical risk factor for stroke—controversial because it increases the risk of stroke only among older women of some ethnicities and paradoxical because it appears to contradict male predominance in cardiovascular diseases. However, the underlying mechanism remains unclear. We conducted simulations to examine the hypothesis that this sex difference is generated non-causally through left truncation due to competing risks (CR) such as coronary artery diseases, which occur more frequently among men than among women and share common unobserved causes with stroke. We modeled the hazards of stroke and CR with correlated heterogeneous risk. We assumed that some people died of CR before AF diagnosis and calculated the hazard ratio of female sex in the left-truncated AF population. In this situation, female sex became a risk factor for stroke in the absence of causal roles. The hazard ratio was attenuated in young populations without left truncation and in populations with low CR and high stroke incidence, which is consistent with real-world observations. This study demonstrated that spurious risk factors can be identified through left truncation due to correlated CR. Female sex in patients with AF may be a paradoxical risk factor for stroke.


Introduction
The prevalence of atrial fibrillation (AF) increases rapidly with age [1][2][3], imposing a heavy burden of AF-related stroke among older people worldwide [1]. The risk of AFrelated stroke can be reduced by oral anticoagulation therapy [3]. In the clinical setting, physicians decide on anticoagulation therapy based on the predicted stroke risk in their patients. Risk prediction is usually made by CHA 2 DS 2 -VASc scoring [4], which incorporates known risk factors, such as hypertension and prior stroke. A score of one or two points out of a total of eight is considered the threshold [3].
Female sex is given one point in the CHA 2 DS 2 -VASc [4], and the latest meta-analysis shows that its integrated hazard ratio (HR) for stroke is 1.24 [5]. However, it is a controversial risk factor because it predicts stroke inconsistently, as it does so among older women but not among younger women [5][6][7][8][9][10]. In addition, female sex predicts stroke in European populations but not in Asian populations [5,11], especially in populations with high stroke and low coronary artery disease (CAD) incidence, such as Taiwan [12], Hong Kong [13], Korea [14], and Japan [15]. These inconsistencies have led to the latest recommendation to avoid anticoagulation in patients with AF whose risk factor is female sex alone [3]. In addition, female sex as a risk factor appears to contradict the fact that Life 2023, 13, 1132 2 of 10 cardiovascular diseases, such as CAD and even stroke in general, occur more frequently among men than women [5,[16][17][18][19][20]. Therefore, it is paradoxical that sex acts in opposite directions only in AF-related stroke. This paradox also occurs in patients with AF: stroke occurs more frequently among women, but CAD, cardiovascular deaths, and overall deaths occur more frequently among men [5,21]. Although researchers have speculated about potential mechanisms, including reproductive hormones, differences in clinical practice, and residual confounding [10,[22][23][24], it is unclear why female sex predicts stroke in patients with AF, why it does so preferentially among older European women, and where the apparent paradox comes from.
Disease risk factors are usually identified through survival analysis, which assumes non-informative censoring and non-informative left truncation (LT), i.e., individuals drop out "independently" of the outcome during and before the follow-up. When there are competing risks (CR) and unobserved heterogeneity correlated with the outcome of interest, it may lead to informative censoring, which could generate a spurious association between the outcome and the variables associated with CR [25], a phenomenon called "false protectivity" under some circumstances [26].
Based on this phenomenon, the age/ethnicity dependency and the apparently opposite prediction described above, as well as the fact that AF is prevalent among older people, we hypothesize that female sex in AF is non-causally associated with stroke because of informative LT caused by premature deaths from CR, such as CAD, which share common, unobserved causes with stroke [4,[27][28][29]. Other male-predominant diseases that share risks with stroke, including cancer [30], hepatic disease [31], and respiratory disease [32,33], may also be included in the CR. This study aimed to examine this hypothesis through simulations.

Population
Suppose a hypothetical population is composed of N same-aged persons of each sex, free from AF, who will eventually develop AF. We were interested in incident stroke after the diagnosis of AF. We assumed that the major CR is a cardiovascular death, mainly associated with CAD. However, CR can encompass other diseases with a correlated risk with stroke and with higher and earlier mortality in men than women. Hence, we define CAD in our model as a collective representation of all potential CR. We defined time 0 as the age at which CAD began to develop in this population (typically, 40 years). For simplicity, we assumed that all individuals were diagnosed with AF at time T e , and defined the cohort of patients with AF at that time. The AF cohort was left-truncated; it consisted only of persons who were alive at the time of AF diagnosis and free from CAD. We followed up the cohort for subsequent strokes over a prespecified period T c (Figure 1). Individuals were censored if they did not develop a stroke before the end of the study or if they developed CR before stroke during the follow-up period.

Model
We modeled stroke and CAD risks such that they are heterogenous and correlated among individuals and that men have a higher risk for CAD than women ( Figure 2). We assumed constant hazards for simplicity. Let [ ] and [ ] be the log hazards of an ith individual for AF-related stroke and CAD (representative of all CR combined), respectively. We model

Model
We modeled stroke and CAD risks such that they are heterogenous and correlated among individuals and that men have a higher risk for CAD than women ( Figure 2). We assumed constant hazards for simplicity. Let λ 1 [i] and λ 2 [i] be the log hazards of an ith individual for AF-related stroke and CAD (representative of all CR combined), respectively. We model and Here, the constants α 1 and α 2 represent log baseline hazards for stroke and CAD, respectively, and the coefficients β 1 and β 2 represent the corresponding log HR of male sex. The last terms v 1 [i] and v 2 [i] capture the unobserved heterogeneity in individual risks for stroke and CAD, respectively, which differ among individuals. Stroke and CAD are thromboembolic events that plausibly share an underlying pathophysiology. Indeed, they share the same predisposing factors, such as disease [4], lifestyle [27,28], socioeconomics [28], and genetics [29]. Accordingly, the susceptibility to stroke and CAD is likely to be positively correlated. Considering the correlation, we assumed that the unobserved individual risks v 1 [i] and v 2 [i] followed bivariate normal distribution with mean 0 and standard deviations (SD) σ 1 and σ 2 with a correlation ρ: Life 2023, 13, x FOR PEER REVIEW 4 of 10

Parameters (Table 1)
We set the parameters such that the simulation generally reproduced real-world observations. We set the log HR of male sex for CAD to 0.7 or 1.2 such that the simulations reproduced the HR of male sex for CAD reported in the real world [18][19][20]. We assumed no effect of sex on AF-related stroke ( = 0). However, we additionally evaluated the scenarios where male sex slightly increased stroke risk in AF ( = 0.2). For individual heterogeneity terms, we examined 12 combinations of variability within populations and the correlation between the risks. For the variability, we presumed two levels ( , = 1.0 and 2.0) based on our previous simulation on the development of stroke in AF populations [34], in which the SD = 1.85 for individual risks reproduced the HR = 2.4 of prior stroke. For the correlations, we assumed three levels: high, moderate, and no correlation ( = 0.8, 0.4, and 0). The time for AF diagnosis was set arbitrarily because it was unit-free (we can scale it as desired). For the fixed , we set the baseline hazard = 0.2 of CAD such that it generated reasonable proportions of left-truncated persons. We set the baseline hazard = 0.2 for stroke to the same value as CAD, such that the proportions of stroke among simulated AF patients largely agreed with the real-world observations [11]. In addition, Figure 2. Individual risks (log hazard) of stoke and CAD (CR) in a simulated population. The risks are correlated among individuals, and men have a systematically higher risk for CAD than women. Parameters were set to σ 1 = σ 2 = 2, ρ = 0.6, and β 2 = 1.2. (Table 1) We set the parameters such that the simulation generally reproduced real-world observations. We set the log HR β 2 of male sex for CAD to 0.7 or 1.2 such that the simulations reproduced the HR of male sex for CAD reported in the real world [18][19][20]. We assumed no effect of sex on AF-related stroke (β 1 = 0). However, we additionally evaluated the scenarios where male sex slightly increased stroke risk in AF (β 1 = 0.2). For individual heterogeneity terms, we examined 12 combinations of variability within populations and the correlation between the risks. For the variability, we presumed two levels (σ 1 , σ 2 = 1.0 and 2.0) based on our previous simulation on the development of stroke in AF populations [34], in which the SD = 1.85 for individual risks reproduced the HR = 2.4 of prior stroke. For the correlations, we assumed three levels: high, moderate, and no correlation (ρ = 0.8, 0.4, and 0). The time T e for AF diagnosis was set arbitrarily because Life 2023, 13, 1132 4 of 10 it was unit-free (we can scale it as desired). For the fixed T e , we set the baseline hazard α 2 = 0.2 of CAD such that it generated reasonable proportions of left-truncated persons. We set the baseline hazard α 1 = 0.2 for stroke to the same value as CAD, such that the proportions of stroke among simulated AF patients largely agreed with the real-world observations [11]. In addition, to simulate various populations worldwide with a relative predominance of CAD to stroke, we examined a total of nine combinations of different baseline hazards for CAD and stroke (α 1 , α 2 = 0.1, 0.5, and 1.0). The follow-up period T c was set to fit relatively into the time T e at AF diagnosis, such that the ratio T c / T e was comparable to the ratio of the real-world study periods to the interval between CAD and AF onset. Finally, to assess the effect of informative censoring alone, we set the time of AF diagnosis T e = 0, which corresponds to young populations with early-onset AF.

Simulation
We conducted 1000 simulations for each scenario. For each simulation, we generated a "population of AF" and conducted a "cohort study" to estimate the HR of female sex. Starting with an initial population of 80,000 individuals per sex, we first generated the time T 2 to CAD for each person. Of those who did not develop CAD before AF diagnosis at time T e , we randomly registered 50,000 individuals per sex who constituted the "simulated cohort" of patients with AF. We subsequently conducted a "simulated cohort study" following the cohort over the T c period and recorded the time T 1 from registration to stroke for each person. When a person developed CAD before stroke during the follow-up, they were censored at the development of CAD. Finally, we estimated the HR by fitting a Cox proportional hazard model on the simulated cohort data consisting of the triplets [min(T 1 , T c , T 2 − T e ), δ, sex], where the indicator δ is 1 if stroke occurred and 0 if censored. The simulation was performed using R 4.2.0 (R Core Team, https://www.R-project.org/ accessed on 1 February 2023).

Results
Of the initial population, 75-89% of men and 88-94% of women were diagnosed with AF while alive (Supplementary Table S1). In all scenarios, a greater proportion of men were left-truncated than women, with a difference of 5-13%. The HR of male sex for CAD estimated in the initial populations ranged from 1.31 to 1.32 and from 1.62 to 1.67 for β 2 = 0.7 and 1.2, respectively.
The HR of female sex exceeded one whenever a correlation existed between stroke and CAD risk ( Table 2). Across the scenarios, there were some tendencies in the magnitude of HR. First, the higher the correlation between stroke and CAD risk, the higher the HR. Second, the wider the stroke and CAD risk distributions within the population, the higher the HR. Finally, the more susceptible men were to CAD than women, the higher the HR. Even when we assumed that male sex increased the risk of AF-related stroke, female sex was identified as a risk factor in some scenarios (Supplementary Table S2). In younger populations without LT, the HR of female sex was substantially lower than that of the corresponding scenarios in older populations (Table 3), underlining the importance of informative LT. Among populations with various combinations of CAD and stroke incidence, the HR was higher when CAD incidence was higher and when stroke incidence was lower (Table 4). Table 3. HR of female sex for stroke in younger populations with AF (no left truncation).  Estimates (standard deviations) from 1000 simulations Low, moderate, and high risks correspond to the baseline hazards of 0.1, 0.5, and 1.0, respectively. Other parameters are set to σ 1 = 2, σ 2 = 2, ρ = 0.8, β 1 = 0, and β 2 = 0.7. The estimated HR of male sex for CAD in the initial populations was 1.31 in all scenarios.

Log HR of Male for CAD
From time 0 to AF diagnosis, the distribution of stroke risk shifted differently between sexes, such that high-risk men preferentially disappeared (Figure 3), illustrating the role of informative LT in generating the non-causal association. From time 0 to AF diagnosis, the distribution of stroke risk shifted differently between sexes, such that high-risk men preferentially disappeared (Figure 3), illustrating the role of informative LT in generating the non-causal association.

Discussion
Our simulation demonstrates that a variable can be identified as a risk factor even if it does not cause the outcome of interest. This spurious association arises when the variable reduces the risk of a preceding CR that shares common unobserved causes with the outcome. We found that this association was generated mainly through informative LT by the correlated CR events. Importantly, the spurious association was not due to longevity alone because it arose only when the unobserved risks for CR and stroke were correlated. The results of this study suggest that some of the observed stroke predictions by female sex in AF may be due to men with a high stroke risk dropping out through correlated events, rather than women being inherently more prone to experiencing a stroke.
This phenomenon may be regarded as an instance of collider bias in a causal context: conditioning on a result (survival until AF diagnosis) creates an association between originally independent causes (sex and unobserved susceptibility) [35]. From this viewpoint, the differential shift in stroke risk between the sexes (Figure 3) can be interpreted in the following manner: "men in AF cohort, who, in spite of being male, have survived until AF diagnosis, tend to have lower unobserved susceptibility to stroke than women." This bias occasionally presents an apparent paradox such as the "low birth-weight paradox" [36], but the bias may go unnoticed unless overtly paradoxical [35]. Researchers interested in causation should pay attention to potential bias, not necessarily a paradox, introduced by informative LT via preceding CR.
We postulated that CAD is a major CR contributing to informative LT because in general (1) it occurs earlier than AF [2,37,38] and (2) associated mortality is higher and occurs earlier among men than among women [18][19][20][37][38][39]. Furthermore, it is plausible that the risks of CAD and stroke, both of which are thromboembolic events, are correlated through shared predisposing factors, including cardiometabolic diseases [4], lifestyle [27,28], socioeconomics [28], and genetics [29]. In addition to CAD and other cardiovascular events [18,38,40], CR could include any disease if mortality is higher and/or earlier in men and is correlated with stroke. Some diseases, such as cancer [30], hepatic diseases [31], and chronic obstructive pulmonary diseases [32,33] may satisfy this condition and therefore could act as a CR, contributing to informative LT [8,41].
The results of this study may explain some puzzling observations in AF populations regarding stroke risk prediction based on female sex. First, the dependency on age (female sex predicts stroke only in older populations) [5][6][7][8][9][10] can be explained by the different latencies during which CR events proceed (Table 2 as opposed to Table 3). Second, dependency on ethnicity (female sex predicts stroke in European populations but not in Asian populations) [5,[11][12][13][14][15] can be explained by the different baseline hazards of CAD and stroke (Table 4). In summary, inconsistent risk prediction by female sex may have resulted Life 2023, 13, 1132 7 of 10 from varying degrees of informative LT across populations. In addition, our hypothesis is consistent with another paradoxical observation in AF: stroke occurs more frequently in women, whereas CAD, cardiovascular mortality, and overall mortality occur more frequently in men [5,21]. Although the proposed mechanism is based solely on simulations, the potential explanation of several paradoxical observations suggests that our hypothesis warrants further investigation.
It is important to note that our hypothesis does not contradict potential sex differences in the atrial substrate. Prior studies reported that a deteriorated atrial substrate in AF was associated with increased stroke risk and was more prevalent in women [42,43]. However, this female predominance became less pronounced or was even reversed in young patients [43,44], analogous to the "paradox" in stroke risk. Furthermore, greater deterioration in women is paradoxical given the higher incidence of AF in men [3]. Some sex differences in the atrial substrate may be generated by LT through CR.
Although a risk factor is sometimes misinterpreted as causative, it is a predictor of disease irrespective of its causal role [45]. For example, prior stroke in AF is a non-causal, "Bayesian" risk factor upon which the presence of underlying causes is inferred from the result that the stroke occurred [34,46]. Female sex may be another non-causal risk factorwhat one might call a "paradoxical" risk factor-upon which an apparent association is generated through informative LT. There may be such "paradoxical" factors in other diseases where the same structure exists. An example might be the female predominance in the incidence of Alzheimer's disease [47].
This study had several limitations. First, our hypothesis remains to be verified in cohort studies, which may pose a significant challenge because left-truncated individuals are never observed. Additionally, several technical limitations were identified. First, our model has some simplifications, such as constant hazards and the homogenous development of AF. Second, because of the unobservable nature of LT, we started with a hypothetical population of "potential AF patients." Because there is no real-world counterpart, we could not calibrate our LT process (development of CAD), although we believe that our model generated a reasonable LT considering the common predisposing factors between AF and CAD [3,48]. Third, we calibrated the hazard of CR only to CAD to maintain simplicity. Owing to these limitations, our estimations may not be quantitative. Finally, a caveat has been added. Although the phenomenon illustrated in this study may pose a challenge to causation, it does not pose any to prediction. Even if our hypothesis is correct, female sex could be a useful predictor of stroke in appropriate settings.

Conclusions
We demonstrated that risk factors can be revealed in the absence of causal roles through left truncation due to preceding competing risks correlated with the outcome of interest. Female sex in AF may be a paradoxical risk factor for stroke among patients with AF.
Supplementary Materials: The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/life13051132/s1, Table S1: Proportion (%) of AF in initial populations and of stroke in AF populations; Table S2: HR of female when female sex is protective against stroke in AF; R code for simulation.