Do GWAS-Identified Risk Variants for Chronic Lymphocytic Leukemia Influence Overall Patient Survival and Disease Progression?

Chronic lymphocytic leukemia (CLL) is the most common leukemia among adults worldwide. Although genome-wide association studies (GWAS) have uncovered the germline genetic component underlying CLL susceptibility, the potential use of GWAS-identified risk variants to predict disease progression and patient survival remains unexplored. Here, we evaluated whether 41 GWAS-identified risk variants for CLL could influence overall survival (OS) and disease progression, defined as time to first treatment (TTFT) in a cohort of 1039 CLL cases ascertained through the CRuCIAL consortium. Although this is the largest study assessing the effect of GWAS-identified susceptibility variants for CLL on OS, we only found a weak association of ten single nucleotide polymorphisms (SNPs) with OS (p < 0.05) that did not remain significant after correction for multiple testing. In line with these results, polygenic risk scores (PRSs) built with these SNPs in the CRuCIAL cohort showed a modest association with OS and a low capacity to predict patient survival, with an area under the receiver operating characteristic curve (AUROC) of 0.57. Similarly, seven SNPs were associated with TTFT (p < 0.05); however, these did not reach the multiple testing significance threshold, and the meta-analysis with previous published data did not confirm any of the associations. As expected, PRSs built with these SNPs showed reduced accuracy in prediction of disease progression (AUROC = 0.62). These results suggest that susceptibility variants for CLL do not impact overall survival and disease progression in CLL patients.


Introduction
Chronic lymphocytic leukemia (CLL) is the most common form of leukemia among adults worldwide [1], and its global health burden has risen substantially over the past 30 years [2]. CLL remains an incurable disease [3] with a heterogeneous clinical course and a 10-year survival rate ranging from 47.3% to 72.5% in males and 58.2% to 78.7% in females [4]. Traditional clinical prognostic factors include Rai and Binet staging systems, lymphocyte doubling time, cytogenetic alterations, and point mutations, which are used for patient risk stratification and clinical management [5]. Although age, sex, exposure to chemicals, race/ethnicity, and family history of hematological cancers influence the risk of CLL, recent studies have suggested that the combination of these classical factors with genetic markers might help in predicting disease onset and clinical outcome [6]. However, despite the overall success of genome-wide association studies (GWAS) in identifying susceptibility loci for CLL [7][8][9][10], there remains an unmet need to characterize genetic markers associated with disease progression and overall patient survival. Considering this background, we investigated whether GWAS-identified susceptibility variants for CLL could influence the overall survival (OS) of CLL patients and their disease progression, defined as time to first treatment (TTFT). Finally, we explored whether the effect of selected variants on OS and TTFT could be mediated by immune-related processes through a comprehensive battery of functional experiments developed in the 500FG cohort recruited in the context of the Human Functional Project (HFGP).

Results
CLL patients from the CRuCIAL cohort had a mean age of 65.87 and a male/female ratio of 1.57, which is in line with the worldwide median age of diagnosis and gender distributions (Table 1) [11]. Median follow-up time was 76.77 months without variation of follow-up statistics for censored patients, and the total number of deceased patients was 287. The OS did not differ significantly by country, ruling out the possibility of any deviation due to multicenter randomized patient recruitment. On the other hand, CLL patients with TTFT data in the CRuCIAL cohort had a mean age of 65.03 and a male/female ratio of 1.62. The median time to first treatment was 759.49 days and the number of deceased patients was 111 (Table 1). Data are mean ± standard deviation, n (%), or percentiles (25th-75th percentiles). Cox regression analyses showed that ten genetic variants within the CAMK2D, CASP8, CFLAR, CXXC1, GPR37, IRF8, LEF1, MYNN, PRKD2, and TERC loci were associated with OS at p < 0.05 level ( Table 2). Although potentially interesting, none of these associations remained significant after correction for multiple testing, which suggested a weak effect (if any) of these genes in determining patient survival. The lack of previous studies assessing the impact of GWAS-identified risk variants on OS hampered the performance of eventual meta-analyses. Data are mean ± standard deviation, n (%), or percentiles (25th-75th percentiles).
Cox regression analyses showed that ten genetic variants within the CAMK2D,  CASP8, CFLAR, CXXC1, GPR37, IRF8, LEF1, MYNN, PRKD2, and TERC loci were associated with OS at p < 0.05 level (Table 2). Although potentially interesting, none of these associations remained significant after correction for multiple testing, which suggested a weak effect (if any) of these genes in determining patient survival. The lack of previous studies assessing the impact of GWAS-identified risk variants on OS hampered the performance of eventual meta-analyses. Abbreviations: SNP, single nucleotide polymorphism; HR, hazards ratio. Significant results in bold (p < 0.05). δ Cox regression analysis assuming a log-additive model of inheritance. Ϯ Cox regression analysis assuming a dominant model. ¥ Cox regression analysis assuming a recessive model.
As expected, we found a weak association between the weighted and unweighted PRSs and OS of CLL patients (HR = 1.22, p = 1.80 × 10 −5 and HRScaled_80% = 1.19, p = 7.61 × 10 −5 ). Weighted and unweighted PRSs increased the capacity to predict OS by only 6-7%, with an area under the receiver operating characteristic curve (AUROC) for the unweighted and weighted PRS of 0.56 and 0.57, respectively (Table 3). As expected, we found a weak association between the weighted and unweighted PRSs and OS of CLL patients (HR = 1.22, p = 1.80 × 10 −5 and HR Scaled_80% = 1.19, p = 7.61 × 10 −5 ). Weighted and unweighted PRSs increased the capacity to predict OS by only 6-7%, with an area under the receiver operating characteristic curve (AUROC) for the unweighted and weighted PRS of 0.56 and 0.57, respectively (Table 3).
In agreement with these findings, we found that none of these SNPs were correlated with host immune parameters (cQTL data, absolute numbers of 91 blood-derived cell populations, 106 serum immunological proteins, or 7 steroid hormones), which reinforced the hypothesis of a null effect of these markers in determining overall patient survival.
On the other hand, Cox regression analyses adjusted for age, sex, and country of origin revealed that seven genetic variants within the ACOXL, CASP8, GRAMD1B, MYNN, PRKD2, TERC, and ZBTB7A|MAP2K2 loci were associated with TTFT at p < 0.05 level (Table 4). However, none of the associations with TTFT remained significant after correction for multiple testing, suggesting that these susceptibility variants for CLL do not have a relevant role in determining disease progression.  Data are mean ± standard deviation, n (%), or percentiles (25th-75th percentiles).
Cox regression analyses showed that ten genetic variants within the CAMK2D, CASP8, CFLAR, CXXC1, GPR37, IRF8, LEF1, MYNN, PRKD2, and TERC loci were associated with OS at p < 0.05 level ( Table 2). Although potentially interesting, none of these associations remained significant after correction for multiple testing, which suggested a weak effect (if any) of these genes in determining patient survival. The lack of previous studies assessing the impact of GWAS-identified risk variants on OS hampered the performance of eventual meta-analyses.  CASP8, CFLAR, CXXC1, GPR37, IRF8, LEF1, MYNN, PRKD2, and TERC loci were associated with OS at p < 0.05 level ( Table 2). Although potentially interesting, none of these associations remained significant after correction for multiple testing, which suggested a weak effect (if any) of these genes in determining patient survival. The lack of previous studies assessing the impact of GWAS-identified risk variants on OS hampered the performance of eventual meta-analyses. As expected, we found a weak association between the weighted and unweighted PRSs and OS of CLL patients (HR = 1.22, p = 1.80 × 10 −5 and HRScaled_80% = 1.19, p = 7.61 × 10 −5 ). Weighted and unweighted PRSs increased the capacity to predict OS by only 6-7%, with an area under the receiver operating characteristic curve (AUROC) for the unweighted and weighted PRS of 0.56 and 0.57, respectively (Table 3). In line with these data, a meta-analysis of our data including data from a previous GWAS confirmed that none of these loci have a significant impact on TTFT (Table 5). These findings support the notion of a null effect of susceptibility variants on disease progression in CLL. Abbreviations: SNP, single nucleotide polymorphism; HR, hazards ratio. CI, confidence interval; Meta-analysis was performed assuming a fixed-effect model. Significant results in bold (p < 0.05). η Authors report the effect found for the rs62410363 (a SNP in strong linkage disequilibrium with the rs11715604, r2 = 0.97). δ Cox regression analyses were adjusted for age, sex, and country of origin and were calculated according to log-additive model of inheritance.

Discussion
This is the largest study evaluating the association of GWAS-identified susceptibility variants for CLL with OS, and one of the first studies assessing the effect of GWAS-identified susceptibility variants for CLL in disease progression. Even though we found potentially interesting associations between ten SNPs within the CAMK2D, CASP8, CFLAR, CXXC1, GPR37, IRF8, LEF1, MYNN, PRKD2, and TERC loci and the OS of CLL patients, none of these associations remained significant after correction for multiple testing. As expected, we found a modest association between weighted and unweighted PRSs and OS, which increased the prediction capacity by only 7%. These findings suggest that susceptibility variants for CLL do not have a large influence on OS, which is in agreement with a previous study that, using a similar approach, demonstrated that susceptibility variants do not influence the OS of patients diagnosed with multiple myeloma, another B cell malignancy [13].
This study failed to find a statistically significant association between GWAS-identified risk variants for CLL and TTFT. We found that only seven SNPs within the ACOXL, CASP8, GRAMD1B, MYNN, PRKD2, TERC, and ZBTB7A|MAP2K2 loci showed an association with TTFT at p < 0.05 level. None of these associations remained significant after correction for multiple testing, and a meta-analysis of our data with those from a previous GWAS on TTFT confirmed the null effect of these variants on disease progression. In agreement with these results, we found that weighted and unweighted PRSs did not have the capacity to predict TTFT. Nonetheless, in light of the relatively low power of our study (80% of power to detect an HR of 1.45 for a SNP with a frequency of 0.25) and the sparse number of studies assessing the impact of GWAS-identified risk variants for CLL on OS and TTFT, we cannot rule out the possibility that some of these SNPs might have a stronger effect on the modulation of OS and disease progression. In fact, we found it interesting that carriers of the CXXC1 rs1036935A allele had poorer OS, as our team has previously reported that the presence of this allele is associated with decreased numbers of CD19+CD20+ B cells [14], a subtype of cells poorly expressed in CLL patients. The CXXC1 locus encodes for a protein of the SETD1 complex, which acts as an epigenetic transcriptional activator; if deregulated, it can lead to tumor progression and poorer survival [15].
This study has both strengths and weaknesses. The major strengths of this study are the large number of genetic markers analyzed and the relatively large size of the study population. Another strength is the comprehensive functional analysis conducted in the HFGP cohort, which included cQTL data after stimulation of whole blood, PBMCs, and MDMs with LPS, PHA, Pam3Cys, CpG, and B. burgdorferi and E. coli, as well as data on serological and plasmatic inflammatory proteins, serum steroid hormones, and bloodderived immune cell types. A limitation of this study is its multicentric nature, with inevitable drawbacks such as the impossibility of uniformly collecting medication history and Rai-Binet status for a significant proportion of the patients analyzed. In addition, considering that all study participants included in this study were of European ancestry, we could not determine the impact of GWAS-identified variants on patient survival and TTFT in other ethnic or ancestral populations.

Study Participants
This study included 1039 CLL patients ascertained through the CRuCIAL consortium. CLL patients were diagnosed following the updated international criteria [5]. Study participants were of European ancestry, and provided their written informed consent to participate in the study, which was approved by the ethical review committee of partici- A detailed description of the study cohort has been reported elsewhere [14]. The main characteristics of the patients are included in Table 1. This study followed the Declaration of Helsinki.

SNP Selection and Genotyping
A total of 41 single nucleotide polymorphisms (SNPs) were selected based on published GWAS, functionality data, and linkage disequilibrium between the SNPs (Supplementary Table S1) [14]. Genotyping of selected SNPs was carried out at GENYO (Centre for Genomics and Oncological Research: Pfizer/University of Granada/Andalusian Regional Government, Granada, Spain) using KASPar ® assays (LGC Genomics, Hoddesdon, UK) according to previously reported protocols. For internal quality control,~5% of samples were randomly selected and included as duplicates. Concordance between the original and the duplicate samples for the selected SNPs was ≥99.0%. Call rate was higher than 90%.

Statistical Analysis and Meta-Analysis
The Hardy-Weinberg Equilibrium (HWE) test was performed in the control group using a standard observed-expected chi-square (χ 2 ) test. The primary outcome was OS and the endpoint was defined as death from any cause. Survival time was calculated as the time from CLL diagnosis until the occurrence of the study endpoint, censoring at the date of death or the last observed follow-up time. The second outcome was time to first treatment (TTFT), defined as the interval between CLL diagnosis and date of the first treatment or last follow-up, while the endpoint was defined as death from any cause. Association with OS and TTFT, defined as hazard ratio (HR), was calculated for each SNP using Cox regression analysis adjusted for age, sex, and country of origin. Considering the number of SNPs and the number of inheritance models tested (log-additive, dominant, and recessive), we set a significance threshold to 0.00041 (0.05/41/3) using the Bonferroni correction. Association analyses were performed using STATA (v12.1; Stata Corp, College Station, TX, USA) and power calculations were estimated using the survSNP package in R (v4.1.1; R Core Team, 2018).
In order to confirm potentially interesting associations with disease progression, we conducted a meta-analysis of the CRuCIAL data with those from a previous GWAS [12] using METAL and assuming a fixed-effect model; the I 2 statistic was used to assess statistical heterogeneity between cohorts. Next, in order to confirm whether susceptibility variants could predict OS and disease progression, we computed weighted and unweighted polygenic risk scores (PRSs) considering those SNPs associated with OS and TTFT at a threshold of p < 0.05. We built PRSs considering either subjects with a call rate of 100% (n = 891 and 290) or 80% (n = 1003 and 323) for OS and TTFT, respectively. PRS is an approach that estimates the individual risk to a phenotype or disease, which is calculated as a sum of their genotypes weighted by corresponding genotype effect sizes from summary statistic data. A detailed explanation of how the PRS scores were generated has been published in [16].

Functional Effect of the GWAS-Identified Risk Variants on Immune Responses
Mechanistically, we investigated whether selected SNPs were correlated with production of nine cytokines after in vitro stimulation of peripheral mononuclear cells (PBMCs) from 408 healthy subjects from the Human Functional Genomic Project (HFGP) cohort with LPS (100 ng/mL, Sigma-Aldrich, St. Louis, MO, USA), PHA (10 µg/mL, Sigma-Aldrich, St. Louis, MO, USA), Pam3Cys (10 µg/mL, EMC microcollections, Tuebingen, Germany), CpG (100 ng/mL, InvivoGen, San Diego, CA, USA), and B. burgdorferi (ATCC strain 35210) and E. coli (ATCC 25922). In addition, we investigated the correlation between SNPs and circulating concentrations of 103 serum and plasmatic inflammatory proteins, absolute numbers of 91 blood-derived immune cell populations (Supplementary Tables S2 and S3) and 7 plasma steroid hormones. Experimental protocols for the functional experiments have been previously described in detail [17,18]. Functional results for selected SNPs have been previously published by our team as part of a study in the context of the CRuCIAL consortium that aimed at validating the associations of GWAS-identified risk variants for CLL [14].

Conclusions
This study suggests that susceptibility variants for CLL do not substantially impact the overall survival of CLL patients, and confirms previous results suggesting the null effect of these variants on TTFT. Funding: This work was supported by the European Union's Horizon 2020 research and innovation program, N • 856620 and by grants from the Instituto de Salud Carlos III and FEDER (Madrid, Spain; PI17/02256 and PI20/01845) and from the Consejería de Transformación Económica, Industria, Conocimiento y Universidades y FEDER (PY20/01282). "The Mayo studies in InterLymph were supported in part by the US National Cancer Institute grants P50 CA97274 and R01 CA92153."