Genetically Determined Height and Risk of Non-hodgkin Lymphoma

Although the evidence is not consistent, epidemiologic studies have suggested that taller adult height may be associated with an increased risk of some non-Hodgkin lymphoma (NHL) subtypes. Height is largely determined by genetic factors, but how these genetic factors may contribute to NHL risk is unknown. We investigated the relationship between genetic determinants of height and NHL risk using data from eight genome-wide association studies (GWAS) comprising 10,629 NHL cases, including 3,857 diffuse large B-cell lymphoma (DLBCL), 2,847 follicular lymphoma (FL), 3,100 chronic lymphocytic leukemia (CLL), and 825 marginal zone lymphoma (MZL) cases, and 9,505 controls of European ancestry. We evaluated genetically predicted height by constructing polygenic risk scores using 833 height-associated SNPs. We used logistic regression to estimate odds ratios (OR) and 95% confidence intervals (CI) for association between genetically determined height and the risk of four NHL subtypes in each GWAS and then used fixed-effect meta-analysis to combine subtype results across studies. We found suggestive evidence between taller genetically determined height and increased CLL risk (OR = 1.08, 95% CI = 1.00–1.17, p = 0.049), which was slightly stronger among women (OR = 1.15, 95% CI: 1.01–1.31, p = 0.036). No significant associations were observed with DLBCL, FL, or MZL. Our findings suggest that there may be some shared genetic factors between CLL and height, but other endogenous or environmental factors may underlie reported epidemiologic height associations with other subtypes.

Although the evidence is not consistent, epidemiologic studies have suggested that taller adult height may be associated with an increased risk of some non-Hodgkin lymphoma (NHL) subtypes. Height is largely determined by genetic factors, but how these genetic factors may contribute to NHL risk is unknown. We investigated the relationship between genetic determinants of height and NHL risk using data from eight genome-wide association studies (GWAS) comprising 10,629 NHL cases, including 3,857 diffuse large B-cell lymphoma (DLBCL), 2,847 follicular lymphoma (FL), 3,100 chronic lymphocytic leukemia (CLL), and 825 marginal zone lymphoma (MZL) cases, and 9,505 controls of European ancestry. We evaluated genetically predicted height by constructing polygenic risk scores using 833 height-associated SNPs. We used logistic regression to estimate odds ratios (OR) and 95% confidence intervals (CI) for association between genetically determined height and the risk of four NHL subtypes in each GWAS and then used fixed-effect meta-analysis to combine subtype results across studies. We found suggestive evidence between taller genetically determined height and increased CLL risk (OR = 1.08, 95% CI = 1.00-1.17, p = 0.049), which was slightly stronger among women (OR = 1.15, 95% CI: 1.01-1.31, p = 0.036). No significant associations were observed with DLBCL, FL, or MZL. Our findings suggest that there may be some shared genetic factors between CLL and height, but other endogenous or environmental factors may underlie reported epidemiologic height associations with other subtypes.

INTRODUCTION
In epidemiologic studies, taller adult height has been associated with an increased risk of several subtypes of non-Hodgkin lymphoma (NHL) (1); however, the results have not been consistent across studies for common subtypes, such as diffuse large B-cell lymphoma (DLBCL), follicular lymphoma (FL), and chronic lymphocytic leukemia/small cell lymphoma (CLL) (2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13). Taller height has also been positively associated with other cancers in both epidemiologic and Mendelian randomization studies (14)(15)(16). The underlying reason or biological mechanism for these associations is not understood. Adult attained height is not thought to be causally related to cancer, but instead thought to be a marker of other biological influences, including hormonal and growth factors, cellular divisions, and nutrition during formative years, that may contribute to cancer risk over the life course of an individual.
The reported positive associations between height and multiple subtypes of NHL suggest the existence of common biological pathways, possibly mediated by genetic factors. Height is a highly polygenic trait and estimated to be 70-80% heritable based on twin studies (17)(18)(19). Genome-wide association studies (GWAS) have identified hundreds of common variants associated with height to date, which explain approximately 16% of the variance of height (20), with additional rare variants also contributing to its variance (21). A deeper understanding of how height is associated with NHL risk may provide insight into underlying biological mechanisms leading to lymphoma.
In this study, we used previously established height-related genetic variants to construct polygenic risk scores (PRS) for height in order to examine the associations between geneticallyinferred height and the risk of four common NHL subtypes: DLBCL, FL, CLL, and MZL.

METHODS
We used data from eight GWAS of NHL subtypes in populations of European ancestry (Supplementary Table 1), the details of which have been reported previous for DLBCL (22), FL (23), CLL (24), and MZL (25). The largest GWAS, the National Cancer Institute (NCI) NHL GWAS, consisted of cases and controls of European ancestry from 22 studies of NHL, including nine prospective cohort studies, eight populationbased case-control studies, and five hospital or clinic-based casecontrol or case-series studies. The other seven GWAS consisted of two population-based case-control studies, one combined clinic-and population-based case-control study, one clinicbased case-control study with additional registry cases, one case series with population-based controls, one randomized clinical trial with population-based controls, and one case-control study enriched for familial cases (Supplementary Table 1). The 29 epidemiologic studies contributing to these GWAS were conducted in the North America, Australia, and Europe. In total, genotype data were available for 3,100 CLL cases, 3,857 DLBCL cases, 2,847 FL cases, and 825 MZL cases, and 9,505 controls. All studies obtained informed consent from participants and were approved by their appropriate Institutional Review Boards.
Genotyping was performed on commercially available Illumina and Affymetrix platforms (Supplementary Table 1). Standard quality control and filtering were applied to each GWAS, and each GWAS was imputed separately. Imputation was performed using IMPUTE2 (26) and the 1,000 Genomes Project version 2, February 2012 release (27) as the reference panel. As reported previously (22)(23)(24)(25), quantile-quantile plots of the results showed no substantial evidence of inflation.
Previous studies have reported 848 independent, autosomal SNPs to be associated with height in primarily in European populations (20,21). Wood et al. (20) identified 697 independent SNPs associated with height based on an approximate conditional analysis, taking the linkage disequilibrium among SNPs into account. Marouli et al. (21) identified an additional 113 novel autosomal loci, which were > 1 Mb from the loci reported by Wood et al. and 38 SNPs near established loci that were shown to be independent of previously reported SNPs based on conditional analyses. We constructed weighted polygenic risk scores (PRS) using 833 of these previously reported SNPs, which were available in our datasets with information scores > 0.3 and p-values for Hardy-Weinberg equilibrium >1 × 10 −5 . To generate the PRS, the height-increasing allelic dosage for each SNP was multiplied by the reported meta-analysis beta coefficient (20,21), as shown below, where w j is the weight or beta coefficient for the jth SNP derived from the literature and x ij is the allelic dosage of the jth SNP.
As previous studies have shown etiologic heterogeneity among NHL subtypes (1), we analyzed each NHL subtype separately. For each GWAS, we performed logistic regression with the PRS, which was normally distributed, modeled as a continuous variable. Models were adjusted for sex, age, and statistically significant principal components for population stratification (p < 0.05) at the GWAS level. Since all controls in the UCSF1/NHS GWAS were female, models for this GWAS were adjusted for age and significant principal components of ancestry only. In a sensitivity analysis, we also adjusted for genetically determined body mass index (BMI), using previously established loci. The regression analyses were performed in R using the glm function.
For each NHL subtype, GWAS-specific risk estimates were pooled using fixed-effect meta-analysis (28). Only one GWAS contributed to MZL, so no meta-analysis was performed for MZL. Between-GWAS heterogeneity was evaluated by the Cochran's Q test and quantified using the I 2 metric (29). Metaanalyses were conducted using the package metan in Stata v15 (StataCorp, College Station, TX). The UCSF1/NHS GWAS was excluded from the sex-specific meta-analysis of FL because all controls were female. Between-sex heterogeneity was evaluated by performing a fixed-effect meta-analysis to obtain the p-value from the Cochran's Q-test and I 2 .
No association was observed between genetically predicted height and the risk of DLBCL (OR = 1.05, 95% CI 0.97-1.14), FL (OR = 1.06, 95% CI 0.97-1.16), or MZL (OR = 1.10, 95% CI 0.95-1.28) ( Table 1). Adjustment for BMI did not alter these results (data not shown). For FL, there was some heterogeneity in the GWAS-specific risk estimates (I 2 = 57.5%, p het = 0.07, Supplementary Table 2), attributable primarily to the inclusion of the SCALE study. Removal of SCALE from the meta-analysis for FL risk slightly strengthened the combined association, but the resulting association did not reach statistical significance (OR = 1.09, 95% CI 0.99-1.20, p = 0.08). With regards to sex-specific analyses for these three subtypes, no significant associations were seen among men or women, with risk estimates similar to those observed for both sexes combined.

DISCUSSION
In this first examination of genetically inferred height and NHL risk, we observed a borderline positive association between genetically inferred taller height and the risk of CLL. This finding is consistent with observations from epidemiologic studies examining height and CLL risk, which generally have reported positive or insignificant associations (2,3,5,11). The prospective UK Million Women Study, with 920 CLL cases, found an increased risk of CLL with increasing height (2). Although the Netherlands Cohort Study observed an overall association between taller adult height and risk of NHL overall, it did not observe a significantly elevated risk of CLL with 165 CLL cases (5). A previous pooled analysis within the InterLymph Consortium found an overall positive association between reported height and risk of CLL with a modest magnitude of association (OR per 10-cm increase = 1.10, 95% CI = 1.02-1.19) (11), which is comparable to the association with a oneunit increase in the PRS from the present study, but without a suggestion of heterogeneity by sex. Collectively, the present and published data provide evidence for a positive association between adult height and risk of CLL, although additional studies are needed to clarify possible sex differences.
No evidence of association was observed between genetically determined height and the risk of DLBCL, FL, nor MZL. Although the findings are not consistent (3,8), some epidemiologic studies have reported positive findings for common NHL subtypes and height (2,30). The UK Million Women Study reported a positive association with height for FL, DLBCL, and NHL overall (2). A large pooled analysis within the InterLymph Consortium found that greater adult height was positively associated FL for males, but not females (12). Consistent with our results, no significant associations were reported between adult height and MZL or DLBCL in the InterLymph pooled analysis (13,31); however, an increased risk of DLBCL was observed for both sexes combined comparing the highest with the lowest quartile of usual adult height in a minimally adjusted model (31). It should be noted that some of the cases and controls included in our analysis were also included in the InterLymph pooled analyses of reported height and NHL risk (32); however, our study included 2,550 cases and 4,330 controls from prospective studies as well, which were not part of the pooled InterLymph analyses (Supplementary Table 1).
Despite the suggestive associations in the literature for selfreported or measured height and FL and DLBCL risk, there are several possible reasons why our study of genetically determined adult height did not identify statistically significant associations. A recent methodologic paper on the use of polygenic risk scores demonstrated that null results are often explained by inadequate sample size (33). Even though our study utilized data from the largest GWAS of NHL to date, it is a possible that our study was underpowered to detect an association for these NHL subtypes. In addition, it is possible that the genetic component of height that is pleiotropically related to risk of these NHL subtypes is not well-captured by the current set of height-associated variants. The PRS for height constructed for the present analyses accounts for <20% of the heritable component of adult height (20,21). A PRS incorporating a larger fraction of the heritable component of adult height would provide a more precise measure of the genetic contribution to height and allow for a better assessment of the association with NHL risk. As more of the genetic component of height becomes identified, it may be possible to create a better PRS that accounts for a larger percentage of height variation. Finally, it is also possible that it is the non-genetic component of height, such as childhood nutrition, or a factor correlated with height that is responsible for the reported epidemiologic associations between adult stature and risk of NHL.
Among the noteworthy limitations of our study is the fact that our participants were from populations of European descent, limiting the generalizability of results. However, this restriction to Europeans also has the benefit of minimizing bias from population stratification, which is an important concern with studies of height. We did not adjust for multiple testing, because previous studies have shown individual NHL subtypes to have distinct etiologies (1). However, it is possible that the finding we observed with CLL may be a false-positive result, and additional studies are needed to replicate our findings. We were unable to perform a true Mendelian randomization analysis due to lack of data on height from many studies, but we were able to combine multiple GWAS of NHL by meta-analysis to achieve sample sizes of several thousand cases for three of the four subtypes. This approach further enabled a direct assessment of between-study and between-sex heterogeneity of the observed associations. Future studies with even larger sample sizes may be able to evaluate the shared genetic correlation with height using cross-trait LD Score regression.
In conclusion, we present the first polygenic risk score analysis examining the genetic contribution of attained adult height on the risk for the four most common NHL subtypes. Although we did not observe statistically significant associations with DLBCL, FL, or MZL, we observed evidence of a positive association between genetically-inferred height and CLL risk, lending more support to previous observational studies. Additional studies are warranted to explore the association between geneticallyinferred height and risk of other NHL subtypes, particularly those that have been reported to be associated with selfreported or measured height in epidemiologic studies (34,35). Additional insight could be gained from studies that explore the relationship between environmental or endogenous factors that may be correlated with or determinants of height and NHL risk. As more knowledge is gained about the genetic architecture and determinants of height, a more informative PRS can be constructed to represent a larger fraction of the heritable component of adult height and underlying biological pathways. A re-examination of the associations of NHL subtypes with genetically-inferred height may be merited in the future as our understanding of height and its underlying biology grows.

ETHICS STATEMENT
All studies involving human participants were reviewed and approved by their respective Institutional Review Boards. The patients/participants provided their written informed consent to participate in this study.  Study for  their valuable contributions as well as the following state cancer  registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA,  ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY,  NC, ND, OH, OK, OR, PA, RI, SC, TN, TX,  UCSF2-The UCSF studies were supported by the NCI, National Institutes of Health, CA1046282 and CA154643. The collection of cancer incidence data used in this study was supported by the California Department of Health Services as part of the statewide cancer reporting program mandated by California Health and Safety Code section 103885; the National Cancer Institute's Surveillance, Epidemiology, and End Results Program under contract HHSN261201000140C awarded to the Cancer Prevention Institute of California, contract HHSN261201000035C awarded to the University of Southern California, and contract HHSN261201000034C awarded to the Public Health Institute; and the Centers for Disease Control and Prevention's National Program of Cancer Registries, under agreement #1U58 DP000807-01 awarded to the Public Health Institute. The ideas and opinions expressed herein are those of the authors, and endorsement by the State of California, the California Department of Health Services, the National Cancer Institute, or the Centers for Disease Control and Prevention or their contractors and subcontractors is not intended nor should be inferred.

UTAH/Sheffield-National
Institutes of Health CA134674. Partial support for data collection at the Utah site was made possible by the Utah Population Database (UPDB) and the Utah Cancer Registry (UCR). Partial support for all datasets within the UPDB is provided by the Huntsman Cancer Institute (HCI) and the HCI Cancer Center Support grant, P30 CA42014. The UCR was supported in part by NIH contract HHSN261201000026C from the National Cancer Institute SEER Program with additional support from the Utah State Department of Health and the University of Utah. Partial support for data collection in Sheffield, UK was made possible by funds from Yorkshire Cancer