Immunomethylomic approach to explore the blood neutrophil lymphocyte ratio (NLR) in glioma survival

Differentially methylated regions (DMRs) within DNA isolated from whole blood can be used to estimate the proportions of circulating leukocyte subtypes. We use the term “immunomethylomics” to describe the application of these immune lineage DMRs to studying leukocyte profiles. Here, we applied this approach to peripheral blood DNA from 72 glioma patients with molecularly defined brain tumors, representing common patient groups with defined characteristic survival times and risk factors. We first estimated the proportions of leukocyte subtypes in samples using deconvolution algorithms with reference DMR libraries from isolated leukocyte populations and Illumina 450K DNA methylation data. Then, we calculated the neutrophil to lymphocyte ratio (NLR) using methylation-derived cell composition estimates (mdNLR). The NLR is considered an indicator of immunosuppressive cells in cancer patients. Elevated mdNLR scores were observed in glioma patients compared to mdNLR values of published controls. Significantly decreased survival times were associated with mdNLR ≥ 4.0 in Cox proportional hazards models adjusted for age, gender, tumor grade, and molecular subtype (HR 2.02, 95% CI, 1.11–3.69). We also identified five myeloid-related CpGs that were highly correlated with the mdNLR (adjusted R2 ≥ 0.80). Each of the five myeloid CpG loci was associated with survival when adjusted for the above covariates and offer a simplified approach for utilizing fresh or archived peripheral blood samples for interrogating a very small number of methylation markers to estimate myeloid immune influences in glioma survival. The mdNLR (based on DNA methylation) is a novel candidate methylation biomarker that represents immunosuppressive myeloid cells within the blood of glioma patients with potential application in clinical trials and future epidemiologic studies of glioma risk and survival.

diagnosis, clinical behavior, acquired genetic alterations, and associated germline variants [4]. Among these groups, IDH mutant only and TERT mutant only tumors are the most common and comprise about 75% of adult glioma patients [4].
While the molecular classification of tumors has substantially improved our understanding of glioma prognosis, immune factors are notably absent in existing prognostic models. This omission is significant as immune evasion is a recognized hallmark of cancer [5], and there is abundant evidence that glioma patients suffer systemic immune defects, with the most profound alterations occurring in GBM patients [6][7][8][9][10]. Recent studies have emphasized the important role of developmentally immature and aberrantly activated myeloid-derived cells as contributing to cancer immunosuppression and adversely affecting patient survival [11][12][13]. Furthermore, immune interventions represent a potentially powerful new therapeutic approach in glioma [14,15].
The best methods to assess altered myeloid populations or systemic immunosuppression, more generally, are still evolving [16] and, as a result, large-scale studies are lacking. However, the peripheral blood neutrophil to lymphocyte ratio (NLR), which can be derived using the common five-part white blood cell differential (neutrophils, basophils, eosinophils, monocytes, lymphocytes), has emerged as a surprisingly robust marker of cancer associated inflammation [17]. Increases in the blood NLR have been remarkably consistent in their association with poor cancer survival. A recent meta-analysis including 100 independent studies encompassing over 40,000 subjects demonstrated that an elevated NLR was a statistically significant predictor of poor overall survival, cancerspecific survival, as well as progression free and disease free survival, even after adjustment for established risk predictors [18]. There are four studies showing shorter survival times in glioma patients with an elevated NLR [19][20][21][22]. Importantly, however, no study has taken into account the molecular features of glioma in conjunction with the NLR or other immune factors.
In this study, our goal was to apply a new epigenetic approach to immune profiling to explore myeloid-related blood markers in glioma survival. Specifically, we examined the peripheral blood DNA methylation status of glioma cases using bioinformatic algorithms that deconvolute the complex methylation signature of whole blood into its component cell compartments [23][24][25][26]. This approach to immune studies is based on recent epigenetic discoveries showing that differentially methylated regions (DMRs) provide highly specific and quantitative markers of immune cell profiles [27,28]. Recently, we developed and validated an algorithm to estimate the NLR from 450K methylation data (methylation-derived NLR; mdNLR) [29]. Our results showed strong agreement between mdNLR and cytological NLR, and elevated mdNLR was significantly associated with diminished patient survival times in head and neck squamous cell carcinoma and bladder cancer, as well as breast and ovarian cancer risk [29], paralleling the now considerable literature on the relationship between cytological NLR and cancer survivorship [18]. Here, we studied the association of the mdNLR with survival among glioma patients.
Because altered myeloid differentiation is implicated in immune alterations in glioma, we also explored the idea that associations of mdNLR in glioma may be linked to myeloid-specific developmental CpG loci. We identified myeloid versus lymphoid specific CpGs on the 450K array that strongly correlate with the mdNLR. This provides important evidence that the NLR is a surrogate marker of myeloid suppression. Consequently, both the mdNLR and the myeloid single CpGs are potential markers of skewed myeloid profiles that may be useful in characterizing immune defects associated with survival in glioma.

Patient samples
Patients were chosen from the University of California San Francisco (UCSF) Adult Glioma Study (AGS) who had both archival blood and tumor marker data [30]. AGS participants represent primary glioma patients; no recurrent or secondary GBM cases are included. Seventy-two cases were selected from the two most prevalent molecular subtypes of glioma [4] (i.e., cases with IDH mutation only or TERT promoter mutation only). Samples from cases aged 40 to 59 were selected as follows: all available non-GBMs and IDH-only GBMs were included. TERT-only GBMs were chosen to match the ages of both the IDH-only GBMs and the TERTonly non-GBMs. Blood samples were collected from patients a median of 100 days after they were histologically diagnosed. Clinical information was collected on patient treatments including temozolomide (TMZ) chemotherapy, radiation therapy, extent of surgery, and steroid use at the time of blood sampling. The anticoagulated whole blood was processed, and DNA was isolated and bisulfite converted as previously described (27).

Quality control and preprocessing of the DNA methylation data
Illumina 450K arrays were run by the UCSF Human Genomics core. Preprocessing and quality control was accomplished using the minfi Bioconductor package [31]. To ensure high-quality methylation data, CpG loci having a sizable fraction (>25%) of detection p values above a predetermined threshold (detection P > 10E-5) were excluded [32]. Subset Quantile Within Array (SWAN) normalization was performed for type 1/2 probe adjustment [33]. The presence of technical sources of variability induced by plate and/or BeadChip was examined using principal components analysis (PCA), and the top K principal components [34] were examined in terms of their association with plate and BeadChip. If plate and/or BeadChip was found to be significantly associated with any of the top K principal components, we applied ComBat method [35] for normalization using the sva Bioconductor package.

Cell mixture deconvolution analysis
Using the preprocessed and normalized methylation data, we applied an optimized reference-based cell mixture deconvolution methodology [28] to gain insight into the cellular composition of the samples considered here. Specifically, the proportions of CD4+ T cells, CD8+ T cells, B cells, natural killer (NK) cells, monocytes, and granulocytes were estimated for each sample using the function "estimateCellCounts" in the minfi Bioconductor package using an optimized reference library set of CpGs.
Computing the methylation-derived neutrophil lymphocyte ratio (mdNLR) Estimation of the mdNLR was carried out as previously described [29]. Briefly, the method requires three main steps: (i) identify differentially methylated CpGs among leukocyte subtypes, (L-DMRs), (ii) perform cell mixture deconvolution to estimate the proportion of leukocyte subtypes using L-DMRs identified in step 1, and (iii) compute the ratio of the predicted proportion of neutrophil granulocytes to lymphocytes. The mdNLR was computed by taking the ratio of predicted granulocyte and lymphocyte fractions, mdNLR i ¼ω Gran;i The mdNLR scores are based on beta values using 300 L-DMR CpGs [28]. A publicly available implementation of this method is available in the IDOL R package (https://www.r-project.org/). 1

Statistical analyses of the mdNLR and clinical outcomes
Associations between mdNLR and clinical covariates were assessed using either logistic regression or linear regression models. Cox proportional hazards regression models were used to examine the association between mdNLR and survival time and were fit using the "coxph" function in the survival R package. Survival models were adjusted for established risk predictors and potential confounders, including age, gender, histological subtype (GBM versus non-GBM), and IDH/TERT mutation status (IDH-only mutation versus TERT-only mutation). The proportionality assumption was assessed by plotting the scaled Schoenfeld residuals against time, and the "cox.zph" function in the survival R package was used for testing the proportionality of each predictor included in our survival models [36]. In our survival analyses, mdNLR was modeled both as a continuous predictor and by dichotomizing subjects into high and low mdNLR groups. The binary cut point of mdNLR >4 is based on previous studies [19]. We also compared the performance of different survival models that included known risk factors compared with analyses including mdNLR and single locus CpGs. Three metrics were computed using the packages survival and survAUC to compare the performance of these models: concordance index (c-index), the Gerds and Schumacher Brier score, and the Song and Zhou [37] time-dependent area under the receiver operator characteristic curve (tAUROC) [38]. Log-rank tests were used to judge differences between the experimental and baseline model. The baseline model contained patient age, gender, tumor grade, and mutation status (TERT mutant only vs IDH mutant only).

Identification of myeloid-specific single locus markers of the mdNLR
While the mdNLR requires 300 CpGs to estimate the neutrophil lymphocyte ratio (29), we hypothesized that the NLR (and the mdNLR) is a biomarker of the known influx of myeloid-derived suppressor cells into the peripheral blood that occurs with the development of a new cancer (11), and as a result of this, reasoned that there may exist individual influential CpGs arising during myeloid differentiation that could serve as surrogates for the mdNLR. To test this hypothesis, we first sought to identify myeloid-specific markers. The M values of 54 samples from the Reinius dataset (excluding the six whole blood samples, GSE35069 [39]) were modeled according to if they were predominantly myeloid or lymphoid cells, adjusting for the proportion of the blood cells in the samples as measured by flow cytometry. The top 100 loci were selected using the RnBeads automatic rank cutoff approach. A second model then evaluated the relationship between the mdNLR as the outcome and the top 100 myeloid-specific loci to obtain a reduced list of methylation-derived mdNLR surrogates. For variance stabilization, beta values were converted to M values and were modeled assuming linear, quadratic, and cubic relationships with survival time; we then computed adjusted R 2 values to assess the correlation of each methylation site. We first modeled the 100 myeloidspecific loci using the methylation data from subjects in this study and then repeated the models in the Hannum [40] [GSE40279] and Liu [41] [GSE42861] blood methylation datasets. For the top 10 models, the adjusted R 2 ranged between 40-86%. Five loci were consistently found to obtain an adjusted R 2 over 80% in all three datasets. Each of the five loci was markedly demethylated in myeloid compared to lymphoid cells and stem cells (using ENCODE resources).

Neutrophil lymphocyte ratio in glioma patients assessed by immunomethylomics
The study sample sizes, clinical characteristics, and available demographic/epidemiological data are given in Table 1. Leukocyte cell composition of whole blood was calculated with our validated algorithm and optimized reference libraries using the IDOL procedure [28], Additional file 1: Figure S1. Combining the myeloid and lymphocytic subtypes allowed the calculation of the mdNLR. The mdNLR scores among glioma cases were then compared with a large public database of blood methylation data collected on 656 non-cancer adults [40]. Figure 1a compares the distributions of mdNLR among glioma cases and the non-cancer comparison group; the median mdNLR of glioma patients was elevated compared to the non-cancer group. Higher glioma tumor grade was associated with increased mdNLR values (panel B), but mdNLR scores were similar among cases whose tumors contained IDH1 vs TERT promoter mutation (panel C).

Association of mdNLR with glioma survival times
Median survival in cases with mdNLR < 4 was 52 months compared to those with elevated mdNLR scores; 22 months (Fig. 1d). Kaplan-Meier survival curves were further stratified by histopathology (GBM vs non-GBM) and suggested shorter survival times among GBM cases, although sample sizes are limited (Fig. 1e). Cox proportional hazards models that included known prognostic factors (age, grade, mutation status) indicated significant association of a high mdNLR (>4) with an increased risk of death; HR 2.02, 95% CI, 1.11-3.69, P = 0.02 (Table 2). A Cox model including chemotherapy and steroid use suggests that mdNLR is associated with survival time, independent of therapy; HR 1.84, 95% CI, 1.00-3.38, P = 0.049 (Additional file 2: Table S1).
Glioma grading was based on WHO 2007 criteria; however, since we know IDH mutation and 1p19q codeletion status, we can reclassify these cases using the new WHO 2016 brain tumor classification [42]. Based on the WHO 2016 criteria, two anaplastic oligodendroglioma or oligoastrocytomas cases would have been classified as GBM instead of non-GBM due to having evidence of microvascular proliferation. This reclassification would not have substantially altered the results of this analysis.

Association of single CpG myeloid differentiation loci with mdNLR and survival
Candidate loci representing myeloid-specific CpGs were identified, and the top 100 included loci hypomethylated in myeloid cells compared to lymphoid cells and only a few loci that were hypermethylated in myeloid cells Fig. 2. Genes associated with these myeloid-specific loci are summarized in Table 3. Five loci were chosen that showed very strong correlation with the mdNLR across three independent blood DNA methylation datasets, Fig. 3. Among the different models examined, the quadratic form best fit the regression of CpG methylation and mdNLR. Table 4 describes the methylation levels of these five loci according to glioma patient characteristics (tumor grade, mutation status, NLR status). The data indicate the strong association of each individual loci with patient NLR status.
We compared the performance of survival models that contain the mdNLR and found a significant difference from the base model which did not contain the mdNLR and a modest increase in the concordance score and Brier score ( Table 5). Models that individually included one of each of the five myeloid-specific differentiation CpGs revealed that the loci were significant compared to the base model and produced concordance and Brier's scores equivalent to the mdNLR. As similar results were found when any of the five loci were included, we included only one of them (cg00901982) in Table 5. We also examined models containing the mdNLR in addition to each of the five loci ( Table 6). As expected, when both variables are included in the models, little additional variance is explained.

Discussion
Shifts in the distribution and numbers of blood leukocytes as well as the emergence of aberrant myeloid cells with immunosuppressive properties are important predictors of cancer patient survival [11][12][13]. The simple NLR in the whole blood has received a great deal of attention as a replicated marker of cancer inflammation linked to poor survival [18]. Because the NLR reflects the relative balance of the myeloid and lymphocytic lineages in peripheral blood, it is sensitive to the altered myelopoiesis arising in chronic inflammation and cancer.
A main finding of this study is that DMRs that distinguish leukocyte subtypes can be used to estimate the NLR ratio and that this epigenetically derived metric, like the cytological NLR, is associated with glioma occurrence and survival times. Our observation of  elevated mdNLR values in GBM patients compared to controls is consistent with previous studies [18]. Although the mdNLR is less dramatically elevated in non-GBM compared with GBM cases, the data suggests alterations in some lower grade patients. Our sample of glioma patients was restricted to tumor subtypes containing either IDH or TERT mutations exclusively. After adjustments for these molecular features and other prognostic factors, the elevated mdNLR was a significant prognostic indicator of shorter survival times. Thus, the immunomethylomic approach to the evaluation of the NLR holds considerable promise in immune profiling. Currently, there is intense interest in multiscale assessment of immune function in cancer patients receiving traditional treatments and new immunotherapies [43]. Immunomethylomics can readily provide cell ratios as in the mdNLR and has the potential to identify aberrant epigenetic subsets of immune cells.
When we evaluated the performance of multivariate survival models with or without the mdNLR, we found a significant but modest improvement of model fit by inclusion of the mdNLR. This is not unexpected as the molecular subtypes selected for the current study represent very divergent prognostic groups. Survival for patients with IDH-only mutant glioma is much longer compared with those harboring TERT promoter mutation only tumors. Thus, survival models containing these mutation factors explain a large degree of variation in survival times and improvements in predictive performance above  Sequence corresponding to 50 bases upstream and 50 bases downstream of the CpG location based on the GRCh37/hg19 build the base model were modest in size. This is not uncommon in cancer studies. Nonetheless, the direction of the association of the mdNLR with survival is consistent with previous studies in glioma and other solid tumors that implicate myeloid factors in cancer inflammation. However, a caveat to this interpretation is that decreases in lymphocytes could drive increases in the mdNLR without alterations in the myeloid compartment. While the mdNLR is affected by either increased myeloid or decreased lymphocyte counts, the individual myeloid-specific differentiation loci are less susceptible to this effect of lymphocyte depletion. It is of interest therefore, that each of the five myeloid-specific loci performed similarly to the mdNLR and produced largely comparable performance metrics in multivariate analyses. The single CpG sites that correlated with the mdNLR are not located in classic promoter regions and their functional significance is unknown. Further work is needed to develop a deeper understanding of the potential biological significance of the five loci. It will be of interest to know whether the single CpG myeloid loci identified in this study are related to the immature and aberrant phenotype of MDSCs. The current state of immunomethylomics does not allow for the direct estimation of MDSCs that are hypothesized to be the drivers of immunosuppression associated with elevated NLR. Thus, future discovery studies of the methylomes of isolated MDSCs that identify their unique DMR repertoire are needed. Specific MDSC defining DMRs can then be added to the reference libraries of normal neutrophil and monocyte cell types and used for cell profiling. Until that time, the current markers of leukocytes, mdNLR, and myeloid differentiation are easily implemented in clinical studies and large population studies. Unprocessed peripheral blood and archival samples are suitable for immunomethylomic profiling. The single CpG myeloid differentiation markers can be used in single locus quantitative assay formats without the requirement for extensive array-based analysis.

Conclusions
The ratio of neutrophils to lymphocytes in blood has been associated with immune suppression and decreased survival times in multiple solid tumors. Based on immune cell-specific DMRs and validated cell deconvolution algorithms, we estimated the NLR in blood from glioma patients and found elevated mdNLR values in glioma patients compared to a published control group. The patient mdNLR scores were increased in patients with grade IV tumors compared with grade II/III. High mdNLR scores were significantly associated with shorter survival times. We explored the contribution of myeloid differentiation loci in driving the mdNLR and identified candidate single (myeloid-associated) gene loci that were highly correlated with the mdNLR. These loci likely represent myeloid differentiation-specific demethylation events, consistent with the hypothesis that the NLR and    mdNLR are biomarkers of myeloid suppression. Individual myeloid differentiation loci were tested in multivariate survival analyses and found to perform as well as the mdNLR in improving survival predictions. Single myeloid differentiation loci provide a simpler and cheaper alternative to the mdNLR, which requires complex array data. Immunomethylomics may be an alternative to conventional cell analysis in profiling glioma risk and survival factors.

Endnote
1 The IDOL R-package has been submitted to the Comprehensive R Archive Network (CRAN) and is available through Github. Software is also available by request.

Additional Files
Additional file 2: Table S1. Cox proportional hazards survival models including mdNLR, age, grade, mutation status, and chemotherapy and dexamethasone. (see Additional file 2: Table S1