Improved prognostic stratification using NCCN- and GELTAMO-international prognostic index in patients with diffuse large B-cell lymphoma

The National Comprehensive Cancer Network (NCCN)-International Prognostic Index (IPI) and GELTAMO (Grupo Español de Linfomas/Trasplante Autólogo de Médula Ósea)-IPI were developed to enable better risk prediction of patients with diffuse large B-cell lymphoma (DLBCL). The present study compared the effectiveness of risk prediction between IPI, NCCN-IPI, and GELTAMO-IPI in patients with DLBCL particularly in terms of determining high-risk patients. Among 439 patients who were enrolled to a prospective DLBCL cohort treated with R-CHOP immunochemotherapy, risk groups were classified according to the three IPIs and the prognostic significance of individual IPI factors and IPI models were analyzed and compared. All three IPI effectively separated the analyzed patients into four risk groups according to overall survival (OS). Estimated 5-year OS of patients classified as high-risk according to the IPI was 45.7%, suggesting that the IPI is limited in the selection of patients who are expected to have a poor outcome. In contrast, the 5-year OS of patients stratified as high-risk according to NCCN- and GELTAMO-IPI was 31.4% and 21.9%, respectively. The results indicate that NCCN- and GELTAMO-IPI are better than the IPI in predicting patients with poor prognosis, suggesting the superiority of enhanced, next-generation IPIs for DLBCL.


INTRODUCTION
The International Prognostic Index (IPI) has been widely adapted in clinical practice since its introduction almost 25 years ago for patients with aggressive non-Hodgkin lymphoma (NHL) [1]. The IPI is clinically useful because it is reproducible, allows convenient scoring and categorizes patients. Several modified versions of IPI according to the subtypes of NHL have been described [2][3][4]. The modifications and the original IPI that comprises five factors, has been used in patients with aggressive NHL, including DLBCL [1]. The addition of rituximab to chemotherapy has improved the outcome of patients with DLBCL, and necessitated a re-evaluation of the role of the IPI. It was concluded that the IPI remains a valid prognostic indicator for patients with DLBCL in the rituximab era [5].
Despite maintaining its overall prognostic value, criticisms of the IPI are that it cannot effectively separate patients who are expected to have a poor outcome in the rituximab era [6]: contrast to pre-rituximab era, 5-year overall survival (OS) of the IPI-defined high-risk group was significantly improved, approaching 40 to 50% [7][8][9][10], suggesting that even patients classified into the poorest risk group according to the IPI have up to a 50% chance of cure. Sehn et al. reported a convergence of Kaplan-Meier curves among high-intermediate (HI) and high-risk categories defined by the IPI, and suggested a Revised-IPI (R-IPI) for better prediction of survival [7]. However, in the R-IPI, the 4-year OS of patients with high-risk category was 55%, and patients expected to have dismal prognosis were not distinguished [7].
In 2014, Zhou et al. proposed the National Comprehensive Cancer Network (NCCN)-IPI [8], which applied enhanced stratifications and scoring of age and serum lactate dehydrogenase (LDH) ratio to the upper limit of normal (ULN). In addition, they included the involvement of major extranodal organs [bone marrow, central nervous system (CNS), liver/gastrointestinal tract, and lung] as a factor of the NCCN-IPI instead of conventional definition of "involvement of >1 extranodal sites" according to the IPI. In their study using the NCCN study cohort comprising 1,650 individuals from seven NCCN centers, and the British Columbia Cancer Agency (BCCA) validation cohort (n = 1,138), the 5-year OS of NCCN-IPI-defined high-risk patients was 33% in the NCCN cohort and 38% in the BCCA cohort, suggesting the improved selection of high-risk group compared to the IPI [8]. Recently, the Grupo Español de Linfomas/Trasplante Autólogo de Médula Ósea (GELTAMO)-IPI Project Investigators proposed a new IPI incorporating the elevation of beta-2 microglobulin (B2MG) above ULN and enhanced scoring system but different from NCCN-IPI (Table 1). They reported that the GELTAMO-IPI yielded a better discrimination of high-risk DLBCL patients compared to the NCCN-IPI (5-year OS 39% vs. 49%) [9].
The purpose of the present study was to validate and compare the effectiveness of the risk assessment between IPI, NCCN-IPI, and GELTAMO-IPI among patients with DLBCL treated with rituximab-CHOP www.impactjournals.com/oncotarget (R-CHOP) immunochemotherapy, particularly in terms of determining high-risk patients.

Patient characteristics and classification
Among 603 patients who enrolled in the PROCESS study, 164 patients were excluded [8 patients did not satisfy inclusion criteria and 156 patients lacked data of baseline serum beta-2-microglobulin (B2MG)] and the remaining 439 patients who had complete clinical, radiologic, and laboratory data enabling their classification according to the three IPI schemes were included in the current study ( Figure 1).
The baseline characteristics of the analyzed patients were summarized in Table 2. Overall characteristics of the 439 patients did not deviate from those of all the patients from the PROCESS cohort. During the median follow-up duration of 55.0 months (95% CI 53.1 -57.0), 133 patients (30.3%) underwent progression-free survival (PFS) events and 120 patients (27.3%) died. Five-year PFS and OS rates were 66.8% and 70.6%, respectively.
According to the IPI, the proportion of low-risk group was the highest (43%). In the NCCN-IPI, the number of low-intermediate (LI)-risk group was the highest (45%). In the GELTAMO-IPI, most patients were classified into LI-risk group (61%). Patterns of Distributio of patients according to the three IPI were overall similar with those of original NCCN and GELTAMO studies ( Figure 2) [8,9]. The NCCN-and GELTAMO-IPI classified a relatively smaller proportion of patients into the high-risk group (8.9% in NCCN-IPI and 6.8% in GELTAMO-IPI, respectively), compared to the IPI (18.2% in the high-risk group).

Prognostic significance of individual IPI factor in the IPI, NCCN-IPI, and GELTAMO-IPI
All five factors of the IPI showed a significant difference of OS with hazard ratios (HRs) between 2.27 to 4.10 (Table 3). Enhanced stratification of age (in the NCCN-and GELTAMO-IPI), serum LDH (in the NCCN-IPI), and performance status (PS; in the GELTAMO-IPI) resulted in more effective risk stratification, except in groups between ≤ 40 vs. 41-60 years of age in the NCCN-IPI (p = 0.175). Involvement of extranodal sites designated by the NCCN-IPI failed to show prognostic significance (p = 0.755). Patients with an increased serum B2MG level showed significantly inferior OS compared to those with not increased B2MG. Ann Arbor staging lost its prognostic significance in the multivariate analyses performed in all three IPIs. Otherwise, most factors maintained an independent prognostic significance ( Table 4).

Stratification of patients according to the IPI, NCCN-IPI, and GELTAMO-IPI
All three IPI schemes effectively separated the analyzed patients into four risk groups according to OS (Table 5 and Figure 3). Estimated 5-year OS of patients classified as high-risk group according to IPI was 45.7%, suggesting that the IPI is limited in the selection of patients who are expected to have poor outcome. In contrast, the 5-year OS of patients stratified as high-risk according to NCCN-and GELTAMO-IPI were 31.4%, and 21.9%, respectively (Table 5). In the reclassification calibration statistic analysis, NCCN-and GELTAMO-IPI showed superior risk prediction (separating patients into high-risk vs. non-high-risk) compared to the IPI (Table 6). Comparison between NCCN-and GELTAMO-IPI was not statistically feasible as patient numbers of high-risk group by either of two IPIs were small and 23 patients were classified into high-risk by both NCCN-and GELTAMO-IPI.

DISCUSSION
In the present study, the NCCN-and GELTAMO-IPI, the revised versions of the IPI that feature enhanced scoring systems (and the addition of serum B2MG in case of GELTAMO-IPI), showed improved prognostic power to detect patients with dismal prognosis compared to the IPI.
The population we analyzed reflects a real-world clinical practice of DLBCL patients because they were accrued from 27 medical centers of a nation-wide distribution, and our prospective cohort had no specific interventions relevant to patient selection or additional investigative therapy. Our patients had a median age of 60 years (57 years in the NCCN-cohort and 63 years in the BCCA-cohort of the NCCN-IPI study and 60 years in the GELTAMO-IPI study, respectively). Forty eight percent of the patients were > 60 years of age, 57% were males, 51% were LDH >1x ULN, 50% were Ann Arbor stage III or IV, and 12% of the analyzed patients were PS >1, showing that these characteristics had not significantly deviated from the populations in the original studies of the NCCN-and GELTAMO-IPI.
In the present study, involvement of the NCCNdesignated extranodal sites had no prognostic significance. The prognostic implication of gastrointestinal tract, one of the designated involved sites, is controversial. Studies have suggested poor survival [11], no association [10], and even favorable outcomes [12]. In a Japanese retrospective study of 1,221 patients, the involvement of the small intestine was an IPI-independent poor prognostic factor, whereas involvement of stomach or colon was not [13]. In addition, the involvement of extranodal sites other than     the NCCN-defined lesions has been suggested as a poor prognostic indicator, including the genitourinary tract [14] and female reproductive organ [10]. In the validation of the NCCN-IPI by the GELTAMO group, the NCCNdesignated extranodal disease lost its prognostic value in multivariate analysis [9], and therefore it was not included in GELTAMO-IPI. A large-scale retrospective analysis (n = 25,992) using the Surveillance, Epidemiology, and End Results (SEER) database from 2004 to 2009 reported that sites of extranodal involvement are more prognostic than the number of involved sites [11]. However, a Danish-Canadian study reported that involvement of three or more extranodal sites is independently associated with dismal outcomes [10]. Considering the above results, the prognostic impact of extranodal sites in terms of its number or anatomic location is still an area of debate. In our study, advanced Ann Arbor staging lost its prognostic significance in multivariate analyses  in several studies using large cohorts [8,9,13]. However, several lines of evidence suggest that the prognostic role of Ann Arbor staging is at least more limited than other IPI factors in the rituximab era. Ziepert et al. performed a meta-analysis of three large clinical trials [5] and reported that for patients who received rituximabcontaining immunochemotherapy, Ann Arbor staging was not prognostic of OS in the MInT trial (p = 0.5217), MegaCHOEP study (p = 0.107), and RICOVER-60 trial (p = 0.061). Moreover, application of positron emission tomography/computed tomography (PET/ CT) in response evaluation may affect the mitigation of prognostic significance of Ann Arbor staging. In the Danish-Canadian study conducted by El-Galaly et al., patients were staged and restaged by PET/CT. The authors reported no significant difference of prognosis among patients with stage I, II, and III, with only stage IV patients displaying an inferior OS. The 3-year OS were 89% [95% confidence interval (CI), 83-95%], 76% (95% CI, 62-90%), 82% (95% CI, 70-94%), and 62% (95% CI, 54-70%) for stage I, II, III, and IV disease, respectively [10]. The authors stated that the increased sensitivity of PET/CT may have upstaged a part of patients, particularly by detecting extranodal sites that would not be found by conventional CT [10]. Our study also integrated PET/ CT for response evaluation. It is noteworthy that as the modality of response evaluation shift from CT to PET/ CT, stage migration may occur, which may attenuate the prognostic significance of Ann Arbor staging.
Recently published studies reproduced the overall satisfactory prognostic stratification of DLBCL patients according to the NCCN-IPI in 100 to 443 DLBCL patients [10,[15][16][17][18][19]. However, in the aforementioned Danish-Canadian study, the NCCN-IPI was suboptimal to identify the high-risk group, showing that 3-year OS of patients with high-risk group was 48% [10]. Therefore, some modification of the NCCN-IPI, such as integrating other clinical or laboratory factors into the index, was tried to further improve the separation of patients expecting dismal outcomes. The GELTAMO-IPI was developed after a validation study of the NCCN-IPI using 2,156 patients with DLBCL from the archives of 20 hospitals in the GELTAMO network in Spain [9]. In the development of GELTAMO-IPI, enhanced scorings were used in age and PS and involvement of extranodal sites were excluded. Notably, serum B2MG was included as an IPI factor. B2MG is a component of the major histocompatibility complex class I molecule, and it is present on all nucleated cells [20]. Elevated serum B2MG has been used as a prognostic indicator in the International Staging System of multiple myeloma [21] and the Follicular Lymphoma International Prognostic Index-2 of follicular lymphoma [22], and its potential role as a prognostic biomarker was reported in many subtypes of mature lymphoid malignancies [23][24][25][26] and lymphomaassociated hemophagocytic lymphohistiocytosis [27]. The mechanism of the relationship of elevated serum B2MG to poor prognosis has been suggested, with B2MG proposed to be an indicator of heavy tumor burden with high cellular turnover rate [28]. However, this remains unclear considering that the elevation of B2MG was independent to serum LDH or Ann Arbor staging in previous studies [9,29] as well as the present study. Further investigations are required for this issue.
In the present study, we did not integrate any biologic prognostic markers recently defined or suggested by the advance of genomics, molecular biology, or immunology in the field of DLBCL. Cell of origin [30], stromal gene signature or its protein expression [31][32][33], double hit [34], or co-expression of MYC and BCL2/BCL6 (double expresser) [35] were not analyzed. However, the present aim is to validate and compare IPIs, and the above integrations are beyond the scope of the study. It is limitation of our study that we could not compare the efficiency of selecting high-risk group between NCCN-and GELTAMO-IPI. In conclusion, our study shows that NCCNand GELTAMO-IPI have a significant advantage in predicting patients with poor prognosis, with 5-year OS rate of approximately 20 to 30%, by using basic clinical information and blood tests that are inexpensive and have a rapid turnaround time. Therefore, when selecting highrisk patients, it would be more reasonable to use NCCNor GELTAMO-IPI rather than the IPI in clinical practice.

Patients
Analyses were conducted with patients enrolled in the PROCESS (Prospective Cohort Study with Risk-Adapted Central Nervous System Evaluation in DLBCL) study from 27 hospitals belonging to the Consortium for Improving Survival of Lymphoma (CISL) in Korea. The original purpose, inclusion and exclusion criteria, and detailed information on the study conduct were as previously described [36]. Briefly, to evaluate risk factors of CNS relapse in patients with DLBCL, adult patients with newly diagnosed DLBCL planning to receive three weekly R-CHOP as a primary treatment were included. Patients with primary CNS DLBCL were excluded. Baseline evaluation of CNS involvement of DLBCL was recommended in any symptomatic patients or with features indicating a high risk for CNS involvement. However, the evaluation was not obligatory and there were no other specific interventions for the treatment of DLBCL. An interim and final response evaluation was conducted using PET/CT. The study started on August 2010 and completed patient enrollment on August 2012. Follow-up data regarding disease status and survival was updated every 6 months, with the latest update performed in February 2017. This study was approved by the institutional review boards of the participating institutions.

IPI, NCCN-IPI, and GELTAMO IPI
Risk groups were classified according to the scores calculated as described in the IPI, NCCN-IPI, and GELTAMO IPI, respectively. For the analysis with serum LDH and B2MG levels, normalized values (ratio to the ULN of each participated institution) were calculated and used.

Statistical analysis
PFS was time from the date of diagnosis to the date of disease progression, relapse, or last follow-up, or death from any cause. OS was defined time from diagnosis to death from any cause. Patient survival was analyzed using the Kaplan-Meier method and compared by logrank test. Multivariate analyses by backward conditional Cox regression model were conducted with variables that had p < 0.1 in univariate analysis. Values were two-sided and the significance of statistics was accepted at the level of p < 0.05. To compare the ability to predict high-risk patients between two IPIs, the risk category of each IPI was modified into either a high-risk or a non-high-risk group (patients with low-risk + LI risk + HI risk] defined by respective IPI. A reclassification calibration statistic was used, in which an IPI with smaller statistic (χ 2 ) and larger p-value are considered to have better risk prediction than its counterpart [37].