Introduction

Immunotherapy targeting programmed cell death protein 1 (PD-1) or programmed death ligand 1 (PD-L1) has demonstrated remarkable efficacy in the treatment of several solid cancers including breast cancer1,2,3,4,5,6,7,8. PD-1, a transmembrane protein, plays an important role in downregulating functions of the immune system9. PD-L1 can be expressed in both tumor and immune cells. PD-L1/PD-1 binding has been shown to induce immune tolerance in peripheral tissues9,10, in which the PD-L1/PD-1 signaling pathway promotes tumor escape from immune surveillance10,11. Currently, diagnostic factors used to predict survival outcomes include the expression profiles of the tumor micro-environment and PD-L1/co-inhibitory proteins12,13. PD-L1 immunohistochemistry (IHC) is the only clinically approved method for predicting response to immune checkpoint inhibitor therapy. However, PD-L1 IHC testing is complex due to the availability of multiple IHC assays, each with its own reagents and other clinical designs4,5,8, resulting in nonuniform clinical application of PD-L1 IHC.

This study aimed to perform a comprehensive, retrospective evaluation of PD-L1 expression based on IHC assays with three PD-L1 antibodies (SP142, SP263, and 22C3) in tumors of patients with breast cancer. Accordingly, we investigated PD-L1 expression in tumor samples and compared survival outcomes predicted by using PD-L1 expression data obtained using each PD-L1 antibody.

Results

Patient characteristics based on PD-L1 expression

A total of 316 breast cancer patients at Gangnam Severance Hospital were included in this study. The median patient age was 50 years (range 25–86 years). Clinical characteristics were grouped and compared based on PD-L1 expression according to antibody-IHC assays, as shown in Table 1. Positive PD-L1 expression was associated with higher Ki67, estrogen receptor (ER) and progesterone receptor (PR) negativity, triple-negative breast cancer (TNBC) subtype, and higher histological grade (HG). Positive PD-L1 expression was related to lower American Joint Committee on Cancer (AJCC) stage for the SP142 group, but not for the SP263 or 22C3 groups. Positive PD-L1 expression was associated with the receipt of chemotherapy for the SP263 group. There was no statistical difference in age, lympho-vascular invasion (LVI), and receipt of radiotherapy between the three groups.

Table 1 Comparison of patient and tumor characteristics, and PD-L1 status in patients with breast cancer.

Correlation of PD-L1 assays with breast cancer tissue microarray (TMA) results

TMAs with tumor tissues derived from 316 patients were subjected to staining procedures for each antibody. The representative case that was triple-positive for all three antibodies (Fig. 1a–d) presented with > 90% of stromal tumor-infiltrating lymphocytes (TIL) infiltration as well as intratumoral TIL. The discordant case that was positive for 22C3 and SP263 antibodies, but lacked SP142 expression (Fig. 1e–h), exhibited lesser extent of TIL infiltration than the triple-positive case. For IHC assays, 289/316 (91.5%) of the investigated TMA cores were subjected to staining procedures and their PD-L1 expression was studied using the 22C3 antibody, whereas the PD-L1 expression of 301/316 (95.3%) and 299/316 (94.6%) of the TMA cores was studied using SP142 and SP263 antibodies, respectively. The prevalence of positive PD-L1 expression differed according to each PD-L1 antibody. Based on the results obtained with the use of the 22C3 antibody, 19.7% of the patients with breast cancer harbored PD-L1-positive tumors [combined positive score (CPS) ≥ 1]. However, only 14.3% of the patients harbored PD-L1-positive tumors according to SP142 staining (≥ 1% IC), and PD-L1 IHC staining using SP263 led to the obtainment of a higher numbers of patients with PD-L1-positive tumors (38.5%) (Fig. 2a). Tumor tissues derived from 273 patients were subjected to staining procedures with all three PD-L1 antibodies, and the correspondence among PD-L1 expression in the three groups has been depicted in Fig. 2b. Agreement between the findings obtained using the three antibodies was moderate (κ = 0.413, P < 0.001). There was moderate agreement between 22C3 and SP263 (κ = 0.424, P < 0.001), but 22C3 vs. SP142 (κ = 0.398, P < 0.001), and SP142 vs. SP263 (κ = 0.361, P < 0.001) showed fair agreements.

Figure 1
figure 1

Positive and negative expression patterns of three PD-L1 antibodies. A representative case that showed triple-positive features for all three antibodies (ad) exhibited > 90% of the stromal tumor-infiltrating lymphocytes (TIL) infiltration as well as intratumoral TIL. Discordant case which was positive for 22C3 and SP263 antibodies but lacked SP142 expression (eh) presented with less TIL infiltration than that shown by the triple-positive case.

Figure 2
figure 2

Immunohistochemical staining patterns of PD-L1 based on the use of three PD-L1 antibodies: (a) 57/289 patients (19.7%) presented with positive PD-L1-expressing tumors using 22C3; 43/301 patients (14.3%) presented with positive PD-L1-expressing tumors using SP142; 115/299 patients (38.5%) presented with positive PD-L1-expressing tumors using SP263. (b) Venn diagram illustrated for correspondence and Kappa value of comparison of PD-L1 staining using three PD-L1 antibodies.

Prognostic significance of PD-L1 expression based on the antibodies

At a median follow-up time of 78.5 months (range, 0–325 months), 49 patients developed recurrence. Among them, 39 exhibited distant metastasis and 14 demonstrated loco-regional recurrence (four cases presented with distant metastasis and loco-regional recurrence simultaneously). There were 2, 9, and 10 recurrences in cases of discrepancy between 22C3 and SP142, 22C3 and SP263, and SP142 and SP263, respectively. Only negative PD-L1 expression based on the 22C3-IHC assay was significantly associated with decreased recurrence-free survival (RFS) [Fig. 3a; hazard ratio (HR) 2.537, 95% confidence intervals (CI) 1.188–5.421, P = 0.0337] and distant metastasis-free survival (DMFS) (Fig. 4a; HR 2.867, 95% CI 1.247–6.589, P = 0.0131, log rank test). However, RFS and DMFS did not differ significantly between negative and positive PD-L1 expression levels based on SP142- and SP263-IHC assays (Figs. 3b,c, 4b,c).

Figure 3
figure 3

Kaplan–Meier survival curves of recurrence-free survival (RFS) in relation to PD-L1 expression based on PD-L1 antibody-IHC assays in patients with breast cancer. (a) Patients with negative PD-L1 expression based on the 22C3-IHC assay were associated with poor RFS (HR 2.537, 95% CI 1.188–5.421, P = 0.0337, log-rank test). (b, c) PD-L1 expression with SP142- and SP263-IHC assays did not show significantly different RFS.

Figure 4
figure 4

Kaplan–Meier survival curves of distant metastasis-free survival (DMFS) in relation to PD-L1 expression based on PD-L1 antibody-IHC assays in patients with breast cancer. (a) Patients with negative PD-L1 expression based on the 22C3-IHC assay were associated with poor DMFS (HR 2.867, 95% CI 1.247–6.589, P = 0.0131, log-rank test). (b, c) DMFS did not significantly differ based on SP142- and SP263-IHC assay results.

In the univariate Cox proportional hazard model, age and positive PD-L1 expression based on the 22C3-IHC assays were found to be significant prognostic factors for RFS (Table 2, HR 0.207 95%, CI 0.050–0.855, P = 0.0295). However, PD-L1 expression based on SP142- or SP263-IHC assays was not a significant factor in RFS (Table 2). In the multivariate analysis, PD-L1 expression based on the 22C3-IHC assays was confirmed as a significant predictor for RFS (Table 2, HR 0.206, 95% CI 0.050–0.853, P = 0.0293).

Table 2 Cox proportional hazard analysis for recurrence-free survival (RFS).

In the univariate analyses of DMFS, only PD-L1 expression based on the 22C3-IHC assay was a significant factor of favorable DMFS (Supplementary Table 1; HR 0.122, 95% CI 0.017–0.888, P = 0.0378). Multivariate analysis confirmed that PD-L1 expression based on the 22C3-IHC assay was significantly associated with favorable DMFS (Supplementary Table 1; HR 0.121, 95% CI 0.017–0.886, P = 0.0376).

Evaluation of Cox proportional hazard model using Harrel’s c-index, net reclassification index (NRI), integrated discrimination improvement (IDI), and time dependent areas under the curve (AUC)

To quantify the improvement of predictive ability contributed by PD-L1 expression according to each antibody utilized, we calculated Harrel’s c-index, NRI, and IDI for the multivariate models. The addition of data on PD-L1 expression based on the 22C3-IHC assay to the null model significantly increased the c-index for both RFS (model 1 in Table 3; HR 0.626, 95% CI 0.536–0.689, P = 0.0001) and DMFS (model 1 in Supplementary Table 2; HR 0.633, 95% CI 0.574–0.692, P < 0.0001). Furthermore, the addition also improved the discriminatory power of RFS measured by considering NRI and IDI (model 1 in Table 3; P = 0.0008 and P = 0.04, respectively). Moreover, it also demonstrated superior discrimination of DMFS measured by considering NRI (model 1 in Supplementary Table 2, P < 0.0001) and IDI (model 1 in Supplementary Table 2, P = 0.044). However, addition of PD-L1 expression data based on SP142- or SP263-IHC assays did not substantially improve the discriminatory power of RFS and DMFS (model 2 and 3 in Table 3 and Supplementary Table 2).

Table 3 Evaluation of multivariate Cox proportional hazard model using Harrel’s c-index, NRI, IDI, and time-dependent AUC for RFS.

In the time-dependent AUC model for RFS, the addition of PD-L1 expression data based on each of the three antibodies utilized to the null model increased the discriminatory ability (Table 3). However, addition of PD-L1 expression data based on the 22C3-IHC assay to the null model yielded higher discriminatory value than that based on SP142- and SP263-IHC assays (Table 3, Fig. 5a,b). In the time-dependent AUC model for DMFS, addition of PD-L1 expression data based on the 22C3-IHC assay to the null model demonstrated superior discriminatory power (Supplementary Table 2) than that based on SP142- or SP263-IHC assays (Supplementary Table 2 and Fig. 5c,d).

Figure 5
figure 5

Comparison of improved discriminatory performance for each PD-L1 antibody using the time-dependent AUC graphs of RFS and DMFS. (a) Among the three PD-L1 antibodies, 22C3-IHC staining showed a higher AUC value in RFS. (b) Via addition to the null model in RFS, the improved AUC value based on the 22C3-IHC assay was observed to be superior among the three PD-L1 antibodies (AUC of 22C3 = 0.636, AUC of SP142 = 0.606, AUC of SP263 = 0.606). (c) Among the three PD-L1 antibodies, 22C3-IHC staining demonstrated a higher AUC value for DMFS. (d) Via addition to the null model in DMFS, the improved AUC value based on the 22C3-IHC assay was found to be superior among the three PD-L1 antibodies (AUC of 22C3 = 0.634, AUC of SP142 = 0.584, AUC of SP263 = 0.596).

Prognostic impact of PD-L1 expression in TNBC

Clinical characteristics were compared according to PD-L1 expression based on each PD-L1 antibody-IHC assay performed for the TNBC subtype, as shown in Supplementary Table 3. Positive PD-L1 expression using the SP263 antibody was associated with HG and higher Ki67, and negative PD-L1 expression using the SP142 antibody was related to lower AJCC stage. Otherwise, there was no statistical difference in the other clinicopathologic characteristics.

Survival outcomes based on the PD-L1 expression according to each antibody utilized were compared for the TNBC subtype. RFS differed significantly according to the PD-L1 expression in all antibody-IHC assays (Supplementary Fig. 1a; HR 3.462, 95% CIs 1.489–8.048, P = 0.0039; Supplementary Fig. 1b; HR 2.701, 95% CIs 1.026–7.108, P = 0.0442, Supplementary Fig. 1c; HR 2.371, 95% CIs 1.097–5.127, P = 0.0281, respectively). Additionally, decreased DMFS was observed in the negative 22C3- and SP263-IHC assays in the TNBC subtype (Supplementary Fig. 2a; HR 4.184, 95% CIs 1.710–10.24, P = 0.0017; Supplementary Fig. 2c; HR 2.746, 95% CIs 1.188–6.350, P = 0.0181, respectively), but it was not observed in the SP142-IHC assay (Supplementary Fig. 2b; HR 2.611, 95% CIs 0.922–7.398, P = 0.0708).

Univariate analysis of RFS in each PD-L1 antibody was performed using the Cox proportional hazard model. In the TNBC subtype, positive PD-L1 expression based on 22C3- and SP263-IHC assays was a significant prognostic factor for RFS (Supplementary Table 4; 22C3, HR 0.095, 95% CI 0.013–0.700, P = 0.021; SP263, HR 0.406, 95% CI 0.176–0.934, P = 0.0034, respectively). However, PD-L1 expression based on the SP142-IHC assay was not a significant factor in RFS (Supplementary Table 4; P = 0.078).

In the multivariate analyses of RFS, only positive PD-L1 expression based on the 22C3-IHC assay was a prognostic factor of decreased recurrence (Supplementary Table 4; HR 0.114, 95% CI 0.015–0.848, P = 0.034), and not SP263 (Supplementary Table 4; P = 0.511).

Discussion

In an era where immune checkpoint inhibitors are being utilized for TNBC, we investigated the prognostic impact of three different PD-L1 antibodies, for which coupled immune checkpoint inhibitors were available, namely PD-L1 (SP142)-atezolizumab, PD-L1 (22C3)-pembrolizumab, and PD-L1 (SP263)-durvalumab. To evaluate the prognostic discriminatory power of each antibody, we established prediction models including a null model that did not include PD-L1 expression data for model development. PD-L1 expression based on the 22C3-IHC assay was consistently the most powerful prognostic factor. Furthermore, pre-existing prediction models showed improved discriminatory power when PD-L1 expression based on the 22C3-IHC assay was added as a parameter. In this study, we focused on the manner in which PD-L1 expression based on different antibody assays correlated with long-term survival outcome and clinicopathologic characteristics.

Comparative analyses of concordant and discordant rates for different PD-L1 antibodies have not been widely reported14,15,16. PD-L1 has been studied as a prognostic factor in breast cancer17, with positive prognostic results being consistently reported despite the use of different antibodies and scoring methods18,19,20. Recently, the IMpassion130 trial demonstrated improved outcome in advanced TNBC patients presenting with PD-L1-expressing tumors who were treated with the SP142 antibody and the immune checkpoint inhibitor atezolizumab4. In addition to the IMpassion130 trial, the KEYNOTE-522 clinical trial demonstrated an improved pathological complete response rate among early TNBC patients with PD-L1 (22C3)-expressing tumors after neoadjuvant treatment combined with pembrolizumab5.

PD-L1 IHC is usually performed after surgical resection, and an accurate PD-L1 IHC assay could be done in the substantial amount of tumor tissue due to the tumor heterogeneity. Kim et al., reported relatively good agreement between small biopsy samples and surgical specimens in the three commercial PD-L1 antibodies (concordance rates of 73%–96%, 65%–80%, and 72%–91% between 26, 20, and 46 paired samples in 22C3, SP142, and SP263 PD-L1 IHC assays, respectively)21. Additionally, several studies have published reliable agreement rates between paired small tumor samples and surgical specimens22,23. In breast cancer, all three assays demonstrated good correlation for IC score and the concordance rate was the highest at a 1% cutoff value16.

This study has several limitations. First, this was a retrospective study conducted by utilizing samples collected before the immune checkpoint inhibitor era; therefore, we could not assess the real effect of immune checkpoint inhibitor on PD-L1-positive patients, Furthermore, although PD-L1 expression based on the 22C3-IHC assay seemed to exert the strongest influence on prognostic power, the RFS and DFMS graphs also demonstrated separation for the two other antibodies. With a sufficiently large population size, PD-L1 expression based on the SP263-IHC assay, and especially that based on the SP142-IHC assay, may demonstrate a significant prognostic effect. Interestingly, positive PD-L1 expression based on the SP142-IHC assay exerted no effect on RFS and DMFS, whereas PD-L1 expression based on the 22C3-IHC assay was revealed to be an independent prognostic factor. These results might have been derived from the cohort composition, as the cohort was not only composed of patients with TNBC but comprised a mixture of cases with varied molecular subtypes. Particularly, PD-L1 expression based on the SP142-IHC assay only evaluated %IC within the tumor area including the peri-tumoral stroma. As most TMA cores are composed of the intra-tumoral area, the peri-tumoral immune cell (IC) might have not been included. Moreover, TIL are heterogeneously distributed; thus, underestimation of TIL and subsequent underestimation of PD-L1 expression might have occurred. Conversely, PD-L1 expression evaluated using 22C3 was determined using the CPS method, wherein the denominator equaled the total number of tumor cells, and the numerator equaled the number of cells that showed positive staining for PD-L1 expression using 22C3. Using this method, the positivity rate using 22C3 might have been largely differed compared to that observed using SP142. However, this aspect could not explain the improved prognostic outcome of patients whose tumor tissues stained positively for PD-L1 expression in assays using the 22C3 antibody, and higher AUC values in the survival model with PD-L1 expression based on the 22C3-IHC assay.

In conclusion, our findings may indicate that PD-L1 expression based on the 22C3-IHC assay is a more powerful discriminatory marker than that based on SP142- and SP263-IHC assays in breast cancer. Our findings warrant additional validation using large-scale studies.

Methods

Patients

This retrospective study was initiated by collecting tumor tissues from 316 patients who underwent primary curative surgery for breast cancer between September 1999 and June 2015 at the Gangnam Severance Hospital in Seoul, Korea. All patients were treated according to standard protocols. The following data were recorded: age at surgery, tumor size, lymph node status, HG, status of ER, status of PR, status of the human epidermal growth factor receptor-2 (HER2), LVI, Ki67 leveling index, status of PD-L1 according to each antibody, treatment modalities, and survival outcomes. Tumor HG was determined using the modified Scarff–Bloom–Richardson grading system. Anatomical tumor-node-metastasis (TNM) classification was based on the TNM staging system of the American Joint Committee on Cancer, 8th edition.

All procedures were performed in accordance with the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards. The study protocol was approved by the institutional review board (IRB) of the Gangnam Severance Hospital (local IRB No. 3–2018-0067). The need for informed consent was waived under the approval of the IRB due to the retrospective design of the study.

TMA construction, IHC staining, and interpretation of results

On hematoxylin and eosin-stained slides of tumors, a representative area was selected, and the corresponding spot was marked on the surface of the paraffin-embedded block. Using a hollow needle, the selected area was punched out and the resulting 2-mm tissue core was placed in a 10 × 5 recipient block. Each separate tissue core was assigned a unique TMA location number that was linked to a database including other clinicopathologic data.

As per methods previously described24, 3 µm-thick tissue sections were sliced and obtained from formalin-fixed paraffin-embedded TMA blocks. After performing deparaffinization with xylene and rehydration with alcohol graded solutions, IHC was performed using the Ventana Discovery XT Automated Slide Stainer (Ventana Medical System, Tucson, AZ, USA). Cell Conditioning 1 buffer (citrate buffer, pH 6.0; Ventana Medical System) was used for antigen retrieval. The slices were incubated with the primary antibodies against estrogen receptor (ER; 1:150, clone 6F11; Novocastra Laboratories, Ltd., Newcastle upon Tyne, UK), progesterone receptor (PR; 1:100; clone 16; Novocastra Laboratories Ltd.), HER2 (1:1500; polyclonal; DAKO, Glostrup, Denmark), PD-L1 (prediluted; clone SP142; Ventana Medical System), PD-L1 (prediluted; clone SP263; Ventana Medical System), and PD-L1 (1:50; clone 22C3; DAKO). The appropriate positive and negative controls were included.

Molecular subtyping

Nuclear staining values ≥ 1% were considered indicative of ER and PR positivity25. HER2 staining was interpreted based on the 2018 American Society of Clinical Oncology/College of American Pathologists guidelines26. Only samples with strong and circumferential membranous HER2 immunoreactivity (3 +) were considered positive, whereas those with 0 and 1 + HER2 staining were considered negative. Cases with equivocal HER2 expression (2+) were further evaluated for HER-2 gene amplification via silver in situ hybridization (SISH). Breast cancer subcategorization was based on the results of IHC staining for ER, PR, HER2, as well as the SISH results for HER2. The specimens were categorized as follows: (i) Luminal/HER2-negative (ER- and/or PR-positive and HER2-negative); (ii) HER2-positive (HER2-positive regardless of the ER and PR statuses); (iii) TNBC (ER-, PR-, and HER2-negative).

Interpretation of PD-L1 immunohistochemistry results

For evaluating PD-L1 expression based on the 22C3-IHC assay, the CPS was calculated by dividing the number of stained cells expressing PD-L1 (tumor cells, lymphocytes, and macrophages) with the total number of viable tumor cells, and by multiplying the quotient by 100. CPS ≥ 1 was considered as positive.

$$\mathrm{CPS score}=\frac{\mathrm{ Number of PD}-\mathrm{L}1\mathrm{ staining cells }\left(\mathrm{tumor cell},\mathrm{ lymphocytes},\mathrm{ macrophages}\right) * 100}{\mathrm{Total number of viable tumor cells}}$$

For PD-L1 expression based on the SP142-IHC assay, the intensity of tumor-infiltrating IC staining was examined. Immune cells present in the intra-tumoral and contiguous peri-tumoral stroma, including lymphocytes, macrophages, dendritic cells, and granulocytes, were evaluated. The % IC was determined as the proportion of tumor area exhibiting PD-L1 staining of any intensity. For deducing PD-L1 expression based on the SP263 IHC assay, staining of immune cells at any intensity was considered positive staining, and the total percentage of signal intensity was visually estimated to generate data on the PD-L1 expression level.

$$\mathrm{\%IC }=\frac{percentage area of PD-L1 positive immune cells * 100}{percentage area of tumor area}$$

Representative images are shown in Fig. 1.

Statistical analysis

Recurrence-free survival (RFS) was defined as the period between the primary surgery and any case of recurrence (loco-regional and/or distant metastasis) of breast cancer, death occurring due to any cause, or event of the last follow-up. DMFS was defined as the period between the primary curative surgery and diagnosis of breast cancer-derived distant metastasis, death occurring due to any cause, or event of the last follow-up. The data of patients who did not exhibit relevant events were censored at the completion of follow-up.

Clinical characteristics were compared between PD-L1-negative and -positive groups assessed using each antibody. Continuous variables between the two groups were compared using the Student’s t-test or the Mann–Whitney test. Categorical variables were compared using the Chi-square test or Fisher’s exact test. Survival curves were generated using the Kaplan–Meier method and compared between the two groups using the log-rank test. Cox proportional hazard models were used to identify factors associated with survival outcome (RFS and DMFS). We applied the backward likelihood method (specifies the significance level for entering effects = 0.10 and removing effects = 0.05) in the Cox proportional hazard models.

Time-dependent receiver operating characteristic curves were generated to ascertain the identity of the PD-L1 antibody that presented with the most considerable contribution to the prognostic value. Furthermore, to investigate the additional prognostic power of PD-L1 expression based on the IHC assays performed using the three antibodies, we calculated the Harrel’s c-index for each Cox proportional hazard model27. This helped measure the concordance for time-to-event data, in which increasing values between 0.5 and 1.0 indicated improved prediction. For model comparison, the bootstrapping method was used with resampling 1,000 times. We also assessed the NRI and IDI to evaluate the improvement in discriminatory ability contributed by PD-L1 expression using each of the three antibodies when data were added to the survival model28. Significant improvement was recognized in the prediction model when NRI > 0 and IDI > 0.

Statistical analysis was performed using the IBM SPSS Statistics version 24 software (IBM Corp., Armonk, NY, USA). The threshold for statistical significance was set at P < 0.05, with a 95% confidence interval (CI) not including 1.

Consent for publication

All authors have provided consent for publication.

Ethics approval and consent to participate

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and the 1964 Helsinki Declaration. The protocol was approved by the institutional review board (local IRB No. 3–2018-0067) of Gangnam Severance Hospital. The need for informed consent was waived under the approval of the IRB due to the retrospective design.