Comparison of three scoring methods using the FDA-approved 22C3 immunohistochemistry assay to evaluate PD-L1 expression in breast cancer and their association with clinicopathologic factors

In the evaluation of PD-L1 expression to select patients for anti-PD-1/PD-L1 treatment, uniform guidelines that account for different immunohistochemistry assays, different cell types and different cutoff values across tumor types are lacking. Data on how different scoring methods compare in breast cancer are scant. Using FDA-approved 22C3 diagnostic immunohistochemistry assay, we retrospectively evaluated PD-L1 expression in 496 primary invasive breast tumors that were not exposed to anti-PD-1/PD-L1 treatment and compared three scoring methods (TC: invasive tumor cells; IC: tumor-infiltrating immune cells; TCIC: a combination of tumor cells and immune cells) in expression frequency and association with clinicopathologic factors. In the entire cohort, positive PD-L1 expression was observed in 20% of patients by TCIC, 16% by IC, and 10% by TC, with a concordance of 87% between the three methods. In the triple-negative breast cancer patients, positive PD-L1 expression was observed in 35% by TCIC, 31% by IC, and 16% by TC, with a concordance of 76%. Associations between PD-L1 and clinicopathologic factors were investigated according to receptor groups and whether the patients had received neoadjuvant chemotherapy. The three scoring methods showed differences in their associations with clinicopathologic factors in all subgroups studied. Positive PD-L1 expression by IC was significantly associated with worse overall survival in patients with neoadjuvant chemotherapy and showed a trend for worse overall survival and distant metastasis-free survival in triple-negative patients with neoadjuvant chemotherapy. Positive PD-L1 expression by TCIC and TC also showed trends for worse survival in different subgroups. Our findings indicate that the three scoring methods with a 1% cutoff are different in their sensitivity for PD-L1 expression and their associations with clinicopathologic factors. Scoring by TCIC is the most sensitive way to identify PD-L1-positive breast cancer by immunohistochemistry. As a prognostic marker, our study suggests that PD-L1 is associated with worse clinical outcome, most often shown by the IC score; however, the other scores may also have clinical implications in some subgroups. Large clinical trials are needed to test the similarities and differences of these scoring methods for their predictive values in anti-PD-1/PD-L1 therapy.


Background
Over the past decade, monoclonal antibody-based immune checkpoint inhibitors targeting programmed death-1 (PD-1) and its ligand, programmed death-ligand 1 (PD-L1), have been developed and approved by the U.S. Food and Drug Administration (FDA) for the treatment of solid tumors such as non-small cell lung cancer, melanoma, urothelial carcinoma, and head and neck squamous cell carcinoma [1][2][3][4][5]. The FDA has also approved several diagnostic immunohistochemistry (IHC) assays corresponding to these drugs to detect PD-L1 expression and inform the selection of patients for treatment [6][7][8][9][10][11]. However, when these different immune checkpoint inhibitors are used for the same tumor type, the corresponding IHC assays may be scored differently.
Taking urothelial carcinoma as an example, when the 22C3 Dako PharmDx IHC assay is used, a Combined Positive Score, which factors in expression in both tumor cells and tumor-infiltrating immune cells, is calculated, and a score of ≥ 10 is considered positive. On the other hand, the 28-8 Dako PharmDx IHC assay scores the expression in the urothelial tumor cells only, with ≥ 1% as the cutoff for positivity. The Ventana SP142 assay, in contrast, measures PD-L1 expression in the tumor-infiltrating immune cells only, with ≥ 5% staining considered positive.
The scoring of PD-L1 expression also varies in different tumor types when the same assay is used. For example, with the 22C3 Dako PharmDx IHC assay, expression in ≥ 1% of tumor cells is considered positive for non-small cell lung cancer, a Combined Positive Score of ≥ 10 is considered positive for urothelial carcinoma, and a Combined Positive Score of ≥ 1 is considered positive for gastric adenocarcinoma and cervical cancer. Overall, there is a lack of uniform guidelines in the evaluation of PD-L1 that account for different IHC assays, different cell types, and different cutoff values across tumor types.
While breast cancer is not a robustly immunogenic tumor type overall, certain subtypes, largely the estrogen receptor-negative tumors, have more abundant immune cell infiltration, representing opportunities for immune checkpoint inhibitors. Emerging clinical trials testing the utility of PD-1/PD-L1 inhibitors have brought promise to the treatment of triple-negative breast cancer (TNBC) patients [12][13][14]. In the phase Ib KEYNOTE-012 trial, pembrolizumab, a PD-1 inhibitor, had an overall response rate of 18.5% in patients with PD-L1-positive advanced TNBC as a single agent, and the response appeared durable [12]. In cohort B of the phase II KEYNOTE-086 study, pembrolizumab monotherapy showed durable antitumor activity as first line therapy for patients with PD-L1-positive metastatic TNBC, with an objective response rate of 21.4% [13]. In the IMpas-sion130 phase III trial, the PD-L1 inhibitor atezolizumab in combination with the chemotherapy drug nabpaclitaxel prolonged progression-free survival in patients with metastatic TNBC, and the survival benefit was significantly higher in PD-L1-positive TNBC than PD-L1negative patients [14]. The last data led to the accelerated FDA approval of atezolizumab plus chemotherapy in the treatment of patients with PD-L1-positive, unresectable, locally advanced or metastatic TNBC. Trials for other subtypes of breast cancer are underway [15].
The issues experienced in the assessment of PD-L1 expression in other solid tumors are also encountered with breast cancer. Earlier studies have used various commercial antibodies to detect PD-L1 prior to FDA approval of PD-L1 IHC diagnostics [16]. More recent PD-L1 expression studies using FDA-approved antibodies have applied various scoring systems [16][17][18][19][20][21][22][23]. In the clinical trials for PD-1/PD-L1 inhibitors in TNBC, positivity was defined as PD-L1 level in the stroma or ≥ 1% of tumor cells in the KEYNOTE-012 trial (a 22C3 clone was used before the 22C3 Dako PharmDx IHC assay was available), a Combined Positive Score of ≥ 1 with the FDAapproved 22C3 Dako PharmDx IHC assay in the KEYNOTE-086 trial, and an immune cell score of ≥ 1% with the Ventana SP142 antibody in the IMpassion130 trial [12][13][14]. Before a uniform PD-L1 detection system is agreed upon, an understanding of the staining frequencies of different PD-L1 antibodies and the differences between scoring methods and their clinicopathologic correlates is needed for meaningful comparison of data from clinical studies and for selection of patients for anti-PD-1/PD-L1 treatment. In this study, we use the 22C3 Dako PharmDx assay, which is one of the first FDA-approved assays and widely used in many clinical laboratories, to evaluate three scoring methods for PD-L1 expression frequency and their associations with clinicopathologic factors, including stromal tumor infiltrating lymphocyte (TIL) levels, in breast cancer. The three scoring methods included a score for PD-L1 expression in the tumor cells, which is the standard approach to evaluate an IHC marker in tumor and also used in the evaluation of PD-L1 in other solid tumors such as lung cancer; a score for PD-L1 expression in tumor-infiltrating immune cells, which was used in the Impassion130 trial; and a combined tumor cell and immune cell score, which is equivalent to the method used in the KEYNOTE-086 trial. This is the first comprehensive evaluation of different scoring methods of PD-L1 in breast cancer.

Human breast tumor samples
This retrospective study was approved by the institutional review board of The University of Texas MD Anderson Cancer Center. Four hundred ninety-six patients diagnosed with invasive breast cancer during 2004 to 2016 and treated at our institution were included. At the last follow-up, none of the patients received anti-PD1/ PD-L1 treatment as recorded in our clinical database. All samples were from surgical excision specimens. In patients with more than one tumor focus, only the largest tumor was included. Patient age, race/ethnicity, tumor size, histologic type, histologic grade, lymph node status, distant metastasis, pathologic stage, prognostic and predictive marker status, history of neoadjuvant chemotherapy (NACT), residual cancer burden category, and clinical follow-up data were retrospectively collected from slide review and patients' medical records. Because histologic type and grade could be altered by NACT, these parameters were recorded according to the information in the pretreatment biopsy report, if the patient received NACT. The American Society of Clinical Oncology (ASCO)/College of American Pathologists (CAP) guideline recommendations [24][25][26] were used as references for categorizing estrogen receptor (ER), progesterone receptor (PR), and HER2 status as part of the routine pathologic evaluation. As minor modifications to the guideline for ER and PR, positive staining was defined as nuclear staining in at least 5% of invasive carcinoma, because low expression of ER and PR is clinically managed similar to ER/PR negative tumors. In the current study, patients were categorized as follows based on receptor status: positive for ER and PR but negative for HER2 (ER/PR positive group); HER2 positive, regardless of ER and PR status (HER2 group); and negative for ER, PR and HER2, or triple-negative (TNBC group).
IHC for assessment of PD-L1 and stromal TIL Tissue microarrays (TMAs) were constructed from representative archival paraffin blocks in the Pathology files of primary tumors using a 1.0-mm manual tissue arrayer (Beecher Instruments, Inc., Sun Prairie, WI). All blocks were from surgical excision specimens. Duplicate punches from different areas of the same tumor were obtained in 95% of the samples. Unstained tissue sections 4-μm thick were prepared from the TMAs, and IHC for PD-L1 was performed using the FDA-approved PD-L1 IHC 22C3 pharmDx kit (Dako North America Inc., Carpinteria, CA) on the Dako AutostainerLink 48 according to the manufacturer's instructions. Slides were counterstained with Mayer's hematoxylin. Results were evaluated with known positive and negative tissue controls. Percent PD-L1 expression in invasive tumor cells (TC) was calculated as the number of viable invasive carcinoma cells showing membranous staining of any intensity divided by the total number of viable invasive carcinoma cells. Percent PD-L1 expression in tumorinfiltrating immune cells (IC) was assessed as the proportion of tumor area occupied by PD-L1-positive immune cells of any intensity in any cell compartment. Percent PD-L1 expression in tumor-infiltrating immune cells and invasive tumor cells (TCIC) was calculated as the number of those cells showing PD-L1 staining (membranous staining for invasive tumor cells and any staining for immune cells) divided by the total number of invasive tumor cells. For each of these percentages, 1% or greater was considered positive. Of note, the TCIC percentage used in our study was equivalent to the Combined Positive Score in the KEYNOTE-086 trial [13]. For example, a TCIC of 5% was equivalent to a Combined Positive Score of 5.
On the whole slide sections from which the TMAs were generated, stromal TILs (sTILs) were evaluated as the area of the tumor stroma occupied by mononuclear inflammatory cells divided by the total tumor stromal area according to the International TILs Working Group guidelines [27,28]. Although sTIL evaluation for the current study was conducted prior to the publication of recommendations for post-NACT TILs by the Group on breast cancer [29], the same principles were applied in this study, including assessment of sTIL within the borders of the residual tumor bed as defined by the Residual Cancer Burden [30]. For correlative analyses, ≥ 5%, ≥ 10%, and ≥ 20% were first used as cutoffs for sTILs in the current study, and associations between sTILs and clinicopathologic factors were found similar between these three cutoffs; therefore, data using only the 10% cutoff are presented below.
PD-L1 expression was evaluated by three breast pathologists HG, QD, and LH. STILs were evaluated by HG and LH. Difficult and discrepant cases were determined by discussing and reviewing at multi-headed microscopes by at least two pathologists.

Statistical analysis
Statistical analysis was carried out using SAS 9.3 for Windows (SAS Institute Inc.) and SPSS Statistics 23.0 (IBM). Associations of PD-L1 staining and sTIL levels with clinicopathologic factors were assessed using the Fisher exact test. Multivariate analysis was performed using logistic regression or exact logistic regression, depending on the sample size, and included all clinicopathologic factors with a p value of 0.05 or less from the Fisher exact tests. Factors with a p value of 0.05 or less in the multivariate model were presented in this article. Overall survival was defined as the time from the initial breast cancer diagnosis until death from any cause or date of last follow-up. Distant metastasis-free survival was calculated as the duration between the initial breast cancer diagnosis and the time of distant metastasis. Recurrence-free survival was calculated as the duration between the initial breast cancer diagnosis and the time of either local regional recurrence or distant metastasis. Survival endpoints were estimated and plotted using the Kaplan-Meier method. Survival was compared between patient groups categorized by PD-L1 status and sTIL levels using the log-rank test. All tests were two-sided, and p values of 0.05 or less were considered statistically significant. For survival analysis, any p value between 0.05 and 0.08 was considered a trend.

Results
Comparison of the three PD-L1 scoring methods Among the 496 patients, TCIC, TC, and IC scores for the primary breast tumors were able to be assessed in 470 patients for comparison. In the entire cohort, positive PD-L1 expression was observed in 20% of patients by TCIC, 16% by IC, and 10% by TC (Fig. 1a, b). Pairwise comparison showed that in 87% (408/470) of patients, the staining results (positive or negative) were concordant between all scoring methods, including 7% that were positive and 80% that were negative for PD-L1. In the TNBC group (n = 93), positive PD-L1 expression was observed in 35% of patients by TCIC, 31% by IC, and 16% by TC (Fig. 1c, d). Concordance (positive or negative) between the three scoring methods was reached in 76% (71/93) of patients, including 11.8% that were positive and 64.5% that were negative for PD-L1. The discordance was due largely to differences between TC and the other two methods; concordance (positive or negative) between TCIC and IC was 96% in both the entire cohort and the TNBC group. Representative images of the staining results are shown in Fig. 2. Among the 496 patients included in the study, 349 patients had no NACT, and 147 patients had NACT at the time of surgical excision. The associations between PD-L1 expression and clinicopathologic factors in the subgroups of patients without NACT and with NACT are summarized in Tables 1 and 2. In the subgroup without NACT, histologic grade (grade 3), sTIL level (≥ 10%), and PR status (negative) were significantly associated with positive PD-L1 staining by all the three scoring methods, while race/ethnicity (black), ER status (negative), HER2 status (positive), TNBC status (yes), and receptor group (not ER/PR positive) were significantly associated with positive PD-L1 by one or two scoring methods. In the subgroup with NACT, histologic grade (grade 3), sTIL status (≥ 10%), ER status (negative), PR status (negative), TNBC status (yes), and receptor group (not ER/PR positive) were significantly associated with positive PD-L1 staining by all scoring methods, while race/ethnicity (black) was associated with positive PD-L1 only by TC.
With regard to receptor status, the entire cohort included 348 patients in the ER/PR positive group, 46 patients in the HER2 group, and 99 patients in the TNBC group. The results for the ER/PR positive group are shown in Additional file 1, Tables S1 and S2. STIL level (≥ 10%) was significantly associated with positive PD-L1 staining by all scoring methods in both the subgroup with NACT and the subgroup without NACT. Race/ethnicity (black) and histologic grade (grade 3) were associated with positive PD-L1 by at least one of the scoring methods.
In the HER2 group, histologic grade (grade 3), sTIL level (≥ 10%), ER status (negative), and PR status (negative) were significantly associated with positive PD-L1 staining by at least one of the scoring methods in the subgroup without NACT (Additional file 1, Table S3).    There were too few patients in the HER2 subgroup with NACT for meaningful statistical analysis (Additional file 1, Table S4).
In the TNBC group, race/ethnicity (black) and sTIL level (≥ 10%) were significantly associated with positive PD-L1 staining by TCIC and IC scores in the subgroup without NACT (Table 3). In the subgroup with NACT, age (≥ 50 years) was the only factor associated with PD-L1 staining, by TCIC (Table 4).
Multivariate analysis was conducted in all patients with NACT and without NACT, and the ER/PR positive subgroup without NACT. Other subgroups had relatively small numbers of patients. As shown in Tables 5  and 6 and Additional file 1, Table S5, sTIL (≥ 10%) retained a significant association with positive PD-L1 staining in all these subgroups by each scoring method. Black race/ethnicity (vs. white and Latino), histologic grade (grade 3), and TNBC group (vs. ER/PR positive group) were also significantly associated with positive PD-L1 staining by at least one of the scoring methods in the all patient subgroups.

Associations between sTIL, PD-L1 status, and other clinicopathologic factors
In the entire cohort comprising all receptor groups, higher sTIL level (≥ 10%) was associated with histologic grade (grade 3), ER status (negative), PR status (negative), TNBC status (yes), and receptor group (not ER/PR positive) both in the subgroup with NACT and the subgroup without NACT (Tables 1 and 2). In addition, higher sTIL level was associated with age (< 50 years), race/ethnicity (black), histologic type (invasive ductal carcinoma), and HER2 status (positive) in the subgroup without NACT and with tumor size (smaller tumor) in the subgroup with NACT. Higher sTIL level was also associated with age (< 50 years), race/ethnicity (black), histologic type (invasive ductal carcinoma), and histologic grade (grade 3) in the ER/PR positive group without NACT, and with age (< 50 years) in the HER2 group without NACT (Additional file 1, Tables S1 and S3). Interestingly, while sTIL level was significantly associated with TNBC status in the entire cohort (Tables 1  and 2), it was not associated with any clinicopathologic factors in the TNBC group (Tables 3 and 4).
Higher sTIL level was associated with positive PD-L1 staining by all three scoring methods in the entire cohort and in the ER/PR positive group with or without NACT (Tables 1 and 2, Additional file 1, Tables S1 and S2). The direct association was also seen in the HER2 and TNBC subgroups without NACT by TCIC and IC, but not by TC (Table 3 and Additional file 1, Table S3). Since both PD-L1 and sTIL level were associated with receptor status (Tables 1 and 2), the associations between sTIL, PD-L1 expression, and receptor status were further explored. As shown in Tables 7 and 8, when patients were grouped by sTIL level (≥ 10% vs. < 10%), positive PD-L1 expression by each scoring method was significantly associated with TNBC receptor group (vs. ER/PR positive group) only in the lower sTIL subgroup with NACT, suggesting that for patients without NACT or with high stromal TIL levels, receptor status does not affect PD-L1 expression.
In the multivariate analysis, younger age (< 50 years), histologic grade (grade 3), TNBC receptor group (vs. ER/PR positive group), and a tumor size of ≤ 2 cm (vs. > 5 cm) were significantly associated with higher sTIL level in at least one of the three subgroups (all patients without NACT, all patients with NACT, and the ER/PR positive patients without NACT) tested (Tables 5 and 6 and Additional file 1, Table S5). Among these factors, histologic grade (grade 3) was the only significant factor in all of these subgroups.    Association between race/ethnicity, PD-L1 status, sTIL, and receptor status Race/ethnicity (black) was frequently significantly associated with positive PD-L1 staining in both univariate and multivariate analyses (Tables 1, 2, 3, 5, and 6, Additional file 1, Tables S1, S2, and S5). Additional univariate analysis showed significant association between race/ethnicity and receptor status in the subgroup without NACT, indicating that black race/ethnicity was significantly associated with TNBC status (p = 0.001; data not shown). STIL level was also associated with receptor group and race/ethnicity in the subgroup without NACT (Table 1).
To further understand the impact of race/ethnicity on PD-L1 staining, we analyzed TCIC and IC scores, which were significantly associated with race/ethnicity in the multivariate analysis (Table 5), in the black race/ethnicity subgroup without NACT in a multivariate model including sTIL, receptor group, and histologic grade. Of these factors, only higher sTIL level (≥ 10%) was independently associated with positive PD-L1 staining (p = 0.0003, odds ratio 27, 95% CI 4.57-159.67, for both TCIC and IC).

Association of PD-L1 and sTIL with prognosis
Overall survival, recurrence-free survival, and distant metastasis-free survival were evaluated according to PD-L1 expression and sTIL level for the 495 patients for whom follow-up data were available. Follow-up times ranged from 3 months to 154 months (median follow-up, 48 months). As shown in Fig. 3, in the entire cohort, positive PD-L1 staining by IC was significantly associated with worse overall survival in the subgroup with NACT (p = 0.021; Fig. 3a). In the same subgroup, positive PD-L1 staining by TCIC showed a trend for worse overall survival (p = 0.064; Fig. 3b).
In the TNBC group, positive PD-L1 staining by IC showed a trend for worse overall survival (p = 0.055; Fig. 4a) and worse distant metastasis-free survival (p = 0.073; Fig. 4b) in the subgroup with NACT. In the ER/ PR positive group and HER2 group, no significant association was seen between PD-L1 expression and survival.
A trend for better recurrence-free survival was observed for higher sTIL level in the TNBC group without NACT (p = 0.076, Additional file 1, Fig. S1). The association of PD-L1 staining with survival in sTIL subgroups was also investigated. In TNBC patients without NACT and with higher sTIL (≥ 10%), positive PD-L1 staining by TC showed a trend for worse distant metastasis-free survival (p = 0.056; Fig. 4c). In TNBC patients with NACT and higher sTIL, positive PD-L1 staining by IC showed a trend for worse distant metastasis-free survival (p = 0.053; Fig. 4d). No significant association or trend was observed between PD-L1 and survival in TNBC patients with lower sTIL levels, or in the entire cohort or other receptor subgroups when patients were grouped by sTIL level.

PD-L1 expression using TCIC 10% cutoff and association with prognosis
While the manuscript of our study was being reviewed, a press release regarding the KEYNOTE-355 trial announced that pembrolizumab plus chemotherapy significantly improved progression-free survival compared to chemotherapy alone in patients with metastatic TNBC whose tumor expressed PD-L1 with a Combined Positive Score of ≥ 10 (https://bit.ly/2HtT4rj; unpublished data). It is possible that in the near future, this new cutoff will be applied in clinical settings. Therefore, it was of interest to examine the expression of PD-L1 in our cohort using a cutoff of ≥ 10% by TCIC, which would be equivalent to a Combined Positive Score of ≥ 10. In the entire cohort, positive PD-L1 expression was seen in 10% of patients (47/471), including 11% (36/340) of those without NACT and 8% (11/131) of those with  NACT. In the TNBC group, positive PD-L1 expression was found in 19% (18/93) of patients, including 19% (11/ 59) of those without NACT and 21% (7/34) of those with NACT. Overall survival, recurrence-free survival, and distant metastasis-free survival were also evaluated in the entire cohort and in the TNBC patients using the TCIC ≥ 10% cutoff. No significant association or trend was observed between PD-L1 expression and prognosis in those subgroups with or without NACT (p > 0.08).

Discussion
Previous studies have demonstrated that PD-L1 expression in breast cancer is positively associated with high TIL levels and with the presence of poor prognostic factors such as high histologic grade, negative ER and PR status, positive HER2 status, and TNBC status [5,16,31]. Similar associations were observed in our study by using different scoring methods. In the entire cohort, PD-L1 positivity by each scoring method was associated IDC invasive ductal carcinoma, sTIL stromal tumor infiltrating lymphocytes with higher nuclear grade, higher sTIL level, and negative PR status both with and without NACT in the univariate analysis; however, the scoring methods showed differences for other clinicopathologic factors. For example, in the subgroup without NACT, HER2 status was associated with only the TCIC score, and TNBC status was associated with only the IC score.
In the multivariate analysis, sTIL remained significantly associated with PD-L1 positivity by all scoring methods in the entire cohort; however, race/ethnicity was significantly associated with TCIC and IC, but not TC, in the subgroup without NACT, and histologic grade was significantly associated with TCIC and IC, but not TC, in the subgroup with NACT (Tables 5 and 6). Interestingly, even though black race/ethnicity was associated with TNBC in the entire cohort without NACT, supporting previous studies [32,33], when PD-L1 expression by TCIC and IC in the black race/ethnicity subgroup without NACT was investigated in a multivariate model, only sTIL level, not receptor status, was  independently associated with TCIC and IC, suggesting that in black patients, sTIL level is a stronger predictor than receptor status for PD-L1 expression. It may appear that TCIC and IC scores had similar associations with clinicopathologic factors across various subgroups; however, in the ER/PR positive subgroup without NACT, the same factors were significantly associated with TC and TCIC, while race/ethnicity was significantly associated with only IC in the multivariate analysis (Additional file 1, Table S5). Thus, in the same subgroup of patients, PD-L1 positivity may be associated with different clinicopathologic factors depending on the scoring method. In our study, PD-L1 positivity by IC was significantly associated with worse overall survival in all patients with NACT, whereas TCIC showed a similar trend. In the TNBC subgroup with NACT, PD-L1 positivity by IC showed trends for worse overall survival and distant metastasis-free survival. These results are consistent with a previous report that demonstrated PD-L1 as a poor prognostic factor in post NACT residual TNBC [34].  The prognostic value of PD-L1 expression by IHC in breast cancer has conflicted between previous studies, partially owing to technical issues related to different antibody clones, cutoff points, and scoring systems. While some studies demonstrated a direct correlation between PD-L1 expression and clinical outcome, others identified PD-L1 as an indicator for worse survival, or no association was found [16,17,19,20,22,23,31]. Some of these studies were performed before the FDA approval of PD-L1 IHC diagnostics, indicating that these discrepancies may be attributed in part to a difference in PD-L1 detection antibodies. But even in studies using FDA-approved, commercially standardized clones, the prognostic value of PD-L1 still was not consistent (Table 9). Of note, the scoring systems in those studies varied whether tumor cells and/or immune cells were assessed, suggesting that both different clones and different scoring systems played a role in reaching the conclusions. Furthermore, although the TC scores in those studies were presumably comparable, the IC scores were often not clearly defined [19,21]. Importantly, the IC score in the current study, which was adopted from the IMpassion130 trial [14], is the proportion of tumor area occupied by PD-L1-positive immune cells, different from the typical IHC interpretation where the denominator is the total number of cells. Also, because the TCIC score used in our study (equivalent to the Combined Positive Score in the KEYNOTE-086 trial [13]) measures the number of PD-L1 staining immune cells and invasive tumor cells divided by the total number of invasive tumor cells, the TCIC score does not represent the sum of TC and IC.
In the assessment of PD-L1 expression in solid tumors, like IHC antibody and scoring method, cutoff value is another variable that may lead to divergent results. There were several reasons for selecting 1% as the cutoff point in our study in order to compare the three scoring methods. A meta-analysis of 20 clinical trials of anti-PD-1/PD-L1 therapy in melanoma, non-small cell lung cancer, and renal cell carcinoma patients noted that 1% was among the most frequently used cutoff values for PD-L1 [35]. One percent was also used as the cutoff in the majority of recent studies of PD-L1 expression in breast cancer using FDA-approved antibodies [16][17][18][19][20][21][22][23] (Table 9). In recent published TNBC clinical trials with pembrolizumab and atezolizumab, a Combined Positive Score of 1 (the equivalent of 1% TCIC in our study) and a 1% IC score, respectively, were the cutoffs used to evaluate PD-L1 expression, and the latter was included as the cutoff in selecting patients in the FDA approval of the drug [13,14]. With ≥ 1% considered positive in our study, in the TNBC group, the positive rate for PD-L1 expression was 31% by IC, somewhat lower than the 41% rate of positivity reported in the IMpassion 130 trial [14]. Preselection of the cohorts may have played a role to lead to this difference. In our cohort, all patients had surgical resection of the primary tumor, and most did not have metastasis, whereas the IMpassion 130 trial focused on metastatic or unresectable locally advanced TNBC. Based on our finding of PD-L1 being an indicator for worse prognosis, it is reasonable to postulate that the IMpassion 130 trial may have selected patients who were more likely to have PD-L1 expression. Also, it has been shown that PD-L1 expression in breast cancer is focal or   [36]; therefore, the use of TMA in our study may partially explain the lower positivity than that reported in the IMpassion 130 trial. In addition, the difference may be due to our small sample size, interobserver variation, or differences in the antibodies. Our results on sTILs showed a trend for better recurrence free-survival with higher sTIL level (≥ 10%) in the TNBC group without NACT. This result is consistent with findings by others showing that higher TIL is a good prognostic indicator [37,38]. Because PD-L1 was associated with sTILs in many subgroups in our study, and because PD-L1 and sTILs had significant associations or trends with survival, the prognostic role of PD-L1 in subgroups according to sTIL level was further examined. In the subgroups with high sTIL levels, positive PD-L1 by TC showed a trend for worse distant metastasis-free survival in the TNBC group without NACT (Fig. 4c), and positive PD-L1 by IC showed a trend for worse distant metastasis-free survival in the TNBC group with NACT (Fig. 4d). There was no association or trend found in the subgroups with low sTIL levels. The trends in the post-NACT TNBC subgroup with or without further grouping by sTIL level (Figs. 3  and 4) are particularly interesting. It is well known that among TNBC patients who do not experience pathologic complete response, not all patients have relapse. It has also been shown that in TNBC patients, the presence of high TIL levels in residual disease after NACT is associated with better prognosis [39,40]. Although our sample size was small and further investigation is necessary to confirm our findings, our results suggest that PD-L1 could serve as a marker to select TNBC patients for further treatment after NACT, and it may further stratify those with high sTILs in residual tumors in terms of prognosis.
The urgent clinical need for effective immunotherapy in treating breast cancer, especially TNBC, is mirrored by rapid new advances in this area. Studies such as that presented by Rugo et al. at the 2019 European Society for Medical Oncology (ESMO) Congress have compared PD-L1 expression in TNBC between different PD-L1 assays using different scoring methods. In that study (abstract LBA20), using the SP142 assay with an IC 1% cutoff and the 22C3 assay with a Combined Positive Score 1 cutoff, 45% of TNBC patients had positive PD-L1 expression with both assays, 36% patients had positive expression with 22C3 and negative expression with SP142, and 18% had negative expression with both assays. In the current study, a high concordance (96%) between IC and TCIC was reached using the 1% cutoff with the 22C3 assay, suggesting that the difference observed in the study by Rugo et al. may be due mainly to the differences between the assays. The recent press release on KEYNOTE-355 indicated the importance of cutoff values; in that study, a Combined Positive Score of ≥ 10 identified metastatic TNBC patients who had significantly improved progression-free survival on pembrolizumab plus chemotherapy compared to chemotherapy alone (https://bit.ly/2HtT4rj; unpublished data). It is not surprising that when the cutoff value was raised to TCIC 10% in our cohort, the positive rate was much lower (10% in the entire cohort and 19% in the TNBC group) compared with the positive rate when TCIC 1% was used (20% in the entire cohort and 35% in the TNBC group). Although the goal of this study was to demonstrate how each of the three scoring methods impacts PD-L1 evaluation using one assay, our other ongoing studies aim to investigate how different assays detect PD-L1 expression. It would be interesting to compare PD-L1 expression using a TCIC 10% cutoff with 22C3 and using an IC 1% cutoff with SP142.
Our study had the limitations of a retrospective study. Because many HER2-positive and TNBC patients receive neoadjuvant therapy and show pathologic complete response, selection bias inevitably influenced the makeup of the subgroups with and without NACT in our study. Also, although the study design included two different areas of the tumor to make duplicate TMA punches, the validity of using TMA to capture PD-L1 expression in breast cancer needs to be further verified. In addition, the predictive value of PD-L1 in anti-PD1/PD-L1 therapy could not be addressed in our study since none of the patients was treated with anti-PD1/PD-L1. Furthermore, although our entire cohort was relatively large, some of the subgroups had small numbers of patients, hindering meaningful statistical analysis. Additional studies with cohorts enriched for rarer subtypes and in patient populations given anti-PD1/PD-L1 therapy may provide useful information in these regards.
The identification of the role of PD-1/PD-L1 in tumor immune escape more than a decade ago has revolutionized immunotherapy in human tumors [41,42]. With accumulating experience, it has become conceivable that key players such as TILs, PD-1, and PD-L1 act dynamically in the process of tumor initiation and progression. Therefore, detection of PD-L1 expression by IHC in a tissue sample may provide only a glimpse of the tumor in its interaction with the immune response. Noninvasive approaches that can reflect dynamic changes of spatial and temporal PD-L1 expression in the tumor would no doubt guide more efficient treatment. To this end, positron emission tomography (PET) imaging studies to detect PD-L1 expression in vivo have shown promising advances [43,44]. However, before such techniques are available for clinical practice, IHC remains the best assay to evaluate PD-L1 expression in human tumors. Although most patients receiving anti-PD-1/PD-L1 therapy react with only mild toxicity profiles, including skin rash, dysthyroidism, and gastrointestinal events, more severe immune-mediated side effects occur in some patients, and treatment-related deaths have been reported [1,2,5]. Therefore, to select patients for this treatment and avoid unnecessary adverse effects, technical issues related to PD-L1 IHC such as the one addressed in this study are of clinical importance.

Conclusions
Our findings indicate that the three scoring methods, with a 1% cutoff, are different in their sensitivity for detecting PD-L1 and their associations with clinicopathologic factors and clinical outcomes. We have shown that scoring by TCIC is the most sensitive way to identify PD-L1-positive breast cancer. In a setting where the desire is to include as many patients as possible for a clinical trial, this score may be the most useful. Alternatively, one can use the combination of the TC score and IC score to reach almost the same sensitivity, although this comparability may be dependent on the selected cutoff values of the individual scores. On the other hand, when PD-L1 expression is assessed as a prognostic marker, our study suggests that PD-L1 is associated with worse clinical outcome, most often shown by the IC score; however, the other scores may also have clinical implications in some subgroups. Beyond the population untreated with immune checkpoint inhibitors studied here, the predictive values of these scoring methods for anti-PD-1/PD-L1 therapy are deferred to large clinical trials.
Additional file 1: Table S1. Association of PD-L1 staining with clinicopathologic factors in estrogen receptor/progesterone receptor positive  . Table S2. Association of PD-L1 staining with clinicopathologic factors in estrogen receptor/progesterone receptor positive patients with neoadjuvant chemotherapy. Table S3. Association of PD-L1 staining with clinicopathologic factors in HER2 positive patients without neoadjuvant chemotherapy. Table S4.
Association of PD-L1 staining with clinicopathologic factors in HER2 positive patients with neoadjuvant chemotherapy. Table S5. Summary of multivariate analysis in estrogen receptor/progesterone receptor positive patients without neoadjuvant chemotherapy showing the odds ratio (95% confidence interval) of variables significantly associated with PD-L1 scoring methods and sTIL level. Figure S1. Kaplan-Meier plots of recurrence-free survival between tumors with higher stromal tumorinfiltrating lymphocyte level (≥10%) and lower stromal tumor-infiltrating lymphocyte in the triple negative group without neoadjuvant chemotherapy.