The Quantitative ER Immunohistochemical Analysis in Breast Cancer: Detecting the 3 + 0, 4 + 0, and 5 + 0 Allred Score Cases

Background and objectives: The currently used immunohistochemical approach in determining the estrogen receptor (ER) positivity of breast cancers (BCs) is inherently subjective and additionally limited by its semi-quantitative nature. The application of software in the analysis of digitized slide images may overcome some of these limitations. However, the utilization of such an approach requires that the entire staining procedure is standardized. Background and objectives: We aimed to establish a procedure for the photometric and morphometric analysis of BC immunohistochemical parameters that can possibly be used for a diagnostic purpose that is in line with the current semi-quantitative scoring system. Materials and Methods: Semi-quantitative analysis of ER-stained tissue sections was performed following the Allred scoring system guidelines. The quantitative analysis was performed in ImageJ software after color deconvolution. The quantitative analysis of 66 cases of invasive lobular BC included: Percent of ER-positive cells, average nuclear coloration intensity, and the quantitative ER score. The percent of ER-positive tumor cells was counted using a standard grid overlay, while optical density (0.0–1.0) was measured within each nucleus at the grid points. Results: A statistical analysis revealed a significant positive correlation (r = 0.886, p < 0.001) between the subjective semi-quantitative and quantitative ER scores, with a large effect size (d = 3.8215). We observed strong statistically significant correlations between individual parameters of the total ER score, percentage of ER-positive nuclei, and color intensity, obtained by the two independent methods. Conclusions: Additionally, besides excluding subjectivity, the up to now unreported cases of 3 + 0, 4 + 0, and 5 + 0 Allred scores were detected only by the application of the proposed quantitative approach.


Introduction
Around 70% of human breast cancers (BCs) express estrogen receptors (ERs) and, based on many demands (diagnosis, therapy, etc.), BCs are divided into estrogen-dependent and independent ones [1]. Estrogen, as a transcription factor, regulates the genetically-programmed progression of cell cycle and growth in mammary glands. Its quantitative expression, which could reflect on both clinical (disease outcome) and laboratory work, has been extensively studied [2][3][4][5][6]. One of the most important prognostic and predictive features of BCs is their ER-positivity, and based on this information the

Patients
Out of all cases diagnosed with BC during a two-year period (2009)(2010)(2011), 66 cases of invasive lobular carcinomas (ILCs) belonging to the two most common variants (classical or pleomorphic) were chosen for this analysis. Tissue samples of ILC were obtained by breast-conserving surgery or mastectomy with axillary dissection in the Clinical Centre Niš and other clinical centers from south-eastern Serbia. The samples were routinely processed, embedded in paraffin, and archived together with their corresponding histopathological diagnosis and clinical documentation in the Centre of Pathology of the Clinical Centre Niš. The study design was approved by the Ethics Committee (No. 12-3627-2/3) on 14 April 2016.

Immunohistochemical Staining
Immunohistochemical staining was performed for the characteristic areas of tumors (1-2 paraffin blocks per case) from microscopically selected samples (regions), based on standard (hematoxylin and eosin) H&E staining. Tissue from the paraffin molds was cut into 4-µm-thick sections and placed on Superfrost glass slides. Antigen retrieval of deparaffinized and rehydrated samples was done in citric acid buffer (pH 6.0) in a microwave oven for 20 min. After cooling to room temperature, the blockage of endogenous peroxidase was performed using 3% (w/w) hydrogen peroxide. After sample washing (PBS, pH 7.4), primary estrogen (Monoclonal Mouse Anti-Human Estrogen Receptor α (ER); Clone 1D5; Code N1575, Ready-to-use; Dako, Glostrup Denmark) antibody was applied for 40 min at room temperature in a moist chamber. Visualization was achieved by incubation of slides with Dako LSAB2 System-HRP (Code K0673, 15 mL) and diaminobenzidine (DAB), followed by washing and counterstaining with Mayer's hematoxylin.

Scoring System
Semi-quantitative analysis of ER-stained tissue sections was performed following the Allred scoring system guidelines. To obtain the final scores, individual scores of the percentage of ER-positive cancer cell nuclei (0-5) and the staining intensity of the nuclei (0-3) (Figure 1) were summed up. The percentage of ER-positive cancer cell nuclei was set as follows: 1-less than 1% of positive cancer cell nuclei; 2-from 1 to 10% of positive cancer cell nuclei; 3-from 11 to 33% of positive cancer cell nuclei; 4-from 34 to 66% of positive cancer cell nuclei; and score 5-more than 67% of positive cancer cell nuclei (Figure 1; first addend). Whereas, the staining intensity in the nuclei was scored as: 1-weak; 2-medium; and 3-strong ( Figure 1; second addend). Estrogen Receptor α (ER); Clone 1D5; Code N1575, Ready-to-use; Dako, Glostrup Denmark) antibody was applied for 40 min at room temperature in a moist chamber. Visualization was achieved by incubation of slides with Dako LSAB2 System-HRP (Code K0673, 15 mL) and diaminobenzidine (DAB), followed by washing and counterstaining with Mayer's hematoxylin.

Scoring System
Semi-quantitative analysis of ER-stained tissue sections was performed following the Allred scoring system guidelines. To obtain the final scores, individual scores of the percentage of ER-positive cancer cell nuclei (0-5) and the staining intensity of the nuclei (0-3) (Figure 1) were summed up. The percentage of ER-positive cancer cell nuclei was set as follows: 1-less than 1% of positive cancer cell nuclei; 2-from 1 to 10% of positive cancer cell nuclei; 3-from 11 to 33% of positive cancer cell nuclei; 4-from 34 to 66% of positive cancer cell nuclei; and score 5-more than 67% of positive cancer cell nuclei (Figure 1; first addend). Whereas, the staining intensity in the nuclei was scored as: 1-weak; 2-medium; and 3-strong ( Figure 1; second addend). Figure 1. The addend combinations for the Allred scoring system; red circled combinations are those score combinations that were not found in our study or are extremely rare.

Experimental Scoring System
Tissue sections were observed using BX-50 microscope (Olympus Co., Tokyo, Japan) and images (TIFF) of selected fields captured with a SONY CCD Color Video camera HYPER HAD connected to the microscope. After the adjustment of the Köhler illumination field, the aperture opening (0.3) and illumination were set to a constant value, while all camera options connected with the automatic image corrections (shutter, gain) were switched off and white balance was adjusted to 3200. We used a blue microscope filter and an indifferent filter ND6. The obtained images, saved in TIFF format, without additional corrections were further analyzed using ImageJ software version 1.5 (http://rsb.info.nih.gov/ij/).
For the analysis of each tissue section, at least 10 fields under 400× magnification were chosen. Fields with normal or dysplastic breast tissue, as well as those with focal lobular breast carcinoma "in situ", were not analyzed. The precisely-defined grid system was used for the analysis of all images obtained from the studied cases. The outline of each nucleus, found on the grid points, was drawn/made and added/saved in ROI (Region of Interest) manager. Afterward, color deconvolution, based on a Landini algorithm and incorporated as a plugin, was applied to all images and the saved nuclear outlines were overlaid on these modified images ( Figure 2). Inside each outlined nucleus, optical density (OD) was measured. For calibration purposes, OD was set to a range from 0 (on a binary image corresponding to almost white, while in our case this was a pale blue hematoxylin coloration) to 255 (almost black). According to the Lambert-Beer law, the OD of each nucleus is directly proportional to the amount of the dye bound for the nuclear structures. Thus, OD = 0 means that there is no dye, OD = 1.0 means that 90% of photons are absorbed, while OD = 2.0 correlates with 99% of absorbed photons [11]. The total number of the measured nuclei per case was taken to correspond to 100%, while the limit between the positive and negative ones was set to 0.1 OD, corresponding to the 10% of DAB OD [18]. The quantitative ER score was assessed as follows: Figure 1. The addend combinations for the Allred scoring system; red circled combinations are those score combinations that were not found in our study or are extremely rare.

Experimental Scoring System
Tissue sections were observed using BX-50 microscope (Olympus Co., Tokyo, Japan) and images (TIFF) of selected fields captured with a SONY CCD Color Video camera HYPER HAD connected to the microscope. After the adjustment of the Köhler illumination field, the aperture opening (0.3) and illumination were set to a constant value, while all camera options connected with the automatic image corrections (shutter, gain) were switched off and white balance was adjusted to 3200. We used a blue microscope filter and an indifferent filter ND6. The obtained images, saved in TIFF format, without additional corrections were further analyzed using ImageJ software version 1.5 (http://rsb.info.nih.gov/ij/).
For the analysis of each tissue section, at least 10 fields under 400× magnification were chosen. Fields with normal or dysplastic breast tissue, as well as those with focal lobular breast carcinoma "in situ", were not analyzed. The precisely-defined grid system was used for the analysis of all images obtained from the studied cases. The outline of each nucleus, found on the grid points, was drawn/made and added/saved in ROI (Region of Interest) manager. Afterward, color deconvolution, based on a Landini algorithm and incorporated as a plugin, was applied to all images and the saved nuclear outlines were overlaid on these modified images ( Figure 2). Inside each outlined nucleus, optical density (OD) was measured. For calibration purposes, OD was set to a range from 0 (on a binary image corresponding to almost white, while in our case this was a pale blue hematoxylin coloration) to 255 (almost black). According to the Lambert-Beer law, the OD of each nucleus is directly proportional to the amount of the dye bound for the nuclear structures. Thus, OD = 0 means that there is no dye, OD = 1.0 means that 90% of photons are absorbed, while OD = 2.0 correlates with 99% of absorbed photons [11]. The total number of the measured nuclei per case was taken to correspond to 100%, while the limit between the positive and negative ones was set to 0.1 OD, corresponding to the 10% of DAB OD [18]. The quantitative ER score was assessed as follows: Quantitative ER score = 1/20 * (percent of ER-positive cancer cell nuclei + (average nuclear intensity × 100)).

Statistical Analysis
The obtained values for the evaluated photometric parameters, as well as the semi-quantitative ER score, were subjected to the following statistical methods: (i) Descriptive statistics (mean (X), standard deviation (SD), median, maximum, and minimum values), (ii) correlation analysis (Pearson test), and (iii) effect size. All statistical analyses were performed using SigmaStat 2.0 (SPSS Inc., Chicago, IL, USA) and GraphPad Prism version 5.03, (San Diego, CA, USA).

Results
Both the semi-quantitative and quantitative analyses of the 66 cases of ILC considered in our study included the determination of the following three parameters: Percent of ER-positive cells, average nuclear coloration intensity, and quantitative ER score. In the semi-quantitative, classical approach, the first two parameters were evaluated by at least two experienced pathologists and were based on their subjective treatment of the coloration intensity (which attained integer values, 0-3), while the percentage of the ER-positive nuclei was assessed by manual counting and was scored typified according to the Allred ranges ( Figure 3). The proposed quantitative approach in this work included both the counting and coloration intensity evaluation by a software that utilized a deconvoluted color intensity and a counting grid. The results of the quantitative assessment were scaled to be compatible with the already utilized Allred score. Following the subjective semi-quantitative analysis of all of the cases included in the study, we initially considered the ER-negative ones. The deconvoluted color intensity assessment allowed us to detect cases where OD values of 0.1 and less were present in the negative cases ( Figure 4). All cases that were determined to have a higher OD of 0.1 were observable by the pathologists. Thus, we chose OD = 0.1 as the limit Quantitative ER score = 1/20 * (percent of ER-positive cancer cell nuclei + (average nuclear intensity × 100)).

Statistical Analysis
The obtained values for the evaluated photometric parameters, as well as the semi-quantitative ER score, were subjected to the following statistical methods: (i) Descriptive statistics (mean (X), standard deviation (SD), median, maximum, and minimum values), (ii) correlation analysis (Pearson test), and (iii) effect size. All statistical analyses were performed using SigmaStat 2.0 (SPSS Inc., Chicago, IL, USA) and GraphPad Prism version 5.03, (San Diego, CA, USA).

Results
Both the semi-quantitative and quantitative analyses of the 66 cases of ILC considered in our study included the determination of the following three parameters: Percent of ER-positive cells, average nuclear coloration intensity, and quantitative ER score. In the semi-quantitative, classical approach, the first two parameters were evaluated by at least two experienced pathologists and were based on their subjective treatment of the coloration intensity (which attained integer values, 0-3), while the percentage of the ER-positive nuclei was assessed by manual counting and was scored typified according to the Allred ranges (Figure 3). The proposed quantitative approach in this work included both the counting and coloration intensity evaluation by a software that utilized a deconvoluted color intensity and a counting grid. The results of the quantitative assessment were scaled to be compatible with the already utilized Allred score. Following the subjective semi-quantitative analysis of all of the cases included in the study, we initially considered the ER-negative ones. The deconvoluted color intensity assessment allowed us to detect cases where OD values of 0.1 and less were present in the negative cases ( Figure 4). All cases that were determined to have a higher OD of 0.1 were observable by the pathologists. Thus, we chose OD = 0.1 as the limit when the score 0 or 1 for nuclear coloration was given to a case to maintain compatibility with the Allred scoring system. Since the maximum OD value cannot surmount 1 [11], the limit (OD = 0.1) chosen represents the lowest 10% cutoff value.
Medicina 2019, 55, x FOR PEER REVIEW 5 of 11 when the score 0 or 1 for nuclear coloration was given to a case to maintain compatibility with the Allred scoring system. Since the maximum OD value cannot surmount 1 [11], the limit (OD = 0.1) chosen represents the lowest 10% cutoff value.   when the score 0 or 1 for nuclear coloration was given to a case to maintain compatibility with the Allred scoring system. Since the maximum OD value cannot surmount 1 [11], the limit (OD = 0.1) chosen represents the lowest 10% cutoff value.   The histograms in Figures 3 and 4 summarize the descriptive (number of cases and their breakdown into specific score combinations) results of both approaches, and present the frequency of occurrence of a specific value of the parameters within the ranges of the studied cases. As it can be seen from the histograms, the quantitative approach pointed to the existence of ILCs which contained a low percentage of colored nuclei, but the intensity of the coloration was still assignable to the score 1, i.e., to the possibility of the total score 3 + 0, which was hardly, if at all, detected by the naked human eye. There were even cases where the total score was 4 + 0, in cases where a low percentage of colored nuclei were counted but a much greater color intensity was revealed after deconvolution. We observed additional unusual cases where the background blue staining masked the brow coloration of the positive nuclei; hence, these were not accounted for by a simple visual inspection by the pathologists, but were clearly detectable after deconvolution by the software. Surprisingly, these cases would be classified as belonging to the 3 + 0, or even 4 + 0 and 5 + 0, total Allred scores (Table 1). For the rest of the cases assessed in this work, there wa excellent correlation between the semi-quantitative and the quantitative approach results. The correlation analysis performed on all cases revealed a statistically significant positive correlation (r = 0.886, p < 0.001) between the subjective semi-quantitative and the quantitative ER scores, with a large effect size (d = 3.8215) ( Figure 5, up). Please note the cases with low OD values in the plot (quantitative score), i.e., the ones that do not fit this good correlation of the remaining cases. These cases represent instances of a score overestimation by the pathologists (Figure 5, up), hence the quantitative approach provides a means of detecting human errors.
The correlations between individual parameters of the total ER score, percentage of ER-positive nuclei, and color intensity obtained by two independent methods are presented in Table 2. The results show that both parameters have strong statistically significant correlations. Inherently, the color intensity of the ER-positive nuclei was in direct positive connection with the number of these nuclei, and the lower laying cases of low-intensity coloration are the problematic ones, as stated above for the subjective semi-quantitative assessment. The correlation analysis revealed a statistically significant positive correlation (r = 0.719, p < 0.001) between the quantitative nuclear coloration intensity and the percent of ER-positive nuclei ( Figure 5, down). However, since the Allred system ranges for the percentage of ER-positive nuclei have an uneven distribution, the scores take higher values for smaller increments at first, and are considered maximum after 67%; if we were to exclude this final score, the correlation would become much higher. The noted correlation ( Figure 5, down) between the percentage of ER-positive nuclei and the stain intensity is in agreement with the general rare occurrence of cases of 1 + 2 and 1 + 3 (Figure 1), as these were not detected at all in our sample. The correlations between individual parameters of the total ER score, percentage of ER-positive nuclei, and color intensity obtained by two independent methods are presented in Table 2. The results show that both parameters have strong statistically significant correlations. Inherently, the color intensity of the ER-positive nuclei was in direct positive connection with the number of these nuclei, and the lower laying cases of low-intensity coloration are the problematic ones, as stated above for the subjective semi-quantitative assessment. The correlation analysis revealed a statistically significant positive correlation (r = 0.719, p < 0.001) between the quantitative nuclear coloration intensity and the percent of ER-positive nuclei ( Figure 5, down). However, since the Allred system ranges for the percentage of ER-positive nuclei have an uneven distribution, the scores take higher values for smaller increments at first, and are considered maximum after 67%; if we were to exclude this final score, the correlation would become much higher. The noted correlation ( Figure 5, down) between the percentage of ER-positive nuclei and the stain intensity is in agreement with the general rare occurrence of cases of 1 + 2 and 1 + 3 (Figure 1), as these were not detected at all in our sample.

Discussion
The standardization of immunohistochemical staining procedures, making them more adequate and routine, is still debatable. Besides these laboratory-oriented issues, the question of stained tissue interpretation has become important for BC therapy outcome. Quantitative immunohistochemistry protocols require following an appropriate procedure during tissue sample processing and image analysis [11,19]. The usage of the quantitative immunohistochemistry, based on single pixels in cells/tissue, is quite difficult in everyday clinical practice, due to the variations in tissue sampling, their further processing, and analysis. The best way to avoid these discrepancies is an automation of all mentioned processes [19]. Although immunohistochemical staining has numerous advantages, there are still no definite solutions for the standardization and interpretation of results [20]. The procedure itself can be easily standardized; however, the interpretation of the results is based only on a visual subjective scoring system [21]. A great number of pathologists differentiate positive and negative immunohistochemical results according to a subjective qualification of positivity and percentage abundances, where the defined limits are between 5 and 45% [22].
Immunohistochemical semi-quantitative assessment of ER expression, for the purposes of therapy inclusion, is recommended by the American Society of Clinical Oncology (ASCO) [23]. Semi-quantitative procedures have both inter-and intra-observer variations [24]; however, some studies have shown that they are still useful for the evaluation of biopsy samples and have significance in an everyday clinical practice [25][26][27]. Their major downfall is in the evaluation of border-line cases of positivity; thus, it is of vast importance to improve them in terms of steroid receptor positivity analysis [28]. There are publications that emphasize that 80% of laboratories show positive ER with medium/strong expression, while only in 37% with weak expression of ER positivity [29]. Several publications applied semi-quantitative scores for the estimation of nuclear staining as a direct connection to the number of ER in cells [27]. However, these systems are with high levels of subjectivity, and inter-observer variations are still present [21]. In order to ensure data standardization, different software was used and a significant correlation with semi-quantitative scores and biochemical parameters was found [30][31][32][33]. Nevertheless, the complexity and high pricing are major limitations for the application of software in routine diagnostics.
Adequately stained tissue sections with monoclonal antibodies have a two-colored character, where the brown coloration arises from the antibody-bound structures (nucleus in our case) while the blue is due to a non-specific hematoxylin background staining affecting all tissue structures. These premises were taken into consideration in our experimental design, where the image color deconvolution was applied ( Figure 5). Previous methods have tried to overcome this color-related issue by subtraction of color intensity of the nucleus with that of the background [21]. The problem with this technique is that the difference between the dark blue and light brown nuclei cannot be made; this is avoided in our study by separating signals of DAB-staining from those originating from hematoxylin (deconvolution). This deconvolution algorithm has been previously suggested by Ruifrok and Johnston for the same purposes as presented here [34].
We focused our study on the analysis of immunohistochemically-stained characteristic areas of tumors (periphery of the sample), based on standard H&E staining. The analysis of these areas seems to be the most adequate one, since a number of researchers have found different ER expression in the central parts compared to the periphery of the tumorous mass [35]. This is of great importance since, in some cases, the heterogeneous expression of ER can be observed [36]. Also, the analyzed ER expression seemed better-compared to the progesterone receptor expression due to the relatively equal expression of progesterone receptors in the surrounding cells [7]. Besides the ER expression, these areas are characterized by different morphological features of tumor cells and have a larger value of the proliferative index than the central parts [28]. Although we cannot eliminate the subjective nuclei labeling during the analysis, insertion of a gird with a predetermined number of grid points partially decreases this form of subjectivity. Strong positive correlation between the subjective semi-quantitative and quantitative ER scores ( Figure 5, down, r = 0.886) suggests that the approach in which 10 fields at the periphery parts of tumor mass are chosen during the analysis might be adequate for the estimation of ER expression.
One of the major features of this study is that the analysis is based on simple, cheap, and available software: ImageJ. Finally, it represents a great benefit for patients with BC, since receptor expression analysis is crucial for the determination of the prognostic indexes [25]. Although no generally accepted standards for morphometric and photometric analysis are available [37], one can say that the results of this study can be useful for comparison between different histological variants of BC. The Allred scoring system is based on the semi-quantitative estimation of percentage abundance of positive cancer cell nuclei and staining intensity, where the expression limit is 10% of weakly or 1% of medium-stained cancer cell nuclei [18]. However, different previous studies have estimated a cutoff value between positive and negative immunohistochemical staining [38,39]. A shortcoming of the Allred scoring system exists, since some of the possibilities for the final score are only hypothetical (3 + 0, 4 + 0, and 5 + 0) ( Table 1). One may say that our study overcomes such shortcomings based on the strong positive correlation between the quantitative nuclear stain intensity and the percent of ER-positive nuclei ( Figure 5, down).

Conclusions
There are two major beneficial points in the currently proposed approach; the first one giving the possibility to detect cases of stain intensity that were obscured by the background coloration, and the second, to eliminate the overestimation of the score values by pathologists. The comparison and evaluation of semi-quantitative scoring systems (such as the Allred scoring system) are necessary for the standardization of quantitative ER expression assessment. Thus, the suggested deconvolution method can distinguish different variants of ILC, reduce intra-laboratory variations, and exclude subjectivity during ER analysis. Also, the application of the deconvolution method can be useful for the detection of border-line ER (+) cases which reflects directly on hormonal therapy usage.