BerEP 4 and AE 1 / 3 are Reliable Markers of Epithelial Content for Biomarker Discovery Using Reverse Phase Protein Arrays ( RPPA )

Background: Reverse phase protein array (RPPA) allows precise protein quantification. Use of homogenised samples confounds interpretation due to uncertainty of cellular content. We hypothesised that antibodies AE1/3 (Type I/II keratins) and BerEP4 (EpCAM) could help to quantify epithelium in tissue from colorectal cancers (CRC). Methods: AE1/3 and BerEP4 expression was quantified using RPPA and immunohistochemistry (IHC) in formalin-fixed paraffinembedded tissues from 19 CRCs. For RPPA, samples were tested in sextuplicate. IHC was quantified by transforming images into positive/negative pixels using ImageJ. Results: In RPPA, the BerEP4 expression was greater than AE1/3 (p < 0.001). There was excellent precision with a mean coefficient of variation of 5.9% for AE1/3 and 6.9% for BerEP4. Expression of AE1/3 and BerEP4 showed significant correlation (p < 0.0001). For IHC, antibodies showed specific epithelial staining. There was no difference in intensity of expression but there were slight differences in extent of expression with, on average, 3.5% more AE1/3-positive pixels than BerEP4-positive pixels (p = 0.016). Antibody expression (number of positive pixels) showed significant correlation (p < 0.0001). Comparison of RPPA with IHC showed significant correlation for each antibody, (BerEP4: p = 0.044; AE1/3: p = 0.037). Combining markers (using the geometric mean) improved the correlation between the assays (p= 0.028). Conclusion: Expression of AE1/3 and BerEP4 correlate with each other in both RPPA and IHC. For each antibody, the two assays show significant correlation. Either antibody can be used to quantify tumour epithelium in RPPA with improved performance if markers are combined. Normalising RPPA data to “epithelial housekeepers” may improve biomarker data interpretation.


Introduction
Biomarker studies are notorious for their high failure rate.Despite huge investment by both academia and industry, very few biomarkers make the transition from interesting laboratory data to tests with clinical utility.The reasons for this are manifold [1,2].Some biomarkers are "true" biomarkers inasmuch as they are robust and reproducible but the information they provide does not help to improve patient management beyond that informed by traditional clinical parameters.In such cases, the lack of success of the validated biomarker has to be accepted as part of the vicissitudes of science.In other cases, the biomarker is doomed to failure due to using an inappropriately defined and limited test population, insufficiently powered studies or technical weaknesses (at either the pre-analytical, analytical or post analytical stage).
These issues need to be addressed to increase the chance of successful biomarker discovery.Variability in the analytical stage can be reduced through use of automated assays.This would ensure that each of the technical steps of the assay is identical, the algorithms for quantifying the data are free of observer variability and they can be adapted for high throughput analysis.Reverse Phase Protein Array (RPPA) represents a high-throughput technique that allows simultaneous precise quantification of protein expression in a large number of biological samples [3][4][5].Over the past decade, RPPA has been successfully used for different applications including basic and preclinical research, biomarker discovery and clinical trials [6][7][8][9][10].
RPPA is a "morphology-independent" assay i.e. extracts of homogenised tissue are interrogated.Since composition of tissue is variable, the interpretation of the data is confounded by uncertainty regarding the actual cellular contents of the tissue.This can be obviated by histological review of each sample prior to use but creates further conundrums: (i) the speed of acquisition of data is reduced (which may be important when large studies are being undertaken), (ii) there is observer variation in the review and (iii) there may still be uncertainty due to differences between the reviewed section and the actual tissue being tested.Thus a solution is required which would allow capture of all the requisite data in as objective a manner as possible.
In many assays, the problem of variation in the templates is dealt with by a process known as "normalisation" [11][12][13][14][15].A surrogate "reference" marker which is thought to represent the variable feature is chosen and used as a measure of the feature.For example, in real-time quantitative PCR assays, uncertainty about the amount of nucleic acid template can be reduced by testing for a "housekeeping" gene.This is a gene which is unchanging (in either expression or copy number -depending on the assay) and therefore the "quantity" of gene is linearly related to the amount of template [16].
We hypothesised that the epithelial content of a tissue sample could be quantified with a reference marker for epithelium in an RPPA based experiment.This would allow expression of biomarkers to be normalized to the amount of epithelium present within tissue samples.BerEP4 and AE1/AE3 (AE1/3) are commonly used in routine diagnostic histopathology in a qualitative manner as markers of epithelium.BerEP4 is a transmembrane glycoprotein which detects an epitope on Epithelial Cell Adhesion Molecule (EpCAM, CD326).AE1/3 is a cocktail of two monoclonal antibodies which detect epitopes on Type I and Type II keratins which are present in the cytoskeleton of epithelial cells [17,18].We reasoned that these would be good antibodies since expression of these markers is maintained in malignancy and, in routine diagnostic practice, they are usually used when tumours are so poorly differentiated that are not obviously epithelial in origin [19].
In this study we investigated whether the quantification of epithelium using BerEP4 and AE1/3 in RPPA reflected that seen by immunohistochemistry (IHC).

Tissue samples
Nineteen equential cases of colorectal cancer (CRC) were selected for study.Each case was reviewed by MI and a single tumour blocks was chosen for testing.Formalin-fixed paraffin-embedded (FFPE) tumour tissue was used for both RPPA and IHC.Approval of the study and access to the tissue were granted by Nottingham Health Sciences Biobank.

Protein extraction for RPPA
For protein extraction, 20 μm thick sections were cut from formalin-fixed paraffin-embedded (FFPE) tumour blocks.Tissue was deparaffinised in xylene and the xylene was removed using graded ethanol (100%, 96%, and 70%).Excess alcohol was removed by centrifugation at 16,000g for two minutes.Next, 40 µl of lysis buffers (20% SDS, 0.5 M DTT and 0.5 M Tris-HCl solution at pH 8) were added to the tissue pellet and incubated at 100°C for 60 minutes using a thermomixer.Finally, the tubes were centrifuged at 14,000 rpm for 20 minutes and the supernatants were collected and stored at 80°C until used.

Protein quantification
Protein in the extracted lysates was quantified as described previously [20,21].In brief, lysates were robotically spotted in triplicate onto nitrocellulose-coated glass slides (Grace Bio-labs, USA) with a microarraying robot (MicroGrid 610, Digilab, Marlborough, MA, USA).In addition, serial two-fold dilutions of 1.0 mg/mL bovine serum albumin (Sigma Aldrich, UK) were spotted onto the array as a total protein standard.Printed arrays were stained with Fast Green Stain (0.005% FCF (Sigma-Aldrich, UK) in 30% ethanol, 10% acetic acid and 60% water, V/V/V) for 45 min at room temperature on a rocking platform at 40 rpm.Destaining was performed 2X 15 min each with 30% ethanol, 10% acetic acid and 60% water.After washing, the slide was air dried and scanned at 700 nm using an Odyssey scanner (LI-COR, Lincoln, USA).

Reverse Phase Protein Microarray (RPPA)
RPPA methodology has been described previously [22].In brief, equal amounts of protein lysates (1 µg/µl) were loaded into a 384well plate.Samples were robotically spotted in sextuplicate onto nitrocellulose-coated glass slides.Blocking of slides was performed by incubating the arrays overnight at 4°C with Super G blocking buffer (Grace Bio-labs, USA).Next, the arrays were incubated with the primary antibodies (Table 1) overnight at 4°C with agitation at 20 rpm.After three washes of 5 min each with Tris Buffer Saline (TBS) / 0.5% Tween 20, slides were incubated with infrared Licor secondary antibodies (680 CW anti-mouse Ig antibody, Cat number: 926-65010) diluted 1:5000 in washing buffer for 30 minutes at room temperature in the dark.Arrays were visualized using an Odyssey high-resolution scanner (LI-COR, Lincoln, USA) at 21 μm resolution using the 700 nm (red) channel.The Axon Genepix Pro-6 Microarray Image Analysis software was used to determine the average fluorescence intensities of all spots on each array.Protein intensity signals were finally determined with background subtraction using RPP analyzer, a module within the R statistical language on the CRAN (http:// cran.r-project.org)[23].For comparative analysis, BerEP4 and AE1/3 intensity signals were normalised to the protein concentration of the lysate.

Immunohistochemistry (IHC)
Whole tissue sections from each tumour block were cut in duplicate and fixed to specially coated slides by baking for 20 minutes at 60°C.Antibody information for BerEP4 and for AE1/3 is located in Table 1.IHC was performed using a BenchMark ULTRA automated IHC/ISH slide staining system (Ventana, Roche) in accordance with the manufacturer's instructions.All materials came with the UltraView(UV) immunostaining kit (Ventana, Roche).The following specific incubation steps were used for the procedure:antigen retrieval for 8 minutes; UV inhibitor for 4 minutes; primary antibody for 32 minutes (see dilutions in Table 1); UV Horse-Radish Peroxidase UNIV MULT for 8 minutes; UV 3'3' Diaminobenzidine (DAB) and UV DAB H 2 O 2 for 8 minutes; UV Copper for 4 minutes; Haematoxylin II for 12 minutes.

Image Analysis
Digital images of the immunostained slides were obtained using the Nanozoomer (Hamamatsu).Slides were scanned in at X40 and saved in the proprietary .ndpiformat.The images were viewed using the NDPIv2 viewing software (Hamamatsu) and exported for image analysis in JPEG format at magnification X0.3.Images were analysed using the publicly available software ImageJ (http://imagej.nih.gov/ij/) and two features were evaluated i.e. mean intensity of pixels and number of positively stained pixels.Images were opened in ImageJ and were processed using the colour deconvolution plugin.This plugin is able to extract colours and specifically separates the image into 3 images i.e. "colour 1" an image containing the haematoxylin colour, "colour 2" an image containing the DAB brown colour and "colour 3" containing the remaining colours (Figure 1).Analysis was restricted to "colour 2" and a histogram was obtained of each image showing the mean pixel intensity.In order to quantify the number of positively staining pixels, the image was processed using the "Make Binary" function in ImageJ to create an image which consisted of only black and white pixels (intensity 0 and 255 respectively).The total number of black pixels (corresponding to stained pixels) could then be calculated via the histogram function.For comparative analysis it is necessary to normalise the number of positive pixels to the total number of pixels comprising the tissue section (indicating the proportion of tissue containing tumour).For this, the polygon selection tool was used to outline the tissue area and total number of pixels in this area were ascertained via the histogram function.

Statistical analysis
Statistical analysis was undertaken via the use of GraphPad Prism 6.5.Precision of the RPPA assay was tested using raw data to calculate the coefficient of variation (CV).For intensity of IHC, the mean value of pixel intensity for "Colour 2" was used.For all other tests of the RPPA and IHC, raw data were firstly normalised (see above) and normalised values were then transformed using the square-root [24] method to allow use of parametric tests.Normal distribution was confirmed by the Shapiro-Wilk test and comparative analysis was performed using paired t-test and linear regression analysis.In order to assess correlation of RPPA and IHC using both markers in combination, the geometric mean (calculated by the square root of the product of the values for each biomarker [25]) was used.A value of p < 0.05 was considered significant.

Precision of BerEP4 and AE1/3 in the RPPA assay
The use of BerEP4 and AE1/3 as robust and reliable markers of epithelium in routine diagnostic practice is well established.In order to evaluate their utility in RPPA, we firstly tested the precision (i.e.reproducibility of data in simultaneous replicates) of these antibodies.Each of the 19 cases of CRC was spotted as 6 replicates onto the array and BerEP4 had a mean coefficient of variation (CV) of 6.9% whilst AE1/3 had a CV of 5.9%.This degree of precision is excellent for this kind of assay and indicates that these antibodies perform robustly in RPPA [26].

Quantification of epithelium by RPPA
The normalised and transformed RPPA data for BerEP4 and AE1/3 are shown in Table 2. Comparison of the normalised fluorescence data showed that the intensity of fluorescence was greater for BerEP4 than AE1/3 with a mean fold difference of 17. Pairwise comparison of the samples (using a paired T-test) showed that this difference was statistically significant.The expression of the two antibodies was significantly correlated (Figure 2, Pearson's correlation coefficient r = 0.9752, p < 0.0001, 95% CI = 0.9351 to 0.9906).

Quantification of epithelium by IHC and comparison with RPPA
The IHC staining for both antibodies was reviewed by MI and showed strong and specific staining in the epithelium with little staining in the stroma (Figure 1).The digital data were confirmed as normal and pairwise comparison of mean pixel intensity values showed no difference in intensity of staining between the two antibodies.The number of stained pixels represents the area stained by the antibodies and comparison of the number of stained pixels (normalised to the total number of pixels in the tissue section) showed that, on average, there were 3.5% more AE1/3-positive pixels than BerEP4-positive pixels (p = 0.016, paired t-test on transformed data).As with the RPPA, the expression of the two antibodies was significantly correlated (Figure 2, Pearson's correlation coefficient r = 0.9484, p < 0.0001, 95% CI = 0.8681 to 0.9803).Although RPPA and IHC represent two completely different methodologies, it was of importance to see if the data between the two showed correlation.For each of the antibodies, there was statistically significant correlation (for BerEP4: Pearson's correlation coefficient r = 0.4652, 95%CI 0.01-0.76,p = 0.044); for AE1/3, Pearson's correlation coefficient r = 0.48, 95%CI 0.03 to 0.77, p = 0.037).As we had used 2 different markers, we sought to combine data.The geometric mean of BerEP4 and AE1/3 for RPPA and IHC were plotted and showed a significant correlation (Figure 2, Pearson's correlation coefficient r = 0.50, 95%CI = 0.06095 to 0.7783, p= 0.028).The correlation of the geometric means was confirmed by randomly resampling the data (Supplementary Figure 1).

Discussion
RPPA is a powerful method of quantifying proteins but it is morphology-independent.Data are therefore confounded by uncertainty around the actual cellular composition of the tissue sample.An epithelial housekeeper or reference marker accurately reflecting epithelial contents of a tissue sample would allow biomarker expression to be normalised to epithelial contents and could improve data analysis.BerEP4 and AE1/3 are well established in routine diagnostic practice as robust qualitative markers of epithelium and this was indeed confirmed by our immunostaining in this study.However, it is well known that performance of antibodies in different assays may vary markedly -thus an antibody which performs well in immunohistochemistry may not perform well in Western blot [27].In order to assess the utility of BerEP4 and AE1/3 as quantitative markers in RPPA assays, we firstly evaluated the precision (i.e.reproducibility) of these antibodies.Our data showed that the mean coefficient of variation (CV) for AE1/3 was 5.9% and BerEP4 was 6.9%.The CV is frequently used as an indicator of the precision of an assay and a value of less than 10% is regarded as acceptable [26].The data therefore indicate that the antibodies perform consistently in RPPA.For every sample, the level of expression of BerEP4 in RPPA was higher than that of AE1/3 (p < 0.001) with a mean 17-fold difference.This would suggest that the protein detected by BerEP4 is present in greater amounts than that detected by AE1/3 although it is possible that it represents differences in antibody properties (e.g.affinity or avidity for the antigen).In the IHC, the results were reversed.There was no difference in the mean intensity of staining between the 2 groups but the epithelial content quantified by AE1/3 was, on average, 3.5% greater than that quantified by BerEP4 (p = 0.016).The difference between the 2 antibodies in the IHC is relatively trivial but it does confirm other data suggesting the poorly quantitative nature of the IHC assay [28,29].There was a significant correlation between the two antibodies in both RPPA and IHC.Even though they are measuring different proteins, this implies that the proteins are common to a particular component in the tissue.The IHC demonstrates that the component is the epithelium and it can thus be inferred it is the same in the RPPA assay.As well as being precise, the RPPA assay using BerEP4 and AE1/3 needs to be accurate.We therefore validated the RPPA against the IHC.The intensity of fluorescence for both antibodies in the RPPA was significantly correlated with positive pixel count in the IHC (BerEP4, p = 0.44, AE1/3 p = 0.037).Combining the biomarkers improved the correlation between the two assays (p = 0.028).This confirms that these antibodies can be used This shows the expression of AE1/3 and BerEP4 by RPPA and by IHC.The RPPA data have been normalised for protein loading and have undergone square root transformation to allow generate parametric data.%CV is the coefficient of variation calculated for replicates for each sample and derived using raw fluorescence data.
For the IHC data, normalised pixel count refers to the number of positive pixels normalised to the total number of pixels of the tissue section.Mean pixel intensity is derived from only the pixels which were positive in the tissue section.
ISSN: 2469-5807 to quantify relative epithelial content in homogenised tissues lysates.This, in turn, means that these antibodies can be used to "normalise" biomarker expression in RPPA thus, to some degree, reducing the uncertainty surrounding cellular content in tissue samples.
Our data show that, where expression of a biomarker is known to be restricted to the epithelium, performing an assay with these antibodies will obviate the need for morphological assessment of tissue.However, there are two caveats: (i) where it is possible that the biomarker may be expressed in other tissue compartment, optimal data analysis will continue to be morphology-dependent i.e. tissue samples will need to be reviewed histologically in order to ascertain which tissue compartment is expressing the biomarker of interest and at what level and (ii) these two antibodies are excellent epithelial markers in colorectal tissue.Tumours arising in other organs may require different epithelial housekeepers.
In summary, our data suggest BerEP4 and AE1/3 can be used for the relative quantification of epithelium in RPPA.By inference, other tissue compartments could similarly be evaluated.This allowsexpression of biomarkers tested using RPPA to be normalised to epithelial content possibly improving analysis and increasing the chance of biomarker discovery.

Figure 1 :
Figure 1: Quantification of the immunohistochemistry. Two sequential sections stained with BerEP4 and AE1/3 are shown at the top and show near identical patterns of expression.The process of quantification is shown for the section stained with BerEP4.The section initially undergoes colour deconvolution to extract the colours of Haematoxylin (Colour 1), DAB (Colour 2) and the remainder (Colour 3).From Colour 2, the mean pixel intensity can be calculated.The Colour 2 image then undergoes binary transformation and all the brown-stained pixels are converted to black and all the non-stained pixels are converted to white.The number of black pixels then indicates the area showing positive staining.

Figure 2 :
Figure 2: Comparison of antibodies and assays.(a) and (b) show correlation of AE1/3 and BerEP4 in RPPA and IHC respectively.(c) shows the interassay correlation when the geometric means are plotted.There is a significant correlation for each of the antibodies individually (not shown) but the performance is better when the markers are combined.

Table 1 :
List of antibodies used in RPPA and IHC.