Serum glycoprotein biomarker validation for esophageal adenocarcinoma and application to Barrett’s surveillance

BACKGROUND & AIMS Esophageal adenocarcinoma (EAC) is thought to develop from asymptomatic Barrett’s esophagus (BE) with a low annual rate of conversion. Current endoscopy surveillance for BE patients is probably not cost-effective. Previously, we discovered serum glycoprotein biomarker candidates which could discriminate BE patients from EAC. Here, we aimed to validate candidate serum glycoprotein biomarkers in independent cohorts, and to develop a biomarker panel for BE surveillance. METHODS Serum glycoprotein biomarker candidates were measured in 301 serum samples collected from Australia (4 states) and USA (1 clinic) using lectin magnetic bead array (LeMBA) coupled multiple reaction monitoring mass spectrometry (MRM-MS). The area under receiver operating characteristic curve was calculated as a measure of discrimination, and multivariate recursive partitioning was used to formulate a multi-marker panel for BE surveillance. RESULTS Different glycoforms of complement C9 (C9), gelsolin (GSN), serum paraoxonase/arylesterase 1 (PON1) and serum paraoxonase/lactonase 3 (PON3) were validated as diagnostic glycoprotein biomarker candidates for EAC across both cohorts. A panel of 10 serum glycoproteins accurately discriminated BE patients not requiring intervention [BE+/-low grade dysplasia] from those requiring intervention [BE with high grade dysplasia (BE-HGD) or EAC]. Tissue expression of C9 was found to be induced in BE, dysplastic BE and EAC. In longitudinal samples from subjects that have progressed towards EAC, levels of serum C9 glycoforms were increased with disease progression. CONCLUSIONS Further prospective clinical validation of the confirmed biomarker candidates in a large cohort is warranted. A first-line BE surveillance blood test may be developed based on these findings. Abbreviations AAL Aleuria aurantia lectin %CV % Co-efficient of variation AUROC Area under receiver operating characteristics curve BE Barrett’s esophagus BE-HGD Barrett’s esophagus with high-grade dysplasia BE-ID Barrett’s esophagus which is indefinite for dysplasia BE-LGD Barrett’s esophagus with low-grade dysplasia BMI Body mass index C1QB Complement C1q subcomponent subunit B C2 Complement C2 C3 Complement C3 C4B Complement C4-B C4BPA C4b-binding protein alpha chain C4BPB C4b-binding protein beta chain C9 Complement component C9 CFB Complement factor B CFI Complement factor I CI Confidence interval CP Ceruloplasmin EAC Esophageal adenocarcinoma EPHA Erythroagglutinin from Phaseolus vulgaris FFPE Formalin-fixed, paraffin-embedded GERD Gastroesophageal reflux disease GSN Gelsolin JAC Jacalin from Artocarpus integrifolia LeMBA Lectin magnetic bead array MRM-MS Multiple reaction monitoring-mass spectrometry NPL Narcissus pseudonarcissus lectin NSE Non-specialized epithelium OR Odds ratio PGLYRP2 N-acetylmuramoyl-L-alanine amidase PON1 Serum paraoxonase/arylesterase 1 PON3 Serum paraoxonase/lactonase 3 RBP4 Retinol-binding protein 4 SERPINA4 Kallistatin SIS Stable isotope-labeled internal standard


INTRODUCTION
Esophageal cancer is the sixth most common cause of cancer related mortality in men, with 3-fold higher rates in men than women (1,2). Of the two main histological subtypes (adenocarcinoma and squamous cell carcinoma), the incidence of esophageal adenocarcinoma (EAC) has been rising continuously in Western countries, and accounts for the majority of cases (3)(4)(5). Despite aggressive treatment, EAC has a 5-year survival of less than 20% (6). EAC is thought to develop from the metaplastic condition Barrett's esophagus (BE) as a consequence of gastroesophageal reflux disease (GERD) through a metaplasia-dysplasia-adenocarcinoma sequence ( Figure 1A) (7)(8)(9).
Currently, BE patients usually undergo endoscopy-biopsy surveillance with the degree of dysplasia assessed by histopathology as a biomarker to monitor risk of neoplastic progression (10).
Patients diagnosed with high grade dysplasia (BE-HGD) are treated with endoscopic mucosal resection, radiofrequency ablation or surgery, in an attempt to halt further disease progression (10)(11)(12). The significant cost of endoscopy plus the low annual progression rate to HGD or EAC means that the cost-effectiveness of endoscopic surveillance is questioned at the population level (13)(14)(15)(16).
Furthermore the evaluation of dysplasia in tissue biopsies by histopathology is prone to interobserver variability and sampling error (17). A less costly and minimally invasive diagnostic procedure is needed for cost-effective screening and surveillance of at-risk populations (18,19).
As the first step to developing blood-based EAC diagnostic test, we focused on differential glycosylation of serum glycoproteins during EAC pathogenesis. We established a new glycoprotein biomarker pipeline which couples lectin-based glycoprotein isolation with state-ofthe-art discovery and targeted proteomics (20)(21)(22)(23). We then applied it to identify and verify changes in lectin binding profile of serum glycoproteins between healthy, BE and EAC patients 7 (22). Here, we report results from validation in independent cohorts, and evaluation of biomarker panels for surveillance of BE patients.

Figure 1. (A)
Pathogenesis of esophageal adenocarcinoma (EAC) and clinical management of patients during each stage. In response to exposure to gastric and bile acid, non-specialized esophageal epithelium (NSE) converts to Barrett's esophagus (BE), and may progress through low grade dysplasia (LGD) and high grade dysplasia (HGD) stages to EAC. Patients at high risk of BE undergo endoscopic screening to detect asymptomatic metaplastic BE condition. Patients with BE or BE-LGD undergo endoscopy-biopsy surveillance to detect BE-HGD for endoscopic treatments. (B) Workflow of the study. A total of 301 serum samples from two different patient cohorts were subjected to lectin magnetic bead array-coupled multiple reaction monitoring mass spectrometry (LeMBA-MRM-MS). Biomarker candidates for EAC and surveillance were identified by statistical analysis. Tissue expression of top glycoprotein biomarker candidate complement C9 was evaluated by immunohistochemistry.

Clinical cohorts
Ethical approval was obtained from all participating institutions, and all patients provided informed consent to participate in the studies. We investigated two independent cohorts recruited in Australia and the United States of America, respectively. The Australian samples were selected from participants recruited into The Progression of Barrett's Esophagus to Cancer Network The clinical diagnosis linked to the sample was based on histological examination of biopsies taken at the same endoscopy. The diagnoses were provided to the researcher performing the biomarker measurements to allow batch randomization design for the assay. Serum samples were stored at -80°C, aliquoted, and shipped to the Translational Research Institute, Brisbane on dry ice for this study. Formalin-fixed, paraffin-embedded (FFPE) tissue sections from the Ochsner Health System were selected and shipped to Brisbane, Australia for immunohistochemistry.
Both PROBE-NET and the Ochsner cohort provided information on patients' age, sex and body mass index (BMI, calculated as weight (kg) / [height (m)] 2 ); whereas ethnicity was provided by the Ochsner cohort only and education, alcohol drinking and tobacco smoking was only available in the PROBE-NET cohort. Data on demographics and lifestyle factors were compared among different clinical and histological groups using Pearson's chi-square test or Fisher's exact test as appropriate. P < 0.05 was considered to be statistically significant. Analyses were performed using SAS 9.4 software.

Serum glycoprotein biomarker measurement and analysis
Serum glycoprotein biomarker candidates were measured using our previously reported lectin magnetic bead array (20, 21)-coupled multiple reaction monitoring (MRM) mass spectrometry assay (24). In this method, lectin binding is used to isolate glycoproteins with particular glycan structures. Based on our previous work for BE/EAC (22,23), four lectins were selected for the independent validation cohorts, namely AAL, EPHA, JAC, and NPL (Vector Laboratories, Burlingame, CA). PROBE-NET and Ochsner cohorts were independently analyzed with block randomization design for each cohort. The PROBE-NET method measured 365 peptides belonging to 106 proteins while Ochsner method measured 381 peptides belonging to 115 proteins, inclusive of standard peptides. Detailed methods are provided in Supplemental Methods and Supplemental Table 1.
Skyline was used for inspecting and processing MRM data (25). The quality of acquired datasets (Supplemental Table 2A for PROBE-NET cohort and Supplemental Table 2B for Ochsner cohort) were evaluated by % coefficient of variation (% CV) of spiked-in stable isotope-labeled internal standard (SIS) peptides and peptides derived from the spiked-in internal standard protein chicken ovalbumin. The data were normalized using the median intensity of 8 SIS peptides.
Peptide intensities were converted into protein intensity with Pearson correlation coefficient cutoff set at 0.6 (22). As recently highlighted by others (26), this step serves as quality control for peptide level measurements resulting in a robust protein level quantitative dataset for down-stream statistical analysis. The normalized protein intensities were transformed using the natural logarithm and z-scores were calculated for downstream statistical analysis. JMP Pro 13.2 (SAS Institute, Inc., Cary, NC, USA) was used for univariate and multivariate biomarker statistical analyses. Univariate logistic regressions were conducted on EAC vs NSE outcome, EAC vs BE outcome, and the surveillance outcome (BE-HGD or EAC vs BE or BE-ID or BE-LGD), against each of the glycoprotein_lectin biomarker candidates. Odds ratios (ORs) with 95% Wald confidence intervals (95% CIs), area under receiver operating characteristic curves (AUROCs) and Likelihood Ratio P values were calculated for both PROBE-NET and Ochsner cohorts. Recursive partitioning (also known as Classification and Regression Trees, CART) was used to identify a multivariate panel of markers that would discriminate between surveillance outcomes (BE-HGD+EAC vs BE+BE-ID+BE-LGD). The PROBE-NET dataset was used as the training set to develop predictive models, and the Ochsner dataset was used as the validation set. The set of 217 markers that were available in both PROBE-NET and Ochsner datasets was used in the training set, as well as baseline characteristics, including age, gender, and BMI. To avoid overfitting, models were limited to 6, 8, and 10 biomarkers. The prediction formulas derived from PROBE-NET were then applied to the Ochsner dataset to determine sensitivity and specificity in the validation set.  (41) 16 (39) 26 (42) 12 (34) 12 (40)

RESULTS
Workflow of this study is depicted in Figure 1B.  Table 3). Data for the top 10 biomarkers that differentiate EAC from NSE, and EAC from BE in PROBE-NET cohort are shown in Figure 2A and 2B, respectively. Out of these candidates, 16 candidates for EAC vs NSE comparison and 9 candidates for EAC vs BE comparison were also significantly different in the Ochsner cohort, confirming these glycoproteins as validated biomarkers. As illustrated in Figure 2C, 8 validated biomarkers overlap between the two lists.
These biomarkers are potentially most useful, being able to distinguish EAC from NSE and BE.
Next we considered BMI as a potential confounding factor for EAC biomarker validation.
Correlation analysis between BMI and biomarker levels in all PROBE-NET samples revealed no substantial correlation (|r|<0.6) (Supplemental Table 4

Biomarkers for BE surveillance
Having confirmed univariate biomarkers for detection of EAC from NSE and BE in independent cohorts, we next evaluated the ability of serum glycoproteins to be used as a surveillance tool for BE and BE-LGD patients, i.e. to distinguish between patients who require treatment (BE-HGD and EAC) and those who do not (BE, BE-ID and BE-LGD). Eight biomarkers that showed AUROC > 0.6 in a BE surveillance setting for both PROBE-NET and Ochsner cohort ( Figure 3, Supplemental Table 3), comprising of 3 glycoforms of GSN, 2 glycoforms of C9, AALbinding PON1, AAL-binding PON3, as well as EPHA-binding, Complement factor B (CFB).
Next we sought to generate a multimarker panel for BE surveillance, using the PROBE-NET cohort for modeling and Ochsner cohort for model validation. The minimal panel of six biomarkers showed 0.83 AUROC, 83% specificity and 67% sensitivity for the PROBE-NET cohort, and a moderate specificity of 61% and sensitivity of 63% for the Ochsner cohort ( Table 2).
Addition of 4 more biomarkers to the panel increased the AUROC to 0.93, and improved the specificity and sensitivity measures for PROBE-NET, as well as the sensitivity for Ochsner cohort (

Complement C9 expression in BE and EAC tissue
In agreement with our previous finding of complement pathway dysregulation in EAC pathogenesis (22), 5 of the 10 glycoprotein biomarker candidates in the final surveillance biomarker panel (Table 2) belonged to the complement pathway. As a first step to evaluate alterations of the complement pathway in EAC at a tissue level, we optimized immunohistochemistry staining for the top candidate complement C9. Staining specificity of the method was confirmed by neutralization of the antibody with recombinant C9 protein prior to staining (Supplemental Figure 1). We then evaluated expression of C9 in esophageal tissue sections from a subset of the Ochsner cohort. As shown in Figure 4A, C9 was detected in BE and EAC, but not squamous esophageal epithelium. Dysplastic BE showed particularly strong staining in the plasma membrane and/or cytoplasm ( Figure 4A). Strong staining in immune infiltrates served as an expected positive control. Quantitation of staining intensity score against the tissue phenotype ( Figure 4B) showed statistically significant association between C9 expression levels and histology groups of squamous epithelium, columnar epithelium, Barrett's mucosa, dysplasia/EAC (P<0.001).

Serum complement C9 glycoforms in progressor samples
As an additional evaluation, we examined C9 lectin pull-down levels in samples from PROBE-NET participants who had progressed from BE to BE-LGD (N = 4), BE-LGD to BE- Significant elevation of C9_EPHA and C9_NPL was observed following progression in this small patient cohort ( Figure 5).

DISCUSSION
This study advances our previous serum glycoprotein research (22,23) and provides a critical breakthrough towards developing cost-effective EAC surveillance. Over the years, several studies have been carried out to identify circulatory biomarker candidates to diagnose BEdysplasia-EAC disease spectrum (19,27). These studies have explored genetic alterations in cell free circulating DNA (28), serum miRNA changes (29), circulatory tumor cells (30), glycan profile alteration in serum (31,32), circulatory autoantibodies (against cancer antigens) (33), volatile organic compounds found in breath analysis (34), metabolic changes in urine (35), and a panel of serum proteins (36) as promising diagnostic biomarker candidates for BE and/or EAC. However, none of these biomarker candidates have progressed to from bench to bedside, likely due to the lack of subsequent validation studies in large independent cohort of patients. Here, we have addressed this gap by validating eight serum glycoprotein biomarkers for EAC in two independent patient cohorts including dysplastic samples.
In addition to demonstrating the robustness of our mass spectrometry based glycoproteincentric proteomics workflow for biomarker validation, the current study confirmed our previous finding of complement activation in EAC (22). The complement system consists of a cascade of circulating proteases that are locally activated leading to the formation of the membrane attack complex on the immunogen and recruitment of phagocytes. Complement components are predominantly expressed and secreted into the plasma by the liver but are also found to be expressed in other tissues (37), but some complement proteins may be expressed by tumor cells (38). While complement components are primarily involved in mediating innate immune response, recent studies have revealed an apparently paradoxical tumor-promoting role of the complement system (38,39). Complement C3 has been reported to play an autocrine role in ovarian and lung cancer tumor growth (40). C5a is elevated in serum of lung cancer patients (41) and increases the invasiveness of C5aR+ tumors (42). Fucosylated C9 was previously reported to be elevated in the serum of lung cancer patients (43).

Disclosures
The authors have no conflicts of interest to disclose.