Diagnostic potential for a serum miRNA neural network for detection of ovarian cancer

Version of Record

Accepted for publication after peer review and revision.

Download
Cite
Share
CommentOpen annotations (there are currently 0 annotations on this page).

Version of Record published: November 3, 2017 (This version)
Accepted Manuscript published: October 31, 2017 (Go to version)
Accepted: October 11, 2017
Received: May 23, 2017

1. Of interest
Recording and classifying MET receptor mutations in cancers

Célia Guérin, David Tulasne

Review Article Apr 23, 2024
Further reading

Abstract
eLife digest
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

Recent studies posit a role for non-coding RNAs in epithelial ovarian cancer (EOC). Combining small RNA sequencing from 179 human serum samples with a neural network analysis produced a miRNA algorithm for diagnosis of EOC (AUC 0.90; 95% CI: 0.81–0.99). The model significantly outperformed CA125 and functioned well regardless of patient age, histology, or stage. Among 454 patients with various diagnoses, the miRNA neural network had 100% specificity for ovarian cancer. After using 325 samples to adapt the neural network to qPCR measurements, the model was validated using 51 independent clinical samples, with a positive predictive value of 91.3% (95% CI: 73.3–97.6%) and negative predictive value of 78.6% (95% CI: 64.2–88.2%). Finally, biologic relevance was tested using in situ hybridization on 30 pre-metastatic lesions, showing intratumoral concentration of relevant miRNAs. These data suggest circulating miRNAs have potential to develop a non-invasive diagnostic test for ovarian cancer.

https://doi.org/10.7554/eLife.28932.001

eLife digest

Ovarian cancer is a major cause of cancer death among women. A woman’s survival often hinges on doctors detecting the tumor before it has spread beyond the ovary. Unfortunately, most women with ovarian cancer are not diagnosed until they have symptoms – such as pelvic pain, bloating, swelling of the abdomen or appetite loss. By then, the disease has usually spread and is difficult to treat. There is currently no reliable test to diagnose ovarian cancer before symptoms emerge. Some tests measure proteins in the blood or use ultrasound images to identify ovary tumors. These tests usually still identify the disease too late. Sometimes they produce “false positive” results, which may cause women without cancer to undergo unnecessary surgery.

Many ovarian cancers have defects in small pieces of genetic information called microRNAs. These microRNAs impact the tumor in multiple ways, and cells release microRNAs into the blood. Testing a seemingly healthy women’s blood for the same pattern of altered microRNAs found in women with ovarian cancer might be one way to detect the disease earlier.

Now, Elias et al. have identified a pattern of seven microRNAs in the blood that appears to predict ovarian cancer. In the experiments, a computer program searched for microRNA patterns in women with ovarian cancer. The program sifted through the microRNAs in blood from women with and without ovarian cancer. Over time, the computer program “learned” to identify a pattern of microRNAs found only in women with ovarian cancer. It then created a formula for identifying ovarian cancer based on seven of the microRNAs.

Elias et al. then verified that the formula accurately detected ovarian cancer by testing it on blood samples from more women with and without cancer. They also found the seven microRNAs in tiny ovarian cancer tumors collected from women. This suggests the formula might be able to detect even the smallest tumors. More studies are needed to determine when this cancer-linked pattern first emerges and confirm that this ovarian cancer-detection formula works. If the test is validated, it might be used to screen women who are at high risk for ovarian cancer because of mutations in the BRCA1 and BRCA2 genes.

https://doi.org/10.7554/eLife.28932.002

Introduction

Invasive epithelial ovarian cancer (EOC) is the leading cause of death from gynecologic cancer among women in developed countries (Siegel et al., 2016). Most women with EOC present with advanced stage disease, where 5 year survival rates average 25–30%, highlighting the need for an effective screening strategy. Unfortunately, two large-scale randomized clinical trials involving ultrasound and CA125, including the Prostate, Lung, Colorectal, and Ovarian Cancer (PLCO) trial and the United Kingdom Collaborative Trial of Ovarian Cancer Screening (UKCTOCS) trial did not demonstrate a meaningful impact on overall survival from EOC (Zhu et al., 2011; Jacobs et al., 2016). These and other non-experimental longitudinal studies reaffirm CA125 can detect advanced disease but with poorer sensitivity for early stage and non-serous cancers. In addition, CA125 has limited specificity, with the majority of abnormal CA125 values being the result of non-gynecologic malignancies or benign gynecologic conditions (Moss et al., 2005). The hope that adding more biomarkers to CA125 would improve screening was not realized in a re-analysis of the PLCO data as well as a recent longitudinal study from the European Prospective Investigation of Nutrition and Cancer (Zhu et al., 2011; Terry et al., 2016). In a separate strategy to improve EOC outcome, several panels (which have CA125 as part of them) have received FDA approval to be used in the differential diagnosis of EOC to encourage referral of EOC cases to centers with greater expertise in cancer surgery and chemotherapeutic treatment (Karst and Drapkin, 2010). However, these have not been effective for early diagnosis.

Among the alternatives to serum proteins for the diagnosis or early detection of EOC, circulating microRNAs (miRNAs) have shown great potential (Nakamura et al., 2016). miRNAs are short (18–24 nucleotide) non-coding RNAs that regulate gene expression through post-transcriptional modification of mRNA transcripts. miRNAs have several advantages over protein measures: (1) PCR amplifies detection of rare transcripts in blood; (2) all miRNAs use the same units of measure, easing incorporation into multiplexed panels; and (3) miRNAs play a critical role in ovarian cancer biology, whereas the function of CA125 is unknown (Deb et al., 2017; Katz et al., 2015). Moreover, non-invasive sampling of circulating miRNAs has a clear advantage over analytes obtained through biopsy (Wang et al., 2016).

Preliminary studies have suggested that circulating miRNAs profiles are altered in women with ovarian cancer (Nakamura et al., 2016; Chung et al., 2013; Langhe et al., 2015; Resnick et al., 2009; Zuberi et al., 2015; Samuel and Carter, 2016). In addition, miRNAs have prognostic significance for EOC survival (Merritt et al., 2008; Bagnoli et al., 2016; Cramer and Elias, 2016). However, efforts to develop a diagnostic signature based on circulating miRNAs have been hampered by issues regarding the best statistical approach to develop a model, reproducibility of miRNA measurement across technology platforms (e.g. qPCR, next generation sequencing, microarray), and the biologic heterogeneity of EOC (Nakamura et al., 2016). In this study, our objective was to develop a serum-based miRNA model for the diagnosis of ovarian cancer that could address these concerns and demonstrate the biologic and clinical relevance of this diagnostic tool.

Results

To produce our diagnostic circulating miRNA signature from human sera, we constructed a study population of pre-treatment (prior to either surgery or chemotherapy) subjects comprising 179 women selected from three independent prospective studies (ERASMOS, PMP, and NECC) (Table 1). ERASMOS contributed consecutive cases presenting for evaluation of an adnexal mass, while PMP allowed enrichment of the population for specific histopathologic diagnoses. NECC added healthy controls age-matched to PMP. After completing small RNA sequencing on the sera, subjects were randomly assigned into model training and testing sets (Figure 1). After the randomization, the training and testing sets were demographically similar, and there were no differences in the distribution of histopathological diagnoses between the sets (Table 2).

Figure 1

Download asset Open asset

Flowchart of study design.

(a) Protocol for miRNA sequencing, filtering, batch adjustment and separation into the training and testing sets. (b) Protocol for model development and testing.

https://doi.org/10.7554/eLife.28932.003

Table 1

Demographics of patients in the model study populations.

https://doi.org/10.7554/eLife.28932.004

	ERASMOS (n = 60)	PMP/NECC (n = 119*)	p-value
Age, years, median (SD)^†	57 (9.8)	56 (7.1)	0.44
CA-125, units/ml, median (SD) ^†	155 (689.8)	88.1 (1335.5)	0.72
Histology, n (%)^‡
Control	0 (0)	15 (12.6)	<0.0001
Serous cystadenoma/cystadenofibroma	7 (11.7)	14 (11.8)
Endometrioma	0 (0)	15 (12.6)
Other benign lesion	9 (15.0)	0 (0)
Borderline mucinous tumor	2 (3.3)	0 (0)
Borderline serous tumor	5 (8.3)	15 (12.6)
Stage I/II serous adenocarcinoma	5 (8.3)	20 (16.8)
Stage III/IV serous adenocarcinoma	19 (31.2)	10 (8.4)
Stage I/II clear cell/endometrioid adenocarcinoma	6 (10.0)	20 (16.8)
Stage III/IV clear cell/endometrioid adenocarcinoma	0 (0)	10 (8.4)
Mucinous adenocarcinoma	1 (1.7)	0 (0)
Other ovarian cancer	10 (10.0)	0 (0)
Stage, n (%)^‡
Not applicable	16 (26.7)	59 (49.6)	<0.0001
I	9 (15.0)	22 (18.5)
II	8 (13.3)	18 (15.1)
III	19 (31.2)	18 (15.1)
IV	8 (13.3)	2 (1.7)
Grade, n (%)^‡
Not applicable	16 (26.7)	44 (37.0)	0.07
Borderline	7 (11.7)	15 (12.6)
1 (well-differentiated)	6 (10.0)	12 (10.1)
2 (moderately differentiated)	3 (5.0)	12 (10.1)
3 (poorly differentiated)	28 (46.7)	36 (30.3)

ERASMOS – Effects of Regional Analgesia on Serum miRNA after Oncology Surgery Study

PMP – Pelvic Mass Protocol
NECC – New England Case Control study

*15samples from NECC, 114 samples from PMP
^†student’s t-test

^‡chi-square test

Table 2

Demographics of patients after stratified random sampling into training and testing sets.

https://doi.org/10.7554/eLife.28932.005

	Training (n = 135)	Testing (n = 44)	p-value
Age, years, median (SD) *	56 (8.1)	56 (8.3)	1.0
CA-125, units/ml, median (SD) ^*	126.5 (1193.5)	105.6 (577.8)	0.91
Pathology, n (%)^†			1.0
Control	11 (8.1)	4 (9.1)
Benign lesions	34 (25.2)	11 (25.0)
Borderline tumors	16 (11.9)	5 (11.4)
Stage I/II invasive cancers	41 (30.4)	12 (27.3)
Stage III/IV invasive cancers	33 (24.4)	12 (27.3)

*student’s t-test

^†chi-square test

We then deployed a series of statistical tools, including machine-learning approaches to analyze the miRNA-seq data to create an algorithm with the best performance for discriminating cases of ovarian cancer from either benign tumors, non-invasive (‘borderline’) tumors, or healthy controls. This began by using three different potential strategies for selecting miRNA variable inputs to build the models: significance-based (by t-test), correlation-based feature subset, or expression fold change (Table 3). Each miRNA variable list method was entered into one of 11 different models, which were compared both by AUC (Table 4) as well as sensitivity and specificity (Figure 2).

Figure 2

Download asset Open asset

Clinical performance characteristics of the tested models.

Sensitivity (blue bars) and specificity (orange bars) of the classifiers on the testing set depending on the method of variable selection. Whiskers denote 95% Confidence Intervals. (a) – Performance of models created on the subset of miRNAs selected using the significance-based filter. (b) Performance of models created on variables selected using the CFS subset algorithm. (c) Performance of models created using variables selected by the fold change-based filter. The red arrow denotes the model with the best performance characteristics, the neural network analysis using the fold change-based filter variable.

https://doi.org/10.7554/eLife.28932.006

Table 3

miRNA variables used in model building identified through univariate testing

https://doi.org/10.7554/eLife.28932.007

Significance-based selection	Correlation-based feature subset selection	Expression fold change selection
miR-29a-3p	miR-16-2-3p	miR-23b-3p
miR-30d-5p	miR-200a-3p	miR-29a-3p
miR-200a-3p	miR-200c-3p	miR-32–5 p
miR-200c-3p	miR-320b	miR-92a-3p
miR-320d	miR-320d	miR-150–5 p
miR-320c		miR-200a-3p
miR-450b-5p		miR-200c-3p
miR-203a		miR-203a
miR-486–3 p		miR-320c
miR-1246		miR-320d
miR-1307–5 p		miR-335–5 p
		miR-450b-5p
		miR-1246
		miR-1307–5 p

Table 4

Performance of the eleven statistical models on the testing set by variable selection method.

Results are shown for the testing set.

https://doi.org/10.7554/eLife.28932.008

	Variable selection method
Statistical model	Significance-based variable subset AUC (95% CI)	Correlation-based feature selection subset AUC (95% CI)	Fold change-based variable subset AUC (95% CI)
Linear discriminant analysis	0.80 (0.66–0.93)	0.76 (0.62–0.90)	0.78 (0.64–0.92)
Logistic regression	0.81 (0.68–0.94)	0.75 (0.61–0.90)	0.82 (0.70–0.94)
Neural network	0.84 (0.72–0.96)	0.75 (0.60–0.89)	0.90 (0.81–0.99)
Support vector machine	0.77 (0.63–0.91)	0.73 (0.58–0.87)	0.77 (0.63–0.91)
Multivariate adaptive regression splines	0.57 (0.40–0.74)	0.66 (0.49–0.82)	0.73 (0.58–0.88)
Naive Bayes classifier	0.75 (0.60–0.89)	0.68 (0.52–0.84)	0.75 (0.60–0.89)
Least Absolute Deviation regression tree	0.77 (0.63–0.91)	0.61 (0.44–0.78)	0.69 (0.53–0.84)
Functional tree	0.78 (0.64–0.91)	0.77 (0.63–0.91)	0.68 (0.52–0.84)
Bayesian network	0.72 (0.56–0.87)	0.67 (0.52–0.83)	0.72 (0.56–0.87)
Random forest	0.78 (0.64–0.91)	0.71 (0.56–0.86)	0.76 (0.62–0.90)
Elastic net	0.80 (0.67–0.93)	0.76 (0.62–0.90)	0.79 (0.66–0.92)

Although many of the models performed well, the neural network model employing miRNA expression fold changes was the only model to meet our pre-specified statistical objective with an AUC of 0.90 (95% CI: 0.81–0.99; p=0.03 over a theoretical AUC of 0.75). The network consisted of 14 individual miRNAs with seven neurons in the hidden layer (Source code 1). As the network relied on complex interactions between miRNA levels we tested whether its performance was not biased by batch adjustment performed at the initial step of the analysis. The neural network worked equally well on the adjusted and unadjusted raw datasets with an AUC of 0.93 (95%CI: 0.89–0.98) on the training and 0.90 (95%CI 0.80–0.99) on the testing set (Figure 3; Supplementary file 1A [by model] and 1B [by sample]). In post-hoc secondary analyses, the neural network worked equally well for older and younger patients, serous and non-serous histologies, and early and advanced stage disease (Supplementary file 2A-C).

Figure 3

Download asset Open asset

ROC curves for the neural network analysis.

(a) Performance of the neural network on the training set of raw, non-batch-adjusted data (red line) and in the batch-adjusted training set (black line) (b) Performance of the neural network on raw (red line) and batch-adjusted (black line) data in the testing set.

https://doi.org/10.7554/eLife.28932.009

Serum CA125 data were available for 120 subjects (Supplementary file 1B and 3A). Among these, the neural network (AUC 0.93; 95% CI 0.88–0.97) significantly outperformed CA125 (AUC 0.74; 95% CI 0.65–0.83; p=0.001; Figure 4). The primary advantage of the neural network over CA125 was avoiding false positives (8/43 for the neural network versus 23/43 for CA125; p=0.002) (Supplementary file 2A). Notably, the neural network and CA125 levels were independent of one another (Figure 4—figure supplement 1; Supplementary file 3B). We tested using the neural network and CA125 in a tiered testing strategy, subjecting all negative neural network algorithm results to a second review with CA125, but found this would increase the probability of a false positive test result from 4.2% (5/120) to 19.2% (23/120) and a false negative rate from 5.8% (7/120) to 13.3% (16/120) (Figure 4—figure supplement 2). The alternative of initial screening with CA125 followed by neural network yielded only three additional correctly diagnosed cases of invasive cancer at the expense of 19 additional false positive results.

Figure 4 with 2 supplements see all

Download asset Open asset

ROC curves for neural network analysis compared to CA-125.

The neural network (AUC 0.93; 95% CI 0.88–0.97) significantly outperformed CA125 (AUC 0.74; 95% CI 0.65–0.83) in terms of overall operating characteristics (p=0.001).

https://doi.org/10.7554/eLife.28932.010

The specificity of the neural network algorithm for the diagnosis of ovarian cancer was tested using an external, independent, dataset previously published by Keller, et al (Keller et al., 2011). These data were generated via a third technology platform, probe-based microarray, which fortunately contained all 14 miRNAs from our original signature, allowing for 1:1 mapping without exclusions (Supplementary file 4A and Supplementary file 6). The neural network perfectly classified patients in the training set (AUC 1.00, 95% CI 1.00–1.00) and provided very good discriminatory power on the testing set (AUC 0.93, 95% CI 0.81–1.00), with an overall sensitivity of 75% and specificity of 100%. The signature was specific to ovarian cancer compared to all other diagnoses, as it did not show any clinically-efficient diagnostic capabilities for any of the 12 other morbidities analysed in the set and showed good performance in distinguishing ovarian cancer samples against all other diagnoses combined (AUC 0.92, 95% CI 0.82–1.00) (Figure 5).

Figure 5

Download asset Open asset

Specificity of miRNA signature for ovarian cancer compared to other diagnoses.

The neural network 14 miRNA signature did not separate any other diagnoses from the control group in the published dataset by Keller, *et al* ¹³. The study also included 70 healthy controls. The number of subjects (n) denotes the number of cases of the given diagnosis in the Keller, *et al* dataset. (a) Pancreatic ductal cancer (n = 45); (b) Prostate cancer (n = 23); (c) Stomach cancer (n = 13); (d) Other pancreatic cancers (n = 48); (e) Melanoma (n = 35); (f) Lung cancer (n = 32); (g) Periodontitis (n = 18); (h) Pancreatitis (n = 38); (i) Multiple sclerosis (n = 23); (j) Acute MI (n = 20); (k) Chronic obstructive pulmonary disease (n = 24); (l) Sarcoidosis (n = 45). (m) Overall, neural network was highly specific for ovarian cancer cases against all other diagnoses (i.e. healthy controls or other cancers).

https://doi.org/10.7554/eLife.28932.013

Having established our miRNAs of interest using next generation sequencing, we next sought to validate the sequencing data across technology platforms by measuring the miRNAs from the neural network using qPCR. While small RNA sequencing is a more robust technology for miRNA discovery, qPCR is a more time efficient and cost-effective diagnostic tool. For this we used 120 samples from PMP and NECC for which we had excess RNA. We internally validated the 14 miRNAs in the neural network (plus an additional nine potential reference miRNAs derived from the sequencing data) by qPCR and recalibrated the algorithm to accept qPCR inputs (Supplementary file 6). We then performed a global sensitivity analysis on the best neural network for qPCR data and iteratively removed the variables which did the least in terms of improving the classifier’s performance. This reduced the neural network to only seven miRNAs (miR-29a-3p, miR-92a-3p, miR-200c-3p, miR-320c, miR-335–5 p, miR-450b-5p, and miR-1307–5 p) plus four normalizers (miR-423–3 p, miR-191–5 p, miR-221–3 p, and miR-103a-3p). To increase the statistical power of this qPCR-based classifier and create a fully locked-down model for clinical application, we added 205 more samples from PMP and NECC, including more than 100 additional healthy controls, to create a 325 subject population for qPCR model development (Table 5).

Table 5

Clinical characteristics of the qPCR model set.

https://doi.org/10.7554/eLife.28932.014

Characteristic	qPCR model set (N = 325)
Age, years, median (SD)	58.0 (10.1)
Grade, n (%)
Borderline	15 (4.6)
1	21 (6.4)
2	27 (8.3)
3	100 (30.8)
unspecified	10 (3.1)
Not applicable	150 (46.2)
FIGO Stage, n (%)
I/II	75 (23.1)
III/IV	83 (25.5)
Not applicable	167 (51.4)
Histology, n (%)
Control	123 (37.8)
Serous cystadenoma/cystadenofibroma	14 (4.3)
Endometrioma	15 (4.6)
Borderline serous tumor	15 (4.6)
Serous adenocarcinoma	100 (30.8)
Endometrioid/clear cell adenocarcinoma	48 (14.8)
Mucinous adenocarcinoma	10 (3.8)

These samples were randomized 3:1 into training and testing sets to create a neural network. The resulting network performed well with an AUC 0.89 on the training set and AUC 0.80 on the testing set.

We then tested the clinical performance of the final, locked-down diagnostic test on a completely independent external sample set collected from 51 preoperative patients treated in Lodz, Poland (Table 6). In this population, the neural network had a positive predictive value of 91.3% (95% CI: 73.3–97.6%) and a negative predictive value of 78.6% (95% CI: 64.2–88.2%) with an AUC of 0.85 (Figure 6).

Figure 6

Download asset Open asset

ROC curve for neural network analysis using qPCR inputs from the clinical test set.
https://doi.org/10.7554/eLife.28932.015

Table 6

Clinical characteristics of the external validation set.

https://doi.org/10.7554/eLife.28932.016

Characteristic	Polish external validation set (N = 51)
Age, years, median (SD)	55.5 (16.1)
Grade, n (%)
Borderline	4 (7.8)
1	2 (3.9)
2	7 (13.7)
3	13 (25.5)
unspecified	3 (5.9)
Benign	22 (43.1)
FIGO Stage, n (%)
I	7 (13.7)
II	3 (5.9)
III	18 (35.3)
IV	1 (2.0)
Benign	22 (43.1)
Histology, n (%)
Serous cystadenoma/cystadenofibroma	6 (11.8)
Endometrioma/endometriosis	10 (19.6)
Mature teratoma	6 (11.8)
Borderline serous tumor	2 (3.9)
Borderline seromucinous tumor	2 (3.9)
Serous adenocarcinoma	4 (7.8)
Mucinous adenocarcinoma	1 (2.0)
Endometrioid adenocarcinoma	1 (2.0)
Clear Cell Adenocarcinoma	9 (17.6)
Mixed adenocarcinoma	3 (5.9)
Adenocarcinoma unspecified	7 (13.7)

Ideally, a serum biomarker should have biologic relevance to the clinical disease. To this end, we returned to the ERASMOS patient set to examine if the expression levels of the miRNAs changed in the cancer patients after surgical cytoreduction. Among the patients with ovarian cancer in the study, 27 had both preoperative and postoperative serum miRNAs profiled. These included 4/7 target miRNAs in the qPCR neural network model. Circulating levels of all three miRNAs decreased within 72 hr of tumor removal, with significant changes for miR-200a-3p and miR-200c-3p (Figure 7A–D).

Figure 7

Download asset Open asset

Change in miRNA expression from preop to post-operative day three after surgical cytoreduction.

n = 27.

https://doi.org/10.7554/eLife.28932.017

We also wanted to test if the miRNAs were in fact coming from the earliest lesions of this disease. For this, we assembled paraffin-embedded tissue sections from independent sets of 15 cases of serous tubal intraepithelial carcinomas and 15 Stage I high grade (serous or Grade three endometrioid) epithelial ovarian cancers. Immunohistochemistry was performed on sequential sections for TP53 and Ki67 to highlight the lesions. We then performed in situ hybridization for three of the miRNAs in our neural network; mir-200c-3p, mir-335–5 p, and mir-92a-3p (Figure 8). In 100% of the samples, there was complete overlap between lesional cells and the miRNAs crucial for neural network performance, suggesting that the miRNAs detected in the serum are present even in early lesions in the fallopian tube epithelium and raising the possibility of detection of pre-metastatic disease.

Figure 8

Download asset Open asset

In situ expression of selected miRNAs from the serum signature.

Sections of fallopian tubes showing serous tubal intraepithelial carcinoma (STIC) lesions and Stage I high grade serous ovariancancer (HGSOC). Lesional cells are indicated by TP53 and Ki-67 staining. (top) STIC lesion in continuity with normal fallopian tube. 20x. (middle) STIC lesion in continuity with normal fallopian tube and invasive cancer with p53-null lesion. 10x. (bottom) HGSOC intraluminal to the fallopian tube. 10x.

https://doi.org/10.7554/eLife.28932.018

Finally, we have constructed a web calculator (http://biostat.umed.pl/ovaries) to demonstrate how to use these models. The calculator accepts various inputs describing on the method of circulating miRNA quantification (sequencing, qPCR, or microarray) and returns the estimated probability of ovarian cancer for a given patient.

Discussion

We have described the development of a diagnostic model for ovarian cancer using sequencing of circulating miRNA. This is the first study in ovarian cancer to combine next generation sequencing technology for serum miRNA with machine learning techniques. Not only does sequencing provide greater sensitivity for miRNA detection than other methods, but expression levels of various miRNAs are not linearly related and relationships among miRNAs tend to be obscured by more basic statistical approaches. The neural network as presented has several advantages over a traditional biomarker like CA125. The neural network recognized more Stage I/II ovarian cancers and had significantly fewer false positives. This likely reflects an ability to discriminate relevant biology more than to quantify tumor burden. For example, the neural network correctly classified 35/43 (81%) borderline tumors as being non-invasive neoplasms, compared to just 20/43 (47%; p=0.002) for CA125. An additional strength of our study is the incorporation of multiple independent datasets. The ERASMOS specimens were obtained from cases enrolled sequentially, reflecting the natural frequency of different ovarian tumor subtypes in the clinical population, including the fact that most women with invasive ovarian cancer present with advanced stage disease. The Pelvic Mass Protocol samples allowed us to enrich the study population for less common clinical cases that would be expected to confound a conventional screening algorithm, including benign complex ovarian masses, borderline tumors, early stage cancers, and non-serous histologic subtypes. NECC provided age-matched healthy controls. The specificity of our model was tested using a publicly available dataset from Keller, et al where we showed that the neural network performed well across disease stages, histologic subtypes, and diagnostic platforms. This ability to specifically identify ovarian cancers and discriminate ovarian cancer from other diagnoses sets the current work apart from prior miRNA studies (Nakamura et al., 2016; Chung et al., 2013; Resnick et al., 2009; Zuberi et al., 2015; Häusler et al., 2010; Zheng et al., 2013). Finally, we tested our signature using a completely external, independent set of samples from Poland, showing that in a clinical sample set the test performed well without additional modifications.

There appears to be biologic relevance to the serum miRNAs in the neural network. The rapid change in circulating levels after surgical cytoreduction for mir-200a and mir-200c suggests these are being produced actively by tumors. Although other miRNAs did not have as great of a decrease, this may reflect differing half-lives for different miRNA species. In future work, it would be interesting to measure changes over a longer time frame than 72 hr, but that was the endpoint for ERASMOS, which is an anesthesia-focused study. We also demonstrated expression of several miRNAs from the neural network in pre-metastatic lesions. This both confirms prior work suggesting that these miRNAs are detectable in advanced ovarian cancers specimens and adds the new finding that these miRNAs are expressed in very early stage and even pre-invasive lesions (Bagnoli et al., 2016). Future work will examine the kinetics of these miRNA changes in tumor pathogenesis.

The phase II specimens used in this study are like those used to support development of assay panels subsequently approved for the differential diagnosis of ovarian cancer vs. a benign pelvic mass. The first panel, named OVA1, was approved by the FDA in 2009 and consisted of 5 analytes including CA125 (Zhang et al., 2004; Ueland et al., 2011). While those authors emphasized the assay’s negative predictive value of 95% (when combined with physician assessment), the assay had an AUC of only 0.80 (95% CI: 0.73–0.88) for pre-menopausal women and 0.82 (95% CI: 0.77–0.87) for post-menopausal women. The second panel was approved in 2011 and consisted of just two markers, CA125 and HE4, combined with menopausal status (Moore et al., 2010). While the ROMA algorithm had an overall AUC for discriminating cancer from benign tumors of 0.91 (95% CI: 0.88–0.94), this was in the setting of including borderline tumors as malignancies. Moreover, the positive predictive value of the test for distinguishing benign masses from Stage I/II EOC was only 0.27. In 2016, the FDA approved an updated version of the OVA1 test which retained CA125 but replaced 2 of the markers with HE4 and FSH (Coleman et al., 2016). This improved the overall AUC to 0.92 (95% CI: 0.89–0.96) for the assay alone and 0.94 (95% CI: 0.91–0.97) when combined with physician assessment, although 80% of the tumors in this study were benign. Although the above panels included some clinical information and therefore are not equivalent to our panel, we point out that the AUC of our panel to distinguish a malignant from benign pelvic mass was similar, while not including borderline tumors as positive results and agnostic to clinical or imaging information. As timely referral to a gynecologic oncologist is a strong predictor of ovarian cancer survival, we believe that there is a role for a test based on blood markers alone (Earle et al., 2006).

FDA approval of the various panels for use in the differential diagnosis of pelvic masses did not extend to their use in the general population. Based upon the results of the PLCO and UKCTOCS randomized clinical trials (or so called ‘phase 4’) the US Preventive Services Task Force and the Society of Gynecologic Oncology (SGO) have not recommended routine screening for ovarian cancer (Zhu et al., 2011; Skates et al., 2001). However, screening with CA125 and transvaginal ultrasound is recommended by the National Comprehensive Cancer Network guidelines and the SGO for women with known hereditary syndromes of ovarian cancer (such as women with germline BRCA1/2 mutations), even though there is currently no evidence that this screening strategy improves survival in elevated risk populations (Schorge et al., 2010).

Recent studies (Chung et al., 2013; Langhe et al., 2015; Resnick et al., 2009; Zuberi et al., 2015) have identified circulating (serum/plasma) miRNAs that are altered in ovarian carcinomas, and there is limited overlap with miRNAs that emerged from our analysis. One possible cause of this difference is the limited number of samples examined in these studies. For example, in Langhe et al, a training set of 5 serous ovarian carcinomas and five benign serous cystadenomas were selected for the initial experiments. The validation set was 20 serous ovarian carcinomas and 20 benign serous cystadenomas. In Resnick et al, 28 ovarian carcinoma patients and 15 healthy controls were used to identify to differential expression of circulating miRNAs. Such limited numbers diminish the statistical robustness of the results. Another possible cause for the differences is the miRNA expression profiling platform. Recently a study (Mestdagh et al., 2014) systematically compared 12 different miRNA expression platforms. Specifically, for serum miRNAs there was a 12-fold difference between the highest and lowest number of detected miRNA when identical samples were profiled by different platforms. According to this report the LNA-based platform from Exiqon has the highest specificity but maybe limited for sensitivity thereby for detection. To circumvent both these concerns we started with next-generation sequencing of 179 samples which captures all small RNAs and addresses any issues of detection or specificity due to limitations of platforms. Next, we did validation using the Exiqon qRT-PCR platform on 325 local samples, and a further validation using an additional cohort of 51 samples from Poland. The large number of samples along with the methodologies used for identification and validation of the circulating miRNAs in our study provides strong support for our conclusions and distinguishes our work from prior reports.

Our study does have several limitations. Whether our miRNA panel will prove useful in the differential diagnosis of early detection will require further study in the following areas. First, additional study is necessary to determine whether integrating clinical risk factors could further improve its performance. Second, confirmation in other phase II data sets are necessary to validate our study results and demonstrate its generalizability. Third, specimens collected and stored months or years prior to a clinical diagnosis (so called phase III specimens) are necessary to demonstrate the model’s potential in the early detection of EOC in a general population or elevated risk setting. For the former, we have access to PLCO specimens; and for the latter, we plan to apply for specimens to the National Clinical Trials Network. Fourth, a logical extension of our work is to determine whether our current miRNA panel (or a new one) would be useful in predicting survival after EOC. A tissue-based MiROvaR signature involving 35 miRNAs for predicting EOC prognosis has recently been described (Bagnoli et al., 2016). Although several miRNAs appear in the tissue signature and our model (Supplementary file 4B), full concordance is unlikely since the tissue model was built to predict prognosis whereas our model was built to predict diagnosis. In addition, about two-thirds of the miRNAs in the tissue signature are not reliably detectable in circulation, which can be attributed to the fact that relatively few miRNAs circulate in serum (Mestdagh et al., 2014; Dinh et al., 2016). Our serum panel is reliant on a smaller number of miRNAs simply because the neural model prioritizes ones that provide novel information. If miRNAs are correlated (for example within the same chromosomal cluster), they will be invariant and knowing one will convey sufficient information about the rest for them to be excluded from model building. Finally, more is to be learned about the basic biology of serum miRNA. Are they all coming from cancer cells or also other cells in the tumor microenvironment? (Likely, both are included in the signature). It is noteworthy that two of the miRNAs are members of the mir-200 family, confirming prior reports identifying these miRNAs as overexpressed in ovarian cancer (Zuberi et al., 2015; Pecot et al., 2013). Some of the miRNAs incorporated into the neural network algorithm have connections to other disease types. For example, miR-1246 has been identified in the serum of ovarian cancer, lung cancer, prostate cancer, and stroke patients (Todeschini et al., 2017; Zhang et al., 2016; Alhasan et al., 2016; Li et al., 2015). However, as noted in Figure 5, the network as a whole was specific to ovarian cancer, again emphasizing the importance of multimarker panels.

In conclusion, serum miRNA adds to the toolbox of options to diagnose ovarian cancer. We plan several future studies to characterize the miRNA neural network. Whether serum miRNA offers a lead time advantage over other putative biomarkers remains to be proven. We need to study the performance characteristics of the miRNA neural network in high risk and low risk populations. Finally, we are performing laboratory investigations to elucidate the biologic function of these miRNAs and to understand the kinetics of miRNA expression in ovarian cancer pathogenesis. With our improved understanding of miRNA analytic approaches, we can develop better models for this and other diseases.

Share this article

Cite this article

Flowchart of study design.

Demographics of patients in the model study populations.

Demographics of patients after stratified random sampling into training and testing sets.

Clinical performance characteristics of the tested models.

miRNA variables used in model building identified through univariate testing

Performance of the eleven statistical models on the testing set by variable selection method.

ROC curves for the neural network analysis.

ROC curves for neural network analysis compared to CA-125.

Specificity of miRNA signature for ovarian cancer compared to other diagnoses.

Clinical characteristics of the qPCR model set.

ROC curve for neural network analysis using qPCR inputs from the clinical test set.

Clinical characteristics of the external validation set.

Change in miRNA expression from preop to post-operative day three after surgical cytoreduction.

In situ expression of selected miRNAs from the serum signature.

Principal component analysis identified a prominent batch effect among the study populations.

Hierarchical clustering of the eleven statistically significant miRNAs identified using univariate analysis.

Author details

Kevin M Elias

Contribution

Competing interests

Wojciech Fendler

Contribution

Competing interests

Konrad Stawiski

Contribution

Competing interests

Stephen J Fiascone

Contribution

Competing interests

Allison F Vitonis

Contribution

Competing interests

Ross S Berkowitz

Contribution

Competing interests

Gyorgy Frendl

Contribution

Competing interests

Panagiotis Konstantinopoulos

Contribution

Competing interests

Christopher P Crum

Contribution

Competing interests

Magdalena Kedzierska

Contribution

Competing interests

Daniel W Cramer

Contribution

Competing interests

Dipanjan Chowdhury

Contribution

For correspondence

Competing interests

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism

Further reading