Computed Tomography Based Radiomics as a Predictor of Survival in Ovarian Cancer Patients: A Systematic Review

Simple Summary Ovarian cancer represents the most lethal gynecological malignancy. Since many new drugs have been recently introduced as adjunctive treatments for this pathology, an early prediction of outcome might be helpful to further improve outcomes. Radiomics represents a recent advancement, relying on extraction of quantitative features from imaging examinations. Indeed, clinical images, such as computed tomography images, may contain quantitative information, reflecting the underlying pathophysiology of a tumoral tissue. Radiomic analyses can be performed in tumor regions and metastatic lesions, as well as in normal tissues. The radiomic process relies on quantitative features, usually extracted by dedicated software, and then culminates in analysis and model building, according to a defined clinical question. This systematic review aims to evaluate association between radiomics based on computed tomography images and survival (in terms of overall survival and progression free survival) in ovarian cancer patients. Abstract The objective of this systematic review was to assess the results of radiomics for prediction of overall survival (OS) and progression free survival (PFS) in ovarian cancer (OC) patients. A secondary objective was to evaluate the findings of papers that based their analyses on inter-site heterogeneity. This systematic review was conducted according to the PRISMA statement. After the initial retrieval of 145 articles, the final systematic review comprised six articles. Association between radiomic features and OS was evaluated in 3/6 studies (50%); all articles showed a significant association between radiomic features and OS. Association with PFS was evaluated in 5/6 (83%) articles; the period of follow-up ranged between six and 36 months. All the articles showed significant association between radiomic models and PFS. Inter-site textural features were used for analysis in 2/6 (33%) articles. They demonstrated that high levels of inter-site textural heterogeneity were significantly associated with incomplete surgical resection in breast cancer gene-negative patients, and that lower heterogeneity was associated with complete resectability. There were some differences among papers in methodology; for example, only 3/6 (50%) articles included validation cohorts. In conclusion, radiomic models have demonstrated promising results as predictors of survival in OC patients, although larger studies are needed to allow clinical applicability.


Introduction
Ovarian cancer (OC) represents one of the most lethal gynecological cancers in the world, accounting for about 14,000 deaths in 2020 in the US [1]. Computed Tomography (CT) is part of the standard pre-treatment evaluation of OC patients to evaluate the spread of the disease [2]. Traditionally, radiologists subjectively evaluate clinical images, based on their training and experience, to provide a diagnosis or an assessment of a clinical state [3].
In the last few years, personalized and precision medicine have begun paving the way towards tailored and individualized treatments, based on the increasing knowledge of the tumoral microenvironment, at a microbiological and molecular level. Radiomics represents a recently introduced translational field of research, aiming to find associations between quantitative information extracted from imaging examinations and clinical data to support evidence-based clinical decision-making [4].
The concept underlying radiomics is that clinical images, such as CT images, may contain quantitative information, reflecting the underlying pathophysiology of a tumoral tissue [5]. Radiomic analyses can be performed in tumor regions and metastatic lesions, as well as in normal tissues [4,6]. The radiomic process relies on quantitative features, usually extracted by dedicated software, and then culminates in analysis and model building, according to a defined clinical question. Despite some promising results of radiomics as a predictor of prognosis in cancer patients [5,[7][8][9], there are many concerns about the applicability of these models and ethical issues [10,11]. Indeed, the main limitations of most radiomic studies are different approaches to the extraction of features, in terms of software, as well as in terms of tissues considered for feature extraction (the tumor itself or the metastases), the high number of features despite the small number of patients included, and the lack of validation cohorts [9,12].
So far, a few narrative reviews about the role of radiomics in evaluation of ovarian cancer patients have been published [3,[13][14][15][16]. To the best of our knowledge, no systematic review has specifically addressed the role of radiomics in evaluation of survival in OC patients.
The main objective of this systematic review was to assess the results of radiomics for prediction of overall survival (OS) and progression free survival (PFS) in OC patients. A secondary objective was to evaluate the findings of papers that based their analyses on inter-site heterogeneity.

Methods
This systematic review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-analysis for Diagnostic Test Accuracy (PRISMA-DTA) statement [17], which describes an evidence-based minimum set of items for reporting in systematic reviews and meta-analyses of diagnostic studies.

Search Strategy
Two authors (SR and LM) performed a comprehensive computer literature search of the electronic databases PubMed, Cochrane, and Web of Science to find primary publications evaluating CT based texture analysis in ovarian cancer. No beginning date limit or language restrictions were used; the literature search was last updated on 31st October 2020; the search was expanded by also screening the references of the retrieved articles for additional potentially eligible studies.

Study Selection
The search terms consisted of ((ovary) OR (ovarian)) AND ((radiomic) OR (radiomics) OR (texture) OR (textural)) AND ((survival) OR (prognosis) OR (prognostic) OR (predictive) OR (predicting) OR (prediction)). Articles in which CT-based texture analysis and radiomics were employed for prognostic purposes of ovarian cancer were obtained in full for further evaluation. Studies were excluded if they were case reports, conference abstracts, reviews, or short communications because they do not provide sufficient information to assess the methodological quality. Uncertainties were resolved in consensus.

Data Extraction
For each eligible article, information was collected concerning the basic study (authors, year of publication, country of origin, journal impact factor, quartile ranking of the journal, study design), patient characteristics (number of patients evaluated, mean/median age, stage according to the International Federation of Gynecology and Obstetrics (FIGO) staging, OS and PFS, including median follow-up in months), technical aspects (presence of a validation group, extraction of features exclusively from the ovaries or from more than one site of disease, number of features included in the final model, categories and names of features included, software used for segmentation and for feature extraction, CT scan manufacturer and protocol of acquisition, ROI tracing).

Quality Assessment
The overall quality of the included studies was critically evaluated based on the revised "Quality Assessment of Diagnostic Accuracy Studies" tool (QUADAS-2) [18]. This tool comprises four domains (patient selection, index test, reference standard, and flow and timing) and each domain was assessed in terms of risk of bias, and a graph was constructed appropriately.

Literature Search
The initial search yielded 145 articles, all in English. According to inclusion and exclusion criteria, six full-text articles were included in this systematic review. Details about the literature search results are reported in Figure 1. in full for further evaluation. Studies were excluded if they were case reports, conference abstracts, reviews, or short communications because they do not provide sufficient information to assess the methodological quality. Uncertainties were resolved in consensus.

Data Extraction
For each eligible article, information was collected concerning the basic study (authors, year of publication, country of origin, journal impact factor, quartile ranking of the journal, study design), patient characteristics (number of patients evaluated, mean/median age, stage according to the International Federation of Gynecology and Obstetrics (FIGO) staging, OS and PFS, including median follow-up in months), technical aspects (presence of a validation group, extraction of features exclusively from the ovaries or from more than one site of disease, number of features included in the final model, categories and names of features included, software used for segmentation and for feature extraction, CT scan manufacturer and protocol of acquisition, ROI tracing).

Quality Assessment
The overall quality of the included studies was critically evaluated based on the revised "Quality Assessment of Diagnostic Accuracy Studies" tool (QUADAS-2) [18]. This tool comprises four domains (patient selection, index test, reference standard, and flow and timing) and each domain was assessed in terms of risk of bias, and a graph was constructed appropriately.

Literature Search
The initial search yielded 145 articles, all in English. According to inclusion and exclusion criteria, six full-text articles were included in this systematic review. Details about the literature search results are reported in Figure 1. Given the small number of papers included, and the heterogeneity of the quantitative analyses performed as well as of the results, a metanalysis for pooled data was not performed. Given the small number of papers included, and the heterogeneity of the quantitative analyses performed as well as of the results, a metanalysis for pooled data was not performed.

Basic Study and Patient Characteristics
Six articles evaluating the association between CT based radiomics and survival in OC patients were selected [18][19][20][21][22][23][24]. The selected articles were published between 2017 and 2019 by researchers from Europe (n = 2), US (n = 3), and China (n = 1). All the studies were retrospective; the number of patients included ranged between 38 and 364; the mean/median age of patients ranged between 50 and 75 years; most of the patients included had a FIGO stage III or IV, median follow-up ranged between 23.1 and 59 months (Table 1).

Methodological and Technical Aspects of the Included Studies
Three out of six (50%) studies included one or more validation cohorts [19,23,24], 2/3 (33%) included both internal and external validation [19,23]. The number of features included in the prognostic models, categories and names of the features, software used for segmentation, and feature extraction are summarized in Table 2. Technical details about CT acquisition and ROI tracing of the included studies are summarized in Table 3.   The overall quality assessment of the studies is reported in Figure 2.  The overall quality assessment of the studies is reported in Figure 2.

Overall Survival
Association between radiomic features and OS was evaluated in 3/6 (50%) studies; the median period of follow-up, where declared, ranged between 53 and 59 months. All the articles showed a significant association between radiomic features and OS. Specifically, Lu et al. drew up a prognostic score of four features, named radiomic prognostic vector (RPV), selected from a larger group of 42 features by using least absolute shrinkage and selection operator (LASSO). Using the RPV, OS differences were confirmed in two independent validation datasets; furthermore, the addition of RPV improved the clinically available prognostic methods (stage, age, and postoperative residual disease) in all three datasets, as measured by the concordance index [19].

Overall Survival
Association between radiomic features and OS was evaluated in 3/6 (50%) studies; the median period of follow-up, where declared, ranged between 53 and 59 months. All the articles showed a significant association between radiomic features and OS. Specifically, Lu et al. drew up a prognostic score of four features, named radiomic prognostic vector (RPV), selected from a larger group of 42 features by using least absolute shrinkage and selection operator (LASSO). Using the RPV, OS differences were confirmed in two independent validation datasets; furthermore, the addition of RPV improved the clinically available prognostic methods (stage, age, and postoperative residual disease) in all three datasets, as measured by the concordance index [19].
The other two studies based their model on features indicating inter-site similarity. Meier et al. demonstrated that smaller values of similarity entropy (SE) were significantly associated with longer OS (p = 0.014) [20]; Vargas et al. demonstrated that SE, similarity level cluster shade (SCS), and inter-site similarity level cluster prominence (SCP) were associated with shorter OS (p = 0.02, 0.017 and 0.028, respectively) [22].

Progression Free Survival
Association with PFS was evaluated in 5/6 (83%) studies; the median period of followup, where declared, ranged between 26 and 59 months. Lu et al. [19] demonstrated that the above-mentioned RPV was significantly associated also with PFS. Meier et al. [20] demonstrated that inter-site textural heterogeneity metrics, such as lower values of intersite cluster variance (SCV) and inter-site cluster prominence (SCP), were significantly associated with longer PFS, with p-values of 0.006 and 0.021, respectively.
Rizzo et al. [21] showed that at the univariate analysis three features-fitting to the gray level run length matrix (GLRLM), to the shape 3D features and to the gray level co-occurrence matrix (GLCM)-were significantly associated with disease progression at 12 months. Multivariate analysis, on the other hand, confirmed the higher risk only for the feature belonging to the shape 3D cluster. Furthermore, the authors demonstrated that a clinical-radiomic model outperformed a clinical model (p = 0.04), with corresponding AUC (95% CI) of 0.87 (0.76-0.97) and 0.73 (0.54-0.93), respectively.
Wei et al. [23] built a radiomic signature for association with PFS with four radiomic features. These included features belonging to the zone-size variance in the gray-level size zone matrix (GLSZM), features extracted from the wavelet transform and a firstorder statistics (FOS) feature. The discrimination accuracy of the radiomic nomogram for predicting 3-year recurrence risk was 88.9% (95% CI, 85.8-92.5%), whereas it was only 73.7% (95% CI, 69.4-78.1%) via the clinical prognostic model alone.
Zargari et al. [24] used a particle swarm optimization (PSO) to narrow down the number of features included in the model (n = 11) and to give each one a different weight. The features belonged namely to the shape and density, discrete cosine transform (DCT), gray level difference method (GLDM), and wavelet groups. Among these, the features in the shape and density group showed greater weights than others in the DCT and GLDM groups, indicating the importance of the features coming from the frequency domain analysis in establishing the optimal synthetic feature (p-values between 0.002 and 0.017).

Radiomic Similarity
Analysis was based on radiomic features extracted from several sites of disease in 3/6 (50%) papers; however, only 2/3 (66%) based their analysis on the inter-site textural heterogeneity metrics. Meier et al. showed that high levels of three inter-site textural heterogeneity metrics were significantly associated with incomplete surgical resection in breast cancer gene (BRCA)-negative patients, but not in BRCA-positive patients. Comparison of texture heterogeneity metrics and surgical resection status demonstrated significant association only for the SCV, suggesting that lower heterogeneity was associated with complete resectability. On the other hand, the evaluated texture metrics were not able to distinguish between BRCA mutation carriers and non-mutation carriers [20].
Vargas et al. confirmed that lower heterogeneity was associated with complete resectability. However, they did not find significant associations between inter-site texture heterogeneity metrics and the cancer genome atlas (TCGA) classification of the OC subtype, when grouped by mesenchymal versus non-mesenchymal subtypes, with the exception of inter-site homogeneity [22].
All the above-mentioned findings are summarized in Table 4. Better performance of the radiomic model than the clinical prognostic model

Discussion
Since OC presents no specific symptoms in the early stage of the disease, over 75% of women are diagnosed at advanced stages with a 5-year survival of 15−25% [1].
The gold standard for OC is a primary cytoreduction followed by platinum-based adjuvant chemotherapy. It has been clearly demonstrated that the absence of residual tumor (RT) at the end of the surgical cytoreduction is the most important factor positively impacting on patient survival [25,26]. An increase in 10% of complete resection (RT = 0) is associated with a 5% improvement in patient OS [27]. Unfortunately, no effective tools are available for a preoperative selection of patients at diagnosis in whom complete resection will be achieved, and who will, therefore, have a better OS. In many institutions, the resectability is evaluated laparoscopically with an easy and reproducible score [28]. When a complete resection seems to be achievable, the laparoscopy is converted to a laparotomy for a primary cytoreduction. When a complete resection does not seem to be achievable, the laparotomy is omitted in favor of neoadjuvant chemotherapy followed by interval debulking surgery and adjuvant chemotherapy.
Despite the high responsiveness to platinum-based regimens, up to 80% of patients with advanced OC experience relapses with a median progression-free survival of 12-18 months [29].
OC is a complex disease, with different histological features and molecular expression. Implementation of biobanking and analysis of tumor samples is considered a promising way to identify markers predictive of response [30]. Among these new methods, radiomics may represent a noninvasive pre-operative tool for stratification of OC patients according to outcome.
Radiomic analysis is a multistep process, where each single step is important for the robustness of the final result. For example, segmentation can be performed with different software programs and can rely on manual, semi-automatic, or automatic contouring. This is both challenging, because many tumors have unclear borders, and contentious because there is no consensus on the need to seek either the ground truth or the reproducibility of image segmentation [4,5]. In this systematic review, only 2/6 papers used the same software for segmentation, thus, demonstrating a moderate variability.
Likewise, the software used for feature extraction showed an even larger variability, because none of the six papers used the same one. This partially accounts for the different results among the articles. Indeed, the radiomic features belong to different categories and not all the software programs extract all the categories. Principally, the features may belong to the following orders of categories. The so-called first order statistics features describe the distribution of individual voxel values without concern for spatial relationships, such as mean, median, maximum, minimum, entropy, skewness, and kurtosis of the histogram of values. The first-order statistics features also include the shape features, which describe the shape of the traced ROI and its geometric properties.
The second order statistics features describe textural features, meaning the statistical inter-relationships between voxels [4], and in radiomics this can readily provide a measure of intratumoral heterogeneity. Examples of these features are all the matrices based on different gray levels, such as the GLCM, GLRLM, and so on. Higher order statistics features are generated by statistical methods after imposing filter grids on the image to extract repetitive or nonrepetitive patterns. These include fractal analyses, Minkowski functionals, and wavelets [4]. Once the features have been extracted, a reduction in number is needed, to avoid redundancy. Furthermore, the stability and reproducibility of the model must be assessed and then confirmed before applying a predictive model in a clinical setting. Indeed, model fitting is usually optimal in the training set for model building, whereas its validation in external cohorts may provide more reliable fitting estimates [31]. The first validation is usually internal cross-validation, but the reference standard should be validation in independent cohorts, ideally prospectively collected within clinical trials [9].
The CT based radiomic association with OS in OC patients was partially addressed by 3/6 articles included in this review. Although all of them demonstrated a significant association of the radiomic model with OS, the features were extracted from the ovarian masses in one article that also included validation cohorts [19], whereas they were extracted from the peritoneal carcinomatosis in 2/3 articles that based their model on the similarity indices [20,22]. These differences account for promising results, but also indicate the need for larger cohorts, possibly supplemented by external validation cohorts.
The CT based radiomic association with PFS in OC was addressed by 5/6 articles included in this review. Three out of five based their feature extraction solely on the ovarian masses [19,21,23], and 2/5 on peritoneal carcinomatosis [20,24]; 3/6 articles included validation cohorts [19,23,24]. The five articles demonstrated associations between the radiomic model and PFS; all the models included features that fell within the second order statistical features, but in 3/5 articles features from the first order statistics and from higher order statistics features were also included [21,23,24]. Furthermore, one paper based its conclusions on evaluation of inter-site heterogeneity features [20]. Noticeably, there is a wide variation in methods also for association with PFS and, consequently, the models are still not transferable as such into clinical practice.
OC is a heterogeneous disease composed of many different histologic subtypes, the most frequent being the high-grade serous, low-grade serous, endometrioid, clear cell, and mucinous type. Each histological type is associated with unique clinical etiologies, sensitivity to therapies, and molecular signatures, including diverse transcriptional regulatory programs [32,33]. Since histological sampling from different sites of disease is not feasible over a period of time, CT based radiomics may represent an optimal tool to capture the heterogeneity of the disease. For this reason, a secondary objective of this systematic review was a dedicated evaluation of papers that based their models on inter-site heterogeneity. In this review, 3/6 articles based their considerations on the extraction of features from many sites of disease [20,22,24], and among these 2/3 calculated specific similarity indices [20,24]. Based on the TCGA data, a prognostic algorithm for high grade serous ovarian cancer, known as classification of ovarian cancer (CLOVAR) with four subtypes has been defined: differentiated, immunoreactive, mesenchymal, and proliferative [32,34], where the mesenchymal subtype has demonstrated shorter survival. Vargas et al. developed 12 quantitative metrics to capture spatial inter-site imaging heterogeneity in high-grade serous ovarian cancer. The authors demonstrated that metrics capturing the differences in texture similarities across sites were associated with shorter OS [22]. Later, Meier et al. assessed the associations between inter-site texture heterogeneity parameters, survival, and BRCA status, demonstrating that high inter-site cluster variance was associated with lower PFS and OS. Furthermore, the authors demonstrated that high values of all the three metrics included were significantly associated with lower rate of complete surgical resection in BRCA-negative patients [20]. However, neither of these studies included validation of their models and the number of patients was limited (n = 38 in [22] and n = 88 in [20]); therefore, for this objective too, larger studies including validation cohorts are needed.
This review has some inherent limitations. First, the small number of papers included. This is strictly related on one hand to the novelty of the topic, and on the other to the selection of radiomics papers applied to OC, that based their analysis on CT images and had survival as their endpoint. However, these selection criteria were chosen to answer an unmet clinical need for clarification regarding the role of radiomics as a predictor of survival in OC. Second, all the studies included were retrospective, with considerable variability in number of patients included and methodology. These methodological flaws were taken into account and prevented us from going further and carrying out a metanalysis. Third, in the last few years new diversified drugs have been introduced as treatments in OC, such as Poly ADP-ribose polymerase (PARP) inhibitors, angiogenesis inhibitors and immunomodulators, that have significantly increased survival, although associated with an increase of treatment-related toxicities [35,36]. In this regard, none of the articles included evaluated whether the differences in survival were related to the introduction of new drugs. A further limitation is that, given the availability of different software, some for free download, others under payment, none of the six papers included used the same software for textures extraction. Furthermore, the radiomics extraction methodology, including mentions to normalization, creation of isotropic images and interpolation was not described in all the studies and, therefore, is not reported in this review.

Conclusions
In conclusion, radiomic models have demonstrated promising results as predictors of survival in OC patients. However, larger studies that include validation cohorts, and take into account the introduction of new drugs that may prolong survival, are needed to transfer the prediction models to clinical practice.