Image analysis-derived metrics of histomorphological complexity predicts prognosis and treatment response in stage II-III colon cancer

The complexity of tumor histomorphology reflects underlying tumor biology impacting on natural course and response to treatment. This study presents a method of computer-aided analysis of tissue sections, relying on multifractal (MF) analyses, of cytokeratin-stained tumor sections which quantitatively evaluates of the morphological complexity of the tumor-stroma interface. This approach was applied to colon cancer collection, from an adjuvant treatment randomized study. Metrics obtained with the method acted as independent markers for natural course of the disease, and for benefit of adjuvant treatment. Comparative analyses demonstrated that MF metrics out-performed standard histomorphological features such as tumor grade, budding and configuration of invasive front. Notably, the MF analyses-derived “αmax” –metric constitutes the first response-predictive biomarker in stage II-III colon cancer showing significant interactions with treatment in analyses using a randomized trial-derived study population. Based on these results the method appears as an attractive and easy-to-implement tool for biomarker identification.

Scientific RepoRts | 6:36149 | DOI: 10.1038/srep36149 histomorphological markers predict response to treatment. Notably, scoring of grade, TBC and TB is subjective, not standardized and may be time-consuming.
Digital image analysis of slide section scans allows quantitative and observer-independent characterization of morphological and histomorphological features. Unsupervised and supervised analyses of tumor morphology has identified novel prognostic markers and identified metastasis-associated patterns of immune infiltration 10,11 . The complexity of the tumor structure can be also described by fractal geometry. A few studies have applied multifractal (MF) analyses to breast 12 , colorectal 13,14 , prostate 15 and lung tumor collections (reviewed in ref. 16). Some of these studies have identified associations between specific MF metrics and outcome. However, none of the earlier studies have analyzed material from randomized studies, which have prevented clear distinctions between marker impact on natural course/prognosis and response to treatment.
In this study we have analyzed prognostic and response-predictive capacity of two MF metrics collected by digital image analyses of cytokeratin-stained sections from 291 tumors of stage II-III colon cancer patients participating in an earlier reported randomized trial comparing treatment with surgery alone and surgery and adjuvant fluoropyrimidine chemotherapy 17 . Scores for grade, TB and TBC were also collected from the same cases and related to MF metrics and outcome.

Material and Methods
Patients. The 291 patients were derived from a randomized clinical Nordic trial aiming to evaluate the efficacy of 5-FU (5-fluorouracil) based adjuvant chemotherapy 17 . The original study included 2224 patients under the age of 76 year with radically resected stage II and III colorectal cancer during the time period 1991-1997. The patients were randomized to surgery alone or surgery followed by 5-FU based adjuvant chemotherapy. The adjuvant chemotherapy regimens included 5FU/leucovorin for 4-5 months according to either a modified Mayo Clinic schedule or the Nordic schedule or 5FU/levamisole for 12 months. Some centers also randomized patients treated with 5FU/leucovorin to + /− levamisole. This study is based on a selection of Swedish cases from the original study. None of the patients were treated with radiotherapy or chemotherapy prior to surgery. Clinical data on patients and tumor characteristics were retrieved through pathology reports. Survival data were available through the regional centers of epidemiological oncology. Patient characteristics are listed in Table 1.
All methods were carried out in accordance with relevant guidelines and regulations. All participants have given and written informed consent, which also included prognosis-related studies. The ethical committee of the Karolinska Institutet, Stockholm, Sweden, approved the analysis within this cohort (Dnr 00-260, 2014/664-32).  19 . The average number of tumor buds in 10 fields of view was then used either as continuous value (for correlation analyses) or was used to dichotomize cases into high-budding group (≥ 10 per high-power field area) and low-budding group (< 10 buds).
The information about the grade of the tumor differentiation was retrieved from the clinic-pathological records.

Digital image analysis. Inclusion criteria and area selection.
For the digital analyses a subset of 291 cases was used. The major exclusion criteria were partial tissue detachment from the glass, focusing issues of the slide scanning microscope which resulted on blurred digital image of > 10% of sample area or poor quality of immunohistochemical staining (too weak specific staining and/or too strong background staining).
For quantification of MF metrics one region per case was selected. All areas selection was done blinded to out-come data. Selection of areas for analyses was based on the following criteria: 1. The largest possible tissue region was selected if not contradictory with other selection criteria.
2. The selected area should include stroma/tumor interface regions. 3. Areas of artificial tissue damage should be avoided. 4. When more than one tumor tissue sample/slide was available from each case, analyses were based on the sample with the deepest invasion.
Image segmentation and classification. The intention was to get as large tumor area as possible, excluding sample margins and areas with tissue damage or unspecific staining. For the classification of tumor and non-tumor regions we used Definiens eCognition Developer ® software trial version 9.1. The analytical pipeline consisted of the segmentation of the image, derived from the pan-cytokeratin-stained section (Fig. S1A), and subsequent image classification using nearest neighbour object-based algorithm. The pipeline was developed and tuned on 10 randomly selected images, and then applied for the rest of the samples. The regions of the tumor tissue, tumor stroma and "background areas" were classified. Each image was reevaluated by pathologist (AM), and classifier settings were modified if deemed necessary. The classified images were then binarized into "tumor vs. non-tumor" images ( Fig. S1B). The first type of images was made by outlining of cancer contour, and thus represented tumor-stroma interface (Fig. S1C1). The second type of images, in addition to the tumor-stroma outlines, also contained contours of internal structures of the tumor tissue (Fig. S1C2).
Multifractal analysis. The multifractal analysis was performed on contour images, obtained as described above, using the FracLac plug-in for the freely available ImageJ software (Karperien, A., FracLac for ImageJ. http://rsb.info.nih.gov/ij/plugins/fraclac/FLHelp/Introduction.htm. 1999-2013.) FracLac default settings were used (the box sizes (scale window) were set for the range from 10 pixels (min) to the maximal of 60% of an image, Q range from − 10 to + 10; no multifractal filters were used; minimal density 0.1 and maximal density 0.98). The generalized fractal dimension D(q) vs Q spectra was generated for each case and the correlation coefficient (r 2 ) for the regression line for the D(q) vs Q was computed. The Hölder's exponent ('α') reflects the local regularity of the analyzed structure. High values of α indicate high local irregularity of the observed structure. For the analyses in the current study we utilized maximal scores for local irregularity among the scores achieved on different scales α max . The second metric is produced as function of α making multifractal spectrum -f(α) -and characterizes the global irregularity. FracLac-derived data were used to obtain, for each tumor, four multifractal metrics: α max and f(α) max for the 'outline' images ( Fig. S1D1) and α max internal structure and f(α) max internal structure for the images with internal structures (Fig. S1D2). The representative images of two tumor samples from the analyzed cohort with low and high multifractal metrics are shown in Fig. 1 For survival analyses Cox Regression was performed in uni-and multivariate settings with median-based dichotomization of cases into "metric-high" and "metric-low" groups.
All statistical tests were performed using SPSS V20 (SPSS Inc., Chicago, IL) and R (version 3.2.2). p values < 0.05 were considered statistically significant.
As survival endpoints we used Cancer Specific Survival (CSS), which was measured from the date of surgery to the date of death from CRC, and Time To Recurrence (TTR), which is time from the date of surgery to the date of local recurrence, distant metastases or death from CRC.
To estimate interactions between adjuvant treatment and studied factors, the test of interaction was used, and included analyzed characteristic (high/low), adjuvant chemotherapy (+ /− ) and a product them (analyzed characteristic * adjuvant chemotherapy).

Results
Multifractal properties of the dataset. Scaling selection has been shown to be critical for the multifractal (MF) analyses 20,21 . To show that the parameters for the MF analysis in our study are selected properly we performed a series of analyses to investigate the behavior of the data set. The individual data from the spectra of the generalized fractal dimension D(q) of each sample was used to produce a summary plot of D(q) vs Q spectra for the whole cohort. As shown in Supp Fig. S1E, the D(q) vs Q curve is nonlinear, has sigmoid shape and is descending. This indicates that the analyzed data has MF properties within the selected settings and thus can be subjected for MF analysis 22 . The average correlation coefficient (r 2 ) for the regression lines for the D(q) vs Q was equal to 0.826 (standard deviation 0.063), which indicates good fit of the line to the data.

Study population and multifractal metrics collection.
Clinico-pathological characteristics of the study population are described in Tables 1 and 2 and in Tables S1 and S2.
The mean CSS in the subgroup treated with surgery alone (n = 150) was 96 months compared with 102 months in patients receiving adjuvant chemotherapy (n = 141). This difference was not statistically significant (Fig. S2). The mean TTR was 90 and 95 months for the surgery-alone and the adjuvant group, respectively, with no statistically significant difference between the groups (Fig. S2).
Digital image analysis of cytokeratin-stained samples was used to collect data from each case for four MF metrics: α max , f(α) max , α max internal structure and f(α) max internal structure (see Materials and Methods for details) (Fig. 1). MF metrics displayed only moderate associations between each other with Spearman correlation coefficients r = 0.650 (for α max and f(α) max ) and r = 0.547 (for α max internal structure and f(α) max internal structure ) (p < 0.001).

Association of MF metrics with clinicopathological and histomorphological characteristics.
As shown in Table 1, proximal localization was associated with higher score for both multifractal metrics. Furthermore, both digital metrics were higher in tumors from females than males. No associations were found with patient age, DNA mismatch repair (MMR) status or usage of adjuvant chemotherapy. Analyses of correlations between the two MF metrics and histomorphological features demonstrated that both MF metrics were associated with TB and TBC (Table 2). Only f(α) max was associated with tumor differentiation.
Analyses which included internal tumor structures are available in Supp. Tables 1 and 2. Interestingly, among clinicopathological characteristics only tumor localization was associated with α max internal structure and f(α) max internal structure .

Multifractal parameters are associated with the post-surgery natural course of stage II-III colon cancer.
A set of analyses was performed with the aim to analyze if the MF metrics were associated with the intrinsic aggressiveness of the disease. For this purpose the associations between MF metrics and TTR were analyzed in the subgroup of patients not receiving adjuvant chemotherapy. As shown on Fig. 2A,B, both MF metrics were associated with shorter TTR as determined by Kaplan-Meier analyses and Cox regression analyses. f(α) max yielded the strongest results (HR = 2.5 (95% CI 1.50− 4.18, p < 0.001)). Survival analyses with CSS yielded similar results (data not shown). High f(α) max was also associated with appearance of distant metastases and stage III tumors (Table 1).
For comparative purposes similar analyses were done using the histomorphological features. As shown in Fig. 2A, only the TBC was statistically significantly associated with TTR.
Multivariate analyses identified both MF metrics as independent prognostic factors for TTR (Table 3A,B). Among the histomorphological features only tumor border configuration remained statistically significant in the multivariate analyses (Supp. Table 3). Interestingly, when combined, the strongest histomorphological and MF predictor, both acted as independent prognostic factors (Table 3C).
Additional analyses were performed to test the prognostic value of MF metrics in a stage-dependent manner. As shown at Supp. Fig. 3, f(α) max preserved its prognostic strength in both groups while α max acted as a strong prognostic factor in stage II, but not in stage III.  Together, these data identify MF metrics as independent prognostic factors, performing better than grade and budding, for recurrence in stage II-III colon cancer. f(α) max showed the strongest link with outcome and was also significantly associated with survival in both stage II and III.

MF metrics define a patient group which benefit from adjuvant chemotherapy. We investigated
if morphological tumor features, as defined by MF metrics or the traditional histomorphological characteristics, were associated with benefit of adjuvant chemotherapy. The effects of adjuvant chemotherapy on TTR were analyzed in subgroups of patients defined by their morphological features.
Interestingly, adjuvant treatment significantly improved TTR in the subgroups with high α max but not in the low-score groups (Fig. 3, left and middle part). These findings were further supported by the interactions test, which demonstrated significant interactions between marker status and treatment in the high α max group (Fig. 3,  right part). The analyses of the images with included tumor internal structure fractal characteristics yielded similar results (Fig. S4).
Notably, none of the histomorphological scores showed any significant interaction with adjuvant treatment (Fig. S5).

Discussion
The relationship between morphology and biological properties of tumors is well established and illustrated by the use of features such as grade, tumor border configuration (TBC) and tumor budding (TB) in colon cancer illustrating TTR of stage II-III colon cancer treated with surgery alone after median-based dichotomization of cases based on the f(α) max metric. All MF metric-related analyses were performed with median-based dichotomization of cases into "metric-high" and "metric-low" groups.
Scientific RepoRts | 6:36149 | DOI: 10.1038/srep36149 diagnosis. This study presents an alternative to these observer-dependent scoring methods by the use of multifractal analyses of scanned cytokeratin-stained tissue sections.
Multifractal analyses have been used in earlier studies to characterize different aspects of tumor morphology, including nuclear morphology, tissue architecture as determined by binarized H&E stainings and growth pattern/ tumor-stroma interface identified by cytokeratin staining [12][13][14]17,23,24 . Some of these studies have also linked these metrics to outcome and have suggested relationships to intrinsic aggressiveness of disease or response to chemotherapy 12,24 . Notably, none of the earlier studies have relied on analyses of samples from randomized studies and analyses have therefore not been able to stringently distinguish between associations related to the natural course of the disease and response to treatment. The present study is thus the first study to stringently report on the ability of different MF metrics to act either as prognostic or response-predictive markers in cancer.
The present analysis demonstrates that MF metrics act as independent prognostic markers. Among the visually defined morphological characteristics, tumor differentiation grade is clinically implemented as a prognostic factor but still belongs to the category IIA prognostic factors 7 . TBC and TB have been extensively studied during past decades as prognostic factors. American Joint Committee on Cancer/UICC recommended the assessment of TBC 9 . However, the lack of established evaluation systems and poor reproducibility of the scoring have hampered practical implementation of TB and TBC 25,26 . Our finding that the MF metrics performs equally well, or better, than differentiation grade, TB and TBC as prognostic markers should prompt further validating studies on MF metrics as prognosticators. Multi-variable analyses including the MF metrics and the three visually defined features indicated that f(α) max and TBC acted as independent markers and thus possibly can be used together in future multiparametric risk scores.
This study also examined the potential ability of MF metrics to act as predictors of benefit of adjuvant 5-FU-based chemotherapy. Interestingly, median-based dichotomization of patients based on α max , or α max internal structure , identified patient groups which displayed differential benefit of treatment, in a manner where the high-marker-group showed significant improvement of CSS. Notably, none of the three visually defined histomorphological characteristics (tumor grade, TB and TBC) demonstrated any significant interactions with treatment.
Identification of markers predicting benefit of adjuvant chemotherapy is a highly active research area 1 . Some candidate markers subject to ongoing validation include thymidylate synthase and MSI [27][28][29][30] . Notably, significant interactions with treatment have not yet been demonstrated for any of these earlier candidate markers.
It is noted that the present study relies on a patient cohort where the chemotherapy regimen is different from the schedules currently in use. An important task for future studies will therefore be to validate the indications of response-predictive potential of MF metrics in other, more modern, tumor collections. However, none of these modern collections will provide an opportunity to analyze the effects on a surgery-alone group. Even if the absolute risks of recurrence have diminished due to improvements with time from the 1990s, the relative importance of the MF metrics reported here is likely robust.
Future studies should also aim to define the biology, which is captured by the MF metrics. Such studies can possibly be done by collecting MF metrics from tumor collections, which are well-annotated with regard to molecular features such as gene-expression patterns and mutation profiles.  Digital pathology and whole slide imaging is presently becoming part of routine at many pathology departments. This development is encouraging for the present study since it is lowering the technological threshold for implementation of MF metrics as routine biomarkers. Findings from the present study should also encourage other independent efforts to fully exploit the potential of digital image analyses for detection of clinically relevant biology impacting on prognosis and response to treatment. Kaplan-Meier plots illustrating TTR of stage II-III colon cancer patients receiving surgery alone (red lines) or surgery together with adjuvant chemotherapy after dichotomization of the study cohort based on α max (upper part), or f(α) max (lower part). Log-rank test was used for statistical analyses. (Right part) Interaction between MF metrics and treatment were analyzed using "formal interaction test". All MF metric-related analyses were based on median-based dichotomization of cases into "metric-high" and "metric-low" groups.