Identification of patterns of tumour change measured on CBCT images in NSCLC patients during radiotherapy

In this study, we propose a novel approach to investigate changes in the visible tumour and surrounding tissues with the aim of identifying patterns of tumour change during radiotherapy (RT) without segmentation on the follow-up images. On-treatment cone-beam computed tomography (CBCT) images of 240 non-small cell lung cancer (NSCLC) patients who received 55 Gy of RT were included. CBCTs were automatically aligned onto planning computed tomography (planning CT) scan using a two-step rigid registration process. To explore density changes across the lung-tumour boundary, eight shells confined to the shape of the gross tumour volume (GTV) were created. The shells extended 6 mm inside and outside of the GTV border, and each shell is 1.5 mm thick. After applying intensity correction on CBCTs, the mean intensity was extracted from each shell across all CBCTs. Thereafter, linear fits were created, indicating density change over time in each shell during treatment. The slopes of all eight shells were clustered to explore patterns in the slopes that show how tumours change. Seven clusters were obtained, 97% of the patients were clustered into three groups. After visual inspection, we found that these clusters represented patients with little or no density change, progression and regression. For the three groups, the survival curves were not significantly different between the groups, p-value = 0.51. However, the results show that definite patterns of tumour change exist, suggesting that it may be possible to identify patterns of tumour changes from on-treatment CBCT images.


Introduction
With 1.82 million incidences in 2012, lung cancer is one of the most common types of cancer worldwide (Ferlay et al 2014). Also, lung cancer is the most common cause of cancer mortality, with 1.6 million deaths globally (Ferlay et al 2014). Depending on the patient's prognosis, different treatment modalities can be used to treat the disease, including surgery, radiotherapy (RT) and chemotherapy. According to Lambin et al (2017), personalising treatment for every patient can improve patient outcome.
Personalised treatment relies on medical imaging (European Society of Radiology 2015). Imaging plays an important role in the pathway of lung cancer management; it is used to diagnose the disease, in staging, to plan RT, assess treatment response and to verify patient's position before treatment using cone-beam computed tomography (CBCT) images. Information obtained from on-treatment images during RT is currently used to inform treatment adaptation, which in lung cancer patients is typically restricted to response to changes in the tumour position. Larger changes may have a dosimetric impact on the tumour coverage or organs at risk (OARs), and offline replanning can correct for these changes.
The majority of the published work on lung cancer has reported a correlation between the gross tumour volume (GTV) before the start of RT and overall survival using CT or PET/CT (Käsmann et al 2018). However, the results of GTV volume changes during RT were contradicting. For example, Jabbour et al (2015) and Brink et al (2014) suggested a correlation between the sub-groups of tumour volume change and overall survival. However, a recent study of 394 patients, found that three sub-groups of GTV volume change during treatment did not differ in their outcomes (Kwint et al 2020). Most studies used computed tomography or positron-emission tomography images at selected timepoints to assess tumour changes (Roengvoraphoj et al 2018).
Image-guided RT has become a standard of care, using CBCT to verify and correct a patient's position before treatment (Jaffray et al 2002). CBCT images are also used to guide decisions on the need for plan adaption during treatment due to anatomical changes. In addition, CBCT images could provide useful information on day to day changes of the tumour throughout treatment. This information can help distinguish the patterns of tumour change, enabling the possibility of early treatment adaption or providing information on how the tumour is responding to RT.
Drawing inspiration from (Buizza et al 2018, Astaraki et al 2019 who have developed methods to quantity intra-tumour heterogeneity by dividing the tumour into regions, in this study, we analyse on-treatment CBCT images for lung cancer patients, extracting information on density changes across the visible tumour and lung-tumour boundary during RT. These density changes are clustered to explore patterns that, we hypothesise, can identify the patterns of tumour change. Furthermore, the obtained clusters were correlated against outcome data to investigate if the clusters display any differences in overall survival.

Patients
A cohort of 240 non-small cell lung cancer (NSCLC) patients who received 55 Gy in 20 fractions with or without induction chemotherapy, treated between 1 January 2013 and 31 October 2018 at the Christie NHS Trust were collected. All images were collected with institutional approval (UKCat, REC reference 17/NW/0060). For each patient, an RT planning CT scan with delineations and all available on-treatment CBCTs were collected. RT planning and contouring were performed in Pinnacle v9-9.8 (Philips Healthcare, Andover, MA, USA) and all CBCTs were acquired on Elekta XVI system v5.52 and v5.69 (Elekta AB, Stockholm, Sweden). Patient demographics and clinical data were collected, including outcome data, summarised in table 1 of supplementary material (available online at https://stacks.iop.org/PMB/ 65/215001/mmedia). All patients involved in this study had a minimum of five CBCTs, including the first three fractions and all available CBCTs thereafter. No other discrimination was used in patient selection.

Image registration
CBCTs were automatically registered to the planning CT scan for each patient. To minimise potential registration uncertainties due to small changes in tumour volume across treatment, we used a two-step rigid alignment process. The first step aligned the vertebral bones, correcting gross rotation and translation set-up errors. For this, a clip box containing the spinal cord delineation was created, limited to the slices containing the planning target volume (PTV). The second step fine-tuned the registration by performing a translation-only soft tissue match based on a mask defined by the PTV. A clip box containing the PTV was created, ensuring that the soft tissue match includes the GTV and the surrounding area. Registrations were visually inspected to check for gross errors. Image registration was performed using World-match v9.0 (Wolthaus et al 2005).

Intensity normalisation of CBCTs
It is established that the imaging quality of CBCT is generally poor compared to CT due to the difference in scatter conditions. Therefore, the signal in each voxel does not directly represent the density of the object being imaged (Richter et al 2008). To equalize the distribution of intensity levels across CBCT images, intensity normalisation was performed. The approach used was adapted from van Timmeren et al (2017), who have demonstrated their methodology to be robust. A correction factor was calculated for each CBCT by defining a spherical region with a diameter of 2 cm at the centre of the heart delineation on CT. This region was rigidly transferred across all CBCT images, allowing the mean intensity from the CBCT in this region to be extracted. As the signal in the heart is driven by the blood volume, and it is expected to be in the range between 40 and 60 HU (Padilla et al 2010). Therefore, a reference value was set to 50 HU.
For quality control, the spherical region was visually verified to ensure that an area representative of the heart was being sampled in each CBCT. Next, to find the correction factor, the following formula was used: Scale factor = mean intensity in spherical region/50 HU.
Finally, the derived scale factor, for each CBCT, was multiplied with the intensity levels of the CBCT image to correct the intensity information. The extended GTV, excluding part of the extension that extends into the mediastinum. The masked ROI is the anatomical region that will be studied.

Creation of shells
Eight shells with a thickness of 1.5 mm were created in the area around the tumour-lung boundary, the shells extend 6 mm inwards and outwards of GTV boundary, see figure 1 of supplementary material. Evidence from surgical resection studies has provided information on the extent of microscopic disease spread, which is around 4 mm (Giraud et al 2000). Therefore, by extending the GTV ±6 mm, our approach was able to cover the expected extent of microscopic deposits for the majority of the patients.
To define the lung-tumour boundary, we first created a region of interest (ROI) based on the GTV contour expanded by 6 mm. The lung contour was then used to ensure the ROI did not extend into other tissue outside the lung, i.e. mediastinum, diaphragm or chest wall, see figure 1. Patients without a lung delineation were excluded from analysis. All shells were confined to this ROI. In the case of patients with small tumours, less than 6 mm in diameter, our approach will not allow for the creation of some inner shells. Hence, these patients were excluded from further analysis. We used 1.5 mm as the thickness of each shell because it approximates the voxel size in plane diagonal (1.4 mm) on CBCTs.
In terms of the image processing operations; in each voxel, the 3D Euclidean distance was computed on the expanded GTV mask to measure the separation of points in the image. Then, we applied a set of thresholds to the distance transform map; the first shell was between −6.0 and −4.5 mm, the second one between −4.5 and −3.0 mm, until the last shell which was between 4.5 and 6 mm.
The created shells were rigidly propagated across all CBCT images and used to calculate the mean intensity in each shell for each CBCT. Eight shells were created; shells 1-3 describe the changes inside the GTV, while shells 4-5 show changes at the lung-tumour boundary and shells 6-8 represent density changes in the surrounding lung tissue. World-match v9.0 (in-house software) was used to create the concentric shells.

Data analysis
For each shell, the mean intensity from each CBCT was extracted and plotted versus days since the start of RT. A linear fit was used to calculate the slope, indicating the predominant density change in each shell across treatment. The slopes of all eight shells were clustered to identify patterns of tumour changes. This part of the analysis was carried out using MATLAB version R2015b.
Affinity propagation, an unsupervised clustering algorithm, was used to explore clusters in the data. Unlike other clustering algorithms such as k-means, the affinity propagation algorithm does not require a predefined number of clusters; it identifies clusters in the data by sending messages between pairs of data points, until they converge in the same cluster (Dueck 2009). These messages represent the suitability of a pair to be an exemplar of another sample in the data. In our study, the slopes of all eight shells were used by the affinity propagation algorithm to partition the data points.
To visualise the clusters in 3D space, three simplified slopes were used; representing density change inside the target volume (shells 1-3), across the lung-tumour boundary (shells 4-5) and in the nearby lung tissue (shells 6-8). Figure 2 of supplementary material shows these simplified slopes for a single patient.
To confirm the affinity propagation clusters, sensitivity analysis was performed. For this the dataset was divided into k sets, k = 100. For each iteration, 10% of the data samples was left out and clusters regenerated using affinity propagation to examine the stabilities of the clusters. Finally, after the generation of all clusters, we compared the initial affinity propagation clusters to the clusters generated during the sensitivity analysis for each iteration. The adjusted mutual information and adjusted Rand score were used as metrics to measure the consensus and similarity between the initial clusters and the clusters generated in each iteration, respectively. The adjusted mutual information and adjusted Rand score, can range between 0.0 and 1.0, a 0.0 score is bad and 1.0 is perfect.
The clustering and cluster sensitivity analysis was performed in python version 3.6, using scikit learn cluster package.

Statistical correlation
Finally, we investigated whether different clusters display different clinical or survival characteristics. The clinical variables available for the patients included, Tumour, Node, Metastasis (TNM) stage, age, tumour volume, gender and tumour histology. The differences in TNM stage, gender and tumour histology between the different clusters was categorically tested using Pearson chi-square test. A univariate Kaplan-Meir survival analysis was performed to test whether different clusters show differences in their risk to treatment outcome. The analysed endpoint was overall survival, and all analyses were compared between the different groups. Furthermore, a multivariate Cox regression analysis adjusted for tumour volume was performed to test the potential impact of tumour volume on treatment outcomes between the groups. Other variables included in the Cox regression model were age, gender and TNM stage. The tumour volume and age are continuous variables, whereas gender and TNM stage are categorical.
All statistical analyses were performed using RStudio version 1.1.456, using the packages survival, survminer.

Patients
A total of 31 out of 240 patients were excluded from the analysis; 14 patients were excluded because they had less than five CBCT scans available, four patients had very small tumours (could not create inner shells), three patients did not have a delineated lung contour. An additional ten patients were excluded because of poor image registration between the CT and CBCTs based on visual inspection, see figure 3 of supplementary material for an example. Therefore, 209 patients remained for analysis. The image registration time was ≈1.1 min per patient.

Intensity normalisation of CBCTs
The mean intensity values in the ROI in the heart for CBCT images of all patients ranged between −242 and 306 HU, see figure 4 of supplementary material. This large inter-patient variation demonstrates the need for normalisation. Figure 5 of supplementary material summarises the distribution of intensity values in the GTV as observed on planning CT images, CBCT images before and after intensity normalisation for all patients. As expected and seen in figure 5 of supplementary material, the intensity levels on the original CBCTs are inconsistent compared to the intensity levels on planning CT images. After correction, the intensity distribution shows improved consistency (lower variation) across all CBCT images, ensuring the sampling of consistent information across CBCTs. Figure 2 shows the linear fits applied to the mean intensity values of each shell across treatment for an example patient. Figure 3 shows the distribution and range in the slopes showing density change in all shells for all patients. We observe that there is greater variability in the slopes in shell 1, as well as larger outliers. The ANOVA test between the 8 shells was significant (p-value < 0.05), we can conclude that there are significant differences between the means of the shells. The multiple pairwise comparison analysis using Tukey honestly significant difference (HSD) test indicated that; shells inside the tumour are significantly different to those outside the tumour, but that adjacent shells do not differ significantly. The full output from the analysis is shown in table 2 of the supplementary material. Majority of the slopes in all shells are negative, implying a general decrease in density across treatment.   (Tukey 1977). Outlier points are those beyond 1.5 interquartile range, and are represented by plus (+) signs.

Data analysis
Seven clusters were generated by affinity propagation, demonstrating particular patterns of tumour change during treatment. The majority of the patients were grouped into three clusters, see figure 4. On visual inspection, we determined these clusters contained patients showing tumour regression (n = 58) in cluster 7, patients with mild tumour progression (n = 14) in cluster 5 and patients showing little or no density change (n = 136) in cluster 2, represented by the blue, green and red colours respectively. The legend of figure 4 shows the number of patients in each cluster.
Exemplar patients for the three major clusters (two, five and seven) and exemplar patient in cluster 3 can be seen in figure 5. The exemplar patient in cluster 3 was included because of the massive change that occurred during treatment. Exemplar patients in clusters one, four and six are provided in figure 7 in supplementary material. These patients were identified by affinity propagation as outliers, patients who could not be associated with patients in other clusters. These outlier patients were grouped on their own because of the distinctive patterns as observed in the slope showing density change in each shell. Two outlier patients were grouped in cluster 1, while clusters three, four and six each had one patient. All these outlier patients indicate signs of tumour regression similar to patients in cluster 7.
The differences between these outlier patients and patients grouped in cluster 7 are; whereas patients in cluster 7 show a decrease in intensity in all shells across lung-tumour boundary, the shells inside the target volume of the outlier patients in cluster one and four indicate little or no density change. However, the same cannot be said about the outlier in cluster 6. The density change in each shell for this particular patient is similar to that observed in patients grouped in cluster 7.
The outlier patient in cluster 3 shows interesting changes, see figure 5; the shells outside target volume show a decrease in intensity while the shells inside the target volume become brighter. These changes are caused by partial collapse of the lung that reappeared later during treatment.
After affinity propagation, sensitivity analysis was performed. The results of sensitivity analysis, with k = 100, is shown in figure 6 of supplementary material. As seen, in 78 out of 100 iterations, the adjusted mutual information and adjusted Rand scores is above 75%. Implying that there is a high consensus and similarities between the initial and sensitivity analysed affinity propagation clusters. However, in 22 out of 100 iterations both the adjusted mutual information and adjusted Rand scores were poor, ranging between 0.32-0.68 and 0.11-0.68, respectively. This could be as a result of leaving out key exemplar data points.

Statistical correlation
Outcome data of four patients could not be found in the database and were therefore excluded from further analysis. Additionally, the five outlier patients were excluded from further analysis. Therefore, 200 patients were available for statistical analysis, see table 1 of supplementary material for patient demographics. The Pearson chi-square tests between the three major clusters, gender or tumour histology, achieved p-value of 0.8476 and 0.5839 respectively, suggesting that these variables are independent. Age, N-stage and M-stage did not show any statistically significant differences between the clusters (p-value > 0.05). However, the Pearson chi-square test between the three main clusters and T-stage, achieved p-value = 0.0072, implying that there is an association between tumour stage and our clusters. As seen in the corresponding rows in table 1 of supplementary material majority of the patients with T-stage 4 were grouped in cluster 7 representing patients with regressing tumours. Regressing tumour tended to be larger than progressing tumours, with the mean GTV volume 83.17 cm 3 compared to the 35.92 cm 3 observed in the patients in the progressing group. The cross classification of the clusters can be found in table 1 of supplementary materials. Finally, the Kaplan-Meier curves, including the three major clusters, are shown in figure 6. There are no significant differences in overall survival between the clusters, p-value = 0.51. The median survival for the little/no change, regression and progression clusters is 502, 450 and 1033 d, respectively. Also, the Cox regression model adjusted for tumour volume did not show any significant differences in overall survival between the major groups. The lack of significant differences in survival between the groups could be as a result of a small dataset. Figure 8 in supplementary material shows the scaled Schoenfeld residuals plot for the covariate clusters (which is a three-level factor) from the multivariate Cox regression model, as seen there is no pattern with overall survival time between the three groups.
Table1 shows the results of the proportional hazard test for each of the covariate included in the Cox regression model adjusted for tumour volume. As seen, the results are not statistically significant for each of the covariates, and the global test is also not statistically significant, p-value = 0.58. Therefore, we can assume the proportional hazards.

Discussion
In this study, we have presented a novel approach for investigating changes at the boundary of lung cancer tumours during RT by analysing density changes over time in shells across the lung-tumour boundary. The slopes of density change in all shells were used to cluster the patients into groups (groupings displaying similar characteristics). Three main clusters were found showing different patterns of tumour changes, representing little or no density change, tumour regression and progression.
Tumour changes in lung cancer patients who underwent RT have been investigated using different imaging modalities such as CT, or CBCT (Knap et al 2010, Berkovic et al 2017. Knap et al (2010) analysed daily CBCTs of 20 lung cancer patients treated with chemo-radiation to determine tumour change, and observed significant tumour shrinkage in one-third of all patients from planning CT to last CBCT (end of RT). Berkovic et al (2017) implemented an adaptive RT strategy assessing tumour and lymph nodes volume changes on CBCTs of 41 NSCLC patients who underwent 15-20 fractions of chemo-radiation. The study used the observed changes to determine potential dosimetric benefits for OARs while assessing the risk of underdosage of microscopic diseases. They reported an average tumour volume shrinkage of 1.6% per day, which could potentially be translated into a significant decrease in dose to OARs using adaptive RT.
Despite the small cohorts used, the above-discussed studies have found a constant rate of decrease in tumour volume during treatment. However, Woodford et al (2007)observed varying GTV volume changes in 17 NSCLC patients, ranging between 12% and 87%. Furthermore, after analysis, these patients were divided into three groups, representing; sharp decrease in tumour volume, steady decrease and no clear sign of tumour volume decrease. Also, in another study tumour progression was observed in 5% of the patients (Brink et al 2014). Our study supports this, we have found different patterns of tumour changes; little or no density change, tumour progression and tumour regression. In our study, 28% of the patients showed signs of tumour regression, representing mild and sharp GTV volume decrease. Tumour progression was observed in 7% of the patients.
After performing a sensitivity analysis, we found a high similarity and agreement (>75%) between the initial affinity propagation clusters and those generated during sensitivity analysis (in 78 out of the 100 iterations) indicating that our clusters are stable, and represent distinct patterns of tumour change during RT. Furthermore, during sensitivity analysis we have demonstrated that excluding patients with large changes does not affect other clusters because the patterns of change as observed in the slopes of density change are different. Therefore, our approach can capture distinct changes across patients in our cohort.
Surprisingly, the Kaplan-Meier curves between the major groups show no significant differences, indicating there are no significant differences in overall survival between the three major clusters. This may indicate that tumour response during treatment is not a major driver of overall survival, other variables might be more important post-treatment. The lack of differences in survival could be explained by the small cohort (in terms of survival analysis), as only 200 patients were included in the survival analysis. Our results agree with Kwint et al (2020) who found that distinct subgroup of GTV change during treatment in a cohort of 394 NSCLC patients did not result in different outcomes. Although we expected tumours of different histologies to change differently, our results could not confirm this. However, we found that T-stage and tumour size were statistically significant between the clusters.
Throughout literature, several challenges have been highlighted when analysing CBCT; quality of CBCT is not optimal, due to artefacts, it is worse than that of CT (Scarfe and Farman 2008), which can often lead to uncertainties during treatment adaptation. Pre-processing steps were adopted; normalisation of information on CBCT and two-step rigid registration. We showed that after intensity normalisation of CBCTs, image consistency was improved. Furthermore, after visual inspection, the two-step registration proved to have worked well in the majority of the patients.
Image registration is another factor that could have affected the results. To deal with these uncertainties, different fits were explored, but we opted for linear fits because they are easy to quantify and their interpretation is straightforward. Linear fits were used to compensate for some of the errors that could have emerged as a result of image registration, i.e. spatial and geometric errors. At this stage, we cannot quantify these errors. Other factors that could potentially have affected the results include noise, variability in repeated CBCT images and motion artefacts because all images used are 3D CBCT acquired in free breathing.
Additionally, the exclusion of small tumours (<3 mm radius) could have biased the results. However, the aim of the study was to investigate density changes across lung-tumour boundary. Thus, creating shells inside the target volume, at lung-tumour boundary and surrounding lung tissue was necessary to capture changes happening to the visible tumour and surrounding tissue during RT. Accounting for changes in different parts of the tumour and the surrounding tissue (more information) makes the shell approach better than previous work, which mainly focuses on visible tumour shrinkage. Additionally, our approach does not require segmentation on the follow up images, and calculating the mean in each shell makes our approach more robust to small registration errors. Yet, we are collapsing the 3D spatial information in each shell down to a mean, a single value, losing spatial information.
Hence, we need better ways to capture and maintain this spatial information. The spatial information is critical for the full understanding of local tumour changes as it will undoubtedly be the case that different aspects of the tumour will change in different ways (O'Connor et al 2015, Wu et al 2016, i.e. one part may regress, while another part may progress or remain unchanged. As part of the future works, we will explore different techniques that can include spatial and temporal information, which will better enable us to accurately distinguish patterns of tumour change during RT.
This study has shown that particular clusters representing unique patterns of tumour change during RT exist. If we can accurately identify patterns of tumour response, better treatment adaption can be performed to increase dose to the target volume and spare surrounding healthy tissue.

Conclusion and future works
Using this novel approach, three groups representing distinct patterns of tumour response during RT were identified using changes in intensities around lung-tumour boundary across CBCT images. Surprisingly, we found that progressing tumours tended to be smaller compared to regressing tumours. However, the three groups of tumour response patterns did not show a significant difference in overall survival.