Predicting radiation treatment planning evaluation parameter using artificial intelligence and machine learning

Purpose: This study suggested a new method predicting the dose-volume parameter for radiation treatment planning evaluation using machine learning, and to evaluate the performance of different learning algorithms in the parameter prediction. Methods: Dose distribution index (DDI) for fifty prostate volumetric modulated arc therapy plans were calculated, and compared to results predicted by machine learning using algorithms, namely, linear regression, tree regression, support vector machine (SVM) and Gaussian process regression (GPR). Root mean square error (RMSE), prediction speed and training time were determined to evaluate the performance of each algorithm. Results: From the results, it is found that the square exponential GPR algorithm had the smallest RMSE, relatively high prediction speed and short training time of 0.0038, 4,100 observation/s and 0.18 s, respectively. All linear regression, SVM and GPR algorithms performed well according to their RMSE in the range of 0.0038–0.0193. However, RMSE of the medium and coarse tree regression algorithms were found larger than 0.03, showing that they are not suitable for predicting DDI in this study. Conclusion: Machine learning can be used to predict dose-volume parameter such as DDI in radiation treatment planning QA. Selection of a suitable machine learning algorithm is important to determine the parameter effectively.


Introduction
In radiotherapy, a dose delivery plan is created by radiation dosimetrist and medical physicist. This dose plan contains information about how to irradiate a volume-of-interest using a dose delivery technique. To guarantee a high conformal dose deposited at the target, while simultaneously sparing the nearby organs-at-risk (OARs), a quality assurance (QA) program is needed to evaluate the dose plan to ensure it fulfills the minimum standard [1][2][3]. For the convenience of plan evaluation, dose distribution in the irradiation volume such as target or OAR is expressed as dose-volume histograms (DVHs). DVH is a histogram relating absorbed dose to the irradiated volume in treatment planning, and is useful for the calculation of different plan evaluation parameters [4][5][6]. These dose-volume and radiobiological parameters are based on medical physics, radiobiology and mathematical expression reflecting the dose distribution merit of a treatment plan in QA [7][8][9][10][11].
Dose distribution index (DDI) is a relatively new dose-volume evaluation parameter proposed to help decision-making in radiation treatment planning QA [12]. This index consolidates the dose coverage conformity and the homogeneity for the planning target volume (PTV), OAR, and remaining target-at-risk (RVR) into one value. DDI is proved to be an effective and straightforward way to evaluate treatment plan in radiotherapy delivery program [12,13]. However, calculation of DDI is still complicated as it involves a number of DVHs from the targets and OARs. This is an issue especially when there are many potential plans generated for the treatment which require a fast evaluation [14,15]. Since there are already many studies on dose Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. prediction for treatment planning using machine learning algorithms [16][17][18], we solved this problem to predict the DDI value using machine learning with training on related DVHs to predict the DDI value [19][20][21].
This Science Note focused on machine learning applications after the dose and DVH calculation in radiation treatment planning. The aim is to confirm that machine learning algorithms can efficiently predict dose-volume parameters such as DDI in treatment planning QA. Prostate volumetric modulated arc therapy (VMAT) plans were used in the machine learning [21,22].

Dose distribution index
The DDI is written as [12]: where W T , W O , and W R are the weights assigned to the PTV, OAR, and RVR and I T , I O , and I R are the components that quantify the plan quality on the PTV, OAR, and RVR. The components defined the dose coverage and conformity of the PTV (I T ), OAR (I O ) and RVR (I R ) in equation (1) are defined as: dD represent the area under the PTV, OAR and RVR in the DVH curves, respectively. In equation (2), PTV is the planning target volume, D p is the prescribed dose, D m is the maximum dose received completely and D M is the maximum dose received within the PTV. In equations (3) and (4), N is the number of OAR, and W Oi is a non-negative weight that reflects the relative clinical importance of the i th OAR. DDI can be determined solely from the prescribed dose and DVHs of the PTV, OAR, and RVR involved [12]. All this information is usually available from the computerized treatment planning system. The influence of the PTV, OAR, and RVR in conformity and homogeneity of these volumes are estimated and then combined in a weighted sum. The weights are set to correspond to the volumes' relative importance assigned in the radiation treatment planning process. Instead of evaluating a treatment plan based on calculating the corresponding evaluation parameters from DVHs of the targets and OAR [7,9,10], DDI simplified the QA process by providing only one value based on a group of DVHs. However it can be seen that to determine the DDI, all the corresponding DVHs of the targets, OARs and RVRs in the treatment plan have to be considered. Since there are typically over 500-800 dose-volume points in one DVH and there are generally 5-10 DVHs to be considered in a DDI calculation, it is worthwhile to investigate if it is more efficient to use machine learning to predict the DDI.
In this study, fifty prostate VMAT plans from the Grand River Hospital were used. The dual-arc technique was used with the 6 MV photon beams [23]. DVHs of the targets and OARs in each plan were determined to calculate the DDI as per equation (1).
Workflow of machine learning MATLAB's regression learner in the Statistical and Machine Learning toolbox App was used in this study. The workflow of machine learning is shown in figure 1.

Data collection and preprocessing
The radiation treatment plan data consisted of the DVH of the PTV, and each of the OAR. The OAR of these plans included the rectum, bladder, left femur and right femur. In data preprocessing, features of DVH were transformed to the maximum dose received by a certain percent of relative volume, defined as Dx. The x represents the percentage of the relative volume that receives a maximum dose of Dx. The total amount of features required was tested iteratively and it is found that D 100% , D 75% , D 50% , D 25% and D M as shown in figure 2 were adequate and efficient to predict the DDI value. D M is the maximum dose of the DVH.

Model selection, training and evaluation
To select the appropriate models to represent the response (figure 1), it is found that supervised regression machine learning algorithms are suitable [24]. The algorithms included linear regression [25], tree regression [26], SVM [27,28] and GPR [29] used in this study. They were examined for their performance in predicting the DDI value. In the training process, k-folds cross validation was carried out [30]. In k-fold cross validation the dataset is divided into k groups. Each group takes a turn being the hold out fold while the other k −1 groups acts as the training data. The total performance is the average of all folds. k-fold cross validation is useful, as a subset of data does not need to be held out, making it particularly beneficial for a small training dataset. A 4-fold cross validation was used in this study. The root mean square error (RMSE) was used to evaluate the performance of all

Results
An actual versus predicted plot of the DDI values using different machine learning models trained with the optimal number of features was used to identify trends of the prediction. Results of actual versus predicted plots  for the linear regression, tree regression (fine, medium and coarse), SVM (linear, quadratic and cubic) and GPR (exponential, matern 5/2, rational quadratic, and square exponential) are shown in figures 3-6, respectively.
The RMSE, R-squared, prediction speed and training time for the linear regression, tree regression, SVM and GPR models are shown in table 1. They are ordered from the best performance to the worst.

Discussion
From the results in figures 3-6 and table 1, it is seen all machine learning algorithms examined reliably predict the DDI except the tree regression model. The tree regression models ( figure 4) were not precise enough for a reliable prediction of DDI. The GPR models (figure 6) were the most accurate among others. It was almost matched the calculated results with RMSE smaller than 0.004 (table 1).
In the linear regression model, the largest coefficient was the D 50% for the PTV. This meant that the D 50% of the PTV was a strong predictor of the DDI. The left and right femur had a large coefficient for the D 100% but the contribution to the overall DDI was small since the D 100% for OAR were small. The tree regression was not accurate enough since the resolution of the tree was not fine enough to distinguish DDI ( figure 4). On the other hand, all SVM models performed well (figure 5) with the linear SVM ( figure 5(a)) being the best. The SVM models were the most accurate when there were around 5 dose-volume points extracted from each DVH. For the quadratic ( figure 5(b)) and linear ( figure 5(a)) SVM, the accuracy showed a trend of decreasing accuracy when the number of features was increased. A possible explanation for this was the increasing number of features made it difficult for the SVM to focus on the features that were most predictive. The GPR algorithms created were the most accurate models (figures 6(a)-(d)). The RMSE for the square exponential, matern 5/2, and rational quadratic GPR were very similar as shown in table 1. For the computing time, while our MATLAB program would take about 5 s to determine the DDI from DVHs per treatment plan, it only took less than one second to determine the DDI using the machine learning method. This time advantage is significant when we need to compare hundreds or treatment planning options for a patient.

Conclusions
Machine learning was used to predict the DDI from DVHs of prostate VMAT treatment plans using different models, namely, linear regression, tree regression, SVM, and GPR. It is found that all examined models were precise in its predictions of DDI, except the tree regression model. So it is possible to predict dose-volume parameter from the DVH data using machine learning.