Machine Learning Techniques Applied to Dose Prediction in Computed Tomography Tests

Increasingly more patients exposed to radiation from computed axial tomography (CT) will have a greater risk of developing tumors or cancer that are caused by cell mutation in the future. A minor dose level would decrease the number of these possible cases. However, this framework can result in medical specialists (radiologists) not being able to detect anomalies or lesions. This work explores a way of addressing these concerns, achieving the reduction of unnecessary radiation without compromising the diagnosis. We contribute with a novel methodology in the CT area to predict the precise radiation that a patient should be given to accomplish this goal. Specifically, from a real dataset composed of the dose data of over fifty thousand patients that have been classified into standardized protocols (skull, abdomen, thorax, pelvis, etc.), we eliminate atypical information (outliers), to later generate regression curves employing diverse well-known Machine Learning techniques. As a result, we have chosen the best analytical technique per protocol; a selection that was thoroughly carried out according to traditional dosimetry parameters to accurately quantify the dose level that the radiologist should apply in each CT test.


Introduction
Nowadays, one of the most frequently carried out medical tests is the so-called computed axial tomography (CT), which is used to obtain a precise human body image while using X-Rays [1]. Radiologists are able to observe anomalies or lesions in patients without performing other invasive techniques by means of this non-invasive technique. This is the reason why the number of CT-based tests has grown enormously in recent years [2]. However, an inadequate dose of X-rays delivered to the human body over time could result in serious diseases, even increasing the risk of cell mutation, which can lead to the proliferation of tumors. In addition, the younger the patient is, the greater the probability that he/she could develop cancer due to increased cellular activity [3].
Keeping the well-known risks of computed tomography-related radiation in mind, there is another factor to take into account; the clarity of the image. The higher the dose applied to the body, the higher the quality of the image, which therefore makes it easier for radiologists to make diagnoses. Thus, adjusting the dose that is received by a patient is mandatory for allowing the detection and accurate identification of lesions or diseases, even in their earliest stages, while minimizing the possible risks for patients. For instance, a well-known practice among radiologists is ALARA (As Low As Reasonably Achievable) [4]. This consists of reducing radiation exposure by computed tomography scanning,

Materials and Methods
The Ethics Committee for Clinical Research, which accepted the waiver of the requirement to obtain patient informed consent, approved the study.

Data Collection
This work is supported by dosimetry data that were collected from thirteen CTs operating in eight hospitals in the Region of Murcia (Table 1), between May 2015 and December 2017. The total number of analyzed exams was 58,571. In addition to the diagnosable image, the data attached to the exams include information regarding the gender, the type of protocol (listed in Table 2), the dose received by each patient in terms of CTDI VOL and SSDE metrics (if applicable), the type of phantom, the CT employed, the age of the patient, and the BMI metric for those exams carried out from 2015 to 2017. The DRL figure associated to CTDI VOL metric is also included in this work. DRLs are obtained through studies of dosimetry that were carried out in different countries. A summary of DRL values per country/protocol can be found in [20,21], where we have selected the most representative ones: Matlab is the mathematical tool selected to manage and analyze the entire data, in its version R2017a, together with its Statistics and Machine Learning Toolbox libraries. The simulations were executed in a computer whose main features are an Intel Core i7-8700K processor and 16 GB of RAM memory.

Methodology
To achieve an appropriate prediction of the dose to be radiated in patients, it is necessary to carry out the following steps (see Figure 1):

1.
From all of the CT tests, the radiology team sets the diagnosable images by protocol. The remaining images, the non-diagnosable ones, are not considered in our work.

2.
A CT test contains a large amount of information (sequence of images, patient's data, physical magnitudes, etc.). From all of them, we extract the following interest parameters: CTDI VOL , BMI, and SSDE (if the CT has this feature). These parameters comprise the Dataset.

3.
It is necessary to discard the data that can jeopardize the prediction of future results. This phase consists of removing this unrepresentative information, which is also called outliers.

4.
Applying different well-known Machine Learning techniques to each protocol, we obtain the regression curve that best fits the data. 5.
In the decision-making process, we select the best ML technique, while employing an objective metric as the root of the quadratic mean error (RMSE), along with the computational cost. 6.
Once the ML technique is selected, a precise CTDI VOL value is calculated from the regression curve, taking the BMI (or SSDE) as the input parameter. The CTDI VOL value is, a priori, the new calculated dose to deliver to the patient.

Methodology
To achieve an appropriate prediction of the dose to be radiated in patients, it is necessary to carry out the following steps (see Figure 1): 1. From all of the CT tests, the radiology team sets the diagnosable images by protocol. The remaining images, the non-diagnosable ones, are not considered in our work. 2. A CT test contains a large amount of information (sequence of images, patient's data, physical magnitudes, etc.). From all of them, we extract the following interest parameters: CTDIVOL, BMI, and SSDE (if the CT has this feature). These parameters comprise the Dataset. 3. It is necessary to discard the data that can jeopardize the prediction of future results. This phase consists of removing this unrepresentative information, which is also called outliers. 4. Applying different well-known Machine Learning techniques to each protocol, we obtain the regression curve that best fits the data. 5. In the decision-making process, we select the best ML technique, while employing an objective metric as the root of the quadratic mean error (RMSE), along with the computational cost. 6. Once the ML technique is selected, a precise CTDIVOL value is calculated from the regression curve, taking the BMI (or SSDE) as the input parameter. The CTDIVOL value is, a priori, the new calculated dose to deliver to the patient.

Removing Outliers
Once the data are gathered, those considered to be atypical (outliers); that is, values denoted as errors or unrepresentative are eliminated from the population (in this work, the sample values match the population). There are diverse techniques in order to do this [24], which are applied according to the distribution of the sample and the data percentage to be removed.
If the data are treated as separated variables, outliers are eliminated while using the univariate  Once the data are gathered, those considered to be atypical (outliers); that is, values denoted as errors or unrepresentative are eliminated from the population (in this work, the sample values match the population). There are diverse techniques in order to do this [24], which are applied according to the distribution of the sample and the data percentage to be removed.
If the data are treated as separated variables, outliers are eliminated while using the univariate method. Figure 2 shows the boxplot diagram that was proposed by Tukey in 1977 [25], in which the distribution of a set of data is observed, and different regions are identified from their statistical information. As Figure 2 shows, we define the Interquartile range (RIC) as the difference between Q3 (third quartile or 75th percentile) and Q1 (first quartile or 25th percentile). In this way, if we want to remove extreme values, those greater than Q3 + 3 *RIC and those less than Q1 -3 *RIC will be eliminated. Even if we intend to be more restrictive, data that are higher than Q3 + 1.5 *RIC and lower than Q1 -1.5 *RIC can be eliminated from our samples. However, the univariate method has a clear drawback; the data removed are located at the ends of each variable and outside of these zones there could be undetected outliers.
Another way of eliminating outliers is to employ the multivariate method. This is ideal for internal areas with low data density; meaning the number of data is not very representative in the whole sample.
To detect and remove outliers, we apply the technique called the density-based spatial clustering of applications with noise technique, DBSCAN [26]. In this technique, the epsilon (euclidean distance) and MinPts (threshold) parameters must be previously defined, to later operate, as follows: for each particular data the number of neighbors in a certain epsilon must be quantified. If that number exceeds the established threshold (MinPts), the specific data and their neighbors are included in a cluster, as well as the neighbors of the previous data that fulfill the same condition. The iterative process continues until all of the data are checked and a cluster of connected data is established. On the other hand, if the number does not exceed the value of MinPts, the specific data will be considered to be noise, and will therefore be eliminated from the sample (see Figure 3). The goal of both techniques is to eliminate the smallest number of data, trying not to exceed 5% of the total data of the sample. Once the atypical data have been removed, the regression techniques that are described in Section 4.2.2 will be applied to the remaining data.
The data are separated into two groups, called training and test, in order to obtain a good regression model [27]. The cross-validation technique will be used, which divides the sample into k groups of data. One of them will be used as a test and the rest for training.
This process is repeated during k iterations, with each time using a different group as test, without repetition. The value of k will vary depending on the number of available data. The value of 10 is the one that is selected for this work.
The regression algorithm will be fed with the number of data allotted to training, which will fit As Figure 2 shows, we define the Interquartile range (RIC) as the difference between Q3 (third quartile or 75th percentile) and Q1 (first quartile or 25th percentile). In this way, if we want to remove extreme values, those greater than Q3 + 3 *RIC and those less than Q1 − 3 *RIC will be eliminated. Even if we intend to be more restrictive, data that are higher than Q3 + 1.5 *RIC and lower than Q1 − 1.5 *RIC can be eliminated from our samples. However, the univariate method has a clear drawback; the data removed are located at the ends of each variable and outside of these zones there could be undetected outliers.
Another way of eliminating outliers is to employ the multivariate method. This is ideal for internal areas with low data density; meaning the number of data is not very representative in the whole sample.
To detect and remove outliers, we apply the technique called the density-based spatial clustering of applications with noise technique, DBSCAN [26]. In this technique, the epsilon (euclidean distance) and MinPts (threshold) parameters must be previously defined, to later operate, as follows: for each particular data the number of neighbors in a certain epsilon must be quantified. If that number exceeds the established threshold (MinPts), the specific data and their neighbors are included in a cluster, as well as the neighbors of the previous data that fulfill the same condition. The iterative process continues until all of the data are checked and a cluster of connected data is established. On the other hand, if the number does not exceed the value of MinPts, the specific data will be considered to be noise, and will therefore be eliminated from the sample (see Figure 3). As Figure 2 shows, we define the Interquartile range (RIC) as the difference between Q3 (third quartile or 75th percentile) and Q1 (first quartile or 25th percentile). In this way, if we want to remove extreme values, those greater than Q3 + 3 *RIC and those less than Q1 -3 *RIC will be eliminated. Even if we intend to be more restrictive, data that are higher than Q3 + 1.5 *RIC and lower than Q1 -1.5 *RIC can be eliminated from our samples. However, the univariate method has a clear drawback; the data removed are located at the ends of each variable and outside of these zones there could be undetected outliers.
Another way of eliminating outliers is to employ the multivariate method. This is ideal for internal areas with low data density; meaning the number of data is not very representative in the whole sample.
To detect and remove outliers, we apply the technique called the density-based spatial clustering of applications with noise technique, DBSCAN [26]. In this technique, the epsilon (euclidean distance) and MinPts (threshold) parameters must be previously defined, to later operate, as follows: for each particular data the number of neighbors in a certain epsilon must be quantified. If that number exceeds the established threshold (MinPts), the specific data and their neighbors are included in a cluster, as well as the neighbors of the previous data that fulfill the same condition. The iterative process continues until all of the data are checked and a cluster of connected data is established. On the other hand, if the number does not exceed the value of MinPts, the specific data will be considered to be noise, and will therefore be eliminated from the sample (see Figure 3). The goal of both techniques is to eliminate the smallest number of data, trying not to exceed 5% of the total data of the sample. Once the atypical data have been removed, the regression techniques that are described in Section 4.2.2 will be applied to the remaining data.
The data are separated into two groups, called training and test, in order to obtain a good regression model [27]. The cross-validation technique will be used, which divides the sample into k groups of data. One of them will be used as a test and the rest for training.
This process is repeated during k iterations, with each time using a different group as test, without repetition. The value of k will vary depending on the number of available data. The value of 10 is the one that is selected for this work.
The regression algorithm will be fed with the number of data allotted to training, which will fit a curve with these data, obtaining the mathematical model as a result. For each iteration, the error Ek will be estimated; the mean of all the errors results in the total error E, as shown in Figure 4. The goal of both techniques is to eliminate the smallest number of data, trying not to exceed 5% of the total data of the sample. Once the atypical data have been removed, the regression techniques that are described in Section 4.2.2 will be applied to the remaining data.
The data are separated into two groups, called training and test, in order to obtain a good regression model [27]. The cross-validation technique will be used, which divides the sample into k groups of data. One of them will be used as a test and the rest for training. This process is repeated during k iterations, with each time using a different group as test, without repetition. The value of k will vary depending on the number of available data. The value of 10 is the one that is selected for this work.
The regression algorithm will be fed with the number of data allotted to training, which will fit a curve with these data, obtaining the mathematical model as a result. For each iteration, the error E k will be estimated; the mean of all the errors results in the total error E, as shown in Figure 4.  After training the algorithm and obtaining the model, we will verify its performance by means of new data that have not been employed in the training process. We use the test data for this purpose.
If the error from the test data is much greater than the error committed by the training data, the model is overfitting the training data, diminishing the generality for the test set. The reason is the following: the algorithm extracts a large amount of information from the dataset to generate the training data, deriving a complex model that is capable of precisely adjusting its predictions. This model can include noise or random fluctuations due to the great number of data. When assessing new data (test), a minor amount of them are selected, which implies a low probability of noise/random data. The result is a clear deterioration in the performance of the model. This issue has a negative effect on the precision of the predictions, to the point of making it unfeasible for the problem contemplated here.

ML Techniques
Machine learning (ML) algorithms, as a subfield of artificial intelligence (AI), have been providing effective solutions in engineering applications and to scientific problems for many decades. The ML methods have the ability to adapt to new conditions and detect and estimate patterns. To this end, ML conceives self-learning algorithms to derive knowledge from data in order to carry out system predictions. ML provides a suitable solution that captures the awareness present in data and gradually enhances the performance of predictive models to build models that analyze a large amount of data. The main goal is to make the best decisions, or to take the best actions based on these predictions.
ML is divided into three categories: supervised learning, unsupervised learning, and reinforcement learning. In this work, we will focus our attention on supervised learning techniques, since they allow for a model to learn from training data to make predictions about unseen or future data. In supervised learning, the input data are defined by labels (such as, for instance, mail labels) or raw data. One of the subcategories of supervised learning is regression analysis, which addresses the prediction of continuous results from labels/raw data. Given a set of variables, x, named predictors, and their corresponding response variables, y, we can fit a curve graph (the simplest is a straight line) to these data that minimizes the distance between the sample points and the fitted linear/non-linear graph. The set unsupervised learning and regression analysis is adjusted to the  After training the algorithm and obtaining the model, we will verify its performance by means of new data that have not been employed in the training process. We use the test data for this purpose.
If the error from the test data is much greater than the error committed by the training data, the model is overfitting the training data, diminishing the generality for the test set. The reason is the following: the algorithm extracts a large amount of information from the dataset to generate the training data, deriving a complex model that is capable of precisely adjusting its predictions. This model can include noise or random fluctuations due to the great number of data. When assessing new data (test), a minor amount of them are selected, which implies a low probability of noise/random data. The result is a clear deterioration in the performance of the model. This issue has a negative effect on the precision of the predictions, to the point of making it unfeasible for the problem contemplated here.

ML Techniques
Machine learning (ML) algorithms, as a subfield of artificial intelligence (AI), have been providing effective solutions in engineering applications and to scientific problems for many decades. The ML methods have the ability to adapt to new conditions and detect and estimate patterns. To this end, ML conceives self-learning algorithms to derive knowledge from data in order to carry out system predictions. ML provides a suitable solution that captures the awareness present in data and gradually enhances the performance of predictive models to build models that analyze a large amount of data. The main goal is to make the best decisions, or to take the best actions based on these predictions.
ML is divided into three categories: supervised learning, unsupervised learning, and reinforcement learning. In this work, we will focus our attention on supervised learning techniques, since they allow for a model to learn from training data to make predictions about unseen or future data. In supervised learning, the input data are defined by labels (such as, for instance, mail labels) or raw data. One of the subcategories of supervised learning is regression analysis, which addresses the prediction of continuous results from labels/raw data. Given a set of variables, x, named predictors, and their corresponding response variables, y, we can fit a curve graph (the simplest is a straight line) to these data that minimizes the distance between the sample points and the fitted linear/non-linear graph. The set unsupervised learning and regression analysis is adjusted to the requirements while observing the nature of the data to analyze in this work, allowing for us to predict the continuous outcomes of target variables.
In regression analysis techniques, the scientific literature presents different approaches that are useful in the massive analysis of data (Big Data). Furthermore, these techniques help in the forecasting of future doses to be received by patients, which is a distinctive objective of this work.
Regarding regression analysis models, we will concentrate our efforts on specifying (data selection), accommodating (eliminating outliers and anomalous points), and analyzing our large amount of CT exam data by using the following models: This technique consists of finding a line that fits a data set following a certain criterion. The most common criterion, which will also be employed in this work, is least squares adjustment [28].

Decision Tree Learning.
This scheme breaks down our data by making decisions based on asking a series of questions. In particular, in the training phase, the decision tree model learns questions that are used to stamp class labels on the samples. As a tree model, the process starts at the root of the tree and then splits the data along its branches. The splitting procedure is repeated at each child node up to the leaves (of the tree). This means that the samples of each node belong to the same class. Note that the error is minimized if the tree is deep, but it can lead to overfitting. Thus, the usual procedure is to prune the tree, restricting its maximum depth. A better way to improve the results of the Decision Tree Learning algorithm is to employ a technique, called Bagged Decision Tree, which reduces the variance of a decision tree.

Bagged Decision Tree.
In this technique, multiple regression trees are constructed. In particular, several subsets of data are created from training samples, for each collection of them to be later used to train their own decision trees. The average derived from these different decision trees provides a more robust solution than a single decision tree. The use of several trees also reduces overfitting. [29].

Artificial Neural Networks
Our focus will be on analyzing the data for the training phase with a technique, called Bayesian regularization [30]. This algorithm allows for us to perform binary classification, and we will use the Levenberg-Marquardt optimization [31] to learn the weight coefficients of the model (in each iteration of the training phase, the coefficients are updated). Furthermore, it is possible to obtain the optimal weights employing cost functions, such as those called Sum of Squared Errors (SSE). To find the predicted values, the solution involves connecting multiple single neurons to a multi-layer feedforward neural network. This particular type of network is also called a multi-layer perceptron (MLP), which consists of three layers (input, hidden, and output layers). Both techniques (Bayesian regularization and Levenberg-Marquardt optimization), together with an infrastructure MLP achieve an optimal model capable of generalizing the mathematical problem thanks to the minimization of a combination of weights and errors. This algorithm allows for overfitting to be reduced at the cost of longer execution time. [32].

Gaussian Process Regression
Parametric regression methods, for instance, linear/logistic regression, generate a line or a curve in the graph of inputs and outputs, replacing the training data. Accordingly, once the regression weights have been obtained, the original training data may be eliminated from the graph. On the other hand, non-parametric regression methods may retain the initial training data (also called latent variables) to Sensors 2019, 19, 5116 9 of 27 be used as a significant element in generating a regressor function. To this end, test data are compared to the training data points; each output value of the test point is estimated via the distance of the test data input to the training data input. It is notable that non-parametric regression considers that data points with similar input values will be close in output space. The mathematical expressions include the covariance function formed of latent variables, which reflects the smoothness of the response. Covariance and mean functions were used in conjunction with a Gaussian likelihood for prediction, employing f * |X, y, X * as an initial expression. In it, f * is a posterior distribution, X is a matrix of training inputs, y is a vector of training target, and X * is a matrix of test inputs.
To maximize this expression, we have carefully studied the mathematical model that was derived in [33]. From this previous study, we opt as a useful equation for the problem here planned the following exponential function, which will, in turn, be employed as a kernel function: where the parameter σ 2 f is the standard deviation, while l k is the scalar dimension for each predictor, k is the number of evaluations to fulfill the maximization problem, and x and x' are two near values. 6. Support Vector Regression (SVR) [34].
In this case, we consider the following training data (x i , y i ), . . . .., (x l , y l ) , where x i ⊂ R n ; y i ⊂ R indicate the input space of the sample and its corresponding target value, respectively, and l denotes the size of the training data. Our objective is to find a function f (x) that has, at most, ε deviation from the obtained targets y i for all of the training data, and at the same time is as flat as possible. In other words, we do not care about errors because they are less than ε. Additionally, the results must avoid senseless predictions to find a function f (x) that returns the best fit.
Regarding the relationship between x and y, it is approximately linear, which means that the model is represented as: f (x) = w, x + b; w ⊂ R n ; b ⊂ R (w represents coefficients and b is an intercept). Therefore, this problem can be formulated as a convex optimization problem: Here, our optimization problem is planned as a non-linear case. Keeping this in mind and, thanks to the support of the work [35], the solution for (2) is the following Equation (3): The constant C > 0 determines the trade-off between the flatness (this means that one seeks a small w value) of f and the amount up to which deviations that are larger than ε are tolerated. On the other hand, α i , α * i , α j , and α * j are Lagrange multipliers. Finally, ϕ( A common kernel that is used for this model is the radial basis function: A more detailed study of these aforementioned techniques can be found in the Supplementary File 1 "Description of Machine Learning Techniques". Finally, Table 3 shows the parameters used in the previously defined algorithms: Table 3. Machine Learning (ML) Technique parameters.

RMSE Metric
We employ the root of the quadratic mean error (RMSE) metric as a measure of error. The RMSE is a value that measures the standard deviation of the error. This is calculated by Equation (4) as the average of errors squared, with n being the number of samples, y the real value, andŷ the predicted value. The RMSE metric presents a range from zero to infinite, especially punishing those data that are far from the estimated value.
When the relationship between two variables is obtained by means of linear regression, we can also employ the parameter R (correlation coefficient) in the analysis, which shows the degree of linear correlation between the variables. When R approaches 1 or −1, there is a high linear correlation. However, if its value is close to 0, both of the variables are said to be poorly correlated.

Results
In this section, two different comparisons are accomplished to predict the precise dose levels that a patient should receive to obtain a diagnosable image. Firstly, we analyze the radiation output of the CT set by the CTDI VOL parameter versus BMI, which depends on the patient's height and weight. Secondly, we compare the same figure of merit CTDI VOL with respect to the SSDE metric, which returns his/her exact morphology. Although the SSDE parameter adjusts, a priori, the dose level better than BMI, the unavailability of collecting it in some of the CTs (see Table 1), together with the possibility of discerning gender, have motivated this twofold study. Finally, it is notable that a powerful tool for dose optimization during CT examinations is the analysis of outliers, as we have shown in a previous work [36].

Comparison between CTDIvol and BMI
The main objective is the prediction of the optimal dose of radiation that a patient should receive, while taking into account his/her BMI and the type of protocol. The parameter predicted is CTDI VOL , so this will allow for technicians to adjust different parameters to achieve that radiation value in CT output. Additionally, gender-segmented regression data for the five protocols indicated in Table 2 will be calculated while using each one of the ML techniques described above. However, we have opted to only show the best ML technique to shorten the length of the paper and make the text more understandable to readers. The remaining studies can be found in the Supplementary File 2 "Results involving ML, Protocols, and dose metrics".
Two figures are plotted for each protocol/gender. The first one illustrates the regression curve after removing the outliers, together with the European DRL reference values. The second shows an error histogram composed of 20 bars. This figure represents both training and test errors and it highlights the relationship between both. The x-axis indicates the error made; that is, the distance between the real data and the value predicted, while the y-axis points out the number of data that have an error of that magnitude.

Skull Protocol
In this protocol, the data dispersion is clearly high, which demonstrates low linear correlation between BMI and CTDI VOL . In the men's case, the correlation coefficient R 2 reaches a value of 0.000908 when all of the samples are adjusted by means of a linear regression. A slight improvement is obtained if the anomalous data (outliers) are not computed; the value reached for R 2 is 0.0467. As in the men's case, for women, a very low linear correlation between BMI and CTDI VOL is attained. By adjusting all of the samples by means of a linear regression the parameter R 2 reaches a value of 0.000154. When outliers are eliminated there is a slight improvement; R 2 increases to 0.0348. For both genders, when the total number of samples is analyzed, Bagged regression trees is the technique that best adjusts the training and test data. This conclusion comes from analyzing the results of Tables 4 and 5. The elimination of outliers was carried out in two phases. Firstly, from the set of samples, those values that were higher than Q3 + 3 *RIC, and those lower than Q1 -3 *RIC were ruled out through the univariate method, as described in Section 4. Secondly, samples that were placed in areas with low data density, far from the usual values, and with significant influence on the error, were also removed. To this end, we employ the multivariate technique also mentioned in the previous section. Note that this procedure was carried out in a similar way in the rest of the protocols analyzed in this section.
After the elimination of outliers, bagged regression trees continues as the best predictive technique, obtaining the lowest RMSE of all the ML techniques analyzed. However, the same performance is not achieved with the test data, showing a big difference between both (training and test data).
The Neural Networks technique provides excellent RMSE results for the test data and, therefore, for the model presented. This technique also fits for the training data, with smooth transitions and avoidance of overfitting. Thus, Neural Networks is the most suitable solution for predicting future doses for the "Men's/Women's Skull" protocol and the CTDI VOL -BMI plots. Gaussian processes is another technique that produces results that are similar to those of Neural Networks. Nevertheless, the execution time of Gaussian processes is the longest, which is an inconvenience when the number of samples increases.
Finally, it should be noted that the obtained dose does not exceed the DRL value for any of the European countries comprised in this work, as shown in Figure 5. After the elimination of outliers, bagged regression trees continues as the best predictive technique, obtaining the lowest RMSE of all the ML techniques analyzed. However, the same performance is not achieved with the test data, showing a big difference between both (training and test data).
The Neural Networks technique provides excellent RMSE results for the test data and, therefore, for the model presented. This technique also fits for the training data, with smooth transitions and avoidance of overfitting. Thus, Neural Networks is the most suitable solution for predicting future doses for the "Men's/Women's Skull" protocol and the CTDIVOL-BMI plots. Gaussian processes is another technique that produces results that are similar to those of Neural Networks. Nevertheless, the execution time of Gaussian processes is the longest, which is an inconvenience when the number of samples increases.
Finally, it should be noted that the obtained dose does not exceed the DRL value for any of the European countries comprised in this work, as shown in Figure 5.

Thorax, Abdomen, and Pelvis Protocol
Regarding men, there is low linear correlation between BMI and CTDIVOL (R 2 = 0.00583), which is greatly enhanced when the outliers are eliminated (R 2 = 0.283). In the case of women, the linear correlation between BMI and CTDIVOL is low (R 2 = 0.000355), which substantially increases when the outliers are eliminated from the computation (R 2 = 0.304).
Gaussian processes obtains the best result (when 100% of the samples are computed), deriving the lowest value of RMSE in the test data, as shown in Tables 6 and 7. Bagged regression trees achieves the  Regarding men, there is low linear correlation between BMI and CTDI VOL (R 2 = 0.00583), which is greatly enhanced when the outliers are eliminated (R 2 = 0.283). In the case of women, the linear correlation between BMI and CTDI VOL is low (R 2 = 0.000355), which substantially increases when the outliers are eliminated from the computation (R 2 = 0.304).
Gaussian processes obtains the best result (when 100% of the samples are computed), deriving the lowest value of RMSE in the test data, as shown in Tables 6 and 7. Bagged regression trees achieves the least error in the training data (as occurred in previous cases). Removing outliers, Gaussian processes and Neural Networks are the techniques that accomplish the least error in the test data for men and women, respectively, and it can be said that both are the best predictive models for the "Thorax, Abdomen, & Pelvis" protocol. However, as in the aforementioned protocol, there are no clear differences in terms of RMSE values if we compare all of the ML techniques. It should be noted that GPR is the technique with the highest computation demand. Figure 6a,b, concerning men, show the Gaussian processes results always below most of the European DRLs, with the exception of only the most restrictive DRL (Switzerland), which has been exceeded by BMI values of about 35. In the case of women (Figure 6c,d), Neural Networks outcomes illustrate the surpassed DRLs are Switzerland and Finland when the BMI exceeds the value of 35. The following most restrictive DRL (UK) is exceeded with a BMI value close to 40.

Abdomen and Pelvis Protocol
In the men's case, there is a low linear correlation between BMI and CTDI VOL (R 2 = 0.00741), which sharply increases when the outliers are eliminated (R 2 = 0.313). As shown in Table 8, bagged regression trees provides the best result (when all of the samples are computed), achieving the lowest RMSE in the test data. In contrast to the previous cases, Gaussian Process Regression (GPR) obtains the least error in the training data. However, the GPR technique implies a high computational cost. Bagged regression trees achieves the least error for the set of training and test data (with suppression of outliers) and therefore, it is the best predictive model for this protocol.
However, as in the aforementioned protocol, there are no clear differences in terms of RMSE values if we compare all of the ML techniques. It should be noted that GPR is the technique with the highest computation demand. Figure 6a,b, concerning men, show the Gaussian processes results always below most of the European DRLs, with the exception of only the most restrictive DRL (Switzerland), which has been exceeded by BMI values of about 35. In the case of women (Figure 6c,d), Neural Networks outcomes illustrate the surpassed DRLs are Switzerland and Finland when the BMI exceeds the value of 35. The following most restrictive DRL (UK) is exceeded with a BMI value close to 40.

Abdomen and Pelvis Protocol
In the men's case, there is a low linear correlation between BMI and CTDIVOL (R 2 = 0.00741), which sharply increases when the outliers are eliminated (R 2 = 0.313). As shown in Table 8, bagged regression trees provides the best result (when all of the samples are computed), achieving the lowest RMSE in the test data. In contrast to the previous cases, Gaussian Process Regression (GPR) obtains the least error in the training data. However, the GPR technique implies a high computational cost. Bagged regression trees achieves the least error for the set of training and test data (with suppression of outliers) and therefore, it is the best predictive model for this protocol.    Regarding women, there is a low linear correlation between BMI and CTDI VOL (R 2 = 0.00329), which improves when outlier data are eliminated (R 2 = 0.338). As observed in Table 9, Neural Networks and bagged regression trees achieve the best results for the test data and training data, respectively.
When the outliers are removed, Neural Networks reach the best data adjustment, obtaining the least error in the test data. Regarding the DRL metric, Figure 7c,d indicate results that are similar to the men's case. Regarding women, there is a low linear correlation between BMI and CTDIVOL (R 2 = 0.00329), which improves when outlier data are eliminated (R 2 = 0.338). As observed in Table 9, Neural Networks and bagged regression trees achieve the best results for the test data and training data, respectively. When the outliers are removed, Neural Networks reach the best data adjustment, obtaining the least error in the test data. Regarding the DRL metric, Figure 7c,d indicate results that are similar to the men's case.

Thorax Protocol
Regarding men, the linear correlation between BMI and CTDI VOL (R 2 = 0.00729) is low. The R 2 value increases to eliminate outliers (R 2 = 0.283). As pointed out in Table 10, Gaussian Process Regression (GPR) provides the best results in the test data, since the lowest RMSE values are reached with this solution. In the case of the training data, the bagged regression trees technique is the most notable.
When we eliminate outliers, Neural Networks satisfies the least error in the test data, so it attains a better prediction for the umbrella of these requirements (BMI-CTDI VOL , protocol, and gender). In this regard, GPR and linear regression also offer good results, although we choose Neural Networks as the best technique. As shown in Figure 8a,b, some European DRLs are surpassed by Neural Networks plots for BMI values of about 25 (as occurs with the DRLs of countries, such as Switzerland or Luxembourg). However, our samples are mainly lower than the rest of standardized DRLs, such as those of Greece, Norway, or France. Regarding men, the linear correlation between BMI and CTDIVOL (R 2 = 0.00729) is low. The R 2 value increases to eliminate outliers (R 2 = 0.283). As pointed out in Table 10, Gaussian Process Regression (GPR) provides the best results in the test data, since the lowest RMSE values are reached with this solution. In the case of the training data, the bagged regression trees technique is the most notable. When we eliminate outliers, Neural Networks satisfies the least error in the test data, so it attains a better prediction for the umbrella of these requirements (BMI-CTDIVOL, protocol, and gender). In this regard, GPR and linear regression also offer good results, although we choose Neural Networks as the best technique. As shown in Figure 8a,b, some European DRLs are surpassed by Neural Networks plots for BMI values of about 25 (as occurs with the DRLs of countries, such as Switzerland or Luxembourg). However, our samples are mainly lower than the rest of standardized DRLs, such as those of Greece, Norway, or France. In the women's case, the R 2 value obtained points to a low linear correlation between BMI and CTDI VOL (R 2 = 0.0261); this value is enhanced when outlier data are removed (R 2 = 0.412). Neural Networks offers the best result for test data when the outliers are not eliminated. Under these conditions and as shown in Table 11, Gaussian Process Regression (GPR) reaches the least error in the training data. In the case of removing outliers, GPR is the best prediction technique, providing the least error during test data. However, while observing the RMSE values, note that most techniques provide good performance, although GPR and Neural Networks imply high computational cost. As illustrated in Figure 8c,d, the exceeding of several European DRLs starts from slightly higher BMI values in the case of women for this protocol than for men.

Abdomen Protocol
Regarding the men's case, a medium/high linear correlation is observed between BMI and CTDI VOL (R 2 = 0.555), which slightly increases as few outlier points are removed from all of the samples (R 2 = 0.586). As shown in Table 12, bagged regression trees is that which obtains the least error in the training data process and computing 100% of the data. Gaussian processes and Neural Networks achieve the lowest RMSE in the test data when all of the samples are analyzed. Neural Networks and Gaussian processes are the most efficient models in removing outliers, and thus, both ML techniques behave better in terms of prediction functionality. However, the latter requires the highest computational cost. In the case of women, there is a certain linear correlation between BMI and CTDI VOL (R 2 = 0.197), which substantially increases when outlier points are eliminated (R 2 = 0.618). As can be observed in Table 13, and while considering outliers, Gaussian processes obtains the lowest RMSE for the test data. Under these same conditions, bagged regression trees reaches the least error for the training data. Neural Networks and Gaussian processes both offer the best adjustments in removing outliers, and therefore, they are the best solutions for predicting future doses in patients according to their weight and height (BMI). Although GPR is that which requires more computational means for the simulations, it is the technique selected. For any gender, Figure 9 illustrates the surpassing of the DRL values that were established by diverse European countries when the BMI metric is around the value of 20. Neural Networks and GPR models exceed all DRLs (excepting the DRL of Poland) for BMI values higher than 30. Networks and Gaussian processes both offer the best adjustments in removing outliers, and therefore, they are the best solutions for predicting future doses in patients according to their weight and height (BMI). Although GPR is that which requires more computational means for the simulations, it is the technique selected. For any gender, Figure 9 illustrates the surpassing of the DRL values that were established by diverse European countries when the BMI metric is around the value of 20. Neural Networks and GPR models exceed all DRLs (excepting the DRL of Poland) for BMI values higher than 30.

Comparison between SSDE and CTDI VOL
CTDI VOL is a metric that is provided by the output of the CT and standardized on a reference volume. However, it is not the real dose received by the patient, since it does not consider his/her morphology. That is, a patient can receive a different amount of radiation than the one indicated by the CTDI VOL value, because his/her morphology usually differs from the standard volume. To address this problem, SSDE is a parameter that computes the morphology of the patient from a scanogram. Therefore, SSDE is a more reliable measurement of the dose that is received by patients. However, not all current CTs have the ability to collect this information (see Table 1).
Figures analyzing CTDI VOL -SSDE metrics give us the required knowledge of the actual dose delivered to the patient from any CT with minimum error. To predict future CTDI VOL -SSDE values beforehand implies knowing the dose to radiate for the set patient and protocol. To achieve this, a CTDI VOL -SSDE regression study is carried out, employing the ML techniques described in this work. Specifically, we have also eliminated univariant/multivariant outliers from the data that were collected during the years 2015 and 2016 for the five protocols under study, as indicated in Table 2.
As in the previous section, two figures are drawn for each of the cases. The first illustrates the regression curve, while the second shows an error histogram with 20 bars, plotting both training and test errors and observing their differences. The x-axis represents the introduced error; that is, the distance between the real data and the predicted value, while the y-axis indicates the data number that has an error of specific magnitude. As in the CTDI VOL -BMI study, we only highlight the most representative ML technique. The remaining results for each ML technique can be found in the "Results involving ML, Protocols, and dose metrics" section in the Supporting Information for description.
In this study, note that there is no separation between men and women, because the SSDE parameter is focused on the morphology/shape of the patient and, therefore, obviates the need to identify the patient as a man or woman.

Skull Protocol
As shown in Figure 10, this protocol is characterized by its high data dispersion. However, a certain linear correlation between CTDI VOL and SSDE is observed (R 2 = 0.193).

Comparison between SSDE and CTDIVOL
CTDIVOL is a metric that is provided by the output of the CT and standardized on a reference volume. However, it is not the real dose received by the patient, since it does not consider his/her morphology. That is, a patient can receive a different amount of radiation than the one indicated by the CTDIVOL value, because his/her morphology usually differs from the standard volume. To address this problem, SSDE is a parameter that computes the morphology of the patient from a scanogram. Therefore, SSDE is a more reliable measurement of the dose that is received by patients. However, not all current CTs have the ability to collect this information (see Table 1).
Figures analyzing CTDIVOL-SSDE metrics give us the required knowledge of the actual dose delivered to the patient from any CT with minimum error. To predict future CTDIVOL-SSDE values beforehand implies knowing the dose to radiate for the set patient and protocol. To achieve this, a CTDIVOL-SSDE regression study is carried out, employing the ML techniques described in this work. Specifically, we have also eliminated univariant/multivariant outliers from the data that were collected during the years 2015 and 2016 for the five protocols under study, as indicated in Table 2.
As in the previous section, two figures are drawn for each of the cases. The first illustrates the regression curve, while the second shows an error histogram with 20 bars, plotting both training and test errors and observing their differences. The x-axis represents the introduced error; that is, the distance between the real data and the predicted value, while the y-axis indicates the data number that has an error of specific magnitude. As in the CTDIVOL-BMI study, we only highlight the most representative ML technique. The remaining results for each ML technique can be found in the "Results involving ML, Protocols, and dose metrics" section in the Supporting Information for description.
In this study, note that there is no separation between men and women, because the SSDE parameter is focused on the morphology/shape of the patient and, therefore, obviates the need to identify the patient as a man or woman.

Skull Protocol
As shown in Figure 10, this protocol is characterized by its high data dispersion. However, a certain linear correlation between CTDIVOL and SSDE is observed (R 2 = 0.193). In the Skull protocol, while considering the whole population, the bagged regression trees and GPR techniques obtain the least error in test data and training data, respectively (as illustrated in Table  14). In the Skull protocol, while considering the whole population, the bagged regression trees and GPR techniques obtain the least error in test data and training data, respectively (as illustrated in Table 14). The outliers were eliminated using the following criterion in order to reduce error. Firstly, applying the univariate technique, values that were greater than Q3 + 1.5 *RIC and values less than Q1 − 1.5 *RIC were left out of this evaluation. In this regard, the high dispersion of the data is a factor to consider. Secondly, additional samples belonging to areas with low data density, and those showing a significant influence on the error were also removed. Under these considerations, the linear correlation (R 2 = 0.283) and effectiveness of the prediction improve in comparison with the processing of raw data.
When the outliers are suppressed, the RMSE is reduced by up to 47%, reaching the best results with the aforementioned techniques. Under these conditions, the bagged regression trees technique predicts better than the rest of the models. Neural Networks and GPR also offer an acceptable performance, while SVR and linear regression do not achieve good data adjustment. These facts are also corroborated in their corresponding error histograms. Finally, the GPR and Neural Networks models result in more processing and computation cost than the rest of the techniques, as in the case of the BMI-CTDI VOL study.

Thorax, Abdomen, and Pelvis Protocol
There is a high linear correlation between CTDI VOL and SSDE (R 2 = 0.914), which indicates a notable data adjustment with a very low error, independent of the analyzed technique.
GPR and Neural Networks attain the least error (when we compute all of the data) for test data and training data, respectively (see Table 15). Some outliers were removed from the raw data following the same procedure described in the 'Skull' protocol, but with one exception: to eliminate data employing the univariate method, the Q3 + 3 *RIC and Q1 − 3 *RIC were selected as threshold values in order to achieve a better result in the adjustment. Following this rule, the error was significantly reduced, further improving the linear correlation (R 2 = 0.958), and therefore, the effectiveness of the prediction. It should be considered that this procedure would be carried out in the rest of the protocols for the CTDI VOL -SSDE prediction.
By removing outliers ( Figure 11) we can reduce the RMSE value by up to 38% in Neural Networks, making it the best predictive technique for this protocol. Gaussian processes also obtains remarkable performance, but at the expense of computation concerns. On the contrary, SVR and linear regression are penalized in this protocol, having the highest errors.

Abdomen and Pelvis Protocol
As in the previous protocol, there is a high linear correlation between CTDIVOL and SSDE (R 2 = 0.704). When 100% of the data are trained, bagged regression trees provides the best result, as shown in Table 16. However, this is not the case for the test data, which are adjusted by Gaussian processes and linear regression. When outliers are removed using univariant and multivariant methods, the correlation coefficient grows to a value of 0.894, which improves the linear correlation between CTDIVOL and SSDE.
With this scenario, two techniques stand out for prediction tasks. Bagged Regression trees obtains the least error in training data, while Gaussian processes reduces the test error up to 55% in comparison with the remaining techniques. As in previous studies, the computation time of this technique is much longer than that of the rest of the models.
Finally, note that linear regression, SVR, and Neural Networks do not achieve good performance in data adjustment. GPR regression figures are grouped in Figure 12.

Abdomen and Pelvis Protocol
As in the previous protocol, there is a high linear correlation between CTDI VOL and SSDE (R 2 = 0.704). When 100% of the data are trained, bagged regression trees provides the best result, as shown in Table 16. However, this is not the case for the test data, which are adjusted by Gaussian processes and linear regression. When outliers are removed using univariant and multivariant methods, the correlation coefficient grows to a value of 0.894, which improves the linear correlation between CTDI VOL and SSDE.
With this scenario, two techniques stand out for prediction tasks. Bagged Regression trees obtains the least error in training data, while Gaussian processes reduces the test error up to 55% in comparison with the remaining techniques. As in previous studies, the computation time of this technique is much longer than that of the rest of the models.
Finally, note that linear regression, SVR, and Neural Networks do not achieve good performance in data adjustment. GPR regression figures are grouped in Figure 12.

Thorax Protocol
An appreciable linear correlation between CTDI VOL and SSDE is observed in the thorax protocol and according to Table 17 (R 2 = 0.572). The best result for the training data is reached with the model bagged regression trees (all of the data are computed). In contrast, the techniques Gaussian processes and Neural Networks are those that offer a better adjustment of the test data.

Thorax Protocol
An appreciable linear correlation between CTDIVOL and SSDE is observed in the thorax protocol and according to Table 17 (R 2 = 0.572). The best result for the training data is reached with the model bagged regression trees (all of the data are computed). In contrast, the techniques Gaussian processes and Neural Networks are those that offer a better adjustment of the test data. By eliminating atypical or unrepresentative data, the correlation coefficient increases up to a value of 0.907, which implies a substantial enhancement in the linear correlation between CTDIVOL and SSDE.
When training the samples without outliers, all of the techniques perform well, appropriately adjusting data in addition to significantly reducing the RMSE value. We want to emphasize, as in other occasions, the efficiency of Gaussian processes and Neural Networks, since they reduce the error to less than one unit, being, therefore, the most remarkable predictive techniques for this protocol. Similar to the previous scenarios, the GPR model stands out in terms of computation requirements; this is the reason why Neural Networks is the technique selected for this protocol. The graphs for this analytical technique and protocol are found in Figure 13.  By eliminating atypical or unrepresentative data, the correlation coefficient increases up to a value of 0.907, which implies a substantial enhancement in the linear correlation between CTDI VOL and SSDE.
When training the samples without outliers, all of the techniques perform well, appropriately adjusting data in addition to significantly reducing the RMSE value. We want to emphasize, as in other occasions, the efficiency of Gaussian processes and Neural Networks, since they reduce the error to less than one unit, being, therefore, the most remarkable predictive techniques for this protocol. Similar to the previous scenarios, the GPR model stands out in terms of computation requirements; this is the reason why Neural Networks is the technique selected for this protocol. The graphs for this analytical technique and protocol are found in Figure 13.

Abdomen Protocol
Regarding the Abdomen protocol, there is also an excellent linear correlation between CTDI VOL and SSDE (R 2 = 0.858). Gaussian Process Regression is the technique that best minimizes the RMSE in both training data and test data when 100% of the samples are analyzed. On the contrary, the worst performance is reached with the SVR and linear regression models.
This protocol presents very few atypical data, although of great magnitude, as can be observed in Table 18. Once they are removed, the RMSE value is significantly reduced (around 16.5%) and the bagged regression trees technique obtains the best predictions for the CTDI VO L-SSDE pair, followed by the GPR technique (however, as in the previous scenarios, its computational cost is the highest). On the other hand, linear regression and SVR exhibit the worst performance. Figure 14 shows the prediction and error graphs for this technique and protocol.

Abdomen Protocol
Regarding the Abdomen protocol, there is also an excellent linear correlation between CTDIVOL and SSDE (R 2 = 0.858). Gaussian Process Regression is the technique that best minimizes the RMSE in both training data and test data when 100% of the samples are analyzed. On the contrary, the worst performance is reached with the SVR and linear regression models.
This protocol presents very few atypical data, although of great magnitude, as can be observed in Table 18. Once they are removed, the RMSE value is significantly reduced (around 16.5%) and the bagged regression trees technique obtains the best predictions for the CTDIVOL-SSDE pair, followed by the GPR technique (however, as in the previous scenarios, its computational cost is the highest). On the other hand, linear regression and SVR exhibit the worst performance. Figure 14 shows the prediction and error graphs for this technique and protocol.

Discussion
The body mass index (BMI) is a metric that depends on weight and height, and it is therefore intrinsically related to the size of the patient. If the body is smaller than the volume of the standardized phantom (16/32 cm), the dose absorbed by the patient will be greater. In the same way, a larger body will receive less radiation.
Consequently, it is necessary to adapt the dose radiated by the CT to the size of the patient to

Discussion
The body mass index (BMI) is a metric that depends on weight and height, and it is therefore intrinsically related to the size of the patient. If the body is smaller than the volume of the standardized phantom (16/32 cm), the dose absorbed by the patient will be greater. In the same way, a larger body will receive less radiation.
Consequently, it is necessary to adapt the dose radiated by the CT to the size of the patient to comply with the recommended dose in a determined protocol and obtain a comprehensible image for the radiologist. Under these circumstances, we demonstrate an increase in the CTDI VOL metric as BMI grows, which is consistent with: (i) the previous explanation, and (ii) the results extracted from the work [6]. In addition, there is a linear correlation between both figures of merit, which is emphasized if the outliers are removed.
The relationship between BMI and DRL for the different protocols should be highlighted. Firstly, figures for the 'Skull' protocol illustrate that BMI-CTDI VOL does not exceed the dose of any of the European reference levels included in this study. This is due to the fact that the size of the head does not affect the variance in the body mass index in the same way as the rest of the body. Secondly, in the protocols that are related to 'Thorax, Abdomen, Pelvis' and 'Abdomen, Pelvis', the regression curve usually remains below most of the DRLs. They only exceed specific DRLs when the body mass index rises above the value of 30, as is the case with obese patients. Thirdly, in the 'Thorax' protocol, the results show that a few DRL values are exceeded in the case of a BMI metric higher than 25. However, in usual operation conditions, the regression curve remains below most DRLs for most of the BMI values.
Concerning the regression curves for each protocol, they provide very valuable information to the radiologist to establish the appropriate dose value for radiating the patient. Thanks to this set of ML tools, the radiologist knows the radiation thresholds that must not be exceeded beforehand. However, as observed in Figures 8 and 9, these thresholds can be surpassed in specific cases (for instance, when the calculation of the BMI value for a patient is high), and only with the most restrictive DRLs.
In contrast, in the 'Abdomen' protocol, we observe that the regression curves surpass some DRLs of European countries for BMI values that were included in normal weight ranges. In this case, the reduction of the radiated dose to the patient below the value indicated by the ML tool is a decision that depends on the clinical judgment of the radiologist, since he/she must be able to analyze and discern possible lesions in the CT images.
Regarding the CTDI VOL -SSDE predictions, we notice a high linear correlation between both of the variables. This means that a good SSDE prediction is achieved with extremely low error in most of the protocols. This prediction is clearly important, as it is possible to know with appropriate enough accuracy the dose that a patient will receive, without compromising the diagnosis, and while taking into account (i) his/her morphology and (ii) the output radiation value of the CT. Using this study, the radiologist can avoid situations in which a patient receives a higher dose than that required, simply by carrying out an appropriate adjustment to the CT parameters.
In relation to the RMSE metric, it is calculated from the study that was carried out for the different ML techniques. Under this study, an evident decrease of the RMSE is obtained when eliminating outliers, only reducing the error by more than 50% in some cases by discarding less than 5% of atypical data. This result also has a twofold meaning. On the one hand, a high RMSE value is attained for the atypical data, in comparison with these same samples when they are trained without these atypical data. On the other hand, greater error is observed (independently of having outliers or not) in the predictions of the Skull protocol when particularizing the protocols studied here. This is due to the fact that the data are more dispersed than in the rest of the protocols and cannot be adjusted in the same range.
The following factors have been considered and analyzed to select the appropriate ML regression technique: RMSE, overfitting, and computational complexity. Table 19 summarizes this study, highlighting the techniques that provide a better adjustment to the data; that is, focusing on Neural Networks, regression trees, and Gaussian processes. These will be discussed below.

ML Technique Less RMSE Less Overfitting Less Complexity
Linear regression The bagged regression trees technique provides a quick interpretation of the results due to its simplicity. In addition to its low computational cost, it also performs the best adjustment of the training data in most cases, obtaining the lowest RMSE values. However, bagged regression trees shows a high discrepancy when comparing the training results with their corresponding test data, which indicates the existence of a certain overfitting or lack of generality. Additionally, the figures of error histograms corroborates the overfitting phenomenon; big test bars (which include a greater number of data) point to higher error than their respective training data. On the other hand, this technique does not draw a curve with smooth transition, but it undergoes quick variations due to the overfitting. This means that two values near to the predictor variable result in different responses. For example, two patients with similar BMI values are radiated with different doses. Furthermore, this variability can cause peaks in the graph that exceed the DRL values in specific points.
Neural Networks is a powerful tool that, with appropriate data selection for the training tasks, allows for us to achieve good test data adjustment. Furthermore, this technique is able to enhance the results that were obtained by other techniques, although not for all the protocols. Its main drawbacks lie in: (i) the optimal selection of parameters, such as the number of layers and neurons, (ii) the high variability in the training results, and (iii) the unpredictability in the values of initialization of the weights, entailing different results in each execution of the network. This means that the model has to be repeatedly trained for each configuration = to select the iteration that offers the best performance. Once selected, it is necessary to store the value of the network weights to replicate the attained results. The execution time of this technique depends, to a great extent, on the number of layers and neurons used in the multilayer architecture.
In many of the protocols under study, the ML technique that presents the best performance is Gaussian Process Regression (GPR), due to the low RMSE obtained and the slight differences between the predictions of training data and test data. Moreover, since this is a probabilistic model, it is easy to calculate confidence intervals, which are of interest when establishing the thresholds of radiated doses to patients. The disadvantage is its algorithmic complexity, which implies longer processing and execution time, but this is fully acceptable for current computers.
Finally, once the ML curves are obtained for each protocol/gender, the medical staff must proceed, as follows: (i) Regarding the patient's disease, the staff selects a protocol/gender. (ii) Tables 4-18 provide the best RMSE result in Test for the selected protocol/gender, and, therefore, the most appropriate ML technique. (iii) The medical staff will go to the corresponding Figure defined for the dupla protocol and ML technique, and according to the patient's morphology/size, they will take a BMI (or SSDE) value as input to obtain the new value of CTDI VOL . (iv) In the future, the goal is for this new value of CTDI VOL to be configured into the CT. To achieve this, X-ray technicians (or automated software) should tune input magnitudes, such as pitch, scan length, amperage, and kilovoltage.

Conclusions
In this paper, we contribute a novel methodology based on Machine Learning Techniques to estimate and predict the dose that is received by a patient in a CT test. To achieve this goal, the figures BMI and CTDI VOL , in a first stage, and CTDI VOL and SSDE, in a second one, are studied and analyzed for five standardized protocols.
We obtain regression curves employing ML techniques regarding CTDI VOL -BMI , and once the outliers are removed from a dataset composed of the data from over fifty thousand doses belonging to real patients. They draw the future dose to radiate and remain below most of the European DRLs for all of the protocols analyzed, except in Abdomen, where our predictions exceed the DRLs of restricted countries for normal BMI values. According to CTDI VOL -SSDE prediction, similar results were attained, with techniques, such as Gaussian processes, Neural Networks, and Regression trees showing an appreciable adjustment with the source of data computed, implying a high correlation coefficient and a very small RMSE; even less than one unit in some protocols.
As a result, our proposed predictive method provides a reliable and powerful tool for planning the dose to deliver to the appropriate patient. Those in the medical field will have useful information when deciding to adjust the dose or not, which minimizes the impact of the radiation in treated patients.