A Sensitivity and Robustness Analysis of GPR and ANN for High-Performance Concrete Compressive Strength Prediction Using a Monte Carlo Simulation

This study aims to analyze the sensitivity and robustness of two Artificial Intelligence (AI) techniques, namely Gaussian Process Regression (GPR) with five different kernels (Matern32, Matern52, Exponential, Squared Exponential, and Rational Quadratic) and an Artificial Neural Network (ANN) using a Monte Carlo simulation for prediction of High-Performance Concrete (HPC) compressive strength. To this purpose, 1030 samples were collected, including eight input parameters (contents of cement, blast furnace slag, fly ash, water, superplasticizer, coarse aggregates, fine aggregates, and concrete age) and an output parameter (the compressive strength) to generate the training and testing datasets. The proposed AI models were validated using several standard criteria, namely coefficient of determination (R2), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE). To analyze the sensitivity and robustness of the models, Monte Carlo simulations were performed with 500 runs. The results showed that the GPR using the Matern32 kernel function outperforms others. In addition, the sensitivity analysis showed that the content of cement and the testing age of the HPC were the most sensitive and important factors for the prediction of HPC compressive strength. In short, this study might help in selecting suitable AI models and appropriate input parameters for accurate and quick estimation of the HPC compressive strength.


Introduction
High-Performance Concrete (HPC) is known as the third generation of concrete material [1] and its definition was adopted in the early 1970s. Compared with the second generation of concrete, such as High-Strength Concrete (HSC), HPC exhibits not only high compression strength, but also other important characteristics-for instance, high flow ability, high elastic modulus, high flexural strength, low permeability, high abrasion resistance, and high durability [1]. The improvement of these mechanical properties has made HPC widely used in long-term construction applications, especially in tall buildings, roadway construction, long-span bridges, and tunnels [2]. The key technology in the manufacturing process of HPC is to make it as dense as possible [2]. Consequently, the main rules that differentiate the manufacture of HPC compared with normal concretes are: (i) The use of smaller aggregate size, (ii) the addition of supplementary cementitious materials such as silica-fume, and (iii) the most important one is the application of superplasticizer to reduce the water/binder ratio [2]. In order to fabricate HPC, the constituent materials as well as their proportions need to be carefully selected and controlled. A number of investigations in the literature have proposed mix design methods for HPC [3][4][5]. The main objective of these works is to obtain the combination of constituent materials and the corresponding proportions to produce HPC with improved mechanical properties.
The compressive strength of HPC has particularly attracted researchers as an important mechanical property that reflects the quality of this material. The current practice is based primarily on carrying out laboratory experiments to obtain the desired compressive strength as a function of all of the constituent materials. This experimental procedure is very time consuming, costly, and always requires some equipment that might not be at disposal. As a consequence, researchers have been trying to propose some formulations which indicate the practical correlation between HPC compressive strength and several related mechanical properties or parameters. Zhou et al. [6] investigated the effect of aggregates on the HPC compressive strength and deduced that it could be predicted by several formulations, except for the cases of aggregates with very low and very high moduli. Duval and Kadri [7] investigated the effect of silica fume on HPC compressive strength and presented a prediction model a with correlation coefficient equal to 0.991. Chan et al. [8] proposed a model to relate the strength and porosity of HPC after exposure to high temperatures. Rashid et al. [9] studied the correlation of several mechanical properties, including modulus of elasticity, tensile strength, and the influence of the specimen size on the compressive strength. Ramezanianpour et al. [10] presented formulas relating the concrete compressive strength to water penetration, concrete resistivity, and rapid chloride penetration.
In general, these papers provide empirical formulas based on experimental results for quick but approximate determination of the HPC compressive strength. However, a drawback of this approach is that it can be achieved only with a limited number of parameters for constructing the relationship functions. Thus, the empirical equation approach becomes impractical and crude when the number of input parameters is large.
Since the first journal article on the civil engineering applications of neural networks (NNs) three decades ago by Adeli and Yesh [11], the penetration of neurocomputing into civil engineering has steadily increased. A review of civil engineering applications of neural networks up to 2000 is presented by Adeli [12] and from then up to 2016 by Amezquita-Sanchez et al. [13]. In recent years, a number of authors have used NNs and other artificial intelligence (AI) approaches to predict the mechanical properties of HPC. Erdal et al. [14] used Gradient-Boosted Artificial Neural Networks (GBANNs) and Bagged Artificial Neural Networks (BANNs) to predict the compressive strength of HPC. Chou and Pham [15] compared NNs, Support Vector Machines (SVMs), Chi-squared Automatic Interaction Detector (CAID), Linear Regression (LR), and Classification and Regression Trees (CART) for predicting the compressive strength of HPC. The ensemble models used in this research have shown good performance compared to previous studies; the value of coefficient determination (R 2 ) [14] is equal to or higher than 93.3% for all models, and the error rates are significantly improved from 4.2% to 69.7%. In another attempts, Cheng et al. [16] used a Genetic Weighted Pyramid Operation Tree (GWPOT), whereas Erdal [17] developed two hybrid ensemble decision trees for the prediction of HPC compressive strength. In general, the aforementioned studies affirmed the effectiveness of the AI models for predicting the compressive strength of HPC more accurately and quickly compared with traditional approaches [18][19][20][21][22][23][24].
Rafiei et al. [48] present a novel deep restricted Boltzmann machine for estimating concrete properties based on mixture proportions and compared its effectiveness with backpropagation (BP) NN and SVM. They report a maximum accuracy of 98% for the new model using the concrete test data from the ML repository of UC Irvine. Nguyen et al. [49] use a deep NN model for the prediction of foamed concrete strength. Rafiei et al. [50] presented an innovative approach for the concrete mix design problem through fusion of an optimization algorithm, the patented neural dynamics optimization model of Adeli and Park [51,52], and an ML classification algorithm used as a virtual lab to predict whether desired constraints are satisfied in each iteration or not. The authors assert: "The outcome of this research is an entirely new paradigm and methodology for concrete mix design for the 21st century" and note the cost savings for large-scale projects, such as a high-rise building structure, can be in the millions of dollars. These modern approaches will have a transformative impact on the engineering practice in the coming decade.
To the best of our knowledge, no systematic investigation has been performed to fully analyze the accuracy of a given AI algorithm in predicting the HPC compressive strength. It is well known that the latter greatly depends on the construction of a given dataset for both training and testing parts. As an example, for a given dataset, the accuracy of an AI model might vary within a wide range and be conditioned by the random sampling procedure. A robust approach that possesses the ability to take into account such an effect is therefore indispensable. Moreover, the study of how the uncertainty in the predicted outputs can be influenced by different sources of uncertainty in the input space is also of great importance. The reliable identification of the most important factor is thus crucial.
The main objective of this study is to propose an effective way to fully evaluate the performance of AI algorithms for predicting the compressive strength of HPC. Two AI models have been selected to perform the case studies: Gaussian Process Regression (GPR) and an Artificial Neural Network (ANN) with the Levenberg-Marquardt Algorithm. The proposed model was tested using a total of 1030 compressive strength tests gathered from the available literature, with eight input parameters (i.e. the contents of cement, fly ash, blast furnace slag, water, superplasticizer, fine and coarse aggregates, and the age of the HPC) with the compressive strength as the prediction target. The Monte Carlo approach was then applied for the assessment of the performance of the proposed AI models along with various criteria such as the Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Coefficient of Determination (R 2 ). Finally, a sensitivity analysis was carried out to determine the contribution of each input parameter to the prediction capability of the AI models. The results were finally compared with those reported in the literature.

Research Novelty and Significance
As discussed in the introduction section, the estimation of the compressive strength of HPC is crucial in civil engineering applications. Although miscellaneous experimental works have reported this problem, it is difficult to quantify a generalized formulation taking into account all the factors that influence the compressive strength of HPC. The application of AI algorithms could help in revealing the nonlinear relationships between the HPC compressive strength and the corresponding mixture components. That way, the impacts on HPC compressive strength of input variables could be fully achieved and determined. Moreover, numerical techniques such as Monte Carlo simulations could be applied to deduce the robustness of the prediction models, taking into account the variability of the input space. Therefore, once constructed, the developed machine learning models can be efficient tools which could help researchers and engineers to save cost and time in experiments when studying the compressive strength of HPC.

Preparation of Data
A total of 1030 HPC samples collected from the UC Irvine repository and published in published papers [53][54][55][56][57] was evaluated in this study. Research data for this article is also included as Supplementary Materials. All of the samples were fabricated with ordinary Portland cement (OPC) and cured under normal conditions. The available HPC testing data in the literature used specimens of different sizes and shapes. They were converted into unique 150 mm cylinders based on existing guidelines, such as IS 516 1959 [58] and GB 50205 2001 [59]. The HPC compressive strength is obtained as a function of eight inputs: Contents of cement (denoted as X 1 ), blast furnace slag (denoted as X 2 ), fly ash (denoted as X 3 ), water (denoted as X 4 ), superplasticizer (denoted as X 5 ), coarse aggregate (denoted as X 6 ), fine aggregate (denoted as X 7 ), and the HPC age (denoted as X 8 ). Table 1 presents the ranges of these components in the database. The database containing 1030 records was divided into two subsets: The training set (70% of the data) and the testing set (the remaining 30% of the data), which were randomly chosen from the initial database under a uniform distribution. Each dataset contained 8 vectors-corresponding to each of the 8 input parameters-and one output, the compressive strength of HPC. While various train/test ratios can be used, in this paper, the 70/30 ratio for training/testing datasets was selected, as suggested in published papers [60] and [61]. It is observed that all input variables cover a wide range of values. Statistical analysis was performed to demonstrate that the choice of these input parameters was relevant, i.e., no major cross-correlation was found in the eight-dimensional input space [53][54][55][56][57]. Therefore, the AI models can be trained with a good generalization capability.

Gaussian Process Regression
In the present study, Gaussian Process Regression (GPR), a nonparametric Bayesian approach to regression, was applied to predict the compressive strength of HPC. An advantage of GPR is that it has an ability to provide uncertainty measurements on the predicted values, unlike many supervised ML algorithms. A feature of GPR is that it directly defines a prior probability over a latent function. GPR is expressed as a Gaussian process of its mean function, denoted as m(x), and covariance kernel function, denoted as k(x, x ) [62,63]: Sustainability 2020, 12, 830

of 22
The mean vector represents the central tendency of function f ; normally, it is assumed to be zero [63]. The covariance matrix describes the structure and the shape of the function. The relation between the input parameters and output variable is defined as: where ε is the independent noise, which is covered by a distribution of a zero mean and a variance σ 2 n , such as: From Equation (2), the likelihood function is obtained as: where and I is the diagonal identity matrix of dimension M × M. The marginal distribution L(f) is defined by a Gaussian with a zero mean and a covariance matrix based on a Gram matrix by the relationship: , as deduced from the definition of the Gaussian process in the work of Mackay [64]: where K = k(x i , x j ) is the covariance matrix corresponding to the covariance function k. The term "marginal" was used to indicate a non-parametric model. It can be seen that Equation (4) obeys the distribution of the Gaussian type, so that the marginal distribution of y could be defined as: where K y = K + σ 2 n I. Let us define f * = f (x * ) as the vector of function values corresponding to the input variables x * and ε * as the corresponding noise. The joint Gaussian distribution is then defined as: where Gaussian distribution, where the mean and covariance are defined as follows [65]: The inverse of the covariance matrix K y can be determined using the Cholesky decomposition [66]. The covariance (kernel) function is a very important factor in GPR, as it defines the similarity of the data, which has a major impact on the prediction results [63]. In this study, the following five types of covariance functions are used for predicting the compressive strength of HPC [67]: Matern 3/2: Matern 5/2: Sustainability 2020, 12, 830 6 of 22 Exponential: Squared Exponential: Rational quadratic: where "r" is the Euclidean distance between variables x i and x j : where σ l and σ f are the characteristic length scale and the signal standard deviation, respectively. The hyper-parameter θ of the covariance function can be calculated by several methods [68]. In the rest of the paper, GPR algorithms using different kernel functions are designated as follows: GPR using Matern 3/2 as GPR-32, GPR using Matern 5/2 as GPR-52, GPR using the exponential kernel as GPR-EXP, GPR using the squared exponential kernel as GPR-SQEXP, and GPR using the rational quadratic kernel as GPR-RSQ.

Artificial Neural Network (ANN)
Hammoudi et al. [69] showed that the ANN technique is superior to the Response Surface Methodology in predicting compressive strength of recycled concrete aggregates. The prediction performance of the ANN technique for predicting the concrete compressive strength has also been reported in several other studies [70,71].
Such artificial nodes process information from the input space and propagate it towards the output. A typical ANN architecture has three layers: Input, hidden, and output layers ( Figure 1). In most cases, the hidden layer performs a nonlinear transformation in order to capture the nonlinear behavior between the input and output variables of the considered problem [72][73][74][75]. In this study, the well-known sigmoid function was employed as the nonlinear transformation of the signal for a given neuron in the hidden layer as follows [76][77][78][79][80]: where x = (x 1 , x 2 , . . . , x n ) are the received signals coming from the previous neurons and y is the output signal of the considered neuron. In the output layer, a linear transformation function is applied to calculate the response of the prediction problem, which is the compressive strength of HPC in this study. The optimal number of neurons in the hidden layer depends on each specific problem. If the number of artificial neurons in the hidden layer is too big or too small, it may result in over-or underfitting [81]. In this study, after several tests, eight hidden neurons-a number equal to the dimension in the input space-were selected, which is in line with the number suggested by Behnood and Golafshani [71]. Finally, the Levenberg-Marquardt learning algorithm was chosen for the training process of the model due to its higher efficiency ( [70,82] or [83]). In the rest of the paper, the ANN model is denoted as LMNN.
given neuron in the hidden layer as follows [76][77][78][79][80]: are the received signals coming from the previous neurons and y is the output signal of the considered neuron. In the output layer, a linear transformation function is applied to calculate the response of the prediction problem, which is the compressive strength of HPC in this study.

Monte Carlo Approach and Statistical Analysis
For typical construction and building materials such as concrete, many studies involving Monte Carlo technique have been introduced in the literature, taking into account the variability in the input space. For instance, Wang et al. [84] quantified the size effect of random aggregates and pores on the mechanical properties of concrete. Jaskulski and Wilinski [85] proposed a probabilistic analysis for concrete subjected to shear. Kostic et al. [86] optimized the response surface methodology in predicting the compressive strength of concrete based on a Monte Carlo framework. Numerical prediction models involving the Monte Carlo method can explain the variation of the output results through a statistical analysis.
The Monte Carlo method is extremely robust and efficient for calculating the propagation of the input variability on the output results, especially using numerical AI models [87][88][89][90]. The main idea behind the Monte Carlo method is to repeat realizations in the input space randomly and then calculate the corresponding output through a simulation model [91,92]. Therefore, this numerical technique lends itself to parallel [93][94][95] and distributed [96][97][98] computing effectively. A concept of the Monte Carlo method involving a two-dimensional input space with a typical probability distribution is presented in Figure 2. The robustness of the model and sensitivity of the input variables can be investigated through a statistical characterization of the output results ( Figure 2). The optimal number of neurons in the hidden layer depends on each specific problem. If the number of artificial neurons in the hidden layer is too big or too small, it may result in over-or underfitting [81]. In this study, after several tests, eight hidden neurons-a number equal to the dimension in the input space-were selected, which is in line with the number suggested by Behnood and Golafshani [71]. Finally, the Levenberg-Marquardt learning algorithm was chosen for the training process of the model due to its higher efficiency ( [70,82] or [83]). In the rest of the paper, the ANN model is denoted as LMNN.

Monte Carlo Approach and Statistical Analysis
For typical construction and building materials such as concrete, many studies involving Monte Carlo technique have been introduced in the literature, taking into account the variability in the input space. For instance, Wang et al. [84] quantified the size effect of random aggregates and pores on the mechanical properties of concrete. Jaskulski and Wilinski [85] proposed a probabilistic analysis for concrete subjected to shear. Kostic et al. [86] optimized the response surface methodology in predicting the compressive strength of concrete based on a Monte Carlo framework. Numerical prediction models involving the Monte Carlo method can explain the variation of the output results through a statistical analysis.
The Monte Carlo method is extremely robust and efficient for calculating the propagation of the input variability on the output results, especially using numerical AI models [87][88][89][90]. The main idea behind the Monte Carlo method is to repeat realizations in the input space randomly and then calculate the corresponding output through a simulation model [91,92]. Therefore, this numerical technique lends itself to parallel [93][94][95] and distributed [96][97][98] computing effectively. A concept of the Monte Carlo method involving a two-dimensional input space with a typical probability distribution is presented in Figure 2. The robustness of the model and sensitivity of the input variables can be investigated through a statistical characterization of the output results ( Figure 2).  In this paper, the statistical convergence of Monte Carlo simulations was investigated using the following equation [100][101][102]: where G is the mean value of the considered random variable G and n MC is the number of Monte Carlo runs. This convergence function provides efficient information about the computational time and reliable results for the later statistical analysis.

Quality Assessment
For the evaluation of the results given by the AI models, several criteria, such as Coefficient of determination (R 2 ), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE), were used in this research. R 2 is a key output in regression analysis. It is interpreted as the square of the correlation (R) between the predicted and the actual outputs, which varies from 0 to 1. A high value of R 2 indicates a good correlation between the predicted and the target values. The RMSE is another error measure of the average squared difference between the predicted and actual outputs of an AI model [103][104][105], whereas MAE measures the average error between them [106]. In contrast to R 2 , lower values of RMSE and MAE indicate a better performance of an AI algorithm [107,108]. The quantities R 2 , RMSE, and MAE are defined as follows [100]: where N is the number of samples; x i and y i are the actual and predicted outputs, respectively; and y is the mean of the predicted output.

Methodology Flowchart
The methodology of the present study for predicting the compressive strength of HPC consists of four main steps as follows ( Figure 3): Step 1: Data preparation. A database containing data for 1030 samples of compressive strength laboratory experiments was used, including input parameters and the compressive strength of HPC as the output variable. The dataset was divided into two parts for the training and validation procedures of the AI models.
Step 2: Model training. In this second step, the proposed AI models such as Gaussian Process Regression with different covariance functions (i.e., GPR-52, GPR-32, GPR-EXP, GPR-SQEXP, GPR-RSQ) and LMNN were trained using the training dataset.
Step 3: Model testing. In this third step, the trained AI models were validated using the testing data set (30% of the initial data). Statistical measures R 2 , RMSE, and MAE were applied in this step to evaluate the performance of the AI models.
Step 4: Robustness and sensitivity analysis. In this final step, the robustness of the AI algorithms was investigated using Monte Carlo simulations. Precisely, 500 Monte Carlo simulations were performed to evaluate the prediction capabilities of the AI models and to provide a statistical comparison between them. Thus, 500 values of each error criteria such as R 2 , RMSE, or MAE were calculated and the convergence of each model was studied. In terms of the sensitivity analysis, the response of each AI model to the input variability was studied. For this purpose, Monte Carlo simulations were executed while respectively taking input variables out of the problem one by one.

Prediction Capability of the AI Models
In this section, the performance of six AI models (GPR-52, GPR-32, GPR-EXP, GPR-SQEXP, GPR-RSQ, and LMNN) for predicting the HPC compressive strength is investigated (Figures 4 and 5). The errors between the predicted and experimental values of the compressive strength for both training and testing parts are plotted as a function of the sample index (Figures 4a,c). The probability distributions of the errors in the training and testing datasets are presented in Figures 4b and 4d, respectively. Figure 5 presents the predicted outputs versus the corresponding testing targets for GPR-52, GPR-32, GPR-EXP, GPR-SQEXP, GPR-RSQ, and LMNN models associated with the testing part. The fitted linear lines are also highlighted in each case in order to demonstrate the performance of the six AI models.
The analysis of errors shows that for the training part, all of the AI models yield good results within a reasonable range (i.e., an error standard deviation variation of less than 10 MPa). It can be seen in Figure 4b that there is a high concentration of error around zero. However, GPR-32 and GPR-EXP models yield the highest value for the error distribution peak around 0, showing that the two models are good performers. Similarly, for the testing part, GPR-32 and GPR-EXP models provided the best prediction results with respect to statistical analysis of errors (Figure 4d). For the prediction quality based on linear fit lines, Figure 5 shows that all of the AI models possess a strong linear correlation between predicted and actual compressive strength values (i.e., the linear fit lines are very close to the diagonal line with ±2° of the slope angle). The performance of all AI algorithms is summarized in Table 2. Based on the prediction quality assessment and error analysis, GPR-32 is found to be slightly better than other models for the prediction of the compressive strength of HPC.

Prediction Capability of the AI Models
In this section, the performance of six AI models (GPR-52, GPR-32, GPR-EXP, GPR-SQEXP, GPR-RSQ, and LMNN) for predicting the HPC compressive strength is investigated (Figures 4 and 5). The errors between the predicted and experimental values of the compressive strength for both training and testing parts are plotted as a function of the sample index (Figure 4a,c). The probability distributions of the errors in the training and testing datasets are presented in Figure 4b,d, respectively. Figure 5 presents the predicted outputs versus the corresponding testing targets for GPR-52, GPR-32, GPR-EXP, GPR-SQEXP, GPR-RSQ, and LMNN models associated with the testing part. The fitted linear lines are also highlighted in each case in order to demonstrate the performance of the six AI models.
The analysis of errors shows that for the training part, all of the AI models yield good results within a reasonable range (i.e., an error standard deviation variation of less than 10 MPa). It can be seen in Figure 4b that there is a high concentration of error around zero. However, GPR-32 and GPR-EXP models yield the highest value for the error distribution peak around 0, showing that the two models are good performers. Similarly, for the testing part, GPR-32 and GPR-EXP models provided the best prediction results with respect to statistical analysis of errors (Figure 4d). For the prediction quality based on linear fit lines, Figure 5 shows that all of the AI models possess a strong linear correlation between predicted and actual compressive strength values (i.e., the linear fit lines are very close to the diagonal line with ±2 • of the slope angle). The performance of all AI algorithms is summarized in Table 2. Based on the prediction quality assessment and error analysis, GPR-32 is found to be slightly better than other models for the prediction of the compressive strength of HPC.

Robustness Analysis of the AI Models
In this section, the performance of the six proposed AI algorithms is evaluated using the Monte Carlo simulations. It has been previously demonstrated that the random sampling procedure for both training and testing datasets greatly affects the prediction performance of AI algorithms [87,88]. Therefore, the robustness analysis of AI models should be carried out with a sufficient number of cases in order to make the obtained results more representative. A total number of 3000 Monte Carlo simulations was performed (500 simulations for six algorithms), and the results are plotted in Figure  6. Only results from the testing parts were considered, as they reflect the prediction performance of algorithms [88]. The Monte Carlo simulation results obtained for R 2 , RMSE, and MAE exhibit significant variations, especially the highest standard deviation (StD) values for the case of the LMNN

Robustness Analysis of the AI Models
In this section, the performance of the six proposed AI algorithms is evaluated using the Monte Carlo simulations. It has been previously demonstrated that the random sampling procedure for both training and testing datasets greatly affects the prediction performance of AI algorithms [87,88]. Therefore, the robustness analysis of AI models should be carried out with a sufficient number of cases in order to make the obtained results more representative. A total number of 3000 Monte Carlo simulations was performed (500 simulations for six algorithms), and the results are plotted in Figure 6. Only results from the testing parts were considered, as they reflect the prediction performance of algorithms [88]. The Monte Carlo simulation results obtained for R 2 , RMSE, and MAE exhibit significant variations, especially the highest standard deviation (StD) values for the case of the LMNN algorithm (Table 3). In terms of the mean values of criteria, GPR-32 performed better than other methods with values of mean(R 2 ) = 0.893, mean(RMSE) = 5.46, and mean(MAE) = 3.86. The standard deviation values obtained by GPR-32 were also the smallest (i.e., StD(R 2 ) = 0.015, StD(RMSE) = 0.37 and StD(MAE) = 0.21), demonstrating that GPR-32 was the most stable method. In addition, the min and max values of the R 2 , RMSE, and MAE of all AI methods are given in Table 3.    The convergence behavior of the six AI algorithms is displayed in Figure 7, considering R 2 , RMSE, and MAE as random variables (see Equation (19) for the expression of the normdiff function). It is observed that at least 100 Monte Carlo runs are required to ensure statistical convergence of R 2 , RMSE, and MAE for GPR-based methods, whereas at least 200 Monte Carlo runs are required to ensure statistical convergence results for LMNN. The converged values of R 2 , RMSE, and MAE are presented in Table 3. It can be concluded that the most accurate AI model for predicting the compressive strength of HPC with respect to R 2 is GPR-32, following by GPR-52, GPR-RSQ, GPR-SQEXP, GPR-EXP, and LMNN. A similar trend was also noticed regarding the converged statistical values of RMSE and MAE-GPR-32 was found to be the best predictor in terms of accuracy, followed by GPR-EXP, GPR-52, GPR-RSQ, GPR-SQEXP, and LMNN. The maximum fluctuation of GPR methods was smaller (0.2%) than that of the LMNN algorithm (1.5%).
Sustainability 2020, 11, x FOR PEER REVIEW 13 of 22 The convergence behavior of the six AI algorithms is displayed in Figure 7, considering R 2 , RMSE, and MAE as random variables (see Equation (19) for the expression of the normdiff function). It is observed that at least 100 Monte Carlo runs are required to ensure statistical convergence of R 2 , RMSE, and MAE for GPR-based methods, whereas at least 200 Monte Carlo runs are required to ensure statistical convergence results for LMNN. The converged values of R 2 , RMSE, and MAE are presented in Table 3. It can be concluded that the most accurate AI model for predicting the compressive strength of HPC with respect to R 2 is GPR-32, following by GPR-52, GPR-RSQ, GPR-SQEXP, GPR-EXP, and LMNN. A similar trend was also noticed regarding the converged statistical values of RMSE and MAE-GPR-32 was found to be the best predictor in terms of accuracy, followed by GPR-EXP, GPR-52, GPR-RSQ, GPR-SQEXP, and LMNN. The maximum fluctuation of GPR methods was smaller (0.2%) than that of the LMNN algorithm (1.5%). Various investigations have been introduced in the literature in order to predict the compressive strength of HPC based on AI approaches. A highlight of previous studies involving the reference, the number of data, the prediction models used, and the average value of R 2 is given in Table 4. Various AI methods have been employed, such as well-known ANN or SVM. Most recently, several hybrid models have been developed in the field of HPC compressive strength prediction; for instance, the authors of [18] have combined Least Squares Support Vector Regression and the global optimization technique called the Firefly Algorithm. In terms of the average value of the coefficient of determination R 2 , our model brings the compressive strength prediction closer and makes it even better than previously published results.  Various investigations have been introduced in the literature in order to predict the compressive strength of HPC based on AI approaches. A highlight of previous studies involving the reference, the number of data, the prediction models used, and the average value of R 2 is given in Table 4. Various AI methods have been employed, such as well-known ANN or SVM. Most recently, several hybrid models have been developed in the field of HPC compressive strength prediction; for instance, the authors of [18] have combined Least Squares Support Vector Regression and the global optimization technique called the Firefly Algorithm. In terms of the average value of the coefficient of determination R 2 , our model brings the compressive strength prediction closer and makes it even better than previously published results.

Input Parameter Sensitivity Analysis
In this section, a sensitivity analysis of the input variables on the prediction accuracy of AI algorithms for the compressive strength of HPC is presented. The selected model for the investigation is GPR-32, as its prediction accuracy was shown to be the best in the previous sections with respect to both average and standard deviation values. Indeed, the GPR-32 model provided 7%, 13%, and 33% lower standard deviation values than those of GPR-52, GPR-EXP, and LMNN, as indicated in Table 3. In terms of the average values, the GPR-32 model exhibited the most efficient values regarding all criteria such as R 2 , RMSE, and MAE. In addition, the LMNN model has also been selected in this section for the sensitivity analysis of inputs.
The sensitivity analysis was conducted by successively replacing all values of each input in the eight-dimensional input space with a constant (equal to zero in this study). That way, the statistical behavior of the excluded input is reduced to zero, allowing the prediction models to quantify the influence of this input on the predicted targets, even in the case of highly nonlinear relationships. Monte Carlo simulations were conducted in order to ensure statistical convergence of the sensitivity analysis. Eight groups of Monte Carlo simulations were performed with a number of 500 realizations for each case (i.e., successively excluding the influence of X 1 to X 8 ). Statistical measures such as R 2 , RMSE, and MAE were employed to quantify the change of prediction accuracy compared with the case of full simulation (i.e., including all input variables).
A total of 4000 simulations for 8 groups were performed and the convergence functions over 500 Monte Carlo simulations for each group were evaluated using Equation (19). All convergence functions reached the stationary solution within an acceptable range of error. The probability density functions of quality assessment criteria (i.e., R 2 , RMSE, and MAE) over 500 runs successively excluding the statistical properties of X 1 to X 8 are shown in Figure 8 using LMNN and GPR-32. Finally, the sensitivity indexes of inputs were obtained by calculating the difference of the considered quality assessment criterion between the case of full simulation and the case of excluding an input. All eight sensitivity indexes were then scaled by their sum and shown in Figure 9, classified in descending order with respect to R 2 , RMSE, and MAE, respectively. It is observed in Figure 9 that the most sensitive input is X8 (the age of concrete) (i.e., its probability distribution exhibits a major difference compared with others seen in Figure 8, so the change of the mean value in the case of excluding X8 was the most important compared to others). This conclusion is reached using both the LMNN and GPR-32 models, with respect to the three statistical criteria R 2 (Figure 9a), RMSE (Figure 9b), and MAE (Figure 9c). The age of concrete (X8) is then the most important input affecting the prediction accuracy of HPC compressive strength in a considerable manner. Additionally identified in Figure 9, the second most important input is the cement content (X1). It is observed in Figure 9 that the most sensitive input is X 8 (the age of concrete) (i.e., its probability distribution exhibits a major difference compared with others seen in Figure 8, so the change of the mean value in the case of excluding X 8 was the most important compared to others). This conclusion is reached using both the LMNN and GPR-32 models, with respect to the three statistical criteria R 2 (Figure 9a), RMSE (Figure 9b), and MAE (Figure 9c). The age of concrete (X 8 ) is then the most important input affecting the prediction accuracy of HPC compressive strength in a considerable manner. Additionally identified in Figure 9, the second most important input is the cement content (X 1 ). Sustainability 2020, 11, x FOR PEER REVIEW 16 of 22 (a) Classified with respect to the average R 2 (b) Classified with respect to the average RMSE.
(c) Classified with respect to the average of the MAE.

Conclusion
Sensitivity Index of Inputs (%) Sensitivity Index of Inputs (%) Sensitivity Index of Inputs (%) Figure 9. Classification of sensitivity of inputs in the prediction of compressive strength, with respect to: (a) R 2 , (b) RMSE, and (c) MAE.

Conclusions
In this study, the sensitivity and robustness analyses of two AI models, including GPR and LMNN, were done for the prediction of the compressive strength of the HPC using the Monte Carlo approach. Various kernel functions-namely, Matern32, Matern52, Exponential, Squared exponential, and Rational quadratic-were employed to construct various GPR models (GPR-32, GPR-52, GPR-EXP, GPR-SQEXP, and GPR-RSQ, respectively). The predictive capabilities of the AI models were evaluated using criteria such as RMSE, R 2 , and MAE. In addition, the Monte Carlo approach with 500 runs was used to deduce the converged statistical values of the criteria for analyzing the robustness of the AI models and to carry out the sensitivity analyses of the input parameters.
The results showed that the AI models developed in this study performed well in predicting the HPC compressive strength, but GPR-32 (R 2 = 0.893, RMSE = 5.46, MAE = 3.86) is the most efficient model compared with other algorithms. The Monte Carlo simulation results show that two parameters, namely the content of cement and the testing age of HPC, were found to be the most sensitive and important factors for predicting the compressive strength of the HPC. Therefore, it can be concluded that GPR-32 is a promising approach for prediction of the compressive strength of the HPC, which can be applied to predict other important properties of HPC such as tensile strength, flexural strength, or modulus of elasticity. However, it was also noticed that this model is sensitive to the selection of input parameters; thus, sensitivity analysis should be carried out to evaluate the importance of input parameters, which might help in the suitable selection of input factors for better performance of the predictive models. In general, this study might help engineers to select suitable AI models and appropriate parameters for the fabrication process of HPC.