Compressive Strength Prediction of High-Strength Concrete Using Long Short-Term Memory and Machine Learning Algorithms

: Compressive strength is an important mechanical property of high-strength concrete (HSC), but testing methods are usually uneconomical, time-consuming, and labor-intensive. To this end, in this paper, a long short-term memory (LSTM) model was proposed to predict the HSC compressive strength using 324 data sets with ﬁve input independent variables, namely water, cement, ﬁne aggregate, coarse aggregate, and superplasticizer. The prediction results were compared with those of the conventional support vector regression (SVR) model using four metrics, root mean square error ( RMSE ), mean absolute error ( MAE ), mean absolute percentage error ( MAPE ), and correlation coefﬁcient ( R 2 ). The results showed that the prediction accuracy and reliability of LSTM were higher with R 2 = 0.997, RMSE = 0.508, MAE = 0.08, and MAPE = 0.653 compared to the evaluation metrics R 2 = 0.973, RMSE = 1.595, MAE = 0.312, MAPE = 2.469 of the SVR model. The LSTM model is recommended for the pre-estimation of HSC compressive strength under a given mix ratio before the laboratory compression test. Additionally, the Shapley additive explanations (SHAP)-based approach was performed to analyze the relative importance and contribution of the input variables to the output compressive strength.


Introduction
Concrete has been widely used worldwide with its economic, monolithic, modular, and durable advantages. High-strength concrete (HSC), which is defined by the compressive strength of more than 40 MPa [1], was developed in the late 1950s and early 1960s in the field of cementitious materials. The American Concrete Institute (ACI) defines HSC as "concrete that meets specific performance and homogeneity requirements that cannot always be achieved through the use of conventional materials and conventional mixing, placing and curing procedures". Nowadays, HSC has been widely used in large-span bridges, high-rise buildings, and piers due to its uniform high density, low impermeability, and high durability [2].
To better understand design methods and the performance of concrete structures under external loads, it is of great importance to study the mechanical properties of concrete. Among the wide variety of concrete properties, the most important property is the compressive strength, as it is directly related to the safety of the structure and is necessary to assess the performance of the structure throughout its life cycle. However, concrete is a non-homogeneous mixture of cement, sand, gravel, supplementary raw materials, and admixtures. These ingredients are randomly distributed in the concrete mix ratio. Many factors affect the compressive strength of concrete, including waste composition, particle size, water-cement ratio, and aggregate ratio. Therefore, it is quite difficult to accurately predict the concrete compressive strength in such a complex matrix. The currently accepted method is to determine the compressive load-bearing capacity of concrete by physical tests. Generally, cubic and cylindrical concrete specimens are prepared according to certain mix ratios and cured for a period of time, and then the compressive strength of concrete is measured by a compressive testing machine. This method has been standardized around the world and is widely used for laboratory and field testing, but it is inefficient, economical, and time-consuming. In fact, for any kind of concrete with expected strength, the reasonable design of the mix ratio needs many attempts and laboratory tests. Moreover, the design procedure for HSC concrete is more complex than that for normal strength concrete, requiring experience and more in-depth knowledge of the chemical and mechanical properties of the components, and usually, several batches of tests are required to obtain the concrete with the desired properties. Thus, time and cost can be saved if the compressive strength can be estimated early and accurately through calculations before implementing the compression tests.
Empirical regression methods seem to be more suitable for assessing the compressive load-carrying capacity of concrete than traditional experimental techniques. With the development of artificial intelligence, it is very common and convenient to estimate the compressive strength of concrete using machine learning methods. Actually, machine learning algorithms such as artificial neural networks (ANN), random forest (RF), support vector machine (SVM), and decision tree (DT) have been widely used for the prediction of compressive strength of concrete [3][4][5][6][7][8][9][10][11]. Hai et al. [12] proposed a deep neural network (DNN) to predict the compressive strength of rubber concrete, and achieved high accuracy and reliability with R = 0.9874. Abobakr et al. [13] developed an extreme learning machine (ELM) model to predict the compressive strength of high-strength concrete, and the results showed that the ELM method has good prediction accuracy and fast learning speed compared with the traditional back-propagation (BP) neural network. Muliauwan et al. [14] employed three intelligent algorithms, linear regression, ANN, and SVM on 1030 samples to investigate the most accurate mapping relationships between input and output in concrete mixtures, and results showed that these intelligent methods can predict compressive strength with high accuracy in predictive models without expensive laboratory experiments. Song et al. [15] utilized gene expression programming (GEP), ANN, DT, and bagging algorithms to predict the compressive strength of fly ash admixture concrete. The results indicated that the bagging algorithm outperformed the other three algorithms with the highest prediction correlation coefficient R 2 = 0.95. To explore the applicability of integrated learning models, Furqan et al. [16] employed machine intelligence algorithms with individual learners and integrated learners on 1030 data samples to predict the compressive strength of sustainable high-performance concrete prepared from waste materials. It was found that the use of integrated models in machine learning can improve the model performance compared to traditional machine learning algorithms. The results of these studies mentioned above demonstrated that machine learning showed good prediction performance in regression prediction of concrete strength. However, the algorithms in these studies are mostly traditional machine learning algorithms with limited predictive capability. To get better prediction performance, model tuning is needed to get the appropriate model parameters, but this task is also a considerable challenge. Compared with conventional machine learning algorithms, it may be a better choice to explore a deep learning model with better prediction performance.
With the application of deep learning in civil engineering, as a special form of recurrent neural network (RNN), long short-term memory network (LSTM) has been well performed in many regression problems which has the ability to learn long-term dependencies. Harun et al. [17] estimated the geopolymerization process of fly ash-based polymers using deep LSTM and machine learning models. The results showed that compared to the prediction accuracy of 98.83%, and 91.62% for SVR and K-nearest neighbor (KNN), the deep LSTM achieved a higher accuracy of 99.55%. Sarmad [18] used the LSTM model on 1030 samples to predict the compressive strength of high-performance concrete achieving high accuracy with R 2 = 0.98. Harun et al. [19] employed two deep learning methods, namely stacked autoencoders and LSTM network, to predict the compressive strength and ultrasonic pulse velocity of concrete containing silica fume at high temperatures, and the results showed that LSTM achieved better prediction results. Overall, the LSTM model has exhibited good performance in the prediction of mechanical properties of concrete, but there is still relatively little research in concrete strength prediction, and in-depth analysis and research are needed before further popularization and application. For this reason, this paper attempts to propose an LSTM-based prediction model to predict the HSC compressive strength and compare the prediction results with the conventional support vector regression (SVR) model.

LSTM
The LSTM network was proposed by Hochreiter and Schmidhuber in 1997 [20], which aims to solve the problems of "gradient disappearance" and "gradient explosion" by introducing the gating function mechanism. As a powerful recurrent neural network model, LSTM can extract the long and short-term dependencies of time series, to achieve effective feature extraction of time-series data [21][22][23]. As shown in Figure 1, the LSTM includes the forget gate, input gate, update gate, and output gate in the principal structure. The main formulas of the LSTM structure are as follows [24,25]: where f t , i t , g t , and o t determine the output values of the forget, input, update, and output gates, respectively; W f , W i , W g , and W o are weight vectors, b f , b i , b g , and b o are bias vectors; c t and σ are memory cell and sigmoid activation functions, respectively.

Support Vector Regression
Support vector regression (SVR) is an application of SVM in regression problems. Compared with ANN, SVM can handle nonlinear regression problems better and has the advantage of obtaining better global optimal solutions rather than local optimal solutions. Moreover, this model is accurate in prediction strength and easy to implement compared to other methods [26]. As shown in Figure 2, SVR adopts the concept of the ε-insensitive zone, in which a margin is defined to control the deviation of the prediction points. In linear SVR, the function f (x) is used as the solution of the problem [27]: where w is the weight vector, x is the input vector, and b is the bias. For the nonlinear case, a kernel function φ(x) can be used to map the data to a high-dimensional space with a nonlinear kernel.
The linear regression algorithm can be implemented by mapping the data to a higher dimensional feature space. The coefficients w and b can be determined by minimizing the following functions [28].
where n is the number of samples, C is the penalty parameter, and C is greater than 0. ξ i and ξ * i are two slack variables. Considering that the Gaussian radial basis function is the most widely used, it is adopted as the kernel function in this paper. Its expression can be expressed as follows [29].

Dataset Description
The data set consists of 324 sets of samples collected from the literature [13], each containing 5 input variables and 1 output compressive strength. The input variables are water, cement, fine aggregate, coarse aggregate, and superplasticizer. For convenience, the abbreviations of all variables can be found in the Abbreviations. Figure 3 shows the distribution of these variables and the Pearson correlation coefficients between them. The linear correlation between the individual input variables and the output was found to be weak, indicating a complex nonlinear regression relationship between the five input variables and the compressive strength. About 80% of the samples were randomly selected and used for training, and the remaining 20% for testing. The statistical characteristics of these data sets are shown in Table 1.

Performance-Evaluation Methods
In general, when assessing the implementation of a prediction model, it is important to use various measures of evaluation metrics to assess the model's effectiveness. In this paper, four metrics, root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and correlation coefficient (R 2 ), were used to analyze the predictive performance. These metrics are defined as follows [31]: where t is the experimental value, y is the predicted value, t is the mean value of t, y is the mean value of y.

Model Building
The LSTM model developed has five inputs and one output. The number of hidden units was 300, and fully connected layers were 100 [18]. For the SVR model with RBF kernel function, the parameters c and g have a great effect on the predictive performance. Usually, the SVR model requires some optimization algorithms to obtain the best combination of parameters, but this is not in the scope of this paper. For this reason, a basic SVR model was used in this paper. According to the reference [32], the two parameters c and g were set to 1 and 0.1, respectively.

Model Training
First, the SVR model was performed to train the training set using ten-fold crossvalidation. For comparison, the same training set was trained by the LSTM model. For the LSTM model, the training method was selected as the "adam". Moreover, the mini-batch size was 64, the training epoch was 1000, and the gradient threshold used was 1. The initial learning rate was 0.001. The learning rate was used as a dropped factor 0.1 during the training with 250 epoch periods. The model was trained using MATLAB R2021a (The MathWorks Inc., Natick, MA, USA) on a laptop with Intel (R) Core (TM) i7 and 16 GB memory. Additionally, to speed up the training process, an NVIDIA™ GPU was used. The training process of the model is shown in Figure 4. It can be found that both RMSE and loss converged around 0 during the model training process, indicating that the model was well trained.

Comparison of Prediction Results
The predictions of the two models for the dataset are shown in Figure 5. It is obvious from the test set that the predicted values of the LSTM model match better with the actual values. For a more visual representation, the scatter plot of the prediction effect for each sample is shown in Figure 6. Compared with the SVR model, the scatter points in the LSTM model are closer to the diagonal, indicating that the model prediction accuracy is higher, which can also be seen from the fitted correlation coefficient between the predicted and actual values. Meanwhile, the residuals histogram of the statistical distribution is shown in Figure 7. The normal distribution curve of the residuals shows that the mean value of the predicted residuals of the LSTM model is closer to zero and the standard deviation is smaller.   The prediction output was statistically analyzed and the list of evaluation metrics is shown in Table 2. Compared with SVR, higher model prediction accuracy was obtained by the LSTM model with R 2 = 0.997, RMSE = 0.508, MAE = 0.08, and MAPE = 0.653, which could be recommended as a candidate for the compressive strength prediction tool of HSC. Moreover, these results further validate the ability of the LSTM model to capture the complex nonlinear relationship between the five input parameters and the compressive strength of the HSC.

Importance Analysis of Input Variables on Output
The results of Section 4 show that given a mix ratio, a more accurate compressive strength estimate can be obtained based on the LSTM model. For HSC, if the pre-estimated compressive strength of the mix-design does not meet the designer's expectation, then the content of each ingredient needs to be continuously adjusted to re-form a suitable mixdesign. However, without knowing the effect and contribution of each input variable to the predicted output, these attempts are blind and require a lot of trial and error. For this reason, a Shapley additive explanations(SHAP)-based method was proposed to investigate the relative importance of each input variable to the output results and whether each variable contributes positively or negatively to the output results [33]. A detailed description of the SHAP approach can be seen in references [34,35].
As shown in Figure 8, the average SHAP values shown represent the relative importance of the input variables on the output. It can be clearly observed that among the five variables listed in this paper, cement has the greatest effect on HSC compressive strength, followed closely by water, coarse aggregate, superplasticizer, and fine aggregate. In addition, the summary plot used to elucidate the influence of the global characteristics of the input features is shown in Figure 9, where each point represents the Shapley value of a feature and a separate observation in the dataset. The position of each point on the x-axis represents the Shapley value for each factor, showing the effect of each factor on compressive strength, while the y-axis provides the order of importance of each factor. A high feature value for each sample in Figure 9 indicates that this input variable is positive for the output compressive strength. Conversely, the smaller the feature value, the more negative the input variable is on the output. It can be clearly observed that cement and superplasticizer are positive for the compressive strength and the compressive strength increases with the increase of their amount. On the contrary, water, coarse aggregate, and fine aggregate are negative for compressive strength, and an increase in the amount of these three ingredients leads to a decrease in the compressive strength of HSC.

Conclusions
In this paper, the LSTM model was employed to predict the HSC compressive strength, and the predicted results were compared with a conventional SVR model. The main conclusions are summarized as follows.
(1) The LSTM model can capture the complex nonlinear relationship between the five input parameters and the compressive strength of HSC with R 2 exceeding 0.99 in both training and testing stages. (2) Compared with the conventional SVR model, the prediction capacity of the LSTM model is superior, which is recommended as an alternative method for the compressive strength prediction of HSC. The pre-estimate HSC compressive strength can be obtained prior to the implementation of laboratory compression tests using the LSTM model, which will greatly reduce the time and cost of laboratory compression tests. Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: Some or all data, models, or codes that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments:
The authors would also like to thank the three anonymous reviewers for their constructive comments.

Conflicts of Interest:
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.