Predictive Modeling of Recycled Aggregate Concrete Beam Shear Strength Using Explainable Ensemble Learning Methods

Cakiroglu, Celal; Bekdaş, Gebrail

doi:10.3390/su15064957

Open AccessArticle

Predictive Modeling of Recycled Aggregate Concrete Beam Shear Strength Using Explainable Ensemble Learning Methods

by

Celal Cakiroglu

¹

and

Gebrail Bekdaş

^2,*

¹

Department of Civil Engineering, Turkish-German University, Istanbul 34820, Turkey

²

Department of Civil Engineering, Istanbul University-Cerrahpasa, Istanbul 34320, Turkey

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(6), 4957; https://doi.org/10.3390/su15064957

Submission received: 28 December 2022 / Revised: 6 March 2023 / Accepted: 9 March 2023 / Published: 10 March 2023

(This article belongs to the Special Issue Engineering Properties and Environmental Effect of Recycled Waste in Geotechnical Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Construction and demolition waste (CDW) together with the pollution caused by the production of new concrete are increasingly becoming a burden on the environment. An appealing strategy from both an ecological and a financial point of view is to use construction and demolition waste in the production of recycled aggregate concrete (RAC). However, past studies have shown that the currently available code provisions can be unconservative in their predictions of the shear strength of RAC beams. The current study develops accurate predictive models for the shear strength of RAC beams based on a dataset of experimental results collected from the literature. The experimental database used in this study consists of full-scale four-point flexural tests. The recycled coarse aggregate (RCA) percentage, compressive strength (

f_{c}^{'}

), effective depth (

d

), width of the cross-section (b), ratio of shear span to effective depth (

a / d

), and ratio of longitudinal reinforcement (

ρ_{w}

) are the input features used in the model training. It is demonstrated that the proposed machine learning models outperform the existing code equations in the prediction of shear strength. State-of-the-art metrics of accuracy, such as the coefficient of determination (

R^{2}

), mean absolute error, and root mean squared error, have been utilized to quantify the performances of the ensemble machine learning models. The most accurate predictions could be obtained from the XGBoost model, with an

R^{2}

score of 0.94 on the test set. Moreover, the impact of different input features on the machine learning model predictions is explained using the SHAP algorithm. Using individual conditional expectation plots, the variation of the model predictions with respect to different input features has been visualized.

Keywords:

recycled aggregate concrete; shear strength; machine learning; XGBoost; SHAP

1. Introduction

The production of cement and natural coarse aggregate have been causing the large-scale consumption of raw materials worldwide [1,2]. The increase in construction projects aiming to overhaul buildings and infrastructure leads to a shortage of natural coarse aggregate [3]. Furthermore, the renewal of infrastructure entails the generation of large amounts of construction and demolition waste (CDW) [4]. The derivation of coarse aggregate by processing CDW is proposed as a solution to reduce the environmental impact of CDW and concrete production [5]. However, the usage of recycled coarse aggregate (RCA) can change the mechanical and workability properties of concrete in an unfavorable way. The attachment of mortar to RCA lowers the workability, density, and compressive strength of concrete by increasing the amount of absorbed water [6,7]. It is reported that strength reduction due to the usage of RCA depends on various factors such as the quality of the concrete from which the RCA was derived, the replacement percentage of RCA, and the water/cement ratio [8]. There have been numerous studies in the past few decades investigating the impact of replacing natural coarse aggregate (NCA) in structural members with RCA. In these studies, the RCA replacement is reported to affect the concrete strength to varying degrees [9,10,11,12,13,14,15,16]. The lack of consensus in the literature about the effect of RCA on the load-carrying capacity of structures makes it difficult to develop reliable models able to predict load-carrying capacity. Another factor that prevents the widespread usage of RCA in construction is the heterogeneity and region-specific composition of this material [17].

Literature Review

In recent years, machine learning models have been increasingly applied to the problem of predicting the mechanical properties and load-carrying capacity of recycled aggregate concrete (RAC). Momeni et al. [18] investigated the flexural strength of RAC beams. An artificial neural network (ANN)-based predictive model was trained using particle swarm optimization and an imperialist competitive algorithm on a dataset of experimental results. Dantas et al. [19] developed ANN models for the prediction of the 3-, 7-, 28-, and 91-day compressive strength of concrete that was produced using CDW. Felix et al. [20] developed ANN and nonlinear regression models to predict the elastic modulus of RAC. The Levenberg–Marquardt back-propagation algorithm was used in the training of the ANN model, which was able to predict the elasticity modulus with a coefficient of determination of 0.91. Gholampour et al. [21] utilized gene expression programming (GEP) to develop empirical models capable of predicting the compressive strength, elastic modulus, flexural strength, and splitting tensile strength of RAC. Hammoudi et al. [22] utilized ANN and response surface methodology (RSM) models for the prediction of the compressive strength of RAC. Cement content, RCA content, and slump were used as the input features, and ANN was found to be the more accurate model compared with RSM. Moein et al. [23] presented a comprehensive review of recent progress on the application of machine learning algorithms to the prediction of concrete’s mechanical properties. A list of studies where support vector machines (SVMs), ANNs, decision trees, and evolutionary algorithms have been employed for the prediction of concrete properties was given. Predictive models that combine SVMs with ANNs were recommended for the prediction of concrete strength due to their accuracy and ease of implementation. Mukhtar and Deifalla [24] investigated FRP-reinforced concrete deep beams without stirrups. A database consisting of 120 experimental results for the shear strength of FRP-reinforced deep beams was curated. A hybrid model was developed based on mechanics and nonlinear regression.

The current study investigates the effect of using RAC on the shear strength of reinforced concrete beams without stirrups. The prevention of brittle shear failure of these structures is of the utmost importance. However, the current predictive models and design guidelines in the literature are mainly developed toward beams made of natural aggregate concrete (NAC). Numerous experimental studies have been carried out for the shear strength assessment of RAC beams [15,25,26,27,28,29,30,31]. The results of these studies show that the inclusion of recycled aggregates reduces the shear capacity of beams. Although machine learning methodologies have been applied to the prediction of RAC properties such as compressive strength, splitting tensile strength, and elasticity modulus, the application of these techniques in shear strength prediction has been limited. Furthermore, in most of these studies the usage of machine learning methodologies has been limited to ANNs. To the best of the authors’ knowledge, the only studies that utilized machine learning techniques for the prediction of the shear strength of RAC have been carried out by Yu et al. [32] and Ababneh et al. [33]. Yu et al. [32] compared the shear strength predictions of the available design equations to the experimental shear strength of RAC beams. It was observed that the design equation predictions could be significantly inaccurate. The ANN and Random Forest techniques were shown to deliver better results in terms of accuracy. Ababneh et al. [33] investigated the shear strength of RAC beams using ANNs. The impacts of different input variables such as the recycled aggregate content, beam width and effective depth, reinforcement ratio, and shear span to effective depth ratio, on beam shear strength were investigated with a parametric study.

The current study presents the application of six different ensemble learning algorithms to the problem of shear strength prediction of RAC beams without stirrups. A dataset comprising the results of 128 experiments has been used in the training of these predictive models. The RCA percentage, concrete compressive strength, cross-section dimensions, span length to effective depth ratio, and longitudinal reinforcement ratio have been used as the input features. Furthermore, using the SHAP algorithm, the impact of different input features on the model output has been visualized. State-of-the-art accuracy metrics such as mean absolute error, root mean square error, and the coefficient of determination have been utilized to quantify the accuracy of each model. In addition, the accuracies of different predictive equations from the literature have been compared with the results of the ensemble learning algorithms. Due to the lack of accurate prediction methodologies for the shear strength of RAC beams in the literature, the proposed methodology in this study has significance. The presented data-driven approach enables safer and more economic design of reinforced concrete beams, while at the same time incorporating recycled CDW in the construction process.

2. Design Equations and Machine Learning Methodologies

2.1. Concrete Shear Strength Prediction by Code Provisions and Equations from the Research Literature

This section gives a list of the equations found in the design codes and research publications. In these equations,

V_{c}

, b, d,

ρ_{w}

, and

f_{c}^{'}

stand for the beam shear strength, width and effective depth of the beam cross-section, ratio of the longitudinal reinforcement, and the 28-day compressive strength of the concrete, respectively. The units of these variables in Equations (1)–(8) are millimeters for length, megapascal for stress, and Newton for force. Geometry of a common test setup with a specimen in four-point bending is displayed in Figure 1, where the specimen rests on pin and roller supports. One of the earliest equations for the prediction of shear strength of reinforced concrete beams was developed by Zsutty [34]. Zsutty [34] developed an equation for the prediction of the shear strength for reinforced concrete beams without stirrups by applying dimensional analysis and statistical regression to existing experimental results (Equation (1)) [26,34].

V_{c} = 2.21 {(f_{c}^{'} ρ_{w} \frac{d}{a})}^{1 / 3} bd

(1)

Equation (2) shows the simplified shear equation from ACI 318-14 which is still being applied in practice [1,35]. In Equation (1), a stands for the shear span length, as shown in Figure 1.

ACI 318-14:

V_{c} = \frac{1}{6} \sqrt{f_{c}^{'}} bd

(2)

An improved version of Equation (2) was introduced in the ACI318-19 code, which considers the case when the shear reinforcement is less than the required minimum. The main difference in the ACI318-19 equation (Equation (3)) is the inclusion of the longitudinal reinforcement effect. In Equation (3), the ratio of the longitudinal reinforcement is denoted by

ρ_{w}

. It can be observed that neither of these two equations considers the effect of using recycled aggregate.

ACI 318-19:

λ = \sqrt{\frac{2}{1 + 0.004 d}} V_{c} = \{\begin{matrix} 0.66 {(ρ_{w})}^{1 / 3} \sqrt{f_{c}^{'}} bd λ, if λ < 1 \\ 0.66 {(ρ_{w})}^{1 / 3} \sqrt{f_{c}^{'}} bd, otherwise \end{matrix}

(3)

Rahal and Alrefaei [14] proposed a modified version of the ACI equations that considered the effect of using RAC (Equation (4)). The effect of using RCA was introduced into the equations by the strength reduction factor

λ_{R} = 0.8

.

V_{c} = 0.17 λ_{R} \sqrt{f_{c}^{'}} bd

(4)

Setkit et al. [1] showed that Equations (2)–(4) could be unconservative for certain beam geometries and RAC replacement levels. In order to have a safer prediction procedure, a further reduction factor,

β_{r}

, was introduced (Equation (5)) which takes different values depending on the level of RAC replacement. In Equation (5),

β_{r} = 0.75

for RCA replacement levels between 50 and 100%, and

β_{r} = 0.9

for RCA replacement levels less than 50%.

λ = \sqrt{\frac{2}{1 + 0.004 d}} V_{c} = \{\begin{matrix} 0.66 β_{r} {(ρ_{w})}^{1 / 3} \sqrt{f_{c}^{'}} bd λ, if λ < 1 \\ 0.66 β_{r} {(ρ_{w})}^{1 / 3} \sqrt{f_{c}^{'}} bd, otherwise \end{matrix}

(5)

In addition to Equations (1)–(5), the predictions of the equations from the Eurocode EC2 (Equation (6)), Canadian Standards Association code CSA A23.3-04 (Equation (7)), and Brazilian concrete code (NBR6118/2007) have also been analyzed [36,37].

Eurocode EC2:

η = 1 + \sqrt{\frac{200}{d}} V_{c} = \{\begin{matrix} 0.18 {(100 ρ_{w} f_{c}^{'})}^{\frac{1}{3}} bd η, if η \leq 2 \\ 0.36 {(100 ρ_{w} f_{c}^{'})}^{\frac{1}{3}} bd, otherwise \end{matrix}

(6)

CSA A23.3-04:

V_{c} = 0.65 \frac{230}{1000 + d} \sqrt{f_{c}^{'}} bd

(7)

NBR6118/2007:

V_{c} = 0.126 {(f_{c}^{'})}^{2 / 3} bd

(8)

2.2. Comparison of the Equation Predictions with Experimental Data

In this section, the prediction accuracies of the equations in the literature have been presented using state-of-the-art accuracy metrics such as mean absolute error, root mean square error, and the coefficient of determination. Furthermore, the accuracies of each equation have been visualized in a Taylor plot using the Pearson correlation coefficient. Figure 2 shows on the left-hand side the variation of the predicted shear strength in comparison to the actual shear strength values for each one of the Equations (1)–(8). For each equation, the percentage error distributions are shown on the right-hand side with swarm plots and violin plots, which display a smoothed kernel density estimation of the error distribution [38]. Each dot in these swarm plots corresponds to one of the data samples. The Eurocode EC2 equation shows the best performance of the eight equations, followed by ACI 318-14, ACI 318-19, and CSA A23.3-04. The least accurate models were the NBR6118/2007 code equation, the equation developed by Zsutty [33], and the equation developed by Setkit et al. [1]. The coefficient of determination, mean absolute error, and root mean square error values associated with all the equations are listed in Table 1. Figure 2 shows that all the equations except Zsutty’s tend to underestimate shear strength. The negative and positive error percentages on the right-hand side of Figure 2 correspond to overestimated and underestimated shear strength values, respectively.

The radial axis in Figure 3 represents the Pearson correlation coefficient, which can take values between −1 and 1. Positive Pearson correlation coefficient values indicate that as one of the variables increases, the other variable increases as well. Pearson correlation values close to ±1 indicate an almost perfectly linear relationship between the two variables. The formula for calculating the Pearson correlation coefficient is given in Equation (9) where,

x_{i}

and

y_{i}

are two sequences of data,

n

is the length of these sequences, and

r_{xy}

is the Pearson correlation coefficient between the two sequences [39].

r_{xy} = \frac{n \sum_{i = 1}^{n} x_{i} y_{i} - \sum_{i = 1}^{n} x_{i} \sum_{i = 1}^{n} y_{i}}{\sqrt{n \sum_{i = 1}^{n} {x_{i}}^{2} - {(\sum_{i = 1}^{n} x_{i})}^{2}} \sqrt{n \sum_{i = 1}^{n} {y_{i}}^{2} - {(\sum_{i = 1}^{n} y_{i})}^{2}}}

(9)

The Pearson correlation coefficient is the measure of a linear relationship between two sequences of data. According to Figure 3, the ACI318-19 equation predictions have the most perfectly linear relationship with the experimental results, followed by Zsutty’s equation, the NBR6118/2007 equation, and the EC2 equation. It can be observed that all the equations have Pearson correlation values greater than 0.95, which indicates highly linear relationships. The standard deviation of the experimental results is shown with a blank circle. To illustrate the relationship between the Pearson correlation coefficient and the accuracy of a model, the shear strength predictions of the equations have been multiplied by a factor of

α

. For the equations that underpredict shear strength,

α

values have been chosen greater than 1. Figure 4a shows the variation of the

R^{2}

score for a range of

α

values. In case of EC2 and ACI 318-14, the

R^{2}

score drops rapidly after a small initial increase, since these equations have

R^{2}

scores greater than 0.9 in the beginning. For ACI 318-19, CSA A23.3-04, and the equation developed by Rahal and Alrefaei, the

R^{2}

score increases for an

α

value up to 1.25, from which point the

R^{2}

score decreases. The equation of Setkit et al. reaches its highest

R^{2}

score for an

α

value of 1.42, and the NBR6118/2007 equation reaches its highest

R^{2}

score for an

α

value greater than 1.5. Figure 4b shows that Zsutty’s equation reaches an

R^{2}

score close to 1 when multiplied by a reduction factor of 0.75. It should be noted that in Figure 4, the curves that reach scores close to 1 belong to equations with a higher Pearson correlation coefficient, such as the ACI318-19, NBR6118/2007, and Zsutty’s equation. Figure 4 shows that, although some of the equations in the literature have a low accuracy of prediction, this accuracy could be increased by multiplying the predicted values by a constant factor for the particular dataset in this study. It should be noted that the values of

α

shown in Figure 4 should not be generalized, since they are based on a particular dataset.

2.3. Machine Learning Procedures

This section presents the statistical distribution of the dataset used in the machine learning model building process. The input and output features of the models are shown with horizontal bars in Figure 5. In Figure 5, RCA,

f_{c}^{'}

, b, d, a/d,

ρ_{w}

, and

V_{test}

denote the recycled coarse aggregate percentage, 28-day compressive strength of the concrete, width of the beam cross-section, effective depth of the beam cross-section, shear span to effective depth ratio, percentage of the longitudinal reinforcement, and measured shear strength of the specimen, respectively. Each one of these features has been split into four segments, and the number of test samples falling into each one of these segments has been written into the boxes corresponding to these segments. The dataset consists of a total of 128 samples. The topmost horizontal bar in Figure 5 shows that 59 of these samples, which corresponds to 46% of the entire dataset, have an RCA percentage greater than 75%. It can be observed that the RCA percentage was at least 5% in all the tested specimens. The compressive strength of concrete ranges between 20 MPa and 46.8 MPa, with the largest group of specimens (36.7% of the entire dataset) having a concrete compressive strength between 33.4 MPa and 40.1 MPa. The beam width ranges between 150 mm and 400 mm, whereas the effective depth ranges between 160 mm and 600 mm, and only 3% of the specimens have an effective depth greater than 490 mm. The shear span to effective depth ratio in the dataset ranges between 1 and 5.69, and 51.6% of the specimens have an a/d ratio between 3 and 4. The longitudinal reinforcement percentage ranges between 0.53 and 4.09%, with 80% of the dataset having a

ρ_{w}

value between 0.53 and 2.31%. Finally, the measured shear strength values range between 12.1 kN and 261.5 kN, with 49.2% of the dataset having a shear strength less than 70 kN. The distribution of the input and output features has been further elaborated using histograms together with kernel density estimate curves in Figure 5, where

μ

and

σ

stand for the mean value and the standard deviation.

Figure 6 shows the Pearson correlation coefficients between all the input and output features. In Figure 7, positive correlations between two features are shown in shades of blue and negative correlations are shown in shades of red. The greatest positive correlation can be observed between

V_{test}

and

b

, with a correlation coefficient of 0.84. This is followed by the correlation between

V_{test}

and effective depth, with a coefficient of 0.67. On the other hand, the a/d ratio and

V_{test}

are inversely correlated, with a coefficient of −0.43. The Pearson correlation coefficients in Figure 7 are calculated using the formula in Equation (9). A positive correlation indicates that the variables increase or decrease together, while a negative correlation coefficient indicates an opposite relationship between the variables.

Six different machine learning algorithms were tested in this study in order to achieve a more accurate prediction of the shear strength of recycled aggregate concrete beams. These algorithms are Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Random Forest, Categorical Gradient Boosting (CatBoost), Adaptive Boosting (AdaBoost), and Extra Trees Regressor. The performances of these equations were investigated using the 10-fold cross-validation approach. The machine learning algorithms as well as the 10-fold cross-validation were carried out using the Scikit-learn package of the Python programming language. Before starting the 10-fold cross-validation process, the dataset has been split into a training set and a test set in a 70 to 30% ratio. In the 10-fold cross-validation approach, the training set is split into 10 disjoint groups, and the machine learning algorithm is trained using 9 of these 10 groups. The performance of the trained model is then measured based on its predictions for the group that was not included in the training. This process is repeated 10 times, and after each training round a different group is used for testing the model performance. Finally, the parameters that gave the best predictions are used to measure the model performance on the test set. This procedure is schematically described in Figure 8. A flow chart of the entire analysis is displayed in Figure 9.

3. Results

This section presents the output of the machine learning algorithms compared with the actual experimental measurements. In Figure 10, the predictions for the training and test set are shown in different colors. The exact prediction of a sample and the ±10% deviation from a perfect prediction are shown with a straight solid line and dotted lines, respectively. Figure 11 shows the first tree of the XGBoost model, which is the model with the best coefficient of determination in this study. The tree root splits at a beam width level of 185 mm. After the root, the tree branches into three internal nodes and five leaf nodes. The internal nodes that come after the root split at two different levels of effective depth. If b ≥ 185 mm and d < 367.5 mm, the tree output is determined by the a/d ratio. The entire XGBoost model consists of 100 trees and the final prediction of a model is calculated by summing up all the tree outputs.

The accuracies of the machine learning models, as well as the duration of training and testing each model, are listed in Table 2. According to Table 2, the XGBoost and Extra Trees Regressor models were able to reach

R^{2}

scores above 0.94 on the test set. On the training set, both of these models achieved

R^{2}

scores greater than 0.99. Overall, four out of the six models in this study reached

R^{2}

scores greater than 0.9 on the test set, and all of the models except for the LightGBM model reached an

R^{2}

score greater than 0.9 on the training set. It should be noted that these model performances could be significantly improved by using larger datasets for model training. Despite the relatively small number of samples used in the training of the models, the average performance of the machine learning models was better than the average performance of the predictive equations listed in the previous sections. The average

R^{2}

score of all eight predictive equations presented in this study was 0.74 on the entire dataset, whereas the average

R^{2}

score of the machine learning models was 0.89 on the test set, which is a 20% improvement.

In Figure 12, for each machine learning model the predicted and target values of the shear strength are plotted together in different colors for the training and test sets separately. It can be observed that the predicted and target values have a better overlap on the training set, which consists of the first 89 samples in Figure 12. On the left-hand side of Figure 12, the variation of the error percentages has been visualized for the training and test sets separately. Overall, smaller error percentages are observed for the training set. Particularly, the error percentages of XGBoost, Extra Trees Booster, and CatBoost algorithms are significantly smaller on the training set, since these models have a coefficient of determination greater than 0.99 on the training set. The distributions of the error percentages are also visualized using swarm plots and violin plots for the training set (left) and test set (right). The mean value (

μ

), standard deviation (

σ

), and minimum and maximum values of the error percentages are also shown on the violin plots for each model. The negative values among these statistical quantities indicate a predicted value greater than a target value.

Interpretation of the Machine Learning Models Using SHAP Approach

The SHAP algorithm is widely used in order to explain the impact of different input features on the predictions of the machine learning models [40,41,42,43,44,45]. The SHAP methodology is based on an additive feature attribution procedure in which an explanation function

g

is defined as a linear combination of simplified input values

x^{'} \in {\{0, 1\}}^{M}

, where M is the total number of simplified input features. The simplified input values

x^{'}

are related to the original input values x through a mapping function h, such that x = h (

x^{'}

). This additive procedure is shown in Equation (10), where the simplified input features are multiplied with the Shapley values

ϕ_{i}

. The

ϕ_{i}

values are calculated as in Equation (11), where F is the set of all the input features and S is a subset of F where the feature with the index i is withheld. The function f in Equation (11) stands for the predictive model. Further details of the SHAP algorithm can be found in [46].

g (x^{'}) = ϕ_{0} + \sum_{i = 1}^{M} ϕ_{i} x^{'}

(10)

ϕ_{i} = \sum_{S \subseteq F \{i\}} \frac{|S|! (|F| - |S| - 1)!}{|F|!} [f_{S \cup \{i\}} (x_{S \cup \{i\}}) - f_{S} (x_{S})]

(11)

The SHAP summary plot in Figure 13 displays the impact of each input feature on the predicted shear strength. For each specimen from the dataset, there is a dot in Figure 13 whose horizontal distance from the zero SHAP value indicates the impact of a feature on the model prediction. The SHAP values are separately calculated for each input feature and the dots are placed in a horizontal distance from the zero point accordingly. Negative SHAP values correspond to a decreasing effect on the model prediction, whereas a positive SHAP value indicates an increasing effect of an input feature on the model output. The values of the input features determine the colors of the dots in Figure 13. Feature values close to the upper bound are shown in shades of red, while values close to the lower bound are shown in shades of blue. According to Figure 13, which was generated based on the XGBoost model, the beam cross-sectional width is the most impactful input feature, since this variable is associated with the greatest magnitudes of the SHAP value. It can be observed that in samples where the variable b is close to its upper bound, including this variable in the model predictions increases the predicted shear strength. On the other hand, in samples where b is close to its lower bound, the corresponding SHAP value is negative and the inclusion of this input feature decreases the predicted shear strength. Figure 13 shows that the second and third most impactful input features are effective depth and the shear span to effective depth ratio, whereas the RCA percentage has a relatively low impact on the model predictions. It can be seen from Figure 13 that in samples where the RCA percentage is low, adding this parameter to the models has an increasing effect on the shear strength, whereas in samples where RCA percentage is high, this parameter has a decreasing impact on the predictions. A similar relationship between the values of an input feature and the model prediction can also be observed for the shear span to effective depth ratio. Furthermore, increasing the values of the concrete compressive strength and longitudinal reinforcement ratio also increases the shear strength, according to Figure 13.

The feature dependence plots in Figure 14 and the individual conditional expectation plots in Figure 15 give more detailed information about the impact of each input feature on the predictions of the machine learning models. For each input feature, Figure 14 shows a dependence plot where the value of a feature is plotted against its SHAP value. Each specimen from the dataset is represented by a dot in Figure 14. The colors of the dots in Figure 14 are determined by the values of the most dependent input feature. Figure 14a shows that effective depth is the most dependent input feature on the RCA percentage. In samples where the RCA percentage is high, the SHAP value of RCA tends to be negative, which indicates a decreasing effect on the mode prediction. It can be observed that in all of the samples with a 100% RCA replacement ratio, with the exception of a single sample, the SHAP value of this feature is negative. Figure 14d shows that beam width is the most dependent feature on effective depth. For the same or close values of d, it can be observed that in specimens with greater values of b (colored in red), d has greater SHAP values, which indicates a greater and increasing impact on the model prediction. It can be observed that for d values less than 400 mm, effective depth tends to have a decreasing effect on the shear strength predictions. Furthermore, Figure 14c shows that for the values of b less than 200 mm, beam width has negative SHAP values and a decreasing effect on the model predictions. Figure 14e shows that increasing the value of the shear span to effective depth ratio decreases the SHAP value of this parameter on a nonlinear curve, whereas the variations of the d and

f_{c}^{'}

SHAP values can be approximated with a linear curve.

The ICE plots in Figure 15 represent the effect of changing one of the input features on the model output while keeping every other input feature constant. Each sample in the dataset is represented by one of the curves and the bold curve represents the average of all curves. For each specimen in the dataset, each input feature is varied between its maximum and minimum values while the other features remain unchanged and the model predictions are calculated. These predictions constitute the thin blue curves in Figure 15. According to Figure 15, the most significant change in the predicted shear strength values is caused by changing the value of the beam width, which confirms the result of the SHAP algorithm. On the other hand, the flattest curves in Figure 15 are those that show the average variation in shear strength with respect to the RCA ratio and

ρ_{w}

, which is in agreement with the SHAP analysis. Figure 15 also contains information about the deviation of the model predictions from the average. If most of the thin blue curves are clustered in a narrow band around the average curve, then this deviation is small. It can be observed that the curves showing the variation with respect to b have the narrowest deviation from the average.

4. Discussion and Conclusions

The inclusion of RAC in the construction industry can have significant benefits for the sustainability of the construction industry by reducing and recycling construction and demolition waste. Since the natural coarse aggregate used in conventional concrete is a finite resource, incorporating recycled coarse aggregate into the construction process is greatly beneficial for the sustainability of the industry. On the other hand, compared with conventional concrete, RAC has greater variability in its quality and strength. Due to the variations in the quality of recycled aggregate, it is a more challenging task to accurately predict the strength of this type of material. Therefore, modern statistical techniques, such as the machine learning models presented in this study, can play a significant role in the accurate assessment of the quality of RAC. The current study aims at adding to the knowledge about the impact of using different levels of RCA in concrete. Eight equations from the literature that predict the shear strength of reinforced concrete beams have been investigated using statistical measures of accuracy. A database consisting of the results of 128 different experiments was used in this study. The main findings of the study can be summarized as follows:

Most of the equations in the literature, except for Zsutty’s equation, are found to underpredict shear strength by a large margin, and the equation predictions are found to be linearly correlated with experimental results, which is characterized by a Pearson correlation coefficient greater than 0.95;
In terms of the coefficient of determination, the machine learning models were found to be on average 20% more accurate than the equations in the literature;
The most accurate equation was found to be the Eurocode EC2 equation, while the most accurate machine learning model was the XGBoost model;
Based on the SHAP algorithm and the individual conditional expectation (ICE) plots, changing the beam width was found to have the greatest impact on the machine learning model predictions; whereas changing the RCA percentage in the concrete was found to have the least effect on the model output;
Increasing the percentage of RCA decreases the shear strength.

It should be noted that the relatively small size of the dataset used in developing the predictive models is a significant limitation of the current study. The models presented in this study need to be further developed on bigger datasets, and are not recommended to be used in practice in their current form. Nevertheless, the developed models were able to deliver highly accurate predictions on the test sets. In addition, it should be considered that, in practice, the output of any predictive model should be multiplied with appropriate safety factors. Future research in this area can be carried out by using larger databases that also incorporate the results of numerical analysis. Furthermore, using optimization techniques on larger datasets, new equations considering the addition of different levels of RCA to concrete could be developed.

Author Contributions

Writing—original draft preparation, C.C.; Conceptualization, C.C. and G.B.; data curation, C.C.; visualization, C.C.; methodology, G.B.; supervision G.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data will be made available upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Setkit, M.; Leelatanon, S.; Imjai, T.; Garcia, R.; Limkatanyu, S. Prediction of Shear Strength of Reinforced Recycled Aggregate Concrete Beams without Stirrups. Buildings 2021, 11, 402. [Google Scholar] [CrossRef]
Sabău, M.; Remolina Duran, J. Prediction of Compressive Strength of General-Use Concrete Mixes with Recycled Concrete Aggregate. Int. J. Pavement Res. Technol. 2022, 15, 73–85. [Google Scholar] [CrossRef]
Leng, Y.; Rui, Y.; Zhonghe, S.; Dingqiang, F.; Jinnan, W.; Yonghuan, Y.; Qiqing, L.; Xiang, H. Development of an environmental Ultra-High Performance Concrete (UHPC) incorporating carbonated recycled coarse aggregate. Constr. Build. Mater. 2023, 362, 129657. [Google Scholar] [CrossRef]
Silva, R.V.; De Brito, J.; Dhir, R.K. Availability and processing of recycled aggregates within the construction and demolition supply chain: A review. J. Clean. Prod. 2017, 143, 598–614. [Google Scholar] [CrossRef] [Green Version]
Sabău, M.; Bompa, D.V.; Silva, L.F. Comparative carbon emission assessments of recycled and natural aggregate concrete: Environmental influence of cement content. Geosci. Front. 2021, 12, 101235. [Google Scholar] [CrossRef]
Deshpande, N.; Londhe, S.; Kulkarni, S. Modeling compressive strength of recycled aggregate concrete by Artificial Neural Network, Model Tree and Non-linear Regression. Int. J. Sustain. Built Environ. 2014, 3, 187–198. [Google Scholar] [CrossRef] [Green Version]
Tu, T.Y.; Chen, Y.Y.; Hwang, C.L. Properties of HPC with recycled aggregates. Cem. Concr. Res. 2006, 36, 943–950. [Google Scholar] [CrossRef]
Ajdukiewicz, A.; Kliszczewicz, A. Influence of recycled aggregates on mechanical properties of HS/HPC. Cem. Concr. Compos. 2002, 24, 269–279. [Google Scholar] [CrossRef]
Etxeberria, M.; Marí, A.R.; Vázquez, E. Recycled aggregate concrete as structural material. Mater. Struct. 2007, 40, 529–541. [Google Scholar] [CrossRef]
Gonzalez-Fonteboa, B.; Martinez-Abella, F. Shear strength of recycled concrete beams. Constr. Build. Mater. 2007, 21, 887–893. [Google Scholar] [CrossRef]
Knaack, A.M.; Kurama, Y.C. Behavior of reinforced concrete beams with recycled concrete coarse aggregates. J. Struct. Eng. 2014, 141, B4014009. [Google Scholar] [CrossRef]
Ignjatović, I.S.; Marinković, S.B.; Tošić, N. Shear behaviour of recycled aggregate concrete beams with and without shear reinforcement. Eng. Struct. 2017, 141, 386–401. [Google Scholar] [CrossRef]
Arezoumandi, M.; Smith, A.; Volz, J.S.; Khayat, K.H. An experimental study on shear strength of reinforced concrete beams with 100% recycled concrete aggregate. Constr. Build. Mater. 2014, 53, 612–620. [Google Scholar] [CrossRef]
Rahal, K.N.; Alrefaei, Y.T. Shear strength of longitudinally reinforced recycled aggregate concrete beams. Eng. Struct. 2017, 145, 273–282. [Google Scholar] [CrossRef]
Etman, E.E.; Afefy, H.M.; Baraghith, A.T.; Khedr, S.A. Improving the shear performance of reinforced concrete beams made of recycled coarse aggregate. Constr. Build. Mater. 2018, 185, 310–324. [Google Scholar] [CrossRef]
Pradhan, S.; Kumar, S.; Barai, S.V. Shear performance of recycled aggregate concrete beams: An insight for design aspects. Constr. Build. Mater. 2018, 178, 593–611. [Google Scholar] [CrossRef]
Anike, E.E.; Saidani, M.; Olubanwo, A.O.; Anya, U.C. Flexural performance of reinforced concrete beams with recycled aggregates and steel fibres. Structures 2022, 39, 1264–1278. [Google Scholar] [CrossRef]
Momeni, E.; Omidinasab, F.; Dalvand, A.; Goodarzimehr, V.; Eskandari, A. Flexural Strength of Concrete Beams Made of Recycled Aggregates: An Experimental and Soft Computing-Based Study. Sustainability 2022, 14, 11769. [Google Scholar] [CrossRef]
Dantas, A.T.A.; Leite, M.B.; de Jesus Nagahama, K. Prediction of compressive strength of concrete containing construction and demolition waste using artificial neural networks. Constr. Build. Mater. 2013, 38, 717–722. [Google Scholar] [CrossRef]
Felix, E.F.; Possan, E.; Carrazedo, R. A New Formulation to Estimate the Elastic Modulus of Recycled Concrete Based on Regression and ANN. Sustainability 2021, 13, 8561. [Google Scholar] [CrossRef]
Gholampour, A.; Gandomi, A.H.; Ozbakkaloglu, T. New formulations for mechanical properties of recycled aggregate concrete using gene expression programming. Constr. Build. Mater. 2017, 130, 122–145. [Google Scholar] [CrossRef]
Hammoudi, A.; Moussaceb, K.; Belebchouche, C.; Dahmoune, F. Comparison of artificial neural network (ANN) and response surface methodology (RSM) prediction in compressive strength of recycled concrete aggregates. Constr. Build. Mater. 2019, 209, 425–436. [Google Scholar] [CrossRef]
Moein, M.M.; Saradar, A.; Rahmati, K.; Mousavinejad, S.H.G.; Bristow, J.; Aramali, V.; Karakouzian, M. Predictive models for concrete properties using machine learning and deep learning approaches: A review. J. Build. Eng. 2022, 63, 105444. [Google Scholar] [CrossRef]
Mukhtar, F.; Deifalla, A. Shear strength of FRP reinforced deep concrete beams without stirrups: Test database and a critical shear crack-based model. Compos. Struct. 2022, 307, 116636. [Google Scholar] [CrossRef]
Wardeh, G.; Ghorbel, E. Shear Strength of Reinforced Concrete Beams with Recycled Aggregates. Adv. Struct. Eng. 2019, 22, 1938–1951. [Google Scholar] [CrossRef]
Katkhuda, H.; Shatarat, N. Shear behavior of reinforced concrete beams using treated recycled concrete aggregate. Constr. Build. Mater. 2016, 125, 63–71. [Google Scholar] [CrossRef]
Sadati, S.; Arezoumandi, M.; Khayat, K.H.; Volz, J.S. Shear performance of reinforced concrete beams incorporating recycled concrete aggregate and high-volume fly ash. J. Clean. Prod. 2016, 115, 284–293. [Google Scholar] [CrossRef]
Kim, S.W.; Jeong, C.Y.; Lee, J.S.; Kim, K.H. Size effect in shear failure of reinforced concrete beams with recycled aggregate. J. Asian Archit. Build. Eng. 2013, 12, 323–330. [Google Scholar] [CrossRef]
Fathifazl, G.; Razaqpur, A.G.; Isgor, O.B.; Abbas, A.; Fournier, B.; Foo, S. Shear capacity evaluation of steel reinforced recycled concrete (RRC) beams. Eng. Struct. 2011, 33, 1025–1033. [Google Scholar] [CrossRef]
Choi, H.B.; Yi, C.K.; Cho, H.H.; Kang, K.I. Experimental study on the shear strength of recycled aggregate concrete beams. Mag. Concr. Res. 2010, 62, 103–114. [Google Scholar] [CrossRef]
Sato, R.; Maruyama, I.; Sogabe, T.; Sogo, M. Flexural behavior of reinforced recycled concrete beams. J. Adv. Concr. Technol. 2007, 5, 43–61. [Google Scholar] [CrossRef]
Yu, Y.; Zhao, X.; Xu, J.; Chen, C.; Deresa, S.T.; Zhang, J. Machine Learning-Based Evaluation of Shear Capacity of Recycled Aggregate Concrete Beams. Materials 2020, 13, 4552. [Google Scholar] [CrossRef]
Ababneh, A.; Alhassan, M.; Abu-Haifa, M. Predicting the contribution of recycled aggregate concrete to the shear capacity of beams without transverse reinforcement using artificial neural networks. Case Stud. Constr. Mater. 2020, 13, e00414. [Google Scholar] [CrossRef]
Zsutty, T. Beam shear strength prediction by analysis of existing data. ACI Struct. J. 1968, 65, 943–951. [Google Scholar]
Olalusi, O.B.; Kolawole, J.T. Shear strength assessment of reinforced recycled aggregate concrete member. In The Structural Integrity of Recycled Aggregate Concrete Produced with Fillers and Pozzolans 2022; Woodhead Publishing: Sawston, UK, 2022; pp. 323–347. [Google Scholar] [CrossRef]
Younis, A.; El-Sherif, H.; Ebead, U. Shear strength of recycled-aggregate concrete beams with glass-FRP stirrups. Compos. Part C Open Access 2022, 8, 100257. [Google Scholar] [CrossRef]
Trautwein, L.M.; de Almeida, L.C.; Gaspar, R. A Comparative Study of the Shear Strength Prediction for Reinforced Concrete Beams without Shear Reinforcement; Applied Mechanics and Materials 2014; Trans Tech Publications, Ltd.: Zurich, Switzerland, 2014; Volume 584, pp. 1135–1140. [Google Scholar] [CrossRef]
Seaborn Documentation. Available online: https://seaborn.pydata.org/tutorial/distributions.html#distribution-tutorial (accessed on 25 December 2022).
Howell, D.C. Statistical Methods for Psychology, 7th ed.; Cengage Learning: Boston, MA, USA, 2010. [Google Scholar]
Cakiroglu, C.; Islam, K.; Bekdaş, G.; Isikdag, U.; Mangalathu, S. Explainable machine learning models for predicting the axial compression capacity of concrete filled steel tubular columns. Constr. Build. Mater. 2022, 356, 129227. [Google Scholar] [CrossRef]
Cakiroglu, C.; Bekdaş, G.; Kim, S.; Geem, Z.W. Explainable Ensemble Learning Models for the Rheological Properties of Self-Compacting Concrete. Sustainability 2022, 14, 14640. [Google Scholar] [CrossRef]
Mangalathu, S.; Karthikeyan, K.; Feng, D.C.; Jeon, J.S. Machine-learning interpretability techniques for seismic performance assessment of infrastructure systems. Eng. Struct. 2022, 250, 112883. [Google Scholar] [CrossRef]
Somala, S.N.; Chanda, S.; Karthikeyan, K.; Mangalathu, S. Explainable Machine learning on New Zealand strong motion for PGV and PGA. Structures 2021, 34, 4977–4985. [Google Scholar] [CrossRef]
Mangalathu, S.; Hwang, S.H.; Jeon, J.S. Failure mode and effects analysis of RC members based on machine-learning-based SHapley Additive exPlanations (SHAP) approach. Eng. Struct. 2020, 219, 110927. [Google Scholar] [CrossRef]
Feng, D.C.; Wang, W.J.; Mangalathu, S.; Taciroglu, E. Interpretable XGBoost-SHAP machine-learning model for shear strength prediction of squat RC walls. J. Struct. Eng. 2021, 147, 04021173. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]

Figure 1. The schematic of the experimental test setup.

Figure 2. Comparison of equation predictions and experimental measurements.

Figure 3. Taylor diagram of the equations.

Figure 4. Variation of the

R^{2}

score for (a) equations that underpredict V; (b) equations that overpredict V.

Figure 4. Variation of the

R^{2}

score for (a) equations that underpredict V; (b) equations that overpredict V.

Figure 5. Feature ranges.

Figure 6. Distribution of the input features.

Figure 7. Correlation of the input and output variables.

Figure 8. 10-fold cross-validation and predictive model performance evaluation.

Figure 9. Flow chart of the analysis.

Figure 10. Comparison of experimental and predicted shear strength values.

Figure 11. First tree of the XGBoost model.

Figure 12. Comparison of the predicted and target values and the error percentages for (a) XGBoost; (b) LightGBM; (c) Random Forest; (d) Extra Trees; (e) AdaBoost; and (f) CatBoost.

Figure 13. SHAP summary plot for the XGBoost model.

Figure 14. Feature dependence plots (XGBoost) for (a) RCA (b)

f_{c}^{'}

(c) b (d) d (e) a/d (f)

ρ_{w}

.

Figure 14. Feature dependence plots (XGBoost) for (a) RCA (b)

f_{c}^{'}

(c) b (d) d (e) a/d (f)

ρ_{w}

.

Figure 15. Individual conditional expectation (ICE) plots (XGBoost) for (a) RCA; (b)

f_{c}^{'}

; (c) b; (d) d; (e) a/d; and (f)

ρ_{w}

.

Figure 15. Individual conditional expectation (ICE) plots (XGBoost) for (a) RCA; (b)

f_{c}^{'}

; (c) b; (d) d; (e) a/d; and (f)

ρ_{w}

.

Table 1. Accuracy of the equations.

Equation	R²	MAE	RMSE
ACI 318-14 (2)	0.9153	8.309	11.05
ACI 318-19 (3)	0.8416	13.06	15.11
Rahal and Alrefaei, 2017 [14] (4)	0.7293	16.06	19.75
Setkit et al., 2021 [1] (5)	0.6034	19.63	23.91
Zsutty, 1968 [34] (1)	0.5764	21.10	24.71
NBR6118/2007 (8)	0.4431	22.82	28.33
CSA A23.3-04 (7)	0.7842	15.24	17.64
EC2 (6)	0.9842	4.028	4.766

Table 2. Accuracy of the machine learning models.

Algorithm	R²		MAE		RMSE		Duration [s]
Algorithm	Train	Test	Train	Test	Train	Test	Duration [s]
XGBoost	0.9988	0.9434	0.335	9.451	1.573	13.38	5.26
Random Forest	0.9748	0.9197	4.728	11.51	7.352	15.94	3.06
LightGBM	0.8184	0.7496	12.17	17.54	19.74	28.14	3.83
CatBoost	0.9973	0.9204	1.649	11.77	2.411	15.87	18.54
Extra Trees	0.9988	0.9413	0.317	9.038	1.573	13.63	2.99
AdaBoost	0.9348	0.8904	9.855	15.26	11.83	18.61	3.95

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cakiroglu, C.; Bekdaş, G. Predictive Modeling of Recycled Aggregate Concrete Beam Shear Strength Using Explainable Ensemble Learning Methods. Sustainability 2023, 15, 4957. https://doi.org/10.3390/su15064957

AMA Style

Cakiroglu C, Bekdaş G. Predictive Modeling of Recycled Aggregate Concrete Beam Shear Strength Using Explainable Ensemble Learning Methods. Sustainability. 2023; 15(6):4957. https://doi.org/10.3390/su15064957

Chicago/Turabian Style

Cakiroglu, Celal, and Gebrail Bekdaş. 2023. "Predictive Modeling of Recycled Aggregate Concrete Beam Shear Strength Using Explainable Ensemble Learning Methods" Sustainability 15, no. 6: 4957. https://doi.org/10.3390/su15064957

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Predictive Modeling of Recycled Aggregate Concrete Beam Shear Strength Using Explainable Ensemble Learning Methods

Abstract

1. Introduction

Literature Review

2. Design Equations and Machine Learning Methodologies

2.1. Concrete Shear Strength Prediction by Code Provisions and Equations from the Research Literature

2.2. Comparison of the Equation Predictions with Experimental Data

2.3. Machine Learning Procedures

3. Results

Interpretation of the Machine Learning Models Using SHAP Approach

4. Discussion and Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI