Data-Driven Predictive Modeling of Steel Slag Concrete Strength for Sustainable Construction

: Conventional concrete causes significant environmental problems, including resource depletion, high CO 2 emissions, and high energy consumption. Steel slag aggregate (SSA), a by-product of the steelmaking industry, offers a sustainable alternative due to its environmental benefits and improved mechanical properties. This study examined the predictive power of four modeling techniques—Gene Expression Programming (GEP), an Artificial Neural Network (ANN), Random Forest Regression (RFR), and Gradient Boosting (GB)—to predict the compressive strength (CS) of SSA concrete. Using 367 datasets from the literature, six input variables (cement, water, granulated furnace slag, superplasticizer, coarse aggregate, fine aggregate, and age) were utilized to predict compressive strength. The models’ performance was evaluated using statistical measures such as the mean absolute error (MAE), root mean squared error (RMSE), mean values, and coefficient of determination ( R 2 ). Results indicated that the GB model consistently outperformed RFR, GEP, and the ANN, achieving the highest R 2 values of 0.99 and 0.96 for the training and testing dataset, respectively, followed by RFR with R 2 values of 0.97 (training) and 0.93 (testing), GEP with R 2 values of 0.85 (training) and 0.87 (testing), and ANN with R 2 values of 0.61 (training) and 0.82 (testing). Additionally, the GB model had the lowest MAE values of 0.79 MPa (training) and 2.61 MPa (testing) and RMSE values of 1.90 MPa (training) and 3.95 MPa (testing). This research aims to advance predictive modeling in sustainable construction through thorough analysis and well-defined conclusions.


Introduction
The construction industry is one of the largest natural resource consumers [1].Concrete, one of the most common materials used in the construction industry, comprises cement, water, and fine and coarse aggregate [2].According to Mehta [3], aggregate is a crucial concrete component since it accounts for 80% of its weight.Aggregation is a fantastic structural material with its high compressive strength, long lifespan, and easy workability.Unfortunately, resources are depleted, noise pollution has increased, habitats have been lost, and CO 2 emissions have increased due to the massive amount of aggregate mining, processing, and transportation [4].The aggregate industry was Europe's most significant non-energy mining sector in 2018, with a production of 3 billion tons spread over 39 nations [4].Additionally, roughly 60% of the raw materials used in building and construction projects worldwide come from the lithosphere, which accounts for 32% of all resources, including up to 40% of all energy used and 12% of all water [1,5,6].Hence, the construction industry needs to reduce its carbon footprint by looking for more environmentally friendly solutions.connecting sustainable materials and advanced computational methods.Additionally, by integrating SSA, the study aims to reduce environmental impact and tackle resource scarcity, particularly in Qatar.The research aligns with Qatar's national sustainability outlined in the QNV 2030 [8] alongside Qatar's national development strategy; both emphasize the significance of recycling and minimizing construction waste.

Literature Review
Steel slag, a by-product of steel manufacturing, is a significant industrial waste that includes dust and large stones [16].As global crude steel production continues to rise, approximately 150 kg of steel slag is generated per ton of steel, often ending up in open areas and posing environmental hazards [17].Despite these challenges, steel slag has gained attention for its potential in concrete applications due to its unique properties [18].
Several studies have investigated the mechanical qualities of concrete containing steel slag aggregates (SSA) compared to natural aggregates.For instance, Qasrawi [16] reported that steel slag with a high Fe 2 O 3 content enhances concrete's compressive and structural strength, surpassing conventional concrete's strength development over time.This finding aligns with Alizadeh et al. [19], which evaluated hardened concrete with SSA, demonstrating a higher modulus of elasticity, flexural strength, and compressive strength compared to natural aggregate concrete.
In a comprehensive study, Maslehuddin et al. [20] concluded that steel slag concrete exhibits marginally greater compressive strength than limestone concrete when used as an aggregate.Similarly, Wang and Zhao [21] explored the use of blast furnace steel slag as coarse aggregate, finding enhanced properties such as higher compressive, flexural, and bond strengths compared to conventional concrete mixes.Subathra Devi and Gnanavel [22] examined the impact of partially substituting fine and coarse aggregates in M20 grade concrete, recommending 40% and 30% steel slag replacement, respectively.They noted reduced workability with increased replacement percentages.
Regarding durability assessments, Awwad et al. [23] investigated the substitution of SSA for sand in concrete mixes with target strengths of 25 MPa.Their results showed improved concrete strength without compromising workability, particularly notable at a 30% replacement ratio.Borole et al. [24] used M30 grade concrete to evaluate the effects of partially substituting steel slag for natural aggregate, finding that a 25% replacement rate optimally enhances compressive, flexural, and tensile strengths without detrimental effects.Sinha [25] also studied the effects of replacing fine and coarse aggregates in conventional concrete mixes with steel slag, observing increased compressive strength at 28 days along with improved flexural and tensile strength.
Further enhancing concrete properties, Pushpakumara and Silva [26] evaluated the effectiveness of steel slag in replacing fine and coarse aggregates, determining that concrete containing 75% steel slag exhibits increased unit weight, splitting tensile strength, compressive strength, and corrosion resistance.Kumar et al. [27] researched the potential of replacing coarse aggregate with steel slag, showing significant improvements in compressive strength, stability, and overall concrete density.Additionally, Miah et al. [28] investigated the replacement of first-class burnt clay brick aggregate with steel slag, finding that SSA improves compressive strength and reduces porosity.
Tarawneh et al. [29] addressed environmental considerations and compared SSA's physical and mechanical characteristics with conventional crushed limestone aggregate concrete, noting higher abrasion resistance and accelerated early strength development with steel slag.Nguyen et al. [30] focused on the compressive properties of steel slag concrete by replacing it with coarse aggregate, observing rapid strength increases within the first 7 days.Aparicio et al. [31] studied the effects of environmental conditions on concrete containing recycled aggregate or SSA, confirming superior compressive strength for SSA concrete at 28 days.
With rapid urbanization, population growth, and stricter laws governing the use of natural resources, civil engineering faces numerous challenges that call for innovative solu-tions.One solution is to use machine learning (ML) and soft computing techniques (SCT).SCT are a collection of computational methods that can tolerate partial truth, uncertainty, and approximation to help solve complex problems, unlike hard computing techniques, which face difficulties when dealing with such issues [32].The main concept behind ML and SCT is to mimic human brain functions such as intuition, reasoning, and consciousness.
In recent years, civil engineering has encountered problems requiring intuition and learning from past experiences.SCT collect statistical, problematic, and optimization tools to learn from past experiences and use these findings to produce new data, identify patterns, or predict novel trends [14,[33][34][35][36][37].Various machine learning and soft computing techniques, such as artificial neural networks, fuzzy logic, and genetic algorithms, can solve these problems.Several studies have used ML and SCT to predict the structural properties of concrete containing SSA.
Awoyera et al. [38] studied gene expression programming (GEP) to predict SSA concrete's compressive and splitting tensile strength.Their empirical studies indicated that steel slag could replace conventional aggregate while yielding similar outcomes.Piro, Mohammed, Hamad et al. [39] used various modeling techniques, including Artificial Neural Networks (ANNs) and Adaptive Neuro-Fuzzy Inference Systems (ANFIS), to predict the compressive strength of mixtures, including steel slag.Their research showed that the ANFIS model outperformed other models in prediction accuracy.
Penido et al. [40] used multiple machine learning models, such as Support Vector Regression (SVR), ANNs, Extreme Gradient Boosting (XGBoost), and Gaussian Process Regression (GPR), to predict the compressive strength of steel slag concrete.Their results highlighted the ANN model's effectiveness despite mixed experimental validation results.Kioumarsi et al. [41] studied several machine learning models to predict the compressive strength of concrete containing ground granulated blast furnace slag (GGBFS), developing a simplified equation for practical application.
Mohana et al. [42] explored the application of machine learning models in predicting the compressive strength of GGBFS concrete, using the Random Forest (RF) model with the Support Vector Machine (SVM) model for verification.Their research demonstrated the RF model's excellent predictive abilities.Mai et al. [43] researched using the RF model to address challenges related to the complexity of mix design composition, achieving high correlation coefficients and low error rates.
Recent studies have further emphasized the role of machine learning and soft computing in predicting concrete properties.For example, Kumar et al. [44] proposed ELM, MARS, and DNN-based prediction models for fly ash concrete, demonstrating their effectiveness in predicting compressive strength.Similarly, Kumar et al. [45] used ANNs to predict previous concrete's compressive strength and permeability with GGBS.Paudel et al. [46] investigated various ML algorithms to estimate the compressive strength of concrete containing fly ash, confirming the robustness of these approaches.Additionally, Albostami et al. highlighted the effectiveness of MOGA-EPR and GEP techniques in predicting self-compacting concrete properties [33].
Overall, these studies demonstrated the effectiveness of various machine learning and soft computing techniques in predicting the compressive strength of concrete.These models and techniques could potentially reduce the number of experiments needed to determine the structural factors of concrete, thus reducing expenses.The following section will provide an insight into the general methodology used in this research and the principles of the different models.

Methodology Overview
This research used ML and SCT models to predict the CS of concrete containing steel slag aggregate (SSA).The first phase involved compiling data from the literature on steel slag concrete's composition and compressive strength.The literature datasets were pre-processed in order to standardize the parameters or factors taken into consideration.The pre-processing steps were applied to the data before analysis started by compiling a total of 367 datasets from various literature sources, which included values for six input variables: cement content (C), steel slag aggregate (SSA), water (W), coarse aggregate (CA), fine aggregate (FA), age (A), superplasticizer (SP), and the output measured compressive strength (CS).The datasets were then examined for any inconsistencies or missing values, and any records with missing or incomplete data were either corrected by cross-referencing the original sources or removed if correction was not possible.
To ensure that the input variables were comparable and to improve the predictive models' performance, the data used Min-Max normalization, which scales the input variables to a range of [0, 1].The cleaned and pre-processed data were then split into training and testing sets using an 80-20 split, with 80% of the data used for training the models and 20% reserved for testing and validation.
Subsequently, four distinct ML and SCT models, namely an Artificial Neural Network (ANN), Gene Expression Programming (GEP), Random Forest Regression (RFR), and Gradient Boosting (GB), were used to develop their predictive models, using the dataset as input and output.A validation process was then carried out using three statistical metrics to assess the models' prediction accuracy: coefficient of determination (R 2 ), root mean square error (RMSE), and mean absolute error (MAE).Finally, a sensitivity analysis will be carried out to determine the impact of various factors on the structural properties of steel slag concrete.Figure 1 below showcases the general methodology followed by any ML or SCT model in the form of a flowchart.
processed in order to standardize the parameters or factors taken into consideration.The pre-processing steps were applied to the data before analysis started by compiling a total of 367 datasets from various literature sources, which included values for six input variables: cement content (C), steel slag aggregate (SSA), water (W), coarse aggregate (CA), fine aggregate (FA), age (A), superplasticizer (SP), and the output measured compressive strength (CS).The datasets were then examined for any inconsistencies or missing values, and any records with missing or incomplete data were either corrected by cross-referencing the original sources or removed if correction was not possible.
To ensure that the input variables were comparable and to improve the predictive models' performance, the data used Min-Max normalization, which scales the input variables to a range of [0, 1].The cleaned and pre-processed data were then split into training and testing sets using an 80-20 split, with 80% of the data used for training the models and 20% reserved for testing and validation.
Subsequently, four distinct ML and SCT models, namely an Artificial Neural Network (ANN), Gene Expression Programming (GEP), Random Forest Regression (RFR), and Gradient Boosting (GB), were used to develop their predictive models, using the dataset as input and output.A validation process was then carried out using three statistical metrics to assess the models' prediction accuracy: coefficient of determination (R 2 ), root mean square error (RMSE), and mean absolute error (MAE).Finally, a sensitivity analysis will be carried out to determine the impact of various factors on the structural properties of steel slag concrete.Figure 1 below showcases the general methodology followed by any ML or SCT model in the form of a flowchart.This paper's novelty lies in preprocessing experimental data sourced from the literature.Initially, the dataset comprised 1031 entries.However, through a careful preprocessing phase, the dataset was refined to 334 entries, excluding fly ash (FA) mixes and those with 0% SSA replacement.This refined dataset allows for a more focused and accurate analysis, enhancing the validity and relevance of the findings presented in this study.This paper's novelty lies in preprocessing experimental data sourced from the literature.Initially, the dataset comprised 1031 entries.However, through a careful preprocessing phase, the dataset was refined to 334 entries, excluding fly ash (FA) mixes and those with 0% SSA replacement.This refined dataset allows for a more focused and accurate analysis, enhancing the validity and relevance of the findings presented in this study.

Data Collection and Statistical Analysis
The first step was to collect data from the literature on steel slag concrete and its strengths.Searches were conducted in scientific databases like Google Scholar and Data Mendeley.Table 1 below shows the parameters used in this review and the statistical measures of the experimental data collected from [47], with 1031 datasets in total.However, after pre-processing, the number of datasets was reduced to 334 due to removing mixes with fly ash (FA) content and 0% SSA replacement.This table contains the minimum, maximum, average, and standard deviation of the SSA mix amounts, replacement percentage aggregate, and compressive strength values.The input dataset shown in Table 1 comprises the following: C, SSA, W, CA, FA, A, SP, and the output measured CS.The dataset, comprising 367 samples collected from various literature sources, was designed to ensure a comprehensive and balanced representation of different experimental conditions and concrete mix compositions.The distribution of SSA content within the dataset ranges from 0% to 100% replacement of natural aggregates, capturing both partial and full replacement scenarios.A roughly equal representation of low (0-30%), medium (30-70%), and high (70-100%) SSA-content samples was included, which ensures diverse coverage and enhances the robustness of the predictive models.
The dataset spans curing periods from 1 day to 365 days, reflecting the early, mid, and late-age strength development of concrete.Representative samples for commonly studied curing periods, such as 7 days, 28 days, and 90 days, were included to support the effective learning of strength gain patterns over various curing durations.
Additionally, the dataset incorporates a variety of values for other input variables, including cement content, water-to-cement ratio, granulated furnace slag, superplasticizer, coarse aggregate, and fine aggregate.This diversity is crucial for capturing the complex interactions between these variables and their combined effect on compressive strength.

Data Grouping
This study evaluated the effectiveness of four different ML and SCT methods.Gene Expression Program (GEP) is the first method; the second is the use of an Artificial Neural Network (ANN), the third is Random Forest Regression (RFR), and the fourth one is Gradient Boosting (GB).Two datasets were created from the acquired data, where 80% (267 observations) were used for model training and the remaining 20% (67 observations) for testing to ensure accuracy.Tables 2 and 3 present the statistical metrics of the training and testing datasets for the four different models.The statistical measures are the input and output (CS) minimum, maximum, average, and standard deviation.Moreover, Figure 2 showcases the data frequency, while Figure 3 represents the data distribution, where the x-axis represents the variables, and the y-axis represents the compressive strength.This paper comprehensively assessed each input variable's feature importance and contribution using established methods.The permutation importance method, illustrated in Figures 3 and 4, was initially applied.Secondly, feature importance scores were derived using heat maps generated by ANN and tree-based models like RFR and GB, which inherently provide scores based on the variables' effectiveness in splitting data at decision nodes.This approach quantified each input variable's contribution to the model's predictions.Additionally, correlation analysis explored relationships between input variables and the output variable, with correlation coefficients indicating the strength and direction of these relationships.A higher coefficient signifies a stronger relationship, while coefficients close to −1 or 0 denote weaker or negligible relationships.These techniques facilitated a thorough assessment of each input variable's feature importance and contribution, ensuring a robust and transparent analysis of their impacts on the predictive models.
nodes.This approach quantified each input variable's contribution to the model's predictions.Additionally, correlation analysis explored relationships between input variables and the output variable, with correlation coefficients indicating the strength and direction of these relationships.A higher coefficient signifies a stronger relationship, while coefficients close to −1 or 0 denote weaker or negligible relationships.These techniques facilitated a thorough assessment of each input variable's feature importance and contribution, ensuring a robust and transparent analysis of their impacts on the predictive models.

Developing Models
In this study, four different machine learning and soft computing techniques were used to predict the compressive strength of steel slag concrete.The development of the different models is explained below.
The selection of Gene Expression Programming (GEP), Artificial Neural Networks (ANN), Random Forest Regression (RFR), and Gradient Boosting (GB) is based on their proven effectiveness and complementary strengths in predictive modeling.
Gene Expression Programming (GEP) is chosen for its ability to generate explicit mathematical models, capturing complex relationships in the data through evolutionary algorithms.It is particularly useful for understanding underlying patterns and interactions in concrete properties [48,49].
Artificial Neural Networks (ANNs) are selected due to their robustness in handling non-linear relationships and high adaptability to various datasets, making them suitable for predicting complex behaviors in materials science [50].
Random Forest Regression (RFR) is included because of its ensemble learning technique, which improves prediction accuracy and reduces overfitting by averaging the results of multiple decision trees.This method is known for its ability to handle large datasets with higher dimensionality and its robustness against overfitting [48,49].
Gradient Boosting (GB) is chosen for its efficiency in building predictive models by sequentially correcting the errors of a series of weak models, thus producing a strong predictive performance.It is particularly effective for regression tasks, providing high accuracy and reducing bias [50].
Together, these methods offer a comprehensive approach to predictive modeling, leveraging their unique strengths to enhance the reliability and accuracy of predictions in the study of concrete properties.This paper's hyperparameter selection and optimization process began with choosing each model's hyperparameter values based on recommendations from [51,52].This provided a baseline for further refinement, and these initial values served as a starting point for subsequent optimization steps.

Developing Models
In this study, four different machine learning and soft computing techniques were used to predict the compressive strength of steel slag concrete.The development of the different models is explained below.
The selection of Gene Expression Programming (GEP), Artificial Neural Networks (ANN), Random Forest Regression (RFR), and Gradient Boosting (GB) is based on their proven effectiveness and complementary strengths in predictive modeling.
Gene Expression Programming (GEP) is chosen for its ability to generate explicit mathematical models, capturing complex relationships in the data through evolutionary algorithms.It is particularly useful for understanding underlying patterns and interactions in concrete properties [48,49].
Artificial Neural Networks (ANNs) are selected due to their robustness in handling non-linear relationships and high adaptability to various datasets, making them suitable for predicting complex behaviors in materials science [50].
Random Forest Regression (RFR) is included because of its ensemble learning technique, which improves prediction accuracy and reduces overfitting by averaging the results of multiple decision trees.This method is known for its ability to handle large datasets with higher dimensionality and its robustness against overfitting [48,49].
Gradient Boosting (GB) is chosen for its efficiency in building predictive models by sequentially correcting the errors of a series of weak models, thus producing a strong predictive performance.It is particularly effective for regression tasks, providing high accuracy and reducing bias [50].
Together, these methods offer a comprehensive approach to predictive modeling, leveraging their unique strengths to enhance the reliability and accuracy of predictions in the study of concrete properties.This paper's hyperparameter selection and optimization process began with choosing each model's hyperparameter values based on recommendations from [51,52].This provided a baseline for further refinement, and these initial values served as a starting point for subsequent optimization steps.
For Gene Expression Programming (GEP), the parameters were set as follows: number of chromosomes: 30; head size: 8; number of genes: 3; function set: +, −, ×, /; square root, mutation rate: 0.00138; inversion rate: 0.00346; gene transposition rate: 0.00277; random chromosomes: 0.0026; and gene recombination rate: 0.00277.For the Artificial Neural Network (ANN), parameters such as the learning rate, number of hidden layers, number of neurons per layer, activation functions, and batch size were optimized using grid and random search techniques.For Random Forest Regression (RFR), key parameters like the number of trees, maximum depth, minimum samples split, and minimum sample leaf were tuned through grid search.Similarly, for Gradient Boosting (GB), the learning rate, number of estimators, maximum depth, and minimum sample split were optimized using grid search and random search methodologies.
By employing these rigorous hyperparameter tuning techniques, we aimed to ensure the reproducibility and robustness of our results.This systematic approach lays a strong foundation for the models' predictive performance and generalizability, ensuring that the models are not overfitted to the training data and can perform well on unseen data.
The performance of each hyperparameter combination was evaluated using metrics such as the mean absolute error (MAE), root mean squared error (RMSE), and the coefficient of determination (R 2 ).These metrics provided a comprehensive assessment of each model's accuracy and generalizability.The final hyperparameters for each model were selected based on the combination that yielded the best performance metrics during cross-validation.These values were then used to train the final models reported in the study.

Artificial Neural Network (ANN)
ANNs are designed to emulate the function and learning ability of the biological nervous system in the human brain, particularly in information processing.ANNs mimic the brain's functionality in two primary ways: acquiring knowledge through a learning process and storing or memorizing information via the strengths of interconnected neurons, known as synaptic weights [32].The structure of an ANN is characterized by a parallel configuration of neurons, which are highly interconnected and capable of complex training.Data are processed through a series of interconnected layers, divided into three sections: input, hidden layers, and output, each comprising several nodes (neurons).The input layer receives and processes data before passing it to the next nodes.The hidden layers perform complex mathematical operations to extract useful features, while the output layer produces the final output or prediction.An ANN's ability to adapt to changing input and output data, perform non-linear function mapping, and capture unknown relationships makes it a versatile model for addressing real-world problems [14].The general structure of the ANN model is illustrated in Figure 5.

Gene Expression Programming (GEP)
Genetic algorithms (GA) are one of the main types of machine learning and SCT; the main principle of this method or technique is based on the Darwinian principle of natural selection to solve complex problems.This method has been used to solve many problems, focusing primarily on optimization problems controlled by various variables [14].Ferreira [53] proposed an improved form of genetic programming (GP) through the Gene Expression Program (GEP).The GEP works as a learning algorithm focused on understanding relationships between different variables within datasets and creating a model to interpret this relationship.The GEP is a type of GA that finds solutions using a combination of chromosomes and the Tree's method.These chromosomes have mathematical information or functions encoded into them and are then used to build the first or initial chromosome population.Each chromosome is evaluated by checking its fitness, and the chromosomes with the highest fitness are chosen for reproduction.The During the testing phase, specific improvements were made to enhance the performance and accuracy of the ANN model.These improvements focused on optimizing the model architecture, adjusting hyperparameters, and implementing regularization techniques to prevent overfitting.Initially, the model architecture was refined by experimenting with different numbers of hidden layers and neurons per layer.An optimal configuration was identified through iterative testing and evaluation, balancing complexity and performance.It was found that increasing the number of hidden layers and neurons enhanced the model's ability to capture complex patterns in the data.However, care was taken to avoid an overly complex model, which could lead to overfitting.
Hyperparameters were systematically tuned using grid and random search techniques to optimize the model's performance.A range of values for key hyperparameters, including the learning rate, batch size, and the number of epochs, were explored.The model's performance on a validation set was evaluated to identify the optimal combination of hyperparameters, resulting in the lowest validation error and highest predictive accuracy [32].
Regularization techniques were implemented to enhance the model's generalization capabilities and prevent overfitting.These included using dropout layers, which randomly deactivate a proportion of neurons during training, and L2 regularization, which adds a penalty to the loss function based on the magnitude of the model's weights.These techniques ensured that the model did not become overly reliant on any feature or subset of the training data, thereby improving its performance on unseen data.
Additionally, early stopping was employed during training to prevent overfitting and ensure good generalization to new data.By monitoring the validation loss, training was halted when it ceased to improve, avoiding unnecessary iterations that could lead to overfitting.

Gene Expression Programming (GEP)
Genetic algorithms (GA) are one of the main types of machine learning and SCT; the main principle of this method or technique is based on the Darwinian principle of natural selection to solve complex problems.This method has been used to solve many problems, focusing primarily on optimization problems controlled by various variables [14].Ferreira [53] proposed an improved form of genetic programming (GP) through the Gene Expression Program (GEP).The GEP works as a learning algorithm focused on understanding relationships between different variables within datasets and creating a model to interpret this relationship.The GEP is a type of GA that finds solutions using a combination of chromosomes and the Tree's method.These chromosomes have mathematical information or functions encoded into them and are then used to build the first or initial chromosome population.Each chromosome is evaluated by checking its fitness, and the chromosomes with the highest fitness are chosen for reproduction.The genetic operations performed include crossover, mutation, and reproduction.The GA continues to evolve until a satisfactory solution is found.This method produces a relatively simple estimation equation that can be used for practical design and hand calculation [54].The general methodology of the GEP model can be seen in Figure 6.In conclusion, the exclusion of the fine aggregate variable from the GEP model was a deliberate decision based on thorough statistical analysis, feature importance evaluation, and the need to balance model complexity with predictive accuracy.This approach ensures that the model remains robust and generalizable without unnecessary variables contributing minimally to its predictive power.All relevant variables, such as cement content, water-to-cement ratio, granulated furnace slag, superplasticizer, coarse aggregate, fine aggregate, and curing period, were included during the initial stages of model development.Preliminary analysis indicated that including all these variables led to overfitting, where the model performed exceptionally well on training data but poorly on validation data.This prompted a closer examination of each variable's contribution to the model's predictive power.
A correlation matrix was generated to examine the relationship between each input variable and the concrete's CS.The fine aggregate variable was found to have a weaker correlation with CS compared to other variables.While fine aggregate contributes to the overall mix design, its direct impact on the compressive strength was less significant in our dataset, which primarily focused on the effects of SSA.
Feature importance techniques, such as recursive feature elimination and random forest feature importance, were utilized to evaluate the impact of each variable on the model's performance.It was consistently shown that the fine aggregate variable ranked lower in importance, indicating that its contribution to predicting compressive strength was relatively minor compared to other variables like the SSA content, cement content, and curing period.Less impactful variables were excluded to enhance model simplicity and performance.
In conclusion, the exclusion of the fine aggregate variable from the GEP model was a deliberate decision based on thorough statistical analysis, feature importance evaluation, and the need to balance model complexity with predictive accuracy.This approach ensures that the model remains robust and generalizable without unnecessary variables contributing minimally to its predictive power.
This paper specifies the key setting parameters and adjustments in the GEP model for predicting CS: the model utilizes 30 chromosomes with a head size of 8, each containing three genes.The function set includes addition (+), subtraction (−), multiplication (×), division (/), and square root ( √ ) operations.The fitness function used is RMSE.The model applies a mutation rate of 0.00138, an inversion rate of 0.00346, and a gene transposition rate of 0.00277.Additionally, random chromosomes are set at 0.0026, and gene recombination occurs at a rate of 0.00277.The mathematical equation developed by the GEP model is provided below alongside the expression tree (see Figure 7).
(1) The random forest regression ML method is known for its great ability to handle large sets of data with different attributes and provide a precise or accurate estimation of attribute importance.RFR works on the principle of ensemble learning, combining the predictive power of multiple decision trees to enhance accuracy and reliability.Every tree is made independently on a random subset of the training data, which creates variance and reduces overfitting.Through bootstrap aggregation (bagging), RFR can build a robust model by training on different dataset variations.The final prediction is determined by averaging the predictions from all the individual trees in the forest.This mechanism ensures that the collective decision of many trees is more accurate and stable than any individual tree's [43].This process is repeated continuously until the required degree of precision is attained.Overall, the RFR's unique ability is to enhance its predictive power [55].The general structure of the RFR model can be seen in Figure 8.

Random Forest Regression (RFR)
The random forest regression ML method is known for its great ability to handle large sets of data with different attributes and provide a precise or accurate estimation of attribute importance.RFR works on the principle of ensemble learning, combining the predictive power of multiple decision trees to enhance accuracy and reliability.Every tree is made independently on a random subset of the training data, which creates variance and reduces overfitting.Through bootstrap aggregation (bagging), RFR can build a robust model by training on different dataset variations.The final prediction is determined by averaging the predictions from all the individual trees in the forest.This mechanism ensures that the collective decision of many trees is more accurate and stable than any individual tree's [43].This process is repeated continuously until the required degree of precision is attained.Overall, the RFR's unique ability is to enhance its predictive power [55].The general structure of the RFR model can be seen in Figure 8.

Gradient Boosting (GB)
Gradient boosting is a powerful machine-learning technique with significant success across various applications.The core idea behind gradient boosting involves sequentially building an ensemble of weak learners, typically decision trees.Each subsequent model attempts to correct the errors of its predecessor by focusing on the residuals, thereby improving overall performance.
One of the seminal tutorials on gradient boosting by Natekin and Knoll [56] provides a comprehensive introduction to its methodology, highlighting its solid machine-learning aspects and practical implementations [56].This tutorial emphasizes the algorithm's iterative nature and the importance of choosing appropriate base learners and loss functions to optimize performance.
Further studies by Aziz et al. [57] explore developing AI monitoring and prediction systems using gradient-boosting algorithms.This research underscores the algorithm's

Gradient Boosting (GB)
Gradient boosting is a powerful machine-learning technique with significant success across various applications.The core idea behind gradient boosting involves sequentially building an ensemble of weak learners, typically decision trees.Each subsequent model attempts to correct the errors of its predecessor by focusing on the residuals, thereby improving overall performance.
One of the seminal tutorials on gradient boosting by Natekin and Knoll [56] provides a comprehensive introduction to its methodology, highlighting its solid machine-learning aspects and practical implementations [56].This tutorial emphasizes the algorithm's iterative nature and the importance of choosing appropriate base learners and loss functions to optimize performance.
Further studies by Aziz et al. [57] explore developing AI monitoring and prediction systems using gradient-boosting algorithms.This research underscores the algorithm's effectiveness in predictive maintenance, where data-driven models predict equipment failures and maintenance needs [57].
Another recent study by Guillen et al. [58] explores the application of gradient tree boosting for estimating production functions.This study illustrates the versatility of gradient boosting in handling various types of predictive tasks, particularly in economic and production forecasting contexts [58].
Comparative analyses, like the one by Bentejac and Csorgo, evaluate different gradientboosting implementations, such as XGBoost, LightGBM, and CatBoost.These comparisons indicate that while all variants share a common foundation, specific implementations may offer advantages in terms of speed, accuracy, or the handling of categorical data [59].
The key concepts and steps in GB, as shown in Figure 9, begin with a simple model to make an initial prediction, such as the mean value for regression tasks.The core of the process involves sequentially adding new models to correct the errors made by the previous ones.Each new model is trained on the residuals, the differences between the actual values, and the predictions from the combined ensemble of earlier models.This correction step is driven by gradient descent, where the new model approximates the negative gradient of the loss function to minimize prediction errors effectively [56].
Buildings 2024, 14, x FOR PEER REVIEW 15 of 25 healthcare, and marketing.Each iteration's continuous improvements and refinements allow GB to achieve high predictive accuracy, making it a preferred choice for complex prediction tasks [57].
Overall, GB remains a robust and versatile technique in machine learning, with continuous advancements and applications across diverse fields.

Statistical Indicators and Measurements
Statistical measures like the mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R 2 ) are used to evaluate the accuracy of the predicted model.Many of these techniques have been used before to assess the accuracy of different models produced.The R 2 indicates how well the model's predictions match the actual value or data.This metric shows how well the independent variables explain the variance in the dependent variables.It goes from negative infinity to 1, with 1 being the best score.A perfect score means the model explains all the variation in the data, while a negative score indicates the model performs worse than just using the average value.Essentially, R 2 is useful for comparing the goodness of fit of various models on the same dataset, and it is calculated using the following formula: where  are the actual values,  ̂ are the predicted values, and  ̅ is the mean of the actual values.
Moreover, MAE indicates how far off the predictions are from the actual values, on average.MAE provides a simple way to assess prediction accuracy, with lower values indicating that the model prediction is similar to actual values.It is calculated as the aver- A crucial element in GB is the learning rate, which determines the contribution of each new model to the final prediction.Lower learning rates require more iterations but help achieve higher accuracy and prevent overfitting by making minor adjustments at each step.Regularization techniques, such as limiting tree depth and subsampling, further enhance the model's robustness and generalization capabilities.These techniques help control the model complexity and prevent overfitting, ensuring that the model performs well on unseen data.
GB's iterative nature, combined with its robust handling of various data types and missing values, makes it a powerful tool for numerous applications, including finance, healthcare, and marketing.Each iteration's continuous improvements and refinements allow GB to achieve high predictive accuracy, making it a preferred choice for complex prediction tasks [57].
Overall, GB remains a robust and versatile technique in machine learning, with continuous advancements and applications across diverse fields.

Statistical Indicators and Measurements
Statistical measures like the mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R 2 ) are used to evaluate the accuracy of the predicted model.Many of these techniques have been used before to assess the accuracy of different models produced.The R 2 indicates how well the model's predictions match the actual value or data.This metric shows how well the independent variables explain the variance in the dependent variables.It goes from negative infinity to 1, with 1 being the best score.A perfect score means the model explains all the variation in the data, while a negative score indicates the model performs worse than just using the average value.Essentially, R 2 is useful for comparing the goodness of fit of various models on the same dataset, and it is calculated using the following formula: where yi are the actual values, ŷι are the predicted values, and y is the mean of the actual values.Moreover, MAE indicates how far off the predictions are from the actual values, on average.MAE provides a simple way to assess prediction accuracy, with lower values indicating that the model prediction is similar to actual values.It is calculated as the average difference between predicted and actual values.MAE is more robust against outliers compared to RMSE, which makes it a valuable measure for assessing the average prediction error of the model.MAE is calculated using the following formula: where n is the number of observations, i represents the actual value, and ŷι is the predicted value.RMSE calculates the square of the mean of the squared variance between predicted and actual values.RMSE is similar to MAE but puts more emphasis on significant errors.This is because RMSE squares the error before calculating the square root, which makes it more responsive to outliers and significant deviations in the predictions.It is calculated using the following formula: Generally, these metrics work together to provide a complete picture of the model's accuracy.Ideally, a high R 2 close to 1 and low values for MAE and RMSE mean the model closely matches the data with minimal prediction error.
Tables 4-6 summarize the evaluation metrics (R 2 , MAE, and RMSE) of the four models created.

Results and Discussion
Predicting the compressive strength of steel slag aggregate (SSA) concrete was carried out using four different models: Random Forest Regression (RFR), Gene Expression Programming (GEP), an Artificial Neural Network (ANN), and Gradient Boosting (GB).Statistical metrics such as the R 2 , mean values, root mean squared error (RMSE) and mean absolute error (MAE) were employed to measure the accuracy of the predictive models.
The results in Tables 4 and 5 showcase that the MAE values range from 0.79 to 9.71 for the training dataset and 2.61 to 6.70 for the testing set.For the RMSE, the range for the training set is 1.90 to 12.03, and for the testing set, it is 3.95 to 8.29.Moreover, the mean values for the training dataset range from 1.00 to 1.19, and for the testing set, they vary from 1.01 to 1.12.Furthermore, the R 2 values lie between 0.61 and 0.99 for the training set and 0.82 and 0.96 for the testing set.
The statistical analysis for the compressive strength models reveals distinct performance differences among the models in the training dataset, as shown in Table 4.For the MAE, the GB model significantly outperforms the others, with an MAE of 0.79, indicating it has the slightest average error.In contrast, the ANN model exhibits the highest MAE at 9.71, suggesting it struggles with accurate predictions compared to the other models.Regarding RMSE, the GB model also achieves the lowest RMSE of 1.90, confirming its superior performance in terms of error magnitude.The ANN model again fares the worst with an RMSE of 12.03, highlighting its less reliable predictions.The mean error values across all models are close, with GB at 1.00, showing minimal deviation from the actual values.This suggests that all models maintain a relatively consistent prediction error pattern.The R 2 value is a crucial indicator of model fit.The GB model attains an almost perfect fit with an R 2 of 0.99, indicating that it explains nearly all the variability in the dataset.The ANN model, with an R 2 of 0.61, has the lowest explanatory power among the models, reflecting its weaker predictive capability.
As shown in Table 5, the models demonstrate a consistent performance pattern with slight variations when applied to the testing dataset.The GB model continues to lead with an MAE of 2.61, though the error is higher compared to the training dataset, suggesting a slight decrease in performance.The GEP model follows closely with an MAE of 5.65, while the ANN model shows improvement but still lags with an MAE of 6.70.GB again achieves the lowest RMSE of 3.95, maintaining its superior performance.Although improved from training, the ANN model still records a higher RMSE of 8.29.The mean values are consistent with the training dataset, with the GB model having a mean error close to 1.01, indicating stable performance across datasets.The GB model retains high performance with an R 2 of 0.96, while the RFR model shows strong performance with an R 2 of 0.93.The ANN model's R 2 value improves to 0.82, indicating a better model fit compared to the training phase.
Analyzing the performance across the combined datasets, as shown in Table 6, offers a comprehensive view of each model's robustness and generalization ability.The GB model exhibits the lowest MAE of 1.15, reaffirming its consistent accuracy.The ANN model, despite improvements, has the highest MAE at 8.86, indicating its predictions are less reliable overall.GB maintains its lead with an RMSE of 2.45, showing minimal error propagation.Conversely, ANN records the highest RMSE of 11.14, reflecting its less accurate predictions.The mean errors remain consistent, with GB showing a minimal mean error of 1.01, indicating stable and precise predictions.GB achieves the lowest standard deviation of 0.06, indicating the smallest spread in prediction errors, while ANN shows more variability with an STDV of 0.32.The GB model, with an R 2 of 0.98, demonstrates exceptional fit and reliability across all data.The ANN model, with an R 2 of 0.67, shows the smallest fit among the models.The GB model has the lowest coefficient of variation (COV) at 5.98%, signifying the most minor relative variability in its predictions.The ANN model, with a COV of 31.96%,exhibits the highest variability, indicating less consistent performance.
Overall, the GB model consistently outperforms the other models across all metrics, demonstrating high accuracy, minimal error, and robust model fit both in training and testing datasets.This suggests that GB is the most reliable model for predicting compressive strength in this context.While showing some improvement in the testing phase, the ANN model generally underperforms compared to the other models.Its higher MAE, RMSE, and lower R 2 values indicate that it struggles to effectively capture the underlying patterns in the data.Both the GEP and RFR models perform reasonably well, with the RFR model showing solid performance in terms of R 2 values and error metrics.GEP, while not as robust as GB, still provides reasonable predictive accuracy.The GB model demonstrates the most minor variability and highest prediction stability, as indicated by its low STDV and COV.This makes it a reliable choice for practical applications where consistency is crucial.The R 2 values across datasets underscore the reliability of the GB and RFR models.These models can be trusted to provide accurate predictions, with the GB model being particularly noteworthy for its near-perfect fit.
Figures 10 and 11 provide scatter plots that compare predicted compressive strength values against experimental values for the training and test datasets.These plots visually represent how well each model predicts CS and the accuracy of its predictions and are within a ±30% error range.
The GB model demonstrates the closest alignment with the ideal fit line in both datasets, indicating its robust performance and accurate prediction of CS across different scenarios.This model consistently produces predictions that closely match the actual experimental values, suggesting minimal bias and high reliability.
Conversely, the ANN model shows more variability in its predictions.While some predictions closely align with the ideal line, others deviate more significantly.This variability suggests challenges in capturing all nuances and complexities of the data using the current ANN configuration.Further optimization or feature selection may be necessary to improve its predictive accuracy and reduce these deviations.
Figure 12 illustrates the distribution of predicted compressive strength values around the mean ± standard deviation (STDV).This visualization provides insights into the consistency and spread of predictions made by each model.
The RFR model exhibits the smallest scatter around the mean, with predictions clustered tightly.This indicates high precision and consistency in predicting CS values, reflecting robust performance and accurate modeling of the underlying data patterns.
Similarly, the GEP model shows a concentration of predictions around the mean range, although with slightly more spread compared to RFR.This suggests generally accurate predictions with some variability in specific scenarios, which could be further refined through model tuning.
required.However, as shown in Equation ( 1), it can be seen that the FA variable was not considered, which can be attributed to several reasons, like a limitation in dataset representations, the prioritization of other influential variables, the consideration of the addition of FA unnecessary, the quality of fine aggregate data, and limitations in the model architecture or training processes.required.However, as shown in Equation ( 1), it can be seen that the FA variable was not considered, which can be attributed to several reasons, like a limitation in dataset representations, the prioritization of other influential variables, the consideration of the addition of FA unnecessary, the quality of fine aggregate data, and limitations in the model architecture or training processes.

Sensitivity Study
A sensitivity study or analysis is crucial to many scientific investigations.This parameter sensitivity analysis helps to understand how a particular parameter might affect the results or the output of the model prediction.This study provides an understanding of which input parameters impact the results most and which have fewer effects.The GEP model was chosen for the sensitivity study based on its simplicity, which can be utilized to analyze the factors influencing compressive strength (CS).

The Effects of Changing Steel Slag Aggregate Content (SSA)
This sensitivity study examined the relationship between the percentage of SSA in concrete mixes and the resulting compressive strength.As mentioned in the literature review and shown in Figure 14a, increasing the SSA content will increase the compressive strength.Moreover, Tarawneh et al. [29] revealed that adding SSA can improve concrete's abrasion factor, impact value, and compressive strength, particularly during the early stages.Miah et al. [28] showed a notable increase in compressive strength and reduced porosity when steel slag was utilized.Sinha [25] also confirmed the trend by observing an increase in compressive, flexural, and split tensile strengths after replacing fine aggregate with steel slag by a specific percentage.In contrast, the ANN model again displays more dispersion in its predictions.The broader spread around the mean indicates variability and less precise predictions compared to RFR and GEP.This variability may stem from the ANN's struggle to fully capture the intricate relationships within the data, highlighting areas where model adjustments or additional data preprocessing could enhance performance.
Figure 13 focuses on residual errors, depicting the differences between predicted and actual CS values.Lower residual errors indicate closer agreement between predicted and measured values, reflecting the model's accuracy and ability to minimize prediction discrepancies.

Sensitivity Study
A sensitivity study or analysis is crucial to many scientific investigations.This parameter sensitivity analysis helps to understand how a particular parameter might affect the results or the output of the model prediction.This study provides an understanding of which input parameters impact the results most and which have fewer effects.The GEP model was chosen for the sensitivity study based on its simplicity, which can be utilized to analyze the factors influencing compressive strength (CS).

The Effects of Changing Steel Slag Aggregate Content (SSA)
This sensitivity study examined the relationship between the percentage of SSA in concrete mixes and the resulting compressive strength.As mentioned in the literature review and shown in Figure 14a, increasing the SSA content will increase the compressive strength.Moreover, Tarawneh et al. [29] revealed that adding SSA can improve concrete's abrasion factor, impact value, and compressive strength, particularly during the early stages.Miah et al. [28] showed a notable increase in compressive strength and reduced porosity when steel slag was utilized.Sinha [25] also confirmed the trend by observing an increase in compressive, flexural, and split tensile strengths after replacing fine aggregate with steel slag by a specific percentage.The RFR and GEP models typically show minimal residual errors clustered around zero, indicating substantial agreement between predicted and actual CS values.This underscores their effectiveness in accurately predicting CS and their robust performance in various prediction scenarios.
In contrast, the ANN model may exhibit more significant residual errors compared to RFR and GEP.Higher residual errors suggest instances where predicted CS values deviate more significantly from actual measurements, indicating potential areas for improvement in model refinement or data feature selection.
The detailed analysis of Figures 10-13 provides valuable insights into how different modeling techniques predict concrete compressive strength.The GB model emerges as highly effective and reliable, while the ANN model shows variability and may benefit from further optimization.Understanding these performance nuances is crucial for optimizing predictive models and enhancing their reliability in concrete engineering applications.Furthermore, when looking at the R 2 values, although the GEP model did not produce the highest value compared to the RFR model, its unique advantage lies in its interpretability.Unlike ANN and RFR, which operate as black-box models, GEP produces a mathematical equation that explicitly describes the relationship between the input variables and the target output.This makes GEP easier to interpret and understand, providing insight into the underlying mechanics that drive the model's prediction.By producing a clear mathematical expression, researchers and practitioners can gain a deeper understanding of the factors that influence the outcome of the problem.Therefore, despite a lower R 2 value compared to GB and RFR, the interpretability offered by the GEP model makes it a valuable tool in scenarios where comprehensibility and transparency are required.However, as shown in Equation ( 1), it can be seen that the FA variable was not considered, which can be attributed to several reasons, like a limitation in dataset representations, the prioritization of other influential variables, the consideration of the addition of FA unnecessary, the quality of fine aggregate data, and limitations in the model architecture or training processes.

Sensitivity Study
A sensitivity study or analysis is crucial to many scientific investigations.This parameter sensitivity analysis helps to understand how a particular parameter might affect the results or the output of the model prediction.This study provides an understanding of which input parameters impact the results most and which have fewer effects.The GEP model was chosen for the sensitivity study based on its simplicity, which can be utilized to analyze the factors influencing compressive strength (CS).

The Effects of Changing Steel Slag Aggregate Content (SSA)
This sensitivity study examined the relationship between the percentage of SSA in concrete mixes and the resulting compressive strength.As mentioned in the literature review and shown in Figure 14a, increasing the SSA content will increase the compressive strength.Moreover, Tarawneh et al. [29] revealed that adding SSA can improve concrete's abrasion factor, impact value, and compressive strength, particularly during the early stages.Miah et al. [28] showed a notable increase in compressive strength and reduced porosity when steel slag was utilized.Sinha [25] also confirmed the trend by observing an increase in compressive, flexural, and split tensile strengths after replacing fine aggregate with steel slag by a specific percentage.

The Effects of Aging on the Compressive Strength of SSA Concrete
This sensitivity analysis focused on investigating the effects of ageing on SSA concrete's compressive strength.The study's results showed that the CS increased over time, reflecting the gradual development of concrete's mechanical properties (as shown in Figure 14b).This finding aligns with what Nguyen et al. [30] found.First, compressive strength rapidly increased within the 7-day curing period of concrete, followed by a slower but continuous increase.Additionally, Tarawneh et al. [29] highlighted the beneficial effects of SSA on enhancing concrete properties, specifically abrasion resistance and impact strength.This suggests that this enhancement may be responsible for the observed strength.The study demonstrated the progressive compressive strength enhancement in SSA concrete as it ages.Moreover, Aparicio et al. [31] found that using SSA can increase compressive strength values after 28 days of curing, increasing with the

The Effects of Aging on the Compressive Strength of SSA Concrete
This sensitivity analysis focused on investigating the effects of ageing on SSA concrete's compressive strength.The study's results showed that the CS increased over time, reflecting the gradual development of concrete's mechanical properties (as shown in Figure 14b).This finding aligns with what Nguyen et al. [30] found.First, compressive strength rapidly increased within the 7-day curing period of concrete, followed by a slower but continuous increase.Additionally, Tarawneh et al. [29] highlighted the beneficial effects of SSA on enhancing concrete properties, specifically abrasion resistance and impact strength.This suggests that this enhancement may be responsible for the observed strength.The study demonstrated the progressive compressive strength enhancement in SSA concrete as it ages.Moreover, Aparicio et al. [31] found that using SSA can increase compressive strength values after 28 days of curing, increasing with the replacement percentage.

Conclusions
The construction industry is a significant consumer of natural resources and faces substantial challenges related to environmental sustainability and resource depletion.This research addresses these challenges by promoting the adoption of locally available ecofriendly alternatives, such as steel slag aggregate concrete (SSA).The study explored the predictive capabilities of various machine learning (ML) and soft computing techniques (SCT), including Artificial Neural Networks (ANN), Gene Expression Programming (GEP), Random Forest Regression (RFR), and Gradient Boosting (GB), to predict the compressive strength (CS) of SSA concrete.Each model's fundamental principles and advantages, limitations, and similarities were elucidated.A total of 334 datasets were used, encompassing input factors like cement content, water, fine and coarse aggregates, superplasticizers, SSA, and age, with compressive strength as the output.
Statistical metrics such as the coefficient of determination (R 2 ), mean absolute error (MAE), root mean square error (RMSE), and mean values were employed to assess the accuracy of the predictive models.The findings indicated that both the GEP and GB models exhibited excellent R 2 values, with the GB model achieving the highest R 2 value of 0.98, followed by the RFR (0.96), GEP (0.86), and ANN (0.67).Additionally, the GB model recorded the lowest MAE and RMSE values, 1.15 MPa and 2.45 MPa, respectively.The RFR model had values of 2.52 MPa and 3.69 MPa, while the GEP and ANN produced 5.86 MPa and 7.35 MPa, and 8.86 MPa and 11.14 MPa, respectively.The mean values further confirmed the superior performance of the GB model, followed by RFR, GEP, and ANN.
The hyperparameter tuning for these models was crucial to achieving these results.Parameters for each model were carefully selected and optimized using techniques such as grid search and random search.For example, the GEP model's parameters included a set number of chromosomes, head size, and genes, among others.The ANN model's parameters, such as the learning rate, the number of hidden layers, and batch size, were similarly optimized.This rigorous hyperparameter optimization ensured the models' robustness, reproducibility, and generalizability, providing reliable predictions for unseen data.
The developed models offer practical applications by providing a reliable method for predicting the compressive strength of SSA concrete, which engineers and construction professionals can use to optimize mix designs and reduce the need for extensive physical testing.These models can serve as a decision-making tool, aiding in selecting appropriate mix proportions to meet specific performance criteria, thereby promoting more efficient and sustainable construction practices.The results of this study contribute significantly to understanding the mechanical behavior of SSA concrete.The developed models, particularly GEP, demonstrated strong predictability for estimating the compressive strength of SSA concrete, offering a mathematical framework that incorporates various critical parameters.These findings provide a foundation for future research and practical applications in the construction industry, promoting sustainable construction practices using SSA concrete.

Recommendation and Future Work
One of the limitations of this project is the extent of the investigation.While this study provides essential insights into the predictive abilities of machine learning models and soft computing techniques, there is still room for improvement.It is recommended that future research broadens the dataset for testing.Including a wider range of parameters, such as different compositions of SSA and various additives, can offer a more accurate picture of the performance of this concrete.Additionally, incorporating a broader range of test scenarios and conditions would enhance the analysis and validate the findings, thereby improving the reliability and usability of the predictive models created.
Notably, the fine aggregate variable was not considered in the equation developed by GEP.Several factors could explain why the GEP model excluded fine aggregate.First, the training data might not have adequately represented the effects of fine aggregate on compressive strength, possibly due to insufficient variation or the lack of samples with varied fine aggregate composition.Second, the model might have prioritized other more significant variables for predicting compressive strength.Additionally, the model could have determined that including fine aggregate did not improve predictive accuracy or that other variables sufficiently captured the interaction between fine aggregate and SSA.The quality of data regarding fine aggregate or its interaction with SSA could have also influenced the model's ability to capture these effects.Paixão et al. [47] noted that data collected from 17 sources could introduce noise or errors if improperly pre-processed.Moreover, while the GEP model can capture complex relationships, limitations or biases in the model architecture or training process might affect its performance.This indicates opportunities for improvement, such as optimizing parameters or including additional relevant variables.
Furthermore, the geographical and environmental conditions under which the data were collected should be considered, as they might influence the results.Data were sourced from various locations with distinct climates and environmental factors, potentially affecting SSA concrete's material properties and performance.Differences in temperature, humidity, and exposure to environmental stressors such as salinity or pollutants were considered.These factors could impact SSA concrete's compressive strength and durability, thus influencing the predictive modeling results [60].
Future work will involve adding more experimental studies on the use of steel slag in concrete.The database must be expanded to incorporate results from these studies to increase the comprehensiveness and reliability of the dataset.
By addressing these recommendations, future research can enhance the understanding and predictive accuracy of models for SSA concrete, contributing to more reliable and comprehensive results.
Future work will also expand the dataset to incorporate more recent experimental studies on using steel slag in concrete.This will include ensuring that data collection is diverse and representative, covering various sources, geographical regions, environmental conditions, and varying compositions of SSA.Thorough data preprocessing, including normalization and standardization, will also be crucial to mitigate potential biases.Augmenting the dataset through synthetic data generation or controlled experiments, implementing cross-validation, and regularly updating it with new data will further enhance its comprehensiveness and reliability.By addressing these recommendations, future research can improve the understanding and predictive accuracy of models for SSA concrete, contributing to more reliable and comprehensive results.

Figure 1 .
Figure 1.A flowchart illustrates the methodology process described in the paper.

Figure 1 .
Figure 1.A flowchart illustrates the methodology process described in the paper.

Figure 2 .
Figure 2. Frequency of the collected data.

Figure 3 .
Figure 3. Relationship between two variables on a Cartesian plane.

Figure 2 .
Figure 2. Frequency of the collected data.

Figure 2 .
Figure 2. Frequency of the collected data.

Figure 3 .
Figure 3. Relationship between two variables on a Cartesian plane.Figure 3. Relationship between two variables on a Cartesian plane.

Figure 3 .
Figure 3. Relationship between two variables on a Cartesian plane.Figure 3. Relationship between two variables on a Cartesian plane.

Buildings 2024 ,
14, x FOR PEER REVIEW 11 of 25These techniques ensured that the model did not become overly reliant on any feature or subset of the training data, thereby improving its performance on unseen data.Additionally, early stopping was employed during training to prevent overfitting and ensure good generalization to new data.By monitoring the validation loss, training was halted when it ceased to improve, avoiding unnecessary iterations that could lead to overfitting.

Figure 5 .
Figure 5.The general structure of the ANN model.

Figure 5 .
Figure 5.The general structure of the ANN model.

Figure 6 .
Figure 6.GEP methodology.This paper specifies the key setting parameters and adjustments in the GEP model for predicting CS: the model utilizes 30 chromosomes with a head size of 8, each containing three genes.The function set includes addition (+), subtraction (−), multiplication (×), division (/), and square root (√) operations.The fitness function used is RMSE.The model applies a mutation rate of 0.00138, an inversion rate of 0.00346, and a gene transposition

Figure 8 .
Figure 8.The general structure of the RFR model.

Figure 8 .
Figure 8.The general structure of the RFR model.

Figure 9 .
Figure 9.The process of the GB model.

Figure 9 .
Figure 9.The process of the GB model.

Figure 10 .Figure 11 .Figure 10 .
Figure 10.The relationship between the predicted and the experimental CS using the training dataset for the models: (a) GEP, (b) ANN, (c) RFR, and (d) GB.

Figure 10 .Figure 11 .Figure 11 .
Figure 10.The relationship between the predicted and the experimental CS using the training dataset for the models: (a) GEP, (b) ANN, (c) RFR, and (d) GB.

Figure 12 .
Figure 12.The relationship between the experimental and the predicted CS ratio and the steel slag amount for all datasets for the models: (a) GEP, (b) ANN, (c) RFR, and (d) GB.

Figure 13 .
Figure 13.Residual errors for the models using all the data.

Figure 12 .
Figure 12.The relationship between the experimental and the predicted CS ratio and the steel slag amount for all datasets for the models: (a) GEP, (b) ANN, (c) RFR, and (d) GB.

Figure 12 .
Figure 12.The relationship between the experimental and the predicted CS ratio and the steel slag amount for all datasets for the models: (a) GEP, (b) ANN, (c) RFR, and (d) GB.

Figure 13 .
Figure 13.Residual errors for the models using all the data.

Figure 13 .
Figure 13.Residual errors for the models using all the data.

Table 1 .
Statistical measure for all data.

Table 2 .
Statistical measure for training data.

Table 3 .
Statistical measure for testing data.

Table 2 .
Statistical measure for training data.

Table 3 .
Statistical measure for testing data.

Table 4 .
Training set statistical measurements.

Table 5 .
Testing set statistical measurements.

Table 6 .
All datasets statistical measurements.