Shear Strength Prediction of Steel-Fiber-Reinforced Concrete Beams Using the M5P Model

: This article presents a mathematical model developed using the M5P tree to predict the shear strength of steel-ﬁber-reinforced concrete (SFRC) for slender beams using soft computing techniques. This method is becoming increasingly popular for addressing complex technical problems. Other approaches, such as semi-empirical equations, can show known inaccuracies, and some soft computing methods may not produce predictive equations. The model was trained and tested using 332 samples from an experimental database found in the previous literature, and it takes into account independent variables such as the effective depth d , beam width b w , longitudinal reinforcement ratio ρ , concrete compressive strength f c , shear span to effective depth ratio a / d , and steel ﬁber factor F sf . The predictive performance of the proposed M5P-based model was also compared with the one of existing models proposed in the previous literature. The evaluation revealed that the M5P-based model provided a more consistent and accurate prediction of the actual strength compared to the existing models, achieving an R 2 value of 0.969 and an RMSE value of 37.307 for the testing dataset. It was found to be a reliable and also straightforward model. The proposed model is likely to be highly helpful in assessing the shear capacity of SFRC beams during the pre-planning and pre-design stages and could also be useful to help for future revisions of design standards.


Introduction
Shear failure, a critical type of failure in structural concrete beams, occurs when the applied load exceeds the shear capacity, leading to diagonal cracks and potential catastrophic failure [1][2][3][4]. Shear failure is of particular concern due to its sudden and brittle nature, posing significant safety hazards, unlike the more gradual flexural failure [5].
The design of concrete beams to resist shear failure involves calculating the shear capacity, which depends on factors such as the size and shape of the beam, concrete compressive strength, and reinforcing steel. Design codes and standards provide guidelines for calculating the shear capacity, taking these factors into account [6,7].
The design of concrete beams to resist shear failure can be challenging, especially for beams with complex geometries or unusual loading conditions. However, advances in concrete construction [8] and computational tools have led to the development of new types of reinforcement and analytical techniques that can enhance the shear capacity of beams and improve their performance [9][10][11][12][13]. One of those reinforcing materials is steel fibers.
The addition of steel fibers to concrete can significantly improve its resistance to shear failure. According to several sources [14][15][16][17], the inclusion of fibers in concrete can greatly enhance its behavior after cracking and modify its tensile strength. The addition of steel fibers has been found to significantly affect various properties of concrete members, such as resistance to deformation and cracking, as well as ultimate flexural strength, ductility, toughness, and shear capacity [18][19][20][21][22][23]. In order to enhance the shear strength of reinforced concrete (RC) elements, fibers are sometimes used to substitute partially transverse steel reinforcement (such as vertical stirrups). High-strength concrete is a type of concrete that is commonly used in high-rise buildings because of its ability to reduce section dimensions and dead load. However, it is also known to have a brittle behavior, which can give rise to structural issues. Fortunately, using steel fibers in high-strength concrete can improve its mechanical properties and ductility [24]. This means that the concrete will be less prone to cracking and breaking under stress and will be more flexible and able to deform without failing. By using steel fibers in high-strength concrete, engineers can design structures that are both strong and durable, without sacrificing safety or stability requirements.
The addition of steel fibers to concrete can make the prediction of its shear strength more challenging due to several factors. One of the primary reasons is the complex and nonlinear behavior of the material with the addition of fibers. Two types of machine learning models (ML) were used in predicting the shear strength of steel fiber reinforced concrete (SFRC) beams. These are called black-box [25][26][27][28][29][30][31] and white-box [32][33][34][35][36] ML models. Black-box ML models such as the neural network, random forest, and support vector machine are often highly accurate in making predictions, which is their primary benefit. Despite this, these models are sometimes referred to as "black box" models because they cannot provide a specific mathematical solution to describe how the shear capacity and input parameters are functionally related. On the other side, various evolutionary algorithms have an advantage over "black box" models because they can express the connection between inputs and outputs through explicit formulas. One example of those algorithms is gene expression programming (GEP). GEP is an evolutionary algorithm optimizing functions and coefficients, ideal for uncovering nonlinear system relationships. With GEP, a symbolic model is produced that can be easily comprehended by humans, allowing users to comprehend the solution from the algorithm. This capability for interpretability is especially beneficial in scenarios where human input is critical for decision-making. Evolutionary algorithms have been used by a number of researchers to predict the shear capacity of SFRC beams [37][38][39][40][41].
Although the studies mentioned earlier reported good levels of accuracy, it is still recommended to create novel models and conduct statistical analyses to ascertain the significance of any observed variations in performance and accuracy. The main objective of this research is to develop an accurate model for predicting the shear strength of SFRC slender beams using the M5P tree algorithm. The M5P tree algorithm is a machine learning technique that combines the advantages of decision trees and linear regression models, offering high accuracy and interpretability [42][43][44][45][46]. It builds a decision tree, where each leaf node corresponds to a linear regression model fitted to the data in that node. The algorithm has been widely used in various applications due to its effectiveness in handling nonlinear relationships and ease of understanding the model's decision-making process. The proposed model will provide an efficient and reliable tool for the design and assessment of SFRC beams, which have been widely used in various civil engineering applications due to their high strength and ductility properties.
While the M5P model itself is not an original proposal, the contribution of this study lies in its application to the prediction of the shear strength for SFRC beams, which has not been previously explored in the literature. In this research, the aim is to achieve the following objectives: establish a comprehensive database for the shear strength of SFRC slender beams by collecting and analyzing data from experimental studies, develop an innovative and interpretable model using the M5P tree algorithm to predict the shear strength of SFRC slender beams with superior performance compared to existing models, and conduct a safety analysis using the Collins scale to assess the safety performance of the proposed model. By developing a more accurate and reliable prediction tool, this research can contribute to the design and optimization of SFRC beams, ensuring that the desired level of shear strength is achieved while optimizing resource utilization and improving safety during the design and construction of SFRC structures.

Research Methodology
This section provides a detailed description of the methodological strategy utilized to achieve the goals and objectives of this research. The approach includes information about the input parameters, the methodology for gathering data, and the specific actions taken to develop the M5P-based prediction model and to analyze its performance.

Selection of the Input Parameters
According to previous studies [47][48][49][50] on SFRC beams, the factors that have a significant influence on their shear capacity include the effective depth d and the width b w of the cross-section of the beam, the longitudinal reinforcement ratio ρ, the concrete compressive strength f c , the shear span to the effective depth ratio a/d, and the steel fiber factor F sf . The individual impact for some of the parameters is discussed in detail below.

Shear Span to Effective Depth Ratio (a/d)
The shear strength of SFRC beams is significantly affected by the shear span-toeffective depth ratio (a/d). As reported by Narayanan and Darwish [51], an exponential relationship exists between the a/d ratio and the shear strength, with strength decreasing as the ratio increases. This is attributed to the arch action effect, which redistributes compressive force within the beam and balances stress distribution between loading points and supports. The effect contributes to increasing the shear resistance and becomes more pronounced with lower a/d ratios, leading to a higher overall shear strength in beams.

Longitudinal Reinforcement Ratio (ρ)
The longitudinal reinforcement has a favorable impact on the shear strength of SFRC beams. However, the positive effect decreases as the reinforcement ratio increases above 3.6%. This was observed by Swami and Bahia [52]. In cases where the reinforcement ratio is lower, the shear strength experiences a higher increase due to the contribution of the dowel action [53].
The shear strength of SFRC beams is also highly influenced by the concrete compressive strength (f c ). The shear strength exhibits a linear growth pattern with an increase in f c . However, some studies have presented contradictory results. It was demonstrated by Khuntia et al. [54] that when f c rises, the shear strength of SFRC beams grows exponentially. This trend is particularly noticeable in high-strength concrete beams, where a robust bond between the concrete matrix and steel fibers exists [55].

Fiber Factor (F sf )
The shear strength of SFRC beams is also influenced by the fiber factor (F sf ), which is a combination of the aspect ratio (l f /D f ), where l f and D f represent the length and diameter of the fiber, respectively, and the fiber volume fraction (V f ) [41]. The fiber factor can be calculated according to the following equation: In Narayanan and Darwish's study from 1987 [51], d f represents the shape factor or bond factor, with 1.0 for indented fibers, 0.75 for crimped fibers, and 0.5 assigned for round fibers.

Data Collection and Pre-Processing
To investigate the impact of the input variables on the mechanical behavior and shear failure mode, the researchers conducted experimental investigations on the shear behavior of SFRCB. Study [49] compiled a comprehensive database of 488 trials on SFRC beams without stirrups. After sifting through the original collection of 488 experiments, a subset of 332 experimental tests were selected for this study after filtering out beams with a shear-span to effective depth ratio (a/d) less than 2.5, which are considered non-slender beams, and beams with a shear-flexure mode of failure. The database that resulted from the 332 experiments was used to develop models. The evaluation database incorporates slender beams with rectangular and flanges cross-sections. Figure 1 (adapted from [26]) presents the distribution of the key parameters within the referred database.

M5P Model Tree Techniques
In the current research, a necessary decision tree was constructed using the M5P classifier, a tree model developed by Quinlan [56]. This model tree incorporates the M5 learning algorithm, along with multiple enhancements, and offers an innovative strategy for tackling persistent challenges in the realm of class learning [56]. The M5 algorithm is a  The results from Figure 1 show that most of the values for the concrete compressive strength found in the database fall within the normal strength range, with some outliers for high-and ultra-high-strength concrete. Most of the specimens from the dataset have significant amounts of longitudinal steel, which is typical in shear experiments. The database is also concentrated in the range of small effective depths. The shear span to depth ratio is uniformly distributed, with a/d = 3.5 being the most commonly used. Finally, the histogram of the fiber factor F sf reflects practical considerations and the workability of SFRC. The database used for experimental evaluation was split into two parts, a training set and a testing set. The purpose of this division was to develop and apply ML algorithms. The testing set was used to assess the performance of the predictive model, while the training set was used to construct M5P models. Care was taken to ensure that the input variables for both sets were statistically consistent. Of the total 332 tests in the evaluation database, 235 (about 70%) were used for training and the remaining 97 (about 30%) were used for testing the model. The input and output parameters for both sets can be found in Table 1. Table 1. Summary of the statistical attributes for the input parameters gathered from the available experimental datasets.

Data Category
Statistics

M5P Model Tree Techniques
In the current research, a necessary decision tree was constructed using the M5P classifier, a tree model developed by Quinlan [56]. This model tree incorporates the M5 learning algorithm, along with multiple enhancements, and offers an innovative strategy for tackling persistent challenges in the realm of class learning [56]. The M5 algorithm is a decision tree algorithm, but what sets M5P apart from other decision tree algorithms is its foundation in regression [57]. As a result, M5P effectively combines linear regression model techniques with the decision tree algorithm, creating a unique approach for solving problems that benefits from the strengths of both methods. Utilizing the M5P algorithm, the developed model tree features linear models at its leaf level, which assist in generating regression outputs. The precision of the resulting regression model can be assessed through target values of previously unobserved instances. The M5P algorithm showcases a remarkable level of adaptability by being able to accommodate a broad spectrum of data types in various contexts. This includes its ability to handle multiclass and binary target variables, as well as nominal and numeric attributes, all while effectively addressing the presence of missing values within the dataset [58]. M5P's linear models enable the generation of numeric outputs, and the algorithm accommodates both nominal and continuous input attribute types. When processing a new instance, the traversal begins at the top of the tree and continues down to a leaf. Throughout this journey, decisions must be made at each tree node to determine which branch to follow, based on the test condition associated with the attribute related to that specific node [59]. This step-by-step procedure ensures an accurate and systematic evaluation of the instance in question. Establishing the splitting criteria is the initial stage in constructing an M5 model tree. The M5 algorithm bases its determination of splitting criteria on an analysis of the standard deviation of attribute values, in conjunction with a calculated estimation of error minimization at each respective node, ensuring an optimized decision-making process. The tree root is selected as the attribute that reduces the anticipated error, with the standard deviation reduction (SDR) being calculated accordingly [57,60].
Within this framework, T signifies the collection of instances that reach the node and T i represents the sets derived from splitting the node according to the chosen attribute. The average value of the attribute sets T is denoted byT, while SD(T) indicates the standard deviation of T.

M5P Derived Models
In the present research, a nonlinear trend was observed in the distribution of a wide range of values, which is indicative of a power law relationship between the input and output parameters. While the M5P model can provide straightforward and efficient guidelines for determining shear capacity, it assumes a linear connection between the input and output parameters. To address this limitation, a new model was developed using log-transformed input and output parameters, based on earlier investigations [1,26]. The obtained results suggest that shear capacity is best described as a power function of the log-transformed inputs and outputs.
where the constants a , b , c , e , f , and g have different values under various circumstances, and all other terms were previously defined. Figure 2 showcases the developed model tree, formed by employing the M5P approach. Additionally, Table 2 supplies the coefficients pertaining to Equation (4), as forecasted by the M5P algorithm. The example provided below highlights the process of employing the M5P technique to estimate the shear strength of SFRC beams. We will focus on a reference beam selected from the testing dataset, with its characteristics outlined in Table 3, to effectively illustrate this approach. The example provided below highlights the process of employing the M5P technique to estimate the shear strength of SFRC beams. We will focus on a reference beam selected from the testing dataset, with its characteristics outlined in Table 3, to effectively illustrate this approach. Observing Table 3 and Figure 2, it becomes apparent that the beam sample is categorized under the M5P's LM4 prediction equation, given that bw exceeds 133.52 and d surpasses 269.15. Consequently, by employing the coefficients associated with LM4, the estimated shear strength for the reference beam listed in Table 3 amounts to 218.187 kN. This value demonstrates a strong correlation with the experimentally determined strength, which measures at 220 kN, indicating a high level of agreement between the two results.

Performance Analysis
To assess the effectiveness of the established models, two key statistical indices were taken into account: the root mean square error (RMSE) and the correlation coefficient (R 2 ). A comprehensive evaluation of these indices, which serves as an indicator of the correlations' performance, is presented in Table 4 for further analysis and consideration. The coefficient of determination, commonly denoted as R², is a statistical measure used to assess how well a regression model fits the data. It represents the proportion of the variation in the dependent variable (y) that can be explained by the independent variable(s) (x). R² is a number between 0 and 1, where 0 indicates that the model cannot explain any of the variation in the dependent variable and 1 indicates that the model perfectly explains all the variation. In other words, R² tells us how much of the variability in the dependent variable can be accounted for by the regression model. On the other hand, the RMSE is a statistical measure used to evaluate the accuracy of the predictions from a model. It is a   Observing Table 3 and Figure 2, it becomes apparent that the beam sample is categorized under the M5P's LM4 prediction equation, given that b w exceeds 133.52 and d surpasses 269.15. Consequently, by employing the coefficients associated with LM4, the estimated shear strength for the reference beam listed in Table 3 amounts to 218.187 kN. This value demonstrates a strong correlation with the experimentally determined strength, which measures at 220 kN, indicating a high level of agreement between the two results.

Performance Analysis
To assess the effectiveness of the established models, two key statistical indices were taken into account: the root mean square error (RMSE) and the correlation coefficient (R 2 ). A comprehensive evaluation of these indices, which serves as an indicator of the correlations' performance, is presented in Table 4 for further analysis and consideration. The coefficient of determination, commonly denoted as R 2 , is a statistical measure used to assess how well a regression model fits the data. It represents the proportion of the variation in the dependent variable (y) that can be explained by the independent variable(s) (x). R 2 is a number between 0 and 1, where 0 indicates that the model cannot explain any of the variation in the dependent variable and 1 indicates that the model perfectly explains all the variation. In other words, R 2 tells us how much of the variability in the dependent variable can be accounted for by the regression model. On the other hand, the RMSE is a statistical measure used to evaluate the accuracy of the predictions from a model. It is a measure of the difference between the predicted values of a model and the observed values. The RMSE represents the square root of the average of the squared differences between the predicted and observed values. The formula for RMSE is as follows: where y pre i is the predicted value, y obs i is the observed value, and n is the total number of observations. The RMSE is often used to evaluate the performance of regression models, such as linear regression or time-series models, by measuring the difference between the predicted values and the actual values. The RMSE is measured in the same units as the dependent variable, which makes it easier to interpret than other measures, such as the mean absolute error (MAE). Table 4 presents the statistical evaluation of the performance of the M5P model for predicting the shear strength of SFRC beams. The results in Table 4 show that the M5P model produced predictions that were in good agreement with the experimental data. The prediction ability of the model was evaluated using blind points that were not part of the training dataset. The M5P model was able to predict the shear strength values for the testing dataset with an R 2 value of 0.969, indicating a strong association between the predicted and actual values. The closer the R 2 value is to 1, the more accurate and effective the model is. Additionally, the authors used the RMSE value as another statistical metric to compare the performance of the models. The RMSE for the testing dataset for the M5P-based correlation was 37.307, indicating that the predictions from the model were very close to the actual values. Figure 3 presents the results of the M5P model for forecasting the shear strength of SFRC beams. The results are presented as a cross plot, which shows the expected values for both the training and testing datasets. The cross plot was drawn to provide a clearer understanding of the model performance. The results indicate that the M5P model performed near perfection, as the predicted values were precise and located in a condensed area around the unit slope line (X = Y), as can be seen in the cross plot. These results are supported by the previous statistical analysis, and they demonstrate that the M5P model can accurately forecast the shear strength of SFRC beams.    In order to assess the precision and efficiency of the M5P model from an alternative viewpoint, Figure 4 displays the predicted and experimental values of the shear strength in both the training and testing datasets. The results in Figure 4 confirm the previous observations about the accuracy of the M5P model.

K-Fold Cross Validation Results
To ensure that overfitting does not occur in the proposed models, a widely used technique known as the five-fold cross-validation method is employed. Five-fold cross-validation is a technique for assessing the performance of ML models. The main purpose of this technique is to evaluate the ability of the model to generalize to new data, rather than simply memorizing the training data. To implement five-fold cross-validation with RMSE (root mean squared error) measurement, the steps to follow are: 1. Divide the data into five equal parts (or "folds").

K-Fold Cross Validation Results
To ensure that overfitting does not occur in the proposed models, a widely used technique known as the five-fold cross-validation method is employed. Five-fold crossvalidation is a technique for assessing the performance of ML models. The main purpose of this technique is to evaluate the ability of the model to generalize to new data, rather than simply memorizing the training data. To implement five-fold cross-validation with RMSE (root mean squared error) measurement, the steps to follow are:

1.
Divide the data into five equal parts (or "folds").

2.
Choose one fold as the test set and the other four folds as the training set.

3.
Train the model on the training set and use it to predict the target values on the test set.

4.
Calculate the RMSE between the predicted values and the actual values in the test set.

5.
Repeat steps 2-4 for each of the five folds, using a different fold as the test set each time. 6.
Calculate the average RMSE across all five folds. This is the overall measure of the model's performance.
The RMSE performance of the model on all the data points is 38 (kN), while the average performance of the 5-fold cross-validation is 41 (kN). This suggests that the model's performance on unseen data is relatively close to its performance on the training data, indicating that the model is not prone to overfitting. The smaller difference between the two RMSE values suggests that the model is more likely to generalize well to new data. A box plot displaying the distribution of the model performance across the five folds of the data can typically be seen in Figure 5.
The RMSE performance of the model on all the data points is 38 (kN), while the average performance of the 5-fold cross-validation is 41 (kN). This suggests that the model's performance on unseen data is relatively close to its performance on the training data, indicating that the model is not prone to overfitting. The smaller difference between the two RMSE values suggests that the model is more likely to generalize well to new data. A box plot displaying the distribution of the model performance across the five folds of the data can typically be seen in Figure 5.

Comparision with Previouly Developed Models
Figure 6a-f visually shows the comparative analysis between the predicted shear strengths derived from existing models and those obtained from the experimental results. The dashed line, also referred to as the 1:1 line, represents the intended target values for the predictions. In contrast, the solid line embodies the linear regression calculated from the distribution of the effectively predicted data points, providing a graphical representation of the deviation between the predicted and experimental values. When the data points are in closer proximity to the 1:1 line, it signifies a higher level of accuracy for the predicted outcomes. When compared to other models, the models from Ashour et al. [60] and Khuntia et al. [54] demonstrate the lowest prediction performance, indicating a higher discrepancy between their results and the actual outcomes, suggesting room for improvement in their predictive capabilities. The results from Ashour came from a proposed equation in the study [61], which is an adaptation of Zsutty's equation found in reference [62]. This modified version integrates the fiber factor, accounting for its influence on the overall equation. Khuntia et al. [54] took into account the post-cracking tensile characteristics of

Comparision with Previouly Developed Models
Figure 6a-f visually shows the comparative analysis between the predicted shear strengths derived from existing models and those obtained from the experimental results. The dashed line, also referred to as the 1:1 line, represents the intended target values for the predictions. In contrast, the solid line embodies the linear regression calculated from the distribution of the effectively predicted data points, providing a graphical representation of the deviation between the predicted and experimental values. When the data points are in closer proximity to the 1:1 line, it signifies a higher level of accuracy for the predicted outcomes. When compared to other models, the models from Ashour et al. [60] and Khuntia et al. [54] demonstrate the lowest prediction performance, indicating a higher discrepancy between their results and the actual outcomes, suggesting room for improvement in their predictive capabilities. The results from Ashour came from a proposed equation in the study [61], which is an adaptation of Zsutty's equation found in reference [62]. This modified version integrates the fiber factor, accounting for its influence on the overall equation. Khuntia et al. [54] took into account the post-cracking tensile characteristics of fiber-reinforced concrete (FRC) to develop an equation. Their equation was subsequently validated through the analysis of experimental outcomes obtained from 68 SFRC beam specimens featured in their study. Conversely, the average results computed using the equations proposed by Sabetifar and Nematzadeh [35], Sarveghadi et al. [63], and Chaabene and Nehdi [64] demonstrated a closer match with the experimental findings. These models were primarily developed using genetic programming (GP) as their foundation. GP, as introduced in reference [65], is a relatively recent ML technique employed to generate nonlinear regression equations, as documented in references [66][67][68]. This innovative approach has proven to be effective in producing more accurate predictive models. Notably, the equations proposed by Sabetifar and Nematzadeh [35] and Chaabene and Nehdi [64] exhibited a strong predictive performance, characterized by higher R 2 values. The linear regression displayed a close alignment with the 1:1 line, indicating a high degree of accuracy in their respective models.    [54], (c) Sabetifar and Nematzadeh [35] (d) Sarveghadi et al. [63], (e) Chaabene and Nehdi [64], (f) proposed M5P model. The predicted outcomes derived from the M5P model are juxtaposed with the experimental data, as illustrated in Figure 6f. Upon examination, it becomes evident that the dispersion of data points in the proposed M5P model is substantially smaller in comparison to the data scatters observed in the earlier models. This observation suggests that the M5P model offers a more precise and accurate prediction. Furthermore, the linear regression line plotted using the data points demonstrates a striking resemblance to the diagonal line, boasting a high R 2 value of 0.9580. This high value signifies a strong correlation between the predicted results and the experimental data, indicating that the model has effectively captured the underlying relationships in the data.
In addition to the aforementioned observations, Figure 7 shows the histograms of the proposed M5P model. The visual representation of these histograms reveals a welldistributed pattern, with the mean unit value being adequately centered within the distribution. This balanced distribution is an indication of the model's robustness and its ability to generalize well across data points with a wide distribution.    Table 5 provides a detailed overview of the statistical characteristics associated with these ratios across the various models being analyzed. The analysis emphasizes that the M5P model demonstrates the highest accuracy in predicting the shear strength of SFRC beams. This is evidenced by its lowest COV and SD values, along with a mean value that closely approaches unity. The R 2 value for the proposed M5P model was found to be the largest among all considered models, indicating its superior performance. In addition, its RMSE value, which signifies the error between the predicted and target values from experiments, was the second smallest one, only slightly surpassed by the value from Chaabene and Nehdi's model [64]. Moreover, the M5P model exhibits a considerably lower SD and COV, with 0.1627 and 0.1601, respectively, compared to the values for the other models. With a mean value of 1.04, which is remarkably close to 1.0, the M5P model demonstrates a high degree of reliability in estimating the shear strength of SFRC beams. This evidence underlines the effectiveness and reliability of the M5P model in the application studied in this research.

Model Safety Analysis
Most models and design codes balance safety and practicality with a marginal accuracy level. The proposed model must also maintain adequate safety, consistent with these codes. This study employs Collins scale [69] to assess the proposed model safety and compare it to existing design codes. The scale facilitates a comprehensive evaluation of the safety performance and practicality, ensuring alignment with industry standards. The demerit points classification (DPC) method evaluates a model's safety, accuracy, and variability by examining the correlation between experimental ultimate shear strengths and estimated theoretical shear capacities. This method assigns a demerit point value (DP) according to the V actual /V predicted ratio. Using the Collins scale, researchers can classify various design codes and the safety performance. Table 6 provides a detailed presentation of the safety classification based on the Collins scale, highlighting different safety levels and criteria. In order to evaluate quantitatively the safety of the proposed model and contrast it with earlier models, demerit points are assigned to each prediction made by these models, incorporating a total of 332 data points based on the criteria outlined in Table 6. This approach enables a comprehensive comparison of the safety performance between the models. The cumulative demerit values for each model are calculated by summing the products of specimen counts and corresponding penalties within each interval as shown in Table 7. A lower total sum value indicates a higher safety level for the model under study. For instance, the model from Khuntia et al. [54] has four instances, where V actual /V predicted < 0.5. Referring to Table 6, each of these 4 points carries a penalty of 10. Thus, the total penalty is calculated as 4 × 10 = 40. This method enables an effective evaluation and ranking of various models' safety performance, guiding the selection of the safest and most reliable model for practical applications. It is crucial to highlight that the M5P algorithm demonstrates the lowest demerit penalty among the various prediction models, indicating its superior safety performance. Based on the Collins scale, approximately 70% of the predictions made by the M5P algorithm fall within the safe and acceptable range, further emphasizing its reliability and effectiveness

Safety Factor Inclusion
The M5P model equations LM1, LM2, LM3, and LM4 do not consistently overestimate all data points, but rather produce both overestimated and underestimated predictions. For this reason, introducing a reduction factor can still further improve the reliability of the predictions. This can be especially important in scenarios where reliable predictions are necessary to satisfy the design conservatism and can help ensure that the model is an effective tool for generating insights from data. As shown in Figure 9a-d, the V exp /V M5P ratios exhibit a minor level of scatter and do not follow a normal distribution (Lognormal). Therefore, the reduction factor γ needs to be applied to account for the statistical uncertainties associated with estimating the characteristic value of the V exp /V M5P ratio. Equation (6) provides a means of calculating the reduction factor, as recommended by EN 1992 [7,70]. By using this approach, engineers can be confident that their designs are based on sound engineering principles and are safe and reliable in a range of different scenarios. where β is a reliability index taken as 3.8, α is a sensitivity factor taken as 0.8, a is the slope of the relationship between V exp and V M5P , and η o is the standard deviation of V exp /a × V M5P . Those values and relationships are recommended by EN 1992 [7]. As depicted in Figure 10a-d, the slope values (a) of 1.11, 1.17, 1.23, and 1.04 are assigned for equations LM1, LM2, LM3, and LM4, respectively. By substituting these values in Equation (7), engineers can estimate the reduction factor γ for the shear strength of SFRC slender beams from equations LM1, LM2, LM3, and LM4, as 0.69, 0.69, 0.85, and 0.65, respectively.

Conclusions
In this study, the aim was to develop an innovative, interpretable, and easy-to-use M5P-tree-algorithm-based model to predict the shear strength of SFRC beams. The model showcases superior performance compared to existing models in the field of structural engineering. Accurate prediction of the shear strength of SFRC beams is crucial, as it ensures the safety and reliability of structures, particularly when dealing with challenging

Conclusions
In this study, the aim was to develop an innovative, interpretable, and easy-to-use M5P-tree-algorithm-based model to predict the shear strength of SFRC beams. The model showcases superior performance compared to existing models in the field of structural engineering. Accurate prediction of the shear strength of SFRC beams is crucial, as it ensures the safety and reliability of structures, particularly when dealing with challenging

Conclusions
In this study, the aim was to develop an innovative, interpretable, and easy-to-use M5P-tree-algorithm-based model to predict the shear strength of SFRC beams. The model showcases superior performance compared to existing models in the field of structural engineering. Accurate prediction of the shear strength of SFRC beams is crucial, as it ensures the safety and reliability of structures, particularly when dealing with challenging loading conditions and complex geometries. Furthermore, a safety analysis using the Collins scale was employed to assess the safety performance of the proposed model. Based on the results and analysis, the following conclusions, which emphasize the novelty of this work, can be drawn:

•
The M5P model demonstrated high accuracy in predicting the shear strength of SFRC beams, outperforming existing models in terms of performance metrics. The simplicity and ease of use of the M5P tree algorithm highlight its effectiveness in handling complex relationships and its potential applicability to other civil engineering problems.

•
The safety analysis conducted using the Collins scale revealed that the M5P model had the lowest demerit penalty and was the safest among the different prediction models. Approximately 70% of the predictions made by the M5P algorithm fell within the safe and acceptable range, emphasizing its reliability and effectiveness in practical applications.

•
By developing a more accurate, reliable, and user-friendly prediction tool, this research provides a significant contribution to the design and optimization of SFRC beams. It ensures that the desired level of shear strength is achieved while optimizing resource utilization and improving safety in SFRC structure design and construction.
The successful development and validation of the M5P model open up new opportunities for future research in the field of SFRC beams. Further refinements and enhancements to the model can be pursued, and additional case studies can be conducted to continue improving the accuracy, reliability, and safety performance of the prediction tool. Ultimately, the M5P model serves as a valuable and novel contribution to the ongoing advancement of SFRC beam design and engineering, addressing a crucial aspect of structural safety and reliability.