PREDICTION OF TBM PENETRATION RATE OF WATER CONVEYANCE TUNNELS IN IRAN USING MODERN METHODS

TBM has been used extensively in civil engineering activities and plays an important role in tunnelling projects. Hence, the penetration rate of these machines plays a crucial role in the success of their application. Therefore, in order to predict the TBM penetration rate in this study in several water conveyance tunnels including the tunnels of Karaj, Ghomrood, Golab, Nosoud and Sabzkooh, four intelligence techniques including multiple linear and nonlinear regression analysis, Gene Expression Programming (GEP) method, and Support Vector Machine (SVM) were applied. The obtained values of R and RMSE included 0.43 and 3.08 for linear regression, 0.68 and 2.3 for nonlinear regression, 0.74 and 2.09 for GEP method and 0.97 and 0.6 for SVM method, respectively which were utilized to predict TBM penetration rate. By investigating the tunnels database, the results indicated that the SVM method had the most accurate prediction of penetration rate (in terms of R and RMSE) and the maximum amount of R and the minimum amount of RMSE among all predictive modelings. Finally, respecting the amount of R and RMSE, the other methods like GEP method, nonlinear regression, and linear regression are listed to have the required accuracy in predicting penetration rate.


INTRODUCTION
Nowadays, the application of mechanized tunnelling has been extensively developed around the world. Due to ever-increasing advance of studied or constructed tunnelling projects (e.g. water conveyance tunnels, subway tunnels and road tunnels), and also the importance of construction time, the use of mechanized tunnelling has been important [1]. The proper use of these tunnelling devices would only be feasible by considering a number of factors including the selection of an appropriate tunnel boring machine, technology transition system, excellent management and project condition as well [2]. Thus, Tunnel Boring Machines (TBMs) would be a significant option for every tunnelling project due to their capabilities and high advancement rates [3]. In the beginning, the only capability of a mechanized excavation tool has been boring and advancement. In the course of time and through the increased need for high boring and advancement rates, several lateral capabilities such as the conveyor system for evacuating muck, temporary and permanent support of excavated areas have been added [4]. In empirical modelling of TBM performance, the machine and ground have been assumed as a continuous system which simultaneous effects on the machine's performance were not so precise and detailed rather it might have been entered to the model unknown [5]. The other category of empirical models was mainly developed by some experimental studies based on one or several parameters of rock or through the prediction index of rock

Regression
Regression is the statistical analysis technique which is used in the most engineering and nonengineering analyses. This approach is extensively utilized to ensure data validity and the results of modelling, which is evaluated by the determination coefficient (R 2 ). The determination coefficient (R 2 ) represents the accuracy of the graph and data validation. Regression analysis is used to determine the contribution of independent variables for predicting the dependent variable [15]. In regression analysis, the objective is to predict dependent variable changes with respect to independent variables. Multivariate regression analysis is well suited to study the effects of independent multivariate variables on the dependent variable. [16].

Gene Expression Programming (GEP)
One of the branches of evolutionary processing is gene expression programming. In gene expression programming, we try to use computer genetics algorithms, and the concepts of decomposition trees for specific applications, instead of writing the required program code, allowing the computer to know the program just by knowing the general meaning of the work. In fact, we give a high-level command to the computer and the computer itself prepares the necessary program to run the program, then runs the program and provides us with the desired output. Genetic expression programming, abbreviated GEP, uses genetic algorithms to write computer programs. In this case, the variables are programming structures, and the output is the program's ability to achieve goals. Minor changes are needed in genetic algorithm operations such as mutations, reproduction, and cost function assessment to use them in GEP. In fact, GEP is a computer program that writes other computer programs. In order to develop and run the models based on GEP, Genexpro tools are applied in the current study. The mentioned program was based on GEP. The GEP is the newest revision of the genetic programming which analyzes different computer programs. One of the GEP advantages is that the genetic diversity indicators are very simple, thus genetic operators act in chromosome level [17]. Moreover, another advantage of this technique is the unique nature of multigene, which could prepare the evaluation process of the complicated models including the parameters listed below. Table 1 indicates the GEP parameters.

Support Vector Machine (SVM)
SVM is a classifier or a border which determines the best classification and separation among the data by utilizing the support vectors [18]. In the SVM, the principles of the learning machine and creating a model is the only data placed in the support vectors [19]. This algorithm is not sensitive to the other points, and it aims to find an optimal line of data so that it has the maximum allowable distance with regard to all classifications (the support vectors) [20]. In a simple way, the support vectors are a set of points in ndimensional space of data which determine the border of classifications, so that the data classification could be carried out. The classification output can be changed as a result of moving one of the vectors [21]. The specification of SVM used in predicting penetration rate was listed in Table 2.

Evaluation parameters
In this paper, the determination coefficient (R 2 ) and root-mean-square error (RMSE) have been utilized to evaluate the parameters. The determination coefficient (R 2 ) indicated a tiny percentage of the independent variable changes which was determined by a dependent variable. The best value for the coefficient of determination varies between one and zero. Thus, the closer to the one, the better results will be achieved. RMSE is the standard deviation of the predicted amount by a model or the statistical estimator and the real (measured) amount [22].

The data
The input and output variables of the predictive modelling of TBM penetration rate are illustrated in Table 3. In this step, the data of the five mentioned tunnels (the case studies) are merged which the descriptive statistics of the data which are represented in Table 4. The description of the projects is shown in the Table 5.

TBM penetration rate prediction in the five tunnels using linear regression model
In order to TBM penetration rate prediction for single tunnels, the assumptions used in the predictions are kept, and the data of each five tunnels are merged. Therefore, the prediction of TBM penetration was carried out on the overall database. Table 6 represents the coefficient of determination related to the multivariate linear regression modeling which is applied for predicting TBM penetration rate in database.
The values of regression coefficients related to predictive modeling of the penetration rate are also shown in Table 7. Furthermore, the nonlinear relationship, created by the coefficients, between the independent variables and the penetration rate are described in Equation 1. Additionally, using the predictive modelling, the Dispersion diagram and Coordination diagram of the measured and the predicted values of penetration rate have been illustrated in figures 1 and 2, respectively.

TBM penetration rate prediction in the five tunnels using nonlinear regression model
For predicting the penetration rate in the single tunnels, the assumptions used in the prediction were kept, and the related information of each five tunnels were merged. Table 8 represents the coefficient of determination related to multivariate nonlinear regression modeling which is applied for the predicting TBM penetration rate in the five tunnels. Also, the values of regression coefficients related to the predictive modeling of the penetration rate in these five tunnels are shown in table 9. Furthermore, the nonlinear relationship, created by the coefficients, between the independent variables and the penetration rate are described in Equation 2. Additionally, the Dispersion diagram and Coordination diagram of the measured and the predicted values of penetration rate using the predictive modeling have been illustrated in figures 3 and 4, respectively.

Prediction of TBM penetration rate in the five tunnels using Gene Expression Programming (GEP) method
To predict the TBM penetration rate related to the single tunnels, all the assumptions used in the predictive models were kept, and the data of each five tunnels were merged. Figure 5 shows R 2 , RMSE and Dispersion diagram related to Gene Expression Programming (GEP) which are used to predict TBM penetration rate in the five tunnels. Besides, Coordination diagram of the measured and predicted values of penetration rate using GEP predictive modeling have been represented in Figure 6. Binary expression tree of predictive modeling of the penetration rate in the tunnels, created by GEP model, between the input variables and penetration rate have been illustrated in Figure 7. Also, the equation of GEP model is described in Equation 3.

Prediction of penetration rate in Sabzkooh water conveyance tunnel using Support Vector Machine (SVM) method
In order to predict the TBM penetration rate for the single tunnels, the assumptions used in the prediction were kept, and the data of each five tunnels were merged. R 2 , RMSE and Dispersion diagram relating Support Vector Machine (SVM) model have been presented in figure 8. They were used to predict TBM penetration rate in the five tunnels. This figure indicated a fitting line between the measured values and the best fitting curve (y=x). As can be seen in figure 8, most of the predicted and measured values except some points were fitted on the bisection line, which implied the equality of the measured and predicted values based on the line y=x. Moreover, figure 9 presented the Coordination diagram of the measured and predicted values of penetration rate using the SVM method.

CONCLUSION
Over the past three decades, many models have been proposed to predict the performance TBM based on theoretical and empirical research. All of these models are intended to accurately estimate TBM penetration rate and how the rock mass interacts with the specifications of TBM. In a realistic model, rock mass characteristics and TBM specifications must be considered to evaluate TBM performance. One of the significant parts of every tunnelling project is related to the study of TBM performance which plays a key role in selecting the method and TBM. Hence, a feasibility study of predicting TBM penetration rate using effective parameters was carried out. In order to predict TBM penetration rate, several methods including multiple linear and nonlinear regression analysis, GEP and SVM have been utilized. As a result, to achieve an acceptable relationship, nine effective parameters including field data and machine parameters were considered as the independent variables, and TBM penetration rate (PR) has been assumed as the dependent variable. Also, a linear or nonlinear relationship between the independent variables and TBM penetration rate was obtained. In this paper, R 2 and RMSE have been applied to estimate the accuracy and the efficiency of the predictive modeling of TBM penetration rate. The obtained values of R 2 and RMSE included 0.43 and 3.08 for linear regression, 0.68 and 2.3 for nonlinear regression, 0.74 and 2.09 for GEP method and 0.97 and 0.6 for SVM method, respectively. The results indicated that in most cases, Gene Expression Programming (GEP) method had higher accuracy and efficiency (in terms of (R 2 ) and RMSE) than the multivariate linear and nonlinear regression in predicting TBM penetration rate. It was remarkable to note that Gene Expression Programming (GEP) method required a basic knowledge to realize the concept of this method and the capability of interpreting specific outputs compared to the linear and nonlinear regression techniques. The analyses represented that the Support Vector Machine (SVM) method showed better performance (in terms of (R 2 ) and RMSE) with regard to the other techniques (linear and nonlinear regression method and GEP algorithm). In addition, the maximum amount of R 2 and the minimum amount of RMSE were allocated to the SVM method among all predictive modeling.