Modern Machine Learning Techniques for Univariate Tunnel Settlement Forecasting: A Comparative Study

. Tunnel settlement is commonly occurred during the tunnel construction processes in large cities. Existing forecasting methods for tunnel settlements include model-based approaches and artificial intelligence (AI) enhanced approaches. Compared with traditional forecasting methods, artificial neural networks can be easily implemented, with high performance efficiency and forecasting accuracy. In this study, an extended machine learning framework is proposed combining particle swarm optimization (PSO) with support vector regression (SVR), back-propagation neural network (BPNN) and extreme learning machine (ELM) to forecast the surface settlement for tunnel construction in two large cities of China P.R. Based on real-world data verification, the PSO-SVR method shows the highest forecasting accuracy among the three proposed forecasting algorithms.

In this study, an extended AI enhanced approach that combines the traditional machine learning techniques with particle swarm optimization (PSO) is proposed. A real-word tunnel surface settlement dataset is employed to verify the performance of the proposing method. In overall, the work that we described in this paper contributes to both the scientific and industrial areas with the following three points: 1. Utilizing machine learning techniques for tunnel settlement forecasting. Tunnel settlement forecasting is a realistic issue in real-world civilization process. However, not many works have been done in this area; especially when the AI enhanced techniques have been rapidly developed, the essentialness of fully utilizing the historical data in tunnel construction process must be emphasized.

2.
Univariate time series data forecasting with small data size. The tunnel settlement data, which was employed in this study, was recorded by a metro tunnel construction company located in Shanghai. For each measured tunnel surface point, a time series dataset of size 100 is provided.
Moreover, the construction company only records the height of each measured point. However, it is evident that the tunnel settlement is affected by multiple external factors, such as the environmental elements, civilization works and etc. The Univariate and small data size properties make the forecasting problem increasingly challenging. 3. Extended machine learning approaches are proposed. The proposed forecasting method modifies the traditional machine learning techniques, such as SVR, BPNN and ELM, to make them more suitable for tunnel settlement forecasting. A PSO process is added to search for the optimal parameters for various classifiers. In the experiment phase, a comparative analysis is performed to justify the effectiveness of the proposed method. II.

LITERATURE REVIEW
In general, there are two approaches for time series data forecasting, namely, model-based method and data-driven method. Model-based methods utilize mathematical of physical models to perform simulation and usually require multivariate data to be recorded. The extra variables excluding the tunnel surface point heights may include under-ground water pumping, soil quality measurements and other assumptions. The forecasting accuracy depends on the validity of the physical assumptions. Shi et al. [14] investigated the soil movement responding to the tunnel excavation in clays through simulations. The soil movements are the main causes of tunnel settlements. Chakeri et al. [15] designed a FLAC3D (Fast Lagrange Analysis of Continua in 3 Dimensions) model to simulate the tunnel excavation process and consequently investigate the ground surface settlement. The proposed FLAC3D is finite-difference approach, which based on a number of mathematical assumptions. Strokova [16] surveyed traditional model-based prediction methods for tunnel settlement during construction process. A finite-element based software named 'Plaxis' and a mathematical model built based on real-world tunnel settlement data in 2007-2008 at Munich Technical University are utilized for simulation and performance comparison [17]. In summary, the model-based methods provide a white-box modeling for the tunnel settlement problem. The forecasting accuracy of model-based methods is comparable to data-driven approached methods while multiple external variables are available with valid mathematical assumptions.
Data-driven approaches are grey-box or black-box models that involve a complex internal structure, receive a pre-processed version of input dataset and output an integrated forecasting results.
Conventional data-driven approaches for time series data forecasting include autoregressive (AR) methods [18], artificial neural networks (ANNs) [19], support vector regression (SVR) [20], deep learning neural networks (DLNNs) [21], wavelets methods [22] and etc. Ji et al. [23] proposed a least square support vector regression (LSSVR) method for ground surface settlement. Wang et al. [19] reported that by utilizing an adaptive differential evolution (ADE) algorithm to overcome the local extreme issues in optimal weight searching process in BPNN, the traditional BPNN can outperform most existing forecasting methods, such as SVR and AR models. Kuremoto et al. [24] proposed to use a deep belief network with restricted Boltzmann machines to perform time series data forecasting. Wang et al. [25,26] proposed to use extended echo state network (ESN) to forecast electricity energy consumption in China. Wu and Gao [27] combined AdaBoost algorithm and long short-term memory (LSTM) neural network to forecast financial time series data. Lu et al. [28] introduced another extended LSTM algorithm combining with the differential evolution (DE) method for electricity price forecasting. Yan et al. [29] proposed a multi-step forecasting algorithm that integrates convolutional neural network (CNN) with LSTM to forecast single household energy consumption. III.

Data Description
Two real-world tunnel settlement datasets were employed for the study of tunnel settlement prediction based on various modern machine learning techniques. Both datasets were collected by a local China tunnel construction company with one of them measuring the tunnel surface settlement of the metro train line 3 construction in Ningbo city, China and the other one measuring the tunnel surface settlement of a subway construction in Zhuhai city, China. Over 700 ground surface sensors were utilized, measuring the overall settlement on each day during the tunnel construction period. The recording frequency is once per day; and the total number of records for each surface point is around 100, depending on the particular construction progress conditions.
In the experiment phase, in total 10 measured surface points were selected; and for each point, 5/6 of the total recorded length was taken as the training dataset for modern machine learning prediction models, including BPNN, SVR and ELM. The remaining 1/6 of the total recorded length was used for verification purposes, computing classic error measurement metrics, including root mean square error (RMSE), mean square error (MSE) and mean absolute percentage error (MAPE).

Back-propagation Neural Network
BPNN, as one specific form of ANN, represents one of the most classic machine learning technique, which is continuously employed and improved in various application fields [11,12,13]. The most critical limitation of BPNN is probably the situation when it is used dealing with big data. For tremendous size data, parallelization of the original BPNN is required [13]. However, when the data size is serious small, the BPNN usually provides high forecasting accuracy with minimal time required compared with other machine learning techniques. Over the past few decades, many extensions of BPNN are proposed. With a pre-processing step, such as the particle swarm optimization (PSO), the extended BPNN becomes more suitable for forecasting and prediction under various working conditions.

Support Vector Regression
Support vector regression (SVR) is a state-of-art and probably the most commonly applied machine learning technique for various purposes in the field of industry engineering, including solar energy generation optimization [30], traffic flow forecasting [31], molecular dynamics forecasting [32] and etc.
Inheriting the core idea from support vector machine (SVM), SVR looks for a hyper-plane in high dimension that best represents the data pattern. Figure 1 shows a simple linear support vector regressive plane with insensitive loss variableε. LibSVM is an assembled tool-box developed by Chang and Lin, which provides the easy access to use SVR and SVM [28]. For a given set of training data, i.e., Tr = {(x i , y i )}, where x i is the training input; and y i is the objective output value. LibSVM is capable to find the objective function f(x) with specified three important parameters: K, C and γ. K stands for the kernel function that maps the low dimensional input data into high dimensional feature space. C and γcan be optimized by the PSO algorithm.

Extreme Learning Machine
Extreme learning machine (ELM), proposed by Huang et al. in 2004 and2006 [33, 34], is reputable by its fast learning speed with low computational resources and simultaneously providing competitive classification results [35,36]. ELM was well-known as a single-layer feed-forward neural network (NN) and also has been extended to non-NN forms. Compared to other neural networks in the literature, such as BPNN, multi-layer neural networks and SVM, ELM is much faster in terms of training efficiency, and provides higher generalized classification accuracy in many proven cases.
The traditional ELM algorithm maps the input data samples with the recognized pattern using one single layer of neurons. For any testing sample x, the ELM function mapping can be expressed by: Where a,b are tuned parameters; w is the weight vector for hidden neurons, which is fixed during the training phase. The function f(x) represents the recognized pattern of the input data samples. The tuning-free feed-forward training strategy of ELM is equivalent to the process of solving a linear equation system that requires very low computational cost.
The basic ELM implementation can be found at http://www.ntu.edu.sg/home/egbhuang/index.html.
To achieve the best result using ELM, two important parameters are required to be tuned, which are the number of hidden neurons, and the activation function. The two parameters, again, can be optimized using PSO algorithm.

Rolling Window Size Selection
Considering the properties of the real-world tunnel settlement data, such as short size, univariate and sparse sampling data points (1 sampling on each day), we select a suitable rolling window size for each machine learning technique in its training process. The univariate training data was re-organized into batches according to the rolling window size and inserted into the machine learning models to predict the next time stamp value (Figure 2). The rolling window size is another important parameter for each machine learning model and basically determines the length of effective source data samples in the training dataset for prediction, since too old data samples usually have less significant influence to the prediction results. According to the data description in Section 3.1, the suitable rolling window size usually lies in range from 1 to 20. For all machine learning techniques, a rolling window size k must be specified for best prediction performance.

Learning Techniques
For all three machine learning techniques that we used in this work, i.e., BPNN, SVR and ELM, there are important parameters to be tuned which seriously impacts the final performance of the tunnel settlement forecasting results [37]. In this study, the PSO is adopted to find the optimal parameters for the three machine learning techniques. The overall algorithms are denoted as PSO-BPNN, PSO-SVR and PSO-ELM.
Compared to the other optimization search algorithms, such as the genetic algorithm (GA), ant colony algorithm and differential evolution (DE) algorithm, the PSO algorithm is more efficient and able to avoid problems of stagnation behavior and premature convergence [38][39][40]. Moreover, in the PSO algorithm, the number of parameters is small; and the real number coding is adopted. Although the PSO algorithm has shortcomings, such as easy to fall into local extremes; the convergence speed is affected by inertia weight, etc. These shortcomings can be resolved by repeated runs and selecting an appropriate combination of the parameters for the algorithm [41].
Taking PSO-SVR algorithm as an example, the initial parameters of PSO include the number of Next, after fixing the parameters of PSO, we look for the optimal parameter combination of SVR using PSO (illustrated in Figure 3). Then the optimal values of C,γand k (SVR parameters) are obtained when all particles converge (Figure 4). The detailed steps of the PSO-SVR algorithm are listed in Algorithm 1.  Algorithm 1 PSO algorithm looking for the optimal parameter set for SVR Input: Searching space of vector (C, γ, k), where C ranges from 1 to 10000;γ ranges from -100 to 100; and k is the rolling window size, ranges from 1 to 20. Output: The optimal values of C,γ, k based on MAPE evaluation of SVR.
Step 1: For each particle p, a location vector l p and a velocity vector v p are assigned.
Step 2: For each particle p, the fitness function is evaluated, which is the MAPE value of SVR using this particular particle's location vector.
Step 3: At each iteration, if the fitness function is not satisfied, all particles update their historical optimal location h and global optimal location g according to their current location and velocity.
Step 4: When the maximum iteration is reached, or the MAPE value is less than a pre-defined value, the global optimal location g in the search space is outputted.
The same process can be applied to search for the optimal parameter combination of BPNN and ELM.

IV. EXPERIMENTAL RESULTS
The three machine learning techniques, namely, BPNN, SVR and ELM, combining with PSO parameter optimization algorithm is applied to a real-world tunnel settlement prediction problem with                The current work has the following limitations. First, the tunnel settlement data that we used in this study is relatively a small size dataset, which makes the DLNN methods, such as long short term memory (LSTM) and gated recurrent unit (GRU), not suitable for this study. As a result, instead, three representative non-deep learning techniques, i.e., BPNN, SVR and ELM, are selected to perform the simulations. More machine learning techniques have to be tested in future study. Second, PSO method is employed to search for the optimal parameter combinations for the three machine learning methods.
More searching algorithms, such as genetic algorithm (GA), ant colony algorithm and differential evolution (DE) algorithm can be adopted and compared in future study. VI.