Random Small Sample Prediction Model on Displacement of Extensive Deep Soil Excavation

: In order to forecast the displacement of deep foundation pit support, this document proposes a new method which combines the cross validation method and supports vector machine (SVM) based on random small samples. Because the random small monitoring data are difficult to fit and forecast, the cross validation method and different kernel function of support vector machine algorithm arerepeatedly used to establish and optimize the displacement prediction model of underground continuous wall, and then uses validation samples to test the accuracy of the models. The results show that this method can meet the requirements of precision relatively well, and Cauchy kernel function is better than the other. In the aspect of accuracy of model fitting and prediction, this method has great advantages, which can be applied to practical engineering.


INTRODUCTION
In order to support the process of foundation pit engineering, an important information feedback is the displacement of supporting structure for the deep foundation pits.This process is donethrough real-time monitoring of deep foundation pit displacement and with numerical calculation that can timely understand the stability of deep foundation pit and better guide the construction.However, due to the limitation of external conditions, which always lead to the measurement of data, and is found to be incomplete or has usually has big errors.Thus the question that arises is how to extract data from limited internal rule?Especially as this process is difficult.. Based on this, the artificial neural network and genetic algorithm in the field of artificial intelligence method is introduced into the geotechnical engineering [1], but as a result of less learning samples and unreasonable design of learning machine, these methods are prone to show over-fitting problems.Therefore, to start from small sample set and then to get a better model prediction ability, becomesdifficult by using deep foundation pit engineering as the so-called "the small sample problem".Support vector machine algorithm based on statistical learning theory and using structural risk minimization principle is a convex quadratic optimization algorithm, which can ensure the extreme solution is a globally optimal solution [2][3][4], and there are solid mathematics theoretical foundation and the strict theoretical analysis as well as other algorithms incomparable superiority.In the light of above analysis, cross *Address correspondence to this author at the School of Civil Engineering and Architecture, Anhui University of Science and Technology, Huainan, Anhui 232001, China; E-mail:lqpzsq@163.com validation is proposed, which ensuresthat the most optimal parameters and optimal model from the small random sample is obtained.In this paper we have used support vector machine algorithm, Huainan Golden displacement monitoring data structures deep excavation Digital Plaza, machine learning and regression, in order tomake predictions.

Support Vector Machine (SVM) Algorithm
The essence of SVM statistics theory, is used to formulize SVM regression fitting samples, determine the sample area mapping function, and then according to the mappingfunction calculate the values of unknown samples [5][6][7].Regression problems can be solved using the function b To the problem of linear regression fitting function can be set : Where w is the weight vector, b is the deviation.Relaxation factor a and b areintroduced to solve the allowed fitting errors.Optimization problem is used to minimize the function: The constant C > 0 is used, where C is the penalty factor, computedon the degree of punishment beyond the error  samples.Constraints for: Type (2) the first equationservesthe purpose of making the function more flat in order to improve the generalization ability, and the latter equation serves the purpose of reducing errors of both equations and makes some compromises.Introducing the Lagrange function: Where a i and a i * are used for Lagrange multipliers.a i , a i * ≥0,γ i ,γ i *≥0,i=1,...,n.Is the (4) partial differential, After this step putting the result generated back in equation( 4), this will get the optimization problem fordual form, and maximize the functions: ) The constraint for: This is a quadratic optimization problem, the SVM fitting function can be obtained as follows: Type of a i ,a i * only a fraction is not zero.The corresponding sample supporting the vector is generally a function of changes in the sample position on the more intense, and in this case involving the inner product operation [8].
Nonlinear regression is the first with nonlinear mapping, first mapped to high-dimensional space,after which it's return from the high-dimensional feature spacein this process the key is toselect thekernel function.Mentioned below is the optimization problem solved into the maximum type (5) under the constraints of function [9].
This function is implicit expression, but the function f (x) can be expressed as: Due to this problem, the optimization function only involves the inner product between the training sample operations(xi, xj), this is also the case in high dimensional spaces.They can also make inner product operation, it is not necessary to know specific form transformation [8], simplify the formula derivation and calculation.Thesupporting vector machine (SVM) in the inner product of different kernel functions can form different algorithm, whereas the applicability of the kernel function is different.The widely used kernel function has the following four categories: (1) Polynomial kernel function (Polynomial kernel function) : (2) the radial basis function (RBF) kernel function (RBF kernel function): (3) the Cauchy kernel function:

Random Small Sample Build
Modeling effect depends on the training sample representativeness, considering factors and results.For deep foundation pit monitoring data of the small sample problem, and the sample also has certain randomness.To take full advantage of the small sample, this paper uses the "cross-validation" approach, in order to build a small random sample and to improve the accuracy of the prediction model.
The cross validation method isusually used on the learning samples, which randomly selects a percentage of the learning samples as training set, then uses the rest of the learning samples as test set.After training model and comparing test set, cross the extraction and repeat the verification.andvalidateevery time in order to ensure the stability of the model established, through analysis and comparison.The final selection needs to be done in accordance to the qualification.Using n samples of sample points set P = {x1, x2,... Xi xn}, remove a few sample points x i (1≤i≤n), with the rest of the sample build into a collection of P -i ={x 1 ,x 2 ,x i-1 ,x i+1 , ...x n } to establish regression model [8], this model is used toestimate the value of the sample point x i , and calculate the point of cross validation error e The biggest change point can be obtained by optimizing the type, choose the closest samples from a sample point set points, if it does not belong to the point at which a cross-validation processes thekey point set, then it is added to the point of focus.
In order to build small "random sample" limited learning samples can be used to get the optimal parameters and models, through this the correctness of the model can be tested by validation samples.

Implementation and Results Analysis of the Algorithm
To verify the SVM algorithm and the feasibility of the "cross validation" and its use in the field of geotechnical engineering, the choice of digital Huainan Golden Square deep excavation monitoring data is utilized.The pit is located in Huainan city center, the foundation pit excavation depth of 16.15 m, the north side of foundation pit houses distance is only 7.4 m, the bored piles + 5 rotary jet mixing stiffening pile supporting; Widely distributed with the venue fractured,over consolidated soil, the water or the soil through wet and dry cycles drastically reduces the soil strength and shear strength is very unstable.However the monitoring data and forecasts pit has is very important.
After 78 days ofmonitoring the displacement of supporting structure at the top of the measured data,thefoundation pit's sensitive area is selected.The selection of the northern pit's 35 measured data in a sensitive area,measurespoints 36 to 70 days, as a learning sample,and 71 to 78 days of eight measures data as the validation sample, to verify using SVM method, to establishes the forecast model's applicability and superiority of displacement of pile top.That is, the use of "cross validation" in combination with the SVM algorithm generates regression of learning samples, displacement and time of displacement of the support vector machine forecasting model and is also used to predict the future, and the measured data to verify the exactness of the model.This article uses a "cross-validation" method (RBF kernel function is used) and in "cross-validation" method a different kernel function is used.The calculation of Cauchy kernel function is selected which compares the RBF kernel function's calculation.A variety of selected parameters, loss function using  -insensitive are labeled as loss functions.The calculation results are shown in Tables 1, 2.  - insensitive loss function is obtained and the value of the said error is also mentioned.The calculation results are shown in Tables 1, 2.  take  -insensitive loss function  value, also is the error as previously mentioned.
In order to further improve the prediction accuracy and real-time reflection ofthe actual status of the displacement of pile top, this paper forecasts the prediction one by one per day.According to the established forecast model, the first 71 days of deep foundation pit displacement values are predicted, and then the 71 th day of measured values is added into the learning samples after which they are learned again.After the new prediction model is established, the prediction for the value of the 72 th day is made, and so on until the 78 th day.
The fitting values of 35 learning samples are made using the method of cross validation of SVM fitting relative error, which can be controlled below 4%, higher than the "cross validation" method which is than adopted.The Cauchy kernel function is used to raise the fitting precision than the RBF kernel function.All the SVM prediction results are acquired using the Cauchy kernel function, the highest precision obtained isu = 1, the relative error can be controlled below 20%.As can be seen from Fig. (1), using the method of cross validation, three kinds of SVM are better in predicting the trend of the displacement curve which have been leveled off.It can be seen in Table 2 that using different kernel functions will lead to different fitting and prediction precision.For the same kernel function, taking different parameters can also lead to different precision.So choosingthe appropriate kernel function and its parameters is the key to the fittingand forecasting.

The Influence of Various Parameters of the SVM Model Accuracy Analysis
Different kernel functions and parameters on the influence of fitting precision is not the same, the figure below illustrates this point by the influence of various parameters on the accuracy of graph.
After using the method of cross validation, the accuracy of the SVM model has three main influencingfactors: ①Kernel function type and its parameter selection; ②Loss function type and its parameter selection; ③Value of C [10].This article uses the E -insensitive loss function.
Define a function [11][12]: As a measure of the precision model, the type of Yi * said to fit the ith learning samples fitting values, Yi said to fit the ith sample value of the learning samples.By the above six fig.analysis,whether using RBF kernel function or Cauchy kernel function,  =0, and C=∞ are the highest fitting precision of SVM model parameter.For Cauchy kernel function, when U = 3 fitting accuracy is better.InRBF kernel function, when g = 1 the fitting accuracy is better, this is due to the expression of the kernel function.For RBF kernel function, G can only take positive numbers, G determines the size of the images of "fat", G, the greater the image, the greater the "tip", as a resultthe fitting error is smaller, but easy to cause a "learning", which affects the promotion ability.A similar Cauchy kernel function is the parameters in the u, this is fitting precision become relatively more unfavorable when u = 1 however the precision of prediction is better.

CONCLUSIONS
(1) To find the site monitoring data of a deep foundation is a difficult but a very meaningful exercise.In this paper we have found that combining the cross validation method and the SVM application methods is better to solve the field monitoring data of settlement from fitting and prediction problems.The above data and analysis shows that the approach proposed in this paper to predict the nonlinear time series in geotechnical engineering hasa certain application value (2) The choice of kernel function of SVM learning and prediction performance has an important influence, using different kernel functions which are directly related to the accuracy of the results.In this paper we found that adopting the Cauchy kernel function fitting precision and prediction precision yield the best calculation results.
(3) Based on "the influence of various parameters on the accuracy of the SVM model analysis" found that C value and fitting is closely relative to the level of precision, E and the fitting precision is high and low negative correlation, For Cauchy kernel function, the increase of U will impact the accuracy and the influence of the fitting precision will have an opposite effect.Therefore in order to ensure the accuracy, part of the fitting precision is permitted to be "sacrificed".
The forecasting method adopted in this paper isn't very accurate for the predictions at the top of bore and causes cast-in-situ pile displacement.For future studies we should be able to combine the findings of this paper with other methods in order to further improve the prediction ability.

CONFLICT OF INTEREST
We declare that we have no financial and personal relationship with other people or organizations that can inappropriately influence our work, there is no professional or personal interest of any nature or kind involved in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled "Random Small Sample Prediction Model on Displacement of Extensive Soil Deep Excavation".
2).The precision curve with constant g,  and different C.

( 3 )
using RBF kernel function, when C remains the same and only g is changed in the fitting precision, then the curve is as shown in Fig. (4) (  = 0, C = up).
the precision curve with constant C,  and different g.

( 4 )
Using Cauchy kernel function, when  and C are constant and u is only changed, the fitting precision of curve isas shown in Fig. (5) (  = 0, C = ∞).

5 )
The precision curve with constant C,  and different u.

( 5 )
Using Cauchy kernel function, when u and C remainthe same and just change  , the fitting precision of curve is as shown in Fig (6) (u = 1, C =∞).

( 6 )
Using RBF kernel function, when the g and C remains the same and only  changes, the fitting precision of curve is changed.Then the curve is shown as in Fig. (7) (g = 1, C = ∞).