A RANDOM FOREST MODEL FOR THE PREDICTION OF SPUDCAN PENETRATION RESISTANCE IN STIFF-OVER-SOFT CLAYS

Punch-through is a major threat to the jack-up unit, especially at well sites with layered stiff-over-soft clays. A model is proposed to predict the spudcan penetration resistance in stiff-over-soft clays, based on the random forest (RF) method. The RF model was trained and tested with numerical simulation results obtained through the Finite Element model, implemented with the Coupled Eulerian Lagrangian (CEL) approach. With the proposed CEL model, the effects of the stiff layer thickness, undrained shear strength ratio, and the undrained shear strength of the soft layer on the bearing characteristics, as well as the soil failure mechanism, were numerically studied. A simplified resistance profile model of penetration in stiff-over-soft clays is proposed, divided into three sections by the peak point and the transition point. The importance of soil parameters to the penetration resistance was analysed. Then, the trained RF model was tested against the test set, showing a good prediction of the numerical cases. Finally, the trained RF was validated against centrifuge tests. The RF model successfully captured the punch-through potential, and was verified using data recorded in the field, showing advantages over the SNAME guideline. It is supposed that the trained RF model should give a good prediction of the spudcan penetration resistance profile, especially if trained with more field data.


INTRODUCTION
Jack-up drilling units are widely employed in the offshore oil and gas industry, due to their economical and operational efficiency. However, the safety of jack-ups is usually challenged by the complex soil conditions in seabeds. A strong layer overlying a weak layer is a dangerous condition during jack-up installation. In this case, the penetration resistance usually has a peak value followed by some reduction, resulting in rapid penetration of jack-up legs during preloading, a phenomenon known as "punch-through" [7].
Seabeds with stiff clay overlying soft clay, i.e. stiff-oversoft clays, often threaten the safety of jack-ups due to their punch-through potential. Since the 1960's, there have been many theoretical studies on the bearing capacity of foundations in stiff-over-soft clays. Two semi-empirical methods, the projected area method proposed by Yamaguchi [15] and the punching shear model proposed by Mehyhof and Hanna [9], are recommended in most of the civil engineering codes [6,11]. Furthermore, Madhav and Sharma studied the effect of load distribution at the interface on the ultimate bearing capacity of the underlying soft clay [8]. Yuan and Yan improved the punching shear model for layered soils with more than two layers [16]. The above methods assumed the presence of a soil plug underneath the spudcan, and the evolution of plugs has been comprehensively studied for the case of sand overlying clay [5,17]. However, the assumed failure mechanisms were different to the actual mechanism in stiff-over-soft clay, and were concentrated on the peak resistance at the stiff layer surface, which could not provide guidance for predicting the resistance profile.
Much research has been performed to study the spudcan penetration resistance profile in stiff-over-soft clay. Mehryar and Hu initiated the numerical simulation of the punchthrough problem with the Remeshing and Interpolation Technique with Small Strain (RITSS) approach, and obtained a preliminary understanding of the punch-through process [9]. Hossain and Randolph experimentally and numerically studied the spudcan penetration in stiff-over-soft clays [3,4]. A semi-empirical model with a peak resistance and a post-peak resistance in the profile was proposed. The peak resistance and its position were proposed by curve fitting from the numerical and experimental results, and the post-peak resistance was deduced based on the evolution of the soil plug before merging with the bearing profile in the underlying clay. Tjahyono also conducted experimental and numerical studies of spudcan penetration in stiff-over-soft clays, and proposed a semi-empirical model [13]. Zheng et al. proposed formulas for the peak resistance, and the resistance at layer interfaces, by curve-fitting both the numerical and centrifuge results [18], then further extended the method into stiff-soft-stiff clay. All methods mentioned are based on the interpolation of numerical and/or centrifuge results, and only a limited number of parameters are considered. Therefore, their applicability to engineering practice is doubtable.
This study aimed to propose a robust model for the prediction of the spudcan penetration resistance profile in stiff-over-soft clays. Firstly, a numerical model based on the CEL approach was proposed and validated against centrifuge tests. With the proposed model, a series of parametric studies were conducted to study the influences of stiff layer thickness, soil strength ratio and the soil strength of the soft layer on the penetration resistance profile. Then, a machine learning method, the Random Forest (RF) method, was adopted to analyse the numerical results. The RF model was trained and tested with the numerical results, showing good performance in capturing the punch-through potential. The successful implementation of the Machine Learning method in spudcan penetration resistance profile prediction is of great significance to future offshore geotechnical studies.

NUMERICAL METHODS AND SIMULATIONS NUMERICAL MODEL
In this study, the coupled Eulerian Lagrangian method is adopted to simulate the spudcan penetration process in stiff-over-soft clays [10,12]. The soil was modelled as a Eulerian part, as it deforms and flows enormously, while the spudcan was modelled as a Lagrangian part. The spudcan was constrained as rigid for its negligible deformation in the penetration process. As the model was axisymmetric, only a quarter of the model was set up. The spudcan was discretised with the 10-node three-dimensional element, denoted as C3D10M, while the soil was discretised with the 8-node three-dimensional Eulerian element with reduced integration, denoted as EC3D8R. The spudcan-soil interface was modelled using the general contact method, and a rough interface was assumed.
As severe soil deformation is induced in the region close to the spudcan during its penetration, the mesh in this region was refined while the mesh size gradually increases to the outer boundaries. In this study, the mesh within 1.5 times the spudcan diameter (1.5 D) from the spudcan centre was refined. The finer the mesh, the higher the computation cost, thus, in order to balance the computation cost and the simulation precision, the influence of mesh size on the resistance simulation was studied. Three cases with different mesh sizes (0.05 D, 0.015 D and 0.005 D, see Fig. 1) in the refined region were simulated and compared. In these cases, the parameters were the same as the centrifuge test case E2UU-II-T5 in Ref. [3] (see Table 1). The penetration responses show that in the case with the coarsest mesh (0.05 D), a much larger resistance than in the other two cases was predicted. The other two refined cases matched very well, showing < 2% difference. It can be inferred that the effect of mesh size converges, and the mesh size of 0.015 D in the refined region is fine enough, with regard to the simulation precision.
The clay was modelled as an elastic, perfectly plastic material obeying the Tresca criteria. The large plastic strain in clay is always involved during spudcan penetration. Hossain and Randolph showed that the effect of strain rate on the spudcan penetration resistance is negligible [2], while Tjahyono demonstrated that the strain softening effect is not negligible in stiff-over-soft clay [13]. Therefore, the strain softening effect of clay is considered in the present model. In the Abaqus model, a field variable is introduced to represent the degree of softening. The field variable is updated by the user subroutine VUSDFLD during the solution, which furtherly updates the soil strength in each time increment.
The strain softening model proposed by Evian and Randolph was incorporated into the present model, as follows [1]: in which S u is the undrained shear strength of clay updated during numerical calculation, while S u0 is the initial undrained shear strength of the clay measured from geotechnical tests, δ rem is the residual degradation factor, which is the inverse of soil sensitivity S t , ξ is the accumulated plastic shear strain, and ξ 95 is the reference accumulated plastic shear strain. A typical value of ξ 95 = 10, within the range of 10-25 (i.e. 1000-2500% shear strain) suggested in [2] for normal clays with sensitivity S t = 2-5, was used.

MODEL VALIDATION
To verify the accuracy of the proposed model in simulating spudcan penetration in stiff-over-soft clay, two centrifuge test cases E1UU-II-T5 and E2UU-II-T5 from [3] were used for comparison.
Both test cases have an underlying clay layer with uniform strength, but differ in the soil strength ratio. Enormous numerical simulations show that δ rem = 0.33 and ξ 95 = 10 gave the best prediction of the spudcan penetration resistance in stiff-over-soft clay, while the numerical cases without strain softening significantly overestimated both the peak and postpeak resistance (see Fig. 2). For E1UU-II-T5, the numerical simulation precisely resembled the centrifuge test, and no punch-through potential was shown. For E2UU-II-T5, the numerical simulation also showed significant punch-through potential, as did the centrifuge test.

PARAMETRIC ANALYSIS
Several parameters involving the spudcan geometry, soil property, and strata configuration all affect the penetration resistance in stiff-over-soft clays. In order to systematically analyse the influences of all these parameters, 67 numerical cases were simulated and characterised by three normalised parameters: the normalised upper layer thickness t/D; the soil strength ratio S ubs /S ut ; and the normalised undrained shear strength of the soft layer S ubs /γ ' b D. All parameter values are summarised in Table 1 The penetration resistance of the four cases with different layer thicknesses (t/D = 0.5, 0.75, 1.0, 1.5) were compared. The other two normalised parameters were set as constant: S ubs /γ ' b D = 0.4, S ubs /S ut = 0.2, and the undrained shear strength of the lower layer was uniform along the depth k = 0.
It was shown that the normalised resistance reached a peak value, followed by some reduction. For t/D = 0.5, the normalised penetration resistance firstly increased to a peak value, then decreased and transited around the layer interface. The other cases showed a similar trend: the larger the normalised stiff layer thickness, the larger the peak normalised resistance. The transition point also deepened with the normalised stiff layer thickness. It is also shown in Fig. 3 that the post-peak resistance reduction rate increased with the normalised stiff layer thickness. To study the influence of the shear strength ratio, the normalised resistance profiles of two series of numerical simulations with S ubs /γ ' b D = 0.4 are plotted in Fig. 4. The shear strength ratio varied between 0.2, 0.4, 0.6 and 0.8, while the other parameters were set as D = 6m, t/D = 0.75 and k = 0. For S ubs /S ut = 0.8, the penetration resistance monotonously increased in both the stiff and soft layers, and no punchthrough potential was observed. As S ubs /S ut decreased, the peak resistance tended to be larger than the resistance in the soft layer, and the punch-through potential emerged. It should be noted that, in the cases where S ubs /S ut = 0.6, the post-peak resistance was very close to the peak resistance in the stiff layer. For cases where S ubs /S ut > 0.6, the penetration resistance profile showed no punch-through potential, while showing significant punch-through potential for S ubs /S ut < 0.6.

Fig. 4. Effect of the undrained shear strength ratio on penetration resistance
To analyse the influence of the shear strength of the underlying layer, cases with S ubs /γ ' b D = 0.2, 0.4 and 0.6 for S ubs /S ut = 0.4 were simulated. As shown in Fig. 5, the potential for punch-through failure existed in all cases. The undrained shear strength of the soft layer influenced not only the normalised resistance in the lower layer but also in the upper layer. The normalised peak resistance decreased with increasing shear strength of the underlying layer, while the potential for punch-through failure increased.

Fig. 5. Effect of undrained shear strength of lower layer clay on penetration resistance
Normally the undrained shear strength varies linearly with depth, so that the influence of strength gradient in the soft layer is studied. Results for cases of various normalised strength gradients ( kD/S ubs = 0.4, 0.5, 0.6 and 0.8) for S ubs /S ut = 0.4 are plotted in Fig. 6. The influence of the normalised strength gradient on the normalised peak resistance was not significant; however, significant differences were shown in the lower layer.

PREDICTION WITH THE MACHINE LEARNING METHOD
The penetration resistance profile in stiff-over-soft clay usually has a peak value followed by significant reduction; however, it may be insignificant in some cases, such as for thin stiff layers. It is difficult to find a nonlinear function that can describe the whole profile, as there is no potential function clearly fitting the shape, and the number of independent parameters of the function may be more than four. Therefore, the profile was divided into three sections in this study, and each segment assumed to be linear. In the first section, the resistance increased rapidly up to the peak value. Resistance then decreased with penetration depth, until transiting around the layer interface. In the last section, the resistance persistently increased in the lower layer. Therefore, the general resistance profile in stiff-over-soft clay can be illustrated as in Fig. 7. There are two points characterising the profile, which are the peak and transition points. Correspondingly, there are four parameters defining the resistance profile -peak resistance, transition resistance, peak depth, and transition depth.

RANDOM FOREST METHOD
Decision trees are a popular method for various machine learning tasks. However, trees that are grown deeply tend to overfit their training sets, i.e. have low bias, but very high variance. To correct for decision trees' habit of overfitting, the random forest or random decision forest method is proposed. As shown in Fig. 8, it is an ensemble learning method for classification, regression and other tasks, which operates by constructing a multitude of decision trees at training time and giving as outputs the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees. As a mature algorithm, it is available in Python and can be very easily called on in a Python program. However, to improve the predictive power of the model, some parameters should be tuned. Generally, there are 3 parameters involved. The first one is "max_features", the maximum number of features that the random forest algorithm can try in an individual tree. Increasing "max_features" generally improves the performance of the model, as a higher number of options are considered at each node. However, it may decrease the diversity of an individual tree, which is a significant advantage of the algorithm. The second one is "n_estimators", which represents the number of trees to build before taking the maximum voting or averages of predictions. A higher number of trees gives better performance, but makes the code slower. The last one is "min_sample_leaf", the minimum sample leaf size. A smaller leaf makes the model more prone to capturing noise in train data.
In the present work, the four characterising parameters of the resistance profile should be predicted separately. Thus, the RF model parameters should be tuned for each case. Multiple sets of parameters were tried to find the most optimum for the cases, and optimised parameters are listed in Table 2.

IMPORTANCE ANALYSIS
The random forest algorithm is a highly convenient method with which to analyse data. In this section, the contribution of the independent parameters to the dependent parameters, i.e. the importance of the independent parameters, was analysed. The sum of the importance ratios was 1; the larger the value, the more the parameter contributes to the dependent parameter.
As shown in Fig. 9a, the undrained shear strength ratio showed a dominant effect on the normalised peak resistance with an importance ratio of 0.92. In other words, the other parameters had little effect. Fig. 9b shows that the peak resistance depth was also widely affected by the strength ratio, the normalised stiff layer depth, and the normalised shear strength of the soft layer. The strength gradient of the soft layer showed little effect on the peak resistance depth.
(a) (b) Fig. 9. Importance of the soil parameters to peak resistance and depth The transition resistance was affected by the strength ratio and the undrained shear strength of the lower layer. As shown in Fig. 10a, the normalised stiff layer thickness also showed a small effect, likely due to the soil plug locked underneath the spudcan. As the undrained shear strength gradient in the soft layer affected the shear strength on the shear banding underneath the spudcan, it also showed some effect. The depth of transition resistance was affected by all four normalised parameters. However, as shown in Fig. 10b, the strength ratio and the strength gradient showed prominent effects.
(a) (b) Fig.10. Importance of soil parameters to the transition resistance and depth

RF MODEL TEST
In order to test the trained RF model, the penetration resistance results obtained from numerical simulations were divided into a training set and a test set. The training set was used to train the RF model. Then, the trained RF model read the independent parameters of the test set, and predicted the dependent parameters of the resistance profile.

Fig. 11. Test of the trained RF model in resistance prediction
The predicted results were compared with the test set. Fig. 11 shows that the peak and transition resistances predicted by the trained RF model agreed well with the test set. However, comparisons of the depths of peak resistance and transition resistance showed poor agreement, as shown in Fig. 12, likely due to the error accumulated from the data extraction. Because the resistance profile was not angular but very smooth, the value was difficult to determine when choosing the peak and transition points in the resistance profile.

DISCUSSION
The trained RF model was further validated against two centrifuge test cases in [3], as no field data is available. The proposed method was also compared with the widely recognised recommended practice SNAME TR5-5 [11]. As shown in Fig. 13, the potential of punch-through was well captured in both cases, while SNAME TR5-5 tended to overestimate the punch-through potential. However, the profiles predicted with the present model were not in good agreement with the centrifuge tests. The difference between the prediction and the test was due to the RF model being trained with the numerical results, which bias from the accurate resistance profile. The centrifuge test results also may have some error, as Xie et al. pointed out that the interpretation of the top stiff layer strength is very challenging in centrifuge tests [14]. If the RF model is trained with field data, it should give a good prediction of the penetration resistance profile in stiff-over-soft clays.

Fig. 13. Validation of the trained RF model against centrifuge tests
Although there is a lack of field-recorded penetration resistance profiles for stiff-over-soft clay, there are some documented punch-through accidents. A JU2000E jack-up unit encountered a punch-through at the site S-A. The unit has three spudcans, each with a diameter of 18 m, and the distance from the maximum cross-section to the spudcan tip is 1.17 m. The stratum properties of site S-A are listed in Table  3. Because there is an inter-bedded soft clay layer, spudcans have punch-through potential when installed there. The trained RF model was used to analyse the penetration resistance of the JU2000E platform at site S-A, and was also compared to the SNAME TR5-5 (Fig. 14). The peak resistance depth predicted by the RF model was 8.1 m, which is deeper than that predicted by the SNAME TR5-5 (6.0 m). The peak resistance predicted by the RF model is 87.3 MN, which is larger than that predicted by the SNAME TR5-5 (70.7 MN). Field records show that the JU2000E spudcan preloaded with 85.8 MN encountered punch-through at 7.3 m depth. Clearly, the resistance profile predicted by the trained RF model gives a better prediction of the site S-A than the SNAME TR5-5; however, as the training data set is very limited given the lack of field data, the RF model should be trained with more field data before being widely adopted.

CONCLUSIONS
Various numerical simulations of spudcan penetration in stiff-over-soft clays were conducted with a numerical model based on the Coupled Eulerian Lagrangian method. A preliminary parametric study of the numerical results showed that the undrained shear strength ratio dominates the punch-through potential in stiff-over-soft clays, and the critical ratio is around 0.6. Furthermore, the Random Forest method is adopted to analyse the numerical results. Importance analysis was use to further analyse the effects of independent parameters on the resistance profile. The RF method was trained and tested, with a training set and a test set extracted from numerical results data. It was shown that the prediction of resistance values by the trained RF method agreed well with the test set. Finally, the trained RF method was validated against two centrifuge tests. Although the values of the resistances predicted by the trained RF method did not agree well with the centrifuge tests, the shapes of the resistance profiles were well-captured, which is critical to evaluate the punch-through potential.