Exploring Interactions among EPBM Features using Bayesian Networks

This paper presents an implementation of Bayesian network (BN) and structure learning algorithm to explore the interactions and causal relationships among excavation features of an earth pressure balance tunnel boring machine (EPBM). This study was performed on a data set from the State Route 99 (SR99) tunnel construction in Seattle, WA. A score-based structure learning algorithm was used to construct the BN graphs. The effects of providing explicit information about geologic conditions in the interactions were evaluated. This study demonstrates that BN graphs could systematically model the interactions among EPBM features in a compact and interpretable representation. The score-based algorithm could successfully capture several true and sensible mechanisms of the feature interactions based on data. Furthermore, their dependencies might indicate how the operators controlled the EPBM.


INTRODUCTION
Tunneling is a complex process due to unpredictable interactions between geologic conditions and tunnel boring machine (TBM) behaviors. To control a TBM, the operators have to continuously make real-time interpretations, judgments, and decisions. For example, in soft ground tunneling using an earth pressure balance tunnel boring machine (EPBM), the operators have to interpret the encountered geologic conditions, decide the excavation control parameters, inject foam to condition the encountered ground, balance the chamber pressure, and navigate the shield to achieve the correct tunnel position (JSCE 2016;Maidl et al. 2013). Therefore, tunneling performance, quality, and safety are strongly reliant on the skill and experience of the TBM operators.
Typically, TBM operators are guided by instruction sheets that are prepared by tunnel engineers prior to each tunneling excavation phase. However, during the excavation phase, the operators have to make judgments and adjustments according to the numerous operational data collected by TBM sensors. For humans, continuously making real-time interpretations, judgments, and decisions in various control tasks based on numerous streaming sensor data is not straightforward and may lead to inconsistent results. A more systematic approach is required to utilize TBM operation data to support the tunneling process (Garcia et al. 2021).
Studies to address this challenge have been emerging in recent years. Especially due to the growing number of data collected by TBM sensors, the increase of computing power, and the emergence of machine learning techniques. For example, many studies have been attempted to predict or forecast the ahead geologic conditions (e.g., Erharter et al. 2020;Sousa and Einstein 2012;Zhang et al. 2019;Zhao et al. 2019), predict the tunneling performance (e.g., Martins and Miranda 2013;Zhou et al. 2020), and predict the tunneling-induced settlement (e.g., Chen et al. 2019;Zhang et al. 2021). However, most past studies focus on prediction performance. Less focus has been paid to understand the features (measured variables used as predictors), the causal inference, and the generalization of the data (Sheil et al. 2020).
This study aims to explore interactions and causal relationships among EPBM features in a systematic and quantitative approach. Bayesian Networks (BN) and a structure learning algorithm were used to model the interactions among EPBM features along the tunnel alignment. The relationship between the feature dependencies and EPBM control mechanisms was explored. The effects of providing explicit information about geologic conditions (i.e., soil labels) in the interactions were also evaluated. This study was performed on a data set from the State Route 99 (SR99) tunnel construction in Seattle, WA.

DATA
The SR99 tunnel construction was carried out using a 17. To allow more generalization and interpretability of the results, data from the same sensor types were summed (e.g., for sensors related to volume, force, pressure) or averaged (e.g., for sensors related to speed, length) together.
This study used statistical ring data as the observation data points. In other words, observation records of a feature within a ring were represented by a data point. Depending on the data characteristics, the data point was obtained by taking the final observation (e.g., for features related to volume, length), or taking the average observation (e.g., for features related to force, pressure, speed) along the ring.

Bayesian Networks
Bayesian network (BN) is a probabilistic graphical model that uses a directed acyclic graph (DAG) to compactly represent the joint distributions of a set of variables and their conditional independencies (Koller and Friedman 2009). In a BN graph , the joint probability distribution over variables 1 , … , can be expressed as a product of the individual conditional probability distributions (CPDs), (1) denote the parents of in . And encodes the local independencies ( ) which state that each node , given its parent nodes , is conditionally independent of its non-descendants , as expressed by By using this method, the global complexity of a high-dimensional distribution can be reduced into local probabilistic models and dependencies.

Structure Learning
BN graphs can be constructed manually based on domain knowledge or can be inferred based on data using Bayesian network structure learning (BNSL) algorithms. In general, BNSL can be divided into classes of (i) constraint-based method, (ii) score-based method, and (iii) hybrid method.
In this study, the BN graphs were constructed using the score-based method. Tabu search algorithm (Glover and Laguna 1993) was used to iteratively construct and modify the graphs toward an optimum network model. The score of the constructed graphs was computed based on Bayesian information criterion (BIC), defined as Where is the likelihood function, ̂ is parameters of the graph, n is the sample size, and ( ) is the model dimension or the number of parameters of the whole network.
This study was performed in R programming language using a BN framework implementation by Scutari and Ness (2020).

Data Preparation
The data was prepared by removing (i) observations of nonadvancing stages, (ii) observations with an incomplete set of features, i.e., missing values, (iii) observations with other erroneous records such as data outliers and duplications. The data outliers were determined using a modified interquartile range (Maher 2015).
The data was labeled according to soil types based on the borehole information and physical characteristics described as engineering soil units (ESU) in the geotechnical baseline project reports (WSDOT 2010a; b). The boreholes were envisaged to represent about 25 feet (about 7.62 meters) radius distance of their vicinity. Five labels were created to represent mixtures of the ESU at each borehole location, namely, (i) cohesive clay and silt (CCS), (ii) predominantly cohesive clay and silt with layers of cohesionless silt, fine sand, sand, and gravel (CCS-CSGCSF), (iii) predominantly cohesionless silt, fine sand, sand, and gravel with layers of cohesive clay and silt (CSGCSF-CCS), (iv) predominantly cohesionless silt, fine sand, sand, and gravel with layers of till/till-like deposits (CSGCSF-TLTD), and (v) mix of cohesionless silt, fine sand, sand, and gravel; cohesive clay and silt; till/till-like deposits (TLTD-CSGCSF-CCS).

Model Setup
First, a simple BN graph was constructed using the structure learning algorithm to model the relationship among ground conditioning volumes, i.e., foam volume, foam liquid (solution) volume, and foam agent (surfactant) volume. This was performed to demonstrate the capability of BNSL to model a true physical mechanism that can be easily validated.
Subsequently, more complex BNSL graphs were constructed to model interactions among EPBM excavation features. To evaluate the effect of providing explicit information about geologic conditions, the graph models were constructed with and without soil labels included as a feature. To evaluate the stability and the change of feature interactions along the tunnel alignment, the graphs were sequentially re-trained for each ring chainage. Meaning that a graph was constructed at each ring chainage based on data of all the previous ring chainages.
No prescribed graph structure was provided to the algorithm. The strength of the node connections (links) and the directions was ensured probabilistically by repeating each graph construction 200 times.

Ground Conditioning System
A simple BN graph was constructed to examine the capability of the structure learning algorithm to model the true physical mechanism of ground conditioning volumes. The graph was constructed by the algorithm without any prescribed network skeleton. Figure 1 shows the constructed BN graph at the end of chainage (using all observations in the dataset). This figure shows that BNSL successfully modeled the true physical mechanism of the ground conditioning volumes. The graph shows that the foam volume had conditional dependencies on the foam liquid volume and the air volume. And the foam liquid volume had a conditional dependency on the foam agent volume. This represents the true ground conditioning physical mechanism where foam is a product of foam liquid plus injected air, and foam liquid is a product of foam agent plus water. Figure 1 Interactions among features related to ground conditioning volumes constructed by the structure learning algorithm. Red links denote true relationship.
To evaluate the stability of these interactions, the graph was sequentially re-trained for each ring chainage location using all the previous ring chainages. Normalized data (for visualization) of the ground conditioning volumes, probabilistic strength of the network links, and probabilistic direction of the links along the tunnel alignment are shown in Figure 2(a), (b), and (c), respectively. Probabilistic strength of 1 means a link was always connected in 200 repetitions of graph construction. Probabilistic direction of 1 and 0 means a link always had direction as shown in Figure 1 and always had its reverse direction, respectively, in the repetitions.
This figure shows that the links were consistently connected along the tunnel alignment. This indicates the robustness of this method in capturing the interactions among these features, even when the data distributions were altered as the data were continuously added during tunneling. However, the probabilistic direction of the links shows more mixed results. The figure shows that the link directions could be switched. It seems the algorithm could not decide which feature was the parent or the child node since the foam agent volume can be estimated based on foam liquid volume, but foam liquid volume can also be estimated if the foam agent volume is known. This might be due to the score equivalence of the graph (Koller and Friedman 2009) and further investigation should be perform to draw a stronger conclusion (Chickering 2002).

Interactions among Excavation Features
More complex BN graphs were constructed to model the interactions among EPBM excavation features. Similar to the ground conditioning system model, the graphs were constructed by the structure learning algorithm without any prescribed network skeleton.  Figure 3 shows two samples of BN graphs that model the interactions among EPBM excavation features at ring chainage of 23000 ft (predominantly cohesive soils and till deposits) and 28000 ft (predominantly cohesionless soils). To observe more possible interactions among the features, probabilistic strength of 0.5 was used as the link threshold (the link appeared at least in 50% of the 200graph construction repetition).

Bayesian Network Graphs
Both graphs share several identical results. First, both graphs show that the thrust stroke was independent of other features. This result is sensible since thrust strokes are typically pushed to their length capacity according to the subsequent tunnel lining width and therefore should not have any dependencies on excavation conditions. Second, both graphs show that the penetration rate had conditional dependencies on the advance speed and the cutter rotation speed. This constructed interaction successfully models the true mechanism since in this case, the penetration rate (mm/rotation) was computed as the advance speed (mm/min) divided by the cutter rotation speed (rpm).
Both graphs also show an interaction between the foam volume and the penetration rate, as well as between the foam volume and the advance speed. This is sensible since some studies have shown that ground conditioning can be associated with tunneling excavation performance Roby and Willis 2014). Furthermore, to maintain the expected foam injection ratio (FIR) during each excavation step, foam flow is typically regulated automatically to correspond to the advance speed (Maidl et al. 2013).
Besides the above-mentioned results, both graphs show that interactions among the features might change during tunneling. The graph at chainage 23000 ft shows that the chamber pressure had a conditional dependency on the advance speed, whereas no noticeable link to the screw rotation speed. In contrast, the graph at chainage 28000 ft shows that the screw rotation speed had a conditional dependency on the chamber pressure, whereas no noticeable link between the chamber pressure and the advance speed.
Furthermore, the graph at chainage 23000 ft shows that speed features (i.e., cutter rotation speed and advance speed) were the root parents of the interactions. In contrast, at chainage 28000 ft, it seems force features (i.e., cutter torque, thrust force, and cutter head force) were the root parents. Typically, advance speed and cutter rotation speed are the key excavation control parameters used by EPBM operators. After specifying these control parameters, the operators need to check whether the produced forces and torques are still within the tolerable limits. Therefore, these graphs may indicate that at chainage 23000 ft (predominantly cohesive soils), the operators controlled the speeds without any significant intervention was taken regarding the forces and torques limits. In contrast, at chainage 28000 ft (predominantly cohesionless soils and till deposits), the operators had to keep adjusting the speeds according to the forces and torques limits.

Effects of Soil Labels
To evaluate the effect of providing explicit information about geologic conditions, the BN graphs were also constructed with soil labels included as a feature. Figure 4 shows two samples of BN graphs that model the interactions among EPBM excavation features with soil labels included as a feature. Similar to the previous graphs, probabilistic strength of 0.5 was used as the link threshold. To provide a visualization of the geologic condition along the tunnel alignment , the geologic map and the borehole information of SR99 tunnel are shown in Figure 5 (WSDOT 2010a; b).
By incorporating the soil labels, the dataset contained both numerical and categorical variables. In a conditional linear gaussian BN model with a mixed variable dataset, the categorical features will always be the parent nodes for the numerical features (Scutari and Denis 2015). Hence, the soil label node was the parent node of almost all features on these graphs.
Nevertheless, several key interactions in these graphs were similar to the previous graphs without the soil labels. The thrust stroke was independent of other features. The penetration rate always had conditional dependencies on the advance speed and the cutter rotation speed. The foam volume always interacted with the penetration rate. At chainage 23000 ft, the chamber pressure interacted with advance speed, but not with the screw rotation speed. At chainage 28000 ft, the screw rotation speed had a conditional dependency on the chamber pressure, but in this case, the chamber pressure still interacted with the advance speed.
In general, the graphs with soil labels produced fewer interactions between speed features (i.e., cutter rotation speed and advance speed) and force features (i.e., cutter torque, thrust force, and cutter head force). In contrast, features in the graphs without soil labels had more dependencies to force features. This might indicate that the information about the geologic conditions (i.e., the soil types) was stored and represented by these force-related "reaction" features.

Feature Interactions along Tunnel Alignment
To evaluate the stability and the change of feature interactions along the tunnel alignment, the BN excavation graphs were sequentially retrained for each ring chainage location using all the previous ring chainages. Figure 6 shows the interactions among features related to the penetration rate along the tunnel alignment. This figure shows that along the tunnel alignment, links between the advance speed and the penetration rate, as well as between the cutter rotation speed and the penetration rate, were consistently strong. The direction of the links was also consistent. These interactions successfully modeled the true mechanism of penetration rate interactions along the tunnel alignment.

Penetration Rate
The interaction between the penetration rate and the foam volume was also consistent along the tunnel alignment. This interaction was stronger especially in the end half of the alignment. However, the link direction was inconsistent. Again, this might be due to the score equivalence of the graph. Or this might indicate the tendency on how the operators controlled the advance speed and the injected foam volume. As previously discussed, ground conditioning can be associated with tunneling excavation performance. Injected foam volumes may affect the excavation performance Roby and Willis 2014). Reciprocally, engineers and operators estimate the injected foam volume for the subsequent excavation step based on the previous excavation performance. further investigations should be carried out to obtain more conclusive results on this interaction.

Chamber Pressure
During tunneling, chamber pressures must be regulated to balance the earth pressure. Typically, a target earth pressure is specified according to the overburden pressure and the estimated lateral earth pressure coefficient at a chainage location. Operators regulate the chamber pressure by adjusting the advance speed, which dictates the speed of the chamber to contract or expand, and/or adjusting the screw rotation speed, which dictates the depressurization of the chamber pressure (JSCE 2016;Maidl et al. 2013). Figure 7 shows the interactions among features related to the chamber pressure along the tunnel alignment. This figure shows that the link between the advance speed and the chamber pressure was stronger during the early half of the tunnel alignment. During this period, the direction of the link was mostly consistent. In contrast, the link between the chamber pressure and the screw rotation speed was stronger during the end half of the tunnel alignment. The direction of the link was also relatively consistent during this period.
This graph might reveal how the chamber pressure was regulated by the operators during tunneling. These results might indicate that no significant intervention on the screw rotation speed was required to regulate the chamber pressure during the early half of the tunnel alignment when the geology was dominated by predominantly   cohesive soils. In contrast, more intervention on the screw rotation speed was required to regulate the chamber pressure during the end half of the tunnel alignment when the geology was dominated by mixtures of cohesionless soils and till/till-like deposits. As another evidence, it can be observed in Figure 7(a) that the screw rotation speeds were highly fluctuated during the end half of the tunneling, which indicates stronger control by the operators.

CONCLUSIONS
This study has demonstrated that BN graphs can potentially be utilized to systematically model the interactions among EPBM features in a compact and interpretable representation. This study has also shown that the score based BNSL algorithm could successfully capture several true and sensible mechanisms of the feature interactions based on data, both with and without soil labels. Furthermore, causal relationships between features could be exploited by examining their dependencies and therefore might provide some indication on how the operators controlled the EPBM.
Detailed investigations should be carried out to obtain more conclusive results about the interactions, e.g., by introducing more features to add more complexity, by utilizing probability distribution in a finer spatial distance to capture detailed operator decisions, and by implementing dynamic BNSL to explicitly capture the sequential decision-making processes. 6.