Machine learning for robust structural uncertainty quantification in fractured reservoirs

Including uncertainty is essential for accurate decision-making in underground applications. We propose a novel approach to consider structural uncertainty in two enhanced geothermal systems (EGSs) using machine learning (ML) models. The results of numerical simulations show that a small change in the structural model can cause a significant variation in the tracer breakthrough curves (BTCs). To develop a more robust method for including structural uncertainty, we train three different ML models: decision tree regression (DTR), random forest regression (RFR), and gradient boosting regression (GBR). DTR and RFR predict the entire BTC at once, but they are susceptible to overfitting and underfitting. In contrast, GBR predicts each time step of the BTC as a separate target variable, considering the possible correlation between consecutive time steps. This approach is implemented using a chain of regression models. The chain model achieves an acceptable increase in RMSE from train to test data, confirming its ability to capture both the general trend and small-scale heterogeneities of the BTCs. Additionally, using the ML model instead of the numerical solver reduces the computational time by six orders of magnitude. This time efficiency allows us to calculate BTCs for 2 ′ 000 different reservoir models, enabling a more comprehensive structural uncertainty quantification for EGS cases. The chain model is particularly promising, as it is robust to overfitting and underfitting and can generate BTCs for a large number of structural models efficiently.


Introduction
Numerical simulations of physical systems described by differential equations are essential in engineering.Advancements in hardware have enabled computing units to solve coupled nonlinear differential equations, encompassing a wide range of phenomena, from weather forecasting (Bauer et al., 2015) to blood circulation in living bodies (Doost et al., 2016).However, these methods are computationally intensive and highly sensitive to specific cases.Besides the huge energy consumption of these computational infrastructures (Benoit et al., 2018), their availability is also limited.Furthermore, parameter tuning, sensitivity analysis (Borgonovo and Plischke, 2016), and uncertainty quantification (Abbaszadeh Shahri et al., 2022;Soize, 2017) demand up to millions of simulations.
One of the challenges in geothermal applications is characterizing fluid flow through complex underground networks.While the geometry of a fracture can define the general direction of flow, the local variation of petrophysical properties impacts the specific pathways (Meakin and Tartakovsky, 2009).The enhanced geothermal system (EGS), as an engineered underground reservoir, strongly relies on high flow rate circulation through the impermeable matrix.To enhance the reservoir's permeability, the cold fracturing fluid is injected to create new fractures or reopen the pre-existing ones (e.g.Kohl and Mégel, 2007).Hence, a complex underground fracture/flow pattern can be observed in any EGS example like the model presented by Egert et al. (2020).
Integrating local data coming from wells with field measurements like tracer tests (Cao et al., 2020) can provide insights into the EGS situation.Tracer test campaigns usually yield breakthrough curves (BTCs), which are widely used to extract properties of the porous media and fracture network.However, each measuring method is error-prone resulting in inherent uncertainty (Bond, 2015;Wellmann et al., 2010).Therefore, incorporating structural uncertainties in numerical simulations in EGS settings makes the flow forecast more realistic (Zhou et al., 2022).This study proposes to replace computationally demanding simulations with speedy ML models to quantify structural uncertainty estimations derived from tracer data in two different EGS settings.By stateof-the-art ML methods like decision tree regression (DTR), random forest regression (RFR), and gradient boosting regression (GBR), multifold BTCs are generated on top of pure time-consuming numerical simulations.We train reliable ML models to map geometric data from the uncertain fractures of the EGS reservoir to the simulated BTC.The position of the variating structural elements is used as the input feature, and the entire BTC is chosen as the target variable.The proposed ML model correlates the entire BTC with input features, rather than using a time window to predict the future.

Tracer models
Tracer flow in two different cases are applied in this study.The conceptual model introduced by Dashti et al. (2023) is used here as the first case.The model for the first case is called the 'simple case' because it is a highly simplified version of an EGS with a doublet configuration.The conceptual model contains two main transmissive/open faults that are connected to an injection and production well.There is also an additional sub-horizontal fault/fracture structure making a connection between the major faults at greater depth.However, data related to this structure are subject to uncertainty since this fault is located far from the drilling trajectory, and its existence as a conduit is confirmed only by additional geophysical surveys or hydraulic testing.Fig. 1   To comprehensively evaluate the performance of ML methods, a second, more intricate fracture network model (named as complex case) was developed (Fig. 2).The 'complex case' incorporates seven fractures, with two designated as uncertain.The impact of varying these two fractures' depth and dip angle on tracer flow was assessed through 100 scenarios.All scenarios shared identical material properties, while the uncertain fractures' dip and depth were varied.The modelling assumptions of the complex case are similar to the simple case which is already addressed in Dashti et al. (2023).

Machine learning model
The ML model in this study predicts the tracer concentrations over time, i.e. the BTCs for two cases.Time series estimation for different applications is a well-documented topic (Gudmundsdottir and Horne, 2020;Weigend and Gershenfeld, 1994).For example, Alakeely and Horne (2020) introduced a recurrent neural network to predict the future by incorporating historical data.Such methods predict the system's long-term performance based on a moderate duration of the monitoring data.However, our study predicts the entire time series making the ML models applicable for cases without any historical data.
Due to the nature of the problem, two different strategies are developed.
• Strategy 1: Two ML models, DTR and RFR, are trained to independently predict the tracer concentration values.Both models predict the entire time steps of the BTC, using the input features.In this study DTR and RFR correlate structural information of the geological model with the tracer concentration.While in DTR a single tree is trained to capture the relation between the input features and target variable, RFR cultivates several trees in parallel (bagging).DTR is simple to implement and interpret, but it can be prone to overfitting.Therefore, the more complex RFR is also included in this study.The mathematical foundations of DTR and RFR are well-documented in the literature e.g.Kotsiantis (2013), Liu et al. (2012) andXU et al. (2005).• Strategy 2: A GBR model is used to predict the concentration value at each time step by correlating it with the previous prediction.The GBR is an ensemble method that combines multiple simple and weak learners sequentially (bagging) to improve the overall performance of the model.This approach, denoted as the chain model, requires GBR to be executed for each time step of the BTC.Details of this approach are elaborated in the following.

Chain GBR model
Fig. 3 provides an overview of the chain regression model for the simple case.A BTC, serving as the target variable, is presented in Fig. 3a.The input features are composed of the structural geometric information from the reservoir model with the coordinates of four corners of the uncertain sub-horizontal fracture (P1, P2, P3, and P4 in Fig. 3-b).The model correlates the x/z coordinates with the BTC concentration values, i.e. the y-coordinate data remain fixed across all scenarios for the sake of simplicity.All the governing equations and modelling assumptions behind the calculation of the BTCs are fully addressed in Dashti et al. (2023).For the complex case, coordinates of the two uncertain fracture surfaces are used as the input feature while the BTC data are target variables.
The chain model predicts the BTC concentration values in a sequential manner.It starts by predicting the concentration for the first time step (C1) based on the input features (Fig. 3-b and c).For the second and following time steps (C2), the model uses the previous values, i.e.C1, along with the input features.Some errors can exist in the predicted C1 by GBR.However, to predict C2, the input feature list still contains 8 coordinate values than have a higher impact compared to the recently predicted C1.This gradual addition of the predicted values can help the chain model to adjust the weight of added features, i.e. previously predicted concentrations.Fig. 3-c illustrates how concentration values from previous steps concatenate in the input features' list.To predict the first concentration value (C1) in the GBR chain model, the input feature list initially contains eight values.To predict the concentration for the last time step of the simple case (C169), the input feature list contains eight coordinates and 168 previously predicted concentration values.In the complex case, the BTC includes 140 concentration values.The input feature list of the DTR and RFR models remains fixed, because these two methods predict all the time steps of the BTC merely based on the coordinates of the fractures.
The GBR algorithm (Friedman, 2002) is selected due to its simplicity, bagging nature, and efficiency as a predictor for the chain model.Like other supervised ML algorithms (Gupta, 2022), GBR learns a function that maps the input feature/s (x) to target variable/s f(x) with the minimum loss: where L is the loss function and f (x) represents the prediction.The loss function is chosen based on the type of learning (e.g., regression, classification) and the type of the target variable (e.g., discrete, continuous).Squared error (L 2 ) loss (Bühlmann and Yu, 2003) is a simple and efficient loss function when outliers are not expected and is hence chosen here: ML methods generating an ensemble of predicting models in parallel (bagging methods like RFR) or sequential (boosting methods like GBR) are more reliable than models consisting of a single strong predictive model (like DTR) (Fanelli et al., 2013;Shu and Burn, 2004).Boosting methods like GBR can have a better performance for working on small data sets compared to bagging methods that distribute the data set between different predictors.GBR starts with a very simple model (F 0 (x)), trying to fit a straight horizontal line (average of target variable).In fact, the derivation of the loss function with respect to the predictions establishes the average value as the best guess for the first tree.In the next round, the GBR algorithm maps the input features to the residuals (remaining errors) of the previous tree, a process that can be interpreted as performing gradient descent on the negative derivative of the difference between prediction and target variable w.r.t. the prediction (Breiman, 1998).The use of residuals rather than absolute values is another reason for choosing GBR.This allows for the inclusion of residuals contributed by recently error-prone predicted concentration values into the model.In subsequent rounds, new decision trees are trained based on the accumulated residuals of the whole ensemble (Schapire et al., 2003): where Fm (x) represents the final general function that connects input features to the target variable, F m− 1 (x) contains the information from all previous tress, α is the learning rate that avoids overfitting and fm (x) represents the last tree that is correlating remaining residuals and the input features.Low learning rates decrease the impact of each tree, i.e., more trees will be needed but the model also will be more generalized.GBR minimizes the error of each tree and uses the remaining errors as the target variable of the next tree.In this way, the model is trained based on its minimized errors and aggregates several trees with decreasing errors.He et al. (2022) delved into the details of the GBR.

ML model optimization and quality control
Each ML model has two types of arguments: 1) inputs that include hyperparameters (parameters related to the model's architecture) and features selected by the user for predicting the target variable/s, and 2) output arguments that consist of internal weights and the target variable/s.The ML model is trained to minimize the error by tuning its input arguments, allowing the learning algorithm to optimize the output arguments and achieve better scores on the withheld test set (Alpaydin, 2020;Hutter et al., 2019).This iterative process, known as hyperparameter tuning (Raschka and Mirjalili, 2019) involves optimization of parameters such as the learning rate, number of trees, maximum depth of trees, etc. to decrease the error.Determining the optimal number of trees poses a challenge due to the bias-variance trade-off (Oshiro et al., 2012;Probst et al., 2019).Another hyperparameter, the maximum depth of a tree, is defined as the longest path between the root node (first node) and the leaf node (last node).
Grid search is a hyperparameter tuning method that allows input arguments to be defined as a range rather than a single value.It performs an exhaustive search over all possible combinations of values to identify the model with the lowest error i.e. highest score.For the RFR model, the number of trees and maximum depth is considered as arrays with 20 and The accuracy distribution in the train split (a) is smooth and higher accuracies can be achieved by increasing the number of threes and maximum depth of each tree.Subplot b depicts the more patchy and anisotropic behavior of the accuracy with respect to the hyperparameters.10 elements, respectively that result in 200 combinations.For the DTR model also maximum depth of each tree, the minimum number of samples in a leaf node and the minimum number of samples for splitting an internal node are tuned.In total, an ensemble of 540 models has been calculated using hyperparameter tuning for the DTR method.In the GBR algorithm of the chain model default values are used.Conventionally, higher score values are preferred over lower ones, and therefore we also tried to find out the combination with the maximum negated mean squared error (MSE) using the grid search.
To evaluate the model's performance, k-fold cross-validation (Zhang et al., 1999) has been employed.Rather than splitting the input data into train and test, it randomly splits them arbitrarily into k number of "splits".Then, the ML model will keep one split as the test and all others as the train sets.In the case of splitting data into five splits, the same number of models will be run and in each run, splits will be shuffled.This five-run procedure will be performed for all the assumed 200 combinations of hyperparameters in the grid search for the RFR method.Therefore, it finally creates 1′000 ML modelseach of them being an ensemble of individual treesand the ensemble with the highest score will be used for the final prediction.In this study, we follow the recommendations in the literature (An et al., 2007;Erdogan Erten et al., 2022) and use five splits for cross-validation for all three methods.Training (online) time for the 1000 ML models of the RFR model on a Core i7 laptop is approximately 10 s.For DTR, with an ensemble of 2700 ML models, the online time remains to be around 10 s.The simplicity of DTR compared to RFR results in faster computation.The chain model proves to be the most time-consuming approach, taking around 70 s for training without any hyperparameter optimization.Several hyperparameters were tested for the chain model, but the online time only increased without improving the model's accuracy.Therefore, default values were chosen for the chain model.For both the simple and complex cases several values have been tried for the learning rate in hyperparameter tuning but in the end, the default one (0.1) has been used.The required time for predicting a new solution with the trained models (offline time) remains in the range of milliseconds.To access the input data and trained ML models of two cases, please refer to the code and data availability section.
Fig. 4-a and b show the distribution of the negative MSE scoring metric in train and test splits, focusing on two tuned hyperparameters of the RFR model in the simple case.The average of the MSE in the four train splits is presented in Fig. 4-a.The distribution of the average scoring metric in the train splits is influenced by both the number and maximum depth of trees.Based on Fig. 4-a, the accuracy of the model increases as both the maximum depth of trees and the number of trees increase.However, the score distribution in the test split (Fig. 4-b) is more complicated.The scores in the test split are generally lower than those in the train splits (− 0.04 to − 0.004 versus − 0.018 to − 0.003).While the score distribution for the train split promises high accuracy of the model by increasing the two hyperparameters, the heterogeneous distribution in Fig. 4-b raises doubts on this conclusion.The presented example in Fig. 4 concludes that determining the optimal combination even for only two hyperparameters is not a straightforward task.Going to higher dimensions can make the situation more complicated and unsolvable.Therefore, methods like grid search identify the best combination of tuned hyperparameters.

Simple case
Dashti et al. ( 2023) employed numerical simulations to assess the effects of uncertainty in structural models using 50 different structural scenarios in a simplified EGS setting.In these synthetic models, a 24-hour tracer injection on day eight of the simulation was assumed and monitored along a one-year time span in the production well (see Fig. 5-a with e.g.peak concentration time varying between days 54 and 68).To better present the variations, a box plot (Fig. 5-b) is generated by extracting the highest concentration value from each BTC and normalizing them based on their median.The variation of the tracer peak concentration time, as well as a 25 % fluctuation in peak magnitude, emphasize the significance of structural uncertainty, which can introduce unexpected deviations in the results of important field tests.
The appearance of a second peak between days 100 and 150 in Fig. 5a is due to the reinjection of the tracer, not multiple flow paths or stagnation zones.The first 30 days of the simulation are disregarded due to negligible concentration (almost zero) of the tracer in the production well during that period.
Results of the k-fold cross-validation in Fig. 6 show how RMSE varies in five splits of the three ML methods.The average RMSE of the chain model is lower than the DTR and RFR.Apart from the higher absolute accuracy, the homogeneity of the model's performance is another important factor to consider.Based on Fig. 6, RMSE values in the DTR model show higher standard deviations.The higher standard deviation of RMSE for the DTR model suggests that it is overfitting the training data.Overfitting occurs when a model learns the training data too well and is unable to generalize to new data.In the case of the DTR model, this may be due to the fact that it is a single-tree model.Hence it is more likely to memorize the training data than an ensemble model like the RFR or chain model.In this study, the simplicity of the DTR model is the main factor leading to overfitting issues.The RFR model mitigates overfitting by initiating multiple parallel trees that distribute the input data.The chain model also incorporates several sequential models that consistently outperform a single model.Overall, the chain model is the most accurate and robust ML model for predicting BTCs in cases without any historical data.It has a lower average RMSE and a lower standard deviation of RMSE than the RFR and DTR models.
To better assess the trained models and prevent information leakage, two additional scenarios are imported into the three ML models.The trained ML models are then utilized to predict the BTCs of these two new test scenarios.Table 1 presents the accumulated RMSEs of these two test scenarios (test set) and models' input data (train set).The ML models exhibit an increase in error when transitioning from train to test scenarios.However, even for the two new test scenarios, the RMSE remains at an acceptable level.The DTR model had the largest difference in RMSE between the train and test sets, which clearly indicates overfitting.The RFR and the chain models yield a better balance in terms of RMSE between the train and test data, suggesting their improved performance and ability to generalize.Fig. 7 shows the numerically simulated BTCs of two test scenarios and the outputs of three ML methods.For one of the test scenarios, all three ML methods achieved similar and reliable results compared to the simulation results.For the other test scenario, the DTR method was less accurate than the other methods, likely due to overfitting.The RFR and chain models had similar levels of accuracy.
To further evaluate the trained models, an additional set of 2′000 different structural scenarios is generated and imported into ML models.In this step, only the connecting fault is perturbed, and the coordinates of its four corners are inputted into the three ML models.Fig. 8 provides a visualization of the BTCs generated by the three ML models.These 6′000 BTCs presented in Fig. 8 are calculated in the scale of milliseconds using DTR, RFR, and chain models.Two extreme cases from the training data are highlighted with blue color and dots to illustrate the boundaries of expectations.The RFR method perfectly follows the trend, generating 2′000 almost unique and parallel BTCs (Fig. 8-a), which suggests that it may be underfitting the training data.The underfit models have a high bias due to oversimplifications and ignoring the underlying patterns in the train data.This problem can directly originate from the insufficient input data used to train the RFR model.The bagging procedure of RFR splits 50 input data sets into parallel bags making it difficult for each tree to be a balanced predictor.On the other hand, DTR has generated far fewer unique BTCs as shown in Fig. 8-b.The covered area with BTC curves in Fig. 8-a    A. Dashti et al. are generated and the majority of 2′000 BTCs overlap the 50 BTCs used in the training step.
The chain model consistently generated more reliable BTCs compared to RFR and DTR (Fig. 8-c).However, in some cases, the chain model generated BTCs with irregular patterns, such as concentration values fluctuating around the peak.Despite these local discrepancies, the chain model is still the most reliable ML model for predicting BTCs.
Another notable point is that all the three data-driven ML methods are unable to be used for extrapolation.Even the frequency of generated BTCs decreases close to the extreme point for three subplots shown in Fig. 8.This issue is the worst with the DTR method while the chain model has generated more BTCs in the adjacency of the extreme cases.

Complex case
For the complex case, 100 BTCs are simulated in the numerical solver and used to train and test the three ML models.The number of scenarios has increased compared to the simple case (with 50 simulations) due to the complexity of the model.In the complex case, a 24-hour tracer injection on day five of the simulation is assumed and monitored for two Fig. 11 shows the numerically simulated BTCs for 100 scenarios of the complex case.Similar to the simple case, the peak concentration   Two test scenarios of the ML methods are shown in Fig. 12.The RMSE values confirm the higher accuracy of the chain method.The cumulative RMSE for both scenarios is 0.05 ppm for the chain model, 0.18 ppm for DTR, and 0.16 ppm for RFR.Notably, all three machine learning models were employed with the same hyperparameters for both the simple and complex cases.

Discussion
Detailing the observed errors is crucial for future work aimed at improving the interpretability of the ML methods' performance.The (negligible) discrepancy likely stems from the distribution of test scenarios and size of input data (50 and 100 scenarios).This finding underscores the sensitivity of data-driven models to input data distribution.As extrapolation is a known challenge for such models, selecting a test sample near the boundary in this study exemplifies this limitation.A uniform high-density sampling strategy may prove more effective than the Gaussian distribution.
Even with large datasets, data-driven ML methods can still deviate from the underlying physics.Degen et al. (2023) proposed promising physics-based ML methods using order reduction techniques e.g., non-intrusive reduced basis, to build the solution based on basis functions that preserve the structure of the physics.In this study, we employed a sequence of concentration values as the target variable, allowing the ML models to learn the temporal relationships.Three tested ML methods have been able to capture the trend for two different cases.The current limitation is that the concentration prediction is restricted to a single point within the model.However, our strategy can be extended to develop ML models that predict the target variable at various points over time.
Meanwhile, the ML methods were significantly faster than the numerical solver, with up to six orders of magnitude reduction in computational time.To numerically solve the problem of the simple case, 12 cores on a high-performance computing cluster should run for 4 h.The whole time for constructing (offline) and applying (online) the ML models remains in the scale of seconds.This substantial reduction makes uncertainty analysis feasible using fast and reliable ML models, without relying on time-consuming simulations that typically span multiple days.This concept can also be suited for including structural uncertainties in more complicated EGS settings with several intersecting fractures.

Conclusion and outlook
This study presents a novel approach for using ML methods that enables quantifying the impact of structural uncertainty on BTCs in two EGS reservoirs.The approach was the first test to expand the range of structural reservoir models using ML techniques, based on an original set of a limited number of the numerical scenarios.This meets the specific requirement of uncertainty quantification, which is to provide a broad range of scenarios.
Different ML approaches are trained using the available numerical simulations to predict the BTCs based on the geometries of the perturbed elements.One ML approach used DTR and RFR algorithms to predict the entire BTC at once.Another ML approach employed a chain of GBR models to predict each time step of the BTCs while considering the correlation between consecutive time steps.The DTR model suffered from overfitting, while the RFR and chain models were more reliable, achieving an acceptable accuracy with a balanced accumulated RMSE in train and test scenarios.In the simple case, the RMSE for the DTR model jumped from 0.00011 to 0.15 between train and test scenarios, while for the RFR and chain models, it reached from 0.0001 to 0.04 and from 0.00012 to 0.052, respectively.
The trained ML models are further applied to generate BTCs for 2′000 unique structural scenarios in the model with a simple geometry.The chain model was more accurate than the RFR and DTR models.The RFR method produced 2′000 BTCs that closely follow the trend observed in the training set indicating the underfitting issue, whereas DTR can only replicate the BTCs from the training set.The chain model captures both the general trend and small-scale patterns of the data.However, the accuracy and reliability in all three methods decreases for test cases that are close to the boundaries of the input test data.A uniform sampling for selecting the input data can help the ML methods to have a wide and homogeneous distribution in the test data.
The presented approach can be adopted for a broader number of forward calculation schemes.This opens up new possibilities for more complex fractured rock settings.Rather than coordinates of one/two fractures, a more complex structural network from a real-world EGS case can be used as the input features for the ML methods.
While only structural models were varied herein to assess their impact on the BTCs, future applications could encompass modifications to specific petrophysical properties of the reservoir, further expanding the possibilities of stochasticity.Conversely, integrating more data into the model, such as BTC's or hydraulic testing data obtained from specific EGS well configurations (e.g.Schill et al., 2017), can reduce structural uncertainties.This allows for the rapid elimination of non-viable models using ML-driven routines.
Harnessing the computational efficiency of ML, this innovative approach can be transformed into a surrogate model, effectively representing the core of an inverse, backward calculation scheme for parameter identification.This transformation has the potential to replace conventional analytical solutions, which are currently the primary method for estimating parameters from tracer campaigns.The ML- A. Dashti et al.
Fig. 1.A schematic view of the simple case.The two certain sub-vertical faults (Fault_Inj and Fault_Pro) are shown as continuous black lines and the thinner green lines show traces of the uncertain sub-vertical fault (Fault_Con).Each green trace makes a unique structural scenario.

Fig. 2 .
Fig. 2. A complex EGS setting with seven fractures.Certain (five) fractures are shown as grey surfaces with varying shades and solid black borders while the two uncertain fractures are highlighted via the thick red border and hashed infill.Two arrows show the location of the injection and production wells.

Fig. 3 .
Fig. 3. Workflow developed for chain GBR model a) A BTC representing concentration values, C, versus logarithmic time scale.b) Four corners of the sub-horizontal uncertain fault, P1, P2, P3, and P4, are used in the ML model to predict the first concentration value (C1) for the simple case.c) To predict the second concentration value (C2), the first predicted value (C1) is also included besides the coordinates of four corners.In each time step, the previous values are added up to the list of input features.

Fig. 4 .
Fig. 4. Change in the accuracy of the ML model with respect to different combinations of two hyperparameters of the RFR model on the train (a) and test (b) splits.The accuracy distribution in the train split (a) is smooth and higher accuracies can be achieved by increasing the number of threes and maximum depth of each tree.Subplot b depicts the more patchy and anisotropic behavior of the accuracy with respect to the hyperparameters.

Fig. 5 .
Fig. 5. a) Unique BTCs simulated using the finite element solver and used as target variables for the ML models.BTCs are different from each other due to changing structural models.b) A box plot visualizing the normalized peak concentration values versus the time of the calculated peak (analysis based on Dashti et al. (2023)).

Fig. 6 .
Fig. 6.Accuracy distribution of the three designed ML models within their splits.RMSE values are represented as accuracy parameters.

Fig. 7 .
Fig. 7. Two different test cases were investigated to understand the accuracy of ML models.The chain model and RFR have a high accuracy in both cases.

Fig. 8 .
Fig. 8. Two thousand generated BTCs using RFR (a), DTR (b), and chain model (c).Two extreme cases coming from the simulation are highlighted as blue curves with dots.

Fig. 9 .
Fig. 9.Most of the 2′000 BTCs generated by DTR (named Test and shown as solid black line) exactly match the input data used for training the model.

Fig. 10 .
Fig. 10.A 2D cross-section from the middle of the complex model.Thin black lines represent the trace of the two uncertain fractures that connect certain fractures shown via two thick black lines.The red and blue traces represent the geometry of the uncertain fractures in two tests.Arrows show the location of the injection and production wells.

Fig. 11 .
Fig. 11.Thin black curves represent 98 BTCs simulated using the finite element solver.Two test scenarios are also named as Test 1 and Test 2. To see the geological model of the test cases refer to Fig. 10.

Fig. 12 .
Fig. 12. Simulation and ML-generated results for Test 1 are plotted as red circles and lines.Results related to Test 2 are plotted as blue circles and lines.

Table 1
RMSE values of the three designed ML models within the train and test sets.