Non-intrusive and semi-intrusive uncertainty quantification of a multiscale in-stent restenosis model

Uncertainty estimations are presented of the response of a multiscale in-stent restenosis model, as obtained by both non-intrusive and semi-intrusive uncertainty quantification. The in-stent restenosis model is a fully coupled multiscale simulation of post-stenting tissue growth, in which the most costly submodel is the blood flow simulation. Surrogate modeling for non-intrusive uncertainty quantification takes the whole model as a black-box and maps directly from the three uncertain inputs to the quantity of interest, the neointimal area. The corresponding uncertain estimates matched the results from quasi-Monte Carlo simulations well. In the semi-intrusive uncertainty quantification, the most expensive submodel is replaced with a surrogate model. We developed a surrogate model for the blood flow simulation by using a convolutional neural network. The semi-intrusive method with the new surrogate model offered efficient estimates of uncertainty and sensitivity while keeping relatively high accuracy. It outperformed the result obtained with earlier surrogate models. It also achieved the estimates comparable to the non-intrusive method with similar efficiency. Presented results on uncertainty propagation with non-intrusive and semi-intrusive metamodeling methods allow us to draw some conclusions on the advantages and limitations of these methods.

intrusive uncertainty quantification takes the whole model as a black-box and maps directly the three uncertain inputs to the quantity of interest, the neointimal area. The corresponding uncertain estimates matched the results from quasi-Monte Carlo simulations well. In the semi-intrusive uncertainty quantification, the most expensive submodel is replaced with a surrogate model. We developed a surrogate model for the blood flow simulation by using a convolutional neural network. The semi-intrusive method with the new surrogate model offered efficient estimates of uncertainty and sensitivity while keeping a relatively high accuracy. It outperformed the results obtained with earlier surrogate models. It also achieved the estimates comparable to the non-intrusive method with a similar efficiency. Presented results on uncertainty propagation

Introduction
Numerical simulations of real-world phenomena contribute to a better understanding of these phenomena and to predicting the dynamics of the underlying systems. Many natural phenomena occur across scales in space and time [1][2][3][4][5].
As a result, multiscale models and simulations are widely used [1,[6][7][8][9]. These multiscale models couple mathematical models of relevant processes on different spatial or temporal scales together using suitable scale bridging methods [10].
However, multiscale simulations can suffer from substantial computational cost because of the high computational demands of, usually, the microscale simulations [9]. Uncertainty quantification (UQ) analysis [11] applied to multiscale simulations adds additional substantial computational burden since thousands of runs are required for good estimates of the uncertainties. Therefore simulations can become extremely time-consuming or impractical, even on current state-of-the-art supercomputing infrastructure.
Surrogate modelling is a common non-intrusive way to resolve the computational intensity problem under the UQ scenario. The motivation for this technique stems from the large number of samples needed in the UQ estimates or from frequently required numerical integration in global sensitivity analysis with the Monte Carlo method [12]. A non-intrusive surrogate model takes the complete model as a black box, mimics the behaviour of the original computational model from a limited amount of existing data and evaluates the corresponding responses of the model based on specific inputs [13]. By replacing the original model with a surrogate model, this allows conducting UQ or sensitivity analysis at an acceptable computational cost. Gaussian process regression [14,15] is one of the state of art methods for surrogate modelling. It is a non-parametric Bayesian regression method and its predictive distribution can be efficiently used in UQ and sensitivity analysis [16][17][18][19][20]. There are several other popular methods for surrogate modelling, including polynomial chaos expansion [21,22] and neural networks [23,24].
As opposed to the traditional non-intrusive UQ method which takes the whole simulation as a black box, we have recently proposed a set of semiintrusive algorithms for multiscale UQ [25] and demonstrated their effectiveness for several multiscale UQ scenarios [25,26]. The term "semi-intrusive" refers to additional interventions into the code of the model compared to non-intrusive approaches: one "opens up" the black box and considers the coupled structure of the multiscale model while the single scale models are still viewed as black boxes. Usually the output of a multiscale model is derived from a macroscale submodel, which in turn is implicitly determined by microscale dynamics to which it is coupled. One approach from [25] relies on performing a Monte Carlo UQ on the macroscale submodel while replacing the most costly microscale submodel by a surrogate model. Replacing the expensive part of a model with a relatively cheap surrogate can often significantly improve computational efficiency, but comes at the cost of reduced accuracy.
In [26], a physics-based and an interpolation-based surrogate model were constructed to implement the semi-intrusive UQ for the in-stent restenosis multiscale model [27][28][29]. In that research, the flow solver submodel was replaced with these surrogates. The flow solver takes blood vessel geometry and blood flow velocity as the inputs and outputs the corresponding wall shear stress.
The physics-based surrogate model simplified the fluid simulation to an ideal Poiseuille flow, hence the shear stress could be computed analytically. The interpolation-based surrogate model applied the nearest neighbour method and approximated the shear stress based on a training dataset. We estimated how the three uncertain inputs: blood flow velocity, endothelium regeneration time and maximum deployment depth of the stent, contributed to the quantity of interest (QoI), the neointimal area. All the semi-intrusive UQ were implemented with a quasi Monte Carlo method based on a Sobol sequence [30,31]. The results were compared to black-box Monte Carlo results to demonstrate the efficiency improvement. However, a comparison between the semi-intrusive algorithm and non-intrusive UQ with a surrogate model replacing the complete multiscale model were not explored in that study. In this work, we first improve the surrogate model of micro-scale model (the blood flow simulation) from [26] by using a convolutional neural network (CNN), because of its capability of pattern recognition and feature extraction [32,33]. Additionally, a surrogate model for non-intrusive UQ is designed to directly map the input parameters to the quantity of interest. The uncertainty estimations with both these methods are then carried out and compared in terms of the estimation accuracy and computational efficiency.
The paper is arranged as follows. The two-dimensional multiscale model of in-stent restenosis is shortly introduced in Section 2. The new surrogate model for the blood flow simulation and a surrogate model for the whole in-stent restenosis model are described in Section 2.2 and 2.3. The approach to estimating and analyzing the uncertainty of the response of the in-stent restenosis model is explained in Section 3. The results of surrogate modelling, uncertainty quantification and sensitivity analysis are presented in Section 4. Sections 5 and 6 compare and discuss the UQ performance and summarise the obtained results.

In-stent Restenosis
An arterial stenosis is the abnormal narrowing of an artery, usually due to accumulation of fatty material in the walls and intimal thickening (atherosclerosis). In ischemic heart disease, a stenosis in a coronary artery limits blood flow to the heart muscle, which can result in reduced heart function, shortness of breath, chest pains or a heart attack. Coronary stenosis can be treated using balloon angioplasty, in which a balloon is inserted into the artery via a catheter and inflated, which compresses the fatty plaque against the arterial wall. During this procedure, a wire-mesh stent is deployed to keep the artery from recoiling back to the narrowed state. This procedure damages the vessel wall, and in particular the endothelium, the innermost lining of the artery. This triggers a healing response involving (amongst other processes) growth and proliferation of smooth muscle cells (SMCs) on the inside of the artery. In some cases, an excessive growth response occurs, leading to a significant renewed narrowing of the artery inside of the stent. This is known as an in-stent restenosis (ISR), and is considered an adverse treatment outcome [34][35][36].
The ISR2D model is a two-dimensional simulation of the post-stenting healing response of an artery [28,29], which is used here to test the proposed semi-intrusive multiscale UQ algorithm. Note that a more realistic, but also computationally much more expensive three dimensional version of the model is available [37,38]. The ISR2D model used in this paper consists of three submodels: the IC submodel, which simulates initial conditions in the form of the state of the artery immediately after stenting, the SMC submodel, which is an agent-based simulation of smooth muscle cell growth and endothelium recovery, and the blood flow submodel, which uses the Lattice Boltzmann method (LBM) to simulate blood flow through the artery. The structure of ISR2D is shown in Sufficiently high wall shear stress (WSS) at the arterial wall triggers any present endothelium to produce nitric oxide, which in turn inhibits the growth of the SMCs if it crosses a threshold value. Blood flow thus affects SMC growth, but in turn is also affected by it, as the proliferating SMCs change the geometry of the artery. Note that due to random placement of daughter cells when the growing SMCs divide, variability in the length of the cell cycle, and a random spatial pattern of endothelium recovery, the SMC model is stochastic. The main output of the model is the cross-sectional area of the neointima (the new tissue formed due to SMC proliferation) as a function of time after stenting. A clinically recognised in-stent restenosis occurs if more than 50% of the original cross-sectional area of the artery is covered by the neointima [39].  in the semi-intrusive UQ scenario is therefore highest. Nikishova et al. [26] performed a semi-intrusive UQ analysis for the same model, comparing surrogates based on nearest-neighbour interpolation and on simplified physics to a non-intrusive black box quasi Monte Carlo approach. The UQ estimate with physics surrogate has improved computational efficiency but the means of the cross-sectional area of the neointima resulting from the surrogate models were substantially lower than the black-box Monte Carlo results. On the other hand, the uncertainty estimates with nearest-neighbour interpolation were better but the corresponding speedup was relatively poor. In this paper, an accurate surrogate model using a convolutional neural network is proposed and demonstrated to be a good representation of the blood flow model, while at the same time leading to the desired reduction in computational cost for the multiscale UQ.

Surrogate Model for Blood Flow Simulation
As mentioned in the previous section, the blood flow simulation in the ISR2D model is computationally expensive. To reduce the computational cost of the model, a surrogate model to compute the required wall shear stresses is used to replace the original blood flow model. Convolutional neural networks have been applied to fetch the features from irregular geometries in fluid dynamics prediction [32,33,40]. In the ISR2D application, the aim is to learn the wall shear stress as a nonlinear function of the vessel wall geometry and the blood flow velocity.
The mapping between input and output of the BF simulation can be considered as a function f , which takes the geometry matrix ζ and the inlet blood velocity v in as input and produces a 2 × k-dimensional vector of WSS magnitudes, τ wss as the output: where k = 150 is the grid size along x axis, ζ = (ζ ij ) ∈ R k×k . The geometry matrix ζ was used for CFD simulation. The surrogate modelf replaces the original blood flow model f (ζ, v in ) and offers an approximate prediction of wall shear stress in a reduced amount of time.
The CNN model follows the network structure proposed by [32] and was optimized to fit our application. The model consists of three parts: shape encoding, nonlinear mapping and stress decoding, as shown in Figure 3. The shape encoding layers extract the features of the geometry to the shape code.
A fully connected (FC) layer then maps the shape code together with the blood flow velocity to the stress code. The stress decoding part is responsible for a mapping from the stress code to wall shear stress. In this surrogate model, the geometry input was transformed from a binary map to a 2 × k array which indicates the locations of upper and lower fluid-solid boundaries. The convolution layers then take the information from both boundaries into account and predict the shear stress on these boundaries. There are three convolution layers, a fully connected layer and four deconvolution layers deployed between the input layer and output layer. Each of them is followed by a rectifier linear unit (ReLU) as the activation function. Additionally, the output of each convolution layer is concatenated to the corresponding deconvolution layer to help with the decoding process. The output of the surrogate model represents the shear stresses on the wall surface on one side of the channel. The sequence of the two rows in the input decides which side is predicted. Therefore, a prediction of the WSS on both boundaries requires the surrogate model to be run twice.
The training data for the surrogate model comes from the runs of the ISR2D model [41]. One run of ISR2D calls the LBM solver 1440 times (once per hour of simulated time). This means that with only a few runs of the simulation, a considerable amount of flow data for training is already available. We trained the surrogate model with the data from four runs of the ISR2D simulation, hence 5760 blood vessel geometries and wall shear stress distributions were used for training. A validation dataset was generated from an additional run of the ISR2D simulation. These five runs of simulation for training and validation are the results from previous experiments using the quasi Monte Carlo method [41].
The training optimization was based on the mean squared error loss function:

Surrogate Model for ISR2D
This surrogate model is constructed to replace the whole ISR2D multiscale model for uncertainty quantification and sensitivity analysis. Let the multiscale model function be defined by g(ξ) with ξ denoting an n-dimensional vector consisting of the stochastic variables and uncertain inputs of the model. The response of the model is: As mentioned before, the ISR2D model is a stochastic model and thus includes both aleatory uncertainty and epistemic uncertainty. In the surrogate model for non-intrusive UQ, we assume that the aleatory uncertainty can be separated from the function g, hence the expression for the surrogate model can be written whereŷ denotes the response of the surrogate model, ξ * is the random variable containing aleatory uncertainty and ξ ∼ξ * is the parameter vector without stochastic variable. Assuming that the stochasticity follows a normal distribution N (0, σ * 2 ), such a stochastic model can be quantified by Gaussian process regression (GPR). Note that ISR2D is a time-evolving model, and that we are not only interested in the response at the final timestep, but also in the dynamics of the process. To avoid an extra dimension of input, more precisely, a time t that will significantly increase the computational cost of training and prediction [14], a local surrogate model for each time step is constructed. An alternative choice could be to apply sparse heteroscedastic GP [44,45] and to adopt the time as an extra dimension of input. However, this would require additional estimations for the variances of local input noises and pseudo-inputs for the sparse Gaussian process in the marginal likelihood optimisation. Such high dimensional optimisation may result in a relatively poor inference of the hyperparameters. Therefore we adopt the local surrogate models, and the expression can be rewritten as:ŷ GPR is based on the assumption of the joint Gaussian distribution between training data and prediction mean: where covariance matrices K t ,K t andK t denote the correlation within training data, within new data, and between these two respectively at each time step.
Applying Bayesian inference, the posterior probability distribution of h pred t given training set(ξ train ∼ξ * , y train The mean value offers the prediction and the variance represents the uncertainty of this prediction. The radial basis function kernel was chosen for the computation of covariance matrices: where x i −x j denotes the L 2 norm between two points in the Gaussian process. In each local GPR model, there are three hyperparameters associated with the kernel: length scale l t , signal variance σ f t and noise variance (stochasticity) σ * t . These hyperparameters are trained by optimizing the log marginal likelihood function: (2.9) The training data comes from the qMC result from [41] and the size of the training data was chosen to match the speedup of semi-intrusive UQ. The details of speedup calculation are introduced in the Section 3.2. As the semi-intrusive method gains a speedup of around 7 times, the speedup of the non-intrusive method is also designed to be around 7 for comparison. Therefore, the results of 150 ISR2D simulations from previous experiments using the quasi Monte Carlo method [41] are used for training. The Gaussian process surrogate model used in non-intrusive UQ was built using GPy [46].

Uncertainty quantification and sensitivity analysis
Uncertainty Quantification is the analysis of uncertainty of a computational model [47], including uncertainty in the model response (forward problem), and  Table 1 in Section 4.
Since ISR2D is a stochastic model, the estimates of the response variance and partial variances in SA include the aleatory uncertainty as well. In other words, the model stochasticity is treated as another uncertainty source similar to the uncertain inputs. This approach was adopted in our previous work [41, p. 764]. It is important to note that there exist alternative approaches to deal with stochasticity in uncertainty and sensitivity analysis, for instance [48,49].
whereŷ t (ξ j ) is the value of the model response obtained with the j-th sampled value of the uncertain inputs ξ j and N is the total number of samples. The total Sobol sensitivity index for the i-th parameter together with the stochastic parameter ξ * is defined by: where the partial variance in the numerator is approximated by [51,52]: where ξ ∼ξi,ξ * is a vector of all uncertain parameters in ξ except ξ i and the stochastic parameter ξ * , andŷ t (ξ ∼ξi,ξ * ) j , (ξ i ) j+N , ξ * j+N denotes the model response at time t with the same values of all inputs as forŷ t (ξ j ) except of the i-th input ξ i and of the stochastic parameter ξ * . Since the interpretation of the total sensitivity indices is the portion of uncertainty that would remain if all other parameters were known precisely [53] and the stochasticity is the irreducible part of uncertainty, the effect of stochastic uncertainty is included in the estimator in (3.3).

Speedup
The main purpose of applying SI and NI methods is to speed up the simulation and reduce the computational cost while getting accurate enough estimates of the uncertainties. The speedup of the UQ analysis of these advanced methods was computed as follows:

Results
First, the quality of the blood flow surrogate model is evaluated. Then, the results of UQ and sensitivity analysis based on non-intrusive and semi-intrusive methods are compared to a previously obtained reference solution reported in [41].

Blood flow surrogate model
Before applying the blood flow surrogate model to semi-intrusive UQ analysis, the quality of the surrogate model was evaluated. We used normalized mean absolute error (NMAE) on a test dataset to evaluate the quality of the surrogate model: where · denotes L 1 norm and max{|τ wss |} is the peak stress in the LBM.  poor prediction may be caused by extrapolation, since the growth is stochastic, and the lumen geometry may end up with a previously unseen irregular shape which is not covered by the training dataset.

Uncertainty quantification and sensitivity analysis
Uncertainty in the ISR2D model response is due to the model stochasticity and uncertainty in three model parameters. The ranges of the uncertain parameters are shown in Table 1. These three uncertain inputs were assumed to be uniformly distributed within the given ranges. The model output of interest was the neointimal area as a function of time after stenting and its uncertainty was estimated using the non-intrusive method (NI) and the semi-intrusive (SI) method with surrogates of the blood flow micro model. We compared the uncertainty quantification result and sensitivity analysis result for similar values of UQ speedup. We also show the quasi-Monte Carlo (qMC) result from [41] as the reference solution. The total number of model runs in both qMC and SI experiments was 1024. Figure 5 shows the estimates of the mean and standard deviation with the qMC, the SI and NI methods. The mean is approximated especially well for the first 10 simulated days. After this point, slightly less average growth is observed in both SI and NI estimates than in the qMC results. The NI estimates slightly   Mean Estimation Standard Deviation     Each of the quantities for the uncertain inputs includes the aleatory uncertainty. The area around the qMC results is the 95% confidence interval obtained by bootstrap [54].

Speedup
In Table 3   an improvement of more than a factor three over the nearest-neighbour interpolation based surrogate model. The simplified physics model was even faster, but was also the least accurate one, while the SI with CNN based surrogate model provided the best uncertainty quantification and sensitivity estimates among the four surrogates (see [26] for details on the Phys, and DD I and DD II surrogates).

Convergence
A study of the effect of training sample size on convergence of SIUQ and NIUQ is shown in Table 4. The mean estimation, standard deviation and Kullback-Leibler (KL) divergence [56,57] are estimated for the distributions at the last time step. The four semi-intrusive UQ results are based on the surrogate models trained with the data from 1,2, and 4 ISR2D simulations respectively.
SI-1 * denotes a case in which the surrogate model is trained with only part of the data of one ISR2D simulation (the truncated data from day 0 to day 15).
The error in standard deviation is reduced significantly with improvement of the surrogate model, however the tendency in mean estimation is not obvious. In SI-1 * , the surrogate model was trained with truncated data of one simulation, which resulted in an overestimation of mean. When the surrogate model was Standard Deviation KL Divergence

Discussion
The CNN surrogate model performed well regarding the wall shear stress prediction for the micro model. It takes advantage of convolution layers to fetch latent features in the geometry input and then uses the FC layer and deconvolution layers to map the features to the wall shear stress prediction. Although the CNN surrogate model was able to predict the wall shear stress quite accurately, a small error still exists. This error introduced by the surrogate model then propagated through the iteration and led to the error in the uncertainty estimation and restenosis prediction as shown in Figure 5 and Table 2.
The accuracy of the estimation with SIUQ method depends not only on the quality of the surrogate model but also on the structure of the multiscale sim-  [60]. In the UQ scenario, the surrogate model can also be validated dynamically [25]. We aim to apply these techniques to the three-dimensional versions of the ISR model in future work.

Conclusions and future work
In order to implement uncertainty quantification and sensitivity analysis  [61,62] can be applied here for uncertainty analysis as both methods require to run a certain amount of original simulation to generate training data. Therefore a multifidelity framework can combine both high fidelity data (training data) and low fidelity approximation (generated by or with a surrogate model) together to further improve the accuracy of uncertainty estimation. This method has been widely used in non-intrusive UQ analysis, but hasn't been applied to semiintrusive UQ methods yet. It would be helpful to include the method to our further works.
In this study, we have taken ISR2D simulation as a case study and investigated how three uncertain parameters: blood flow velocity, endothelium regeneration time and deployment depth affect the neointimal area of restenosis.
However, ISR2D is a simplified model for the restenosis process. We aim to apply the UQ techniques to our more realistic and complex model, ISR3D [37,38], and analyse uncertainty parameters including not only the three mentioned in this paper but also other factors, such as fenestration percentage, maximum strain for SMCs and others that may influence the growth. Since ISR3D is suitable for modelling restenosis in realistic geometries, it can eventually be used to perform in silico clinical trials or to design novel stents. These directions of research would require many runs of the 3D model, and this makes it essential to run each individual simulation as cheaply as possible, while retaining the validity of the model. The 3D flow calculation is much more expensive in 3D; for example, the 3D simulations of restenosis done for the InSilc project 1 take around 3000 core hours for each artery reconstructed from optical coherence tomography (OCT) images. Substituting the flow model with a surrogate may be essential for reducing these costs.
Semi-intrusive methods for UQ of multiscale models require modifying those models by adding new submodels or other components, and by changing the connections between them. If the codes are tightly coupled using a low-level communication facility such as MPI, then this entails changing the model code and maintaining multiple versions simultaneously, which is cumbersome and error-prone. We have recently created a new implementation of the multiscale coupling framework, MUSCLE3 [63,64]. With MUSCLE3, submodels are connected to the framework once and can then be coupled in different ways based on a simple configuration file. Advanced non-and semi-intrusive UQ algorithms can be implemented as additional modules, and wired into the multiscale model without changing the code of the submodels. We have already ported ISR3D to MUSCLE3, and aim to implement non-and semi-intrusive UQ with MUSCLE3.