Data-driven reduced-order modelling for blood flow simulations with geometry-informed snapshots

Parametric reduced-order modelling often serves as a surrogate method for hemodynamics simulations to improve the computational efficiency in many-query scenarios or to perform real-time simulations. However, the snapshots of the method require to be collected from the same discretisation, which is a straightforward process for physical parameters, but becomes challenging for geometrical problems, especially for those domains featuring unparameterised and unique shapes, e.g. patient-specific geometries. In this work, a data-driven surrogate model is proposed for the efficient prediction of blood flow simulations on similar but distinct domains. The proposed surrogate model leverages group surface registration to parameterise those shapes and formulates corresponding hemodynamics information into geometry-informed snapshots by the diffeomorphisms constructed between a reference domain and original domains. A non-intrusive reduced-order model for geometrical parameters is subsequently constructed using proper orthogonal decomposition, and a radial basis function interpolator is trained for predicting the reduced coefficients of the reduced-order model based on compressed geometrical parameters of the shape. Two examples of blood flowing through a stenosis and a bifurcation are presented and analysed. The proposed surrogate model demonstrates its accuracy and efficiency in hemodynamics prediction and shows its potential application toward real-time simulation or uncertainty quantification for complex patient-specific scenarios.


Introduction
Computational modelling has demonstrated significant efficiency and flexibility in cardiovascular science for the study of mechanisms of disease, biological/pathological processes, and designs of medical devices [1,2,3,4,5]. Computational models describe the behaviour of systems using governing equations, usually via partial differential equations (PDEs) with particular initial and boundary conditions, but also using cell-based or particle-based approaches. To capture the full complexity of the behavior of the system, the models are generally sophisticated and involve domains with a complex shape. Numerical methods, such as finite element methods (FEM) [6] and finite volume methods [7] are applied to solve the model equations. The models subsequently contribute to studying the underlying dynamics of the system or making predictions to support decision-making in a clinical context.
The evaluation of hemodynamics is one of the important topics in cardiovascular science. The mechanical and biological interaction between blood flow and the vessel wall has a significant impact on the initiation and progression of vascular diseases, e.g. the development of atherosclerotic plaque [8,9,10], or restenosis of arteries after percutaneous intervention [11,12]. Therefore computational fluid dynamics (CFD) techniques are widely applied to perform high-fidelity simulations to mimic and predict the behavior of blood flow, to further support the clinical practice or in-depth study of physiology and pathology [13,14,15,16].
However, owing to the complexity and scale of cardiovascular flow problems, the evaluation of the model could be computationally expensive, especially in the cases where a large number of evaluations are required, such as design optimisation, reliability analysis and uncertainty quantification [17,18,19]. Reduced order modelling (ROM) provides a high-fidelity framework to reduce the computational complexity for solving parametric PDEs [18,20]. It exploits the intrinsic correlation of the full-order model (FOM) solution over a physical domain, time evolution, or parameter space, and projects the system operation to a low-dimensional reduced space. Owing to efficient and reliable prediction based on the reduced systems, such surrogate models also enable the real-time performance of the simulation [21,22,23]. One of the widely applied reduced-order methods is Proper Orthogonal Decomposition (POD) [24]. The POD can be realized using principal component analysis (PCA), or the singular value decomposition (SVD). The projection coefficients can be computed either in an intrusive manner by manipulating the governing equations with Galerkin methods [25,26] or in a non-intrusive manner by formulating it as an interpolation problem [27,20]. Dynamic mode decomposition is another projection-based method for the temporal decomposition of CFD [28,29,30]. The empirical interpolation method [31] and the discrete empirical interpolation method [32] were proposed to recover an affine expansion for the nonlinear problem. Operator inference methods can directly approximate the reduced-order differential operator from data without knowing the full-order operators [33,34].
In cardiovascular science, the shapes of domains for blood flow simulations are either segmented from clinical image data or generated indirectly by computational models [35]. These anatomical geometries of a kind, such as the segments of arteries and organs are similar in terms of general shape but different in details. These shapes are typically irregular and unparameterised. Apart from the parametrisation issue, the difference in the shapes of the domain also leads to distinct spatial discretisations, and therefore results in inconsistency over snapshots due to geometric differences if ROM was intended to apply. To ensure the spatial compatibility among snapshots for ROM, a parameter-independent reference domain is needed hence all the discrete solutions generated from FEM share the same discretisation. Such a method requires a mapping between the reference and target domains that can be computed via an interpolation based on point-to-point (or landmark) correspondence between the two domains, which is generally not a known condition in cardiovascular engineering.
In this work, a data-driven surrogate model based on the non-intrusive reduced order modelling method is proposed. The surrogate model aims to approximate the fluid dynamics of blood flow within vessels of distinct but similar shapes efficiently. Different from existing approaches [36,37], the proposed surrogate model performs a surface registration based on currents to approximate the shape of the target domain from the reference shape. The currents-based registration method enables the registration without point-to-point correspondence [38,39]. The diffeomorphism constructed during surface registration provides the mapping between the reference domain and the target domain, as well as the parametrisation of the shape. The evaluations are subsequently performed on the reference domain to ensure the spatial compatibility of the snapshots, namely geometry-informed snapshots. Therefore a reduced-order model of geometrical parameters can be formulated. POD is applied to construct reduced bases and the projection coefficients are computed via a radial basis function (RBF) interpolation method. A schematic diagram of the proposed surrogate model is shown in Figure 1. The proposed data-driven reduced-order model with geometry-informed snapshots resolves the spatial compatibility and parametrisations issues for geometric ROM problems and enables efficient and accurate prediction of hemodynamics within similar geometries of domains.
Note that this work is presented in the context of blood flow simulations with steady-state. In practice, more complex situations with time dependency (e.g. pulsatile flow) will be encountered. The transient simulations of course capture more details and characteristics of the physiological flow, such as the problems involving mass transport [40] or stagnation [41,42]. However, steady-state simulations with time-averaged velocity in many scenarios have provided enough information on hemodynamics to facilitate investigation and understanding of physiological and pathological problems [43,44,45]. A further discussion can be found in Section 4.

Surface registration with currents
Here we introduce an approach based on non-parametric representation of surfaces using currents and its deformation framework [38,39,46]. The method characterises shapes in the form of currents which enables one to quantify the dissimilarity between two shapes without point correspondences, and subsequently can be applied to perform surface registration. We assume that the shapes of domains in the blood flow simulations are similar and can be achieved by a deformation of the reference domain. A diffeomorphic registration, which transforms shapes from a source coordinate system to a target coordinate system, is generated via minimising the difference between two surfaces and geodesic distance.

Represent surfaces using currents
Given a continues vector fields ω : R d → R d in the ambient space R d , where d = 2 or 3, an oriented piecewisesmooth surface S in R 3 can be characterised by the flux going through the surface, where n(x) denotes the unit normal vector of the surface S at point x. For a curve in two-dimensional case (d = 2), n(x) is replaced by a tangent vector. The computation of the flux defines a linear mapping from a space of vector field W to R. This mapping is called the current associated with surface S . We denote the space of current as W .
By defining the space W as a Reproducing Kernel Hilbert Space (RKHS), the vector field can be represented in the form of a kernel function K : R 3 × R 3 → R, therefore a vector field ω in W can be reformulated with fixed points {x * i } i=1,2,··· ,n and their corresponding vectors β, The space W equips with inner product K(·, x)α, where λ W denotes the scale at which W may spatially vary. By substituting the expression of equation 2, the inner product of space W can be written as K(·, x)α, ω W = α T ω(x), which is known as reproducing property. Consequently, equation 1 can be reformulated into, It shows that S K(·, x)n(x)dS (x) is a unique Riesz representation in W of current [S ] in W [47]. Therefore, the inner produce of two currents [S 1 ] and [S 2 ] can be obtained by, The associated norm can be applied to measure the dissimilarity between two shapes,

Diffeomorphism construction
The shape of a reference domain can be represented by a series of control points (CPs) {x ..,N cp on a boundary (the boundary vertices in a finite element mesh). The surface registration of a reference shape S ref to a target shape S tar can be performed by finding a flow of diffeomorphism ϕ(·, t) : R d → R d such that the dissimilarity of the shapes, as well as geodesic distance of the process, are minimised. The trajectory of these points at a given time t, can be described using a flow differential equation: where v(ϕ(x, t)) is a vector field representing the velocity of the deformation. Similar to equation 2, the velocity vector field can be approximated by a summation of kernel function at discrete CPs, where α i denotes the basis vector of the field and K(·, ·) is a kernel function that decides the weights of each basis based on the distance between two points. Together with equation 6, the flow of diffeomorphism can be reformulated as the motion of CPs, To perform a surface registration from a reference shape S ref to a target shape S tar , a proper velocity vector field can be computed via an optimisation problem formulated as, where [ϕ(S ref )] − [S tar ] 2 W measures the difference between the deformed shape and target shape using currents; 1 0 v 2 W dt denotes the total kinetic energy required to deform the shape from its initial state. For further details of surface registration with currents, see [46,47].
The constructed diffeomorphism has a twofold application in the proposed method. On the one hand, it is applied to generate all the geometry-informed snapshots with different shapes of domains by projecting and solving the FOMs on a reference domain, which is detailed in Section 2.2. On the other hand, since these domains of different shapes can be achieved by diffeomorphisms from the same reference domain, those shapes hence can be parameterised by the displacement of the boundary points of the reference shape before and after the registration.

Radial basis function interpolation
The RBF interpolation method is one of the state-of-the-art methods for multivariate interpolation [48]. The method approximates the desired function by a linear combination of basis functions based on known collocation points. It has been widely applied in computational science and engineering, e.g. [49,50]. In this work, RBF interpolation is applied for both approximating the mapping between the reference and target domain, and predicting the reduced coefficients for non-intrusive ROM. Here we present the description based on a general form of RBF interpolation.
Assume that f : R m → R n is the latent function between inputs and outputs, and a collection of collocation points { f (x i )} i=1 is known. A general form of RBF interpolatorf can be written as: such thatf where ψ : [0, +∞] → R is the radial basis function which is also known as kernel function. It computes the similarity between a prediction point and known collocation points. Multiple choices of kernels are available such as cubic kernel, thin-plate spline kernel or Gaussian kernel. Each kernel leads to its unique asymptotic behaviour, see [51] for more details; λ i denotes the expansion coefficients (weights) for each collocation points; denotes a low-order polynomial function for conditionally positive definite radial functions (e.g. thin-plate spline kernel), where p 1 , p 2 , · · · , p q are the polynomial bases of order not exceeding q. The additional degrees of freedom introduced by the polynomial function are compensated by, Together with the constraint from equation 11, a linear system can be formed, where ] T being the vector form of expansion coefficients for the kernel function and the polynomial function, and output vector, respectively. By solving the linear system above, the expansion and polynomial coefficients can be obtained and hence an interpolation for a new pointx can be estimated based on equation 10.

Geometry-informed snapshots based on FOM on a reference domain
Consider blood flow with a moderate Reynolds number as an incompressible Newtonian fluid to be modelled with the steady Navier-Stokes equations. Within a parametric vessel lumen as the spatial domain Ω(γ) ∈ R d , where d = 2 or 3, such problem can be formulated as: find the vectorial velocity field u(x; γ) : Ω(γ) → R d , and scalar pressure field p(x; γ) : Ω(γ) → R, such that: and subject to boundary conditions: where γ ∈ P ⊂ R N p stands for the geometric parameters; ν and ρ denotes the dynamics viscosity and density of the fluid respectively; f denotes the body force. Equation 15a and 15b represent Dirichlet and Neumann boundary condition on ∂Ω D and ∂Ω N , respectively. Dirichlet boundary condition is generally defined for the inlet and wall boundary of a vessel, while the Neumann boundary condition is applied to the outlet. For the sake of simplicity, the body force term is taken to be zero. As mentioned before, the governing equations should be solved on a parameter-independent reference domain Ω * to ensure the spatial compatibility of the snapshot through a parameterised mapping X(ξ; γ), i.e. x = X(ξ; γ). The variational form of the system is therefore obtained by introducing two functional spaces V {w ∈ H 1 (Ω * ) | w = 0 on ∂Ω * D )} of weighting functions for momentum conservation and Q {q ∈ L 2 (Ω * )} for incompressible constraint over the reference domain. Q can be also applied as the solution space for pressure. Besides, solution spaces S {u ∈ H 1 (Ω * ) | u = g D on ∂Ω * D } is defined over the domain for velocity. The governing equations are then projected to the weighting spaces by multiplying weighting functions and integrating over the domain, which reformulates the problem 14 to: find u(ξ) ∈ S and p(ξ) ∈ Q such that for all w and q, These are the bilinear and trilinear forms for diffusion term, pressure-divergence term and convection term of the system, respectively. The elements κ i j (ξ; γ) and ζ i j (ξ; γ) in the compact forms come from the tensor representation of functions associated with coordinate transformation, where J X (ξ; γ) denotes Jacobian matrix of mapping X(·; γ) and |J X | is the determinant of the matrix. Galerkin spatial discretisation approximates the solution by seeking solutions within corresponding finite dimensional subspace V h ⊂ V and Q h ⊂ Q, and the problem is subsequently reformulated to: seek velocity field u h ∈ S h and pressure The discrete solution of finite element approximation to the problem can be obtained by solving the system above with the given boundary conditions. The computational accuracy of the discretised system derived is related to the degrees of polynomial chosen for shape functions and the size of the elements. In the remaining part of the paper, we denote the discrete solutions in general of a FOM blood flow simulation with geometric parameters γ on a reference domain Ω * as u h (ξ; γ), namely geometry-informed snapshots in context of ROM.

Reduced order model
The application of ROM is motivated by reducing the high computational cost of a FOM owing to the large number of degrees of freedom (DOF) involved in a large-scale system. In other scenarios, e.g. UQ, design optimisation, or reliability analysis where a considerable amount of evaluations are required, ROM frequently serves as a surrogate model technique to make the evaluations computationally tractable.
ROM seeks hidden low-dimensional patterns behind the FOM and approximates the FOM with high fidelity. Here ROM is applied to extract the low-dimensional representation over geometric parameter space and the FOM solution therefore can be approximated by, where {u rb i (γ)} i=1,··· ,N l denotes the reduced coefficients which provide the weighting of each reduced basis and φ i (ξ) is the corresponding reduced basis. The decomposition can be applied to any flow fields such as velocities in different directions, pressure, and shear stress fields. We denote N and N γ as the dimension of snapshots and the number of snapshots available (corresponding to geometric parameters), respectively in the ROM.
A non-intrusive ROM framework consists of two stages, offline and online. The offline stage prepares the necessary ingredients for prediction, including snapshot generation, construction of reduced bases with the dimensionality reduction method, and training an interpolator for predicting reduced coefficients for each basis. During the online stage, the interpolator is applied to predict the reduced coefficients according to new parameters, and together with the reduced bases, an approximation of the system can be subsequently reconstructed via equation 19. One of the widely applied reduced-order methods is POD [24]. The POD can be realized using PCA, or SVD. In this work, SVD is applied to construct reduced bases.

Proper orthogonal decomposition
Consider a snapshot matrix M ∈ R N×N γ consisting of snapshots with respect to corresponding geometric parameters, By performing SVD, such a snapshot matrix can be written in a factorization form, Algorithm 1 Offline stage of the surrogate model Surface registration 1. Collect the data of vessel shapes represented by a collection of boundary points where b i denotes the number of the boundary points for each shape and N s denotes the total number of shapes. For convenience, we denote the boundary points for each shape as or perform atlas regression based on the shape dataset. 3. Perform surface registration based on currents representation, and compute the corresponding diffeomorphism for each shape in the dataset, {ϕ 1 , ϕ 2 , · · · , ϕ N s }.
Shape parametrisation 1. Parameterise the shapes using displacement of boundary points Apply POD to the geometric parameters {γ i } N s i=1 and obtain the reduced coefficients of geometric parameters γ rb i for each shape, which can be applied to fully represent the geometric parameters.
Coordinate transformation 1. Train RBF interpolators which approximate the mapping X i (·; γ) between the reference domain and a target domain based on the input data X i and output ϕ i (X i ), for i = 1, 2, · · · , N s .
Geometry-informed snapshot evaluations 1. Compute the Jacobian matrix J X i of each X i (·, γ) and its determinant |J X i |, for i = 1, 2, · · · , N s . 2. Solve Navier-Stokes equations for each shape on the reference domain Ω * and corresponding discrete geometryinformed snapshot {u h (ξ; γ i )} N s i=1 is obtained. Reduced-order modelling 1. Construct the snapshot matrix [u h (γ 1 )| · · · |u h (γ N s )] and perform POD to obtain orthogonal reduced bases Φ. N l is the number of bases that the cumulative energy of which has reached 99.9%. 2. Project the output of training data to reduced bases and compute the reduced coefficients u rb (γ j ) = [u rb 1 (γ j ), · · · , u rb N l (γ j )] T for each γ j , where j = 1, · · · , N s . 3. Train an RBF interpolator (RBF-ROM) with reduced coefficients of geometric parameters {γ rb i } N s i=1 and corresponding reduced coefficients of flow field {u rb (γ rb where Z and V denote left and right orthonormal matrices. The diagonal matrix Σ consists of singular values σ i , where i = 1, ..., N γ and σ 1 ≥ σ 2 ≥ ... ≥ σ N γ ≥ 0. The objective of POD is to find out a set of orthogonal basisΦ = {φ 1 , φ 2 , · · · , φ N l } from the space Z = {Φ ∈ R N×N l : Φ T Φ = I} containing all possible orthogonal bases, such that the projection residual is minimized: where · F denotes the Frobenius norm. The Eckart-Young theorem [52] shows that the orthogonal bases constructed by the basis vectors {z i } N l i=1 taken from the ith column of Z is the unique solution to the optimization problem. The cumulative energy captured by N l number of reduced bases can be evaluated by the ratio between truncated and complete singular values, The values of σ i decay rapidly which means a small number of bases are sufficient to approximate the data well. Apart from POD on the flow field, the POD was also applied to reduce the dimensionality of the geometrical parameters of a shape obtained from surface registration. As a parameterised non-intrusive ROM, the reduced coefficients of flow fields will be viewed as a function of those reduced geometrical parameters.

Algorithm 2 Online stage of the surrogate model
Surface registration 1. Collect boundary data of a vessel for prediction,X = {x i }ˆb i=1 . 2. Perform surface registration based on currents representation, and compute the corresponding diffeomorphism for the shape,φ.
Shape parametrisation 1. Parameterise the shapes using displacement of boundary pointsγ =φ(X) −X. 2. Projectγ to the reduced space constructed during the online stage and compute the reduced coefficients for the geometric parametersγ rb .
Coordinate transformation 1. Train an RBF interpolator which approximates the mappingX(·,γ) between the reference domain and a target domain based on the input dataX and outputφ(X).
Prediction via ROM 1. Apply RBF-ROM interpolator from offline stage to predict the reduced coefficientsû rb (γ rb ).

Online-offline framework of ROM with geometry-informed snapshots
In the ROM, the framework can be separated into two stages, online and offline, as shown in Figure 1. The offline stage prepares essential ingredients while the online stage makes the prediction. The offline stage mainly consists of surface registration, FOM evaluations on the reference domain and the construction of ROM, while the online stage extracts the geometric parameters from a new shape and predicts the flow profile based on the ROM. The details of the online-offline framework are outlined by algorithm 1 and algorithm 2.
The offline stage collects and prepares the essential ingredients for online prediction. Assume that we are interested in predicting the flow fields in different but similar shapes of the domains. These domains are usually segments of arteries generated either from the segmentation of medical image data or from a computational model. The shapes of the domains are assumed to be similar to each other such that all the shapes can be considered as small deformations from a reference domain. For a computational model simulating e.g. time-dependent tissue growth in the artery, the initial shape of the domain can be chosen as the reference shape. For more general situations, the reference shape can be chosen as the average of the shapes, namely the atlas construction [53,54]. By performing the surface registration between the reference shape and target shapes, the diffeomorphisms between the shapes are constructed and subsequently applied in both shape parametrisation and mapping construction. The displacements of the CPs of each shape are viewed as the geometric parameters of the parametric PDE problem described in Section 2.2. However, such an approach to parametrise the shape may lead to the high dimensionality of parameters owing to the number of the CPs of a two-or three-dimensional shape. Besides, the displacements of CPs only represent local information and correlations can be found between the displacements. Therefore, a dimensionality reduction technique can be employed to further extract the latent low-dimensional representation of these parameters. In this work, POD is also applied to extract the reduced coefficients of the geometrical parameters.
On the other hand, the constructed diffeomorphism also provides a mapping between the reference domain and a target domain. Such a mapping can be approximated by RBF interpolation based on the position of the control points before and after the deformation. These mappings are further used to project the Navier-Stokes equations onto the reference domain using the Jacobian matrix and its determinant, through which all the evaluations of the blood flow simulations on the domains of different shapes can be evaluated on the same discretisation. This ensures spatial compatibility over the snapshots corresponding to the geometric parameters for ROM. Therefore the reduced bases can be constructed by applying POD to the snapshot matrix, and the reduced coefficients of the flow field are considered as a function of the reduced coefficients of the geometrical parameters. In this work, RBF interpolation is also applied to approximate the function between the reduced coefficients of geometric parameters and flow fields.
During the online stage, reduced coefficients of geometric parameters of a new shape can be again computed through surface registration and subsequently, used to estimate the reduced coefficients of corresponding flow fields. The estimated reduced coefficients together with the reduced bases derived in the offline stage reconstruct the entire flow field on the reference domain, which is mapped back to its original domain in the final step.

Error estimation
In order to present the accuracy and performance of the proposed ROM, several measurements were applied. The relative error was employed to visualise the relative difference between two flow fields at a specific point x, where max{|u h |} denotes the maximum absolute value of the vector u h . Relative L 2 error corresponding to geometric parameters γ is employed to measure the overall difference of the estimations between a FOM and the proposed ROM over the domain, where · L 2 denotes L 2 norm and u rb (γ) = [u rb 1 (γ), · · · , u rb N l (γ)] denotes the vector consisting of the reduced coefficients. Besides, the projection error introduced by POD is also measured by the relative L 2 error,

Case one: blood flow in a stenosis
The simulation of blood flowing through a stenotic vessel is a common scenario in cardiovascular science to study the development of arteriosclerosis or in-stent restenosis. We constructed an idealised 2D example of blood where L 0 = 2 mm is the width of the inlet. 500 different shapes of domains were generated by sampling σ 1 and σ 2 within the ranges [0.8, 2], µ 1 and µ 2 within the range [0.3L 1 , 0.7L 1 ] using the Latin hypercube method [55]. 400 of the samples were applied for the construction of the ROM in the offline stage, while the remaining 100 samples were equally split into the validation dataset and test dataset. The dynamic viscosity and density of the flow were set to be 3.5 × 10 −3 g/(mm · s) and 1.06 × 10 −3 g/mm 3 . A constant parabolic velocity profile with a velocity of 200mm/s on its vertex was prescribed at the inlet. A non-slip boundary condition was prescribed for the upper and lower boundaries, and a homogeneous Neumann condition was forced on the outlet. The simulation including mesh generation and finite element method was implemented using the open-source PDE solver FreeFEM [56]. A rectangular shape was chosen for the shape of the reference domain, as shown in Figure 2(b) and a Taylor-Hood P2-P1 finite element spatial discretisation was applied. The surface registration was performed via statistical analysis software Deformetrica [57,58]. One of the surface registration results is demonstrated in Figure 2(c). The result shows that the reference shape deformed gradually toward the target shape and aligned well at the final time step. The average residual of the surface registration over 500 samples is 1.31 × 10 −7 (measured on currents). The results of surface registration were subsequently applied to compute the mappings between the domains of different shapes using RBF interpolation. A cubic kernel is employed with a polynomial function of the first order. Owing to the limited amount of mapping information we have, it is difficult to validate the interpolation itself. Instead, the validation was performed based on the results of the FEM simulation. The result computed on the reference domain was mapped back to its original domain and compared to the result that was directly simulated on the target domain. The accuracy of the simulation result based on the reference domain reflects the accuracy of the mapping. Figure 3 shows the comparison of one selected sample. The regions with relatively higher errors are located close to the deformed boundaries, which means that RBF interpolation introduced minor errors into the snapshots. The average relative L 2 errors of velocities in x and y directions, and pressure of 400 training samples are 0.21%, 0.42% and 0.042%, respectively. It implies that the overall errors introduced by mapping are small, therefore the mapping approximated by the RBF interpolation is accurate enough for the FE simulation. We therefore considered these FOM evaluations based on the reference domain as the snapshots for the reduced model. The surface registration results have also been applied to parameterise the configuration of the domains. In this case, there are 360 points on the boundary and therefore leads to 720 geometric parameters in a two-dimensional problem. POD was applied to further reduce the dimensionality of the parameters to 17, which has already captured 99.9% of the total variance. The percentage of the total variance captured by the number of reduced bases is demonstrated in Figure 4 The discrete results of simulations on the reference domain are subsequently taken as the snapshots of ROM, and POD was performed to build reduced bases. The percentage of the total variance of the velocity and pressure data captured by the number of reduced bases is demonstrated in Figure 4. Similar to the dimensionality reduction to the geometric parameters, we assume that if more than 99.9% of the variance has been recovered by POD, the approximation reconstructed by the first N l bases performs well enough. More rigorous criteria could be set but the benefits are marginal as long as the main patterns have been already covered. The relative L 2 errors introduced by POD are demonstrated in Table 1 RBF interpolation again was utilised to predict reduced coefficients of flow fields based on reduced coefficients of the geometric parameters. A comparison of the expected and predicted reduced coefficients of one test case is shown in Figure 5. The points cluster around the diagonal line, indicating that the RBF interpolation predicted the reduced coefficients well. Figure 6 visualises the velocity fields reconstructed from predicted reduced coefficients and compares the result to the evaluation of the corresponding FOM. Note that in this comparison, the evaluations of the FOM were based on the reference domain. The average relative L 2 errors over 50 validation samples are 0.24%, 3.42% and 0.47% for velocity in x and y directions and pressure, respectively. The error estimations on the test dataset are shown in Table 1. The result shows that the prediction of the surrogate model matches the finite element simulation result well. The relative L 2 error is comparatively large in the y direction owing to its small magnitude compared to the velocity in the x direction. Note that the error estimations here do not include the error introduced by the construction of the diffeomorphism, which was neglected as they were small enough. They only represent how well the non-intrusive ROM predicted based on given snapshot data.
A detailed comparison of FOM and proposed ROM with geometry-informed snapshots on the computational time     and error estimations are demonstrated in Table 1. It is obvious that the computation of ROM barely took any time, while most of the computational effort of the proposed surrogate model was spent on surface registration. Therefore the entire speedup of the proposed method heavily relies on the performance of surface registration. The error of the ROM mainly originated from POD due to the limitation of the amount of training data.

Case two: blood flow in a bifurcation
We present the second example of blood flow simulations through an artery bifurcation, another common scenario in cardiovascular science. The basic configuration of the flow domain is shown in Figure 7(a), which is also considered as the reference shape. A series of new shapes are generated via deformations based on the basic configuration. The four red dots in Figure 7 are the chosen points to control the deformation. We consider the displacement of these points in the x and y directions as random variables and sample within the range [−0. 5  therefore imposed by interpolation. Similar to case one, 500 samples were generated by the Latin hypercube method and the same fluid properties and boundary conditions were applied. The Reynolds number for the case is 60. The surface registration was performed to compute the geometric parameters and the corresponding mapping between the reference shape and other shapes. The average residual of the registration is 6.4 × 10 −6 . There are 580 nodes on the boundary of the reference shapes which leads to 1160 geometric parameters in the two-dimensional case. Owing to the POD, the dimension of the geometric parameters was reduced to 8, which matches the dimensions of parameters used for shape generation. The accuracy of the mapping is measured by the accuracy of the result solved on the reference domain. The average relative L 2 errors of velocities in x and y directions, and pressure of 400 training samples are 0.033%, 0.069%, and 0.026%, respectively.
The bases of the ROM were subsequently constructed. The velocities in the x and y directions require 17 and 21 reduced bases to capture 99.9% of the total variance of the data, while pressure needs 8 reduced bases. The predictions of the ROM for one of the test cases are shown in Figure 8. The average L 2 errors of velocities in the x and y directions and pressure over 50 validation cases are 0.70%, 1.53%, and 2.65%, respectively. The error estimations and computational time of the test dataset are presented in Table 2. Similar to the previous case, the main computational cost of the surrogate model comes from surface registration.

Discussion
The error of the surrogate model mainly stems from three parts: registration, decomposition, and interpolation. The diffeomorphisms were constructed via registration which enables the systems to be solved on a reference domain. The residual in the registration hence introduced an error into diffeomorphism and subsequently propagate to snapshots. The error in the two case studies was reasonably small as the registration aligned well enough, therefore could be neglected. However, it is not trivial to assess if an accurate diffeomorphism can be obtained by registration between two general shapes. Here we emphasize the assumption that the geometries can be achieved by small perturbation from the reference shape. The second part of the error originates from POD where the number of reduced bases was selected to represent the original data. The criteria was that the cumulative energy captured by N l of reduced bases reached 99.9%. The singular values of the SVD decayed rapidly as shown in Figure 4. However, the error introduced by POD is still a majority part of the entire error, as shown in Table 1 and 2, limited by the number of snapshots available for training. The third part of the error originates from the non-intrusive prediction with RBF interpolation. The RBF interpolation can be replaced by other interpolation methods such as polynomial interpolation [59] or a neural network [60], etc. The error of this part should be first evaluated and tuned against a validation dataset before employing the trained interpolator to prediction.
The computational time of the surrogate model for the examples was demonstrated in Table 1 and 2. A speedup of around 3 and 4 for online prediction was achieved in prediction by applying non-intrusive ROM and surface registration for each case respectively. The computational cost in prediction mainly came from surface registration while the non-intrusive ROM barely takes any time as the operation involved is weighted summations. This also enables the possibility of real-time performance in clinical practice owing to the separation of online and offline processes since the main computational effort was distributed to the offline part. However, the speedup shown here is only based on two-dimensional synthetic examples. Generally, the two-dimensional problem itself would be computationally cheap enough for a direct simulation rather than spending a huge amount of time on offline preparation. A three-dimensional problem with a complex domain e.g. image-based patient-specific simulation is much more computationally demanding and therefore closer to the purpose of the method. The corresponding time for prediction also increases. The computational time of surface registration increases almost linearly in log-log scale with the number of boundary vertices if a proper parallelisation is implemented on CPU/GPU as shown in [57]. Therefore, a further investigation on the speedup for a three-dimensional patient-specific dataset would be required. Nevertheless, the examples prove the concept of the proposed method and demonstrate its accuracy and efficiency.
The surrogate model proposed in this work is in the context of biomedical science and engineering, mainly aiming at solving cardiovascular flow. We assume that with state-of-art imaging and segmentation techniques, a large amount of geometric data of arteries are available. Instead of handling each geometric data to perform a simulation individually, the geometric dataset of a kind (similar but distinct) can be leveraged for further prediction based on ROM. The proposed method can be also applied to other physiological flow simulations, e.g. respiratory flow or cerebrospinal flow as long as the surface registration can be performed between the different shapes of the domains and construct corresponding diffeomorphisms. Likewise, such a method of course is not limited to biomedical applications. For example in environmental science, fluvial bank erosion simulation always needs to solve the fluid dynamics problem with changing domains over time. Other fields such as design optimization especially shape optimization in automotive and aerospace engineering also require constant evaluations of fluid dynamics with similar shapes of domains.
In this work, we limit the scope of the problem to the physiological fluid system with moderate Reynolds's number and steady situation. In practice, we will face more complex situations such as problems with time-dependency (e.g. pulsatile flow) or with a wide range of Reynolds numbers depending on the sections of the arteries (e.g. from Re ≈ 1 in arterioles to Re ≈ 4000 in the aorta). For time-dependent cases, the construction of the snapshot matrix would also include the changes of variable fields due to time evolution and an extra dimension for temporal coordinates should be added to the input space for the non-intrusive ROM interpolator. Different Reynolds numbers will lead to completely different behaviours of fluid. For instance, at Reynolds number around 2000, the laminar flow starts to transition into turbulent flow and becomes fully turbulent in the aorta [61]. We leave the study of those more complex situations to our future work.

Conclusion
In this work, a surrogate model for blood flow simulations in an unparameterised vessel lumen is presented. The surrogate model is constructed based on geometry-informed snapshots and non-intrusive reduced-order modelling. The surface registration with currents provides a mapping between reference and target coordinate systems and parameterises the shape without point-to-point correspondence. With the coordinate mapping, all the evaluations of FOM are performed on a reference domain which captures the geometry information in the snapshots and ensures the spatial compatibility of snapshots. The non-intrusive reduced order model is subsequently constructed using POD on the geometry-informed snapshots and the RBF interpolator is trained for predicting the reduced coefficients of ROM based on reduced coefficients of geometric parameters of the shape. Two examples of blood flow simulations based on stenosis vessels and bifurcation vessels are presented and discussed.

Funding
This work has received funding from the European Union Horizon 2020 research and innovation programme under grant agreements #800925 (VECMA project). This work was sponsored by NWO Exacte Wetenschappen (Physical Sciences) for the use of supercomputer facilities, with financial support from the Nederlandse Organisatie voor Wetenschappelijk Onderzoek (Netherlands Organization for Science Research, NWO).