Learning Mesh Motion Techniques with Application to Fluid-Structure Interaction

Mesh degeneration is a bottleneck for fluid-structure interaction (FSI) simulations and for shape optimization via the method of mappings. In both cases, an appropriate mesh motion technique is required. The choice is typically based on heuristics, e.g., the solution operators of partial differential equations (PDE), such as the Laplace or biharmonic equation. Especially the latter, which shows good numerical performance for large displacements, is expensive. Moreover, from a continuous perspective, choosing the mesh motion technique is to a certain extent arbitrary and has no influence on the physically relevant quantities. Therefore, we consider approaches inspired by machine learning. We present a hybrid PDE-NN approach, where the neural network (NN) serves as parameterization of a coefficient in a second order nonlinear PDE. We ensure existence of solutions for the nonlinear PDE by the choice of the neural network architecture. Moreover, we present an approach where a neural network corrects the harmonic extension such that the boundary displacement is not changed. In order to avoid technical difficulties in coupling finite element and machine learning software, we work with a splitting of the monolithic FSI system into three smaller subsystems. This allows to solve the mesh motion equation in a separate step. We assess the quality of the learned mesh motion technique by applying it to a FSI benchmark problem. In addition, we discuss generalizability and computational cost of the learned mesh motion operators.


Introduction
Following the past few decades' success of deep learning in fields such as image processing and speech recognition [31], researchers have turned their use to solving differential equations of scientific computing.In [30,43], it is proposed to solve forward and inverse PDEs by parameterizing the solution as a neural network and solving the non-convex optimization problem of minimizing the squared residual of the PDE's strong form, an approach called Physics-Informed Neural networks that has many variants [14].Moreover models known as neural operators have been developed, that are trained to solve an entire class of PDEs at the same time, for instance the DeepONet [36] and the Fourier Neural Operator [32], in a paradigm known as operator learning.Another line of work is to use neural networks to accelerate parts of numerical algorithms that are driven by heuristics or parameter tuning.If a parameter in an algorithm affects its performance and must be tuned by hand, it can instead be chosen during runtime by a neural network with access to some performance indicator.For instance, in [6,51] the authors use neural networks to respectively pick a stabilization parameter of a finite element scheme and to automate the selection of a parameter affecting how coarse problems are constructed in algebraic multigrid methods.In [11], the authors use a convolutional neural network to accurately detect and localize shocks, allowing more efficient shock capturing and avoiding parameter tuning.Automating the selection of parameters in algorithms can improve both the efficiency of the numerical algorithm itself and the scientific workflow, by allowing the user to spend time on other tasks.If a step in an algorithm is driven by heuristics, there is no obvious way of making the optimal choice and learning how to do this from data might be feasible.We take inspiration from operator learning and attempt to solve with machine learning techniques the problem of boundary deformation extension, a computational subproblem that is typically defined by the solution of heuristically chosen PDEs.We do this using two distinct approaches.First, we present an approach where a neural network parameterizes a PDE, the solution of which defines an extension operator.Secondly, we present an approach where a neural network defines a correction to a standard extension operator.
Extending boundary deformation onto the whole domain is a task that can be a crucial and limiting factor in applications, especially in settings, where one wants to avoid remeshing techniques to reduce computational effort, see e.g.[50,29].It appears in monolithic arbitrary Lagrangian-Eulerian (ALE) formulations of fluid-structure interaction (FSI) problems, e.g., [56,47], and the method of mappings for partial differential equation (PDE) constrained shape optimization problems, e.g., [39,40,42].The task can be stated as follows.Let Ω ⊂ R d , d ∈ {2, 3}, be a domain and ∂Ω be its boundary.Given a boundary displacement g : ∂Ω → R d , one needs to apply an extension operator to find a deformation field u : Ω → R d such that u| ∂Ω = g and (id + u)(Ω) is a well-defined (Lipschitz) domain.This imposes requirements on g and the extension operator.Among other things, id + u : Ω → (id + u)(Ω) needs to be bijective, which also implies that g has to be chosen appropriately (e.g.not leading to a self-intersecting boundary of Ω).For computational purposes, it is mandatory that for any fixed mesh that discretizes Ω the transformed mesh is non-degenerate, see Figure 1.In ALE formulations of fluid-structure interaction, the task of finding a suitable extension operator is typically based on heuristic choices, and has no (from a continuous perspective) or little (from a numerical perspective) influence on the physical solution as long as it is ensured that it fulfills the above described requirements (and regularity requirements), see e.g.[17].At the same time, it is the limiting factor for numerical simulations as soon as these requirements are violated.In this work, we apply ideas from machine learning to find a suitable extension operator for fluid-structure interaction problems formulated in the ALE framework.
Several approaches to define the extension operator have been introduced in the literature.In [56], harmonic extension operators, a linear elastic and a biharmonic model have been compared in the context of monolithic ALE formulations of FSI problems.The harmonic Figure 1: Mesh degeneration at the top corner of the elastic structure for harmonic extension (left), but not for the learned extension operator (middle; here shown for hybrid PDE-NN approach with artificial training data) and biharmonic extension (right).
extension operator is obtained by solving the PDE −∆u = 0 in Ω, u = g on ∂Ω. ( The linear elastic model, e.g. also used in [50], is given by −div(η λ tr(ϵ)I + 2η µ ϵ) = 0 in Ω, u = g on ∂Ω, with ϵ = 1 2 (∇u+∇u ⊤ ) and scalar valued functions η λ , η µ that are chosen, e.g., mesh dependent.In the context of shape optimization, [45] chooses η λ = 0 and η µ = µ, where µ solves the PDE −∆µ = 0 in Ω, µ = µ max on Γ, µ = µ min on ∂Ω \ Γ, for constants µ max, µ min > 0. The manually chosen boundary value allows to increase rigidity in areas where large deformations occur, e.g., around ∂Ω. Finally, for large displacements the biharmonic model given by ∆ 2 u = 0 in Ω, u = g, ∇u • n = 0 on ∂Ω, shows better numerical behavior compared to the harmonic or linear elastic extension, see e.g.[56].However, it is also the most expensive of these options since it requires either H 2conforming finite elements, or other techniques such as a mixed approach, cf.e.g.[56,Sec. 4.4.5], or weakly imposed continuity of the normal derivatives [22].Recently, in the context of shape optimization, also different types of extension equations were used.In contrast to fluid-structure interaction simulations, the choice of nonlinear extension operators is justified in many shape optimization applications since solving governing time-dependent PDEs for the state equation is the computational bottleneck of the shape optimization simulation.Differently from the fluid-structure interaction application, the choice of the extension operator can be part of the modeling and, e.g., used to define a shape metric based on the corresponding Steklov-Poincaré operator [46].In [39], the p-Laplacian for p ≥ 2 was introduced as an extension equation.For 1 < p < ∞, the problem is given by where ∥ • ∥ denotes the Frobenius norm.The nonlinearity increases the rigidity for increasing deformation gradient.
In the context of mesh extension operators, machine learning algorithms have been applied in several works.For example, in [16], it is proposed to use neural networks as an ansatz class for mesh motion, by training a network of only coordinate inputs to match the deformation of the boundary vertices.Working in the PINNs methodology, [7] learns a deformation map by linear elastic mesh motion.In [49], neural networks are trained to produce mesh motion for r-adaptivity, taking inputs describing the PDE instance.
In this work we develop two approaches to learn an extension operator.First, we consider a hybrid PDE-NN model [38].We observe that the harmonic extension operator and the p-Laplacian are special cases of the more general formulation where ᾱ is a scalar valued function, ξ denotes the coordinates and θ parameters (e.g., θ = p, θ = µ).Here we shall parameterize ᾱ by using a neural network and in turn θ represents the network's weights and biases.The choice of ᾱ is made based on theoretical considerations such that existence of solutions of (3) is ensured.This requires a sophisticated choice of the architecture and the weights.In the scope of the work, for the sake of simplicity, we restrict ourselves to ᾱ(θ, ξ, u, ∇u) = α(θ, ∥∇u∥ 2 ).We make a theoretically funded choice for the neural net, which serves as parameterization for ᾱ, such that existence of solutions of ( 3) is ensured.The choice of ᾱ then allows us to prove well-posedness of the extension operator by applying ideas from the proof that the p-Laplace equations have unique solutions.Moreover, we follow a supervised learning approach that aims to find a neural network such that the solution of ( 3) is close to the solution of the biharmonic extension.The resulting extension equation requires a nonlinear PDE solve.In order to make it suitable in the setting of the unsteady FSI equations, we follow a "lagging nonlinearity" approach, i.e. we consider the linear extension equation −div(ᾱ(θ, ξ, ū, ∇ū)∇u) = 0 with ū being the deformation variable of the previous time-step in order to compute the new deformation field.
Our second approach is to learn an extension operator where a neural network corrects the harmonic extension operator.The choice of network architecture acts locally and ensures that the resulting extension operator respects the boundary condition constraints, see (22).Hence, it differs from other approaches where such constraints are penalized in the loss function.This is done by including in the network architecture multiplication of the output by a function vanishing on the boundary, which is found by solving a Poisson problem on the computational domain Ω.Also here we follow a supervised learning approach, training the corrected harmonic extension operator to match the biharmonic extension operator.
In the hybrid PDE-NN approach we generalize the harmonic extension to a class of extensions and then select one of them by training a neural network.In the corrected harmonic extension approach, we instead use the harmonic extension as a base that is improved by a trained correction.Both approaches incorporate prior knowledge in the design of the neural networks, e.g.via respecting theoretical requirements or sophisticated choice of features.This is a common strategy.Features like periodicity, can be guaranteed in the network architectures by using a feature transformation, see e.g.[59].Theoretical considerations and classical discretization schemes can be used as prior knowledge and be respected in the NN architecture.The choice in [44] is, e.g., based on theory of unsteady partial differential equations.On unstructured meshes, the connection between message-passing neural networks and classical discretization schemes is demonstrated in [33].Physics-informed neural networks [43] incorporate knowledge or modeling assumptions on the physical model in the objective function.The hybrid PDE-NN approach proposed in [28,58,38] can itself be viewed as a neural network approach that learns a solution operator g → u of (3) from data and involves a solve of a partial differential equation in the output layer, which makes evaluations expensive compared to standard neural network architectures.Learning partial differential equations from data is also addressed e.g. in [1,18,19].
To assess the quality of the learned extension operator, we apply it to the FSI benchmark II [53].More precisely, we do a comparison similar to [56,Fig. 6].In order to be able to include different extension operators and assess their quality without having to modify the FSI system, we introduce a novel splitting scheme for the monolithic system.We discuss the hybrid PDE-NN approach in section 2. In section 3 we introduce the approach of correcting standard extension operators with a neural network.Section 4 presents the novel FSI splitting scheme and numerical results for the FSI benchmark II.Moreover, generalizability and computational costs of the approaches are discussed in sections 4.5 and 4.6.

Approach 1: Hybrid NN-PDE Approach
In order to find the extension operator via (3), we formulate an optimization problem to find ᾱ: Given boundary displacements g i , i ∈ {1, . . ., N d }, N d ∈ N, we search for weights θ such that min θ∈Θ,u∈W where λ denotes a regularization parameter, R(θ) a regularization term, Θ and W closed subsets of suitable Banach spaces, and ||| • ||| is defined by Moreover, u i biharm = u i biharm (g i ) solves the biharmonic equation for boundary conditions g = g i .In the numerical realizations in this work, we neglect the regularization term and set λ = 0.For future work, the term R can be used to add mesh quality measures to the objective function.
In order for the optimization problem to be well-defined, we have to choose ᾱ in a way such that the PDE constraint in (4) is solvable for all admissible θ ∈ Θ.This requires a suitable choice of Θ and ᾱ, which we address in section 2.2.We choose ᾱ such that it is defined via the derivative of a convex function.Conformal approximations of convex functions via finite elements or polynomials is challenging [12,54].It requires, e.g., the choice of appropriate basis functions and non-negative coefficients, or a set of constraints.Therefore, we model ᾱ via input convex neural nets, see [5,48] and section 2.2.

Choice of ᾱ
The partial differential equation ( 3) is not necessarily solvable if we choose ᾱ(θ, ξ, u, ∇u) arbitrarily.In order to find conditions for ᾱ under which (3) can be ensured to be uniquely solvable, we take inspiration from the p-Laplace equations.The weak formulation of the p-Laplace equations, p > 1, can be viewed as the optimality conditions of the optimization problem Since ( 6) is a convex optimization problem with strictly convex objective, (6) has a unique solution.More precisely, we have the following result.
Lemma 1 (see [34,Thm. 2.16]).Let Ω be a bounded Lipschitz domain, p ≥ 2, g ∈ W 1,p (Ω).Moreover, let Then, there exists a unique minimizer of the optimization problem (6) and the solution of the optimization problem is characterized by We prove a similar result for a more general class of PDEs.Hence, we consider a mapping Λ : R → R and the optimization problem min u∈W Ω Λ(∥∇u∥ 2 )dξ (8) and work with Lemma 2. Let Ω be a bounded Lipschitz domain and g, W be defined as in lemma 1.Let Λ : R → R be such that Λ is convex, strictly increasing, continuously differentiable, and assume that there exist a, b, d > 0, c ∈ R and p > 2 (or p = 2 and Λ being affine) such that Moreover, let α be defined by (9).Then the mapping is convex, continuous and Fréchet differentiable with derivative Proof.In order to show continuity and differentiability of F , we rewrite , where Since f 1 is linear, f 2 is strictly convex and Λ is strictly increasing and convex, F is convex.Due to [52,Sec. 4.3.3]we know that the superposition operator f 2 is continuous and continuously differentiable with derivative f ′ 2 (B) : Due to (10) and (11), we further obtain continuity and differentiability of f 3 with derivative f ′ 3 (β) : Due to linearity and boundedness, f 1 is continuous and differentiable with Hence, by applying the chainrule we obtain continuity and differentiability of F : Since W is a closed, affine linear subspace of W 1,p (Ω) d , this concludes the proof.Lemma 3. Let the prerequisites of lemma 2 be fulfilled.Assume further that there exist e ≥ 0, f > 0, such that Moreover, let g ∈ W 1,p (Ω) and W be defined by (7).Then, there exists a unique minimizer of the optimization problem (6) and the solution of the optimization problem is characterized by Proof.The proof follows the line of argumentation of [34,Thm. 2.16].Assume u 1 , u 2 ∈ W are solutions of (8).Since Λ is convex and strictly increasing and ∇u → ∥∇u∥ 2 is strictly convex, we know that ∇u → Λ(∥∇u∥ 2 ) is strictly convex.Hence, we know that ∇u 1 = ∇u 2 .Therefore, This shows uniqueness of minimizers for (8).
In the following, we show existence of solutions.By the continuity of F shown in lemma 2 and (15), we have Therefore, we can choose a sequence (v j ) j∈N ⊂ W such that Ω Λ(∥∇v j ∥ 2 )dξ < I 0 + 1 j for all j ∈ N. Due to (15), we know that ∥∇v j ∥ L p (Ω) is bounded.Since, by the Poincaré inequality, there exists a constant C > 0 such that ∥w∥ L p (Ω) ≤ C∥∇w∥ L p (Ω) for all w ∈ W 1,p 0 (Ω), we know that ∥v j −g∥ L p (Ω) ≤ C∥∇(v j −g)∥ L p (Ω) for all j ∈ N.This implies boundedness of the sequence (∥v j ∥ W 1,p (Ω) ) j∈N .Hence, there exists a weakly convergent subsequence (v j ) j∈J⊂N and v ∈ W such that v j ⇀ v weakly in W 1,p (Ω) for J ∋ j → ∞.Since, by lemma 2, F : W → R is convex and continuous, it is weakly lower semicontinuous.Therefore, F (v) ≤ lim inf J∋j→∞ F (v j ) = I 0 and v is the unique minimizer of (8).The first order necessary optimality conditions for the unconstrained optimization problem then yield (16).

Choice of the neural network
We do a slight abuse of notation and write α(θ, s) instead of α(s), as well as Λ(θ, s) instead of Λ(s), in order to stress that our choices of α and Λ depend on the weights and biases θ of a neural net.By the considerations of the previous section, we choose where Λ : Θ × R → R fulfills the requirements of (16).More precisely, we work with where η 1 > 0, η 2 ≫ η 1 , ϵ > 0, (•) +,ϵ denotes a monotonically increasing, smooth approximation of the max(•, 0) function and Λ(θ, s) denotes a continuously differentiable, monotonically increasing, non-negative Input Convex Neural Network [5,48], i.e. s → Λ(θ, s) is convex, that fulfills Λ(θ, s) = O(s) for s → ∞.The last summand is not realized in the numerics and only needed if the second summand is zero.
The reason why we choose this specific form of α is due to the fact that for small displacements the harmonic extension operator is a computationally cheap and appropriate choice and we want to keep the linearity of the extension operator close to the identity, which is also beneficial for Strategy 3 in section 4.3.2.
In order to fulfill the requirements we ensure that the weights of Λ are non-negative (we realize this by using squared weights).Let σ denote an activation function.The neural network Λ(θ, s) = x k+1 is given by the recursion with weights ((W 0 , b 0 ), (W 1 , b 1 ), . . ., (W k , b k )).This type of neural networks is known as a multilayer perceptron (MLP).
For the hybrid PDE-NN approach, in order to ensure the properties of Lemma 3 we choose where s : t → t 2 , and we use the notation in [26] to denote by s[•] the entrywise application of s to a matrix •.We work with the differentiable activation function σ(x) = ln(1 + e x ), with its derivative, the sigmoid function, σ ′ (x) = e x 1+e x = 1 1+e −x .Moreover, since it is not present in the derivative of the network, we choose the bias in the output layer as b θ k = 0, where k is the depth of the neural network.Its derivative is therefore given by d d s Λ(θ, s) = y k+1 , where Hence, we work with a neural network architecture that is motivated by theoretical considerations.

Approximation Properties of Input Convex Neural Network Architectures
We discuss approximation capabilities of a class of shallow, input-convex neural networks.We show that for one-dimensional input the ansatz class guarantees both convexity and universal approximation which renders it a promising candidate for later numerical studies.The convexity is crucial to ensure the solvability of the PDE at every iteration, whereas the universality provides the guarantee that any convex function can be approximated.
Lemma 4. Let f ∈ C 0 (R) be convex, coercive, piecewise affine linear with n ∈ N linear regions.Then f can be written in the form for a i ≥ 0 and w i , b i , c ∈ R.
Proof.Follows via induction over the number of linear regions.
Lemma 5. Let f ∈ C 0 (R) be convex.Then f can be approximated by ReLU networks as specified in Lemma 4 uniformly on compacta.This means, for any compact subset Proof.Follows from Lemma 4 and uniform continuity of f on any compact set K ⊂ R.
Remark 1.For d ≥ 2, shallow input convex neural networks of the form where c i ≥ 0 and ρ = ReLU , yield no universal approximation to convex functions, see section A. Similar results for classical discretizations can e.g.be found in [12,54].There are other ansatz classes for input convex neural networks that have better approximation properties, see e.g.[10].

Approach 2: NN-corrected Harmonic Extension Approach
In contrast to the approach of learning ᾱ in a non-linear PDE (3) defining an extension operator, we explore a more direct approach by learning a neural network that realizes a pointwise, additive correction to the harmonic extension operator.This neural network takes pointwise input features from the harmonic extension and produces a correction to deal with boundary deformations where the harmonic extension degenerates.Typical boundary extension operators are based on PDEs, so we choose to use the component values and first order derivatives as input features, along with spatial coordinate ξ.We define the neural network-corrected harmonic extension as where u harm solves (1) with boundary condition u = g and N θ is a learned function with parameters θ.The function l is zero on the boundary ∂Ω and positive in the interior Ω, such that (22) satisfies the boundary condition u| ∂Ω = g exactly and the neural network affects the extension in every point in the interior [37].We define the learned extension in this way because even for small deviations from the boundary conditions, the discretized geometry will be incorrect for the problem and introduce numerical artifacts for FSI simulations.We construct our neural network as a standard multi-layer perceptron, defined by the recursion (19) and N θ (x) = x k+1 for a network with k hidden layers.We write θ for the collection of weights W ℓ and biases b ℓ , the parameters of the network to be trained, and refer to the dimensionality of x ℓ as the width of the ℓ-th layer.We choose the Rectified Linear Unit, ReLU(x) = max(0, x), as activation function.We choose to base our extension on the harmonic extension, since it is a linear equation with fast, order optimal solvers available and is suitable for modest boundary deformations.It is therefore a good starting point for the correction, without being prohibitively expensive to compute.Moreover, for a fixed architecture, evaluating the learned neural network correction N θ scales linearly with the number of mesh points in terms of runtime and thus the NNcorrected harmonic extension approach should scale well for larger meshes.
We determine the function l in ( 22) as the solution of the Poisson problem which for a wide class of f give solutions l that are strictly positive in the interior Ω and smooth, see lemma 6.The option of f ≡ 1 is one choice of f such that this lemma holds.
, is bounded, non-negative and not identically zero, Ω is bounded, and ∂Ω is regular at every point, then there exists a unique classical solution l of (23), Proof.By [23,Thm. 4.3], the Dirichlet problem ( 23) has a unique classical solution if every boundary point of Ω is regular and f is bounded and locally Hölder continuous over Ω.Thus, there exists a classical solution l of ( 23), which is by definition C 2 (Ω) ∩ C( Ω).By the strong maximum principle, [23, Thm.3.5], since ∆l ≤ 0 in Ω and l = 0 on ∂Ω, l is positive in the interior.Since l is a classical solution of (23), it is also a weak solution in H 1 (Ω) and is Remark 2. The domain in the FSI benchmark example is two-dimensional and bounded by a finite number of simple closed curves, hence all boundary points are regular [23, p.26].
The fact that l is strictly positive in the interior ensures that it is zero nowhere, so the neural network N θ can correct the extension everywhere.The values of l can also be normalized such that 0 < l(ξ) < 1 everywhere in Ω, since by scaling f the solution of the linear PDE (23) will scale accordingly.
The function f was chosen such that the resulting l solving (23) weights the areas where mesh degeneration is expected to happen, and is proportional to which we then normalized.Figure 2 shows the difference in l resulting from choosing f by (24) compared to using the simple f = 1.The expression for f by (24) was chosen by first assuming f only depends on x, due to the near symmetry of the domain.If ( 23) is actually one-dimensional, the equation reduces to choosing the curvature of l, by −l ′′ = f in Ω.Then, f close to zero in the right half of the domain makes l close to zero as well, due to the boundary condition l = 0 on the right boundary and positivity due to the maximum principle.With this motivation, we experimented with terms that kept f positive over the whole domain and small in the right half, until we found an l we were satisfied with.The significance of the choice of f is analyzed in section 4.4.
Since closed form solutions of ( 23) for arbitrary domains Ω and source terms f are in general not available, the solution is approximated using an appropriate discretization.Maximum principles do not necessarily translate into discrete ones, so since positivity of l in the interior is necessary, we selected continuous linear Lagrange (P 1 ) finite elements with a Delaunay triangulated mesh [27,Thm. 3.1], which together satisfy a discrete maximum principle.We note that we only need to solve for l once for any given mesh and reuse the solution afterwards.
The neural network correction N θ defines a d-dimensional vector field over the domain that we can evaluate where needed to solve the total computational problem.However, for the use in the FSI test problem it is convenient to embed the learned mesh displacement into a finite element function space.To this end, similar to the hybrid PDE-NN approach, we use the space of continuous piecewise quadratic functions (P 2 ).For simplicity, the P 2 embedding of N θ is obtained by linear interpolation (into midpoints) of its P 1 representation.In particular, this construction only requires evaluation of N θ at mesh vertices.
In this work we restrict ourselves to neural networks operating on pointwise values, allowing the same network to be used for any triangulation of the domain, just by applying the network to each mesh point.We then need to encode sufficient information into the network to allow for correction of the mesh extension of varying boundary deformations.For example, predicting the extension correction using only the spatial coordinates ξ is not practicable, since the network would not be able to distinguish the different boundary conditions u| ∂Ω = g.In addition to ξ we also include information from the already computed harmonic extension u harm .To complement this (vertex-)local information, the final network input is an approximate gradient of u harm at the vertex.We remark that since u harm is represented by a C 0 -conforming finite element function, its gradient in a vertex cannot be simply obtained by standard nodal interpolation.Instead, we apply the Clément interpolation [13], which computes the approximation of ∇u harm as the L 2 projection over a patch composed of the finite element cells connected to a given vertex.We note that the L 2 approximation error of the recovered gradient (as P 1 function) decreases linearly with the mesh size [13].Let us finally note that the local averaging gathers information from the neighboring vertices and is in this sense similar to aggregation in graph-neural networks [60].In summary, the inputs to our neural network N θ at ξ are (ξ, u harm (ξ), D c u harm (ξ)), where D c denotes approximating the Jacobian by Clément interpolation.

FSI example
In order to test the learned extension operators and compare them with respect to extension quality, we apply them to the FSI benchmark II. 1 We consider the coupling of the Navier-Stokes equations with St. Venant-Kirchhoff (STVK) or Incompressible Mooney-Rivlin (IMR) type material and choose a monolithic ALE setting formulated on a fixed reference domain.
For the sake of convenience and brevity, we present the PDE system with a harmonic extension equation for the fluid displacement -other extension equations are more appropriate for large displacements and can be straightforwardly included into the system of equations [56,24] as long as they can be represented as PDEs.Let ρ f , ρ s denote the fluid density and structure density, and ν f denote the fluid viscosity.Moreover let µ s , λ s and µ 2 denote Lamé parameters.The FSI model reads as follows.
where T > 0, Ω f denotes the fluid domain with boundary Γ f ∪ Γ i .Γ f,D ⊂ Γ f denotes a subset of the boundary on which Dirichlet boundary conditions for the fluid flow are prescribed.Ω s denotes the structure domain with boundary Γ f ∪ Γ i , where Γ i denotes the interface between the fluid and structure domain.v f and v s denote the fluid and structure velocity, respectively, p f the fluid pressure, w s the structure displacement, and w f the extension of the structure displacement to the fluid domain.In addition, and for STVK or IMR type material.In case of IMR type material we add the equation In order to be able to work with functions defined on the whole domain, in the case of STVK type material, we add the equation for a constant α p = 10 −9 .Whenever it is clear from the context we write J, E, σ f , σ s instead of J(F ), E(F ), σ f (F ), σ s (F ).In order to simplify notation, we also do not explicitly write the subscripts for the states v, p and w.D• denotes the Jacobian of function •.We have compatibility conditions: For the sake of clarity, we restrict the presentation to the STVK type material, mention explicitly when IMR type material is considered, and assume that Γ f,D = Γ f for the presentation of the weak formulation and the splitting scheme (see section 4.1).The weak formulation (for STVK type material) is given by: Find (v, p, w) for all (ψ v , ψ p , ψ w ) ∈ V 0 × P × W 0 .

Splitting the FSI problem into smaller subproblems
In order to simplify the use of arbitrary extension operators, we propose a splitting scheme that handles the extension in a separate step.To do so, we build on ideas and techniques Algorithm 2: Solve FSI equations Input : (v(0), p(0), w(0)) (discretized as described in section 4. that are also present in, e.g., [21,57].The method proposed in [21] substitutes the condition We motivate a similar procedure, use the reasoning of [21] to reduce the system but combine it with another idea.After transformation to the physical domain, the choice of w f in the interior of Ω f does not affect the result of the fluid solution (as long as it is ensured that the corresponding transformation is bi-Lipschitz).Therefore, in the first step, we use w(t + ∆t) = w(t) + ∆t 0 v(t + s)ds, where w(t) denotes a deformation field at timepoint t.After having computed v(t + ∆t) and p(t + ∆t), we compute w| Ωs (t + ∆t) and w(t + ∆t) by applying an extension operator of the solid displacement onto the fluid domain.In this way, we solve for (v, p) and w separately without doing an approximation.Then, we need to recompute (v, p) for the modified deformation field w (which replaces w for the time-stepping).Thus, the procedure requires to solve the fluid system twice, see also Algorithm 1.
More precisely, assume we know the states w, v, p at the time step t.In order to get the states at time step t + ∆t we have to solve three systems of equations.For the case of using an harmonic extension operator the procedure is given as follows.
Second system.Given the solution of the first system we consider the solution of where we impose the Dirichlet boundary conditions w| ∂Ω f = ( w + t+δt t v(ξ)dξ)| ∂Ω f .The solution of this system is used to update w. (Here, we use the subscripts • f and • s in order to clarify that these two equations are solved independently.)It is straightforward to replace the harmonic extension with an arbitrary extension operator that extends the boundary displacement to the interior of the domain, e.g., also those that are not defined via the solution of a PDE.
Third system.Given w we obtain the updated v, p as the solution of where

Discretization
The discretization is done similar to [56,21,24].We use a triangulation of the domains and choose the lowest-order Taylor-Hood finite elements for the velocity and pressure.The deformation is discretized using piecewise quadratic continuous finite elements.As in [56], we further choose a shifted Crank-Nicolson scheme (θ-scheme with θ = 1 2 + ∆t max for sufficiently small ∆t max > 0) for performing the time-stepping in case we work with STVK type material and implicit Euler as time-stepping scheme when we work with IMR type material.The pressure term and the incompressibility condition are handled implicitly.Analogous to [55], in the first summand of the weak form (Jρ , which is a convex combination of the determinant of the deformation gradient of the previous time t n−1 and the current time t n . Since the splitting schemes requires a smaller time-step size than classical monolithic approaches, we work with a time-step size that is iteratively adapted.We choose a time-step-size range [∆t min , ∆t max ] and start with time-steps of size ∆t = ∆t max .If the systems with the maximal time-step get unsolvable, we choose ∆t = max(∆t min , 1 2 ∆t).This is repeated until we obtain solvability or reach ∆t min .If the system is not solvable for ∆t = ∆t min , we stop.After each successful step we adapt the time-step by setting ∆t = min(2∆t, ∆t max ).The algorithm is sketched in Algorithms 1-2.
Simulations are performed on a desktop computer with an AMD Ryzen Threadripper 3970X 32-Core CPU and 128 GiB memory.Neural networks for the NN-corrected harmonic extension approach are trained on an NVIDIA GeForce GTX 1660 SUPER GPU.Nonlinear systems are solved using a Newton based nonlinear solver with linesearch in PETSc [8,9,15].Linear systems are solved using the direct solver MUMPS [3,4].

Numerical Results for FSI benchmark problem II
In order to validate the splitting of the FSI system in section 4.1, we apply it to the FSI benchmark II [53] and plot the displacement of the tip of the flap in Figure 3, compare e.g.[53, p. 256] or [24,Fig. 5.6].We test the harmonic extension, the biharmonic extension and an incremental version of the harmonic extension where the extension is performed on the deformed domain of the previous timestep and only the change of the displacement is extended harmonically, see [47].Moreover, Figure 3 shows the minimal determinant value of the deformation gradient.The simulations of the standard and incremental (see [47]) harmonic extension break if the determinant of the deformation gradient becomes negative.Therefore, with these approaches, it is not possible to simulate the whole time interval.

Generation of Training Data
Training data via FSI benchmark II.For a proof of concept study, we first generate training data by running the FSI benchmark II with the biharmonic extension and a time-step size of 0.0025, and save the deformations for every time step in (15,17] for the hybrid PDE-NN (and (11,17] for the NN-corrected harmonic extension approach), i.e., we generate N d = 800 (and N d = 2400, respectively) data points.
Remark 3.This procedure only makes sense if the FSI problem is solved several times as it is the case, e.g., if the FSI equations are the governing equations of a PDE constrained optimization problem.Moreover, since it is not necessary that the test data is physically realistic, the test set can also be artificially generated.Instead of solving the biharmonic equations and using this as a reference value, also unsupervised learning approaches can be considered, where Ext θ denotes an extension operator that depends on the parameters θ and J denotes a quality measure for the deformed mesh.For example, one can choose J such that the determinant of the transformation gradient is bounded away from 0, see [25], or another mesh quality measure is optimized.This, however, is left for future research.

Artificial training data.
We also created an artificial dataset to train our networks on.This artificial dataset was created by solving for the stationary deformation of a neo-Hookean body representing the solid domain, with the same Lamé parameters as the solid in the FSI simulations.Specifically, we prescribe homogeneous Dirichlet boundary conditions for the displacement on the cylinder boundary and let the body deform under boundary loads selected by hand to create deformations of the solid domain resembling by eye the deformations encountered in the FSI benchmark problem II.Six different load configurations were selected and for each of these configurations 101 different deformations were computed, by varying the amplitude of the loads by cosine factors for 101 equally spaced θ from 0 to 2π.These solid deformations were extended to the fluid domain by the harmonic and biharmonic extension, to serve as input data and target data respectively.
In each base load configuration, there is one boundary load acting vertically on the right side of the deformable body and also a boundary load acting vertically and equally on the bottom and top side of a limited part of the deformable body,  given by (0, else.
The parameters used to create the artificial dataset are given in table 1.

Hybrid PDE-NN Approach
In order to reduce the computational burden, we only take every ⌊ N d N ⌋th training data point into account and work with N = 20 for the FSI training data set and N = 60 for the artificial data set.Every objective function evaluation requires the solution of N nonlinear PDEs (which could potentially be done fully in parallel).In our numerical realization we do not use regularization, i.e., we work with λ = 0.
We choose α to be a neural network with two hidden layers of width 5 with bias on the two hidden layers.Let θ opt denote the approximate solution of the optimization problem (4) that we obtain by applying an L-BFGS method with at maximum 100 iterations.Figure 5 visualizes a neural network for θ = θ opt that we obtained by training on the FSI and artificial dataset.In our numerical experiments ∥∇u∥ 2 reaches values up to around 0.6 for the nonlinear learned extension operator applied to the FSI benchmark II.
We present numerical results for the FSI benchmark II in Figure 6 and Figure 7, where we choose ∆t max = 0.01 and ∆t min = 1 128 ∆t max .More precisely, we show the y-displacement of the tip of the flap and the minimal determinant of the deformation gradient, which serves as an indicator for mesh degeneration, see [56].We consider three different strategies for solving (16): • Strategy 1: First we try the learned nonlinear extension operator, i.e., we solve (3) with ᾱ defined by ( 18) and θ = θ opt .This strategy does not fail due to mesh degeneration.However, it requires several Newton-steps for large displacements.
• Strategy 2: In order to avoid the nonlinearity, we consider a strategy that uses a linear version of the extension operator.Working with a "lagging nonlinearity", i.e., solving (3) with ᾱ defined by ( 18) being approximated by ᾱ(θ opt , ξ, u, ∇u) ≈ α(θ opt , ∥∇u old ∥ 2 ), (32) where u old denotes the displacement of the previous time-step, shows unsatisfactory numerical behavior.Therefore, we combine it with the idea of using incremental versions for extension operators, see [47].We present results for the incremental "lagging nonlinearity" strategy, where we solve and use the update This strategy is successful even for the largest appearing displacements.However, the mesh quality decreases during the course of the simulations and it is not possible to simulate the whole 15 seconds.
• Strategy 3: In order to counteract the decreasing mesh quality that occurs with the second strategy during the oscillations, we consider a third strategy that combines strategies 1 and 2. If the vertical displacement of the tip of the flap is below a certain threshold (we use 0.005), we work with strategy 1, otherwise we choose strategy 2. This approach allows us to simulate the whole 15 seconds and the mesh quality stays stable during the oscillations.Moreover, the extension equation of strategy 1 just needs to be evaluated for few time-steps and in a regime with small deformations at the tip.Instead of using strategy 1 for small deformations, one can also directly use the harmonic extension.We design the hybrid PDE-NN extension such that it is equal to the harmonic extension for small deformations, see (18).

NN-corrected harmonic extension approach
For the NN-corrected harmonic extension, we partition the FSI dataset by selecting the first 1800 snapshots for the training dataset, the following 200 snapshots for the validation dataset, and the last 400 snapshots for the test dataset.The validation set is used to make decisions during training, such as for learning rate scheduling, but is not used to compute gradients of the cost function by backpropagation.The test set is set aside until the networks are fully trained and then used to evaluate the performance of the networks.Because the artificial dataset is not periodic like the FSI dataset, it was randomly split into 85% training data and 15% validation data.The FSI test dataset is used for testing networks trained on both the FSI dataset and the artificial dataset.We use a standard MLP for N θ with 6 hidden layers of width 128 and ReLU activation function.The network has 8 input features (as here Ω ⊂ R 2 ), one for each component of ξ,    u harm , and ∇u harm , and 2 output features, one for each component of the extension.Before the MLP, we include a normalization layer, which transforms the neural network's inputs by the affine transformation x j → x j −µ j σ j , where the real numbers µ j , σ j , j = 1, 2, . . ., 8 are chosen such that the inputs of N θ will have componentwise zero mean and unit variance over all vertices and snapshots of the training set.This normalization layer is part of the neural network's architecture and therefore the values of µ j and σ j are not changed between training and testing.In contrast to normalization approaches like batch and layer normalization, the values of µ j and σ j are not trainable parameters.
We implement the neural network N θ using PyTorch [41].The harmonic extension u harm and boundary condition-preserving function l are computed using FEniCS [2], the FEM we use for the rest of our computations.The correction made by N θ is translated into the finite element function spaces by adding the vertex evaluations of N θ to the basis coefficient vectors.
We train our networks in a supervised fashion by minimizing the sum of absolute error in each mesh vertex for all snapshots, with the biharmonic extension as target.This can be formulated in the learning problem where V denotes the set of vertices in the computational mesh used to generate the dataset.
To solve (33), we used the AdamW optimizer [35], with random batches of 128 snapshots and every mesh vertex included for each snapshot in the batch.We used the PyTorch-built in ReduceLROnPlateau learning rate scheduler, with the reduction factor set to 0.5 and otherwise default parameters, and weight decay regularization with factor 0.01.On the FSI dataset we trained the network for 500 epochs.When training on the artificial dataset, which is smaller, we stopped the training process after only 200 epochs, as we found the training process had stabilized by then.
The network and training hyper-parameters were chosen by using common practices in deep learning and no extensive hyper-parameter search was conducted.The number and width of the hidden layers was increased until we found the network produced satisfactory mesh motion.A brief analysis of the models dependence on random initialization, the function l used, and the number of hidden layers in the neural network is included in section 4.4.
To predict the correction for a given harmonic extension u harm , we pre-process u harm to supply the Clément interpolated gradients at each mesh vertex.For the training process, we did this pre-processing beforehand.When using the trained network as part of the FSIsimulation, the Clément interpolation is done in each time step before the second step of the splitting method described in section 4.1.This interpolation is cheap when done repeatedly, as it can be realized by multiplication with the same pre-assembled sparse matrix for any given mesh.As can be seen in Figure 20, the cost of the Clément interpolation is modest compared to computing the harmonic extension and neural network output and scales well with increased mesh refinement.
To test the fully trained neural network correction (22), we evaluate it by computing the scaled Jacobian mesh quality measure over the FSI test set and by using the extension strategy on the same FSI test problem as the hybrid PDE-NN approach.We solve the FSI-system (25) using the same splitting and discretization approach, outlined in sections 4.1-4.2.The NN-corrected extension operator is not a PDE, but the splitting of the system allows us to use this extension.
The scaled Jacobian mesh quality measure returns the value 1 for equilateral triangles and goes to zero as a cell degenerates.The extension trained on the FSI dataset was able to produce a quality mesh for all snapshots in the test set, with the minimal scaled Jacobian   mesh quality measure being approximately 0.32 across all cells and snapshots.Trained on the artificial dataset, the extension was able to produce a functional mesh for all snapshots, but not of high quality.On some snapshots, the resulting mesh was nearly degenerate, with minimal scaled Jacobian mesh quality indicator being approximately 0.07 across all cells and snapshots.The minimum cell values of scaled Jacobian mesh quality indicator over the FSI test set are shown in Figure 8.In Figure 9, we show the cell resulting in the lowest scaled Jacobian mesh quality value before and after the mesh motion.When implemented as the mesh motion strategy in the FSI benchmark II simulation, the FSI-trained extension is successful and the simulation runs the whole 15 seconds period, as shown in Figure 10.Using the extension trained on the artificial dataset, the FSI simulation was also able to run for the whole 15 second period, as shown in Figure 11, but with lower mesh quality.
Figure 12 shows a histogram of inputs at ξ = (0.6, 0.21), the upper right corner of the flag, which is one of the points where we observed the mesh to degrade the most.The missing tails in the neural network inputs at this point from the artificial dataset indicate that there are characteristic deformations in the FSI simulation that are not available for the artificial dataset.
The network was trained on a GPU and the process took approximately 16 minutes.Evaluating the network to produce an extension correction for every vertex of the mesh in the FSI simulations takes approximately 10 milliseconds per time step.The simulations run to pro- spent on evaluating the neural network, although the real number is higher, since a portion of the time steps were rejected because the non-linear solver did not converge.In comparison, the total run time of the simulations was one order of magnitude higher than the training process.

Parameter study
We performed a parameter study for the neural network in the NN-corrected harmonic extension by analyzing the quality of meshes produced with varying random seeds and hyperparameter selections.The factors we varied are the number of hidden layers in the network and the choice of boundary condition-preserving function l used.
To analyze the sensitivity of the network to random initialization, we trained 10 networks with equal hyperparameters as in section 4.3.3,using the random seeds 0-9.For the fully trained networks, we then computed the scaled Jacobian mesh quality indicator over the deformed fluid domains of the FSI test set.In Figure 13 we plot the median and quartiles over the 10 random initializations of the minimal cell value of scaled Jacobian mesh quality.
From Figure 13, we can see that not all the trained networks produce quality mesh extensions.For some random initializations, the scaled Jacobian mesh quality approaches 0 at certain points of the periodic solid deformation.
To judge the importance of the choice of boundary condition-preserving function l, we train      2: Architectures and parameter counts used for analysis of impact of depth in neural network-correction of harmonic extension.
10 neural networks to correct the harmonic extension, but compute l from (23) with f ≡ 1 instead of using the right hand side (24).Other than the choice of l, the network architecture and training parameters are the same as those used in section 4.3.3.We compute the scaled Jacobian mesh quality indicator's minimum over the mesh elements for the snapshots in the FSI test set and show the median and quantiles over the 10 random initializations in Figure 14.
The results shown in Figure 14 are similar to those in Figure 13, but with lower quality extensions in some areas.These results indicate that NN-corrected harmonic extension can be successful with l that are not hand-tuned for the specific task at hand, but the use of hand-tuned l can be beneficial.
To judge the impact of the number of hidden layers in the neural network correction, we analyzed 10 random initializations of networks with 2, 3, 4, and 5 hidden layers and similar complexity.In these experiments, the width of the network was always chosen such that the total number of parameters was as close as possible to the number of parameters in the architecture used in section 4.3.3.This was done by adjusting the width of the networks.The architectures used are summarized in table 2.
Using the neural network architectures in table 2, we run the training process from section 4.3.3 with 10 different random initializations for each architecture.We compute the scaled Jacobian mesh quality measure's minimum value over all mesh elements over the FSI test set and report the median and quantiles over the random initializations in Figure 15.
The results in Figure 15 indicate that the NN-corrected harmonic extension requires a sufficiently deep network to learn a stable and high-quality mesh extension.
Judging only by the scaled Jacobian mesh quality indicator, a shallower network of for  instance 3 hidden layers can produce satisfactory results.However, we found that deeper networks tend to produce smoother mesh motions and chose to use 6 hidden layers in our network.We show how the smoothness of the mesh motion increases with depth in Figure 16, where we plot the vertical displacement over the FSI benchmark problem II test set of two vertices located at the tip of the deformable solid.

Generalizability
In order to assess the generalizability of the extension operators, we apply the extension operators to larger deformations (section 4.5.1) and for a different geometry (section 4.5.2).

Gravity-driven deformation test
We analyze the quality of transformed meshes obtained by extending solid boundary deformations produced from an elasticity problem [47, section 4].Using the geometry of the FSI benchmark problem II, we apply a uniform gravitational load (0, f grav ) to the solid, with homogeneous Dirichlet boundary conditions on the cylinder boundary and zero-traction boundary conditions elsewhere.We then compute the transient displacement of the solid according to the St. Venant-Kirchoff model (26), with the same material constants as in the FSI benchmark problem II.The elasticity problem is discretized using implicit Euler time-stepping and second order Lagrange elements in space.
For three different values of f grav ∈ {1.0, 2.0, 2.5}, we simulate the system until the solid first reaches its maximum deformation in the y-direction.These maximal solid deformations are then extended to the fluid domain to evaluate the quality of the meshes produced by the different mesh motion techniques.The maximal value of f grav = 2.5 was chosen as this, for our set-up and discretization, results in deformations close to the limit of biharmonic mesh motion.
Similar to [50,56,47], loss of bijectivity of the ALE-mapping is identified by sampling J = det(I + ∇F ) at the degree of freedom locations of a sixth order discontinuous Galerkin space.If a negative value of J is found in an element, we change the sign of the measured mesh quality indicator for that cell to be negative.Thus, if a deformed mesh contains negative mesh qualities in our experiments, the mesh is degenerate.
Figure 17 shows histograms of the scaled Jacobian mesh quality indicator, with negative sign for degenerate cells, for the three test deformations with the biharmonic extension, the harmonic extension, the hybrid PDE-NN extension trained on FSI-data and artificial data, and the NN-corrected harmonic extension trained on FSI-data and artificial data.
The setup of this test problem is similar to the FSI benchmark problem II and Figure 17 shows that the learned extensions are able to produce meshes of comparable quality to the biharmonic extension.The extensions trained on FSI data perform better than the ones trained on the artificial dataset.The artificial-trained hybrid PDE-NN extension degenerates on the two largest deformations and the artificial trained NN-corrected harmonic extension degenerates on the largest deformation.The other learned extensions are non-degenerate in that the mesh quality is bounded away from zero and the determinant of the deformation gradient is positive.

Membrane on fluid test
In order to test the generalizability properties of the different extension operators (trained on the FSI benchmark II dataset) onto different geometries, we consider the membrane fluid test presented in [56,Fig. 7,Sec. 4.2].We generate another data set by running an FSI simulation with biharmonic extension operator, IMR type material and parameters ρ f = 10 3 , ν f = 4 • 10 −3 , ρ s = 8 • 10 2 , µ s = 2 • 10 7 and µ 2 = 1 • 10 5 .We apply the different extension operators to the three snapshots of the simulation depicted in the first line of Figure 18.
Applying the trained hybrid PDE-NN extension to the new geometry is straight forward, since it includes no specifics about the geometry it is used for.In contrast, for the neural network-corrected harmonic extension, one has to compute a new boundary conditionpreserving function l.Switching l between training and testing amounts to a change in network parameters between training and testing, with which it is not reasonable to expect the network to succeed.Also, since the position vector is an input to the network, the network is specific to the training geometry and transferability cannot be expected.
Figure 18 shows histograms of the scaled Jacobian mesh quality indicator, with negative sign for degenerate cells, for the three different snapshots of the membrane FSI problem with the biharmonic extension, the harmonic extension, and the hybrid PDE-NN extension trained on FSI-data and artificial data.The sign for degenerate cells is determined in the same way as for the gravity-driven motion test.
From the results shown in Figure 18, we can conclude that the hybrid PDE-NN extensions are to some extent transferable between geometries.

Runtime comparison for FSI benchmark II
Figures 19, 20 present the average computational time of the different extension operators on the first 40 snapshots of the FSI benchmark II data set split into assembly, linear solves, neural network correction and rest (e.g.mesh moving) for three different refinement levels (with 3935, 15300 and 60320 vertices).While the runtime for the NN correction just slightly increases the runtime of the harmonic extension, the hybrid PDE-NN variants increase the runtime significantly.Qualitatively, the plots for the incremental linearized variants are similar to the results presented in [47,Fig. 7].The assembly and factorization of the matrices for the harmonic and biharmonic extension only needs to be done once while initializing the extension    operators.For the (linearized versions of the) hybrid PDE-NN approach, the matrices change and the assembly needs to be done several times.The computational time for the training process is not reflected in these plots.The training takes approximately two hours and 18 minutes for the hybrid PDE-NN approach and approximately 16 minutes for the NN-corrected harmonic approach.The hybrid PDE-NN training can be significantly sped up if the PDEs for each function and gradient evaluation are computed in parallel, instead of sequentially solving for each training data point.

Conclusion
We presented two approaches to learn an operator that extends boundary deformation to the interior of the domain.We chose a supervised learning approach such that the operators approximate the solution of the biharmonic equation.We demonstrated that the learned operators have the potential to serve as mesh motion technique for FSI simulations.So far, we considered supervised learning approaches.For future work, e.g. in cases where the biharmonic extension does not give satisfactory results, it is also interesting to consider unsupervised approaches, where one tries to optimize a measure for the quality of the deformed mesh.Moreover, we worked with approaches that require the solution of a partial differential equation.Considering approaches that do not require PDE solves is another interesting direction for future research.Hence, for x = (α, 0) ⊤ , α > 0, we obtain f θ ((α, 0) ⊤ ) ≥ 2α.However, h((α, 0) ⊤ ) = α.This yields a contradiction and implies that h(x) can not be represented by a function of the form (35).

Figure 3 :
Figure 3: Numerical results for harmonic and biharmonic extensions.

Figure 4 :
Figure 4: The body deformations caused by the six base load configurations the artificial dataset is based on.

Figure 5 :
Figure 5: Visualization of learned neural net α(θ, s) in the hybrid PDE-NN approach based on training data via FSI benchmark II (left) and artificial training data (right).

Figure 6 :
Figure 6: Numerical results for hybrid PDE-NN approach, trained on the FSI benchmark II training set.

Figure 7 :
Figure 7: Numerical results for hybrid PDE-NN approach, trained on the artificial training set.

Figure 8 :
Figure 8: Minimum cell value of scaled Jacobian mesh quality indicator over the FSI test set for biharmonic extension and NN-corrected harmonic extension trained on FSI dataset and artificial dataset.

Figure 9 :
Figure 9: Cell which attains lowest value of scaled Jacobian mesh quality across all timesteps with the NN-corrected harmonic extension, before (left) and after (right) mesh motion.The left cell has scaled Jacobian mesh quality value of 0.755 and the right cell has scaled Jacobian mesh quality value of 0.068.

Figure 10 :Figure 11 :
Figure 10: Numerical results for the NN-corrected approach compared to the biharmonic and hybrid PDE-NN approaches, trained on the FSI benchmark II dataset.

Figure 12 :
Figure 12: Histogram of neural network inputs at upper right corner of the flag in FSI and artificial datasets.Parts of the distributions for the FSI dataset are not represented in the artificial dataset, indicating that there are features of the deformations in the FSI simulation that are not included in the artificial dataset.

Figure 13 :
Figure 13: Quantiles of minimum over cells of scaled Jacobian mesh quality indicator over snapshots in test dataset for 10 trained networks with varying initializations and same hyperparameters as in FSI-study.

Figure 14 :
Figure 14: Quantiles of minimum over cells of scaled Jacobian mesh quality indicator over snapshots in test dataset for 10 trained networks with varying initializations and same hyperparameters as in FSI-study.The boundary condition-preserving function l is computed from (23) with f ≡ 1.

Figure 15 : 16 :
Figure 15: Quantiles of minimum over cells of scaled Jacobian mesh quality indicator over snapshots in test dataset for 10 trained networks with varying initializations for a varying number of hidden layers.Other than width and depth of neural networks, the same hyperparameters are used as in section 4.3.3 and the total number of parameters is as close as possible to the number of parameters used in section 4.3.3.

Figure 17 :
Figure 17: Mesh quality histograms for the biharmonic, harmonic, hybrid and NN-corrected extension operators on three different deformations of the gravity-driven deformation test (depicted in the first row)

Figure 18 :
Figure 18: Mesh quality histograms for the biharmonic, harmonic, and hybrid PDE-NN extension operators on three different deformations of the membrane on fluid test (depicted in the first row)

Figure 21 :
Figure 21: Contour plot of the function h (left) and a function f θ of the form (35) (right).This function is a lower bound for every function of the form (35) that is equal to h in [−k, 0] × (−k, k).