Multilevel domain decomposition-based architectures for physics-informed neural networks

Physics-informed neural networks (PINNs) are a powerful approach for solving problems involving differential equations, yet they often struggle to solve problems with high frequency and/or multi-scale solutions. Finite basis physics-informed neural networks (FBPINNs) improve the performance of PINNs in this regime by combining them with an overlapping domain decomposition approach. In this work, FBPINNs are extended by adding multiple levels of domain decompositions to their solution ansatz, inspired by classical multilevel Schwarz domain decomposition methods (DDMs). Analogous to typical tests for classical DDMs, we assess how the accuracy of PINNs, FBPINNs and multilevel FBPINNs scale with respect to computational effort and solution complexity by carrying out strong and weak scaling tests. Our numerical results show that the proposed multilevel FBPINNs consistently and significantly outperform PINNs across a range of problems with high frequency and multi-scale solutions. Furthermore, as expected in classical DDMs, we show that multilevel FBPINNs improve the accuracy of FBPINNs when using large numbers of subdomains by aiding global communication between subdomains.


Introduction
Scientific machine learning (SciML) [1][2][3][4][5] is an emerging and rapidly growing field of research.The central goal of SciML is to provide accurate, efficient, and robust tools for carrying out scientific research by tightly combining scientific understanding with machine learning (ML).The field has provided many such tools which have enhanced traditional approaches, from accelerating simulation algorithms to discovering new scientific phenomena.
One popular SciML approach is physics-informed neural networks (PINNs) [6,7].PINNs solve forward and inverse problems related to differential equations by using a neural network to directly approximate the solution to the differential equation.They are trained by using a loss function which minimizes the residual of the differential equation over a set of collocation points.The initial concepts behind PINNs were introduced by [6] and others, and later re-implemented and extended in [7].One of the advantages of PINNs over traditional methods for solving differential equations such as finite difference (FD) and finite element methods (FEM) is that they provide a mesh-free approach, paving the way for the application of problems with complex geometry or in very high spatial dimensions; cf.[8].Furthermore, they can easily be extended to solve inverse problems by incorporating observational data.
Since their invention, PINNs have been employed across a wide range of domains [3,9].For example, they have been used to solve forward and inverse problems in geophysics [10], fluid dynamics [11][12][13], and optics [14].Many extensions of PINNs have also been proposed.For example, PINNs have been extended to carry out uncertainty quantification [15], learn fast surrogate models [16,17], and carry out equation discovery [18].
However, PINNs suffer from a number of limitations.One is that, compared to traditional methods, their convergence properties are poorly understood, although some work has started to explore this [19][20][21].Another limitation is that, compared to traditional methods, the computational cost of training PINNs is relatively high, especially when they are only used for forward modeling [9].Finally, a major limitation of PINNs is that they often struggle to solve problems with high frequency and/or multi-scale solutions [22,23].Typically, as higher frequencies and multi-scale features are added to the solution, the accuracy of PINNs usually rapidly reduces and their computational cost rapidly increases in a super-linear fashion [22].
There are multiple reasons for this behavior.One is the spectral bias of neural networks, which is the well-studied property that neural networks tend to learn high frequencies much slower than low frequencies [24][25][26][27].Another is that, as higher frequencies and more multi-scale features are added, more collocation points and a larger neural network with significantly more free parameters are typically required to accurately approximate the solution.This creates a significantly more complex optimization problem when training the PINN.
Recently, [22] proposed finite basis physics-informed neural networks (FBPINNs), which aim to improve the performance of PINNs in this regime by using an overlapping domain decomposition (DD) approach.In particular, instead of using a single neural network to approximate the solution to the differential equation, many smaller neural networks were placed in overlapping subdomains and summed together to represent the solution.On the one hand, FBPINNs can be seen as a DD-based network architecture for PINNs.On the other hand, by taking this ''divide and conquer'' approach, the global PINN optimization problem is transformed into many smaller local optimization problems, which are coupled implicitly due to the overlap of the subdomains and their globally defined loss function.The results in [22] show that this approach significantly improves the accuracy and reduces the training cost of PINNs when solving differential equations with high frequency and multi-scale solutions.
In this work, we significantly extend FBPINNs by incorporating multilevel modeling into their design.In particular, instead of using a single DD in their solution ansatz, we add multiple levels of overlapping DDs.This idea is inspired by classical DDMs, where coarse levels are required for numerical scalability when using large numbers of subdomains.Furthermore, to assess the performance of multilevel FBPINNs, we define strong and weak scaling tests for measuring how the accuracy of PINNs and FBPINNs scale with computational effort and solution complexity, analogous to the strong and weak scaling tests commonly used in classical DDMs.
Given these extensions, the performance of PINNs, (one-level) FBPINNs, and multilevel FBPINNs across a range of high frequency and multi-scale problems is investigated.We also compare multilevel FBPINNs to PINNs with Fourier input features [23,28] and selfadaptive PINNs (SA-PINNs) [29], which have both been shown to improve the accuracy of PINNs when solving multi-scale problems.Across these tests, we find that multilevel FBPINNs significantly outperform PINNs in terms of accuracy and computational cost.Furthermore, as expected in classical DDMs, we show that multilevel FBPINNs improve the accuracy of FBPINNs when using large numbers of subdomains by aiding global communication between subdomains.
The remainder of this work is structured as follows.In Section 1.1 we discuss related work on combining ML, PINNs, and DD, and in Sections 1.2 and 1.3 we give a brief overview of neural networks and PINNs.Then we define FBPINNs and extend them to multilevel FBPINNs in Section 2. Our strong and weak scaling tests and corresponding numerical results on the performance of PINNs, FBPINNs, and multilevel FBPINNs across a range of high frequency and multi-scale problems are discussed in Section 3. Finally, in Section 4, we discuss the implications and limitations of our work and further research directions.

Related work
In general, the idea of combining ML with classical DDMs is not new; for early works on using ML to predict the geometrical location of constraints in adaptive finite element tearing and interconnecting (FETI) and balancing DD by constraints (BDDC) methods; see [30].An overview of the first attempts on combining DD and ML can be found in [31], a more recent overview is given in [32].
For specifically combining PINNs with DD, some of the first methods in this area were the deep domain decomposition method (D3M) [33], the deep-learning-based domain decomposition method (DeepDDM) [34,35], and its two-level variant [36], which use PINNs to solve local problems and overlapping Schwarz steps to iteratively connect them based on Lions' parallel Schwarz algorithm [37].At the same time, a series of other extensions, like conservative physics-informed neural network (cPINN) and extended physics-informed neural networks (XPINNs) [38] were proposed, which similarly divide the domain and use PINNs to solve each local problem; here, typically a non-overlapping DD is used.A detailed comparison of these methods to FBPINNs is given in Section 2.1.3.
In [39], partition of unity functions, similar to the window functions used in the FBPINN method, are learned.However, this is done in a pure function approximation setting rather than in the solution of PDE-based problems with PINNs.

Neural networks
We first provide a basic definition of a neural network.For the purpose of this work, we simply consider a neural network to be a mathematical function with some learnable parameters.More precisely, the network is defined as (, ) ∶ R   × R   → R   , where  are some inputs to the network,  are a set of learnable parameters, and   ,   , and   are the dimensionality of the network's inputs, parameters, and outputs.In a traditional supervised learning setting, learning typically consists of fitting the network function to some training data containing example inputs and outputs, by minimizing a loss function with respect to  which penalizes the difference between the network's outputs and the training data.
The exact form of the network function is determined by the neural network's architecture.In this work, we solely use feedforward fully connected networks (FCNs) [40].In this case, the network function is given by where now  ∈ R  0 is the input to the FCN,  ∈ R   is the output of the FCN,  is the number of layers (depth) of the FCN, and   (, ) =   (   +   ) where   = (  ,   ),   ∈ R   × −1 are known as weight matrices,   ∈ R   are known as bias vectors,   are element-wise activation functions commonly chosen as rectified linear unit (ReLU), hyperbolic tangent, or identity functions, and  = ( 1 , … ,   , … ,   ) are the set of learnable parameters of the network.Note that only the nonlinear activation functions   facilitate nonlinearity of the network function.

Physics-informed neural networks
Physics-informed neural networks (PINNs) [6,7] use neural networks to solve problems related to differential equations.In particular, PINNs focus on solving boundary value problems of the form where N[]() is some differential operator, () is the solution, and B  (⋅) are a set of boundary conditions (BCs) which ensure uniqueness of the solution.For the sake of simplicity, we consider BCs in a broad sense; we do not explicitly distinguish between initial and boundary conditions, and the  variable can include time.Eq. ( 2) can describe many different differential equation problems, including linear and non-linear problems, time-dependent and time-independent problems, and those with irregular, higher-order, and cyclic boundary conditions.To solve Eq. ( 2), PINNs use a neural network to directly approximate the solution, i.e., (, ) ≈ ().Note, for simplicity throughout this work, we use the same notation for the true solution and the neural network.It is important to note that PINNs provide a functional approximation to the solution, and not a discretized solution similar to that provided by traditional methods such as finite difference methods, and as such PINNs are a mesh-free approach for solving differential equations.Following the approach proposed by [7], the following loss function is minimized to train the PINN, ( where =1 is a set of collocation points sampled in the interior of the domain, {   }    =1 is a set of points sampled along each boundary condition, and   and    are well-chosen scalar weights that ensure the terms in the loss function are well balanced.Intuitively, one can see that by minimizing the PDE residual, the method tries to ensure that the solution learned by the network obeys the underlying PDE, and by minimizing the BC residual, the method tries to ensure that the learned solution is unique by matching it to the BCs.Importantly, a sufficient number of collocation and boundary points must be chosen such that the PINN is able to learn a consistent solution across the domain. Iterative schemes are typically used to optimize this loss function.Usually, variants of the gradient descent (GD) method, such as the Adam optimizer [41], or quasi-Newton methods, such as the limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm [42] are employed.These methods require the computation of the gradient of the loss function with respect to the network parameters, which can be computed easily and efficiently using automatic differentiation [43] provided in modern deep learning libraries [44][45][46].Note that gradients of the network output with respect to its inputs are also typically required to evaluate the PDE residual in the loss function, and can similarly be obtained and further differentiated through to update the network's parameters using automatic differentiation.

Hard constrained PINNs
A downside of training PINNs with the loss function given by Eq. ( 3) is that the BCs are softly enforced.This means the learned solution may deviate from the BCs because the BC term may not be fully minimized.Furthermore, it can be challenging to balance the different objectives of the PDE and BC terms in the loss function, which can lead to poor convergence and solution accuracy [21,47].An alternative approach, as originally proposed by [6], is to enforce BCs in a hard fashion by using the neural network as part of a solution ansatz.More precisely, the solution to the differential equation is instead approximated by [C](, ) ≈ () where C is an appropriately selected constraining operator which analytically enforces the BCs [22,48].
To give a simple example, suppose we want to enforce ( = 0) = 0 when solving a one-dimensional ordinary differential equation (ODE).The constraining operator and solution ansatz could be chosen as [C](, ) = tanh()(, ) ≈ ().The rationale behind this is that the function tanh() is zero at 0, forcing the BC to always be obeyed, but non-zero away from 0, allowing the network to learn the solution away from the BC.In this approach, the BCs are always satisfied and therefore the BC term in the loss function Eq. ( 3) can be removed, meaning that the PINN can be trained using the simpler unconstrained loss function, where {  }  =1 is a set of collocation points sampled in the interior of the domain.Note that, in general, there is no unique way of choosing the constraining operator, and the definition of a suitable constraining operator for complex geometries and/or complex BCs may be difficult or sometimes even impossible, i.e., this strategy is problem dependent; in this case, one may resort to the soft enforcement of boundary conditions Eq. (3) instead.

Methods
In this section, we define FBPINNs (Section 2.1) and extend them to multilevel FBPINNs (Section 2.2).We also discuss the similarities and differences of FBPINNs and multilevel FBPINNs to classical DDMs (Section 2.2.2).

Finite basis physics-informed neural networks
As discussed in Section 1, a major challenge when training PINNs is that, when higher frequencies and multi-scale features are added to the solution, the accuracy of PINNs usually rapidly reduces and their computational cost rapidly increases in a super-linear fashion [22,23].
In the FBPINN approach [22], instead of using a single neural network to represent the solution, many smaller neural networks are confined in overlapping subdomains and summed together to represent the solution.By taking this ''divide and conquer'' approach, the global PINN optimization problem is transformed into many smaller coupled local optimization problems.
Furthermore, FBPINNs ensure that the inputs to each subdomain network are normalized over their individual subdomain.When solving problems with high frequency solutions, this effectively scales each local problem from a high frequency problem to a lower frequency problem, and helps limit the effect of spectral bias; Fig. 1 explains this effect further.

Mathematical definition
We now provide a mathematical definition of FBPINNs.First, the global solution domain  is decomposed into  overlapping subdomains {   }  =1 ; cf.Fig. 2.Then, for each subdomain   , a space of network functions is defined, where v  (,   ) is a neural network placed in each subdomain and   = R   is the linear space of all possible network parameters.
Here,   is the number of local network parameters which is determined by the network architecture.Next, each subdomain network is confined to its subdomain by multiplying each network with a window function   (), where supp (   ) ⊂   .Note the neural network functions used in V  generally can have global support, and the window functions are used to restrict them to their individual subdomains.Furthermore, we impose that the window functions form a partition of unity, i.e.,  ∑ =1   ≡ 1 on .
Given the space of network functions and the window functions, we define a global space decomposition given by V as This space decomposition allows for decomposing any given function  ∈ V as follows respectively.
FBPINNs solve the boundary value problem Eq. ( 2) by using equation Eq. ( 5) to approximate the solution, and we refer to Eq. ( 5) as the FBPINN solution.From a PINN perspective, the FBPINN solution can simply be thought of as a specific type of neural network architecture for the PINN which sums together many locally-confined networks to generate the output solution.The same scheme for training PINNs is used to train the FBPINN.More specifically, the FBPINN solution Eq. ( 5) is substituted into the PINN loss function Eq. ( 3) and the same iterative optimization scheme is used to learn the parameters {   }  =1 of each subdomain network.FBPINNs can also be trained with hard BCs by using the same constraining operator approach described in Section 1.3.1.In particular, substituting the FBPINN solution Eq. ( 5) into the hard-constrained loss function Eq. ( 4) yields the loss function

Computational efficiency of FBPINNs versus PINNs
Assuming the same size network in each subdomain, naively computing the FBPINN solution Eq. ( 5) has a time complexity of O( S), where S is the cost of computing the output of a single subdomain network for a single collocation point.This becomes very expensive as more subdomains are added ( increases).However, because the output of each subdomain network is zero outside of the overlapping subdomain after applying the window function, only collocation points within the subdomain need to be included in the summations in Eq. ( 6).This reduces the computational cost to O( S), where  is the average number of subdomains a collocation point belongs to.PINNs have a computational cost of O(), where  is the cost of computing the output of the single global PINN network.Importantly, as the problem complexity increases, the size of the PINN network must typically be increased (increasing ), whilst for FBPINNs, we can typically keep the subdomain network size fixed, and increase  instead -thus, often S ≪  and FBPINNs are often orders of magnitude more efficient than PINNs.More details on our efficient software implementation are provided in Appendix A.

FBPINNs versus other methods for combining PINNs with DD
Multiple other approaches exist which combine PINNs with DD; a recent overview can be found in [32].Similar to FBPINNs, XPINNs [38] divide the domain into subdomains and use separate neural networks to solve each subdomain problem.However, XPINNs use a non-overlapping DD.A downside of this approach is that the global solution contains discontinuities at the subdomain interfaces and additional loss terms are required to enforce coupling between subdomain networks.In contrast, FBPINNs do not require additional loss terms and their solution is continuous across subdomain interfaces.XPINNs and FBPINNs are comparable in terms of their computational cost of training, and both are able to use irregular DDs with different types of subdomain neural networks.[49,50] propose a similar approach to XPINNs, but they use extreme learning machines [51] as subdomain networks, where only the parameters of the last layer of the network are learnt.This approach has the advantage of being much faster to train, but the capacity of the subdomain networks is limited, and the performance strongly depends on the initialization.
Other approaches attempt to learn suitable DDs for PINNs.For example, Gated-PINNs [52] use several neural networks, called experts, to propose a solution to a PDE, whilst a gating network is used to weight-average the expert solutions given a coordinate in the domain.Augmented-PINNs (APINNs) [53] build upon this strategy by introducing parameter sharing between experts to capture solution similarities between subdomains.These approaches are more flexible than FBPINNs in that they can adaptively learn DDs, but they are also significantly more computationally expensive to train as they require all experts to be evaluated at each input coordinate and therefore do not scale well with problem size; cf.Section 2.1.2.

Multilevel FBPINNs
In this work we propose multilevel FBPINNs, which extend FBPINNs by adding multiple levels of DDs to their solution ansatz.They are inspired by classical multilevel DD methods, where coarse levels are generally required for numerical scalability when using large numbers of subdomains, and multilevel approaches may significantly improve performance; see, for instance, [54,55].Our hypothesis is that adding multilevel modeling to FBPINNs similarly improves their performance.The generalization of FBPINNs to two levels was briefly discussed in [56] and we fully introduce the concept here.
A multilevel FBPINN is defined as follows.First, we define  levels of DDs, where each level, , defines an overlapping DD of  with  () subdomains, i.e.,
Next, we define spaces of network functions for each level, as well as a partition of unity for each level using window functions,  ()  , with

𝑗
and We can then define a global space decomposition, and use this space decomposition to decompose any given function  ∈ V as follows, We refer to Eq. ( 7) as the multilevel FBPINN solution.Note, the original FBPINN solution described in Section 2.1 can be obtained by simply setting  = 1; we refer to these as one-level FBPINNs going forward.
V. Dolean et al.Analogous to FBPINNs, we can train multilevel FBPINNs by using the same training scheme as PINNs and inserting Eq. ( 7) into the PINN loss function.When using the hard-constrained PINN loss function Eq. ( 4), this yields the corresponding multilevel FBPINN loss function

Example of a multilevel FBPINN
We now show a simple example of a multilevel FBPINN to aid understanding.In particular, we use a multilevel FBPINN to solve the Laplacian boundary value problem, First we consider the 1D case ( = 1), and set  = 8.Then the exact solution is given by () = 4(1 − ).We create an  = 3 level FBPINN to solve this problem, with  (1) = 1,  (2) = 2, and  (3) = 4.Each level uses a uniform DD given by where  is defined as the overlap ratio and is fixed at a value of  = 1.9.Note that an overlap ratio of less than 1 means that the subdomains are no longer overlapping.The subdomain window functions form a partition of unity for each level and are given by where  ()  = ( − 1)∕( () − 1) and  ()  = (∕2)∕( () − 1) represent the center and half-width of each subdomain respectively.The window functions for each level are plotted in Fig. 3(a).A FCN Eq. ( 1) with 1 hidden layer, 16 hidden units, and tanh activation functions is placed in each subdomain, and the  inputs to each subdomain network are normalized to the range [−1, 1] over their individual subdomains.
Note that multilevel FBPINNs are not restricted to the particular choice of DD, level structure, window function, partition of unity, and subdomain network architecture used above.This choice may not be optimal, and the optimal choice is clearly problemdependent.For example, it may be beneficial to use an irregular DD, level structure, and varying subdomain network sizes for problems where the solution has varying complexity in different parts of the domain.
The resulting multilevel FBPINN solution is shown in Fig. 3(c), and the individual subdomain network solutions (with the constraining operator and window function applied) are shown in Fig. 3(b).In this case, we find the FBPINN closely matches the exact solution.
Next we consider the 2D case ( = 2), and set Then the exact solution is given by In this case we create a  = 3 level FBPINN to solve this problem, using a uniform rectangular DD for each level with  (1) = 1 × 1 = 1,  (2) = 2 × 2 = 4, and  (3) = 4 × 4 = 16, as shown in Fig. 3(d) and (e).The size of each subdomain along each dimension is defined similar as in Eq. ( 9) using, again, an overlap ratio of  = 1.9.The subdomain window functions are given by where  ()  and  ()  represent the center and half-width of each subdomain along each dimension, respectively.A FCN Eq. ( 1) with 1 hidden layer, 16 hidden units, and tanh activation functions is placed in each subdomain, and the  inputs to each subdomain network are normalized to the range [−1, 1] along each dimension over their individual subdomains.
Similar to above, the multilevel FBPINN is trained using the hard-constrained loss function Eq. ( 8), using a constraining operator given by with  = 0.2.The loss function is minimized using the Adam optimizer with a learning rate of 1 × 10 −3 and  = 80 × 80 = 6,400 uniformly-spaced collocation points across the domain.
The resulting multilevel FBPINN solution is shown in Fig. 3(f).Similar to the 1D case, we find the multilevel FBPINN solution closely matches the exact solution.

Multilevel FBPINNs versus classical multilevel DDMs
Whilst multilevel FBPINNs are inspired by classical multilevel DDMs, a number of differences and similarities exist between these approaches.We believe it is insightful to briefly discuss these below.
Most classical DDMs can be described in terms of the abstract Schwarz framework [54,57].Similar to FBPINNs, this framework is based on a decomposition of a global function space  into local spaces {   }  =1 defined on overlapping subdomains   , where Here,  ⊤  ∶   →  is an interpolation respectively prolongation operator from the local into the global space.These notions can be defined in a similar fashion at the continuous and discrete level.For the sake of simplicity, we suppose here a variational discretization of the PDE to solve.The space decomposition Eq. ( 11) allows for decomposing any given discrete function  ∈  as due to the overlap, this decomposition is generally not unique.Schwarz DDMs are then based on solving local overlapping problems corresponding to the local spaces {   }  =1 and merging them via the prolongation operators  ⊤  .Classical one-level Schwarz methods based on this framework are typically not scalable to large numbers of subdomains.In particular, since information is only transported via the overlap, their rate of convergence will deteriorate when increasing the number of subdomains [54].In order to fix this, multilevel methods add coarser problems to the Schwarz framework to facilitate the global transfer of information; in particular, the coarsest level typically corresponds to a global problem.
We note that: • In classical Schwarz methods, the global discretization space  is often fixed first, and then, the local spaces {   }  =1 are constructed.In FBPINNs, we do the opposite; we define a local space of neural network functions on an overlapping DD {   }  =1 and construct the global discretization space from them.• In classical Schwarz methods, the local functions   ∈   are generally not defined on the global domain  outside the overlapping subdomain   ; the prolongation operators  ⊤  extend the local functions to  such that supp( ⊤    ) ⊂   , ∀  ∈   .On the other hand, in FBPINNs, the local neural network functions v  generally have global support, and the window functions   are used to confine them to their subdomains.This difference stems from the fact that the local neural networks are not based on a spatial discretization but a function approximation; cf.Section 1.3.Nonetheless, both the prolongation operators and the window functions ensure locality; cf.Eqs. ( 5) and (12).Note that the prolongation operators in the restricted additive Schwarz (RAS) method [58] also include a partition of unity, such that they are very close to the window functions in FBPINNs.
• A key difference is how the boundary value problem is solved.Whereas in DDMs, local subdomain problems are explicitly defined and solved in a global iteration, in FBPINNs, the global loss function is minimized.Moreover, classical DDMs can exploit properties of the system to be solved.For instance, if the PDE is linear elliptic, convergence guarantees for classical DDMs can be derived; cf.[54,55].In FBPINNs, we always have to solve a non-convex optimization problem Eq. (3) or Eq. ( 4), which makes the derivation of convergence bounds difficult.Note that there are also nonlinear overlapping DDMs, for instance, additive Schwarz preconditioned inexact Newton (ASPIN) [59] and additive Schwarz preconditioned exact Newton (RASPEN) methods [60].

Numerical results
In this section we assess the performance of multilevel FBPINNs.In particular, we investigate the accuracy and computational cost of using multilevel FBPINNs to solve various differential equations, and compare them to PINNs, PINNs with Fourier input features, SA-PINNs, and one-level FBPINNs.
First, in Section 3.1, we introduce the problems studied.Then, in Section 3.2 we introduce a notion of strong and weak scaling, inspired by classical DDMs, for assessing how the accuracy of FBPINNs and PINNs scales with computational effort and solution complexity.In Section 3.3, we list the common implementation details used across all experiments.Finally, in Section 3.4 we present our numerical results.

Problems studied
The following problems are used to assess the performance of multilevel FBPINNs;

Homogeneous Laplacian problem in two dimensions
First, we consider the 2D homogeneous Laplacian problem already presented above, namely where In this case, the exact solution is given by This problem is used to carry out simple ablation tests of the multilevel FBPINN.In particular, we assess how varying the number of levels and subdomains as well as the overlap ratio and size of the subdomain networks (architecture) affects the multilevel FBPINN performance.

Multi-scale Laplacian problem in two dimensions
Next, we consider a multi-scale variant of the Laplacian problem Eq. ( 13) above by using the source term Then, the exact solution is given by In this case, multi-scale frequencies are contained in the solution, and the values of  and   allow us to control the number of components and the frequency of each component.We use this problem to assess how the performance of the multilevel FBPINN scales when more multi-scale components are added to the solution.

Helmholtz problem in two dimensions
Lastly, we study the 2D Helmholtz problem with a constant (scalar) wave number, .Here, homogeneous Dirichlet boundary conditions and a Gaussian point source with a scalar width, , placed in the center of the domain are used.Note that, for this problem, the exact solution is not known, and instead, we compare our models to the solution obtained from FD modeling, as described in Appendix B. In this case, the solution contains complex patterns of standing waves where the dominant frequency of the solution depends on the wave number, .We use this problem to test the multilevel FBPINN on a more realistic problem.We first carry out some simple ablation tests by assessing how varying the number of levels, subdomains, overlap ratio and size of the subdomain networks affects the multilevel FBPINN performance.Then, we assess how the performance of the multilevel FBPINN scales when the value of  is increased.

Definition of strong and weak scaling
For both the multi-scale Laplacian and Helmholtz problems, we carry out strong and weak scaling tests.These assess how the accuracy of the multilevel FBPINN scales with computational effort and solution complexity and are inspired by the strong and weak scaling tests commonly used in classical DD.They are defined in the following way; • Strong scaling: We fix the complexity of the problem and increase the model capacity.For optimal scaling, we expect the convergence rate and/or accuracy to improve at the same rate as the increase of model capacity.• Weak scaling: We increase the complexity of the problem and the model capacity at the same rate.For optimal scaling, we expect the convergence rate and/or accuracy to stay approximately constant.
For all our tests, increasing the model capacity means increasing the number of levels, number of subdomains, and/or the size of the subdomain networks.The exact factors varied and their rates of increase are detailed in the relevant results sections below.Note all of the multilevel FBPINNs tested have been trained on a single GPU, and hence we only show strong and weak scaling tests with respect to model capacity and not hardware parallelization.

Common implementation details
Many of the implementation details of the multilevel FBPINNs, one-level FBPINNs, and PINNs tested are the same across all tests.These details are presented here; some are only changed for ablation studies, in which case they are described in the relevant results section below.Fig. 4. Hierarchy of levels used in the multilevel FBPINN.For all the multilevel FBPINNs tested we use an exponential level structure.This means that the number of subdomains in each level is given by 2 ( −1) , where  is the level number and  is the dimensionality of the domain.Our hypothesis is that this helps the multilevel FBPINN model solutions with frequency components that span multiple orders of magnitude.
Level structure.Firstly, all multilevel FBPINNs use an exponentially increasing number of subdomains per level.In particular, we choose  () = 2 (−1) for  = 1, … , .This level structure is shown in Fig. 4.This constraint is chosen so that the multilevel FBPINN is able to contain an exponentially large number of subdomains with a relatively small number of levels; our hypothesis is that this helps the multilevel FBPINN model solutions with frequency components that span multiple orders of magnitude.

Domain decomposition.
All FBPINNs tested use a uniform rectangular DD for each level, with all multilevel FBPINNs having 2 −1 subdomains along each dimension.The size of each subdomain along each dimension is defined similar to Eq. ( 9), i.e., all 2D DDs look similar to those shown in Fig. 3(d) and (e).Furthermore, all FBPINNs use the same subdomain window functions, given by Eq. ( 10).Loss function and optimization.All FBPINNs and PINNs tested use the hard-constrained variants of their loss functions.For fairness, the same constraining operator is used across all models tested for a given problem.Furthermore, the same collocation points are used for training whenever multiple models are compared on a given problem.This is similarly the case for all testing points used after training.The SA-PINNs tested use learnable weights for each collocation point in their loss function; this is described in more detail in Appendix D. All tests use the Adam optimizer with a learning rate of 1 × 10 −3 except for the PINNs with Fourier input features, which are trained using a learning rate of 1 × 10 −4 , because it was found their convergence is unstable when using a larger learning rate.For robustness, all models are trained 10 times using different random starting seeds, and all results are reported as averages over these different seeds.All models are evaluated using the normalized L1 test loss, given by L() = 1  ∑   ‖(  , ) − (  )‖∕, where  is the number of test points and  is the standard deviation of the set of true solutions {(  )}   .
Software and hardware implementation.All FBPINNs and PINNs tested are implemented using a common training framework written in JAX [46].Further details on our software implementation are given in Appendix A. All models are trained on a single NVIDIA RTX 3090 GPU.

Results
Here, we will discuss the results for the model problems described in Section 3.1.

Homogeneous Laplacian problem in two dimensions
First, we carry out simple ablation tests of the multilevel FBPINN using the 2D homogeneous Laplacian problem described in Section 3.4.1.
To carry out our ablation tests, we first train a baseline multilevel FBPINN to solve this problem, using  = 3 levels, an overlap ratio along each dimension of  = 1.9, and FCNs with 1 hidden layer and 16 hidden units for each subdomain network.The multilevel FBPINN is trained using the constraining operator Given this baseline model, we then vary different hyperparameters over a range of values and measure the change in performance.This is carried out for the number of levels ranging from  = 2 to 5, the overlap ratio ranging from 1.1 to 2.7, and the number of hidden units in the subdomain network ranging from 2 to 32.Our results are shown in Fig. 5.We observe that the accuracy of the multilevel FBPINN does not depend significantly on the number of levels, likely because in this case the solution is very simple.However, its accuracy increases as the overlap ratio increases, likely because there is more communication between the subdomain networks, which is similar to what is expected in classical DDMs.Furthermore, its accuracy increases as the number of free parameters of the subdomain networks increases.This is expected as the capacity of the model increases.Thus, the multilevel FBPINN has similar characteristics to classical DDMs for this problem.
We carry out two other benchmark tests.First, we train a PINN with 3 hidden layers and 64 hidden units, and second, we train four one-level FBPINNs with  (1) = 2, 4, 8, and 16 subdomains along each dimension, respectively.All other relevant hyperparameters are kept the same as the baseline model.These results are also shown in Fig. 5.In these tests, the PINN is able to solve the problem, although its final accuracy is lower than the baseline multilevel FBPINN and its convergence curve is more unstable.Furthermore, the accuracy of the one-level FBPINN reduces as more subdomains are added.This is analogous to the expected behavior of one-level classical DDMs, which is not scalable to large numbers of subdomains, and shows that coarse levels are required for scalability.It is therefore likely that the additional levels in FBPINNs serve the same purpose as in classical DDMs, i.e., they allow direct transfer of global information.

Multi-scale Laplacian problem in two dimensions
Next, we evaluate the strong and weak scalability of the multilevel FBPINN using the multi-scale Laplacian problem described in Section 3.4.2.Strong scaling test.First, we carry out a strong scaling test.Here, the problem complexity is fixed and we assess how the performance of the multilevel FBPINN changes as the capacity of the model is increased.In particular, we fix the problem complexity by choosing  = 6 with   = 2  for  = 1, … ,  in Eq. ( 14).Thus, the solution contains 6 multi-scale components with exponentially increasing frequencies.This represents a much more challenging problem than the homogeneous problem studied above.The exact solution in this case is shown in Fig. 6.
We then increase the capacity of the multilevel FBPINN by increasing the number of levels, testing from  = 2 to 7. The rest of the hyperparameters of the multilevel FBPINN are kept fixed across all tests.Namely, we use  = 320 × 320 = 102,400 uniformly-spaced collocation points throughout the domain, an overlap ratio of  = 1.9 and FCNs with 1 hidden layer and 16 hidden units for each subdomain network.All models are trained using the constraining operator [C](, ) = tanh( 1 ∕) tanh((1− 1 )∕) tanh( 2 ∕) tanh((1−  2 )∕)(, ) with  = 1∕  . = 350 × 350 uniformly-spaced test points are used to test all models.
The results of this study are shown in Fig. 6.We find that the accuracy of the multilevel FBPINN increases as the number of levels increases, where the  = 2, 3, and 4 models are unable to accurately model the solution, whilst the  = 5, 6 and 7 models are able to accurately model all of the frequency components.The test shows that the multilevel FBPINN is able to solve a high frequency, multi-scale problem, and exhibits strong scaling behavior somewhat analogous to what is expected by classical DDMs.However, we note that the accuracy of the 7-level FBPINN is worse than the 6-level FBPINN.We believe this may because at the finest level of the 7-level PINN, each subdomain network only contains approximately 10 × 10 collocation points.More collocation points may allow this level to converge more accurately.
Five other benchmark tests are carried out for this problem.First, we train a PINN with 5 hidden layers and 256 hidden units, a PINN with 256 Fourier input features with  = 5, 5 hidden layers and 256 hidden units, and a SA-PINN with 5 hidden layers and 256 hidden units.Then, we train a one-level FBPINN with  (1) = 64 subdomains along each dimension and a three-level FBPINN with  (1) = 1,  (2) = 8, and  (3) = 64 subdomains along each dimension, respectively.All other relevant hyperparameters are kept the same as the baseline model above.These results are also shown in Fig. 6.We find that the accuracy of the standard PINN is  poor, and it is only able to model some of the cycles in the solution.Furthermore its convergence curve is very unstable, and its training time is an order of magnitude larger than the  = 7 level FBPINN tested.Its poor convergence is likely due to spectral bias and the increasing complexity of the PINN's optimization problem, as discussed in Section 2.1 and [22].For this problem, adding Fourier features to the PINN significantly improves its accuracy, although it its training time remains high and it converges slower than the multilevel FBPINNs tested.The SA-PINN does not offer any improvement over the standard PINN.The one-level FBPINN is able to model the solution, although its accuracy is less than the  = 7 level FBPINN.Finally, the three-level FBPINN benchmark performs similarly to the  = 7 level FBPINN.This suggests that multilevel FBPINNs with stronger coarsening ratios can be used, and that multilevel FBPINNs are not strongly dependent on their coarsening ratio.
Weak scaling test.Next, we carry out a weak scaling test.Here, the problem complexity, number of collocation points and model capacity are scaled at the same rate, and we assess how the performance of the multilevel FBPINN changes.We increase the model capacity in the same way as the strong scaling test above, i.e., the number of levels is increased from  = 2 to 7.However, now the problem complexity is also scaled, such that for each test  =  − 1 and   = 2  for  = 1, … , .Furthermore, each test has (5 × 2 −1 ) × (5 × 2 −1 ) uniformly-spaced collocation points.Note that the number of subdomains, number of collocation points, and the frequency range of the solution all grow exponentially, and the multilevel FBPINN is in alignment with the problem structure.
All other hyperparameters are fixed to the same values as the strong scaling test above.
The results of this test are shown in Fig. 7.We find that the multilevel FBPINNs are able to model all of the problems tested accurately, that is, modeling all of their frequency components.However, the normalized L1 accuracy of the multilevel FBPINNs does reduce somewhat as the problem complexity increases.Thus in this test the multilevel FBPINN exhibits near -but not perfect -weak scaling.As an additional test we plot the contribution to the FBPINN solution from each level for the 5-level FBPINN in Appendix E.

Helmholtz problem in two dimensions
Finally, we test the multilevel FBPINN using the more complex Helmholtz problem described in Section 3.4.3.Again, we carry out ablation tests first and then carry out a weak scaling study assessing how the performance of the multilevel FBPINN changes as the wave number, , increases.
Given this baseline model, we then vary different hyperparameters over a range of values and measure the change in performance.This is carried out for the number of levels ranging from  = 2 to 5, the overlap ratio ranging from 1.1 to 2.7, and the number of hidden units in the subdomain network ranging from 2 to 32.Our results are shown in Fig. 8.We obtain similar results to the ablation tests carried out in Section 3.4.1 for the homogeneous Laplace problem.Namely, that the accuracy of the multilevel FBPINN improves as the overlap ratio and the number of free parameters of the subdomain networks increases.Furthermore, its accuracy improves as the number of levels increases, likely because the solution contains relatively high frequencies and multiple subdomains are needed.We observe relatively large variations of the test loss between different random initializations of the models below a value of 10 −1 .In particular, the final loss can be somewhere between 10 −2 and 10 −1 .
We carry out two other benchmark tests.First, we train a PINN with 5 hidden layers and 256 hidden units, and second, we train four one-level FBPINNs with  (1) = 2, 4, 8, and 16 subdomains along each dimension, respectively.All other relevant hyperparameters are kept the same as the baseline model.These results are also shown in Fig. 8. Here, the PINN converges poorly, which again highlights the shortcomings of PINNs when solving more complex problems.Furthermore, the convergence of all the one-level FBPINNs is much slower than the multilevel FBPINN, and their final accuracy is worse.This again suggests that multiple levels are required for scalability.9. Weak scaling test using the Helmholtz problem.In this test the problem complexity is increased (in this case, the wave number) (top row) and the solution estimated using multilevel FBPINNs with increasing numbers of levels and collocation points are plotted (second row).The title of each plot describes the level structure (first line) and the number of collocation points along each dimension (second line).Three benchmarks using a PINN, a PINN with Fourier input features, and a SA-PINN, all with a fixed network size and increasing numbers of collocation points are also shown (third and fourth row).
Weak scaling test.We carry out a weak scaling study, where both the problem complexity and model capacity are scaled at the same rate.In a similar fashion to the weak scaling test in Section 3.4.2, the capacity of the multilevel FBPINN is increased by increasing the number of levels, testing from  = 2 to 6.For each test, (10 × 2 −1 ) × (10 × 2 −1 ) uniformly-spaced collocation points are used.The problem complexity for each test is increased by setting  = 2  ∕1.6 and  = 0.8∕2  in Eq. (15).All other hyperparameters are fixed to the same values as the baseline model used in the ablation tests above.
The results of this test are shown in Figs. 9 and 10.We find that the multilevel FBPINN is able to accurately model all the problems tested, except for the highest wave number test.In this case, the multilevel FBPINN successfully models the dominant frequency and overall concentricity of the solution but fails to model its more complex motifs.In this case, we believe that the FBPINN is struggling to satisfy both the point source and Dirichlet boundary conditions.Without the Dirichlet boundary condition, the solution to Eq. ( 15) is that of a simple point source.For all tests, we notice that in the first few training steps this is the solution learned by the multilevel FBPINN, which is then updated to the correct solution after further training.Thus, it appears the presence of the Dirichlet boundary condition leads to an optimization problem which remains challenging.This is consistent with challenges  arising in solving Helmholtz problem using classical iterative numerical solvers.Further work is required to understand this behavior; one may be able to address this problem by using subdomain scheduling strategies to incrementally train the multilevel FBPINN, as proposed in [22].
Finally, we carry out the same weak scaling test but using a PINN instead of a multilevel FBPINN.For each test, the PINN's architecture is kept fixed at 5 hidden layers and 256 hidden units whilst the number of collocation points and problem complexity is increased in the same way as the previous test.We also test a PINN with 256 Fourier input features, 5 hidden layers and 256 hidden units, and a SA-PINN with 5 hidden layers and 256 hidden units.The  values of the Fourier input features are hand-selected for each test, and are 0.4, 1, 2, 1.5, and 5 in order of increasing problem complexity.All other relevant hyperparameters are kept the same as the FBPINNs tested.The result of this study is shown in Figs. 9 and 10.In this case, we find that the PINN and SA-PINN are unable to accurately model any of the solutions, and their training time is an order of magnitude larger than the multilevel FBPINN.The PINN with Fourier input features is able to model the solutions with a similar level of accuracy as the multilevel FBPINN, but its training time is an order of magnitude larger.Thus, multilevel FBPINNs still outperform PINNs for this problem.

Discussion
Across all the problems studied, we find that the multilevel FBPINNs consistently outperform the one-level FBPINNs and PINNs tested.The multilevel FBPINNs are more accurate than the one-level FBPINNs when a large number of subdomains are used, suggesting that coarse levels are required for scalability by improving the global communication.Furthermore, the multilevel FBPINNs are significantly more accurate and computationally efficient than the PINNs tested.However, we have only started to investigate multilevel FBPINNs in this work and there are many important research questions outstanding.
One important question would be to investigate the performance of multilevel FBPINNs on problems with complex geometries and solutions which have varying complexity in different parts of the domain.In this work, we restrict ourselves to problems with rectangular geometries and homogeneous solution complexity, and use multilevel FBPINNs with uniform rectangular decompositions and an exponential level structure.However, for problems with complex geometry and varying solution complexity, it is likely to be beneficial to use irregular DDs with irregular level structures and varying subdomain network sizes, to capture inhomogeneities in the solution.Whilst the FBPINN framework can represent such a model, implementing it is practically challenging for two reasons.First, to remain computationally efficient it is likely that a fully asynchronous training code across subdomains is required, because each subdomain network would require a different amount of computation to be evaluated and trained.Secondly, it may be challenging to choose an optimal DD, level structure and the subdomain network sizes, especially if characteristics of the solution are not known beforehand.One interesting direction here would be to try to learn the DD, for example by jointly learning the parameters of the FBPINN window functions with the subdomain network parameters or using a gating network similar to [52,53].
Another valuable direction would be to study the theoretical convergence properties of multilevel FBPINNs.A major limitation of PINNs compared to classical DDMs is that their convergence properties are still poorly understood.In particular, whilst the multilevel FBPINN exhibits good scaling properties for the Laplacian problems studied, it remains unclear why the optimization of the high wave number Helmholtz problem is challenging; note that the convergence of classical DDMs for high wave number Helmholtz problems is also not fully understood.
Furthermore, it is important to investigate ways to accelerate the computational efficiency of multilevel FBPINNs further.Whilst multilevel FBPINNs are over an order of magnitude more efficient than the PINNs tested, their training times are still likely to be slower than many traditional methods, such as numerical solvers for finite difference or finite element systems.Fundamentally, this is because (FB)PINNs yield a non-convex optimization problem, which is relatively expensive compared to the linear solves which traditional methods typically rely on.One way to accelerate (multilevel) FBPINNs, as suggested in [22], is to provide more inputs to the subdomain networks, such as BCs and PDE coefficients, and train across a range of these inputs so that the multilevel FBPINN learns a fast surrogate model which does not need to be retrained for each new solution.
Alongside this, multi-GPU training of (multilevel) FBPINNs should be investigated.Here we solve all problems using a single GPU, but multi-GPU training will become essential for problem sizes where 10, 000+ subdomains are required (for example, 3D problems, or problems with highly multi-scale solutions).In Section 2.1.2,we show that (multilevel) FBPINNs are theoretically scalable to large problem sizes: the computational cost of evaluating the FBPINN solution is O( S), where  is the average number of subdomains a collocation point belongs to, S is the cost of computing the output of a single subdomain network for a single collocation point and  is the number of collocation points.Importantly, this cost is independent of the total number of subdomains ( ), and scales linearly with the number of collocation points.Practically, it may be challenging to achieve perfect linear scaling when using multiple GPUs because of the communication required between GPUs.For FBPINNs, the only communication required between subdomains is within their overlapping regions, where subdomain solutions are summed together; note that, in the multilevel case, subdomains on different levels may also overlap, therefore, increasing the required communication but improving the numerical scalability.One possible parallel implementation of FBPINNs was proposed by [22] (Algorithm 1 and Fig. 4), where a separate GPU is used to train each subdomain network.Importantly, communication between GPUs is only required in the forward pass when summing the outputs of the subdomain networks in the overlapping regions; the backpropagation and updating of the subdomain networks can then be done independently on each GPU for each subdomain network.Future work will investigate the parallel scalability of FBPINNs in detail.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig.1.Scaling high frequency problems to low frequency problems using domain decomposition.FBPINNs decompose the domain into many subdomains, and use neural networks within each subdomain to learn the local solution.The input coordinates to each network are normalized to the range [−1, 1] over their individual subdomains.When solving problems with high frequency solutions, this effectively scales each local problem from a high frequency problem to a lower frequency problem, and helps reduce the network's spectral bias.

Fig. 2 .
Fig. 2. Plot of a square domain  decomposed into four overlapping subdomains, using a uniform rectangular decomposition.

Fig. 3 .
Fig. 3. Example of a multilevel FBPINN solving Laplace's equation in one and two dimensions.For the 1D problem, the multilevel FBPINN uses  = 3 levels, where each level has 1, 2 and 4 subdomains respectively.The window functions, ω()  (), used for each level are shown in (a), the individual solutions learned by each subdomain network are shown in (b), and the multilevel FBPINN solution is shown in (c).For the 2D problem, the multilevel FBPINN uses  = 3 levels, where each level has 1 × 1, 2 × 2 and 4 × 4 subdomains respectively, using a uniform rectangular DD.The DDs for level 2 and level 3 are plotted in (d) and (e), and the multilevel FBPINN solution is shown in (f).Note the subdomain boundaries and window functions extend past the problem domain (in this case, [0, 1]  ).Example collocation points used to train the multilevel FBPINN are plotted in (a), (d) and (e).
Network architecture.All FBPINNs tested use FCNs with identical architectures as their subdomain networks.The PINNs tested either use FCNs or FCNs with Fourier input features (see Appendix C for the definition of Fourier features).For all the FBPINNs tested, the  inputs to each subdomain network are normalized to the range [−1, 1] along each dimension over their individual subdomains.For the PINNs tested, the  inputs are normalized to the range [−1, 1] along each dimension over the global domain.tanh is used for all activation functions.

Fig. 5 .
Fig. 5. Ablation tests using the homogeneous Laplacian problem.The convergence curve of a baseline multilevel FBPINN is plotted when changing the number of levels (top right), overlap ratio (bottom left), and number of hidden units for each subdomain network (bottom right).The baseline model has  = 3 levels, an overlap ratio of  = 1.9, and 16 hidden units for each subdomain network.The exact solution is shown (top left).Convergence curves of two other benchmarks are shown; a PINN (bottom right), and one-level FBPINNs with varying numbers of subdomains (top right).The lists which label each model in the top right plot contain the number of subdomains along each dimension for each level in the model.Filled region edges show the minimum and maximum loss values across 10 random starting seeds and lines show the average.

Fig. 6 .
Fig. 6.Strong scaling test using the multi-scale Laplacian problem.In this test the problem complexity is fixed and the solution estimated using multilevel FBPINNs with increasing numbers of levels are plotted (top row).The title of each plot describes the level structure (first line) and the number of collocation points along each dimension (second line).The color-coded convergence curves and training times for each model are shown (bottom row).Filled region edges show the minimum and maximum loss values across 10 random starting seeds and lines show the average.Error bars show the minimum and maximum loss values and training times.The exact solution is shown (middle row).Plots of the solutions and convergence curves of a PINN, PINN with Fourier input features, SA-PINN, one-level FBPINN and three-level FBPINN benchmark are also shown (middle and bottom row).

V
.Dolean et al.

Fig. 7 .
Fig. 7. Weak scaling test using the multi-scale Laplacian problem.In this test the problem complexity is increased (in this case, the number of frequency components in the solution) (top row) and the solution estimated using multilevel FBPINNs with increasing numbers of levels and collocation points are plotted (middle row).The title of each plot describes the level structure (first line) and the number of collocation points along each dimension (second line).The color-coded convergence curves and training for each model are shown (bottom row).Filled region edges show the minimum and maximum loss values across 10 random starting seeds and lines show the average.Error bars show the minimum and maximum loss values and training times.

Fig. 8 .
Fig. 8. Ablation tests using the Helmholtz problem.The convergence curve of a baseline multilevel FBPINN is plotted when changing the number of levels (top right), overlap ratio (bottom left), and number of hidden units for each subdomain network (bottom right).The baseline model has  = 4 levels, an overlap ratio of  = 1.9, and 16 hidden units for each subdomain network.The solution obtained from FD modeling is shown (top left).Convergence curves of two other benchmarks are shown; a PINN (bottom right), and one-level FBPINNs with varying numbers of subdomains (top right).The lists which label each model in the top right plot contain the number of subdomains along each dimension for each level in the model.Filled region edges show the minimum and maximum loss values across 10 random starting seeds and lines show the average.

Fig.
Fig.9.Weak scaling test using the Helmholtz problem.In this test the problem complexity is increased (in this case, the wave number) (top row) and the solution estimated using multilevel FBPINNs with increasing numbers of levels and collocation points are plotted (second row).The title of each plot describes the level structure (first line) and the number of collocation points along each dimension (second line).Three benchmarks using a PINN, a PINN with Fourier input features, and a SA-PINN, all with a fixed network size and increasing numbers of collocation points are also shown (third and fourth row).

V
.Dolean et al.

Fig. 10 .
Fig. 10.Color-coded convergence curves and training times for each model displayed in Fig. 9. Filled region edges show the minimum and maximum loss values across 10 random starting seeds and lines show the average.Error bars show the minimum and maximum loss values and training times.

Fig. E. 12 .
Fig. E.12.Contribution to the FBPINN solution from each level, for the 5-level FBPINN shown in Fig. 7. Top left shows the full FBPINN solution after summing the levels.