Next Article in Journal
Discussion on the Approximate Controllability of Hilfer Fractional Neutral Integro-Differential Inclusions via Almost Sectorial Operators
Previous Article in Journal
Fractal Geometry and Convolutional Neural Networks for the Characterization of Thermal Shock Resistances of Ultra-High Temperature Ceramics
Previous Article in Special Issue
A Uniform Accuracy High-Order Finite Difference and FEM for Optimal Problem Governed by Time-Fractional Diffusion Equation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Numerical Investigation of a Class of Nonlinear Time-Dependent Delay PDEs Based on Gaussian Process Regression

School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan 430073, China
*
Author to whom correspondence should be addressed.
Fractal Fract. 2022, 6(10), 606; https://doi.org/10.3390/fractalfract6100606
Submission received: 27 August 2022 / Revised: 7 October 2022 / Accepted: 11 October 2022 / Published: 17 October 2022
(This article belongs to the Special Issue Novel Numerical Solutions of Fractional PDEs)

Abstract

:
Probabilistic machine learning and data-driven methods gradually show their high efficiency in solving the forward and inverse problems of partial differential equations (PDEs). This paper will focus on investigating the forward problem of solving time-dependent nonlinear delay PDEs with multi-delays based on multi-prior numerical Gaussian processes (MP-NGPs), which are constructed by us to solve complex PDEs that may involve fractional operators, multi-delays and different types of boundary conditions. We also quantify the uncertainty of the prediction solution by the posterior distribution of the predicted solution. The core of MP-NGPs is to discretize time firstly, then a Gaussian process regression based on multi-priors is considered at each time step to obtain the solution of the next time step, and this procedure is repeated until the last time step. Different types of boundary conditions are studied in this paper, which include Dirichlet, Neumann and mixed boundary conditions. Several numerical tests are provided to show that the methods considered in this paper work well in solving nonlinear time-dependent PDEs with delay, where delay partial differential equations, delay partial integro-differential equations and delay fractional partial differential equations are considered. Furthermore, in order to improve the accuracy of the algorithm, we construct Runge–Kutta methods under the frame of multi-prior numerical Gaussian processes. The results of the numerical experiments prove that the prediction accuracy of the algorithm is obviously improved when the Runge–Kutta methods are employed.

1. Introduction

In the last decade, more and more attention has been paid on the theoretical studies and practical applications concerned with probabilistic machine learning and data-driven methods [1,2], and some studies have been gradually extended to the fields of differential equations. How to predict the solution of (nonlinear) dynamic systems represented by (delay) partial differential equations (PDEs) has been valued [3,4,5,6,7,8]. However, (deep) neural networks are favored and used more often by researchers [9,10,11,12,13,14,15,16], and the research targets may involve the forward and inverse problem [17].
Machine learning for Gaussian processes (GPs) [18] is not new, and has been wildly employed in many fields, such as geology [19], meteorology [20] and some financial applications [21]. Similar to other machine learning methodologies, Gaussian processes can be utilized in regression problems or classification problems. Moreover, Gaussian processes belong to a set of methods named kernel machines [22]. However, compared with other kernel machine methods, such as support vector machines (SVMs) [23] and relevance vector machines (RVMs) [24], the strict probability deduction of Gaussian processes greatly limits the popularization of this method in industrial circles. Another drawback of Gaussian processes is that the computational cost may be expensive, due to the Cholesky decomposition of the kernel functions. As a kernel machine with very strong learning ability, Gaussian processes (GPs) have been widely used in the forward and inverse problem of ordinary differential equations (ODEs) and PDEs in recent years.
Markus et al. (2021) [25] constructed multi-output Gaussian process priors with realizations in the solution set of the linear PDEs; the construction was fully algorithmic via Gröbner bases. However, the construction for building GP priors by the parametrization of such systems was done via a pullback. Paterne et al. (2022) [26] efficiently inferred inhomogeneous terms modeled as GPs by using a truncated basis expansion of the GP kernel from the perspective of exact conjugate Bayesian inference. Mamikon et al. (2022) [27] proposed a framework which combined co-Kriging with the linear transformation of GPs together with the use of kernels given by spectral expansions in eigenfunctions of the boundary value problem, to infer the solution of a well-posed boundary value problem (BVP). To estimate parameters of nonlinear dynamic systems, represented by ODEs from sparse and noisy data, Yang et al. (2022) [28] bypassed the need for numerical integration, and proposed a fast and accurate method, named manifold-constrained Gaussian process inference (MAGI), explicitly conditioned on the manifold constraint that derivatives of Gaussian processes must satisfy the considered ODEs, based on a GP model over time series data. Raissi et al. (2018) [29] proposed a method for the forward problem of time-dependent PDEs based on GPR, which was named numerical Gaussian processes (NGPs). The method can solve equations with high accuracy, and the frame of the algorithm is flexible for solving different problems. Generally speaking, the study of Gaussian processes is an important branch of probabilistic numerics [30,31,32,33,34,35] in numerical analysis.
On the basis of NGPs, we propose a new method, multi-prior numerical Gaussian processes (MP-NGPs), which is designed to solve complex PDEs. In our known information, most methods based on Gaussian processes do not consider the delay term and high-dimensional problems, and only a few researchers consider fractional derivatives and integro differential derivatives, which obviously cannot meet the requirements of practical application. However, these problems are all considered in MP-NGPs. We also give strict mathematical deduction of MP-NGPs. Several PDEs are solved in the following sections to prove the validity of MP-NGPs.
It is well known that Gaussian processes [18] possess a strict mathematical basis and coincide with the Bayesian method in nature. Unlike traditional methods, including the finite element method [36], spectral method [37], and finite difference method [38], MP-NGPs [29] solve PDEs by means of statistical methods, which is the same as NGPs. In MP-NGPs, the Bayesian method [39] is used to construct an efficient data learning machine, and then infer the solution of the equations.
This paper will mainly exploit the possibility of MP-NGPs to solve the forward problem of a class of time-dependent nonlinear delay partial differential equations with multi-delays (1),
u ( x , t ) t = g ( u , x ) L x u ( x , t ) + i = 1 m h i ( u , x ) L x s i u ( x , t s i ) + f ( x , t ) , x Ω , 0 < t < T , u ( x , t ) | x Ω = φ ( x , t ) , u ( x , t ) = ψ ( x , t ) , t 0 ,
where t is the temporal variable, x is the D dimensions spatial vector, s i ( i = 1 , 2 , , m ) are delay terms ( s m > s m 1 > > s 1 > 0 ), and Ω R D , Ω is the boundary of Ω . u ( x , t ) is the solution of (1), f ( x , t ) is the inhomogeneous term, g ( u , x ) , h i ( u , x ) ( i = 1 , 2 , , m ), φ ( x , t ) , ψ ( x , t ) are given functions, and L x , L x s i ( i = 1 , 2 , , m ) are linear operators.
The linear operators considered in this paper will contain three different operators; these are partial differential, integro-differential and fractional-order operators. We should point that the fractional-order operators of (1) are considered in the sense of Caputo [40,41]
α x α u ( x , t ) = 1 Γ ( n α ) x 1 ( x υ ) α n + 1 n u ( υ , t ) υ n d υ ,
where α > 0 is a constant ( n 1 < α n , n N + ), Γ ( ) is the Gamma function.
Apparently, the boundary conditions of the problem (1) are Dirichlet boundary conditions. As for other types of boundary conditions, we will discuss them and give some solutions in Section 4. In order to certify the high efficiency and flexibility of MP-NGPs again, we will consider another type of time-dependent PDEs, wave equations, which still involve delay terms and inhomogeneous terms, in Section 5.
We arrange the outline of this paper as follows: Section 2 introduces MP-NGPs according to the workflow of solving nonlinear delay PDEs. In Section 3, Runge–Kutta methods based on MP-NGPs are constructed. Section 4 introduces the processing of Neumann and mixed boundary conditions in MP-NGPs. Section 5 gives two methods of how to slove wave equations in MP-NGPs. Finally, we present some discussions and conclusions in Section 6 and Section 7, respectively.

2. Multi-Priors Numerical Gaussian Processes

In this section, we introduce MP-NGPs based on the workflow of solving nonlinear delay PDEs with multi-delays.

2.1. Prior

Use the backward Euler scheme in the first equation of (1) to discretize time, take the time step size as Δ t , and then obtain
u n 1 = u n Δ t · g ( u n , x ) L x u n Δ t · i = 1 m h i ( u n , x ) L x s i u n τ i Δ t · f n ,
where n = 1 , 2 , , N , N = T / Δ t , u n = u ( x , t n ) , u n τ i = u ( x , t n s i ) , τ i = s i / Δ t , f n = f ( x , t n ) . We approximate the nonlinear terms g ( u n , x ) L x u n and h i ( u n , x ) L x s i u n τ i ( i = 1 , 2 , , m ) with g ( μ n 1 , x ) L x u n and h i ( μ n 1 , x ) L x s i u n τ i ( i = 1 , 2 , , m ), respectively, where μ n 1 denotes the predicted solution (posterior mean) of the previous time step. Then introduce notations that L x 0 u n = u n Δ t · g ( μ n 1 , x ) L x u n , L x i u n τ i = Δ t · h i ( μ n , x ) L x s i u n τ i ( i = 1 , 2 , , m ) and L x m + 1 f n = Δ t · f n , so we can obtain the simplified equation
u n 1 = L x 0 u n + i = 1 m L x i u n τ i + L x m + 1 f n .
Equation (4) captures the whole structure of solutions at each time step. We can link the solutions at different time steps by setting reasonable prior assumptions. Assume the following prior hypotheses [29]
u n ( x ) G P ( 0 , k u , u n , n ( x , x ; θ 0 ) ) , u n τ i ( x ) G P ( 0 , k u , u n τ i , n τ i ( x , x ; θ i ) ) , i = 1 , 2 , , m , f n ( x ) G P ( 0 , k f , f n , n ( x , x ; θ m + 1 ) ) ,
to be m + 2 mutually independent Gaussian processes. Considering the calculation feasibility, we set a prior assumption on u n rather than u n 1 . Assume covariance functions to be of the following exponential form [18],
k u , u n , n ( x , x ; θ 0 ) = σ 0 2 exp 1 2 d = 1 D w 0 , d ( x d x d ) 2 , k u , u n τ i , n τ i ( x , x ; θ i ) = σ i 2 exp 1 2 d = 1 D w i , d ( x d x d ) 2 , i = 1 , 2 , , m , k f , f n , n ( x , x ; θ m + 1 ) = σ m + 1 2 exp 1 2 d = 1 D w m + 1 , d ( x d x d ) 2 ,
where { θ j | j = 0 , 1 , , m + 1 } denotes the hyper-parameters. In Equations (6), θ j = σ j 2 , ( w j , d ) d = 1 D , j = 0 , 1 , , m + 1 . In short, any information of PDEs, such as delay terms, can be encoded into MP-NGPs through assuming multi-priors, as long as the structure constructed by these priors is feasible.
By Mercer’s theorem [42], given a positive definite covariance function k ( x , x ) we can decompose it into k ( x , x ) = i = 1 λ i φ i ( x ) φ i ( x ) , where λ i and φ i satisfying k ( x , x ) φ i ( x ) d x = λ i φ i ( x ) . Treat { λ i φ i } i = 1 as a set of orthogonal basis and construct a reproducing kernel Hilbert space (RKHS) [43], which is the so-called kernel trick [43]. The covariance function in Equation (6) does not have a finite decomposition. Different covariance functions, such as rational quadratic covariance functions and Matérn covariance functions, can be appropriately selected under different prior information [18].

2.2. Training

By the properties of GPs [18], we can obtain y G P ( 0 , K ) , where
y = u b n u n τ 1 u n τ m f n u n 1 , K = k b , b n , n k b , u n , n τ 1 k b , u n , n τ m k b , f n , n k b , u n , n 1 k u , u n τ 1 , n τ 1 k u , u n τ 1 , n τ m k u , f n τ 1 , n k u , u n τ 1 , n 1 k u , u n τ m , n τ m k u , f n τ m , n k u , u n τ m , n 1 k f , f n , n k f , u n , n 1 k u , u n 1 , n 1 ,
k b , b n , n = k u , u n , n ( x b n , x b n ) + σ u n 2 I , k f , f n , n = k f , f n , n ( x f n , x f n ) + σ f n 2 I , k u , u n τ i , n τ i = k u , u n τ i , n τ i ( x u n τ i , x u n τ i ) + σ u n τ i 2 I , i = 1 , 2 , , m , k b , u n , n τ i = k b , u n , n τ i ( x b n , x u n τ i ) = 0 , k u , f n τ i , n = k u , f n τ i , n ( x u n τ i , x f n ) = 0 , i = 1 , 2 , , m , k b , f n , n = k u , f n , n ( x b n , x f n ) = 0 , k u , u n τ l , n τ k = k u , u n τ l , n τ k ( x u n τ l , x u n τ k ) = 0 , l = 1 , 2 , , m , k = 1 , 2 , , m , k b , u n , n 1 = L x 0 k u , u n , n ( x b n , x u n 1 ) , k u , u n τ i , n 1 = L x i k u , u n τ i , n τ i ( x u n τ i , x u n 1 ) , i = 1 , 2 , , m , k f , u n , n 1 = L x m + 1 k f , f n , n ( x f n , x u n 1 ) , k u , u n 1 , n 1 = L x 0 L x 0 k u , u n , n ( x u n 1 , x u n 1 ) + i = 1 m L x i L x i k u , u n τ i , n τ i ( x u n 1 , x u n 1 ) + L x m + 1 L x m + 1 k f , f n , n ( x u n 1 , x u n 1 ) + σ u n 1 2 I , x b Ω .
It is worth noting that K is a symmetry matrix, and elements besides the upper triangular part of K are not written in this paper.
Consider additive noise in the observed data, and introduce notations that u b n = u ( x b n , t n ) + ε u n , f n = f ( x f n , t n ) + ε f n , u n 1 = u ( x u n 1 , t n 1 ) + ε u n 1 , and u n τ i = u ( x u n τ i , t n τ i ) + ε u n τ i ( i = 1 , 2 , , m ), where ε u n N ( 0 , σ u n 2 I ) , ε f n N ( 0 , σ f n 2 I ) , ε u n 1 N ( 0 , σ u n 1 2 I ) , and ε u n τ i N ( 0 , σ u n τ i 2 I ) ( i = 1 , 2 , , m ). Assume ε u n , ε f n , ε u n 1 and ε u n τ i ( i = 1 , 2 , , m ) to be m + 3 mutually independent additive noise. Notice that k b , b n , n is equal to k u , u n , n . That is, the boundary points can be seen as the known points uses for the next time step, as Dirichlet boundary conditions are considered. For different types of boundary conditions, the corresponding treatment will be introduced in Section 4.
By (5), the hyper-parameters { θ j | j = 0 , 1 , , m + 1 } are trained by minimizing the following log marginal likelihood function,
log p ( y | x b n , x u n τ 1 , , x u n τ m , x f n , x u n 1 , θ 0 , , θ m + 1 ) = 1 2 log K + 1 2 y T K 1 y + N 2 log 2 π ,
where N is the length of y . A Quasi-Newton optimizer L B F G S B [44] is applied for training in this paper. It should be pointed that other optimizer algorithm, such as the whale optimization algorithm [45] (WOA) and the ant colony optimization algorithm [46], can also be used for training.
The training procedure is one of the most important things for exploiting the algorithm, which can be seen as the “regression nature” of MP-NGPs. However, to numerically solve the forward problem of PDEs, the method of MP-NGPs subtly turns algebraic problems into optimization problems, which are more familiar with machine learning methods.

2.3. Posterior

We can easily obtain the following posterior distribution of the prediction solution u n ( x * n ) based on the conditional distribution of Gaussian processes,
u n ( x * n ) u b n u n τ 1 u n τ m f n u n 1 N q T K 1 u b n u n τ 1 u n τ m f n u n 1 , k u , u n , n ( x * n , x * n ) q T K 1 q ,
where q T : = [ k u , b n , n ( x * n , x b n ) , k u , u n , n τ 1 ( x * n , x u n τ 1 ) , , k u , u n , n τ m ( x * n , x u n τ m ) , k u , f n , n ( x * n , x f n ) , k u , u n , n 1 ( x * n , x u n 1 ) ] .

2.4. Propagating Uncertainty

Notice that { x u n 1 , u n 1 } ( n > 1 ) and { x u n τ i , u n τ i } ( n > τ i , i = 1 , , m ) are generated data, we cannot use the outcome of the conditional distribution (8) in the next time step. We should propagate the uncertainty firstly, marginalize u n 1 ( u n τ i , n > τ i ) out by exploiting (when τ q < n η τ q + 1 )
u n η u b 1 , u b 2 , , u b n η u 1 τ 1 , u 2 τ 1 , , u 0 u 1 τ q , u 2 τ q , , u 0 u 1 τ q + 1 , u 2 τ q + 1 , , u n η τ q + 1 u 1 τ m , u 2 τ m , , u n η τ m f 1 , f 2 , , f n η u 0 N μ n η , Σ n η , n η ,
to have (when τ p < n τ p + 1 )
u n ( x * n ) u b 1 , u b 2 , , u b n u 1 τ 1 , u 2 τ 1 , , u 0 u 1 τ p , u 2 τ p , , u 0 u 1 τ p + 1 , u 2 τ p + 1 , , u n τ p + 1 u 1 τ m , u 2 τ m , , u n τ m f 1 , f 2 , , f n u 0 N μ n ( x * n ) , Σ n , n ( x * n , x * n ) ,
where μ n ( x * n ) = q T K 1 u b n μ n τ 1 μ n τ p u n τ p + 1 u n τ m f n μ n 1 , Σ n , n ( x * n , x * n ) = k u , u n , n ( x * n , x * n ) q T K 1 q + q T K 1 S K 1 q , and S = d i a g ( 0 , Σ n τ 1 , n τ 1 , , Σ n τ p , n τ p , , 0 , , Σ n 1 , n 1 ) .
Then from (10), we can obtain both the form of u n N μ n , Σ n , n and the artificial data { x u n , u n } of the next time step, where μ n = μ n ( x u n ) , Σ n , n = Σ n , n ( x u n , x u n ) .

2.5. Basic Workflow

Summing up the above introduction, we have the following algorithm workflow:
  • Firstly, we obtain the initial data { x u 0 , u 0 } , delay term data { x u 1 τ i , u 1 τ i } ( i = 1 , , m ) and the inhomogeneous term data { x f 1 , f 1 } by randomly selecting points on spatial domain Ω , and obtain the boundary data { x b 1 , u b 1 } by randomly selecting points on Ω . Then we train the hyper-parameters { θ j | j = 0 , 1 , , m + 1 } in (7) by using the obtained data.
  • From the obtained hyper-parameters of step 1, (8) is used to predict the solution for the first time step and obtain artificially generated data { x u 1 , u 1 } , where points x u 1 are randomly sampled from Ω , and u 1 is the corresponding random vector. We obtain the delay term data { x u 2 τ i , u 2 τ i } ( i = 1 , , m ) and the inhomogeneous term data { x f 2 , f 2 } by randomly selecting points on Ω , then obtain the boundary data { x b 2 , u b 2 } by randomly selecting points on Ω .
  • From { x u 1 , u 1 } , { x u 2 τ i , u 2 τ i } ( i = 1 , , m ), { x f 2 , f 2 } and { x b 2 , u b 2 } , we train the hyper-parameters { θ j | j = 0 , 1 , , m + 1 } for the second time step.
  • From the obtained hyper-parameters of step 3, (10) is used to predict the solution for the second time step and obtain artificially generated data { x u 2 , u 2 } , where points x u 2 are randomly sampled from Ω . We also obtain the delay term data { x u 3 τ i , u 3 τ i } ( i = 1 , , m ) and the inhomogeneous term data { x f 3 , f 3 } by randomly selecting points on Ω , then obtain the boundary data { x b 3 , u b 3 } by randomly selecting points on Ω .
  • Steps 3 and 4 are repeated until the last time step.
However, we exploit the Gaussian process regression to predict the next solution at each time step. Figure 1 shows the basic idea of MP-NGPs. At each iteration, the optimized hyper-parameters of the current time step are saved and deemed as initial values of the hyper-parameters in the next time step, which can significantly speed up the training process.

2.6. Processing of Fractional-Order Operators

For simplicity, we consider the following delay problem to show the processing of the fractional-order operators,
u t = g ( u , x ) α x α u + β x β u + h ( u , x ) 2 x 2 u ( x , t s ) + f ,
where continuous real function u ( x , t ) is absolutely integrable in R 2 . Assume arbitrary-order partial derivatives of u ( x , t ) to be continuous function and absolutely integrable in R 2 .
Consider the backward Euler scheme to (11) with g ( u n , x ) and h ( u n , x ) replaced by g ( μ n 1 , x ) and h ( μ n 1 , x ) , respectively,
u n 1 = u n Δ t · g ( μ n 1 , x ) α x α u n Δ t · g ( μ n 1 , x ) β x β u n Δ t · h ( μ n 1 , x ) 2 x 2 u n τ Δ t · f n .
From (12), we can obtain
u n 1 = L x 0 u n + L x 1 u n τ + L x 2 f n ,
where L x 0 u n = u n Δ t · g ( μ n 1 , x ) α x α u n Δ t · g ( μ n 1 , x ) β x β u n , L x 1 u n τ = Δ t · h ( μ n 1 , x ) 2 x 2 u n τ , and L x 2 f n = Δ t · f n . Assume the Gaussian process prior as follows,
u n ( x ) G P ( 0 , k u , u n , n ( x , x ; θ 0 ) ) , u n τ ( x ) G P ( 0 , k u , u n τ , n τ ( x , x ; θ 1 ) ) , f n ( x ) G P ( 0 , k f , f n , n ( x , x ; θ 2 ) ) .
We can easily obtain the following results from Section 2.2,
k u , u n 1 , n 1 ( x , x ) = L x 0 L x 0 k u , u n , n ( x , x ) + L x 1 L x 1 k u , u n τ , n τ ( x , x ) + L x 2 L x 2 k f , f n , n ( x , x ) .
How to deal with the first term on the right side of (15) is one of the most important tasks. Denote k i n ( x , x ) = L x 0 L x 0 k u , u n , n ( x , x ) , according to [40], the Fourier transform is carried out with respect to the spatial variable ( x , x ) on k u , u n , n ( x , x ) to obtain k ^ u , u n , n ( ω , ω ) , as the excellent properties of the squared exponential covariance function k u , u n , n ( x , x ) obviously assure the feasibility of the Fourier transform of derivatives of k u , u n , n ( x , x ) , according to which k u , u n , n ( x , x ) and its partial derivatives vanish for x 2 + x 2 + . Then we can obtain the following intermediate kernels:
k ^ u , u n , n 1 ( ω , ω ) = [ 1 Δ t · g · ( i ω ) α Δ t · g · ( i ω ) β ] k ^ u , u n , n ( ω , ω ) , k ^ i n ( ω , ω ) = [ 1 Δ t · g · ( i ω ) α Δ t · g · ( i ω ) β ] [ 1 Δ t · g · ( i ω ) α Δ t · g · ( i ω ) β ] k ^ u , u n , n ( ω , ω ) ,
where g = g ( μ n 1 ( x ) , x ) , g = g ( μ n 1 ( x ) , x ) .
At last, the inverse Fourier transform can be performed on k ^ u , u n , n 1 ( ω , ω ) and k ^ i n ( ω , ω ) to obtain the kernels k u , u n , n 1 ( x , x ) and k i n ( x , x ) .

2.7. Numerical Tests

To verify the validity of the method proposed, three examples are provided. Firstly, we introduce the maximum absolute error E to denote the difference between the conditional posterior mean of the prediction solution and exact solution as follows:
E ( t n ) = max x * | u n ( x * ) u ¯ n ( x * ) | ,
where t n is the current time, x * is a point random sampling from Ω , u n ( x * ) is the exact solution, and u ¯ n ( x * ) is the posterior mean of u n ( x * ) (prediction solution).
We also introduce a notation that max E ( Δ t ) = max 1 n N E ( t n ) , which can be seen as a measure of the prediction error under a different time step size Δ t .
Then we introduce the relative spatial error L 2 to represent the relative error of prediction [29],
L 2 ( t n ) = x * [ u n ( x * ) u ¯ n ( x * ) ] 2 x * [ u n ( x * ) u ¯ n ( x * ) ] 2 x * [ u n ( x * ) ] 2 x * [ u n ( x * ) ] 2 .
Example 1.
u t = u sin u · ( 1 x ) α x α u ( x , t ) + sin ( 2 π x ) β x β u ( x , t ) + u x 2 x 2 u ( x , t 0.2 ) + f ( x , t ) , ( x , t ) ( 0 , 1 ) × ( 0 , 1 ] , u ( 0 , t ) = 1 + t 2 , u ( 1 , t ) = e 0.5 ( 1 + t 2 ) , t ( 0 , 1 ] u ( x , t ) = e 0.5 x ( 1 + t 2 ) , 0.2 t 0 ,
where 0 < α , β < 1 . Example 1 is a fractional partial differential equation with one delay. The fractional partial derivative α x α u ( x , t ) and β x β u ( x , t ) is defined by (2) in the Caputo sense [40].
We fix ( α , β ) to be ( 0.25 , 0.75 ) , ( 0.5 , 0.5 ) , and ( 0.75 , 0.25 ) in turn. The exact solution of (30) is u ( x , t ) = e 0.5 x ( 1 + t 2 ) , and the inhomogeneous term f ( x , t ) differs from each other when ( α , β ) is changed,
f ( x , t ) = q ( x , t ) u sin u · ( 1 x ) ( 1 + t 2 ) 2 Γ ( 0.75 ) x e υ / 2 ( x υ ) 0.25 d υ u sin u · sin ( 2 π x ) ( 1 + t 2 ) 2 Γ ( 0.25 ) x e υ / 2 ( x υ ) 0.75 d υ , ( α , β ) = ( 0.25 , 0.75 ) , q ( x , t ) u sin u · ( 1 x ) ( 1 + t 2 ) 2 Γ ( 0.5 ) x e υ / 2 ( x υ ) 0.5 d υ u sin u · sin ( 2 π x ) ( 1 + t 2 ) 2 Γ ( 0.5 ) x e υ / 2 ( x υ ) 0.5 d υ , ( α , β ) = ( 0.5 , 0.5 ) , q ( x , t ) u sin u · ( 1 x ) ( 1 + t 2 ) 2 Γ ( 0.25 ) x e υ / 2 ( x υ ) 0.75 d υ u sin u · sin ( 2 π x ) ( 1 + t 2 ) 2 Γ ( 0.75 ) x e υ / 2 ( x υ ) 0.25 d υ , ( α , β ) = ( 0.75 , 0.25 ) .
where q ( x , t ) = 2 e 0.5 x t 0.25 x e x [ 1 + ( t 0.2 ) 2 ] ( 1 + t 2 ) .
Numerical tests of solving (19) are performed with noiseless data and noisy data, respectively, where the backward Euler scheme is employed. We only add noise into the initial data u 0 , and the additive noise of u 0 is ε u 0 N ( 0 , 0.1 2 I ) . Figure 2, Figure 3 and Figure 4 show the posterior distribution of the predicted solution at some time nodes, where we fix the number of initial data points (inhomogeneous term data points, artificially generated data points and delay term data points) to be 20 and Δ t to be 0.01, and take ( α , β ) to be ( 0.25 , 0.75 ) , ( 0.5 , 0.5 ) , and ( 0.75 , 0.25 ) in turn. In each of these figures, (a) presents the results of the experiments with noiseless data, and (b) presents the results of the experiments with noisy data. Moreover, Table 1 shows us the maximum absolute error E between the conditional posterior mean of the prediction solution and the exact solution at some time nodes in detail, where noiseless data is used in the experiments. Table 2 shows us the maximum absolute error E at some time nodes, where noisy data are used. It is found that the method works well when solving nonlinear delay fractional equations. It is easy to observe that higher prediction accuracy is obtained by using noiseless data in experimental tests.
Example 2.
u t = ( u 2 + 1 ) 0 x u ( υ , t ) d υ + 2 u ( x , t ) x 2 + ( u + 1 ) 2 u ( x , t 0.2 ) x 2 + f ( x , t ) , ( x , t ) ( 0 , 1 ) × ( 0 , 1 ] , u ( 0 , t ) = 2 ( 1 + t 3 ) , u ( 1 , t ) = 1 + t 3 , t ( 0 , 1 ] , u ( x , t ) = ( 2 x 2 ) ( 1 + t 3 ) , 0.2 t 0 ,
Example 2 is a partial integro-differential equation with one delay. The exact solution of (21) is u ( x , t ) = ( 2 x 2 ) ( 1 + t 3 ) , and the inhomogeneous term is f ( x , t ) = 3 t 2 ( 2 + x 2 ) + 2 [ 1 + ( 0.2 + t ) 3 ] [ 1 ( 1 + t 3 ) ( 2 + x 2 ) ] + 1 3 ( 1 + t 3 ) ( 6 6 x + x 3 ) [ 5 4 x 2 + x 4 + 2 t 3 ( 2 + x 2 ) 2 + t 6 ( 2 + x 2 ) 2 ] .
Numerical tests of solving (21) are performed with noiseless data and noisy data, respectively, where the backward Euler scheme is employed. We add noise into the initial data u 0 and inhomogeneous term data f n ( n = 1 , 2 , , 100 ), and the additive noisy of u 0 and f n is ε u 0 N ( 0 , 2 2 I ) and ε f n N ( 0 , 4 2 I ) , respectively. Figure 5 shows the posterior distribution of the predicted solution at different time nodes, where we fix the number of initial data points (inhomogeneous term data points, artificially generated data points and delay term data points) to be 10 and Δ t to be 0.01. In Figure 5, the five pictures on the upper side present the results of the experiments with noiseless data, and the other five pictures on the lower side present the results of the experiments with noisy data. Moreover, Table 3 shows us the maximum absolute error E at some time nodes in detail, where noiseless data and noisy data are used in the experiments, respectively. It is found that the method works well when solving nonlinear delay integro-differential equations. It is worth noting that the computational accuracy of the prediction solution is not sensitive to noise, as the level of noise added into the data is not low.
Example 3.
u t = ( u sin u + x 1 x 2 x 3 ) i = 1 3 2 u x i 2 + ( u 2 + x 1 2 ) 2 x 1 x 2 u ( x , t 0.1 ) + ( u x 2 x 3 1 ) 2 x 2 x 3 u ( x , t 0.2 ) + f ( x , t ) , ( x 1 , x 2 , x 3 , t ) ( 0 , 2 π ) 3 × ( 0 , 1 ] , u ( x , t ) | x ( 0 , 2 π ) 3 = 1 2 + sin t 10 , t ( 0 , 1 ] , u ( x , t ) = ( 1 2 + sin t 10 ) i = 1 3 ( cos x i + sin x i ) , t 0 .
Example 3 defines a problem with two delays and three spatial dimensions. The exact solution of (22) is u ( x 1 , x 2 , x 3 , t ) = ( 1 2 + sin t 10 ) i = 1 3 ( cos x i + sin x i ) , and the inhomogeneous term is
f ( x 1 , x 2 , x 3 , t ) = 1 10 cos t ( cos x 1 + sin x 1 ) ( cos x 2 + sin x 2 ) ( cos x 3 + sin x 3 ) 1 10 ( 5 sin ( 0.2 t ) ) ( cos x 1 + sin x 1 ) ( cos x 2 sin x 2 ) ( cos x 3 sin x 3 ) ( 1 + 1 10 x 2 x 3 ( 5 + sin [ t ] ) ( cos x 1 + sin x 1 ) ( cos x 2 + sin x 2 ) ( cos x 3 + sin x 3 ) ) 1 10 ( 5 sin ( 0.1 t ) ) ( cos x 1 sin x 1 ) ( cos x 2 sin x 2 ) ( cos x 3 + sin x 3 ) ( x 1 2 + 1 100 ( 5 + sin t ) 2 ( cos x 1 + sin x 1 ) 2 ( cos x 2 + cos x 2 ) 2 ( cos x 3 + sin x 3 ) 2 ) ( 1 10 ( 5 + sin t ) ( cos x 1 + sin x 1 ) ( cos x 2 + sin x 2 ) ( cos x 3 sin x 3 ) + 1 10 ( 5 + sin t ) ( cos x 1 + sin x 1 ) ( cos x 2 sin x 2 ) ( cos x 3 + sin x 3 ) + 1 10 ( 5 + sin t ) ( cos x 1 sin x 1 ) ( cos x 2 + sin x 2 ) ( cos x 3 + sin x 3 ) ) ( x 1 x 2 x 3 + 1 10 ( 5 + sin t ) ( cos x 1 + sin x 1 ) ( cos x 2 + sin x 2 ) ( cos x 3 + sin x 3 ) sin ( 1 10 ( 5 + sin t ) ( cos x 1 + sin x 1 ) ( cos x 2 + sin x 2 ) ( cos x 3 + sin x 3 ) ) ) .
Numerical tests applying the backward Euler scheme to solve (22) will be performed with noiseless data. Figure 6 shows the relative spatial error L 2 and the maximum absolute error E with the increase in time, where we fix the number of initial data points (inhomogeneous term data points, artificially generated data points and delay term data points) to be 20 and Δ t to be 0.01. The number of training data points on each boundary line is fixed to be 5. Furthermore, Table 4 shows the relative spatial error L 2 and the maximum absolute error E at some time nodes in detail. According to the experimental outcome, we can conclude that the method is still effective when solving nonlinear time-dependent delay equations with high dimensions and multi-delays.

3. Runge–Kutta Methods

For the purpose of improving the accuracy of the methodology considered in Section 2, we will consider Runge–Kutta methods [47,48] based on the frame of MP-NGPs.

3.1. MP-NGPs Based on Runge–Kutta Methods

Apply Runge–Kutta methods with q stages to the first equation of (1) with appropriate approximation of nonlinear terms, and we can obtain [47]
u n + 1 u n Δ t = j = 1 q b j g ( μ n , x ) L x u n + ς j + k = 1 m h k ( μ n , x ) L x s k u n + ς j τ k + f n + ς j , u n + ς i u n Δ t = j = 1 q a i j g ( μ n , x ) L x u n + k = 1 m h k ( μ n , x ) L x s k u n + ς j τ k + f n + ς j , i = 1 , 2 , , q .
Reforming (24), we have
u q + 1 n : = u n = u n + 1 Δ t j = 1 q b j g ( μ n , x ) L x u n + ς j + k = 1 m h ( μ n , x ) L x s k u n + ς j τ k + f n + ς j , u i n : = u n = u n + ς i Δ t j = 1 q a i j g ( μ n , x ) L x u n + k = 1 m h ( μ n , x ) L x s k u n + ς j τ k + f n + ς j , i = 1 , 2 , , q ,
where u n + ς j = u ( x , t n + ς j Δ t ) , u n + ς j τ k = u ( x , t n + ς j Δ t τ k ) , and f n + ς j = f ( x , t n + ς j Δ t ) . Prior assumptions are set on u n + 1 , u n + ς i , f n + ς i , u n + ς i τ k ( i = 1 , 2 , , q ; k = 1 , 2 , , m ) and Equations (25) turn into a regression problem.
Then directly write the joint distribution of u n + 1 , u n + ς i , f n + ς i , u n + ς i τ k , u q + 1 n , u i n ( i = 1 , 2 , , q ; k = 1 , 2 , , m ) . For simplicity, we introduce the Runge–Kutta methods by solving the following problem:
u t = g ( u , x ) L x u + h ( u , x ) L x s u ( x , t s ) + f .
The trapezoidal rule is applied under the structure of Runge–Kutta methods to (26) with g ( u n + 1 , x ) , h ( u n + 1 , x ) , g ( u n , x ) and h ( u n , x ) replaced by g ( μ n , x ) , h ( μ n , x ) , g ( μ n , x ) and h ( μ n , x ) , respectively. Then we have
u n + 1 u n Δ t = 1 2 [ g ( μ n , x ) L x u n + 1 + h ( μ n , x ) L x s u n + 1 τ + f n + 1 ] + 1 2 [ g ( μ n , x ) L x u n + h ( μ n , x ) L x s u n τ + f n ] .
Reforming (27), we have
u 1 n : = u n , u 2 n : = u 3 n : = u n + 1 1 2 Δ t [ g ( μ n , x ) L x u n + 1 + h ( μ n , x ) L x s u n + 1 τ + f n + 1 ] 1 2 Δ t [ g ( μ n , x ) L x u n + h ( μ n , x ) L x s u n τ + f n ] .
However, Equation (28) coincides with the structure of Runge–Kutta methods when q = 2 , ς 1 = 0 , ς 2 = 1 , b 1 = b 2 = 1 / 2 , a 11 = a 12 = 0 , and a 21 = a 22 = 1 / 2 . Reforming (28), we obtain
u 1 n : = u n , u 2 n : = u 3 n : = L x 0 u n + 1 + L x 1 u n + 1 τ + L x 2 f n + 1 + L x 3 u n + L x 1 u n τ + L x 2 f n ,
where L x 0 , L x 1 , L x 2 and L x 3 are linear operators.
Assume the following prior hypotheses:
u n ( x ) G P ( 0 , k u , u n , n ( x , x ; θ 1 ) ) , u n + 1 ( x ) G P ( 0 , k u , u n + 1 , n + 1 ( x , x ; θ 2 ) ) , u n τ ( x ) G P ( 0 , k u , u n τ , n τ ( x , x ; θ 3 ) ) , u n + 1 τ ( x ) G P ( 0 , k u , u n + 1 τ , n + 1 τ ( x , x ; θ 4 ) ) , f n ( x ) G P ( 0 , k f , f n , n ( x , x ; θ 5 ) ) , f n + 1 ( x ) G P ( 0 , k f , f n + 1 , n + 1 ( x , x ; θ 6 ) ) ,
to be mutually independent Gaussian processes, and assume covariance functions to be the following exponential form of (6). By the properties of the Gaussian processes, we have
y G P ( 0 , K ) ,
where
K = k b , b n + 1 , n + 1 k b , u n + 1 , n + 1 τ k b , f n + 1 , n + 1 k b , b n + 1 , n k b , u n + 1 , n τ k b , f n + 1 , n k b , 1 n + 1 , n k b , 2 n + 1 , n k u , u n + 1 τ , n + 1 τ k u , f n + 1 τ , n + 1 k u , b n + 1 τ , n k u , u n + 1 τ , n τ k u , f n + 1 τ , n k u , 1 n + 1 τ , n k u , 2 n + 1 τ , n k f , f n + 1 , n + 1 k f , b n + 1 , n k f , u n + 1 , n τ k f , f n + 1 , n k f , 1 n + 1 , n k f , 2 n + 1 , n k b , b n , n k b , u n , n τ k b , f n , n k b , 1 n , n k b , 2 n , n k u , u n τ , n τ k u , f n τ , n k u , 1 n τ , n k u , 2 n τ , n k f , f n , n k f , 1 n , n k f , 2 n , n k 1 , 1 n , n k 1 , 2 n , n k 2 , 2 n , n ,
y = u b n + 1 u n + 1 τ f n + 1 u b n u n τ f n u 1 n u 2 n , k b , b n + 1 , n + 1 = k u , u n + 1 , n + 1 ( x b n + 1 , x b n + 1 ) + σ u n + 1 2 I , k b , b n , n = k u , u n , n ( x b n , x b n ) + σ u n 2 I , k u , u n + 1 τ , n + 1 τ = k u , u n + 1 τ , n + 1 τ ( x u n + 1 τ , x u n + 1 τ ) + σ u n + 1 τ 2 I , k u , u n τ , n τ = k u , u n τ , n τ ( x u n τ , x u n τ ) + σ u n τ 2 I , k f , f n + 1 , n + 1 = k f , f n + 1 , n + 1 ( x f n + 1 , x f n + 1 ) + σ f n + 1 2 I , k f , f n , n = k f , f n , n ( x f n , x f n ) + σ f n 2 I , k b , 1 n , n = k u , u n , n ( x b n , x u n ) , k b , 2 n + 1 , n = L x 0 k u , u n + 1 , n + 1 ( x b n + 1 , x u n ) , k u , 2 n + 1 τ , n = L x 1 k u , u n + 1 τ , n + 1 τ ( x u n + 1 τ , x u n ) , k f , 2 n + 1 , n = L x 2 k f , f n + 1 , n + 1 ( x f n + 1 , x u n ) , k b , 2 n , n = L x 3 k u , u n , n ( x b n , x u n ) , k u , 2 n τ , n = L x 1 k u , u n τ , n τ ( x u n τ , x u n ) , k f , 2 n , n = L x 2 k f , f n , n ( x f n , x u n ) , k 1 , 1 n , n = k u , u n , n ( x u n , x u n ) + σ u n 2 I , k 1 , 2 n , n = L x 3 k u , u n , n ( x u n , x u n ) , k 2 , 2 n , n = L x 0 L x 0 k u , u n + 1 , n + 1 ( x u n , x u n ) + L x 1 L x 1 k u , u n + 1 τ , n + 1 τ ( x u n , x u n ) + L x 2 L x 2 k f , f n + 1 , n + 1 ( x u n , x u n ) + L x 3 L x 3 k u , u n , n ( x u n , x u n ) + L x 1 L x 1 k u , u n τ , n τ ( x u n , x u n ) + L x 2 L x 2 k f , f n , n ( x u n , x u n ) + σ u n 2 I .
Other kernels are assumed to be 0 .
We can write the posterior distribution of the prediction u n + 1 ( x * n + 1 ) as follows:
u n + 1 ( x * n + 1 ) u b n + 1 u n + 1 τ f n + 1 u b n u n τ f n u n u n N q T K 1 u b n + 1 u n + 1 τ f n + 1 u b n u n τ f n u n u n , k u , u n + 1 , n + 1 ( x * n + 1 , x * n + 1 ) q T K 1 q ,
where q T : = [ k u , b n + 1 , n + 1 ( x * n + 1 , x b n + 1 ) , k u , u n + 1 , n + 1 τ ( x * n + 1 , x u n + 1 τ ) , k u , f n + 1 , n + 1 ( x * n + 1 , x f n + 1 ) ,
k u , b n + 1 , n ( x * n + 1 , x b n ) , k u , u n + 1 , n τ ( x * n + 1 , x u n τ ) , k u , f n + 1 , n ( x * n + 1 , x f n ) , k u , u n , n τ m ( x * n , x u n τ m ) , k u , 1 n + 1 , n ( x * n + 1 , x u n ) , k u , 2 n + 1 , n ( x * n + 1 , x u n ) ] , k u , 2 n + 1 , n = L x 0 k u , u n + 1 , n + 1 ( x * n + 1 , x u n ) .
For { x u n , u n } ( n > 0 ), { x u n τ , u n τ } ( n > τ ) and { x u n + 1 τ , u n + 1 τ } ( n τ ) are artificially generated data, we cannot directly use the outcome of the conditional distribution of (32) in the next time step. To propagate the uncertainty, we should marginalize u n 1 ( u n τ , u n + 1 τ ) by employing
u n + 1 η u b 1 , u b 2 , , u b n + 1 η u 1 τ , u 2 τ , , u n + 1 η τ f 1 , f 2 , , f n + 1 η u b 0 , u b 2 , , u b n η u τ , u 1 τ , , u n η τ f 0 , f 1 , , f n η u 0 N μ n + 1 η , Σ n + 1 η , n + 1 η , n + 1 η τ u n + 1 η u b 1 , u b 2 , , u b n + 1 η u 1 τ , u 2 τ , , u 0 f 1 , f 2 , , f n + 1 η u b 0 , u b 2 , , u b n η u τ , u 1 τ , , u 0 f 0 , f 1 , , f n η u 0 N μ n + 1 η , Σ n + 1 η , n + 1 η , n + 1 η > τ
to get
u n + 1 ( x * n + 1 ) u b 1 , u b 2 , , u b n + 1 u 1 τ , u 2 τ , , u n + 1 τ f 1 , f 2 , , f n + 1 u b 0 , u b 2 , , u b n u τ , u 1 τ , , u n τ f 0 , f 1 , , f n u 0 N μ n + 1 ( x * n + 1 ) , Σ n + 1 , n + 1 ( x * n + 1 , x * n + 1 ) , n + 1 τ u n + 1 ( x * n + 1 ) u b 1 , u b 2 , , u b n + 1 u 1 τ , u 2 τ , , u 0 f 1 , f 2 , , f n + 1 u b 0 , u b 2 , , u b n u τ , u 1 τ , , u 0 f 0 , f 1 , , f n u 0 N μ n + 1 ( x * n + 1 ) , Σ n + 1 , n + 1 ( x * n + 1 , x * n + 1 ) , n + 1 > τ
where μ n + 1 ( x * n + 1 ) = q T K 1 u b n + 1 μ n + 1 τ f n + 1 u b n μ n τ f n μ n μ n (if n + 1 τ < 0 , replace μ n + 1 τ by u n + 1 τ ; if n τ < 0 , replace μ n τ by u n τ ), Σ n + 1 , n + 1 ( x * n + 1 , x * n + 1 ) = k u , u n + 1 , n + 1 ( x * n + 1 , x * n + 1 ) q T K 1 q + q T K 1 S K 1 q , S = d i a g ( 0 , Σ n + 1 τ , n + 1 τ , 0 , 0 , Σ n τ , n τ , 0 , Σ 2 × 2 n , n ) (if n + 1 τ < 0 , replace Σ n + 1 τ , n + 1 τ by 0 ; if n τ < 0 , replace Σ n τ , n τ by 0 ), and Σ 2 × 2 n , n = Σ n , n Σ n , n Σ n , n Σ n , n .
On the basis of Equation (34), we have u n + 1 N μ n + 1 , Σ n + 1 , n + 1 , and obtain the artificial data { x u n + 1 , u n + 1 } of the next step, where μ n + 1 = μ n + 1 ( x u n + 1 ) , Σ n + 1 , n + 1 = Σ n + 1 , n + 1 ( x u n + 1 , x u n + 1 ) .

3.2. Numerical Test for the Problem with One Delay Applying the Trapezoidal Rule

In this subsection, the method of applying the backward Euler scheme and the trapezoidal scheme is applied to solve Example 4, and numerical tests are carried out. Moreover, the impact of data noise and the amount of training data on maximum absolute error E is investigated in this subsection.
Example 4.
u t = ( u 2 + sin u + 2 x + 1 ) 2 u x 2 + ( u x ) x u ( x , t 0.2 ) + f ( x , t ) , ( x , t ) ( 0 , 2 π ) × ( 0 , 1 ] , u ( 0 , t ) = e t , u ( 2 π , t ) = e t , t ( 0 , 1 ] u ( x , t ) = e t ( cos x + sin x ) , t 0 .
The exact solution of (35) is u ( x , t ) = e t ( cos x + sin x ) , and the inhomogeneous term is f ( x , t ) = e t ( cos x + sin x ) { 2 x + e 2 t ( cos x + sin x ) 2 + sin [ e t ( cos x + sin x ) ] } e 0.2 2 t x cos 2 x .
Table 5 presents the maximum absolute error E between the conditional posterior mean of the prediction solution and the exact solution at five time nodes for the different time step size Δ t under the backward Euler scheme, where the number of initial data points (inhomogeneous term data points, artificially generated data points and delay term data points) is fixed to be 20. It is found that the method applying the backward Euler scheme is first-order accurate in time, which is just in line with the first-order convergence property of the backward Euler scheme we applied. Since calculating the convergence order in time requires the maximum absolute error under two consecutive time steps, we omit the display of the first value in the last column of Table 5 and Table 6.
Figure 7a shows the impact of the number of initial data points (inhomogeneous term data points, artificially generated data points and delay term data points) on max E ( Δ t ) = max 1 n N E ( t n ) , where we take Δ t as 10 1 , 10 2 and 10 3 in turn. The results show that when Δ t is equal to 10 1 , the prediction accuracy quickly reaches saturation with the increase in the number of training data points, but when Δ t is taken smaller, increasing the amount of training data still has the potential to improve the prediction accuracy, and the saturation behavior occurs much later.
Moreover, we investigate the impact of noise on the prediction accuracy. We add noise into the initial data u 0 , delay term data u τ ( τ = 1 , 2 , , 19 ) and inhomogeneous term data f n ( n = 1 , 2 , , 100 ) to investigate the impact of the level of the noise on the accuracy. The additive noisy of u 0 , u τ and f n is ε u 0 N ( 0 , σ 2 I ) , ε u τ N ( 0 , σ 2 I ) and ε f n N ( 0 , 1 4 σ 2 I ) , respectively. Take t = 0.2 , 0.6 , 1.0 , and observe the change of E under different σ at these time nodes (see Figure 7b). The experiment outcome proves that the bigger σ , the lower the prediction accuracy. With the progress of the iteration, the impact of the noise on the accuracy of the current time step becomes smaller, and the accuracy gradually tends to increase, as the hyper-parameters that contain the knowledge of the considered problem are trained better and better as the iteration progress continues.
Table 6 shows the maximum absolute error E between the conditional posterior mean of the prediction solution and the exact solution at five time nodes for the different time step size Δ t under the trapezoidal scheme, where the number of initial data points (inhomogeneous term data points, artificially generated data points and delay term data points) is fixed to be 20. It is found that the method applying the trapezoidal rule is still first-order accurate in time, which is apparently not consistent with the second-order convergence property of the trapezoidal rule. Therefore, we can conclude from these results that the convergence rate of MP-NGPs using Runge–Kutta methods is also the first-order. Fortunately, the accuracy of the method applying the trapezoidal scheme (Runge–Kutta methods) does exceed the backward Euler scheme when other experimental conditions are fixed, which means that applying the Runge–Kutta methods on MP-NGPs is still very useful when the current accuracy is not satisfying. The results of Figure 8 also prove that the Runge–Kutta methods do exceed the backward Euler scheme in prediction accuracy.

4. Processing of Neumann and Mixed Boundary Conditions

In the above sections, we did not mention the processing of other boundary conditions, except Dirichlet boundary conditions. This section will introduce how to solve two-dimensional PDEs with Neumann or mixed boundary conditions. That is, the spatial vector x is a one-dimensional variable.
As everyone knows, Dirichlet, Neumann and mixed boundary conditions have the following general form, respectively, in the problem (1) of two dimensions:
u ( lb , t ) = ϕ 1 ( t ) , u ( ub , t ) = ϕ 2 ( t ) , 0 < t T ,
u x x = lb = ψ 1 ( t ) , u x x = ub = ψ 2 ( t ) , 0 < t T ,
λ 11 ( t ) u x + λ 12 ( t ) u x = lb = ξ 1 ( t ) , λ 21 ( t ) u x + λ 22 ( t ) u x = ub = ξ 2 ( t ) , 0 < t T ,
where ϕ 1 ( t ) , ϕ 2 ( t ) , ψ 1 ( t ) , ψ 2 ( t ) , λ 11 ( t ) , λ 12 ( t ) , λ 21 ( t ) , λ 22 ( t ) , ξ 1 ( t ) , ξ 2 ( t ) are given functions.
In order to avoid the repetitive introduction, we will focus on the processing of Neumann and mixed boundary conditions in this section.

4.1. Neumann Boundary Conditions

For the Neumann boundary conditions (37), introduce v n = u n x , where v n ( x ) = v ( x , t n ) = u ( x , t n ) x . Note L x v u n = u n x , and L x v is a linear operator.
Return to the problem (1), then we can get y G P ( 0 , K ) , where
y = v b n u n τ 1 u n τ m f n u n 1 , K = k v , v n , n k v , u n , n τ 1 k v , u n , n τ m k v , f n , n k v , u n , n 1 k u , u n τ 1 , n τ 1 k u , u n τ 1 , n τ m k u , f n τ 1 , n k u , u n τ 1 , n 1 k u , u n τ m , n τ m k u , f n τ m , n k u , u n τ m , n 1 k f , f n , n k f , u n , n 1 k u , u n 1 , n 1 ,
k v , v n , n = L x v L x v k u , u n , n ( x b n , x b n ) + σ v n 2 I , k f , f n , n = k f , f n , n ( x f n , x f n ) + σ f n 2 I , x b n = { lb , ub } , k u , u n τ i , n τ i = k u , u n τ i , n τ i ( x u n τ i , x u n τ i ) + σ u n τ i 2 I , i = 1 , 2 , , m , k v , u n , n τ i = k v , u n , n τ i ( x b n , x u n τ i ) = 0 , k u , f n τ i , n = k u , f n τ i , n ( x u n τ i , x f n ) = 0 , i = 1 , 2 , , m , k v , f n , n = k u , f n , n ( x b n , x f n ) = 0 , k u , u n τ l , n τ k = k u , u n τ l , n τ k ( x u n τ l , x u n τ k ) = 0 , l = 1 , 2 , , m , k = 1 , 2 , , m , k v , u n , n 1 = L x v L x 0 k u , u n , n ( x b n , x u n 1 ) , k u , u n τ i , n 1 = L x i k u , u n τ i , n τ i ( x u n τ i , x u n 1 ) , i = 1 , 2 , , m , k f , u n , n 1 = L x m + 1 k f , f n , n ( x f n , x u n 1 ) , k u , u n 1 , n 1 = L x 0 L x 0 k u , u n , n ( x u n 1 , x u n 1 ) + i = 1 m L x i L x i k u , u n τ i , n τ i ( x u n 1 , x u n 1 ) + L x m + 1 L x m + 1 k f , f n , n ( x u n 1 , x u n 1 ) + σ u n 1 2 I .
where v b n = v ( x b n , t n ) + ε v n , ε v n N ( 0 , σ v n 2 I ) . The additive noise ε v n is assumed to be independent of other noise.
By (5), train the hyper-parameters { θ j | j = 0 , 1 , , m + 1 } by minimizing NLML (39),
log p ( y | x b n , x u n τ 1 , , x u n τ m , x f n , x u n 1 , θ 0 , , θ m + 1 ) = 1 2 log K + 1 2 y T K 1 y + N 2 log 2 π ,
After the training part, the posterior distribution of u n ( x * n ) can be directly written as
u n ( x * n ) v b n u n τ 1 u n τ m f n u n 1 N q T K 1 v b n u n τ 1 u n τ m f n u n 1 , k u , u n , n ( x * n , x * n ) q T K 1 q ,
where q T : = [ k u , v n , n ( x * n , x b n ) , k u , u n , n τ 1 ( x * n , x u n τ 1 ) , , k u , u n , n τ m ( x * n , x u n τ m ) , k u , f n , n ( x * n , x f n ) ,
k u , u n , n 1 ( x * n , x u n 1 ) ] , k u , v n , n ( x * n , x b n ) = L x v k u , u n , n ( x * n , x b n ) .
On the basis of Section 2.4, in order to accurately propagate the uncertainty, marginalize u n 1 (and perhaps u n τ j , when n > τ j ) by employing (assume τ q < n η τ q + 1 )
u n η v b 1 , v b 2 , , v b n η u 1 τ 1 , u 2 τ 1 , , u 0 u 1 τ q , u 2 τ q , , u 0 u 1 τ q + 1 , u 2 τ q + 1 , , u n η τ q + 1 u 1 τ m , u 2 τ m , , u n η τ m f 1 , f 2 , , f n η u 0 N μ n η , Σ n η , n η ,
to obtain (assume τ p < n τ p + 1 )
u n ( x * n ) v b 1 , v b 2 , , v b n u 1 τ 1 , u 2 τ 1 , , u 0 u 1 τ p , u 2 τ p , , u 0 u 1 τ p + 1 , u 2 τ p + 1 , , u n τ p + 1 u 1 τ m , u 2 τ m , , u n τ m f 1 , f 2 , , f n u 0 N μ n ( x * n ) , Σ n , n ( x * n , x * n ) ,
where μ n ( x * n ) = q T K 1 v b n μ n τ 1 μ n τ p u n τ p + 1 u n τ m f n μ n 1 , Σ n , n ( x * n , x * n ) = k u , u n , n ( x * n , x * n ) q T K 1 q + q T K 1 S K 1 q , and S = d i a g ( 0 , Σ n τ 1 , n τ 1 , , Σ n τ p , n τ p , , 0 , , Σ n 1 , n 1 ) .
By the posterior distribution (42), obtain the simplified form u n N μ n , Σ n , n , and then obtain the artificial data { x u n , u n } of the next time step, where μ n = μ n ( x u n ) , and Σ n , n = Σ n , n ( x u n , x u n ) .
Let us return to the problem (11) in Section 2.6, and the operator L x 0 contains fractional-order operators. Then k v , u n , n 1 ( x , x ) can be obtained by performing the inverse Fourier transform on k ^ v , u n , n 1 ( ω , ω ) , where
k ^ v , u n , n 1 ( ω , ω ) = ( i ω ) [ 1 Δ t · g · ( i ω ) α Δ t · g · ( i ω ) β ] k ^ u , u n , n ( ω , ω ) .

4.2. Mixed Boundary Conditions

For the mixed boundary conditions (38), the processing of boundary information is more complicated. Introduce w 1 n = λ 11 n u n x + λ 12 n u n , w 2 n = λ 21 n u n x + λ 22 n u n , where λ 11 n = λ 11 ( t n ) , λ 12 n = λ 12 ( t n ) , λ 21 n = λ 21 ( t n ) , λ 22 n = λ 22 ( t n ) . Note L x w 1 u n = λ 11 n u n x + λ 12 n u n , L x w 2 u n = λ 21 n u n x + λ 22 n u n , where L x w 1 and L x w 2 are linear operators.
Return to the problem (1), then we can obtain y G P ( 0 , K ) , where
y = w 1 , lb n w 2 , ub n u n τ 1 u n τ m f n u n 1 , K = k w 1 , w 1 n , n k w 1 , w 2 n , n k w 1 , u n , n τ 1 k w 1 , u n , n τ m k w 1 , f n , n k w 1 , u n , n 1 k w 2 , w 2 n , n k w 2 , u n , n τ 1 k w 2 , u n , n τ m k w 2 , f n , n k w 2 , u n , n 1 k u , u n τ 1 , n τ 1 k u , u n τ 1 , n τ m k u , f n τ 1 , n k u , u n τ 1 , n 1 k u , u n τ m , n τ m k u , f n τ m , n k u , u n τ m , n 1 k f , f n , n k f , u n , n 1 k u , u n 1 , n 1 ,
k w 1 , w 1 n , n = L x w 1 L x w 1 k u , u n , n ( lb , lb ) + σ w 1 , n 2 , k w 2 , w 2 n , n = L x w 2 L x w 2 k u , u n , n ( ub , ub ) + σ w 2 , n 2 , k f , f n , n = k f , f n , n ( x f n , x f n ) + σ f n 2 I , k u , u n τ i , n τ i = k u , u n τ i , n τ i ( x u n τ i , x u n τ i ) + σ u n τ i 2 I , i = 1 , 2 , , m , k w 1 , u n , n τ i = k w 1 , u n , n τ i ( lb , x u n τ i ) = 0 , k w 2 , u n , n τ i = k w 2 , u n , n τ i ( ub , x u n τ i ) = 0 , k u , f n τ i , n = k u , f n τ i , n ( x u n τ i , x f n ) = 0 , i = 1 , 2 , , m , k w 1 , f n , n = k w 1 , f n , n ( lb , x f n ) = 0 , k w 2 , f n , n = k w 2 , f n , n ( ub , x f n ) = 0 , k u , u n τ l , n τ k = k u , u n τ l , n τ k ( x u n τ l , x u n τ k ) = 0 , l = 1 , 2 , , m , k = 1 , 2 , , m , k w 1 , u n , n 1 = L x w 1 L x 0 k u , u n , n ( lb , x u n 1 ) , k w 2 , u n , n 1 = L x w 2 L x 0 k u , u n , n ( ub , x u n 1 ) , k w 1 , w 2 n , n = L x w 1 L x w 2 k u , u n , n ( lb , ub ) , k u , u n τ i , n 1 = L x i k u , u n τ i , n τ i ( x u n τ i , x u n 1 ) , i = 1 , 2 , , m , k f , u n , n 1 = L x m + 1 k f , f n , n ( x f n , x u n 1 ) , k u , u n 1 , n 1 = L x 0 L x 0 k u , u n , n ( x u n 1 , x u n 1 ) + i = 1 m L x i L x i k u , u n τ i , n τ i ( x u n 1 , x u n 1 ) + L x m + 1 L x m + 1 k f , f n , n ( x u n 1 , x u n 1 ) + σ u n 1 2 I .
where w 1 , lb n = w 1 ( lb , t n ) + ε w 1 , n , ε w 1 , n N ( 0 , σ w 1 , n 2 ) , w 2 , ub n = w 2 ( ub , t n ) + ε w 2 , n , ε w 2 , n N ( 0 , σ w 2 , n 2 ) . The additive noise ε w 1 , n and ε w 2 , n are assumed to be independent of other noise.
By (5), train the hyper-parameters { θ j | j = 0 , 1 , , m + 1 } by minimizing NLML (46),
log p ( y | x b n , x u n τ 1 , , x u n τ m , x f n , x u n 1 , θ 0 , , θ m + 1 ) = 1 2 log K + 1 2 y T K 1 y + N 2 log 2 π ,
After the training part, the posterior distribution of u n ( x * n ) can be directly written as
u n ( x * n ) w 1 , lb n w 2 , ub n u n τ 1 u n τ m f n u n 1 N q T K 1 w 1 , lb n w 2 , ub n u n τ 1 u n τ m f n u n 1 , k u , u n , n ( x * n , x * n ) q T K 1 q ,
where q T = [ k u , w 1 n , n ( x * n , lb ) , k u , w 2 n , n ( x * n , ub ) , k u , u n , n τ 1 ( x * n , x u n τ 1 ) , , k u , u n , n τ m ( x * n , x u n τ m ) , k u , f n , n ( x * n , x f n ) , k u , u n , n 1 ( x * n , x u n 1 ) ] , k u , w 1 n , n ( x * n , lb ) = L x w 1 k u , u n , n ( x * n , lb ) , k u , w 2 n , n ( x * n , ub ) = L x w 2 k u , u n , n ( x * n , ub ) .
By Section 2.4, in order to accurately propagate the uncertainty, marginalize u n 1 (and perhaps u n τ j , when n > τ j ) by employing (assume τ q < n η τ q + 1 )
u n η w 1 , lb 1 , w 1 , lb 2 , , w 1 , lb n η w 2 , ub 1 , w 2 , ub 2 , , w 2 , ub n η u 1 τ 1 , u 2 τ 1 , , u 0 u 1 τ q , u 2 τ q , , u 0 u 1 τ q + 1 , u 2 τ q + 1 , , u n η τ q + 1 u 1 τ m , u 2 τ m , , u n η τ m f 1 , f 2 , , f n η u 0 N μ n η , Σ n η , n η ,
to obtain (assume τ p < n τ p + 1 )
u n ( x * n ) w 1 , lb 1 , w 1 , lb 2 , , w 1 , lb n w 2 , ub 1 , w 2 , ub 2 , , w 2 , ub n u 1 τ 1 , u 2 τ 1 , , u 0 u 1 τ p , u 2 τ p , , u 0 u 1 τ p + 1 , u 2 τ p + 1 , , u n τ p + 1 u 1 τ m , u 2 τ m , , u n τ m f 1 , f 2 , , f n u 0 N μ n ( x * n ) , Σ n , n ( x * n , x * n ) ,
where μ n ( x * n ) = q T K 1 w 1 , l b n w 2 , u b n μ n τ 1 μ n τ p u n τ p + 1 u n τ m f n μ n 1 , Σ n , n ( x * n , x * n ) = k u , u n , n ( x * n , x * n ) q T K 1 q + q T K 1 S K 1 q , and S = d i a g ( 0 , Σ n τ 1 , n τ 1 , , Σ n τ p , n τ p , , 0 , , Σ n 1 , n 1 ) .
On the basis of the posterior distribution (49), obtain the simplified form u n N μ n , Σ n , n , and then obtain the artificial data { x u n , u n } of the next time step, where μ n = μ n ( x u n ) , and Σ n , n = Σ n , n ( x u n , x u n ) .
Return to problem (11) in Section 2.6, and the operator L x 0 contains fractional-order operators. Then k w 1 , u n , n 1 ( x , x ) and k w 2 , u n , n 1 ( x , x ) can be obtained by performing the inverse Fourier transform on k ^ w 1 , u n , n 1 ( ω , ω ) and k ^ w 2 , u n , n 1 ( ω , ω ) , respectively, where
k ^ w 1 , u n , n 1 ( ω , ω ) = [ λ 11 n ( i ω ) + λ 12 n ] [ 1 Δ t · g · ( i ω ) α Δ t · g · ( i ω ) β ] k ^ u , u n , n ( ω , ω ) , k ^ w 2 , u n , n 1 ( ω , ω ) = [ λ 21 n ( i ω ) + λ 22 n ] [ 1 Δ t · g · ( i ω ) α Δ t · g · ( i ω ) β ] k ^ u , u n , n ( ω , ω ) .

4.3. Numerical Tests for a Problem with Two Delays and Different Types of Boundary Conditions

In this subsection, we will solve a nonlinear PDE with two delays, and Dirichlet, Neumann and mixed boundary conditions will be considered in this PDE, in turn.
Example 5.
u t = ( u + 1 ) 2 u x 2 + sin u x u ( x , t 0.1 ) + cos u x u ( x , t 0.2 ) + f ( x , t ) , ( x , t ) ( 0 , 2 π ) × ( 0 , 1 ] , u ( x , t ) = ( 1 + t 2 ) ( x 2 π + sin x ) , 0.2 t 0 ,
Example 5 is a nonlinear PDE with two delays. The exact solution of (51) is u ( x , t ) = ( 1 + t 2 ) ( x 2 π + sin x ) , and the inhomogeneous term is f ( x , t ) = 1 2 π ( ( ( 1 + ( 0.2 + t ) 2 ) ( 1 + 2 π cos x ) cos ( ( 1 + t 2 ) ( x 2 π + sin x ) ) ) + 2 t ( x + 2 π sin x ) + ( 1 + t 2 ) sin x ( 2 π + ( 1 + t 2 ) ( x + 2 π sin x ) ) ( 1 + ( 0.1 + t ) 2 ) ( 1 + 2 π cos x ) sin ( ( 1 + t 2 ) ( x 2 π + sin x ) ) ) . The Dirichlet, Neumann and mixed boundary conditions of Example 5 are as follows:
u ( 0 , t ) = 0 , u ( 2 π , t ) = 1 + t 2 , 0 < t 1 ,
u x x = 0 = ( 1 2 π + 1 ) ( 1 + t 2 ) , u x x = 2 π = ( 1 2 π + 1 ) ( 1 + t 2 ) , 0 < t 1 .
( 4 + t ) u x + ( 1 + t 2 ) u x = 0 = ( 1 2 π + 1 ) ( 4 + t ) ( 1 + t 2 ) , 0 < t 1 , ( 1 + t ) 2 u x + ( 3 + 2 sin t ) u x = 2 π = [ ( 1 2 π + 1 ) ( 1 + t ) 2 + ( 3 + 2 sin t ) ] ( 1 + t 2 ) , 0 < t 1 .
Numerical tests of solving (51) will be performed with noiseless data. Figure 9 shows the posterior distribution of the predicted solution at different time nodes, where we fix the number of initial data points (inhomogeneous term data points, artificially generated data points and delay term data points) to be 10 and fix Δ t to be 0.01. Figure 9a–c presents the results of the experiments with Dirichlet, Neumann and mixed boundary conditions, respectively, and Figure 9d shows the maximum absolute error E with the increase in time. Moreover, Table 7 shows us the maximum absolute error E at some time nodes in detail, where Dirichlet, Neumann and mixed boundary conditions are both considered. It is found that the method based on MP-NGPs works well when solving nonlinear delay PDEs with Dirichlet, Neumann or mixed boundary conditions. The prediction accuracy under different types of boundary conditions is close to each other.

5. Processing of Wave Equations

So as to prove the efficiency of MP-NGPs, we give the processing of another type of time-dependent PDEs, wave equations,
2 u ( x , t ) t 2 = 2 u ( x , t ) x 2 + u ( x , t s ) x + f ( x , t ) , l b < x < u b , 0 < t T , u ( l b , t ) = ϕ 1 ( t ) , u ( u b , t ) = ϕ 2 ( t ) , 0 < t T , u ( x , t ) = ψ ( x , t ) , l b x u b , s t 0 , u t ( x , 0 ) = φ ( x ) , l b x u b ,
where ϕ 1 ( t ) , ϕ 2 ( t ) , ψ ( x , t ) , and φ ( x ) are given functions. Apparently, delay terms and inhomogeneous terms are also considered in the wave equations.
We will give two methods of how to solve wave equations by the algorithm of MP-NGPs.

5.1. Second Order Central Difference Scheme

The first method is to apply a second-order central difference scheme of second-order derivatives on the first Equation of (55) to discretize the temporal variable:
u n 1 + u n + 1 2 u n Δ t 2 = 2 x 2 u n + x u n τ + f n ,
where n = 0 , 1 , , N 1 , N = T / Δ t , u n = u ( x , t n ) , u n τ = u ( x , t n s ) , τ = s / Δ t , f n = f ( x , t n ) .
Rearrange (56) to obtain
u n + 1 = 2 u n + Δ t 2 2 x 2 u n u n 1 + Δ t 2 x u n τ + Δ t 2 f n .
Obviously, the scheme (56) is a multi-steps method, while the backward Euler scheme is a one-step method. Then introduce notations that L x 0 u n = 2 u n + Δ t 2 2 x 2 u n , L x 1 u n 1 = u n 1 , L x 2 u n τ = Δ t 2 x u n τ and L x 3 f n = Δ t 2 f n . So, we can obtain the simplified equation
u n + 1 = L x 0 u n + L x 1 u n 1 + L x 2 u n τ + L x 3 f n .
Assume the Gaussian process prior as follows:
u n ( x ) G P ( 0 , k u , u n , n ( x , x ; θ 0 ) ) , u n 1 ( x ) G P ( 0 , k u , u n 1 , n 1 ( x , x ; θ 1 ) ) , u n τ ( x ) G P ( 0 , k u , u n τ , n τ ( x , x ; θ 2 ) ) , f n ( x ) G P ( 0 , k f , f n , n ( x , x ; θ 3 ) ) .
According to the properties of Gaussian processes, then we can obtain y G P ( 0 , K ) , where
y = u n u n 1 u n τ f n u b n + 1 , K = k u , u n , n k u , u n , n 1 k u , u n , n τ k u , f n , n k u , b n , n + 1 k u , u n 1 , n 1 k u , u n 1 , n τ k u , f n 1 , n k u , b n 1 , n + 1 k u , u n τ , n τ k u , f n τ , n k u , b n τ , n + 1 k f , f n , n k f , b n , n + 1 k b , b n + 1 , n + 1 ,
k u , u n , n = k u , u n , n ( x u n , x u n ) + σ u n 2 I , k u , u n 1 , n 1 = k u , u n 1 , n 1 ( x u n 1 , x u n 1 ) + σ u n 1 2 I , k f , f n , n = k f , f n , n ( x f n , x f n ) + σ f n 2 I , k u , u n τ , n τ = k u , u n τ , n τ ( x u n τ , x u n τ ) + σ u n τ 2 I , k u , b n , n + 1 = L x 0 k u , u n , n ( x u n , x b n + 1 ) , k u , b n 1 , n + 1 = L x 1 k u , u n 1 , n 1 ( x u n 1 , x b n + 1 ) , k u , b n τ , n + 1 = L x 2 k u , u n τ , n τ ( x u n τ , x b n + 1 ) , k f , b n , n + 1 = L x 3 k f , f n , n ( x f n , x b n + 1 ) , k b , b n + 1 , n + 1 = L x 0 L x 0 k u , u n , n ( x b n + 1 , x b n + 1 ) + L x 1 L x 1 k u , u n 1 , n 1 ( x b n + 1 , x b n + 1 ) + L x 2 L x 2 k u , u n τ , n τ ( x b n + 1 , x b n + 1 ) + L x 3 L x 3 k f , f n , n ( x b n + 1 , x b n + 1 ) + σ u n + 1 2 I , x b n + 1 = { lb , ub } .
Any kernel not mentioned above is equal to 0 .
By (5), train the hyper-parameters { θ j | j = 0 , 1 , 2 , 3 } by minimizing NLML (60):
log p ( y | x b n + 1 , x u n τ , x f n , x u n , x u n 1 , θ 0 , θ 1 , θ 2 , θ 3 ) = 1 2 log K + 1 2 y T K 1 y + N 2 log 2 π ,
After the training part, the posterior distribution of u n + 1 ( x * n ) can be directly written as
u n + 1 ( x * n + 1 ) u n u n 1 u n τ f n u b n + 1 N q T K 1 u n u n 1 u n τ f n u b n + 1 , k u , u n + 1 , n + 1 ( x * n + 1 , x * n + 1 ) q T K 1 q ,
where q T = [ k u , u n + 1 , n ( x * n + 1 , x u n ) , k u , u n + 1 , n 1 ( x * n + 1 , x u n 1 ) , k u , u n + 1 , n τ ( x * n + 1 , x u n τ ) , k u , f n + 1 , n ( x * n + 1 , x f n ) , k u , u n + 1 , n + 1 ( x * n + 1 , x b n + 1 ) ] .
On the basis of Section 2.4, in order to accurately propagate the uncertainty, marginalize u n , u n 1 (and u n τ , when n > τ ) by employing
u n + 1 u 0 u 1 , u 0 , , u n 1 u τ , u 1 τ , , u n τ f 0 , f 1 , , f n u b 1 , u b 2 , , u b n + 1 N μ n + 1 , Σ n + 1 , n + 1 , n τ , u n + 1 u 0 u 1 , u 0 , , u n 1 u τ , u 1 τ , , u 0 f 0 , f 1 , , f n u b 1 , u b 2 , , u b n + 1 N μ n + 1 , Σ n + 1 , n + 1 , n > τ ,
to obtain
u n + 1 ( x * n + 1 ) u 0 u 1 , u 0 , , u n 1 u τ , u 1 τ , , u n τ f 0 , f 1 , , f n u b 1 , u b 2 , , u b n + 1 N μ n + 1 ( x * n + 1 ) , Σ n + 1 , n + 1 ( x * n + 1 , x * n + 1 ) , n τ , u n + 1 ( x * n + 1 ) u 0 u 1 , u 0 , , u n 1 u τ , u 1 τ , , u 0 f 0 , f 1 , , f n u b 1 , u b 2 , , u b n + 1 N μ n + 1 ( x * n + 1 ) , Σ n + 1 , n + 1 ( x * n + 1 , x * n + 1 ) , n > τ ,
where μ n + 1 ( x * n + 1 ) = q T K 1 μ n μ n 1 μ n τ f n u b n + 1 (if n = 0 , replace μ n by u n ; if n < 2 , replace μ n 1 by u n 1 ; if n τ , replace μ n τ by u n τ ), Σ n + 1 , n + 1 ( x * n + 1 , x * n + 1 ) = k u , u n + 1 , n + 1 ( x * n + 1 , x * n + 1 ) q T K 1 q + q T K 1 S K 1 q , and S = d i a g ( Σ n , n , Σ n 1 , n 1 , Σ n τ , n τ , 0 , 0 ) (if n = 0 , replace Σ n , n by 0 ; if n < 2 , replace Σ n 1 , n 1 by 0 ; if n τ , replace Σ n τ , n τ by 0 ).
By the posterior distribution (63), obtain the simplified form u n + 1 N μ n + 1 , Σ n + 1 , n + 1 , and then obtain the artificial data { x u n + 1 , u n + 1 } of the next time step, where μ n + 1 = μ n + 1 ( x u n + 1 ) , and Σ n + 1 , n + 1 = Σ n + 1 , n + 1 ( x u n + 1 , x u n + 1 ) .

5.2. Backward Euler Scheme

Firstly, note t u ( x , t ) = v ( x , t ) , then obtain
t u n = v n , t v n = 2 x 2 u n + x u n τ + f n ,
where v n = u ( x , t n ) . Apply the backward Euler scheme on the first equation of (55) to discretize time,
u n u n 1 Δ t = v n , v n v n 1 Δ t = 2 x 2 u n + x u n τ + f n .
Rearrange (65) to obtain
u n 1 = u n Δ t v n , v n 1 = v n Δ t 2 x 2 u n Δ t x u n τ Δ t f n .
Then introduce notations that L x 0 u n = u n , L x 1 v n = Δ t v n , L x 2 v n = v n , L x 3 u n = Δ t 2 x 2 u n , L x 4 u n τ = Δ t x u n τ , and L x 5 f n = Δ t f n . Then obtain the simplified form
u n 1 = L x 0 u n + L x 1 v n , v n 1 = L x 2 v n + L x 3 u n + L x 4 u n τ + L x 5 f n .
Assume the Gaussian process prior as follows:
u n ( x ) G P ( 0 , k u , u n , n ( x , x ; θ 0 ) ) , v n ( x ) G P ( 0 , k v , v n , n ( x , x ; θ 1 ) ) , u n τ ( x ) G P ( 0 , k u , u n τ , n τ ( x , x ; θ 2 ) ) , f n ( x ) G P ( 0 , k f , f n , n ( x , x ; θ 3 ) ) .
According to the properties of Gaussian processes, we can obtain y G P ( 0 , K ) , where
y = u b n u n τ f n u n 1 v n 1 , K = k b , b n , n k b , u n , n τ k b , f n , n k b , u n , n 1 k b , v n , n 1 k u , u n τ , n τ k u , f n τ , f k u , u n τ , n 1 k u , v n τ , n 1 k f , f n , n k f , u n , n 1 k f , v n , n 1 k u , u n 1 , n 1 k u , v n 1 , n 1 k v , v n 1 , n 1 ,
k b , b n , n = k u , u n , n ( x b n , x b n ) + σ u n 2 I , k u , u n τ , n τ = k u , u n τ , n τ ( x u n τ , x u n τ ) + σ u n τ 2 I , k f , f n , n = k f , f n , n ( x f n , x f n ) + σ f n 2 I , k b , u n , n 1 = L x 0 k u , u n , n ( x b n , x u n 1 ) , k b , v n , n 1 = L x 3 k u , u n , n ( x b n , x v n 1 ) , k u , v n τ , n 1 = L x 4 k u , u n τ , n τ ( x u n τ , x v n 1 ) , k f , v n , n 1 = L x 5 k f , f n , n ( x f n , x v n 1 ) , k u , u n 1 , n 1 = L x 0 L x 0 k u , u n , n ( x u n 1 , x u n 1 ) + L x 1 L x 1 k v , v n , n ( x u n 1 , x u n 1 ) + σ u n 1 2 I , k u , v n 1 , n 1 = L x 0 L x 3 k u , u n , n ( x u n 1 , x v n 1 ) + L x 1 L x 2 k v , v n , n ( x u n 1 , x v n 1 ) , k v , v n 1 , n 1 = L x 2 L x 2 k v , v n , n ( x v n 1 , x v n 1 ) + L x 3 L x 3 k u , u n , n ( x v n 1 , x v n 1 ) + L x 4 L x 4 k u , u n τ , n τ ( x v n 1 , x v n 1 ) + L x 5 L x 5 k f , f n , n ( x v n 1 , x v n 1 ) + σ v n 1 2 I , x b n = { lb , ub } .
Any kernel not mentioned above is equal to 0 .
By (5), train the hyper-parameters { θ j | j = 0 , 1 , 2 , 3 } by minimizing NLML (69):
log p ( y | x b n , x u n τ , x f n , x u n 1 , x v n 1 , θ 0 , θ 1 , θ 2 , θ 3 ) = 1 2 log K + 1 2 y T K 1 y + N 2 log 2 π ,
After the training part, the joint posterior distribution of u n + 1 ( x * n ) can be directly written as
u n ( x * u n ) v n ( x * v n ) u b n u n τ f n u n 1 v n 1 N q T K 1 u b n u n τ f n u n 1 v n 1 , k u , u n , n ( x * u n , x * u n ) k u , v n , n ( x * u n , x * v n ) k v , v n , n ( x * v n , x * v n ) q T K 1 q ,
where
q T = k u , u n , n ( x * u n , x b n ) k u , u n , n τ ( x * u n , x u n τ ) k u , f n , n ( x * u n , x f n ) k u , u n , n 1 ( x * u n , x u n 1 ) k u , v n , n 1 ( x * u n , x v n 1 ) k v , u n , n ( x * v n , x b n ) k v , u n , n τ ( x * v n , x u n τ ) k v , f n , n ( x * v n , x f n ) k v , u n , n 1 ( x * v n , x u n 1 ) k v , v n , n 1 ( x * v n , x v n 1 ) .
On the basis of Section 2.4, in order to accurately propagate the uncertainty, marginalize u n 1 , v n 1 (and u n τ , when n > τ ) by employing
u n v n u b 1 , u b 2 , , u b n u 1 τ , u 2 τ , , u n τ f 1 , f 2 , , f n u 0 , v 0 N μ u n μ v n , Σ u , u n , n Σ u , v n , n Σ v , v n , n , n τ , u n v n u b 1 , u b 2 , , u b n u 1 τ , u 2 τ , , u 0 f 1 , f 2 , , f n u 0 , v 0 N μ u n μ v n , Σ u , u n , n Σ u , v n , n Σ v , v n , n , n > τ ,
to obtain
u n ( x * u n ) v n ( x * v n ) u b 1 , u b 2 , , u b n u 1 τ , u 2 τ , , u n τ f 1 , f 2 , , f n u 0 , v 0 N μ u n ( x * u n ) μ v n ( x * v n ) , Σ u , u n , n ( x * u n , x * u n ) Σ u , v n , n ( x * u n , x * v n ) Σ v , v n , n ( x * v n , x * v n ) , n τ , u n ( x * u n ) v n ( x * v n ) u b 1 , u b 2 , , u b n u 1 τ , u 2 τ , , u 0 f 1 , f 2 , , f n u 0 , v 0 N μ u n ( x * u n ) μ v n ( x * v n ) , Σ u , u n , n ( x * u n , x * u n ) Σ u , v n , n ( x * u n , x * v n ) Σ v , v n , n ( x * v n , x * v n ) , n > τ ,
where μ u n ( x * u n ) μ v n ( x * v n ) = q T K 1 u b n μ n τ f n μ u n 1 μ v n 1 (if n = 0 , replace μ u n 1 and μ v n 1 by u n 1 and v n 1 , respectively; if n τ , replace μ n τ by u n τ ), Σ u , u n , n ( x * u n , x * u n ) Σ u , v n , n ( x * u n , x * v n ) Σ v , v n , n ( x * v n , x * v n ) = k u , u n , n ( x * u n , x * u n ) k u , v n , n ( x * u n , x * v n ) k v , v n , n ( x * v n , x * v n ) q T K 1 q + q T K 1 S K 1 q , S = d i a g ( 0 , Σ u , u n τ , n τ , 0 , Σ 2 × 2 n 1 , n 1 ) (if n = 1 , replace Σ 2 × 2 n 1 , n 1 by 0 ; if n τ , replace Σ u , u n τ , n τ by 0 ), and Σ 2 × 2 n 1 , n 1 = Σ u , u n 1 , n 1 Σ u , v n 1 , n 1 Σ v , v n 1 , n 1 .
By the posterior distribution (73), obtain the simplified form u n N μ u n , Σ u , u n , n , and then obtain the artificial data { x u n , u n } of the next time step, where μ u n = μ u n ( x u n ) , and Σ u , u n , n = Σ u , u n , n ( x u n , x u n ) .
This two methods reflect the flexibility of NGPs and MP-NGPs. Due to the limited space of this paper, we only introduced the processing of wave equations. The other types of time-dependent PDEs can also be solved by the idea of this two methods.

5.3. Numerical Tests

Example 6.
2 u ( x , t ) t 2 = 2 u ( x , t ) x 2 + u ( x , t 0.2 ) x + f ( x , t ) , 0 < x < 2 π , 0 < t 1 , u ( 0 , t ) = e t , u ( 2 π , t ) = e t , 0 < t 1 , u ( x , t ) = e t ( cos 2 x + sin x ) , 0 x 2 π , 0.2 t 0 , u t ( x , 0 ) = cos 2 x sin x , 0 x 2 π ,
Example 6 is the wave equation with one delay. The exact solution of (74) is u ( x , t ) = e t ( cos 2 x + sin x ) , and the inhomogeneous term is f ( x , t ) = e t ( 5 cos 2 x + 2 sin x ) e 0.2 t ( cos x 2 sin 2 x ) .
Numerical tests of solving (74) are performed with noiseless data. Figure 10 shows the posterior distribution of the predicted solution at different time nodes, where we fix the number of initial data points (inhomogeneous term data points, artificially generated data points and delay term data points) to be 15 and fix Δ t to be 0.01. Figure 9a,b, presents the results of the experiments employing the second-order central difference scheme and the backward Euler scheme, respectively, and Figure 9c shows the maximum absolute error E with the increase in time. Moreover, Table 8 shows us the maximum absolute error E at some time nodes in detail. According to the results of numerical tests, we conclude that the two methods based on MP-NGPs effectively solve the wave equation with one delay. The second method applying the backward Euler scheme behaves better than the first method, applying the second-order central difference scheme on prediction accuracy.

6. Discussion

In this paper, a new method, called multi-priors numerical Gaussian processes (MP-NGPs), is designed and proposed on the basis of numerical Gaussian processes (NGPs), to solve complex time-dependent PDEs. Compared with NGPs, the MP-NGPs proposed possess a structure with a much stricter mathematical deduction, and the application of MP-NGPs is more flexible. Delay terms and inhomogeneous terms can be introduced on MP-NGPs. The main idea of MP-NGPs is to realize the discretization of time, apply Gaussian process regression to transform the potential information of PDEs into a scheme controlled by several operators, and predict the solution from the statistical perspective. MP-NGPs build a Gaussian process regression at each time step to predict the solution at the next time step. The basic idea of MP-NGPs is very similar to that of NGPs, but the applications of MP-NGPs are explored more deeply than NGPs. Moreover, this methodology of MP-NGPs has natural advantages in handling noisy data and quantifying the uncertainty of the predicted solution. It should be mentioned that MP-NGPs can only solve a class of fractional-order PDEs composed of special fractional-order derivatives (2). Other kinds of fractional-order derivatives still deserve further research in MP-NGPs. Moreover, multi-spatial-dimension irregular domain problems, the inverse problems of PDEs, such as estimating constant or variable coefficients based on Gaussian processes and some specific application problems of MP-NGPs, may be the direction of our future investigation.

7. Conclusions

In this manuscript, the possibility of MP-NGPs in solving nonlinear delay partial differential equations with multi-delays is explored. These equations investigated include partial differential equations, partial integro-differential equations, fractional partial differential equations and wave equations. We also consider Neumann and mixed boundary conditions of PDEs on MP-NGPs. Numerical experiments prove that the method proposed can solve nonlinear delay partial differential equations with satisfactory precision, and the algorithm applying the backward Euler scheme is stably first-order accurate in time. So as to improve the prediction accuracy, Runge–Kutta methods under the frame of MP-NGPs are proposed in this manuscript. The algorithm applying the Runge–Kutta methods can effectively improve the prediction accuracy, although the temporal convergence rate of this algorithm is still of the first order according to the numerical experiments.

Author Contributions

Conceptualization, W.G., W.Z. and Y.H.; methodology, W.G. and W.Z.; software, W.G. and W.Z.; validation, W.G. and Y.H.; formal analysis, W.G. and Y.H.; investigation, W.G., W.Z. and Y.H.; writing—original draft preparation, W.Z.; writing—review and editing, W.G. and Y.H.; visualization, W.G., W.Z. and Y.H.; supervision, W.G. and Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by National Natural Science Foundation of P.R. China (No. 71974204).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ghahramani, Z. Probabilistic machine learning and artificial intelligence. Nature 2015, 521, 452–459. [Google Scholar] [CrossRef] [PubMed]
  2. Kempa-Liehr, A.W.; Lin, C.Y.C.; Britten, R.; Armstrong, D.; Wallace, J.; Mordaunt, D.; O’Sullivan, M. Healthcare pathway discovery and probabilistic machine learning. Int. J. Med. Inform. 2020, 137, 104087. [Google Scholar] [CrossRef] [PubMed]
  3. Maslyaev, M.; Hvatov, A.; Kalyuzhnaya, A.V. Partial differential equations discovery with EPDE framework: Application for real and synthetic data. J. Comput. Sci. 2021, 53, 101345. [Google Scholar] [CrossRef]
  4. Lorin, E. From structured data to evolution linear partial differential equations. J. Comput. Phys. 2019, 393, 162–185. [Google Scholar] [CrossRef]
  5. Arbabi, H.; Bunder, J.E.; Samaey, G.; Roberts, A.J.; Kevrekidis, I.G. Linking machine learning with multiscale numerics: Data-driven discovery of homogenized equations. JOM 2020, 72, 4444–4457. [Google Scholar] [CrossRef]
  6. Martina-Perez, S.; Simpson, M.J.; Baker, R.E. Bayesian uncertainty quantification for data-driven equation learning. Proc. R. Soc. A 2021, 477, 20210426. [Google Scholar] [CrossRef] [PubMed]
  7. Dal Santo, N.; Deparis, S.; Pegolotti, L. Data driven approximation of parametrized PDEs by reduced basis and neural networks. J. Comput. Phys. 2020, 416, 109550. [Google Scholar] [CrossRef]
  8. Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
  9. Kremsner, S.; Steinicke, A.; Szölgyenyi, M. A deep neural network algorithm for semilinear elliptic PDEs with applications in insurance mathematics. Risks 2020, 8, 136. [Google Scholar] [CrossRef]
  10. Guo, Y.; Cao, X.; Liu, B.; Gao, M. Solving partial differential equations using deep learning and physical constraints. Appl. Sci. 2020, 10, 5917. [Google Scholar] [CrossRef]
  11. Chen, Z.; Liu, Y.; Sun, H. Physics-informed learning of governing equations from scarce data. Nat. Commun. 2021, 12, 6136. [Google Scholar] [CrossRef] [PubMed]
  12. Gelbrecht, M.; Boers, N.; Kurths, J. Neural partial differential equations for chaotic systems. New J. Phys. 2021, 23, 043005. [Google Scholar] [CrossRef]
  13. Omidi, M.; Arab, B.; Rasanan, A.; Rad, J.; Parand, K. Learning nonlinear dynamics with behavior ordinary/partial/system of the differential equations: Looking through the lens of orthogonal neural networks. Eng. Comput. 2022, 38, 1635–1654. [Google Scholar] [CrossRef]
  14. Lagergren, J.H.; Nardini, J.T.; Michael Lavigne, G.; Rutter, E.M.; Flores, K.B. Learning partial differential equations for biological transport models from noisy spatio-temporal data. Proc. R. Soc. A 2020, 476, 20190800. [Google Scholar] [CrossRef]
  15. Koyamada, K.; Long, Y.; Kawamura, T.; Konishi, K. Data-driven derivation of partial differential equations using neural network model. Int. J. Model Simulat. Sci. Comput. 2021, 12, 2140001. [Google Scholar] [CrossRef]
  16. Kalogeris, I.; Papadopoulos, V. Diffusion maps-aided Neural Networks for the solution of parametrized PDEs. Comput. Meth. Appl. Mech. Eng. 2021, 376, 113568. [Google Scholar] [CrossRef]
  17. Kaipio, J.; Somersalo, E. Statistical and Computational Inverse Problems; Springer Science & Business Media: New York, NY, USA, 2006; Volume 160. [Google Scholar] [CrossRef]
  18. Williams, C.K.; Rasmussen, C.E. Gaussian Processes for Machine Learning; MIT Press: Cambridge, MA, USA, 2006; Volume 2. [Google Scholar] [CrossRef] [Green Version]
  19. Mahmoodzadeh, A.; Mohammadi, M.; Abdulhamid, S.N.; Ali, H.F.H.; Ibrahim, H.H.; Rashidi, S. Forecasting tunnel path geology using Gaussian process regression. Geomech. Eng. 2022, 28, 359–374. [Google Scholar] [CrossRef]
  20. Hoolohan, V.; Tomlin, A.S.; Cockerill, T. Improved near surface wind speed predictions using Gaussian process regression combined with numerical weather predictions and observed meteorological data. Renew. Energy 2018, 126, 1043–1054. [Google Scholar] [CrossRef]
  21. Gonzalvez, J.; Lezmi, E.; Roncalli, T.; Xu, J. Financial applications of gaussian processes and bayesian optimization. arXiv 2019, arXiv:1903.04841. [Google Scholar] [CrossRef] [Green Version]
  22. Schölkopf, B.; Smola, A.J.; Bach, F. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and beyond; MIT Press: Cambridge, MA, USA, 2002. [Google Scholar]
  23. Drucker, H.; Wu, D.; Vapnik, V.N. Support vector machines for spam categorization. IEEE Trans. Neural Netw. 1999, 10, 1048–1054. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Tipping, M.E. Sparse Bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 2001, 1, 211–244. [Google Scholar]
  25. Lange-Hegermann, M. Linearly constrained gaussian processes with boundary conditions. In Proceedings of the International Conference on Artificial Intelligence and Statistics, PMLR, San Diego, USA, 13–15 April 2021; pp. 1090–1098. [Google Scholar]
  26. Gahungu, P.; Lanyon, C.W.; Alvarez, M.A.; Bainomugisha, E.; Smith, M.; Wilkinson, R.D. Adjoint-aided inference of Gaussian process driven differential equations. arXiv 2022, arXiv:2202.04589. [Google Scholar] [CrossRef]
  27. Gulian, M.; Frankel, A.; Swiler, L. Gaussian process regression constrained by boundary value problems. Comput. Methods Appl. Mech. Eng. 2022, 388, 114117. [Google Scholar] [CrossRef]
  28. Yang, S.; Wong, S.W.; Kou, S. Inference of dynamic systems from noisy and sparse data via manifold-constrained Gaussian processes. Proc. Natl. Acad. Sci. USA 2021, 118, e2020397118. [Google Scholar] [CrossRef] [PubMed]
  29. Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Numerical Gaussian processes for time-dependent and nonlinear partial differential equations. SIAM J. Sci. Comput. 2018, 40, A172–A198. [Google Scholar] [CrossRef] [Green Version]
  30. Oates, C.J.; Sullivan, T.J. A modern retrospective on probabilistic numerics. Stat. Comput. 2019, 29, 1335–1351. [Google Scholar] [CrossRef] [Green Version]
  31. Hennig, P.; Osborne, M.A.; Girolami, M. Probabilistic numerics and uncertainty in computations. Proc. R. Soc. A 2015, 471, 20150142. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  32. Conrad, P.R.; Girolami, M.; Särkkä, S.; Stuart, A.; Zygalakis, K. Statistical analysis of differential equations: Introducing probability measures on numerical solutions. Stat. Comput. 2017, 27, 1065–1082. [Google Scholar] [CrossRef] [Green Version]
  33. Kersting, H.; Sullivan, T.J.; Hennig, P. Convergence rates of Gaussian ODE filters. Stat. Comput. 2020, 30, 1791–1816. [Google Scholar] [CrossRef]
  34. Raissi, M.; Karniadakis, G.E. Hidden physics models: Machine learning of nonlinear partial differential equations. J. Comput. Phys. 2018, 357, 125–141. [Google Scholar] [CrossRef] [Green Version]
  35. Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Machine learning of linear differential equations using Gaussian processes. J. Comput. Phys. 2017, 348, 683–693. [Google Scholar] [CrossRef] [Green Version]
  36. Reddy, J.N. Introduction to the Finite Element Method; McGraw-Hill Education: Columbus, OH, USA, 2019. [Google Scholar] [CrossRef] [Green Version]
  37. Gottlieb, D.; Orszag, S.A. Numerical Analysis of Spectral Methods: Theory and Applications; SIAM: Philadelphia, PA, USA, 1977. [Google Scholar] [CrossRef] [Green Version]
  38. Strikwerda, J.C. Finite Difference Schemes and Partial Differential Equations; SIAM: Philadelphia, PA, USA, 2004; Available online: https://www.semanticscholar.org/paper/Finite-Difference-Schemes-and-Partial-Differential-Strikwerda/757830fca3a06a8a402efad2d812bea0cf561702 (accessed on 20 August 2022).
  39. Bernardo, J.M.; Smith, A.F. Bayesian Theory; John Wiley & Sons: Hoboken, NJ, USA, 2009; Volume 405, Available online: https://onlinelibrary.wiley.com/doi/book/10.1002/9780470316870 (accessed on 20 August 2022).
  40. Podlubny, I. Fractional Differential Equations: An Introduction to Fractional Derivatives, Fractional Differential Equations, to Methods of Their Solution and Some of Their Applications; Elsevier: Amsterdam, The Netherlands, 1998. [Google Scholar]
  41. Povstenko, Y. Linear Fractional Diffusion-Wave Equation for Scientists and Engineers; Birkhäuser: New York, NY, USA, 2015. [Google Scholar] [CrossRef]
  42. König, H. Eigenvalue Distribution of Compact Operators; Birkhäuser: Basel, Switzerland, 2013; Volume 16. [Google Scholar] [CrossRef]
  43. Berlinet, A.; Thomas-Agnan, C. Reproducing Kernel Hilbert Spaces in Probability and Statistics; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2011. [Google Scholar] [CrossRef]
  44. Zhu, C.; Byrd, R.H.; Lu, P.; Nocedal, J. Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Softw. 1997, 23, 550–560. [Google Scholar] [CrossRef]
  45. Mirjalili, S.; Lewis, A. The whale optimization algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar] [CrossRef]
  46. Dorigo, M.; Birattari, M.; Stutzle, T. Ant colony optimization. IEEE Comput. Intell. Mag. 2006, 1, 28–39. [Google Scholar] [CrossRef]
  47. Iserles, A. A First Course in the Numerical Analysis of Differential Equations; Number 44; Cambridge University Press: Cambridge, UK, 2009. [Google Scholar] [CrossRef]
  48. Butcher, J.C. A history of Runge-Kutta methods. Appl. Numer. Math. 1996, 20, 247–260. [Google Scholar] [CrossRef]
Figure 1. Basic workflow of MP-NGPs.
Figure 1. Basic workflow of MP-NGPs.
Fractalfract 06 00606 g001
Figure 2. Fractional equation of Example 1: the posterior distribution of the predicted solution at different time nodes, where ( α , β ) is fixed to be ( 0.25 , 0.75 ) . The red solid line represents the exact solution. The blue dash line represents the posterior mean. The green shade indicates the two posterior standard deviation centered on the posterior mean. Noiseless data and noisy data are used, respectively.
Figure 2. Fractional equation of Example 1: the posterior distribution of the predicted solution at different time nodes, where ( α , β ) is fixed to be ( 0.25 , 0.75 ) . The red solid line represents the exact solution. The blue dash line represents the posterior mean. The green shade indicates the two posterior standard deviation centered on the posterior mean. Noiseless data and noisy data are used, respectively.
Fractalfract 06 00606 g002
Figure 3. Fractional equation of Example 1: the posterior distribution of the predicted solution at different time nodes, where ( α , β ) is fixed to be ( 0.5 , 0.5 ) . Noiseless data and noisy data are used, respectively.
Figure 3. Fractional equation of Example 1: the posterior distribution of the predicted solution at different time nodes, where ( α , β ) is fixed to be ( 0.5 , 0.5 ) . Noiseless data and noisy data are used, respectively.
Fractalfract 06 00606 g003
Figure 4. Fractional equation of Example 1: the posterior distribution of the predicted solution at different time nodes, where ( α , β ) is fixed to be ( 0.75 , 0.25 ) . Noiseless data and noisy data are used, respectively.
Figure 4. Fractional equation of Example 1: the posterior distribution of the predicted solution at different time nodes, where ( α , β ) is fixed to be ( 0.75 , 0.25 ) . Noiseless data and noisy data are used, respectively.
Fractalfract 06 00606 g004
Figure 5. Integro-differential equation: the posterior distribution of the predicted solution at different time nodes, where Δ t = 0.01 . Noiseless data and noisy data are used, respectively.
Figure 5. Integro-differential equation: the posterior distribution of the predicted solution at different time nodes, where Δ t = 0.01 . Noiseless data and noisy data are used, respectively.
Fractalfract 06 00606 g005
Figure 6. Problem of Example 3: Time versus the relative spatial error L 2 and maximum absolute error E until the final time step is reached, where Δ t is fixed to be 0.01.
Figure 6. Problem of Example 3: Time versus the relative spatial error L 2 and maximum absolute error E until the final time step is reached, where Δ t is fixed to be 0.01.
Fractalfract 06 00606 g006
Figure 7. Results of Example 4, (a) impact of the number of initial data points (inhomogeneous term data points, artificially generated data points and delay term data points) on max E ( Δ t ) = max 1 n N E ( t n ) , when the backward Euler scheme is employed. Fix Δ t to be 10 1 , 10 2 and 10 3 in turn. (b) Impact of the level of the additive noise on the maximum absolute error E , when the backward Euler scheme is used. Fix the number of initial data points (inhomogeneous term data points, artificially generated data points and delay term data points) to be 20, and take Δ t = 0.01 .
Figure 7. Results of Example 4, (a) impact of the number of initial data points (inhomogeneous term data points, artificially generated data points and delay term data points) on max E ( Δ t ) = max 1 n N E ( t n ) , when the backward Euler scheme is employed. Fix Δ t to be 10 1 , 10 2 and 10 3 in turn. (b) Impact of the level of the additive noise on the maximum absolute error E , when the backward Euler scheme is used. Fix the number of initial data points (inhomogeneous term data points, artificially generated data points and delay term data points) to be 20, and take Δ t = 0.01 .
Fractalfract 06 00606 g007
Figure 8. Problem of Example 1: (a) The posterior distribution of the predicted solution at different time nodes, where the backward Euler scheme is used. Fix the number of initial data points (inhomogeneous term data points, artificially generated data points and delay term data points) to be 30, and take Δ t = 0.01 . (b) The posterior distribution of the predicted solution at different time nodes, where the trapezoidal scheme is used. The other experimental conditions are the same as those with the backward Euler scheme. (c) Time versus maximum absolute error E until the final time step is reached, where the backward Euler scheme and the trapezoidal scheme are both considered.
Figure 8. Problem of Example 1: (a) The posterior distribution of the predicted solution at different time nodes, where the backward Euler scheme is used. Fix the number of initial data points (inhomogeneous term data points, artificially generated data points and delay term data points) to be 30, and take Δ t = 0.01 . (b) The posterior distribution of the predicted solution at different time nodes, where the trapezoidal scheme is used. The other experimental conditions are the same as those with the backward Euler scheme. (c) Time versus maximum absolute error E until the final time step is reached, where the backward Euler scheme and the trapezoidal scheme are both considered.
Fractalfract 06 00606 g008
Figure 9. (a) The posterior distribution of the predicted solution at different time nodes, where Dirichlet boundary conditions are employed. (b) The posterior distribution of the predicted solution at different time nodes, where Neumann boundary conditions are employed. (c) The posterior distribution of the predicted solution at different time nodes, where mixed boundary conditions are employed. (d) Time versus the maximum absolute error E until the final time step is reached.
Figure 9. (a) The posterior distribution of the predicted solution at different time nodes, where Dirichlet boundary conditions are employed. (b) The posterior distribution of the predicted solution at different time nodes, where Neumann boundary conditions are employed. (c) The posterior distribution of the predicted solution at different time nodes, where mixed boundary conditions are employed. (d) Time versus the maximum absolute error E until the final time step is reached.
Fractalfract 06 00606 g009aFractalfract 06 00606 g009b
Figure 10. Wave equation of Example 6: (a) The posterior distribution of the predicted solution at different time nodes, where the second order central difference scheme is used. (b) The posterior distribution of the predicted solution at different time nodes, where the backward Euler scheme is used. (c) Time versus maximum absolute error E until the final time step is reached, where the second order central difference scheme (SCD) and the backward Euler scheme (BE) are both considered.
Figure 10. Wave equation of Example 6: (a) The posterior distribution of the predicted solution at different time nodes, where the second order central difference scheme is used. (b) The posterior distribution of the predicted solution at different time nodes, where the backward Euler scheme is used. (c) Time versus maximum absolute error E until the final time step is reached, where the second order central difference scheme (SCD) and the backward Euler scheme (BE) are both considered.
Fractalfract 06 00606 g010
Table 1. Fractional equation of Example 1: maximum absolute error E at some time nodes, where Δ t = 0.01 , ( α , β ) is fixed to be ( 0.25 , 0.75 ) , ( 0.5 , 0.5 ) , and ( 0.75 , 0.25 ) in turn, and noiseless data are used.
Table 1. Fractional equation of Example 1: maximum absolute error E at some time nodes, where Δ t = 0.01 , ( α , β ) is fixed to be ( 0.25 , 0.75 ) , ( 0.5 , 0.5 ) , and ( 0.75 , 0.25 ) in turn, and noiseless data are used.
t 0.1 0.2 0.3 0.4 0.5
( 0.25 , 0.75 ) 4.006275 × 10 3 4.169644 × 10 3 5.313127 × 10 3 7.771295 × 10 3 1.038458 × 10 2
( α , β ) ( 0.5 , 0.5 ) 4.289846 × 10 3 5.309005 × 10 3 8.044738 × 10 3 1.138058 × 10 2 1.385240 × 10 2
( 0.75 , 0.25 ) 4.731742 × 10 3 6.114796 × 10 3 8.990011 × 10 3 1.248342 × 10 2 1.469581 × 10 2
t 0.6 0.7 0.8 0.9 1.0
( 0.25 , 0.75 ) 1.022259 × 10 2 7.038555 × 10 3 3.247575 × 10 3 5.014752 × 10 3 1.554705 × 10 2
( α , β ) ( 0.5 , 0.5 ) 1.270620 × 10 2 7.946964 × 10 3 2.464540 × 10 3 7.199783 × 10 3 1.560388 × 10 2
( 0.75 , 0.25 ) 1.318964 × 10 2 8.319928 × 10 3 2.440425 × 10 3 6.997903 × 10 3 2.024694 × 10 2
Table 2. Fractional equation of Example 1: maximum absolute error E at some time nodes, where Δ t = 0.01 , ( α , β ) is fixed to be ( 0.25 , 0.75 ) , ( 0.5 , 0.5 ) , and ( 0.75 , 0.25 ) in turn, and noisy data are used.
Table 2. Fractional equation of Example 1: maximum absolute error E at some time nodes, where Δ t = 0.01 , ( α , β ) is fixed to be ( 0.25 , 0.75 ) , ( 0.5 , 0.5 ) , and ( 0.75 , 0.25 ) in turn, and noisy data are used.
t 0.1 0.2 0.3 0.4 0.5
( 0.25 , 0.75 ) 9.751693 × 10 3 9.933069 × 10 3 7.915402 × 10 3 5.985707 × 10 3 4.335663 × 10 3
( α , β ) ( 0.5 , 0.5 ) 1.037415 × 10 2 1.226151 × 10 2 1.153429 × 10 2 9.480904 × 10 3 6.599661 × 10 3
( 0.75 , 0.25 ) 1.107313 × 10 2 1.362223 × 10 2 1.362442 × 10 2 1.230488 × 10 2 9.452340 × 10 3
t 0.6 0.7 0.8 0.9 1.0
( 0.25 , 0.75 ) 3.930905 × 10 3 4.989465 × 10 3 1.076718 × 10 2 2.762690 × 10 2 2.925931 × 10 2
( α , β ) ( 0.5 , 0.5 ) 3.722859 × 10 3 4.222247 × 10 3 7.953070 × 10 3 2.115376 × 10 2 2.769300 × 10 2
( 0.75 , 0.25 ) 5.542730 × 10 3 3.738703 × 10 3 8.613066 × 10 3 2.476015 × 10 2 5.135747 × 10 2
Table 3. Integro-differential equation of Example 2: maximum absolute error E at some time nodes, where Δ t = 0.01 , where experiments are performed with noiseless data and noisy data, respectively.
Table 3. Integro-differential equation of Example 2: maximum absolute error E at some time nodes, where Δ t = 0.01 , where experiments are performed with noiseless data and noisy data, respectively.
t 0.1 0.2 0.3 0.4 0.5
N o i s e l e s s d a t a 1.998069 × 10 4 5.508548 × 10 4 9.958003 × 10 4 1.473446 × 10 3 2.054287 × 10 3
N o i s y d a t a 1.857561 × 10 2 1.760902 × 10 2 6.577157 × 10 3 3.764989 × 10 2 3.478251 × 10 2
t 0.6 0.7 0.8 0.9 1.0
N o i s e l e s s d a t a 2.789148 × 10 3 3.663247 × 10 3 4.655392 × 10 3 5.852885 × 10 3 7.195341 × 10 3
N o i s y d a t a 5.215384 × 10 3 2.053576 × 10 2 1.473903 × 10 2 7.691779 × 10 3 1.786041 × 10 2
Table 4. Problem of Example 3: Relative spatial error L 2 and maximum absolute error E at some time nodes, where Δ t is fixed to be 0.01.
Table 4. Problem of Example 3: Relative spatial error L 2 and maximum absolute error E at some time nodes, where Δ t is fixed to be 0.01.
t 0.1 0.2 0.3 0.4 0.5
L 2 1.255236 × 10 2 1.287883 × 10 2 1.319996 × 10 2 1.311049 × 10 2 1.293817 × 10 2
E 3.240473 × 10 2 3.420406 × 10 2 3.573738 × 10 2 3.626811 × 10 2 3.617718 × 10 2
t 0.6 0.7 0.8 0.9 1.0
L 2 1.339241 × 10 2 1.301079 × 10 2 1.289006 × 10 2 1.318331 × 10 2 1.341626 × 10 2
E 3.835969 × 10 2 3.759043 × 10 2 3.745873 × 10 2 3.913333 × 10 2 4.018637 × 10 2
Table 5. Problem of Example 4: maximum absolute error E at five time nodes for different Δ t under the backward Euler scheme and the convergence order in time.
Table 5. Problem of Example 4: maximum absolute error E at five time nodes for different Δ t under the backward Euler scheme and the convergence order in time.
Δ t t = 0.2 t = 0.4 t = 0.6 t = 0.8 t = 1.0 max E ( Δ t ) max E ( 2 Δ t ) max E ( Δ t )
0.2 5.196883 × 10 2 4.358087 × 10 2 3.277721 × 10 2 2.416694 × 10 2 1.781159 × 10 2 5.196883 × 10 2 *
0.1 2.643025 × 10 2 2.202658 × 10 2 1.643945 × 10 2 1.206984 × 10 2 8.922375 × 10 3 2.643025 × 10 2 1.966
0.05 1.354968 × 10 2 1.112114 × 10 2 8.246767 × 10 3 6.030003 × 10 3 4.610772 × 10 3 1.354968 × 10 2 1.951
0.025 6.893477 × 10 3 5.593641 × 10 3 4.121273 × 10 3 3.001243 × 10 3 2.359743 × 10 3 6.893477 × 10 3 1.966
0.0125 3.459075 × 10 3 2.800349 × 10 3 2.065297 × 10 3 1.507394 × 10 3 1.192749 × 10 3 3.462585 × 10 3 1.991
0.00625 1.753664 × 10 3 1.414149 × 10 3 1.036199 × 10 3 7.571422 × 10 4 5.988719 × 10 4 1.763992 × 10 3 1.963
Table 6. Problem of Example 4: maximum absolute error E at five time nodes for different Δ t under the trapezoidal scheme and the convergence order in time.
Table 6. Problem of Example 4: maximum absolute error E at five time nodes for different Δ t under the trapezoidal scheme and the convergence order in time.
Δ t t = 0.2 t = 0.4 t = 0.6 t = 0.8 t = 1.0 max E ( Δ t ) max E ( 2 Δ t ) max E ( Δ t )
0.2 4.495057 × 10 2 1.990879 × 10 2 1.849233 × 10 2 1.384659 × 10 2 1.045600 × 10 2 4.495057 × 10 2 *
0.1 1.638529 × 10 2 1.275052 × 10 2 9.718359 × 10 3 7.421787 × 10 3 5.774761 × 10 3 1.747229 × 10 2 2.573
0.05 8.155066 × 10 3 6.595744 × 10 3 5.033765 × 10 3 3.856745 × 10 3 2.979750 × 10 3 8.184114 × 10 3 2.135
0.025 4.079031 × 10 3 3.340249 × 10 3 2.562043 × 10 3 1.969803 × 10 3 1.528144 × 10 3 4.088401 × 10 3 2.002
0.0125 2.029881 × 10 3 1.701985 × 10 3 1.300908 × 10 3 9.986143 × 10 4 7.749402 × 10 4 2.043320 × 10 3 2.001
0.00625 1.036153 × 10 3 8.602172 × 10 4 6.599498 × 10 4 5.096151 × 10 4 3.956461 × 10 4 1.040213 × 10 3 1.964
Table 7. Example 5: Maximum absolute error E at some time nodes, where Δ t = 0.01 , where Dirichlet, Neumann and mixed boundary conditions are used, respectively.
Table 7. Example 5: Maximum absolute error E at some time nodes, where Δ t = 0.01 , where Dirichlet, Neumann and mixed boundary conditions are used, respectively.
t 0.1 0.2 0.3 0.4 0.5
D i r i c h l e t 1.933660 × 10 2 1.133342 × 10 2 9.185798 × 10 3 1.085891 × 10 2 1.424333 × 10 2
N e u m a n n 2.687072 × 10 3 2.880986 × 10 3 3.562039 × 10 3 6.123954 × 10 3 1.009985 × 10 2
M i x e d 6.755766 × 10 3 6.675441 × 10 3 3.984626 × 10 3 6.110999 × 10 3 9.978279 × 10 3
t 0.6 0.7 0.8 0.9 1.0
D i r i c h l e t 1.890978 × 10 2 2.444410 × 10 2 3.139654 × 10 2 4.025414 × 10 2 5.092472 × 10 2
N e u m a n n 1.704219 × 10 2 2.634275 × 10 2 3.677554 × 10 2 4.920981 × 10 2 6.465664 × 10 2
M i x e d 1.571632 × 10 2 2.541023 × 10 2 3.595499 × 10 2 4.830915 × 10 2 6.284298 × 10 2
Table 8. Wave equation of Example 6: Maximum absolute error E at some time nodes, where Δ t = 0.01 , where the second order central difference scheme (SCD) and the backward Euler scheme (BE) are applied, respectively.
Table 8. Wave equation of Example 6: Maximum absolute error E at some time nodes, where Δ t = 0.01 , where the second order central difference scheme (SCD) and the backward Euler scheme (BE) are applied, respectively.
t 0.1 0.2 0.3 0.4 0.5
S C D 3.521926 × 10 2 1.731871 × 10 2 2.562235 × 10 2 3.339859 × 10 2 3.854518 × 10 2
B E 9.577671 × 10 4 1.647835 × 10 3 2.197030 × 10 3 2.490295 × 10 3 2.566658 × 10 3
t 0.6 0.7 0.8 0.9 1.0
S C D 4.179861 × 10 2 4.218729 × 10 2 3.356628 × 10 2 2.681414 × 10 2 2.301241 × 10 2
B E 2.471149 × 10 3 2.195480 × 10 3 1.811016 × 10 3 1.411214 × 10 3 1.300891 × 10 3
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Gu, W.; Zhang, W.; Han, Y. Numerical Investigation of a Class of Nonlinear Time-Dependent Delay PDEs Based on Gaussian Process Regression. Fractal Fract. 2022, 6, 606. https://doi.org/10.3390/fractalfract6100606

AMA Style

Gu W, Zhang W, Han Y. Numerical Investigation of a Class of Nonlinear Time-Dependent Delay PDEs Based on Gaussian Process Regression. Fractal and Fractional. 2022; 6(10):606. https://doi.org/10.3390/fractalfract6100606

Chicago/Turabian Style

Gu, Wei, Wenbo Zhang, and Yaling Han. 2022. "Numerical Investigation of a Class of Nonlinear Time-Dependent Delay PDEs Based on Gaussian Process Regression" Fractal and Fractional 6, no. 10: 606. https://doi.org/10.3390/fractalfract6100606

Article Metrics

Back to TopTop