Application of neural networks to inverse elastic scattering problems with near-field measurements

: This paper is concerned with the application of a machine learning approach to inverse elastic scattering problems via neural networks. In the forward problem, the displacements are approximated by linear combinations of the fundamental tensors of the Cauchy-Navier equations of elasticity, which are expressed in terms of sources placed inside the elastic solid. From the near-field measurement data, a two-layer neural network method consisting of a gated recurrent unit to gate recurrent unit has been used to reconstruct the shape of an unknown elastic body. Moreover, the convergence of the method is proved. Finally, the feasibility and e ff ectiveness of the presented method are examined through numerical examples.


Introduction
Direct and inverse time-harmonic elastic wave scattering by a bounded body [1] has a wide range of applications, such as radar/sonar, geophysical exploration, medical imaging, seismology and materials sciences [2][3][4][5].The forward problem is to determine and approximate the diffracted field given an incident wave and the elastic body.We refer the reader to, e.g., [6,7] for the existence and uniqueness of forward solutions via the variational approach.The inverse scattering problem (see e.g., [8][9][10]) is to determine the physical or geometric information of the scatterer based on knowledge of the scattered field.This paper is concerned with both direct and inverse problems.
In recent years, numerical approaches for position and shape reconstruction in many elastic scattering problems have drawn the attention of a large number of scholars.In [11], Arens proved that in two dimensions the shape of an elastically rigid body can be determined from the information of compressional and shear far-field patterns for all incident elastic plane waves with distinct directions at a fixed frequency.In [7], Bao et al. investigated the direct and inverse problems for a penetrable anisotropic elastic body embedded in a homogeneous and isotropic background medium.They show the unique-ness and existence of weak solutions by using the Fredholm alternative together with properties of the Dirichlet-to-Neumann map in both two and three dimensions.Moreover, an iterative method was designed to recover the interface from the multi-frequency near-field data.Charalambopoulos et al. [12] evaluated the inverse elastic scattering problem by using penetrable isotropic bodies embedded in an isotropic host environment via the factorization method.Hu et al. [13] investigated some uniqueness results for a ball or a convex Lipschitz polyhedron and adapted the factorization method to recover the shape of a rigid body from the scattered S-waves or P-waves corresponding to all incident plane shear or pressure waves.Recently, a reverse time migration method was proposed by Chen and Huang [14].In [15], Li et al. presented an iterative method to solve the inverse elastic scattering problem.The domain derivative expresses the rate of change with respect to the surface displacement.A continuation method using multiple frequency data was established for the inverse problem.
The method of fundamental solutions (MFS) is an effective method for solving both direct and inverse problems governed by partial differential equations (PDEs).Since it is very simple and easy to program, it has been attracting an increasing amount of attention in recent years.Various engineering problems, including the acoustics scattering problems [16,17] and the acoustics-elastic scattering problem [18], can be solved by the MFS.The MFS can also be used to solve the inverse acoustic scattering problems (see e.g., [19][20][21]).Alves et al. [22] used the MFS to solve non-homogeneous elastic wave equations.
In recent years, neural networks, one of the most popular schemes in machine learning, have been applied in many research studies, including inverse scattering problems [23,24].The seq2seq neural network has been applied for natural language processing and machine translation in the literature [25,26].In [27], a new SwitchNet with a sparse neural network structure was proposed to solve the wave equation-based backscattering problems.Neural networks are especially suitable for dealing with inaccurate and vague information processing problems involving many parameters and constraints, and they can fully approximate complex nonlinear maps.Yin et al. [23] proposed a two-layer neural network method to recover the location and shape of an obstacle by using phaseless far-field data.
In this paper, we consider inverse elastic scattering problems related to recovering the shape of a rigid body and a cavity by applying a two-layer gated controlled recurrent unit (GRU) neural network.The convergence of the presented method will be verified.The contributions of this paper are summarized as follows.
(1) We propose a fast numerical scheme for solving the direct problem; (2) We prove the convergence of a two-layer GRU neural network; (3) We test the accuracy of the reconstruction method by numerical examples.The rest of the paper is arranged as follows.In Section 2, we formulate the elastic scattering problem and describe the numerical scheme for forward solutions.In Section 3 we introduce the architecture of the neural network.In Section 4, we perform theoretical analysis for the convergence of the network.In Section 5, numerical experiments are presented to verify the efficiency and effectiveness of the neural network.

Formulation and solution method
Let D ⊂ Ω be a two-dimensional elastic body with the smooth boundary ∂D, where Ω ⊂ R 2 is a subdomain.The exterior domain D c : = R 2 \ D is supposed to be connected and filled with a homogeneous and isotropic elastic medium.The displacement u(x, t) satisfies µ△u(x, t) In general, the displacement u(x, t) is time harmonic and given by u(x, t) = e iωt v(x) (where i is the imaginary unit); thus, the displacement v(x) satisfies the Navier equation where ω > 0 is the angular frequency, λ, µ are the Lamé constants satisfying that µ > 0 and λ + µ > 0, and ρ is the density function, which is assumed to be constant.
The forward problem considered in this paper is to give an incident elastic plane wave u incident = de ik p d•x + d ⊥ e iκ s d•x with d = (cos ϑ, sin ϑ) and d ⊥ = (− sin ϑ, cos ϑ), ϑ ∈ [0, 2π), as well as to find the scattered field v = (v 1 , v 2 ) ⊤ that satisfies the Navier equation.
By the well-known Hodge decomposition, any solution v to (2.1) can be decomposed into where v p := (−1/k 2 p )grad divv is known as the compressional part of v and v s : as the shear part.Here, k s = ω ρ µ is the shear wave number and k p = ω ρ λ+2µ is the compressional wave number.For a rigid body D, the scattered field v satisfies the first (Dirichlet) kind of boundary condition, i.e., it holds that v = −u incident on ∂D (2.2) so that the total displacement is zero.
If D is a cavity, the following second (Neumann) kind of boundary condition holds: with t := t n u incident .Here, the traction operator t n is defined by where êi , i = 1, 2, designates the unit vector along the x i axis and n = (n 1 , n 2 ) denotes the unit normal direction point into D c .To ensure uniqueness, we require the scattered field v to satisfy the two-dimensional Kupradze-Sommerfeld radiation condition at infinity, that is, In this paper, we will employ the MFS (see [28] for the acoustic scattering problem) to calculate the forward solution v.The main idea of the MFS is to approximate the scattered field in the solution
domain D c by using a linear combination of fundamental solutions [18,29] to the Navier equation (2.1) with N source points y j ∈ D, i.e., Here, a j , b j are the coefficients to be determined by the boundary condition on ∂D and U(x, y) is Green's tensor of the two-dimensional Lamé system of the form where I is the 2-by-2 identity matrix and is the fundamental solution of the Helmholtz equation associated with the wavenumber k.Here, H (1)   0 is the Hankel function of the first kind of order zero.
Remark 2.1.Following the idea in [29], the solution v(x) of this problem can be approximated by the single-layer potential function, with ψ ∈ (L 2 (∂B)) 2 , as follows: where B ⊂ D. If we discretize the integral operator S , we can get a linear combination of the fundamental solutions with respect to some source points located on ∂B with the unknown density function ψ(y j ).It can be seen that (2.5) is a certain discrete form of (2.6) with some weight coefficients.
We respectively define the compressional and shear parts of U(x, y) as follows: and Recalling the definitions of the traction vector (2.4) and the fundamental solution to the Navier equation, one can approximate the traction vector on the boundary ∂D as follows: (2.9) Here, the matrix T(x, y j ) is defined by with the entries To determine the coefficients a j , b j , j = 1, 2, ...N, we employ the collocation method by choosing M collocation points x j M j=1 ⊂ ∂D and impose the boundary condition on ∂D.
be the vector of 2N unknown parameters.It can be calculated by solving a linear system with 2M equations of the form where the matrix A ∈ C 2M×2N on the left-hand side and the vector h ′ ∈ C 2M×1 on the right-hand side are defined as follows depending on the boundary condition on ∂D.
In our numerics, we choose the number M = N in all examples.The algebraic linear system (2.10) is solved directly by using Matlab with the command d ′ = A\h ′ .Then we can get the scattered nearfield data v on ∂Ω from (2.5).One can also get the shear part v s | ∂Ω and compressional part v p | ∂Ω of v by replacing Green's tensor U by its shear and compressional parts in (2.7) and (2.8), respectively.These synthetic data will be used as the measurement data in our inverse problem.In our examples below, we suppose, for simplicity that the boundary ∂Ω can be parameterized as follows: This means that the measurement data are taken on the circle centered at (a, b) ∈ R 2 with the radius R > 0.
The inverse problem in this paper is to recover the shape of the obstacle or the cavity, namely ∂D, by applying knowledge of the near-field data on ∂Ω associated with a fixed incident field: Assumption 2.1.The boundary of the obstacle D is assumed to take the form where p(θ) permits the following (truncated) Fourier representation: where Q ∈ N. Below we define the set of the Fourier coefficients a 0 , a m and b In the remaining part of this paper, we shall consider the inverse problem of recovering y from our measurement data set.

Architecture of neural networks
In this section, we introduce the main components of the neural network first.Then how to construct the neural network to deal with the inverse elastic scattering problem is given.
Figure 1 gives a simple map of the exhibition of its working status.We introduce the GRU cells (see e.g., [23,30]), i.e., the principal element of the update network.This network consists of many GRU neurons.(1) The cell receives a stimulus, i.e., an input I(t)(t = 1, 2, • • • , T ).Then the cell will respond to the stimulus.We describe this response as an activation function acting on the previous output h(t −1) and the present stimulus I(t) to produce a reset gate r(t), which is to control candidate status h(t) based on whether it depends on the prior status of learning.
(2) It preserves historical output h(t − 1) and presents input I(t) to obtain update gate z(t), which can be used to choose and forget memory.In other words, it can forget some information passed down in h(t − 1) and select and memorize the information in h(t).
In order to see the notations clearly, we list the notations used in the neural network in Table 1.

Construction of the network
The whole model consists of three parts, namely, the encoding end, the middle status and the decoding end.The role of the encoding end is to extract data features, and the intermediate state serves to connect the encoding and decoding ends.The function of the decoder is to decode the information obtained by the encoder to obtain the output sequence.Next, the relevant details of the network structure are illustrated as below.
Table 1.The notations used in GRU neurons.

Notations r(t)
The reset gate z(t) The update gate w r The reset gate weight w h The candidate status weight w z The update gate weight σ(x) = 1 1+e −x .
A sigmoid activation function tanh(x) = e x −e −x e x +e −x

The hyperbolic tangent activation function h(t)
The GRU output ⊙ The tensor product of matrices ⊕ Plus Step 1: The scattered data features are extracted by using GRU neurons on the encoding end, i.e., where ζ t signifies the t th GRU cells in the course of the work.That is, it is equivalent to extracting the input data features by using (3.1)-(3.5).Then the first layer extracts the scattered data as Step 2: Extracting features again.Take the features extracted for the first time, i.e., h 1 = (h 1 (1), h 1 (2), • • • , h 1 (T )) as input to extract the features again by means of (3.1)-(3.5): Step 3: First decoding process on the decoding terminal.The decoder side will use N GRU neurons to decode the second layer output information h 2 (T ) and the obstacle boundary Fourier expansion coefficients y = (y(1), y(2), • • • , y(N)): where ζ n signifies operation of the n th GRU.It plays the same role as ζ t in Step 1.
Step 4: Second decoding process at the decoder end.Get the output h 4 for the fourth layer by decoding the output ) of the third layer.That is, this layer will preserve profitable information and discard unprofitable information Finally, the latest output from the decoding side will be converted into a predictive output by the fully connected layer.That is to say, we will calculate ŷ = (ŷ(1), ŷ(2), • • • , ŷ(N)), i.e., the predicted output of the network by using the following formula

Theoretical analysis of convergence
This section is devoted to the convergence analysis of the neural network.
For the sake of convenience, we introduce some symbols.The input weights and the internal weights of every layer are gathered in the weight matrices W I and W L .We denote W = (W I , W L ).The specific forms are as follows Here, W I has the same form as W L .
Next, we write the input of every layer as , .
The input vectors for each layer of the encoding and decoding ends are given as follows: , the activation function acts on a vector as follows: The derivatives of the activation function acts similarly
In other words, the activation function acts on a vector componentwise.The weight gradient of the fourth-layer neural network update gate can be calculated directly: From the initial condition constant ŷ(0) ≡ 0, we have For any given initial weight w 0 , the weight sequence {w k } can be generated from an iterative formula: where and γ is the learning rate.In order to check the convergence of the neural network, we claim the following theorem.
Theorem 4.1.The loss function is defined as the formula (4.1), and the weight sequence {w k } can be generated from the iterative formula (4.5).If Assumptions 4.1 and 4.2 are valid, then we have the following: (1) L(w k+1 ) ≤ L(w k ); Proof.First, we can also use the method employed in [32], which adopts the induction argument to calculate ∆v 4 (w k , n) as follows: where δ(w k , n) will be given in the following remark.Then, apply the Taylor expansion to v 4 (w k , n); we have Next, substituting (4.7) into the first order term of ∆v 4 (w k , n) of the above equation, we have Then, combining (4.2), (4.3) and (4.6), we will get By Assumptions 4.1 and 4.2, together with the Cauchy-Schwarz inequality, we obtain that |φ(w If the learning rate is sufficiently small and satisfies that 0 < γ < 1/C (C is a constant), then the first conclusion of Theorem 4.1 is established.
Since L(w k ) ≥ 0 is monotonically bounded, there exists an L * such that lim k→∞ L(w k ) = L * .The proof is completed.

Numerical examples
In this section, we will give some numerical examples to show the effectiveness of the presented method.where D denotes the set of shape parameters.For the elements in T ′ , I D is also updated as the shape D changes, and we randomly selected 2% of the scattered data as the testing set to get the final prediction results.
The inverse problem can be solved in two steps.First, for a given shape D, the elastic wave equation (2.1) is solved by using the MFS described in Section 2. Second, the network was trained to identify the obstacle from the scattered data, i.e., the network was trained to approximate the Fourier coefficients of the obstacle.
To test the sensitivity to the noise, we suppose that noisy data is generated by adding Gaussian random noise pointwise, i.e., where δ ≥ 0 denotes the noise level and the random variable ξ(i) is the standard Gaussian distribution.In all numerical examples, the parameters are given as follows.
(2) The symbol "-" signifies the exact boundary of an obstacle, and "--" signifies the forecast boundary of an obstacle.(3) The number of neurons of the GRU neural network is generally set as 2 p (p ∈ R + ), and the network parameters were determined through a large number of numerical experiments.The number of GRU neurons was set as 128.Dropout was set as 0.5.Batch processing was set as 500.And the number of iterations is 40.
Remark 5.1.The scattering data are complex numbers and the current neural network method can only deal with real numbers; thus, we took out the real and imaginary parts of the scattering data as the scattering data information in the training data set.In addition, we need to use a random function to generate random numbers as the shape parameters in the training data set.This is the preparation of the training set.After the training set is prepared, the network model is trained by using the data in the training set through the two-layer network which has been built.Finally the output of the network, i.e., the predicted values of the shape parameters, is obtained by using the trained model to use the test data set.
Example 5.1 (Dirichlet boundary condition).First, the scattered data used in the training network are noise free, and the scattered data are collected on a circle with the radius R = 3 centered at the origin.We chose to investigate the numerical recovery results by changing the number of measurements T .The d = (1, 0) is the incident direction.The boundaries of the scatterers used in this example are parameterized as follows.
The polar diameter has the following (truncated) Fourier representation: a 0 2 + Q m=1 a m cos(mθ).We take Q = 6, and the idea is to reconstruct the Fourier coefficients a m .The coefficients used in the training process, i.e., a 0 ∈ [0.   Figure 2 presents the recovery results for a fixed number T ∈ {4000, 20,000, 40,000}.From the numerical recovery results in Figure 2, we can see that the accuracy in the numerical results will be improved by increasing the number T .
Second, in addition to the usual rounding errors, we also contaminated the forward scattered data by adding random noise.We chose to investigate the stability of the inversion schemes by changing the noise level δ. Figure 3 presents the recovery results for the noise level δ ∈ {5%, 10%, 15%}.From the fourth column in Figure 3, we can see that the errors are becoming smaller as the noise level decreases.We can also see that the numerical reconstruction results are poorest where the body has high curvature.This meets our expectations, since more expansion coefficients are required at the corner.
Example 5.2 (Neumann boundary condition).For comparison, in this example we consider the scattering problem by cavities, i.e., the Neumann boundary condition is imposed.In this example, we will consider a heart-shaped domain parameterized by The training sets are the same as in Example 5.1.We chose the direction d = (1, 0).The scattered data were collected on a circle with the radius R = 3 centered at the origin.The numerical recovery results for the noise level δ ∈ {0%, 5%, 10%, 15%} with a fixed number T = 40, 000 are shown in Figure 4. From this, we observe that the locations and shapes are still well captured.From the fourth row in Figure 4, we can see that the errors are becoming smaller as the noise level decreases.We can also see that the numerical reconstruction results are poorest where the body has high curvature.This further shows that our methods are independent of the physical properties of the underlying scatterers.The polar diameter has the following (truncated) Fourier representation a 0 2 + Q m=1 b m sin(mθ).We take Q = 6, and the idea is to reconstruct the Fourier coefficients a 0 and b m .The coefficients used in the training process, i.e., a 0 ∈ [0.We chose the direction d = (1, 0).The scattered data were collected on a circle with the radius R = 3 centered at the origin.The numerical recovery results for the noise level δ ∈ {5%, 10%, 15%} with a fixed number T = 40, 000 are shown in Figure 5.The error for all testing sets obtained with different noise levels δ ∈ {5%, 10%, 15%} and T = 40, 000 is demonstrated by the fourth column in Figure 5. From this, we observe that the locations and shapes are still well captured.The above examples give the effectiveness of the presented method for the case of a single obstacle or cavity.The following example will give the case of two disjoint components.Meanwhile, we should know that D is the case of two disjoint components as the a priori information.The Fourier coefficients were tracked as two independent groups.
Example 5.4 (Mixed boundary conditions).As shown in Figure 6, we consider a scatterer with two disjoint components.We set the scatterer to be a combination of a triangle-shaped domain and a peanutshaped domain.Suppose that D 1 is a big triangle-shaped domain of the Dirichlet kind, parameterized by x(θ) = 0.8 (2 + 0.3 cos θ) (cos θ, sin θ).
D 2 is a mini peanut-shaped cavity.The boundary ∂D 2 is parameterized by 2 , 1 2 ) in this example.The scattered data were collected on a circle with the radius R = 3 centered at the origin.Figure 6 shows the numerical results obtained from the knowledge of the compressional near-field measurements, the shear near-field measurements and the full near-field measurements with δ = 10%.From these subfigures, we observe that the locations and shapes can be captured.This further shows that the presented method is independent of the physical properties of the underlying scatterers, even for two disconnected obstacles with different boundary conditions.We also note that the numerical error is worse than that in the above examples.
Example 5.5 (Two obstacles).As shown in Figures 7 and 8, the underling scatterer D is given as the union of two disjoint obstacles D = D 1 ∪ D 2 .We consider two cases in this example.

Definition 4 . 1 . 1 )
Let {y(n)} N n=1 be the initial values of the boundary curve and {ŷ(n)} N n=1 be the network prediction output.Define the common mean absolute value loss (error) function as When y(n) − ŷ(n) 0, the derivative of the loss function (4.1) with respect to the weight w in (3.1)-(3.5) is given by

Figure 2 .
Figure 2. Numerical results for recovering the heart-shaped and round-square domain, obtained with noise free data and different numbers T ∈ {4000, 20,000, 40,000} for Example 5.1.

Example 5 . 6 (θ 2 sin θ 2 (
Dirichlet boundary condition).The last example considered here is a different option of the physical parameters.The Lamé constants are given as λ = 3.88, µ = 2.56 and the frequency ω = π.The scattered data were collected on a circle with the radius R = 3 centered at the origin.The incident direction is d = (− 1 2 , 3 2 ).The boundary of the scatterer is a heart-shaped domain parameterized by x(θ) = 0.5 1.1 + 1.6 cos 2 cos θ, sin θ).

Figure 9
Figure9presents the recovery results for a fixed number T = 40, 000 with the noise level δ ∈ {5%, 10%, 15%}.It can be seen that the recovery results are independent of the physical parameters.