A fast solver for multi-particle scattering in a layered medium

In this paper, we consider acoustic or electromagnetic scattering in two dimensions from an infinite three-layer medium with thousands of wavelength-size dielectric particles embedded in the middle layer. Such geometries are typical of microstructured composite materials, and the evaluation of the scattered field requires a suitable fast solver for either a single configuration or for a sequence of configurations as part of a design or optimization process. We have developed an algorithm for problems of this type by combining the Sommerfeld integral representation, high order integral equation discretization, the fast multipole method and classical multiple scattering theory. The efficiency of the solver is illustrated with several numerical experiments.


Introduction
The problem of designing composite materials that exhibit a specific acoustic or electromagnetic response is an area of active research [2,26,30]. Examples include the design of random media with a well-defined macroscopic refraction (coherent scattering) [26] and the fabrication of metamaterials [30] for cloaking, near field imaging, etc. In many experiments, the materials are designed by incorporating large numbers of identical inclusions (particles) in a layered material. When the size of each particle is comparable to the wavelength of the incoming field and the distribution of particles is reasonably dense, then the interaction of the particles involves non-negligible multiple scattering effects and methods based on homogenization [26] are not applicable. Instead, the full Helmholtz or Maxwell equations should be solved at each iteration of the design process. Numerical simulation, in the absence of suitable fast algorithms, are impractical when thousands of particles are involved.
In this paper, we develop an algorithm that accelerates the computation of electromagnetic scattering when a large number of particles are embedded in the middle of a threelayer dielectric medium. Numerical experiments show that our solver takes 1-2 minutes to evaluate the scattered field for up to 5, 000 particles on a 2.3GHz laptop. Our method combines the Sommerfeld integral representation, a well-posed integral formulation, highorder discretization, multiple scattering theory and the fast multipole method. We focus on the two dimensional setting by assuming the material is invariant in the z direction. A related three-dimensional solver was considered in [15], but the particles were assumed to be distributed in free space. A principal contribution of this paper is the development of a mathematical framework that permits them to be embedded in a layer material (which is closer to being manufacturable). While we restrict our attention here to the three-layer case, the extension to an arbitrary layered medium is straightforward. Figure 1: Geometry of the three-layered medium, with a large number of dielectric particles embedded in the middle layer.
More precisely, we consider time-harmonic scattering (with time dependence e iωt ) from a three-layered medium as depicted in Fig. 1. The incident field is assumed to be driven by a point source located in the first (top) layer. The thickness of the middle layer is denoted by d. We assume the magnetic permeability µ is identical in each layer, while the electric permittivity is piecewise constant. There are two fundamental polarizations in the two dimensional setting to consider: the transverse magnetic (TM) polarization and the transverse electric (TE) polarization. In both cases, the Maxwell equations reduce to a scalar Helmholtz equation. For simplicity, we consider the TM polarization here, in which case the scattered field u s must satisfy the equation where k = ω √ µ is the wavenumber. We denote by k 1 , k 2 and k 3 the wavenumber for the three layers, and by k p the wavenumber for the particles. The scattered field also has to satisfy the Sommerfeld radiation condition at infinity [10]: where r = x 2 + y 2 . In order to develop an especially fast solution method, we make two further assumptions. First, as in the fast multi-particle scattering (FMPS) method of [15], we assume that the particles are well separated from each other -that is, the separation between particles is at least 10% of the particle size. Second, we assume that only a finite number of distinct particle shapes are included in the simulation. The first condition ensures that a multiple scattering formalism will be accurate and the second condition ensures that precomputation of single particle scattering matrices permits a dramatic reduction in the number of degrees of freedom necessary for the solver. The particles are not assumed to be symmetric and may be placed with arbitrary orientation. Both hypotheses are common in materials design (although there are exceptions).
An outline of the paper follows. In Section 2, we introduce the Sommerfeld integral and its application to layered materials (in the absence of inclusions). In Section 3, we review classical multiple scattering theory for circular particles. Section 4 extends the scattering formalism to non-circular particles and Section 5 develops analytical tools needed to go back and forth between the Sommerfeld integral formalism and multiple scattering theory. Section 5 also combines the techniques in the preceding sections and extends the FMPS method to layered media. Numerical examples are provided in Section 6 to illustrate the efficiency of the method, followed by some concluding remarks in Section 7.

The Sommerfeld integral for layered media
Wave propagation in a layered medium is a well-studied problem in acoustic and electromagnetic scattering theory. Nearly a century ago, Sommerfeld developed a spectral representation involving a Fourier integral in the "transverse" variable (the x-coordinate in Fig. 1) [8]. Assuming a point source is located at x 0 = (x 0 , y 0 ) in the top layer, with wavenumber k 1 , the corresponding field is given by the (two-dimensional) free space Green's function: 0 (x) is the first kind Hankel function of order zero. Combing the Fourier transform and contour integration [25], the Green's function can also be written in the form: It is important to note that the Sommerfeld integral (3) is conditionally convergent and as stated, requires that y = y 0 .
In the Sommerfeld approach ( [8]), we assume the upward scattered field u s 1 in the top layer can be expressed as where σ 1 (λ) is an unknown density on the upper interface y = 0. It is straghtforward to verify that u s 1 satisfies the Helmholtz equation with Helmholtz parameter k 1 . In the second layer, the scattered field u s 2 can be written in terms of contributions from both the upper (y = 0) and lower (y = −d) interfaces: u t 2 and u b 2 . These are given by where σ + 2 (λ) and σ − 2 (λ) are used to denote spectral density functions on the upper and lower interfaces.
Similarly, we can represent the scattered field u s 3 in the third layer with an unknown density σ 3 (λ) on the lower interface as Remark 1. The signs of the terms e ± √ λ 2 −ki 2 y and e ± √ λ 2 −ki 2 (y+d) in eqs. (4)-(7) ensure that evanescent modes (when |λ| > |k i |) decay away from each layer. (Physically, this is related to causality and is required in the derivation of formula [25] by countour integration).
It is worth noting that the four unknown densities σ 1 , σ + 2 , σ − 2 and σ 3 can be interpreted in two ways. First, they can simply be considered the spectral densities in the Fourier domain of a consistent representation for the Helmholtz equation. For those more familiar with potential theory, they can be viewed as the Fourier transforms of charge densities of four single layer potentials lying on the corresponding interfaces [3].
In the absence of any inclusions, the Sommerfeld representation for the field in each subdomain is derived from a "mode by mode" analysis. That is, the unknown functions σ 1 , σ + 2 , σ − 2 , and σ 3 are found by enforcing the continuity conditions at the interface for each value of the argument λ. For the case of electromagnetic scattering in TM polarization, when the permeability µ is constant in each layer, this requires that where [·] denotes the jump of a function along the interface, ∂/∂n is the normal derivative and u is the total field in each layer [10]. It is straightforward to check that the linear system to be solved for each λ takes the form: Definition 2.1. We will denote the 4 × 4 matrix above by A λ .
For the problem we consider here, the Sommerfeld integrals must be coupled to a representation of the field induced by the many particles present in the central layer. Before discussing the coupled system, however, we first summarize some well-known facts about scattering from a finite collection of inclusions in a homogeneous infinite medium.

Wave scattering for disks
Suppose now that we have an inclusion of dielectric material with k = ω √ p µ embedded in R 2 , assumed to consist of a dielectric with k 2 = ω √ 2 µ. For transverse magnetic(TM) polarization, the total electrical field u in the exterior of the inclusion satisfies the Helmholtz equation: ∆u + k 2 2 u = 0.
Further, the total field u can be written as the sum of the incident field u inc and the scattered field u s , where u s satisfies (11) and the Sommerfeld radiation condition, where r = x 2 + y 2 . Within the inclusion, the field u satisfies the Helmholtz equation with wavenumber k p , ∆u + k 2 p u = 0.
On the boundary of the inclusion, we must enforce the continuity conditions given by Eq (8) and (9).

A single disk
When the inclusion is a disk of radius R centered at the origin, it is straightforward to represent the solution using separation of variables, with in the exterior and in the interior. Here, (r, θ) are the polar coordinates of a point in the plane, H n (r) is the Hankel function of the first kind of order n and J n (r) is the Bessel function of order n [11,24]. We now expand the incident wave u inc and its normal derivative in the form: Enforcing the continuity conditions (8), (9) on the boundary of the disk for each Fourier mode, we easily obtain the following linear equation for mode n: where n ∈ N. Solving Eq. (17) determines the coefficients β n , γ n : It is straightforward to verify that the denominator in the preceding expressions cannot vanish if k and k p have positive real part and non-negative imaginary part [10,21].
Definition 3.1. The mapping between the incoming coefficients {α n } and outgoing coefficients {β n } is referred as the scattering matrix for the disk and denoted by S.
Remark 2. While we restrict our attention here to dielectric particles, the method can easily be extended to perfectly conducting disks. Since the interior field u in (15) is zero for perfect conductors, from (17), we have Remark 3. In the remainder of this paper, we will refer expansions based on Hankel functions, such as (14), as multipole expansions or H-expansions, and expansions based on Bessel functions, such as (15), as local expansions or J-expansions.

Multiple disks
Suppose now that we have M well separated, identical dielectric disks randomly distributed in a homogeneous medium. Each disk is assumed to have radius R and wavenumber k p and the background medium again has wavenumer k 2 . For each individual particle, the analysis can be carried out as above. We will denote by α m the incoming coefficients and by β m the outgoing coefficients for the m-th particle. We have where S p denotes the truncated (2p+1)×(2p+1) scattering matrix acting on the truncated expansion.
The principle difference between the single particle and multi-particle scattering problem is that, in the latter case, the incoming field experienced by each particle consists of two parts: the (applied) incident field u inc and the contribution to the scattered field u s from all of the other particles. In order to formulate the problem concisely, given the multipole expansion for disk j, we need some additional notation.
induces a field on disk l of the form Here, (r m , θ m ) and (r l , θ l ) denote the polar coordinates of a target point with respect to disk centers x m and x l , respectively and θ lm denotes the angle between (x m − x l ) and the x-axis.
Remark 5. We denote by T jm the translation operator that maps the outgoing coefficients β m from particle m to the local expansion α l centered at particle l. With this operator in place, the incoming coefficients α m for the m-th particle is where a m is the (truncated) local expansion (16) of the incident wave u inc on particle m. T jm is referred to as the multipole-to-local (M2L) translation operator [28]. Combining eqs. (21) and (24), one can easily eliminate the incoming coefficients α m and obtain the following linear system that only involves the outgoing coefficients: where The system (25) can be solved iteratively, using GMRES [29]. Since each translation operator T nm is dense, a naive matrix-vector product requires O((M (2p + 1)) 2 ) operations, where p is the order of the truncated expansion. FMM acceleration reduces the cost to O(M (2p + 1) 2 ) work, for which we refer the reader to [28,7]. Further, (25) has a simple diagonal preconditioner. Multiplying through by the block diagonal matrix S, results in the preconditioned system matrix I − ST . This significantly reduces the number of iterations.
We now extend the multiple scattering approach to arbitrarily shaped particles.
Remark 6. It is worth emphasizing that the multiple scattering theory as discussed here is hardly new. We refer the reader to [14,18,31] and the references therein.

Wave scattering for arbitrarily shaped particles
When the dielectric inclusions are of arbitrary shape, multiple scattering theory cannot be used quite so easily. Suppose, however, that an inclusion Ω is compactly supported with boundary ∂Ω and that it is composed of a homogeneous material with wavenumber k p , as above. Given the incident wave u inc and the boundary conditions (8), (9), the exterior scattered field u s and the field u within Ω have the following representation [10]: where S k and D k are the usual single layer and double layer potentials on ∂Ω, σ(y) and µ(y) are unknown charge and dipole densities that lie on the boundary ∂Ω. We will need the normal derivatives of S k and D k as well: By construction, the representations (26) and (27) satisfy the relevant Helmholtz equation in each domain. The single layer potential S k is weakly singular and the value is well-defined for x ∈ ∂Ω. The operators D k and N k are define on the boundary in the principal value sense (and have different limits when approaching the boundary from the interior and the exterior). The operator T k is hypersingular with its value on the boundary defined in the Hadamard finite part sense. For further details, we refer the reader to [10].
Enforcing the interface conditions (8), (9) and taking appropriate limits [10] yields the following system of Fredholm integral equations of the second kind: Remark 7. It is worth noting that, while T k is hypersingular, the difference kernel T k2 −T kp is only logarithmically singular and compact as are all the other difference operators in (32), at least for smooth boundaries. We use Nyström discretization for the system of equations based on the high order hybrid Gauss-trapezoidal rule of Alpert [1]. In this paper, we restrict our attention to smooth inclusions that are about one wavelength in size, so that 12 digits of accuracy are easily achieved with modest values of N using the Gauss-trapezoidal Wednesday, July 9, 14 Figure 2: Two inclusions and their enclosing disks. The scattering matrix S i for each inclusion Ω i with wavenumber k p is defined as the map from an incoming field on D i to the corresponding outgoing field. It is computed by solving a sequence of boundary value problems on the inclusion itself in a precomputation phase (see text). In this paper, we assume that all the inclusions are identical but may be rotated, as drawn here.
rule for logarithmic singularities of order 16. We refer the reader to [3] and the references therein for further details. The integral equation (32) was introduced in electromagnetics by Müller [23], and in the scalar case by Kress, Rokhlin, Haider, Shipman and Venakides [19,21,27].

The scattering matrix
Suppose now that we have M inclusions Ω 1 , . . . , Ω M that are identical up to rotation, and well separated in the sense that each inclusion Ω i lies within a disk D i of radius R so that the disks are not overlapping. (see Fig. 2).
In that case, we can sample the incoming field on the disk D j rather than Ω j as using a polar coordinate system centered on the disk D j . Let σ n and µ n denote the solution to the integral equation (32) with right-hand side u inc = J n (kr)e inθ , ∂u inc ∂n = kJ n (kr)e inθ . We may then precompute the multipole expansion from these source distributions where Here, y is the location of a point on ∂Ω j with respect to the center of disk D j and θ j (y) is the polar angle subtended with respect to the center of disk D j . The formula for β l is standard [28,7] and derived from the Graf addition theorem [24]. Definition 4.1. As before, the mapping between the incoming coefficients {α n } and outgoing coefficients {β n } is referred as the scattering matrix for the inclusion Ω j and denoted by S j .
The reason for permitting a different scattering matrix for each inclusion is that the Ω j may be distinct in terms of geometry or dielectric properties. For the sake of simplicity, we assume here that the wavenumbers are the same in each inclusion and that the shapes are the same up to rotation. This permits us to solve only 2p + 1 integral equations on a single prototype inclusion in the enclosing disk. The scattering matrix for each rotated copy is then trivial to construct. Moreover, we can easily store the densities σ n and µ n , since this requires only O (2N (2p + 1)) storage, where N is the number of points used to discretize the boundary ∂Ω. The amount of memory required to store the scattering matrix is O((2p + 1) 2 ). For modest values of N , as is the case in the present paper, we compute the LU factors of the integral equation system matrix corresponding to (31), (32) only once, at a cost of O(N 3 ) work. Each right-hand side corresponding to u inc = J n (k 2 R)e inθ and ∂u inc ∂n = k 2 J n (k 2 R)e inθ can then be solved for n = −p, . . . , p at a total cost of O(N 2 (2p+1)) work.

Multiple scattering
If we were interested in solving the multiple scattering problem in an infinite medium, we could now proceed as in the previous section. The number of degrees of freedom is only 2p + 1 per inclusion rather than N points per inclusions (the number needed to discretize the domain boundaries ∂Ω j ). For complicated inclusions, this permits a vast reduction in the number of degrees of freedom required and forms the basis for the FMPS method [15]. Moreover, the block-diagonal preconditioned multiple scattering equations are much better conditioned than the integral equation (31), (32) itself and FMM acceleration is particularly fast in this setting.
Remark 8. Extending the method to more than one type of substructure is straightforward as long as the assumption that the enclosed circles are well separated still holds. The additional cost is the bookkeeping for different scattering matrices of these substructures.

Multi-particle scattering in a layered medium
To this point, we have discussed the layered medium and multiple scattering problem spearately. For the full problem, we now assume that multiple inclusions have been placed in the middle of a three-layered medium. We assume that the inclusions are well separated, so that the multiple scattering formalism applies within the layer. Then, we may write where u 1 and u 3 denote the fields in the top and bottom half spaces and u 2 denotes the field in the central layer exterior to the scattering disks D j . u s 1 , u t 2 , u b 2 , and u s 3 are the Sommerfeld integrals from Section 2. Once u 2 is known, the field within the scattering disks and the inclusions themselves is easily obtained.

Evaluation of the Sommerfeld integral
Let us consider the function u t 2 defined by (6). Its computation is a standard problem in acoustic and electromagnetic scattering and often handled by contour deformation. It is typical to deform the integration contour by pushing it from the real line into the second Figure 3: The Sommerfeld contour in the complex λ plane: Each segment in the contour is discretized using Gauss-Legendre quadrature. The branch cut (shown in red) points upward from k and downward from −k. and fourth quadrants of the complex λ-plane in order to avoid the square root singularities in the integrand. One option is to use a hyperbolic tangent contour [3], which yields spectral accuracy with the trapezoidal rule and is extremely efficient. In our numerical simulation, we have chosen to use the piecewise smooth contour shown in Fig. 3 instead. This is slightly less efficient, but will permit us to evaluate the Sommerfeld integral using the non-uniform FFT, as explained further below. The contour consists of three segments: The branch cuts for the square root in the integrand are chosen to ensure that waves are decaying away from the interface. (up at k and down at −k as shown in Fig. 3).
We truncate Γ 1 and Γ 3 at a point t max > 0, where the integrand of u t 2 has decayed to a user-specified tolerance. Fortunately, the decay in the integrand is exponential once λ exceeds k 2 . (The precise rate of decay depends on the distance from the interface of the scattering disks and the point source generating the incoming field.) We let N S denote the number of points used in the quadrature for the Sommerfeld contour and note that each discretization point λ j on the contour corresponds to a plane wave. We use the same contour and the same N S values {λ j } for each of u s 1 , u t 2 , u b 2 , and u s 3 .

The full linear system
Let us denote by σ the discretized densities on the dielectric layers, σ = [ σ 1 , σ + 2 , σ − 2 , σ 3 ] T , and by β the multipole coefficients for all M particles in the central layer. Each of σ 1 , σ + 2 , σ − 2 , and σ 3 is of length N S and the full linear system for multiple scattering in the layered medium takes the form of a block 2 × 2 linear system: A itself is block diagonal 4N S × 4N S matrix with 4 × 4 blocks of the form A λ in (10), each such block corresponding to a distinct λ j in the contour integral discretization. The right-hand side component b is simply the right-hand side of (10) for each such λ j . The matrix D = ST − I is simply the multiple scattering system for the particles from (25). The off-diagonal blocks B and C are more complicated. B is a matrix that translates the multipole expansion coefficients to a Sommerfeld representation on the upper and lower interfaces of the layered medium, while C requires the evaluation of the Sommerfield integral contributions from the interfaces in terms of incoming local expansions on the scattering disks themselves. We turn now to the efficient application of the matrices B and C.

The Sommerfeld-to-local operator
A straightforward mechanism to map from the σ variables to local expansions on the M disks is to use the Jacobi-Anger formula [24].
Suppose now that we want to compute the contribution from σ + 2 to a local expansion on a disk centered at (x 1 , y 1 ). Using Lemma 5.1, it is easy to see that where φ = arccos(λ j /k 2 ), θ = arccos((x − x 1 )/r) and r = (x − x 1 ) 2 + (y − y 1 ) 2 . The analogous formula can be obtained for the contribution from σ − 2 . The cost of using formula (40) to compute the action of the C block in the system matrix above is clearly O(M N S (2p + 1)), where M denotes the number of particles and N S the number of discretization points λ j in the Sommmerfeld contour and p is the order of the expansions used in the multiple scattering representation. This is quite acceptable when either N S or M is small. For high frequency problems with many inclusions, where k 2 is large and N S = O(k 2 ), we have developed a more efficient scheme, based on the nonuniform FFT (NUFFT).

The Sommerfeld-to-local operator using the NUFFT
Instead of mapping the contribution from the Sommerfeld integral to each disk separately, we seek a fast algorithm for evaluating the integral on a grid of points in the central layer, after which we can use high order interpolation to get the desired local expansion.
Restricting our attention to u t 2 for a fixed value of y, we have where Note now that the integral on the right-hand side of (41) is a finite Fourier transform. If we could compute it rapidly, we would have an efficient method for evaluating the Sommerfeld integral at a fine grid in the x variable for a fixed y. The discretization points in t, however, lie at Gauss-Legendre nodes, so the FFT itself does not apply. Fortunately, the nonuniform FFT (NUFFT) of Dutt and Rokhlin [12,13] permits this to be done in nearly linear time. In our numerical simulations, we use the version discussed in [17,22]. The analogous method permits the rapid evaluation of the Sommerfeld integral on the contour Γ 3 . For the integral on Γ 2 , the NUFFT cannot be applied, but only a few discretization points are required, so we evaluate that contribution directly.
To provide rapid access to the field induced by the Sommerfeld integral at any location in the central layer, we superimpose on it a grid of n 1 × n 2 boxes that contain all of the M scattering disks. In each such box, we construct a tensor product m 1 ×m 2 Chebyshev mesh, which will permit qth order local interpolation by barycentric interpolation [4]. The cost for evaluation at all grid points is O ((n 2 m 2 ) (n 1 m 1 + N S ) log(n 1 m 1 + N S )) operations, using the NUFFT for each of the distinct n 2 m 2 locations in y.
Consider now one of the scattering disks D j of radius R. If we discretize the boundary of the disk using 2p + 1 equispaced points, evaluation of the induced field at each of the points requires O(m 1 m 2 ) operations, for a net cost of O(m 1 m 2 (2p + 1)) work. An FFT of order (2p + 1) converts these values into their Fourier transforms, which we denote by {u n }, for n = −p, · · · , p. From this, the n-th term a n in the incoming J-expansion is simply a n = u n J n (k 2 R) . (42) Remark 9. The formula (42) will fail if the value k 2 R is a zero of the function J n for any n from −p . . . , p. This can be avoided if we also compute the normal derivative of the Sommerfeld integral on the boundary of each scattering disk. If we denote by {u n } the Fourier coefficient of the normal derivative, it is easy to see that a n = u n J n (k 2 R) + u n kJ n (k 2 R) J 2 n (k 2 R) + (kJ n (k 2 R)) 2 , for j = −p, · · · , p.
The evaluation of the gradient of the Sommerfeld integral can be computed by an obvious modification of the formula (41) or (with a reduction in order) by computing the gradient of the tensor product Chebyshev series discussed above. In summary, it requires O(M m 1 m 2 (2p + 1)) operations to interpolate the field values on each of the M scattering disks and O(M (2p + 1) log(2p + 1)) operations to obtain the coefficients of the J-expansions. This completes the computation of the C block in the system matrix.

The multipole-to-Sommerfeld operator
The off-diagonal B block in (38) requires a formula for recasting the multipole expansion to the corresponding Sommerfeld representation on either the upper or lower interface of the layered medium. More precisely, each H-expansion in the central layer, centered on disk D j with center (x j , y j ) has a spectral representation on the upper layer y = 0 and the lower layer y = −d of the form: respectively. The formulae for σ + mp (λ) and σ − mp (λ) follow directly from the following theorem. Theorem 5.2. [6] Let (x j , y j ) denote the center of a multipole expansion in the central layer, with −d < y j < 0 and let (r, θ) denote the polar coordinates of a target point with respect to that center. Then, on the upper interface, and on the lower interface, Each multipole coefficient in the expansion about disk D j contributes to each of the N S discretization points in the Sommerfeld integrals, requiring a total of O ((2p + 1)N S M ) work. This, then, is the cost of applying the B block of the system matrix directly.

5.3.1
The multipole-to-Sommerfeld operator using the NUFFT Because of the computational complexity of applying the B block in the manner described above, it is important to develop a fast algorithm for the case where M and N S are large. We do so by essentially inverting the method of section 5.2.2. Assume first that all the centers of the H-expansions lie at the nodes of a uniform grid in the central layer and let us consider the contributions from the nth mode at each such grid point (x l , y j ) for a fixed horizontal line y = y j . If there are n 1 such expansion centers, with x coordinates {x l }, l = 1, · · · , n 1 , and we denote by a l n the coefficient for the nth mode of the H-expansion at location (x l , y j ), then the total contribution to the induced spectral coefficient σ + mp (λ j ) on the top layer is given by The formulae (48) imply that for each row, one can use the NUFFT to compute the induced coefficients for each discrete quadrature node λ j on Γ 1 or Γ 3 . As above, we use direct computation for the contributions to discretization nodes on Γ 2 . In the general case, the centers of the H-expansions are not aligned on a grid, but we can first shift the center of each H-expansion to the nearest grid point, using the multipole-to-multipole translation operator [28,7] based on the Graf addition theorem [24]. After M such shifts, we may apply the transformation of (48).
The total computational cost is O(M (2p + 1) 2 ) for shifting all the H-expansions and O (n 2 (2p + 1) (n 1 + N S ) log(n 1 + N S )) for the NUFFT-based work (see Table 1). The merits of the NUFFT-based schemes would become more apparent for larger N S .

Iterative solution of the system matrix
We will solve equation (38) using the iterative method GMRES [29]. However, the unknowns σ and β may be poorly scaled with respect to each other. However, A is block diagonal, as noted above, with simple 4 × 4 blocks. Thus, we first invert A directly and use GMRES on the Schur complement of (38). In other words, we solve the system instead. This is much better conditioned and involves only the β unknowns. The Schur complement formalism has a simple physical interpretation: it is, in essence, a reformulation of the scattering problem using the layered medium Green's function.

Numerical experiments
In this section, we illustrate the performance of our algorithm with three examples. For simplicity, we use a single class of inclusions, parametrized by x = (a 1 + a 2 cos(a 3 t)) cos(t), y = (a 1 + a 2 cos(a 3 t)) sin(t), for 0 ≤ t < 2π.
As discussed in section 4, inclusions with more complicated boundaries do not introduce any essential difficulty in our scheme except that the precomputation of the scattering matrix is a little more involved, particulalry if corners are present [5,20]. Given a fixed a 1 , a 2 and a 3 , multiple copies of the inclusion are randomly distributed in the central layer of the medium with random orientations. To ensure the inclusions are well separated but confined in a fixed region, we use a bin sorting algorithm to construct the random distribution. We begin with inclusions located on a regular grid and then perturb their positions randomly, accepting the random move if the inclusion remains inside the region and stays well separated from the others. Several such sweeps are carried out to randomize the positions further.
We have not, as yet, specified the parameter N S used to discretize the Sommerfeld integral in (37). While special techniques have been developed by many authors to handle sources near the interface (see [3,9,25]), we simply assume that the source defining the incoming field is at least 0.2 wavelengths from the top interface. More precisely, in our examples, the source point in the top layer is placed at (1, 1) (which is roughly 0.2 wavelengths away for wavenumber k 1 = 1). We also assume that the nearest the inclusions get to either one of the interfaces in the layered medium is at least 0.5 wavelengths. Under these assumptions, we let t max = max{|k 1 |, |k 2 |, |k 3 |} + 20, b = 0.2, and discretize Γ 1 and Γ 3 using 240 Gauss-Legendre points and Γ 2 by 20 Gauss-Legendre points. This is sufficient to achieve about 10 digits of accuracy.
All computations are carried out using a 2.3GHz Intel Core i5 laptop, with 4GB RAM.

Example 1: scattering from large numbers of inclusions
In our first example, we consider the scattering of inclusions defined by parameters a 1 = 0.12, a 2 = 0.04, and a 3 = 3 in eq. (50) with wavenumber k p = 2.0. To obtain the scattering matrix with p = 10, we solve the integral equation (31) and (32) by discretizing the boundary of the particle using N = 300 equispaced points. We assume the wavenumbers of the layered medium are given by k 1 = 1.0, k 2 = 3.0, k 3 = 1.0. The thickness of the central layer is determined by the parameter d = 32. We consider distributions of M = 100, 500, 1, 000, 5, 000 inclusions and solve the mulitple scattering problem using GMRES with FMM acceleration. We terminate the iteration once the residual is less than 10 −6 . Results are presented in Fig. 4 and 5. Figure 4: Real part of the total field with 5, 000 dielectric inclusions randomly distributed in a three-layered medium. The wavenumber for each particle is k p = 2.0 and the wavenumbers for the three layers are k 1 = 1.0, k 2 = 3.0, k 3 = 1.0. The size of each particle is approximately 0.1 wavelength for the wavenumber k 2 .
(a) (b) Figure 5: Convergence behavior of GMRES and the CPU time required for various numbers of inclusions embedded in either (a) free space or (b)a three layered medium. For (a), we set k 1 = k 2 = k 3 = 3.0 and for (b), we set k 1 = 1.0, k 2 = 3.0, k 3 = 1.0. Fig. 4 shows the total field in the case M = 5, 000. The field distortion due to the inclusions is apparent. It requires 127s to achieve 6 digits of accuracy. Fig. 5 shows the convergence behavior of GMRES as the number of inclusions is increased as well as the total CPU time. Clearly, more iterations are required for larger numbers of particles. Nevertheless, the time scales roughly linearly with the number of particles. In Fig. 5(a), we study the convergence rate when the background is homogeneous, by setting the material parameters to be the same for the three layers (k 1 = k 2 = k 3 ). As expected, convergence is more rapid than when the inclusions are embedded in a true layered medium, because of the multiple reflections from the interfaces themselves.

Example 2: scattering in high contrast materials
In our second example, we consider the same inclusion shape as above, with k p = 2, but with higher contrast materials. We fix the number of particles to be 200 and the thickness of the middle layer to be d = 12. We allow the wavenumber in the middle layer to vary from k 2 = 1 up to k 2 = 20. Results are shown in Figs. 6 and 7 for 6 digits of accuracy.
In Fig. 7, we compare the convergence behavior in an infinite medium (a) vs. a layer medium (b). Note that the convergence is slower at high constrast and that this effect is more pronounced in the layerd medium case, where the central layer involves strong scattering and reflection.

Example 3: scattering from smoothed pentagons
In our last example, we consider the scattering from a different inclusion shape, setting a 1 = 0.3, a 2 = 0.1, a 3 = 5 in eq. (50) with k p = 2.0. The inclusions are smoothed pentagons, as shown in Fig. 8. We discretize the boundary of the inclusion using N = 300 equispaced points and solve eq. (31) and (32) to obtain the scattering matrix with p = 10. We consider M = 100, 200, 500 and 1, 000 inclusions. For the three-layered medium, we set k 1 = 1.0, k 2 = 3.0, k 3 = 2.0. Results are shown in Figs. 8 and 9. Note that in order to obtain 6 digits of accuracy, 69.6 secs. are required for 1, 000 inclusions in a homogeneous background, while 140 seconds are required for the three layered medium. The time for convergence increases more or less in proportion to M 2 .

Conclusions
We have developed a fast algorithm to simulate electromagnetic scattering from a microstructured, three-layered material. Our methodology permits inclusions of arbitrary shape using a scattering matrix formalism combined with the use of Sommerfeld integrals to account for the influence of the layered material. We have designed efficient procedures to evaluate the Sommerfeld integral at arbitrary locations in the layered material using the non-uniform FFT and an effective preconditioner that allows the multiple scattering problem to be solved using GMRES with a modest number of iterations. As one would expect from physical considerations, the performance of the method degrades when the packing of inclusions is dense and when the contrast is high. While the method is suitable for parallel implementation, we are also investigating the possibility of replacing GMRES iteration with a fast direct solver [16].
Extension of the present method to the quasi-periodic case, where the incoming field impinges on a periodic microstructure will be reported at a later date.