Effect of Choices of Boundary Conditions on the Numerical Efficiency of Direct Solutions of Finite Difference frequency Domain Systems with Perfectly Matched Layers

Direct solvers are a common method for solving finite difference frequency domain (FDFD) systems that arise in numerical solutions of Maxwell's equations. In a direct solver, one factorizes the system matrix. Since the system matrix is typically very sparse, the fill-in of these factors is the single most important computational consideration in terms of time complexity and memory requirements. As a result, it is of great interest to determine ways in which this fill-in can be systematically reduced. In this paper, we show that in the context of commonly used perfectly matched boundary layer methods, the choice of boundary condition behind the perfectly matched boundary layer can be exploited to reduce fill-in incurred during the factorization, leading to significant gains of up to 40 percent in the efficiency of the factorization procedure. We illustrate our findings by solving linear systems and eigenvalue problems associated with the FDFD method.


Introduction
In any finite difference simulation, the domain must be truncated and a boundary condition is imposed. Common boundaries include the periodic boundary condition or the Dirichlet boundary condition. However, in many applications, one wants to simulate an "infinite" domain in the sense that any wave incident on the boundary should have no reflection. In one dimension, it is possible to eliminate all reflections via the Mur absorbing boundary condition [1]. However in n > 2 dimensions, there is no absorbing boundary condition that can eliminate reflections for waves incident from all angles. Instead, it is common to surround the computational domain by perfectly matched layers (PML), a specially designed anisotropic absorber [1,2]. The PML region is typically truncated using a boundary condition such as the periodic or the Dirichlet boundary condition.
The PML has been extensively used in computational electromagnetics, for both frequency and time-domain solutions of the Maxwell's equations [1][2][3][4][5]. In the time domain, significant efforts have been invested in reducing the reflection for a given number of layers in the PML region [6][7][8]. Various formalisms of the PML have also been developed to ensure strong suppression of reflection over broad bandwidth [2,9]. In the frequency domain, there are additional considerations in the applications of PML. For example, Ref. 10 has shown that the condition number of the system matrix, and hence the performance of iterative solutions of Maxwell's equations in the frequency domain, can vary drastically for different formalisms of PML, even though these formalisms offer similar wall-clock performance in time-domain algorithms.
In this paper we provide a discussion of PML in the context of direct solver for Maxwell's equations in the frequency domain. In such direct solvers, one typically performs an LU decomposition of the system matrix. We show that the fill-in in the LU decomposition strongly depends on the boundary condition that is used to truncate the PML. In arXiv:2202.13552v1 [math.NA] 28 Feb 2022 particular, the use of the Dirichlet boundary condition leads to far less fill-in and hence significant efficiency gain, as compared with the use of the periodic boundary condition. This result is in contrast with both time-domain algorithms, as well as with iterative algorithms in the frequency domain for the solutions of Maxwell's equations, where the use of these two boundary conditions do not significantly affect the performance of the algorithms. Our result is important for the development of direct solvers for Maxwell's equations in the frequency domain.
2 A Brief Review of the Finite Difference Frequency Domain Method 2.1 Linear system formulation The Maxwell's equation written for the H-field is: where ← → r denotes a three-dimensional dielectric tensor, ω is the frequency, H(r) denotes the vector magnetic field, M(r) is the vector magnetic source, and the magnetic permeability is assumed to be µ 0 in the entire computational domain. In the finite-difference frequency-domain (FDFD) method, one discretizes Eq. (1) via finite differences on a Yee grid [2,[11][12][13] to obtain a system of linear equations: The matrixÂ corresponds to the left hand side of Eq. (1) and x is a vector corresponding to the discretized version of H. b corresponds to the source term on the right hand side.
In order to obtain a solution to Eq. (2), a number of algorithms exist, such as Gaussian elimination [14]. However, a common method is to factorize the matrixÂ intoÂ =LÛ , whereL is a lower triangular matrix andÛ is an upper triangular matrix. The solution can then be obtained: Eq. (3) is typically called back-substitution. The computation of L −1 b and U −1 (L −1 b) is cheaper as compared with the full Gaussian elimination of the system matrixÂ [15,16]. However, the process of determining the factorization requires a modified version of Gaussian elimination, so the overall time complexity of using the LU method is the same as the Gaussian elimination [14]. On the other hand, if one has multiple right hand sides to solve, then the LU factorization becomes efficient as the L and U factors can be re-used.

Linear Eigenvalue System Formulation
The LU decomposition is also useful for solving eigenvalue problems in electromagnetics, an example of which is: Following the same discretization procedure as that leading to Eq. (2), we obtain the eigenvalue problem: whereÂ now corresponds to the left-hand side of Eq. (5) and λ = µ 0 ω 2 . Eigenvalues are typically determined iteratively by using the Arnoldi iteration and its variants where one repeatedly applies the matrixÂ to a test vector [17,18]. However, direct application of these methods can only produce the largest k eigenvalues of a system. In computational electromagnetics, one typically is interested instead in eigenvalues that are near a target value σ. For such a problem, a reformulation of the eigenvalue problem is required [17,19]. Instead of solving Eq. (5) we solve: whereÎ represents the identity matrix. The eigenvalues β can be related to eigenvalues of A via: β = 1/(λ − σ).
Thus the eigenvalues ofÂ that are closest to σ correspond to the largest eigenvalues of (Â − σI) −1 . In applying iterative methods to obtain the largest eigenvalues of (Â − σI) −1 , we need to repeatedly apply (Â − σÎ) −1 . In the FDFD method,Â is sparse, but (Â − σÎ) −1 is fully dense. For large problems it is too costly memory-wise to form (Â − σÎ) −1 explicitly. The efficient way to do this is to compute an LU factorization ofÂ − σÎ [20]. Thus, the LU factorization is important for the eigenvalue problems that arise in the FDFD method as well.

Brief Review of Boundary Conditions
In the FDFD method the computational domain is finite and a boundary condition needs to be applied on the surface of the computational domain. This is illustrated in Fig. 1, where we show an undirected graph representation of a simple square grid consisting of N = N x × N y nodes with N x = N y = 10. The points at which the fields are evaluated are termed nodes. We denote the fields evaluated at the nodes as x(i, j), where i is the row index and j is the column index. For a 2D problem, the Maxwell's equations can be written in terms of only E z or H z fields, where z is the out-of-plane direction, hence x can correspond either to the E z or the H z fields. The outermost nodes are the boundary nodes where the boundary condition is enforced. The edges represent the node-to-node coupling as derived from finite-difference approximation of Eq. (1). In Fig. 1a, we show the periodic boundary, which can be described as: x(N x , j) = x(0, j) and x(i, N y ) = x(i, 0). In Fig. 1b, we show a Dirichlet boundary condition: When x corresponds to the H(E) fields, the Dirichlet boundary condition corresponds to the perfectly magnetic (electric) conductor boundary condition.

(b)
PML Interior (a) In typical FDFD simulations one often surrounds a computation domain of interest with perfectly match layers (PML) [1,10]. These layers are used to ensure that any wave exiting the computational domain is nearly perfectly absorbed, which is important for simulating an open electromagnetic system. When the PML are used, the entire computational domain consists of the domain of interest and the PML regions. Outside the PML regions, the computational domain still needs to be truncated. Either the periodic or the Dirichlet boundary conditions can be used for this purpose.
The choice of either the periodic or the Dirichlet boundary conditions outside the PML regions does not affect their absorption properties. Moreover, in the finite difference time domain method, or when iterative solvers are used in the finite difference frequency domain method, these two boundary conditions result in very similar computational cost. Therefore, there has been very little attention paid to the choice between these two boundary conditions. As we will show in this paper, however, this choice in fact makes a significant difference when direct solvers are used in the finite difference frequency domain method, since the choice significantly influences the fill-in behaviors in the LU decomposition of the system matrix.

A Brief Review of the Fill-in of Sparse Linear Systems
In general,Â for Eq. (2) or Eq. (5) is sparse. A key consideration in the LU factorization is the issue of fill-in, which is the accumulation of additional non-zero elements inL andÛ outside those ofÂ. In general, fill-in during the factorization process for a sparse linear system depends sensitively on the detailed sparsity pattern of the system matrix.
Consider, for example, the following factorization of a matrix: where p denotes a matrix element which is nonzero (element values may be different). Because the first row has no zeros, the factorization process results inL andÛ factors that are dense [14]. In contrast, consider the factorization of an equivalent system as obtained from the left hand size of Eq. (7) by permuting the row index: TheL andÛ factors resulting from Gaussian elimination do not have any fill-in. In general, the order of the equations as well as the order of the variables can drastically change the fill-in. The operations which generate these reorderings can be expressed as preconditioners onÂ via a set of permutation matricesP andQ:Â →PÂQ, whereP performs row-wise andQ performs column-wise permutations. Unfortunately, the problem of finding the optimal permutation to produce the least amount of fill-in is NP-complete [21], which means there are no known, at-worst polynomial time algorithms to determine the optimal permutation. Instead, there is a wide array of different heuristic or approximate methods to reduce fill-in.
The state-of-the-art re-orderings are based on the exact minimum degree algorithm and their variants, such as the methods of approximate minimum degree (AMD) and nested dissection [22][23][24][25][26]. All of these methods effectively rely on the minimum degree heuristic, which is the observation that first eliminating variables in the factorization process with fewer couplings (or nonzeros in the row/column) tends to yield less fill-in overall.

Reducing Fill-in by Reducing Connectivity within the PML
As discussed in Section 2.3, the choice of boundary condition backing a PML can vary. As a result, we have the freedom to choose the boundary condition behind the PML. In this section, we now show how the choice of boundary condition behind the PML can affect the efficiency of sparse direct solvers, specifically impacting the fill-in during factorization.
From Figure 1, we can apply the minimum degree heuristic by counting the neighbors of the individual nodes on the grid. The number of neighbors corresponds to the number of nonzero elements in each row ofÂ. For the periodic boundary case shown in Fig. 1a, all nodes including the boundary have 4 neighbors. By the minimum degree heuristic, there is no preference in picking which rows to eliminate first. Meanwhile, for the Dirichlet boundary in Fig. 1b, the four corner nodes have 2 neighbors, the edge nodes have 3 neighbors and then all other nodes have 4 neighbors. By the minimum degree heuristic in this case, we can see that the rows corresponding to the boundary nodes should be eliminated first.
Finally, we can take advantage of the fact that the nodes adjacent to the Dirichlet boundary have fields that are generally close to zero, since the fields have been strongly by the PML. As a result, we can also further modify the connectivity to minimize the number of neighbors per node, as we show in Fig. 2:  Table 1: Changes in the number of couplings in a grid of N nodes for different boundary conditions. The 2nd column shows the analytic expression for the number of total couplings in the grid. The third column shows the values of these expressions for the specific case of N = 10000.
In Fig. 3, we show the influence of the boundary conditions on the fill-in behavior for the small 10×10 grids shown in Figs. 1 and 2. The use of such a small grid allows easier visualization of various quantities involved. We solve a 2D problem corresponding to setting all terms involving ∂ z to zero in Eq. (1). We simulate a magnetic source in vacuum for the magnetic field perpendicular H z to the 2D plane: where we assume r is isotropic, M z is the z-component of the magnetic source current. We consider the periodic boundary conditions, the Dirichlet boundary conditions, and the modified Dirichlet boundary conditions as discussed above where the connectivity at the boundary is reduced. The first column of Figure 3 shows the sparsity pattern of the system matrices for the three boundary conditions. The sparsity patterns of the three matrices are similar since they differ only on the boundary. Compared with the system matrix from the Dirichlet boundary condition (Fig. 3d), the matrix from the periodic boundary condition (Fig. 3a) has extra non-zero elements at the lower-left and upper-half corners in the figure due to the long-range coupling induced by the periodic boundary condition. The matrix from the modified Dirichlet (Fig. 3g) boundary condition has less non-zero elements near the diagonal as results from the modifications.
The second column in Figure 3 shows theL factor as obtained by directly factoring the corresponding matrices in the first column as is without any reordering. For the case with periodic boundary condition (Fig. 3b), theL factor fills in near the diagonal and the lower edge in the figure. The fill-in at the lower edge arises from the long-range coupling as induced by the periodic boundary condition. By switching to the Dirichlet boundary condition (Fig. 3e), the fill-in near the lower edge disappears and the number of non-zero elements in theL factor significantly decreases. Modifying the Dirichlet boundary condition further reduces the fill-in and the number of non-zero elements (Fig. 3h).
The third column in Figure 3 shows theL factor as obtained by factoring the corresponding matrices in the first column using a minimum degree heuristics to reorder the matrix [22]. Compared with the second column, the use of the minimum degree heuristics significantly reduces the number of fill-in for all three cases. Also, the number of non-zero elements reduces as we go from the case with the periodic boundary condition (Fig. 3c), to the case with the Dirichlet boundary condition (Fig. 3f), to the case with the modified Dirichlet boundary condition (Fig. 3i). TheL factor shown in Fig. 3i, which was obtained for the modified Dirichlet boundary condition using the minimum degree heuristics, has 456 non-zero element, as compared with the number of non-zero elements of 500 in the system matrix with the periodic boundary condition. The result in Figure 3 provides a direct demonstration that the use the modified Dirichlet boundary condition, in combination with the minimum degree heuristics, can significantly reduce the fill-ins.  Fig. 3, we show the sparsity pattern ofÂ corresponding to the finite difference discretization of the 10 × 10 grid with the a) periodic, d) Dirichlet boundary, and g) modified Dirichlet boundary with reduced connectivity at the boundary respectively. In the second column, we show the correspondingL factors for factorizing these three matrices as is. In the third column, we show correspondingL factors for factorizing these three matrices using a minimum degree heuristic to reorder the matrices. The number of nonzero elements (nnz) for each matrix is also provided in each figure.

Numerical Experiments
In this section, we show that the concept as described above can be applied to typical simulations of electromagnetic waves. We consider both the driven case with a source, and eigenmode calculations without a source. For both cases, we show that choosing either the periodic or the Dirichlet boundary conditions behind the PML does not affect the accuracy of the results, but the choice of the Dirichlet boundary condition leads to better numerical performance.

Driven Problem
We solve Eq. (9) as in the previous section but now the interior domain consists of N x × N y with N x = N y = 301 grid cells surrounded with a PML with a thickness of 30 cells to ensure minimal reflection [27]. The domain has a length a = 4λ where λ is the wavelength of the source.
First, we confirm that the use of either the periodic boundary or the modified Dirichlet boundary in conjunction with a PML indeed has no effects on the solution within the interior domain. In Fig. 4 we show the result where the  interior domain consits of vacuum and we put a point source at the center of the interior domain. For the two boundary conditions, the field distributions inside the computational domain are almost identical, as can be seen by comparing visually the field distributions shown in Fig. 4a and b, or by exampling the field distributions along a line in the computational domain as shown in Figure 4c.
For the matrixÂ corresponding to the driven problem, we show in Fig. 5 the difference in factorizing the matrix with the periodic boundary vs the Dirichlet boundary. In Fig. 5a and b, we show the fill-in of theL factor with the modified Dirichlet and periodic boundary respectively for the system shown in Figure 4. TheL factor from the modified Dirichlet boundary is noticeably sparser than theL corresponding to the periodic boundary.
In Fig. 5c, we vary the size the domain by keeping the domain to be a square shape and varying N x = N y . We plot the percentage reduction in the number of non-zero elements in theL factor as we change from the periodic to the modified Dirichlet boundary condition. The percentage reduction fluctuates significantly as N x varies, since the reordering strategies are heuristic. But in general, we see significant reduction for all N x 's. On average the percentage reduction decreases as N x increases. This is expected since the boundary, being a smaller and smaller fraction of the domain as N x increases, would lead to a smaller effect on the relative fill-in. What is remarkable is that the boundary couplings clearly have a disproportionately large effect on the fill-in ofL factor, even though the boundary nodes are often just 1% or less of the total nodes in the entire grid. Even with N x ≈ 1000, there are cases where the use of Dirichlet boundary condition results in 33% fewer nonzeros in theL as compared with the use of the periodic boundary condition.

Eigenvalue Problem
As a second example, we consider an eigenvalue problem that commonly arises in the study of nanophotonic waveguides [28,29]. In this problem, we consider a structure that is periodic along the x-direction. For a given operating frequency ω, we solve for the propagating wavevector k x . To develop the formalism, we substitute via the Bloch theorem H z (x, y) → H z (x, y)e ikxx in Eq. (9) to obtain: which is a quadratic eigenvalue problem of the form: where λ = k x and:K Eq. (11) can be reformulated into an equivalent linear, generalized eigenvalue problem of the form: Dλw =Bw (15) where w = [x, λx] with x denoting the vector formed by H z (x, y) and where: After discretization using a computational domain consisting of a single unit cell of the waveguide structure,0 denotes a matrix of zeros of dimension N x N y . In the waveguide eigenvalue problem, similar to the discussion in Section (2.2), we are typically interested in modes with wavevector near a particular value σ. Therefore, following the same derivation as in Section (2.2), the matrix that needs to factored isB − σD [17].
As an illustration we consider the waveguide structure schematically shown in Figure 6a [30,31]. The permittivity of the materials used is provided in the caption of Figure 6a. The waveguide consists of two walls. Each wall consists of a mixture of dielectric and a Drude metal. The thickness of each wall is 0.3 µm. The separation between the walls is 1.6 µm. The structure is periodic along the x-direction. The periodicity along the x-direction is 0.2 µm. The metal region in the wall has a width along the x-direction of 0.04 µm. Two unit cells are shown in Figure 6a. By operating this system at λ = 2 µm, the walls become near-completely reflective, creating a hollow-core waveguide with light guided in the vacuum region between the walls.
To simulate this structure we use a computational domain consisting of a single unit cell shown in Figure 6a. We discretize the computational domain with N x × N y grid cells where N x = 100 and N y = 150. Along the x-direction a periodic boundary condition is imposed at the edges of the computational domain. Along the y-direction we place PML regions at both edges. The PML regions consist of 10 grid cells. Outside the PML region we truncate the grid with either the modified Dirichlet or the periodic boundary conditions.
First, we validate that the use of different boundary conditions backing the PML does not affect the guided modes of the system. In Fig. 6b and c, we compare the real part of the field profiles of the structure for a sample mode with the same solved k x eigenvalue but with either the Dirichlet or periodic boundary conditions. We see an excellent agreement between the modal profiles. In Fig. 7, we compare the bandstructure of the waveguide with the periodic and Dirichlet boundary conditions backing the PML in the y-direction. To generate these bandstructures, we scan a discrete set of ω values, and look for the k x eigenvalues closest to 0. Additionally, due to spurious solutions that come from discretizing Eq. (10), a mode-filter must be implemented to produce the band-structure. Fig. 7a is the band structure with the Dirichlet boundary and Fig. 7b is the corresponding one with the periodic boundary. Again, we see that the band structures in both case are essentially identical, as expected.   Now we compare the fill-in of factorizing (B −σD) with the periodic versus the modified Dirichlet boundary conditions. In Fig. 8a-c we consider the case where σ = 0. In Fig. 8a, we show the sparsity pattern of the matrixB in Eq. (17). In Fig. 8b and c, we show the fill-in of theL factor ofB with the domain truncated by the periodic boundary and the modified Dirichlet boundary conditions, respectively. The fill-in patterns vary substantially between the two cases. Moreover, even though we have PML regions only along the y boundaries, we still achieve a 40% reduction in the number of nonzeros. In the second row of Fig. 8, we show a case with σ = 0. SinceB andD have different sparsity patterns, the sparsity pattern shown in Fig. 8d differ from that in Fig. 8a. In comparing Fig. 8e with Fig. 8f, we again see that the use of the modified Dirichlet boundary condition leads to less fill-in.

Conclusion
In summary, we consider direct solvers for the FDFD methods applied to Maxwell's equations. We demonstrate that the choice of boundary condition used behind the PML can have significant effects on the numerical efficiency of the LU factorization of the system matrix. With the PML the physical properties of the solutions do not depend on the  11) of the eigenvalue problem. The gray lines partition the matrix into N x N y × N x N y sub-blocks corresponding to those in Eqs. (17) and (16). The corresponding sparsity pattern of theL factor with the periodic boundary condition. c) The corresponding sparsity pattern of theL factor for the modified Dirichlet boundary condition. In the second row, we show the analogous result with σ = 1 (in units of λ p , the free-space wavelength at the plasma frequency).
choices of the boundary conditions. But we observe that the use of a modified Dirichlet boundary condition can lead to a reduction of fill-in by up to 40%, as compared to the commonly used periodic boundary conditions. Such a reduction translates to significant increase in computational speed and decrease in memory requirements for the direct solvers.
While in our paper we focus on the PML, our results should be applicable for other numerical absorbers, such as the adiabatic absorbers [32]. Also, while we illustrate our observations with two-dimensional simulations, the same observations should be applicable for three-dimensional simulations as well. Our observation should also be relevant for finite-element frequency-domain solvers.
Our result is important for enhancing the performance of direct solvers in the FDFD methods. In addition, this result should have implications for several numerical tasks in computational electromagnetics that can benefit from efficient LU decomposition of the system matrix, such as adjoint variable method [12,33], efficient line searches for gradient-based photonic optimization [34][35][36], and domain decomposition methods utilizing the Schur complement approach [37,38].