Sensitivity analysis of waveguide eigenvalue problems

We analyze the sensitivity of dielectric waveguides with respect to design parameters such as permittivity values or simple geometric dependencies. Based on a discretization using the Finite Integration Technique the eigenvalue problem for the wave number is shown to be nonHermitian with possibly complex solutions even in the lossless case. Nevertheless, the sensitivity can be obtained with negligible numerical effort. Numerical examples demonstrate the validity of the approach.


Introduction
Numerical simulations of waveguide structures by finite methods have been used for many years, and a number of eigenvalue formulations are available, based, e.g., on the Finite Integration Technique (FIT, Weiland, 1977), or the Finite Element method (FE, e.g. Farle et al., 2004). If the calculations aim mainly at the analysis of port planes of originally 3-D setups, the number of unknowns in the 2-D waveguide model is typically low, and results are available within seconds. However, the situation changes for optical applications such as fibers or integrated waveguides, where due to the small wavelengths in the optical regime the problem sizes (of the two-dimensional discrete model) sometimes exceed several 10 6 unknowns. If additionally the dependencies of the waveguide modes w.r.t. design parameters such as geometric dimensions or material parameters are searched for, it is desirable to have sophisticated approaches for fast parameter sweeps at hand. Several such approaches have been reported in the context of Model Order Reduction (MOR) techniques. In most cases they are based on projections of the system matrices by low-order projection matrices, with the frequency as the main design parameter. Recently, some effort has been taken to extend them to the so-called multi-variate case (Farle et al., Correspondence to: N. Burschäpers (burschaepers@tet.upb.de) 2006; Stavrakakis et al., 2009). The application to 2-D eigenvalue problems seems to be straight-forward, although only a few such attempts have been published (Polstyanko et al., 1997;Schultschik et al., 2008).
A different approach, the so-called sensitivity analysis of electromagnetic systems using adjoint techniques (Nikolova et al., 2004), has recently gained large interest. Starting with analytical differentiations of the algebraic matrix equations, compact formulas can be derived for the sensitivities of output quantities w.r.t. an arbitrary number of design parameters. Adjoint techniques have been applied to various formulations in electromagnetic modeling, but to our knowledge not yet to 2-D waveguide eigenvalue problems.
In this paper we apply a classical sensitivity analysis to the eigenvalue problem arising from a 2-D FIT-discretization of inhomogeneous dielectric waveguides. In Sect. 2 we show that the system matrix is non-Hermitian, supporting complex modes also in the lossless case. This has some consequences for the adjoint technique to be applied which will be addressed in Sect. 3, where we also discuss an efficient Algorithmic Differentiation (AD) method for the matrix derivatives. Some numerical examples in Sect. 4 demonstrate the validity and efficiency of the approach.

Waveguide eigenvalue problem using FIT
We consider a cross section of a dielectric waveguide (e.g., an optical fiber) and use the Finite Integration Technique, FIT, Weiland, 1977Weiland, , 1996, for the discretization of Maxwell's Equations in frequency domain. For sake of simplicity, a standard Cartesian mesh is used and PEC boundary conditions are imposed at a distance far enough from the core region. The derivation of the resulting eigenvalue problem has been described in detail in Weiland (1977); Schuhmann and Weiland (2001) and is only briefly revisited here.
The state variables of FIT are integral quantities which are defined on edges L i , L i and facets A i , A i of the primary grid 86 N. Burschäpers et al.: Sensitivity analysis of waveguide eigenvalue problems G and the dual grid G, respectively. These are the grid voltages and the grid fluxes (omitting currents) Using these definitions, Maxwell's equations (neglecting currents and charges) are transformed into a set of matrixvector-equations for the component vectors e , d , h , b , which are referred to as Maxwell's Grid Equations. The matrix C is the discrete curl-operator, S is the discrete divoperator of the primary grid, and S and C are the corresponding operators for the dual grid. In Cartesian grid systems with N P primary nodes these are structured matrices, e.g.
and the N P × N P -blocks P x , P y , P z can be identified as discrete partial differentiation operators (Weiland, 1996). From grid topology we find the relations SC = 0, S C = 0 and C = C T . The formulation is completed by the material relations (for linear media) Both material matrices M ε and M −1 µ are diagonal and may be complex (in frequency domain) to account for dielectric or magnetic losses. Their entries contain the locally averaged permittivity and permeability distribution as well as the metrics of the grid.
For the discretization of waveguide cross sections we use a two-dimensional Cartesian grid with N P = N x · N y primary nodes and assume a wave propagation E,H ∼ e −j k z z in zdirection, with k z the (unknown) wave number. Thus, the longitudinal differential operator becomes with the identity matrix I. Note, that applying Eq. (7) to (5) results in complex-valued matrices, which are only valid for one distinct k z . Besides, the transposed sub-matrices in C and S have to be replaced by the Hermitian expression P H z = +j k z I. In order to derive an eigenvalue formulation for the modes in such waveguide cross sections, we start with the discrete curl-curl eigenproblem and the divergence-free condition of the fields Equation (9) allows to eliminate the longitudinal components from the eigenvalue equation, since the expression (7) as well as the material matrices can be easily inverted: Finally a 2N P × 2N P -eigenvalue problem can be derived: For a fixed angular frequency ω we obtain a simple, non- Note that A is non-symmetric also in the lossless case: This is not a property of our specific formulation but rather of the physical setting itself, since it is well-known that dielectric waveguides may support so-called complex modes.
In the lossy case (with complex material matrices) the system is non-Hermitian and supports complex wave numbers k z = β − j α. Actually, dielectric losses can be allowed in the following without any restriction (except for some more effort in the algebraic solver for the eigenproblem).

Eigenvalue perturbation theory
We are interested in the sensitivity of the eigensolutions with respect to a number of design parameters such as geometric dimensions or permittivity values. For simplicity of notation we restrict here to one single parameter p and calculate the derivatives The derivation makes use of the left-eigenvectors y of the system (the eigenvector of the Hermitian matrix A H , hence: and the orthonormality condition of right-and lefteigenvectors (i,j denoting the index number of the eigensolution) Following the standard perturbation theory for eigenproblems (e.g., Nelson, 1976) we build the derivative d dp Multiplying from the left by y H yields, together with Eqs. (16) and (17), the desired eigenvalue sensitivity: Once λ has been calculated, Eq. (18) defines a linear system for the eigenvector derivative x . Its singularity can be easily removed using a normalization condition (Nelson, 1976), here defined again by y H x = 1.
To summarize, the ingredients to calculate the sensitivities are 1. the eigensolution {x,λ} itself, properly normalized, 2. the corresponding left-eigenvector y, 3. the derivative of the system matrix A .
Note that some care must be taken in the previous steps in case of multiple eigensolutions (degenerated modes).

Calculation of the left-eigenvector
Unfortunately, in a non-Hermitian system as given here, the left-and right eigenvectors are not identical. A symmetrization of the system (like, e.g., in lossless 3-D eigenproblems) is generally not available, since this loss of symmetry is due to the physics of inhomogeneous waveguides rather than an inefficient formulation.
However, physical considerations again can help to avoid an additional solver step as shown in the following. The orthogonality property (17) between the searched left eigenvector and the original right eigenvector (the transversal electric field) suggest that y may be related to the magnetic field in the guide which fulfills the continuous orthogonality relation for the fields of two modes i and j . It has already been shown previously (Schuhmann and Weiland, 2001) that this type of orthogonality can be reproduced within the discrete setting by n ( e (i) Fig. 1. Discrete evaluation of the E × H -based orthogonality relation in the waveguide cross section. The voltages e x ≈ E x x, e y ≈ E y y and h y ≈ H y y, h x ≈ H x x include the lengths which define the integration areas A yx , A yx (shaded gray).
As shown in Fig. 1 this formula directly rebuilds the desired cross product in a Cartesian system, and the area integration is implicitly included due to the integral character of the grid voltages e and h .
Obviously, the vector (now omitting the mode number j in the notation) seems to be the left-eigenvector we are looking for. To prove this assumption we can derive the eigenvalue formulation for the magnetic field components in a similar way than above for the electric field: Starting with the curl-curl problem which proves that y is the desired left-eigenvector. Besides the nice physical interpretation of this result (the transversal magnetic and electric fields of the modes constitute the leftand right-eigenvectors in this formulation), there is an even more important algebraic consequence: Although the matrix is non-symmetric, the left-eigenvector can be easily calculated without any additional solver step, simply by applying Faraday's law: Note that the z-components in e have to be calculated first from the eigenvector x by applying Eq. (10), and this step as well as the C-matrix in Eq. (25) contain the (square-root of the) eigenvalue k z .

Calculation of the matrix derivative
The next step is the calculation of the matrix derivative A = ∂ ∂p A in Eq. (19). We have implemented two variants: The first one depends on the type of parameter p and may only be available for simple constellations. If, e.g., the design parameter is related to the permittivity within the waveguide, the dependency of the material operator M ε (p) in Eqs. (12) and (13) can be analyzed by analytical differentiation, and the main task is to provide for an efficient implementation.
As a more general approach we also apply an Algorithmic Differentiation (AD) algorithm as described in Griewank and Walther (2008) and implemented in the ADOL-C package (ADOL-C). It enables the differentiation w.r.t. arbitrary parameters at low computational cost, once the original function has been coded in a computer program. As expected, the results are identical up to the level of numerical noise.
Note that AD is only applied to the matrix assembly in Eqs. (11)-(13). The parameter dependency in these formulas is considered as a function F : R n → R m , y = F(x) describing an algebraic mapping from R n to R m and should be defined by an evaluation procedure in a high-level computer language like Fortran or C. The technique of algorithmic differentiation, also called automatic differentiation, provides derivative information of arbitrary order for the code segment in the computer that evaluates F(x) within working accuracy. For this purpose, the basic differentiation rules such as, the product rule, the quotient rule etc., are applied to each statement of the given code segment. This local derivative information is then combined by the chain rule to calculate the overall derivatives. Hence the code is decomposed into a long sequence of simple evaluations, e.g., additions, multiplications, and calls to elementary functions such as sin(x) or exp(x), the derivatives of which can be easily calculated. Exploiting the chain rule yields the derivatives of the whole sequence of statements with respect to the input variables.
Over the past decades, extensive research activities led to a thorough understanding of AD and its two basic modes, the forward and the reverse mode. The complexity estimates for these approaches to compute derivative information are based on the operation count O F , i.e., the number of floating point operations required to evaluate F(x). Using the forward mode, one computes the required derivatives together with the function evaluation in one sweep as illustrated above. The forward mode yields one column of the Jacobian ∇F at no more than three times O F (Griewank and Walther, 2008). One row of ∇F, e.g., the gradient of a scalar-valued component function of F, is obtained using the reverse mode in its basic form also at no more than four times O F (Griewank and Walther, 2008). It is important to note that this bound for the reverse mode is completely independent of the number n of input variables.
For the problem formulation considered here, it suffices to apply the forward mode. The more sophisticated reverse mode will be required for more detailed problem descriptions to be considered in future work.

Numerical examples
We apply our algorithm to the fundamental modes in two inhomogeneous dielectric waveguides, where the permittivity distributions are given by simple one-dimensional parametrized models. Goal quantities are the wave number k z , or the effective refractive index n eff = k z /k 0 , respectively. In both cases their sensitivities have to be derived by a simple post-processing step from the derivative of the eigenvalue λ = −k 2 z . Parameter sweeps serve as a reference for the sensitivities. Note that the dimension of the system matrices has no influence on the performance and accuracy of the sensitivity analysis. Figure 2 shows the FIT 2-D computational grid and the geometry of a waveguide with dielectric inset. The parameter of the model is the relative permittivity p = ε 1 /ε 0 of the dielectric inset, and we calculate the wave number k z (p).

Graded index waveguide
The second example is a graded index waveguide with parabolic index profile. Figure 4 shows the 2-D computational grid, where due to the twofold symmetry only a quarter of the waveguide has to be discretized. Also shown is the profile of the refractive index n = √ ε r as a function of the radius ρ. The design parameter p = a is the core size of the waveguide, i.e. a geometrical parameter, which can of course easily be mapped on the permittivity distribution in each cell. Thus, the existence of the first derivatives of the matrices and the goal function w.r.t. the design parameter is assured. Figure 5 shows again the reference results together with a tangent using the first order derivative from the sensitivity analysis. The results agree very nicely.

Conclusions
The numerical examples show that the accurate calculation of the first derivative of the eigenvalue w.r.t. some design parameters can be performed at negligible computational cost. Although the original formulation is structurally non-Hermitian, the required left-eigenvector is available from the magnetic field of the modes without a further solving step. The main implementation effort has to be spent on providing a matrix derivative, where the AD approach promises to be both numerical efficient and generally applicable.
Some issues to be addressed in future work are the following: Although the result curves shown here are quite smooth (suggesting that the wave number depends only weakly on the design parameters), it might be desirable to have also sec-ond order derivatives. Following the ideas presented above, their calculation is straight-forward, provided that the differentiability of the operators involved is assured. The required second order matrix derivatives can again be obtained from the AD approach. Either second order derivatives or multiple first order derivatives at different expansion points may be used to attain broadband estimations of the parameter dependencies. Especially if applied to the frequency as parameter. this clearly is another link to related MOR approaches and the discussion on single-or multipoint approximations.
Finally, the sensitivity results can be very useful in optimization processes. Especially when derived for more than a single design parameter, the existing implementation will benefit from the generality of the AD approach.