Transformation Inverse Design

We present a new technique for the design of transformation-optics devices based on large-scale optimization to achieve the optimal effective isotropic dielectric materials within prescribed index bounds, which is computationally cheap because transformation optics circumvents the need to solve Maxwell's equations at each step. We apply this technique to the design of multimode waveguide bends (realized experimentally in a previous paper) and mode squeezers, in which all modes are transported equally without scattering. In addition to the optimization, a key point is the identification of the correct boundary conditions to ensure reflectionless coupling to untransformed regions while allowing maximum flexibility in the optimization. Many previous authors in transformation optics used a certain kind of quasiconformal map which overconstrained the problem by requiring that the entire boundary shape be specified a priori while at the same time underconstraining the problem by employing"slipping"boundary conditions that permit unwanted interface reflections.


Introduction
In this work, we introduce the technique of transformation inverse design, which combines the elegance of transformation optics [1][2][3][4][5] (TO) with the power of large-scale optimization (inverse design), enabling automatic discovery of the best possible transformation for given design criteria and material constraints. We illustrate our technique by designing multimode waveguide bends [6][7][8][9][10][11][12][13][14][15][16][17][18] and mode squeezers [17][18][19][20][21], then measuring their performance with finite element method (FEM) simulations. Most designs in transformation optics use either hand-chosen transformations [3, 11-14, 19, 22-29] (which often require nearly unattainable anisotropic materials), or quasiconformal and conformal maps [2, 6-10, 19, 30-48] which can automatically generate nearly-isotropic transformations (either by solving partial differential equations or by using grid generation techniques) but still require a priori specification of the entire boundary shape of the transformation. Further, neither technique can directly incorporate refractive-index bounds. On the other hand, most inverse design in photonics involves repeatedly solving computationally expensive Maxwell equations for different designs [49][50][51][52][53][54][55][56][57][58][59][60][61][62][63][64][65][66][67][68]. Transformation inverse design combines elements of both transformation optics and inverse design while overcoming their limitations. First, the use of optimization allows us to incor- porate arbitrary fabrication constraints while at the same time searching the correct space of transformations without unnecessarily underconstraining or overconstraining the problem. Second, instead of solving Maxwell's equations, we require only simple derivatives to be computed at each optimization step. This is because transformation optics works by using a coordinate transformation x ′ (x) that warps light in a desired way (e.g. mapping a straight waveguide to a bend, or mapping an object to a point or the ground for cloaking applications [3, 30, 37-40, 69, 70, 73]) and then employing transformed materials which are given in terms of the Jacobian J i j = ∂ x ′ j /∂ x i to mathematically mimic the effect of the coordinate transformation. This transforms all solutions of Maxwell's equations in the same way (as opposed to non-TO multimode devices which often have limited bandwidth and/or do not preserve relative phase between modes [57,68,[74][75][76][77][78][79][80][81][82]), and is therefore particularly attractive for designing multimode optical devices [17,31,[83][84][85] (such as mode squeezers, expanders, splitters, couplers, and multimode bends) with no intermodal scattering. (Before TO was first used for cloaking, electrostatic cloaking using anisotropic conductivities [71,72] had been proposed). Examples of such transformations are shown in Fig. 1. For a more comprehensive treatment of the physical and mathematical ideas of transformation media used in this paper, see [73].
One major difficulty with transformation optics is that most functions x ′ (x) yield highly anisotropic and magnetic materials. In principle, these transformed designs can be fabricated with anisotropic microstructures [26,[86][87][88] or naturally birefringent materials [22,27]. However, in the infrared regime (where metals are lossy) it is far easier to instead fabricate effectively isotropic dielectric materials, provided that the refractive index falls within the given bounds n min and n max of the fabrication process (for example, subwavelength nanostructures [33,[38][39][40]44,69,86,[89][90][91] or waveguides with variable thickness [92][93][94][95][96][97]). This requirement means that we would prefer to consider the subset of transformations that can be mapped to approximately isotropic dielectric materials.
The theory of transformation optics with nearly isotropic materials is intimately connected to the subjects of conformal maps (which are isotropic by definition [2,98,99]) and quasicon-formal maps [which in mathematical analysis are defined as any orientation-preserving transformation with bounded anisotropy (as quantified in Sec. 2.4)]. However, in transformation optics the term "quasiconformal" has become confusingly associated with only a single choice of quasiconformal map suggested by Li and Pendry [30]. In that work, Li and Pendry proposed minimizing a mean anisotropy with "slipping" boundary conditions (defined in Sec. 2.4), which turns out to yield a transformation that is essentially conformal up to a constant stretching (and thus anisotropy) everywhere. This map, which also happens to minimize the peak anisotropy given the slipping boundary conditions [30,100], is sometimes confusingly called "the quasiconformal map" [6,34,36,38,40]. However, we point out in Sec. 2.4 that slipping boundary conditions are not the correct choice if one wishes to ensure a reflectionless interface between transformed and untransformed regions. Instead, for interfaces to be reflectionless requires at least continuity of the transformation x ′ at the interface [85,[101][102][103] and, as we show in Sec. 2.3 for the case of isotropic dielectric media, continuity of the Jacobian J as well. If one fixes the transformation on part or all of the boundary (instead of just the corners) and minimizes the peak anisotropy, the result is called (in analysis) an extremal quasiconformal map [104][105][106][107][108][109]. We point out in Sec. 2.2 that this extremal quasiconformal map can never be conformal except in trivial cases. Additionally, previous work in quasiconformal transformation optics underconstrained the space of transformations in one way but overconstrained it in another. Li and Pendry's method, along with other work on extremal quasiconformal maps in mathematical analysis, assumed that the entire boundary shape of the transformed domain is specified a priori (even if the value of the transformation at the boundary is not specified). In contrast, transformation inverse design allows parts of the boundary shape to be freely chosen by the optimization, only fixing aspects of the boundary that are determined by the underlying problem (e.g. the input/output facets of the boundary in Fig. 1) as explained in Secs. 3.2, allowing a much larger space of transformations to be searched. Also, for such stricter boundary conditions, minimizing the mean anisotropy is not equivalent to minimizing the peak anisotropy [105,[110][111][112], and we argue below that the peak anisotropy is a better figure of merit for transformation optics in general.
We solve all of these problems by using large-scale numerical optimization to find the transformation with minimal peak anisotropy that exactly obeys continuity conditions at the boundary with untransformed regions. This allows the input/output interfaces to transition smoothly and continuously into untransformed devices while also satisfying fabrication constraints (e.g. bounds on the attainable refractive indices and bend radiii). A large space of arbitrary smoothly varying transformations (that satisfy the continuity conditions and fabrication constraints) is explored quickly and efficiently by parametrizing in a "spectral" basis [113,114] of Fourier harmonics and Chebyshev polynomials. The optimized transformation is then scalarized (as in the case of previous work on quasiconformal transformation optics) into an isotropic dielectric material that guides modes with minimal intermodal scattering and loss. In the case of a multimode bend, for which our design was recently fabricated and characterized [115], we achieve intermodal scattering at least an order of magnitude smaller than a conventional non-TO bend.
In Sec. 2.1, we review the equations of transformation optics. In Sec. 2.2, we describe situations where the transformation-designed material can be mapped to isotropic media. In Sec. 2.3, we point out that such isotropic transformations, due to their analyticity, always have undesirable interface discontinuities when coupled into untransformed regions. In Secs. 2.4 and 2.5, we review the techniques of quasiconformal mapping (as used in both the transformation optics and mathematical analysis literature) and scalarization of nearly isotropic transformations. We show that the inherent restrictions of quasiconformal mapping can be circumvented by directly optimizing the map using transformation inverse design. In Secs. 3.1 and 3.2, we design a nearly isotropic transformation for a 90 • -bend by perturbing from the highly anisotropic circular bend transformation. In Secs. 3.3 and 3.4, we set up the bend optimization problem and the spectral parameterization. In Sec. 4, we present the optimized structure, which reduces anisotropy by several orders of magnitude compared to the circular TO bend. In Sec. 4.1, we present finite element simulation results comparing our optimized design to the conventional non-TO bend and the circular TO bend. In Secs. 4.2 we show that minimizing the mean anisotropy can lead to pockets of high anisotropy (which in turn leads to greater intermodal scattering) while minimizing the peak does not. In Secs. 4.3, we discuss the tradeoff between the bend radius and the optimized anisotropy. In Sec. 5 we briefly present methods and results for applying transformation inverse design to optimize mode squeezers.

Transformation optics
The frequency domain Maxwell equations (fields ∼ e −iωt ), without sources or currents, in linear isotropic dielectric media [ε ε ε = ε(x), µ µ µ = µ 0 ] are Consider a coordinate transformation ∇ and the primed fields as E ′ ≡ J −1 E and H ′ ≡ J −1 H. One can then rewrite Eq. (1), after some rearrangement [1,116], as where the effects of the coordinate transformation have been mapped to the equivalent tensor materials This equivalence has become known as transformation optics (TO). It is actually the specific case of a much more general result from general relativity [118]. For a further discussion of space-time transformations and connections to negative refraction, see [73] and [119]. Most useful applications of TO require that the transformation be coupled to untransformed regions (e.g. the input and output straight waveguides in the case of a bend transformation, or the surrounding air region for the case of a ground-plane cloaking transformation). However, in order for TO to guarantee that the interface between transformed and untransformed regions be reflectionless, the transformation must be equivalent to a continuous transformation of all space that is the identity x ′ (x) = x in the "untransformed" regions, as depicted in Fig. 2. More generally, the untransformed regions can be simple rotations or translations, but when examining a particular interface, we can always choose the coordinates to be x ′ = x at that interface. It is clear by construction that continuous x ′ is sufficient for reflectionless interfaces [85,102,103], and this is in fact a necessary condition as well [101]. Although a general anisotropic transformation need only have x ′ (x) continuous at the interface , we show below that an isotropic transformation will also have a continuous J at the interface. These boundary conditions are essential for designing useful transformations without interface reflections.

Transformations to isotropic dielectric materials
For the vast majority of transformations, the materials in Eq. (3) are anisotropic tensors. However, for certain transformations, the tensors are effectively scalar. Suppose that the transforma- , making J block-diagonal (with the zz element independent of the xy block). Then, the xy block of J T J is isotropic if and only if the diagonal elements are equal and the off-diagonal elements vanish: In this case, the J part of Eq. (3) becomes This isotropy has different implications for transverse-magnetic (TM) polarized modes in 2D (which have E = Eẑ and H ·ẑ = 0) versus transverse-electric (TE) polarized modes (which have E ·ẑ = 0 and H = Hẑ). For TM-polarized modes, the fields E ′ , H ′ in the primed coordinate system are also TM-polarized and Eq. (2) becomes Hence, for TM modes, an isotropic transformation can be exactly mapped to an isotropic dielectric material. Similarly, for TE-polarized modes the equivalent material is isotropic and magnetic. However, if det J varies slowly compared to the wavelengths of the fields, then the transformation can approximately still be mapped to an isotropic dielectric material by making an eikonal approximation (as in [117] Ch. 8.10) and commuting 1/µ ′ with one of the curls in the Maxwell equations. In particular, Eq. (2) can be written: where the last term can be neglected for slowly varying transformations. Because the TM case is conceptually simpler and does not require this extra approximation, we work with it exclusively for the rest of this paper. Also, because the non-trivial aspects of the transformation occur in the xy plane, we hereafter use J to denote the xy block.

Conformal maps and uniqueness
If a transformation has an isotropic J , then the transformation preserves angles in the xy plane. Additionally, if det J > 0, then the transformation also preserves handness and orientation. The combination of these two properties is called a conformal map [98], and is the only case where the situation in Sec. 2.2 can be realized. We only consider transformations with det J > 0 in order to restrict ourselves to dielectric materials. Also, a detJ > 0 transformation coupled continuously to an untransformed (det J = 1) region would require singularities (detJ = 0) at some points. Conformal maps are described by analytic functions, which are of the form is the untransformed complex coordinate) and whose real and imaginary parts satisfy the Cauchy-Riemann equations of complex analysis [98,99]. However, true conformal maps cannot directly be used for transformation optics in typical applications, because of the impossibility of coupling them to untransformed regions with the boundary conditions discussed in Sec. 2.1. In particular, the uniqueness theorem of analytic functions [99,Thm. 10.39] tells us that if w ′ (w) = w in some region, then w ′ (w) = w everywhere (similarly for a simple rotation or translation in some regions).
As a corollary, in the limit where a transformation becomes more and more isotropic in the neighborhood of an interface, it must have a continuous J , not just a continuous x ′ (x). It is easy to see this explicitly in the example of Fig. 2: continuity of x ′ (x) at the interface requires that ∂ x ′ ∂ x = (1, 0) on both sides of the interface, which determines the first row of J . The isotropy of J T J then forces J = I. Therefore, in the sections that follow (where we search for approximately isotropic maps), we will impose the condition of continuous J as a boundary condition on our transformations. The resulting transformations are nearly isotropic in the interior and exactly isotropic on the interfaces. This condition, discussed at the end of Sec. 2.5, also has the useful consequence of producing a continuous refractive index n ′ = ε ′ µ ′ .

Quasiconformal maps and measures of anisotropy
Because true conformal maps cannot be used, one widely used alternative is to search for a nearly isotropic transformation, which can be approximated by an isotropic material at the cost of some scattering corrections to the exactly transformed modes of the nearly isotropic material. To do this, one must first quantify the measure of anisotropy that is to be minimized. The isotropy condition of Eq. (4) is equivalent to λ 1 = λ 2 , where λ 1 (x, y) ≥ λ 2 (x, y) are the two eigenvalues of J T J . While λ 1 − λ 2 works as a measure of anisotropy, it is convenient for optimization purposes to define differentiable measures that can be expressed directly in terms of the trace and determinant of J , and precisely such quantities have been developed in the literature on quasiconformal maps [104,105,109,120,121].
A general transformation is an arbitrary function of x and y or, equivalently, an arbitrary function w ′ (w,w), of w andw = x − iy, which may not be analytic in w. The anisotropy can be related to the Beltrami coefficient [105,109] The term quasiconformal map refers to any map that has bounded |µ B | < 1, which includes all non-singular sense preserving (detJ > 0) transformations. It can be shown that the linear distortion K [104,105] satisfies Various other measures of anisotropy have been defined in the grid generation literature [122,123], including the Winslow and Modified Liao functionals, which are given by Φ ≡´d 2 x (K + 1 K ) and Φ ≡´d 2 x (K 2 + 1 K 2 ), respectively. However, for the rest of this work we refer to the quantity K − 1 ≥ 0 as the "anisotropy", where K is the distortion function [104,105], defined as The tensor J T J det J is known as the distortion tensor [105]. As mentioned in the introduction, an extremal quasiconformal map is one that minimizes the peak anisotropy, given the shape of the transformed region and the values of the transformation on some or all of the boundary [104,105,109,120]. Because K, K − 1, K, and |µ B | are all monotonic functions of one another, they are equivalent for the purpose of finding an extremal quasiconformal map. However, K is numerically convenient because it is a differentiable function of the entries of J . These quantities are not generally equivalent for minimizing the mean anisotropy [105,[110][111][112], and we argue in Sec. 4.2 that the peak anisotropy is a better figure of merit. However, in the special case where the value of the transformation is only fixed at the corners of the domain and is allowed to vary freely in between (a "slipping"' boundary condition), Li and Pendry showed that it is equivalent to minimize the mean (either Winslow or modified-Liao) or the peak anisotropy, and that these yield a constant-anisotropy map (a uniform scaling of a conformal map) [100] that they and other authors have used for transformation optics [6,19,[31][32][33][34][35][36][37][38][39][40][41][42][43][44][45]. However, the slipping boundary will generally lead to reflections at the interface between the transformed and untransformed regions because of the resulting discontinuity in the transformation, which can only be reduced by making the transformation domain very large in cases (e.g. cloaking) with localized deformations. In order to design compact transformation-optics devices, especially for applications such as bends where the deformation is nonlocalized, we will instead impose continuity of the transformation and/or its Jacobian on the input/output facets of the domain, while at the same time allowing the shape of some or all of the boundary to vary (unlike all previous work on quasiconformal maps, to our knowledge).

Scalarization errors for nearly isotropic materials
The minimum-anisotropy quasiconformal map is then scalarized (as in [30]) by approximating it with an isotropic dielectric material. As shown in Sec. 2.2, a perfectly isotropic 2D transformation of a geometry with an isotropic dielectric material that guides TM modes E 0 , H 0 can be mapped to a transformed material and geometry that is also isotropic dielectric and guides TM modes E ′ 0 , H ′ 0 . This is exact for K = 1, but for a nearly isotropic transformation with K > 1, the equivalent permeability is µ µ µ ′ = I + ∆µ µ µ, where the anisotropic part ∆µ µ µ is proportional to K − 1 to lowest order. While ∆µ µ µ ′ = 0 cannot be fabricated using dielectric gradient index processes, one can neglect this small correction so that the actual fabricated material has permeability µ µ µ ′ approx = I. In practice, we absorb any ∆µ µ µ ′ into ε ′ by multiplying ε ′ by the average but this does not change the O (K − 1) error. A Born approximation [124][125][126] tells us that, given an exact transformation with no scattering, any small error of ∆ε ε ε and ∆µ µ µ will generically lead to scattered fields with magnitudes of O(|∆ε ε ε|+|∆µ µ µ|) and scattered power of O(|∆ε ε ε| 2 +|∆µ µ µ| 2 ). The modes of the approximate scalarized material µ µ µ ′ approx , ε ′ approx are then the exact guided modes plus scattered power corrections A similar analysis explains why we must explicitly impose continuity of J at the input/output facets of the domain. As explained in Sec. 2.3, a purely isotropic transformation in the neighborhood of the interface, along with a continuity of x ′ , would automatically yield continuous J , so one might hope that minimizing anisotropy would suffice to obtain a nearly continuous J . Unfortunately, as we show in the Appendix, the resulting discontinuity in det J (and hence the discontinuity in the refractive index) is of order O( √ K − 1), which would lead to O(K − 1) power loss due to reflections, much larger than the O[(K − 1) 2 ] power scattering from anisotropy in the interior. This would make it pointless to minimize the anisotropy in the interior, since the boundary reflections would dominate. In fact, our initial implementation of the bend optimization in Sec. 3.4 did not enforce continuity of J , and we obtained a large 2% index discontinuity at the endfacets for max x K − 1 ≈ 0.0005. Therefore, in Sec. 3.4 we impose continuity of J explicitly.

General optimization of anisotropy
In this paper, we directly minimize K using large-scale numerical optimization while keeping track of constraints on the transformation x ′ and its Jacobian J , as well as the engineering fabrication bounds n min and n max . By using numerical optimization, we can in principle achieve both a lower mean anisotropy and a lower peak anisotropy than by traditional quasiconformal mapping, since the optimization is also free to vary the boundary shape (with at most the input/output interfaces fixed, although in some cases their locations and shapes are allowed to vary as well). The minimization problem can be written, for example, as where K(x) is a functional norm taken over the domain of x ′ (x). We consider two possible norms: the L 1 norm (the mean K x ), and the L ∞ norm (max x K). We show in Sec. 4.2 that minimizing the mean can lead to pockets of high anisotropy which can cause increased scattering. Directly optimizing the peak anisotropy on the other hand, avoids such pockets while simultaneously keeping the mean nearly as low. The continuity of x ′ and J at the input/output interfaces, as well as other constraints on the interface locations, are imposed implicitly by the parametrization of x ′ (x) (as explained in Sec. 3.4).

Multimode Bend design
In this section, we design a bend transformation (depicted in Fig. 3) using general methods to (locally) solve the optimization problem of Eq. (8). In contrast, previous work on TO bend design either utilized materials that were either anisotropic or consisted of multiple stacked isotropic layers [11-15, 17, 18] or employed slipping boundary conditions [6][7][8][9][10] (which result in endfacet reflections when coupled to untransformed waveguide).

Simple circular bends
First, we consider a simple circular bend transformation (which we refer to hereafter as the circular TO bend) that maps a rectangular segment of length L and width unity (in arbitrary distance units to be determined later) into a bend with inner radius R and outer radius R + 1 (as shown in Fig. 3). For convenience, we choose the untransformed coordinates to be R ≤ x ≤ R+1 and − L 2 ≤ y ≤ L 2 , with the untransformed segment length L = πR 2 equal to the inner arclength of the bend. The transformation x ′ (x) can be written as where r = x and θ = y R . While x ′ is continuous at the input/output interfaces y = ± L 2 , one issue is that J is not continuous there, which can be seen from det J = x R = 1. Another issue is that µ µ µ ′ = µ 0 I is highly anisotropic. The anisotropy for this transformation is K(x, y) − 1 = x 2R + R 2x − 1, which has a peak value max x K − 1 ≈ 1 2R 2 for R ≫ 1 at the outer radius x = R + 1. Note that one can instead choose r = exp( πx 2L ), which gives the conformal bend x ′ + iy ′ = exp[ π 2L (x + iy)]. As explained in Sec. 2.3, this map has zero anisotropy, but neither x ′ nor J are continuous at the input/output interfaces, leading to large reflections there.

Generalized bend transformations
In order to address the problems of the circular TO bend, we look for minimum anisotropy and continuous-interface transformations of the form of Eq. (9), where the intermediate polar coordinates are now arbitrary functions r(x, y) and θ (x, y). The ratio L/R is now an optimization parameter. The Jacobian then satisfies We find that the optimization always seems to prefer a symmetric bend (and if the optimum is unique, it must be symmetric), so we impose a mirror symmetry in order to halve our search space: We also require interface continuity of x ′ and J (as discussed in Sec. 2.3), which give the conditions at y = ± L 2 :

Numerical optimization problem
Besides minimizing the objective function K, the optimization must keep track of several constraints. First, any fabrication method will bound the overall refractive index n ′ to lie between some values n min and n max . We choose units so that the width of the transformed region is unity (R ≤ x ≤ R + 1), and consider transforming a straight waveguide of width ∆ w < 1. ∆ w should be small enough so that the exponential tails of the waveguide modes are negligible outside the transformed region. In the straight waveguide segment to be transformed (as well as the straight waveguides to be coupled into the input and output interfaces of the bend), n(x) is high in the core x − R − 1 2 < ∆ w 2 and low in the cladding x − R − 1 2 > ∆ w 2 . For convenience, we write this refractive index as a product n(x) = n 0 p(x) of an overall refractive index n 0 and a normalized profile p(x) that is unity in the cladding and some value greater than unity in the core (determined by the ratio of the high and low index regions of the straight waveguide). The transformed refractive index is given by where the average eigenvalue µ ′ of the magnetic permeability Eq. (7) has been absorbed into the dielectric index. The overall refractive-index scaling n 0 is then allowed to freely vary as a parameter in the optimization. Second, like the circular TO bend, the optimum TO bend is expected to have a tradeoff between the bend radius and anisotropy. Because of this expected tradeoff, we can choose to either minimize R while keeping K fixed, or minimize K while keeping R fixed. We focus on the latter choice, since the bend radius is the more intuitive target quantity to know beforehand. Also, we find empirically that optimizing K converges much faster than optimizing R while yielding the same local minima. With these constraints, there are several ways to set up the optimization problem, depending on which norm we are minimizing. One method is to minimize the peak anisotropy max x K with x ∈ G for some grid G of some points to be defined in Sec. 3.4. However, the peak (the L ∞ norm) is not a differentiable function of the design parameters, so it should not be directly used as the objective function. Instead, we perform a standard transformation [114]: we introduce a dummy variable t and indirectly minimize the peak K using a differentiable inequality constraint between t and K(x) at all x ∈ G: min r(x), θ (x), n 0 , L,t t subject to : For comparison, we explain in Sec. 4.2 why the L ∞ norm is better to minimize than the L 1 norm (the mean anisotropy). The minimization of the L 1 norm, K x =´K dx dy/area [which is differentiable in terms of the parameters r(x), θ (x), n 0 , and L] is implemented as min r(x), θ (x), n 0 , L K x subject to : We use the circular bend r = x, θ = πy 2L = y R as a starting guess, and search the space of general transformations r(x), θ (x) by perturbing from this base case. (We only perform local optimization; not global optimization, but comment in Sec. 4.3 on a simple technique to avoid being trapped in poor local minima.) As explained in Sec. 3.4, the perturbations will be parameterized such that the symmetry and continuity constraints are satisfied automatically. Figure 3 shows a schematic of the bend transformation optimization process. First, the straight region is mapped to a circular bend. Then, the intermediate polar coordinates r and θ for every point x are perturbed, using an optimization algorithm described at the end of Sec. 3.4, and the desired norm (either L 1 and L ∞ ) of the anisotropy is computed. This process is repeated at each optimization step until the structure converges to a local minimum in K .

Spectral parameterization
To faciliate efficient computation of the objective and constraints, the functions r and θ can be written as the circular bend transformation plus perturbations parametrized in the spectral basis [113,114]: where the coordinate 2x − 2R − 1 has been centered appropriately for the domain [−1, 1] of degree-ℓ Chebyshev polynomials T ℓ . The sines and cosines have been chosen to satisfy the mirror-symmetry conditions of Eq. (10). The sine series also automatically satisfies the second continuity condition of Eq. (11). In order to satisfy the rest of the conditions, the following constraints are also imposed: These equations are solved to simply eliminate the C r, θ ℓN m coefficients before optimization. This spectral parametrization has several advantages over finite-element discretizations such as the piecewise-linear parameterization of [104]. First, the spectral basis converges exponentially for smooth functions [113]. We found that only a small number (N ℓ × N m < 100) of spectral coefficients C r, θ are needed to achieve very low-anisotropy (K − 1 ≈ 10 −4 ) transformations. Second, if the fabrication process favors slowly varying transformations (or if these are needed to make the eikonal approximation for the TE polarization, as in Sec. 2.2), this constraint may be imposed simply by using smaller N ℓ and N m .
With this spectral parameterization, the formulation of the optimization problem Eq. (12) The local optimization was performed using the derivative-free COBYLA non-linear optimization algorithm [127,128] in the NLopt package [129]. In principle, we can make the optimization faster by analytically computing the derivatives of the objective and constraints with respect to the design parameters and using a gradient-based optimization algorithm, but that is not necessary because both trJ T J and detJ , which determine all the non-trivial objective and constraint functions in this optimization problem, are so computationally inexpensive to evaluate that the convergence rate is not a practical concern.

Minimal peak anisotropy
A min K ∞ design is shown in Fig. 4, along with the scalarized circular TO bend for comparison. The bend radius was R = 2 and the number of spectral coefficients was N ℓ = 5, N m = 8. The objective and constraints were evaluated on a 100 × 140 grid G in x (Chebyshev points in the x direction and a uniform grid in the y direction). This design had max x K − 1 ≈ 5 × 10 −4 and mean K − 1 ≈ 10 −4 . In comparison, the circular TO bend of the same radius has max x K − 1 ≈ 0.1 and K − 1 ≈ 10 −2 .
The R = 2 optimized design structure was compared in finite-element Maxwell simulations (using the FEniCS code [130]), with the conventional non-TO bend [simply bending the waveguide profile around a circular arc with n ′ (x ′ ) = n(x)] and the scalarized circular TO bend. The four lowest-frequency modes of a multimode straight waveguide were injected at the input interface y = L 2 , and the scattered-power matrix T was computed using the measured fields at the output interface y = − L 2 . The scattered-power matrix is defined as where −θ is the propagation direction of the guided modes, E 0 j is the normalized electric field of the jth exactly guided mode of the non-scalarized material (µ µ µ ′ , ε ′ ), and H i is the actual magnetic field of the approximate scalarized material at the interface after injecting a normalized mode E 0 i at the input interface. This makes T i j equal to the power scattered into the jth output mode from the ith input mode. For a straight waveguide, which has no intermodal scattering, T = I. Figure 4 shows a dramatically improved T for the scalarized and optimized TO bend compared to the scalarized circular TO bend. [The rows and columns of T for the circular bend add up to less than one because some power has either been scattered out of the waveguide entirely, or some power has been scattered into fifth or higher-order modes. The rows and columns of T for the optimized bend add up to nearly 1, with the small deficiency due to the O (K − 1) out-of-bend and higher-order intermodal scattering as well as mesh-descretization error.] The electric-field profiles for the fundamental mode, displayed in Fig. 5, show a dramatic difference in the performance of the optimized structure versus the other structures. Both the conventional and circular TO bend show heavy intermodal scattering in the bend region, while the optimized transformation displays very little scattering.

Minimizing max versus minimizing mean
We found a clear difference between minimizing the peak anisotropy versus minimizing the mean. The results of an optimization run with R = 2.5, N ℓ = 3, and N m = 6 are shown in Fig. 6. Both structures had very low mean anisotropy K x − 1. The mean-minimized structure, at K − 1 ≈ 10 −5 , had a slightly lower mean than the peak-minimized structure which had K − 1 ≈ 1.5 × 10 −5 . However, in terms of the peak anisotropy, the peak-minimized structure is the clear winner by a factor of 2.5, with max x K − 1 ≈ 2 × 10 −4 as opposed to max x K − 1 ≈ 5 × 10 −4 for the mean-optimized structure. Both structures were scalarized and tested in finite-element Maxwell simulations of the four lowest-frequency modes of the straight waveguide. The scattered-power matrix shows that the difference in max x K resulted in an order of magnitude reduction in the intermodal scattering (as shown in the off-diagonal elements) and noticeably improved transmission, (especially in the element T 44 = 0.89 for the fourth mode). Fig. 6. Anisotropy profile and scattered-power matrices for optimized designs that minimize the mean and the peak, with R = 2.5, N ℓ = 3, and N m = 6.

Tradeoff between anisotropy and radius
In optimized structures, we found that max x K for the optimized bend, similar to the circular TO bend, decreases monotonically with R (as shown in Fig. 7). Unlike the circular bend, however, this tradeoff seems asymptotically exponential rather than O(R −2 ). In particular, there are two clearly different regimes for this tradeoff: a power law K − 1 ∼ R −4 at small R 3 and an exponential decay K − 1 ∼ exp(−0.34R), at larger R. The second regime was only attained after using successive optimization, because with only one independent optimization run the algorithm tended to get stuck in local minima. For successive optimization, the optimum structure is used as a starting guess for the next run, and the initial step size is set large enough so that the algorithm can reach better local minima than the previous one.
For R 3, we found that there are multiple local minima and that independent optimizations for different R tend to be trapped in suboptimal local minima, as shown by the open dots in Fig. 7. To avoid this problem, we used a "successive optimization" technique in which the optimal structure for smaller R is rescaled as the starting guess for local optima at a larger R, in order to stay along the exponential-tradeoff curve. (Another possible heuristic is "successive refinement" [131][132][133][134], in which optima for smaller N ℓ, m are used as starting points for optimizing using larger N ℓ, m .)

Mode squeezer
We also applied transformation inverse design to another interesting geometry: a mode squeezer that concentrates modes and their power in a small region in space, again with minimal intermodal scattering (quite unlike a conventional lens, which is intrinsically angle/modedependent), similar to the problem considered in [19] (which did not construct isotropic designs). We choose the untransformed region to be −1 ≤ x ≤ 1 and 0 ≤ y ≤ L. The goal of this transformation x ′ (x) is to focus the beam by minimizing the mid-beam width Fig. 7. Successive optimization with N ℓ = 5, N m = 8 results in a power law decaying tradeoff max x K − 1 ∼ R −4 at low R and an exponentially decaying tradeoff at higher R. For comparison, the unoptimized anisotropy for the circular TO bend is shown above.
As in Sec. 3.4, the transformation is written as a perturbation from the identity transformation (which was used as the starting guess) and parameterized in the spectral basis The sine series automatically satisfies mirror symmetry about y = L 2 and continuity of x ′ at the input/output interfaces y = 0, L. However, we found that constraining the coefficients C x, y to enforce continuity of J (as in Sec. 3.4) was not necessary (although it might give a better result) since the optimization algorithm only squeezed the center region while leaving the interfaces and the regions around them relatively untouched. In this problem, we could either minimize K for a fixed W or minimize W for a fixed K, and we happened to choose the latter. Finite-element Maxwell simulations, shown in Fig. 8, demonstrate that the optimized design is greatly superior to a simple Gaussian taper transformation designed by hand. The Gaussian transformation was given by x ′ (x) = x − xα exp[−β (y − L 2 ) 2 ], where β > 0 and 0 < α < 1. Superficially, the design seems similar to an "adiabatic" taper between a wide low-index waveguide and a narrow high-index waveguide, and it is known that any sufficiently gradual taper of this form would have low scattering due to the adiabatic theorem [135]. However, the optimized TO design is much too short to be in this adiabatic regime. If it were in the adiabatic regime, then taking the same design and simply stretching the index profile to be more gradual (a taper twice as long) would reduce the scattering, but in Fig. 8 we perform precisely this experiment and find that the stretched design increases the scattering.

Concluding remarks
The analytical simplicity of TO design-no Maxwell equations need be solved in order to warp light in a prescribed way-paradoxically makes the application of computational techniques more attractive in order to discover the best transformation by rapidly searching a large space of possibilities. Previous work on TO design used optimization to some extent, but overconstrained the transformation by fixing the boundary shape while underconstraining the boundary conditions required for reflectionless interfaces. In fact, even our present work imposes more constraints than are strictly necessary-as long as we require continuous J at the input/output interfaces, there is no conceptual reason why those interfaces need be flat. A better bend, for example, might be designed by constraining the location of only two corners (to fix the bend radius) and constraining only J on other parts of the endfacets. However, we already achieve an exponential tradeoff between radius and anisotropy, so we suspect that further relaxing the constraints would only gain a small constant factor rather than yielding an asymptotically faster tradeoff. In the case of the mode squeezer, one could certainly achieve better results by imposing the proper J constraints at the endfacets. It would also be interesting to apply similar techniques to ground-plane cloaking.
All TO techniques suffer from some limitations that should be kept in mind. First, TO seems poorly suited for optical devices in which one wants to discriminate between modes (e.g. a modal filter) or to scatter light between modes (e.g. a mode transformer). TO is ideal for devices in which it is desirable that all modes be transported equally, with no scattering. Even for the latter case (such as our multimode bend), however, TO designs almost certainly trade off computational convenience for optimality, because they impose a stronger constraint than is strictly required: TO is restricted to designs where the solutions at all points in the design are coordinate transformations of the original system, whereas most devices are only concerned with the solutions at the endfacets. For example, it is conceivable that a more compact multimode bend could be designed by allowing intermodal scattering within the bend as long as the modes scatter back to their original configurations by the endfacet; the interior of the bend might not even be a waveguide, and instead might be a resonant cavity of some sort [55,75,76,136]. However, optimizing over such structures seems to require solving Maxwell's equations in some form at each optimization step, which is far more computationally expensive than the TO design and, unlike the TO design, must be repeated for different wavelengths and waveguide designs.