Adjoint shape optimization applied to electromagnetic design

We present an adjoint-based optimization for electromagnetic design. It embeds commercial Maxwell solvers within a steepest-descent inverse-design optimization algorithm. The adjoint approach calculates shape derivatives at all points in space, but requires only two “forward” simulations. Geometrical shape parameterization is by the level set method. Our adjoint design optimization is applied to a Silicon photonics Y-junction splitter that had previously been investigated by stochastic methods. Owing to the speed of calculating shape derivatives within the adjoint method, convergence is much faster, within a larger design space. This is an extremely efficient method for the design of complex electromagnetic components. ©2013 Optical Society of America OCIS codes: (130.3120) Integrated optics devices; (230.7370) Waveguides; (230.1360) Beam splitters. References and links 1. P. Sandborn, N. Quack, N. Hoghooghi, J. B. Chou, J. Ferrara, S. Gambini, B. Behroozpour, L. Zhu, B. Boser, C. Chang-Hasnain, and M. C. Wu, “Linear frequency chirp generation employing opto-electronic feedback loop and integrated Silicon photonics,” in CLEO, (2013) pp. 5–6. 2. A. Sakai, T. Fukazawa, and T. Baba, “Low loss ultra-small branches in a silicon photonic wire waveguide,” IEICE Trans. Electron. E85, 1033–1038 (2002). 3. Y. Zhang, S. Yang, A. E. J. Lim, G. Q. Lo, C. Galland, T. Baehr-Jones, and M. Hochberg, “A compact and low loss Y-junction for submicron silicon waveguide,” Opt. Express 21(1), 1310–1316 (2013). 4. P. Sanchis, P. Villalba, F. Cuesta, A. Håkansson, A. Griol, J. V. Galán, A. Brimont, and J. Martí, “Highly efficient crossing structure for silicon-on-insulator waveguides,” Opt. Lett. 34(18), 2760–2762 (2009). 5. Y. Zhang, S. Yang, E. J. Lim, G. Lo, T. Baehr-Jones, and M. Hochberg, “A CMOS-compatible, low-loss and low-crosstalk silicon waveguide crossing,” IEEE Photon. Technol. Lett. 25(5), 422–425 (2013). 6. T. Tanemura, K. C. Balram, D. S. Ly-Gagnon, P. Wahl, J. S. White, M. L. Brongersma, and D. A. B. Miller, “Multiple-wavelength focusing of surface plasmons with a nonperiodic nanoslit coupler,” Nano Lett. 11(7), 2693–2698 (2011). 7. M. P. Bendsøe and N. Kikuchi, “Generating optimal topologies in structural design using a homogenization method,” Comput. Meth. In Appl. M. 71, 197–224 (1988). 8. M. P. Bendsoe and O. Sigmund, Topology Optimization Theory, Methods and Applications. (Springer, 2003) 9. T. Borrvall and J. Petersson, “Topology optimization of fluids in Stokes flow,” Int. J. Numer. Methods Fluids 41(1), 77–107 (2003). 10. J. S. Jensen and O. Sigmund, “Topology optimization for nano-photonics,” Las. Photon. Rev. 5(2), 308–321 (2011). 11. P. Seliger, M. Mahvash, C. Wang, and A. F. J. Levi, “Optimization of aperiodic dielectric structures,” J. Appl. Phys. 100(3), 034310 (2006). 12. W. R. Frei, D. A. Tortorelli, and H. T. Johnson, “Geometry projection method for optimizing photonic nanostructures,” Opt. Lett. 32(1), 77–79 (2007). 13. V. Liu and S. Fan, “Compact bends for multi-mode photonic crystal waveguides with high transmission and suppressed modal crosstalk,” Opt. Express 21(7), 8069–8075 (2013). 14. G. Veronis, R. W. Dutton, and S. Fan, “Method for sensitivity analysis of photonic crystal devices,” Opt. Lett. 29(19), 2288–2290 (2004). 15. A. F. J. Levi and I. G. Rosen, “A novel formulation of the adjoint method in the optimal design of quantum electronic devices,” SIAM J. Contr. Optim. 48(5), 3191–3223 (2010). #194605 $15.00 USD Received 26 Jul 2013; revised 19 Aug 2013; accepted 20 Aug 2013; published 6 Sep 2013 (C) 2013 OSA 9 September 2013 | Vol. 21, No. 18 | DOI:10.1364/OE.21.021693 | OPTICS EXPRESS 21693 16. G. Strang, Computational Science and Engineering, (Wellesley-Cambridge, 2007). 17. O. D. Miller, PhD thesis (2012), University of California at Berkeley, http://optoelectronics.eecs.berkeley.edu/ThesisOwenMiller.pdf 18. S. G. Johnson, M. Ibanescu, M. A. Skorobogatiy, O. Weisberg, J. D. Joannopoulos, and Y. Fink, “Perturbation theory for Maxwell’s equations with shifting material boundaries,” Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 65(6), 066611 (2002). 19. Lumerical FDTD Solutions, www.lumerical.com 20. S. Osher and J. A. Sethian, “Fronts propagating with curvature-dependent speed: algorithms based on HamiltonJacobi formulations,” J. Comput. Phys. 79(1), 12–49 (1988). 21. http://optoelectronics.eecs.berkeley.edu/PhotonicInverseDesign/


Introduction and motivations
Silicon photonics offer the unique ability of managing light through sub-wavelength Silicon waveguides patterned on chip, enabling extremely tight integration of photonic components and conventional CMOS electronics.Consequently, functions that previously required many separate components may now all be performed on single chips, reducing their cost, energy consumption and size [1].
Nevertheless a number of challenges remain, one of them being the efficient management of light at these scales.Indeed, while straight Silicon waveguides can have extremely low loss and enable excellent transport of light throughout the chip, other functions (such as splitters, waveguide crossings, multimode interferometers) suffer from the presence of evanescent fields outside the waveguide and imperfect reflections at the Silicon/oxide interface.These induce scattering loss, which can be highly detrimental to the total system performance.For this reason, a significant effort in photonic device topology optimization has taken place recently.This has drastically reduced the losses in Y-splitters [2,3], crosstalk and insertion losses in waveguide crossings [3,4], along with other more exotic components [4] and is effectively enabling better photonic circuits.
Most of these optimizations are based on heuristic optimization methods such as genetic optimization [4], particle swarm optimization [3,5], or other hybrid methods tailored for specific problems [6].Heuristic optimization relies on a somewhat limited parameterization of the solution space and subsequent random testing of a large number of different parameter sets.Because of the high computational cost of solving Maxwell's equations, these optimization methods may only be applied to relatively simple geometries, as they require the testing a very large number of different solutions in order to find a satisfactory one.
While this is perfectly suitable for the simple problems mentioned above, these methods will fail to perform in a reasonable amount of time for more complex geometries and functions.It is therefore necessary to have a more efficient way of performing topology optimization for general purposes.In our shape optimization approach, shape derivatives play an important role.In this paper we present an adjoint method to calculate shape derivatives by wrapping an inverse algorithm around commercial Maxwell solvers.Such efficient gradient descent methods unlock the possibility to optimize particularly complex structures, which has not previously been possible.

Presentation of the adjoint method for electromagnetic problems
The adjoint method enables the computation of shape derivatives at all points in space, with only two electromagnetic simulations per iteration.It has been extensively used for shape optimization in mechanical engineering [7][8][9] but has seen more limited use for photonic components [10][11][12][13][14], and more recently quantum electronics [15].Mathematical derivations of the adjoint method are available in optimization textbooks [8,16], but we will limit ourselves to a very simple example that intuitively illustrates the mathematical procedure when it is used in the context of electromagnetism.In our example, we want to maximize the absolute value of the electric field at a given point 0 x , given a geometrical region Ω in which we can change the electric permittivity ε at (vectors are written in bold).The change in figure of merit for a small change of dielectric permittivity where 0 ( ) is the value of the electric field at a given point before any change and 0 ( ) represents the change in electric field when the small dielectric modification is performed.
Some algebraic manipulations are needed to arrive at the derivative.The change in field at 0 x can be written for a small enough volume perturbation V Δ : is the Maxwell Green's function relating the electric field at 0 x to the induced polarization density ind p at x in the infinitesimal volume V Δ .new E is the electric field given the new dielectric distribution.If the change r ε Δ is small enough, we may approximate ( ) ( ) be the small parameter for the derivative.A similar line of reasoning results in almost the same final equation, albeit taking care to distinguish which components of E and D are continuous across the boundary [14,17,18].We can then rewrite (2) as Using the reciprocity of the Green's function 0 0 ( , ) ( , ) The mathematical method can be understood from the new adjoint electric field: which is the electrical field induced at x from an electric dipole at 0 x driven with amplitude 0 0 ( ) , as illustrated in Fig. 1.Thus, the gradient of the Figure-of-Merit can be obtained with only a single simulation, even though it provides the derivative with respect to permittivity at every point in the computational region Ω .The term 0 ( ) old x E is readily available from the original forward simulation.
Therefore with just one forward simulation (which is needed to calculate the FoM in all optimization schemes) plus one adjoint simulation, the shape derivative can be obtained over the entire design region, for arbitrarily many degrees of freedom.With the gradient of the Figure-of-Merit calculated, changes in the geometry can be introduced proportional to the gradient, known as the gradient descent method.Applied iteratively, this can then lead to an optimum.For a more detailed and general study of the adjoint method and more complex Figures-of-Merit we refer to [17].
The adjoint method is also extremely attractive since the overall iterative scheme can be wrapped around a commercial forward solver, such as the one used in [19].

Y-Splitter optimization example using the level set method for shape representation
A Y-splitter for λ = 1550nm vacuum wavelength light was optimized by the adjoint method to compare with state of the art Silicon photonic components optimized up to date [3].The material system (Silicon waveguide, Silicon dioxide cladding) and the constraints of small overall dimensions and minimum feature size were kept the same as in [3].For the minimum feature size a minimum radius of curvature of 200nm was imposed.The waveguide is 220nm thick, the most common choice for Silicon photonics.The two waveguide branches and their junction at the end of the splitter were left to be the same as in [3], although they also could have easily been optimized.The design region was the central 2µm × 2µm domain.
The method used in [3] is particle swarm optimization, which consists of calculating the Figure-of-Merit for a large population of randomly generated solutions and having the population evolve at every iteration using the information collected in the previous tests, until a satisfying solution is reached.By contrast, the adjoint method provides shape derivatives over the entire design region.The level set method, developed by Sethian and Osher [20], was chosen to represent the geometry.This enables a more flexible representation of a larger design space than, for example, spline interpolations used in [3][4][5].Level sets are particularly usable within an adjoint approach, since a very large number of shape derivatives are inside the Level Set, compared to the feasible number of variables in stochastic optimization.Note also that level set methods impose two-phase, binary materials throughout the optimization, compatible with practical engineering, but in contrast with [10], which optimizes a continuously variable permittivity.
The figure-of-merit that we employed was transmission into the fundamental mode of the bent output waveguides, which can be obtained from Poynting vectors: ( ) where m E and m H are the field profiles of the fundamental mode at the surface S , while E and H are the actual fields from the direct simulation at that surface.Thus Eq. ( 7) is the power transmission, corrected for the mode overlap.
Adapting the adjoint Eq. ( 6) to the new figure of merit ( 7) (and employing an additional magnetic Green's function, EM G , and magnetic symmetries [17]), we arrive at the adjoint field: where ( , ') EM x x G is the electromagnetic Green's function expressing the electric field at x due to a magnetic dipole at ' x .The adjoint simulation described Eq. ( 8) consists of sending the desired mode backwards into the splitter.This is analogous to Eq. ( 6), where the adjoint source was located at the measurement point of the Figure-of-Merit.This source problem can be solved with a standard Maxwell solver.FDTD is perfectly suited for this propagating wave problem.Also analogous to Eq (6) the phase of the adjoint source is set using old E and old H , from the forward simulation, as described in Eq (9).Once the adjoint simulation is performed, the derivative of the Figure-of-Merit with respect to dielectric permittivity at every point in the design region is calculated by combining the forward and adjoint simulations results into Eq.( 5).FDTD is perfectly suited to solve the direct and adjoint problem, which consists of propagating waves in a dielectric.Fig. 3. Coupling efficiency evolution during the optimization.The switch from 2d to 3d FDTD is visible at iteration 41.For comparison, the previous record of ref [3].was −0.13dB and required 1500 simulations.
This derivative is then used to modify the geometry of the splitter.Since we employed a level set description of geometry, the derivative is used as a velocity field to modify the level set shape.This has the effect of pushing out the geometry boundary when the derivative is positive and pushing it in when it is negative.Since the refractive index if Silicon is higher than that of Silicon dioxide this implements the imperative of the derivative at every point: The Figure-of-Merit benefits from an increase in the dielectric permittivity where the derivative is positive and vice-versa.The step-size criterion for each iteration is a fixed area of changing type in 2d, and a fixed volume in 3d.
The device was first optimized using 2d finite difference time domain (FDTD) simulations of a structure extruded infinitely in the 3rd dimension.In 2d, the effective index method is used and the Silicon is assigned the fictitious refractive index = 2.8, which mimics the proper in-plane wavevector of the correct 3d mode.Once iterative progress stopped in 2d (41 iterations), the problem was transferred to 3d for more iterations.Naturally the first 3d iteration is not as good as the optimized 2d device, since the effective index method is only an approximation.The optimal structure was computed within 51 iterations (102 simulations), achieving a record low insertion loss −0.07dB.By comparison, ref [3].achieved a minimal insertion loss −0.13dB, after 1500 simulations using particle swarm optimization.(Note that for such small attenuation, the simulation results are very sensitive to the simulation parameters, which may not have been perfectly identical to ref [3].).
Thus adjoint steepest descent, with much lower computational cost, can yield as good or better results than particle swarm optimizations, which take no advantage of the underlying Maxwell equation physics.Fig. 4. Geometry evolution during the optimization process and total coupling efficiency to the output waveguides.Iter indicates the iteration number, and the insertion loss is given in dB.The optimization is first carried out using a 2d approximation with an effective waveguide index = 2.8, which mimics the 3d in-plane propagation constant.The final iterative steps are carried out in full 3d FDTD.
The figure of merit evolution, as well as intermediate optimization steps, is presented in Figs. 3. and 4 respectively.There is a visible change between the 2d solution and the 3d solution, with a non-negligible efficiency improvement.This 3d improvement was only possible with the adjoint method, as the 3d computational cost limits the multiple simulations in particle swarm methods.The electric field intensity distribution of the final iteration is shown in Fig. 5.The large operating bandwidth of the optimized structure is shown in Fig. 6. and is good indication of the robustness of the design generated by the optimization.

Conclusion
As photonic and wireless components become an increasingly important part of electronics, it is evident that many problems will require electromagnetic optimization.The computational cost of solving Maxwell's equations is significant, and inefficient design optimization algorithms will become unacceptable.We have shown that the adjoint gradient decent method for shape optimization of sub-wavelength photonic devices can be readily implemented by embedding commercial Maxwell solvers within an inverse optimization algorithm.
For exploration of larger solution spaces where local optima may exist, this method may be augmented with a clever choice of Figure-of-Merit, as well as global optimization routines such as simulated annealing to provide efficient and powerful automated design of photonic components.
Adjoint-gradient-steepest-descent has already beaten the previous record for a manufacturable splitter within current Silicon photonics technology, at much less computational cost than previous methods.This opens the pathway to a more systematic, efficient, photonic component design optimization.The code used for this optimization is available at [21].

Fig. 1 .
Fig. 1.Adjoint method schematic: two simulations are needed for every iteration; the direct and the adjoint simulation.Sources for each simulation are drawn in red.

Fig. 2 .
Fig. 2. Top view of the optimized silicon splitter geometry obtained after 51 iterations of the Steepest Descent algorithm.Only the designable region geometry was allowed to change.The Silicon waveguide is 220nm thick, and the cladding is Silicon dioxide.

Fig. 5 .
Fig. 5. Simulated field intensity |E| 2 for the optimized structure at λ = 1550nm for a slice in the middle of the device.

Fig. 6 .
Fig. 6.Simulated insertion loss of the optimized device for wavelengths between 1.5 and 1.6 µm.The broad operating spectrum of the device is a good indicator of the robustness of the design.