Fourier modal method for inverse design of metasurface-enhanced micro-LEDs

We present a simulation capability for micro-scale light-emitting diodes (uLEDs) that achieves comparable accuracy to CPU-based finite-difference time-domain simulation but is more than 10^7 times faster. Our approach is based on the Fourier modal method (FMM) -- which, as we demonstrate, is well suited to modeling thousands of incoherent sources -- with extensions that allow rapid convergence for uLED structures that are challenging to model with standard approaches. The speed of our method makes the inverse design of uLEDs tractable, which we demonstrate by designing a metasurface-enhanced uLED that doubles the light extraction efficiency of an unoptimized device.


Introduction
Micro-scale light-emitting diodes (LEDs) with lateral dimensions near 1 m are of interest for a variety of applications including augmented reality displays [1,2].In this context, high light extraction efficiency (LEE) is essential due to the requirement for high brightness (e.g. for outdoor usage) and the need to operate for extended periods on battery power.It would be desirable to apply methods such as inverse design to obtain LED designs with higher performance; inverse design is a powerful technique that automatically discovers the topology and shapes of designs that minimize some objective function [3][4][5], and would be suitable to the creation of metasurfaces that enhance LED LEE.However, light generated in the LED is spatially incoherent; spatially incoherent sources (arising in e.g.spontaneous emission [6][7][8] and thermal emission [9,10]) are extremely expensive to model by standard approaches (requiring many independent simulations), making inverse design of structures including incoherent sources computationally intractable.Consequently, previous works on modeling and optimization of LEDs have focused on symmetric or low-dimensional cases, or used a widely-spaced dipoles to approximate a planar active region [11,12], leaving the challenge of general 3D LED inverse design unaddressed.
In this work, we present a new simulation capability based on the Fourier modal method (FMM) which overcomes these limitations.For the LED, our method gives results that are in excellent agreement with finite-difference time-domain (FDTD) simulations while being more than 10 7 times faster (Sec.4).This makes it practical for use in an inverse design setting, which we demonstrate by designing metasurface-enhanced LEDs that significantly improve LEE (Sec.6).The speed of our method also enables calculation of quantities such as high-resolution spatial maps of LEE, allowing new insights into the physics of LED devices.
Related to our work, novel factorization methods for multi-channel inverse design [13,14] and a trace formulation of photonic inverse design [15] have recently been introduced, which have lower computational cost compared to conventional inverse design formulations.Our work shows that-for problems such as the LED-with an appropriate choice of the simulation algorithm, the standard formulation may be sufficient.In particular, the FMM has several characteristics that enable efficient modeling of many incoherent sources.The FMM treats fields in periodic stratified media in a truncated Fourier basis, which has the potential to represent fields accurately with relatively few terms.As in other methods, with the FMM one solves a linear system to obtain the fields generated by a source; this system has dimension given by the size of the basis, which for FMM is related only to the details of the structure in two dimensions, as the third dimension is handled analytically.The resulting linear system can be relatively small, enabling direct solution methods to be employed.Finally, while constructing the linear system involves eigendecompositions and is expensive, the results of these operations can often be reused e.g. in an optimization setting where many layer profiles may be unchanged between iterations.
Despite these potential advantages, there are challenges that have discouraged the use of the FMM for problems such as the LED.For example, the original and common FMM formulation exhibits poor convergence in structures containing metals [16].In addition, FMM naturally treats periodic structures and sources; while LEDs are generally formed in periodic arrays, treatment of a source as periodic gives rise to nonphysical interference effects [17].We address these issues with a vector formulation of the FMM-including an improved method for automatically computing vector fields-and Brillouin zone integration [17][18][19], which dramatically improve convergence (Sec.5) and allow modeling of localized dipoles in a periodic LED array.
Our implementation of the FMM is FMMAX, which is based on Jax [20] and joins several recent codes that support automatic differentiation [21][22][23][24], enabling the gradient calculation needed for inverse design.Our code is distinguished by a flexible low-level programming interface which admits uncommon use cases such as the LED simulations carried out here, and is freely available at github.com/facebookresearch/fmmax.

Method
Throughout this section, we discuss extensions of the basic FMM needed to enable LED simulation and inverse design.For a full description of the FMM, we refer the reader to [25] and [26], which our implementation closely follows.

Vector FMM formulations
It has long been recognized that the original FMM formulation (referred to as FFT, as in [25]) converges well only when electric fields are tangent to material interfaces [16]; this was improved by vector FMM formulations, which introduce a local coordinate system with unit vectors that are tangent and normal to the boundaries of features.Vector formulations allow orientation-dependent Fourier factorizations of permittivity to be used for tangent and normal components of the electric displacement field, which is the essential change of the vector FMM methods from the original formulation [27].In an inverse design setting, the vector fields must be automatically generated from the evolving geometry.Various schemes have been developed based on interpolation [28] and functional minimization [25]; we adopt the latter due to its applicability to continuously varying permittivity, as is encountered during the course of optimization.
As in [25], the vector field t =   ,   ⊺ for a permittivity array  is obtained by minimizing a loss function L. High-quality vector fields should be smoothly varying and normal to the permittivity gradient, i.e. tangent to the interfaces in the structure.This is achieved using where the sum is over all the elements in t.The first term is minimized when t is correctly oriented, while the remaining terms are minimized when t varies slowly.The resulting optimization problem, can be solved by standard gradient-based methods.The choice of initial t ★ is consequential.In a novel method termed Jones direct, we compute an initial real-valued vector field (even for complex ) and convert to a complex Jones field using the method of [29].The final complex-valued t is obtained by directly optimizing this Jones field, and advantageously lacks both discontinuities and zeros.
A real-valued t can be obtained by avoiding the Jones conversion and directly optimizing a real t ★ ; this will lack discontinuities but include zeros, and is equivalent to the Pol method of [25].The field can be normalized to have magnitude 1 everywhere (the Normal method of [25]) or it can be converted to a Jones field as a post-processing step (the Jones method of [25]).In general, the Normal method will yield fields having discontinuities, while the Jones method will yield fields that lack discontinuities and zeros, but vary more rapidly than those obtained by the Jones direct method.Thus, the Jones direct scheme has favorable characteristics for generation of a local coordinate system represented in a Fourier basis.
Example vector fields for these methods are in Appendix A. Consistent with the theoretical benefits discussed above, we observe best convergence for the FMM with the Jones direct formulation (Appendix B) and use the method throughout this work.

Brillouin zone integration for aperiodic sources
Sources in periodic structures can be modeled in the FMM by expanding the source spatial profile in terms of the layer eigenmodes [26].However, in this approach both the sources and the structure are periodic, which is not the case in a real LED and can give rise to unphysical interference effects.
In principle, larger supercells including multiple LEDs could be modeled, but due to the unfavorable scaling of the FMM algorithm this may be prohibitive.Absorbing boundary conditions could also be introduced, as used previously in extensions of the FMM to accommodate aperiodic structures [30,31].A further alternative-the approach adopted here-is the Brillouin zone (BZ) integration method of [17][18][19], in which the electric field from an isolated source in a periodic structure is found by, where    is the field with Bloch-periodic boundary conditions for wavevector   , and   is the BZ area.The calculation of each    requires a simulation of a single unit cell, and thus the problem of modeling a large supercell is transformed into one of several smaller simulations.Throughout this work, we approximate integrals over the Brillouin zone by averaging over a regular   grid.When the grid has shape  × , we refer to the case as  ×  BZ integration; when no BZ integration is performed, we refer to this as the periodic dipole approximation (PDA).

Calculation of light extraction efficiency
For the LED modeled here, we are primarily concerned with the total LEE and the power radiated by the dipole.For a given wavelength, the LEE is obtained by, Here,    , ,  and     , ,  are the power emitted by a dipole, and power emitted by the dipole which is extracted from the LED.The powers are measured with planar monitors, obtained by summing over the powers in all the Fourier orders as described in [25], and are indexed by , , and   , corresponding to the dipole position, dipole orientation, and BZ point, respectively.We use a dipole-orientation-dependent weight   , with values of 1 for the -and -oriented dipoles and 0.1 for the -oriented dipoles.These generally have lower extraction efficiency and can be suppressed in material systems such as AlInGaP by appropriately straining the quantum wells [32].

𝜇LED structure and optimization problem
We now describe the LED geometry and corresponding optimization problem used throughout the rest of our analysis.We consider an array of simplified cylindrical LEDs with a 1.4 m pitch, which is illustrated in Fig. 1.The semiconductor region has a diameter of 1 m, the conducting oxide has a diameter of 0.8 m, and the passivation sidewall has a thickness of 0.1 m.Other layer thicknesses are optimizable parameters specified elsewhere.A spatial resolution of 10 nm is used for all layers.Refractive indices of the materials are as follows: semiconductor, 3.0; conducting oxide, 1.9 + 0.005; passivation, 1.5; and metal, 0.2 + 3.3.Values are chosen to be representative of AlInGaP, indium-tin oxide, silicon oxide, and gold, respectively.A high index semiconductor ( = 3.0) cylinder is embedded within a metallic substrate ( = 0.2 + 3.3) and separated by a thin passivation layer ( = 1.5) and transparent conducting oxide ( = 1.9 + 0.005).The metasurface "design region" of the LED unit cell resides on top of the semiconductor cavity and emits into air.The material of the metasurface is composed of a refractive index that is linearly interpolated between the index of the semiconductor and air.The metasurface itself is parameterized by a 2D grid of "pixels" [33] in / and projected into the -direction using a thickness chosen by the optimizer.Dipoles within the LED are located in a planar source region embedded within the semiconductor layer, which is free to move as the optimizer dictates.
We model dipoles in a plane within the semiconductor material, and consider -, -, and -oriented dipoles on a 20 nm grid (5847 total dipoles).The spatial profile of the dipoles is Gaussian with a 60 nm full-width at half-maximum.Power emitted by each dipole is measured using planar monitors positioned 50 nm from the dipole plane.We model a single plane, corresponding e.g. to a single quantum well active region, but note that it would be possible to model multi-quantum-well active region with no additional eigendecompositions, and therefore relatively cheaply.
We turn now to the LED optimization problem.In this paper, we exercise unconstrained optimization, as our primary aim is to demonstrate the functioning of the simulator and inverse design pipeline, rather than to obtain designs suitable for manufacture, or to form estimates of the potential performance of metasurface-enhanced LEDs.We use a monochromatic objective L ( ) = 1 −   ( ) for a 630 nm emission wavelength.Here,  represents the optimizable parameters, including layer thicknesses and the metasurface density .The density is a twodimensional array with values in the range [0, 1] from which the permittivity at each point is computed by interpolation between the permittivity of air and of semiconductor using the method of [34].
Although one could directly optimize the elements of the density array , this can produce xz slice (y=0) x-dipole at (0, 0) / = 630 nm FMM xz slice (y=0) FDTD xz slice (y=0) x-dipole at (0, 0. designs with very fine features that would not be practical to manufacture.Therefore, we choose to instead optimize a latent density ρ also in the range [0, 1], from which  is obtained by, where  is a Gaussian kernel with a full-width at half-maximum of 40 nm and  = 2.This expression is similar to the transforms commonly used in density-based topology optimization [35] and help ensure that  lacks features which are too small by encouraging an implicit lengthscale.Futhermore, this approach helps regularlize the optimization problem, which is solved using the L-BFGS-B algorithm [36].

Field calculation and validation
To establish the validity of our method, we first make a direct comparison of the steady-state fields computed by Meep [37], a mature FDTD solver, to those computed by FMM.We consider a simple LED lacking a metasurface, with semiconductor and conducting oxide thicknesses of 1.0 m and 0.1 m, respectively.Dipoles emitting at 630 nm are centered vertically in the semiconductor and positioned (with units of m) at (, ) = (0, 0) and (0, 0.2), both oriented in the -direction.
Figure 2 shows cross sections of the electric field magnitude computed by the two methods.The field structure for both dipole locations computed by FMM and FDTD are in excellent agreement; in particular, the nodal lines found by the two methods are nearly indistinguishable.
Our FMM calculation uses the PDA with the Jones direct formulation and  = 2000.
The FDTD calculation uses a spatial resolution of 200 voxels / m, periodic boundary conditions in the -and -directions, and perfectly electric conductor (PEC) boundary conditions in the z-direction.To mitigate backreflections from the upper z-boundary, we used a 2 m thick, adiabatic, conductive absorber layer above the LED.Using a nonlinear optimization algorithm [38], we fit a Lorentz-drude material model to the desired complex refractive index values at  = 630 nm to ensure simulation stability [39].The FDTD simulation using a broadband Gaussian pulse centered at 630 nm and terminated the simulation after the fields were sufficiently decayed.Just like the FMM solver, the spatial profile of the FDTD source was a normalized Gaussian with a FWHM of 60 nm.The steady-state fields for the 630 nm wavelength were recorded using rolling discrete-time Fourier transform monitors at the specified cross sections.

Solver convergence and performance
Next, we study the convergence and performance of the FMM in the context of our LED simulations and compare the results to FDTD.We note that the numerical accuracy of each algorithm is fundamentally dictated by the underlying discretization resolution; for FDTD, this corresponds to the actual spatial resolution of the grid, or the total number of simulation voxels, whereas with FMM this refers to the spatial-frequency resolution, or the total number of Fourier orders.To quantify the relative convergence characteristics of both algorithms, we simulated the LED structure and dipole configuration of Sec. 4 with multiple resolutions.
First, we examine the importance of the FMM formulation itself.Fig. 3 shows the LEE for the x-oriented dipoles at (, ) = (0, 0) and (0, 0.2) as a function of the Fourier orders  and for the Jones direct and FFT formulations.For the Jones direct formulation, the LEE for both dipoles quickly converges; the values with  = 4800 are 36.7%and 30.4% respectively, but with only  = 400 the values are 37.4% and 29.4%, in excellent agreement with the converged results.By contrast, the FFT formulation exhibits very poor convergence; as  increases the values trend in the direction of the Jones direct results, but they exhibit substantial fluctuation and remain far from the converged result even with  = 4800.
We now directly compare the convergence of FMM and FDTD.Fig. 3 also shows the convergence of LEE for both dipole positions calculated by FDTD with varying spatial resolution.With a resolution of 200 voxels / m (Δx=5 nm), we obtain values of 39.2% and 31.7%, in good agreement with the converged Jones direct result.Such a high spatial resolution for FDTD is expected due to the high-contrast metals within the LED model, although the application of subpixel-smoothing algorithms or other conformal meshing approaches compatible with dispersive media should accelerate the convergence [40].In combination with the field profiles in Fig. 2, this shows that the vector FMM can be used for modeling and LEE calculation of LEDs, with results comparable to high-resolution FDTD.
Despite the similarity in results between FMM and FDTD, there are major differences in the performance and computational burden of the two algorithms.Fig. 4 shows the elapsed time for the FMM and FDTD simulations in Fig. 3 as a function of Fourier order and resolution respectively.All calculations were performed using 10 cores of a 2.20GHz Intel ® Xeon ® CPU E5-2698 v4 with 512 GB of RAM and two NUMA nodes.Although FMMAX supports GPU acceleration, this was not used to allow a direct comparison.The FMM results are shown for the case where three incoherent dipoles are modeled, and where all ∼6000 dipoles are modeled; the results show the negligible incremental cost of simulating additional dipoles.
The FDTD elapsed times are shown for centered and offset dipoles; the centered-dipole simulation used odd -symmetry and even -symmetry to reduce the computational burden, of Fourier order and spatial resolution respectively.Results are given for both the centered and offset -oriented dipole.While the Jones direct FMM formulation of FMM converges for  ≥ 400, the FFT formulation fails to converge; the calculated efficiency trends in the direction of the Jones direct result, but does not reach agreement even for  = 4800.FDTD requires a resolution of 200 voxels / m (Δx=5 nm) to reach suitable convergence, largely due to the high-contrast metals within the LED model.while the offset dipole considered the full simulation volume and is representative of the general case.Considering this offset dipole, there is approximately a 3 × 10 3 speedup when comparing the 200 voxels / m FDTD results (the coarsest resolution that yields converged results) to the  = 800 Fourier-term results (the lowest  we used in the inverse design setting).
Practically speaking, the speedup is even more significant; if FDTD simulations were carried out for the ∼6000 dipoles considered in this work, we project that CPU time would be ∼2 × 10 7 times greater than for the equivalent calculation by FMM.If simulations considering multiple wavelengths were carried out, the advantage would be reduced; FDTD automatically handles multiple wavelengths within a single simulation, while with the FMM each wavelength requires an independent simulation.However, for the LED we have found that  (10) wavelengths are sufficient, and in any case these simulations can easily be distributed across many machines (the task is embarrassingly parallel).This makes FMM a strong choice for problems such as the LED.
We note a few other important observations regarding the convergence trends of each method.For low , for example, the vector field calculation contributes significantly to the total time of the FMM method, causing the non-polynomial scaling seen in Fig. 4. The FDTD elapsed times also exhibit some deviation from polynomial behavior and a crossover at 100 / m resolution.Meep requires the user to properly choose a hardware configuration (e.g. the proper number of processes to launch) to maximize the timestepping rate [41].Choosing these parameters correctly is highly dependent on the underlying simulation resolution, and most likely accounts for the aforementioned crossover point.The LED structure may also be more or less resonant for dipoles at different positions in a manner that depends on resolution; for the FDTD algorithm, this can also cause simulation times to vary.
While not clear from the data in Fig. 4, the FMM solver also offers advantages when used within an optimization pipeline where many solves are performed sequentially.Specifically, the eigendecomposition for layers whose cross-sections do not change can be cached and reused.With Jax, this is optimization is easily done by jit-ing (just-in-time compiling) the simulation and indicate an approximate ∼3 × 10 3 times speedup when comparing the 200 voxels / m FDTD simulation to the  = 800 Fourier term FMM simulation (resolutions where both algorithms are converged).Notably, the FDTD results correspond to a single dipole simulation.The expected computational cost for additional dipoles scales linearly with the number of dipoles.This is in direct contrast with the FMM method shown here, where the marginal cost of an additional ∼6000 dipoles is negligible.Therefore, the projected speedup for a full LED simulation is ∼2 × 10 7 .x-dipole at (0, 0)

PDA 2x2 3x3
x-dipole at (0, 0.2) all dipoles Fig. 5. Light extraction efficiency for various BZ integration schemes computed with the Jones direct FMM formulation and  = 2000.Values are for individual -oriented dipoles and a full calculation including all -, -, and -oriented dipoles in the basic LED.Each "scheme" is defined the regular grid of wavevectors in the first BZ used to approximate a BZ integral, such that 3 × 3 corresponds to nine evenly distributed points (three samples along   and three samples along   ).Importantly, for the LED problem, the LEE appears to converge with a relatively small number of points.
specifying that eigendecompositions be carried out at compile time.For the LED problem, the savings are substantial-while eigendecomposition for three patterned layers is needed at the first step, at subsequent steps only one eigendecomposition (for the metasurface) is needed.For the  = 800 case on our Xeon ® system, this results in a step time of 49 seconds for the second and subsequent steps, compared to 175 seconds for the first step.Finally, to study the effect of BZ integration, we calculated LEE using the PDA and BZ integration schemes up to 6 × 6, using the Jones direct formulation and  = 2000.The results are in Fig. 5; values are shown for the two -oriented dipoles and the total LEE considering all ∼6000 dipoles in the active region.The quality of the PDA apparently depends on the dipole position: for the dipole at (0, 0), going to 2 × 2 BZ integration and beyond only slightly increases  the computed LEE value.By contrast, the dipole at (0, 0.2) has a dramatic increase in the LEE when BZ integration is used, and a sizeable increase is also seen in the all-dipole case.This indicates that avoiding the PDA is critical in the calculation of the LEE for a LED.Motivated by these results, for the remainder of this work we make use of two simulation configurations.In the inverse design setting, to balance accuracy against speed, we use the Jones direct method with  = 800 and 2×2 BZ integration.For validation, we use  = 2000, also with Jones direct and 2×2 BZ integration.

Optimization results
To exercise our inverse design pipeline and examine the potential of metasurface-enhanced LEDs, we consider three cases: the first is the initial, unoptimized LED with structure discussed earlier, i.e. with no metasurface, conducting oxide thickness of 0.1 m, semiconductor thickness of 1 m, and dipoles are located 0.5 m above the oxide.In the second, the film thicknesses are optimized, using the given values as the initial solution.In the third case, both the film thicknesses and the metasurface density are optimizable.Here, the initial semiconductor thickness is reduced to 0.95 m and the initial metasurface thickness is 0.1 m, resulting in a structure whose optical thickness approximately matches that of the unoptimized LED.The latent density is uniformly initialized with a value of 0.5.
The thickness-optimized LED optimization proceeded for 17 iterations before the optimizer converged, reaching an LEE of 40.1% at 630 nm, compared to 28.4% for the initial LED.The optimized film thicknesses are given in Table 1.Changes to thicknesses from the initial values are relatively small, consistent with a loss landscape having many local minima, as frequently encountered in thin film optimization problems.The LED with optimized film thicknesses and metasurface proceeded for 69 iterations, ultimately reaching an LEE of 56.2%.The film thicknesses are given in Table 1, and the metasurface density is shown in Fig. 6.
To understand the mechanism for LEE improvement in the LEDs obtained by optimization, we computed high-resolution maps of the LEE and emitted power for the three LED structures.These are shown in Fig. 7.As noted, our calculation considers dipoles at ∼2000 locations in a plane; each pixel in a map corresponds to a single physical dipole location, with value found by appropriately combining those for the three dipole orientations at the location.It is evident that the large increase in LEE for the optimized LEDs is not simply due to a uniform increase in the LEE, but rather a nonuniform increase together with an enhancement in the dipole power in regions where the LEE is high.We can also observe sharp features in the LEE and emitted power.This suggests that simulations with few dipoles on a coarse grid may yield poor estimates of the total LEE.Finally, Fig. 8 shows the wavelength-dependent LEE for the three cases, and illustrates the strong enhancement of LEE at the 630 nm target wavelength.Away from the target, both optimized LEDs exhibit a significant drop in LEE, reaching values comparable to the unoptimized LED with less than 5 nm of wavelength shift.This suggests that a broadband objective will be required in order to obtain LED designs which do not have strong wavelength dependence to LEE.
We emphasize that the LEDs discussed above are merely examples of designs that can be found through optimization, and we have seen many other designs with different metasurface densities and associated maps of LEE and emitted power.In practice, an objective should be crafted which is minimized for devices having the actual characteristics needed for LED applications.

Conclusion
We have shown that the FMM can be used to model LED structures with accuracy comparable to high-resolution FDTD, while achieving a speedup greater than 10 7 compared to CPU-based FDTD in situations where a large number of incoherent dipoles is considered.This result is enabled by the inherent characteristics of the FMM, vector FMM formulations that dramatically improve convergence compared to the basic FMM scheme, and BZ integration to model localized sources.These features are available in FMMAX, a Jax-based implementation of the FMM which we are open-sourcing alongside this work.
We used FMMAX to optimize several LED designs, and showed that a metasurface-enhanced LED can increase LEE by 98% over an unoptimized design, and by 38% over a LED with optimized film thicknesses but no metasurface.This result indicates that metasurfaces are a promising technology to improve LED performance as required for applications such as AR displays.Next steps could include consideration of LED structures that are more realistic, use of optimization objectives that better target the actual requirements of LEDs in AR applications (e.g. which maximize directional emission), the incorporation of manufacturability constraints in the inverse design scheme, and alternate initialization for LED film thicknesses to obtain better local optima.It would also be interesting to study the new Jones direct FMM formulation across a wider of structures.
Finally, our method could be applied to a range of problems where spatially incoherent emission is considered, to problems where localized sources interact with a periodic structure, or to problems where FMM is more typically employed, such as modeling of gratings or photonic crystals.We then evaluated each design with various  and the three formulations.Overfitting to  = 800 is seen in all cases, but the results converge as  increases.Jones direct exhibits the best convergence of the three formulations.
this may not be expected.Further, in an inverse design setting one may encounter "overfitting" where an optimized design performs best when evaluated with simulation settings (i.e.number of Fourier orders, FMM formulation, and BZ integration scheme) used during the optimization procedure.This makes it challenging to identify a preferred formulation.Thus, to select the formulation best suited for LED design, we solved the inverse design problem from Sec. 3 with the Pol, Jones, and Jones direct methods (with  = 800), obtaining three separate designs.We then evaluated the total LEE of each design using the three formulations and Fourier orders up to  = 4800.The results are shown in Fig. 10.
The designs achieve similar performance and all show some degree of overfitting, where the simulation settings used for optimization yield a LEE higher than the converged result.However, increasing from  = 800 to  = 2000 (our validation setting, discussed in Sec. 5) is sufficient to obtain converged results.Of the three formulations, Pol converges least well, and exhibits some oscillation in the total LEE with an amplitude of ∼5%.Jones and Jones direct are quite similar, with Jones direct being slightly better on all three designs and avoiding the overfitting on the Pol-optimized design.The results support the selection of Jones direct and  = 2000 for validation, although a more thorough study of convergence for a wider range of structures would be valuable.
Disclosures.The authors declare no conflicts of interest.

Fig. 1 .
Fig.1.Side and top view of the LED unit cell analyzed throughout this work.A high index semiconductor ( = 3.0) cylinder is embedded within a metallic substrate ( = 0.2 + 3.3) and separated by a thin passivation layer ( = 1.5) and transparent conducting oxide ( = 1.9 + 0.005).The metasurface "design region" of the LED unit cell resides on top of the semiconductor cavity and emits into air.The material of the metasurface is composed of a refractive index that is linearly interpolated between the index of the semiconductor and air.The metasurface itself is parameterized by a 2D grid of "pixels"[33] in / and projected into the -direction using a thickness chosen by the optimizer.Dipoles within the LED are located in a planar source region embedded within the semiconductor layer, which is free to move as the optimizer dictates.

Fig. 2 .
Fig. 2. Cross sections of the steady-state electric field magnitude for dipoles at (, ) = (0, 0) and (0, 0.2) computed by FMM and FDTD.The  and  are taken through the center of the LED, while the  section is taken at the semiconductor/air interface.The methods show strong agreement at all slices and for both dipole positions, as demonstrated by the consistent representation of the complicated modal patterns.

Fig. 3 .
Fig.3.Convergence of the light extraction efficiency for FMM and FDTD as a function of Fourier order and spatial resolution respectively.Results are given for both the centered and offset -oriented dipole.While the Jones direct FMM formulation of FMM converges for  ≥ 400, the FFT formulation fails to converge; the calculated efficiency trends in the direction of the Jones direct result, but does not reach agreement even for  = 4800.FDTD requires a resolution of 200 voxels / m (Δx=5 nm) to reach suitable convergence, largely due to the high-contrast metals within the LED model.

Fig. 4 .
Fig.4.Comparison of elapsed time for FMM and FDTD LEE calculations.The results indicate an approximate ∼3 × 10 3 times speedup when comparing the 200 voxels / m FDTD simulation to the  = 800 Fourier term FMM simulation (resolutions where both algorithms are converged).Notably, the FDTD results correspond to a single dipole simulation.The expected computational cost for additional dipoles scales linearly with the number of dipoles.This is in direct contrast with the FMM method shown here, where the marginal cost of an additional ∼6000 dipoles is negligible.Therefore, the projected speedup for a full LED simulation is ∼2 × 10 7 .

Fig. 7 .Fig. 8 .
Fig. 7. Spatial maps of light extraction efficiency and emitted power for the initial, thickness-optimized, and metasurface-enhanced LED.Values are for 630 nm wavelength, averaged over dipole orientations, and for =2000 with 2×2 BZ integration.

Fig. 10 .
Fig.10.Light extraction efficiency as a function of Fourier orders  for three separate metasurface-enhanced LED designs.Each design was obtained by solving the inverse design problem of Sec. 3 using the Pol, Jones, and Jones direct formulations during the course of optimization.We then evaluated each design with various  and the three formulations.Overfitting to  = 800 is seen in all cases, but the results converge as  increases.Jones direct exhibits the best convergence of the three formulations.

Table 1 .
Thicknesses for the unoptimized and optimized LEDs.