Fast Eigensolver for plasmonic metasurfaces

Finding the wavevectors (eigenvalues) and wavefronts (eigenvectors) in nanostructured metasurfaces is cast as a problem of finding the complex roots of a non-linear equation. A new algorithm is introduced for solving this problem; example eigenvalues are obtained and compared against the results from a popular, yet much more computationally expensive method built on a matrix eigenvalue problem. In contrast to the conventional solvers, the proposed method always returns a set of ‘exact’ individual eigenvalues. First, by using the Lehmer-Schur algorithm, we isolate individual complex roots from each other, then use a zero-polishing method applied at the very final stage of ultimate eigenvalue localization. Exceptional computational performance, scalability, and accuracy are demonstrated. ©2014 Optical Society of America OCIS codes: (160.3918) Metamaterials; (050.1755) Computational electromagnetics methods; (050.6624) Subwavelength structures. References and links 1. N. Yu, P. Genevet, M. A. Kats, F. Aieta, J.-P. Tetienne, F. Capasso, and Z. Gaburro, “Light propagation with phase discontinuities: generalized laws of reflection and refraction,” Science 334(6054), 333–337 (2011). 2. M. A. Kats, P. Genevet, G. Aoust, N. Yu, R. Blanchard, F. Aieta, Z. Gaburro, and F. Capasso, “Giant birefringence in optical antenna arrays with widely tailorable optical anisotropy,” Proc. Natl. Acad. Sci. U.S.A. 109(31), 12364–12368 (2012). 3. F. Aieta, P. Genevet, N. Yu, M. A. Kats, Z. Gaburro, and F. Capasso, “Out-of-plane reflection and refraction of light by anisotropic optical antenna metasurfaces with phase discontinuities,” Nano Lett. 12(3), 1702–1706 (2012). 4. P. Genevet, N. Yu, F. Aieta, J. Lin, M. A. Kats, R. Blanchard, M. O. Scully, Z. Gaburro, and F. Capasso, “Ultrathin plasmonic optical vortex plate based on phase discontinuities,” Appl. Phys. Lett. 100(1), 013101–013103 (2012). 5. S. Larouche and D. R. Smith, “Reconciliation of generalized refraction with diffraction theory,” Opt. Lett. 37(12), 2391–2393 (2012). 6. X. Ni, S. Ishii, A. V. Kildishev, and V. M. Shalaev, “Ultra-thin, planar, Babinet-inverted plasmonic metalenses,” Light Sci. Appl. 2(4), e72 (2013). 7. S. Sun, Q. He, S. Xiao, Q. Xu, X. Li, and L. Zhou, “Gradient-index meta-surfaces as a bridge linking propagating waves and surface waves,” Nat. Mater. 11(5), 426–431 (2012). 8. A. V. Kildishev, A. Boltasseva, and V. M. Shalaev, “Planar photonics with metasurfaces,” Science 339(6125), 1232009 (2013). 9. L. Verslegers, P. B. Catrysse, Z. Yu, and S. Fan, “Planar metallic nanoscale slit lenses for angle compensation,” Appl. Phys. Lett. 95(7), 071112 (2009). 10. S. Ishii, A. V. Kildishev, V. M. Shalaev, K.-P. Chen, and V. P. Drachev, “Metal nanoslit lenses with polarization-selective design,” Opt. Lett. 36(4), 451–453 (2011). 11. R. E. Collin, “Reflection and Transmission at a Slotted Dielectric Interface,” Can. J. Phys. 34(4), 398–411 (1956). 12. C. B. Burckhardt, “Diffraction of a plane wave at a sinusoidally stratified dielectric grating,” J. Opt. Soc. Am. 56(11), 1502–1508 (1966). 13. H. Kogelnik, “Coupled wave theory for thick hologram gratings,” Bell Syst. Tech. J. 48(9), 2909–2947 (1969). 14. F. G. Kaspar, “Diffraction by thick, periodically stratified gratings with complex dielectric constant,” J. Opt. Soc. Am. 63(1), 37–45 (1973). 15. R. Magnusson and T. K. Gaylord, “Analysis of multiwave diffraction of thick gratings,” J. Opt. Soc. Am. 67(9), 1165–1170 (1977). 16. K. Knop, “Rigorous diffraction theory for transmission phase gratings with deep rectangular grooves,” J. Opt. #200494 $15.00 USD Received 15 Nov 2013; revised 26 Dec 2013; accepted 28 Dec 2013; published 13 Jan 2014 (C) 2014 OSA 1 February 2014 | Vol. 4, No. 2 | DOI:10.1364/OME.4.000288 | OPTICAL MATERIALS EXPRESS 288 Soc. Am. 68(9), 1206–1210 (1978). 17. L. C. Botten, M. S. Craig, R. C. McPhedran, J. L. Adams, and J. R. Andrewartha, “The finitely conducting lamellar diffraction grating,” Opt. Acta (Lond.) 28(8), 1087–1102 (1981). 18. M. G. Moharam and T. K. Gaylord, “Three-dimensional vector coupled-wave analysis of planar-grating diffraction,” J. Opt. Soc. Am. 73(9), 1105–1112 (1983). 19. L. Li, “Multilayer modal method for diffraction gratings of arbitrary profile, depth, and permittivity,” J. Opt. Soc. Am. A 10(12), 2581–2591 (1993). 20. E. Noponen and J. Turunen, “Eigenmode method for electromagnetic synthesis of diffractive elements with three-dimensional profiles,” J. Opt. Soc. Am. A 11(9), 2494–2502 (1994). 21. G. Granet and B. Guizal, “Efficient implementation of the coupled-wave method for metallic lamellar gratings in TM polarization,” J. Opt. Soc. Am. A 13(5), 1019–1023 (1996). 22. J. M. Jarem and P. P. Banerjee, “A nonlinear, transient analysis of twoand multi-wave mixing in a photorefractive material using rigorous coupled-wave diffraction theory,” Opt. Commun. 123(4-6), 825–842 (1996). 23. P. Lalanne and G. M. Morris, “Highly improved convergence of the coupled-wave method for TM polarization,” J. Opt. Soc. Am. A 13(4), 779–784 (1996). 24. E. Popov and M. Nevière, “Maxwell equations in Fourier space: fast-converging formulation for diffraction by arbitrary shaped, periodic, anisotropic media,” J. Opt. Soc. Am. A 18(11), 2886–2894 (2001). 25. B. Momeni and B. Rashidian, “Improved coupled wave analysis of two-dimensional planar multiple gratings,” IEEE Trans. Antenn. Propag. 52(1), 165–171 (2004). 26. A. David, H. Benisty, and C. Weisbuch, “Fast factorization rule and plane-wave expansion method for twodimensional photonic crystals with arbitrary hole-shape,” Phys. Rev. B 73(7), 075107 (2006). 27. X. J. Ni, Z. T. Liu, A. Boltasseva, and A. V. Kildishev, “The validation of the parallel three-dimensional solver for analysis of optical plasmonic bi-periodic multilayer nanostructures,” Appl. Phys., A Mater. Sci. Process. 100(2), 365–374 (2010). 28. A. V. Kildishev and U. K. Chettiar, “Cascading optical negative index metamaterials,” J. Appl. Comput. Electromagnetics Soc. 22, 172–183 (2007). 29. S. M. Rytov, “Electromagnetic properties of a finely stratified medium,” Sov. Phys. JETP 2, 466–475 (1956). 30. P. Yeh, A. Yariv, and C.-S. Hong, “Electromagnetic propagation in periodic stratified media. I. General theory,” J. Opt. Soc. Am. 67(4), 423–438 (1977). 31. F. Tisseur and K. Meerbergen, “The Quadratic Eigenvalue Problem,” SIAM Rev. 43(2), 235–286 (2001). 32. X. Ni, S. Ishii, M. D. Thoreson, V. M. Shalaev, S. Han, S. Lee, and A. V. Kildishev, “Loss-compensated and active hyperbolic metamaterials,” Opt. Express 19(25), 25242–25254 (2011). 33. F. S. Acton, Numerical Methods that Work (Math. Assoc. Amer., Washington, DC, 1990). 34. J. Elser, V. A. Podolskiy, I. Salakhutdinov, and I. Avrutsky, “Nonlocal effects in effective-medium resfponse of nanolayered metamaterials,” Appl. Phys. Lett. 90(19), 191109 (2007). 35. J. W. Brown and R. V. Churchill, Complex Variables and Applications (McGraw Hill, 2004). 36. M. J. D. Powell, A Hybrid Method for Nonlinear Equations (Gordon and Breach, 1970). 37. GNU Scientific Library, (1996–2013). http://www.gnu.org/s/gsl/. 38. X. Ni, Z. Liu, F. Gu, M. G. Pacheco, J. Borneman, and A. V. Kildishev, “PhotonicsSHA-2D: Modeling of single-period multilayer optical gratings and metamaterials,” (2012), doi: 10.4231/D3WS8HK4X. https://nanohub.org/resources/6977. 39. Z. Liu, K. P. Chen, X. Ni, V. P. Drachev, V. M. Shalaev, and A. V. Kildishev, “Experimental verification of two-dimensional spatial harmonic analysis at oblique light incidence,” J. Opt. Soc. Am. B 27(12), 2465–2470 (2010). 40. A. V. Kildishev, J. D. Borneman, X. J. Ni, V. M. Shalaev, and V. P. Drachev, “Bianisotropic effective parameters of optical metamagnetics and negative-index materials,” Proc. IEEE 99(10), 1691–1700 (2011). 41. M. Frigo and S. G. Johnson, “The design and implementation of FFTW 3,” Proc. IEEE 93(2), 216–231 (2005).


Introduction
Fabricating a lattice of direct or inverted nanoantennas by creating nanoscale perforations in a subwavelength-thin metal film [1][2][3][4][5][6][7][8] is a practical way of realizing working photonic devices built on the concept of the generalized Snell's law [1,5,6]. Using such ultra-thin metasurfaces for mimicking an ideal, continuous, and surface-confined phase gradient is always somewhat inexact; the lattice of individual structural units provides only an approximation (with discrete lateral distribution of finite phase steps) needed to generate a desired wavefront of the scattered cross-polarized light. Moreover, the nanoantenna-based metasurfaces can generate an additional, undesired scattered signal aligned with the incident polarization [1][2][3][4][5][6]. Nonetheless, this most general approach to creating 3D beam-shaping devices (lenses, axicons, phase plates, etc) may still be promising, provided that the designed functionality and features are beyond our reach through other means. Introduction of elemental nanostructures that support localized polaritonic modes (e.g., the conventional or Babinet-inverted plasmonic v-shape antennas and optical metamagnetics arranged of periodic coupled metal nanostrips) results in substantially stronger, sometimes exotic optical responses [6][7][8][9][10].
Even in a simple practical focusing device -plasmonic metal slit lenses [9,10], which consist of a number of short, metal-insulator-metal (MIM) waveguides, with width-adjustable effective propagation constants -the resulting electromagnetic modes are much more complicated than in periodic dielectric lattices. In such devices, the arrays of MIM waveguides are designed to generate the combined transmitted and reflected wavefronts, which propagate either out of the surface or into the substrate. Along with the focusing of incident light, the nanoslit metal lenses also generate surface modes on the lens interfaces and inside the slits. The combination of those effects complicates the theoretical and numerical analysis of such structures, as the coupling between the individual scattering elements (slits) and all the above modes must be taken into account.
One of the most popular methods for solving the general linear scattering problem for a periodic array of elemental sub-wavelength-scale structures is the Fourier Mode Matching, FMM (also known as the Rigorous Coupled Waves Analysis, RCWA, and the Spatial Harmonic Analysis, SHA) method [11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27][28]. Simply stated, by using the FMM method, one turns the wave equation inside the given planar metasurfaces into an eigenvalue problem and tries to get the possible wavefronts (eigenvectors) and propagation constants (eigenvalues). The solution is solely built on a set of material and geometrical properties which are preserved by translation along the array periodicity directions normal to the metasurface boundary. The convenience of using the Fourier basis implies the truncation of the entire set of eigenmodes, and requires expensive, poorly-scalable numerical solution of a matrix eigenvalue problem. Moreover, the usually convenient Fourier basis is not a good fit for the particular purpose of approximating the distribution of dielectric constants within a given structure, which are discontinuous within the structure. For this reason, not only the numerical accuracy of computing each mode depends substantially on the total amount of modes taken into account [27], but the appearance of polaritonic (plasmonic or phonic) modes requires a specific, fastconverging formulation of the matrix eigenvalue problem in hand.
The approach described in this paper provides a means to solve the eigenvalue problem for a realistic subwavelength-patterned cascaded multilayer structure, and therefore calculation of the complex scattering coefficients. The development of this approach is motivated by our previous theoretical, numerical, and experimental studies of angular dependences of reflectance and transmission of plasmonic metasurfaces [6]. Those initial numerical experiments provided us with detailed information about the overall performance and scalability of our 2D and 3D SHA solver built on the standard, scalable linear algebra libraries [27]. Brief analysis of those numerical experiments is presented in [27], which describes the theoretical treatment of the proposed approach. Instead of solving an approximate (truncated) matrix eigenvalue problem (MEP), this paper is built on finding complex roots of an equivalent, yet exact nonlinear equation (NLE) for a given number of the same eigenvalues.
There have been many studies of the conventional single-periodic or double-periodic MEP-based methods, including the reflection and transmission of cross-gratings and metasurfaces [6,10,27]. Discussions of the NLE equivalents are much fewer and date back to layered media classics such as Rytov and Yeh et al. [29,30]. It is the effective mode indices that are of prime interest here. Addressing the question of how they couple to external fields deserves a separate publication. For simplicity, we choose to test our approach with a singleperiod structure. The initial development of the related methods built on MEP [31] was done by using the Fourier expansion based on the lattice period to approximate individual natural wavefronts. With this approach, the known structure of the wavefront inside individual layers is not taken into account exactly, and hence the problem becomes similar to any boundaryvalue problem reformulated in terms of a convenient orthogonal basis in space. The NLE adopted in the paper is derived from a boundary-value problem, which is extensively used to calculate the propagation properties of lamellar structures [32]. plane wave vector k x , the equation for a transverse field is formulated independently for each layer of a given unpatterned lamellar structure. Then, the system of equations for adjacent layers is solved by expanding the fields in each layer in terms of plane waves with the corresponding perpendicular wave vector k y and by applying appropriate electromagnetic boundary conditions at the interfaces. Adding the periodic Bloch-Flouquet boundary conditions couples the propagation constant values k x along the lamellar interfaces, leading to a non-linear equation. Hence, our method deals with a set of proper wavefronts (eigenvectors) moving synchronously along the interfaces of a given lamellar structure with a corresponding propagation constant (eigenvalue). Our method uses natural, most efficient, piece-wise continuous basis functions or wavefronts. These functions behave as smooth functions within each layer, and as continuous functions at the interfaces.
The remaining outline of our text is arranged as follows: Section II casts the Maxwell curl equations in a transverse field representation for p-and s-polarized incidence. The most general electromagnetic modes are defined in each layer, ignoring its boundaries. These fields are used (i) to obtain a set of synchronous wavefronts, propagating in the x-direction as a general plane wave, and (ii) to form a non-linear equation for the eigenvalues in terms of the period, material arrangement, angle of incidence, and the free-space wavelength. Section III describes how the NLE is solved, using the Lehmer-Schur algorithm [33]. A brief account of some important numerical benchmarks is also given in Section III, with particular emphasis on the individual eigenvalue accuracy and parallelization properties.

Physical fundamentals
The formulation of the proposed method for bi-periodic structures is quite involved. For the sake of brevity in this paper, we use a simplified formulation relevant to single-periodic structures for in-plane oblique incidence with p-or s-polarized light. The complete formulation for bi-periodic 3D structures for arbitrary incidence will be published elsewhere. We consider a 2D cross-section of a single-periodic layered medium (with period δ along the y-direction). As we take a p-polarized plane wave incident at angle θ to the xdirection, a single component of the magnetic field (ˆh = H z ) sufficiently describes the propagation of light through the structure. The structure starts at 0 0 y = . Every i th layer is defined by its dielectric constant i ε and position i y of its boundary (see Fig. 1).
Using free-space wavelength λ as a geometrical scale reference, it is convenient to renormalize coordinates with a free-space wavenumber, 0  x k x y k y k δ δ and as in the optical range, 1 μ ≡ , a p-polarized plane wave in i th layer ( 1, and c are respectively the angular frequency and the speed of light. Plus and minus signs correspond to forward and backward propagating modes. Due to normalization (1), i g and x k imply dimensionless wavenumbers. We also use a normalized dispersion law, and include ι in the definition of the y-component of wave vector, Then, we take the Maxwell curl equations, the fields, and to define the boundary conditions (BC) at i th interface, , we also take into account the Bloch periodic boundary conditions

Systems of equations
We drop the monochromatic factor t e ιω − in (2), and write the BC (4) in a matrix-vector form, where ( ) , and matrix i m and its exact inverse 1 i − m are given by , ; then, from (6), we obtain a recurrence relation, 1 , where i t is defined by We stress that we can introduce i t only if matrices , Finally, we arrive at a homogeneous system of equations for 1 a , As (11) is the eigenproblem for 1 a , where the eigenvalue is equal to 1, the homogeneous system (11) has nontrivial solutions iff 0 − = t i , which results in the following characteristic equation: ( ) Moreover, as we have 2 1 s + unknowns ( 2s of i a ± , and a single x k , because each i g can be expressed through x k and i ε using the dispersion relation (3)), then the characteristic Eq.

Properties of matrix t
First, we calculate the determinant of matrix i t for , using (7) ( ) with a special case for s t from (10), Using the definition of t and Eqs. (13) and (14) we obtain 2 e . αδ = t (15) Then, (15) simplifies (12), providing the following equation, From (16), it follows that only the diagonal elements of t are important, so we could consider the following matrix

Binary medium
An example configuration for a binary single-periodic 2D structure is depicted in Fig. 1(b). The expression for K is given by Accordingly, the nonlinear Eq. (20) corresponds to the known result [34] Equation (21) is the main equation, which we use for testing the proposed method in our custom accuracy and performance tests. Once we take 0 α = (normal incidence) we obtain, matching the classical Rytov's result [29].

New algorithm implementation: theoretical description
While the problem of finding the roots (zeros) of the function on a complex plane is old and well explored, it remains rather difficult to solve. There are many "root polishing" algorithms available which can provide a given zero with a required accuracy. However, we must first isolate a zero from others in a given region with relatively good accuracy in order to get fast and reliable convergence of a given polishing algorithm (if the zero is of the first order). Here, we use the Lehmer-Schur algorithm [33] built on the "Argument Principle" (see for example [35]). Suppose that a function ( ) f z is meromorphic (analytic except for poles) from the domain interior to a positively oriented contour G , analytic and nonzero on G , then where Z is the number of zeros, P is the number of poles of the function in the domain, and ln′ denotes the logarithmic derivative of a given function ( ) Physics here corresponds to natural facts that the field in an element of the structure depends on the fields in the neighboring elements, and the wavefront of the field inside a structured unit cell is always different from that of a plane wave in a uniform media. First, we should determine a region to localize the roots, and use (23) to find the total number of zeros in the domain. Here, we choose a rectangle that includes the origin 0 z = . The rectangle should not be too large, as it is difficult (i) to excite modes with the propagation constants being much higher than that of the incident wave (we normalized all wave vectors to the wave vector of the incident light, meaning that the square shouldn't be more than a couple orders of magnitude larger than a unit square), or (ii) propagate the modes with substantial attenuation (i.e., with large imaginary parts of their propagation constants).
Next, the rectangular domain is split in half (in a real situation, any ratio may be used) over vertical and horizontal dimensions and use (24) again in order to locate "non-empty" (containing zeros) quadrants. Such iterations (splitting of zero-containing quadrants in four parts) are repeated until the resulting quadrants are small enough -meaning that zeros have been located close enough to ensure convergence of a given root-polishing algorithm. This process is schematically represented in Fig. 2, where crosses correspond to the values of x k (eigenvalues) of a binary periodic structure.
To test our approach, we consider a layer consisting of only two elemental materials (see Fig. 1(b)). This binary structure (Material 1) is composed of a gain-doped silica layer (with thickness 1 45 nm δ = and complex dielectric permittivity 1  Several consecutive splittings of the initial search area (a red rectangle) are shown in Fig.  2 by blue, orange, and black lines. The final zero-containing quadrants are shown as filled yellow rectangles. Overall, this is simply a generalization of the dichotomy method for the complex plane. As calculations for each quadrant are completely independent from each other, these calculations can be done in parallel.
It is convenient to use (25)  , Γ is split into relatively small steps, and is checked whether arguments on every step change by a small value with respect to 2π . If this change is greater than some threshold value, let's call it argument smoothness tolerance (say, 0.1), then we split our small interval in half and repeat calculation of the argument for both parts. Iterations are continued until this branch point is located, or it is understood that there is no branch point on this interval. At this point, arg f z has a jump of 2π , so we can correct our calculations of the argument function, which follows the principal branch only, by adding this jump to the sum of argument changes. It should be noted that instead of the argument function, the complex logarithm function can be used. During application of the Lehmer-Schur algorithm, we should watch closely for the conservation of the total number of zeros. If the total number of zeros is not conserved, it usually means that there is a zero on one of the boundaries and this boundary must be moved. After the zeros are separated, the hybrid Powell algorithm (a modification of 2D Newton's algorithm) [36] is used in order to obtain every zero with a desired accuracy. Currently, an implementation provided by the GNU Scientific Library [37] is used. As a reference point, we have used the simulation results from 2D spatial harmonics analysis (SHA), proposed in [28] and validated in [27,38] (2D-case) and in [27] (3D-case). The accuracy of the SHA method depends on the total number of Fourier modes. In the reference runs, 300 modes were taken. This number of modes is, with confidence, beyond the limit of convergence of the algorithm. This number is so high because the usual configuration of current nanostructures involves a complex piecewise constant function for the dielectric permittivity, which results in a slow convergence of the Fourier series. We compared the simulation results of the lowest propagating mode ( x k has the smallest imaginary value) as a function of the incident light's wavelength. Results are shown in Fig.  3(a) and 3(b), where the binary periodic structure (Material 2) had the following configuration: layer of silica doped with gain inclusions of thickness 1 20 nm δ = , and a layer of silver of thickness 2 20 nm δ = , incident angle 0 θ = . The optical constants of silver and doped silica were kept the same as in Material 1.

Accuracy of the proposed algorithm
We have used the following parameters of the algorithm: zeros tuning was started after the size of all the quadrants in the Lehmer-Schur algorithm smaller than curacy of the zero tuning was 16 10 − (the absolute value of the function at the zero-point). It can be seen that the same (up to the round-off error) results for 300 harmonics in SHA have been obtained. These results are very promising, taking into account the performance tests discussed later. Here, it should also be stressed that the proposed algorithm takes into account the piecewise character of the structure by using the natural proper functions. This means that the exact eigenvalues are obtained up to the accuracy of calculated zeros, which is limited from below by a round-off error of a given floating-point representation.
In Fig. 4, we demonstrate that the propagation constants ( x k ) from the MEP-SHA method converge to the results of the proposed method after increasing the number of harmonics. In this case (Material 3) the binary structure had the following composition: silica layer of thickness 1

Parallelization of the algorithm
After each iteration of the Lehmer-Schur process, a set of zero-containing quadrants is found. Every such quadrant is processed independently. Thus, we have a natural, parallel part of the algorithm. Another important part is the final tuning of zeros with Powell's algorithm, where each zero is independent. However, as we know that this part of the entire algorithm takes a small fraction of the total computation time, it is much more beneficial to optimize the calculation of ( )

( )
arg f x in the Lehmer-Schur process.
Currently, only the simplest Lehmer-Schur parallelization described above has been implemented. The shared memory model with the Linux implementation of POSIX threads as a parallelization mechanism is used. For our current requirements (simulation of metasurfaces), we are satisfied with the present performance, which is more than one order of magnitude faster than the traditional, MEP-based SHA. The comparison of the performances of SHA against the proposed algorithm is shown in Fig. 5. As one can see from Fig. 5(a), scalability of the proposed algorithm is better than in the case of SHA, but still far from linear. The total performance is much better, which is demonstrated in Fig. 5(b). Moreover, in order to obtain comparable accuracy in MEP-based SHA (even for the lowest evanescent modes), one must use hundreds of modes.
For the proposed approach, we have used a freely available gcc compiler and standard libraries, while for the SHA code we have used a highly optimized Intel compiler together with a specialized Intel Math Kernel Library. So we still have a room for further acceleration of the new algorithm implementation. Hence, the current speed up numbers at Fig. 5(b) can be considered as minimal values. All the performance tests have been performed on Intel Xeon 5450 CPU operating at 3.0 GHz under CentOS 5.6 Linux.

Conclusion
We propose an algorithm for simulating periodic metasurfaces and cascaded metamaterial structures. The current formalism is developed for 2D-geometry, but can be generalized for 3D-case without any difficulties. We decided to limit ourselves to the 2D-case in order to keep all derivations relatively simple and clear for understanding. The proposed algorithm was implemented, and its accuracy and performance have been thoroughly tested. The preliminary results are truly encouraging -both in terms of accuracy and speed. Even the proof-ofconcept, pilot version of the proposed algorithm is significantly (at least one order of magnitude) faster than the traditional, MEP-based implementation of SHA.  We must point out that even the simplest case of a slit in a metal film requires using at least 80 modes in the MEP-based SHA (or similar algorithms) in order to get any reasonable correspondence with experimental results [39]. In this case, our method offers more than one order-of-magnitude less time. At the same time, more involved configurations may require even more modes, as the main problem of the Fourier basis in MEP-based approaches is poor representation of piecewise constant distribution of dielectric constants in the media, which leads to slow convergence of higher frequency modes. This means that the further from the origin we go in the spatial spectrum, the more significant is the error (see Fig. 4). In other words, higher frequency modes obtained by SHA have a tendency to deviate from their exact values stronger than lower frequency modes. In contrast with the MEP-based approaches, the proposed method includes naturally the sharp boundaries. As a result, all calculated propagation constants are exact up to a desired level of accuracy, which is limited from below by a round-off error of floating-point representation.
One may speculate that the non-propagating modes are not that important. However, current metasurfaces and metamaterials operate mostly due to near-field coupling effects. In the case of cascaded layers (which is natural for construction of bulk metamaterials) the nearfield interaction becomes crucial. As it has already been demonstrated in [28], the significance of lower evanescent modes in cascaded structures increases dramatically with respect to single-layer case, and is still important even in thin symmetric structures on a substrate [40]. As a result, accurate calculation of the evanscent modes becomes absolutely necessary. This leads to an increasing number of modes in MEP-based methods, with N reaching several hundred modes. In this case, usage of our method is a must, because of several orders of magnitude advantage in performance.
As the number of required operations is dramatically decreased, the proposed solver demonstrates somewhat weak scalability, which is still better than the scalability of MEP-based methods. However, due to the exceptional speed enhancement of our new solver, even its current, non-optimal implementation exhibits very good performance. The solver already allows for fast global optimization of modern periodic metamaterials and metasurfaces by using a conventional, four-core desktop computer.
We truly believe that the proposed solver is a key to both fast computation of periodic nano-structures on conventional personal computers and feasible application of the optimization algorithms, which requires many repeating calculations of the given structure. Future work will include the implementation of the proposed algorithm into our free, in-the-cloud simulation tool [38], which currently employs a conventional, MEP-based version of 2D SHA. Speed-up techniques for spectral and parametric sweeps will also receive our special attention.