Fast and accurate electromagnetic solutions of finite periodic optical structures

Electromagnetic applications of periodic materials have become popular in many modern optical and RF applications. The accurate computation of the electromagnetic response of large structures requires solving problems with high number of unknowns. Fast methods are useful to deal with such big problems, but, in general they do not take advantage of the periodicity properties. Based on the behaviour of impedance matrices involved in the solution of the surface integral equations with the Method of Moments, an accelerated solution based on the FFT is implemented. The presented approach slots the original impedance matrix and it applies the FFT to calculate the exact solution of the matrix vector product in an iterative process. The proposed solution achieves a linear memory cost proportional to O(N ) and a computing time of O(N log N ), where N is the problem number of unknowns. Also, in this paper, the advantages of this technique are shown in the developed applications. c © 2017 Optical Society of America OCIS codes: (240.6695) Surface-enhanced Raman scattering; (160.3918) Metamaterials; (260.2110) Electromagnetic optics; (050.1755) Computational electromagnetic methods. References and links 1. M. E. Veysoglu, R. T. Shin, and J. A. Kong, “A Finite-Difference Time-Domain analysis of wave scattering from periodic surfaces: Oblique incidence case,” J. Electromagn. Waves Appl. 7, 1595–1607 (1993). 2. P. Yeh, A. Yariv, and C.-S. Hong, “Electromagnetic propagation in periodic stratified media. I. General theory,” J. Opt. Soc. Am. 67, 423–438 (1977). 3. O. Ergul and L. Gurel, The Multilevel Fast Multipole Algorithm (MLFMA) for Solving Large-Scale Computational Electromagnetics Problems (Wiley-IEEE Press, 2014). 4. O. Ergul and L. Gurel, “Accurate solutions of extremely large integral-equation problems in computational electromagnetics,” Proc. IEEE 101, 342–349 (2013). 5. J. M. Taboada, M. G. Araujo, F. Obelleiro, J. L. Rodriguez, and L. Landesa, “MLFMA-FFT parallel algorithm for the solution of extremely large problems in electromagnetics,” Proc. IEEE 101, 350–363 (2013). 6. O. Ergul and L. Gurel, “Fast and accurate analysis of large-scale composite structures with the parallel Multilevel Fast Multipole Algorithm,” J. Opt. Soc. Am. A 30, 509–517 (2013). 7. J. M. Taboada, M. G. Araujo, J. M. Bertolo, L. Landesa, F. Obelleiro, and J. L. Rodriguez, “MLFMA-FFT parallel algorithm for the solution of large-scale problems in electromagnetics,” Prog. Electromagn. Res. 105, 15–30 (2010). 8. O. Ergul, “Analysis of composite nanoparticles with surface integral equations and the Multilevel Fast Multipole Algorithm,” J. Opt. 14, 062701 (2012). 9. M. G. Araujo, J. M. Taboada, J. Rivero, D. M. Solis, and F. Obelleiro, “Solution of large-scale plasmonic problems with the multilevel fast multipole algorithm,” Opt. Lett. 37, 416–418 (2012). 10. X.-C. Nie, L.-W. Li, and N. Yuan, “Precorrected-FFT algorithm for solving combined field integral equations in electromagnetic scattering,” J. Electromagnet. Wave 16, 1171–1187 (2002). 11. V. Okhmatovski, M. Yuan, I. Jeffrey, and R. Phelps, “A three-dimensional precorrected FFT algorithm for fast Method of Moments solutions of the mixed-potential integral equation in layered media,” IEEE Trans. Microw. Theory Tech. 57, 3505–3517 (2009). 12. M. F. Catedra, R. P. Torres, J. Basterrechea, and E. Gago, The CG-FFT Method: Application of Signal Processing Techniques to Electromagnetics (Artech House, 1995). 13. R. F. Harrington, Field Computation by Moment Methods (Wiley-IEEE Press, 1993). 14. B. E. Barrowes, F. L. Teixeira, and J. A. Kong, “Fast algorithm for matrix-vector multiply of asymmetric multilevel block-Toeplitz matrices in 3-D scattering,” Microw. Opt. Tech. Lett. 31, 28–32 (2001). Vol. 25, No. 15 | 24 Jul 2017 | OPTICS EXPRESS 18031 #297676 https://doi.org/10.1364/OE.25.018031 Journal © 2017 Received 9 Jun 2017; accepted 29 Jun 2017; published 18 Jul 2017 15. Y. Zhang, J. Y. Li, C. H. Liang, and Y. J. Xie, “A fast analysis of large finite-slot phased arrays fed by rectangular waveguides,” J. Electromagnet. Wave 18, 715–727 (2004). 16. H. Zhai, Q. Chen, Q. Yuan, K. Sawaya, and C. Liang, “Analysis of large-scale periodic array antennas by CG-FFT combined with equivalent sub-array preconditioner,” IEICE Trans. Electron. E89B, 922–928 (2006). 17. E. H. Bleszynski, M. K. Bleszynski, and T. Jaroszewicz, “Block-Toeplitz fast integral equation solver for large finite periodic and partially periodic antenna arrays,” in “2003 IEEE Topical Conference on Wireless Communication Technology,” (2003), pp. 428–429. 18. E. H. Bleszynski, M. K. Bleszynski, and T. Jaroszewicz, “Block-Toeplitz fast integral equation solver for large finite periodic and partially periodic array systems,” IEICE Trans. Electron. E87C, 1586–1594 (2004). 19. J. M. Taboada, J. Rivero, F. Obelleiro, M. G. Araujo, and L. Landesa, “Method-of-moments formulation for the analysis of plasmonic nano-optical antennas,” J. Opt. Soc. Am. A 28, 1341–1348 (2011). 20. A. J. Pickles and M. B. Steer, “Effective permittivity of 3-D periodic composites with regular and irregular inclusions,” IEEE Access 1, 523–536 (2013). 21. S. Schlucker, “Surface-enhanced raman spectroscopy: Concepts and chemical applications,” Angew. Chem. Int. Ed. Engl. 53, 4756–4795 (2014). 22. R. A. Alvarez-Puebla and L. M. Liz-Marzan, “SERS-based diagnosis and biodetection,” Small 6, 604–610 (2010). 23. D. M. Solis, J. M. Taboada, L. Landesa, J. L. Rodriguez, and F. Obelleiro, “Squeezing Maxwell’s equations into the nanoscale,” Prog. Electromagn. Res. 154, 35–50 (2015).


Introduction
The electromagnetic analysis of large periodic structures is an interesting topic in the scope of computational electromagnetics, from antenna arrays, RF metamaterial and metasurfaces (based on periodic patterns of resonant or non-resonant cells) to new applications in the optical regime for plasmonic periodic structures, among others.Several methods have been proposed to deal with this kind of problems.
The Floquet's harmonic theory [1,2], based on series expansions, is useful for analysing the electromagnetic behaviour of periodic infinite body arrays.Although Floquet theory might also be used to approximate the electromagnetic response of a finite periodic structure, important issues in such systems, such as edge effects or stationary waves, would not be properly taken into account.
Solutions based on the discretization of the surface integral equations can be applied to solve large problems through iterative methods and acceleration techniques.Among them, the Multilevel Fast Multipole Algorithm (MLFMA) [3][4][5][6][7][8][9] is one of the most useful approaches.To obtain the solution, the MLFMA accelerates the Matrix Vector Product (MVP) in the framework of an iterative process.Its efficiency achieves O(N log(N )) (where N is the problem number of unknowns).The MLFMA does not take special advantage of periodic problems since it is designed for arbitrary shaped geometries.Another extended approach for fast MVP calculation is the precorrected Fast Fourier Transform (p-FFT) method [10,11].At the expense of a significant loss of accuracy, the p-FFT transforms a general problem in a new uniform threedimensional (3D) grid alternative scheme.The solution can be accelerated using the Fast Fourier Transform (FFT).The CG-FFT method [12] is a similar procedure for accelerating a 3D uniform grid discretized problem using the FFT.
Apart from the use in general non periodic problems, the FFT has been applied to solve, accurately, some kind of periodic structures.For example, in some specific antenna array systems, the solution of the Method of Moments (MoM) [13] deals with multilevel block-Toeplitz matrices that can be solved in a fast way using the FFT [14][15][16].For dealing with generic block-Toeplitz matrices, another FFT based approach is developed in [17,18] that obtains fast solutions for periodic antenna arrays.By taking advantage of the periodic nature of the composite, techniques in [14][15][16][17][18] accelerate the MVP without any loss of accuracy.Also, the technique proposed in this paper has these same advantages in addition to be valid for general periodic composites.
By taking pieces (slots) of the MoM impedance matrix, the solution of the MVP can be partitioned in several portions (also slots).Each one of these partial MVP's can be efficiently solved in a fast way using the FFT.This technique, so called slot FFT, is applied for obtaining, both, accurate and fast electromagnetic solutions of very and extremely large generic periodic composites, e.g.metamaterials or plasmonic periodic structures in the optical regime.
The slot FFT will be explained in detail in the following section.Two different applications in the optical regime are presented in the Results section.Finally, the significance of the accuracy in fast solutions of large periodic problems is explained in the Discussion section.

The slot FFT method for periodic problems
Let us consider a periodic electromagnetic problem of M bodies (Fig. 1) each of them identically modelled with the MoM and using n unknowns.Let Z be the N × N impedance matrix of the MoM, being N = nM the total number of unknowns of the problem [13,19].For the sake of simplicity, only the one dimensional (1D) periodicity will be explained.In this case, the impedance matrix can be block-wise expressed as: where the submatrix Z i , j is composed of n × n elements, representing the electromagnetic interaction of the radiating basis functions of the body j over the testing functions of body i.
In addition, since bodies are identically modelled and realizing that the interactions between elements depend only on their relative location, the submatrices fulfil the following properties: Replacing Eqs. ( 2)-( 4) into Eq.( 1), Z can be expressed as a block Toeplitz Matrix as follows: To simplify the notation let Z k be the matrix given by: Replacing Eq. ( 6) into Eqs.( 2)-( 4) we get Hence, matrix Z given by Eq. ( 5) can be expressed as: If the Z k elements were scalar (1 × 1 matrices), the MVP of martrix Z by a vector x would be solved as a convolution.In such a case, this MVP could be solved in O(N log N ) operations by using the FFT to accelerate the convolution.Nevertheless, this is only an hypothetical case and we need to deal with general block Toeplitz matrices in order to accelerate the MVP.
Transforming Z from block-Toeplitz to a standard Toeplitz matrix, the FFT algorithm is directly applied slotting the matrix and the solution.This transformation is the key point of the slot FFT method.
Let R i be the block row matrix satisfying: Hence, matrix Z given in Eq. ( 10) can be written as: Next, we define the special matrix R of dimensions n × (2N − n) which is a container of the R i matrices: It is easy to see that the R i matrices, 0 < i ≤ M, can be obtained from R taking columns from N − in + 1 to 2N − in.This property can be expressed using MATLAB notation as R i = R(:, (N − in + 1) : (2N − in)).
Let Z k be the n injected N × N matrices, with 0 < k ≤ n, defined as: In general, i − j − k + N + 1 > 0 in the above equation.For the very few elements in which the evaluation of the injected matrix Z k element is out of the range of R we can take, for example, zero or Not-a-Number values.
Matrix Z k is built by using only the k-row of R. Notice that the k + in row (with i = 0, . . ., M − 1) of the Z k matrix is equal to the k + in row of the real original Z matrix, that is: In addition, each row of Z k can be obtained by shifting the precedent row as: which implies that each one of the Z k matrices (k = 1 . . .n) is a Toeplitz matrix.
Let S k be the slotting matrix given by where ṅ represents a multiple of n. S k is a matrix with M periodically distributed ones in the diagonal.It satisfies the following properties: • The product of two slotting matrices is: • The sum of all the slotting matrices gives the identity matrix: hence, any matrix, A, can be written as: • The S k operator produces the same result when it is applied to Z and when it is applied to the applicable injected matrix.Using Eq. ( 15): Summing over k in Eq. ( 19) we obtain: Finally, replacing Eq. ( 18) in Eq. ( 20), the matrix Z is obtained from slotted versions of the injected matrices Z k as follows Then the MVP of the matrix Z by a vector x can be calculated using the Eq. ( 21) as: The computational cost of Eq. ( 22) is dominated by the product Z k x (the product of S k by a vector is a trivial replacement by zeros).Given that Z k is a Toeplitz matrix, Z k x is a convolution and it can be solved in a expedited way using the FFT in O(N log N ) operations.Since there are n MVP operations in Eq. ( 22), the total cost of evaluating the product Z x is O(nN log N ).
Notice that no calculation of injected matrices, Z k , is required in the slot FFT method.Also, the calculation of the MoM matrix, Z, is not required either.The method computes only one matrix, R, expressed in Eq. ( 13), i.e. it needs to store n × (2N − n) elements in memory.
In short, if n, the number of unknowns of each cell element, were negligible compared with M, the method would be virtually of order O(N log N ) and O(N ) in terms of CPU time and memory requirement, respectively.

Results
The following results were performed in a High Performance Computing (HPC) system composed by 4 Intel Xeon E7-4820 at 2.00GHz (eight cores each one, 32 cores in all), with 512GB of RAM memory.
In Fig. 2(a) the memory cost of the whole MoM and the slot FFT using periodic structures in 1D, 2D and 3D uniform grids is showed.These results are obtained for the scattering of a λ = 540nm wavelength electromagnetic wave impinging on 1D, 2D and 3D periodic compositions of gold nanospheres of radius 100nm and 280nm separation between centres; each sphere is modelled with n = 240 unknowns and using a number of spheres, M, from just a few units to 40000.A significant memory saving can be observed in the slot FFT, given that only R in Eq. ( 13) needs storage.In Fig. 2(b) the mean computing time of the MVP is shown.The differences between the O(nN log N ) cost of slot FFT and the O(N 2 ) direct cost of carrying out the MVP are observed.For more than 100 • 10 3 unknowns, costs for the direct MoM MVP solution are extrapolated for both memory usage and MVP time.Notice that in order to avoid bias in the comparison, these results were performed without parallelization.
The first example is the electromagnetic scattering analysis at a wavelength of λ = 550nm from a periodic dense metastructure composed of 110 × 110 × 110 nanocubes in a uniform 3D grid.The edge dimension of the nanocubes is 55nm and the separation between centres is 110nm.The composite parameters are r = 3 and μ r = 1 and the nanocubes are suspended in vacuum medium.The problem is composed of nearly 50 million unknowns.In Fig. 3 the Radar Cross Section (RCS) of the metastructure is showed.The RCS is calculated for normal plane wave incidence (θ = 90 o ) using the slot FFT to speed up the MVP.This result is compared to the RCS of two different homogeneous dielectric cubes of edge dimension 12.1μm and r = 1.1579 and r = 1.1638 (effective permittivity obtained with the Maxwell-Garnett and Bruggeman homogenization method [20], respectively).The computation of the RCS for the two entire homogeneous cubes were performed using the MLFMA algorithm [5].The results of the homogeneous effective cube using MLFMA needed 1, 7 million unknowns.This result proves that both homogenization formulas (Maxwell-Garnett and Bruggeman) provide accurate results for general purposes.Nevertheless, in this kind of large problems, we can make use   of the full-wave solution based on the slot FFT for a more accurate result.The three results show a good agreement.Notice that 30 degrees apart from the principal directions a slight difference between the electromagnetic scattering of the two homogeneous cubes and the real electromagnetic response of the large periodic composite object is revealed.The properties of the real and homogeneous material differs at the microscopic scale, which is important to describe the exact behaviour of the material.As second example, a plane-shaped array composed by gold nanospheres is analysed.In this example we study the surface-enhancement Raman spectroscopy (SERS) [21][22][23] to show the effects that appear in large SERS substrates.Some important effects in SERS may not be discovered with other widespread analysis techniques, e.g.approximating the structure by an infinite array or analysing it by a small portion of it, this giving a wrong prediction of the behavior of the substrate in a real scenario.This plays an important role in these kind of applications, as the SERS intensity distribution may present stationary wave-patterns effects.
The studied array was composed of 200 × 200 gold nanospheres located in an uniform 2D grid and the simulation was performed at a wavelength of 600nm, in a medium filled with water ( r = 1.7689) and illuminated by two incident plane waves perpendicular to the array, each of them linear polarized along the principal axes.The diameter of the nanospheres was 55nm and the separation between centres was 56nm.The discretization of the problem resulted in nearly 10 million unknowns.
For the sake of simplicity, SERS intensity was calculated using no Raman shift.This approach is valid for small Raman shift and yields close-fitting results in a faster way through the following expression [23]: where E is the total electric field in the presence of the SERS substrate and E i the field of the incident planewave.Figure 4(a) shows the SERS enhancement obtained after smoothing with a 2D Gaussian filter-to simulate the effect of a lens-.The representation is saturated so as to give a better illustration of the effects that appear along the xy plane.It is clearly discernible how the edge effect and the stationary wave pattern appear.This makes highly recommended the analysis of real case substrates.

Discussion
The presented approach, based on slotting partial FFT solutions (Eq.22), is a very efficient way to compute the solution of very or extremely large electromagnetic finite periodic problems.The amount of required memory of O(nN ) and the O(nN log N ) computing time allow us to analyse complicated periodic problems in the optical regime without accuracy loss.Although, for the sake of clarity and conciseness, only the 1D approach is formulated in this paper, their 2D and 3D counterparts can be formulated in an analogous fashion.In fact, in the results Section the use of 2D and 3D approaches were necessary.The method is straightforward to apply for any MoM implementation.Also, parallelization of the slot FFT does not require too much effort.

Fig. 2 .
Fig. 2. Comparison of the memory and computing time needed with MoM and slot FFT methods.

Fig. 3 .
Fig. 3. Scattering diagram of a 110 3 cube-shaped array and its equivalent homogenized by the Maxwell-Garnett and Bruggeman equivalent permittivities.

Figure 4 (
b) shows the SERS of a small part of the structure without filtering.The hotspots between nanospeheres can be appreciated, with peaks of 1.0824 • 10 5 arbitrary units.This figure also shows a saturated representation of the SERS intensity to give a better representation of the hotspots.(a) Representation of the SERS intensity obtained in the simulations for a 2D array of 200x200 gold nanospheres filtered by a Gaussian mask.(b) Detail of the region marked showing the hotspots between adjacent nanospheres.

Fig. 4 .
Fig. 4. Representation of the SERS intensity of a small part of the composite object.