Comparison of absorption simulation in semiconductor nanowire and nanocone arrays with the Fourier modal method, the finite element method, and the finite-difference time-domain method

For the design of nanostructured semiconductor solar cells and photodetectors, optics modelling can be a useful tool that reduces the need of time-consuming and costly prototyping. We compare the performance of three of the most popular numerical simulation methods for nanostructure arrays: the Fourier modal method (FMM), the finite element method (FEM) and the finite-difference time-domain (FDTD) method. The difference between the methods in computational time can be three orders of magnitude or more for a given system. The preferential method depends on the geometry of the nanostructures, the accuracy needed from the simulations, whether we are interested in the total, volume-integrated absorption or spatially resolved absorption, and whether we are interested in broadband or narrowband response. Based on our benchmarking results, we provide guidance on how to choose the method.


Introduction
Nanostructures enable new functionality for photodetectors and solar cells by inducing, for example, light trapping to increase the absorption of incident light [1][2][3]. However, the optical response of such nanostructures can be complicated and strongly dependent on the geometry of the design [4][5][6][7]. Therefore, when optimizing nanostructures for maximized absorption, optics modelling is a powerful tool allowing to reduce the need of time-consuming prototyping. In the optics modelling, the diffraction, scattering, and absorption of incident light are described by Maxwell's equations. The materials of the nanostructure are taken into account through wavelength-dependent refractive indexes of the materials. Here, we focus on nanostructure arrays in which we aim to absorb the incident light [1][2][3]. Thus, in our case of interest, the nanostructure array is not just an antireflection coating [8], but contains the photodetector or solar cell region [1][2][3].
To the best of our knowledge, a detailed comparison of different simulation methods for analysis of absorption in periodic semiconductor nanostructure arrays has not been performed. In contrast, for example for simulating diffraction gratings [9], anti-reflection coatings [10], plasmonic nanoparticles [11][12][13][14][15][16], metallic slit-groove systems [17], photonic crystals [18], photonic crystal line cavities [19], and photonic crystal fibers [20], numerical methods have been compared. However, due to the difference in materials, geometry, and type of optics problem, those results are not directly transferable to the case of absorption in periodic arrays of semiconductor nanostructures.

Geometry and materials
As test systems, we consider GaAs nanowire and nanocone arrays of circular cross-section on top of a GaAs substrate (see figure 1 for geometry and supporting information section 1.4, which is available online at stacks. iop.org/NANOX/1/030034/mmedia, for discussion of the possible effect of the cross-sectional shape on the simulation performance). GaAs is a direct bandgap semiconductor whose band gap corresponds to 872 nm in wavelength at room temperature [22]. In general, we expect our results to be relevant for a wide range of direct bandgap semiconductors of interest for nanostructured solar cells and photodetectors. The nanowires and nanocones are characterised by a (base) diameter D and length L, and they are placed in a square array of period P. Throughout the study, we consider a normally incident plane wave, which maximizes the incident power per unit area of nanostructure array. A detailed study of the modelling at non-normal incidence [23] is left for future work.

Methods
For the light scattering, we assume a model with linear, local, isotropic, non-magnetic, and time-harmonic Maxwell equations of the form and · ( ) l  = H r, 0,the two remaining Maxwell equations, follow. In these equations, ( ( )) l > r Im n , 0 induces absorption in the material. Below, we describe in some detail the three simulation methods used in this study, with more technical details given in supporting information section 1. Fourier modal method (FMM) For FMM [24][25][26], also known as the scattering matrix method or the rigorous coupled-wave analysis (RCWA), we use an in-house implementation in Fortran, where the numerically heavy vector and matrix operations are performed through the Intel MKL package [24]. This FMM implementation includes the N-vector field formulation [27] to describe the discontinuity of the refractive index in the x-y plane (see supporting information figure S9 for the drastic increase in the convergence rate when this N-field formulation is used). In FMM, we solve the scattering problem in the frequency domain, that is, at a pre-chosen λ in each modelling run.
For the normally incident light, we use an underlying Fourier basis of the from exp(ik x,a x)exp(ik y,b y) where k x,a =2πa/P x and k y,b =2πb/P y . Here, a and b are integers, and P x and P y are the period of the unit cell in the x and the y direction. Throughout this work, we use P x =P y =P. In the numerical implementation, we need to truncate the Fourier basis. Here, we truncate the basis through a=−m, −m+1, K m−1, m and b=−m, −m+1, K m−1, m. Thus, there are 2m+1 basis components in both the x and the y direction, that is, (2m+1) 2 basis components in total. In FMM, we project the Maxwell equations to this Fourier basis and work with the two in-plane electric field components E x and E y [24]. This projection gives matrices of the size where the additional factor of 2 originates from the two in-plane components. A larger m is expected to give more converged results, at the cost of a longer simulation time and a larger RAM requirement.
From the projected matrices, we can solve for optical eigenmodes in each z-invariant geometrical region of the system (see equation (9) in [24]). Next, we solve for the expansion coefficients of the eigenmodes in each such region, as induced by the incident plane wave. From these expansion coefficients, we can readily calculate the power flow through the x-y plane at any z position by performing the integration in the underlying Fourier basis functions [5]. Such a power flow can be used for example for calculating the reflectance R of the system and the transmittance T into the substrate at z=0, which in turn give the absorptance of the nanostructure array through A=1−R-T. To study the fields in the real-space, we need to perform a Fourier transform from the underlying Fourier basis to the real space [28] (see also supporting information section 1.1.4).

Finite element method (FEM)
For FEM, we use the Wave Optics Module in Comsol Multiphysics [29]. Also here, we solve the scattering problem in the frequency domain. In FEM, the volume of the system is discretized into a mesh of finite elements (see supporting information figure S8). The Maxwell equations are projected onto the finite elements, and we end up with a problem of the type Ax F =b where A contains information about the system, x F information of the electromagnetic field in the elements, and b the boundary conditions as set for example by the incident light. Thus, in FEM, we need to solve large equation systems.
The numerical accuracy of the solution depends on the choice for the size of the elements and the spatial distribution of the elements, which affects the number of elements and hence the number of degrees of freedom (NDOFs) in our equation set. A larger number of elements is expected to give more converged results, at the cost of a larger NDOFs, a longer computational time, and a larger RAM requirement. For further details on how the meshing was chosen, how the regions in the unit cell were defined, and how the computational speed and RAM usage of the linear equation solvers were tested, the reader is referred to section 1.2 of the supporting information document.
In FEM, we obtain as solution the electric field E(r,λ), from which spatially resolved absorption can be calculated, in a post-processing step, by integrating over the volume of interest. Here, A is obtained by integrating the spatially resolved absorption over the full nanostructure volume, that is, for 0<z<L.

Finite-difference time-domain (FDTD) method
For the FDTD method, we use the Lumerical FDTD Solutions software [30]. The spatial domain is discretized into a grid (see supporting information figure S8), and the spatial derivatives in the Maxwell equations are approximated by finite differences. Here, we use a uniform discretization with Δx=Δy=Δz. In contrast to FMM and FEM, FDTD performs the simulation in the time-domain. In FDTD, an incident pulse is sent toward the nanostructure array and forward-propagated with a time step of Δt through the system. Importantly, n(λ) cannot be used directly for the materials since FDTD works in the time-domain. Instead, a fitting of n(λ) to an oscillator form must be performed to represent the frequency dispersion in the time-domain (see supporting information figures S1 and S3 and supporting information section 1.3.1).
In the solution we need to keep track of the fields at each discretization point, but in contrast to FEM, no linear equation system needs to be solved. In FDTD, the convergence is dependent on the discretization step Δx, and more converged results are expected with decreasing Δx, at the cost of increased computational time and RAM usage.
In FDTD, to obtain the response of the system as a function of wavelength, we need to perform a Fourier transform from the time-domain to the frequency domain. For this purpose, we place spatial monitors at locations where the electromagnetic fields are recorded at each time step, and after the simulation, these fields are Fourier transformed. For example, to obtain information about T, we place a monitor spanning the x−y plane at z=0. After the simulation, the z-component of the Poynting vector at that surface is Fourier transformed and, for each wavelength of interest, integrated over the surface. Similarly, we can obtain R from a monitor placed above the nanostructure array, which in turn allows to calculate the absorptance with A=1 −R−T.

Symmetry reduction
The arrays shown in figure 1, with a single nanowire or nanocone located at x=0 and y=0 in the unit cell, show mirror symmetry about the x and the y direction, which is respected also by the normally incident plane wave. To speed up the calculations and to save RAM, we can then use symmetry reduction [21]. Due to the symmetry of x and y polarized light, we consider x polarized light without loss of generality.
In FMM, such a reduction is performed by using basis functions that respect the symmetry. In practice, we use a cos(k x,a x)cos(k y,b y) basis for E x and a sin(k x,a x)sin(k y,b y) basis for E y . This reduces the matrix size from , which for large m corresponds to a reduction by a factor of 4 in the number of both rows and columns.
In FEM and FDTD, with symmetry reduction, we model ¼ of the full unit cell, with 0xP/2 and 0yP/2 [21]. For the x-polarized incident light, we use perfect electric-conductor (PEC) boundary conditions along x=0 and x=P/2 and perfect magnetic-conductor (PMC) boundary conditions along y=0 and y=P/2. Thus, in FEM and FDTD, with the symmetry reduction, we reduce the volume of the simulation domain by a factor of 4.

Computational resources
Our focus is on simulations that can be performed on desktop workstations. The run times we report are benchmarked on a system with 64 GB of RAM and an Intel ® Xeon ® E3-1230 v5 central processing unit (CPU) with 4 physical cores, using Ubuntu 16.04.6 LTS with Linux 4.4.0−174-generic kernel. The run times are reported when using all four CPU cores for the calculations. For FEM and FDTD, we use all four CPU cores for a single instance of the software. In contrast, the FMM implementation is a single-core implementation, and we run four FMM implementation in parallel in a wavelength study, requiring four times the RAM of a single instance. See supporting information sections 1.1.3, 1.2.5, and 1.3.8 for additional details of the computational performance of FMM, FEM, and FDTD on the workstation used. Some of the computationally time-demanding simulation sweeps were performed at the Triton computational cluster available through the Aalto Science-IT project, with computational time and RAM usage extracted from the desktop workstation by considering selected test runs within the large sweeps.

Convergence metrics
Note that the nanostructure arrays considered in this study do not allow for analytical, exact solutions. In lack of exact reference values, we must approximate the convergence of the results. When presenting the convergence of the results below, we use the most converged results from any of the three methods as reference value.

Results
Below, we highlight the most important aspects that we found when comparing the three simulation methods. The supporting information document contains a large range of additional technical details in order to support the general conclusions here.

Validation of simulation methods
We start by considering a GaAs nanowire array of period P=400 nm. For the nanowires, we choose a diameter of D=160 nm and a length of L=2000 nm, which gives rather optimized absorption [5]. For the refractive index of GaAs, we used tabulated values [31] (see supporting information figure S1 for the wavelength dispersion of the refractive index). We show results also for a nanocone array with D=400 nm, P=400 nm, and L=960, which thus contains the same volume of GaAs as the considered nanowire array (for the reflectance and transmittance of the nanowire array and the nanocone array, see supporting information figure S2).
First, we validate that all three methods yield equivalent solutions for the light-scattering problem (figure 2). With regard to both the electric field distribution and the absorption spectrum, FMM, FEM, and FDTD give results that agree well. It is of note, however, that there are some visible discrepancies in the absorption spectrum modelled with FDTD compared to the spectrum modelled with FMM and FEM, especially at λ≈850 nm (figure 2(d)). This discrepancy originates from the oscillator fitting for the refractive index in the FDTD simulations. We used a large number of oscillators (nineteen) in this FDTD simulation, but, some discrepancy still remains in the refractive index and in the resulting A(λ) (see supporting information figures S1 and S3). Thus, the FDTD results are highly converged, but not to the same exact value as the FMM and FEM results. Therefore, any of these three methods works for calculating spectral and spatial properties for the nanostructure array. Then, the practical question that we aim to answer below becomes: at what computational cost do we obtain the results?

Convergence of absorptance in nanowires and nanocones
To obtain broadband information about the absorption in the nanowires or nanocones, we use the AM1.5G solar spectrum (see supporting information figure S6 for wavelength-dependent convergence of A(λ)). As a measure, we use the short-circuit current given by [33] ħ Here, E g =1.4225 eV [22] is the bandgap energy of GaAs at room temperature, and in equation (1), we assume that photons with energy below it give negligible contribution to the short-circuit current [33]. In practice, the lower limit on the integration in equation (1) is λ≈300 nm since the AM1.5G solar spectrum shows negligible intensity below that wavelength (see the light grey line in figure 2(d)). Throughout, the simulations are carried out for λ>270 nm. In equation (1), we assume that each absorbed photon gives rise to one charge carrier to the short-circuit current. The maximum value for j sc from equation (1), obtained with A(λ)=1, is 31.4 mA cm −2 .
For the nanowires and the nanocones in figure 2, we obtain j sc =27.1 mA cm −2 and j sc =23.4 mA cm −2 . In figure 3, we show the relative error in j sc as a function of run time. This relative error is defined as |j sc −j sc,ref |/j sc,ref where j sc,ref is a highly converged reference value.
For FDTD, we perform a single run that covers all wavelengths. For FMM and FEM, we perform runs with a wavelength step of Δλ=10 nm (see supporting information figure S4 for effect of Δλ). For this simulation, we For the electric field profiles with FMM, we paid attention to respect the Fourier factorization rules [28,32]. For FEM, we show a plane cut at y=0 of the solution E(r,λ). For FDTD, we placed for the electric field an x-z monitor at y=0. (d) Highly converged absorption spectra (left axis) for nanowires of D=160 nm, L=2000 nm, and P=400 and nanocones of D=400 nm, L=960 nm, and P=400, together with the AM1.5G spectrum in terms of incident photons (right axis). performed in FEM 64 such wavelength steps to cover the 270 to 900 nm wavelength range. For FMM, we modelled one wavelength per CPU core, that is, we run four FMM implementations in parallel to cover four wavelength steps in parallel, and 16 such batches to cover the 64 wavelengths.
For the nanowires, we used a highly converged result from FMM as a reference, and for the nanocones, we used a highly converged result from FEM as a reference (these two methods gave, respectively, the most converged results with our computational resources, as seen in figure 3). For FMM and FEM, we used the tabulated refractive index of GaAs. For more details about the convergence of each method, see supporting information figure S7 for convergence with increasing m in FMM, increasing NDOFs in FEM, and decreasing Δx in FDTD. Note that for FDTD, we calculated a second reference with FMM for nanowires and with FEM for nanocones with the refractive index resulting from the oscillator fitting in FDTD (see supporting information figure S3 for the difference in j sc when using tabulated and fitted refractive index).
For the nanowires, FMM appears to converge 2 orders of magnitude faster than FEM and 4 orders of magnitude faster than FDTD ( figure 3(a)). Here, FEM shows better than 10 −3 convergence even at the coarsest mesh used, and the run time did not decrease by trying to make the mesh coarser since the run time per wavelength point settled at approximately 4 s with the coarsest mesh used (see supporting information figure S12). Thus, if we require 10 −3 convergence, FEM appears to perform 2 orders of magnitude faster than FDTD, as estimated from linear extrapolation of the FDTD results toward 10 −3 relative error. However, if 10 −2 relative error is sufficient, FDTD performs an order of magnitude faster than FEM.
When moving to consider nanocones, FMM shows a drastic slowing down when trying to reach better than 10 −2 convergence ( figure 3(a)). This slowing down is associated with the staircase approximation [34], which we use to take into account the slanted sidewalls of the nanocone (see supporting information figure S5 for more details about the effect of the staircase approximation). In contrast, FEM and FDTD appear to perform slightly faster for the nanocones than for the nanowires. For nanocones, FMM appears as the method of choice for 10 −2 convergence, FDTD or FEM for 10 −3 convergence, and FEM for 10 −4 convergence.
Convergence of absorption in core-shell structure A high bandgap semiconductor shell is common for passivating the surface of GaAs nanowires [35,36]. To investigate the optics simulation of such systems, we consider a 20 nm thick AlGaAs shell around GaAs nanowires ( figure 4(a)). For the considered 80% Al content, with refractive index from [37], the shell starts absorbing noticeably at λ=500 nm ( figure 4(b)). Here, in equation (1) we use the absorptance in the core and the shell to give the corresponding j sc in each material, separately. The resulting values are j sc,core =25.1 mA cm −2 and j sc,shell =1.2 mA cm −2 . Compared to the calculation of j sc =j sc,core +j sc,shell , which can be obtained in FMM and FDTD from A=1-R-T, for j sc,core and j sc,shell we need to solve for spatially resolved absorption (see supporting information sections 1.1.4, 1.2.4, and 1.3.9 for technical details for such absorption calculation).
First, we notice that for all three methods, the relative error in j sc,shell is, at a given run time, approximately an order of magnitude worse than for j sc,core ( figure 4(c)). This slower convergence of j sc,shell originates from the approximately one order of magnitude lower absolute value of j sc,shell . Thus, the absolute convergence of j sc,shell and j sc,core is rather similar. Similarly as for j sc in the bare nanowires in figure 3, the convergence is the slowest in FDTD for all three of j sc , j sc,core , and j sc,shell . Also for these core-shell nanowires, FMM appears as the method of choice if we need only the total absorption, expressed through j sc . However, if j sc,shell is needed with better than 10 −2 relative error or j sc,core with better than 10 −3 relative error, FEM appears as the method of choice.

Convergence of absorption in quasi-random array
To test the performance of the simulation methods for larger systems, we consider quasi-random nanowire arrays that show disorder in the x-y plane [38,39] (see figure 5(a)). To create the quasi-random array, we started from the highly symmetric square array of nanowires used in figure 2 and increased the number of nanowires by an integer N in both the x and the y direction in the simulation domain. That is, for N, we consider N 2 nanowires in a supercell of size N×P in both the x and the y direction. Then, we moved each of the N 2 nanowires by a random distance in the range −0.25P to 0.25P in both the x and the y direction from their original position. Next, we gave to each nanowire a random diameter in the range from 60 to 260 nm. Finally, we scaled the diameter of all the nanowires in the supercell by the same factor in such a way that the volume of GaAs in the nanowires is equal to that in the N=1 square array of D=160 nm. See figure 5(a) for the resulting supercells for the N=1, 2, 3, 4, 5, and 6 that we consider here. Due to the broken symmetry for N2, we average the absorption of x and y polarized incident light. Here, in FEM and FDTD, we perform separate simulation for x and y polarized incident light. In FMM, a single simulation run yields separately the response for x and y polarized incident light.
We investigated the run time ( figure 5(b)) and RAM usage (figure 5(c)) to reach lower than 1% relative error in j sc . The reader is referred to supporting information table S1 for the m, NDOFs, and Δx used in the FMM, FEM, and FDTD, respectively, for the varying N, as well as for discussion of possible variation in their values for varying realizations of the random array configuration for given N. Note that since we run four parallel FMM implementations, one on each CPU core, the reported RAM usage for FMM is a factor of four higher than for a single instance of the FMM implementation. As reference for j sc to assess the convergence, we used a highly converged value from FMM with m=35.
For all the considered values for N, FEM was slower than FDTD and required more RAM. For N=1, FMM is an order of magnitude faster than FDTD and almost three orders of magnitude faster than FEM. When moving from N=1 to N=2, all methods slow down considerably since (i) the unit cell area increases by a factor of 4 and (ii) the mirror symmetry breaks around x=0 and y=0 making the simulation volume effectively an additional factor of 4 larger in FEM and FDTD and the basis size in FMM a factor of approximately 4 larger. At N3, FDTD becomes the method of choice for run time, and at N5, FDTD becomes the method of choice also for RAM usage. It appears that if we would to increase N beyond 6, also FEM will eventually become faster and less RAM requiring than FMM.

Discussion of additional dependencies
Size of the simulation domain in the x-y plane In figure 5, we increased the lateral size by considering unit cells of area NP×NP. From supporting information table S1, we see that the m required for FMM scales as N. Thus, with increasing unit cell area, which is proportional to N 2 , the computational time scales as N 5.8 (see figure S10(a)) and the memory requirement scales as N 4 . For FDTD, as expected, the required Δx stays rather constant with increasing N (supporting information table S1). Thus, with increasing N, the number of grid points in the FDTD solution increases as N 2 , and the run time and RAM usage scale proportionally with the number of grid points, that is, proportional to N 2 (see supporting information table S1). For FEM, the NDOFs used scales as N 2 due to the increase in the volume of the simulation domain by a factor of N 2 (see supporting information table S1). With the iterative solvers, we expect that run time and RAM usage increase proportionally to NDOFs and hence N 2 (see supporting information figure S12). However, since we had to use the direct solver due to the periodic boundary conditions in figure 5, the FEM run time is slightly superlinear in NDOFs (see supporting information figure S12(a) and supporting information table S1). RAM usage for a quasi-random array with N 2 nanowires in a supercell to reach less than 1% relative error in j sc . The period of each supercell is N×P with P=400 nm. Here, for N>1, that is, for the supercells that break the mirror symmetries along x=0 and y=0, we use periodic boundary conditions in all three methods.
Size of the simulation domain in the z-direction If we wish to consider the effect of varying nanowire length, FMM tends to stand out in its own class as a method, especially for the square array in figure 3. For FEM and FDTD, the simulation effort scales linearly with the number of lengths studied: for each length, a separate, new simulation is needed, and for simplicity, we assume that the computational effort for each individual simulation stays constant with varying length. In contrast, with FMM, since only the length of the nanowires is changed, we can re-use much of the previous calculations for a given length when changing the length. In our implementation, for the nanowires in figure 3, once we have calculated the results for a given length, we could reuse partial calculation results to such a high degree that the cost in run time to calculate the response for 30 additional lengths, at m=5, was equal to the cost of one simulation starting from scratch.
In reality, with increasing length, a larger simulation effort is expected in FEM and FDTD since the physical size of the simulation domain increases. For a large L, the nanowire domain dominates the extent of the simulation domain in the z direction. Based on the scaling with NDOFs in FEM and grid points in FDTD, we expect then the simulation time to scale linearly with L. In contrast, in FMM, the simulation time stays constant with increasing L at given basis size, which is typically set by the size of the simulation domain in the x-y plane.

Random length for nanowires
In figure 5, we considered a random nanowire array with N 2 nanowires where each nanowire is of length L. If we consider a case where each of the N 2 nanowires is of a random length, we expect FEM and FDTD to perform at approximately the same speed as when all the nanowires were of the same length L. In contrast, in FMM, we need to divide the system into N 2 slices in the z-direction to be able to consider z-invariant regions, as required by the method. Thus, FMM simulations are expected to slow down by a factor of N 2 relative to the case of constant length for all nanowires. We expect a similar slowing down of FMM if we consider nanocones instead of nanowires in the quasi-random array (since the nanocones require the staircase approximation in the zdirection in FMM). In contrast, we expect for FEM and FDTD a similar computational performance for a quasirandom nanocone array as for the quasi-random nanowire array in figure 5.

TCO contact layer
In many nanostructured solar cells and photodetectors, the electrical top contact layer is realized with a transparent conductive oxide (TCO) [1][2][3]. If such a TCO layer is of thickness t tco and planarized to cover the region L<z<L+t tco above the nanostructure array (see figure 1(b) in [40] for a schematic), FMM, FEM, and FDTD are expected to handle the simulations well.
In FDTD, we can use three power monitors instead of the two for R and T. The additional monitor is placed at z=L to allow a separate study of the absorption in the TCO and the nanostructures. Since we expect that t TCO <L, the increase in the calculation time due to the increase in the volume of the simulation domain is a minor effect. In FMM, we can monitor the power flow at z=L at just a 25% increase in run time compared to a calculation of only R and T (see supporting information section 2). However, for a nanowire array that originally consists of a single z-invariant region, we would now include an additional layer to the system, which thus increases the run time approximately by 100%. In FEM, we can separately integrate the spatially resolved absorption in the TCO and the nanostructures, as done for absorption in the core and the shell in figure 4. Similarly as for FDTD, also for FEM, due to the minor increase in the volume of the simulation domain, we expect only a minor increase in the calculation time.
However, if we consider a conformal TCO layer that wraps around a nanowire in the radial direction, with a dome-shaped TCO top (see figure 1(b) in [1]), the differences between the methods become apparent. With FEM, we can include the dome shape directly and with appropriate volume integration study the absorption in the nanowire and the TCO separately. For FDTD, we expect the calculation of the spatially resolved absorption in the TCO and the nanowire to be the largest numerical hurdle, as for the core-shell structure in figure 4. For FMM, we expect the dome-like top shape to be the largest numerical hurdle since it requires the staircase approximation, as for the nanocone-shape in figure 3(b).

Photogeneration rate
The j sc in equation (1) is the upper limit for the short-circuit current. It is reached when each photogenerated charge carrier contributes to the short-circuit current. In an actual device, there is typically noticeable spatial dependence in the probability for photogenerated carriers to contribute to the short-circuit current [40,41]. Therefore, for device simulations, we are often interested in G(x,y,z), the spatially resolved photogeneration rate [40,41]. Then, by performing for example drift-diffusion modelling of electron-hole transport with G(x, y, z) as a generation term for charge carriers, we can obtain an estimate for the short-circuit current, including varying extraction and recombination losses [40,41].
The G(x, y, z) can be obtained from any of FMM, FEM, or FDTD from the spatially resolved absorption (see equation (7) in the supporting material of [1]). In this case, the discussion for the choice of method is similar as for the core-shell structure: FEM yields readily the spatial profiles since the E(r,λ) required for the calculation of G(x, y, z) is obtained as a solution at each modelled wavelength. As long as no slanting regions in the z-direction exist, which would require the staircase approximation, FMM could also be a suitable choice, similarly as it was for resolving the absorption in the core and the shell in figure 4. FDTD will on the other hand need a memory heavy three-dimensional monitor and appears to give slower convergence than FEM.
Absorption along the axial direction In figures 2-5, we considered the absorptance integrated over the full nanostructure region. However, in some cases, we are interested in how the absorption varies along the axial direction [23]. Such information can be obtained by integrating the above G(x,y,z) over the x-y plane. However, for FMM and FDTD, we can perform the analysis numerically much less heavily. In FMM, we can integrate the power flow over the x-y plane at a given z position. From the variation of this power flow in the axial direction, we can obtain the axially resolved absorption. For FDTD, we can place additional power monitors at varying position in the axial direction. In our testing with nanowires of L=2000 nm and a resolution of 10 nm for the absorption in the axial direction (see supporting information section 2 for technical details), the axial absorption profile can be calculated in FMM with approximately 45% increase in calculation time without noticeable increase in RAM usage, in FDTD with an increase in calculation time on the order of 50% and increase of 1000% in RAM usage, and in FEM with a 1500% to 7000% increase in calculation time without increase in RAM usage (in FEM we used actually a volume integration to speed up the convergence, see supporting information section 2).

Broadband versus narrowband response
Above, we focused on the case of absorption of sunlight. For the GaAs considered, the broad spectral range from 270 to 900 nm was thus of interest. However, in some applications, we are interested in absorption over a narrow spectral range, for example in a photodetector designed for a single communication wavelength. Then, some additional differences between FMM and FEM compared to FDTD show up. In FMM and FEM, we modelled one wavelength at a time. Therefore, compared to the broadband simulations in figures 3-5 with 64 wavelength points, we could perform the computation at 1/64 of that cost when optimizing the geometry for a single wavelength with FMM or FEM. In FDTD, we obtain the full spectral response in a single run. However, compared to the broadband response, by considering a single wavelength, we can use a constant refractive index for GaAs, which saves approximately a factor of four in calculation time (see supporting information section 1.3.1). Thus, when moving from simulating a broadband response to simulating a narrowband response, FDTD is expected to slow down by approximately an order of magnitude relative to FMM and FEM.

Comparison to previous studies of simulation methods
As mentioned in the introduction, the type of optics problem affects strongly the preferred choice for the simulation method. Below, to highlight this aspect, we compare our present results with the general trends in some of the previous studies.
Finite ensembles of metallic nanoparticles have attracted interest lately, and methods for simulating their optical properties have been compared [11][12][13][14][15][16]. There, FEM has been often proposed as the method of choice due to its ability to adaptively resolve the strong electric field enhancement in the vicinity of the nanoparticles or in the gap between two nanoparticles [11,15]. For metallic nanoparticles, FMM is typically not considered, even if a finite ensemble of nanoparticles could be modelled with the aperiodic FMM, which is an extension of FMM to finite-sized scattering systems [42]. Thus, for metallic nanoparticles, FMM, which was highlighted as the method of choice in the present study for many of the semiconductor nanostructure arrays, is typically not even considered. Indeed, also for the case of simulating a finite-sized photonic crystal slab with a line defect, the aperiodic FMM underperformed computationally compared to FEM and FDTD [19].
Regarding other simulation methods not covered in the present work: Surface integral or boundary element methods have been proposed as a method of choice for simulating nanoparticles [13]. To the best of our knowledge, such methods have not been studied to a large degree for simulating periodic arrays like those in the present study. The discrete dipole approximation is another popular method for simulating the optical response of nanoparticles [13,15,43]. The method worked well for simulating the optical response of single semiconductor nanowires [43]. However, when we explored the use of the discrete dipole approximation for simulating the semiconductor nanowire arrays in the present study, the computational performance appeared considerably lower compared to FMM, FEM, and FDTD (results not shown).
The geometry of the nanostructure arrays in the present study has some similarities with those considered in a review on the simulation of sub-wavelength anti-reflective structures for solar module applications [10]. However, in the present study we focus on absorption in the nanostructure array itself whereas the focus on that review is on minimizing reflections, with the active, absorbing layer possibly further down into the structure. Also in that review [10], FMM, FEM, and FDTD are highlighted for modelling periodic nanostructures, but no quantitative comparison of their computational performance is included.

Conclusions
We showed how FMM, FEM, and FDTD can give highly converged results for optics simulations in semiconductor structures based on nanowire and nanocone arrays (figure 2). In FMM, for the nanowires in figure 3(a), only the number of Fourier basis functions, as controlled by the single parameter m, determines the convergence. In contrast, in FEM and FDTD, various parameters in the configuration of the simulation deck and solver settings affect the convergence (see supporting information sections 1.2 and 1.3). Thus, in both FEM and FDTD, it takes more effort to try to optimize the numerical performance, and this optimization could be dependent on the problem at hand. At the same time, for the GaAs nanowires in figure 3(a), FMM appears as the numerically most efficiently performing method for simulating the absorption of sun light. However, when we consider the nanocone geometry with slanted sidewalls that requires the staircase approximation in FMM, the choice of the method becomes more complicated as we saw in figure 3(b). If a 10 −2 relative error is sufficient, FMM still appears as the method of choice. However, for below 10 −3 relative error, FEM becomes the method of choice. Similarly, for the core-shell structures in figure 4, FMM is the method of choice if a 10 −2 relative error for absorption in the shell or 10 −3 relative error in the core is sufficient, and for lower relative error FEM becomes the method of choice. For the overall absorption in both the core and the shell, FMM appears as the method of choice no matter which relative error is tolerated. FDTD shows its strength for the quasi-random arrays in figure 5. There, with increasing number of nanowires in the supercell, FDTD becomes more and more preferable for simulations. In our test system, for more than 16 nanowires in the supercell, FDTD required the least of computational time as well as RAM, as compared to FMM and FEM. Compared to the above results for the broadband sun light, if we move to consider narrowband response, we expect FDTD to slow down by a factor of approximately 10 relative to FMM and FEM.
Importantly, the results are obtained for our test workstation with four CPU cores. The FMM code runs on a single CPU core, and to utilize the available CPU cores, we run multiple FMM codes in parallel for different wavelengths or geometries under study. Thus for FMM, the RAM usage scales with the number of available CPU cores. In a system with much more CPU cores available, a benchmarking of parallel computation performance should be performed. For example, it might turn out that it is beneficial to allocate only a limited number of CPU cores to a given FEM or FDTD instance and run multiple such instances in parallel. For FEM, we can split the runs in either wavelength or geometry and in FDTD in geometry. Then, the RAM usage scales as the number of parallel FEM or FDTD simulations. We note that the FMM, FEM, and FDTD implementations tested here did not support calculations on graphical processing units (GPUs). A benchmarking of GPU accelerated optics simulations [44] of nanostructure arrays would be a useful future study.