Semi-automatic engineering and tailoring of high-efficiency Bragg-reflection waveguide samples for quantum photonic applications

Semiconductor alloys of aluminum gallium arsenide (AlGaAs) exhibit strong second-order optical nonlinearities. This makes them prime candidates for the integration of devices for classical nonlinear optical frequency conversion or photon-pair production, for example, through the parametric down-conversion (PDC) process. Within this material system, Bragg-reflection waveguides (BRW) are a promising platform, but the specifics of the fabrication process and the peculiar optical properties of the alloys require careful engineering. Previously, BRW samples have been mostly derived analytically from design equations using a fixed set of aluminum concentrations. This approach limits the variety and flexibility of the device design. Here, we present a comprehensive guide to the design and analysis of advanced BRW samples and show how to automatize these tasks. Then, nonlinear optimization techniques are employed to tailor the BRW epitaxial structure towards a specific design goal. As a demonstration of our approach, we search for the optimal effective nonlinearity and mode overlap which indicate an improved conversion efficiency or PDC pair production rate. However, the methodology itself is much more versatile as any parameter related to the optical properties of the waveguide, for example the phasematching wavelength or modal dispersion, may be incorporated as design goals. Further, we use the developed tools to gain a reliable insight in the fabrication tolerances and challenges of real-world sample imperfections. One such example is the common thickness gradient along the wafer, which strongly influences the photon-pair rate and spectral properties of the PDC process. Detailed models and a better understanding of the optical properties of a realistic BRW structure are not only useful for investigating current samples, but also provide important feedback for the design and fabrication of potential future turn-key devices.

In this work, we show how to tailor the BRW structure towards a specific design goal. We optimize a simplified epitaxial layout for high conversion efficiency at an operating wavelength of choice. The conversion efficiency is related to the optical nonlinearity, but waveguide specific effects, like the modal overlap, have to be taken into account. Previous sample designs have often relied on complicated epitaxial layouts to guarantee high conversion efficiencies. Instead, our proposed waveguides are at least as bright, but with a greatly reduced complexity in fabrication. We can also show that the new found structure possesses favorable properties in terms of state engineering. While optimizing the entanglement of the state is not the main focus of this paper, the concepts presented here also strongly affect this topic.
Furthermore, predictability and consistency between a theoretical design and the experimental implementation is also an issue. Though rarely acknowledged, experiments often rely on large-number statistics to achieve progress, even in the broader field of semiconductor quantum-optics. Random fluctuations in the samples may or may not help with reaching the set goal, but fabricating a large number of devices often turns out to be beneficial. This is also the case with BRWs. Consequently, in order to follow a more rational approach, we discuss the influence of additional fabrication issues that make a real sample deviate from the specifications. We show that some of these deviations can even be exploited to increase the yield. Regarding the individual waveguides, for example, the operating wavelengths and their acceptance bands are strongly affected by the details of the fabrication. By incorporating this knowledge already at the design stage, more waveguides are usable in the end.
The specifics of BRWs make them prime candidates for such an optimization and accompanying comprehensive analysis. Since the design relies on the classical second-order optical nonlinearity, the properties of the waveguide are grounded in classical electrodynamics. Each epitaxial layer is at least 100 nm thick and the materials and their dispersion are well understood. This means that their behavior is well defined and should correspond perfectly to the specification.
The design process of a good BRW epitaxial structure is usually based on simplification, heuristics, experience, but also a great deal of intuition. For certain configurations, design equations are readily available [22][23][24][25], parts of the possible solution space have been computed [26][27][28] and alternative design objectives were studied [29,30]. As an example, one may choose certain aluminum concentrations, apply the design equations and iterate on the layer thicknesses until a working sample is found. Local optimization is then achieved by choosing two variables, such as the thicknesses of the core and its aluminum concentration, and mapping the desired metric. While this approach has worked well in the past, adding constraints or optimizing directly for combinations of certain process parameters is either tedious or unfeasible. By moving to a largely automated sample design procedure, the time consuming manual trial-and-error process is shortened significantly.
The algorithm presented here is general and also designed in a way to search for potentially unconventional structures. It is based on simulated annealing [31] and an autonomous BRW analysis procedure, which can automatically find phasematching, dispersion parameters, overlaps of the interacting modes and effective nonlinearities of a given structure. A great deal of effort is required to make the search more stable, as the numerical optimization procedure may stray far from established, working designs. A heuristic objective function is introduced and the influence of the annealing parameters on the convergence behavior is discussed. Having a reliable routine enables a straightforward analysis of other important parameters, like fabrication tolerances. In that sense, we not only discuss our methodology, but are able to present several candidates for a simpler-yet-better BRW epitaxial structure.
We emphasize that our approach is not fundamentally restricted to certain tools or devices. For example, we employ Wolfram Mathematica [32] for its versatility with postprocessing the data with the help of a large existing code base. We have carried out simpler but similar studies with scriptable industry standard tools like COMSOL [33] or Lumerical's Mode Solutions [34]. In these cases, however, engineering the device is often reduced to parameter sweeps. Instead, we aim for searching a complicated, high-dimensional space derived from a multifactorial fitness function-which is an 'objective-first' approach [35]. It can also be adapted to other device types, especially ones relying on the interaction of the higher order spatial modes.
Similar numerical optimization techniques have been employed for different optical systems, such as dielectric filters [36], computer-designed holographic elements [37] or generic nano-photonic structures [35].
Recently, interest has grown in applying them also in the design of quantum optical photonic devices. For example, one can find nontrivial poling patterns that shape the spectrum [38,39] of the generated photon-pairs. On a broader scope, it is also worth mentioning the work of Dolgaleva et al [40], who report on their efforts on tailoring four-wave-mixing based frequency converting waveguides made from AlGaAs.
The paper is organized in the following way: first, we review the relevant theory in section 2 to introduce the required quantities and notation. Then, in section 3.1, the peculiarities of the autonomous analysis of a BRW epitaxial structure are discussed. Section 3.2 focuses on the fine-tuning and behavior of the numerical optimization procedure. Finally, in section 4 we present and analyze the found samples, while in section 5 we investigate the effect of the thickness gradient of a real-world sample on the spectral properties of the photonpairs.

Theoretical background
BRWs rely on modal phasematching to achieve nonlinear interactions. They are primarily designed to convert light between the near infrared around 775 nm (NIR) and the telecom C-band around 1550 nm. While the primary goal in this paper is to optimize the efficiency of second-harmonic generation (SHG), the results are also beneficial for the parametric down-conversion (PDC) processes. For this, we need to calculate the optical eigenmodes of the structure, discuss the physics of the nonlinear interaction and define the relevant quantities, like operating wavelengths and modal overlaps. The latter are used as metric for the optimization in the following section 3.2.
Initially, only two sets of equations are important: first, for optimizing the epitaxial structure, we only need to take the one-dimensional, vertically oriented distribution of the refractive index n(y) into account, as depicted in figure 1(a). This greatly simplifies the design procedure without sacrificing too much accuracy. After optimization, the second step involves verifying an etched, two-dimensional waveguide, shown in figure 1(b). This is achieved by solving for the full-vectorial eigenmodes of Maxwell's equations over the two-dimensional refractive index distribution ( ) n x y , . Here, x is the coordinate along the plane, but perpendicular to the waveguide direction, y the vertical coordinate and etch direction 6 . The wave is then guided along the ridge in the z direction.
In the one-dimensional case, the wave equations for the TE-like modes, with the spatial components of the electric fields E μ and magnetic fields H μ (m = x y z , , ), of a dielectric waveguide are given by [41] In equations (1)- (3), ω is the angular vacuum frequency of the light, b = n k eff 0 is the propagation constant (eigenvalue) with the vacuum wavenumber k 0 of the light. The parameter m 0 is the vacuum permeability and c the speed of light in vacuum. Writing β in terms of the effective mode index n eff is convenient, and will be used further on. Similarly, the equations for TM-like modes are given by [41] Figure 1. Illustration of a BRW slab structure (a) and a ridge waveguide with 4 μm width (b), with the terminology and coordinate system convention used in this paper. All dimensions and values are to scale with sample 2M-1 from table 1. 6 x is not to be confused with the aluminum concentration x i of the -Al Ga n y H y and 5 We assume a one-dimensional refractive index distribution n(y). However, in the case of the two-dimensional refractive index distribution ( ) n x y , , the full vectorial Maxwell's equations are solved via a finite element method (FEM) with the help of the Wave Optics package in COMSOL [33] or a mode solver implemented in Matlab [42].
Equations (1) through (6) have interesting consequences for the nature of nonlinear processes in AlGaAs slab waveguides. Due to the m 43 crystal symmetry, modes can couple in different ways and the ridge directions are therefore restricted to the 110 and 110 axes in order to achieve the highest possible nonlinear couplings. In order to calculate the coupling of the various field components, we have to take the orientation of the crystal coordinates with respect to the laboratory frame into account. In the crystal frame, the mode fields transform as where R is the appropriate 3×3 rotation matrix. The transformed fields are plugged into the standard equation for the second-order nonlinear coupling with d being the 18-element tensor of the nonlinear coefficient. In m 43 all d ij are zero, with the exceptions of = = d d d 14 25 36 . In our sample, the two frames are rotated 45 • around the y axis via 1 0 , modes can propagate in the near infrared around 800 nm (NIR) and telecom range in the C-band around 1550 nm. For example, if we look at the x-component of equation (10) we see that the horizontally polarized TE-mode in the NIR range is linked to both TE and TM modes at telecom wavelengths (type II nonlinear process). In contrast, the vertically polarized y-component is only converted to a TE mode (type I). Note that E z , which is aligned in propagation direction, is also involved. In strongly confined waveguides like ours, all modes can have a non-negligible E z . This forward field component results in type0 processes that have been observed in BRWs [43]. In order to analyze such a process, a full-vectorial 2D treatment is necessary. For the purpose of optimization, however, the onedimensional approach is usually descriptive enough.
BRW slab structures are engineered in a way, so that for three modes at wavelengths l p , l s and l i , the respective effective mode indices fulfill the relation l l l l l l . The indices p, s and i denote the pump, signal and idler modes. Furthermore, we require that the three-mode spatial overlap [30,[44][45][46] i j k as well as the effective nonlinearity d eff , are non-zero. They are derived from the two-dimensional distribution of the fields n ( ) E x y , , with n = ( ) i j k , , indicating three modes, and the local material nonlinearity ( ) , which takes the coupling between modes i j k , , into account. In this way, the E ν are just the scalar electric fields and may correspond to either TE or TM polarization. This works well, because the modes are strongly polarized so we can ignore the weak field component in z-direction.
Note that in the case of unnormalized field profiles E i j k , , , like they often result from numerical mode solvers, appropriate normalization constants have to be found. Here, are used [30].
In this work, we optimize the SHG [2][3][4][5][6]47] since the photon-pair production via PDC is the faint counterpart of this bright three-wave mixing. Effectively, this means maximizing the total nonlinearity defined as x d eff . This is motivated by the normalized second harmonic generation efficiency h norm , which is the efficiency per unit length. In waveguides [45,46], which takes the spatial mode overlap ξ into account. As the fundamental wavelength l i is fixed for a certain device design and the effective mode index n eff usually only varies in the second digit in practice, the product x d eff mainly determines the interaction efficiency. Note that the choice of parameter combination of effective nonlinearity and mode overlap is somewhat arbitrary [46]; for example, the three-mode spatial overlap is related to the commonly reported effective mode area via In the sense of optimization, we choose d eff and ξ as both are to be maximized.
As the PDC and SHG processes both rely on energy and momentum conservation in the investigated structure, they depend on the same underlying physical parameters such as the dispersion or mode properties of the sample. Yang et al [48] derive the proper equations and Hamiltonian for the case of waveguided PDC. The equations are given in terms of unnormalized field profiles, with the Hamiltonian where the constant  takes the constant prefactors and phasematching along the propagation direction into account. This Hamiltonian  can be used to perform an ab-inito calculation of the quantum-optical state and the pair production rate. Moreover, the normalization constants  n of a mode are given by which are similar to the individual terms of the spatial overlap normalization constant  x in equation (13). The main difference to equation (13) is that they take the local refractive index ( ) n x y , and the local ratio of phase and group velocity ( of the material into account. If the structure involves materials with large group indices, like Al x -Ga x 1 As with low aluminum content values, the normalization may differ slightly from equation (14). Nevertheless, the total nonlinearity serves as a direct indicator for the pair production rate.
However, we remark that the choice of metric is not completely unambiguous. For example, even in the well-studied case of SHG, West et al [24] or Khurgin et al [49] prefer using just the SHG width, defined as h =n d width eff 3 eff 2 . Since it effectively represents a slightly different weighting of the raw mode overlap, the results are nonetheless very similar.
3. Numerical discovery and optimization of a BRW slab structure 3.1. Autonomous analysis of a BRW slab structure In order to find suitable BRW slab designs via numerical optimization, a fitness function has to be defined. Depending on the desired application, the fitness function requires knowledge about certain parameters of the slab. This includes-but is not limited to-the type of the phasematching processes, the mode profiles at the phasematching wavelengths, their dispersion characteristics or the (nonlinear) overlap integrals in equations (11) and (12). Therefore, for any practical optimization, a candidate BRW slab needs to be analyzed automatically with respect to these properties. Here, the challenge lies with making such an analysis as stable and reliable as possible. In this section, we give a detailed description of our proposed algorithm, the corresponding flowchart is shown in figure 2.
At the core of the analysis routine is an eigenmode solver for equation (1) and/or (4). If these two components are known, the orthogonal components can be derived by scaling (equations (2) and (5)) or differentiation (equations (3) and (6)). There are numerous approaches to determine the eigenmodes: either by transfer matrix analysis [23,50], finite-differences or FEM, the approach chosen in this work [32]. During analysis, the set of eigenmodes has to be re-calculated several times, so strict convergence checks are omitted at first. Sufficient numerical reliability is ensured by having a maximum finite-element cell size so that at least five cells make up the thinnest layers. This is usually enough to determine the effective mode index b = n k eff 0 , to at least four significant digits. At this level of accuracy, the uncertainty of the refractive index model is already dominant [47,51].
Depending on the choice of mode solver, physically irrelevant modes arising from computational artifacts are excluded at this stage. This reduces the computational load in the subsequent steps of the algorithm. In our case with FEM, radiative modes with significant non-zero imaginary propagation constants appear as solutions. The imaginary part of the propagation constant is linked to loss and we are expecting ours to be completely lossless. The threshold for maximum b ( ) Im is set to 10 −10 . This value is chosen by manually classifying the modes of several, typical simulation layouts: in our setup, the propagation constant of 'lossy' modes typically have an imaginary part in the range of 1 to -10 6 while for 'confined' modes, it is usually smaller than -10 12 . A small but finite value is chosen to account for floating point inaccuracies. These thresholds are justified for our simulation because of two reasons: the truncation of the computational grid and the real-valued refractive indices. First, the simple truncation of the computational grid means that the simulation region is effectively bounded by a metal box. Therefore, modes that would be radiative still are perfectly confined. By choosing a large enough simulation region the metal box will have a negligible effect on the confined waveguide modes 7 . Second, since all refractive indices in the model are real valued, no intrinsic loss is modeled. We know from extensive simulations and experiments [47] that the sample loss is mostly determined by sidewall roughness. For now, including material loss or even optimizing for it is unnecessary. Finally, the leakage through the DBRs into the substrate is mitigated by proper mirror design with at least 5 pairs [26], enabling us to discard many undesired modes.
Aside from the imaginary part, we can additionally exclude modes because of material-imposed constraints on the real part of β. Due to oxidation considerations for high aluminum concentrations, the relevant material refractive index is restricted to a lower bound of approximately 2.9. At low aluminum concentrations, the vicinity of the bandgap yields an upper bound of »3.6. Hence, the modes can already be pre-selected by their effective index. We choose a reasonable lower bound of > n 2.8 eff , to account for the confinement of the modes. With these pre-selections, only a few tens of modes remain for further analysis. In order to search for phasematching, two sets of modes have to be calculated: one for NIR wavelengths around 775 nm,  l { }, and one for telecom at 1550 nm,  l { } 2 . Each entry in the sets is a 3-tuple which stores the effective mode index, the polarization (TE or TM) as well as a normalized field profile e x (y) for TE or h x (y) for TM modes. Since the wavelengths are fixed for each set of eigenmodes, it is highly unlikely that a perfectly phasematched pair is found at first try; subsequent analysis steps have to be 'fuzzy' in this respect.
For computational efficiency, the telecom mode set is ordered by descending effective mode index first. Then, each telecom mode is consecutively compared to potential candidates from the NIR set  l { }. The intuition behind this approach is twofold: first, significant nonlinear interaction is usually only observed for fundamental modes [53]. Second, all materials involved exhibit normal dispersion in this wavelength rage. Consequently, the most promising candidates are quickly found if a high n eff mode in the telecom range can be matched to a NIR mode. The effective indices of the two modes are allowed to deviate up to 0.05, so only close matches are selected. Then, for each matched pair of modes, their overlap according to is calculated. Pairs are rejected if < o 1%. The threshold of 1% is sufficiently high to reject numerical artifacts from orthogonal modes, but already allows identification of potential phasematching. Since the semiconductor material is strongly dispersive, the mode shape-and so the overlap-may change considerably if the wavelength is changed. During optimization, an intermediate, perturbed design may yield phasematching wavelengths far from the desired value and thus far from the reference wavelengths l = 775 nm where the modes have been calculated. By choosing this low threshold, the algorithm is able to handle these conditions. Depending on the structure, multiple suitable NIR modes may be found for each telecom mode. The mode pairs are ranked according to their overlap value.
Next, the candidate modes are tracked across several wavelengths to find l ( ) n eff . This effectively means calculating the set of eigenmodes again for several distinct wavelengths. Since BRWs operate in a regime where the modal dispersion is mainly influenced by the materials, the first derivative l ¢ ( ) n eff can be heuristically estimated by averaging the slope of the local refractive index l ¢( ) n weighted with the squared normalized field profile ( ) e y In this way, the evolution of the effective mode index can be predicted quite accurately over a large range of wavelengths, even if the dispersion is steep. Note that for highly confined systems the estimated dispersion may differ significantly. Similar to the mode selection, the modes that are closest to the estimated effective mode index are compared to the reference mode by means of the overlap integral (18). Because of the expected slow variation of the mode shape with the wavelength, a threshold of 90% is chosen for equality. The modes are tracked incrementally, starting from the reference wavelengths l p ref, and l si ref, . While this approach seems quite involved, it is very stable for a wide range of geometries and tracking modes over large wavelength ranges (>100 nm (telecom) or >50 nm (NIR)). We require this large range not only because it covers the established telecom bands, but also to allow for a significant, temporary deviation of candidate samples during optimization. Usually, the eigenmodes have to be re-calculated only ten times to reliably track a mode. After tracking, the determined effective mode indices are interpolated via a second-order polynomial. By intersecting the two polynomials, the phasematching wavelength l pm is found. While the polynomial approximation is valid even slightly outside the range of wavelengths where the eigenmodes have been calculated, we restrict them to the explicitly specified range. In that sense, phasematching wavelengths outside the band of 1500-1600 nm are disregarded, as they lie far outside the desired telecom band. In the optimization, this essentially imposes a heavy penalty for the algorithm to not stray too far from the wavelengths of interest.
In the one-dimensional case, we only need to consider type0 interaction with TE-like modes. Strictly speaking, we have to consider that TE and TM modes are distinct due to the spatial derivative of the refractive index in equation (4). In practice, however, the normalized mode profiles of TE, e x (y), and TM, h x (y), modes of a given order are extremely similar with overlaps > o 99%, if the proper symmetry for the nonlinear couplings is factored in. Moreover, despite the different equations of TE and TM modes, the effective indices the two modes are equal to within 10 −3 . Taking into account these small deviations is well outside the scope of the optimization procedure. We note that in two dimensions the effect of the lateral confinement causes an effective mode index splitting on the order of 10 −2 for the two polarizations.
Finally, the eigenmodes at the phasematched wavelength are computed. Then, the dispersion profile, mode shapes and phasematching wavelength are known. We also calculate the spatial overlap(11) and effective nonlinearity (12), as they are the required parameters for optimizing the conversion efficiency.

Random optimization and simulated annealing
Even though the BRW epitaxial structure consists of 20 or more distinct layers, symmetry considerations, such as the alternating mirror pairs of the DBR or a symmetric core layout reduce the dimensionality of the optimization problem. The epitaxial structure is parameterized in terms of layer thicknesses t i and aluminum concentration x i . The material dispersion for each layer is then calculated via the model proposed by Gehrsitz et al [51]. The second-order nonlinearity depends on the local aluminum concentration and is taken from Ohashi et al [54], and is expressed relative to GaAs. Other parameters, such as the number of mirror pairs, are usually fixed. Additionally, there may be strict constraints on certain parameters. For example, the minimum aluminum concentration must be above 20% to avoid direct absorption because.
Due to complicated and convoluted nature of the problem, calculating explicit gradients for optimization purposes is next to impossible. Therefore, we use simulated annealing [31] as a black box optimizing technique that requires no gradients. In simulated annealing the so called neighbor function derives a new sample s new from the currently best design s best . Samples are defined by an associative array s , which store a parametrization of the structure. The parameterizations mainly consists of the layer thicknesses and aluminum concentrations but constraints are also included. The quality of each sample is judged via the energy function ( ) E s , which is to be minimized. The simulation progresses with the annealing temperature T. After each step, T is gradually reduced via geometric progression given by a = + The maximum achievable probability in the first case of equation (20) is limited to 1/2. In this way, moving away from an already good solution at the end of the optimization run is penalized. A random number generator decides with the probability p whether the new solution is accepted or rejected. Figure 4 depicts a detailed flowchart of our implementation of the simulated annealing algorithm. In our case, the annealing schedule and neighbor functions are chosen so that there is an extensive initial phase that resembles a random search. This choice is based on the observation that even a small random change in the sample layout can instantly lead to a very different-better or worse-solution. This essentially increases the coverage of the search space. To improve convergence, the algorithm is reset to the best solution of this run after 30 iterations without an accepted change in the energy function. If there is no change for 100 iterations, the algorithm terminates even if the current temperature is still above the final temperature of = -T 10 The neighbor function is defined in a way that it acts mostly local, but occasionally allows larger jumps. First, a random integer m between 1 and ( ) min 5, dimensionality is picked. Then, m random parameters in the set s best are selected. The parameter values are multiplied by a random factor sampled from a normal distribution (s = 0.02, m = 1). At this point constraints are taken into account. In our case, the aluminum concentrations <20% and >85% are simply clamped to the respective limit. Also, the thicknesses of the DBR mirror pairs t i are adjusted automatically to a design wavelength l = 773 nm 1 i As at the design wavelength and design temperature. Strictly speaking, if we change the mirror layer thicknesses, the effective mode index n eff will also change slightly. This converges with a straightforward fixed-point iteration, but we have to recalculate the set of eigenmodes multiple times. According to our experience, this is unnecessary, since the optimization works well if we simply set n eff to 3.15 in equation (21) to avoid the costly iterations. Because of the BRW geometry and the requirement of reasonable aluminum concentrations, the effective mode index is mostly limited to values between = n 3.13 eff and 3.18. The DBR mirrors will still guide the mode, as with enough mirror pairs, the stop-band is wide enough to tolerate a slight mismatch.
We define the energy function heuristically as where l pm is the calculated phasematching wavelength. Choosing the total nonlinearity x | | d eff in the energy function is motivated by equation (15). The absolute is taken because the FEM solver might flip the sign in the mode profile, which is carried over by the integrals in equations (12) and (11). For previous samples, the total nonlinearity x | | d eff is typically between 0.01 and 0.05, so the energy according to equation (22) is larger than 1. Improving the total nonlinearity slightly to 0.07 reduces the energy to below <0.3. The energy gradient with respect to the total nonlinearity is steep with 0.5 per 0.01. Furthermore, the function is specified in a way to allow some variation in the target phasematching wavelength without additional penalty. In equation (22), this window is approximately 4 nm wide. Outside this range, the quadratic term imposes a steep penalty of 1 per 7 nm of deviation. In that sense, optimizing the nonlinearity is prioritized if the phasematching wavelength is close to the target wavelength. The energy function could be even stricter, but it has proven practical to run the optimization several times with different annealing schedules and manually select the most suitable design. This is because one may encounter a good design early in the random search phase, as well as to avoid getting stuck in one of the countless local optima. The latter cannot be reasonably avoided because the optimization is still probabilistic.
Typical examples of the convergence behavior for different annealing schedules are shown in figure 3. The algorithm usually proceeds at a rate of about 150 iterations per hour on 16 cores of a Xeon E5-2660 workstation, taking around 20 GB of RAM. Note that only the routines within one iteration are parallelized, so one sample will be optimized per run.
If a suitable structure is found, 2D simulations in COMSOL [33] and Matlab [42] are employed for verification. In this way, the effect of the ridge widths and etch depths can be studied. The additional lateral confinement reduces the effective mode index depending on the mode shape and wavelength. The shift is larger for higher-order modes, so the real phasematching wavelength will be slightly higher than expected from slab simulations. As a rule of thumb, a 5 μm wide ridge will increase the phasematching wavelength by approximately 5 nm at telecom wavelengths if the waveguide is etched to the core layer. This effect is enhanced for the narrower waveguides. For example, waveguides thinner than 3 μm will shift by more than 10 nm. This is why a phasematching wavelength between 1544 and 1548 nm is targeted in the cost function (22). Due to the horizontal confinement it will shift to just above 1550 nm in the real sample.
Additionally, the etch depth not only affects the linear loss coefficient but is also mode-selective. If the ridge is etched slightly above the core layer, one may find a regime where the TM-mode is guided but the TE-mode is not. Therefore, a trade-off has to be found: etching deeper increases the loss due to the additional exposure of the mode fields to the sidewall roughness, but also assures that both polarizations will be guided efficiently. Furthermore, higher-order horizontal modes are also sensitive to the etch depth. Even if the waveguide is just etched above the core layer, additional modes may be found. They provide additional mode combinations for nonlinear interactions, however, their phasematching wavelengths are usually several nanometers away [4].

Discussion of potential samples
To demonstrate our approach, we present several designs that optimize the total nonlinearity x | | d eff of a matching layer sample [5,55] with the constraint of a reduced number of differing, discrete aluminum concentrations. We formulate the constraint in a way that in the molecular beam epitaxy (MBE) process, only two distinct effusion cells are required to grow this sample. A third aluminum concentration can then be derived by summing the two concentrations. A layout with matching layers is chosen because it provides additional degrees of freedom to fine-tune the sample and eases experimental characterization due to the larger core dimensions. The expanded fields in the core result in a more manageable numerical aperture in vertical direction.
In contrast to previous samples, we search for designs with two matching layers. We choose the inner layer of the first mirror pair with aluminum fraction x 2 as the second matching layer. The optimization is allowed to modify only the thickness of these layers, so that the final design is still similar to existing structures. The material composition constraints are formulated as where x c , x m and x i are the concentrations of the core, matching and mirror layers, respectively. Alternatively, these constraints allow another design where the outer matching layer is made of the same material as the core. The reduction in aluminum concentration increases the effective nonlinearity further, but the effect is expected to be small as the fields are still localized close to the center. An in-depth discussion of such a design, however, is beyond the scope of this work. It is imperative that the starting conditions are based on a working sample. Surprisingly, it makes little difference for convergence whether a very different sample, like the old design from [11], or an intermediate result from a previous optimization run is chosen. For the designs presented here, mostly the latter was chosen.
Three candidates in table 1 were accepted for further investigation, while only sample 2M-1 was fabricated. There are two main reasons for selecting this design: First, 2D simulations set the phasematching wavelength to ≈1553 nm. Due to the thickness gradient of the wafer discussed in section 5, the phasematching wavelengths shift to lower values the farther the waveguide is from the center. Since the convenient experimental wavelength range due to equipment restrictions is 1525-1560 nm (e.g. for PDC fluorescence [17]), a large portion of the wafer is usable. Thus, a design wavelength of slightly above 1550 nm is desired (see section 3.2). Second, the inner matching layer and core thicknesses sit right between sample 2M-2 and sample 2M-3, with very similar aluminum concentrations. This is a good indication that any deviation towards another sample will yield a sample with a usable phasematching wavelength and comparable total nonlinearity. In section 4.2, we present a more detailed study of the fabrication tolerances for this design.
In addition, 2D simulations predict that the group delay difference of the telecom TE and TM modes of sample 2M-1 is significantly smaller than for previous samples. We expect a value of about 6 fs mm −1 compared to 19 fs mm −1 of the old sample from [11]. The experimental value may even be smaller, as measurements show that the real delay of the old sample is closer to 11 fs mm −1 [17,18]. The difference results from inaccuracies in the refractive index models, which strongly affect the predictions of the group indices [17,47]. Having a group delay difference as small as possible is important for the state engineering of polarization entangled photons in schemes based on routing non-degenerate photon via dichroic mirrors [8,18] and wavelength-division multiplexers [12,16].
In line with previous work (e.g. [24]), the optimization landscape of the core and inner matching thickness of sample 2M-1 is plotted in figure 5. The solution space has been sampled8 10 4 times, which required a computation time of approximately 60 h on the workstation used. In principle, such plots can be used to finetune the design of certain parameters, but it is a very time-consuming task if more dimensions are involved. Table 1. Overview of some adiabatic slab designs with two matching layers and a reduced number of distinct aluminum content values. Tuples consist of layer thickness t and aluminum content x of the Al x -Ga x 1 As material. The layout of sample 2M-1 is also shown in figure 1. Furthermore, table 2 shows candidates with a three-matching-layer structure. The total nonlinearity is slightly, but significantly, increased, with stable phasematching wavelengths at 1545 nm. Sample 3-3 is particularly interesting: it converged to two matching layers, but with a better total nonlinearity than the comparable sample 2M-2 from table 1. It is convenient, if a shorter phasematching wavelength is desired. We conclude that three matching layers do not show a significant advantage in terms of total nonlinearity. It can be inferred, however, that a total nonlinearity of about´-7.2 10 2 presents a practically achievable limit for our design goals. Such samples might be of interest, if certain other optimization goals, for example a further reduction in group index difference of the telecom TE and TM modes, have to be met. In this case, it would also be advantageous to evaluate other aluminum concentrations for the matching layers.
Finally, we compare our results with samples from the literature, the corresponding values are given in table 3. Further, we discuss samples derived during the fabrication tolerance analysis next in section 4.2 and present them in table 4.

Fabrication tolerance and further samples
In any realistic fabrication process, there is discrepancy between the design specification and the produced sample. In our case, with MBE process, the sample is very uniform across the wafer. Usually, there is only a small thickness gradient radially from the center, which affects all layers proportionally and is discussed in section 5. However, the initial set points for the thickness and aluminum concentration of an individual layer may be Figure 5. Optimization landscape of sample 2M-1 with respect to the core and inner matching layer thicknesses. The shading indicates the total nonlinearity, while the red contours shows the corresponding phasematching wavelength in nanometers. Furthermore, the approximate minimum-energy region according to equation (22) is bounded by the dashed black lines, with the absolute minimum indicated by the solid black line. Due to the very similar parameters of the samples in table 1, all samples can be located in the plot. The finite numerical resolution of the analysis routine as well as the limited granularity of the sampling cause artifacts in the contours. The fine-scale structure in the total nonlinearity is a result of effective nonlinearity integral from equation (12), the overall trend is dominated by equation (11). Table 2. Overview of some adiabatic slab designs with three matching layers and a reduced number of distinct aluminum content values. Tuples consist of layer thickness t and aluminum content x of the Al x -Ga x 1 As material.
slightly different from the specification. By performing a Monte-Carlo analysis of randomly varied design parameters, we can estimate the effect of the fabrication tolerances on, for example, the phasematching wavelength and total nonlinearity. The autonomous analysis of a BRW structure presented in section 3.1 is well suited to perform such a task, as a large number of samples can be analyzed without manual intervention. Here, the errors of the layer thickness and aluminum concentrations are modeled with a uniform distribution ( , . In contrast to the simulated annealing optimization, the sample layout is not necessarily symmetric around the core but each layer thickness can change individually. The error of the aluminum concentration is assumed to be an error of the set-point of the individual effusion cells. Therefore, all layers of the same type share the same perturbed aluminum concentration.
In figure 6, the probability distributions of the phasematching wavelengths and nonlinearities for different error magnitudes are plotted. The thickness error is given in terms of a scaling error , while the aluminum concentration error is given in percentage points, i.e.
. With ≈10 μm above the substrate, the BRW epitaxial structure is rather thick for the MBE growth process. Reasonable errors are on the order of 5% for the thickness, and maximum 2 percentage points (p.p.) for the aluminum concentration.
As far as the average phasematching wavelengths of the probability distributions are concerned, figures 6(a) and (c) show that they are very stable. However, the full-width-at-half-maximum (FWHM) of the distribution increases with 3.1 nm in wavelength for each percentage point of thickness error and 2.1 nm/p.p. of concentration error. The observed broadening is mainly an effect of the thickness error. Since the magnitude of the coefficient is similar, the larger experimentally expected error values cause it to dominate in practice. The total nonlinearity in figures 6(b) and (d) behaves slightly differently, as the distribution is skewed towards low nonlinearities. Therefore, the mean is also shifted to lower values. Here, the error of the aluminum concentration dominates with a coefficient of´-22 10 2 /p.p. compared to´-7 10 2 /p.p. for the thickness.
This study shows that reducing the thickness error to less than 5% results in a disproportionate increase in fabrication accuracy. The constraints on the aluminum concentrations help to achieve this goal, as the complexity of the growth process is greatly reduced. Since the Monte Carlo analysis investigates random samples similar to the given layout, it may be utilized to optimize the design further. Note that the histograms of the nonlinearity in figure 6 do not have a sharp cut-off at values similar to tables 1 or 2, but show values up to »´-9 10 2 . However, most of these samples are unusable in practice due to either a wrong phasematching wavelength, unsuitable/unmatched Bragg mirrors or unpractical material compositions.
Nevertheless, one can still try to derive useful samples from the dataset. While the wrong phasematching wavelength is non-trivial to correct, we can use equation (21) to find the proper layer thicknesses of the mirror pairs. It turns out that almost all samples will either shift the phasematching wavelength to an undesired value or the total nonlinearity is reduced to numbers similar to tables 1 or 2. Moreover, for phasematching wavelengths below 1550 nm and a total nonlinearity >´-5 10 2 , the total nonlinearity is slightly anticorrelated ( =r 0.12) with the wavelength.
High nonlinearity samples, which have a proper mirror design and phasematching wavelength, are exclusively ones with aluminum concentrations lower than 20%. Samples with nonlinearities up to´-8.5 10 2 have been found and are presented in table 4. They show an slightly asymmetric layout which is most likely an artifact of the random search. However, as the etched ridges are also asymmetric with respect to the core, they warrant further investigation. If a phasematching wavelength of 1550 nm is targeted, aluminum concentrations of ≈18% leave little margin of error in fabrication. In this case, the room temperature bandgap [56] already shifts from 717 nm (1.719 eV) at 20% to 729 nm (1.693 eV). At a reasonable lower limit of 16%, the bandgap is already at 746 nm (1.653 eV). With a design wavelength of 775 nm (1.592 eV), there is a 120 meV separation from the bandgap. Below 100 meV, impurities in the semiconductor may also be excited [57]. Therefore, a reduction of the aluminum concentration is only a viable option if samples with phasematching wavelengths greater than 1560 nm are desired. around 775 nm or 1550 nm and a nominal scaling factor S=1. The resulting coefficients given in table 5.
As a first indication, figure 7(a) already shows that due to the pronounced impact of the scaling factor on the effective mode index in the NIR, the phasematching wavelength can be strongly influenced even by small changes in layer scaling. In our case, the phasematching wavelength is directly proportional to the scaling factor with a gradient of approximately 5 nm per percentage point. This is well applicable if the waveguide is oriented perpendicularly and symmetrically to the gradient. The total nonlinearity slightly increases by ≈1×10 −3 per Table 5. Dispersion coefficients of sample 2M-1 for equation (27), derived with the methods of the autonomous analysis in section 3.  Figure 8. Slices through the phasematching function at degenerate signal and idler frequencies. Inset 8(a) is a top view of the wafer that shows the location and orientation of the waveguides. For j =  90 , the waveguides coincide with the 110 axis. The dashed lines illustrate the contours of a constant scaling factor, the axis crossing corresponds to the center of the wafer, whereas the numbers mark the location on the wafer. We set q =  0 , so j is the polar angle of the midpoint. In (b) we plot two slices in each subfigure one with an r 0 close to the center (1 mm, blue), and one farther out (10 mm, orange). The dark dashed line shows the equivalent constant-Dk sinc-like phasematching function as comparison. The sinc is calculated for the scale at the midpoint of the waveguide. decreasing percentage point of the scaling factor. The increase is consistent with the anticorrelation of the phasematching wavelength with the total nonlinearity mentioned in section 4.2.
In general, however, the integral in equation (24) has to be calculated. We note that this is essentially a line integral across the scalar scaling field ( ) S r , with the waveguide parametrization r over which has to be solved numerically.
In figure 8 the scale-corrected phasematching functions calculated according to equation (29) are compared to the equivalent constant-Dk sinc-function for several waveguide positions, orientations and lengths. The previously mentioned shift in phasematching wavelength is clearly visible. The phasematching function quickly widens if the waveguide orientation is non-perpendicular to the direction of the thickness gradient. With respect to the sinc at the midpoint scale, the widening is almost exclusively towards longer wavelengths. An exception are the waveguides which are perpendicular to the gradient: here, the sinc-like side-lobes are suppressed on the long and amplified on the short wavelength side. This is because the local thickness scaling along the waveguide is symmetric with respect to the midpoint.
While the main goal of this investigation is to better understand the predicted and experimental nonlinear process parameters, the results can be utilized to increase the yield of future samples. For example, if it is desired to integrate a laser or passive components with a limited wavelength acceptance band, the thickness gradient can be exploited to design a fitting waveguide. A widening of the phasematching function, however, comes at the expense of conversion efficiency. From running this analysis on a variety of other BRW samples we conclude that he results are quantitatively applicable to other BRW structures.

Conclusion
We showed how to improve an existing multilayer BRW sample to yield an increased total nonlinearity with the help of numerical optimization techniques. The approach is general and allows a quick remodeling of a BRW structure towards different specific design goals, such as a certain phasematching wavelength and higher conversion efficiency. It further enables the autonomous exploration of a wide variety of different BRW layouts. Especially, we have demonstrated all the necessary numerical, algorithmic and physical concepts to simplify epitaxial structure.
Despite the simplification, we can present a design that has a 15%-25% higher total nonlinearity than in previous work. Additionally, we have carried out a rigorous, statistical analysis of the fabrication tolerances encountered in advanced BRW samples. We found that controlling the layer thicknesses is imperative, if tight tolerances on the operating wavelengths are required. A layer thickness error of 5% results in a widening in the distribution of the phasematching wavelengths of 20 nm FWHM. This error already makes it difficult to fabricate samples with reproducible optical properties. In contrast, even a comparatively large aluminum concentration error of 2% turns out to be almost negligible. However, the simplified structure paves the way to reducing all fabrication errors further through a targeted optimization of the growth process.
Furthermore, understanding the impact of wafer layout makes waveguides far from the wafer center more predictable. Thus, using smart, application-driven patterning, a larger area of the wafer is useful in practice. New samples based on the design methods presented here have been fabricated. If the higher nonlinearity can be verified in the experiment, one may optimize for pair production directly.