Fabrication tolerant chalcogenide mid-infrared multimode interference coupler design with applications for Bracewell nulling interferometry

: Understanding exoplanet formation and finding potentially habitable exoplanets is vital to an enhanced understanding of the universe. The use of nulling interferometry to strongly attenuate the central star’s light provides the opportunity to see objects closer to the star than ever before. Given that exoplanets are usually warm, the 4 µm Mid-Infrared region is advantageous for such observations. The key performance parameters for a nulling interferometer are the extinction ratio it can attain and how well that is maintained across the operational bandwidth. Both parameters depend on the design and fabrication accuracy of the subcomponents and their wavelength dependence. Via detailed simulation it is shown in this paper that a planar chalcogenide photonic chip, consisting of three highly fabrication tolerant multimode interference couplers, can exceed an extinction ratio of 60 dB in double nulling operation and up to 40 dB for a single nulling operation across a wavelength window of 3.9 to 4.2 µm. This provides a beam combiner with sufficient performance, in theory, to image exoplanets.


Introduction
The search for planets in star systems other than our own (exoplanets) is ongoing in the field of astrophysics. Ultimately, the atmospheric characterization of these exoplanets, especially ones considered Earth-like, is vital to understanding how prolific habitable planets are in the universe and ultimately if life, as we know it, exists outside of our solar system. Interferometry is key to future spectral analysis of exoplanets.
Amongst the available methods for direct imaging, interferometry has come to the forefront of exoplanet hunting after the infrared imaging of a planet in its early stages of formation was reported [1]. Interferometry, as an astronomical application, was first used by Michelson to image the moons of Jupiter [2] and other, larger, stellar phenomena [3,4]. Interferometry has the advantage of improved accuracy when measuring the Fourier components of barely resolved objects.
Space based instruments will be the ultimate incarnation for astronomical interferometry because of the distortions imposed by the Earth's atmosphere, however, ground based interferometers exploiting advanced adaptive optics (AO) systems [5,6] have significantly improved terrestrial systems to the point that useful science is now possible even with large apertures.
Exoplanetary light is many orders of magnitude weaker than that of the host star, as it is either a weak reflection of the host star (attenuated by distance and albedo) or comprises low (compared to the host star) temperature thermal emission for young exoplanets. To image planets in such systems clearly requires a means of blocking the host star's light to render the exoplanetary light detectable. As an example, for a planet in the HR 8799 system recently studied with direct imaging, the contrast ranges between 37 and 44 dB [7,8]. To cite another example providing contrast requirements, consider extrasolar giant planets (EGPs) that derive their luminosity directly from reprocessed starlight. Such planets have contrasts at 4 μm of ~100 dB at the orbit of our Jupiter (5 astronomical units, AU), decreasing to ~40 dB for orbits ~0.2 AU [9]. Such exoplanets have already been identified: for example HD 3651b, at a distance of only 11 pc, was discovered using radial velocity methods [10] and has a semimajor axis of 0.284 AU.
Common exoplanets that reside at a separation of ~1 AU from their hosts are too close to their host star to attain sufficient contrast with commonly used coronagraphic imaging methods where the central star is blocked behind a mask. Either nulling interferometry (first proposed by Bracewell in 1978 [11]) or advanced coronagraphic techniques such as complex phase mask based coronagraphs with pupil apodization to eliminate the telescope diffraction pattern are required (e.g [12].). Interferometric nullers have several advantages over even advanced coronagraphs; leaving aside the advantages in fabrication and alignment, nullers offer throughput and contrast that are significantly improved at small inner working angles (θ ~λ/D or smaller), meaning imaging can be performed much closer to the host star.
The advantages of using a nuller come at a price, and as with all broadband interferometers, path matching constraints, environmental stability, and alignment are all serious issues in bulk optics based interferometer implementations. It is clear that a photonic chip based interferometer has much to offer in ameliorating these difficulties. This has been clearly demonstrated in the Near-Infrared J and H band (1.33 µm and 1.55 µm) [13] with the first ever integrated optic device to combine stellar signals achieved in 2001 [14] in the H band. However, achieving the same architecture in the Mid-Infrared (MIR) is rather more challenging.
A recent review into the astronomical context of infrared interferometry [15] highlights the need for nulling interferometers. Designs of various infrared nulling architecture have been made in Lithium Niobate [16,17] and Silicon-Germanium [18] but, to our knowledge, no experimental nulling depths, theoretical projections of expected performance, nor extensive fabrication tolerance analyses incorporating real world inaccuracies have been published for interferometry devices in these materials.
In this paper a critical part of such an interferometer (the beam combiner) is first design optimized using simulations with the beam propagation method (BPM), and the performance across the operational bandwidth is investigated over the expected range of real world fabrication tolerances. Detailed simulations of 3 dB splitters in the form of multimode interference couplers (MMIs) are presented to show that a fabrication tolerant, high performance, wide bandwidth, planar waveguide design for a basic MIR nulling interferometer is possible. Such a device could then be utilized to find exoplanets either in a long baseline interferometer such as the Planet Formation Imager [19,20] or as the MIR beam combiner for an instrument such as Dragonfly [21] on a large ground based telescope.

Photonics background
The MIR is particularly favourable for exoplanet research as the contrast ratio between star and exoplanet light is maximized at two windows [7], specifically in the L (3 to 4 μm) and N (7.5 to 14 μm) bands [22]. The N band is most interesting for mature exoplanets at Earth-like temperatures, especially at 9.7 μm where the O 3 absorption is visible [23]. In the N band, however, the detector performance is typically two orders of magnitude worse compared to the L band [24], thermal emissions from room temperature optics (e.q. the telescope primary) and other environmental emissions decrease signal-to-noise ratios immensely, and diffractionlimited angular resolution is poorer. The L band is especially advantageous for the study of exoplanetary formation as the relatively high temperatures (~700K) of such bodies moves the black body emission peak to the L band region. As exoplanet formation is a hotly debated process, interferometric nullers in the 4 μm wavelength band are of current interest.
To create a monolithic chip at 4 μm, traditional silicate based glasses normally used in planar integrated optics are of no utility as they are no longer transparent. A number of potentially viable replacements exist for planar integrated optical circuits at this wavelength, namely Silicon on Sapphire [25], Silicon Germanium [26], Indium Phosphide [27], Aluminium Gallium Arsenide [28], Lithium Niobate [29], and Chalcogenide Glass (ChG) [30].
In addition to the obvious requirements for compact low loss circuitry there is a further requirement related to interfacing to the pupil plane sampling in the telescope. This is most easily achieved with microlens arrays having customized spatial distributions, that typically have a Numerical Aperture (NA) ~0.2 and below. Efficient coupling with this sort of NA requires large waveguides with low index contrast for them to be single mode. On the interferometer chip, however, reasonably large NA single mode waveguides are required in order to implement the <0.5 mm bend radii needed to make the circuits compact. Thus there is a fundamental mismatch in core sizes and index differences between the sampling and processing that has to be accommodated through a mode transformer of some type. The transformer also has to be intrinsically very low loss as telescope photons are scarce and scattered light from mismatched modes cannot be tolerated in the interferometer as it may produce spurious interference.
Chalcogenide glasses (ChGs) offer a more flexible alternative in finding a solution to this mismatch. Laser writing [31] of chalcogenides for example provides a low index change which eases coupling difficulties. Unfortunately this significantly limits bending radii which need to be quite small to minimize circuit size to enhance stability and losses. Chalcogenide glasses also offer entire families of glass composition that can be tuned to generate a very wide range of index contrasts. In addition, simple vertical tapering methods have been demonstrated in ChG with extremely low transition losses that can be used to adapt the mode size from a small high index contrast waveguide that can be sharply bent to one which is optimized for coupling with lower NA microlens arrays [32]. This design flexibility and the future scalability to N band was the factor that motivated this design study and ongoing work to implement the final design as a functional device.

Waveguide design and low NA mode matching
Initially the waveguide design follows from [30] and uses the combination of Germanium, Arsenic and Selenium (Ge 11.5 As 24 Se 64.5 ) as the core material and a Sulfur substituted analog (Ge 11.5 As 24 S 64.5 ) as the under-and over-cladding. These materials were chosen as they can be deposited by thermal evaporation in a bulk like state and so require no annealing [33] which would introduce birefringence through the resulting structural rearrangements. The refractive index of these glasses were measured by spectroscopic elipsometry as 2.609 and 2.279 respectively [30] at 4 μm. The maximum width and height for a single mode waveguide was determined by modelling using RSoft FEMSim in full vector mode for a square core buried in cladding glass. The resulting mode intensity profile for a 2 by 2 μm waveguide single moded down to 3.8 μm is shown in Fig. 1 for the TE (and equivalently due to symmetry also TM) propagating mode. The 1/e 2 mode field diameter was 2.6 μm, matching the central lobe of an Airy disc formed in air with a numerical aperture of 0.79, or a Gaussian with minimal truncation formed with an NA of 0.82. To attain high efficiency coupling to low NA systems, a 3-D taper with simultaneous width and height changes, or a pair of 2-D tapers can be used. The 2-D taper pair, whilst conceptually inferior, is considered as it eases mechanical alignment tolerances imposed by the fabrication methodology for the vertical taper [32]. The challenge is to keep the light confined in the fundamental mode of the now multimode waveguide, by designing to avoid mode coupling. Both device types were modelled with RSoft BEAMProp in full vector mode. Figure 2 shows simulations of a full 3-D and dual stage (vertical then lateral tapers) transforming a 10 by 10 μm waveguide fundamental mode to a 2 by 2 μm waveguide mode. It is clear from Fig. 2 that such devices work, but the transformation losses also need to be clarified. The calculated transmission across the wavelength window of interest is displayed in Fig. 3 as a function of taper length (sum of two tapers for the dual case), treating all light not in the fundamental mode of a 2 by 2 μm waveguide as lost.  Figure 3 shows that the single taper is better behaved (the monotonic increase indicating nelegible mode coupling) creating a device 2 times shorter than the two taper system for the same throughput of ~95%. For taper lengths >250 μm, the 3-D device is essentially lossless. The main issue is in the fabrication tolerances, which requires the lithographic mask for the waveguide width definition to be aligned to the mechanically defined vertical taper to a tolerance <10 μm. This is a non-trivial exercise as the taper deposition mask is usually mechanically machined and so there are large tolerances at play and deposited fiducials are necessarily large and of low edge sharpness due to the tapering.
The fall back option is the dual taper system where the vertical tapering is separated from the width tapering resulting in significantly relaxed lithographic alignment tolerances. However as can be seen in Fig. 3 there is non-monotonic behaviour occuring as a function of length which is a clear indicator of mode coupling occuring. However the magnitude of this is acceptably small resulting in only slightly lower transmission (96% vs 99.5%) than the full 3-D case by a total length of 400 μm. Thus the dual taper is also quite acceptable, requiring more space on the chip for the benefit of simpler fabrication.

The multimode interference coupler design and tolerances
The most basic component in an interferometer is the combiner/splitter. The port counts and coupling ratios vary depending on the particular nulling architecture, with the simplest type being the Bracewell Interferometer [11] that requires a single 2x2 splitter with 50% coupling ratios [34]. Once the Bracewell Interferometer has provided a proof of concept, the next step is to use the a interferometer Multimod evanescent co compared to typically larg example is th evanescent c fabrication to split (referred nulling interfe Prior anal especially for

MMI des
The requirem the two outpu MMI, as defin analyzed and Accordingly, where W is th material, N b this case 2) an to 1 to minim Equation positions 1/6t but with a len  (1) f the core ection (in ength (set to be at is similar s e tolerance MI, from and a null only free MProp in th of 500 orporated into the model. For simulations that used the same dimensions, wavelength and refractive index the input mode was invariant and thus the 2 by 2 μm waveguide mode, computed by FEMSim, was used as the launch and monitor mode; otherwise the launch mode was computed directly by BEAMProp and also used as the monitor for the output powers. Simulations indicated that the MMI length for a given width can be independently optimized in terms of equalization of the coupling ratios or the total throughput of the device [36]. Figure 5 shows the difference between these optimum lengths as a function of MMI width from BEAMProp simulations over MMI widths from 25 μm to 55 μm. The imbalance is found using Eq. (2), with bar port defined as that directly opposite to the input taper and the cross port being that diagonally opposite the input respectively. The imbalance defines a deviation from perfect 50/50 splitting (an imbalance of 0) and is used here in conjunction with the maximum transmission for device optimization.
Bar port Cross port Imbalance (%) 100 Bar port Cross port

MMI width and length dependence
Having an optimized design for the base MMI, its tolerance to fabrication errors needs to be elucidated. In terms of the MMI geometry, errors can occur independently in the length and width of the MMI during the mask fabrication step according to the accuracy of the mask writer, termed its critical dimension (CD) tolerance, the level to which the fabricated feature matches the design feature. During actual device fabrication, the length and width will be affected together by both the lithography and etching processes, and independently to any mask errors therefore adding cumulatively. During lithography, the initial imaging of the device onto the wafer, the final size of the printed feature compared to the mask feature size after resist development will depend upon a number of factors, (e.g. exposure dose, temperature humidity, develop time, etc). Similarly, during dry plasma etching, further size changes can occur depending upon the etch time, the degree of undercut or mask erosion, etc. Both processes affect all features in the same way causing the outline of the device to grow or  From Fig. 8 the imbalance has a similar dependence on the outline of the MMI to the width with an imbalance variation of ± 0.35% per ± 100 nm outline change.

Refractive index
The refractive index of the MMI will remain constant throughout a single device or a small group of devices (at least over a device <1 mm long), however there will be intra-and interwafer variations in the refractive indices of the core and cladding. Experience with the films involved over many depositions indicates a total variation of 0.01 in refractive index is the worst case. Accordingly, the performance of the optimized design above was modelled over a wide range of core and cladding indices as shown in Fig. 9, where the refractive indexes where varied independently: the number indicated by the vertical line was the design refractive index for the optimized device and the value used when varying the other index. As would be expected, the performance is relatively insensitive to the cladding index exhibiting an imbalance variation of ± 0.0015% per ± 0.005 change in cladding refractive index around the design point (2.279) and negligible change in the transmission. As suggested by Eq. (1), however, the performance is more sensitive to the core index. Around the design point (2.609), the imbalance change is ± 0.09% per ± 0.005 of refractive index change. The transmission again is effectively unchanged.

Layer thickness
The other remaining device geometry variable is the film thickness. The expected thickness variation over a single device (millimeter scale) would be very small (sub nanometer) but over centimeter scales and between wafer runs more significant variations occur, typically a maximum range of ± 0.5% or a total range of 20 nm for the design considered here. Figure 10 shows a simulation from BEAMProp of the effect of core thickness variation on the reference design.  For the ± 10 nm achievable accuracy the change in imbalance around the reference value of 2.00 μm is ± 0.011% for TE and ± 0.0010% for TM.

Waveguide taper width
Whilst the impact of changes in the taper width is captured in the lithography and etching outline error considered above, the effects of solely varying the taper widths are considered here as it offers some further design flexibility and increased fabrication robustness for tapers a little narrower than optimum. The tapers increase robustness and reduce insertion losses and originated from research outlined in [39]. The expected optimum width is calculated as 3/10th the MMI width, 13.5 μm for a 45 μm wide MMI. As is shown in Fig. 11, varying the width of both tapers simultaneously has minor effect on the transmission: decreasing the width from the "optimum" value of 13.5 μm there is only a gradual fall in the transmission until the decline accelerates for widths 8 μm and smaller. The taper length is chosen to be short enough to keep the light single moded by preventing mode coupling. In this simulation the length was set to 200 μm which is consistent with Fig. 3. As can also be seen in Fig. 11, there are in fact several alternate widths at which excellent imbalance can be obtained, and that the performance is not highly sensitive to the taper width.

Fabrication tolerance summary
The effects of the various possible sources of fabrication tolerance error on the imbalance are now summarized in Table 1. From this it is clear the dominant error sources are the random mask width error, and the lithography and etching bias outline errors.

Wavelength dependence
Whilst the wavelength is not a fabrication variable it is nonetheless a quantity of operational interest when nulling performance is considered. The bandwidth of a MMI is expected to be wavelength dependent due to the 1/λ term in Eq. (1) for the 3 dB coupling length. The reference design was modelled using BEAMProp across a range of wavelength with the results displayed in Fig. 12. The design of the MMI was optimized for a wavelength of 4 μm as is evident from the peak in the transmission in Fig. 12. The MMI transmission around the design wavelength of 4 μm, is 98 ± 6.5% for ± 200 nm of optical bandwidth. The imbalance is minimized at the design wavelength as desired, and to a first order approximation, varies linearly with a proportionality constant of 0.0126%/nm in the ± 200 nm operating bandwidth.

Beam combiner
The MMI has so far been presented on its own, however, three are required for the architecture proposed by Angel and Wolf [35] -which involved interfering two sets of two beams separately and then interfering the resulting output together. Fig. 13. Three MMIs in a 2:1 Angel and Wolf configuration [35]. The black lines in the centre of each MMI are only for simulation purposes they are in no way physical.
The design shown in Fig. 13 is as follows: The star light enters the first two MMIs and, by appropriate phase changes applied to the input light, is almost entirely directed to the outside (top and bottom in the diagram) output waveguides (termed the antinulls). To achieve this result the input wavefronts, to the two input MMIs, must have a 90° phase offset controllably imposed upon them. By tuning the phase offset, light from a close companion (incoherent with the star light), is directed into the (middle) third MMI. The middle MMI is a simple power combiner where no nulling is intended where an appropriate scan of the phase offset provides detailed information of surrounding light sources. This operation is herein termed single nulling due to a null of the host star only occurring at the first two MMIs and is similar to that used for the beam combination in Dragonfly [21]. A small proportion of the star light exits from each input MMI into the third MMI, and can be interfered again with appropriate imposed phase shifts to send most of it to one of the two central output waveguides. This is a different operational mode herein termed double nulling as per Angel and Woolf [35] and illustrated in Fig. 13 with the exception of one null port becoming an antinull port.
The figure of merit of a nulling interferometer is the Extinction ratio, which is defined as Null Extinction (dB) 10Log , AntiNull where the Null is the power from the on-axis star light that has been suppressed by the architecture in Fig. 13 and the AntiNull is the power in the channels the star is directed into. In Fig. 13 the null channels are the middle two waveguides (directly after the third MMI) and the AntiNull channels are the highlighted channels at the top and bottom of the frame. As shown in Table 1; the main sources of error are the, per MMI random mask width variation and the lithography and etch bias outline error. To evaluate the likely impact of these factors on the nulling performance over the operating bandwidth, a Monte Carlo simulation was performed where three random MMI widths are chosen within the error bounds ( ± 100 nm) and then a random outline error applied to all three MMIs together within the error bounds ( ± 100 nm). The chosen system then had the extinction ratio simulated across the operating window, and the process was repeated 100 times. To do this using BPM would however be unacceptably slow, and so the individual MMI performances were represented in matrix form with wavelength dependent parameters.
A series of targeted BPM simulations were used around the nominal design to fit the variation of the MMI coupling efficiency (k) and the transmission (α) across wavelength and over the full range of width and outline variations. The coupling matrix in Eq. (4), was then used for each MMI individually with the α and k over the wavelength range chosen from the interpolated table of values based on the random parameter choices (and i 2 = −1). The first two MMIs (as shown in Fig. 13) were set to have inputs of 1 and i to represent two incoming coherent wavefronts at a 90° phase difference. The middle MMI uses the outputs of the first two MMIs. All waveguide and bend losses have been excluded in this model as their impact would be minimal based on the short distance between the MMIs and any significant loss would be dependent on a range of different fabrication errors not discussed in this paper.  Figure 14 shows the results of the Monte Carlo simulation for 100 nullers dimensioned as described above for the TM mode extinction (worst case) of the nulling interferometer. Figure  14(a) illustrates that a 40 dB extinction is achieved across a bandwidth of 3.9 to 4.2 μm when simulating a single nulling operation whereas (b) has a 60 dB null using a double null operation over the same bandwidth.

Conclusion
A beam combiner for a Bracewell inspired nulling interferometer has been designed and rigorously simulated to elucidate fabrication tolerances. It has been shown that the selected coupler, an MMI, is robust against changes in width and the total outline providing a nulling device with a reasonably achromatic response over the bandwidth of 3.9 to 4.2 µm.
For the operational bandwidth of 3.9 to 4.2 µm, a > 40 dB null is expected in single nulling mode. An even better extinction can be reached with a smaller bandwidth, exceeding 60dB for narrowband observations though at the expense of total power arriving at the detector. In double nulling mode, extinction exceeding 60 dB is available over the whole 3.8 to 4.2 μm operating bandwidth. At these null depths, scientifically meaningful observations are achievable with, for example, EGPs and exoplanets within 1 AU of the host star. These types of measurements work in tandem with established coronagraphic measurements that cannot penetrate as close to the host star and thus cannot resolve planets inside the habitable zone. Work is currently in progress to fabricate and characterize the optimized device designed here, and verify the predicted performance experimentally.

Funding
This research was supported by the Australian Research Council (ARC) Centre of Excellence for Ultrahigh bandwidth Devices for Optic Systems (CUDOS) project CE110001018.