Accelerating Silicon Photonic Parameter Extraction using Artificial Neural Networks

We present a novel silicon photonic parameter extraction tool that uses artificial neural networks. While other parameter extraction methods are restricted to relatively simple devices whose responses are easily modeled by analytic transfer functions, this method is capable of extracting parameters for any device with a discrete number of design parameters. To validate the method, we design and fabricate integrated chirped Bragg gratings. We then estimate the actual device parameters by iteratively fitting the simultaneously measured group delay and reflection profiles to the artificial neural network output. The method is fast, accurate, and capable of modeling the complicated chirping and index contrast.


Introduction
Interest in integrated optics continues to grow as silicon photonics provides an affordable platform for areas like telecommunications, quantum information processing, and biosensing [1]. Silicon photonic devices typically contain features with sub-micron dimensions, owing to the platform's high index contrast and years of complementary metal-oxide-semiconductor (CMOS) process refinement.
While small features enable several innovative and scalable designs, they also induce an increased sensitivity to fabrication defects [2]. A fabriction defect of just one nanometer, for example, can cause a nanometer shift in the output spectrum of the silicon photonic device [3]. Understanding and characterizing these process defects is essential for device modeling and variability analysis [4,5]. Predicting and compensating for such sensitivity in the optical domain is difficult because typical simulation routines are computationally expensive and in many cases, prohibitive.
To overcome these challenges, we propose a new parameter extraction method using artificial neural networks (ANN). We train an ANN to model the complex relationships between integrated chirped Bragg gratings (ICBG) [6,7] and their corresponding reflection and group delay profiles. We use the trained ANN to extract the physical parameters of various fabricated ICBGs using a nonlinear least squares fitting algorithm -a task that is computationally prohibitive using traditional simulation routines. We find that the proposed routine produces spectra that matches well the experimental reflection and group delay profiles for the ICBGs.
Our work builds upon previous efforts that extract integrated photonic device parameters using analytic models. Chrostowski et al., for example, extract the group index across a wafer with 371 identical microring resonators (MRR) using an analytic formula describing the free spectral range (FSR) [8]. Similarly, Chen et al. derive both the effective and group indices from MRRs by fitting the full, analytic spectral transfer function to the experimental data [9]. Melati et al. extract phase and group index information from small lumped reflectors knows as point reflector optical waveguides (PROW) using various analytic formulas [10].
Perhaps most similar to this work, Xing et al. build a regression model from data generated by an eigenmode solver that relates waveguide design parameters (e.g. width and thickness) to their corresponding effective indices [11]. They subsequently use this regression model in addition to an analytic transfer matrix and a fitting routine to extract the average waveguide width and thickness from various Mach-Zhender Interferometer (MZI) devices. Our ANN parameter extraction method is fast, just like the analytic and regression models, but capable of modeling much more complicated devices, like ICBGs.
The rest of the paper is outlined as follows: first, we describe the data generation process necessary to train the ANN. Next, we describe the process of training the ANN. We then discuss the ICBG device design, fabrication, testing, and data calibration. Finally, we describe our nonlinear fitting algorithm and present our experimental results.

Data generation
ANNs model the relationship between inputs and outputs by cascading various nonlinear computational units known as neurons [12]. ANN training algorithms like backpropogation tune the weights of these neurons until the functional mapping adequately models the corresponding training set [13]. Several factors, like the ANN's architecture and even the training set itself, influence the training accuracy and speed of the ANN. In addition, the ANN may learn unintentional biases if the training set insufficiently represents the function space [14].
Consequently, it is important to adequately describe the ICBG using parameters that are simple and intuitive for the designer, but also comprehensive and descriptive in order to fully span the design space. To accomplish this, we parameterized the ICBG's design space using the length of the first ICBG period (a 0 ), the length of the last ICBG period (a 1 ), the number of gratings (NG), and the ICBG's corrugation width (∆w = w 1 − w 0 ), and the wavelength (λ). Figure 5 illustrates an ICBG with each of these design parameters. We trained the ANN to output the reflection and group delay spectra of the simulated ICBG.
To accurately simulate the the ICBG reflection and group delay responses for our training-set, we used a layered-dielectric media transfer matrix method (LDMTMM) [15]. The method discretizes the ICBG into individual dielectric slabs, models each slab as an ideal waveguide, and propogates the fields through each slab using a transfer matrix. The effective index of each section is modeled using another ANN that parameterizes the wavelength as a function of waveguide width and thickness. This process is repeated for every wavelength point of interest.
We simulated over 100,000 grating configurations at 250 wavelength points from 1.45 µm to 1.65 µm resulting in approximately 26,000,000 training points. We swept through 10 different corrugation widths, 11 different ICBG lengths, and 961 different chirping patterns [16].
For a tool intended to perform parameter extraction on fabricated devices, it is important to use a generalized and abstracted model insensitive to minor fabrication defects. For example, small changes in the ICBG apodization profile greatly alter the expected spectral ringing. The LDMTMM method also exhibits significant ringing that corresponds to a very narrow parameter space. Furthermore, parameterized high frequency ringing is rather difficult to capture efficiently using ANNs. We overcame these challenges by fitting the LDMTMM group delay and reflection profiles prior to training to a generalized skewed Guassian function of the form Our resultant dataset corresponds to a much larger and practical parameter space but significantly alleviates the ANN training process. Once the dataset was generated and processed, we proceeded to train the ANN. Often, this process must be repeated until a suitable parameter space is simulated. This design flow is illustrated in Figure 1.  (1). Then, the reflection and group delay profiles are simulated using the transfer matrix method (2). This dataset is then fed into a ANN training algorithm (3). Often, this process must be repeated until the ANN can suitably express a large enough ICBG design space.

ANN training
To identify a suitable ANN architecture, we performed a hyper-parameter optimization (HPO), where several ANNs with different architectures were simulated simultaneously. We swept through common ANN architecture components, like the number of layers, the number of neurons for each layer, each neuron's activation function, and the batch size. We concurrently trained 1200 different ANNs on 1200 cores using Brigham Young University's Fulton Supercomputing Lab and the TensorFlow package [17]. Each simulation took approximately 12 hours. Figure 2 illustrates the HPO's results. We measured the accuracy and effectiveness of each network by tracking the mean squared error (MSE) and coefficient of determination (R 2 ) for all simulated permutations.
From the HPO, we chose to train an ANN with 8 layers. Each layer had 32, 64, 128, 256, 128, 64, 32, and 16 neurons respectively. Each neuron used a leaky ReLu activation function. No dropout was used.

Device fabrication, measurement, and calibration
We designed 11 different ICBGs each with a linear chirp of 6 nm. We designed 5 of the devices with a reversed chirping, such that their resultant group delay profiles would be mirror images of their counterparts. Some devices were 750 grating periods long and the others were 250. We chose corrugation widths of 10 nm, 30 nm, and 50 nm.
To efficiently extract the reflection, transmission, and group delay profiles of the same ICBG, we designed an interrogator circuit using various Y-branches, directional couplers, and grating couplers. Figure 3 illustrates the circuit. We used the grating couplers to direct light on and off of the chip. We routed the light using the Y-branches and directional couplers. We interfered the reflection signal with the original reference signal using a Mach-Zhender Interferometer (MZI) in order to measure the group delay information.
Our devices were fabricated at the University of Washington in collaboration with the University of British Colombia and the SiEPIC program on a 150 mm silicon-on-insulator (SOI) wafer with 220 nm thick silicon on 3 µm thick silicon dioxide and a hydrogen silsesquioxane resist (HSQ, Dow-Corning XP-1541-006). Electron beam lithography was performed using a JEOL JBX-6300FS system operated at 100 keV energy [18], 8 nA beam current, and 500 µm exposure field size. The silicon was removed from unexposed areas using inductively coupled plasma etching in an Oxford Plasmalab System 100. Cladding oxide was deposited using plasma enhanced chemical vapor deposition (PECVD) in an Oxford Plasmalab System 100.
To characterize the devices, a custom-built automated test setup [1] with automated control software written in Python was used. An Agilent 81600B tunable laser was used as the input source and Agilent 81635A optical power sensors as the output detectors. The wavelength was swept from 1500 to 1600 nm in 10 pm steps. A polarization maintaining (PM) fibre was used to maintain the polarization state of the light, to couple the TE polarization into the grating couplers [19]. A polarization maintaining fibre array fabricated by PLC Connections (Columbus OH, USA) was used to couple light in/out of the chip.
To estimate the reflection and group delay profiles from the measurement data, we calibrated out the band-limited spectral responses induced by the grating couplers, directional couplers, and Y-branches. Figure 4 illustrates this process for both the reflection and group delay data. For the reflection measurements, we first fit the data outside of the ICBG's bandwidth to a fourth order polynomial. We use this polynomial fit to remove the couplers' responses. We then relocate the noise floor by fitting, once again, the data outside of the ICBG's bandwidth. To extract the group delay, we fit the entire MZI interference pattern to a fourth order polynomial to remove the couplers' response. We then estimate the free spectral range (FSR) of the interferometer using a Fig. 3. Interrogation circuit used to extract the reflection, transmission, and group delay profiles of a single ICBG simaltaneously. Light is routed on and off the chip using grating couplers. The group delay is extracted using a Mach-Zehnder Interferometer (MZI) formed by various directional couplers and Y-branches.
peak-tracking algorithm. From the FSR, along with the relative path length difference (L r e f ) of approximately 200 µm, we can estimate the group delay (τ) using where and n g (λ) is the group index of the reference arm waveguide.

Experimental Results
To estimate the actual fabrication parameters of the ICBGs, we used the ANN in conjunction with a nonlinear least squares fitting routine within the SciPy package [20]. The routine initializes by calling the ANN using the original design parameters. The simulated reflection and group delay profiles are directly compared to the measurement data. From the residuals, the algorithm decides whether the current design is sufficiently similar to the measurement data or if further simulation is needed. Figure 5 illustrates this procedure.
Since the ICBG has fabrication limits that can be cast as parameter bounds, we chose to run a Trust Region Reflective (TRF) optimization algorithm within the nonlinear solver [21]. Specifically, we bounded the first and last ICBG periods (a 0 & a 1 ) between 312 nm and 328 nm and the corrugation width between 1 nm and 50 nm. The number of periods was fixed.
We chose to extract parameters of three different ICBGs. After just 5 minutes of optimization on a Macbook Air 2012 (1.8 GHz Intel Core i5, 4 GB 1600 MHz DDR3 RAM), the solver Fig. 4. Calibration process used to extract the measured reflection and group delay responses. The reflection data is first fit to a fourth order polynomial outside of the expected bandwidth in order to remove the grating couplers' transfer function (a1). Next, the data is once again fit to a fourth order polynomial outside of the device's bandwidth to identify the noise floor (a2). The data is then normalized to unit power (a3). Similar to the reflection data, the group delay data is also fit to a fourth order polynomial to remove the grating couplers' response (b1). Next, the FSR is approximated using a peaktracking algorithm (b2). From the FSR, the group delay is estimated (b3).  5. Efficient and robust method to extract fabricated ICBG device parameters using ANNs and a nonlinear least-squares optimizer. First, the ANN simulates reflection and group delay spectra for the device's initial design parameters (1). Then, the simulations are compared directly to the measured data (2). If the results are sufficiently similar, the optimizer returns the device parameters (3). If not, the optimizer strategically simulates a new set of device parameters based on the residual error (4). Fig. 6. The extracted reflection (a1, a2, a3) and group delay (b1, b2, b3) profiles (yellow) compared to the initial design profiles (red) and the calibrated measurement data (blue).
converged on new parameters for all three devices that more reasonably reflect the measurement data. Figure 6 illustrates the algorithm's results compared to the fabrication data and the original design spectra. Not only do the algorithm's profiles match the data much better, but the extracted parameter differences are expected from the processes used to fabricate the devices. For example, the algorithm predicts a slightly wider chirping bandwidth and smaller corrugation width for all three devices. The E-beam raster grid's resolution approaches the chirping resolution of the ICBG (1 nm), so "snapping" from one grid point to the next results in slightly wider chirping bandwidths. The E-beam's resolution, along with the etch process, also tend to round the sharp ICBG corners, resulting in lower net corrugation width.
Other small differences between the extracted parameter sets and the fabricated data, like the fabry-perot resonances, are difficult to model with the current ANN abstraction. It would require a much more sophisticated, and possibly impractical, parameterization to capture these defects. Despite these small discrepancies, the fitting algorithm and ANN demonstrate a strong ability to extract parameters for complex silicon photonic devices.

Conclusion
We demonstrate a novel silicon photonic parameter extraction method using artificial neural networks. Our method is capable of extracting parameters for complicated devices, like integrated chirped Bragg gratings, without sacrificing the speed of traditional analytic methods. To validate our method, we fabricated and measured various integrated chirped Bragg gratings and extracted the actual parameters. Future work will explore other parameterizations and new devices, like adjustable splitters and directional couplers.