Multi-channel control for microring weight banks

,


Introduction
Advances in photonic integrated circuit (PIC) technologies [1][2][3][4][5] will generate opportunities for large-scale, low-cost optical processing systems.At the same time, a revival is underway in unconventional (neuro-inspired) microelectronic computing architectures, aimed to address energy efficiency limitations inherent in von Neumann computers [6][7][8].Neuro-inspired hardware addresses these issues, in part, by distributing processing among many nodes, and, as such, rely heavily on multi-access networking strategies in which connection strengths (i.e."weights") are reconfigurable.It has long been recognized that optical physics are well-suited to the analog interconnect problem, yet solutions based on holograms [9] and fiber [10] circuits have not led to integrated systems.We have previously proposed a PIC-compatible multiaccess analog network called "broadcast-and-weight" [11], which combines results in multiwavelength networks [12], analog photonic links [13,14], and photonic neurons [15][16][17][18].Broadcast-and-weight networks could open processing domains with unprecedented speed and complexity [19].Figure 1(a) depicts the concept of a broadcast-and-weight network.
Broadcast-and-weight relies heavily on wavelength-division multiplexed (WDM) weighted addition.The analog network topology is "programmed" by controlling weight values.Microring resonator (MRR) implementations of weight banks, drawn in Fig. 1(b), have the advantages of compactness, WDM capability, and ease of tuning.On the other hand, MRR sensitivity to fabrication variations, thermal fluctuations, and thermal cross-talk presents a control problem.MRR control is an important topic for WDM demultiplexers [20], high-order filters [21], modulators [22], and delay lines [23].MRR controllers are often based on online feedback control [24,25], but the unique requirements of an analog weight bank (continuous range of weights, input signals of unknown amplitude and shape, etc.) call for a feedforward control approach with offline pre-calibration performed at least once per fabricated device [26].
Prior work demonstrated feedforward control of an add/drop MRR filter edge for effecting a continuous range of transmission values.This enabled a single photonic weight with a range of -1 to +1 [27] and precision of 3.1 bits [26] (i.e. a maximum error of ±0.117 over the range ±1, or a dynamic range of 9.33dB).Calibration consisted of recording weights over the filter edge tuning range and interpolating.The results in [26] are preliminary in the sense that interpolation-based techniques are impractical for simulataneously controlling more than one MRR weight, an essential requirement for photonic weight banks and weighted optical networks.When weight interdependency or cross-talk are present, the control problem can not be separated into N isolated channels.The dimensionality of the full tuning range increases with N, necessitating O(2 N ) calibration measurements in general.The major contributions of the present work are full, simultaneous control of MRR weight banks and the development of tractable, O(N), calibration methods for banks with any number of channels.
Model-based calibration is required for the two predominant sources of weight interdependency: thermal cross-talk and cross-gain saturation.In this work, we expand on preliminary results in [29], developing models whose parameters can be fit (i.e.calibrated) with a O(N) routine of spectral and oscilloscope measurements.Whereas an interpolation-only approach with 20 points resolution would require 20 4 = 160, 000 calibration measurements, the presented calibration routine takes roughly 4 × [10(heater) + 20(filter) + 4(amplifier)] = 136 total calibration measurements.We then assess factors affecting weight precision, including the complexity of the thermal cross-talk model.We demonstrate simultaneous 4-channel MRR weight control with an accuracy of 3.8 bits and precision of 4.0 bits (plus 1.0 sign bit) on each channel.While optimal weight resolution is still a topic of discussion in the neuromorphic electronics community [8], several state-of-the-art architectures with dedicated weight hardware have settled on 4-bit resolution [30,31].Practical, accurate, and scalable MRR control techniques are a critical step towards large scale analog processing networks based on MRR weight banks.

Methods
MRR weight bank samples were fabricated on silicon-on-insulator wafers at the Washington Nanofabrication Fabrication through the UBC SiEPIC rapid prototyping group [32].Silicon thickness is 220 nm, and buried oxide thickness is 3 µm.500 nm wide WGs were patterned by Ebeam lithography and fully etched to the buried oxide [33].A hydrogen silsesquioxane resist (HSQ, Dow-Corning XP-1541-006) was spin-coated at 4000 rpm, then hotplate baked at 80 C for 4 minutes.Electron beam lithography was performed using a JEOL JBX-6300FS system operated at 100 keV energy, 8 nA beam current, 500 µm exposure field size, and exposure dose of 2800 µC/cm 2 .The resist was developed by immersion in 25% tetramethylammonium hydroxide (TMAH) for 4 minutes.Silicon was etched from unexposed areas by inductively coupled plasma etching in an Oxford Plasmalab System, with a chlorine gas flow of 20 sccm, pressure of 12 mT, ICP power of 800 W, bias power of 40 W, and a plate temperature of 20  system.
The weight bank device pictured in Fig. 1(c) consists of two bus waveguides and four MRRs in a parallel add/drop configuration , each of which controls a single wavelength channel by tuning on or off resonance.The radii of the MRRs are [6.37,6.90, 7.43, and 7.96] µm, respectively.Coupling region gaps were 200 nm, and neighboring MRRs are separated by 20 µm.Q-factors are approximately 10,000.The free spectral range of the first MRR is measured to be 15 nm, indicating an effective TE refractive index of 4.2. .The sample is mounted on a temperature-controlled alignment stage and coupled to fiber with TE focusing subwavelength grating couplers [34].
The experimental setup shown in Fig. 2(a) consists of a multiwavelength reference input generator [35] that produces statistically independent signals by imparting channel-dependent delays on a 2Gbps pseudo-random bit sequence (PRBS).These reference signals are shown in Fig. 2(b).A 4-channel 13-bit digital-to-analog converter (DAC), NI PCI-6723, buffered to provide up to 80 mA per channel, tunes the electrical power dissipated in each MRR heater.The heaters share a common connection to reduce electrical I/O count.Since this common wire is not perfectly conducting, the effective common voltage can fluctuate with total current flow.Current-mode drivers are used to avoid this issue.The drop and through outputs of the MRR weight bank are amplified, their net delays matched, and detected by a balanced photodiode (PD).A transmission spectrum analyzer (not shown) is also connected to the device to simultaneously monitor the filter resonance peaks, tune them onto resonance with the WDM input signals , and assist in thermal model calibration.Figure 2(c) depicts tuning the bank from the initial state to the all channels on-resonance state.
Although input signals to the MRR weight bank are not necessarily known during an operation phase, the calibration phase can take advantage of known reference inputs in order to simultaneously measure the effective weight of each channel.In this case, references were de-layed PRBS signals, each of which is stored as x i (t).If channel delays exceed one bit period, then the correlation x i (t) • x j (t) t approaches zero for a sufficiently long pattern (in this work, 2 7 bits).All weights µ i can then be determined by decomposing a single measurement m(t) in terms of stored references: The calibration routine estimates a mapping of applied current to weight i → µ.The inverse of this mapping becomes the feedforward control rule for effecting a desired weight vector.We separate the map into physical stages for thermal tuning ( i → ∆λ ), MRR bank transmission ( ∆λ → T ), and actual detected weight ( T → µ).

Thermal cross-talk model
The temperature of an MRR waveguide is affected predominantly by the heater directly above, but heat can also leak between nearby MRRs.The relationship between dissipated electrical power, i 2 R, and resonant wavelength shift, λ − λ 0 is linear and can be modeled by a matrix, K K K [36].Assuming heater resistance is constant, where λ 0 is the resonant wavelength at zero tuning current, and K K K is a nearly diagonal matrix that describes the thermo-optic effect, heat transfer coupling, and heater resistance.Offdiagonals of K K K decribe unintended heat transfer from a given heater to filter of different channels, a.k.a.thermal cross-talk.Substituting q j ≡ i 2 j for notational clarity, this equation can be put in a differential form around WDM signal wavelengths, λ sig , and the tuning current needed to bias filters on-resonance with these signals, q bias , This linear model is simple to calibrate and invert, but it relies on an assumption of constant heater resistance.In general, heater resistance is also temperature dependent due to thermoelectric self-heating.For a single current-driven heater with ambient resistance of R 0 and thermo-electric coefficient α, which is certainly not constant, and even has a singularity at q = (αR 0 ) −1 , signifying a thermal runaway.Instead of combining the multivariate and nonlinear equations above, we simply note that non-constant resistance means that second and higher-order derivatives of ∆λ in terms of ∆q are non-zero but small enough that a Taylor expansion can incorporate the nonlinearities.
where D is model order, the exponentiation of ∆q is element-wise, and the model now contains D distinct K K K matrices.The Taylor approximation's main advantage for calibration is a simple method to fit K K K matrices.The tuning current of each channel is swept over an operating range of interest (∼4 filter linewidths), while the wavelength shift of every filter peak is measured with the spectrum analyzer.Each peak shift function is fit with a D-order polynomial to obtain one element of each K K K matrix.The process is repeated for every channel.K K K values found in this experiment are shown in Fig. 3. To prevent overfitting, there must be at least D spectrum measurements per channel.We make 5DN measurements for added robustness and found a D = 2 made thermal modelling error sufficiently small so as not to limit overall precision, which is revisited in Section 3.
The polynomial mapping ∆q → ∆λ must be inverted to provide a feedforward control rule.While it does not have a closed-form inverse, the following iterative solution converges quickly.
The iteration takes advantage of the fact that thermo-electric effect on heater resistance, represented by the K K K d>1 matrices, are relatively small perturbations.As heaters are biased closer to thermal runaway singularities from Eq. ( 7), the thermo-electric effects become stronger.This means more steps are needed to converge, and higher orders of Taylor approximation must be used, necessitating more calibration measurements.

Optical transmission interpolation
The transmission effect of each MRR filter edge is treated as an independent function, f i : ∆λ i → T i , and calibrated with an interpolation-based approach orgininally developed for a single channel [26].20 samples per filter are interpolated to get a continuous estimate of the forward function, fi (∆λ i ), and inverse, f −1 i ( Ti ).This estimate is refined by taking a second set of samples that are nominally uniform in T i .Calibrated edge transmission functions are shown in Fig. 3.The advantage of the interpolation approach is robustness to arbitrary and non-ideal filter edge shapes; however, it requires that the channel spacing be large enough that filters do not interact optically.In this case, a minimum channel spacing of about 150GHz, seen in Fig. 2(c), gives sufficient isolation, but future work to increase channel density must reexamine the edge calibration approach.

Cross-gain saturation model
EDFAs at the output of the weight bank in Fig. 2(a) are subject to slow timescale cross-gain saturation, which depends on the weight of each channel in addition to absolute power levels that can fluctuate with polarization, ambient temperature, and fiber strain.The present fiber experiment must model this cross-gain saturation to obtain unbiased weight bank results.While optical amplifiers are not yet widely available on silicon PICs, semiconductor and rare earth ion amplifiers in silicon have been investigated [37,38], and could potentially use a similar model.We model the cross-gain saturation effect, assuming two homogeneously broadened EDFAs in  Eq. 14 non-depleted pump regimes, as where i indicates channel number and T is the tunable microring through port transmission.P in is input power, T c is net coupling efficiency, γ is drop efficiency, g ss is amplifier small-signal gain, and P s is saturation power, which is not channel-dependent.P amp signifies total power incident on an EDFA.(+,-) superscripts respectively indicate the amplifiers on through and drop output ports.Not all physical parameters are observable from weight measurements, but the following parameterization yields a fittable model: The parameter vectors B (+,−) and C (+,−) (totaling 4N parameters) can be fit (i.e.calibrated) with a series of 4N measurements at particular tuning states.We introduce a notation µ (xy) i to signify the measured weight of channel i when the transmission of channel T j=i is x and the transmission of other channels T j =i are y.For example, µ (10) 2 signifies the weight of channel 2 when it is transmitted to the through port (T = 1) and channels 1, 3, and 4 are coupled to the drop port (T = 0).The calibration procedure starts by measuring µ   These equations containing 2N unknown parameters and 2N known measurements can be solved analytically as follows.
By summing this equation over all i and rearranging, the sum of C + can be stated entirely in terms of measured weights, at which point it can be substituted into Eq.( 17) to recover individual C + parameters.The B + parameters then fall trivially from Eq. ( 15).Drop port amplifier parameters, C − and B − , follow an identical procedure upon measuring µ (00) i and µ (01) i . The ability to decompose single measurements of m(t) into all weights via Eq.( 1) means that µ (11) and µ (00) only require one measurement each, while the dissimilar measurements call for distinct tuning states and therefore N measurements per amplifier.In this derivation, it was assumed that complete switching down to T = 0 is possible, which is not always the case in practice.A more algebraically complex calibration technique with nonzero T min can be derived similarly, but is omitted here in the interest of space.Calibrated parameter values found for this experiment are shown in Fig. 3.
Once the forward model parameters have been calibrated, we must invert the mapping T → µ, Eq. ( 14), in order to work as a feedforward controller rule.
This is solved iteratively as follows: This iteration converges quickly when C parameters are small, as in Fig. 3, which is the case when signal powers are less than amplifier saturation power.

Results
After the above calibration procedure is performed, the four-dimensional command weight is swept in two-dimensions at a time while the actual weight is recorded.A sweep of command weight values over two channels is shown in Fig. 4. Sweeps over other pairs of channels (not shown) were seen to produce similar results.Mean error is less than 0.072 over the range, corresponding to a weight accuracy of 3.8 bits (i.e.11.4dB dynamic range), and dynamic error is less than 0.062 for a weight precision of 4.0 bits.Accuracy is 0.7 bits higher than in prior work with a single channel [26] due to procedure and setup changes that minimize polarization drift.The effects of using simplified models for thermal physics are shown in Fig. 5.When thermal cross-talk and self heating are completely neglected (i.e.D = 1 and K 1 is diagonal), accuracy is reduced to 2.8 bits.A constant resistance model (i.e.D = 1) is used for Fig. 5(b), yielding a small improvement to 3.0 bits.In both cases, mean errors in Fig. 5 show no clear trend, besides being less accurate towards more negative weight values.Surprisingly, introducing a linear cross-talk model barely improves weight accuracy.This can be explained by the sharp sensitivity of filter transmission to resonant wavelength.The sensitive response of the MRR filter edge necessitates very accurate thermal modelling, in this case, D = 2 provided significant improvement.For the devices in this paper, we found D = 3 to yield negligible improvement over D = 2 since other factors limited precision; however, MRR weight banks with different biases, heater designs, materials, etc. may need increased Taylor orders for sufficient thermal model accuracy.An alternative to thermal tuning is depletion modulation [39], which could eliminate thermal cross-talk and the current-squared dependence, yet requires a more involved fabrication process with a partial etch of the top silicon layer and four dopant levels.Fig. 5. Weight sweeps for simplified thermal cross-talk models over 5 iterations.The target grid (black), mean error vectors (red), and standard deviation ellipses (blue) are used as in Fig. 4(a).a) No model of thermal cross-talk is applied, and weight accuracy is 2.8 bits (8.4dB dynamic range).b) A first-order D = 1 (i.e.constant resistance) model of thermal cross-talk is applied, and weight accuracy is 3.0 bits (9.0dB dynamic range).In both sweeps, the amplifier cross-gain calibration model is applied in order to isolate the effect of thermal cross-talk modeling.

Discussion
We can make several conclusions about weight accuracy and precision by examining Fig. 4.
Negative weight values are accurate on average, but have more variability because they correspond to the on-resonance condition where sensitivity to fluctuations is greater.The orientation of error ellipses indicate a positive correlation between dynamic errors, indicating intra-sweep power level fluctuations affecting all channels.Most likely, this is explained by drift in fiber temperature and strain affecting polarization and therefore fiber-to-chip coupling efficiency of all channels (T c in Eq. ( 11)).Random non-repeatable errors with standard deviation 0.063 are likely attibutable to polarization drift; however, accuracy is still limited by systemic errors, which repeat over multiple sweeps and have a maximum of 0.072.Several mechanisms could explain much of the discrepancy between DAC resolution (13 bits) and weight accuracy (3.8 bits), providing evidence that DAC resolution is limiting.Firstly, the transmission T dependence on ∆λ is sharply nonuniform over the tuning range of interest (Fig. 3).The ratio of maximum slope to mean slope is 5.3 in the worst case (ch.2), representing 2.4 bits lost.This effect is somewhat intrinsic to filter edge tuning and difficult to improve without affecting performance.Secondly, the DAC driver is designed to have an 80mA range in order to cover large initial biases; however, the weight tuning range (in this case ∼1mA) is much less.This dynamic range mismatch reduces the usable resolution by a factor of 80, or 6.3 bits, accounting for the remaining discrepancy.Controller accuracy is therefore expected to improve by reducing the mismatch between tuning range of interest and driver range.
The range of interest for ∆q 1 was 53 2 − 52 2 = 105mA 2 .Supposing MRRs could start onresonance with zero bias, either by careful fabrication or by flexible WDM wavelengths, a maximum current of √ 105 = 10.2mA would be needed to get the same wavelength shift (provided identical device/heater design).If the driver's full range could then be set to 10.2mA, the dynamic range mismatch could be entirely eliminated, in theory.Reducing dynamic range mismatch is an important direction for making systemic controller errors negligible compared with dynamic errors, and also reducing the DAC resolution needed to achieve a given accuracy.
A natural question to ask of MRR weight banks is minimum channel spacing.Just as the introduction of thermal cross-talk dictated a switch from interpolation-based calibration to modelbased calibration, weight banks with dense channel spacing will be subject to optical cross-talk, requiring optical transmission modeling.The weight bank cannot be broken into N independent models or interpolated functions.Since all filters couple into the same output ports, a channel that partially couples through the "wrong" filter can still end up at the intended output, unlike in the case of a demultiplexer wherein unavoidable cross-talk dictates minimum channel spacing [40].This suggests that model-based calibration of optical cross-talk in a MRR weight bank could be instrumental for increasing channel density and number.Further work in this direction will likely benefit from generalized models of waveguide circuits, such as described in [41].

Conclusion
We have demonstrated simultaneous feedforward control of a 4-channel microring weight bank, which could play a major role in large scale processing networks on silicon photonic platforms.The primary enablers of this result were scalable calibration models of thermal cross-talk and amplifier cross-gain saturation.A weight accuracy of 3.8 bits was demonstrated, on par with corresponding state-of-the-art digital electronic neuromorphic hardware [31].Thermal models that neglect cross-talk and thermo-electric self-heating were found to be insufficient, reducing this accuracy by 1.0 and 0.8 bits, respectively.Parameterized calibration models for optical cross-talk could be developed for more advanced weight bank controllers.Further work could explore the limits of channel density in a single MRR weight bank, and the integration of multiple weight banks into a broadcast-and-weight network.

# 258667 Received 1 Fig. 1 .
Fig.1.a) Role of WDM weighted addition in a proposed on-chip analog photonic processing network[11].Each E/O node -which could be any nonlinear modulator, direct-driven laser, or dynamical laser neuron -produces a signal modulated on a unique wavelength.Weighted addition banks produce electrical signals that drive the E/O converters [28].b) Microring resonator (MRR) implementation of a WDM weight bank.Tuning MRRs between on-and off-resonance switches a continuous amount of optical power between drop and through ports.A balanced photodetector (PD) yields the sum and difference of weighted signals.c) Optical micrograph of the device under test, showing a bank of four thermally-tuned MRRs.d) Wide area micrograph, showing fiber-to-chip grating coupler ports.

Fig. 2 .
Fig. 2. a) Experimental setup.An input generator creates uncorrelated signals on different wavelengths by time delaying a single PRBS.DFB: distributed feedback laser; AWG: arrayed-waveguide grating; PPG: pulse pattern generator; MZM: Mach-Zehnder Modulator; FBG: fiber Bragg grating.The microring weight bank is thermally tuned by a currentmode DAC (digital-to-analog converter).Drop and thru outputs are amplified by erbium doped fiber amplifiers (EDFAs) and delay-matched before detection by a balanced photodetector (PD).A computer (CPU) executes the calibration routine.b) Time domain traces of reference input signals on different wavelength channels.c) Optical spectrum of WDM inputs (red) and transmission spectra of the drop port when tuning current is off (gray) and tuned onto resonance (blue), measured with a drop port spectrum analyzer (not shown).
) Two-dimensional weight sweep showing controller accuracy and precision.After the calibration procedure, the target weight is swept 5 times over a grid of values from -1 to 1 (black grid).Black points are measured weight data.Red lines show the mean offset from each target grid point.Blue ellipses indicate one standard deviation around the mean.Mean error magnitude is less than 0.072 over the span.Standard deviation remains below 0.063, with a tendency to be larger for negative weights.From this plot, we estimate that the weight can be controlled with an accuracy of 3.8bits.b) Measured traces of 2Gbps signals.[ 1 -9 ]  Output signals corresponding to points labeled in (a).The expected signal is in red, while measured traces are in blue.All time and voltage axes have identical scales.

Figure 4 (
b) shows time traces compared to expectation at several weight values.Traces 2 and 6 represent the original inputs and traces 8 and 4 their respective inverses.The sweep in Fig. 4(a) is used to analyze accuracy, a.k.a.mean error or repeatable error (red lines), and precision, a.k.a.dynamic error or non-repeatable error (blue ellipses).

#258667 Received 1
Feb 2016; revised 3 Apr 2016; accepted 4 Apr 2016; published 14 Apr 2016 (C) 2016 OSA 18 Apr 2016 | Vol. 24, No. 8 | DOI:10.1364/OE.24.008895 | OPTICS EXPRESS 8904 Fig.3.Diagram of modelling stages showing calibrated parameter values fit during this experiment.Bias stage puts variables in differential form around the state of all filters being on-resonance with signals, λ sig .Heater stage models thermo-electric, heat transfer, and thermo-optic effects with a predominantly diagonal, linear K K K 1 1 1 matrix and nonlinear corrections (order D = 2 shown).Filter stage consists of four independent interpolation-based estimates of the transmission along each MRR filter edge.Amplifier stage models absolute optical powers and fiber amplifier saturation characteristics preceding photodetection.