Towards a high-density photonic tensor core enabled by intensity-modulated microrings and photonic wire bonding

Luan, Enxiao; Yu, Shangxuan; Salmani, Mahsa; Nezami, Mohammadreza Sanadgol; Shastri, Bhavin J.; Chrostowski, Lukas; Eshaghi, Armaghan

doi:10.1038/s41598-023-27724-y

Download PDF

Article
Open access
Published: 23 January 2023

Towards a high-density photonic tensor core enabled by intensity-modulated microrings and photonic wire bonding

Enxiao Luan^1,2,
Shangxuan Yu²,
Mahsa Salmani¹,
Mohammadreza Sanadgol Nezami²,
Bhavin J. Shastri³,
Lukas Chrostowski² &
…
Armaghan Eshaghi¹

Scientific Reports volume 13, Article number: 1260 (2023) Cite this article

6713 Accesses
8 Citations
4 Altmetric
Metrics details

Subjects

Abstract

We propose a photonic processing unit for high-density analog computation using intensity-modulation-based microring modulators (IM-MRMs). The output signal at the fixed resonance wavelength is directly intensity modulated by changing the extinction ratio (ER) of the IM-MRM . Thanks to the intensity-modulated approach, the proposed photonic processing unit is less sensitive to the inter-channel crosstalk. Simulation results reveal that the proposed design offers a maximum of 17-fold increase in wavelength channel density compared to its wavelength-modulated counterpart. Therefore, a photonic tensor core of size 512 $\times $ 512 can be realized by current foundry lines. A convolutional neural network (CNN) simulator with 6-bit precision is built for handwritten digit recognition task using the proposed modulator. Simulation results show an overall accuracy of 96.76%, when the wavelength channel spacing suffers a 3-dB power penalty. To experimentally validate the system, 1000 dot product operations are carried out with a 4-bit signed system on a co-packaged photonic chip, where optical and electrical I/Os are realized using photonic and electrical wire bonding techniques. Study of the measurement results show a mean squared error (MSE) of 3.09$\times $10$^{-3}$ for dot product calculations. The proposed IM-MRM, therefore, renders the crosstalk issue tractable and provides a solution for the development of large-scale optical information processing systems with multiple wavelengths.

Parallel convolutional processing using an integrated photonic tensor core

Article 06 January 2021

Reconfigurable nonlinear photonic activation function for photonic neural network based on non-volatile opto-resistive RAM switch

Article Open access 06 October 2022

Nonlinear germanium-silicon photodiode for activation and monitoring in photonic neuromorphic networks

Article Open access 13 October 2022

Introduction

Computational requirements and energy expenditures have escalated rapidly, to either process the exponentially increased data generated by ultra-high-speed mobile networks or to address the demand for accelerating artificial intelligence ¹. However, current state-of-art electronic processors, which have developed with startlingly rapid progress in the past decades, are approaching their growth limit subject to Moore’s Law. It can be foreseen that if progress continues along the current route, these computational requirements will rapidly become prohibitive technically and economically ². Photonic platforms have been regarded as ideal candidates for analog processing of optical communication signals, providing a framework for a new class of information processing machines ³. Compared to electrical counterparts, photonic circuits have their predominant advantages: optical signals traveling at the speed of light can be manipulated by transmission modulation, experience lower attenuation and generate less heat as a function of distance ³. Many application-specific optical processors have been exploited for addressing mathematical ^4,5 and signal processing ^6,7 tasks with performance enhanced by orders of magnitude.

Integrated photonics has drawn tremendous attention due to its capability to generate, manipulate, and detect optical signals on a single chip. Leveraging photonic integrated circuits (PICs) manufactured using CMOS compatible processes, one can build up miniaturized photonic processing systems with high yield and low cost. According to the requirements of the light source, photonic processing systems can be divided into two categories, coherent architectures and multiwavelength architectures. For the coherent architecture, the coherent input light is employed in an array of beam splitters and phase shifters to perform matrix transforms using interference between different paths ³. The Mach-Zehnder interferometer-based (MZI) mesh is the dominant linear photonic processing network with coherent input signals. It is a well-studied and mature architecture for matrix multiplications in computational systems, including applications in optical neural networks ^8,9, quantum transport simulations ¹⁰, reconfigurable optical delay lines ¹¹ and singular value decomposition ¹². However, coherent optical interconnects exhibit sensitivity to the optical phase, which requires calibration after each MZI mesh layer ¹³. In addition, since coherent architectures require a single optical phase reference, only a single source laser can be employed. This required the laser to generate high optical power sufficient for the entire system. In contrast to coherent systems, multiwavelength architectures use incoherent signals generated by individual light sources at different wavelengths or a single source that produces multiple wavelengths to carry and process the information. Leveraging wavelength-division multiplexing (WDM), each input signal is an analog optical power at a given wavelength processed in parallel by a bank of modulators.

An integrated multiwavelength architecture, namely broadcast-and-weight, first proposed ¹⁴ and demonstrated ¹⁵ using cascaded microring resonators (MRRs) in the silicon photonic platforms, has proven its unique capabilities in information processing applications including in photonic neural networks ¹⁶, wireless signal processing ¹⁷$^{,}$ ¹⁸, and nonlinear programming ¹⁹. Incoming signals assigned to different wavelength carriers are wavelength division multiplexed and weighted by a photonic weight bank, that is realized by tuning MRR modulators. The signals are then summed up by full power differential detection. Typically, MRR modulators are wavelength-modulated. The resonant peak of the MRR shifts when the refractive index of the waveguide core changes through the thermo-optic effect or the plasma dispersion effect. This results in transmission intensity variation at the carrier wavelength. One of the drawbacks of this modulation technique, particularly in a large-scale system ²⁰, is the inter-channel crosstalk issue. Due to this issue, the optical modulation amplitude of the individual wavelength channels needs to be minimized, which, as a result, limits the scalability of the photonic processing system. Another multiwavelength architecture for photonic computing has been demonstrated recently by utilizing phase-change-material (PCM) memory crossbar arrays, which enables a speed of trillions of multiply-accumulate (MAC) operations per second ²¹. However, in this approach, signals are inscribed based on the absorption rate of the PCM patch on the waveguide. Part of the input light power is dissipated by the absorption. In addition, signals carried at different wavelengths are combined on the bus waveguide through directional couplers. These couplers will generate additional losses due to the broadband coupling. Although no inter-channel crosstalk is present, the crossbar system shows lower energy efficiency compared to the MRR-based system.

This paper investigates a fixed-wavelength intensity modulation scheme for a WDM-based photonic signal processing system. As shown in Figure 1a, interferometric coupling-based microring modulators (MRMs), which contain index-modulation components (shown in purple) in the coupling region and the resonator region, are introduced. Incoming WDM sources are first modulated by the all-pass MRMs in the modulation bank, and then weighted by the add-drop MRR filter arrays in the weight bank at each wavelength. By adjusting the coupling strength and the resonance condition, direct intensity modulation can be achieved at their fixed resonance wavelengths with negligible inter-channel crosstalk. At last, balanced-photodetectors (BPDs) are employed at the output for multiwavelength intensity summation. Simulations results show that when aiming for a 3-dB power penalty, IM-MRMs can improve the wavelength channel spacing by a factor of seventeen compared to the wavelength-modulated MRMs. Using photonic wire bonding (PWB) technique, the photonic chip that includes our proposed system based on IM-MRMs is assembled on a printed circuit board (PCB) and dot product calculation is demonstrated. PWB involves three-dimensional photosensitive polymer-based waveguide writing between different platforms after the die placement, which relaxes the requirement for fine alignment ³. Based on the experiments done using the packaged photonic chip, a mean squared error of 3.09 $\times $ 10$^{-3}$ is observed for 1000 multiplications with random input values. Moreover, to study the performance of the proposed scheme in a large-scale system, a convolutional neural network simulator is built using a co-simulation environment in Lumerical and Python. Simulations based on MNIST handwritten digit recognition dataset shows that the architecture can achieve an accuracy of 96.76%. The measurement and simulation results show that the proposed IM-MRMs can be the building block of the large-scale optical information processing systems which do not suffer from the inter-channel crosstalk.

Intensity-modulated microring modulators

Design

Interferometric coupling-based (or two-point coupled MZI-based) MRRs have been utilized for hitless-switching ²², high-speed modulation ²³, PAM-4 modulation ²⁴, photonic delay line ²⁵, single-photon sources ²⁶, and post-fabrication correction ²⁷. Recently, a wideband tunable add-drop multiplexer is reported using interferometeric coupling-based modulators ²⁸. The proposed design utilizes the similar intensity modulation scheme to achieve ’connection’ and ’disconnection’ of filter channels. However, this multiplexer needs 4 directly-coupled MRRs as the Vernier filter, and requires an additional photodetector for tuning and calibration, which inhibits its large-scale application at multiwavelengths.

The schematic in Figure 1b illustrates the proposed IM-MRM used in our high-density photonic processing system. By adding an unbalanced MZI acting as a tunable coupler, the effective coupling ratio changes when the sinusoidal spectral response of the MZI coupler shifts in wavelength ²². Accordingly, the extinction ratio of the resonant peak can be modulated in the MRM. The length difference $\Delta L$ between the unbalanced MZI coupler arms (as shown in Figure 1b) determines the free spectral range (FSR) of the device, according to the following equations:

$$\begin{aligned} FSR_\text {MZI} = \frac{FSR_\text {MRR}}{m}, \end{aligned}$$

(1)

$$\begin{aligned} m=\frac{\Delta L}{2\pi R}, \end{aligned}$$

(2)

where, $FSR_\text {MZI}$ and $FSR_\text {MRR}$ represent the FSR of the MZI coupler and the MRR, respectively, and R is the radius of the MRR. According to different non-negative values of m, the coupling spectrum and the ring resonance behave as either co-period or periodic resonance suppressed ²². To realize intensity modulation at a fixed wavelength, $\Delta L$ is set to be $\pi R$ (m = 1/2) to accommodate the phase shifting mechanism in the MZI arm. A device for larger values of m can be designed by sacrificing the footprint to improve the tuning efficiency. Analytical equations used towards the design of the IM-MRM can be found in Section S1 of the Supplementary Information.

The cross-section in Figure 1b shows a doped silicon waveguide, namely the photoconductive heater, to serve as the index-modulation and resonant peak-tracking component. N-doped resistive heaters integrated into MRRs have shown photoconductive effects with high responsivities for automatically tuning and stabilizing the filter’s resonant peak to the wavelength of interest ²⁹. Unlike germanium (Ge) deposition ³⁰ or PIN diode implantation ³¹, N-doped resistive heaters do not require dedicated defect implantation steps, additional material depositions, dedicated photodetectors, or optical power taps. This allows for a low-cost, compact and straightforward method for modulating large-scale MRR systems. Two types of N-doped photoconductive heaters are involved in our IM-MRM design. An in-coupler photoconductive heater (ICPH) serves as the effective coupling ratio modulator in the MZI coupler, and an in-resonator photoconductive heater (IRPH) serves as the resonant peak monitor and wavelength compensator in the MRR (shown in Figure 1b). Although N-doped photoconductive heaters have been employed in optical computing ¹⁶, biosensing ³² and data transmission ²⁹ systems, to our knowledge, this is the first demonstration of doped silicon-based multiwavelength optical information processing systems leveraging intensity modulation at a fixed wavelength.

Characterization and control

All the devices described in this paper were designed using the open-source layout editor KLayout and SiEPIC-Tools, and fabricated on a silicon-on-insulator (SOI) wafer with 220-nm-thick silicon and 2-$\upmu $m-thick buried oxide layers through a SiEPICfab Multi-Project Wafer fabrication run by Applied Nanotools Inc. Details of the optical characterization setup can be found in Section S2 of the Supplementary Information. Figure 2a shows a microscopic image of the proposed all-pass IM-MRM with integrated ICPH and IRPH components, which share the ground for space-saving consideration. N-doped and N$^{++}$-doped regions are false-colored. It is worth noting that due to the compact footprint (m = 1/2) and a shared ground, undesired effects caused by thermal and electrical crosstalk are observed between the ICPH and IRPH, which needs to be optimized in future designs (discussed in “Discussion and conclusion” Section). The radius is 15 $\upmu $m, and the coupling gap is 200 nm. The transmission spectrum of the fabricated IM-MRM, after calibration for removing the insertion loss of the grating couplers, is shown in Figure 2b. The FSR is measured at 12.5 nm, which is approximately doubled compared with a point-coupling-based MRM with the same radius. This matches the design above with $\Delta L = \pi R$, considering a group index of 3.85 for the N-doped waveguide. The noise in the transmission spectrum is because of the high return loss of the grating couplers suffering from high levels of back reflection in the system, which can be improved with optimization. A zoomed-in plot aiming for one resonant peak at 1526.25 nm is depicted in Figure 2c, indicating a quality factor (Q-factor) of $\sim $10,000 and an ER of 21 dB. In Figure 2d, IV curves for ICPH and IRPH are presented when no light is incident (dark current).

The resonance wavelength and ER of the IM-MRM can be manipulated by applying voltages across the IRPH ($V_\text {IRPH}$) and ICPH ($V_\text {ICPH}$) ³³. As shown in Figure 3a, when only $V_\text {IRPH}$ is applied (from 0 to 3.5 V), the resonant peak shows a 350 pm red-shift, and the ER remains at about 27 dB. While, when only applying $V_\text {ICPH}$, the resonant peak shifts and changes the ER from 27 dB to 1.25 dB (Figure 3b). Therefore, by adjusting both $V_\text {IRPH}$ and $V_\text {ICPH}$ and changing the ER of the resonant peak, we could realize an intensity modulation scheme, while the peak position is maintained without any wavelength shift. The flow diagram of the intensity-modulation algorithm is outlined in Figure 3c. First, a wavelength of interest ($\lambda $) is selected, which is achieved by applying a voltage to the IRPH ($V_\text {IRPH}^{\lambda }$). For information encoding, a “discrete analog” encoding/decoding scheme is employed to implement direct value mapping for translating the digital number to an analog value ^34,35. More details about the “discrete analog” scheme can be found in “Dot product engine” Section. Here, we apply a voltage to the ICPH ($V_\text {ICPH}^{w}$) to achieve a transmission at the resonant peak for representing the input values (w) according to the measured ER from Figure 3b. This causes a drift in the resonant peak to another wavelength $\lambda + \Delta \lambda $. To compensate for the unwanted drift, a wavelength offset scheme is employed using the photodetection mechanism of the IRPH. The voltage applied to the IRPH is swept from 0 to $V_\text {IRPH}^{\lambda }$ while $V_\text {ICPH}^{w}$ is kept fixed. the photocurrent generated in the IRPH at wavelength $\lambda $ is monitored to find the new resonant wavelength ($V_\text {IRPH}^{w}$). It is worth noting that changing the voltage applied to the IRPH offsets the wavelength shift and changes the ER simultaneously, although at much lower efficiency ($\Delta w<$ w). A calibration algorithm is used to tune $V_\text {ICPH}^{w}$ and minimize $\Delta w$. At last, the calibrated $V_\text {ICPH}^{w}$ and $V_\text {IRPH}^{w}$ voltage pair are stored for encoding the input value, w.

Normalized transmission spectra with discrete ER values of the resonant peak fixed at 1526.5 nm are depicted in Fig. 4a. By applying the voltage to the ICPH from 0 to 3.3 V followed by changing the corresponding voltage on the IRPH (obtained through the intensity-modulation algorithm), the ER of the resonant peak varies from -27 to -1.25 dB with no apparent wavelength shift. Figure 4b shows the calibrated output extracted from resonant peaks at 1526.5 nm in Fig. 4a with different $V_\text {ICPH}$ and $V_\text {IRPH}$ voltage pairs. It can be observed that when $V_\text {ICPH}$ is larger than 1.66 V, the transmitted light starts to increase because of the change of the coupling strength, and $V_\text {IRPH}$ drops from 3.46 to 0.13 V accordingly for wavelength offsetting. Figure 4c demonstrates the multi-level operation with 4-bit distinct output levels. The transmission returns to level 0 for erasure ($V_\text {ICPH}$ = 0 V and $V_\text {IRPH}$ = 3.46 V) between each level. Since the applied voltage and the transmitted output show a nonlinear relationship during the intensity modulation operation, a simple predistortion step is introduced by applying a series of voltage pairs with unequal increments to realize approximately linear output levels with an interval of $\sim $0.05 $\upmu $W in Fig. 4c. More precise linear distributions can be achieved by finely adjusting the voltage pairs for each power level. Three standard deviations (3$\sigma $) of 0.0048 $\upmu $W is observed in Fig. 4c (SNR = 10.4 dB); therefore 7.2-bit precision could be achieved. It has been reported that IRPH-based MRMs allow continuous, multichannel control of the weight bank with accuracy up to 8.5 bits ³⁶. By employing a “dithering” control scheme to monitor and stabilize the entire optical link, environmental changes in the system can be compensated with high precision ³⁶. By selecting predistorted voltage pairs applied to ICPH and IRPH, one can reliably move the transmitted power between these known intermediate levels with high repeatability (shown in Fig. 4(d) with a sampling rate of 1 Hz). Detailed information on the predistortion step can be found in Section S2 of the Supplementary Information.

Similarly, an add-drop IM-MRR filter with a radius of 15 $\upmu $m was also fabricated to serve as the positive and negative signed information encoding element. As shown in Fig 5a, another MZI coupler connecting add and drop ports of the IM-MRR is implemented, and both MZI couplers are designed symmetric to the ring with coupling gaps of 200 nm to reduce the IL ³⁷. Figure 5b presents through and drop ports transmission spectra. A reduced Q-factor of $\sim $1200 is observed compared to the all-pass one, due to the higher absorption loss in the resonator which contains the IRPH with twice the size. In Fig. 5c, IV curves are measured with no incident light. Normalized transmission spectra at through (blue) and drop (red) ports are measured by applying different $V_\text {ICPH}$ and $V_\text {IRPH}$ voltage pairs and are presented in Fig 5d along with the calibrated output values at resonance wavelength (Fig. 5e) following the procedure in Fig. 3c and “Characterization and control” Section. Measured weight values for the add-drop IM-MRR are plotted in Fig 5f. By subtracting the calibrated output power between drop and through ports, a weight range from -1 to 0.75 is obtained. The insertion loss of the modulator causes the 0.25-deduction of the maximum value at the positive side. Therefore, to get a symmetrical weight range relative to 0, the range of [-0.75, 0.75] is selected. Since multiplication can be interpreted as one input value being weighted by another, we define the mapping between voltage pairs and the transmitted power as the input mapping for the all-pass IM-MRM, and the weight mapping for the add-drop IM-MRR, respectively, in the rest of this paper.

Scalability

It is known that the number and the bandwidth of optical signals are restricted by the ability of the modulators to tune each wavelength channel independently ¹³. In WDM systems, two factors determine the maximum wavelength channel density, the FSR and the linewidth of the MRR. Namely, the resonator’s finesse ($\mathscr {F}$ = FSR/linewidth) sets the upper fan-in limit: $N\le \mathscr {F}$, where N is the number of wavelength channels. However, unlike demultiplexers, weight banks in the broadcast-and-weight architecture are reconfigurable, requiring independent modulations for every signal over a transmission range ¹³. In addition, weight banks contain two output ports (through and drop). Input signals carried by different wavelengths are proportionally multiplexed and demultiplexed by a series of add-drop MRRs in the weight banks, where the inter-channel crosstalk results from the overlap of the optical pass-bands of the modulators. Therefore, a wider wavelength channel spacing than the linewidth of MRRs is always necessary to minimize the crosstalk impact.

Impacts of crosstalk in MRR-based WDM systems have been numerically investigated, in which the level of crosstalk is specified by the level of isolation between adjacent channels ^38,39,40. As for the broadcast-and-weight architecture, we utilize a similar metric to allow for the crosstalk effect to be incorporated into the power budget of the weight banks. The metric is defined as the weighted power range for a signal when it is impaired by crosstalk relative to the range in the absence of crosstalk sources. It has been reported that a 3-dB power penalty caused by the crosstalk is observed in the weight banks tuned by wavelength modulation, when the minimum wavelength channel spacing falls between 3.41 and 4.61 times linewidth ¹³. Considering a finesse of 368 and a minimum wavelength channel spacing of 3.41 times linewidth, up to 108 wavelength channels can be supported in the broadcast-and-weight system ⁴¹. Finesse of 368 is experimentally demonstrated by a passive MRR with a radius of 1.5 $\upmu $m. However, it will be challenging to achieve with active modulation components ⁴². A more manufacturable MRR radius is 5 $\upmu $m for an active MRM with an acceptable estimated Q-factor of 10,000. Therefore, the resonant peak linewidth is around 153 pm, revealing that only 34 wavelength channels can be supported. As a result, a solution is required to address the scalability limit caused by the inter-channel crosstalk.

Intensity modulation at a fixed wavelength shows the potential for signal encoding with lower crosstalk penalty and higher wavelength channel density. Since the wavelength is kept fixed through the encoding process, no additional wavelength channel spacing is needed to accommodate wavelength drift. To investigate the scalability of the proposed IM-MRM system, we employed Lumerical’s tools for simulations ⁴³. A custom compact model for the IM-MRM was developed in Lumerical INTERCONNECT and used for the crosstalk penalty investigation. Figure. 6(a,c) show the transmission spectra of two types of cascaded IM-MRM systems, and the insets show the schematic of the cascaded rings. The first type is an all-pass MRM cascaded with an add-drop MRR filter (Type-I in Fig. 6a), which is used to investigate the inter-channel crosstalk between modulation banks and weight banks. For the second type, both MRMs are all-pass and cascaded in parallel (Type-II in Fig. 6c). It is used to investigate the inter-channel crosstalk in weight banks only. The wavelength channel spacing is normalized by the linewidth ($\delta \omega $ = $\mathscr {F}/N$ ¹³). This provides an objective comparison between MRM systems with varying linewidths and FSRs. By changing the ER of the resonant peak at Channel-1 (shown as Ch-1 in Fig. 6), the transmission at through (blue) and drop (red) ports at Channel-2 (shown as Ch-2 in Fig. 6) is passively changed due to the crosstalk phenomenon ($\delta \omega $ = 0.5 in both systems). It can be observed that the crosstalk impact in Fig. 6(a,c) are different. In Fig. 6a, the crosstalk penalty at Channel-2 is mainly due to the overlap of the Lorentzian-shaped pass-band at Channel-1, and only the drop port at Channel-2 is strongly affected. While, in Fig. 6c, the inter-channel crosstalk is more complex. Since these add-drop IM-MRMs share two parallel bus waveguides that are parallel-coupled to each MRM, a resonator-like coherent feedback path is created between the resonances of similar wavelengths. The coherent interaction is specifically severe when the resonances are closely spaced. It depends on the bus waveguide’s phase ¹³. The bus-waveguide-induced phase change affects the performance of two adjacent resonant peaks when the modulation method is based on wavelength shifting. For the intensity-modulation-based weight bank system proposed here, we only need to monitor the light intensity at individual resonance wavelengths where the coherent interference has a minimal impact on the amplitude ⁴⁴. Therefore, the transmission at the resonant peak wavelength is only determined by the ER. A detailed derivation of the coherent interference within multiple cascaded MRRs is presented in Section S3 of the Supplementary Information.

The 3-dB power penalty-based value mapping for two types of cascaded MRMs systems are depicted in Fig. 6(b,d) with $\delta \omega $ = 0.5 and 0.2, respectively, both of which present the attainable mapping range (red boxes in Fig. 6(b,d) close to 0.5. Unlike Fig. 6(a,c) which present the transmission spectra obtained by only applying voltage pairs to the IM-MRR at Channel-1, plots in Fig. 6(b,d) are obtained by modulating both IM-MRRs simultaneously. For Type-I systems, only positive values are considered that are obtained from the through port of the system. It requires $\delta \omega $ = 0.5 to allow both MRMs to be independently modulated between 0 and 0.48, as seen in Fig. 6b. For Type-II systems in Figure 6d, a symmetric weight range centred at 0 is investigated that is generated by the differential power between the drop and through ports at each wavelength channel. It allows a smaller wavelength channel spacing ($\delta \omega $ = 0.2) for the same 3-dB power penalty. Figure 6e shows the maximum mapping range as a function of the wavelength channel spacing for both types with $\delta \omega $ varying from 0.1 to 1. Detailed individual mapping plots for each wavelength channel spacing can be found in Section S4 of the Supplementary Information. Considering 3-dB as the tolerable power penalty benchmark, Type-I systems require $\delta \omega>$ 0.5 (blue), while Type-II systems require $\delta \omega $ = 0.2 (orange). Modulation banks can be broadband, such as electro-absorption modulators (EAMs) or electro-optic modulators (EOMs), which do not generate filtered pass-bands in transmission. By implementing broadband modulator as modulation banks, only Type-II crosstalk needs to be considered. The wavelength channel spacing is $\sim $17 times denser than the one from conventional wavelength-modulation-based MRMs (WM-MRMs) ^13,41 and hence, a tensor core with size of up to 578 can be realized. This demonstrates that the intensity modulation scheme in the MRM-based optical computing system improves the inter-channel crosstalk tolerance, and facilitates using more wavelength channels within one FSR.

It is worth mentioning that the noise due to signal beating may occur at the BPD readout when two resonant peaks are too close such that the difference of the optical frequencies is within the bandwidth of the detector. To eliminate this noise, a frequency filter is needed at the readout to remove the beating frequencies that are larger than the system’s sample rate. For the optical frequencies with a 5 GHz difference at the C-Band, the channel spacing is roughly 40 pm in wavelength. Considering $\delta \omega $ = 0.2 in Type-II system, the FWHM of each resonant peak is $\sim $200 pm. Therefore, to reduce the crosstalk caused by the beating signals with < 5 GHz, the Q-factor of the resonant peak needs to be lower than 7750.

Dot product engine

Co-packaging

A dot product computation system that includes a photonic chip with an all-pass IM-MRM and one add-drop IM-MRR weight filter is tested. The photonic chip is co-packaged on a custom PCB, leveraging photonic and electrical wire bonding for optical and electrical inputs/outputs (I/Os). We used a chip-on-board assembly method for co-packaging of the photonic chip. The three main steps for co-packaging are as follows. First, the photonic chip and V-grooves with single-mode fibers are directly mounted on a PCB substrate using UV curable epoxies. In the second step, photonic wire bonding is done using a photonic wire bonder. PWB is a state-of-the-art technique for implementing optical interconnects with polymer waveguides built by in-situ two-photon polymerization ^45,46, enabling flexible interconnection with a low insertion loss between different material platforms and components (III-V lasers ⁴⁷, SOAs ⁴⁸, optical fibers ⁴⁶ and silicon photonics chips). PWB avoids out-of-plane coupling; thus enabling dense optical I/Os with a pitch down to 25 $\upmu $m ⁴⁹. Furthermore, the PWB technique is fully automatic and has no active alignment requirements, making PWB suitable for mass production ⁴⁶. The third step is electrical wire bonding (EWB). At this step, the aluminum (Al) bond pads on the photonic chip are wire-bonded to the corresponding electroless nickel immersion gold pads on the PCB to implement chip-to-PCB electrical interconnects. An ultrasonic energy is used to attach an Al wire from the photonic chip pads to the PCB pads. We used the wedge-wedge bonder. More information for the co-packaging result can be found in Section S6 of the Supplementary Information.

Figure 7a shows the co-packaged photonic chip on the PCB. The solder under the photonic chip is exposed on the bottom side of the PCB for temperature control. A zoomed-in microscopic image in Fig. 7b presents the mounted photonic chip and V-grooves fiber arrays on the PCB, with optical and electrical I/Os to conduct the signal on and off the photonic chip. A dual-channel optical power meter is employed as the off-chip readout for O/E conversion and power subtraction. Figure 7c presents a series of magnified images focusing on photonic and electrical wire bonds, and the photonic chip. One input MRR (all-pass IM-MRM) for the input encoding and one weight MRR (add-drop IM-MRM) for the signal weighting are cascaded on the chip (Figure S8a), where electrical wires labelled with $V_\text {IRCH}$ and $V_\text {IRPH}$ are used to apply voltage pairs to each MRM. Drive and bias to the packaged photonic chip are provided by the source meter. When no voltage is applied, as-fabricated transmission spectra of the packaged photonic chip measured at through (blue) and drop (orange) ports are depicted in Fig. 7d. An insertion loss around -17.5 dB is observed (-7.5 dB insertion loss per PWB interface considering each IM-MRM has an insertion of -1.25 dB), which might be due to the under etching of the oxide opening at PWB interfaces and their rough sidewalls (see SEM images in Figure S11 of the Supplementary Information).

Crosstalk characterization

To experimentally evaluate the aforementioned inter-channel crosstalk, different voltage pairs are subsequently applied to both IM-MRMs in the co-packaged chips. Figure 8a shows the normalized transmission spectra at through (blue) and drop (red) ports. Two resonant peaks are observed in the through-port transmission and one in the drop-port transmission. Therefore, the left and right peaks are associated with the all-pass (Channel-1) and the add-drop (Channel-2) IM-MRR filters, respectively. The wavelength channel spacing is set to 0.7 nm ($\delta \omega $ = 0.5) by simply adjusting the $V_\text {IRPH}$ applied to the add-drop IM-MRR filter in Channel-2. By changing the voltage pair of the all-pass IM-MRM, the transmission intensity varies at Channel-1. At the same time, the drop-port transmission at Channel-2 changes accordingly due to crosstalk. As seen in Figure 8a, with different voltage pairs applied to the all-pass IM-MRM, the maximum normalized transmitted power at Channel-1 can reach to -3 dB, while under critical coupling condition at Channel-1, the insertion loss of the add-drop IM-MRR at Channel-2 drops 3 dB. Both observations are consistent with the simulation results in Fig. 6a.

We then applied different voltage pairs to the add-drop IM-MRR filter for generating the attainable mapping values at Channel-2 when different mapping values from 0 to 1 are pre-applied to the all-pass IM-MRM at Channel-1. The mapping value is represented by the detected intensity normalized by the maximum transmission (0 dBm). Figure 8b shows that the measured attainable mapping range for Channel-1 and Channel-2 is limited within [0, 0.5] (green square) due to the inter-channel 3-dB power penalty crosstalk. The variation in the color of dots in the plot represents the expected mapping value pre-applied to the all-pass IM-MRM at Channel-1. However, the measured value at Channel-1 (at the X-axis) is different from the predicted mapping value due to crosstalk. For example, for the dark blue dots (with an expected mapping value of 1.0 at Channel-1), the measured value at Channel-1 will equal 1.0 only when the value at Channel-2 is also set to 1.0. In this scenario, both MRMs represent 1.0 at each channel (no resonant peaks); thus, no inter-channel crosstalk exists. But when the measured value at Channel-2 is not 1.0, the measured value at Channel-1 will drop accordingly, even though the all-pass IM-MRM is consistently pre-applied with 1.0. In conclusion, the experimental data presented in Fig. 8 validates the reliability of the simulation using our custom compact model for IM-MRM and demonstrates that for $\delta \omega $ = 0.5, the power penalty between two adjacent channels is 3 dB.

Dot product calculation

We implement a 4-bit signed weighting system and 3-bit input encoding using the co-packaged chip for our proof-of-concept demonstration of dot product computation. We firstly characterize transmission levels for the all-pass IM-MRM (input MRR in Fig. 7c) and the add-drop IM-MRR filter (weight MRR in Figure 7(c)) with 3-bit and 4-bit precision, respectively, using the aforementioned intensity-modulation algorithm. Both of the IM-MRMs are intensity modulated at the same operational wavelength. Figure 9a illustrates 8 distinguishable power levels for the input MRR (solid blue line) and 16 distinguishable power levels for the weight MRR (solid orange line). The power levels for the weight MRRs are obtained after subtracting each power level between drop (dashed red line) and through (dashed green line) ports. For the input MRR, the power levels are distributed between 0 to 15 $\upmu $W while for the weight MRR, due to the IL, the maximum transmitted power at the drop port can only reach 11.25 $\upmu $W; thus limiting 16 power levels between 11.25 to -15 $\upmu $W. We employ the “discrete analog” scheme to encode inputs and weights to distinct power levels of the input MRR and the weight MRR, respectively ³⁵. By normalizing each quantized power level (solid lines in Figure 9a) by the maximum measured output power of 15 $\upmu $W, a point-to-point mapping is realized by endowing voltage pairs with digital information, which can be expressed as:

$$\begin{aligned} \begin{aligned} D = \frac{I_\text {volt pair}}{I_\text {max}}, \end{aligned} \end{aligned}$$

(3)

where, D is the correlated digital number, and $I_\text {volt pair}$ and $I_\text {max}$ are the voltage pair-applied transmission and the maximum transmission of the IM-MRM, respectively. For instance, the voltage pair that allows maximum transmission of the input MRR (15 $\upmu $W) is mapped to 1, and minimum transmission to 0. Therefore, voltage pairs for the 3-bit input MRR can realize the mapping of 8 discrete digital numbers ranging from 0 to 1, while for the 4-bit weight MRR, the 16 digital numbers can be mapped to the range of 0.75 to -1, due to the IL. Figure 9a shows that the power levels were not quantified uniformly as we expected especially for the Weight MRR’s output levels on the negative side, which we believe is due to localized thermal crosstalk and fluctuations. This is consistent with the fact that more applied voltages are required for negative weighting values in Fig. 5f, since more voltages mean more unwanted heat generation. However, the inhomogeneity is fully captured and reproduced in generated digital representations by exploiting the point-to-point mapping scheme. In terms of readout, decoding is required to convert the detected power at the BPD to a digital dot product result. To establish the decoding correlations, a look-up table is needed. In Fig. 9b, an experimental look-up table (128 points) for the co-packaged chip is generated by point-to-point applying digit-endowed voltage pairs to input and weight MRRs according to Figure 9a. The Y-axis shows the differential output power detected by the dual-channel optical power meter. The X-axis shows the dot product result of the mapped digital numbers to the co-packaged chip. The differential output power has an almost linear relationship with respect to dot product results with a first-order linear fit.

Figure. 9(c,d) demonstrate dot product computation using the proposed photonic chip. By randomly encoding digital numbers, selected from power levels in Figure 9a, as the input and the weight to the photonic chip, dot product results are obtained by decoding the readout according using the look-up table and compared with the expected dot product result. The evaluation result of 1000 dot product operations is shown in Figure 9c, with the inset showing a histogram of the error with an MSE of 8.11 $\times $ 10$^{-4}$. Since we only selected existing power levels from two IM-MRMs as inputs and weights, applied voltage pairs are precise; thus, computing errors are mainly from the system fluctuation. Then, dot product computation with random decimal floating-point number is also performed using the proposed photonic chip. Two random decimal floating-point numbers ranging within [0, 1] and [-1, 1] are firstly generated and rounded to the nearest digital number according to power levels in Fig. 9a, then encoded into the system using voltage pairs for the closest digital number. As for decoding, the fitted curve in the look-up table is employed. Figure 9d shows the evaluation result of 1000 dot product operations with decimal floating-point numbers. The inset shows a histogram of the error with an MSE of 3.09 $\times $ 10$^{-3}$. Additional rounding errors and fitting errors can explain the increased MSE during encoding and decoding processes, which can be reduced by improving the bit-precision of the system.

Performing convolution

We develop a CNN simulator for MNIST handwritten digit recognition, following the DEAP-CNN system described in Ref. ⁴¹, so as to evaluate the computational performance of the proposed IM-MRM system with limited channel spacing and also to compare the performance of different signal modulation schemes (IM-MRMs vs. WM-MRMs). The CNN starts with two convolutional layers, each of which with 8 kernels of size 3 $\times $ 3 with Rectified Linear Unit (ReLU) non-linear activation function. The convolutional layers are followed by an average pooling layer and, finally, the last two layers of the network are fully-connected layers.

To develop the IM-MRM-based CNN, as shown in Fig. 10a, the input laser beams are first multiplexed using WDM. Then the multiplexed signal is split into separate input channels in parallel with each channel realizing one of the filters. In our developed network 9 all-pass IM-MRMs are cascaded on each input channel, which can be intensity modulated individually to represent a subset of 9 pixels from the input image (28 $\times $ 28 pixels). Considering the 8 different filters of the convolution layer, the convolution part of our architecture comprises 8 input channels, which result to the total of 72 all-pass IM-MRMs that serve as modulation banks for input data encoding. Each input channel is followed by 9 add-drop IM-MRMs, representing the 3 $\times $ 3 kernel to weight the loaded wavelength-multiplex signal from modulation banks, i.e., inputs. Each MRM in weight banks is intensity-modulated at a unique wavelength, in line with operational wavelengths in modulation banks. Finally, output signals of weight banks are accumulated and converted into electrical signals by BPDs at the output of each input channel. Each single convolved pixel is then obtained by adding all electrical signals together through voltage adders. The output of each of the convolutional layer goes through an offline ReLU activation function followed by an average pooling layer with a 2 $\times $ 2 kernel. At the end, two fully-connected layers, with ReLU and Softmax non-linear activation functions, are added to complete the CNN architecture. We would like to note that the weights and biases of the developed CNN are obtained by offline training of the network on a Graphics Processing Unit (GPU) using TensorFlow (Python), while the number of epochs and the batch size are set to 10 and 32, respectively.

Transmission responses of 9 cascaded add-drop IM-MRMs are simulated using the aforementioned custom compact model in Lumerical INTERCONNECT. By evenly adjusting the radius of each MRM, from 15 to 16.6 $\upmu $m, output powers at through (blue) and drop (red) ports are calculated and presented in dashed curves in Fig 10b. The linewidth-normalized channel spacing is set to 0.5 between adjacent wavelength channels to introduce the power penalty crosstalk. In this scenario, the transmission response of each MRM represents the maximum value of 1. From Fig. 6d, we notice that when the mapping value in Channel-1 is set to 1, there is no room for modulating the mapping value in Channel-2. Therefore, to allow a [-0.5, 0.5] mapping range for each wavelength channel, all MRMs need to be pre-adjusted. Solid curves in Fig. 10b show the transmission spectra after pre-adjustment. Nine resonant peaks appear with the mapping value set to 0.5, which is realized by adding the voltage pair to each model. In Fig. 10c, attainable mapping ranges for nine cascaded IM-MRMs are presented as a function of applied voltage pairs by simulation. Compared with the measured voltage pair, simulation models do not involve the thermal crosstalk; thus resulting in different voltage pair results. A common mapping range of [-0.5, 0.5] is selected (red area) for all wavelength channels to realize the 3-dB power penalty in the system, although a larger range of [-0.75, 0.75] is available. Since for Type-II systems (orange curve in Fig. 6e), the maximum attainable value is 0.75 for $\delta \omega $ = 0.5. While the maximum possible range of only 0.5 is achieved for Type-I systems (blue curve in Fig. 6e), which eventually limits the system’s common range to [-0.5, 0.5] when connecting modulation banks with weight banks.

Frequency-domain simulation in Lumerical INTERCONNECT uses scattering data analysis to calculate the overall circuit response. It is done by solving a sparse matrix that represents the circuit as connected scattering matrices, each representing the frequency response of a single element. ⁵⁰. Having 72 IM-MRM elements in both modulation and weight banks separately would result in massive computation to obtain a single transmission response. Furthermore, considering 28 $\times $ 28 pixels in each MNIST input image, convolution calculations using the proposed simulation system in Lumerical INTERCONNECT will be time-consuming by continuous loading of 3 $\times $ 3 input subsets in strides of 1, let alone more than 10,000 MNIST handwritten images are employed for inference. A co-simulation pipeline using Lumerical Application Programming Interface (API) in a Python environment was developed to speed up the CNN simulation. The API can be used for developing scripts or programs via Python that treat Lumerical solvers as clients, and enable users to perform custom analyses, undertake enhanced optimization and visualization, produce plots, and automate complex workflows. Detailed information on the co-simulation pipeline can be found in Section S5 of the Supplementary Information.

By loading the kernel to the Python-Lumerical simulation pipeline, the CNN simulator performs the MNIST recognition task. For every input subset, normalized data is implemented to modulation banks by adjusting the transmission intensity of each all-pass IM-MRM. Then, the signal-encoded light is intensity-modulated by weight banks. Since the same kernel is applied across the set of inputs, weight banks are modified only when a new kernel is loaded. A single convolved pixel output is obtained at the readout in the electrical domain. A voltage is then added to represent the bias. Prediction accuracies versus different encoding precision of the MNIST recognition task solved by the proposed CNN simulator are shown in Fig. 10d. It reveals that with a limited wavelength channel spacing ($\delta \omega $ = 0.5) and mapping range of [-0.5, 0.5], performance for the proposed CNN architecture using IM-MRMs as weight banks can admit > 96.76$\%$ prediction accuracy when applying a 6-bit encoding precision or more. Individual results of the MNIST recognition task with each precision can be found in Figure S10 of the Supplementary Information.

Discussion and conclusion

Compared to digital accelerators in the latest CMOS processes, optical or optoelectronic MAC operations in SOI-based PIC platforms can process signals with lower latency and higher throughput ⁵¹. The optical link budget and total energy efficiency are discussed in this section to shed light on the advantages and challenges of implementing MAC operations using our proposed photonic processing unit with IM-MRMs.

In order to analyze the optical link budget and energy efficiency, an $N \times M$ vector-matrix multiplication system is implemented using the proposed IM-MRMs. A frequency comb source is utilized to provide multiple wavelengths from $\lambda _1$ to $\lambda _\text {N}$. The multiwavelength light that includes N carrier wavelengths is modulated by an array of high-speed EAMs or EOMs for the input vector encoding and then coupled to the photonic chip. The modulated input is split into M branches and weighted by weight banks. Each weight bank array contains N add-drop MRR filters. After weight banks, M receivers collect the multiwavelength signals and convert them to electrical information for memory access. All-pass IM-MRMs in the modulation bank are replaced with EAMs or EOMs, since the thermo-optic tuning bandwidth of the photoconductive heater (PH) is around 175 kHz ⁵². As a results, (i) the channel density can be further improved ($\delta \omega $ = 0.2 in Fig. 6e), and thus more wavelength channels can be realized ($\sim $580 wavelengths channels for 3-dB power penalty); (ii) the energy consumption can be reduced and (iii) the modulation speed can be enhanced using high-speed EAMs or EOMs. Recently, a high-speed evanescent-coupled Ge waveguide EAM has been proposed with simple fabrication processes on the SOI platform and modulation speed of 56 GHz and dynamic power consumption of 45 fJ/bit ⁵³. The schematic of the $N \times M$ photonic processing system is illustrated in Fig. 11. The optical link budget is calculated based on the following equation ⁵¹:

$$\begin{aligned} P_\text {out} = P_\text {laser} - P_\text {EAM} - P_\text {coupling} - P_\text {Si-prop} - P_\text {splitter} - P_\text {IL-MRM} - (N-1)P_\text {OBL-MRM} - P_\text {crosstalk} - P_\text {penalty}, \end{aligned}$$

(4)

where $P_\text {laser}$ is the optical power of the laser, $P_\text {EAM}$ is the insertion loss of the EAM, $P_\text {coupling}$ is the loss introduced by the attenuation of the single-mode optical fiber (SMF) and the fiber-to-chip coupling loss, $P_\text {Si-prop}$ is the silicon waveguide attenuation, $P_\text {splitter}$ is the splitter insertion and excess loss, $ P_\text {IL-MRM}$ is the insertion loss of the add-drop IM-MRR filter at the input vector wavelength, $ P_\text {crosstalk}$ is the inter-channel crosstalk penalty, and $P_\text {OBL-MRM}$ is the out-of-band insertion loss when the resonant peak is not aligned with input wavelengths, and $P_\text {penalty}$ is the network penalty due to extinction ratio, crosstalk, and laser relative intensity noise (RIN). Considering $P_\text {laser}$ = 0 dBm, $P_\text {coupling}$ = 1.6 dB ⁵¹, $P_\text {EAM}$ = 6.2 dB ⁵³, $P_\text {Si-prop}$ = 2.5 dB/cm $\times $ $3(N-1)L$ where L in centimeters is the waveguide spacing between the two MRMs, $P_\text {splitter}$ = 3.3 $\times $ log$_2M$ dB, $ P_\text {IL-MRM}$ = 1.25 dB, $ P_\text {crosstalk}$ = 3 dB, $P_\text {OBL-MRM}$ = 0.01 dB ⁵¹, and $P_\text {penalty}$ = 4.8 dB ⁵¹, the optical link budget as a function of the dimensions of the proposed photonic processing system is calculated and plotted in Fig. 12a. The total length of the waveguide is approximated as the length spanning the optical depth of the system, where the distance between two adjacent MRMs is set to 50 $\upmu $m to minimize the thermal crosstalk. As can be seen in Fig. 12a, the number of MRMs (N) in each array does not have a huge impact on the link budget, since only one MRM aligned with the input wavelength contributes 1.25-dB to the insertion loss, but the remaining out-of-band MRMs each contribute 0.01 dB. For M weight bank arrays, the attenuation introduced due to log$_2M$ splitters degrades the optical power and poses a limitation on the energy efficiency. In the optical link budget calculation, we estimate around 60 dB of loss for a 512 $\times $ 512 tensor core based on the proposed IM-MRR system. The loss can be compensated by either increasing the laser’s output power or adding SOAs to the system. It is worth mentioning that whole SOAs can pre-amplify the input signal, the non-linear gain-current curve would need to be calibration for ⁵¹.

The total energy consumption of the proposed system to overcome the shot noise and the capacitance of the PD at the output with a fixed n-bit precision can be expressed by the following equation:

$$\begin{aligned}{} & {} P_\text {total} = N (\frac{P_\text {laser}}{\rho _\text {SOA}}) + \frac{N}{M}\frac{hv}{\eta }\text {max}(2^{2n+1}, \frac{C_\text {d}V_\text {r}}{e})f_\text {mod} + N(E_\text {EAM}\times n)f_\text {mod} \nonumber \\{} & {} \quad \quad \quad \,\,+ NM\times P_\text {MRM} + N \times k P_\text {SOA} + M \times P_\text {rec}, \end{aligned}$$

(5)

where, hv is the photon energy for a center wavelength of 1550 nm, $C_\text {d}$ = 2.4 fF and $V_\text {r}$ = 1 V are the capacitance and driving voltage of the PD, respectively, and $\eta $ is the quantum efficiency of the detector ($\eta _\text {PD}$ = 80%), laser ($\eta _\text {laser}$ = 20%) and the optical loss through the $N \times M$ system. Considering the noise equivalent power of 0.214 pW/$\sqrt{HZ}$ at the readout, the total required power for 8.5-bit precision needs to be more than -22.6 dB with the sample rate of 5 GS/s. Based on the calculations done in Fig. 12a, this limits the size of the system to 20 $\times $ 20. Hence, SOAs are required in the system to compensate for the loss. $\rho _\text {SOA}$ is the efficiency enhancement due to an SOA ($\rho _\text {SOA} = 10^{G/10}$, where G is the gain of the SOA ⁵¹). The modulation speed of the system is denoted by $f_\text {mod}$. The energy associated with signal modulation and detection is dominated by EAMs ($E_\text {EAM}$ = 45 fJ/bit), add-drop IM-MRMs ($P_\text {MRM}$ = $I_\text {IRPH}V_\text {IRPH}$+$I_\text {ICPH}V_\text {ICPH}$) and ADCs ($P_\text {rec}\approx $ 200 mW). Assuming the system’s sample rate is 5 GS/s limited by ADCs and DACs with the input laser power of 10 dBm and an SOA gain of 17 dB ⁵⁴, the total energy efficiency (Joule per MAC) and total power consumption of an $N \times M$ photonic processing system with a fixed 8.5-bit precision are calculated and plotted in Fig. 12b. To enable 512 $\times $ 512 implementation, two SOAs are required in each channel ($k = 2$ in Equation 5). To clarify, the calculation does not include the power consumed by the temperature control system, which consumes 36 W in operation. The energy efficiency (dashed blue curve) increases with matrix size due to the fact that more MAC operations can be handled simultaneously in larger systems. It has been observed that photonic tensor cores outperform electronic counterparts as the matrix size exceeds 500 ⁵⁵. However, the total power consumption (dashed red curve) surges simultaneously since add-drop IM-MRR filters for weight banks consume considerable power ($P_\text {MRM}$ = $\sim $13.3 mW). One of the solutions is replacing the PH with low power index-modulation components, such as non-volatile PCMs in the weight bank. For PCMs, such as GST ⁵⁶ and GSST ⁵⁷, the refractive index (n) and the extinction coefficient (k) increases when the material changes its phase from amorphous to crystalline. Hence, they generate high absorption losses. Recently, a wide bandgap PCM Sb$_2$S$_3$ that allows for a strong optical phase modulation and low optical loss was experimentally demonstrated on the SOI platform at 1550 nm, with a figure of merit (FOM = $\Delta n/\Delta k$) of 10.8 ⁵⁸, which can be a potential candidate for our future design.

In summary, we have demonstrated a novel photonic processing unit using the intensity-modulation-based MRM. By utilizing an unbalanced MZI serving as a tunable coupler to couple the light into the ring resonator, the ER of the resonant peak can be directly modulated by changing its coupling strength. To lock the resonant peak at a fixed wavelength during modulation, another index modulator is implemented in the resonator to offset the wavelength shift during the ER modulation. Using the intensity modulation scheme at a fixed wavelength, we demonstrated that the wavelength channel spacing with 3-dB power penalty crosstalk tolerance is 17-fold denser than the conventional wavelength-modulated counterpart. As a result of the increased channel density, our proposed system allows up to 578 wavelength channels with a 3-dB power penalty when weight banks are designed using IM-MRMs with a radius of 5 $\upmu $m. A photonic dot product core was presented for the proof-of-concept demonstration. The photonic chip, containing one IM-MRM serving as the input encoder and one IM-MRM serving as the weight encoder, was integrated on a PCB through an electrical wire bonding/photonic wire bonding (EWB/PWB) co-packaging technique. Applying the “discrete analog” encoding/decoding scheme, a 3-bit input and 4-bit signed weighting dot product computation was realized in the optical domain. One thousand random decimal floating-point dot product results showed an MSE 3.09 $\times $ 10$^{-3}$, experimentally demonstrating the capability of our proposed IM-MZMs for optical information processing.

Future work will focus on: (i) optimizing the IM-MRM design to reduce the thermal crosstalk between the ICPH and the IRPH, such as moving the IRPH to the opposite side of the MRR and using a separate ground; (ii) replacing photoconductive heaters with non-volatile PCMs in the IM-MRR filter for weighting; (iii) decreasing the energy consumption and improving the modulation bandwidth of the system by replacing modulation banks with directly modulated lasers or high-speed EAMs and EOMs.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

Tait, A. N. (2021) Quantifying power use in silicon photonic neural networks. arXiv preprint arXiv:2108.04819
Thompson, N. C., Greenewald, K., Lee, K. & Manso, G. F. (2020) The computational limits of deep learning. arXiv preprint arXiv:2007.05558.
Shastri, B. J. et al. Photonics for artificial intelligence and neuromorphic computing. Nat. Photonics 15, 102–114 (2021).
Article ADS CAS Google Scholar
Estakhri, N. M., Edwards, B. & Engheta, N. Inverse-designed metastructures that solve equations. Science 363, 1333–1338 (2019).
Article ADS MATH Google Scholar
Xu, X.-Y. et al. A scalable photonic computer solving the subset sum problem. Sci. Adv. 6, eaay5853 (2020).
Article ADS Google Scholar
Zhang, W. & Yao, J. Photonic integrated field-programmable disk array signal processor. Nat. Communi. 11, 1–9 (2020).
ADS Google Scholar
Salmani, M., Eshaghi, A., Luan, E. & Saha, S. (2021) Photonic computing to accelerate data processing in wireless communications. arXiv preprint arXiv:2103.07406.
Shen, Y. et al. Deep learning with coherent nanophotonic circuits. Nat. Photonics 11, 441–446 (2017).
Article ADS CAS Google Scholar
Zhang, H. et al. An optical neural chip for implementing complex-valued neural network. Nat. Commun. 12, 1–11 (2021).
ADS Google Scholar
Harris, N. C. et al. Quantum transport simulations in a programmable nanophotonic processor. Nat. Photonics 11, 447–452 (2017).
Article ADS CAS Google Scholar
Pérez-López, D., Sánchez, E. & Capmany, J. Programmable true time delay lines using integrated waveguide meshes. J. Lightwave Technol. 36, 4591–4601 (2018).
Article ADS Google Scholar
Shokraneh, F., Nezami, M. S. & Liboiron-Ladouceur, O. Theoretical and experimental analysis of a 4 $\times $ 4 reconfigurable MZI-based linear optical processor. J. Lightwave Technol. 38, 1258–1267 (2020).
Article ADS CAS Google Scholar
Tait, A. N. (2018) Silicon Photonic Neural Networks. Ph.D. thesis, Princeton University, Princeton.
Tait, A. N., Nahmias, M. A., Shastri, B. J. & Prucnal, P. R. Broadcast and weight: An integrated network for scalable photonic spike processing. J. Lightwave Technol. 32, 4029–4041 (2014).
Article Google Scholar
Tait, A. N. et al. Neuromorphic photonic networks using silicon photonic weight banks. Sci. Rep. 7, 1–10 (2017).
Article CAS Google Scholar
Huang, C. et al. Demonstration of scalable microring weight bank control for large-scale photonic integrated circuits. APL Photonics 5, 040803 (2020).
Article ADS Google Scholar
Blow, E. C. et al. Broadband radio-frequency signal processing with neuromorphic photonics. In AI and Optical Data Sciences III Vol. 12019 157–162 (SPIE, Bellingham, 2022).
Google Scholar
Peng, H.-T. et al. (2021) A photonic-circuits-inspired compact network: Toward real-time wireless signal classification at the edge. arXiv preprint arXiv:2106.13865.
Huang, C. et al. A silicon photonic-electronic neural network for fibre nonlinearity compensation. Nat. Electron. 4, 837–844 (2021).
Article CAS Google Scholar
Prucnal, P. R., Shastri, B. J. & Teich, M. C. Neuromorphic photonics (CRC Press, Boca Raton, 2017).
Book Google Scholar
Feldmann, J. et al. Parallel convolutional processing using an integrated photonic tensor core. Nature 589, 52–58 (2021).
Article ADS CAS Google Scholar
Popović, M. (2008) Theory and design of High-index-contrast Microphotonic Circuits. Ph.D. thesis, Massachusetts Institute of Technology, Cambridge.
Sacher, W. et al. Coupling modulation of microrings at rates beyond the linewidth limit. Optics Express 21, 9722–9733 (2013).
Article ADS CAS Google Scholar
Hai, M. S., Fard, M. M. P. & Liboiron-Ladouceur, O. A ring-based 25 Gb/s DAC-less PAM-4 modulator. IEEE J. Sel. Top. Quantum Electron. 22, 123–130 (2016).
Article ADS Google Scholar
Shan, W. et al. Broadband continuously tunable microwave photonic delay line based on cascaded silicon microrings. Opt. Express 29, 3375–3385 (2021).
Article ADS CAS Google Scholar
Preble, S. F. et al. On-chip quantum interference from a single silicon ring-resonator source. Phys. Rev. Appl. 4, 021001 (2015).
Article ADS Google Scholar
Shoman, H. et al. Compact wavelength-and bandwidth-tunable microring modulator. Opt. express 27, 26661–26675 (2019).
Article ADS CAS Google Scholar
Morichetti, F. et al. Polarization-transparent silicon photonic add-drop multiplexer with wideband hitless tuneability. Nat. Commun. 12, 1–7 (2021).
Article ADS Google Scholar
Jayatilleka, H. et al. Wavelength tuning and stabilization of microring-based filters using silicon in-resonator photoconductive heaters. Opt. Express 23, 25084–25097 (2015).
Article ADS CAS Google Scholar
Poulton, C. V., Dong, P. & Chen, Y.-K. (2015) Photoresistive microring heater with resonance control loop. In CLEO: Science and Innovations SM2I–3. Optical Society of America, Washington.
Zhang, Y., Li, Y., Feng, S. & Poon, A. W. Towards adaptively tuned silicon microring resonators for optical networks-on-chip applications. IEEE J. Sel. Topics Quantum Electron. 20, 136–149 (2014).
Article ADS Google Scholar
Chrostowski, L. et al. (2021) A silicon photonic evanescent-field sensor architecture using a fixed-wavelength laser. In Optical Interconnects XXI, vol 11692. pp. 116920, International Society for Optics and Photonics, Bellingham.
Luan, E., Saha, S., Semnani, B., Salmani, M. & Eshaghi, A. (2021) Interferometric coupling-based modulator for large-scale integrated photonic systems. In 2021 Conf. on Lasers and Electro-Optics Europe & European Quantum Electronics Conference (CLEO/Europe-EQEC), IEEE, NY.
Marquez, B. A. et al. Photonic pattern reconstruction enabled by on-chip online learning and inference. J. Phys. Photonics 3, 024006 (2021).
Article ADS Google Scholar
Guo, Z. (2021) Photonic Tensor Machine and Multi-level Encoding and Decoding in Wavelength-Multiplexed Photonic Processors. Ph.D. thesis, Queen’s University, Canada.
Zhang, W. et al. Microring weight banks control beyond 8.5-bits accuracy. arXiv preprint arXiv:2104.01164 (2021).
Geuzebroek, D. H. & Driessen, A. Ring-resonator-based wavelength filters. In Wavelength filters in fibre optics 341–379 (Springer, Singapore, 2006).
Chapter Google Scholar
Preston, K., Sherwood-Droz, N., Levy, J. S. & Lipson, M. Performance guidelines for WDM interconnects based on silicon microring resonators. In CLEO: 2011-Laser Science to Photonic Applications 1–2 (IEEE, Manhattan, 2011).
Google Scholar
Dong, P. et al. 1x4 reconfigurable demultiplexing filter based on free-standing silicon racetrack resonators. Optics Express 18, 24504–24509 (2010).
Article ADS CAS Google Scholar
Jayatilleka, H. et al. Crosstalk in SOI microring resonator-based filters. J. Lightwave Technol. 34, 2886–2896 (2016).
Article ADS CAS Google Scholar
Bangari, V. et al. Digital electronics and analog photonics for convolutional neural networks (DEAP-CNNs). IEEE J. Sel. Top. Quantum Electron. 26, 1–13 (2019).
Article Google Scholar
Xu, Q., Fattal, D. & Beausoleil, R. G. Silicon microring resonators with 1.5-$\upmu $m radius. Opt. Express 16, 4309–4315 (2008).
Article ADS Google Scholar
Ansys-Lumerical. (2022) Lumerical. https://www.lumerical.com/ (Accessed: 18 January).
Tait, A. N. et al. Microring weight banks. IEEE J. Sel. Top. Quantum Electron. 22, 312–325 (2016).
Article ADS Google Scholar
Lindenmann, N. et al. Photonic wire bonding: A novel concept for chip-scale interconnects. Optics Express 20, 17667–17677 (2012).
Article ADS CAS Google Scholar
Lindenmann, N. et al. Connecting silicon photonic circuits to multicore fibers by photonic wire bonding. J. Lightwave Technol. 33, 755–760 (2014).
Article ADS Google Scholar
Billah, M. R. et al. Hybrid integration of silicon photonics circuits and InP lasers by photonic wire bonding. Optica 5, 876–883 (2018).
Article ADS CAS Google Scholar
Xu, Y. et al. InP/silicon hybrid external-cavity lasers (ECL) using photonic wirebonds as coupling elements. In Optical Fiber Communication Conference M4H-6 (Optical Society of America, Washington, 2020).
Google Scholar
Blaicher, M. et al. Hybrid multi-chip assembly of optical communication engines by in situ 3D nano-lithography. Light Sci. Appl. 9, 1–11 (2020).
Article Google Scholar
Chrostowski, L. & Hochberg, M. Silicon photonics design: From devices to systems (Cambridge University Press, Cambridge, 2015).
Book Google Scholar
Al-Qadasi, M., Chrostowski, L., Shastri, B. & Shekhar, S. Scaling up silicon photonic-based accelerators: Challenges and opportunities. APL Photonics 7, 020902 (2022).
Article ADS CAS Google Scholar
Jayatilleka, H., Shoman, H., Chrostowski, L. & Shekhar, S. Photoconductive heaters enable control of large-scale silicon photonic ring resonator circuits. Optica 6, 84–91 (2019).
Article ADS CAS Google Scholar
Liu, Z. et al. 56 Gbps high-speed Ge electro-absorption modulator. Photonics Res. 8, 1648–1652 (2020).
Article CAS Google Scholar
Hui, R. Introduction to Fiber-Optic Communications (Academic Press, Cambridge, 2019).
Google Scholar
Nahmias, M. A. et al. Photonic multiply-accumulate operations for neural networks. IEEE J. Sel. Topics Quantum Electron. 26, 1–18 (2019).
Article Google Scholar
Zheng, J. et al. GST-on-silicon hybrid nanophotonic integrated circuits: A non-volatile quasi-continuously reprogrammable platform. Opt. Mater. Express 8, 1551–1561 (2018).
Article ADS CAS Google Scholar
Zhang, Y. et al. Broadband transparent optical phase change materials for high-performance nonvolatile photonics. Nat. Commun. 10, 1–9 (2019).
ADS Google Scholar
Fang, Z. et al. Non-volatile reconfigurable integrated photonics enabled by broadband low-loss phase change material. Adv. Opt. Mater. 9, 2002049 (2021).
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

Huawei Technologies Canada Co., Ltd. 19 Allstate Parkway, Markham, Ontario, L3R 5A4, Canada
Enxiao Luan, Mahsa Salmani & Armaghan Eshaghi
Department of Electrical and Computer Engineering, The University of British Columbia, 2332 Main Mall, Vancouver, British Columbia, V6T 1Z4, Canada
Enxiao Luan, Shangxuan Yu, Mohammadreza Sanadgol Nezami & Lukas Chrostowski
Department of Physics, Engineering Physics & Astronomy, Queen’s University, Kingston, Ontario, KL7 3N6, Canada
Bhavin J. Shastri

Authors

Enxiao Luan
View author publications
You can also search for this author in PubMed Google Scholar
Shangxuan Yu
View author publications
You can also search for this author in PubMed Google Scholar
Mahsa Salmani
View author publications
You can also search for this author in PubMed Google Scholar
Mohammadreza Sanadgol Nezami
View author publications
You can also search for this author in PubMed Google Scholar
Bhavin J. Shastri
View author publications
You can also search for this author in PubMed Google Scholar
Lukas Chrostowski
View author publications
You can also search for this author in PubMed Google Scholar
Armaghan Eshaghi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

E.L. conceived the project and designed the research, provided, and analyzed the data and wrote the paper; M.S. provided CNN simulation pipeline and analyzed the simulation data; S.Y. performed and helped in the experiments, and analyzed the experimental data; S.Y. and M.S.N. performed the chip post-fabrication and co-packaging; S.Y., M.S., and M.S.N. provided advice and edited the paper. B.J.S., L.C., and A.E. supervised research and edited the paper. All authors contributed to the article and approved the submitted version.

Corresponding author

Correspondence to Enxiao Luan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Luan, E., Yu, S., Salmani, M. et al. Towards a high-density photonic tensor core enabled by intensity-modulated microrings and photonic wire bonding. Sci Rep 13, 1260 (2023). https://doi.org/10.1038/s41598-023-27724-y

Download citation

Received: 26 September 2022
Accepted: 06 January 2023
Published: 23 January 2023
DOI: https://doi.org/10.1038/s41598-023-27724-y

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.