Hardware Error Correction for MZI-Based Matrix Computation

With the rapid development of artificial intelligence, the electronic system has fallen short of providing the needed computation speed. It is believed that silicon-based optoelectronic computation may be a solution, where Mach–Zehnder interferometer (MZI)-based matrix computation is the key due to its advantages of simple implementation and easy integration on a silicon wafer, but one of the concerns is the precision of the MZI method in the actual computation. This paper will identify the main hardware error sources of MZI-based matrix computation, summarize the available hardware error correction methods from the perspective of the entire MZI meshes and a single MZI device, and propose a new architecture that will largely improve the precision of MZI-based matrix computation without increasing the size of the MZI’s mesh, which may lead to a fast and accurate optoelectronic computing system.


Introduction
Benefiting from the era of big data, the rapid growth of the Internet provides sufficient training datasets for functioning artificial intelligence (AI), which has also become a hot spot in the current technological revolution. However, artificial intelligence has conventionally relied on electronic processors, with its computing power being greatly determined by transistor numbers and capabilities; managing the massive amount of data necessitates additional computing resources. In the post-Moore era, the increasing transistor numbers can no longer keep up with the computing power demand of artificial intelligence, and electronic computation is therefore stuck in a bottleneck [1][2][3]. On the other hand, photons are bosons, which have the advantages of a higher transmission speed and can work well with electrons through the Einstein coefficients, which is conducive to the realization of ultra-high-speed optoelectronic computing. Consequently, more and more researchers are beginning to focus on optoelectronic computing [4][5][6][7][8][9].
The fabrication techniques for silicon photonics are advancing rapidly, enabling the creation of large-scale and complicated circuits and paving the way for low-loss and lowcost optoelectronic devices [10]. These devices can be produced using complementary metal-oxide-semiconductor (CMOS) fabs [11]. As a result, silicon-based optoelectronic computation has emerged as a prominent area of research [3]. At present, optoelectronic matrix computation mainly includes three implementation methods [12]: multi-plane light conversion (MPLC), the Mach-Zehnder interferometer method (MZI), and wavelength division multiplexing (WDM). The Mach-Zehnder interferometer method is simple and easy to integrate. It is one of the best choices to implement optoelectronic matrix computation [13].
The optoelectronic matrix computation implemented by the Mach-Zehnder interferometer is a kind of analog computation, in which computational precision is the most important and should be treated with special attention. However, the performance of silicon photonic devices is sensitive to the influence of environmental disturbances [14][15][16][17], fabrication [15,18], and device aging [15,19], including MZI devices. Particularly, the hardware errors generated by fabrication are the focus of research [20,21], which result in deviations between the matrix implemented by MZI and the actual needed matrix. The hardware errors involve beam splitter errors [20], phase errors [22], and errors caused by optical loss differences [23]. The hardware errors will accumulate with the increase of the scale of the MZI meshes, which limits the implementation of large-scale matrix computation based on MZI meshes. In order to overcome this limitation of hardware error, especially beam splitter error, researchers have proposed different MZI meshes or different MZI devices with higher precision.
In this paper, the Mach-Zehnder interferometer method of achieving optoelectronic matrix computation is introduced in Section 2. The origins of the hardware error of the Mach-Zehnder interferometer are analyzed in Section 3, followed by a synthesis of the studies of MZI hardware error correction in Section 4. Finally, a feasible approach to mitigate the MZI hardware error is proposed in Section 5.

Methods
In 1994, Reck et al. [24] proposed one type of triangular mesh based on the Mach-Zehnder interferometer and demonstrated that the unitary matrix transformation of any finite dimension could be implemented by MZI devices. They decomposed the Ndimensional unitary matrix into a series of two-dimensional unitary matrices, which were presented as a triangular mesh formed by the arrangement of MZIs, as shown in Figure 1. A common Mach-Zehnder interferometer consists of an external phase shifter (ϕ), an internal phase shifter (θ), and two 50:50 beam splitters (directional couplers (DC)) or multimode interference couplers (MMI)), as shown in Figure 2. For an MZI device, transmission matrix S is where P 1 and P 3 are the transmission matrices of the external phase shifter and the internal phase shifter, respectively; C 2 and C 4 are the transmission matrices of the beam splitter on the left side of the MZI and the beam splitter on the right side of the MZI, respectively.
Among them, the left and right beam splitters are considered as ideal 50:50 beam splitters. Hence, MZI transmission matrix S can be written as [25] where ϕ and θ are phases produced by the external phase shifter and the internal phase shifter, respectively. Obviously, we can produce different phases through these two phase shifters to implement two-dimensional unitary matrices with different elements. For an N-dimensional unitary matrix, the MZI's transmission matrix can be generalized as T n,m (θ, ϕ): where n and m represent the transmission matrix of the MZI between the nth and mth input ports of the signal entering the mesh. The dimension of an N-dimensional unitary matrix, U(N), can be reduced by right multiplying T n,m (θ, ϕ), namely, The N-dimensional unitary matrix's dimension can be continued to be reduced according to the above dimensionality reduction method, and finally, a diagonal matrix, D, with elements of modulo 1 is obtained. Let It can be seen from the previous theoretical derivation that the final implementation of the N-dimensional unitary matrix will be a triangular mesh with N-1 MZIs in the N row and N-2 MZIs in the N-1 row. Based on the above theory, in 2017, Shen et al. [26] experimentally demonstrated a cascaded mesh of 56 programmable MZIs with this triangular mesh, which improved the computational speed and power efficiency compared with that of traditional electronic processors.

Hardware Error
The N-dimensional unitary matrix implemented by MZI meshes has some deviations from the needed matrix in actual computation. On the one hand, the triangular mesh of the Reck design leads to an inconsistent optical loss of each output, and the larger mesh is, the greater the optical loss difference of each output, and the higher the total losses. On the other hand, more importantly, the beam splitter in MZI cannot implement the ideal 50:50 beam splitter due to the hardware error generated by the fabrication and the non-uniformity of the material, which leads to the deviation of the transmission matrix and affects the final computation.
The non-ideal 50:50 beam splitter transmission matrix can be expressed as follows: where, when ϕ = π/4, it is the ideal 50:50 beam splitter; ϕ will generally deviate from π/4 for the non-ideal 50:50 beam splitter, which leads to the deviation of the MZI transmission matrix. For the transmission matrix of the beam splitter on the left side of the MZI and the beam splitter on the right side of the MZI, ϕ can be expressed as ϕ 1 and ϕ 2 , respectively, and then MZI transmission matrix S can be expressed as Obviously, the MZI's transmission matrix changes because of two non-ideal beam splitters, which is inconsistent with the transmission matrix in the ideal case, which leads to the deviation between the N-dimensional unitary matrix implemented by the MZI meshes and the theoretical one.

Error Correction
In terms of the hardware error in the N-dimensional unitary matrix transformation implemented by the MZI meshes introduced in Section 3, researchers have made a lot of effort. This section will introduce the main hardware error correction methods for (1) the entire MZI meshes and (2) a single MZI device.

Rectangular Mesh
In 2016, Clements et al. [23] proposed a rectangular mesh, which was an improvement of the triangular mesh of Reck, as shown in Figure 3. Compared to the triangular mesh of Reck, the rectangular mesh has higher symmetry and a lower optical loss difference of each output. Additionally, the longest optical depth of the rectangular mesh is about half that of the triangular mesh, and the total optical loss is only half that of the triangular mesh. The improvement of the rectangular mesh is specifically manifested in the process of decomposition of the N-dimensional unitary matrix into a series of two-dimensional unitary matrices. The decomposition process does not reduce the dimensions by right multiplying T n,m (θ, ϕ), but by both right multiplying T n,m (θ, ϕ) and left multiplying T −1 n,m (θ, ϕ), as shown in Figure 4 and seen in the following: that is, For a matrix T n,m (θ, ϕ) and a diagonal matrix D, there are matrix T −1 n,m (θ, ϕ) and another diagonal matrix D , with T −1 n,m (θ, ϕ) D = D T n,m (θ, ϕ), and we obtain a rightmultiplication dimensionality reduction operation similar to that in Equation (7).
Shokraneh et al. [27] experimentally proved that due to the asymmetric distribution of MZIs in the triangular mesh, optical loss is greater on the triangular mesh than on the rectangular mesh in the computation process. They used a dataset that was perfectly classifiable to assess the classification performance of the two meshes in optical neural networks (ONN). Compared to the triangular mesh, the rectangular mesh is more phaseerror-tolerant and loss-tolerant. Thus, the rectangular mesh is commonly adopted as the fundamental unit for constructing ONN in various research studies [28][29][30]. However, the beam splitter errors will still impact the computed result of the rectangular mesh.

Fourier Structure
In order to solve hardware errors introduced by the non-ideal 50:50 beam splitter, Lopez-Pastor et al. [31] presented one kind of mesh composed of Fourier transforms and phase masks, which can implement the unitary matrix transformation of any finite dimension.
For an N-dimensional unitary matrix, the above Equation (13) can be written as can be decomposed by the diagonal matrix, permutation matrix and circulant matrix, and the decomposition of the unitary matrix can be obtained as follows: where A (i) and B (i) are and, Finally, The N-dimensional unitary matrix, U(N), can be decomposed into a product of phase masks and Fourier transforms. As shown in Figure 5, the gray rounded rectangles represent the Fourier transforms, and the colored rectangles represent the phase-mask diagonal matrices. Only two diagonal matrices per layer (denoted by red and yellow rectangles) depend on the unitary matrix being implemented, while the rest (denoted by blue rectangles) are fixed. In this structure, the MZI of each layer in the mesh is decomposed into the form of Fourier transforms and phase masks, and the Fourier transforms can be implemented by MMI. This mesh avoids the use of a large number of beam splitters, reduces the source of error to a certain extent, and is more conducive to the realization of a large-scale mesh for optoelectronic computation. However, this decomposition method is not as simple and easy to implement as the triangular mesh and rectangular mesh methods are. Additionally, an even-dimension unitary matrix requires 6N DFTs (the discrete Fourier transform) and 6N + 1 controllable phase masks, and thus the size of the mesh will also significantly increase.

Redundant Rectangular Mesh
Pai et al. [32] proposed two mesh improvements. The first is adding redundant tunable layers in the rectangular Mesh, called the redundant rectangular mesh (RRM), which can accelerate the optimization process of the mesh to implement the unitary matrix. The second is adding a redundant mesh composed of low-loss waveguide crossings or MZIs with fixed cross-state phase shifts, called the permuting rectangular mesh (PRM). As shown in Figure 6, Figure 6a is a schematic diagram of the RRM, where the green part represents the redundant tunable layers, and Figure 6b shows the PRM, where the gray part represents the additional low-loss waveguide crossings or MZIs with fixed cross-state phase shifts. This method of adding additional MZIs in the mesh increases the tunable degrees of freedom in the MZI's mesh. For a given unitary matrix, there is a supersaturated implementation schemes. Some unitary matrices that cannot be realized due to MZI imperfections can be realized by new equivalent schemes brought by the extra degrees of freedom in the redundant mesh [33]. However, the redundant mesh increases the size of the MZI meshes, which inevitably leads to an increase in the optical loss of the entire mesh. In a small mesh, the increased optical loss is within an acceptable range, and the computational precision of the MZI meshes is better improved.

A Single MZI Device
In addition to the above improvements in the MZI's mesh, more researchers focus on the improvement of a single MZI device. The improvement of a single MZI device is mainly carried out by adding redundant phase shifters and beam splitters to improve the tunable degrees of freedom of the MZI, so as to correct its hardware errors.
Miller et al. [35] presented a double Mach-Zehnder interferometer (DMZI), as shown in Figure 8. Two MZIs were used as tunable beam splitters to replace the two splitters on the left and right of the common MZI, forming a new MZI architecture. Now, we will illustrate the error correction process of DMZI. For instance, considering the MZI on the left, we should adjust its beam splitters to a 50:50 split. The transmission matrix of two beam splitters in this MZI can be expressed as P a and P b : where, R a and R b represent errors of the left beam splitter and the right beam splitter, respectively. The transmission matrix, P L , of the MZI as a beam splitter can be expressed as This can be rewritten as where, When we input optical signals, E 1 , from only one port of MZI, the output optical power, P out , of one of the two output ports is Thus, if this MZI is used as a 50:50 beam splitter, P 3 = 1 2 |E 1 | 2 ; hence, from Equation (25) considering cos 2 θ ≤ 1; hence, from Equation (26) which can be derived to give Therefore, the split ratios of the fabricated power from 85:15 to 15:85 can be compensated by adjusting the split ratio of the two tunable beam splitters back to 50:50. Two redundant beam splitters and two redundant phase shifters can completely correct the transmission matrix of MZI. Wilkes et al. [36] proposed a configuration algorithm for this DMZI, eventually achieving a 60 dB extinction ratio.
However, the size of the MZI also increases significantly. The increase in optical loss brought about by redundant beam splitters and phase shifters is also a problem to be considered in large-scale MZI meshes. Moreover, the transmission matrix of a real MZI as a beam splitter is different from that of the ideal one; when the ideal MZI is used as a 50:50 beam splitter, its transmission matrix R, from Equation (3), is: where, x = cos(ϕ 1 − ϕ 2 ); y = cos(ϕ 1 + ϕ 2 ); s = sin(θ/2); c = cos(θ/2). The corresponding output optical phase of the four elements in the matrix R can be represented on a complex plane, as shown in Figure 9. For a real MZI, when it is used as a 50:50 beam splitter, its transmission matrix R , from Equation (10), is where, x = cos(ϕ 1 − ϕ 2 ); y = cos(ϕ 1 + ϕ 2 ); s = sin(θ/2); c = cos(θ/2). The corresponding output optical phase of the four elements in the matrix R can be represented on a complex plane, as shown in Figure 10.  By comparison with Figures 11 and 12, it is evident that for a MZI with a beam splitter that is not an ideal 50:50 beam splitter, when it is used as a 50:50 beam splitter, the phase of the output optical signal will be altered, thus ultimately impacting the interference result in the subsequent optical path of the mesh and thus the final computed result.  Based on the above works, Hamerly et al. [20] proposed two MZI architectures (Figures 11 and 12). One is of a three-splitter MZI, which can correct generic errors and achieve a full range of split ratios. To realize the full range of the split ratio, it changes the position of "forbidden regions" caused by the error of the beam splitter, which are some unitary matrices that cannot be implemented because of the beam splitter error. The "forbidden regions" are displaced away from the cross state, rather than being eliminated completely. However, similarly to the architecture proposed by Suzuki, it does not completely correct the transmission matrix of the MZI. Furthermore, this MZI architecture does not incorporate an external phase shifter. Therefore, when it is formed into a mesh, it cannot correct the phase error from the previous layeR s MZIs.
The other one is MZI + crossing, which can only correct correlated device errors. Because the errors of the right and left beam splitters are consistent, the added cross waveguide rotates the "forbidden regions" by 180 • to achieve complete extinction. Thus, this architecture has bandwidth tolerance. However, due to the errors of the right and left beam splitters being usually inconsistent, this architecture only exhibits good bandwidth tolerance but cannot correct the beam splitter error and eliminate limitations to matrix computation. Compared to the Suzuki design and Miller design, these two MZI architectures are smaller in size and do not add redundant phase shifters, but the hardware error correction is not good enough.
Bandyopadhyay et al. [21] used redundant phase shifters to correct hardware errors and proposed a method of adding phase shifters to two ports of the output end of an MZI to locally correct the hardware error within an individual MZI. In this method, no additional beam splitters are added, and the increase in the size of a single device is small, as shown in Figure 13. For the MZI with non-ideal 50:50 beam splitters, the transmission matrix S , from Equation (10), can be rewritten as: To implement a desired unitary S (Equation (3)), we should find the θ , ϕ of each S (θ , ϕ ) = S. The condition to each S (θ , ϕ ) = S produces the following expression for θ : Therefore, the beam splitter errors restrict θ to the range Assuming that θ is in this range, the transmission matrix's S (θ , ϕ ), from Equation (31), can be rewritten as S = ie iθ /2 e iϕ e iδ a sin(θ/2) e iδ b cos(θ/2) e iϕ e iδ c cos(θ/2) −e iδ d sin(θ/2) , where δ a , δ b , δ c and δ d are phase errors the elements of S (θ , ϕ ), and for the unitary matrix requiring that δ a + δ d = δ b + δ c , we can correct those phase errors from Equation (36) to set S (θ , ϕ ) as equal to S. In accordance with Equation (36), the architecture shown in Figure 13 can be obtained. For different unitary matrices, corresponding error correction procedures must be implemented, and the correction method is complicated. It is important to note that this architecture can only correct the phase error. Although the transmission matrix is corrected, θ is restricted, meaning that some matrices cannot be implemented accurately. When θ is not in this range, there is a deviation between the desired matrix and the actual matrix.

Discussion
To overcome hardware errors in MZI meshes during the implementation of matrix computation, the primary strategies include increasing the tunable degrees of freedom of MZI meshes. This involves adding redundant beam splitters and phase shifters. However, incorporating a redundant mesh and improving a single MZI device may result in an increase in the size of the MZI mesh, as shown in Table 1. Therefore, it is crucial to develop a MZI mesh that can perform high-precision computation without increasing the device's size to achieve large-scale matrix computation.
In 2001, a study report [37] was conducted on tunable multimode interference couplers (MMI). By changing the position of the four-fold images of MMI out of the phase shift of the four fields, the split ratio of the two-fold images was controlled, which was then realized in the InP material. In 2008, May-Arrioja et al. [38] used local electrical modulation to control the split ratio, and in addition, changed the phase of the two-fold images [39,40] to realize the arbitrary split ratio of the self-image. A thermally modulated MMI using polymer materials was also introduced [41]. In 2019, Perez et al. [42] reported a thermally modulated dual-drive directional coupler (DD-DC) and experimentally proved that it can realize an arbitrary split ratio. The size of the tunable beam splitters proposed in these studies is relatively large, and some beam splitters are not integrated on a silicon platform.
However, these studies provide us with a novel perspective; by applying thermal or electrical modulation interference to the local optical field of the static beam splitter, we can also create a tunable beam splitter on a silicon platform without needing to increase its size. Here, we propose a new MZI architecture, where we replace the 50:50 beam splitter in the common MZI with a tunable DC/MMI that takes into account hardware error correction and mesh size, as shown in Figure 14. This improves the computational precision of the MZI's mesh without increasing its size. By adjusting the split ratios of the tunable DC/MMI, we can eliminate its split ratio deviations from 50:50. For instance, for a tunable MMI, when the phase of optical fields at the position of the four-fold images in the multimode interferometer is altered, it affects the interference results of the optical fields, culminating in changes in the intensity of the two images at the two-fold image position. This enables the adjustment of the beam split ratios. The tunable MMI has the theoretical capability to adjust split ratios from 100:0 to 0:100, enabling the adjustment of any fabricated power split ratios in the physical 2 × 2 MMI to the ideal 50:50 split. This correction completely eliminates beam splitter errors and enables high-precision MZI-based matrix computation.

Conclusions
To summarize, this paper introduces the method of using a MZI's mesh to implement any finite dimensional unitary matrix transformation, along with an analysis of the effects of hardware errors during matrix computation based on the MZI. Addressing MZIs' hardware errors is crucial to achieving large-scale matrix computation. To eliminate the hardware error of an MZI, various improvement works have been carried out on the entire MZI mesh and a single MZI device. It is important to note that correcting these errors can lead to an increase in an MZI's mesh size, which is also a significant concern. The tradeoff between hardware error correction and mesh size requires more innovative works in artificial intelligence, materials, optics, device manufacturing and other related fields.
In this paper, a new MZI architecture is proposed which replaces the 50:50 beam splitter in the common MZI with the tunable DC/MMI. This MZI is designed to take into account both hardware error correction and mesh size concerns. The tunable beam splitterbased MZI provides a new approach for more accurate large-scale matrix computation based on the MZI's mesh.