Multidimensional fiber echo state network analogue

Optical neuoromorphic technologies enable neural network-based signal processing through a specifically designed hardware and may confer advantages in speed and energy. However, the advances of such technologies in bandwidth and/or dimensionality are often limited by the constraints of the underlying material. Optical fiber presents a well-studied low-cost solution with unique advantages for low-loss high-speed signal processing. The fiber echo state network analogue (FESNA), fiber-based neuromorphic processor, has been the first technology suitable for multichannel high bandwidth (including THz) and dual-quadrature signal processing. Here we propose the multidimensional FESNA (MD-FESNA) processing by utilizing multi-mode fiber non-linearity. Thus, the developed MD-FESNA is the first neuromorphic technology which augments all aforementioned advantages of FESNA with multidimensional spatio-temporal processing. We demonstrate the performance and flexibility of the technology on the example of prediction tasks for hyperchaotic systems. These results will pave the way for a high-speed neuromorphic processing of multidimensional tasks, hardware for spatio-temporal neural networks and open new application venues for fiber-based spatio-temporal multiplexing.


Introduction
Optical neuromorphic computing has attracted increasing attention over the past decades [1,2]. An important advantage of the technology is its ability to execute a neural network (NN) in an optical domain providing a significant enhancement in speed. Among variety of approaches [3][4][5], reservoir computing (RC) architecture enables an optical NN implementation with relaxed training requirements. In particular, echo state network (ESN) [6], a type of RC, has been realised via various computing platforms [7,8]. At the same time there is a surge in developments aimed at increasing operational bandwidth of the neuromorphic systems. Recently an optical ESN utilizing semiconductor laser [9] and silicon photonics for reservoir technology [10] have been developed for processing 20 GHz and 40 GHz bandwidth signals correspondingly. While recently we introduced a fiber-based neuromorphic technology -fiber ESN analogue (FESNA) [11], which realizes signal mixing and neural response by utilizing inherent fiber properties: dispersion and the Kerr non-linearity -this enabled to achieve a dual-quadrature [11], high bandwidth (including THz) [12,13] and multi-channel [14] neuromorphic signal processing for the first time.
On the other hand, many applications in data analysis and machine learning, such as object recognition or decision making, require processing of multidimensional data. The increased complexity of analysis also bears immediate effect on speed, energy efficiency and complexity of the processor. It is beneficial therefore to incorporate more physical dimensions into the related neuromorphic hardware. Indeed, these are of high importance for many applications including optical communications, emerging 6G and Internet of Things (IoT), where space division multiplexing (SDM) is employed for enhancing the transmission rate requiring multi-dimensional signal equalization for compensation of non-linear impairments [15][16][17][18] and smart fiber-based technologies, including lasers and sensors [19][20][21][22].
Moreover, many complex dynamical effects and systems can be analysed only by processing all dimensions simultaneously. For example, hyperchaos [23,24], which occurs only in multidimensional systems. Furthermore, due to the increasing interest in the studies of spatio-temporal NNs there is an interest in spatio-temporal neuromorphic platforms.
The developments of fibers with multiple spatial modes and with engineered properties on demand [15,16,25] enable to apply multi-mode fibers for neuromorphic computing. In particular, multi-core fiber-based neuromorphic technology has already been demonstrated, where cores were utilized as virtual neurons and the strength of synaptic interactions was varied by amplifiers [26].
Here we are focused on a different task -incorporating spatial fiber modes (similarly one can use cores) to process all signal dimensions simultaneously, which enables to solve complex machine learning problems. Similarly to FESNA [11], we use temporal multiplexing for realizing virtual neurons and augment it further by spatial multiplexing to realize multidimensional signal interactions.
We show that by incorporating multi-mode fiber (MMF), we can achieve multidimensional neuromorphic processing -multidimensional FESNA (MD-FESNA). This will be beneficial for solving complex multidimensional tasks, which is demonstrated here numerically on the example of hyperchaotic systems. Moreover, MD-FESNA incorporates spatio-temporal dimensions -crucial for many effects and applications, including spatio-temporal NNs. Thus, MD-FESNA is the first multi-dimensional spatio-temporal neuromorphic technology.

Design
Here we describe a design of optical neuromorphic architecture based on a multi-mode fiber for multidimensional signal processing. The design is based on the ESN as it enables NN realisation with one non-linear element and a feedback loop (see figure 1(a)). The typical ESN comprises the following stages of signal processing: (i) signal mixing through random weight matrices W in and W, which are multiplied to the input u n and feedback x n signals correspondingly: x (i) n+1 = W in u n + Wx n . In multidimensional case matrices incorporate mixing among all signal dimensions.
(ii) non-linear transformation: x , which is usually a sigmoid or tanh function. The resulted signal is then fed back for the subsequent mixing and the copy is also collected at the receiver X.
The state matrix is then used to obtain the optimal output weight matrix W out through linear regression: y = W out X. Thus, the only optimized element is the output weight matrix, optimization of which requires a linear regression only. This property is an important advantage of the ESN compared to other NNs, training of which requires complex optimization algorithms.
Since the first design of the fiber ESN analogue (FESNA) [11], where only non-linear transformation was realised optically, we have shown that both signal mixing and non-linear transformation can be realised optically using fiber dispersion and non-linearity correspondingly (DM-FESNA) [13]. Unlike conventional neuromorphic systems, here we assume an optical input signal u to be multidimensional: a result of multidimensional digital data encoded to the analog multi-mode signal [15,16,27,28] or multi-mode signal originated from communication systems (e.g. SDM). As shown in figure 1(c), each input data dimension or signal mode is mapped to the corresponding mode of the MMF (in the illustration there are 4 data/signal dimensions and 4 MMF modes, correspondingly).
One of the advantages of utilizing fiber properties for both stages (i)-(ii) is that it enables high bandwidth signal processing [13]. Moreover such design enables the first multichannel neuromorphic processing, which is of high importance for many applications, in particular optical communications. Here we demonstrate that augmented by multi-mode fiber we can achieve simultaneous multidimensional signal processing -multidimensional FESNA (MD-FESNA) (see figure 1(b)). Moreover we can significantly reduce the complexity of previous realisations, as one can realise MD-FESNA using one spool of fiber and only one pump. The setup follows the stages outlined above for ESN: (i) To achieve signal mixing we utilize fiber dispersion: To control the amount and strength of mixing one can change the accumulated dispersion D of the dispersion compensating fiber (DCF). Dispersion induces temporal mixing between signal samples, each of which is a multidimensional signal being a combination of the modes (see figure 1(c)).
(ii) To realize non-linear transformation we use the inherent Kerr-non-linearity of the fiber, which in multi-mode case is described by the Manakov equation for the M-modes [29][30][31], so that each mode p = 1..M is transformed as follows: here deterministic distortions are described by fiber losses α, second-order dispersion β 2 and non-linearity coefficient γ. The formula captures both a) weak and b) strong coupling regimes, when a) κ pp = 8/9, κ pq = 4/3 and b) κ = κ pq = κ pq = 8 M/3/(2 M + 1). As an example, further we use strong coupling. The resulted effect can be explained as follows. The Kerr-non-linearity induces a non-linear phase shift: (above we assumed no dispersion for explanation purposes only). By coupling the signal with the pump one can achieve signal-pump beating, which in the limit of strong pump (compared to signal) x (i) + ξ results in harmonic oscillations of x (ii) . In the previous FESNA designs we used parameters and loop mirror design to achieve sine-transformation with the aim to approximate tanh function, typically used for the ESN. However, in most cases this is not necessary and, as we will show further, there is a broad flexibility in the choice of setup parameters. Thus, the Kerr non-linearity results in the mode mixing and non-linear transformation, the properties of which can be governed by the pump and fiber parameters (see figure 1(b, c)).
In MD-FESNA, unlike the ESN design above, we do not require mixing of modes in the signal mixing stage (thus, reducing dimensionality of the weight matrices to mixing of the signal samples in time only) as the signal modes will mix at the non-linear stage naturally through non-linearity.

Performance and discussion
To illustrate the performance of the proposed setup we test it on a prediction task for the multidimensional chaotic systems. In particular, we choose to consider hyperchaotic systems as they demonstrate the impact of mutually interconnected dimensions, which require to be processed simultaneously. In particular, we use a well-studied hyperchaotic system -4D Rössler system [23,24]: (4)  The plane projection is plotted in figure 2(a) for a standard set of parameters: a = 0.25, b = 3, c = 0.5, d = 0.05 and initial conditions: x 0 = −10; y 0 = 6; z 0 = 0; w 0 = 10. The equations are simulated by the Runge-Kutta method with step 10 −4 , then downsampled with sampling rate 16 and normalized to average power 0.05 to form the data for processing. Here we utilize two signal quadratures (real and imaginary) and two modes. For benchmarking we compare the performance to that of a standard ESN architecture with four dimensions and following parameters: reservoir size N = 256, number of training and testing samples N tr = 1000 and N test = 2000 with tanh activation function and weights chosen randomly from the intervals [−0.5...0.5] and [−0.5i...0.5i] for all quadratures and modes. We evaluate the performance in terms of the mean squared error (MSE) normalized to the average signal power. The maximum MSE after ESN processing is MSE ESN = −20.5 dB compared to MSE LR = −8.5 dB in case of a standard ridge linear regression. Note, that in the ESN the mode mixing in matrix W is essential, without which the performance is comparable to that of linear regression, which highlights that processing data dimensions as separate data streams is equivalent to linear regression and multidimensional approach is essential.  [34,35], parameters for the latter are given in brackets alongside the corresponding parameters for a standard MMF. The HNLF offers higher values of non-linearity (resulting in shorter fiber length), yet they have higher values of attenuation. The corresponding MSE is plotted in figure 2(b) as a function of varied pump power and for different sets of MMF parameters. The HNLF and standard optical communications fiber scenarios demonstrate similar performance. While in the latter case higher pump powers are required to achieve peak performance, which is due to longer fiber lengths (see Sets2,3) and the achieved minimum MSE is bigger than in HNLF case, which is due to additional dispersion-induced signal-signal mixing, which manifests itself for longer fiber lengths (see Sets2,3). One can see that even in case of smaller non-linearity (Set1) one can achieve high performance compared to linear regression, while Set 2 gives result comparable to that of the ESN. The increase of the MMF length (Set3) results in further improvement of the performance surpassing that of the ESN with tanh activation function. This could be explained in terms of a choice of activation function, as while tanh function is commonly used in the ESN and other NNs, nevertheless, it is not necessarily the optimum function. The optimization of pump power shows that the optimum performance is achieved around the pump power 5 dBm with the corresponding MSE significantly surpassing MSE resulted from linear regression.
Next we compare the performance of the proposed setup with the same set of parameters for the prediction task based on the 3D hyperchaotic Hénon map [36]: x n+1 = a − y 2 n − bz n ; y n+1 = x n ; z n+1 = y n .
The plane projection is plotted in figure 2(c) for a standard set of parameters: a = 1.9, b = 0.03 and initial conditions were chosen as: x 0 = 0.1; y 0 = 0.2; z 0 = −0.1. The chaotic 2D Hénon map is often used for benchmarking in neuromorphic systems [37]. Here we utilize three modes and a single signal quadrature -signal amplitude. We compare the performance to that of the standard ESN architecture with three dimensions and the same parameters as in the above example. The maximum MSE after the ESN processing is MSE ESN = −59 dB, while linear regression returns MSE LR = −0.9 dB.
The parameters of MD-FESNA are also kept the same as in the previous example, see table 1, which allows us to study the flexibility of the setup for solving different tasks when parameters are fixed. The pump power is expected to increase as it represents the total power distributed over the increased number of modes. The corresponding MSE, plotted in figure 2(d) as a function of varied pump power, demonstrates that by using the same set of parameters (for example Set2 or Set3), one can achieve good performance for various tasks. Here the difference in performance for highly non-linear and standard MMF is less pronounced than in the previous example (see figure 2(b)), processing of which required less non-linearity yet was more sensitive to optimization.

Conclusions
In summary, we have proposed a fiber-based technology for dual-quadrature high bandwidth multidimensional neuromorphic signal processing. The setup enables to incorporate spatial and temporal dimensions enabling high speed performance. We demonstrated an application of the technology for prediction of multidimensional hyperchaotic systems. The design is flexible and applicable to various tasks with the fixed set of parameters. The technology, based on the low-cost off-the-shelf fibers, offers unique advantages such as high bandwidth and multidimensionality and opens new opportunities for advanced fiber-based signal processing addressing variety of applications ranging from optical communications and 5/6G to lasers and sensors.