Conditions for reservoir computing performance using semiconductor lasers with delayed optical feedback

Photonic implementations of reservoir computing (RC) have been receiving considerable attention due to their excellent performance, hardware, and energy efficiency as well as their speed. Here, we study a particularly attractive all-optical system using optical information injection into a semiconductor laser with delayed feedback. We connect its injection locking, consistency, and memory properties to the RC performance in a non-linear prediction task. We find that for partial injection locking we achieve a good combination of consistency and memory. Therefore, we are able to provide a physical basis identifying operational parameters suitable for prediction. c © 2017 Optical Society of America OCIS codes: (190.3100) Instabilities and chaos; (320.7085) Ultrafast information processing; (200.4560) Optical data processing; (250.5960) Semiconductor lasers.


Introduction
Efficient information processing is among the most prominent challenges of modern society. Its importance becomes immediately apparent considering advances made in the wake of a large scale implementation of computers. In only a short period of time, computation has penetrated almost all aspects of professional and private lives. This development was driven by von Neumann computing on binary, electronic processors.
After the successful exploitation of the von Neumann concept, we currently witness the rise of various novel approaches to information processing. Among others, strong advances in computing have been made thanks to machine learning based on emulated Neural Networks (NNs). Diverging from the algorithmic computational principle of von Neumann machines, various types of NNs have shifted performance benchmarks in multiple fields. Reservoir Computers (RC) have outperformed previous approaches in time series prediction [1], while deep learning broke many records in image recognition [2] and won against the most renowned human players in the classical Chinese game of Go [3]. While these novel algorithms have demonstrated their relevance exhibiting exceptional computational performance, they have been suffering from inefficient implementations on hardware substrates. The implementation of the game Go in AlphaGo serves as an illustrating example, running on hundreds of CPUs and GPUs in parallel.
At their core, NNs can be considered as non-linear dynamical systems. Stimulated by injected information, they exhibit complex and high dimensional dynamical responses. As such, implementations of these methods in appropriate physical, in particular high-speed photonic systems could pave the way to a new type of computational device or substrate. However, a full implementation of NNs in dedicated physical hardware appeared illusive for a long time. The introduction of the reservoir computing principle has served as a disruptive step for this field [1]. In RC, a NN is usually implemented in a randomly connected recurrent neural network. Such random connections strongly reduce the demands placed on a hardware system, and as a result such Reservoirs were quickly demonstrated in a considerable number of electronic [4], opto-electronic [5,6], and all optical [7-9] delay systems as well as on a photonic chip [10].
Computational performance in most of these approaches are state of the art, but were rarely linked to underlying physical system properties. The realization of RC employing a semiconductor laser was among the first demonstrations of an all-optical machine learning approach utilizing GHz bandwidth [8]. Here, we analyze this promising system in detail. Specifically, we investigate the injection of high bandwidth, all-optical information into a semiconductor laser coupled to delayed feedback. We are able to directly connect physical properties of induced dynamical responses to the system's performance in a non-linear prediction task. Until now, detailed analysis of such an information processing system was only available from numerical simulations [11][12][13]. Our approach allows for the experimental identification of the best operation conditions for time series prediction tasks in a semiconductor laser based reservoir computer.

Reservoir computing in delay systems
A RC scheme consists of three main parts: a reservoir, an input layer, and an output layer. The reservoir performs the mapping of the input information onto a high-dimensional state space, a requirement for neural network computation. In traditional RC the reservoir is a network of randomly connected non-linear nodes. Recently it was shown that a single non-linear node can emulate a recurrent network by simply implementing multiple nodes within a delay line with delay time τ [4]. The nodes of such networks are called virtual nodes, and they form a unidirectional ring topology. The output of the non-linear node on an interval of duration θ is interpreted as the state of a virtual network node. Therefore, consecutive intervals θ of the non-linear node's response will correspond to the outputs of consecutive virtual nodes. The value of θ is typically chosen as a fifth of the system's characteristic response time. By choosing θ < 1 the state of each node is mixed with the state of their neighbors, increasing the connectivity of the network. Due to the delayed feedback, the current state of the nodes are additionally influenced by their previous state one delay time before [4]. The network size will depend on τ and θ, the number of virtual nodes corresponding to N = τ/θ.
The role of the input layer is to inject the information, usually a stream of data inputs. Each sample of input information is injected during a time interval of τ into the N virtual nodes. To obtain a large number of different transient responses to a data input, a temporal masking is applied to the inputs. The mask acts as weights multiplied on the input data. Creating constant injection weights for each virtual node, the input masking sequence is repeated for every interval τ. The signal injected into the reservoir therefore comprises consecutive sections with length τ of the mask multiplied by the inputs. The mask used in the following work is a piecewise function of N values randomly chosen from the set {-1, -0.6, -0.2, 0.2, 0.6, 1.0 }. Choosing the mask to have six possible values offers enough variability of responses for computation [14].
The output layer performs the readout of the reservoir. The responses of the virtual nodes to the injected information are linearly combined using a set of output weights. The optimal output weights are determined in the training process, by minimizing the mean square error of the output y(k) from the desired outputȳ(k). The output weights are trained such that their values remain constant for each node during the entire information processing task. The linear sum therefore creates a single computational result per delay time τ. More extensive explanations of the RC principle can be found in [4, 15,16].

Experimental setup
Our hardware implementation is based on a semiconductor laser with delayed optical feedback and modulated optical injection. This scheme already showed excellent performance employing reservoir computing techniques [8]. A scheme of the setup is depicted in Fig. 1, and consists of off-the-shelf telecommunication components except for the laser in the feedback loop, which is lacking an optical isolator. The setup can be sub-divided into a reservoir, injection, and detection section.
The reservoir is implemented by a semiconductor laser, called the response laser, coupled to an optical fibre delay loop. The response laser is a single mode quantum well semiconductor laser emitting at λ = 1542 nm, with a longitudinal mode separation of 145 GHz, side mode suppression ratio of 40 dB, and threshold current of I th = 11.24 mA. Temperature and bias current were stabilized with 0.01 K and 0.01 mA accuracy, respectively. The response laser was biased at I bias = 11.10 mA, corresponding to 1 % below solitary threshold. At this current the laser exhibits chaotic dynamics under maximum feedback conditions, but remains in the stable off-state without external perturbation. The delayed feedback loop is realised using an optical circulator. The circulator defines a propagation direction within the feedback loop and suppresses unwanted reflections. A polarization controller and a variable attenuator control feedback conditions. We define η as the attenuation applied on the feedback. A pair of optical couplers extract light for detection and facilitate injection of the modulated optical signal. The external cavity round trip time was precisely measured by injecting a 1 ns light pulse into the delay loop and measuring the propagation delay. The delay time was measured to be τ = (66.000 ± 0.025) ns. For minimum feedback attenuation, η = 0 dB (maximum feedback), the response laser receives from the delay loop a power of 12 µW, corresponding to 21 % of the optical output power collected from the laser's fibre pigtail for these conditions. For η = 0 dB, the threshold current is reduced to I th, fb = 10.55 mA. Feedback polarization is kept parallel relative to the response laser polarization direction. We choose θ = 200 ps, resulting in a network of 330 virtual nodes. The time scales of the intensity dynamics of semiconductor lasers lies in the range of GHz, which motivates the choice of the specific value of θ [11].
The injection arm's main component is the injection laser, which we use to optically introduce the information into the reservoir. This laser is a tunable single mode DFB emitter with an integrated 30 dB optical isolator. The bias current of the injection laser was kept constant, and its emission frequency (ν i ) was tuned via temperature variation. We choose the spectral position of the response laser's solitary emission as reference (ν r ) to determine the frequency detuning ∆ν, specified in Eq. 1: An arbitrary waveform generator (AWG) synthesises the electrical signal used for modulation at a rate of 5 GSamples/s with 8 bit resolution and 9.6 GHz analog bandwidth. A Mach-Zehnder amplitude modulator creates the information injection signal by modulation of the injected intensity. A polarization controller and a variable attenuator control polarization and average power of the injection keeping the peak-to-average power ratio constant. The average power injected into the delay loop was 0.7 mW, with the injected light polarization parallel with respect to the response laser's polarization. As the polarization states of the injected and feedback light are parallel to the response laser's emission, only one polarization direction is contributing throughout the experiment. We use an additional 50 dB optical isolator to suppress unwanted reflections. A semiconductor optical amplifier and an optical tunable filter are used to increase the SNR before detecting the optical response of the reservoir. The spontaneous emission noise of the optical amplifier limited the improvement on SNR if used without the optical tunable filter. The SNR was further improved removing a large fraction of the spontaneous emission via the optical tunable filter. The optical signal is recorded using a fast photodetector and a 16 GHz analog bandwidth oscilloscope recording at 40 GSamples/s. Using this detection scheme we achieved SNR = 19 dB for minimum feedback, and SNR = 13 dB for maximum feedback. A 10 MHz resolution Optical Spectrum Analyser is used to record the optical spectra of the different optical signals.

Properties of dynamical information injection
Injecting light from a cw laser into another semiconductor laser has been profusely studied for a long time, showing that even this simple drive-response configuration configuration gives rise to a variety of behaviors, depending on power and detuning of the injected field [17][18][19][20][21][22][23]. Our setup additionally includes optical feedback of the response laser and optical injection modulated at fast timescales. It is well known that the dynamics of a semiconductor laser can become highly complex under the influence of optical feedback [22][23][24][25]. In past years there have been studies of the dynamical properties of semiconductor lasers with optical feedback under a cw optical injection [26-28]. Yet not much is known when the injection is time-dependent. These, however, are the conditions under which RC is being performed and a detailed characterisation is therefore crucial.

Spectral locking to signals
In this section we focus on the influence of the dynamical optical injection on the reservoir system by studying highly resolved optical spectra. This allows to gain insight into the resulting injection locking properties and into the underlying dynamics of the system. In Fig. 2(a) we depict the optical spectra of the response laser (reservoir) without injection for an injection current of I bias = 11.10 mA and η = 0 dB. For these conditions, the reservoir exhibits a broad spectrum characteristic for the non-linear dynamics induced by delayed optical feedback. Moreover, we show the optical spectrum of the modulated injected signal detuned by ∆ν = −30 GHz for comparison. The injected spectra shows a central narrow peak corresponding to the injection laser emission and sharp, symmetric sidebands. The sidebands are located at ν sb = 5 GHz = θ −1 and are caused by the temporal modulation of the injection signal.
Under dynamical optical injection, the reservoir shows different spectra depending on the spectral detuning of the injected light. We separate the response spectra in 3 categories: fully locked, partially locked, and unlocked. Examples of spectra from those three categories are shown in Fig. 2(b). Fully locked behavior corresponds to all of the response laser's emission concentrated on frequency components of the injected field. In this case injection exerts a strong influence on the emission behavior of the reservoir.
The unlocked cases correspond to the response spectra being unperturbed despite the optical injection. The unlocked example in the dark gray case shown in Fig. 2(b) shows the spectrum of the response laser unmodified by injection along with components corresponding to remainders of the injection. These injection components originate from the injected field being reflected at the response laser's front facet and propagation of injection through the response laser medium. Partial locking corresponds to an intermediate situation between the fully locked and unlocked category. The optical spectra exhibits frequency components from injection as well as from the free running delay laser dynamics. We show an example of such a case via the spectrum depicted in light gray in Fig. 2(b), where one can observe frequency components inside the range of the response spectra without injection.
Next, we include the dependencies on frequency detuning and feedback attenuation in our evaluation. In Fig. 2(c) we classify the previously defined locking scenarios found in the (∆ν, η) plane for I bias = 11.10 mA and P in j = 0.7 mW. For high η, the response spectra are fully locked in the range of ∆ν studied. For lower η we are able to observe all three categories. As the injection detuning is varied from −60 to 0 GHz, the system shows transitions from unlocked, partially locked, and finally to fully locked behavior. Full locking is restricted to ∆ν between 0 and −25 GHz for η = 0 dB. Partial locking is observed between the full locking and the unlocking regimes. We observe that the width of the full locking region depends on η. Under unlocking conditions, optical feedback is sufficiently strong to destabilize the system.

Dynamical consistency
Having obtained an overview over the spectral properties of the dynamic injection, in this section we study the consistency properties of the setup. Consistency is a fundamental feature of dynamical systems and relates to the reproducibility of system responses under repetitive injection of similar inputs. In the framework of dynamical systems, consistency has been extensively studied along with the concept of synchronization [29-31]. Furthermore, it is a topic of interest for photonic systems [28, 32-35]. Computational performance requires reproducible results, making consistency an essential condition for reservoir computing [12,15,36]. In the following, we therefore characterize the consistency properties of our system in dependence on the spectral detuning of the dynamical optical drive signals.
The consistency of the dynamical responses will depend on properties associated with both the drive signal and the response system. In particular, properties of the injected signal such as its injection strength, bandwidth and frequency detuning will influence the susceptibility of the response system to the drive, and therefore affect consistency [12, 28, 33]. Additionally, the non-linear properties of the response system will critically determine the consistency properties. Furthermore, different sets of initial conditions caused by noise in the response system may result in different evolutions under injection of the same drive. This is a direct consequence of the complex phase-space of the system [31, 34]. Finally, noise can act as a consistency degrading factor, potentially inducing bubbling effects [37].
The injected signal is chosen to exhibit the same properties in terms of bandwidth and amplitude modulation as in the previous section. Consistency is characterized in two ways: using persistence plots and calculating the cross-correlation between responses for repeated drives. Persistence plots are obtained overlapping 500 different responses to the same modulated injection, representing a distribution of possible responses at every point in time. Local consistency, identifying temporal regions with different levels of consistency, can be observed as vertical spreading of the distribution. In Figs. 3(a) and 3(b) we show persistence plots for the same time interval from the response for different spectral detuning and corresponding to different locking conditions. These persistence plots show different responses highlighting the different non-linear transformations performed by the reservoir depending on operating conditions. In the persistence plots, local consistency varies significantly in a non-trivial fashion. This indicates changing local stability in phase space for different input segments. Direct observation of the local consistency shows that it is lower for the conditions shown in Fig. 3(b) than in Fig. 3(a); under full locking conditions consistency is higher than for partial locking. Temporal unlocking lowers the consistency of the responses for certain segments in time. Temporal reduction of local consistency can originate from the complex dynamical phase space of the system. The presence of unstable regions in phase space can can result in a lowering of the temporal consistency, even in the case of full locking. In the case of unlocking, the response system typically exhibits inconsistent behavior.
While persistence plots offer information about local consistency, cross-correlations provide a global measure of consistency among responses. We refer to the cross-correlation coefficients as consistency correlation (CC). The CC is calculated using three different responses. We use pairs of three different responses to calculate the correlation between them via Eq. 2, where I i (t) is the intensity of the response over time,Ī i is the average intensity, σ i is the standard deviation of the response, and the subscripts i and j denote different responses. The value of CC is calculated by averaging CC i , j .
In Fig. 3 we show the dependence of CC on ∆ν and η for I bias = 11.10 mA and P in j = 0.7 mW. These conditions match those of the injection locking properties in Sec. 3.1. We observe that CC is close to unity for high feedback attenuation regardless of frequency detuning. For low feedback attenuation CC is high close to ∆ν = 0 GHz, with CC decreasing as frequency detuning lowers. Comparing the consistency results in Fig. 3 with the spectral properties in Fig.  2, we find that for full locking we obtain high CC, and for partial locking intermediate values for CC. As locking is lost completely, CC decreases further. This illustrates the destabilizing role of the feedback and the stabilizing role of the injection. Overall, we conclude that the dynamical consistency of the responses depends on the spectral locking conditions. Fig. 3. Persistence plots of the response for full locking (a) and partial locking (b) conditions. Parameters are I bias = 11.10 mA, P in j = 0.7 mW, η = 5 dB. Frequency detuning are ∆ν = −10 GHz and ∆ν = −30 GHz, respectively. Gray levels indicate the level of consistency. (c) Consistency correlation dependence in the (∆ν, η) plane for I bias = 11.10 mA and P in j = 0.7 mW.

Memory properties
Next we study the memory properties of the system and its connection to spectral locking and consistency. Memory is the property of a system to retain previously injected information. This property is important when tasks require past information. For prediction tasks it is especially relevant to have information on the past to accurately infer future values. Nevertheless, memory needs to vanish after some time to allow responses to be affected only by the latest history. This property is referred to as fading memory [38].
Memory in a RNN originates from the recurrent network connections, allowing information to remain in the network over finite time [38]. Past information therefore mixes with the current input. In our photonic RNN, optical delayed feedback introduces recurrences resulting in the ring topology [4,8]. As a consequence, the optical reservoir possesses fading memory. Increasing feedback attenuation η will reduce the relative strength of recurrent connections of our RNN, and consequently its available memory.
To evaluate the memory properties we use a method introduced in [38]. We inject a stream of pseudo-random numbers y(k) uniformly distributed between 0 and 1 into the reservoir. We train the reservoir to retrieve the numberȳ i (k) which was introduced i time steps before. Then, we calculate the correlation m i between the original time trace and the reconstructed one as described in Eq. 3. This is the so called linear memory. This correlation value measures how well the reservoir can hold information injected a certain amount of time before. We partitioned the data set into 800 samples for training and 200 samples for testing. We compute the sum of memory correlations to extract the system's memory capacity MC, corresponding to an accumulated measure of memory according to Eq. 4.
We evaluate the memory properties of the system under the same conditions as before. Figure 4(a) depicts the memory correlations depending on the frequency detuning for a feedback attenuation of η = 5 dB. For a frequency detuning around −50 GHz, memory correlation values are low. As frequency detuning increases, memory correlation values rise and the memory horizon extends to further time steps. At −35 GHz detuning, memory correlation reaches a maximum, and then collapses to short memory as frequency detuning crosses −25 GHz. This short memory extends to zero frequency detuning. This shows that memory fades, and different memory lengths can be obtained by changing the frequency detuning.
Next, we evaluate the memory capacity. Figure 4(b) shows the memory capacity in the (∆ν, η) plane. For conditions I bias = 11.10 mA and P in j = 0.7 mW, we find the highest memory capacity of MC = 8.3 for ∆ν = −30 GHz and η = 0 dB. We find the high values of memory capacity to concentrate at low η. This is an expected observation, consistent with the already mentioned relation between η and available memory.
Here, we need to point out a particular feature of the obtained MC. Independent of ∆ν and η, we obtain MC ≥ 2. The reason for this is the AWG's finite analog bandwidth. Frequency filtering affects the AWG's output caused by the non-ideal impulse response. As the AWG provides our injection signal, only memory capacities above MC = 2 can be considered to originate from the delayed feedback. We take this effect into account considering for Fig. 4(b) only values of MC ≥ 2.
Comparing Fig. 4(b) with the spectral locking results in Fig. 2(c), we find that memory is the highest just outside the full locking region, following the boundary between full and partial locking as conditions change. This is because under full locking injection is too dominant to allow feedback to affect the response and generate memory. Memory is related to an interplay of locking properties and nevertheless a susceptibility to the delayed feedback.
Comparing Fig. 4(b) with Fig. 3(c), one can observe that the highest memory correlation does not coincide with the consistency correlation maxima. In addition, for low η, memory capacity and consistency correlation reduce as frequency detuning lowers. The best compromise between consistency and memory is found at the boundary between full and partial locking. But moreover, these results show that by variation of injection parameters we can to some extent tailor the properties for a specific task.
Evaluation of the memory capacity properties in the range of positive frequency detunings yields qualitatively comparable trends within the parameter range explored here. Therefore we only show results for the case of negative frequency detunings.

Performance for the Mackey-Glass prediction task
Having studied the main properties relevant for reservoir computing performance, we now apply these insights by studying the performance of the system in a prediction task. The Mackey-Glass chaotic time series prediction task is a benchmark task in machine learning to evaluate the ability of a system to predict future values of a chaotic time series. In this task, the reservoir has to predict the value δ steps ahead for a time series originating from a Mackey-Glass delay equation (Eq. 5) with τ = 17 (and usually α = 0.2, β = 10, γ = 0.1), exhibiting moderate chaotic dynamics [1,39]. We integrated the Mackey-Glass time series with an integration step of 0.17. The Mackey-Glass time series is then downsampled to t s = 3 to obtain the discrete time series y k that then was used in the prediction task. A data set of 6500 values was partitioned into 4000 samples for training and 2500 samples for testing.
The task is chosen because it requires several memory steps and high consistency correlations to obtain good prediction results. On the one hand, a fading memory covering the delay time is beneficial for this prediction task. The memory capacity of the reservoir should be equal or larger than the samples needed to cover the delay time of the chaotic Mackey Glass dynamics. Due to the chosen parameters in the Mackey Glass task, this memory capacity should be above six. On the other hand, a high signal to noise ratio is beneficial for prediction tasks [14]. The performance is expected to improve in the regions of high consistency correlations.
The task involves time discretization of the time evolution, feeding the reservoir one value at a time. To minimize the prediction error, the reservoir must reproduce the true chaotic dynamics as closely as possible. To show successful prediction, the reservoir therefore needs consistency, memory, and the capability for high-dimensional mapping of the input. The prediction error is quantified by the Normalized Mean Square Error (NMSE), calculated via Eq. 6, whereȳ k is the predicted time trace, y k is the original trace, σ 2 y is the variance of y k , and N the length of the time trace.
We evaluate the performance of our system for this task for δ ranging from 1 to 3 in the (∆ν, η) plane, using I bias = 11.10 mA and P in j = 0.7 mW. By increasing δ we increase the memory requirements placed upon the RC to solve the prediction task. Results are shown in Fig. 5.
As can be seen in Fig. 5(a), a simple one timestep prediction (δ = 1) can successfully be implemented regardless of reservoir internal memory. The prediction's NMSE remains low even for a strong attenuation of delayed feedback, yielding NMSE = 0.019 as the lowest error with close values in a wide region. The memory introduced by the AWG turns out to be sufficient for this task. Increasing δ, however, significantly changes the situation. In Figs. 5(b) (δ = 2) and 5(c) (δ = 3), we can clearly identify the impact of the memory due to the delayed feedback on the performance. For δ = 2, best performance is found for η = 5 dB at ∆ν = −30 GHz with NMSE = 0.045. Focusing on δ = 3, Fig. 5(c), the lowest NMSE error (NMSE = 0.056) is found also for η = 5 dB at ∆ν = −30 GHz. Comparing these NMSE results with consistency and memory properties shown in Fig. 3(c) and Fig. 4(b), we find that best performance is attained in the region where high consistency and long memory overlap. This corroborates our previous analysis and underlines that a balance of consistency and memory is required for the chosen non-linear prediction task. , and for δ = 3 (c) in the (∆ν, η) plane using I bias = 11.10 mA and P in j = 0.7 mW. All panels share the gray scale.

Conclusions
In this work we have studied the influence of fundamental properties of an all-optical reservoir computing system based on a semiconductor laser with delayed feedback [8] on its information processing performance. We particularly concentrated on injection locking, consistency, and memory properties depending on the spectral detuning of the optical injection and the optical attenuation in the delayed feedback loop.
We found that under injection of a modulated signal, the response system can experience an intermediate state between the known unlocked and fully locked states characterised by partial locking. The consistency of the response to modulated injection showed a strong dependence on the locking conditions. For full locking, we observed the highest consistency, and for unlocking the lowest consistency. The evaluation of the memory in the system showed a strong dependence of memory correlation and memory capacity on the parameters. Memory was found to be the longest at the boundaries of full to partial locking. Altogether, the properties of our hardware implementation can be conveniently tuned via the detuning ∆ν. This offers the possibility to tailor the properties of the reservoir to a specific task and conditions with little effort. Finally, we have evaluated the performance of the reservoir for the Mackey-Glass prediction task. We demonstrated that the characterised properties of our reservoir computer determine the performance landscape, and that a compromise of consistency and memory results in the best prediction performance.

Outlook
While several attractive platforms for photonic reservoir computing have been suggested and implemented, we feel that establishing the link between fundamental optical system properties and computing performance, as presented here, represents a crucial step forward. Only providing such a basis will allow us to design, adapt and ultimately scale the reservoir computing schemes and enable qualitative advances in this emerging field. Interestingly, the approach to utilize photonic systems for unconventional computing also draws the perspective towards fundamental properties of the underlying optical systems which so far have not been addressed or sufficiently studied.
The obtained windows for good prediction performance lie well within the accessible parameter regimes and standard control techniques of photonic systems. The effects we described are in good agreement with the underlying physical principles of reservoir computing systems. Based on our results, it is now possible to design larger or fully implemented RC systems based on semiconductor lasers.