Effects of cavity nonlinearities and linear losses on silicon microring-based reservoir computing

Microring resonators (MRRs) are promising devices for time-delay photonic reservoir computing, but the impact of the different physical effects taking place in the MRRs on the reservoir computing performance is yet to be fully understood. We numerically analyze the impact of linear losses as well as thermo-optic and free-carrier effects relaxation times on the prediction error of the time-series task NARMA-10. We demonstrate the existence of three regions, defined by the input power and the frequency detuning between the optical source and the microring resonance, that reveal the cavity transition from linear to nonlinear regimes. One of these regions offers very low error in time-series prediction under relatively low input power and number of nodes while the other regions either lack nonlinearity or become unstable. This study provides insight into the design of the MRR and the optimization of its physical properties for improving the prediction performance of time-delay reservoir computing.


Introduction
Neuromorphic computing systems, which try to resemble the working mechanism of the human brain, are an interesting alternative to traditional Von Neumann architectures.Fundamental physical boundaries of electronics set some limits in future developments of current architectures to increase their computing capacity.Hence, neuromorphic computing appears to be a promising step in the development of novel artificial intelligence processors that can enhance the performance of current computing architectures and might extend Moore's law [1].Over the last decade, developments in integrated photonics have allowed the exploration of novel computing paradigms.Additionally, these developments have driven the photonic hardware realization of computing processing schemes and machine learning algorithms.Lower energy consumption, parallel computing and faster processing speed are the key potential benefits of photonic computing architectures that could address the limitations of traditional electronic circuits [2,3].Photonic neural networks, all-optical switching, optical spiking neurons and optical activation functions are some examples of the emergence of the photonic computing field [3].
Reservoir computing (RC) is a relatively recent computing paradigm in the recurrent neural networks (RNNs) family that offers a lower complexity of the training process with respect to conventional RNN and other neural network schemes [4,5].An RC architecture consists of an input layer, in which the data is assigned random fixed weights before being transferred to the reservoir layer, where the data is mapped into a higher dimensional space by means of interconnected nonlinear nodes with random and fixed connections.Using the response of the reservoir nodes, the weights in the output layer are trained to solve a specific target task, usually using ridge or linear regression.Based on the trained weights, RC can also make a prediction of the target task for subsequent input sequences that are unknown to the reservoir.Only the output layer is trained in RC, and this key feature considerably decreases the training time of this type of neural network [5,6].RC has been shown to have applications in time-series predictions, channel equalization, speech recognition, medical and financial applications, etc.Further details about RC architecture are available in comprehensive reviews on the subject [5,6].
Multiple works have demonstrated implementations of photonic RC, where usually the nonlinear nodes are achieved through the nonlinear behaviour of photonic devices [7][8][9][10][11][12][13][14][15][16][17][18][19][20] or the dynamics of nonlinear optical phenomena [21][22][23].Some of these works consist of blocks of photonic devices that perform as nonlinear nodes [7,11,12], but this leads to the scalability of RC being a challenge as well as the footprint of the photonic circuit being considerably increased.An alternative approach known as time-delay reservoir computing (TDRC) is to multiplex the nodes in time and use a single physical nonlinear node that typically receives feedback through a physical loop to boost the connectivity between the virtual nodes and the overall memory of RC.Several works regarding photonic TDRC can be found in the literature, e.g., using a Mach Zehnder modulator as the nonlinear node [9,10,14,17,19], semiconductor optical amplifiers [8] or the nonlinear dynamics of laser devices [13,15].A TDRC setup based on microring resonators (MRR), first studied in [18], demonstrated a good performance in time-series prediction tasks.Nonetheless, there was no clearly established relationship between the performance of RC and the physical effects that generate the nonlinear dynamics of the microring cavity.In [24] we reported initial studies on the impact of varying the relaxation times of such physical effects and numerically showed that it is possible to obtain frequency detuning and power regions with a prediction error lower than other RC implementations with a similar number of virtual nodes and input rate.
In this work, we extend our study of the MRR-based TDRC architecture from [24] by investigating the impact of the cavity waveguide linear loss on the performance of RC.This study also encompasses an analysis of how the amount of nonlinearity given by the cavity dynamics influences RC and how the behaviour of such dynamics can be used to improve the performance of this type of RC implementation.We also explore the impact of the generated nonlinear oscillations in the time-series prediction and show that it is a non-desirable effect for solving the discrete-time tenth-order nonlinear auto-regressive moving average (NARMA-10) task.Additionally, this work deepens the understanding of the performance thresholds and the fabrication requirements for the microring waveguide in order to achieve lower error prediction than similar numerical RC schemes.This improvement is shown for the prediction of the NARMA-10 sequence and can potentially be extended to other RC tasks.
The structure of the paper goes as follows: In section 2 we introduce the model used to mathematically describe the nonlinear dynamics of the MRR-based TDRC scheme.In this section, we also detail the major assumptions and values used for the optical parameters.In section 3 we present the details of each of the TDRC layers, including the mathematical description of the electric field of the processed optical signal at the different stages of our setup.In section 4 we describe the benchmark methodology when solving the NARMA-10 task.In section 5 we present the results of the setup when varying different physical parameters related to the MRR properties, and define input power vs. frequency detuning regions with different levels of prediction error.Afterwards, in section 6 we analyze the previous results and relate the region of low error prediction with the physical properties of the MRR and the dominance of each of the nonlinear effects.Section 7 summarizes the main conclusions.

Free-carrier nonlinearities in silicon MRR
Silicon microring resonators have been extensively studied in the field of photonic computing as their features have demonstrated applications in a variety of computing processes such as all-optical switching [25][26][27], optical logic gates [28], weight banks [29], photonic spiking neural networks [30], photonic accelerators [31] and photonic RC [16,18].The study in [18] focuses on the dependence of the MRR-based TDRC performance on the feedback intensity and the phase shift of the external feedback waveguide.This analysis is tested for the NARMA-10 and Mackey-Glass tasks.The dependence on the input power and detuning of the pump for the Santa Fe task is also studied.The system investigated in [16] presents a similar scheme but without external feedback and was tested on analog and binary operations.Furthermore, the impact of noise on the performance of the previous schemes with or without feedback was analyzed in [32].
To model the dynamics of a silicon MRR, we use the same mathematical model as in our previous study [24], similar to the one used in [18].In this model, we introduce in the input port of the MRR a quasi-monochromatic electric field   at an angular frequency   close to that of the resonance frequency of the cold MRR cavity  0 .This field triggers the generation of excess carriers due to two-photon absorption (TPA).The conversion of the optical mode energy to heat by the absorption of power results in the increase of the cavity temperature.The generated free carriers change the refractive index of the cavity waveguide through free-carrier dispersion (FCD), which in turn results in a blue shift of the resonance frequency.They also become a source of free-carrier absorption (FCA).FCA contributes to the total rise of heat in the cavity.The resulting thermo-optic (TO) effect also changes the refractive index of the cavity but in the opposite direction, causing a red shift of the resonance [33].All of these effects taking place inside the MRR cavity can be described by the temporal coupled mode theory (TCMT) through the following system of coupled differential equations, for an add-drop MRR configuration [18,[33][34][35][36][37]: where  is the modal amplitude within the resonator cavity, Δ represents the excess free-carrier density generated via TPA, and Δ is the temperature difference of the waveguide cavity with respect to the environment.The variation of  described in Eq. 1 is dependent on both  in and  add , where the latter denotes the electric field at the add port in an add-drop MRR configuration.1/ c is the decay rate of the cavity modal energy due to the coupling of each bus waveguide to the MRR.The terms   () and  tot () represent the total angular frequency detuning and losses rate in the MRR cavity, respectively.In Eq. ( 2) and (3),  and ℏ denote the speed of light in vacuum and the reduced Planck's constant, respectively. FC is the relaxation time of the free carriers,  th is the decay time of the TO effect, and  is the mass of the MRR.The terms  TPA ,  Si and  p refer to the TPA coefficient, refractive index and specific heat of silicon, respectively.Γ FCA and Γ th are the FCA and thermal confinement factors related to the fractional energy overlap of the mode with the differential temperature and excess of FCD within the silicon microring. FCA is the effective volume of the FCA. abs () is the total optical mode energy converted into absorbed power.The time-dependent terms   (),  tot () and  abs () have the following definitions [33,35,36]: FCA 2.36 m 3 [35]  TPA 2.59 m 3 [35] In Eq. 4, the total angular frequency detuning is a result of the sum of the detuning between  p and  0 , which we refer to in this work as Δ =  p −  0 , and the nonlinear detuning due to TO and FCD effects.d Si /d and d Si /d represent the TO and FCD coefficients of silicon.In Eq. ( 5) and ( 6), the terms  TPA and  FCA denote the losses due to TPA and FCA [33], and  is the linear attenuation of the waveguide. FCA is the total FCA cross-section.The values of the optical parameters used for the simulation of the photonic RC are listed in Table 1.
Our model does not consider the contribution of the frequency detuning due to Kerr effect in Eq. 1 as the induced refractive index change (of the cavity waveguide) is negligible compared to the change caused by TO and FCD effects [38] at the simulated input power range.However, it is important to point out that the Kerr effect becomes more relevant in the case of coupled cavities and different material platforms, as recently studied in [39].The model also does not consider any source of noise, nor does it consider the counterpropagating optical mode as the absence of backscattering is assumed.This assumption follows the approaches of [34][35][36][37][38].In order to apply the model to our RC tasks, we normalize and solve Eq. ( 1)-(3) using a 4 th -order Runge-Kutta method similar to the solutions of the TCMT equations described in [35,38].Throughout this work, we sweep the values of the nonlinear effects lifetimes  FC , and  th , as well as the attenuation value given by .Then, we evaluate the impact of varying those parameters on RC dynamics and performance.

MRR-based Time-delay photonic RC
The photonic TDRC simulated in this work (Fig. 1a), consists of an optical pump from a laser source which is modulated by the masked input data sequence of the RC.The virtual nodes are multiplexed in time by the masking signal () and a delay waveguide is added in order to increase the memory to the RC, with a length that matches the added delay.In [16,18], the authors demonstrated the memory capacity provided by the MRR cavity itself and by the external waveguide.The response of the RC is obtained through a photodetector connected to the drop port of the MRR.Afterwards, the training and testing of the output layer are performed using ridge regression.
The silicon MRR cavity acts as the single physical node of RC when nonlinear behaviour is induced.Indeed, this behaviour tends to become oscillatory as the FCD and TO effects cause an increase in cavity losses, which decreases the modal energy || 2 .This reduction of energy in turn diminishes the nonlinear effects together with their caused losses, and consequently, || 2 starts to rise again, forming a cyclic nonlinear behaviour known as self-pulsing (SP) [34,37,38].
In RC, the high nonlinearity is correlated with the dimensionality expansion that is required for computation tasks when their solution is not feasible in low dimensional input space, and a task-dependent correlation between higher dimensionality and better performance has been demonstrated [40,41].However, a higher level of nonlinearity and dimensionality does not always entail a better performance of RC [42], whereas, in time-series prediction tasks like the one analyzed in this work, it might even be detrimental for the memory capacity of the reservoir [43].Furthermore, in certain frequency detuning and input power conditions, a perturbation of the input optical signal can trigger self-pulsing of the cavity energy with ultra-short discontinuities or spikes, as previously studied on silicon MRR nonlinearities [37,44].Such fast pulse transitions can alter the stability of the RC dynamics, as they affect the computational consistency of the system, as defined in [45].In [39], it is also pointed out that in the case of coupled cavities under the influence of FCD effects, there is an optimum input power interval in which the increase of dimensionality comes before the loss of consistency.As further discussed in section 5, similar findings are achieved in this work where the RC reaches enough dimensionality to achieve good performance without exciting SP that could affect its stability.Further details from each of the layers of the simulated RC scheme are described in the following subsections.

Input layer
The 1-GBd input symbol sequence, () is multiplied by the mask (), whose random values are generated from a uniform distribution over the interval [0, +1].The size of () is determined by the number of nodes  of the RC.Unless another value is specified, this work uses  = 50.Therefore, every symbol, which has a duration of 1.0 ns, belonging to the sequence (), is masked into a virtual node over a time interval of  = 1.0 ns  = 20 ps, producing  () (Fig. 1b).In order to fulfil the requirement of our model of using a quasi-monochromatic input electric field by using a small modulation index, we add a bias  = 8.0 to  () which was optimized with respect to the performance of the RC (with the rest of the parameters fixed).The resulting masked input signal, X (), has a modulation index of less than 2%.X () is transformed into an electric signal that linearly modulates the optical field from an ideal noiseless continuous-wave laser source that generates an optical pump at a frequency  p and with average power  in .We denote the resulting linearly modulated optical signal power at the input port of the MRR as  in ().Consequently, we can mathematically describe the input sequence X () and the electric field  in at the input port of the MRR (Fig. 1c) corresponding to the input optical signal (square root of  in ()), in the following way [18]:

Reservoir layer
The Runge-Kutta solution of Eq. ( 1)-( 3) is obtained with a step  = 2.0 ps.This value was sufficiently small to obtain an accurate solution for the cavity nonlinearities of the system under investigation in [24].There are  =   = 10 solver steps per virtual node (500 per symbol for  = 50).Hereafter, we sample and hold  in () for each  th step of the Runge-Kutta solver over an interval  for each  th virtual node to build the input electric field Êin () used in our model solver : Similar to the mathematical description of the scheme performed in [18], the external waveguide provides a delay  d to the optical signal that links the through and the add port.Throughout the simulations of the RC, we use a value of  d = 0.5 ns for the delay waveguide.This value was optimized as a function of the ratio between   and the symbol duration (1.0 ns) where a delay of a duration of half the symbol length was found to be the optimum value.The solver steps equivalent to the delay time are determined as τd =  d / = 250.We assume no counterpropagating modes in the microring cavity.Hence, the electric field samples at the add and drop ports can be expressed as: where  = 0.95 represents the optimized coupling factor between the delay waveguide and the bus waveguides of the MRR.As in the case of ,  was optimized with respect to the performance of the RC (lowest prediction error).The obtained value of  is also close to the one used similarly in [18].The model takes into account the phase shift  d due to propagation through the delay waveguide for the optical pump with wavelength  p .In [18], an adjustable external phase shift (Δ) in the delay waveguide was used to improve performance.In this work, Δ = 0 is used, for which similar performance is achieved than for other values of Δ within a limited range of configurations of the setup.The results obtained in this work could be further improved by a more systematic optimization of Δ, which is out of the scope of this work. d is defined as:

Readout layer
We average the  step values of the Runge-Kutta solution for Êdrop () over the duration  for each  th virtual node to obtain  drop (): Next, we simulate the photodiode response of the RC by calculating the square of  drop ().
Once the response of the RC is obtained, we calculate the -size weight vector  for the output layer of the RC by using ridge regression with an optimized regularization parameter Λ = 10 −9 .The impact of varying Λ for different values of the physical parameters considered in this work is not analyzed and we keep it constant.The  elements of the predicted sequence ŷ(n) are then determined as follows:

Benchmark
To investigate the impact of the relaxation times of the studied nonlinear effects and the waveguide attenuation we test the system when solving a chaotic time-series prediction task like NARMA-10.This task requires the high dimensionality given by the nonlinearities of the cavity and the photodiode response, in addition to memory capabilities as it requires a memory of 10 steps in the past to be solved.The NARMA-10 time-series target equation can be expressed as [46]: For this task, the sequence  () is taken from a uniform distribution over the interval: [0.0, 0.5].We use 2000 data points for the training and 2000 new data points for the testing.The whole data batch is processed into the RC at once.Additional to the 4000 data points, we add 200 warm-up data points at the start of both the training and the testing sets.This allows the Runge-Kutta solution to settle and eliminates any memory dependency between the sequences.To measure the performance of the RC, we calculate the normalized mean square error (NMSE) of the predicted sets.This metric is an estimation of the prediction errors, and so the lower the error, the higher the performance of the RC.The NMSE between the predicted sequence ŷ(), and the target () with a standard deviation  2   and length  data , can be expressed as:

Results
In the following subsections, all the results are obtained from simulations under a  in range of -20 to +20 dBm and a Δ/2 range of ±300 GHz.For every simulation with the same value of , we use the same generated random mask sequence ().The calculated NMSE of our simulation results corresponds to the testing set and it is averaged over 10 different seeds used to generate () and consequently, the NARMA-10 sequence.As previously mentioned, a normalization process is carried out when solving Eq. ( 1)-(3).However, for visualization purposes, we present the denormalized quantities of Δ () and Δ () in our results.When not otherwise specified, we use the following values for the analyzed parameters:  th = 50 ns,  FC = 10 ns,  = 0.2 dB/cm, = 50.The value of each parameter analyzed was selected as a middle point within realistic ranges of previously reported values from studies in the literature that use MRRs for photonic computing applications [34][35][36][37][38]. First, using the previous parameters as the initial conditions we obtain the results of subsection 5.1.

𝑃 in vs Δ𝜔/2𝜋 regions of NMSE
We divide the parameter space  in vs Δ/2 according to the NMSE performance and identify three different behaviours.The corresponding regions are labelled A, B, and C, as shown in Fig. 2. The first region, A, occurs at higher detuning from the (cold) resonance of the MRR in the heat maps, in which the NMSE gradually approximates a constant value.The region C is located near the resonance of the MRR (although this is not necessarily the case for other parameter configurations, as further results demonstrate).It achieves an NMSE > 1.0, which indicates an inconsistent response of the RC which makes it unable to achieve accurate predictions.The third region, B is located in the middle of regions A and C at both the red-shift and blue-shift sides of the spectrum.This region is the one with the best performance of the RC (NMSE<0.2).In Fig. 3, we show a slightly different perspective of the previous result by simulating a frequency sweep for a set of  in values that highlight the transitions between the different regions.In order to determine the reasoning behind the constant value that region A appears to approximate when we increase either positive or negative detuning, we determine first the performance of the RC in the absence of nonlinear dynamics in the MRR.In this scenario, we can expect the photodiode response at the detection stage of Fig. 1a 3. NMSE as a function of Δ for different levels of  in ( FC = 10 ns,  th = 50 ns,  = 0.8 dB/cm).We also include the NMSE result obtained when the system uses a photodiode as single nonlinear element of the TDRC.
system.Therefore, we simulate our system using a photodiode as the nonlinear node and with absence of MRR nonlinearities, which we refer to as photodiode-based RC in this work.For such a setup, we obtain approximately the same NMSE threshold as the one to which region A converges in Fig. 3 (NMSE ≈ 0.3475).This indicates that the MRR approaches a linear regime value the further we detune   from  0 .The slope of this trend increases in low power levels and it decreases for 20 dBm, as the high power holds the FCD nonlinearities for a longer Δ range.Later in this work, we also study the characteristic response from each of the regions.

Impact of the thermo-optic decay time
The value of  th is related to the thermal conduction and geometry properties of the microring silicon waveguide and the cladding.It is possible to tailor  th by altering the thickness of the cladding or its material.It is also possible to control  th by etching trenches around the microring [33,38].In Fig. 4, we vary  th while fixing the values of  fc and .The NMSE of the testing set prediction is obtained for the aforementioned ranges of average input power and frequency detuning.The minimum NMSE obtained for each value of  th is shown in Table 2.
When increasing  th , the TO effect inside the cavity becomes dominant over the FCD effect.Hence, as mentioned in section 2, the TO effect causes a red shift of the frequency resonance of the MRR, which is more evident in moderately higher powers (above 0 dBm) [34].This red shift appears to be reflected in the behaviour of region C with respect to   .Region C is shifted to negative detunings (redshift of   ) also in values of  in above 0 dBm.On the other hand, as  th increases, region B gets narrower, and the minimum NMSE obtained is also higher.In general, increasing  th shows to be detrimental to the performance of the RC, resulting in a limited tolerance for practical demonstration.The results of Table 2 seem to indicate that the longer  th the higher the required power to achieve the minimum NMSE of the results of Fig. 4 (a-d).Nevertheless, as region C shifts towards negative detunings, the location of minimum NMSE shifts to positive detunings where now regions A and B are located.

Impact of the free-carrier relaxation time
The carrier's lifetime within the diffusion effective area of a microring cavity is determined by recombination processes such as the Shockley-Read-Hall (SRH), radiative and Auger recombinations. FC depends on the density of holes in the valence band and electrons in the conduction band [47].In practice, the value of  FC is usually approximated taking into account just the SRH recombination rate if the deviation of carrier density from thermal equilibrium is much larger than the density of traps [44]. FC can also be adjusted by improving the quality of the silicon-silica interfaces [38].Using the same methodology as in the previous subsection, we determine the RC performance when varying  FC while fixing the values of  th and  (Fig. 5).We vary  FC from the order of picoseconds to the order of tens of nanoseconds, so that we can analyze the impact of both reducing and increasing  FC to have a deeper understaning of how the ratio  FC / th affects the RC.This ratio has a significant influence on the nonlinearities behaviour as discussed in further details later in this work.The minimum NMSE values obtained under the previous conditions are shown in Table 3.For a low value of  FC (Fig. 5 a and b), the TO effect is dominant and, similarly to the results obtained in Fig. 4, region C is shifted towards negative Δ.However, the extension over the Δ/2 axis seems to be reduced with respect to Fig. 4 a) when  FC is reduced.As  FC increases to the order of tens of nanoseconds, the FCD effect becomes more dominant and region C shifts towards positive Δ, which means a blueshift of   .This behaviour of region C is similar to the results of the previous subsection but in the opposite direction of the spectrum.This is to be expected since TO and FCD effects cause a detuning of  0 in opposite directions.Unlike the previous case though, when region C shifts towards either negative or positive Δs its width gets narrower, which translates into a larger region B. This is of crucial importance as it opens a significantly larger window of  in vs Δ where RC operates with a very low NMSE and low power (particularly in Fig. 5 e and f).This finding is also summarized by the results listed in Table 3.

Impact of the waveguide linear attenuation
The attenuation of a silicon MRR is directly related to the fabrication quality of the MRR waveguide and it has been extensively studied as high-quality factor () silicon microring cavities have been pursued during the last decades [48].However, it is also important in the context of this work to consider that modifying  th or  FC could lead to a collateral impact on , and therefore, it is important to assess the influence of  on the RC performance in order to fully grasp the possible performance penalties of altering the dimensions or physical properties of the MRR when tuning other parameters.Under the same frequency and average power conditions as before, we increase the value of  while fixing the ones of  th and  FC (Fig. 6).The minimum NMSE obtained for each increasing value of  is shown in Table 4. Contrary to the previous results, altering the waveguide attenuation does not have a great impact on the position of region C when  in increases, although it does have an effect on its extension.A higher attenuation appears to increase the size of region C and in turn, makes region B narrower.Therefore, a relatively high- MRR (low attenuation) is desirable in terms of performance of the RC.

Decreasing the number of virtual nodes
In order to make a fair comparison with the study on the MRR-based RC with external feedback done in [18], we decrease the number of virtual nodes to 25 and simulate the RC for a sample set of configurations from the previous subsections (Fig. 7).This set corresponds to the minimum and maximum values of  th ,  FC and  simulated in the previous subsections.There is a slight increase in the obtained minimum NMSE as is expected from a neural network with fewer virtual nodes (Table 5).Likewise, when reducing  to 10 virtual nodes, further slight degradation of the performance is observed.For this case (Fig. 8), the reduction of the size over the parameter space of region B is more noticeable than in the case of  = 25.The minimum NMSE for  = 10 is shown in Table 6.The results indicate that even when the minimum error obtained increases, there is still no dependence of the defined  in vs Δ regions (A, B and C) on , as the obtained trends are very similar between the different values of .It is possible to argue that the gain in terms of minimum NMSE is relatively small when increasing  up to 50.This indicates that the dimensionality of the RC is mainly given by the MRR nonlinear dynamics, and, within the considered values,  does not impact the behaviour of the defined regions as much as  th ,  FC and .In Fig. 9 (a-f), we display an instance of the waveform of the MRR response at the drop port for each of the defined regions shown in Fig. 2. To switch between the regions, we fix the value of  in to 0 dBm, while varying Δ/2.Fig. 9(a-c) correspond to the response of the same 10 bits (10 ns) of the testing sequence set previously shown in Fig. 1a and 1b.By zooming out this sequence to capture 1000 bits (1 s), we obtain Fig. 9(d-f).The response of the RC in Fig. 9(a,d) (region A) shows a delayed linear transformation of the input in which the signal does not differ much from the input sequence.Due to the lack of nonlinearity, the prediction fails to resemble the original NARMA-10 sequence, resulting in a relatively high NMSE = 0.3357 as the system approaches the response of a photodiode-based RC as explained in subsection 5.1 for region A. Next, the waveform examples of region B (Fig. 9(b,e)) show a nonlinear transformation of the input sequence under stable conditions (avoiding SP) which gives enough dimensionality to the RC to achieve a low error (NMSE = 0.0269).Lastly, Region C, which can be observed in Fig. 9(c,f) shows SP oscillatory behaviour with various discontinuities or 'spikes' and fast transitions to values near zero amplitude (more visible in Fig. 9(f)).This behaviour of the SP affects the consistency of the RC response as in these time intervals RC is incapable of learning a consistent response to similar inputs in order to be tested with unknown data sets.

RC linear vs nonlinear regimes
The previous subsections qualitatively indicate the amount of nonlinear transformation the input goes through in each region.A way to quantitatively evaluate this is by determining the coefficient of determination,  2 , between  drop and  in so that we can quantify, on a range [0.0 − 1.0], how much the response of the RC (before photodetection) can be accounted as a linear transformation of the input.In other words, an  2 of 1.0 would indicate that the RC approximates a linear regime.We show the results for 6 instances of the configurations used in Fig. 2-4 (2 per subsection).As demonstrated in Fig. 10.Each of the 3 defined regions of the previous subsections matches specific levels of  2 .First, it confirms that the cavity gradually gets close to a linear regime as the  2 tends to 1.0 as we increase the detuning from  0 (Region A).Then, the best performance is obtained in a mixed state between a linear and nonlinear regime where  2 oscillates between ∼0.2 and ∼0.9.The location of this mixed state matches that of region B. Lastly, an  2 of 0 is obtained in the same area that corresponds to SP (region C), which indicates that there is no direct relation between the response of the RC and the input.Similar findings regarding the importance of the transition between linear and nonlinear regimes to enhance TDRC performance were previously found in [49] for a Mach Zehnder Modulator used as a nonlinear node of photonic TDRC.However, in their implementation, the TDRC performs a different time-series prediction task.

Dependence of the RC dynamics on ΔT and ΔN
Finally, we also assess the relation between the defined regions (A, B and C) over the  in vs Δ/2 parameter space, with Δ and Δ.To achieve this, we plot the average of Δ and Δ for the Runge-Kutta solution of Eq. ( 1)-( 3) of the whole testing set.By comparing their  in vs Δ/2 heatmaps with the ones related to the NMSE and  2 , as shown in Fig. 11, we can see a clear relation between the increase in free-carrier concentration and temperature, and the rise of SP with its previously stated effects on the RC response.Very low average Δ or Δ are not good either for the performance of the RC, as this shows to be highly correlated with the MRR approximating a linear regime.To achieve optimum performance, relatively small increases in temperature ∼(2.5 to 15 K) and a Δ between ∼10 18 and ∼10 22 appear to be the key requirements.However, the susceptibility of the MRR cavity to enable SP for a given Δ and  in also depends strongly on the ratio between the relaxation lifetimes of the nonlinear effects as discussed in section 6.

Discussion
In our study, the free-carrier and TO nonlinearities relaxation times ( FC ,  th ) and the waveguide attenuation  have been varied while fixing the other parameters in order to understand the individual contribution of each of them to the overall performance of the RC.However, in practice, each of them is highly correlated with the others and the potential changes that can be made to the MRR to optimize one of them will most probably affect the others.In the case of  FC and  th it is also essential to consider the ratio between them,  FC / th .As shown in [37,38], this ratio has a severe influence on the existence of SP, depending in turn on Δ and  in .In [37] the authors conclude that in order to enhance SP, a short  FC is required with respect to  th , but not too short (it should be longer than the photon lifetime, i.e., the relaxation time of the resonance  r =  Si   ).This condition is matched in the simulated parameters used to obtain the results of Fig. 4. By increasing  th , the ratio  FC / th becomes smaller and since  FC = 10 ns is higher than  r (∼0.14 ns for  = 0.8 dB/m), conditions are favourable to reinforce SP and extend its region.Due to the increase of  th , the TO effect is dominant and the SP region is shifted towards negative detuning (red-shift of the resonance) [38].
Hence, for a given , one way to diminish SP behaviour is to reduce  FC as much as possible, so that the requirement  FC < r is fulfilled.A few attempts to try to reduce  FC to sub-nanosecond magnitudes are found in the literature: A  FC of 55 ps is achieved in [26] using ion implantation in a silicon MRR, although the waveguides are penalized with additional 22 dB/cm propagation losses.Moreover, a  FC of 15 ps is reported in [27] using an oxygen-implanted silicon on insulator (SOI) MRR for all-optical switching purposes.Lastly, in [50], a p-i-n junction is integrated into an SOI waveguide to reduce  FC to 12.2 ps with an attenuation loss of 2 dB/cm.
Another way to keep a certain level of nonlinear dynamics without reinforcing SP would be to increase the ratio  FC / th [37].This can be done by decreasing  th so that both relaxation times are around the same order of magnitude (Fig. 4(a)), but this is more challenging as  th is very dependent on the quality of the fabrication process and the geometry of the silicon waveguide. th is also relatively fixed by the required width of the cladding material.So, the other way to increase  FC / th would be to increase  FC , and with it, the strength of free-carrier nonlinearities.Our results are coherent with these previous methods as the area of very high error (Region C) that corresponds to SP is reduced by making  FC either too small in comparison to  th and smaller than  r (order of picoseconds, Fig. 5(a)) or by making  FC / th >= 0.5 (Fig. 5(b)).Both scenarios provide enough nonlinearity to the RC and region B is extended to lower levels of power.
Regarding the linear losses, resonator cavities with a high -factor are indispensable to achieve the necessary nonlinear dynamics for the RC to operate with low NMSE [37].Higher linear losses contribute to the increase in heat due to power absorption and in turn, the TO effect becomes the dominant effect and the region of SP is widened.On the contrary, low linear waveguide losses (which normally translate into relatively high  MRRs) allow the dynamics of free-carrier nonlinearities to become more relevant within the cavity.As FCD becomes more dominant, the resonance is blue-shifted and the SP region shifts to positive detunings [37].Our results in Fig. 6 match the previously described physical dynamics.
Following the aforementioned insights from studies about SP in MRR cavities, therefore, provides a way to enhance the range of  in and Δ/2 in which region B occurs.These conditions are consistent with our results.The existence of this interval of low-error prediction (region B in our case) also fits well with the results of [39], where, as we mentioned before, a power interval is also found for FCD nonlinearities in which the RC is given enough dimensionality while keeping consistency using directly coupled cavities.
As a matter of performance comparison, we state previous works about photonic TDRC schemes solving the NARMA-10 task with the same number of virtual nodes ( = 50) and similar speed.An optoelectronic scheme using a Mach-Zehnder modulator achieved an NMSE = 0.168 ± 0.015 [10].Even though, we acknowledge that the results in [10] were obtained both in simulations and experimentally.Numerically, an NMSE = 0.103 ± 0.018 was obtained in [14].In order to perform a fair comparison of our results with [18], we simulate the system using the highest and lowest values of the nonlinear effects lifetimes and waveguide loss considered in this study, for  = 25 nodes.The minimum NMSE with that number of virtual nodes numerically obtained in [18] is 0.204 ± 0.026.In our results, we obtain a minimum NMSE of 0.0151 ± 0.0021, which is a considerable improvement over [18] and we could expand the region in which the RC achieves its best performances in comparison to our previous study [24].

Conclusion
Throughout this work, we have investigated the relationship between the performance of an MRR-based TDRC and the lifetimes of the free-carrier and TO effects as well as the impact of the waveguide attenuation.We simulate different values for the parameters ( FC ,  th and ), and define three  in vs Δ/2 regions in which each of them shows a very different level of error prediction.We characterize such regions by first showing qualitatively the waveform response of each region and then we quantify the differences in nonlinearity between such regions using  2 .We show that when the response at the drop port can be potentially represented by a linear transformation of the input sequence, the nonlinearities of the system are given mostly by the photodiode response at the detection stage.When the  2 between the output and input of the RC is very low, the system might run into the risk of obtaining a response too inconsistent to accurately calculate the weights during the training process due to the discontinuities and near-to-zero response of the RC caused by SP nonlinearities.There is an interval of average input power and angular frequency detuning in which enough dimensionality is given to the RC without becoming unstable due to SP.This work provides a further understanding of the physical conditions that are optimum to reduce SP while keeping a high dimensionality.In the areas that fulfil such conditions, our RC achieves low error at low power when solving a time-series prediction.Our results obtain an NMSE lower than other works that propose a similar  and speed.We also show that decreasing  does not have a great impact on the physical dynamics of this RC setup.Disclosures.The authors declare no conflicts of interest.Data Availability.Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Fig. 1 .
Fig.1.a) Photonic TDRC scheme using a silicon MRR with delayed feedback.b) Data sequence and masked input of the RC, taken from an individual testing set of the RC simulations.c) Corresponding electric field envelope of the signal at the input port of the MRR.The small modulation index approximates a quasi-monochromatic optical signal.
Fig.3.NMSE as a function of Δ for different levels of  in ( FC = 10 ns,  th = 50 ns,  = 0.8 dB/cm).We also include the NMSE result obtained when the system uses a photodiode as single nonlinear element of the TDRC.

-
Fig. 5. NMSE of the simulated RC for different values of  FC with  th = 50 ns,  = 0.8 dB/cm.

α
Fig. 6.NMSE of the simulated RC for as a function of , and  th = 50 ns,  FC = 10 ns.

Fig. 9 .
Fig. 9. a-c) Sampled waveforms at drop port corresponding to 10 bits of each of the three defined  in vs Δ/2 parameter space regions of the simulated RC when solving the NARMA-10 task with  in = 0 dBm, N = 50,  th = 50 ns,  FC = 10 ns,  = 0.8 dB/cm.a and d-f) 1.0 s Extended sampled waveforms under the same conditions.g-i) Samples of the target (blue) and predicted (orange) testing sets of a NARMA-10 sequence.

Fig. 11 .
Fig. 11.Comparison of the  in vs Δ/2 heatmaps for a) NMSE of the testing set.b)  2 between  drop and  in c) Average Δ of the testing set computing.d) Average Δ of the testing set computing. th = 50 ns,  FC = 10 ns,  = 0.8 dB/cm.

Table 1 .
Optical parameters used in the photonic RC simulations.

Table 4 .
Minimum NMSE of testing set prediction for selected values of  ( FC = 10 ns,  th = 50 ns).

Table 5 .
Minimum reached NMSE and corresponding parameters for the testing set prediction with  = 25.

Table 6 .
Minimum reached NMSE and corresponding parameters for the testing set prediction with  = 10.
5.6.Characteristic RC response of the  in vs Δ/2 regions of NMSE