Image edge detection with a photonic spiking VCSEL-neuron

We report both experimentally and in theory on the detection of edge features in digital images with an artificial optical spiking neuron based on a vertical-cavity surface-emitting laser (VCSEL). The latter delivers fast (< 100 ps) neuron-like optical spikes in response to optical inputs pre-processed using convolution techniques; hence representing image feature information with a spiking data output directly in the optical domain. The proposed technique is able to detect target edges of different directionalities in digital images by applying individual kernel operators and can achieve complete image edge detection using gradient magnitude. Importantly, the neuromorphic (brain-like) spiking edge detection of this work uses commercially sourced VCSELs exhibiting responses at sub-nanosecond rates (many orders of magnitude faster than biological neurons) and operating at the important telecom wavelength of 1300 nm; hence making our approach compatible with optical communication and data-centre technologies. Published by The Optical Society under the terms of the Creative Commons Attribution 4.0 License. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal


Introduction
Neuromorphic spiking models, developed to emulate the spiking responses of biological neurons in the brain [1], are seeing a rise in interest as new routes to novel computing paradigms are being explored. The brain, and its vast interconnected networks of neurons, is known to be capable of performing complex, human-like decision making more efficiently than traditional Von-Neumann computing architectures making the procurement of neuronal spike-based functionalities a key priority. Neuromorphic platforms based on electronic circuits for both research and technology applications have been developed in recent years with systems such as Neurogrid [2], SpiNNaker [3], TrueNorth [4] and more recently the BrainScaleS 2 neuromorphic chip [5] coming to fruition. These impressive systems have thrived due to the maturity of CMOS technology, however photonic approaches are also sought after by researchers due to their potential high energy efficiency, large bandwidth, low crosstalk, and fast spiking rates offered by light-based platforms. The push for photonic realisations has inspired a wave of optical spiking neuronal models based on numerous technologies such as phase change materials (PCMs) [6,7], resonant-tunnelling diodes (RTDs) [8,9], photonic cavities [10], optical modulators [11] and semiconductor lasers (SLs) [12][13][14][15][16][17][18][19][20][21][22][23][24][25]. SL technology has given rise to a large number of spiking neuronal models with devices such as quantum dot lasers [12,13], micropillar lasers [14,15] and lasers with saturable absorber sections [16], to name a few (see [17] for a review).
Among the most promising SL neuronal models are vertical-cavity surface-emitting lasers (VCSELs) [18][19][20][21][22][23][24][25]. These devices are reported to generate controllable excitable (spiking) dynamics along with key neuronal functionalities such as input thresholding and tonic spike activation [18,19]. The complex non-linear dynamics responsible for the excitable responses have also provided pathways to inhibitory dynamics [20], controllable spike rate-encoding for digital/analog-to-spike conversion at rates surpassing 1 GHz [21], and input integration [22,23] for realisations of logic and supervised pattern recognition systems. Additionally, theoretical studies into the achievable neuronal functionalities of VCSEL devices, based on the spin-flip model (SFM) [26,27], suggest their capability to perform learning based upon spike-timing dependent plasticity (STDP) as observed in biological neural networks [24,25]. The neuromorphic capabilities of these VCSEL-neurons are therefore exciting, and are only further enhanced by the inherent advantages of VCSELs, such as their affordability, high-speed and low-energy operation, ease of integration into 2D arrays of devices, telecom wavelength compatibility with data centre technologies (C&O telecom bands) and sub-nanosecond spiking dynamics, demonstrated at up to 7 orders of magnitude faster than biological timescales.
Image interpretation and classification is one of the characteristic tasks allowed by artificial neural networks. This complex functionality, realized computationally using convolutional neural networks (CNNs), applies the traditional image processing technique, convolution, in a large parallel network to achieve an overall recognition by assembling and comparing many smaller features [28]. Like neuromorphic spiking models, photonic realisations of CNNs have recently come under increasing investigation [29][30][31][32][33] as again it is expected that the energy efficiency, speed and inherent parallelism offered by light-based technology will benefit the computationally intensive convolution task. At the heart of these complete systems, numerous photonic technologies such as ring resonators and photodiodes [29,30], Mach Zehnder interferometers [31,32], and PCMs [33] are being used for the creation of elements that perform weighting, summation, thresholding and activation functionalities.
In this work, we propose and demonstrate edge-feature image detection using a single artificial spiking VCSEL-neuron. The latter fires optical spiking patterns in response to pre-convolved optical inputs; hence representing edge-features (with different directionalities) in digital images with sub-ns long optical spikes. Specifically, we apply our photonic VCSEL-neuron technique for the edge detection of digital images at the key telecom wavelength of 1300 nm. Unlike reported implementations of CNNs, our technique utilizes time-division multiplexed optical inputs to enable operation with a single VCSEL-neuron; hence reducing hardware requirements, and provides output data in a spiking representation, directly in the optical domain. We provide both experimental findings and theoretical results modelling the anticipated spiking responses of the VCSEL-neuron based on the SFM [26,27]. This paper is organised as follows: in sections 2 & 3 we discuss the convolutional and experimental technique applied to achieve image edge-detection with a spiking photonic VCSEL-neuron. Section 4 describes the theoretical model used to validate the experimental findings. Section 5 provides theoretical and experimental results on vertical and horizontal image edge detection and in Section 6 we discuss and provide results of image gradient magnitude detection.

Image processing and convolution technique for image edge detection
In this work, image convolution is incorporated alongside an experimental realisation of a photonic spiking VCSEL-neuron (via modulated optical injection) for the first time, as shown in Fig. 1. In this section, we detail the pre-processing image convolution technique utilised ( Fig. 1(a)). Initially, digital greyscale source images undergo conversion to positive (black) and negative (white) integer (1 and −1) matrices. Source images that contain different directional features were selected as shown in Fig. 2. The 28 × 28 pixel cross and saltire (Scotland's national flag) images ( Fig. 2(a)-(b)) were chosen as they contain horizontal, vertical and diagonal features. The larger 50 × 50 pixel image in Fig. 2(c) corresponds to the logo of our institute, the Institute of Photonics (IOP) at the University of Strathclyde. The IOP logo contains additional curved features and is much larger in size to illustrate the versatility of our work's technique. In order to reveal edge information in these images, convolution is performed using a 2 × 2 kernel operator. The latter applies a weight to each pixel in a 2 × 2 region of the image and sums of all 4 weighted Image convolution technique utilised to achieve spiking image edge detection and the experimental setup employed to implement it. Image processing procedure (a) and experimental realisation of the VCSEL-neuron (b). In (a), a black and white source image is converted into positive (1) and negative (−1) integers before it is multiplied by a 2 × 2 kernel operator. The resulting image is converted into a RZ image input where the destination pixel value is taken and encoded into the tunable laser's optical intensity. In (b), light from a tunable laser is encoded with the convolved image input using a Mach Zehnder intensity modulator (MZ). The intensity encoded signal is injected into the spiking VCSEL-neuron whose response is collected via the optical circulator and analysed using a fast real-time oscilloscope. Two polarisation controllers (PC), an optical isolator (OI) and a variable optical attenuator (VOA) are used to control the light signals within the fibre-optic based experimental setup of this work. pixel values. The 4-value sum corresponds to the destination pixel value in the new convolved image (as shown in Fig. 1(a)). Different features can be targeted for recognition by applying different kernel operators and 2 × 2 kernels are scanned along every pixel in a row, and every row in an image (sliding window method). By comparing neighbouring pixels in this convolution process, we are able to identify features that best match our selected kernel operator. This process can be summarised using the equation: where g p,q is the value of the destination pixel when the source image anchor-pixel f p,q is operated on by kernel K. A (M+1) x (N+1) pixel neighbourhood is operated on by the customisable kernel operator, in this work we set M = N = 1 to achieve a 2 × 2 neighbourhood array. No image buffer was used, hence both the number of rows and columns in the new convolved image is reduced by 1. The destination pixel value g p,q is injected into the VCSEL-neuron ( Fig. 1(b)) which will then identify with ultrashort (sub-nanosecond) neuron-like spikes which pixels in the original source image contain the target feature. In order to generate the input signals entering the VCSEL-neuron, time-division multiplexing was used. In the time-multiplexed image input, each destination pixel was sequentially allocated the same configurable pixel duration. The latter was selected equal to 1.5 ns/pixel to coincide with the spiking refractory period of the VCSEL-neuron [22]. This, along with an inter-pixel return-to-zero (RZ) coding scheme, helped to ensure all neighbouring pixels were capable of triggering a single spiking event per input. All destination pixel values were held for 0.25 ns before returning to zero. The time-division multiplexed image input was produced in a 12 GSa/s, 5 GHz bandwidth, arbitrary waveform generator (AWG -Keysight M8190a) and fed to the VCSEL-neuron. The latter performs thresholding and spike activation operations to finally provide output data with a spiking representation (directly in the optical domain) of the edge-features present in the injected digital images, as shown in Fig. 1(b). We note that the image data input, here demonstrated at 1.5 ns/pixel (based on the refractory period of the commercially-obtained VCSEL-neuron) could be achieved at faster rates, e.g. sub-ns long pixel inputs, by further optimisation of device design and fabrication, which is beyond the scope of this study. Similarly, we note that overall processing rates can be improved by expanding from a single device to a multi-device architecture with parallel artificial neurons. These scaling-up approaches are of high interest and we plan to expand and explore the alternative system architectures in our future work.

Experimental VCSEL-neuron implementation for ultrafast spiking edge detection
The experimental setup used to implement the photonic VCSEL-neuron and perform spiking image edge detection is shown in the schematic diagram of Fig. 1(b). Similar to our previous work [23], modulated optical injection was used to trigger spiking responses from the VCSEL-neuron. Light from a 1300 nm tunable laser (TL -Santec TLS-210 V) was optically encoded with the time-division multiplexed image input (from the initial pre-processing step). A Mach Zehnder (MZ) intensity modulator was configured to introduce drops of intensity in the tunable laser's light when subject to positive pulses from the image input. The encoded optical signal was passed to a coupler where a power meter (PM) made a measure of input power, and a circulator injected the optical signal into the VCSEL-neuron. An amplified 9 GHz bandwidth photodetector (Thorlabs PDA8GS) was used to collect the output of the VCSEL-neuron and a high-speed 13.5 GHz bandwidth, 40 GSa/s sampling rate, real-time oscilloscope (OSC -Agilent Infiniium DSO81304B) was used for temporal analysis. Throughout this work, the commercially sourced, fibre-pigtailed VCSEL device was biased with a current of 6.5 mA (I th = 2.96 mA) and temperature stabilised at 298 K. Under these operating conditions, the VCSEL exhibited single mode lasing and the presence of two linear-orthogonally polarised modes, namely a main lasing (parallel) and a subsidiary-attenuated (orthogonal) polarisation mode. Optical injection was made at a frequency detuning (∆f ) of −4.58 GHz from the peak of the subsidiary (orthogonal) mode, inducing polarisation switching during injection locking. An injection power of 152.7 µW was used to injection lock the device. Encoded intensity drops of sufficient amplitude were used to force the laser out of injection locking and into a regime of fast spike firing dynamics. The stability of experimental parameters such as laser bias, temperature, injection power and detuning is important to the controllable nature of the excitable dynamics with condition deviations affecting the overall performance of the spiking system. The VCSEL-neuron is therefore performing input thresholding and spike activation, allowing the system to reveal target feature information through the triggering of fast neuromorphic spiking events in the optical domain at the key telecom wavelength of 1300 nm.

Theoretical analysis of the VCSEL-neuron with the spin flip model (SFM)
We use a modified version of the well-known spin-flip model (SFM) [27] to evaluate theoretically the operation of the VCSEL-neuron and validate the experimental findings. The modified model used here includes an additional term that accounts for the convolution step, namely the injection of the time varying post-kernel image input. The modified rate equations are as shown below: where the subsidiary (orthogonally-polarised) and solitary (parallel-polarised) lasing modes of the VCSEL are represented by subscripts x and y respectively. The field amplitudes of the subsidiary and solitary modes are represented by E x and E y . N is the total carrier inversion between conduction and valence bands and n is the carrier inversion difference between spins of opposite polarity. γ a is the gain anisotropy (dichroism) rate, γ p is the linear birefringence rate, γ N is the decay rate of the carrier inversion and γ s is the spin-flip rate. k is the field decay rate, α is the linewidth enhancement rate and µ is the normalized pump current (µ = 1 represents the VCSEL's threshold). E inj represents the post-kernel image input created during the convolution process and k inj is the injection strength. The angular frequency detuning is defined as ∆ω x = ω inj -ω 0 , where the central frequency ω 0 = (ω x +ω y )/ 2 lies between the frequencies of the subsidiary ω x = ω 0 +αγ a -γ p and the solitary mode ω y = ω 0 +γ p -αγ a . ∆f = f inj -f x is the frequency detuning between the injected field and the subsidiary mode, hence ∆ω x = 2π∆f +αγ a -γ p . The spontaneous emission noise F x and F y are calculated as: where β sp represents the spontaneous emission strength and ξ 1,2 represent two independent Gaussian white noise terms of zero mean and a unit variance. The model was solved using the fourth order Runge-Kutta method and the following parameters: γ p = 128 ns −1 , γ a = 2 ns −1 , γ N = 0.5 ns −1 , γ s = 110 ns −1 , α = 2, k = 185 ns −1 , k inj = 15 ns −1 and β sp = 10 −5 .

Vertical and horizontal edge detection in source images using a spiking VCSEL-neuron
The detection of horizontal and vertical edge features was first tested using the 'cross' source image in Fig. 2(a). Kernels 1-4 (shown in Fig. 2) were sequentially applied to the source image and the resulting input values were experimentally injected into the VCSEL-neuron. Kernels 1-2 target vertical lines that transition from white-to-black (Kernel 1) and black-to-white (Kernel 2) pixels. Kernels 3-4 target in turn horizontal lines that transition from white-to-black (Kernel 3) and black-to-white (Kernel 4) pixels respectively. Figures 3(a-b) and 3(e-f) show respectively the spiking responses measured at the output of the VCSEL-neuron when applying individually Kernels 1-4. Image reconstruction maps are built to depict the collected time series from the VCSEL-neuron as intensity colour maps, where the spiking responses appear yellow and the resting state appears blue. Pixels with spiking responses should indicate the presence of the target feature in the source image. As expected, Figs. 3(a) and 3(b) demonstrate the triggering of a spike in response to vertical edges in the cross image. In Fig. 3(c) the image input, created using vertical Kernel 1, is shown for row 10 of Fig. 3(a). The image input injects a positive pulse for the detection of a matching target feature and a negative pulse for the detection of an inverse target feature. The corresponding VCSEL-neuron's response, shown is Fig. 3(d), demonstrates that only the positive pulse triggers a fast ∼100 ps spike at the output of the VCSEL-neuron, highlighting the detection of the target feature. Therefore, applying Kernel 2 as shown in Fig. 3 3(h). Here, as Kernel 3 is scanned horizontally along that specific row of the source image, we create an input consisting of multiple positive target detections. The VCSEL-neuron responds to this input by firing multiple spiking events, one for each of the target detections. The spiking system is not triggered by the half-amplitude pulses (corresponding to corner edges) as the encoded input energy was not enough to cross the spiking activation threshold of the device. Input (c) and output (d) correspond to row 10 of (a). Input (g) and output (h) correspond to row 12 of (e). Pixel duration is set to 1.5 ns/pixel in all cases.
These results demonstrate that the experimental spiking system can threshold and activate spiking responses for horizontal edges when applying Kernels 3 and 4.
For comparison, Fig. 4 shows the theoretically calculated response of the VCSEL-neuron (using the model described in Section IV) when the same 'cross' image is injected into the model. From the calculated maps reconstructing the image (plotted in white and blue for distinction with the experimental findings of Fig. 3) we see excellent agreement with the measured results. The same number of spiking responses are activated when each kernel is applied to detect different vertical and horizontal image edge features. Also, a similar spiking threshold is achieved preventing the activation of corner edges giving an overall experimental accuracy of 100% in each of the 4 presented cases when compared with the numerical results. The spiking rate achieved in the theoretical results also showed a good correlation with the experiment allowing the model to operate at 1.5 ns/pixel. Therefore, both experiment and theory agree that convolutions performed by horizontal and vertical Kernels 1-4, can successfully be thresholded by the VCSEL-neuron to encode target edge-features with fast spiking responses directly in the optical domain. Additionally, we found that in both theory and experiment, the spike activation threshold could be controlled by varying injection power and frequency detuning, and that it could grant the detection of non-target features with smaller amplitude inputs (such as corners). Similarly, we expect that the control of the activation threshold can be used to apply this convolution technique to non-binary, greyscale images to selectively trigger the detection of lower contrast edge features. Fig. 4. Simulations of the spiking response from the VCSEL-neuron, using the spin-flip model, when Kernels 1 (a), 2 (b), 3 (e) and 4 (f) are applied to the source 'cross' image before its injection into the system. Similar to Fig. 3, inputs and outputs are plotted for row 10 of (a) and row 12 of (e).

Gradient-based edge detection in source images with a spiking VCSEL-neuron
In order to progress beyond the detection of one individual directional edge per input, we look towards performing gradient edge detection. Gradient is a vector with both a direction and a magnitude, which provide information about the rate of change of pixel intensity. Specifically calculating the magnitude of the gradient creates a set of data that can be injected into a VCSEL-neuron to perform spiking edge detection. The magnitude of the gradient |G(x,y)| can be calculated using the following equation: where G x and G y are respectively the result of convolving a horizontal and a vertical kernel with the image. Both horizontal and vertical kernels must be 90°rotations of one another. By combining the results of two kernel operators in this way, we can detect edges in our images indiscriminately of their direction. Consequently, the convolution results of Kernels 1 and 3 were combined in this way to produce the gradient magnitude. The latter was taken in place of the destination pixel value when creating the image input for injection into the VCSEL-neuron.
Gradient-based edge detection was performed on all three source images included in Fig. 2 and the results for both experimental and theoretical approaches are showcased in Figs. 5 and 6.  Fig. 2(a). The time series demonstrates that we now have a spiking response for both vertical edges (white-to-black and black-to-white transitions) and the image maps, built from the VCSEL-neuron's spiking output, reveal also the successful detection of the horizontal edges in the 'cross'. Figures 5(b) and 5(d) also demonstrate the modelled activation of spiking events for each edge of the 'cross' image showing excellent agreement with the experimental results (an experimental accuracy of 99.18% when compared with the theoretical result). The experimental gradient edge detection of the 'Saltire' image ( Fig. 2(b)) is shown in Fig. 5(e) illustrating also the successful detection of diagonal edges from a source image with our VCSEL-neuron. Despite gradient detection combining a horizontal and a vertical kernel, we see from the image map (Fig. 5(e)) and time series (Fig. 5(g)) that all diagonal edges were successfully detected in the 'Saltire' source image. The modelling of the VCSEL-neuron's response provided in Figs. 5(f) and 5(h) again shows excellent agreement with the experimental results (experimental results are 99.86% accurate to the theoretical findings), revealing the successful detection of all the diagonal edges. Figure 6 shows the experimental and theoretical response of the VCSEL-neuron to the gradient edge detection of the 'IOP' logo source image included in Fig. 2(c). This larger 50 × 50 pixel image contained both straight and curved lines (as shown in Fig. 2(c)), distributed unevenly across the image background. The image map and time series reveal that the gradient detection successfully reveals every directionality of edge. Despite the larger image, the pixel duration remained consistent and only a larger overall time series was required. Again, the modelling of the VCSEL-neuron (Figs. 6(c) and 6(d)) showed excellent agreement with the experimental results with an overall experimental accuracy of 99.67% compared to the numerical results. Spiking edge detection can therefore be successfully performed by calculating the magnitude of the gradient and using the VCSEL-neuron to threshold the convolved inputs.

Conclusion
In summary, we demonstrate image edge detection using a single spiking VCSEL-neuron operating at the key telecom wavelength of 1300 nm. Our approach uses time-division multiplexing for image data injection; hence allowing for single VCSEL operation, that delivers the output data of image edge-features in a fast spiking representation (<100 ps long spikes) directly in the optical domain. The artificial neuronal model presented demonstrates the spiking edge detection of vertical, horizontal and diagonal straight lines using individual 2 × 2 kernel operators and traditional image convolution techniques. Building upon this, the identification and extraction of multiple straight and curved edges, irrespective of their directionality, was achieved in a single input run by calculating and thresholding image gradient magnitude. Furthermore, the experimental findings in this work are in excellent agreement with numerical simulations carried out using a modified version of the SFM. The commercially available and telecommunication compatible spiking photonic VCSEL-based neurons used in this proof-of-concept demonstration represent a first step towards the future development of high-speed neuromorphic photonic systems formed by interconnected networks of VCSEL neurons for enhanced image processing capability with additional functionalities (e.g. multiple input integration [23], photonic-hardware implemented weight control [24,25,32]) and pathways to help integrate with fields such as optical communications, directly in the optical domain.