Dual-Comb Real-Time Molecular Fingerprint Imaging

Hyperspectral imaging provides spatially resolved spectral information. Utilising dual frequency combs as active illumination sources, hyperspectral imaging with ultra-high spectral resolution can be implemented in a scan-free manner when a detector array is used for heterodyne detection. However, relying on low-noise detector arrays, this approach is currently limited to the near-infrared regime. Here, we show that dual-comb hyperspectral imaging can be performed with an uncooled near-to-mid-infrared detector by exploiting the detector array’s high frame-rate and the combs’ high-mutual coherence. The sys-tem simultaneously acquires hyperspectral data in 30 spectral channels across 16’384 pixel, from which molecule-speciﬁc gas concentration images can be derived. Artiﬁcial intelligence enables rapid data reduction and real-time image reconstruction. Owing to the detector array’s sensitivity from 1 µ m to 5 µ m wavelength, this demonstration lays the foundation for versatile imaging of molecular ﬁngerprint signatures across the infrared wavelength-regime in real-time.


I. INTRODUCTION
Hyper-spectral imaging extends traditional imaging approaches by providing detailed spectral information for each pixel of an image 1 .As a general method it has been employed with impressive success across the scientific disciplines, including Earth remote sensing 2 and medical sciences 3 .Hyperspectral imaging instruments often rely on conventional visible or near-infrared photocameras in conjunction with dispersive elements or filters.Achieving fast image acquisition as well as high spectral resolution in a large number of pixel remains challenging.Indeed, resolving the narrow optical absorption features of gases requires precision spectroscopic techniques, which typically do not offer spatial resolution.Having the capability to rapidly image and identify characteristic molecular fingerprints of gas molecules would open new opportunities for medical imaging diagnostics, environmental monitoring or industrial applications, includ-ing leak detection, process optimisation or identification of hazardous substances.
In order to permit rapid and reliable hyperspectral imaging of gas molecules it is therefore desirable to implement precision spectroscopic techniques that combine multiplexed spatial and spectral acquisition.One particularly attractive approach is performing dual-frequency comb spectroscopy with an imaging detector array 4 , which enables pixel-wise parallel spatial multiplexing and hence rapid acquisition of hyperspectral data without any moving mechanical parts.
In dual-frequency comb spectroscopy [5][6][7][8][9] , two optical frequency combs (1 and 2) are used.Each represents a well defined set of lasers lines spaced by their respective repetition rate f (1)   rep and f rep ) with a relative frequency offset f c f (1,2)  rep between both combs.Simultaneous photodetection of the combined combs with a single detector results in a multi-heterodyne signal comprised of periodic interferograms, each with a duration of ∆f −1 rep .Fourier-transforming (at least one of) the interferograms yields the multi-heterodyne spectrum comprising beatnotes at frequencies f c + n • ∆f rep (n = 0, ±1, ±2, ...).Effectively, the optical spectrum is compressed by a factor of (f rep )/(2∆f rep ) and down-converted from the optical domain to multi-heterodyne frequencies around f c .Direct dual-comb hyperspectral imaging is achieved when a dual-comb light source illuminates a sample and then is imaged on an 2-dimensional detector array where each pixel performs a multi-heterodyne detection.
In this way, recent work demonstrated dual-comb hyperspectral imaging with a frame rate of 1 Hz and 5 spectral sampling points using a low-noise indium-gallium-arsenide (InGaAs) near-infrared detector array with a frame rate of 24 Hz 4 .In contrast to mid-infrared capable detection arrays, InGaAs-based arrays offer low-noise detection, however, are limited to near-infrared wavelength and cover only a small portion of the infrared molecular fingerprint regime.
Here we show that an uncooled, near-to-mid-infrared lead-selenide (PbSe) photo-detector array, sensitive across the entire 1 µm to 5 µm wavelength range, can be used to perform dual-comb hyperspectral imaging.Although the array exhibits significant low-frequency Flicker noise, its exceptionally high frame rate of up to 4 kHz permits heterodyne detection above the Flicker noise dominated band.The high mutual coherence of the employed optical combs allows dense spacing (down to the Hz-level) of the multi-heterodyne beatnotes and, in addition, can be leveraged to lower the contribution of thermal detector noise to the final signal.Hyperspectral images with approximately 30 high-resolution spectral channels are recorded with a hyperspectral frame rate of up to 10 Hz.Importantly, the array's near-to-mid-infrared sensitivity allows for direct operation in the molecular fingerprint region.Moreover, a second major challenge arising from the massively parallel data acquisition is addressed in this work: While modern detector arrays can rapidly collect data, processing this data is not readily possible at the same rate resulting in large memory requirements and time-consuming post-processing routines.Applications requiring direct or short-term actions (such as in leak detection, chemical process monitoring or medical diagnostics) are then hindered by the time delay between the data acquisition and the processed image.To overcome this data reduction bottleneck, we show that artificial intelligence (AI) based on a deep convolutional neural network (CNN) [10][11][12] is capable of processing the data generated by the 16'384 hyperspectral pixels in real-time on a personal computer, permitting molecule-specific gas live imaging.Built on direct processing of time-domain interferograms, this method extends the AI toolbox for spectroscopy [13][14][15][16][17] .

II. SETUP
In our experimental setup, a dual-comb source is used to illuminate a near-to-mid-infrared PbSe photo-detector array.The detector array is sensitive to light over the entire 1 to 5 µm wavelength range, which contains the characteristic spectral fingerprint signature of a large number of gas molecules.The array, manufactured by NIT, consists of 128 × 128 pixels and can be read out with a maximal frame rate of 4 kHz, which corresponds to maximal detectable heterodyne frequency of 2 kHz (Nyquist frequency).Between the dual comb source and the detector, we arrange small nozzles through which acetylene gas (C 2 H 2 ) can be released, resulting in a jet of gas that is probed by the large diameter dual-comb illumination beam (Figure 1a).In this way, the spatial structure of the gas flow is projected onto the detector array, where the large number of pixels provides high spatial resolution.
Analysis of the detector's noise spectrum (Figure 1b) reveals prohibitively high Flicker noise at low frequencies (below 100 Hz), including in particular pixel dark-value fluctuations.However, following the idea of lock-in detection, the high frame-rate of the detector array provides access to a higher frequency band for heterodyne beatnote detection above 100 Hz that exhibits significantly lower noise.We therefore aim to arrange all heterodyne beatnotes above 100 Hz (and below the Nyquist frequency), thereby overcoming the challenge of lowfrequency Flicker noise.In order to allow dense encoding of spectral information in the desired heterodyne frequency interval, dualcombs of high mutual coherence are required (the width of the heterodyne beatnotes needs to be smaller than their frequency spacing ∆f rep ).Such high-mutual coherence dual-frequency combs have been demonstrated both in the near-and mid-infrared based on modelocked lasers, electro-optic modulation and optical parametric oscillators 4,[18][19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35] .In this work, near-infrared dual-frequency combs are generated via electro-optic modulation 36 in a fibre-based setup utilising polarisation maintaining components.Their central wavelength is chosen to coincide with an absorption line of C 2 H 2 gas (spectral line intensity of 4.882•10 −21 cm•molecule −1 ) .Specifically, a continuous wave (CW) laser at a wavelength of 1536.7 nm is split into two parts from each of which a comb is derived; this method ensures highmutual coherence between both combs (Figure 1a).After splitting, the CW laser's frequency is shifted by 80 MHz and 80 MHz + ∆f c respectively, creating a relative frequency offset ∆f c between the centre frequencies of both combs.Next, optical combs are generated from the shifted CW laser lines by electro-optic modulation (similar to 34 ).In this way, approximately 30 comb lines are generated in each comb and the combs' repetition rates are defined by the electro-optic modulation frequencies.As all modulation sources are referenced to a common 10 MHz clock, high-mutual coherence (heterodyne beatnote width < 10 mHz) is readily achieved between both combs.In order to densely sample and resolve the acetylene absorption line (FWHM approx.10 GHz) we choose f As an aside, we point out that much wider comb spectra can be generated via non-linear spectral broadening 37 , in particular for a similar configuration of the setup 38,39 .The detector's maximal frame rate of 4 kHz as well as the sub-Hz linewidth of the heterodyne beatnotes would in principle allow accommodating thousands of spectral channels for a different choice of ∆f c and ∆f rep (a lower/higher value of ∆f rep would entail a reduced/increased acquisition rate).
The acquisition rate of 1000 detector frames per second across all 16'384 pixels allows massively parallel acquisition of data, each pixel simultaneously recording the dual-comb time-domain interferograms that contain the spectral information.An example of several interferograms recorded by a single pixel of the detector is shown in Figure 2a (after removal of low frequency components).Fourier-transformation of the raw interferogram trace yields the multi-heterodyne spectrum, as shown in Figure 2b for different acquisition duration of 0.1 s, 1 s and 10 s (i.e. 1, 10 and 100 interferograms).The spectral envelope of the heterodyne beatnotes reflects the unabsorbed spectral envelope of the dual-combs.For the shortest acquisition time (1 interferogram), the spectral resolution of the heterodyne spectrum corresponds to the frequency spacing of the heterodyne beatnotes.Longer acquisition duration provide higher spectral resolution in the heterodyne spectrum and the heterodyne beatnotes show as narrow spectral peaks.As the higher spectral resolution effectively rejects the incoherent white thermal noise contribution, the signal-to-noise-ratio (SNR) of the heterodyne beatnotes grows proportionally with the square-root of the acquisition duration.Already, single-interferogram acquisition provides a useful SNR of approximately 10.The high-mutual coherence of the combs would in principle permit longer acquisition durations 34 , and phase-correction 19,[40][41][42][43][44][45] can extend this well beyond 1000 s.

III. RESULTS
For a first test of hyperspectral imaging, the C 2 H 2 gas jet across the field-of-view is turned on.The heterodyne signal is recorded for each pixel and Fourier-transformed to yield the heterodyne spectra.For each pixel, the transmittance of the dual-comb light on its specific path is obtained by normalising the heterodyne spectrum by an unabsorbed reference heterodyne signal.In our case, the reference heterodyne spectrum is derived from 9 pixels in the top left corner of the detector (where only a negligible amount of gas is present).Alternatively, a prerecorded reference spectrum may be used.An example of the transmittance signature recorded by a single pixel is shown in (Figure 3a) for a 10 second long acquisition, showing very good agreement with the HITRAN database (residuals below 3%).In addition, standarderror bands for shorter acquisition duration (0.1 s and 1 s) are shown, revealing that absorption on the fewpercent level can already be detected based on singleinterferogram (0.1 s acquisition duration).
To image the spatial distribution of the C 2 H 2 gas, the integrated concentration of C 2 H 2 molecules along the light path is derived for each pixel.Generally, the absorption of light propagating through a sample is described by the Beer-Lambert law dI(ν, l) where I is the intensity, ν is the optical frequency, l is the spatial coordinate along the beam path, m is the molar absorption coefficient and c(l) the molar concentration of gas.Integrating both sides along the path from the dual-comb source to the detector array yields where T (ν) is the measured transmittance (Figure 3a), L is the distance between the dual comb source and the detector array, I d (ν) = I(ν, L) is the intensity at the detector array and I 0 (ν) = I(ν, 0) is the intensity before absorption.A natural measure of the number of absorbing particles traversed by the beam is then defined as which can be estimated by fitting the measured transmittance to a HITRAN model.The integrated concentration C int is computed for each pixel based on the 10 second recording (100 interferograms) and shown in (Figure 3b); the gas jet is well detected and imaged based on its infrared absorption signature.Profiles of C int for different heights of the gas flow show a transverse expansion of the gas jet along its flow direction.While in our case only one gas species is imaged, the analysis can readily be generalised to include multiple gas species.
Figure 3b demonstrates that hyperspectral imaging with the near-to-mid-infrared detector array is possible.However, the high data rate of approximately 35 Mb/s and the necessity of performing the analysis on each pixel requires large memory size and result in long computations times, in our case 20 to 30 minutes on a desktop computer for one frame.
Overcoming this data processing bottleneck is crucial in applications demanding fast reactions, such as leak detection or medial diagnostics.To this end, we demonstrate an AI based approach that can directly operate on single (temporal) interferograms and bypasses the conventional analysis presented above.A specifically designed one-dimensional convolutional neural network (CNN) is used for speeding up the processing and implemented using the Keras library running on a Tensorflow backend 46 .The response of the CNN is invariant against translation of the input data along the time axis; a dedicated trigger or temporal alignment of the input interferogram is not required (any input of a duration of ∆f −1 rep is suitable).This property of CNN greatly simplifies the data analysis and results in better versatility (i.e.applicability to unknown data) of the CNN than densely connected networks with a comparable number of parameters.To avoid any unnecessary processing, the CNN directly takes a single amplitude-normalised interferogram with 100 sampling points as the input (Figure 4a).The first convolutional layer uses periodic boundary conditions, reflecting the periodic nature of the input and uses a kernel size equal to the size of the input.In our case, we found that 8 filters on the first layer were sufficient, however, more filters can readily be added to extend the CNNs capabilities to analyse different gas species and mixtures thereof.Each subsequent layer has a kernel size (approximately) divided by two and a doubled number of filters, such that the number of processed features remains constant throughout the layers.Each layer's activation function was chosen to be the rectified linear unit 47 , as it gave the best performances among a few test functions.The last layer directly provides the scalar integrated concentration value C int .As we detail below, the rapid and massively parallel recording capability of the system allows building large data sets for training such that regularisation layers 48 to prevent over-fitting are not necessary.This neural network architecture allows skipping both Fourier transformations as well as time-consuming fitting of absorption profiles.
While the capability of neural networks for fast data processing are widely recognised [49][50][51] , the difficulty of building a reliable labelled training data set (containing training input data as well as the correct outcome) is often prohibitive to their use.In our case, a training data set is rapidly built by sending the comb through a 10 cm gas cell filled with acetylene and arranged between the dual-comb source and the detector (all light traverses the cell).Within 30 minutes we record 10 seconds long data sets for 180 different integrated concentrations, ranging from 0 to a maximal integrated concentra-tion of 4.16 mol•m −2 .This way, a set of approximately 300 million interferograms over the full range of values for C int is quickly obtained.In order to label each interferogram, i.e. assign to it one of the 180 possible values of C int , the integrated concentration for each 10 seconds data set is derived as described above via HITRAN fitting.The signal of all pixels is combined for better precision in the derivation of the integrated concentration label.Note that the granularity of the output scale does not degrade the resolution in C int , which is limited by measurement noise (cf. Figure 3a).Note that fitting the HITRAN lineshape model is only used for labelling the training data set, not for the actual hyperspectral analysis.Complete independence from a theoretical lineshape model can be achieved if the training data is taken with a known gas concentration.
Approximately 2 • 10 6 interferograms (equivalent recording time of only 12 s) from the training data set are randomly selected for training and validation of the CNN.The training proceeds over 100 epochs and, on a desktop computer with a standard graphics processing unit, takes approximately 8 hours.A decreasing learning rate divided by 2 every 10 epochs is used to improve the convergence of the learning process.A final standard deviation between the predictions and the expected output of 1.3 % of the maximum integrated concentration value is reached.
To test the CNNs performance, we observe the dynamics of the emerging gas jet when the gas flow is turned on.The multi-heterodyne data of each pixel are processed for each 100 ms time window (single interferogram), so that good temporal resolution is achieved.From the series of reconstructed gas images, three snapshot frames, separated in time by 1 s, are shown in Figure 4c.In frame 0 no C 2 H 2 gas was released, in frame 10 the gas jet is emerging from one out of several nozzles, and in frame 20 the gas jet from several nozzles is fully developed.The results in Figure 4c show that the CNN can reliably work on single-interferograms (0.1 s acquisitions), permitting the observation of dynamic processes.Importantly, the trained CNN can process the data at a rate that exceeds the raw data recording rate, therefore enabling real-time molecule specific imaging with a frame rate of 10 Hz.The CNN also alleviates the need for large memory stor-age by reducing the heterodyne raw data frame rate from 1000 down to 10 frames per second for the gas images.If desired, the neural network could also be trained to output other parameters e.g.gas temperature (based on line shapes) or be extended to multi-species imaging by adding outputs on the last layer and adjusting the training accordingly.

IV. CONCLUSION
In summary, we have shown that dual-comb precision hyperspectral imaging can be performed with an uncooled, high-frame rate near-to-mid-infrared photodetector array, enabling imaging of gases with molecular specificity.Hyperspectral data has been simultaneously recorded in 16'384 pixels with 30 spectral channels and short acquisition times of 100 ms enabled observation of dynamic phenomena in an acetylene gas jet.If needed, a significantly larger number of spectral channels could be implemented, at the cost of a reduced image frame rate.Key to this demonstration is the high-frame rate of the detector array as well as the high-mutual coherence of the dual-comb illumination, which permits recording the heterodyne signal in a Flicker noise-free frequency band (here above 100 Hz).Importantly, we have also shown that the high data rate resulting from the massively parallelized hyperspectral data acquisition can be handled in real-time by a convolutional neural network, providing gas concentration images at 10 Hz rate.As the detector array is sensitive across the entire 1 µm to 5 µm wavelength range, our demonstration can readily be extended to cover the characteristic absorption fingerprints of a wide range of molecular species.Possible extension of our demonstration include the use of high-repetition rate mid-infrared quantum cascade 52,53 or microresonator combs 54,55 for broadband spectral imaging of transparent condensed phase media.

FIG. 1 .
FIG.1.Dual-comb hyperspectral imaging and detector noise.a. Highly mutually coherent optical frequency combs are generated by electro-optic modulation (EOM) of a single continuous (CW) wave laser.The dual-comb light is sent through a sample, here a flow of absorbing acetylene (C2H2) gas, then detected by a near-to mid-infrared fast detector array.Each pixel of the 128 × 128 detector array simultaneously digitises the dual-comb multi-heterodyne interferograms, which contain spectral information about the sample that can be retrieved via Fourier transformation and normalisation.b.Single pixel noise spectrum of the near-to mid-infrared detector array.
(1) rep = 1 GHz.The detector is operated with a 1 kHz frame rate, which is sufficient in our case and permits recording heterodyne beatnotes up to the Nyquist frequency of 500 Hz.The centre frequency and the spacing of the heterodyne beatnotes are set to ∆f c = 250 Hz and ∆f rep = 10 Hz, respectively.

FIG. 2 .
FIG. 2. Raw interferograms and spectrum.a. Raw dual-comb multi-heterodyne interferograms as recorded by a single pixel.b.Multi-heterodyne spectra obtained by Fourier transforming the raw interferograms for different acquisition times (10 s in orange, 1 s in blue, 0.1 s in black).

2 FIG. 3 .
FIG. 3. Single pixel transmittance spectrum, comparison with HITRAN and integrated concentration image.a. Single pixel dual-comb absorption spectrum of acetylene (C2H2) retrieved from a 10 seconds acquisition (blue dots) compared to a HITRAN fit (black line).The standard deviation of the absorption spectrum for shorter acquisition times (0.1 seconds and 1 second) is shown in bands (centred around the 10 seconds based data).b.Reconstructed integrated concentration image of an acetylene flow based on fits to the HITRAN model.Transverse absorption profiles are shown for three different positions along the gas flow (white curves).

4 FIG. 4 .
FIG. 4. Neural network architecture, training history and AI results.a.A single 100 points interferogram is used as input of the convolutional neural network.A first layer with periodic boundary conditions and 8 filters extracts 800 features from the input.Approximately reducing the Kernel width by a factor of two and doubling their number in each subsequent layers keeps the number of features constant across the network.The last layer outputs the integrated concentration corresponding to the input interferogram.b.Training history of the network, achieving a standard deviation of 1.3 % of the maximal integrated concentration with the validation data set.c.Three selected frames from a movie that has been reconstructed in real-time by the neural network.The frames show the dynamics of the C2H2 gas jet and are separated by 1 s each.The first frame is recorded before the gas jet is turned on, the second frame depicts the onset of gas emission and the last frame shows the established gas jet.