Compressive hyperspectral imaging in the molecular fingerprint band

: Spectrally-resolved imaging provides a spectrum for each pixel of an image that, in the mid-infrared, can enable its chemical composition to be mapped by exploiting the correlation between spectroscopic features and specific molecular groups. The compatibility of Fourier-transform interferometry with full-field imaging makes it the spectroscopic method of choice, but Nyquist-limited fringe sampling restricts the increments of the interferometer arm length to no more than a few microns, making the acquisition time-consuming. Here, we demonstrate a compressive hyperspectral imaging strategy that combines non-uniform sampling and a smoothness-promoting prior to acquire data at 15% of the Nyquist rate, providing a significant acquisition-rate improvement over state-of-the-art techniques. By illuminating test objects with a sequence of suitably designed light spectra, we demonstrate compressive hyperspectral imaging across the 700–1400 cm − 1 region in transmission mode. A post-processing analysis of the resulting hyperspectral images shows the potential of the method for efficient non-destructive classification of different materials on painted cultural heritage. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.


Introduction
Hyperspectral imaging in the mid-infrared has the potential to provide a spatial map of the chemical composition of an object or surface, with major applications in agriculture [1], healthcare [2] and technical art history [3]. An implementation based on full-field Fourier-transform spectroscopy (FTS) has the advantage that the spectral information is multiplexed [4], resulting in much higher throughput and superior signal-to-noise ratio (SNR) than filter-or grating-based spectrometers, while also accessing higher optical resolutions than are possible with competing techniques. Here, the light illuminating or scattered from a sample is modulated by passing through a Michelson interferometer, and the hyperspectral image is reconstructed from the Fourier transforms of the interferograms recorded on each pixel of the camera. Under conventional Nyquist-limited sampling, the interferometer path difference must be stepped in intervals of no more than half the shortest wavelength measured. This leads to a requirement for a few thousand camera frames to produce well sampled interferograms with few-cm −1 resolution in the 7-14-µm (700-1400 cm −1 ) fingerprint region. In the mid-infrared, the high cost of semiconductor-bandgap cameras-for example based on strained superlattice InAs/GaSb arrays [5]-prohibits their wider use, but the limited frame rates of lower-cost microbolometer arrays mean that conventional FTS imaging with these slower devices incurs acquisition times of several minutes. Here, we show how compressive sensing [6] can be used to significantly reduce the time taken to record spectroscopic images in the mid-infrared region, surpassing the Nyquist sampling criterion by a factor of 6.5×, while preserving spectroscopic features with sufficient precision to allow pixel-wise clustering of the materials present in an object (assuming a fixed number of classes).
Traditional signal compression approaches (using regular sampling) acquire measurements at conventional sampling rates before applying compression, providing no efficiency gain in the acquisition process itself. By contrast, compressive sensing exploits the fact that the signal-in the case of hyperspectral imaging, the infrared spectrum-is typically sparse in some transform basis or domain of representation, allowing it to be reconstructed from a smaller number of samples than would be necessary under Nyquist-Shannon sampling [7,8]. In hyperspectral FTS, this allows the interferometer path difference to be stepped at intervals greater than half of the shortest wavelength, directly reducing the acquisition time in proportion to the reduction in the number of samples. Considering only spectral compression, the goal of compressive hyperspectral imaging using FTS is to identify a minimal set of interferometer path differences, corresponding to a set of sinusoidally-modulated illumination spectra ( Fig. 1(a)), that sufficiently samples the object to allow spectral reconstruction using an appropriate sparsity-promoting prior. To demonstrate spectral reconstruction (even for an isolated pixel) we use a prior that decouples all the pixels which leads to a highly-parallelizable algorithm. However, more complex models accounting for spatial correlation could also be applied. In the following sections, we introduce the conceptual basis for this approach and how it has been implemented, first with a selection of chemically uniform plastic samples then, as an example of one important potential use case in the study of cultural heritage objects, with a set of paint samples containing different commercially available inorganic pigments. All spectroscopic results are validated by single-point ground-truth transmission spectra obtained using a conventional commercial imaging spectrometer. Variable-density subsampling strategy. A Michelson interferometer (a) is illuminated by a broadband source, and a set of prescribed path differences is used to tailor spectra, s i , with a discrete Fourier basis. Over the camera sensitivity region of 700-1400 cm −1 , these spectra are projected through a sample to interrogate its transmission spectrum, illustrated in (b) for polypropylene. The resulting interferogram (c) can be oversampled (blue), bandpass sampled (black circles), or under-sampled (red symbols). Unique subsampling is implemented according to the probability density function shown in green. Inset in (c) shows a zoomed region.

Concept of compressive hyperspectral Fourier-transform spectroscopy and computational reconstruction
Compressive sensing theory for Fourier-transform spectroscopy has undergone extensive development in recent years since the seminal works of Candes and Tao [9] and Tsaig and Donoho [10], who proposed an alternative sampling strategy to the prevalent sampling of bandpass signals at the Nyquist rate for signals that possess other properties, e.g. sparsity in a particular domain.
An example of such a sparsity-promoting domain of representation would be a spectral library of known materials expected to be present in the sample of interest. However, in our work, we make use of a more generic prior, providing wider applicability than is achievable using a fixed spectral library, which would need to be user-defined. Here, we assert that the spectra of interest are relatively smooth, and as such have sparse gradient transitions, an assumption that is justified by the typical shapes and feature widths in the absorption spectra of solid dielectric materials. Due to the nature of FTS, spectral compressive sensing is restricted to only subsampling the interferogram measurements. Spatial subsampling is not considered in this work, as the principal practical motivation is to reduce the acquisition time, which is more affected by the number of samples constituting the interferogram than the number of spatial locations probed, especially when using a camera as a detector. Three principal methods are available to subsample the interferogram: regular subsampling (equidistant coarse measurements) [11][12][13], random uniform subsampling (random interferometer positions drawn from a uniform distribution), and variable-density subsampling (random positions drawn from a non-uniform distribution) [14][15][16][17][18].
Here, a combination of regular subsampling and variable-density subsampling is employed to complement the characteristics of both the system (camera and source) and the spectra of interest. Moshtaghpour et al. [18] proposed a variable-sampling strategy, specifically for the case of a one-dimensional interferogram signal, which is adapted here. Measurements from all pixels are recorded using the same spectral mask to reduce the acquisition time. Using the same mask also simplifies the estimation process as it does not require storing individual masks for each pixel.
The image reconstruction problem can be formulated as an optimization problem that combines two terms: a data fidelity term, and a penalty term to control the level of sparsity / smoothness of the solution. In this work we present a reconstruction method which processes all the pixels independently. Although it does not account for spatial correlation between pixels it does offer scalability and a satisfactory compression ratio in practice. The classical convex minimization problem for compressive sensing, which is adopted here for each pixel, can be written as: where y p is the measured M × 1 interferogram time-signal for pixel p, x p is the Q × 1 resultant complex frequency spectrum, B is an M × N binary non-uniform sampling mask, F * is the conjugate transpose of a Q × N bandpass discrete Fourier transform (DFT) matrix, and D is a Q × Q one-dimensional discrete gradient (finite difference) matrix penalty function to promote sparsity of transitions in the spectral gradient. The first term is the fidelity term, using an l 2 norm r p 2 = (r 2 p,1 + · · ·+ r 2 p,Q ) 0.5 , scaled with the inverse of the noise variance σ 2 . The second term, using an l 1 norm r p 1 = (r p,1 + · · ·+ r p,Q ) 0.5 , penalizes the modulus of the (complex) gradient of x p . Its relative weight is controlled by the regularization parameter λ ≥ 0.
The construction of the sampling mask B utilizes variable density sampling alongside bandpass sampling. The bandpass sampling frequency f s is an integer factor n larger than f u , assuming the bandpass signal exists such that: 2f where the bandpass signal has frequency band (f l , f u ) [11,12]. Bandpass sampling with n = 3 was recently demonstrated in nano-FTIR holography by Schnell et al. [13]. Our empirical system bandwidth of 700-1400 cm −1 (determined by the camera sensitivity range) allows for bandpass sampling with n = 2, with an octave bandwidth at f l = 21.23 THz and f u = f s = 42.45 THz as illustrated in Fig. 1(b). In this analysis, an oversampled binary sampling mask (at 300% Nyquist-limited sampling) is an N × N identity matrix, where each row corresponds to a position on an equally spaced oversampled stage-position grid. A bandpass binary sampling mask where n = 2 is then a regular subsampling of one in every six rows of the oversampled binary mask from an initial offset row from one to six. This corresponds to a bandpass sampling rate of 50% Nyquist-limited sampling. A com pressive sampling mask B in Eq. (1) is constructed by applying a variable density sampling scheme to a bandpass binary sampling mask. The probability for each position in the interferogram or, equivalently, for each row in the bandpass binary sampling mask being chosen is given by a power-law of the form: where p is the probability between 0 and 1, Q is the number of unique points possible to be chosen, l is an integer from 1 to Q, α is the so-called cluster parameter, β is a threshold parameter, | · | defines the absolute value, and C Q,α,β is a normalization constant [18]. The sampling strategy is illustrated in Fig. 1(c), which shows one typical compressive sampling at 15% of Nyquist-limited sampling.
The alternating direction method of multipliers (ADMM) is utilized in the reconstruction algorithm, whereby variable-splitting is used to converge on an optimal solution given prior information including hard constraints [19]. The augmented Lagrangian problem can be constructed as: where ρ>0 is an arbitrary parameter ensuring the algorithm convergence, z p is a Q × 1 complex splitting variable, and d p is the Q × 1 scaled dual variable. The variable updates for the splitting variables at iteration k + 1 are: where the x k+1 p update is estimated by complex bi-conjugate gradient descent [20] and the z k+1 p update is achieved by applying the soft-thresholding operator on both real and imaginary parts [19]. Note that ρ only affects the convergence speed and it can be adjusted automatically using a heuristic method [19]. A set of simulations using synthetic data (shown in Fig. 2) was used to determine the optimal algorithm parameters (e.g. optimal λ) and subsampling level by comparing solutions from the reconstruction algorithm to an oversampled noise-free polypropylene (PP) transmission profile taken from an attenuated total reflectance (ATR) measurement database [21]. The simulated spectrum, displayed in Fig. 1(b), has the form e −a(ν−ν o ) 2 −α PP (ν) , representing an illumination source having a Gaussian profile and modulated by the reference spectrum of PP. This simulated spectrum was used to provide a ground-truth interferogram comprising points at 300% Nyquist-limited sampling ( Fig. 1(c), blue). The simulation ran over a wide range of signal-to-noise ratios and subsampling levels. There were 50 noise realizations of each interferogram created for each of the 16 chosen signal-to-noise ratios (SNR; defined as BF * x p 2 2 /Mσ 2 )-equally spaced on a logarithmic grid-for each of the nine chosen subsampling rates (SSR), from 50% to 10% Nyquist-limited sampling in increments of 5%. The 16 SNRs were centered around SNR=45, with σ 2 =100, which is the typical SNR observed in our actual measurements. The simulation was restricted to single pixels as the algorithm does not use information from neighboring pixels during the reconstruction process. Although the estimate of x p is a complex vector, our final spectrum estimate |x p | is its modulus. For each realization, we computed the normalized mean squared error (MSE) value by comparison with the ground-truth spectrum x p,0 as viewed in Fig. 2(b), defined as MSE = (|x p | − x p,0 ) 2 2 /x p,0 2 2 . The MSE values were recorded over several chosen regularization parameter λ values (spaced on a logarithmic grid) for each unique set of subsampling rate and SNR over all realizations ( Fig. 2(a)). The optimal regularization parameter value was selected as the value that yielded the lowest MSE over all realizations. This simulation provided a basis to experimental subsampling of an interferometric signal at a 15% Nyquist-limited sampling rate when the SNR >5.

Compressive acquisition hardware
The compressive hyperspectral imager is illustrated schematically in Fig. 3. A stabilized silicon nitride globar lamp (Thorlabs) was used as the light source, providing 200 mW over the 700-1400 cm −1 spectral range with 0.05% output power fluctuation (Fig. 3, inset). An aluminum reflector collected the light, directing a 35-mm beam towards a compact Michelson interferometer comprising a ZnSe beamsplitter, a ZnSe compensation plate and two gold retroreflectors. A ZnSe lens concentrated light on the sample plane, which was then transmitted to the LWIR camera. The system utilized a direct-drive linear motor stage (Physik Instrumente PIMag V-524) as the delay-scanning element, which provided the means to rapidly actuate the optical path difference to any chosen absolute position (random access positioning) with 250-mm s −1 velocity and 20-nm unidirectional repeatability. A highly accurate optical encoder permitted interferometry with long-wave infrared (LWIR) light without the need for the optical delay reference more commonly used in FTS, allowing for a simple configuration occupying around 0.3 m 2 . Hyperspectral imaging was performed in transmission, with a 4-cm −1 spectral resolution, corresponding to a maximum physical scan range of 1.25 mm. A series of images at various optical path differences was recorded with a microbolometer camera (FLIR A65SC) comprising a 640×512-pixel focal plane array with a pixel pitch of 17 µm and operating at 30 Hz. The camera had a noise equivalent temperature difference (NETD) of 50 mK and a digital output range of 14 bits. The spectral response of the camera covered the 7-14 µm band, allowing it to function as an optical bandpass filter that enabled bandpass sampling with n = 2. The acquisition and reconstruction of the hyperspectral data cubes were implemented under computer control and using the MATLAB language environment.

Transmission spectroscopy of pure plastics
The high spectral uniformity of pure plastics made them suitable for use in an initial evaluation of the LWIR hyperspectral imager. A composite sample ( Fig. 4(a)) containing four regions of different plastics-low-density polyethylene (LDPE), polycarbonate (PC), polypropylene (PP) and polystyrene (PS)-was used to assess the quality of the spectra obtained using different subsampling ratios. The sample was imaged in transmission mode ( Fig. 4(b)) using a 317×192pixel subset of the available 512×640 pixels due to the field of view of the camera. Specific absorption features of these compounds enable their accurate identification: for example, PC is the only one with carbonyl bonding, while having benzene rings in common with PS; LDPE and PP are straight chain polymers.
To prevent saturation of some of the strongest absorption features, each plastic sample was thinned from several mm to approximately 100 µm by mechanical milling / polishing. An interferometric data cube was acquired at 15% Nyquist-limited sampling rate, and the hyperspectral cube reconstructed as described in Section 2.1. Clustering of pixels into five groups (four plastics + noise) was achieved by fitting a Gaussian mixture model (GMM) whose labels were spatially correlated (Fig. 4(c)). This is possible using a Markov chain Monte Carlo sampler similar to that proposed by Altmann et al. [22] and which automatically adjusts the amount of spatial correlation during the segmentation process. Principal component analysis (PCA) [23] was first performed to project the signals reconstructed via ADMM and reduce their dimensionality before clustering, to lower the computational cost of the clustering step. For all the results presented in this paper, we used projections onto 10-dimensional principal subspaces and the final clustering step was performed with a user-defined number of clusters (e.g., five clusters in Fig. 4). The structure of the clustering map (Fig. 4(c)) and the clear separation between the first two principal components in a score plot (Fig. 4(d)) demonstrate that under-sampling at a rate of 15% of the Nyquist limit is sufficient to spectrally discriminate these materials rapidly and with high accuracy and spatial resolution, with acquisition taking around 15 seconds for a 30×30 mm 2 area.
Spectral transmission profiles, illustrated in Fig. 4(e), were obtained for the five groups of pixels by calculating the mean of their reconstructed spectra divided by the instrument spectrum. The reconstructed transmission spectra at 15% Nyquist-limited sampling correspond closely with the transmission reference spectra (Fig. 4(e)) recorded for the purposes of cross-validation with a conventional imaging spectrometer (see Supplement 1 for further information on this instrument). The spectra in Fig. 4(e) contain sufficient detail to assign specific features to individual molecular groups according to the wavenumbers of their peak absorptions annotated in Fig. 4(e). Full details of the band assignments are provided in Supplement 1. The omission of narrow spectral features was observed when under-sampling due to the nature of the sparsity-based prior which promotes sparse gradient transitions. However, as the comparison with conventional FTS data presented in Fig. 4(e) shows, this reconstruction artefact does not affect the chemical identification and qualitative spectral analysis of all plastic samples with high confidence.
The camera frame rate presents the primary limit on the acquisition speed of the hyperspectral imager, leading to a 270 second (4.5 minutes) acquisition time for measurements with 4-cm −1 spectral resolution at 300% Nyquist-limited sampling rate, and achieving an SNR of 96. Bandpass sampling reduces this acquisition time by a factor of six to 45 seconds, achieving this saving in time and data without information loss (despite a √ 6 reduction in SNR). In some regions, a saturation of absorption features was observed due to the spectral response function of the LWIR camera. A further reduction in acquisition time to 15 seconds was obtained using the 15% Nyquist-limited sampling scheme.
To compare the performance of the compressive spectral imaging approach presented here with the state-of-the-art we introduce a new metric, which is the product of the pixel sampling rate and the finesse of the hyperspectral imager. For an P-pixel imager, acquiring in a time τ an image with a spectral resolution of δν spanning an optical bandwidth ofν, we define the spatio-spectral acquisition rate (SSAR) as: This metric is defined to be invariant to the rate increase made possible by bandpass sampling (Eq. (2)), since it remains unchanged whenν and τ decrease by the same factor. It therefore provides a value that allows a comparison of hyperspectral imaging systems differing in the efficiencies of both their hardware and compressive sampling strategies. The estimated SSAR for the reported spectral imager is 3.9 pixels µs −1 , a factor of 3.25× improvement over recently published results by Primpke et al. [21] (1.2 pixels µs −1 ). Recently, hyperspectral imaging with an uncooled microbolometer camera was reported by Sugawara et al. [24], who obtained hyperspectral images and identification of four adhesives samples with 9 cm −1 resolution and 30 seconds measurement time using a 60-Hz microbolometer camera, achieving an SSAR of 0.16 pixels µs −1 . For our images, the total reconstruction time was around 100 seconds for the 317×192 pixel subset, when stopping the Dx − z 2 <10 −4 using a non-optimized MATLAB implementation [19]. The reconstruction time for the whole image could in principle be reduced to near instantaneous speeds both by using a more computationally efficient penalty function (e.g. using correlation between pixels) or by performing processing on a graphics processing unit (GPU), as recently demonstrated for spectral unmixing problems [25]. These standard approaches would permit near-real-time estimation, however such a focus on algorithm acceleration is beyond the scope of this paper and is left as future work.

Transmission spectroscopy of paint samples
Having validated the performance of the hyperspectral imager using homogeneous plastic samples in a high-SNR transmission modality, we next assessed it in the significantly more challenging context of inhomogeneous paint samples. These measurements represent a first step towards applications in the study of painted cultural heritage, where compressive imaging can minimize the infrared heating of delicate artefacts and, for large objects, significantly reduces the time needed for a hyperspectral imaging study. The composite sample, illustrated in Fig. 5(a), comprised four NaCl windows painted with the following pure pigments: Terre Verte, synthetic Ultramarine, Titanium White and Lead White. The powdered pigments were mixed with a 7% (vol:vol) rabbit skin glue:water as a binder. Further details of the chemical composition of these pigments appear in Supplement 1. The samples were hyperspectrally imaged in transmission and produced the 256×257-pixel LWIR image shown in Fig. 5(b). The lossy and non-uniform nature of these samples, with their look-alike spectral features, is representative of painted cultural heritage, contrasting with the sharper and more differentiated spectroscopy of plastics. Despite the lower SNR, the reconstructed spectra were of sufficient quality to permit automated pixel clustering, enabling classification of the paint specimens (Figs. 5(c) and 5(d)). The spectral features expected for each of the four pigments are found in the average hyperspectral spectra for each cluster, and the results were corroborated by independently measured point spectra (Fig. 5(e), gray lines) recorded by the same instrument used for the analysis of the plastic samples. The spectra display the characteristic bands expected from the chemical compositions of these pigments, and full details of the band assignments are presented in Supplement 1.
The quality of the reconstructed spectra, specimen clustering performance and rapid sampling rates provided in a compact instrument design is expected to enable applications of the hyperspectral imaging system in the non-invasive analysis of painted cultural heritage objects at SSARs orders of magnitude higher than recent reflection-FTS images by Sciutto et al. (0.00059 pixels µs −1 ) [26] and Rosi et al. [3] (0.027 pixels µs −1 ). Combined with fast and readily-available GPU-based reconstruction, optimized compressive approaches such as those we present here therefore promise a route to practical large-area hyperspectral imaging, complementing expensive and invasive techniques like electron dispersive X-ray spectroscopy. Wider application in cultural heritage applications will naturally require the extension to a reflection geometry because of the opacity of many painted objects.

Conclusions
Our results introduce and validate a new concept for Fourier-transform infrared hyperspectral imaging at considerably sub-Nyquist acquisition rates, based on combining bandpass and nonuniform compressive sampling, which allows data collection to be concentrated in the most information rich part of the interferogram. Practically, this is achieved by implementing a hyperspectral Fourier-transform spectroscopic imager using a random-access delay stage to sparsely sample the interferogram at predetermined unique and non-uniform delays, and forms a compressive sampling strategy optimized to the spectral response and acquisition rate of a microbolometer camera. We have shown that data acquired at only 15% of the Nyquist rate can be used to successfully reconstruct spectral features in transmission mode, achieving 20× faster acquisition than in the typical conventionally over-sampled case. The introduction of a new spatio-spectral acquisition rate metric allows the efficiency of different compressive hyperspectral imaging arrangements to be compared in a way that accounts for their measurement bandwidth and spectroscopic resolution, showing an improvement in the method presented here of at least 3.25× over the state of the art, and up to several orders of magnitude faster in specific cases of broadband hyperspectral imaging of artwork.
The potential of this work to enable real-time estimation from compressive measurements is expected to impact many applications, particularly when coupled with GPU-based reconstruction and post-processing. The reconstructed datasets we have presented demonstrate promising region-based information, and we have shown that by applying clustering methods after spectral processing the imager can readily differentiate different plastics and pigments. Future development should explore joint library-based reconstruction and segmentation to further enhance the accuracy and the sensitivity of spectral chemical differentiation, while also allowing faster algorithmic convergence. In particular, using regularization that leverages the spatial organization of natural images (e.g., as in [27]) is a clear next step. While in this work we have used a Michelson interferometer to illuminate the samples with a variety of sinusoidally modulated spectra, the approach can be extended more generally to use any spectral shaping device to produce an appropriate set of structured spectra, potentially allowing further improvement of the spatio-spectral acquisition rate.
Funding. Royal Academy of Engineering (RF201617/16/31); Engineering and Physical Sciences Research Council